EDA and Stepping Away from Tableau

Written in

by

I spent a lot of time over the past year or so hearing about Jupyter Notebooks without really knowing what it was. In the past two weeks, I’ve gradually begun to spend so much time in Jupyter Notebooks that it feels strange not to have one running at any given time.

I’m still learning about its limitations and what I can do and can’t do in it, but so far I’ve been enjoying myself. Which, you could probably guess that by how much time I’ve been spending in it.

Originally this was going to be a portfolio post. Instead, I just want to talk about the EDA ups and downs I’ve had with my customer churn project I’ve picked up for this portfolio builder cohort.

I also wanted to touch briefly on a data analysis challenge I was given as part of a test for a job interview.

When I started my data analysis journey, it was all self taught. I spent a lot of time working on programming and learning SQL and Python languages, but the actual deep analysis didn’t happen until my first job in data analysis back last year. Even then I was still using Excel. Then, I learned Tableau and its possibilities and limits as well.

When it came time for the EDA for my portfolio builder program, I was hesitant to leave Tableau because I was a little unsure of Python and what I could achieve with it. That’s the biggest part: I didn’t know everything pandas, python, numpy, matplotlib, and plotly could do for me, among other packages. There’s so much there but nothing felt like it was collected in one single place.

And then I found a gallery for it. Python Graph Gallery. That was a total gamechanger for me.

Still, I swear that there’s more that you don’t see on that website. There are so many possibilites just in general.

I was a Tableau fangirl before, and I still have a love for it, for the simpler things I wanted to get done or look at. Python has a much larger library, and even if I have to slog through functions (which I’m not as strong with) sometimes, I think I prefer it.

I’m just talking about EDA now but I did want to mention that with this challenge I needed to do a projection, and I was having a very hard time with it in Python. I’ll get into that at a later time.

But literally, the same day, an hour or two later, after I had wrapped everything up and sent it off… I realized it probably would have been better to do it in R.

So my next journey is very likely to be getting into R and all of ITS functionalities and what else can be done there.

Next week I might talk about the difference between Jupyter Notebook and Google Colab. Anaconda in general has shown me a lot of other options for IDEs and I’m really excited to get into all of them just to see what I can do.

It can absolutely be difficult learning how to program. I’m lucky enough that I have the basics down to a point, but that doesn’t mean it doesn’t get super frustrating when the code gets me down. And this is before getting into anything interactive or animated. I want to try and branch out and find other visuals not just to help analysis, but to show the story in the data in a way that’s unique and grabs you and can be translated to anyone, regardless of their data literacy.

I am hoping that next week I’ll be able to provide some actual portfolio analyses. I realize I have a lot of cleaning left to do in the grocery dataset, so that’s my next project… After I finish working through the Residency Program for Women in Data!

That’s also another post for another time. For now, I’ll see you all next time, and until then, I wish you swift and easy data cleaning!

Tags

Leave a comment