The Analyst’s Toolkit

Written in

by

Across the past year, I’ve worked with and spoken to people who have strong opinions about what tool to use for data cleaning and visualization. There was the interviewer who did not want me to use Tableau, there was the portfolio builder class where cleaning directly in Excel was avoided, and the recruiter who put more emphasis on Tableau than anything else, contradicting the other interviewer I’d spoken to (separate companies).

It wasn’t until I was interviewing with Home Depot that I started to put words to what I was actually doing. The hiring manager had given me the phrase for it, “using whatever tool fits the need.”

That was my natural inclination. And going through the Google Analytics Certification has pushed that even more.

The certification program is meant to be for everyone, including true beginners. They start with the foundations of spreadsheets, and rely pretty heavily on that. At least, they have so far and I’m almost done with Course 4.

I’ll get into the Certification program from Google in a different post, but I wanted to mention that to give more to the point I’m trying to make. I’ve spent so much time in the past year letting myself be impressed upon by other people, because I didn’t have as much knowledge in analysis.

Now, I feel like I have a strong foundation, and a strong understanding of how to get things done.

Obviously every dataset is going to be different, but that’s the point. I wouldn’t use SQL for my Frognalysis, there were only 88 entries. Then again, I wouldn’t just pore through the spreadsheet for the Video Game Time Series I was poking at, either, since it has over 20,000 entries.

I tried to use Python and pandas for my Frog analysis. It’s not that it didn’t work, but since that language is primarily used for larger datasets, it didn’t necessarily fit the need. Meanwhile, I plugged the frog dataset into Tableau and had a clean dashboard with as many insights as I could pull from it, within an hour or so of some tinkering.

Don’t limit yourself to one language, or one tool. Learn everything you can get your mitts on. Python, Excel, Google Sheets, R, Looker, Power BI, Tableau… Every one of these things is going to have some kind of functionality that will get you further in your analysis of a tricky dataset. The more you learn, the wider you learn, the better equipped you are to deal with all sorts of datasets.

I have no real push on what industry I want to get into. I have no compunction with where to go or where I’ll end up: I’m not worried about that. My interests are as varied as my skills, and that’s why my portfolio reflects that.

It is a lot like what the hiring manager from Home Depot said to me: you ask the questions, then you figure out how to answer them. Is it easier for you to poke around on Pandas and get an idea of the shape of the data that way? Is it second nature for you to scroll through excel, do a “Find and Replace” instead?

None of these are bad things, and that took me a while to realize. Everyone will have different ideas for where you should be and what specific tools you should use, but don’t limit yourself. Get your hands dirty in what you know, and dabble until you’re comfortable in things that can help.

Anyway, the bigger the toolbox you have, the better equipped you are to handle special cases, both in personal and professional projects.

The Google Data Analytics Certification is helping with the toolbox, and I’ll write about that soon, I think. Overall it’s just felt like a refresher course for me, but it’s been good. I’m halfway through it already!

Until next week, I wish you the very best on your data journey, wherever you are. And happy 2023!

Tags

Leave a comment