It’s hard not to make that title super long. This is the first part of my Beginner’s Guide to a Career in Data series. You do not have to read these in order of when they’re published, you can feel free to jump around or only read what’s relevant to you. Whatever helps!
One of the first things I wanted to cover is skills you already know that can apply to data careers. A lot of times, the biggest part to transitioning careers is learning what you need to thrive in those roles. For a lot of data focused jobs, that includes the programming language SQL, a Business Intelligence tool like PowerBI or Tableau, and, most of all: Microsoft Excel.
Excel has been at the forefront of analysis and spreadsheet software since its initial release. It’s evolved over decades to include a lot of different functions, including one i wasn’t aware of until very recently: predictions.
The reason I put this on here is because Excel is used by a lot of varied professions. Specifically Admin, where I’m transitioning from, I spent a lot of time on projects in Excel, which gave me a good basic understanding of the application.
Excel isn’t the only thing that will give you a good foundation for transition: there’s a lot of basic skills that may be shared by other roles as well.
Soft skills, specifically communication (written/verbal) is one of the most sought after. The reason for this is that you have to know how to communicate your findings to someone who isn’t a data person, who may have a different set of interests or different goals than you.
With stakeholders, they won’t really be interested in how cool you thought it was to get something to work on Tableau, they want to know how it will help their overall goal, if it’ll help them achieve anything more or different than before. They want to know what they have to do.
One of the things I always felt played to my strengths was having an interest in technical writing. For years I explained technical writing as “simplifying things for people who aren’t technical,” and that is it essentially. You take a generally technical or very mathematical approach and you find a way to communicate it with someone who doesn’t share that background, in a way they’ll understand.
You do have to consider what they might know, and break things down enough that they won’t get lost, while also expecting them to be familiar with other concepts.
To the point of communication, one of the more surprising things I’ve had to learn lately has been PowerPoint.
Presenting your findings in a neat, orderly fashion that’s easy to follow and isn’t over crowded is incredibly important. If you’re good at putting presentations together in a way that’s not just visually appealing but also communicates well, you’re well on the right track.
Terminology/Jargon
This glossary from dataquest.io is comprehensive, even if it isn’t exhaustive, and will show you a lot more you might see from me. I did want to cover what I felt were the biggest, most confusing, or most likely to be in your skillset already.
BI
Business Intelligence. When you see someone– especially me– saying “BI Tools” or “BI Analysis”, it is referring to Business Intelligence.
Data Wrangling
This is the process of getting data. Making data yourself, collecting it from a resource, finding a csv with data. This could also mean web scraping, which means using programming to get information from websites.
Data Mining
Data mining is very broad. It could probably be rolled into Exploratory Data Analysis. Data mining is the process of analyzing and getting insights from data.
I’ll admit that I was a little surprised to find this out, because my experience with data mining was with games where data mining meant looking for things (images, names of items) that were left in the code or elsewhere of a new patch/download for the game. In that space, that definitely counts: you’re pulling insights from information in a different way than just visualizing.
EDA – Exploratory Data Analysis
My exposure to this phrase came with the machine learning course I took. Like I said, it could be rolled into Data Mining, but EDA also includes getting the shape of the data, finding out what values are null, seeing what could be dropped or kept, seeing how big the dataset is in general. It goes from getting a good feel of the data to visualizing the different values and insights.
ETL – Extract, Transform, Load
This is one I know people will be familiar with. ETL sounds intimidating but it’s as easy as finding/wrangling data, removing “NA” or “NaN” from cells, putting in 0 in place of no value there at all, and then loading it into your choice of platform (Excel, SQL database, Jupyter Notebooks, Google Colab).
Related term: ETL Pipeline
I almost didn’t put this term in here, but I did want to mention it in relation to ETL. An ETL pipeline can be a complex process down with different programming languages (I mostly know it to be done with Python), and specifically tends to deal with Natural Language Processing, which is in the Machine Learning space.
Creating ETL pipelines, or Data Pipelines, created by Data Engineers, can most often include cloud based services like AWS to store the data, and then load it in from the cloud, using a (BI) Business Intelligence Tool like Tableau or PowerBI, to be visualized and mined.
A/B Testing
Admittedly, I have a strange relationship with A/B testing, because it can mean a couple of different things. For me, A/B testing has been taking a binary value in a dataset and testing it against another value, such as gender or senior adult status against customer churn.
A/B testing can get more involved, though, boiling down to how people react to one choice over the other, such as two different colors on a web page.
Modeling
Modeling, like most machine learning concepts, has roots in Statistics. I loved working on modeling back in school. A lot of these terms can be used to describe things on a spectrum from very simple to complex and detailed.
Modeling can mean creating a linear regression model with specific data points the way you would in school, or using a large dataset. The general takeaway from modeling is finding correlation.
Database
Like modeling, databases can be thought of as on a spectrum. For smaller companies and usually non-profit organizations, a database can be house in Excel, and be as simple as just having basic data and reports there.
Larger companies or more tech-forward organizations tend toward specific database software like MongoDB, Oracle, MySQL, Microsoft SQL Server, and others. (Larger list can be found here.)
Related term: Dataframe
Dataframe is something you’ll see if you program with pandas in Python. Dataframe is another word for the CSV or XLS file you load in, a collection of all of that data in whatever form it takes.
Database Admin
This almost feels sacrilege to be so base with the description of this, but it still counts. If you were in charge of keeping a database clean and making sure that any entries weren’t screwing up any formulas anywhere else (if you were using Excel, like me), then you were the admin of the database.
This extends to things more complex as well, if you managed something in a SQL database or anything else that housed organized data.
Visualization
Though it almost doesn’t need a description, I wanted to add this in. If you’ve ever made charts or graphs of any kind, you have experience with visualization!
———————————–
So many of these just sound so intimidating, and again if you click through to the dataquest.io link, you’ll see even more that might make your head spin. When it comes down to it, though, these are relatively simple concepts with big names attached.
My hope for this series is to help give you more confidence in your skills and where you stand right now, whether you just began your transition or you’re just curious about it. It’s possible to get into this field, and you don’t have to be an expert at first. You have opportunities across the internet to join groups and programs that will help you hone these skills.
In a lot of ways, you might already have a really good baseline to come in on a new career.
The next part of the guide I’ll be posting will go more in depth about universal skills: dashboarding, visualizations, reports, presentations. If there’s anything specific you want me to cover, please let me know!
Until then, see you in the next post!
Leave a comment