Distinctions between Data Analysts, Data Scientists, and Data Engineers

Distinctions between Data Analysts, Data Scientists, and Data Engineers

in

My first job out of college was as a Data Analyst working for Salt Lake County. My team wrote SQL against on-premise Microsoft SQL Servers to build dashboards in Power BI and provide data to internal stakeholders. Little did I know that that would begin my obsession with working in the Data Industry. Since then, I’ve had the great opportunity to rub shoulders with highly bright and talented data professionals. Particularly when I moved to the startup world. There were jobs that I had never even heard of, like DevOps Engineer, Data Engineer, Data Scientist, ML Ops Engineer, etc. Although we could write another article on all these job titles (which we probably will), this blog post will focus on three typical personas and how their responsibilities/skill sets can overlap or differ. These personas include:

  1. Data Analyst
  2. Data Engineer
  3. Data Scientist

If you are thinking of making a career change or trying something new, this blog should give you a good idea of what you’ll need to learn to jump into that dream job!

Overview

The following Venn Diagram can simplify the overlap between these three positions. Feel free to refer back to it as you read each Persona section.

data practioner venn diagram

Each Persona section contains 3 sub-sections:

  1. Overview: Describes the day-to-day responsibilities of the persona
  2. The Skills: Lists the technical skills typically required of that persona
  3. Skill Gaps: Lists the technical skills that the persona is missing when compared to the other two personas

Data Analyst

As a Data Analyst, I spent most days writing and productionizing SQL. This SQL was part of a larger data model or could’ve been directly embedded in a Dashboard (Tableau, Power BI, Sigma, Looker, etc.). The consumers of those dashboards or reports were usually internal to the company where I was working (i.e., I didn’t have to talk to customers external to the company). So there was usually a lot of back and forth. Communication is vital for this role. You need to be able to communicate complex and messy data to key decision-makers at the company (this usually includes CEOs, VPs, Directors, and so on).

Furthermore, you need to be able to establish stakeholder trust. If you feed stakeholders incorrect data in the form of a pretty dashboard, you lose credibility. On the other hand, if you accurately communicate something complex for them to digest quickly, you will be their favorite person at the company.

The Skills

If you want to land a gig as a Data Analyst, these are the must-have skills:

  • Interpersonal & Effective Communication
  • SQL
  • Data Modeling
  • Creating Dashboards & Visualizations
  • Story Telling

Skill Gaps

If you are a Data Analyst and you want to be a Data Engineer, these are the skills that you may not have that you need to work on:

  • Python Programming
  • ETL Orchestration in a tool like Apache Airflow
  • Version Control (git)

If you are a Data Analyst and you want to be a Data Scientist, these are the skills that you may not have that you need to work on:

  • Building Statistical Models in R or Python
  • Understanding/Interpreting the results of those models

Data Engineer

For me, “Data Engineer” always had a nice ring to it. I was naturally drawn to the work because I was interested in Object Oriented Programming. I was jealous of those who could bend Python & Java to their will to make their work more accessible and efficient. Your days will be spent doing ETL/ELT pipelines (E = Extract, T = Transform, L = Load). Somebody at the company needs data in an obscure system (some other database, S3 buckets, SFTP servers, HTTP endpoints, etc.), and it’s your job to ensure they get it error-free, neatly modeled, checked for errors, and well-defined. There is less Story Telling with this persona than with the Analyst & Scientist personas, but you will build the foundation upon which the Analysts & Scientists query their data. Much of the time, they will be your customers.

The Skills

If you want to land a gig as a Data Engineer, these are must-have skills:

  • Python or Scala
  • SQL
  • Data Modeling
  • Deep Understanding of ETL/ELT
  • Version Control with git

Skill Gaps

If you are a Data Engineer but would like to be a Data Analyst, these are the skills that you may not have that you need to work on:

  • Story Telling
  • Dashboarding & Visualizing Data
  • Interpersonal & Communication Skills

If you are a Data Engineer but would like to be a Data Scientist, these are the skills that you may not have that you need to work on:

  • Building Statistical Models in R or Python
  • Understanding/Interpreting the results of those models

Data Scientist

This is the hottest job title of our generation. Unfortunately, it’s also one of our generation’s most misunderstood job titles. I cannot tell you how often I’ve seen companies rush to hire a Data Scientist when they have no Data Analysts or Data Engineers that have curated the data for the Data Scientists to build their statistical models. Data Scientists are typically statisticians who build models that predict critical events that the company may be interested in. These events could include Customer Attrition, Customer Expansion, Marketing Campaigns, etc. Much like how we can guess your BMI based on your height, weight, age, and gender as inputs, we can somewhat effectively guess these events of interest at a company given inputs like Customer Logins, Purchases, or other demographics. However, to build these models, the Data Scientists need data. This data is usually procured from Data Analysts and Engineers.

The Skills

If you want to land a job as a Data Scientist, these are must-have skills:

  • Programming in R or Python
  • SQL
  • Understanding how to build Statistical Models (Regressions, Decision Trees, etc.)

Skill Gaps

If you are a Data Scientist but would like to be a Data Analyst, these are the skills that you may not have that you need to work on:

  • Data Modeling & Warehousing

If you are a Data Scientist but would like to be a Data Engineer, these are the skills that you may not have that you need to work on:

  • Data Modeling & Warehousing
  • Using Python for ETL/ELT vs. Statistical Modeling