In the good old days, before Big Data and Cloud revolutions, companies were building classical Business Intelligence departments. Everything was simple those days. There were three major, well-known and well-recognized for everybody roles: DBA, ETL Developer and BI/Report Developer. Of course, we exclude from our considerations managers, scrum masters and let’s say all the “soft” roles. We are focusing here on people doing “actual work”, whose hands are dirty with data. I guess, the responsibilities of each of those classical roles are clear to everybody, but let’s put a couple of words, to compare them nextly with modern data roles.
So traditional data roles were:
Data Stewards – GDPR!
Additionally, years ago, also before the big data revolution, one more role appeared: Data Steward. Do you remember the short hype for this position? Companies have realized they are collecting so much data, and building so complex data models, that nobody has an overview of the whole data landscape. Data Stewards and their Data Governance practices were addressing also data security and privacy issues. When the EU introduced GDPR regulations the response from data organizations was GDPR officers, but let’s leave it for the other post.
Do you remember when the last time you saw a job offer for any traditional data positions? Me neither. Nobody would ever apply for an ETL Developer position today. No organization is these days looking for Report Developer. Today everybody wants to be a Data Scientist. By the way, recently I came across a little bit malicious explanation of who Data Scientists actually are: they are the folks who are “better engineers than statisticians and better statisticians than engineers” [1].
Jokes aside. Let’s try to enumerate all modern data roles and briefly describe them:
Data Engineers – convert raw data into usable information
Let’s start with one out of the two most popular data roles – Data Engineer.
Data Engineers – in fundamental understanding Data Engineer uses tools like SQL and Python to make data ready for data scientists. They build data pipelines that source and transform the data into the structures needed for analysis. Besides the modern tool stack, doesn’t it sound a bit like a good old ETL Developer?
However, since Data Analytics platforms and practices have expanded, but we are using only two terms to describe them, then way more responsibilities can be understood as Data Engineers.
The next four roles can be understood as spin-offs from the Data Engineer role or simply parts of it. A lot of times they all are referred to as just a Data Engineer.
Data Integration Engineers – the focus of these guys is on integrating data from multiple sources into a central data store (e.g. data lakes), so on EL (Extract and Load) part of ELT processes. Peanuts for ETL Developers.
Data Platform Engineers - role focused on running the infrastructure (nowadays mostly cloud) for data processing, spinning up the tools and configuring the services, which create a specific data platform. Security is also one of the main responsibilities. Think about it as it’s a role that primarily doesn’t work with the data platform to produce data, but makes it possible for others to do so. Like a DBA after a solid cloud course.
DataOps Engineers – everybody who reads Faro Blog knows exactly what does it mean, but let’s simply say that DataOps Engineer is a Data Engineer using DevOps practices (continuous integration, automated testing, continuous deployment, monitoring, orchestration and infrastructure as a code) and works in accordance with the practices of agile and lean methodologies. Nowadays, many Data Engineers are actually DataOps Engineers without even realizing it.
Analytics Engineer – quite rare used, but the goal is to take the processed and purified core data sets and make them into something meaningful for data analysts or business users. Something similar to DWH orBI developer, right?
Data Scientists - next generation of data analytics experts
Data Scientists - an analytics professional who is responsible for collecting, analyzing and interpreting data to help drive decision-making in an organization. Data scientist creates programming code and combines it with statistical knowledge to create insights from data. A bit like BI Developer.
Data Modelers - they translate complex business data into user-friendly data models. They first design logical data models in dedicated tools, which are later mapped on physical data models. They existed, before the big data and the cloud era.
Data Analysts - reviews data to identify key insights into a business's customers and ways the data can be used to solve problems. They also communicate this information to company leadership and other stakeholders. Also nothing new, I have been meeting them in traditional BI departments.
AI & Machine Learning Engineers - Machine learning engineers focus on building machine learning models and deploying them to production. Finally something new! Sophisticated AI and ML algorithms a few years ago were only used by real scientists (Ph.D. and above) or very specialized companies. By the way, those algorithms were called Data Mining those days.
What I like about new names for data roles is the wide usage of “engineers”. I’m glad that the IT world has finally promoted us! Speaking more seriously, in my opinion calling us engineers is more accurate than developers – our work, tool stacks and everyday challenges are way different than those our colleagues, software developers, are facing.
So as we can see traditional roles have evolved and split into a bit more specialized roles, as technology stacks have also become a bit more complex in our Data Analytics solutions and in many cases our data departments also expanded. However, the modern roles are not as different from the old ones as it can look at the first glance. If you are still fulfilling one of the traditional roles I have good news for you – when you decide to step up in your career you will have at least a few options to choose from and your new position will not turn your professional life upside down.
Sources: