Skip to main content

Command Palette

Search for a command to run...

Decoding the Roles in Data and ML Teams

Updated
4 min read
Decoding the Roles in Data and ML Teams

Introduction

As an instructor for data professionals, I often find myself in the awkward position of being asked questions that don’t have a clear answer. For example, an aspiring data scientist once asked me, “What’s the difference in day-to-day tasks between a Data Engineer and a Data Scientist, or between a Data Scientist and an ML Engineer?”

It feels strange when I can’t give a simple, concise answer to such straightforward questions. The truth is, there’s so much context involved that it’s nearly impossible to provide a direct answer without oversimplifying.

That’s why I decided to write this blog post—to shed light on the nuances of different roles within data teams and provide the context needed to make the answers to these kinds of questions clear and intuitive.

Data Team Roles Composition

Let’s get one thing out of the way: in my opinion, roles should be well-defined and agreed upon within a team, even if people end up wearing multiple hats. Roles aren’t restrictions; they’re just a way to group tasks and responsibilities. No one has to fit perfectly into one box. Someone might do both Data Engineering and Data Science tasks—and that’s fine.

Roles serve two purposes, depending on the context:

  1. In a team, they group tasks together to ensure clarity and efficiency.

  2. For an individual, they group skills—since skills are essentially the ability to handle tasks.

With that in mind, I like to group roles into three major categories: Business, Data, and Machine Learning (ML). These categories help illustrate the main focus of each role and their interdependencies.

For example, to work on ML tasks, you first need data. To work effectively with data, you need a solid understanding of the business. This hierarchy is often overlooked in the industry. You can’t even talk about digital transformation without clear business processes in place. Similarly, you can’t adopt ML and AI without a proper data infrastructure.

The Five Key Roles

In a well-rounded data team, there are five key roles, which align intuitively with the categories above:

  • Business Category:

    • Business Analyst: Bridges the gap between business operations and data. They document requirements, semantics, and specifications for data and ML products.
  • Data Category:

    • Data Engineer: Focuses on building a platform that reliably delivers high-quality data on time.

    • Data Analyst: Works on delivering insights and data products to end users.

  • ML Category:

    • Data Scientist: Extracts hidden value from data using statistics and domain knowledge.

    • ML Engineer: Makes ML models designed by Data Scientists available to other software systems or end users.

Each role has a unique focus, but their work often depends on the deliverables of others. For instance, a Data Engineer needs clear business requirements from a Business Analyst to build a meaningful data platform. And that’s goes on on the rest roles. In the bellow graph I draw those dependencies.

The Gap between Position Titles & Actual Work

Here’s something I’ve said before, but it’s worth repeating: the job title on a description doesn’t always reflect the actual work. During interviews, it’s crucial to dig into the day-to-day tasks you’ll be handling.

For example, I’ve seen roles labeled “Data Engineer” that had nothing to do with engineering!

Another common issue is people ending up in roles that don’t align with their goals. This often happens when job seekers rely too much on titles instead of understanding the role’s actual responsibilities. If you find yourself in a job that doesn’t fit your expectations, it’s worth considering a change. It’s usually the best decision for both you and the organization.

That said, the gap between a title and the actual work doesn’t always have to be a bad thing. Some people, like myself, enjoy being generalists. Although my primary role is “Data Engineer”, I love wearing multiple hats and working across the spectrum of data roles. Whether this approach works for you often depends on the organization.

In my experience, smaller organizations tend to offer more flexibility, allowing you to take on diverse responsibilities and make a broader impact. On the flip side, larger organizations—like those with thousands of employees—often prioritize specialists over generalists. While working for a big-name company like FAANG might look great on a résumé, it might not suit everyone’s working style.

Ultimately, it’s a matter of personal taste. The key is finding a place where you feel comfortable, aligned with your priorities, and able to work toward your technical and career goals.

Conclusion

Understanding the differences between data team roles isn’t just about satisfying curiosity—it’s about making informed career choices and building better teams.

If you’re aspiring to work in the data field, focus on the skills and tasks that align with your interests, not just the titles. And if you’re already part of a team, make sure roles and responsibilities are clear to everyone.

At the end of the day, whether you’re working in Business, Data, or ML, it all comes down to collaboration. When everyone knows their part, the whole team thrives.

More from this blog