With more digital data entering the world every day, the first jobs were the ones of data scientists. Today there are so much data that not only Artificial Intelligence is growing, it is also getting smarter each day. When building data-driven products you need a data science team. Therefor you will need data scientists, data engineers and product owners. To name a few. But how do you put together a solid team when it comes to developing data-driven products? And which roles are most crucial? With this article we hope to give you an insight.
Every role has their own focus and they are all equally important. You have to make sure everyone is connected and working together, even though everyone is developing their own thing. ‘Without good communication, things that are being created won’t really work. Collaboration is key. Whatever solution they are working on, it needs to be done by the whole team, according to Romain Huet, Senior Data Scientist at TMC.
Romain created a sheet (see image) with essential roles for a data science team. This sheet creates an insight on each team member, how they contribute and who works alongside whom. With that being said, which roles are crucial for a data science team and what is it they do? We’ll give you an example on how a team works when building platform tool such as Spotify.
When building a platform tool, you need to have stories and tasks available. Scrum masters are making sure everyone of the team knows about these and are aligned. Together with the product owner (and the rest of the team) they define tasks and organize these by creating a roadmap. They also check and define tasks even more that are needed for building the product. Tasks like making sure everyone is working as a team and know their responsibility. Further they will help the team devising the best tasks accordingly to the roadmap.
Possible next steps
For building a tool like Spotify, business managers get in touch with stakeholders. It’s their goal to improve a product and its impact on the market. They think about all possible options. Also, they will be asking ‘What could be done to complete our vision based on the market?’ Business managers are reporting and asking (and answering) questions of the business intelligences. Business intelligences focus on how to improve business and how to be more profitable. For Spotify they could think about several subscriptions. Also, they study and evaluate the business model of competition and try to figure out what can be done to compete with them. By using Tableau they create dashboards that automate (daily or weekly) reports to visualize data. When are people using Spotify? And what do they use most? With this data better decisions can be made. Business intelligences are able to give better advice to the business manager and discuss possible next steps.
The text continues below the info block
Algae reactor: cheaper production of microalgae for biofuel
Until a few years ago, algae had a bad image and were mainly known to the general public as the dirty bits in waters. However, algae are rapidly becoming an attractive term for consumers. They are sustainable and have a wide applicability for different purposes such as food products, cosmetics and biofuel. Partly due to the expensive production process, algae are still relatively little used. "There must be a better way," thought employeneur Kevin Gordon. Last year, he started a project in the entrepreneurial lab to develop an algae reactor.
Show casing solutions
The responsibility of product owners is to see what can be done to make a product. They are constantly looking for data that answer their questions. Also, they get feedback from data analysts that helps developing and defining the product. Product owners are making sure business managers are aligned and manage their expectations. Last - but not least - the data scientists will show case solutions to product owners to see what can be done to make the product. In short: product owners are making sure a product is becoming what it should be.
Data analysts will see if – and what – can be done based on available data within the company. They use Python and Tableau to turn sales information into insights which helps the management in their decision making. Python and Snowflake are used to automate existing reporting into better solutions. By checking what is happening they show product owners and business intelligences if building a product is possible. If so, they will tell them how. For instance, they evaluate customers feedback of a product and the impact of the tool on the market. After this they’ll discuss with product owners if there is data that can help. For example: we need a new feature for reviews, what do users want or need? Data analysts make queries that answer these questions.
Data scientists are working side by side with data analysts (data engineers and machine learning engineers). They make dashboards and Proof of Concepts (PoC), get access to insights from data engineers and work with the company’s data. Data scientists check which data is already there and which questions are answered so a tool - that actually will be used - is built. Do they conclude better data is needed? Then they must interact with data engineers.
Working with the right data
Moving on to the machine learning engineers. These need to work with data engineers and consolidate all pipelines. They are taking models to scale and put them into a production environment. For example: when Spotify would like to scale up. They work alongside with software engineers on the backend to optimize technologies and collaborate with data engineers on the infrastructure. Data engineers create and work with databases. They make sure data scientists get access to needed data for building a tool. Without data you don’t know what to do or where to start. You might say this is the most crucial and important role of the data science team. For it’s important they know how to build and structure a database. When someone is doing a query for example, it needs to be efficient. And data engineers make sure others have the right data to do a query.
Once you have your tool available and stakeholders are on board, you want to make it accessible for the public. This is when a solution (hopefully) will turn into production and the proof of concept has to show. Putting the tool into production will ensure users to actually start using it. Once on the market you also need to make it attractive for the public. To align your audience the software engineers, need to make sure the product looks nice and is easy to use. That’s where front-end software engineers come into place. Or if possible, an UI or UX designer who think about the look and feel.
Building your team
Before building a data science team it is important to figure out what you are creating and what it is you are looking for. A common mistake being made, is people start looking for data scientists. Not because they need one (or two or three), but because everyone else is looking for them. Even though there might be a small amount of data scientists, you first need to make sure you have data to work with. And you won’t have the right data without a data engineer. So, first thing you do when building your data science team? Hire a data engineer. Good luck!
Bram Thelen
Director Data Science | Nanotechnology | Physics, Netherlands
Tel: +31 (0)6 52 89 25 70