How to Manage a Data Science Team in a Startup

The five pillars to keep your team focused

Jorge Peñalva

Published in

Building Lang.ai

6 min readApr 12, 2018

Introduction

When you start a new company, you have many uncertainties and it is definitely going to be tough. There are many learnings along the way. With the growth of AI and big companies investing billions of dollars in research, there are some of us crazy enough to build a deep-tech product which requires to set up the right structure for research and innovation.

As a startup, the organization of such a research -in our case Data Science and more specifically NLP- team is complex because you need a long-term vision when you have a lot of short-term needs. We decided to share how we organize and manage our team in lang.ai, hoping this can help more people or smaller companies like us that go along this path.

The reason is that we believe in a world where all deep tech innovation doesn’t come from Microsoft, Amazon and Google… Our vision is that deep-tech projects are not successful or competitive just based on the resources invested but also depending on how well the talent is aligned with the vision of the company and focused on a unique value proposition for the community and/or clients.

How we do it

These are the five pillars we maintain for our Data Science team:

1. Transparency and Market Orientation

At each Data Science meeting I share all the recent developments in the market, the challenges, and the projects that we are facing commercially. It helps the team to know that we are building the core of a technology that is useful for companies and individuals and at the same time there are some projects or prospects where we get really good ideas that would make our product and technology better for that use case.

At the same time, once we sign a client, we all have an idea of what challenges we will face and how the client is going to use the product.

2. Being Product-Oriented

The product team is aware of the Data Science roadmap which is open to all the company. That way, once a new data science breakthrough is made, the product team has already prepared the functionalities in the product. Our lead technical product manager and CTO both work hand to hand with the Data Science team in order to deeply understand the new advances and how that impacts the frontend and backend of our infrastructure.

3. Horizontal Management

The team discusses the topics that make more sense in terms of where to continue our research. Ultimately it’s a decision that is taken together in a research strategy semester session where the Data Science team, CTO and CEO participate.

In order to decide, we taking into account three main things:

The market
The current research in the field
The technical difficulty

4. Specialized Leadership

We have the following leadership structure for our daily work:

A technical leader: Coordinates where the research is going and guides/helps the rest of the team in their specific tasks.
A SCRUM leader: Helps the rest of the team in organizing and making most of their time to meet the goals
A team leader: In our case, both CEO and CTO, are responsible for having an overall view of the team and individual performance with regards to our culture values (transparency, innovation, quality and eagerness).
Every semester, we review the technical goals and the team/culture goals to assess the results and health of the team. Also, performance metrics are individually shared with each of the members of the team in a 1on1 meeting, where everyone can share feedback and ideas about improving how the work is done that we can try out in the next semester.

5. Adapting Agile for Research teams

We use JIRA and Confluence in the team as the tools to organize the work.

Here is the drill down of the team organization:

2–3 months Epics with one owner: We define higher level research topics that we are focusing on, and each Epic has one owner. These epics are public for the organization.
1 month sprints/issues: Each Epic is divided in issues. As research issues, they usually have a lot of work and different paths so we have decided that a month is a good unit for working in those issues.
1 week sessions: Every week we do the weekly stand-up. It’s a meeting where everyone shares what we are up to and therefore allows for knowledge sharing and collaboration and understanding of what everyone is working on.

Apart from that, we usually hold:

1-on-1 reviews with the technical leader or teammates: This helps to shed light on complex tasks with different points of view plus it adds the benefits of spreading the knowledge and reuse of the skills we’ve learnt on other tasks.
Product and technical sessions with the PM and CTO: This has proven to be amazing for us and the speed of our new product functionalities. Before even the research is finished, the technical product manager knows what we are going to build and we create sessions with the corresponding teams to understand how a research functionality will impact the product. That way, once the research is finished, you may already have all the product integration prepared!

Closing the Customer Feedback Loop

As a product-oriented team, we need to make sure that our deliverables have a real impact on the problems we are trying to solve. There is no better way than having a close eye on how our customers use our product to understand if we are performing as how we expected.

For example, we have a visualization in our product that shows clusters of intents based on semantic similarity:

After the initial launch, we realized that the clustering startegy was creating a really big cluster and a lot of smaller ones. Since we have an Epic owner for the clustering algorithm, he was notified by the product team about the issue. He knew what changes to apply to the algorithm for that purpose so, as a result, in less than a week the improvement was in production.

It took less than a week to modify the algorithm as well as backend and frontend changes.

Second clustering strategy after user feedback

Maintaining Consistent Innovation

We can define consistent innovation as the ability of a team to reapeatedly add value to the business. In order to avoid working only on what is called keep the lights on activities, we should have the correct mechanisims to foster innovation.

Innovation Spaces

We created a monthly meetying where one or two people in the team build a presentation about a recent research paper, its implications, and why it could be useful for our future. So far it has been proven to be very productive as a means of bringing the latest research to the team.

Flexibility

Research work cannot be modeled and estimated as precisely as development work. Therefore flexibility is needed as some Epics may really be longer than 2–3 months and some issues longer than a sprint/month. Just split them in the best units you can find, but the purpose of research and innovation is innovating, not executing and you need the right environment for innovation.

Conclusion

Is this what they call AI-first companies? I am really not sure, but what we know is that so far this has proven to be an efficient way to organize and manage our research focusing on the long-term without losing sight of our customers. We really hope this can help other companies in a similar stage.

Check the other articles in our Building Lang.ai publication. We write about Machine Learning, Software Development, and our Company Culture.

Credits: Banner icon by Chameleon Design from the Noun Project