Wanna become a data scientist? Checkout Beyond Machine!
Data-driven, data-informed, data-centric. These are buzzwords you might have heard being thrown around, but you don’t know what they really mean. Truth is, you will find many articles using these terms interchangeably, but I am going to argue in this article that there is a difference between all of them.
Being data-informed is the lowest level of data integration. A company is data-informed when the members of the organisation the following conditions hold:
- The company is collecting data from multiple sources. The data is organised and documented.
- The staff is aware of the data and knows how to access the data.
- The decision makers are using dashboards, or at least Excel.
So, companies that are data-informed make use of data, but not really use of data science.
Data-driven is the next stage of data integration. When a company is data-driven, it is actively using data in order to make decisions. This precludes that the company is using data science. Examples of being data-driven are companies using recommender systems to make suggestions to their customers or algorithmic trading. There are various grades of being data-driven, with some companies giving away more control data science than others. Data science, however, can be a double-edged sword. When properly used, it can lead to sound and well-informed decisions. When improperly used, the same data can lead not only to poor decisions but to poor decisions made with high confidence that, in turn, could lead to actions that could be erroneous and expensive. This is why it is important to consult with a data scientist set up a data strategy from the first day.
Being data-centric is the last stage of data evolution. This is when data science is at the core of the business. There are whole divisions dedicated to different data-related tasks. The decision makers have a 360 degrees view of the whole business and how the data is used. These are some points that indicate a company is data-centric.
- For example, data engineers are considered a different department, with the responsibility of storing data and making sure the infrastructure is scalable. They are also responsible for extracting the relevant pieces of data.
- The data science department is focused only on the analysis of data and research (and not software dev, or data engineering).
- There is a Chief Data Officer.
- There are strict operations in place that communicate the results of the data science research to the decision makers, and then make sure these are applied in practice.
The last point is especially important. Data science is a core process in the company, and the focus is on actually having an impact. This comes only after data scientists have the necessary freedom, but also there are processes in place that ensure the communication of results and the application of new models.
Examples of bad implementations of this are:
- Data scientists trying to do everything themselves. This can be great for saving money, but you are sacrificing on innovation. Unless you are the CEO of a small company, this is not a very good strategy.
- Bad communication channels between the data scientists and the decision makers. Many interesting projects or models are just lost in the process.
- No formal process of the models being implemented in production. Every project is treated differently. Time is lost between data scientists and developers trying to agree on how the model is implemented. When testing the implemented model, the results are different to the ones expected, due to miscommunication.
The state of being data-centric describes the situation of some of the biggest players in the market such as Google, Microsoft or Amazon. However, this doesn’t mean that this is a goal that even a small company can aspire to reach as well. If you are interested to learn more about the subject make sure to check my new book, “The Decision Maker’s Handbook to Data Science“, where I talk about data science culture and other related topics such as hiring and managing data scientists.
So, in summary, what are the differences between being data-informed, data-driven and data-centric?
- Data informed companies are collecting and using data, but only through simple techniques (e.g. charts).
- Data-driven companies are using algorithms in decision making.
- Data-centric companies place data science at the core (e.g. Google).