Working full-time for almost four years as data scientist, has helped me shape and clarify some of the misconceptions I had about this role and the buzz built around this field. I summarized them into ten points. Needless to say these are all totally subjective:
- In spite of being the sexiest job of the 21st century, we spend most of our time cleaning data, the situation has evolved positively in recent years.
- Without data engineers on the team, we would also spend time collecting and organising data. A big shoutout to them!
- Between quality and quantity of data, always go for both.
- Data is the most valuable asset in a company, and algorithms are the most valuable asset of a data scientist. Knowing how and when to combine both is a competitive advantage.
- Ego can be a powerful enemy in your career. The moment you assume you don’t know everything is when you start learning.
- AutoML sort of packages are fairly good solutions for the vast majority of problems, but won’t replace data scientists soon. They save you time when assessing the performance of multiple models on a specific use case. However, they are unable to identify and remove biases in the data, or extracting insights not just from the data itself, but by connecting the dots with other sources of data or information related to the business.
- “All models are wrong but some are useful” by George Box. For data scientists, it means that we need to make some assumptions about the data in order to find meaningful patterns or be able to use models which in some cases wouldn’t make sense.
- Data labelers are totally underrated.
- Sometimes, logic rules work better than any machine learning model, so spend your resources wisely.
- Let the data speak for itself. If you want a concrete insight that our work doesn’t show up, probably you don’t need a data scientist.
Feel free to give your take in the comments 😉