The most important part is the Data Science application, all kinds of applications. Yes, you read correctly, all kinds of applications, for example, machine learning.

The data revolution

By 2010, with a large amount of data, it was possible to train machines with a data-driven approach rather than a knowledge-based approach. All theoretical articles on recurrent neural networks supporting vector machines have become feasible. Something that can change the way we live, live things in the world. Deep learning is no longer an academic concept that can be found in a thesis work. It has become a kind of tangible and useful learning that would affect our daily lives. As a result, Machine Learning and AI dominated the media, overshadowing all other aspects of data science, such as exploratory analysis, metrics, analysis, ETL, experimentation, and A-tests. / B and what was traditionally called business intelligence.

Data Science - General Perception

So now, the general public is thinking about data science as researchers focus on machine learning and artificial intelligence. But the industry hires data scientists as analysts. As a result, there is a misalignment there. The reason for this misalignment is that, yes, most of these scientists can probably work on more technical problems, but big companies such as Google, Facebook and Netflix have so much fruit to improve their products that they do not have need to acquire more machine learning. or statistical knowledge to find these impacts in your analysis.

A good scientist does not just have complex models.

Being a good scientist is not the advancement of your models. This is the impact you can have on your work. You are not a data decipherer, you solve problems. You are a strategist. Companies will expose you to the most ambiguous and challenging issues and expect you to guide them in the right direction.

The job of a data scientist begins with the collection of data. This includes user-generated content, instrumentation, sensors, external data and recording.

The next aspect of the scientist's role is to move or store that data. This involves unstructured data storage, reliable data flow, infrastructure, ETLs, pipelines, and structured data storage.

As you progress in the work required for a data scientist, the next step is to transform or explore. This particular set of work covers preparation, anomaly detection and cleaning.

The next thing in the hierarchy of work for a data scientist is the aggregation and labeling of the data. This work includes Metris, analyzes, aggregates, segments, training data and features.

Learning and optimization is the next set of jobs for data scientists. This work package includes simple machine learning algorithms, A / B tests, and experiments.

At the top of the complex is the most complex work of data scientists. It consists of artificial intelligence and deep learning,