What is data science?

Data science is the science & art of extracting information, knowledge and inference from unstructured & structured data; with the aim of providing descriptive, predictive & prescriptive analytical findings out of it. Data science includes a plethora of different facets such as data research & technology, statistical modelling & computation, data visualization & finally business/data consultation.


The word Data Science is everywhere, but what is so important about it? Nowadays without Data, you are just another person with an opinion. Data is famously been referred as the Crude Oil of 21st century. So great! We have data but why Science?

Let me put it this way! Data Science is like Crude Oil, it is available everywhere but the uses of Crude Oil are very limited. But what makes Crude Oil so powerful? It’s ability to transform into Petrol. Untill the Data doesn’t have ability to transform into meaningful insights, data is as effective as Crude Oil.

Here Data Science comes into the picture. And it’s ability to break the data, clean and bring out a pattern. These insights are inturn used to take business decisions and add value to the system.

Data is willing to talk to you if you listen, why not just listen!:wink:

Data science collects, analyzes, and interprets data to derive insights relevant to a given problem. The process involves mathematics and algorithms to statistical analysis and machine learning. The aim is to bring out the story the number, and categorical data wants to portray, the relationship among various data, and how one can group and organize the attributes.

In the field of marketing, one can upsell, cross-sell, or find the customer’s lifetime value; for healthcare, it can help in disease prediction and medicine effectiveness; advancement in self driving cars for automation; sentiment analysis and digital marketing in social media; the possibilities are endless and its usage is unfathomable.

There is a whole life cycle of the data from which one extracts relevant information.

Data Collection - This stage includes data collection from various sources like surveys, social media, live audience, regular data entries etc. We deal with data acquisition at this stage: more the data, clearer the picture.

Data Cleaning - This is the maintenance stage of data. It includes proper categorization relevant to the problem statement and removing anomalies like inconsistencies and missing data. It includes data modelling and summarization. The aim is to have a clean data set to work with

Data Analysis - With this process starts the inspection of information through data analysis. The aim is to look for any pattern that the data can provide for qualitative research. To make the findings more presentable, there is a lot of statistical analysis and proper visualization. It spotlights the points for further investigation and noting the outliers.

Data Modeling - The process of analyzing and organizing, updating, collecting, and storing the data. It’s an essential aspect of any organization that wants to use large amounts of data. A data scientist uses various powerful tools and techniques to understand the best ways to create a data model for a business.

Thus for data and data science, taking inspiration from the famous dialogue by Professor Albus Dumbledore, “data will help those who look into it” :wink: