Why is SQL necessary for Data Science?

gagandeep · 9 December 2021 10:21

I need a deep understanding about this Domain.

rajanikant-ghate · 9 December 2021 12:28

As a data scientist, you should be able to slice the data as needed. Now, most of the big data will have to be sub sampled for research. Some big companies have a luxury to have a data engineer to do that. However, it’s fairly faster and simpler if Data Scientist can pull that data out. Either through the cloud interface that supports SQL syntaxes or through API connections set up by DevOps which then allow SQL code in a .py file.

In some cases, a data scientist may also work as a data analyst interchangeably.

chirag-garg · 12 December 2021 05:05

In data science , data analysis and related, the first step is to make/have data available. The data pipeline consists of four main steps: Sourcing, Wrangling, Building a model and then Production. SQL , in terms of storing and retrieving ( structured) data is used often in the first two steps.

Essentially, SQL (meaning a RDBMS) is one of the most effective ways of storing and retrieving structured data to be used in the first two steps of the different data pipelines