Is SQL Needed to be a Data Scientist?

SQL is one of the four most important skills I personally test a candidate while interviewing. Yes you no need to be stored procedure or cursor expert but you should neither have a confusion between WHERE clause and HAVING clause.

There are few reasons why at least above average SQL knowledge is important. Few of them being.

  1. For an overall first cut view of the raw data, you will mostly interact with a database. If you lag in SQL skills, viewing raw data as it is stored in tables at that level becomes difficult.
  2. Never ever, you will have a master data set ready on which you run a machine learning algorithm. This master data set need to be prepared by data scientist and to do this you will have to join multiple data sources(tables mostly). Here you need a good grasp on SQL.
  3. There are lot of packages, interfaces in analytics tools like R and Python which facilitate creation of a bridge between databases(may be oracle, mssql, hive etc) and analytics tool/platform. All of this will run on SQL queries for data pull and push both ways. This entire thing demand a decent SQL skill.