Describe the ETL process?

For feeding a data warehouse, a business intelligence system, or a big data platform, extraction, transformation, and loading (ETL) operations are essential. An ETL process gathers data from operational systems and pre-processes it for further analysis by reporting and analytics tools, while being mainly transparent to users of a business intelligence platform. ETL procedures are responsible for the correctness and timeliness of the whole business intelligence platform, specifically:

  • Extraction of the data from production applications and databases (ERP, CRM, RDBMS, files, etc.)
  • Transformation of this data to reconcile it across source systems, perform calculations or string parsing, enrich it with external lookup information, and also match the format required by the target system (third normal form, star schema, slowly changing dimensions, etc.)
  • Loading of the resulting data into The business intelligence (BI) applications: Data Warehouse or Enterprise Data Warehouse, Data Marts, Online Analytical Processing (OLAP) applications or “cubes”, etc.