What are the top-level concepts of Azure Data Factory?

  • Pipeline: It acts as a carrier in which we have various processes taking place.

This individual process is an activity.

  • Activities: Activities represent the processing steps in a pipeline. A pipeline can have one or multiple activities. It can be anything i.e process like querying a data set or moving the dataset from one source to another.
  • Datasets: Sources of data. In simple words, it is a data structure that holds our data.
  • Linked services: These store information that is very important when it comes to connecting an external source.

For example: Consider SQL server, you need a connection string that you can connect to an external device. you need to mention the source and the destination of your data.

Azure Data Factory can be thought of as an Orchestration tool. What that means is that you select different services and place them in a specific order. This is then called a pipeline.

For example you can have a copy command where you specify a source and a destination where that data will be copied to (ex: SQL database).

In other instances you can create a link to a data source which then will be connected so Azure Data Bricks to transform that data.If you are just copying data from point A to point B you will be using an Azure service called Integration Runtime.