Please enable JavaScript.
Coggle requires JavaScript to display documents.
Azure Data Facory, image - Coggle Diagram
Azure Data Facory
Components
Linked Service
Define the required connection information needed for Azure Data Factory to connect to external resources, such as a data source. Azure Data Factory uses these for two purposes: to represent a data store or a compute resource.
Data Set
Datasets represent data structures within the data store that is being referenced by the Linked Service object
A dataset is a named view of data that simply points or references the data you want to use in your activities as inputs and outputs.
Activities
A single processing step in a pipeline. Azure Data Factory supports three types of activity: data movement, data transformation, and control activities.
Activities within Azure Data Factory define the actions that will be performed on the data and there are three categories including
-
Pipelines
A logical grouping of activities that perform a specific unit of work. These activities together perform a task. The advantage of using a pipeline is that you can more easily manage the activities as a set instead of as individual items.
Data Flows
Enable your data engineers to develop data transformation logic without needing to write code. Data flows are run as activities within Azure Data Factory pipelines that use scaled-out Apache Spark clusters.
Integration Runtimes
Azure Data Factory uses the compute infrastructure to provide the following data integration capabilities across different network environments: data flow, data movement, activity dispatch, and SSIS package execution. In Azure Data Factory, an integration runtime provides the bridge between the activity and linked services.
-
ADF provides a cloud-based data integration service that orchestrates the movement and transformation of data between various data stores and compute resources.
you can create and schedule data-driven workflows (called pipelines) that can ingest data from disparate data stores
You can build complex ETL processes that transform data visually with data flows or by using compute services such as Azure HDInsight Hadoop, Azure Databricks, and Azure Synapse Analytics.
-