Streaming ETL with Azure Data Factory and CDC – Creating a Data Source Connection in Azure Data Factory

In this series we look at building a Streaming ETL with Azure Data Factory and CDC – Creating a Data Source Connection in Azure Data Factory. This is Part 5, The rest of the series is below.

  1. Enabling CDC
  2. Setting up Audit Tables
  3. Provisioning Azure Data Factory
  4. Provisioning Azure Blog Storage
  5. Create Data Source Connection in ADF
  6. Create Incremental Pipeline in ADF
  7. Create a Parameter Driven Pipeline
  8. Create a Rolling Trigger

This series uses the Adventureworks database. For more information on how to get that set up see my Youtube video for Downloading and Restoring the database.

  1. Create a new Dataset as a SQL Server.
  • Name it DimProperty and select your Integrated Runtime for the local SQL server. For the table, select the CDC table.
  • Create a new Dataset and this time select Azure Blob Storage and a DelimitedText.
  • Name the csv_DimProperty and select New Linked Service.
  • Name the blob storage “DataLake” to match your storage account and point it to your storage account in the subscription.
  • Select the “datalake” from the file folder section or type it in and set first row as header and select ok to complete.
  • We should now have our two datasets in the resource’s sections for the property CDC transfer. Select Publish All at the top to save your changes

Streaming ETL with Azure Data Factory and CDC – Creating a Data Source Connection in Azure Data Factory


Posted

in

, , , ,

by