{"id":1022,"date":"2021-01-27T01:23:51","date_gmt":"2021-01-27T01:23:51","guid":{"rendered":"http:\/\/jackofalltradesmasterofsome.com\/blog\/?p=1022"},"modified":"2023-02-28T14:52:27","modified_gmt":"2023-02-28T14:52:27","slug":"streaming-etl-with-azure-data-factory-and-incremental-pipeline-azure-data-factory","status":"publish","type":"post","link":"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/27\/streaming-etl-with-azure-data-factory-and-incremental-pipeline-azure-data-factory\/","title":{"rendered":"Streaming ETL with Azure Data Factory and CDC &#8211; Creating an Incremental Pipeline in Azure Data Factory"},"content":{"rendered":"\n<p>In this series we look at building a Streaming ETL with Azure Data Factory and CDC &#8211; Creating an Incremental Pipeline in Azure Data Factory. This is Part 6, The rest of the series is below.<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/25\/streaming-etl-with-azure-data-factory-and-cdc-enabling-cdc\/\">Enabling CDC<\/a><\/li><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/25\/streaming-etl-with-azure-data-factory-and-cdc-setting-up-audit-tables\/\">Setting up Audit Tables<\/a><\/li><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/25\/streaming-etl-with-azure-data-factory-and-cdc-provisioning-azure-data-factory\/\">Provisioning Azure Data Factory<\/a><\/li><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/25\/streaming-etl-with-azure-data-factory-and-cdc-provisioning-provisioning-azure-blob-storage\/\">Provisioning Azure Blog Storage<\/a><\/li><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/27\/streaming-etl-with-azure-data-factory-and-cdc-creating-a-data-source-connection-in-azure-data-factory\/\">Create Data Source Connection in ADF<\/a><\/li><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/27\/streaming-etl-with-azure-data-factory-and-incremental-pipeline-azure-data-factory\/\">Create Incremental Pipeline in ADF<\/a><\/li><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/27\/streaming-etl-with-azure-data-factory-and-create-a-parameter-driver-pipeline\/\">Create a Parameter Driven Pipeline<\/a><\/li><li><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2021\/01\/27\/streaming-etl-with-azure-data-factory-and-cdc-creating-the-rolling-etl-window\/\">Create a Rolling Trigger<\/a><\/li><\/ol>\n\n\n\n<p><em>Side Note: Want to learn <a href=\"https:\/\/subscribepage.io\/8daystosqlcourse\">SQL <\/a>or <a href=\"https:\/\/subscribepage.io\/Py2Y8D\">Python <\/a>for free. In less then 10 minutes a day and less than an hour total? Signup for my free classes delivered daily right to your email inbox for free!  <\/em><\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/subscribepage.io\/8daystosqlcourse\"><img loading=\"lazy\" decoding=\"async\" width=\"558\" height=\"280\" src=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2023\/02\/image-1.png\" alt=\"\" class=\"wp-image-1305\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2023\/02\/image-1.png 558w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2023\/02\/image-1-300x151.png 300w\" sizes=\"auto, (max-width: 558px) 100vw, 558px\" \/><\/a><\/figure>\n\n\n\n<figure class=\"wp-block-image size-full\"><a href=\"https:\/\/subscribepage.io\/Py2Y8D\"><img loading=\"lazy\" decoding=\"async\" width=\"572\" height=\"281\" src=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2023\/02\/image-2.png\" alt=\"\" class=\"wp-image-1306\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2023\/02\/image-2.png 572w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2023\/02\/image-2-300x147.png 300w\" sizes=\"auto, (max-width: 572px) 100vw, 572px\" \/><\/a><\/figure>\n\n\n\n<p>Now back to the article&#8230;<\/p>\n\n\n\n<p>This series uses the Adventureworks database. For more information on how to get that set up see my Youtube video for <a href=\"https:\/\/www.youtube.com\/watch?v=JsrQGU7rhYA\">Downloading <\/a>and <a href=\"https:\/\/www.youtube.com\/watch?v=YL4-wMriI8s\">Restoring<\/a> the database.  <\/p>\n\n\n\n<p>Creating an Incremental Pipeline in Azure Data Factory<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Create a new Dataset as a SQL Server and name it \u201cIncremental_Load\u201d<\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"347\" height=\"113\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-45.png\" alt=\"\" class=\"wp-image-1048\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-45.png 347w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-45-300x98.png 300w\" sizes=\"auto, (max-width: 347px) 100vw, 347px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"234\" height=\"128\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-46.png\" alt=\"\" class=\"wp-image-1049\"\/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>From the Activities, get the lookup and move it to the main window. This task will be used to track the new changes in the CDC table for a certain time frame. Name the Lookup \u201cGetChangeCount\u201d <\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"456\" height=\"255\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-47.png\" alt=\"\" class=\"wp-image-1050\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-47.png 456w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-47-300x168.png 300w\" sizes=\"auto, (max-width: 456px) 100vw, 456px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>In the properties, add the customer query and\nreplace with the correct table name. The query \u201cSELECT capture_instance FROM\ncdc.change_tables\u201d will give you the names of the CDC tables and can be tested\nin SQL Management Sudio <\/li><\/ul>\n\n\n\n<p>DECLARE&nbsp; @from_lsn\nbinary(10), @to_lsn binary(10);&nbsp; <\/p>\n\n\n\n<p>SET @from_lsn =sys.fn_cdc_get_min_lsn(&#8216;dbo_DimProduct&#8217;);&nbsp; <\/p>\n\n\n\n<p>SET @to_lsn = sys.fn_cdc_map_time_to_lsn(&#8216;largest less\nthan or equal&#8217;,&nbsp; GETDATE());<\/p>\n\n\n\n<p>SELECT count(1) changecount FROM\ncdc.fn_cdc_get_all_changes_dbo_DimProduct(@from_lsn, @to_lsn, &#8216;all&#8217;)<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Preview data will show the result of this query. This will show how many changes have been recorded in<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"409\" height=\"321\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-48.png\" alt=\"\" class=\"wp-image-1051\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-48.png 409w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-48-300x235.png 300w\" sizes=\"auto, (max-width: 409px) 100vw, 409px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>In the activities, expand iteration and add the If condition to the flow.<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"412\" height=\"277\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-49.png\" alt=\"\" class=\"wp-image-1052\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-49.png 412w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-49-300x202.png 300w\" sizes=\"auto, (max-width: 412px) 100vw, 412px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Name it \u201cHasChangedRow\u201d and in the Properties window add the code \u201c@greater(int(activity(&#8216;GetChangeCount&#8217;).output.firstRow.changecount),0)\u201d and select the Pencil next to the True Condition. <\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"434\" height=\"218\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-50.png\" alt=\"\" class=\"wp-image-1053\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-50.png 434w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-50-300x151.png 300w\" sizes=\"auto, (max-width: 434px) 100vw, 434px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Add the Copy Activity from the Move &amp; Transform Activity and name it \u201cCopy Incremental Data\u201d.<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"427\" height=\"309\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-51.png\" alt=\"\" class=\"wp-image-1054\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-51.png 427w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-51-300x217.png 300w\" sizes=\"auto, (max-width: 427px) 100vw, 427px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>In the Source Tab, set the Source to your SQL Dataset and use the following query replacing the highlighted if needed. Select Preview to see the results.<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"394\" height=\"213\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-52.png\" alt=\"\" class=\"wp-image-1055\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-52.png 394w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-52-300x162.png 300w\" sizes=\"auto, (max-width: 394px) 100vw, 394px\" \/><\/figure>\n\n\n\n<p>DECLARE @from_lsn binary(10),\n@to_lsn binary(10); <\/p>\n\n\n\n<p>SET @from_lsn\n=sys.fn_cdc_get_min_lsn(&#8216;dbo_DimProduct&#8217;);\n<\/p>\n\n\n\n<p>SET @to_lsn = sys.fn_cdc_map_time_to_lsn(&#8216;largest\nless than or equal&#8217;, GETDATE());<\/p>\n\n\n\n<p>SELECT * FROM\ncdc.fn_cdc_get_all_changes_dbo_DimProduct(@from_lsn, @to_lsn, &#8216;all&#8217;)<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>In the Sink tab, select your CSV Blob source. <\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"362\" height=\"95\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-53.png\" alt=\"\" class=\"wp-image-1056\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-53.png 362w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-53-300x79.png 300w\" sizes=\"auto, (max-width: 362px) 100vw, 362px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li><\/li><\/ul>\n\n\n\n<ol class=\"wp-block-list\"><li>You can now debug the main pipeline and check in your storage account to see if data was move to capture the rows from CDC. <\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"151\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-54.png\" alt=\"\" class=\"wp-image-1057\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-54.png 624w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2021\/01\/image-54-300x73.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p>Streaming ETL with Azure Data Factory and CDC &#8211; Creating an Incremental Pipeline in Azure Data Factory<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this series we look at building a Streaming ETL with Azure Data Factory and CDC &#8211; Creating an Incremental Pipeline in Azure Data Factory. This is Part 6, The rest of the series is below. Enabling CDC Setting up Audit Tables Provisioning Azure Data Factory Provisioning Azure Blog Storage Create Data Source Connection in [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":822,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,28,27,11,1],"tags":[104],"class_list":["post-1022","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-analytics","category-azure","category-big-data","category-data-warehouse","category-uncategorized","tag-azure-data-factory"],"_links":{"self":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/1022","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/comments?post=1022"}],"version-history":[{"count":3,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/1022\/revisions"}],"predecessor-version":[{"id":1315,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/1022\/revisions\/1315"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media\/822"}],"wp:attachment":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media?parent=1022"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/categories?post=1022"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/tags?post=1022"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}