{"id":210,"date":"2018-12-17T15:10:04","date_gmt":"2018-12-17T15:10:04","guid":{"rendered":"http:\/\/jackofalltradesmasterofsome.com\/blog\/?p=210"},"modified":"2020-04-15T14:21:14","modified_gmt":"2020-04-15T14:21:14","slug":"the-modern-data-warehouse-azure-data-lake-and-u-sql-to-combine-data","status":"publish","type":"post","link":"http:\/\/jackofalltradesmasterofsome.com\/blog\/2018\/12\/17\/the-modern-data-warehouse-azure-data-lake-and-u-sql-to-combine-data\/","title":{"rendered":"The Modern Data Warehouse; Azure Data Lake and U-SQL to combine data"},"content":{"rendered":"\n<p>The modern data warehouse will need to use Azure Data Lake and U-SQL to combine data. Begin by navigating to your Azure Portal and searching for the Data Lake Analytics Resource. Let\u2019s start by creating a new Data Lake. Don\u2019t worry, this service only charges on data in and out, not just remaining on like an HDInsights cluster so you should not be charged anything and we will not need to spin up and spin down services like we did earlier. <\/p>\n\n\n\n<!--more-->\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"339\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-5.png\" alt=\"\" class=\"wp-image-211\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-5.png 624w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-5-300x163.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>You will need to give it a unique name as well as tie it to a pay as you go subscription. A data lake will also need a data lake storage layer as well. You can keep the naming as is or rename it if you wish. It is recommended to keep the storage encrypted. Deployment may take a few minutes.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"528\" height=\"247\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-6.png\" alt=\"\" class=\"wp-image-212\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-6.png 528w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-6-300x140.png 300w\" sizes=\"auto, (max-width: 528px) 100vw, 528px\" \/><\/figure>\n\n\n\n<p><\/p>\n\n\n\n<p>Let\u2019s navigate to the data explore and create a new folder to put our sample data. Upload the same 3 files from the happiness index to the new folder. <\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"623\" height=\"214\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-7.png\" alt=\"\" class=\"wp-image-213\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-7.png 623w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-7-300x103.png 300w\" sizes=\"auto, (max-width: 623px) 100vw, 623px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"178\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-8.png\" alt=\"\" class=\"wp-image-214\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-8.png 624w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-8-300x86.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p>Take a look through your data explorer at the tables. It should contain a master database similar to as if you were running a traditional SQL Server.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"438\" height=\"259\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-9.png\" alt=\"\" class=\"wp-image-215\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-9.png 438w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-9-300x177.png 300w\" sizes=\"auto, (max-width: 438px) 100vw, 438px\" \/><\/figure>\n\n\n\n<p> <\/p>\n\n\n\n<p>Create a new U-SQL Job that creates the new database in the\ncatalog as well as a schema and a table. <\/p>\n\n\n\n<p>CREATE DATABASE IF\nNOT EXISTS testdata;<\/p>\n\n\n\n<p>USE DATABASE testdata;<\/p>\n\n\n\n<p>CREATE SCHEMA IF NOT EXISTS happiness;<\/p>\n\n\n\n<p>CREATE TABLE happiness.placeholderdata<\/p>\n\n\n\n<p>( <\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp; Region string,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; HappinessRank float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; HappinessScore float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nLowerConfidenceInterval float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nUpperConfidenceInterval float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Economy_GDPperCapita float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Family float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Health_LifeExpectancy\nfloat,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nFreedomTrust_GovernmentCorruption&nbsp;\nfloat,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; GenerosityDystopiaResidual\nfloat,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp; INDEX clx_Region CLUSTERED(Region ASC) DISTRIBUTED BY HASH(Region)<\/p>\n\n\n\n<p>);<\/p>\n\n\n\n<p>While running and once completed, Azure will present the execution\ntree. With the new table created, we can now load data into that table.<\/p>\n\n\n\n<p>USE DATABASE testdata;<\/p>\n\n\n\n<p>@log = <\/p>\n\n\n\n<p>EXTRACT Region string,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; HappinessRank float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; HappinessScore float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nLowerConfidenceInterval float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nUpperConfidenceInterval float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Economy_GDPperCapita float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Family float,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Health_LifeExpectancy\nfloat,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nFreedomTrust_GovernmentCorruption&nbsp;\nfloat,<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;\nGenerosityDystopiaResidual float<\/p>\n\n\n\n<p>FROM &#8220;\/sampledata\/{*}.csv&#8221;<\/p>\n\n\n\n<p>USING Extractors.Text(&#8216;,&#8217;,silent:true);<\/p>\n\n\n\n<p>INSERT INTO happiness.placeholderdata<\/p>\n\n\n\n<p>SELECT * FROM @log;<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img loading=\"lazy\" decoding=\"async\" width=\"575\" height=\"356\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-11.png\" alt=\"\" class=\"wp-image-217\" srcset=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-11.png 575w, http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2018\/12\/image-11-300x186.png 300w\" sizes=\"auto, (max-width: 575px) 100vw, 575px\" \/><\/figure>\n\n\n\n<p>Once the data is loaded, we can go query the data in the data\nexplorer to see what it looks like. The script will run the select and output\nthe data to a file to be browsed. <\/p>\n\n\n\n<p>@table = SELECT * FROM [testdata].[happiness].[placeholderdata];<\/p>\n\n\n\n<p>OUTPUT @table<\/p>\n\n\n\n<p>&nbsp;&nbsp;&nbsp;&nbsp;TO &#8220;\/OUTPUTS\/Sampledataquery.csv&#8221;<\/p>\n\n\n\n<p>\u00a0\u00a0\u00a0\u00a0USING Outputters.Csv(); <\/p>\n\n\n<p><br>\n<br>\n<!--StartFragment--><\/p>\n\n\n<p>Be sure to check out my full online class on the topic. A hands on walk through of a Modern Data Architecture using Microsoft Azure. For beginners and experienced business intelligence experts alike, learn the basic of navigating the Azure Portal to building an end to end solution of a modern data warehouse using popular technologies such as SQL&nbsp;Database, Data Lake, Data Factory, Data Bricks, Azure Synapse Data Warehouse and Power BI. Link to the class can be <a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/08\/modern-data-architecture-using-microsoft-azure-online-class-and-free-ebook\/\">found here<\/a> or directly <a href=\"https:\/\/www.udemy.com\/course\/modern-data-architecture-using-microsoft-azure\/learn\/lecture\/18527998#overview\">here<\/a>.  <\/p>\n\n\n<p><!--StartFragment--><\/p>\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/08\/modern-data-architecture-part-1-navigating-the-azure-portal\/\">Part 1 &#8211; Navigating the Azure Portal<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-2-resource-groups-and-subscriptions\/\">Part 2 &#8211; Resource Groups and Subscriptions<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-3-creating-data-lake-storage\/\">Part 3 &#8211; Creating Data Lake Storage<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-4-setting-up-a-sql-server\/\">Part 4 &#8211; Setting up an Azure SQL Server<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-5-loading-data-lake-with-data-factory\/\">Part 5 &#8211; Loading Data Lake with Azure Data Factory<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-6-configuring-and-setting-up-data-bricks\/\">Part 6 &#8211; Configuring and Setting up Data Bricks<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-7-staging-data-into-data-lake\/\">Part 7 &#8211; Staging data into Data Lake<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/14\/modern-data-architecture-part-8-provisioning-a-synapsis-sql-data-warehouse\/\">Part 8 = Provisioning a Synapse SQL Data Warehouse<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/14\/modern-data-architecture-part-9-loading-data-synapse-data-warehouse\/\">Part 9 &#8211; Loading Data into Azure Data Synapse Data Warehouse<\/a><\/p>\n\n\n<p><!--EndFragment--><br>\n<br>\n<\/p>","protected":false},"excerpt":{"rendered":"<p>The modern data warehouse will need to use Azure Data Lake and U-SQL to combine data. Begin by navigating to your Azure Portal and searching for the Data Lake Analytics Resource. Let\u2019s start by creating a new Data Lake. Don\u2019t worry, this service only charges on data in and out, not just remaining on like [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":135,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,28,27,11,29],"tags":[44,15,60,42,62,61],"class_list":["post-210","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-analytics","category-azure","category-big-data","category-data-warehouse","category-hdinsight","tag-azure","tag-big-data","tag-business-intelligence","tag-cloud","tag-data-lake","tag-usql"],"_links":{"self":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/210","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/comments?post=210"}],"version-history":[{"count":7,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/210\/revisions"}],"predecessor-version":[{"id":779,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/210\/revisions\/779"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media\/135"}],"wp:attachment":[{"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media?parent=210"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/categories?post=210"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/tags?post=210"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}