{"id":638,"date":"2020-04-13T19:25:01","date_gmt":"2020-04-13T19:25:01","guid":{"rendered":"http:\/\/jackofalltradesmasterofsome.com\/blog\/?p=638"},"modified":"2020-04-14T17:26:43","modified_gmt":"2020-04-14T17:26:43","slug":"modern-data-architecture-part-6-configuring-and-setting-up-data-bricks","status":"publish","type":"post","link":"https:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-6-configuring-and-setting-up-data-bricks\/","title":{"rendered":"Modern Data Architecture \u2013 Part 6 \u2013 Configuring and setting up Data Bricks"},"content":{"rendered":"\n<p>Modern Data Architecture \u2013 Part 6 \u2013 Configuring and setting up Data Bricks<\/p>\n\n\n\n<p>Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Databricks is an industry-leading, cloud-based data engineering tool used for processing and transforming massive quantities of data and exploring the data through machine learning models<\/p>\n\n\n\n<p>Azure Data\nBricks provides a highly flexible data framework allowing for developers to\nbuild an entire ETL framework using Python and SQL. We will explore how to stage\ndata from your SQL Server to your Data Lake in this Lab.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Creating a Data Bricks Resource<\/h3>\n\n\n\n<ol class=\"wp-block-list\"><li>In the search bar, search for \u201cData Bricks\u201d and select \u201c+ Add\u201d<ol><li>Subscription \u2013 Your training subscription<\/li><\/ol><ol><li>Resource Group \u2013 \u201ctraining_resourcegroup_<em>yourname<\/em>\u201d<\/li><\/ol><ol><li>Workspace Name \u2013 \u201ctraining_databricks_<em>yourname<\/em>\u201d<\/li><\/ol><ol><li>Location \u2013 East US 2<\/li><\/ol><ol><li>Pricing Tier \u2013 Standard<ol><li>Pricing will only offer when the Data Bricks is in use. No charges will be incurred if you do not have a cluster running.<\/li><\/ol><\/li><\/ol><ol><li>Select \u201cReview and Create to complete\u201d<\/li><\/ol><\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"467\" height=\"124\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-56.png\" alt=\"\" class=\"wp-image-705\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-56.png 467w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-56-300x80.png 300w\" sizes=\"auto, (max-width: 467px) 100vw, 467px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"447\" height=\"275\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-57.png\" alt=\"\" class=\"wp-image-706\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-57.png 447w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-57-300x185.png 300w\" sizes=\"auto, (max-width: 447px) 100vw, 447px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Once completed you will be able to launch the data bricks application. <ul><li>This service runs its own portal and toolset and is interfaced with Azure via single sing on authentication. Select \u201cLaunch Workspace\u201d to launch the tool.<\/li><\/ul><\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"287\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-58.png\" alt=\"\" class=\"wp-image-707\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-58.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-58-300x138.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Navigating Data Bricks<\/h3>\n\n\n\n<ol class=\"wp-block-list\"><li>Review the menu on the left-hand side<\/li><li>The two main sections we will be working with will be Workspace and Clusters. All data bricks code can only run on an active cluster so we will create one first.<\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"321\" height=\"322\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-59.png\" alt=\"\" class=\"wp-image-708\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-59.png 321w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-59-300x300.png 300w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-59-150x150.png 150w\" sizes=\"auto, (max-width: 321px) 100vw, 321px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Select \u201cClusters-Create Cluster\u201d<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"260\" height=\"177\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-60.png\" alt=\"\" class=\"wp-image-709\"\/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Enter the details for the cluster<ul><li>Cluster Name \u2013 training_cluster_<em>yourname<\/em>\u201d<\/li><\/ul><ul><li>Cluster Mode \u2013 Standard<\/li><\/ul><ul><li>Pool \u2013 None<\/li><\/ul><ul><li>Data Bricks Run time \u2013 Scala 2.11, Spark 2.4.4)<\/li><\/ul><ul><li>Auto Pilot Options (<strong>Very Important, this will shut down your clusters when not in use to ensure cost is minimized)<\/strong><ul><li>Disable &#8211; Enable Autoscaling<\/li><\/ul><ul><li>Enable &#8211; Terminate after <strong>15<\/strong> minutes of Inactivity <\/li><\/ul><\/li><\/ul><ul><li>Worker Type \u2013 Standard DS2 <ul><li>Workers \u2013 1 (Since our lab is a low volume and frequency request, do no not need a lot of horsepower)<\/li><\/ul><\/li><\/ul><\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"373\" height=\"305\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-61.png\" alt=\"\" class=\"wp-image-710\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-61.png 373w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-61-300x245.png 300w\" sizes=\"auto, (max-width: 373px) 100vw, 373px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Select\n\u201cCreate Cluster\u201d to Complete. This will take a few minutes to complete.<\/li><\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Create and Setup a Workspace<\/h3>\n\n\n\n<ol class=\"wp-block-list\"><li>Open a Workspace from the left navigation menu<\/li><li>Right click under the \u201ctrash\u201d icon and select \u201cCreate-&gt; Folder\u201d and name it \u201cstaging\u201d<\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"273\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-62.png\" alt=\"\" class=\"wp-image-711\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-62.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-62-300x131.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Right click the folder and select \u201cCreate-&gt;Notebook\u201d and name it \u201cData_Lake_Staging\u201d<ul><li>Language \u2013 Python<\/li><\/ul><ul><li>Cluster \u2013 \u201ctraining_cluster_<em>yourname\u201d<\/em><\/li><\/ul><\/li><li>This will launch a new Workbook for your code.<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"198\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-63.png\" alt=\"\" class=\"wp-image-712\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-63.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-63-300x95.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<p> Be sure to check out my full online class on the topic. A hands on walk through of a Modern Data Architecture using Microsoft Azure. For beginners and experienced business intelligence experts alike, learn the basic of navigating the Azure Portal to building an end to end solution of a modern data warehouse using popular technologies such as SQL\u00a0Database, Data Lake, Data Factory, Data Bricks, Azure Synapse Data Warehouse and Power BI. Link to the class can be <a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/08\/modern-data-architecture-using-microsoft-azure-online-class-and-free-ebook\/\">found here<\/a> or directly <a href=\"https:\/\/www.udemy.com\/course\/modern-data-architecture-using-microsoft-azure\/learn\/lecture\/18527998#overview\">here<\/a>. <\/p>\n\n\n<p><br>\n<br>\n<!--StartFragment--><\/p>\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/08\/modern-data-architecture-part-1-navigating-the-azure-portal\/\">Part 1 &#8211; Navigating the Azure Portal<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-2-resource-groups-and-subscriptions\/\">Part 2 &#8211; Resource Groups and Subscriptions<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-3-creating-data-lake-storage\/\">Part 3 &#8211; Creating Data Lake Storage<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-4-setting-up-a-sql-server\/\">Part 4 &#8211; Setting up an Azure SQL Server<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-5-loading-data-lake-with-data-factory\/\">Part 5 &#8211; Loading Data Lake with Azure Data Factory<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-6-configuring-and-setting-up-data-bricks\/\">Part 6 &#8211; Configuring and Setting up Data Bricks<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-7-staging-data-into-data-lake\/\">Part 7 &#8211; Staging data into Data Lake<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/14\/modern-data-architecture-part-8-provisioning-a-synapsis-sql-data-warehouse\/\">Part 8 = Provisioning a Synapse SQL Data Warehouse<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/14\/modern-data-architecture-part-9-loading-data-synapse-data-warehouse\/\">Part 9 &#8211; Loading Data into Azure Data Synapse Data Warehouse<\/a><\/p>\n\n\n<p><!--EndFragment--><br>\n<br>\n<\/p>\n\n\n<p> Modern Data Architecture \u2013 Part 6 \u2013 Configuring and setting up Data Bricks  <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Modern Data Architecture \u2013 Part 6 \u2013 Configuring and setting up Data Bricks Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Databricks is an industry-leading, cloud-based data engineering tool used for processing and transforming massive quantities of data and exploring the data through machine learning models Azure [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":624,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,28,27,11,85],"tags":[],"class_list":["post-638","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-analytics","category-azure","category-big-data","category-data-warehouse","category-sql-server"],"_links":{"self":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/638","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/comments?post=638"}],"version-history":[{"count":6,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/638\/revisions"}],"predecessor-version":[{"id":769,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/638\/revisions\/769"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media\/624"}],"wp:attachment":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media?parent=638"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/categories?post=638"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/tags?post=638"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}