{"id":630,"date":"2020-04-13T18:59:15","date_gmt":"2020-04-13T18:59:15","guid":{"rendered":"http:\/\/jackofalltradesmasterofsome.com\/blog\/?p=630"},"modified":"2020-04-14T17:25:57","modified_gmt":"2020-04-14T17:25:57","slug":"modern-data-architecture-part-3-creating-data-lake-storage","status":"publish","type":"post","link":"https:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-3-creating-data-lake-storage\/","title":{"rendered":"Modern Data Architecture \u2013 Part 3 \u2013 Creating Data Lake Storage"},"content":{"rendered":"\n<p>Modern Data Architecture \u2013 Part 3 \u2013 Creating Data Lake Storage<\/p>\n\n\n\n<p>A data lake is\na system or repository of data stored in its natural\/raw format, usually object\nblobs or files. A data lake is usually a single store of all enterprise data\nincluding raw copies of source system data and transformed data used for tasks\nsuch as reporting, visualization, advanced analytics and machine learning. A\ndata lake can include structured data from relational databases (rows and\ncolumns), semi-structured data (CSV, logs, XML, JSON), unstructured data\n(emails, documents, PDFs) and binary data (images, audio, video). <\/p>\n\n\n\n<p>Azure data\nstorage is extremely cost effective and can be used to store vast amounts of\ndata for a small monthly cost. No cost will be incurred for empty data buckets.<\/p>\n\n\n\n<p>From your Azure\nPortal we will create a new Storage Account. You may see a resource for Data\nLake Gen1 Storage, but the go forward standard at the time of creating this\ncontent is a default Storage Account. Azure Storage is a Microsoft-managed\nservice providing cloud storage that is highly available, secure, durable,\nscalable, and redundant. Azure Storage includes Azure Blobs (objects), Azure\nData Lake Storage Gen2, Azure Files, Azure Queues, and Azure Tables. The cost\nof your storage account depends on the usage and the options you choose below.<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Search for Storage Account from your Portal and select \u201c+ Add\u201d at the top left<\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"568\" height=\"160\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-13.png\" alt=\"\" class=\"wp-image-657\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-13.png 568w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-13-300x85.png 300w\" sizes=\"auto, (max-width: 568px) 100vw, 568px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"423\" height=\"225\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-14.png\" alt=\"\" class=\"wp-image-658\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-14.png 423w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-14-300x160.png 300w\" sizes=\"auto, (max-width: 423px) 100vw, 423px\" \/><\/figure>\n\n\n\n<ol class=\"wp-block-list\"><li>From the Storage Account Window, fill in the following details and select \u201cCreate and Review\u201d to complete the wizard.<ol><li>Subscription \u2013 The subscription you previously setup<\/li><\/ol><ol><li>Resource Group \u2013 \u201ctraining_resourcegroup_<em>yourname<\/em>\u201d or the group you previously created<\/li><\/ol><ol><li>Storage Account Name \u2013 As this value does not allow underscores or anything longer then 26 characters name the storage account \u201ctrainingsa<em>yourname\u201d<\/em><\/li><\/ol><ol><li>Location \u2013 US East 2 or the region you belong to. It is best practice to keep the same region for latency issues as services that need to communicate with one another are slightly affected by distance <\/li><\/ol><ol><li>Performance \u2013 Standard<\/li><\/ol><ol><li>Account Kind \u2013 Storage V2<\/li><\/ol><ol><li>Replication \u2013 Read-Access Geo-Redundant <\/li><\/ol><ol><li>Access Tier \u2013 Hot<\/li><\/ol><\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"404\" height=\"436\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-15.png\" alt=\"\" class=\"wp-image-659\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-15.png 404w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-15-278x300.png 278w\" sizes=\"auto, (max-width: 404px) 100vw, 404px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>After a short duration the storage group will be provisioned and will appear in your dashboard. No cost will be associated to this resource until actual data is moved and stored here. <\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"624\" height=\"176\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-16.png\" alt=\"\" class=\"wp-image-660\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-16.png 624w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-16-300x85.png 300w\" sizes=\"auto, (max-width: 624px) 100vw, 624px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\">Creating a Folder Structure <\/h3>\n\n\n\n<ol class=\"wp-block-list\"><li>Select \u201cStorage Explorer (preview) <\/li><\/ol>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"251\" height=\"289\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-17.png\" alt=\"\" class=\"wp-image-661\"\/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Right click \u201cblob containers\u201d and select \u201cCreate blob container\u201d. Name it \u201croot\u201d and leave it \u201cprivate\u201d<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"254\" height=\"143\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-18.png\" alt=\"\" class=\"wp-image-662\"\/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Create a new virtual folder<ul><li>This will create a virtual folder. A virtual folder does not actually exist in Azure until you paste, drag or upload blobs into it. To paste a blob into a virtual folder, copy the blob before creating the folder. <\/li><\/ul><\/li><li>From the Datafiles Folder from this lab, upload the file \u201cplaceholderfile\u201d into folder called \u201craw\u201d.<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"225\" height=\"352\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-19.png\" alt=\"\" class=\"wp-image-663\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-19.png 225w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-19-192x300.png 192w\" sizes=\"auto, (max-width: 225px) 100vw, 225px\" \/><\/figure>\n\n\n\n<ul class=\"wp-block-list\"><li>Complete the same exercise for a folder named \u201cstaging\u201d<\/li><\/ul>\n\n\n\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"562\" height=\"150\" src=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-20.png\" alt=\"\" class=\"wp-image-664\" srcset=\"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-20.png 562w, https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-content\/uploads\/2020\/04\/image-20-300x80.png 300w\" sizes=\"auto, (max-width: 562px) 100vw, 562px\" \/><\/figure>\n\n\n\n<p>Be sure to check out my full online class on the topic. A hands on walk through of a Modern Data Architecture using Microsoft Azure. For beginners and experienced business intelligence experts alike, learn the basic of navigating the Azure Portal to building an end to end solution of a modern data warehouse using popular technologies such as SQL\u00a0Database, Data Lake, Data Factory, Data Bricks, Azure Synapse Data Warehouse and Power BI. Link to the class can be <a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/08\/modern-data-architecture-using-microsoft-azure-online-class-and-free-ebook\/\">found here<\/a> or directly <a href=\"https:\/\/www.udemy.com\/course\/modern-data-architecture-using-microsoft-azure\/learn\/lecture\/18527998#overview\">here<\/a>. <\/p>\n\n\n<p><br>\n<br>\n<!--StartFragment--><\/p>\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/08\/modern-data-architecture-part-1-navigating-the-azure-portal\/\">Part 1 &#8211; Navigating the Azure Portal<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-2-resource-groups-and-subscriptions\/\">Part 2 &#8211; Resource Groups and Subscriptions<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-3-creating-data-lake-storage\/\">Part 3 &#8211; Creating Data Lake Storage<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-4-setting-up-a-sql-server\/\">Part 4 &#8211; Setting up an Azure SQL Server<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-5-loading-data-lake-with-data-factory\/\">Part 5 &#8211; Loading Data Lake with Azure Data Factory<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-6-configuring-and-setting-up-data-bricks\/\">Part 6 &#8211; Configuring and Setting up Data Bricks<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/13\/modern-data-architecture-part-7-staging-data-into-data-lake\/\">Part 7 &#8211; Staging data into Data Lake<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/14\/modern-data-architecture-part-8-provisioning-a-synapsis-sql-data-warehouse\/\">Part 8 = Provisioning a Synapse SQL Data Warehouse<\/a><\/p>\n\n\n\n<p><a href=\"http:\/\/jackofalltradesmasterofsome.com\/blog\/2020\/04\/14\/modern-data-architecture-part-9-loading-data-synapse-data-warehouse\/\">Part 9 &#8211; Loading Data into Azure Data Synapse Data Warehouse<\/a><\/p>\n\n\n<p><!--EndFragment--><br>\n<br>\n<\/p>\n\n\n<p> Modern Data Architecture \u2013 Part 3 \u2013 Creating Data Lake Storage <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Modern Data Architecture \u2013 Part 3 \u2013 Creating Data Lake Storage A data lake is a system or repository of data stored in its natural\/raw format, usually object blobs or files. A data lake is usually a single store of all enterprise data including raw copies of source system data and transformed data used for [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":624,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[4,28,11,85],"tags":[],"class_list":["post-630","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-analytics","category-azure","category-data-warehouse","category-sql-server"],"_links":{"self":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/630","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/comments?post=630"}],"version-history":[{"count":5,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/630\/revisions"}],"predecessor-version":[{"id":766,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/posts\/630\/revisions\/766"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media\/624"}],"wp:attachment":[{"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/media?parent=630"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/categories?post=630"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jackofalltradesmasterofsome.com\/blog\/wp-json\/wp\/v2\/tags?post=630"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}