SQL Server – jack of all trades master of some

Power BI Paginated Reports vs SSRS and Tutorial

Vimal Vachhani — Wed, 23 Sep 2020 00:20:10 +0000

If you long for the days for pixel perfect reports of SSRS or Cognos but find yourself lost in the world of Power BI’s and Tableaus, this post is for you. Some times these new tools just cannot do the job of exactly formatted canned report runtime queries that the older tools used to be able to do for recipients of business intelligence users. Paginated Reports is SQL Server Reporting services surfaced through Power BI giving you the best of both worlds so here are Power BI Paginated Reports vs SSRS and Tutorial.

Side Note: Want to learn SQL or Python for free. In less then 10 minutes a day and less than an hour total? Signup for my free classes delivered daily right to your email inbox for free!

Now back to the article…

1. What Are Paginated Reports

Paginated reports are essentially SSRS reports that can be designed and deployed through Power BI. You have the ability to add custom logic to nearly every element and line your reports up to the exact pixels of the screen giving you total command of the look and feel which allow for quality looking reports in printing and exporting

2. How to create Paginated Reports?

You can create reports by downloading the free tool for Power BI Report Builder right to your desktop. This is a standalone tool from Power BI. These reports are compatible with the Power BI Service.

3. What Data Sources Work with Paginated Reports

There is some limited data source capabililities for Paginated Reports when comparing to the vast library of Power BI. Report data is pulled at run time and not stored in an underlying model like Power BI.

Current sources include the following

Azure SQL Database and Data Warehouse
Azure Analysis Services (via SSO)
SQL Server via a gateway
SQL Server Analysis Services via a gateway
Power BI Premium Datasets
Oracle
Teradata

4. Deploying and Sharing Reports to Users

Paginated Reports can be deployed to the Power BI Service. From here you can leverage the ability to put a report in a workspace and let users subscribe to the report. Reports will be automatically emailed to users based on the subscription sending a PDF directly to the consumer.

5. Exporting Power BI Service

Paginated Reports can be exported in the following formats

Excel
Microsoft Word
Microsoft
PowerPoint
PDF
CSV
XML
MHTML

6. Licensing for Paginated Reports in Power BI

Unfortunately Paginated Reports in Power BI is not free the same way you need a SQL Server licence for SSRS reports. You will will need either a License for Power BI Embedded or have a Power BI Premium Capacity P1, P2 or P3.

7. Limitations on Paginated Reports

Reports are not interactive. Similar to SSRS, all reports are essentially static one rendered with the exception of tables that can expand and collapse
You must pay the licence to use these in production
No access to Custom Fonts
No sharing data sets across reports

If you enjoyed this post on Power BI Paginated Reports vs SSRS and Tutorial and want to learn more, check out my Udemy class on getting started learning about the tool!

Modern Data Architecture – Part 9 – Load Data into Synapse Data Warehouse

Vimal Vachhani — Tue, 14 Apr 2020 16:36:46 +0000

Modern Data Architecture – Part 9 – Loading Data into Synapse Data Warehouse

Now that we have provisioned a Synapses Data Analytics environment, we are now ready to begin loading data into this environment. In lab 7, we loaded a single table “sales.customer” to our data lake. To complete this lab, you will either need to complete the same exercise for the tables listed below. To make it easier, we have included the csv files you can upload directly to the data lake raw folder as well in the Datafiles folder included with this lab.

Tables

Sales.customers

Sales.order_items

Sales.orders

Sales.staffs

Sales.stores

Open or navigate back to your Data Bricks environment

We will be creating a similar script as we did in lab 7 but in reverse. We will connect to the data lake first and then load this data to our data warehouse.
Create a new work book called “loading_customers”

Block 1

Description – This section connects your data bricks storage to your data lake so that we can pull data from the data lake in the next step. This is the quick and easy way to accomplish this but your access keys are visible and shared in your code and not best practice. For production, be sure to use secret access keys.

Insert your storage account name, the destination folder and your access key respectively.

Code

%scala

spark.conf.set(“fs.azure.account.key.trainingsavimal.dfs.core.windows.net”,”youraccesskey“)

To access “youraccesskey”, we are going to need some additional information. From the azure portal, navigate back to your Storage Account “trainingsayourname” and select “Access Keys”

Block 2

Description – This will store your data to a data frame in data bricks memory.

Code

%scala

val df = spark.read.option(“header”, “true”).csv(“abfss://root@trainingsavimal.dfs.core.windows.net/raw/customers.csv”)

Block 3

Description – This will display the contents of what has now been stored into the dataframe. Code to manipulte or transform the data can now be done if needed.

Code

%scala

df.show()

Block 3

Description – Similar to before, we now connect to our SQL Server instance but this time we connect to our Synapses database instead of our SQL Server Database.

Code

%scala

Class.forName(“com.microsoft.sqlserver.jdbc.SQLServerDriver”)

val jdbcHostname = “training-sqlserver-vimal.database.windows.net”

val jdbcPort = 1433

val jdbcDatabase = “training_sqlpool_vimal”

// Create the JDBC URL without passing in the user and password parameters.

val jdbcUrl = s”jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase}”

import java.util.Properties

val connectionProperties = new Properties()

connectionProperties.put(“user”, “sqlserveradmin”)

connectionProperties.put(“password”, “Password1234”)

val driverClass = “com.microsoft.sqlserver.jdbc.SQLServerDriver”

connectionProperties.setProperty(“Driver”, driverClass)

Block 4

Description – We can now write the contents of our data frame to our SQL Server Synapses database

Code

%scala

df.write.mode(“append”).jdbc(jdbcUrl, “customers”, connectionProperties)

Heading back to your SQL Server Management Studio and query the table customers. You should now see data available in this table

Be sure to check out my full online class on the topic. A hands on walk through of a Modern Data Architecture using Microsoft Azure. For beginners and experienced business intelligence experts alike, learn the basic of navigating the Azure Portal to building an end to end solution of a modern data warehouse using popular technologies such as SQL Database, Data Lake, Data Factory, Data Bricks, Azure Synapse Data Warehouse and Power BI. Link to the class can be found here or directly here.

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 9 – Loading Data into Synapse Data Warehouse

Modern Data Architecture – Part 8 – Provisioning a Synapsis SQL Data Warehouse

Vimal Vachhani — Tue, 14 Apr 2020 14:39:19 +0000

Modern Data Architecture – Part 8 – Provisioning a Synapsis SQL Data Warehouse

Azure Synapse is a limitless analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources—at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate business intelligence and machine learning needs.

With Azure Synapse, data professionals can query both relational and non-relational data using the familiar SQL language. This can be done using either serverless on-demand queries for data exploration and ad hoc analysis or provisioned resources for your most demanding data warehousing needs. A single service for any workload.

From the Azure Portal Search, Find and select “Azure Synapse Analytics” and select “+ Add”

From the Setup Screen, Configure the Basic Settings
- Subscription – The subscription you previously setup
- Resource Group = “training_resourcegroup_yourname”
- SQL pool name – “training_sqlpool_yourname’
- Server – “training-sqlserver-yourname”
- Performance – Be sure to scale this down to DW100c to limit cost as this will incur charges when not in use

Select “Review and Create” to provision the resource. This may take a few minutes to complete.
Once completed, in the dashboard for your new Synapse environment you will be able to pause and resume the service. This will help with costing and scaling. In the common task, you can explore some options but we do not have any data yes so we will come back to this later in the labs.

Open SQL Server Management Studio and log back in using the same server credentials from lab 4. Setting up a SQL Server.

You will not see your Synapsis SQL Server Environment listed

Run the script “Create_DW_Tables.sql” located in the Datafiles folder included with this lab. This will create the empty tables we need for later in the lab. Be sure to set the Database to “training_sqlpool_yourname” before running the query. Once complete, you can refresh your database and will see the tables listed on the left hand side.

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 8 – Provisioning a Synapsis SQL Data Warehouse

Modern Data Architecture – Part 7 – Staging Data into Data Lake

Vimal Vachhani — Mon, 13 Apr 2020 19:33:28 +0000

Modern Data Architecture – Part 7 – Staging Data into Data Lake

In Lab 5, we demonstrated loaded data from our sample SQL server into our data lake. Although this tool is very handy, it creates a new service and set of tools that need to be provisioned and monitored. It is best to use ADF more as an workflow orchestration tool and use a single tool to handle all of your ETL. Data Brinks provides a robust tool to handle this task. We will investigate how to query data from your SQL Server and load your data lake.

For this lab we will be hard coding the access keys and credentials directly into the code. The optional section at the bottom will demonstrate how to configure Azure credentials secret keys to configure this the correct way as you would in production.

Connect and Query SQL Server

Create a new workbook in your Data Bricks and name it “staging_customers”

Copy the Query below and paste it into your Data Bricks Window. Replace the highlighted sections with your database information. Most of our code will be writing using Scala to start. We will place each piece of code into its own cells to make the job a bit easer to read and test. Copy and paste the following code into your notebook. We will be using hard coded credentials and keys for the lab but in real world, be sure to configure secret key access only.

Example

Block 1

Description – This sets the initial driver for our database connection

Code

%scala

Class.forName(“com.microsoft.sqlserver.jdbc.SQLServerDriver”)

Block 2

Description – We configure and test the connection. Replace the items highlighted below with you SQL Server name, database name, username and password respectively.

Code

%scala

val jdbcHostname = “training-sqlserver-vimal.database.windows.net”

val jdbcPort = 1433

val jdbcDatabase = “training_sqldatabase_vimal”

// Create the JDBC URL without passing in the user and password parameters.

val jdbcUrl = s”jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase}”

import java.util.Properties

val connectionProperties = new Properties()

connectionProperties.put(“user”, “sqlserveradmin”)

connectionProperties.put(“password”, “Password123”)

val driverClass = “com.microsoft.sqlserver.jdbc.SQLServerDriver”

connectionProperties.setProperty(“Driver”, driverClass)

Block 4

Description – This allows us to review and test the connection to our sales.customers table in the database

Code

%scala

val customer = spark.read.jdbc(jdbcUrl, “sales.customers”, connectionProperties)

Block 5

Description – Select the data from the table for display

Code

%scala

customer.select(“*”).show()

Block 6

Description – This section connects your data bricks storage to your data lake so that we can push data to the data lake in the next step. Insert your blob name, your storage account name, the destionation folder, the destination folder, the storage account name and your access key respectively.

Code

%scala

dbutils.fs.mount(

source = “wasbs://root@trainingsavimal.blob.core.windows.net/raw”,

mountPoint = “/mnt/raw”,

extraConfigs = Map(“fs.azure.account.key.trainingsavimal.blob.core.windows.net” -> “ youraccesskey “))

To access “youraccesskey”, we are going to need some additional information. From the azure portal, navigate back to your Storage Account “trainingsayourname” and select “Access Keys”

Block 7

Description – This last section pushed your data to a csv on to the data lake. Upon completion, you can now see you new data file on your storage via the storage explorer. There will also be a folder containing some of the data tranfer metadata.

Code

%scala

customer.coalesce(1)

.write.format(“com.databricks.spark.csv”)

.option(“header”, “true”)

.save(“/mnt/raw/customers.csv”)

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 7 – Staging Data into Data Lake

Modern Data Architecture – Part 6 – Configuring and setting up Data Bricks

Vimal Vachhani — Mon, 13 Apr 2020 19:25:01 +0000

Modern Data Architecture – Part 6 – Configuring and setting up Data Bricks

Azure Databricks is an Apache Spark-based analytics platform optimized for the Microsoft Azure cloud services platform. Databricks is an industry-leading, cloud-based data engineering tool used for processing and transforming massive quantities of data and exploring the data through machine learning models

Azure Data Bricks provides a highly flexible data framework allowing for developers to build an entire ETL framework using Python and SQL. We will explore how to stage data from your SQL Server to your Data Lake in this Lab.

Creating a Data Bricks Resource

In the search bar, search for “Data Bricks” and select “+ Add”
1. Subscription – Your training subscription
1. Resource Group – “training_resourcegroup_yourname”
1. Workspace Name – “training_databricks_yourname”
1. Location – East US 2
1. Pricing Tier – Standard
  1. Pricing will only offer when the Data Bricks is in use. No charges will be incurred if you do not have a cluster running.
1. Select “Review and Create to complete”

Once completed you will be able to launch the data bricks application.
- This service runs its own portal and toolset and is interfaced with Azure via single sing on authentication. Select “Launch Workspace” to launch the tool.

Navigating Data Bricks

Review the menu on the left-hand side
The two main sections we will be working with will be Workspace and Clusters. All data bricks code can only run on an active cluster so we will create one first.

Select “Clusters-Create Cluster”

Enter the details for the cluster
- Cluster Name – training_cluster_yourname”
- Cluster Mode – Standard
- Pool – None
- Data Bricks Run time – Scala 2.11, Spark 2.4.4)
- Auto Pilot Options (Very Important, this will shut down your clusters when not in use to ensure cost is minimized)
  - Disable – Enable Autoscaling
  - Enable – Terminate after 15 minutes of Inactivity
- Worker Type – Standard DS2
  - Workers – 1 (Since our lab is a low volume and frequency request, do no not need a lot of horsepower)

Select “Create Cluster” to Complete. This will take a few minutes to complete.

Create and Setup a Workspace

Open a Workspace from the left navigation menu
Right click under the “trash” icon and select “Create-> Folder” and name it “staging”

Right click the folder and select “Create->Notebook” and name it “Data_Lake_Staging”
- Language – Python
- Cluster – “training_cluster_yourname”
This will launch a new Workbook for your code.

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 6 – Configuring and setting up Data Bricks

Modern Data Architecture – Part 5 – Loading Data Lake with Data Factory

Vimal Vachhani — Mon, 13 Apr 2020 19:20:49 +0000

Modern Data Architecture – Part 5 – Loading Data Lake with Data Factory

The Azure Data Factory (ADF) is a service designed to allow developers to integrate disparate data sources. It is a platform somewhat like SSIS in the cloud to manage the data you have both on-prem and in the cloud.

It provides access to on-premises data in SQL Server and cloud data in Azure Storage (Blob and Tables) and Azure SQL Database. Access to on-premises data is provided through a data management gateway that connects to on-premises SQL Server databases.

It is not a drag-and-drop interface like SSIS. Instead, data processing is enabled initially through Hive, Pig and custom C# activities. Such activities can be used to clean data, mask data fields, and transform data in a wide variety of complex ways.

Creating a Data Factory Resource

In the search bar, search for “Data Factories”

Select “+ Add” to create a new Data Factory
Fill in the necessary details for your data factory and select “Create”
- Name – “training-datafactories-yourname”
  - Name does not allow underscores, use dashes instead
- Version – V2
- Subscription – The default subscription you set up for the demo
- Resource Group “training_resourcegroup_yourname”
- Location – “East US 2” or the same as your resource group
- Enable Git – Uncheck this for the demo. In real development, you will want to link your code to a repository for disaster recovery.

After a short loading period, your Data Factory resource will be provisioned and listed in your dashboard.

Click on the new Data Factory Resource to view its details. Select the “Author and Monitor” button to navigate to the editor.

Setting Up a Data Source

Select “Create a Pipeline” from the editor. In pipelines we can create data flows between data sources and data destinations. For this example, we will be loading data locally stored into our newly provisioned data lake and scheduling the job to run on a schedule. In real world scenarios, the data source could be a FTP file path or a shared drive. We will also create a data source from a SQL database to the data lake as we would in a real world scenario.
Select Datasets->New dataset.

In the new datasets section, select “SQL Server”

In Set Properties set the name as “Source_Database” and select a + New Linked Service

Complete the remaining details for the SQL Data Connection
- Name “Training_SQLDatabase_yourname”
- Connect via integration run time – Leave as default “AutoResolveIntegationRuntime”
- Connection String
  - From Azure Subscription
  - Server name – “training-sqlserver-yourname”
    - Enter all the remaining database credentials from the previous lab and select “Test Connection” to test.
    - If you receive a connection error based on IP Firewall, record the IP address and whitelist it on the database similar to how it was done in lab 4.

In the last properties window, set the Table Name to “sales.customers”.

Click “OK” to save.

Create a new connection, but this time select “Azure Blog Storage”

Select Comma Delimeted and name set it as “Destination_CSV” and select a + New Linked Service

In the New linked servers window enter the necessary details
1. Name – “training_storage_yourname”
1. Connect via integration run time – Leave as default “AutoResolveIntegationRuntime”
1. Authentication method – Account Key
1. Account selection method – From Azure Subscription
  1. Select the training storage you created in earlier labs

Once complete, select test connection and create.

On the set properties, set a file path called “root/raw”

Set “First Row as Header” and Ok to save.
Navigate to the connections tab in the details and select “Test Connection” to test.

Create a Data Transfer

From the Pipeline tab, In the “Move and Transform” section in the Activities, drag and drop a “Copy Data” task to the main window. In the properties editor below, select “Source” and select “Source_Database”

Select “Query” and write the following query. Use the Preview Data to validate the query works as expected.

Click on “Sink” and select “Destination_CSV”

Publishing and Running

Select publish at the top of the window.

Select “debug” to run the job.
Once the job has succeeded, navigate back to your Storage account and view the storage browser. The new file “customer.csv” should now be available.

Setting up the job to run automatically

Back in Azure Data Factory, select “Add Trigger->New Edit” from your Pipeline

Create a new trigger
- Type – Schedule
- Start Date – Todays Date
- Reoccurrence – Every 1 Month (We don’t need this to run as frequently yet)
- End No End
Select “OK to save and “Publish” to push your changes.
From the triggers menu at the bottom, you can now see your new trigger and activate and deactivate it manually.

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 5 – Loading Data Lake with Data Factory

Modern Data Architecture – Part 4 – Setting up a SQL Server

Vimal Vachhani — Mon, 13 Apr 2020 19:08:44 +0000

Modern Data Architecture – Part 4 – Setting up a SQL Server

Before we move on to loading data into our Azure Data Lake, we will need a data source to simulate pulling data from as to mirror a production system. For this exercise, we will walk through creating your own SQL server database and loading it with sample data.

This is the first resource we care creating that will bill ongoing in the background, so we will want to be sure we deprovision this resource once we are done with it. The tier we will be using is the basic database at $5.00 a month, so this lab should only cost a few cents.

Prerequisites

You will want to install and set up SQL Server Management Studio prior to this completing this section

Create a SQL Server and SQL Database in Azure

In the search bar, search for “SQL Server”

Select “+ Add” to create a new SQL Server
Fill in the necessary details and select “Review and Create”
- Subscription – The default subscription you set up for the demo
- Resource Group – The training group resource group “training_resourcegroup_yourname” you created for the lab
- Database Details
  - Database Name – “training_sqldatabase_yourname”
  - Server – Create New
    - Server Name – “training-sqlserver-yourname”
      - Underscores are not allowed, use dashes
    - Server Admin Login – create a unique id
      - Record this somewhere as we will need it for later in the lab
      - sqlserveradmin for this lab sample if needed
    - Password – secure password
      - Record this somewhere as we will need it for later in the lab
    - Location – US East 2
Want to Use SQL Elastic Pool – No
Compute + Storage
- Select Configure.
- Select Basic. This will ensure you are using the cheapest tier of databases to mitigate costs.

Once completed, the SQL Server and Database resource will be available from the dashboard. Click on the server to navigate to details.

Setting Security and Firewalls

Click on the server name to begin to configure database rules

Select “Show Firewall Settings”

Select “Add client IP” to whitelist the current IP of your computers location and select “Save”. Add any other IP that may need access. A simple google search of “what is my IP” will return the IP address required.

Set Allow Azure Services and resources to access server to “Yes”. This will allow Azure Data Factory and other data connections to connect.

Creating a Sample Database

Open SQL Server Management Studio. In the login prompt, enter the server name from Azure and the username and password that was configured. Be sure to use “SQL Server Authentication” in the Authentication Setting

Upon successful login, your databases should be available

Select “New query and set the database to your training database

From the Sample Data Folder, run the following two scripts. Be sure the database connection does not revert back to “master” as updated in the previous step.
- BikeStores Sample Database – create objects
- BikeStores Sample Database – load data
Tables will now appear in your database which you can query and review once the scripts complete running.

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 4 – Setting up a SQL Server

Modern Data Architecture – Part 3 – Creating Data Lake Storage

Vimal Vachhani — Mon, 13 Apr 2020 18:59:15 +0000

Modern Data Architecture – Part 3 – Creating Data Lake Storage

A data lake is a system or repository of data stored in its natural/raw format, usually object blobs or files. A data lake is usually a single store of all enterprise data including raw copies of source system data and transformed data used for tasks such as reporting, visualization, advanced analytics and machine learning. A data lake can include structured data from relational databases (rows and columns), semi-structured data (CSV, logs, XML, JSON), unstructured data (emails, documents, PDFs) and binary data (images, audio, video).

Azure data storage is extremely cost effective and can be used to store vast amounts of data for a small monthly cost. No cost will be incurred for empty data buckets.

From your Azure Portal we will create a new Storage Account. You may see a resource for Data Lake Gen1 Storage, but the go forward standard at the time of creating this content is a default Storage Account. Azure Storage is a Microsoft-managed service providing cloud storage that is highly available, secure, durable, scalable, and redundant. Azure Storage includes Azure Blobs (objects), Azure Data Lake Storage Gen2, Azure Files, Azure Queues, and Azure Tables. The cost of your storage account depends on the usage and the options you choose below.

Search for Storage Account from your Portal and select “+ Add” at the top left

From the Storage Account Window, fill in the following details and select “Create and Review” to complete the wizard.
1. Subscription – The subscription you previously setup
1. Resource Group – “training_resourcegroup_yourname” or the group you previously created
1. Storage Account Name – As this value does not allow underscores or anything longer then 26 characters name the storage account “trainingsayourname”
1. Location – US East 2 or the region you belong to. It is best practice to keep the same region for latency issues as services that need to communicate with one another are slightly affected by distance
1. Performance – Standard
1. Account Kind – Storage V2
1. Replication – Read-Access Geo-Redundant
1. Access Tier – Hot

After a short duration the storage group will be provisioned and will appear in your dashboard. No cost will be associated to this resource until actual data is moved and stored here.

Creating a Folder Structure

Select “Storage Explorer (preview)

Right click “blob containers” and select “Create blob container”. Name it “root” and leave it “private”

Create a new virtual folder
- This will create a virtual folder. A virtual folder does not actually exist in Azure until you paste, drag or upload blobs into it. To paste a blob into a virtual folder, copy the blob before creating the folder.
From the Datafiles Folder from this lab, upload the file “placeholderfile” into folder called “raw”.

Complete the same exercise for a folder named “staging”

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 3 – Creating Data Lake Storage

Modern Data Architecture – Part 2 – Resource Groups and Subscriptions

Vimal Vachhani — Mon, 13 Apr 2020 18:56:20 +0000

Modern Data Architecture – Part 2 – Resource Groups and Subscriptions

Resource Groups in Azure are a mechanism of grouping a collection of services and assets in Azure together. Resource groups can assist in automatic provisioning, deprovisioning, monitoring, and access control as well as provide logical groupings for business processes, units or even environment.

A subscription is an object that represents a “folder” that you can put resources in. Subscriptions are tied to tenants. One tenant can have many subscriptions, but not vice versa. An Azure subscription has a trust relationship with Azure Active Directory (Azure AD). A subscription trusts Azure AD to authenticate users, services, and devices. You can associate and manage the directory using a different Azure subscription. All your users have a single home directory for authentication. Billing and cost is tied to a subscription.

Adding a subscription (Not needed for labs)

Navigate to your subscriptions by using the search. The list will auto populate after typing a few characters.

Click on Add. This will navigate you to the Microsoft Azure Billing Portal as subscriptions are tied to billing accounts and credit cards.

You can select a type of subscription you will require. Review which works best for you. The dev/test provides slightly favorable pricing for training and education

Once you complete your billing, registration and acceptance of terms and conditions you will now have access to this subscription in you Azure Portal.

Creating a Resource

Navigate to your subscriptions by using the search. The list will auto populate after typing a few characters.

Click on “Add”

On the following screen, give your resource group a new unique name and select the subscription you would like it tied to.
- For this lab we will be using US East 2 for all of our regions to keep everything lined up to one geography. You can change and have resources in cross regions but it is best to keep all items lined up.
- For all naming convetions we are going to follow the following standard to keep it easy to remember. “training_resourcetype_yourname.

Hit review and Create and complete the workflow to create your new resource group. The top right of the screen will show the status of the deployment of your new resource. Once it is complete, your new group should be ready to use.

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 2 – Resource Groups and Subscriptions

Modern Data Architecture – Part 1 – Navigating the Azure Portal

Vimal Vachhani — Wed, 08 Apr 2020 15:49:46 +0000

Modern Data Architecture – Part 1 – Navigating the Azure Portal

Logging into Azure and the Portal

Open up a web browser and navigate to https://azure.microsoft.com/en-us/ and click on “Portal” to log in.

You will be navigated to authentication portal to log in.
- You can either sign up for a new account which comes with free credit for a year or use a designated account for your organization.

Once logged in, you will be navigated to the Azure Portal.
- The center portion will outline most recent resourcess you have worked with or have access too
- On the left-hand side, you can navigate to resources or have the ability to create new ones
- The search at the top can also be used to navigate to different resources

Explore more services by clicking the “More Services” button. From here you can find helpful documentation in addition to seeing all the other offering Azure provides including free ones!

Part 1 – Navigating the Azure Portal

Part 2 – Resource Groups and Subscriptions

Part 3 – Creating Data Lake Storage

Part 4 – Setting up an Azure SQL Server

Part 5 – Loading Data Lake with Azure Data Factory

Part 6 – Configuring and Setting up Data Bricks

Part 7 – Staging data into Data Lake

Part 8 = Provisioning a Synapse SQL Data Warehouse

Part 9 – Loading Data into Azure Data Synapse Data Warehouse

Modern Data Architecture – Part 1 – Navigating the Azure Portal