Category: Data Warehouse
-
The Modern Data Warehouse; Running Hive Queries in Visual Studio to combine data
In previous posts we have looked at storing data files to blob storage and using PowerShell to spin up an HDInsight Hadoop cluster. We have also installed some basic software that will help us get going once the services are provisioned. Now that the basics are ready, it is time to process some of that…
-
Setting up tools to work with HDInsights and run Hive Queries – Azure Data Lake Tools and Azure Storage Browser
Two tools that are going to make life a bit simpler if you are going to be working with HDInsights and Azure blog storage are “Azure Data Lake and Stream Analytic Tools for Visual Studio” and Azure Storage Browser. Azure Data Lake and Stream Analytic Tools for Visual Studio To run Hive Queries, you’re going…
-
Big Data for The Rest of Us. Affordable and Modern Business Intelligence Architecture – Adding Lifecycles to your S3 buckets to save cost and retain data forever!
I wanted to keep this post short since as I mentioned in the previous post about cloud storage, our use case is already an affordable one, but it still makes sense to touch on some of the file movement strategy to other tiers of storage to make sure we are maximizing our cost saving vs.…
-
Big Data for The Rest of Us. Affordable and Modern Business Intelligence Architecture – Auto uploading and syncing your data using AWS S3
The first process in any data warehouse project is obtaining the data into a staging environment. In the traditional data warehouse, this required an ETL process to pick up data files from a local folder or FTP, and in some cases, a direct SQL connection to source systems to then load into a dedicated staging…
-
Big Data for The Rest of Us. Affordable and Modern Business Intelligence Architecture – An Introduction using AWS
If you google the use cases for Big Data, you will usually find references to scenarios such as web click analytics, streaming data or even IOT sensor data, but most organizations data needs and data sources never fall into any of these categories. However, that does not mean they are not great candidates for a…
-
The Modern Data warehouse; The low-cost solution using Big Data with HDInsight and PowerShell
Organizations have been reluctant to transition their current business intelligence solutions from the traditional data warehouse infrastructure to big data for many reasons, but there are two reasons that are false barrier to entries, cost and complexity of provisioning environments. In the post below, we will cover how to use PowerShell to commission and decommission…
-
Accelerating the Staging Process for your Data Warehouse
Real Estate Data Warehouse – Accelerating the Staging Process The script below can be used to build a staging environment for any sort of industry and not just real estate related databases. The specifics of a RE Data warehouse will be covered in future blog post. It will allow you to Accelerating the Staging Process for…
-
Getting Data from Yardi Log Shipping Part 2
Until recently obtaining data externally from Yardi was reserved to two methods. The first was via a nightly backup of the database or if you were paying for private cloud, then the second was via direct SQL query access over VPN. Both worked very well but both also had some minor flaws. We also have…