Two tools that are going to make life a bit simpler if you are going to be working with HDInsights and Azure blog storage are “Azure Data Lake and Stream Analytic Tools for Visual Studio” and Azure Storage Browser.
Azure Data Lake and Stream Analytic Tools for Visual Studio
- To run Hive Queries, you’re going to need to install Azure Data Lake and Stream Analytic Tools to your version of Visual Studio, sometimes referred to as HDInsight tools for Visual Studio or Azure Data Lake tools for Visual Studio. You can install directly from Visual studio by selection Tools -> Get Tools and Features.
- Once you have Visual Studio Open, navigate to your server explorer and verify you are connected to the right Azure Subscription. You should now see your cluster information as created in the previous blog post on “Provisioning HDInsight Clusters using PowerShell”
You should be able to browse and interact with your storage from Visual Studio as well as the Azure dashboard at this point but another helpful tool to install will be “Azure Storage Explorer”. It can be found and installed from the following location.
Data explorer can be used to create new folders and upload your data files and this does make that job a little easier. Please note, when creating new folders, the actual folder will not be created until you upload files using this tool.
That should be it! From the Server explorer, you can now see the HDInsight Cluster that was spun up as well as the storage accounts where files will be stored. In the next post we will look at building and running Hive queries against your data files to get them ready for reporting.