The Azure SQL Datawarehouse is a fast, flexible, and secure cloud data warehouse tuned for running complex queries fast and across petabytes of data. The Azure DWH is build using parallel data warehouse technology which was already the pinnacle of SQL Server DWH workloads. Moving this technology to the cloud made it available for the majority of all companies, enabling them to use this workhorse as a powerful engine to drive their analytical needs. Scaling down, even pausing the service when it wasn’t needed, and scaling up when the workload demanded more raw power. The Gen 2 is five times faster than the already blazing fast Gen1.
5 times faster, how is this even possible?
The Massively parallel processing (MPP) of the DWH workloads performance is typically determined by:
- I/O bandwidth to storage
- repartitioning speed, also known as shuffle speed.
To solve these issues, the Gen 2 uses two new features:
First of all each SQL DW compute node is equipped with a dynamic cache containing the recently accessed SQL Server columnar storage segments, available via network to the other Azure SQL DW nodes, no additional set up or configuration is required.
What is evenly important is that the presence of this cache will still allow you to pause or resize the data warehouse instance to be resized or paused to save money. When an SQL DW instance is resuming, it will populate the cache again from Azure Storage as data is queried.
Previously, SQL Data Warehouse instances containing a smaller domain, a reference or dimension tables used the default round robin distribution. During query execution, data was copied to each compute node forcing queries to execute longer. Furthermore, system resources are taken away from other queries on the system to move the data. With Replicated Tables, the data is available on all compute nodes, hence data movement is eliminated, and queries run faster.
How much faster, well…
Up to 5x Better performance
The Gen 2 will enable you to have major speed increases on both levels at the same price level. To put it bluntly, on average data warehouse workload we see an average of 5.4 times performance improvement while achieving 4 times more concurrency.
As if this wouldn’t be enough, the Gen2 also removes some important boundaries
Unlimited Columnstore and the power to use it
The Gen 2 will give you unlimited columnstore storage capacity and the compute power to deliver acceptable query performances on larger and larger data. Azure SQL DW Compute Optimized Gen2 tier’s additional capabilities are specifically focused in this area. Together with the unlimited data in SQL’s columnar format, it has new performance tiers which will increase the compute capacity even further.
4x Higher Concurrency
Just like every MPP data warehousing system, there is a limit to the number of concurrent queries that can be processed and executed, sometimes leading to suboptimal user experiences. The Gen2 tier increases the number of concurrent queries that can be executed to 128 concurrent queries, this is four times more concurrency compared to the previous generation.
How to get it?
You can seamlessly upgrade from the Gen 1 to the Gen 2, using the following documentation.
What can we do for you?
As Kohera we are very proud to be part of the partner network for the Azure DWH.
Native third party connectors.
Although we prefer the Microsoft tools like Power BI or SSRS, Azure SQL Data Warehouse also works with other data integration and business intelligence solutions such as Informatica, Talend, Tableau, MicroStrategy, Qlik, and Alteryx.
This means that you can now connect Azure SQL Data Warehouse Gen2 to for example Tableau through the existing native SQL Server connector. The good thing about this is that a native connection is tuned for performance and does not involve custom configuration or coding; just point to your data and go!
Other important new features on the Azure DWH (both on Gen 1 & Gen 2)
SQL DW now supports Azure Monitor which is a built-in monitoring service that consumes performance and health telemetry for your data warehouse. Azure monitor not only enables you to monitor your data warehouse within the Azure portal, but its tight integration between Azure services also enables you to monitor your entire data analytics solution within a single interface.
Enhanced Pause button
The pause feature for SQL DW enables you to reduce and manage operating costs for your data warehouse by turning off compute during times of little to no activity. This feature will now detect active running queries and provides a warning before issuing the pause command. Pausing will also cancel all sessions to immediately quiesce your data warehouse before shutting it down. This can sometimes lead to interruptions to your end user applications.
Now with a simple click of the pause button in the Azure portal, you can detect the number of running queries so you can make an informed decision on when to pause.
Thanks to the Integration with Azure Analysis Services, creating a model from SQL DW is extremely easy and can even be done through Azure portal. This enables you to achieve high concurrency and performance for your BI dashboards and to offload the provisioned capacity for SQL DW and lower your overall data warehouse cost.
TDE – BYOK
General availability of Transparent Data Encryption (TDE) with Bring Your Own Key (BYOK) support for Azure SQL Database and Azure SQL Data Warehouse.
Previously, SQL Data Warehouse instances containing a smaller domain, reference, or dimension tables used the default round robin distribution. During query execution, data was copied to each compute node forcing queries to execute longer. Furthermore, system resources are taken away from other queries on the system to move the data. With Replicated Tables, the data is available on all compute nodes, hence data movement is eliminated, and queries run faster.