kohera-logo-regular.svg

Azure SQL Datawarehouse “Gen 2”: a mayor game changer

Vector_BG.png

Azure SQL Datawarehouse “Gen 2”: a mayor game changer

Vector_BG.png

The Azure SQL Datawarehouse is a fast, flexible, and secure cloud data warehouse tuned for running complex queries fast and across petabytes of data. The Azure DWH is build using parallel data warehouse technology which was already the pinnacle of SQL Server DWH workloads. Moving this technology to the cloud made it available for the majority of all companies, enabling them to use this workhorse as a powerful engine to drive their analytical needs. Scaling down, even pausing the service when it wasn’t needed, and scaling up when the workload demanded more raw power. The Gen 2 is five times faster than the already blazing fast Gen1.

5 times faster, how is this even possible?

The Massively parallel processing (MPP) of the DWH workloads performance is typically determined by:

  • I/O bandwidth to storage
  • repartitioning speed, also known as shuffle speed.

To solve these issues, the Gen 2 uses two new features:

Cache

First of all each SQL DW compute node is equipped with a dynamic cache containing the recently accessed SQL Server columnar storage segments, available via network to the other Azure SQL DW nodes, no additional set up or configuration is required.

What is evenly important is that the presence of this cache will still allow you to pause or resize the data warehouse instance to be resized or paused to save money. When an SQL DW instance is resuming, it will populate the cache again from Azure Storage as data is queried.

Replicated tables

Previously, SQL Data Warehouse instances containing a smaller domain, a reference or dimension tables used the default round robin distribution. During query execution, data was copied to each compute node forcing queries to execute longer. Furthermore, system resources are taken away from other queries on the system to move the data. With Replicated Tables, the data is available on all compute nodes, hence data movement is eliminated, and queries run faster.

How much faster, well…

Up to 5x Better performance

The Gen 2 will enable you to have major speed increases on both levels at the same price level. To put it bluntly, on average data warehouse workload we see an average of 5.4 times performance improvement while achieving 4 times more concurrency.

As if this wouldn’t be enough, the Gen2 also removes some important boundaries

Unlimited Columnstore and the power to use it

The Gen 2 will give you unlimited columnstore storage capacity and the compute power to deliver acceptable query performances on larger and larger data. Azure SQL DW Compute Optimized Gen2 tier’s additional capabilities are specifically focused in this area. Together with the unlimited data in SQL’s columnar format, it has new performance tiers which will increase the compute capacity even further.

4x Higher Concurrency

Just like every MPP data warehousing system, there is a limit to the number of concurrent queries that can be processed and executed, sometimes leading to suboptimal user experiences. The Gen2 tier increases the number of concurrent queries that can be executed to 128 concurrent queries, this is four times more concurrency compared to the previous generation.

How to get it?

You can seamlessly upgrade from the Gen 1 to the Gen 2, using the following documentation.

What can we do for you?

As Kohera we are very proud to be part of the partner network for the Azure DWH.

Something else

Native third party connectors.

Although we prefer the Microsoft tools like Power BI or SSRS, Azure SQL Data Warehouse also works with other data integration and business intelligence solutions such as Informatica, Talend, Tableau, MicroStrategy, Qlik, and Alteryx.

This means that you can now connect Azure SQL Data Warehouse Gen2 to for example Tableau through the existing native SQL Server connector. The good thing about this is that a native connection is tuned for performance and does not involve custom configuration or coding; just point to your data and go!

Other important new features on the Azure DWH (both on Gen 1 & Gen 2)

Monitoring

SQL DW now supports Azure Monitor which is a built-in monitoring service that consumes performance and health telemetry for your data warehouse. Azure monitor not only enables you to monitor your data warehouse within the Azure portal, but its tight integration between Azure services also enables you to monitor your entire data analytics solution within a single interface.

Enhanced Pause button

The pause feature for SQL DW enables you to reduce and manage operating costs for your data warehouse by turning off compute during times of little to no activity. This feature will now detect active running queries and provides a warning before issuing the pause command. Pausing will also cancel all sessions to immediately quiesce your data warehouse before shutting it down. This can sometimes lead to interruptions to your end user applications.

Now with a simple click of the pause button in the Azure portal, you can detect the number of running queries so you can make an informed decision on when to pause.

Azure Analytics

Thanks to the Integration with Azure Analysis Services, creating a model from SQL DW is extremely easy and can even be done through Azure portal. This enables you to achieve high concurrency and performance for your BI dashboards and to offload the provisioned capacity for SQL DW and lower your overall data warehouse cost.

TDE – BYOK

General availability of Transparent Data Encryption (TDE) with Bring Your Own Key (BYOK) support for Azure SQL Database and Azure SQL Data Warehouse.

Replicated tables

Previously, SQL Data Warehouse instances containing a smaller domain, reference, or dimension tables used the default round robin distribution. During query execution, data was copied to each compute node forcing queries to execute longer. Furthermore, system resources are taken away from other queries on the system to move the data. With Replicated Tables, the data is available on all compute nodes, hence data movement is eliminated, and queries run faster.

 

Photo of successful woman coder hacker web creator sitting armchair comfortable workspace workstation indoors.
The hurdles and pitfalls of moving or migrating a System-versioned temporal table cross database
Maybe you already have your own way of doing this and are wondering about alternative methods, or maybe you are...
Group of computer programmers working in the office. Focus is on blond woman showing something to her colleague on PC.
Updating your Azure SQL server OAuth2 credentials in Power BI via PowerShell for automation purposes
The better way to update OAuth2 credentials in Power BI is by automating the process of updating Azure SQL Server...
2401-under-memory-pressure-featured-image
Under (memory) pressure
A few weeks ago, a client asked me if they were experiencing memory pressure and how they could monitor it...
2402-fabric-lakehouse-featured-image
Managing files from other devices in a Fabric Lakehouse using the Python Azure SDK
In this blogpost, you’ll see how to manage files in OneLake programmatically using the Python Azure SDK. Very little coding...
2319-blog-database-specific-security-featured-image
Database specific security in SQL Server
There are many different ways to secure your database. In this blog post we will give most of them a...
kohera-2312-blog-sql-server-level-security-featured-image
SQL Server security made easy on the server level
In this blog, we’re going to look at the options we have for server level security. In SQL Server we...