kohera-logo-regular.svg

Natively Stored Procs and In-memory tables: NextGen loading!

Today I was creating a dummy database and wanted to fill this with a certain amount of data. Since I was going to use this database for demo purposes, I created them with foreign keys, primary keys and some indexes. I found that the way I was loading data was quite slow, so I decided to load the data into my database using Natively Stored Procedures and in-memory tables, since this should be the fastest way. The result really baffled me!

Some background information: I’m using the DELL Inspirion 5220 with 16GB RAM. My virtual machine has access to all cores but is capped to 90%. It has 4096MB of memory of which 3584 is allocated to SQL Server. The VM is a server core with SQL server 2016 which runs on SSDs.

The first scenario I used loaded my data into SQL Server without the usage of in-memory or natively stored procedures. A lot of code will follow but this way you can test it yourself. The following script was used.

-- CREATE DATABASE PART
ALTER DATABASE REALTIMEANALYTICS SET SINGLE_USER WITH ROLLBACK IMMEDIATE

USE MASTER
GO
DROP DATABASE IF EXISTS RealtimeAnalytics
GO
CREATE DATABASE [RealtimeAnalytics]
CONTAINMENT = NONE
ON  PRIMARY
( NAME = N'RealtimeAnalytics_1', FILENAME = N'G:\DATA\RealtimeAnalytics_1.mdf' , SIZE = 131072KB , FILEGROWTH = 131072KB ),
( NAME = N'RealtimeAnalytics_2', FILENAME = N'G:\DATA\RealtimeAnalytics_2.ndf' , SIZE = 131072KB , FILEGROWTH = 131072KB ),
( NAME = N'RealtimeAnalytics_3', FILENAME = N'G:\DATA\RealtimeAnalytics_3.ndf' , SIZE = 131072KB , FILEGROWTH = 131072KB ),
( NAME = N'RealtimeAnalytics_4', FILENAME = N'G:\DATA\RealtimeAnalytics_4.ndf' , SIZE = 131072KB , FILEGROWTH = 131072KB )
LOG ON
( NAME = N'RealtimeAnalytics_log', FILENAME = N'C:\Data\LOG\RealtimeAnalytics_log.ldf' , SIZE = 131072KB , FILEGROWTH = 131072KB )
GO

I created a database with 4 files called Realtime Analytics. Afterwards I’m adding 4 tables with keys to each other.

 

-- CREATE TABLE PART
CREATE TABLE dbo.Clients
(
ClientID int identity(-2147483648,1)PRIMARY KEY NOT NULL,
ClientName varchar(200),
ClientCreditcard varchar(200)
)
GO
CREATE TABLE dbo.SalesPersons
(
SalesPersonID int identity(-2147483648,1)PRIMARY KEY NOT NULL,
SalesPersonName varchar(300),
SalesPersonDiscount int
)
GO
CREATE TABLE dbo.Item
(
ItemID int identity(-2147483648,1)PRIMARY KEY NOT NULL,
ItemName varchar(300),
ItemPrice numeric(15,2)
)
GO
CREATE TABLE dbo.Production
(
ProductionID int identity (-2147483648,1) PRIMARY KEY NOT NULL,
ClientID int FOREIGN KEY REFERENCES CLIENTS(ClientID)NOT NULL,
SalesPersonID int FOREIGN KEY REFERENCES SalesPersons(SalesPersonID)NOT NULL,
AmountOfItemsSold int,
ItemID int FOREIGN KEY REFERENCES Item(ItemID)NOT NULL,
DateOfSale datetime not null DEFAULT(Getdate())
)
GO

After the creation of the tables I’m going to insert data into the tables in a while loop. It’s not the best solution, but, hey, I will do the same thing with the natively.

 

---- FILL THE TABLES WITH DATA
DECLARE @Count int = 1
--FILL CLIENTS
WHILE @Count <= 1000
BEGIN
INSERT INTO dbo.Clients
select 'Name' + cast(@Count as varchar), '1111-1111-111'+cast(@Count as varchar)
SET @Count = @Count +1
END
SET @Count = 1
--FILL SalesPersons
WHILE @Count <= 10
BEGIN
INSERT INTO dbo.SalesPersons
select 'Name' + cast(@Count as varchar), FLOOR(RAND()*(50-1)+1)
SET @Count = @Count +1
END
SET @Count = 1
--FILL Items
WHILE @Count <= 800
BEGIN
INSERT INTO dbo.Item
select 'Item' + cast(@Count as varchar),cast(RAND()*(200000-1) as numeric(15,2))
SET @Count = @Count +1
END
SET @Count = 1
DECLARE @ClientID int
DECLARE @SalesPersonID int
DECLARE @ItemID int
--FILL Items
WHILE @Count <= 2000000
BEGIN
SET @ClientID = (select ClientID from dbo.Clients where ClientName = 'Name'+CAST(FLOOR(RAND()*(1000-1)+1)AS VARCHAR))
SET @SalesPersonID = (select SalesPersonID from dbo.SalesPersons where SalesPersonName = 'Name'+CAST(FLOOR(RAND()*(10-1)+1)AS VARCHAR))
SET @ItemID = (select ItemID from dbo.Item where ItemName = 'Item'+CAST(FLOOR(RAND()*(800-1)+1)as varchar))

INSERT INTO dbo.Production
select @ClientID,@SalesPersonID,FLOOR(RAND()*(100-1)+1),@ItemID,dateadd(second, -floor(rand()*(604800-1)+1),getdate())
SET @Count = @Count +1
END

When we run this script we see following workload on the C-drive. The SSD is getting hammered, but CPU-usage is quite fine.
160725_1

I decided to stop the script after about an hour and a half, because this was taking way too long.
160725_2

 

When I checked how many records where inserted into the Production Table I saw following results.
160725_3

During all that time, only about 1 000 000 rows were inserted, whichis unacceptable. So I thought let’s use that improved in-memory table and natively stored procedures and see what the difference is. So I edited the script with following code:

On the Create database I made it Memory Optimized enabled

use master  
go  
alter database RealtimeAnalytics add filegroup RealtimeAnalytics_mod contains memory_optimized_data  
go  
-- adapt filename as needed  
alter database RealtimeAnalytics add file (name='RealtimeAnalytics_mod', filename='G:\data\RealtimeAnalytics_mod') to filegroup RealtimeAnalytics_mod  
go

I then created the same table structure but in-memory.

--MEMORY OPTIMIZED 
use RealtimeAnalytics  
go  
CREATE TABLE dbo.Clients_MON
(
ClientID int identity(1,1)PRIMARY KEY NONCLUSTERED NOT NULL,
ClientName varchar(200),
ClientCreditcard varchar(200)
)
with (memory_optimized=on)  
GO

CREATE TABLE dbo.SalesPersons_MON
(
SalesPersonID int identity(1,1)PRIMARY KEY NONCLUSTERED NOT NULL,
SalesPersonName varchar(300),
SalesPersonDiscount int
)
with (memory_optimized=on)  
GO

CREATE TABLE dbo.Item_MON
(
ItemID int identity(1,1)PRIMARY KEY NONCLUSTERED NOT NULL,
ItemName varchar(300),
ItemPrice numeric(15,2)
)
with (memory_optimized=on)  
GO

CREATE TABLE dbo.Production_MON
(
ProductionID int identity (1,1) PRIMARY KEY NONCLUSTERED NOT NULL,
ClientID int FOREIGN KEY REFERENCES CLIENTS_MON(ClientID)NOT NULL,
SalesPersonID int FOREIGN KEY REFERENCES SalesPersons_MON(SalesPersonID)NOT NULL,
AmountOfItemsSold int,
ItemID int FOREIGN KEY REFERENCES Item_MON(ItemID)NOT NULL,
DateOfSale datetime not null DEFAULT(Getdate())
)
with (memory_optimized=on)

Notice that I can’t set my ID to -2147483648, more info on that here. Why would they do that? But this is out of scope and I’m going to continue by giving you the stored proc that I created. This is exactly the same thing as I did in the previous test, only now I made a native stored proc out of it.

USE RealtimeAnalytics
GO

create procedure dbo.FillDatabase  
with native_compilation, schemabinding, execute as owner  
as  
begin atomic  
with (transaction isolation level=snapshot, language=N'us_english')  
  
DECLARE @Count int = 1

--FILL CLIENTS
WHILE @Count <= 1000
BEGIN
INSERT INTO dbo.Clients_MON
select 'Name' + cast(@Count as varchar), '1111-1111-111'+cast(@Count as varchar)
SET @Count = @Count +1
END

SET @Count = 1
--FILL SalesPersons
WHILE @Count <= 10
BEGIN
INSERT INTO dbo.SalesPersons_MON
select 'Name' + cast(@Count as varchar), FLOOR(RAND()*(50-1)+1)
SET @Count = @Count +1
END

SET @Count = 1
--FILL Items
WHILE @Count <= 800
BEGIN
INSERT INTO dbo.Item_MON
select 'Item' + cast(@Count as varchar),cast(RAND()*(200000-1) as numeric(15,2))
SET @Count = @Count +1
END

SET @Count = 1

DECLARE @ClientID int
DECLARE @SalesPersonID int
DECLARE @ItemID int
--FILL Items
WHILE @Count <= 2000000
BEGIN
SET @ClientID = (select ClientID from dbo.Clients_MON where ClientName = 'Name'+CAST(FLOOR(RAND()*(1000-1)+1)AS VARCHAR))
SET @SalesPersonID = (select SalesPersonID from dbo.SalesPersons_MON where SalesPersonName = 'Name'+CAST(FLOOR(RAND()*(10-1)+1)AS VARCHAR))
SET @ItemID = (select ItemID from dbo.Item_MON where ItemName = 'Item'+CAST(FLOOR(RAND()*(800-1)+1)as varchar))

INSERT INTO dbo.Production_MON
select @ClientID,@SalesPersonID,FLOOR(RAND()*(100-1)+1),@ItemID,dateadd(second, -floor(rand()*(604800-1)+1),getdate())
SET @Count = @Count +1
END
End

 

After this, I created following statement to load the data to in-memory and then switch it to my normal tables using a select into:

EXEC dbo.FillDatabase
GO
SET IDENTITY_INSERT dbo.Clients ON
insert into dbo.Clients(ClientID,ClientName,ClientCreditcard) 
select ClientID,ClientName,ClientCreditcard from Clients_MON
SET IDENTITY_INSERT dbo.Clients OFF
GO
SET IDENTITY_INSERT dbo.SalesPersons ON
insert into dbo.SalesPersons (SalesPersonID,SalesPersonName,SalesPersonDiscount)
select SalesPersonID,SalesPersonName,SalesPersonDiscount from SalesPersons_MON
SET IDENTITY_INSERT dbo.SalesPersons OFF
GO
SET IDENTITY_INSERT dbo.Item ON
insert into dbo.Item (ItemID,ItemName,ItemPrice)
select ItemID,ItemName,ItemPrice from Item_MON
SET IDENTITY_INSERT dbo.Item OFF
GO
SET IDENTITY_INSERT dbo.Production ON
insert into dbo.Production (ProductionID,ClientID,SalesPersonID,AmountOfItemsSold,ItemID,DateOfSale)
select ProductionID,ClientID,SalesPersonID,AmountOfItemsSold,ItemID,DateOfSale from Production_MON
SET IDENTITY_INSERT dbo.Production OFF

 

While executing the last statement we can see that the workload has changed. We no longer see our disk getting hammered. this is normal because we are hammering our memory and more CPU-usage!

160725_4

After about 8 minutes I got a ping that my query had executed. 8 MINUTES!!!!
160725_5

 

So for doing exactly the same thing but first loading them into memory I got my result in my table in 8 minutes and 48 seconds, while with the other way (no in-memory tables & native procs) I only had inserted 1 000 000 rows after 1 hour and 30 minutes. This is an excellent performance gain if you ask me! To conclude, let’s just say that in-memory optimized tables and natively stored procs are fast, very fast, extremely fast! If you have a high insert rate on your table and the table can’t follow I would suggest trying to implement this scenario because it might be the case that you can greatly benefit from it.

Thanks for reading and stay tuned!

Parameter sniffing solved with new Parameter Sensitive Plan Optimization feature

If you’re someone that works a lot with Microsoft SQL Server, there’s no doubt that you’ve had issues with an issue called “Parameter sniffing” before. Up until now, you had...

Creating maps with R and Power BI

The possibilities are infinite when it comes to creating custom visuals in Power BI. As long as you have creativity and knowledge about the right programming language, you can let...

Sending monitoring alerts through Telegram

What if you could get the ease of phone notifications for whatever monitoring alerts you need? Then we have a solution for you with the app Telegram. Some of you...

Send mails with Azure Elastic Database Jobs

The DatabaseMail feature in SQL Server and Managed Instance is widely used by many professionals. But what if you want a similar functionality in Azure SQL Database? There are options,...

Sorting matrices in Power BI

Recently I worked on a Power BI project for a client. They had a SharePoint site where they regularly published articles and wanted to pour view data into a report...

The world of data is evolving

The data landscape has changed dramatically over recent years. In the past, we mainly heard that we needed to do as much as possible “cloud-only”—but this trend has become more...