Natively Stored Procs and In-memory tables: NextGen loading!

August 2, 2016

Natively Stored Procs and In-memory tables: NextGen loading!

August 2, 2016

Today I was creating a dummy database and wanted to fill this with a certain amount of data. Since I was going to use this database for demo purposes, I created them with foreign keys, primary keys and some indexes. I found that the way I was loading data was quite slow, so I decided to load the data into my database using Natively Stored Procedures and in-memory tables, since this should be the fastest way. The result really baffled me!

Some background information: I’m using the DELL Inspirion 5220 with 16GB RAM. My virtual machine has access to all cores but is capped to 90%. It has 4096MB of memory of which 3584 is allocated to SQL Server. The VM is a server core with SQL server 2016 which runs on SSDs.

The first scenario I used loaded my data into SQL Server without the usage of in-memory or natively stored procedures. A lot of code will follow but this way you can test it yourself. The following script was used.

-- CREATE DATABASE PART
ALTER DATABASE REALTIMEANALYTICS SET SINGLE_USER WITH ROLLBACK IMMEDIATE

USE MASTER
GO
DROP DATABASE IF EXISTS RealtimeAnalytics
GO
CREATE DATABASE [RealtimeAnalytics]
CONTAINMENT = NONE
ON  PRIMARY
( NAME = N'RealtimeAnalytics_1', FILENAME = N'G:\DATA\RealtimeAnalytics_1.mdf' , SIZE = 131072KB , FILEGROWTH = 131072KB ),
( NAME = N'RealtimeAnalytics_2', FILENAME = N'G:\DATA\RealtimeAnalytics_2.ndf' , SIZE = 131072KB , FILEGROWTH = 131072KB ),
( NAME = N'RealtimeAnalytics_3', FILENAME = N'G:\DATA\RealtimeAnalytics_3.ndf' , SIZE = 131072KB , FILEGROWTH = 131072KB ),
( NAME = N'RealtimeAnalytics_4', FILENAME = N'G:\DATA\RealtimeAnalytics_4.ndf' , SIZE = 131072KB , FILEGROWTH = 131072KB )
LOG ON
( NAME = N'RealtimeAnalytics_log', FILENAME = N'C:\Data\LOG\RealtimeAnalytics_log.ldf' , SIZE = 131072KB , FILEGROWTH = 131072KB )
GO

I created a database with 4 files called Realtime Analytics. Afterwards I’m adding 4 tables with keys to each other.

-- CREATE TABLE PART
CREATE TABLE dbo.Clients
(
ClientID int identity(-2147483648,1)PRIMARY KEY NOT NULL,
ClientName varchar(200),
ClientCreditcard varchar(200)
)
GO
CREATE TABLE dbo.SalesPersons
(
SalesPersonID int identity(-2147483648,1)PRIMARY KEY NOT NULL,
SalesPersonName varchar(300),
SalesPersonDiscount int
)
GO
CREATE TABLE dbo.Item
(
ItemID int identity(-2147483648,1)PRIMARY KEY NOT NULL,
ItemName varchar(300),
ItemPrice numeric(15,2)
)
GO
CREATE TABLE dbo.Production
(
ProductionID int identity (-2147483648,1) PRIMARY KEY NOT NULL,
ClientID int FOREIGN KEY REFERENCES CLIENTS(ClientID)NOT NULL,
SalesPersonID int FOREIGN KEY REFERENCES SalesPersons(SalesPersonID)NOT NULL,
AmountOfItemsSold int,
ItemID int FOREIGN KEY REFERENCES Item(ItemID)NOT NULL,
DateOfSale datetime not null DEFAULT(Getdate())
)
GO

After the creation of the tables I’m going to insert data into the tables in a while loop. It’s not the best solution, but, hey, I will do the same thing with the natively.

---- FILL THE TABLES WITH DATA
DECLARE @Count int = 1
--FILL CLIENTS
WHILE @Count <= 1000
BEGIN
INSERT INTO dbo.Clients
select 'Name' + cast(@Count as varchar), '1111-1111-111'+cast(@Count as varchar)
SET @Count = @Count +1
END
SET @Count = 1
--FILL SalesPersons
WHILE @Count <= 10
BEGIN
INSERT INTO dbo.SalesPersons
select 'Name' + cast(@Count as varchar), FLOOR(RAND()*(50-1)+1)
SET @Count = @Count +1
END
SET @Count = 1
--FILL Items
WHILE @Count <= 800
BEGIN
INSERT INTO dbo.Item
select 'Item' + cast(@Count as varchar),cast(RAND()*(200000-1) as numeric(15,2))
SET @Count = @Count +1
END
SET @Count = 1
DECLARE @ClientID int
DECLARE @SalesPersonID int
DECLARE @ItemID int
--FILL Items
WHILE @Count <= 2000000
BEGIN
SET @ClientID = (select ClientID from dbo.Clients where ClientName = 'Name'+CAST(FLOOR(RAND()*(1000-1)+1)AS VARCHAR))
SET @SalesPersonID = (select SalesPersonID from dbo.SalesPersons where SalesPersonName = 'Name'+CAST(FLOOR(RAND()*(10-1)+1)AS VARCHAR))
SET @ItemID = (select ItemID from dbo.Item where ItemName = 'Item'+CAST(FLOOR(RAND()*(800-1)+1)as varchar))

INSERT INTO dbo.Production
select @ClientID,@SalesPersonID,FLOOR(RAND()*(100-1)+1),@ItemID,dateadd(second, -floor(rand()*(604800-1)+1),getdate())
SET @Count = @Count +1
END

When we run this script we see following workload on the C-drive. The SSD is getting hammered, but CPU-usage is quite fine.

I decided to stop the script after about an hour and a half, because this was taking way too long.

When I checked how many records where inserted into the Production Table I saw following results.

During all that time, only about 1 000 000 rows were inserted, whichis unacceptable. So I thought let’s use that improved in-memory table and natively stored procedures and see what the difference is. So I edited the script with following code:

On the Create database I made it Memory Optimized enabled

use master  
go  
alter database RealtimeAnalytics add filegroup RealtimeAnalytics_mod contains memory_optimized_data  
go  
-- adapt filename as needed  
alter database RealtimeAnalytics add file (name='RealtimeAnalytics_mod', filename='G:\data\RealtimeAnalytics_mod') to filegroup RealtimeAnalytics_mod  
go

I then created the same table structure but in-memory.

--MEMORY OPTIMIZED 
use RealtimeAnalytics  
go  
CREATE TABLE dbo.Clients_MON
(
ClientID int identity(1,1)PRIMARY KEY NONCLUSTERED NOT NULL,
ClientName varchar(200),
ClientCreditcard varchar(200)
)
with (memory_optimized=on)  
GO

CREATE TABLE dbo.SalesPersons_MON
(
SalesPersonID int identity(1,1)PRIMARY KEY NONCLUSTERED NOT NULL,
SalesPersonName varchar(300),
SalesPersonDiscount int
)
with (memory_optimized=on)  
GO

CREATE TABLE dbo.Item_MON
(
ItemID int identity(1,1)PRIMARY KEY NONCLUSTERED NOT NULL,
ItemName varchar(300),
ItemPrice numeric(15,2)
)
with (memory_optimized=on)  
GO

CREATE TABLE dbo.Production_MON
(
ProductionID int identity (1,1) PRIMARY KEY NONCLUSTERED NOT NULL,
ClientID int FOREIGN KEY REFERENCES CLIENTS_MON(ClientID)NOT NULL,
SalesPersonID int FOREIGN KEY REFERENCES SalesPersons_MON(SalesPersonID)NOT NULL,
AmountOfItemsSold int,
ItemID int FOREIGN KEY REFERENCES Item_MON(ItemID)NOT NULL,
DateOfSale datetime not null DEFAULT(Getdate())
)
with (memory_optimized=on)

Notice that I can’t set my ID to -2147483648, more info on that here. Why would they do that? But this is out of scope and I’m going to continue by giving you the stored proc that I created. This is exactly the same thing as I did in the previous test, only now I made a native stored proc out of it.

USE RealtimeAnalytics
GO

create procedure dbo.FillDatabase  
with native_compilation, schemabinding, execute as owner  
as  
begin atomic  
with (transaction isolation level=snapshot, language=N'us_english')  
  
DECLARE @Count int = 1

--FILL CLIENTS
WHILE @Count <= 1000
BEGIN
INSERT INTO dbo.Clients_MON
select 'Name' + cast(@Count as varchar), '1111-1111-111'+cast(@Count as varchar)
SET @Count = @Count +1
END

SET @Count = 1
--FILL SalesPersons
WHILE @Count <= 10
BEGIN
INSERT INTO dbo.SalesPersons_MON
select 'Name' + cast(@Count as varchar), FLOOR(RAND()*(50-1)+1)
SET @Count = @Count +1
END

SET @Count = 1
--FILL Items
WHILE @Count <= 800
BEGIN
INSERT INTO dbo.Item_MON
select 'Item' + cast(@Count as varchar),cast(RAND()*(200000-1) as numeric(15,2))
SET @Count = @Count +1
END

SET @Count = 1

DECLARE @ClientID int
DECLARE @SalesPersonID int
DECLARE @ItemID int
--FILL Items
WHILE @Count <= 2000000
BEGIN
SET @ClientID = (select ClientID from dbo.Clients_MON where ClientName = 'Name'+CAST(FLOOR(RAND()*(1000-1)+1)AS VARCHAR))
SET @SalesPersonID = (select SalesPersonID from dbo.SalesPersons_MON where SalesPersonName = 'Name'+CAST(FLOOR(RAND()*(10-1)+1)AS VARCHAR))
SET @ItemID = (select ItemID from dbo.Item_MON where ItemName = 'Item'+CAST(FLOOR(RAND()*(800-1)+1)as varchar))

INSERT INTO dbo.Production_MON
select @ClientID,@SalesPersonID,FLOOR(RAND()*(100-1)+1),@ItemID,dateadd(second, -floor(rand()*(604800-1)+1),getdate())
SET @Count = @Count +1
END
End

After this, I created following statement to load the data to in-memory and then switch it to my normal tables using a select into:

EXEC dbo.FillDatabase
GO
SET IDENTITY_INSERT dbo.Clients ON
insert into dbo.Clients(ClientID,ClientName,ClientCreditcard) 
select ClientID,ClientName,ClientCreditcard from Clients_MON
SET IDENTITY_INSERT dbo.Clients OFF
GO
SET IDENTITY_INSERT dbo.SalesPersons ON
insert into dbo.SalesPersons (SalesPersonID,SalesPersonName,SalesPersonDiscount)
select SalesPersonID,SalesPersonName,SalesPersonDiscount from SalesPersons_MON
SET IDENTITY_INSERT dbo.SalesPersons OFF
GO
SET IDENTITY_INSERT dbo.Item ON
insert into dbo.Item (ItemID,ItemName,ItemPrice)
select ItemID,ItemName,ItemPrice from Item_MON
SET IDENTITY_INSERT dbo.Item OFF
GO
SET IDENTITY_INSERT dbo.Production ON
insert into dbo.Production (ProductionID,ClientID,SalesPersonID,AmountOfItemsSold,ItemID,DateOfSale)
select ProductionID,ClientID,SalesPersonID,AmountOfItemsSold,ItemID,DateOfSale from Production_MON
SET IDENTITY_INSERT dbo.Production OFF

While executing the last statement we can see that the workload has changed. We no longer see our disk getting hammered. this is normal because we are hammering our memory and more CPU-usage!

After about 8 minutes I got a ping that my query had executed. 8 MINUTES!!!!

So for doing exactly the same thing but first loading them into memory I got my result in my table in 8 minutes and 48 seconds, while with the other way (no in-memory tables & native procs) I only had inserted 1 000 000 rows after 1 hour and 30 minutes. This is an excellent performance gain if you ask me! To conclude, let’s just say that in-memory optimized tables and natively stored procs are fast, very fast, extremely fast! If you have a high insert rate on your table and the table can’t follow I would suggest trying to implement this scenario because it might be the case that you can greatly benefit from it.

Thanks for reading and stay tuned!

Kohera, Power BI

Updating your Azure SQL server OAuth2 credentials in Power BI via PowerShell for automation purposes

7 February 2024

The better way to update OAuth2 credentials in Power BI is by automating the process of updating Azure SQL Server...

Kohera

Under (memory) pressure

28 January 2024

A few weeks ago, a client asked me if they were experiencing memory pressure and how they could monitor it...

Fabric

Managing files from other devices in a Fabric Lakehouse using the Python Azure SDK

11 January 2024

In this blogpost, you’ll see how to manage files in OneLake programmatically using the Python Azure SDK. Very little coding...

SQL Server

Database specific security in SQL Server

12 October 2023

There are many different ways to secure your database. In this blog post we will give most of them a...

SQL Server

SQL Server security made easy on the server level

13 September 2023

In this blog, we’re going to look at the options we have for server level security. In SQL Server we...

Kohera, SQL Server

Microsoft SQL Server history

5 July 2023

Since its inception in 1989, Microsoft SQL Server is a critical component of many organizations' data infrastructure. As data has...

Cookie	Duration	Description
ARRAffinity	session	ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user.
ARRAffinitySameSite	session	This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
elementor	never	The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	Cloudflare set the cookie to support Cloudflare Bot Management.
pll_language	1 year	Polylang sets this cookie to remember the language the user selects when returning to the website and get the language information when unavailable in another way.

Cookie	Duration	Description
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
ai_session	30 minutes	This is a unique anonymous session identifier cookie set by Microsoft Application Insights software to gather statistical usage and telemetry data for apps built on the Azure cloud platform.
CONSENT	2 years	YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data.
vuid	1 year 1 month 4 days	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website.

Cookie	Duration	Description
ai_user	1 year	Microsoft Azure sets this cookie as a unique user identifier cookie, enabling counting of the number of users accessing the application over time.
VISITOR_INFO1_LIVE	5 months 27 days	YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface.
YSC	session	Youtube sets this cookie to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt-remote-device-id	never	YouTube sets this cookie to store the user's video preferences using embedded YouTube videos.
yt.innertube::nextId	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen.