What can go wrong when you combine memory-optimized tempdb with the resource governor?
If you are using SQL Server 2019 you’ll see that one of the new features allows you to convert some tempdb system tables in in-memory OLTP. If your system is overloaded with a busy tempdb, you can reduce contention by eliminating locking on the system tables. Just enable the feature by running a single command and restarting the instance:
ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED TEMPDB_METADATA = ON;
And all your tempdb contention problems magically go away. Microsoft warns you about some minor limitations of this feature, but most of us won’t have any problems turning this on. (Read: I’m not one of those people.)
Conventional tables are stored as pages in the buffer pool. Whenever there is memory pressure, the storage engine will evict the oldest pages to make some room. Your tables may be cached in memory in full, partly or kicked out of your memory completely. When you need a page that hasn’t been cached in the buffer pool, the storage engine will read it back into memory from the disk. As all of this happens in the background, for all intents and purposes, the only size limit you have is how much disk space you can provide.
In-memory OLTP works differently. Memory optimized tables need to be stored in memory in their entirety and can’t be evicted from memory. SQL also doesn’t allow the operating system to start paging memory out to the system swap file for memory optimized tables. This means the maximum size of these tables is limited. When all free memory is exhausted, any operations that require more memory will fail.
You can restrict the buffer pool maximum size by setting the maximum server memory option. Memory optimized tables, however, use the resource governor to restrict their size. Unless you explicitly bind a database to a different resource pool, the required memory will be taken from the default resource pool. This means that, if you use the resource governor to restrict the total amount of memory allocated to a pool, you are not only limiting memory grants for queries but also how much memory optimized data you can store in that pool.
When you enable the in-memory tempdb metadata feature, you effectively restrict the size of tempdb metadata. Behind the scenes this metadata, which used to be conventional tables, have now been turned into memory optimized tables. Except Microsoft doesn’t mention anywhere in their documentation under which resource pool this metadata is placed.
On a new SQL instance, you get an internal pool which you can’t change and a default pool. Unless you explicitly tell SQL to bind sessions to another pool, all queries end up running in the default pool. The same goes for databases that you haven’t bound to another pool; any In-memory OLTP data gets placed on the default pool.
For my example, I created a SQL instance with a resource pool named bucket to which I wanted to bind a copy of the StackOverflow database. Microsoft provides documentation on how to explicitly bind a database to a pool:
EXEC sp_xtp_bind_db_resource_pool 'StackOverflow', 'bucket';
A binding has been created. Take database 'StackOverflow' offline and then bring it back online to begin using resource pool 'bucket'.
I tried binding tempdb to a user-created pool, but unfortunately SQL doesn’t allow this for system databases:
EXEC sp_xtp_bind_db_resource_pool 'tempdb', 'bucket';
Msg 41371, Level 16, State 1, Procedure sys.sp_xtp_bind_db_resource_pool_internal, Line 1 [Batch Start Line 0]
Binding to a resource pool is not supported for system database 'tempdb'. This operation can only be performed on a user database.
So which resource pool is tempdb using?
SELECT a.name, a.database_id, a.resource_pool_id, b.name as pool_name FROM sys.databases a left join sys.dm_resource_governor_resource_pools b on a.resource_pool_id = b.pool_id
Databases that don’t have an explicit bind to a resource pool show up as NULL. According to Microsoft’s documentation, they get bound to the default pool. But does this also apply to system databases, or do these get bound to the internal pool?
There is no DMV that makes an easy overview of the memory usage per resource pool, which is then broken down per database. As system objects aren’t counted in some DMVs, finding this is even harder. If you really want to get to the bottom of this rabbit hole, you need to look at memory addresses to figure out which XTP objects belong to which memory broker.
Since we’re on an idle test system and the only database using memory optimized tables is tempdb, we can look at the total XTP memory usage per pool, XTP usage per database and to see if we can find any correlation.
First, we need to look at the memory usage per pool and filter for XTP objects:
SELECT b.name as pool_name, a.pool_id, a.allocations_kb FROM sys.dm_os_memory_brokers a join sys.dm_resource_governor_resource_pools b on a.pool_id = b.pool_id WHERE a.memory_broker_type = 'MEMORYBROKER_FOR_XTP'
Now, we need to find the XTP memory usage per database. The DMV for this is per database, so we need to run this query for each database:
select sum(allocated_bytes)/1024 as sum_allocated_kb, sum(used_bytes)/1024 as sum_used_kb from sys.dm_db_xtp_memory_consumers;'
For every database this returns NULL, except on tempdb:
Jackpot! 87MB allocated for in-memory OLTP on tempdb and I have 95MB allocated for in-memory OLTP on the default pool.
I logged on to a customer’s system of which I suspected it was having issues because of this. They pretty much have the same setup as my test system: SQL2019 CU4, In-memory tempdb metadata is enabled on their SQL instance and there is no other database using in-memory OLTP. I ran the same two queries to confirm my suspicions.
That’s 56 gigs worth of tempdb metadata!
On this instance, the resource governor is used to differentiate between high and low priority queries and any unclassified logins have their queries sent to the default pool. Given that their database has a mixed OLTP and batch processing workload, it was necessary to separate these queries out to different resource groups and limit the amount of memory for the pools to prevent buffer pool trashing. They ran out of memory in the default pool and the log file was getting spammed with errors:
Another observation on this system is that these errors were occurring on SPIDs that were using the other pools. Queries were failing across all pools, So it wasn’t contained to just the default pool. When the engine is unable to store any more tempdb metadata, tempdb is effectively locked up, which affects everything on the SQL instance.
This created a perfect storm for them:
Luckily, the instance didn’t crash and managed to recover on its own after a minute.
I advised them to add an extra resource pool for unclassified queries and modify the resource governor classifier function so that it would send all unknown logins to this new pool. This was to ensure that queries with large memory grants don’t trigger another OOM condition. Since the default resource pool won’t be used by user sessions anymore, they can then increase the memory limit again on the default pool without trashing their buffer pool.
Except for metadata, tempdb was virtually empty and this memory usage has been creeping up steadily over the past few weeks without going back down. So, I urged them to open a case with Microsoft support as this looks like a memory leak.
I’m still going to use the new in-memory tempdb metadata feature, but be sure to pay special attention to setups where you combine this with the resource governor. As a DBA, just imagine how your server would react if queries with huge memory grants start coming in. I’d also advise to have monitoring tools in place to catch issues creeping up like this, before they become a real problem. In a perfect world, developers write perfect queries and database engines don’t have bugs. But, it’s not a perfect world…
In-memory tempdb metadata stores its data in the default resource pool. If you run out of memory on the default pool, tempdb locks up.
Cookie | Duration | Description |
---|---|---|
ARRAffinity | session | ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user. |
ARRAffinitySameSite | session | This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session. |
cookielawinfo-checkbox-advertisement | 1 year | Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category. |
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
CookieLawInfoConsent | 1 year | CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie. |
elementor | never | The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | Cloudflare set the cookie to support Cloudflare Bot Management. |
pll_language | 1 year | Polylang sets this cookie to remember the language the user selects when returning to the website and get the language information when unavailable in another way. |
Cookie | Duration | Description |
---|---|---|
_ga | 1 year 1 month 4 days | Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors. |
_ga_* | 1 year 1 month 4 days | Google Analytics sets this cookie to store and count page views. |
_gat_gtag_UA_* | 1 minute | Google Analytics sets this cookie to store a unique user ID. |
_gid | 1 day | Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously. |
ai_session | 30 minutes | This is a unique anonymous session identifier cookie set by Microsoft Application Insights software to gather statistical usage and telemetry data for apps built on the Azure cloud platform. |
CONSENT | 2 years | YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data. |
vuid | 1 year 1 month 4 days | Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website. |
Cookie | Duration | Description |
---|---|---|
ai_user | 1 year | Microsoft Azure sets this cookie as a unique user identifier cookie, enabling counting of the number of users accessing the application over time. |
VISITOR_INFO1_LIVE | 5 months 27 days | YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface. |
YSC | session | Youtube sets this cookie to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt-remote-device-id | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt.innertube::nextId | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
Cookie | Duration | Description |
---|---|---|
WFESessionId | session | No description available. |