This post might sound a bit like a game of Cluedo, but it happened on a production SQL Server…
But now you want to prove it
This post might sound a bit like a game of Cluedo, but it happened on a production SQL Server…
The issue showed itself on a hosted production VM with 64GB of Ram, where SQL could allocate memory from 32GB to 55GB. The agreements with the hosting partner clearly mentioned that while the CPU could be overcommitted, memory couldn’t.
All my alerts suddenly started to show that the machine was using all the available and became completely unresponsive (I couldn’t even open the Event Viewer to unload the logs).
Remote connection with management studio gave me this worrisome error:
But ping responses proved to be ok (less than 3 ms) so I didn’t suspect memory, page life expectancy gave a hint that something was certainly not OK
The Microsoft Performance dashboard showed worrying sings, so I RDP’d onto the server, and saw the horror of any SQL DBA this SQL Server was enduring Swapping Hell… whilst it should never even have to use its swap file… Ok something or someone murdered my SQL Server… let’s start the CSI investigation…
In the Recource Monitor we see that SQL is “only” getting ±20GB and is struggling to run. Swap file is overloaded and the server is running like a snail with arthritis.
So I suspected VMWare ballooning was the cruel pit here, but as this was especially agreed upon that this machine wouldn’t use memory ballooning I had to prove that it was happening Unfortunately as this is a hosted environment I could not check the ESX console to see the ballooning parameters, so I had to look for another way to find the smoking gun.
Luckily I found good documentation on the VMWare site explaining that there is a command file in the VMWare tools folder that just could to the trick, so I tried this on our murder(ed) victim and Hurray! It gave me the proof I needed
So the murder weapon was known… VM Ware’s ballooning was to blame…
Cluedo summary
– Murder victim: Production Server
– Killed in VMWare ESX environment
– By the VM Ware ballooning service
Now I can go to the hosting partner and ask them to fix this, could become interesting because in the contracts, we especially agreed on not turning on ballooning on production SQL Servers ;-)
On a side note, normally VMWare should have released memory when the (SQL) Server started to ask more resources, but somehow it didn’t, so I also send them the following link to a known issue where the Balloon driver does not releases its memory when using default parameters:
Balloon driver retains hold on memory causing virtual machine guest operating system performance issues (1003470)
I’m interested how this is going to go on ;-)
On the other hand, the VMWareToolBoxCMD stat seems to be extremely useful, in many situations, so I pasted the most interesting ones from the complete list here:
© 2023 Kohera
Crafted by
© 2022 Kohera
Crafted by
Cookie | Duration | Description |
---|---|---|
ARRAffinity | session | ARRAffinity cookie is set by Azure app service, and allows the service to choose the right instance established by a user to deliver subsequent requests made by that user. |
ARRAffinitySameSite | session | This cookie is set by Windows Azure cloud, and is used for load balancing to make sure the visitor page requests are routed to the same server in any browsing session. |
cookielawinfo-checkbox-advertisement | 1 year | Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category. |
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
CookieLawInfoConsent | 1 year | CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie. |
elementor | never | The website's WordPress theme uses this cookie. It allows the website owner to implement or change the website's content in real-time. |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |
Cookie | Duration | Description |
---|---|---|
__cf_bm | 30 minutes | Cloudflare set the cookie to support Cloudflare Bot Management. |
pll_language | 1 year | Polylang sets this cookie to remember the language the user selects when returning to the website and get the language information when unavailable in another way. |
Cookie | Duration | Description |
---|---|---|
_ga | 1 year 1 month 4 days | Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors. |
_ga_* | 1 year 1 month 4 days | Google Analytics sets this cookie to store and count page views. |
_gat_gtag_UA_* | 1 minute | Google Analytics sets this cookie to store a unique user ID. |
_gid | 1 day | Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously. |
ai_session | 30 minutes | This is a unique anonymous session identifier cookie set by Microsoft Application Insights software to gather statistical usage and telemetry data for apps built on the Azure cloud platform. |
CONSENT | 2 years | YouTube sets this cookie via embedded YouTube videos and registers anonymous statistical data. |
vuid | 1 year 1 month 4 days | Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos on the website. |
Cookie | Duration | Description |
---|---|---|
ai_user | 1 year | Microsoft Azure sets this cookie as a unique user identifier cookie, enabling counting of the number of users accessing the application over time. |
VISITOR_INFO1_LIVE | 5 months 27 days | YouTube sets this cookie to measure bandwidth, determining whether the user gets the new or old player interface. |
YSC | session | Youtube sets this cookie to track the views of embedded videos on Youtube pages. |
yt-remote-connected-devices | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt-remote-device-id | never | YouTube sets this cookie to store the user's video preferences using embedded YouTube videos. |
yt.innertube::nextId | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
yt.innertube::requests | never | YouTube sets this cookie to register a unique ID to store data on what videos from YouTube the user has seen. |
Cookie | Duration | Description |
---|---|---|
WFESessionId | session | No description available. |