Index maintenance on a VLDB – how to tame the beast


Index maintenance on a VLDB – how to tame the beast


Normal day to day maintenance of large data volumes can be difficult. In a previous post I talked about how to handle DBCC CheckDB on a multi-terabyte database. Today I will tackle another issue I came across while dealing with multi-terabyte databases: index maintenance.

Like most of you, my absolute favorite solution when it comes to index maintenance is the Ola Hallengren solution. This is a perfectly written and solid maintenance solution, but I noticed it fails to pass the fragmentation checking stage of an index in VLDBs. In order to tame the VLDB beast, I wrote my own script for handling index maintenance. This script can be found here.


How does the script work?

Maintenance solutions usually restart daily, even if not all indexes were reached during the maintenance window. To cope with this, we wrote a VLDB maintenance script which divides the full index maintenance work load over a few days by creating a few tables. These tables keep track of your index maintenance progress (Checks & Rebuild/Reorganizes), allowing you to check and rebuild all indexes. Only when all indexes have been checked, the maintenance cycle will start over.

Another issue is when the TLOG is larger in size than the index, the TLOG builds up and causes your disk to fill up with all the problems that come with it. For this reason, we built in a check into the script that skips the index if there’s insufficient available storage space at that moment or in total. Instead, the index will be reported in the reporting table, allowing you to check which indexes are not getting rebuild and try to get them rebuild in another way.


The parameters

As with a normal script, I defined some parameters which you have to enter to create the index maintenance.

SET @ManagementDatabase = 'master' --> Database in which objects have to be created

This is be the database you will use to create the objects needed for the script to work.

SET @Database = 'WideWorldImporters' --> Database which will be maintained

This is the database on which the maintenance has to be done.

 SET @AmountOfMemory = 30 --> Percentage of memory to be used for Online Rebuild Operations(0-100) (EnterpriseOnly)

This is the amount of memory available for the online rebuild to ensure you don’t use all your memory for one rebuild.

SET @ColumnStatisticsMaintenance = 'Yes' -->Do you want to maintain Column Statistics? (Yes/No)

If you also want to maintain the column statistics you will have to say yes here.

SET @MaxRunTime = 300 --> Amount of minutes the Maintenance can run each night (300 = 5hours)

The amount of minutes your script can run every night, after this amount of time no new statements will be launched.

SET @JobStartHour = 1 --> F.E. 1 job will start at 01:00 (possible options 1 2 3 4 5 6 7 8 9 10 11 12 13 ... 24)

The start hour of your  scheduled jobs,  created by the script.

SET @LockTimeout = 360 --> Set the locktimeout in seconds

The amount of time in seconds a lock can be kept by de index rebuild/reorganize before cancelling the operation.

SET @WithOnline = 'YES' -->Rebuild indexes with the ONLINE rebuild option (YES OR NO)

This gives you the choice to use online rebuilds or not (in case of enterprise: I would suggest YES,but I had one case where it was NO  )


Created objects


The ColumnStatistics table contains the workload for all Column Statistics. The script uses this table to define whether or not it should resample the statistics. The IndexFragTable contains all information the script needs to rebuild or reorganize your indexes and where it should continue since its last rebuild. The IndexRebuildColumnStatLoggingTable is the most important table for you because it contains all information on failed rebuilds, error messages and unchecked indexes.


Stored Procedures

The dbo.IndexStatsOptimize is the main stored procedure which will be executed to handle you index maintenance. The dbo.Update IndexStatsTable updates your table, adds new indexes and resets after the cycle has been run.



This job will execute the stored procedures. If you need to make an alteration to your schedule, this is the place to go. Also, you can change your parameters here by altering the dbo.IndexStatsOptimizer stored procedure.


Availability Groups

The script has been made AG-Aware, the only thing you have to do is execute it on both nodes.


I made a mistake

If you made a mistake with your parameters, just reexecute the script. It will recreate all objects. If you made an error with the name of your database, then you have to manually remove all objects listed above.


I have had a lot of table changes in my database

The script will check daily for new tables and indexes. Add them to the list of indexes to be checked and rebuild. The script will also look for deleted tables and delete them from the list.

If you have any questions, feel free to ask. Thanks for reading and enjoy the script!
Stay tuned!

Group of computer programmers working in the office. Focus is on blond woman showing something to her colleague on PC.
Updating your Azure SQL server OAuth2 credentials in Power BI via PowerShell for automation purposes
The better way to update OAuth2 credentials in Power BI is by automating the process of updating Azure SQL Server...
Under (memory) pressure
A few weeks ago, a client asked me if they were experiencing memory pressure and how they could monitor it...
Managing files from other devices in a Fabric Lakehouse using the Python Azure SDK
In this blogpost, you’ll see how to manage files in OneLake programmatically using the Python Azure SDK. Very little coding...
Database specific security in SQL Server
There are many different ways to secure your database. In this blog post we will give most of them a...
SQL Server security made easy on the server level
In this blog, we’re going to look at the options we have for server level security. In SQL Server we...
Microsoft SQL Server history
Since its inception in 1989, Microsoft SQL Server is a critical component of many organizations' data infrastructure. As data has...