The magic of Scalar-valued function inlining in SQL 2019

When one of my clients asked me for a reason to upgrade to SQL 2019, the first thing that came to mind is Scalar-valued function inlining. Reason? Well, for some reason developers like stuffing their SQL with SVFs, and that… makes… it… so… slow. Are SVFs really that bad? And is scalar-valued function inlining useful at all? Let’s walk through it together and find out!

First off, I’m using the opensource database StackOverflow2013. I’m going to start off with creating a SVF, which’ll just count the amount of votes in the Votes table for a specific UserId.

USE tempdb
CREATE OR ALTER FUNCTION dbo.svf_YesWeCan(@UserId INT)
RETURNS INT
WITH RETURNS NULL ON NULL INPUT
AS
BEGIN
DECLARE @Count INT
SELECT @Count = COUNT(*)
FROM dbo.Votes
WHERE UserId = @UserID 
RETURN @Count
END
GO

Make sure we’re using an older compat level, I’ll be using 130 for this example.

ALTER DATABASE StackOverflow2013
SET COMPATIBILITY_LEVEL = 130

And I also created this index to support my SVF.

CREATE INDEX NCI_UserId ON dbo.Votes( UserId )

And I’m adding a data filter, just so I don’t have to wait for too long.

SET STATISTICS TIME,IO ON
SELECT dbo.svf_YesWeCan(Id),*
FROM dbo.Users
WHERE CreationDate > '20130501' AND CreationDate < '20130601'
GO

If we look at the query plan afterwards:

(98144 rows affected)
Table ‘Users’. Scan count 1, logical reads 44492, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 4, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

SQL Server Execution Times:
CPU time = 6860 ms,  elapsed time = 7153 ms.

But…, that doesn’t even look that bad? So are scalar-valued functions not really an issue? Well, a careful eye would have noticed we’re missing something here. The SVF I made queries data from the Votes table, yet nowhere in the stats or in the plan we can see the votes. Let’s go over some downsides of Scalar valued functions.

  • The execution plan lies to you. You won’t find a full detail of what the SVF does, except for a Compute Scalar operator.
  • Everything has to be done row by row.
  • Your query is unable to go parallel.

So, in short, you’ll have a single core query, plowing through all your rows, and then, not admitting what it did in the plan. Now, how can this be fixed? Currently one of the most used fixes for this (if you can change the code) is to turn the SVF into an inlined table valued function, like this:

CREATE OR ALTER FUNCTION dbo.TVF_YesWeCan
(
@UserId INT
)
RETURNS TABLE
AS
RETURN
(
SELECT COUNT(*) Total
FROM dbo.Votes
WHERE UserId = @UserID
)GO

And then, we can use it in our query:

SELECT TVF.Total,*
FROM dbo.Users
CROSS APPLY dbo.TVF_YesWeCan(Id) TVF
WHERE CreationDate > '20130501'

(98144 rows affected)

Table ‘Votes’. Scan count 9, logical reads 92352, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

Table ‘Users’. Scan count 9, logical reads 44994, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 4, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

SQL Server Execution Times:
CPU time = 15175 ms,  elapsed time = 3545 ms.

While this plan looks more complicated, it’s a lot more transparent to us. We can also see that the query went parallel, which is good for this query because it allows it to go much faster.

“But I’m not allowed to change any code in my current environment” Fair point! This is where scalar-valued function inlining comes in. We alter the compat level to SQL 2019.

ALTER DATABASE StackOverflow2013
SET COMPATIBILITY_LEVEL = 150

We run the same query as before with the SVF:

SET STATISTICS TIME,IO ON
SELECT dbo.svf_YesWeCan(Id),*
FROM dbo.Users
WHERE CreationDate > '20130501' AND CreationDate < '20130601'
GO

(98144 rows affected)

Table ‘Votes’. Scan count 98144, logical reads 294690, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

Table ‘Users’. Scan count 1, logical reads 44492, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 4, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

(1 row affected)

SQL Server Execution Times:
CPU time = 718 ms,  elapsed time = 1496 ms.

What kind of magic is this? While the plan looks more complicated, the SQL optimizer knows what to do and creates the best ‘good-enough-plan’ for execution!

If we compare the different execution times for the queries:

 

SVF compat 1307153 ms
iTVF3545 ms
SVF compat 1501496 ms

 

“Does this mean I can use SVF everywhere now without my DBAs complaining?” Probably not. The feature isn’t perfect and it definitely has some limitations, but that’s a bit too much for this post. However, I’m a big fan of trialing your current workload on a SQL2019, and perhaps this magic will do wonders for you as well!