Two Code Patterns That Don’t Virtualize Well

Last Updated 6 years ago

Virtualization used to be a really Big Deal™ for database admins: we had to do a lot of careful planning to get a virtualization project done right. These days, virtualization is more and more of a no-brainer: most apps make the transition just fine. Every now and then, though, an exception pops up – usually after the project has already gone live and failed.

Most user-defined functions don’t virtualize well.

User-defined functions (UDFs) accept parameters and output a value or a table. Here’s an example of a scalar function that calculates how many badges a user has earned in the Stack Overflow database:

CREATE FUNCTION dbo.ScalarFunction ( @uid INT )
RETURNS BIGINT
    WITH RETURNS NULL ON NULL INPUT,
         SCHEMABINDING
AS
    BEGIN
        DECLARE @BCount BIGINT;
        SELECT  @BCount = COUNT_BIG(*)
        FROM    dbo.Badges AS b
        WHERE   b.UserId = @uid
        GROUP BY b.UserId;
        RETURN @BCount;
    END;
GO

CREATE FUNCTION dbo.ScalarFunction ( @uid INT )

RETURNS BIGINT

WITH RETURNS NULL ON NULL INPUT,

SCHEMABINDING

BEGIN

DECLARE @BCount BIGINT;

SELECT @BCount = COUNT_BIG(*)

FROM dbo.Badges AS b

WHERE b.UserId = @uid

GROUP BY b.UserId;

RETURN @BCount;

END;

And here’s an example of a query that calls that function:

SELECT TOP 1000 
u.DisplayName, 
dbo.ScalarFunction(u.Id)
FROM dbo.Users AS u;

SELECT TOP 1000

u.DisplayName,

dbo.ScalarFunction(u.Id)

FROM dbo.Users AS u;

User-defined functions are really common because good developers are taught to package their code for easy reusability. Put it in a function, and then call that function from everywhere.

However, most functions have a dark secret: queries that call them are single-threaded. (This starts to get a little better with some types of functions in SQL Server 2017.)

That means that for CPU-bound queries with scalar functions, single-core CPU speed is incredibly important. If a long-running query can only use one CPU core, and that core is suddenly 25% slower, then your query is suddenly 25% slower.

To successfully virtualize these:

Track SOS_SCHEDULER_YIELD closely with something like the Power BI Dashboard for DBAs
Get the fastest cores possible (think 3.5GHz or faster)
Avoid CPU overcommitment – normally, VM admins like putting multiple VMs per core, especially given SQL Server’s licensing costs

IO-latency-sensitive apps don’t virtualize well.

We’ve all been taught that our code should work in sets, not row-by-agonizing-row. However, if you work one row at a time, you can become really sensitive to transaction log file latency.

One of my (least) favorite examples was an app server that had:

C# app running on an app server
It called SQL Server to log a row in a logging database table to say it was starting processing (which waited on the log file to harden)
The C# app would do some processing
It would call SQL Server back and update that one row to say it was done
Wash, rinse, and repeat millions of times in a single-threaded fashion

As a result, every added millisecond of latency meant huge time increases for their nightly jobs. They’d long ago understood that it was a problem, so they’d put that database’s log file on really cheap, consumer-grade NVMe SSDs, which meant that they had sub-millisecond latency.

But when they virtualized that application, the log file moved from local SSD out to the shared storage. They’d purchased pretty good storage – but even that couldn’t compete with the extremely low latency they could get locally.

To successfully virtualize these:

Look out for single-row, single-threaded processes (and ideally, write those to work in parallel batches)
Track WRITELOG waits closely before the migration
Load test the vulnerable processes before going live, making sure your jobs still finish in an acceptable time window
Consider putting databases like that on separate volumes so their performance characteristics can be tuned separately

And alert management about technical debt.

Both of these cases involve code that isn’t so great – code that was shipped to get a feature out the door and bring revenue in. That’s technical debt.

Ward Cunningham’s analogy about technical debt is one of the most effective ways I’ve seen to communicate the issue to management.

Computed Columns: Reversing Data For Easier Searching

New Speed Controls for Training Videos

13 Comments. Leave new

Wyatt
January 25, 2018 8:55 am

I see this time and time again. Some organization does a round of hardware upgrades. Since they want to virtualize a bunch of servers (most of which aren’t SQL Server), they buy a big host with as many cores as they can get in it, but those cores are slower than if they bought fewer core, faster CPUs. Then they put a VM on the host with SQL Server on it, licensed at the guest level. Except they’re paying the same to license 8 2.0GHz cores as they could have paid to license 8 3.0 GHz cores. They probably could have gotten the same performance with only 6 3.0 GHz cores, which would be cheaper to license.

Reply
Randy Minder
January 25, 2018 11:48 am

Are you saying all UDFs are single-threaded or just scalar UDFs?

Reply
- Brent Ozar
  January 25, 2018 11:56 am
  
  Randy – go ahead and read the post closely, and there’s a link to more information, too. Enjoy!
  
  Reply
Randy Minder
January 25, 2018 12:37 pm

I thought I did read it pretty carefully. Here you say, “However, most functions have a dark secret: queries that call them are single-threaded.” And then you say, “That means that for CPU-bound queries with scalar functions, single-core CPU speed is incredibly important”. Your first statement doesn’t mention scaler and your second does. And as far as reading the link, well, like most of your hard working readers, I’m pretty busy. 🙂

Reply
- Brent Ozar
  January 25, 2018 12:53 pm
  
  Randy -no problem! Whenever you’ve got the time to read and learn, the post will be there for ya.
  
  But just like you, I’m kinda busy myself. I wish I could spoon feed everyone every little piece of knowledge, but….at some point, ya gotta pick up the spoon yourself. 😉
  
  Reply
Randy Datillo
January 25, 2018 8:41 pm

Brent, that’s an awfully snarky comment. Do you realize how condescending that is to all of us readers?

Reply
- Brent Ozar
  January 26, 2018 2:37 am
  
  Yep.
  
  And for someone who claims to be too busy to read, you sure do have a lot of time available to have a discussion in comments. If you invested just a little of that in reading the post and the supporting material in the links, you’d be in much better shape.
  
  Reply
- T P
  January 26, 2018 10:35 am
  
  “Do you realize how condescending that is to all of us readers?!”
  How so? Brent was talking to Randy Minder not us.
  
  Reply
  - Brent Ozar
    January 26, 2018 11:14 am
    
    Yep, Randy doesn’t realize that many/most other readers are chuckling when they see comments like that.
    
    Reply
Steve Knox
February 3, 2018 11:00 pm

This could have just been called “Two Code Patterns That Don’t Scale Well.”
Use these code patterns on a physical box which is otherwise properly sized for the application, and they’ll bring that box to a halt as their use scales up as well.
Virtualization really has nothing to do with it — it just highlights the fact that IT has been hiding poorly-performing code by overspending on system resources for decades.

Reply
- Brent Ozar
  February 4, 2018 3:28 am
  
  Steve – interesting, I wouldn’t have called a $250 SSD overspending (especially in relation to a SAN.)
  
  Reply
- Endi
  February 9, 2018 6:33 am
  
  > it just highlights the fact that IT has been hiding poorly-performing code by overspending on system resources for decades.
  
  not only poorly performing, it’s poorly designed. This is the root cause. Logging from C# app to a relational, transactional database? That’s idiocy. Perfect example of technical debt.
  
  Reply
  - Brent Ozar
    February 9, 2018 1:17 pm
    
    If only every app had a perfect design from the beginning, eh?
    
    Holler if you need me. I’ll be over here in the real world. 😉
    
    Reply