Scalar functions and multi-statement table-valued functions are notorious performance killers. They hide in execution plans, their cost is under-estimated, the row estimates are way off, they cause queries to go single-threaded, I could go on and on. The Connect item about their performance has historically been one of the top-voted complaints out there.
At Summit 2017, Joe Sack said Microsoft had heard the community, and was working on a way to inline scalar functions. Now, thanks to a newly published paper, we know more about how they’re doing it. Folks from Microsoft and the Gray Systems Lab wrote Froid: Optimization of Imperative Programs in a Relational Database (17-page PDF, and slightly easier-to-digest 12-page PDF).
It’s just a little technical, but if you’re a query tuner, you’ll probably find it a clearly written, enjoyable read. (I hate most academic papers, but this one was an easy read.)
What Froid Does: Turn Scalars Into Subqueries
If the entire body of an imperative UDF can be expressed as a single relational expression R, then any query that invokes this UDF can be transformed into a query with R as a nested sub-query in place of the UDF.
At first, it sounds like they’re turning single-statement scalar functions into subqueries a la the APPLY statement. That might not sound too exciting because most of the scalars out there are multi-statement. However, keep reading, and they say they’ve already got it working with DECLARE, SET, SELECT, IF/ELSE, RETURN, UDFs (nested/recursive functions), EXISTS, and ISNULL. The 17-page version of the paper includes some decently sized functions as examples of what Froid can tackle.
Froid replaces the scalar UDF operator in the calling query with the newly constructed relational expression as a scalar sub-query.
Froid (French for cold) is the name they’ve given the technique. Not a big fan of that myself – I keep hearing it as either Freud or fraud, depending on my level of enthusiasm. (“Sometimes a subquery is just a subquery.”) Anyhoo, onwards.
That one statement comes with a lot of interesting gotchas that they discuss throughout the paper:
- You won’t see the scalar function in the plan (just like we don’t currently see single-statement inline table-valued functions by name in the plan – we just see their effects, kinda like views)
- The function and the underlying tables might have different permissions (you could have permissions on the function but not the tables, or vice versa, which makes compilation & execution a little trickier)
- Code that doesn’t get used can just get removed outright (just like SQL Server can do join elimination)
- Costs and row estimates are now useful inside the plan
Froid is an extensible framework, designed in a way that makes it straightforward to incrementally add support for more languages and imperative constructs.
They mention C#, Java, R, and Python as other candidates. Given the initial (kinda duct-tape-y) implementation of R & Python in SQL Server 2016/2017, this interests me a lot: if we’re gonna use other languages in the database, I’d much rather have them be first class citizens instead of separate services.
How Froid Was Tested: With Azure SQL DB Data
I absolutely adore Azure SQL DB, but maybe not for the reason you expect. See, now that Microsoft is charging money to host your databases, they’re starting to catch on to the performance bottlenecks. They’re probably looking at their own code and going, “Dang, if we fixed this scalar UDF problem, queries would run X% faster, and we’d be able to cut our hosting bills by 14%, and therefore we’d be able to pass the savings on to…actually, we could just keep the savings.”
An example of that pops up in the paper:
We have analyzed several customer workloads from Azure SQL Database to measure the applicability of Froid with its currently supported constructs. We are primarily interested in databases that make good use of UDFs and hence, we considered the top 100 databases in decreasing order of the number of UDFs present in them. Cumulatively, these 100 databases had 85329 scalar UDFs, out of which Froid was able to handle 51047 (59.8%).
Think about that for a second.
Yeah yeah yeah, the top 100 databases had an average of 853 UDFs in them each. Sure, that’s horrifying. But set that aside – I actually don’t think that’s so bad given that it’s the top 100 databases worldwide.
Focus on this line for a second:
…we considered the top 100 databases in decreasing order of the number of UDFs present in them.
That means Microsoft looked at all Azure SQL DBs, sorted them by the number of UDFs, and then analyzed those customers’ code.
I know sometimes people get all tin-foil-hat-conspiracy, but this does mean that Microsoft developers are reading your databases and your code. Yes, they’re using it to make the product better – but this is a good example of the fact that in the cloud, your hosting partner can (and does) read your data, your code, and your workloads. (Later paragraphs talk about how they replayed customer workloads to gauge performance improvements with Froid.)
It’s a tradeoff: Microsoft reads your data, but they use it to make the product better. Is it worth it? Those poor folks with 853 functions would probably say yes, but obviously, you can’t trust their judgment to begin with, right? Those bozos have 853 functions per database. C’mon. They can’t even be trusted to pick what we’re having for lunch.
No, seriously, the performance improvements are fantastic, of course – the same kinds of order-of-magnitude improvements that you’d normally get with rewriting the code to avoid scalars and multi-statement TVFs altogether. The engine just goes faster without changing the way you drive. That’s pretty awesome.
When Froid Might Ship: Azure, SQL 2017, and SQL vNext
The paper includes this gem:
Froid is implemented in SQL Server 2017 in about 1.5k lines of code. For our experiments, SQL Server 2017 with Froid was run on Windows Server 2012(R2).
Don’t get all excited, now, speedy: that doesn’t mean they’re going to ship it in a 2017 cumulative update, although a performance tuner can dream.
Remember, Microsoft has a track record of being properly paranoid when it comes to shipping dramatic execution changes – like how batch mode is only available for queries with columnstore indexes in them. If query execution could result in incorrect results – and it has – then they’re right to be paranoid. So, it wouldn’t surprise me to see this kind of rollout come very slowly just like it did with batch mode execution. They surely won’t go wild and crazy and just release it in a 2017 CU that affects all scalar UDFs.
They’ll probably ship it in Azure SQL DB first – and hey, we can probably guess which 100 databases will get it first.
To learn more about Froid, head to the 44th International Conference on Very Large Data Bases 2018 on August 27-31 in Rio de Janeiro, Brazil. I’m kidding, of course – the paper was submitted for that, but the sessions haven’t been picked yet. And let’s be honest, if one of you goes to Brazil, you’re not going to step foot in the conference hall. You people are party animals. I’ll see you at the beach. Spring break!
Erik Says: Look, I’m willing to trade in a lot of blog posts and training material about scalar UDFs for them not to kill performance for unsuspecting developers out there anymore. I’d also love to have a very tangible reason to give clients to upgrade to 201(?). For all the things I love about 2016 and especially 2017, most of our clients wouldn’t see a lot of ROI with stuff needing batch mode to work, etc. (even with the tricky workarounds).