When In-Memory OLTP came out, I worked with it briefly, and I remember coming away thinking, “Who in their right mind would actually use this?” I was so horrified that I wrote a presentation about it and gave it at a couple of conferences:
Many laughs were had, but obviously that didn’t make me any friends at Microsoft, ha ha, ho ho. I figured I was done with Hekaton, and I wouldn’t ever have to see it again because y’all all saw the video, and you’d never be crazy enough to implement that feature.
Well, here it is in 2023, and recently I’ve talked to a couple of architects who wish they could go back in time and watch that video. In both cases, they suffered from the same issue.
The short story is that the more data you put into durable In-Memory OLTP tables – and even just 5GB of data can hit this issue – the more your startups, failovers, and restores turn into long stories, to the point where other databases on your SQL Server are practically unusable.
Setting up the 5GB Votes table
To demonstrate the problem, we’ll start with a large copy of the Stack Overflow database. I’ll drop my nonclustered indexes just to make the next screenshot more clear, and then I’ll list the tables in the database to find a good candidate for the demo.
The highlighted table, dbo.Votes, has about 150 million rows and takes up about 5GB of space. That’s not big data by any means, and we can easily fit it in memory on our 8-core, 60GB RAM server. Let’s migrate it into In-Memory OLTP in order to make performance go faster:
ALTER DATABASE StackOverflow
SET COMPATIBILITY_LEVEL = 130, /* Or you can go higher, too */
MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT = ON;
ALTER DATABASE StackOverflow
ADD FILEGROUP StackOverflow_ram_fg CONTAINS MEMORY_OPTIMIZED_DATA;
ALTER DATABASE StackOverflow
ADD FILE (name='StackOverflow_ram',
TO FILEGROUP StackOverflow_ram_fg;
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
CREATE TABLE [dbo].[Votes_InMemory_1](
[Id] [int] NOT NULL PRIMARY KEY NONCLUSTERED,
[PostId] [int] NOT NULL,
[UserId] [int] NULL,
[BountyAmount] [int] NULL,
[VoteTypeId] [int] NOT NULL,
[CreationDate] [datetime] NOT NULL)
WITH (MEMORY_OPTIMIZED = ON,
DURABILITY = SCHEMA_AND_DATA);
/* Takes about 11 minutes on my machine: */
INSERT INTO [dbo].[Votes_InMemory_1]
(Id, PostId, UserId, BountyAmount, VoteTypeId, CreationDate)
SELECT Id, PostId, UserId, BountyAmount, VoteTypeId, CreationDate
After migrating that 5GB table to In-Memory OLTP, it’s taking up 5GB RAM, right? Well, no:
Try 28GB RAM. That’s why we have a sp_Blitz check to warn about high memory usage for In-Memory OLTP:
As to why an uncompressed 5GB rowstore table takes up 28GB RAM, that’s a topic for another blog post. (Sometimes, I imagine rowstore tables being introduced today – people would think they’re the most amazing thing ever.)
What happens when SQL Server restarts?
For most of us, when we restart SQL Server, our biggest concerns are things like dropping connections, losing performance metrics, and starting over again with a fresh buffer pool. In-Memory OLTP users have a bigger problem: when the SQL Server restarts, CPU goes straight to 100%, and stays there:
Ouch. Why is CPU so high? Because with In-Memory OLTP, SQL Server starts up the database by reading the In-Memory OLTP tables up into something we call “memory.” It’s using less memory than before, thank goodness – it’s “only” 13GB – but 13GB of data still takes a long time to pull from disk and reconstruct in memory. (I’m not going to go into technical terms like crash recovery phases here because it’s not necessary to convey the overall problem.)
You can see exactly how long it takes in the SQL Server logs. In this case, the StackOverflow database recovery took 50 seconds – and in this case, because our server’s got really fast storage, the storage was able to deliver data so quickly that it pegged CPU at 100% for all 50 seconds.
During startup, wait stats show a lot of wait times on XTP_PREEMPTIVE_TASK and SLEEP_DB_STARTUP:
There’s also blocking, which is kinda funny – anything that needs to query sys.databases gets blocked, like IntelliSense queries running in the background:
This happens during failovers, restores, and attaches, too.
Startups, failover clustered instance failovers, and restores all exhibit the same problem: they need to start up the database, and doing that means reconstructing the In-Memory OLTP data from scratch.
This is particularly problematic for shops that use Always On Availability Groups. Say you want to add a new database into an existing cluster, and the new database happens to use In-Memory OLTP. You restore it on the primary, and … boom, the server hits 100% CPU usage for an extended period of time, which affected all the running queries on existing databases on that server.
Here’s what CPU looks like at the end of a database restore:
CPU was relatively low while the restore was processing, but once SQL Server was done writing the files, CPU went straight to 100%. Why? Because it needed to bring the database online, which meant reading the In-Memory OLTP data from disk and reconstructing the table.
Even if you just attach an In-Memory OLTP database – a process that is normally near-instantaneous – you bring SQL Server to its knees while it reads through that data and populates memory.
In my example, I don’t have indexes on the In-Memory OLTP table, but if I did, the situation would be even worse. Indexes on these tables are only kept in memory, not on disk, so they’re reconstructed from scratch at startup, failover, restore, and attach time.
Bottom line: the more In-Memory OLTP durable data you have,
and the more databases you have that use it,
the worse this gets.
Trace flags 3408 and 3459 don’t help on SQL Server 2022, at least.
Among other sources, Konstantin Taranov’s excellent trace flag list reports that trace flag 3408 forces all databases to use just one thread when starting up. I don’t think I’ve ever needed to use that trace flag, but it doesn’t appear to help here. In my lab, I set up 3408 in the startup config, restarted the SQL Server, and CPU still went straight to 100%:
Furthermore, 3408 doesn’t appear to have the desired effect on SQL Server 2022 (and I didn’t bother checking on other versions, since it wouldn’t have helped my client, as they had a lot of databases with In-Memory OLTP.) Databases are still starting up with multiple threads, which would drive CPU to 100%:
Same problem with trace flag 3459, which is supposed to disable parallel redo for AGs, but that’s unrelated to database startup, as we can see by the flames:
And just because someone’s going to ask, no, setting MAXDOP to 1 has no effect on system processes like bringing databases online.
So how do we get faster startups, failovers, and restores?
In theory, you could add CPUs. In practice, that’s a really expensive way to solve this problem, and it doesn’t work when you have several (or heaven forbid, dozens) of databases that use In-Memory OLTP. SQL Server doesn’t just start up one database at a time – it starts them in groups, which means you can saturate lots of CPUs on restart. If you need to dive into the internals of this, here are a couple of resources:
- Which databases are started up first, and how recovery is parallelized
- Availability Group secondary redo and performance, then How many worker threads are used by Availability Groups – these are relevant because if you have a lot of databases using In-Memory OLTP, and if you chose In-Memory OLTP to reduce latch contention due to high concurrency writes, then you’re probably going to have to worry about parallel redo on the secondaries
In theory, you could use slower storage. Bear with me for a second: if your primary concern was that other databases on the SQL Server were unusable while In-Memory OLTP databases came online, you could actually put the In-Memory OLTP filegroup on slower storage. In this 5GB Votes table example, that does actually lower CPU usage during most of the process, only driving it to 100% near the end of the process:
(Yes, I actually tested that for one particular client, who was curious.) However, that also makes the In-Memory OLTP databases take even longer to come online! In my example with a 5GB table, the database took ~80 seconds to come online instead of ~50 – making your RTO goals tougher to meet.
In practice, minimize what you keep in In-Memory OLTP durable tables. A few ways to do that:
- Make them schema-only instead, and don’t keep the data around. Yes, you lose all data when the SQL Server goes down, but if you’ve been using this feature for temporary data anyway, like session state, reconsider whether you need to keep it.
- Sweep older data out to conventional tables. One of Microsoft’s design patterns for this feature is to use In-Memory OLTP tables only for ingestion, but then after the data’s been absorbed into the database, archive it.
- Use regular tables instead, not In-Memory OLTP. Conventional tables don’t interrupt startup.