Let’s finish up Research Paper Week with something we’re all going to need to read over the next year or two. I know, it says Azure SQL DB, but you boxed-product folks will be interested in this one too: Constant Time Recovery in Azure SQL DB by Panagiotis Antonopoulos, Peter Byrne, Wayne Chen, Cristian Diaconu, Raghavendra Thallam Kodandaramaih, Hanuma Kodavalla, Prashanth Purnananda, Adrian-Leonard Radu, Chaitanya Sreenivas Ravella, and Girish Mittur Venkataramanappa (2019).
This one covers a new feature for both Azure SQL DB and SQL Server that:
- Stores the version store in the user database file(s) rather than TempDB, which
- Facilitates near-instant transaction rollbacks, even for huge transactions
- Lets you clear the transaction log much faster, even when transactions are open
Those of you who are familiar with the magic of Read Committed Snapshot Isolation (RCSI) and its use of the version store in TempDB are probably slapping your heads right about now going, “Ah, it’s obvious!” You’re going to be able to hit the paper running – no need to keep reading.
For the rest of you, I’ll try to condense the relevant parts of Kendra’s intro to RCSI here in a few sentences. When you enable RCSI or SI, SQL Server automatically starts storing versions of rows in TempDB. If you want to hold a lock on a row, SQL Server stores the original unmodified row over in TempDB. This means that if someone wants to read (not write) the row that you currently have locked, the SQL Server can simply hand them the unmodified copy out of TempDB and skirt around your lock. This means writers don’t block readers, which is what makes RCSI so massively popular for fixing blocking issues.
The downsides of RCSI include more workloads happening in TempDB, which causes two problems: slower TempDB access, and larger TempDB files. For years, folks have asked for the ability to use a different TempDB per user database to work around problems like this, but Microsoft went in a totally different direction and decided to let you store the version store in the user database instead.
This is way better than separate TempDBs (for this purpose) because having the original rows inside the user database opens up new possibilities for faster transaction rollback and quicker transaction log truncation, as the authors go into in the paper.
If you’re considering enabling Constant Time Recovery in SQL Server 2019 (when it ships), you should read the paper to understand the risks and drawbacks of a feature like this. This is the kind of feature that will put the cutting in cutting edge: I would fully expect to see a lot of rollback, concurrency, and backup/restore bugs fixed in the early Cumulative Updates for 2019. I’m not saying it’s a bad feature – it certainly looks cool – but there are lots of risks when you make a change this big.
If you enjoy this paper, you’ll probably also enjoy Socrates: The New SQL Server in the Cloud. It explains how Azure SQL DB Hyperscale works, and boy, does it have intriguing new ways of handling data and log files.
I hope you enjoyed Research Paper Week – I certainly did – and I want to stop for a moment to thank all those academics and research professionals out there who’ve done such a fantastic job putting very complex thoughts down into easier-to-understand words for the rest of us. These papers involve hundreds or thousands of hours worth of research, action, writing, and testing, and they’re all available to you for free. How awesome is that?