The SQL Server 2016 PDF data sheet is out, giving us the first public look at the next round of features in Microsoft’s database product.
Quick disclaimer: I’m a Microsoft MVP and Certified Master, and we’ve got clients that work with Microsoft. As such, I’m bound by a few non-disclosure agreements. For this post, I’m going to take off my regular hat, and put on my reverse-engineering hat.
“In-memory enhancements: 30X in-memory OLTP + 100X in-memory ColumnStore = Real-time operational analytics” – SQL Server 2014 brought us each of these features individually. ColumnStore indexes really can get you 100X performance on your analytical warehouse-type queries, and Hekaton may or may not give you rocket speed transactional performance. Both of these storage types are very specialized – for serious performance reasons, you don’t want to run analytical queries against your Hekaton tables, and you don’t really want to do transactional delete/update/insert activity against your ColumnStore tables. The graphics in the brochure show one transaction going into both tables. While this might at a glance seem like you get the best of both worlds, keep in mind that both of these say “In-memory” – and you’ll probably need two worlds’ worth of memory to pull this off. Fortunately, servers like that are available today.
“Always Encrypted technology helps protect your data at rest and in motion, on-premises and in the cloud, with master keys sitting with the application, without any application changes.” In the past, we’ve basically had two places we could encrypt or decrypt data:
- In the database – in which case, the database administrator could see private data, thereby giving security professionals nightmares
- In the application – which also gave the security pros nightmares, because our developers would be rolling their own encryption code
The key phrase here says “protect your data at rest and in-motion…without any application changes.” Microsoft’s implying app-level encryption, keeping those prying DBA admins out. However, if the data is encrypted before it hits the database, you need to consider the following scenarios:
- Analytics queries – your analytics and reporting apps will have to be equipped to decrypt the data if you need to show it onscreen
- Range queries – if you need to see all of the users whose last names are like Smi%, you need to decrypt them
- Replication – if you want to copy the data from one SQL Server to another, it usually needs to be decrypted on the way out
“High Availability…with the ability to have up to 3 synchronous replicas, DTC support, and round-robin load balancing of the secondaries.” No reading between the lines necessary here – these are hotly requested features for AlwaysOn Availability Groups.
“Manage document data with native JSON support.” While this has also been hotly requested, I’d caution you to be careful what you ask for. Users demanded XML support inside SQL Server, and then proceeded to use SQL Server as an XML query engine, sending CPU through the roof. SQL Server is one of the world’s most expensive application servers.
“Scale and Manage – Enhanced performance, scalability and usability across SQL Server Enterprise Information Management tools and Analysis Services.” Careful reading this one, and note which column it’s in – “Deeper Insights Across Data.” While you might read this as a better/faster/stronger SQL Server Management Studio, that’s not what Enterprise Information Management means.
“Powerful Insights on any device – Business insights through rich visualizations on mobile devices. Native apps for Windows, iOS and Android. New modern reports for all browsers.” Microsoft recently acquired DataZen, a reporting tool that focuses on mobile devices, and promptly gave it away free to most Enterprise Edition customers. The simplest way to accomplish this feature would be to simply throw DataZen’s existing code in free with the SQL Server boxed product. Like any acquisition, I wouldn’t expect this to become a seamless part of SQL Server for a year or two at least. (Think Microsoft’s acquisition of DATAllegro back in 2008, then gradually transitioned into Parallel Data Warehouse aka APS. Some will say, “Well, that was a tougher product to build,” but native apps for mobile BI ain’t an easy nut to crack either – as evidenced by Microsoft’s difficulty in getting to that sweet pistachio. Look, this analogy had to go off the rails at some point, we all saw that coming)
“Advanced Analytics at massive scale – Built-in advanced analytics provide the scalability and performance benefits of running your “R” algorithms directly in SQL Server.” Just like you can run your C# code in the database, too. Or your XML parsing. Or your JSON. If you wanna do your processing in a $7k USD per core database server, Microsoft wants to empower you to do it. I would too, if I was wearing their exceedingly good-looking and expensive shoes. I like the way they think.
“Stretch Database technology keeps more of your customers’ historical data at your fingertips by transparently [sic] and stretching your warm and cold OLTP data to Microsoft Azure on-demand without application changes.” Database admins often come to me and say, “I’ve got a ton of data in a table, but most of it just isn’t queried that often. I need to archive it. Will table partitioning help?” Sadly, it usually doesn’t, because good table partitioning requires changes to your application. If you want great performance, you have to make sure your WHERE clause specifically excludes the archival partitions. Otherwise, SQL Server isn’t terribly good at excluding those partitions, and it ends up scanning all of the partitions.
Microsoft sounds like they’re moving the cold table partitions up into Azure storage. That had damn well better not be the only portion of the solution that they deliver – they have gotta deliver better partition elimination. If not, this feature is going to be legendarily bad, because scanning partitions up in Azure is going to be even worse than scanning partitions locally. Questions around a feature like this would include:
- How will backups work?
- How will restores work?
- How will DBCCs work?
- How will data be moved from on-premise to Azure?
- Can multiple servers attach to the Azure partitions? (For example, if you regularly back up production and restore it over to development, how will that work?
“Temporal Database: Track historical changes” – Whenever Jeremiah shows me an Oracle feature, I get so jealous. Oracle’s got this really cool feature where you can query a table to see what it looked like at 5PM yesterday, or at the start of the last financial quarter. It ain’t free/cheap – after all, you have to store all the historical data to make those queries work. For scenarios that need that level of archival detail, though, that’s killer.
“Row Level Security: Apply fine-grained access control to table rows based on users rights” – I blogged about this when it was unveiled in Azure SQL Database.
“Dynamic Data Masking: Real-time obfuscation of data to prevent unauthorized access.” – We’ve got clients who work with payroll, health, and financial data. They need to restore their production databases into their development and QA environments, but it’s critical that the developers not see personally identifiable information like social security numbers. These numbers are stored in the database. So what’s a DBA to do? Currently, the solution involves scrambling the data as soon as the restore completes, but that’s painful.
“Enhanced Database Caching: Cache data with automatic, multiple TempDB files per instance in multi-core environments” – First off, I would just like to point out that Microsoft uses my preferred capitalization of TempDB. Take that, rest of Brent Ozar Unlimited®. But back to the issue at hand – TempDB (BOOM!) configuration has long been a sore spot for database administrators. TempDB defaults to just one data file, no matter how many cores you have. In a perfect world, Microsoft would fix the SGAM and PFS page contention problem. Or they could just duct tape it by automatically adding multiple files when necessary. Sounds like they chose the latter. I’m okay with that.
“Enterprise-grade Analysis Services” – wait, what was it before?
“PowerBI with on-premises data: New interactive query with Analysis Services. Customer data stays behind your firewall.” – PowerBI looks great, but it’s for Office365 customers only right now. A lot of our clients have smiled and nodded past that. The cloud is indeed the future, and most of us believe most of the data will end up there eventually, but Microsoft’s acknowledging the reality that for most of the market, it’s still the future, not the present.
“Easy migration of on-premises SQL Server: Simple point and click migration to Azure.” Migrating databases isn’t the problem. Moving all of the application servers and related services up to the cloud, that’s where the hard work is.
And now it’s off to the afternoon SQL Server keynote here at Ignite! Wow, this is really turning out to be the conference to be at if you want to see the future of SQL Server.