This morning at the PASS Summit, we have the pleasure of listening to Professor David DeWitt talk about Hekaton internals.
I’m actually not going to liveblog this – I’m going to sit back and take in the presentation as an attendee because it’s going to be so damn good, and I’m not going to be able to do justice to it in a live blog post. I need to explain why.
Yesterday, a few of us bloggers were given an advance question-and-answer session with him to talk about – well, whatever. Here’s what I asked him, and keep in mind that the answers are paraphrased. I took notes as fast as I could, and it’s
possible probable guaranteed that I misheard things, so don’t take this as a word-for-word transcription. I’m trying to maintain the spirit of what he said.
BGO: Out of your accomplishments this year, what are you the most proud of?
Shipping PDWv2 with Polybase.
Seeing Hekaton emerge in SQL Server 2014 CTP2. I didn’t have ownership of that project at the very beginning – I got it after 3 months – and then I owned it for a year and a half. Seeing it come out the door was exciting.
I’ve been working on this keynote since July 1st. I went canoeing in the Arctic in 1st of August, and I had to have it done by then, so the month of July was spent banging the talk out. In 75 minutes with 77 slides (with complex animations), I’m trying to explain to the PASS Summit audience something that I would normally cover a couple of lectures to students. It’s going to be complex.
Hekaton is totally different than the relational engine. How Hekaton stores data is just as different from the regular engine as the column store engine is different. Just as we saw Apollo’s column store indexing folded into the mainstream engine over time, we may fold in Hekaton improvements over time too.
BGO: What do you enjoy about speaking at Summit?
The high of doing this. It’s a very appreciative audience, unlike an undergraduate audience. <laughter> In a college environment, I don’t really want people to have laptops in class. They’re probably shopping online.
PASS is a great environment where you can tell people are here because they want to learn voluntarily. It’s all volunteers. Volunteers make such a big commitment to the event.
At the same time, it’s not all fun. There are periods like 2 weeks before where I’m incredibly stressed out. Two years ago I got a 5 on my session feedback evaluations, and then last year I got a 3. What am I going to get this year? I’m really stressed about that.
(Note from Brent: I really do get the vibe that DeWitt cares passionately about the session materials and how PASS attendees receive it. He’s not under fire from Microsoft to produce amazing materials – he is just totally self-motivated to beat expectations of the audience.)
BGO: What’s the toughest part about your job?
Not being able to ship stuff as fast as I want. I’m not at a startup. I’ve come to appreciate what it means to be part of a company that prides itself on delivering really high quality software. SQL Server has a sterling reputation for really high quality. I’ve learned so much about the testing process.
In the upcoming release of PDW, we’ve got a feature coming, and it’s really important to me. It’s what most people would consider a small thing, but it’s very important to me. Unfortunately, we can’t enable it – we can’t ship it to the public because we don’t have enough time to test it. That’s frustrating, but it’s fair, and the bar for testing PDW isn’t even as high as regular SQL Server testing. The bar for SQL Server engine testing is incredibly high.
I could do with 2-3x the number of resources than I have.
I’m old enough to get Medicare, but I still have lots of good ideas. Mike Stonebraker turned 70 this fall. Mike was my graduate TA for my first graduate class. I’ve known him for 40 years. He’s had a lot of successful startups, and he doesn’t need to work, but he has 4 startups and still goes to work every day.
BGO: Is there anything you regret not doing?
I’m envious of Stonebraker and all the startups he’s done.
I was part of Vertica, so I’ve never worked on the Microsoft column store stuff because of non-competes. But being part of a startup would be really gratifying. It takes guts, has challenges, and I’m not sure I would have been successful, but that’s the one thing I regret.
And I wish I would have been better at mathematics.
BGO: What’s the one thing you want people to take away from the keynote?
Building Hekaton was really a serious long term endeavor. We’ve been at it a full 5 years. It was a big deal. It could be the basis for a lot of new SQL products down the road.
For relational database storage, columnar stuff was really the first chink in the armor. It’s processed in vectors, the vectors get combined with bit masking, we use a lot of different query processing techniques. More chinks are coming.
Look at what’s happening with computer hardware, specifically memory prices. There’s a chance technology will drive us to a place where we have large amounts of memory, some non-volatile RAM, we may end up with database systems whose databases are all in memory or near memory (NVRAM). In the next 10-20 years, the Hekaton approach of memory-intensive, core-intensive approach could become the de facto way of storing data as opposed to the disk-based product (the way SQL Server stores data & logs today).
My hope is that people will take those slides and study them carefully for the exam. <laughter> I want them to read them closely and understand why we did it this way.
BGO: So it sounds like it’s not a one-and-done feature like so many others we’ve seen. Development is actively ongoing, and there’s still more investments being made here?
Apollo (column store indexes) came out in SQL Server 2012, and in SQL Server 2014, it’s v2 with updates and investments. PDW v2 is out with more features – except for that one small feature that’s my favorite that we can’t ship. These storage investments aren’t one-and-done – we’re focusing on these.
Hekaton CTP 1 had hash indexes only, and CTP2 adds B-tree indexes. There’s a white paper coming out on the index types we’ll support.
We have lots of exciting things in the language hopper for Hekaton. We’re broadening the language and data type support in Hekaton V2.
BGO: Can I quote you on that? I want to make sure I can actually blog that Hekaton V2 is going to have expanded language support.
My thoughts on what’s about to go down
For the keynote, DeWitt’s tackling something really challenging. How do you teach database internals – and not just regular internals, but really all-new internals – to a very wide audience? In this room, we’ve got database administrators, BI developers, database developers, and managers. Many of us in here don’t regularly work with latching problems in SQL Server’s current engine, much less a new one.
I admire what he’s trying to achieve, and having read the slide deck, I admire how he’s going to do it. The snark department is going to make fun of his clip art, but pay close attention. In the next 75 minutes, you’re going to learn internals of both the current engine and the Hekaton one. You’re probably not going to deploy Hekaton v1 for existing applications, but if Microsoft continues making payments on this vision, you’re probably going to want v2. Today’s session will explain why.