PASS Summit Keynote Notes (#sqlpass)



It’s Friday, it’s 8:30 AM, and it’s time for another keynote.

8:30 AM – Bill Graziano just rode out on a trike!

8:40 – Two sessions will be repeated today due to high demand. At 10:45 in room 604 they’re repeating the clustering session, and 2:30 in room 6E is SQL Server on SAN.

8:45 – The PASS Board Election results are in! The new board members are Douglas McDowell, Lynda Rab and Andy Warren! That means Tom LaRock, Pat Wright and Louis Davidson didn’t make it. Here’s some blog links from each candidate’s experiences:

8:47 – now announcing winners of the SQL Heroes contest.

8:50 – The winner: BIDS Helper with a perfect score of 55! I wish I could tell you the rest, but the slides rolled off the screen within 30 seconds. Damn. Update: thanks to Adam Machanic, they’re:

  1. BIDS Helper – I’ve heard this mentioned repeatedly over the week by presenters
  2. Extended Events Manager
  3. SSISUnit
  4. CDC Helper
  5. QPee tools by Jason Massie! Woohoo!

8:55 – Dell reviewing the Microsoft BI platform reference architecture with the entire BI stack including PerformancePoint.

9:00 – Bill Graziano back up onstage to introduce Dr. David DeWitt from Microsoft.

9:03 – David’s great – he’s identifying all the reasons why his presentation will suck: he doesn’t have a motorcycle, he doesn’t have new presentations, etc! Hilarious. Love it.

9:10 – He’s going through the basics of why linear scaleup is a good solution, and how the numbers work.

9:12 – Ebay has a couple of >1 petabyte warehouses with 85 disk drives for storage. As a former SAN guy, I’m guessing I heard those numbers wrong, because that sounds ridiculously impossible. (Update – that numbers is 8500 disk drives, which also sounds ridiculous but on the other extreme, hahaha.)

Dr David DeWitt Onstage
Dr David DeWitt Onstage

9:15 – DATAllegro works because it’s shared-nothing: no shared memory, no shared SAN, etc. Runs with commodity gear like a bunch of rackmount servers connected via gigabit networks or Infiniband. This same hardware model is used by all the major search engines.

9:20 – Interesting – he’s getting ready to talk about design alternatives to shared-nothing cluster approaches. I like it when a vendor is honest and says, “This is another way to do it other than what we’re talking about.” He’s covering horizontal partitioning now.

9:23 – Brilliant. He’s using animated pictures to show how the different partitioning methods work when they’re loading data. As he shows round-robin partitioning, the loaded rows populate onto the different visual database servers. It occurs to me that he’s never really left education – he understands how to convey complicated information to a wide audience.

Dr David DeWitt Explains Partitioning
Dr David DeWitt Explains Partitioning

9:30 – He’s explaining about how you scale out joins in shared-nothing systems and how relational operators affect the joins. This is amazing stuff.

9:32 – With Project Madison, reading execution plans is going to be a lot more interesting: we’re going to need to pay more attention to how each individual thread works, because the load really needs to be balanced. Right now, if you have a parallel operation broken out across 8 threads, for example, you’re not too concerned whether one thread finishes faster. When each thread is running on a separate machine, each with a separate set of data, you need to pay close attention to what kinds of data you’re putting on each machine. This is mesmerizing for engine guys like me.

9:38 – He’s talking about the role of indexes. I’m not even going to try to blog this from here on out. If you’re into query execution plans, if you’re into data warehousing or if you’re into scale-out architectures, you definitely want to watch the recording of this presentation. He’s brilliant and it has a lot of impact on people who will use Project Madison.

9:47 – I said I wasn’t going to keep blogging, but man, this is really, really important. Your indexing strategies, especially your clustered index strategy, is going to be dramatically different. There’s some similarities with OLTP partitioning – scenarios where you partitioned in order to gain concurrency benefits – but man, you gotta be paying attention. I can easily see where a Project Madison implementation will benefit from having a person specifically dedicated to determining the correct partitioning keys for a month or more, just testing and benchmarking. Schema changes will also require careful planning to make sure the data lands on the right nodes for querying speed.

9:52 – When the engine has to join two large tables that are partitioned in different ways, it’s called Table Repartitioning. Each node splits its part of the table out into N parts (N being the number of nodes) and then splits that data out to all of the other nodes. It’s done by splitting, shuffling and combining. Networking is fast enough that it’s not a bottleneck, they say. Can’t wait to see that part in action – technologically impressive!

9:55 – Thinking about concurrency problems, I’m amazed that this stuff works. It’s shuffling data between nodes in order to recombine for a query, and yet you could have inserts/updates/deletes going on at the same time too. I’d love to play with benchmarks to see when Madison starts to pay off: for example, if you have a 16-core box with 128gb of memory, how many Madison nodes would you need to keep up when there’s heavy insert/update/delete activity going on? I’m not saying Madison doesn’t pay off – I’m sure it will – but it’s a neat theoretical question to ask. (But it depends on the activity and the schema, of course – it’d cost a lot to find the answer, and the answer would only be valid in limited scenarios.)

9:57 – This has awesome backup implications. It’s been said before, but it bears repeating: on a scaled-out Madison system, the dimensions will be copied across all of the nodes, but fact tables will be spread out and not copied everywhere. Now, for the new part: I wonder how backups will work. With data spread out so far, the amount of throughput coming from Madison to the backup system will be insanely, screamingly fast. Toss in a compression product like LiteSpeed, and wow, I bet I could back up terabytes in minutes. I’m going to need a lot of storage speed on the target end just to keep up!

10:00 – Partition skew = an unbalanced amount of data in specific partitions (now nodes). The engine can either has functions (almost random) or range partitioning (manual setup). Since the node with the most data has the slowest response time, and that determines the query’s total execution time, then you have to pay attention to partition skew or your queries will slow down.

10:02 – DeWitt’s Gray Systems Lab is working with the DATAllegro team to build a new “world-class parallel optimizer” to make it easier to work around partition skew, among other things. Wow – brain power. “Microsoft intends to become the premier supplier of scalable database systems for data warehousing.” Interesting – that wouldn’t appear to put them in competition for lower-end scale-out systems.

10:03 – Done. Wow, he was awesome for engine guys like me! Off to the sessions…

Other bloggers took some great notes during the keynote:

Previous Post
PASS Summit 2008 Thursday Recap (#sqlpass)
Next Post
PASS Session on Analytics by Donald Farmer (#sqlpass)

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.