PASS Summit 2007: Day One Notes

#SQLPass
2 Comments

My notes on the sessions I’ve attended so far:

Ted Kummert: Keynote on SQL 2008

I’d already seen most of this PowerPoint deck during a South Florida SQL meeting at Microsoft, but the demos kicked butt.  Finally, we have Intellisense in Management Studio!  Woohoo!  There was other fun stuff, but I’ve already forgotten it.

One odd note – they slipped in a slide that said the easiest transition to SQL 2008 will be to move to SQL 2005 now.  The exact wording strongly hinted to SQL 2000 users that they need to get with the program, or that maybe we won’t be able to migrate directly to SQL 2008.  That could be a problem – I’d planned on skipping SQL 2005 for the majority of our third party apps like Project, Sharepoint, Mercury Quality Center, etc.  SQL 2008’s transparent database mirroring is a big selling point, and I’ll put the time into migrating programs to it just for that alone, but it’s a tougher sell on 2005’s not-so-transparent mirroring that requires application changes.

Kalen Delaney: The Hidden Costs of Concurrency

Kalen explained the differences between optimistic & pessimistic concurrency: in a nutshell, pessimistic concurrency’s problem is that it blocks other queries that try to read what you’re updating. Both optimistic and pessimistic concurrency modes will hold locks on what they’re trying to update, but pessimistic will cause blocks to select statements as well.

The solution, switching from the default pessimistic concurrency to optimistic concurrency, means that snapshot isolation must be enabled. That causes SQL Server to store versions of updated/inserted rows in TempDB, so every update/insert incurs a lot more I/O. Reads are slower too, because if a process needs to read an older version of a record, that version will be fetched out of TempDB.

The sudden increased use of TempDB means that a database administrator’s job becomes much more difficult: they have to carefully watch the row versioning storage amounts in TempDB and size their files appropriately. Run out of space in TempDB for versions, and everything will grind to a halt. Perfmon counters are available under SQLServer:Transactions for the version store load, like checking to see how much space is being used for the version store as opposed to user TempDB objects.

She made the case for using Read Committed Snapshot Isolation instead of what she calls “full” Snapshot Isolation because it requires less overhead.

I’d love to enable RCSI on my data warehouse, but I’d need to put some time and testing into validating the TempDB speed & size first. Not a low-hanging fruit.

Kalen did a great job of conveying difficult concepts with simple demos, and I would recommend her sessions and her books to highly technical DBAs. This was one of those sessions that I really wanted to come to PASS to see, because this stuff is pretty esoteric.

SQL CAT: Optimizing OLTP: Lessons Learned from MySpace.com

This one was a bit of a surprise, because Hala Al-Adwan from MySpace showed up to talk about the challenges she’s had with SQL 2005. She’s worked closely with the Microsoft SQL Customer Advisory Team on managing her 275+ SQL Servers. That’s right, 275 HP DL585 4-way dual-cores with 32-64gb of ram. They’re adding 3 SQL Servers a day. Humina humina. And you think you have replication problems.

They use SQL2005 Standard Edition x64, and the drawback of Standard Edition x64 is that it won’t lock pages in memory. Enterprise does, and Standard 32-bit does, but Standard x64 doesn’t.

MySpace ran into connection denials because the same bits of code that manage IO also manage connections. This was an eye-opener for me, because I’ve heard rumblings from my data warehouse users that they’ve recently had connection denials during random times of the day. As soon as I get back to the office, I’ll be doing some in-depth performance monitoring.

As part of that problem, the SQL Customer Advisory Team recommended that transaction log writes need to happen in 2 milliseconds or less. It’s good to monitor wait queues, but for t-log drives, low wait queues isn’t enough.

They highly recommended having a clustered index on every single table regardless of size. They gave great charts and statistics about the impact of adding a clustered index: it improved speed under all circumstances, even inserts. I took a camera phone picture of that slide just to send to my BI manager because he still questions that one.

They’ve seen a problem with hotspots when inserting into tables with an identity field, but only with insert rates of >4500 inserts/second and >100 concurrent users. The fix is in the August cumulative update for SQL 2005, which I’ll be promptly applying on my dev & QA servers when I get back, because I’m pretty sure we’ve seen that issue with our ETL nightly loads. We run a lot of loads concurrently into the same tables.

The one shocker: the query optimizer will only do index seeks down to the level of the index where the user uses an inequality clause. That’s a mouthful, so lemme say it another way. Imagine a table with columns Col1, Col2, Col3 and Col4, all integers. It has an index on those columns, in order. A user writes this query:

SELECT * FROM MyTable WHERE Col1 = 100 AND Col2 > 350 AND Col3 = 135 AND Col4 = 151

The query optimizer will not look at all four fields on the index – only Col1 and Col2. Col2 has an inequality clause (less than, greater than, other than, etc) and that’s where the optimizer draws the line. Whoa. Gotta look into that.

Eric Hanson: Data Warehousing with SQL 2008

Eric specializes in the database engine, so he focused on the engine improvements for warehousing. I was most interested in the star join query plan improvements, and those alone would be worth the upgrade process. We use a lot of star join queries, and the query plans he showed looked exactly like the kinds of problems we’re having: inability to section out partitions, inability to filter down large subsets of data, etc.

He mentioned a new MERGE syntax for T-SQL that will do “upserts” – automatic insert/update/deletes, perfect for nightly ETL work.

He also covered a benefit of using indexed views for aggregates. Data warehouses use aggregate tables either by populating them manually as a part of the nightly ETL process, or by using indexed views. Indexed views have an advantage I hadn’t thought of: they will automatically be used in query plans whenever the engine detects that they’d speed up a query. That’s not the case with the aggregate tables we build manually, because the engine has no idea that those tables relate to our fact tables.

SQL 2008 will include database compression and backup compression, two things I’d heard previously but I’ll mention here in case anybody hasn’t heard it. An attendee asked Eric if this would phase out third party compression utilities like Quest Litespeed and Idera SQLsafe, and Eric danced around it well. He said SQL 2008’s compression will be a first version, and the third party utilities may be more mature with more features, so they may still be worth the money. Good answer.

Bill Ramos: Performance Monitoring Using Management Studio

SQL Server 2008 also steps on the toes of third-party performance monitoring products by collecting the same types of runtime statistics DBAs pay extra for now. Bill demoed a very-very-green version of the code, and it has a long way to go but it shows potential.

I was surprised that Microsoft would let this part of the product start up at this point in the SQL lifecycle. Microsoft just got done sticking a fork in Microsoft Operations Manager 2005, and the whole tagline around System Center 2007 is that admins should be monitoring whole applications, not just single servers. For example, I don’t just want to know that my SQL Server is having problems – I want to know what applications are affected, and how that affects the uptime of my entire application.

Performance Studio, on the other hand, was very server-centric and application-ignorant.

Plus, why implement this whole thing in the SQL stack? Why not include this inside of System Center? I don’t get it.

This was the one seminar I actually left early. I just didn’t see a long-term future for this piece of the product, at least not in its current incarnation. I was a little disappointed because I’d expected something else – not quite sure what.

At that point, I went back to the hotel room for dinner (Quizno’s, blah) and an early bedtime.  Lots of great stuff on the agenda tomorrow.

On to my Day Two Notes

Previous Post
In Denver for SQLpass 2007 Summit
Next Post
PASS Summit 2007: Day Two Notes

2 Comments. Leave new

  • You took some pretty good notes today. I agree that Kalen Delany is a great speaker. Glad you are enjoying the conference.

    Reply
  • Regarding the upgrade to SQL 2008 from SQL 2000, Microsoft will (and recently has) always supported easy upgrade from the prior two releases. That will be the case with SQL 2008 as well. I believe that the best testing of the most cases for upgrade will almost always be the direct upgrade from the prior release. Even though MS will work hard to test from 2 releases back, I suspect that if they must focus more resources on one, it would be the most recent upgrade. We should always test upgrades very well, but if I were upgrading from two releases back, I would expect more issues, and manual stuff, that I would generally expect from the most recent release.

    As a reminder, this is my opinion only.

    Thanks for attending PASS this year – it was a great week!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.