Blog

manzanitaThank you for your web visit. We’re out of the office and will be back on Monday, August 3.

During this period we will have limited access to our email. We’ll still have full access to Twitter, Facebook, and Instagram. We apologize in advance about that, because we’re going to be posting a lot of photos about this year’s company retreat.

This year, we’re spending a week on the Oregon coast – specifically, Manzanita, an adorable little town in Tillamook County – population 598. We’ve rented a couple of houses for us, our spouses, and our dogs. (In the past, we’ve done our retreat on a cruise ship, and a couple of times in a beachfront house in Cabo.)

For immediate assistance, please contact Microsoft Support at 1-800-642-7676.

10 comments ↑ Back to top

I have to deliver a lot of bad news. It’s pretty much my full time job. Nobody calls us when things are going well. (Well, they tweet us, because that’s free.)

Sometimes, that bad news is very dangerous – especially for the company as a whole, or its customers.

In today’s consulting line, I’m working with a group of developers on a performance issue. We’re using sp_BlitzCache™ to check out the top resource-intensive queries on their system, and figuring out how to make them go faster.

The Conversation

Me: “It looks like this query returns the customer’s email address, password, birth date, and address.”

Larry: “Yeah, it’s the profile page. We show them their account.”

Me: “So this is stored in the database, and it’s not encrypted.”

Larry: “Yeah.”

Me: “And we’re on the development server that everybody in the company has access to, right?”

Larry: “Everybody.”

Me: “Okay, let’s stop for a second. I have to cover my own butt. Gimme a few minutes to document this, and I’ll include it as part of your written findings. I know you didn’t call me in for a security review, but I’d be a bad consultant if I didn’t put that in writing.”

"What do you mean the whole world knows that my password is p0ll0?"

“What do you mean the whole world knows that my password is p0ll0?”

What That Line Does

Putting a big security risk in writing is a career-limiting move.

This line helps you defuse that bomb.

Nobody wants to have something like this in writing – especially a written document that gets forwarded up the management chain. They’re going to want you to stop writing and just ignore it, but this line establishes that it’s not really your call to make. Nobody wants to be bad at their job – but ignoring huge, dangerous risks would make you bad at your job.

What Happens Next: The Easy Way

Larry: “OK, cool.”

Sometimes the rest of the team isn’t happy about the risk either, and they’re dying to have someone else champion the cause.

And you know what’s funny? The developers and sysadmins are often quietly high-fiving each other while you’re writing it down. But if there’s a manager in the room…

What Happens Next: The Hard Way

Moe the Manager: “Wait, that’s not why we brought you here. Let’s focus on the problem at hand.”

Me: “I totally understand. This will only take a few moments for me to document, and we can work a few extra minutes at the end of the day. I won’t bill you at all for this. I just have to write this down to cover my butt.”

Moe the Manager: “I can’t really have something like this in writing.”

Me: “Believe me, I totally understand why you’d say that. If I was in your shoes, I’d say the exact same thing. But I have to protect myself – say for a second that another Target, Anthem, AshleyMadison, or whatever happens here, and it gets out that I was the database consultant, and that the personally identifiable data was out in the wide open, and I didn’t tell you about the risk.”

Moe the Manager: “We won’t tell anybody.”

Me: “You say that now, but if you got hacked and the personal data got out, Curly the CEO would tell your lawyers to point the blame at me. They’d say I was a bad consultant because I didn’t alert you about this risk. I have to include this in my written findings to cover my butt. You can feel totally free to ignore it if everybody else in the company is comfortable with that risk, but I have to put it in writing or I’m a bad consultant.”

I’ve actually lost a couple of clients when this line went downhill, but you have to be careful with your own reputation. Nobody wants to hire the DBA who was on duty when one of these incidents went down.

For more fun, read more of my favorite consulting lines.

 

4 comments ↑ Back to top

I’m delighted and honored to be presenting two sessions at the 2015 PASS Summit in Seattle: “Living and Dying by Dynamic SQL” and “SQL Server Mystery: Dead Reports Don’t Talk”. Since I’ve never done either of these sessions at the Summit before, here’s a little more information about the sessions to help you decide whether to attend:

  • Although the titles differ, these are both murder mystery sessions, modeled after a murder mystery dinner party. Except we won’t be serving food.

    "Where were you on the night of the 27th?"

    “Where were you on the night of the 27th?”

  • Like a murder mystery party, you’ll be conversing with the people around you. You’ll discuss clues and work as a group to solve the mystery.
  • The suspects will be sitting in the crowd too; you may end up next to one of them.
  • The mystery loops through a pattern 3-4 times: topic – interview – clue. This means every 2-3 minutes, we’re moving on to the next step and keeping things lively.
  • The two main goals of the session are for you to learn more about 1) the technical topic and 2) the people you’re sitting with. The PASS Summit is as much (if not more so) a networking event as it is an educational event. I want to emphasize both points in the same session.
  • In the coming months, I’ll be retweeting people at Cromulent Technologies — people you’ve never heard of. If you’re thinking of coming to my session(s), pay special attention to these retweets. Cromulent Technologies is the workplace of the principal characters in our murder mysteries.
  • If you’re familiar with other speakers in the SQL Server community, you’ll probably recognize a few of the suspects.

I promise you, you’ve never seen SQL Server presentations quite like these. I hope to see you there! Got questions? Ask away in the comments!

Brent says: I love stuff like this that breaks up the monotony of conference sessions. Make me think in new ways and keep the conference lively – that keeps me engaged.

4 comments ↑ Back to top

i like questions!

Because I don’t always like talking to myself, and robots are sometimes murderous and insane. So when this one came up in the comments of my previous post, I thought it would make a good follow-up blog. It’s a really good question, and definitely one I found myself asking quite a bit when I first started writing queries.

I hope you like big words, veljasije!

I hope you like big words, veljasije!

the fancy pants answer

Is that this isn’t a processing order issue, but rather a set processing issue. When you group by two columns, the resulting set makes ordering by a column outside of that set ambiguous.

Huh?

Think about it. If two columns of values are spread out over a range of unique IDs in a third column, which ID belongs to which grouped set if you group by the two other columns? The highest ID? The lowest ID? Something in between?

demonstrate my syntax

A quick example using a take on FizzBuzz. We’ll call it NYHCBuzz.

;WITH E1(N) AS (
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL UNION ALL SELECT NULL UNION ALL SELECT NULL UNION ALL 
    SELECT NULL  ),                          
E2(N) AS (SELECT NULL FROM E1 a, E1 b, E1 c, E1 d, E1 e, E1 f, E1 g, E1 h, E1 i, E1 j),
Numbers AS (SELECT TOP (1000000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS N FROM E2)
SELECT  
        IDENTITY (BIGINT, 1,1) AS [ID] , 
        ISNULL(CONVERT(DATE, DATEADD(MINUTE, -[N].[N], GETDATE())),     '1900-01-01') AS [SomeDate] ,
        CASE WHEN n.[N] % 15 = 0 THEN 'Sheer Terror'
             WHEN n.[N] % 5 = 0 THEN 'Madball'
             WHEN n.[N] % 3 = 0 THEN 'Skarhead'
             ELSE 'Warzone' END AS [NYHCBuzz]
INTO [LowerEastSide]
FROM    [Numbers] [N]
ORDER BY [N] DESC;

SELECT * 
FROM [dbo].[LowerEastSide] AS [les]
ORDER BY [les].[ID];

And here’s a little snippet of what you end up with. This hopefully makes my point a bit more clear. You can’t order by both ID 11 and 26, for example. They would be part of the same group, and they’re buried in a range with hundreds of other values.

Ordering by the ID column wouldn’t make sense. Which ID would each grouped set be associated with?

Who knows? Not even SQL.

I like this playlist.

I like this playlist.

Running this will get you a big fat error.

SELECT [les].[SomeDate], [les].[NYHCBuzz], COUNT_BIG(*) AS [Records]
FROM [dbo].[LowerEastSide] AS [les]
GROUP BY [les].[SomeDate], [les].[NYHCBuzz]
ORDER BY [les].[ID]


Msg 8127, Level 16, State 1, Line 26
Column "dbo.LowerEastSide.ID" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.

So what can you do? You can’t group by the ID column, it’s a unique value per row. You’ll just get every row and a count of 1. That’s dumb and ugly.

You can work around it by telling SQL how to order within each group, with a query like this.

SELECT [les].[SomeDate], [les].[NYHCBuzz], COUNT_BIG(*) AS [Records]
FROM [dbo].[LowerEastSide] AS [les]
GROUP BY [les].[SomeDate], [les].[NYHCBuzz]
ORDER BY MIN([les].[ID]) --Look at me

Adding the MIN() function to the ORDER BY tells SQL to take the lowest ID per grouped set for ordering. If you’re curious as to how that helps, look at the results with this query.

SELECT MIN([les].[ID]) AS [MinID], [les].[SomeDate], [les].[NYHCBuzz], COUNT_BIG(*) AS [Records]
FROM [dbo].[LowerEastSide] AS [les]
GROUP BY [les].[SomeDate], [les].[NYHCBuzz]
ORDER BY MIN([les].[ID])

You can see pretty easily what SQL is doing here. For every grouped set of dates and names, you’re also grabbing the minimum ID value. The aggregation works to provide each set with a related value to also order by.

I would go to this show.

I would go to this show.

You can use any aggregate function of your choosing, really. I don’t know if any of these have a good business use case, but taking a look at the changes to both output ordering and ID column values is instructive when it comes to learning exactly how the aggregation is necessary to provide order to grouped sets.

SELECT MIN([les].[ID]) AS [MinID], [les].[SomeDate], [les].[NYHCBuzz], COUNT_BIG(*) AS [Records]
FROM [dbo].[LowerEastSide] AS [les]
GROUP BY [les].[SomeDate], [les].[NYHCBuzz]
ORDER BY MIN([les].[ID])

SELECT MAX([les].[ID]) AS [MaxID], [les].[SomeDate], [les].[NYHCBuzz], COUNT_BIG(*) AS [Records]
FROM [dbo].[LowerEastSide] AS [les]
GROUP BY [les].[SomeDate], [les].[NYHCBuzz]
ORDER BY MAX([les].[ID])

SELECT AVG([les].[ID]) AS [AvgID], [les].[SomeDate], [les].[NYHCBuzz], COUNT_BIG(*) AS [Records]
FROM [dbo].[LowerEastSide] AS [les]
GROUP BY [les].[SomeDate], [les].[NYHCBuzz]
ORDER BY AVG([les].[ID])

SELECT SUM([les].[ID]) AS [SumID], [les].[SomeDate], [les].[NYHCBuzz], COUNT_BIG(*) AS [Records]
FROM [dbo].[LowerEastSide] AS [les]
GROUP BY [les].[SomeDate], [les].[NYHCBuzz]
ORDER BY SUM([les].[ID])
Just can't aggregate enough.

Just can’t aggregate enough.

We cover all sorts of cool stuff like this (and more!) at Advanced Querying and Indexing. There’s even food. And prizes. And almost no one ever self-immolates, so that’s a big plus.

0 comments ↑ Back to top

For the last few years, we’ve hosted a Tuesday webcast to talk SQL Server.

We’re giving it a vacation in August, and then starting September 2, Office Hours is coming back on Wednesdays.

This is easier for us because of the way our SQL Critical Care® service works. We’re typically working with clients interactively on Monday and Tuesday, and then Wednesday we take a break to write the findings recap. It’s easier for us to take a midday break on Wednesdays.

Here’s a sampling of some of the questions we’ve answered at Office Hours:

  • What are your recommendations for service accounts? Unique domain accounts for each instance/service, shared domain accounts, or something else?
  • Can you explain why one database can be added to AG with just a full backup and another would need a transaction log backup as well?
  • Working on my first AG setup in 2012. I don’t quite understand how the listener works with DNS. Does the listener name get registered with each AG IP?
  • We’re looking into Amazon Web Services cloud with SQL Server, any thoughts or tips or things to be aware of?
  • Ever ran into an instance where database gets detached automatically after server restart?
  • What would you recommend as MAXDOP for a server with 80 cores?

To pick our brains and learn from other questions/answers, register for Office Hours here.

6 comments ↑ Back to top

This sounds really cheesy, but I’m honestly excited to be presenting again this year at kCura Relativity Fest 2015.

Here’s what I’ll be talking about:

How to Check Your SQL Server’s Health

The Abstract: You’re a system or database administrator responsible for the uptime and performance of Relativity’s SQL Servers, but you’ve never received professional training on SQL Server and you need to figure out if it’s safe. I’ve built up an array of free tools over the 15-plus years I’ve been working with SQL Server, and I’ll show you how to use them in this demo-packed session.

Why I’m Presenting It: For the last few years, kCura has sent me around to dozens of the biggest Relativity shops around to do a one-day SQL Critical Care®. I want to teach you that exact same process – and trust me, it’s way easier than you think. I’m not going to teach you how to put fires out – but I *am* going to teach you how to tell in just a few minutes if your server is on fire or not, and whether it’s a small trash can fire – or a big, blazing three-alarm monster.

Buying the Right Hardware for SQL Server

The Abstract: Ever wonder if there was a simple list of instructions for picking out the perfect infrastructure for Relativity at your organization? Wonder if you should use mirroring, replication, clustering, or AlwaysOn Availability Groups? Are you better off with solid state or shared storage? Is virtualization an option? Microsoft MVP Brent Ozar has worked with dozens of Relativity customers, and he’ll simplify your infrastructure options in this one-hour session.

Why I’m Presenting It: I see the same situation over and over again: your Relativity infrastructure started out small – just one SQL Server in the corner that you didn’t worry too much about – but man, has that thing grown. It’s become really important to the business, and people are stuffing data in there like it’s some kind of Chipotle data burrito. You need to buy exactly one replacement environment, and it’s gotta be right the first time. You don’t buy a lot of SQL Servers, so you want simple independent advice that you can take home to management. I’m going to hand you a few sketched-out options with budget ranges.

Want to Learn More About Relativity Fest?

Check out this video, and then head over to RelativityFest.com to register. It sells out, so move fast.

2 comments ↑ Back to top

i’ve seen this happen

Especially if you’ve just inherited a database, or started using a vendor application. This can also be the result of inexperienced developers having free reign over index design.

Unless you’re running regular health checks on your indexes with something like our sp_BlitzIndex® tool, you might not catch immediately that you have a heap of HEAPs in your database.

You may be even further flummoxed upon finding that someone thoughtfully created Primary Keys with Nonclustered Indexes on them, yet no Clustered Indexes. Unless the original developer is still around, the intent may not be clear.

Using this code snippet, you can quickly identify tables that were created with Nonclustered Indexes on the Primary Key, and no Clustered Index. Another way to spot this potential issue might be looking for RID lookups, or Table Scans in your Plan Cache. Wide Nonclustered Indexes may also be present to compensate for the lack of a good Clustered Index.

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

SELECT QUOTENAME(SCHEMA_NAME([t].[schema_id])) + '.' + QUOTENAME([t].[name]) AS [Table] ,
 QUOTENAME(OBJECT_NAME([kc].[object_id])) AS [IndexName] ,
 CAST((SUM([a].[total_pages]) * 8 / 1024.0 ) AS DECIMAL(18,2))AS [IndexSizeMB]
FROM [sys].[tables] [t]
INNER JOIN [sys].[indexes] [i]
ON [t].[object_id] = [i].[object_id]
INNER JOIN [sys].[partitions] [p]
ON [i].[object_id] = [p].[object_id]
 AND [i].[index_id] = [p].[index_id]
INNER JOIN [sys].[allocation_units] [a]
ON [a].[container_id] = CASE WHEN [a].[type] IN ( 1, 3 ) THEN [p].[hobt_id]
 WHEN [a].[type] = 2 THEN [p].[partition_id]
 END
INNER JOIN [sys].[key_constraints] AS [kc]
ON [t].[object_id] = [kc].[parent_object_id]
WHERE (
 [i].[name] IS NOT NULL
 AND OBJECTPROPERTY([kc].[object_id], 'CnstIsNonclustKey') = 1 --Unique Constraint or Primary Key can qualify
 AND OBJECTPROPERTY([t].[object_id], 'TableHasClustIndex') = 0 --Make sure there's no Clustered Index, this is a valid design choice
 AND OBJECTPROPERTY([t].[object_id], 'TableHasPrimaryKey') = 1 --Make sure it has a Primary Key and it's not just a Unique Constraint
 AND OBJECTPROPERTY([t].[object_id], 'IsUserTable') = 1 --Make sure it's a user table because whatever, why not? We've come this far
 )
GROUP BY [t].[schema_id] ,
 [t].[name] ,
 OBJECT_NAME([kc].[object_id])
ORDER BY SUM([a].[total_pages]) * 8 / 1024.0 DESC;

There are times when heaps are a valid choice

ETL or staging tables are the most common examples of when raw insert throughput is necessary, and a Clustered Index may not be.

But when tables are designed to be queried against and lack a Clustered Index, it’s usually a problem to be fixed.

Brent says: and remember, kids, SQL Server won’t suggest a clustered index.

Kendra says: if you think this probably didn’t happen to you, that’s a good sign you should doublecheck.

27 comments ↑ Back to top

During some testing with SQL Server 2014’s new cardinality estimator, I noticed something fun: the new CE can give you different index recommendations than the old one.

I’m using the public Stack Overflow database export, and I’m running this Jon Skeet comparison query from Data.StackExchange.com. (Note that it has something a little tricky at the top – it’s using a local variable for the @UserId, which itself makes for a different execution plan. When literals are used in the query, the behavior is different, but that’s another story for another blog post.)

First, here are the two different execution plans, both of which do about 1mm logical reads:

With the new cardinality estimator (compat 2014)

With the new cardinality estimator (compat 2014)

With the old CE (compat 2012)

With the old CE (compat 2012)

It’s a really subtle difference in the plans – at first glance, just looks like the 2014 CE removed a single operator – but the big difference is in the number of estimated rows returned:

  • Old CE estimated 82 rows returned
  • New CE estimated 352,216 rows returned

In actuality, 166 rows get returned with this particular input variable – the new CE is just flat out making bad guesses on this data.

Here are the different index recommendations:

CREATE NONCLUSTERED INDEX NewCE_OwnerUserId_Includes
ON [dbo].[Posts] ([OwnerUserId])
INCLUDE ([ParentId],[Score]);

CREATE NONCLUSTERED INDEX OldCE_OwnerUserId_PostTypeId_Includes
ON [dbo].[Posts] ([OwnerUserId],[PostTypeId])
INCLUDE ([ParentId],[Score]);

And when I run sp_BlitzIndex® after doing a little load testing, both missing index recommendations show up in the DMVs:

sp_BlitzIndex® output

sp_BlitzIndex® output

But surely the new CE’s recommendation is better. We’ll create just the one it recommends, and the resulting execution plan does 57k logical reads. Both the new CE and the old CE produce an identical plan, albeit with wildly different row count estimates (old 83, new says 37,423, actual is 166):

Execution plan with the new CE's recommended index

Execution plan with the new CE’s recommended index

HAHAHA, now the new CE agrees that it needs the index recommended by the old CE in the first place. So let’s remove the new CE’s recommendation, and only create the old CE’s recommended index. Both the old and new CE choose to use it:

With the old CE's recommendation

With the old CE’s recommendation

And even better, the old CE’s recommendation results in only 175 logical reads.

So what’s the takeaway? If you’re relying on the execution plan’s missing index recommendations for fast performance tuning, you’re not going to get the best results – no matter which cardinality estimator you’re using. With 2014, the recommendations are different, not necessarily better.

The real keys are knowing how to do it yourself, and we teach those in the Advanced Querying and Indexing 5-day in-person class next month.

3 comments ↑ Back to top

Holy cow, people, we have a lot of free stuff coming your way. Here’s just some of the highlights:

  • Digging Into THREADPOOL Waits
  • How Do You Fix a Slow TempDB?
  • Advanced Shared Storage
  • How to Prove Hardware is the Problem
  • SQL Server High Availability Options Explained
  • Watch SQL Server Break and Explode
  • The Art of Stress-Free Database Administration
  • Prove It! Collecting The Right Performance Metrics
  • What’s New in SQL Server 2016

To sign up for ’em all, just put in your contact info here. You don’t even have to check any boxes. Learnin’ doesn’t get much easier than that. These webcasts won’t be recorded, so be there or be square.

9 comments ↑ Back to top

… is to use a SQL Server where a robot does it for you!

Let’s take a look at how you can do this in AmazonRDS’s hosted SQL Server these days.

 Normally, restoring transaction logs is super tedious.

You’ve got to restore all the files in the right order, for every single database. Even if you script it, if you’ve got a bunch of databases you need to restore, it’s a giant pain.  AmazonRDS has essentially done all the scripting for you.

Amazon RDS makes sure it can always restore your SQL Server to a point in time

When you set up an RDS SQL Server instance, it asks you how long you’d like to keep backups, and if you have a preferred full backup window. And that’s all it asks. Transaction log backups are taken every 5 minutes to support point in time restores, no matter what you want.

Amazon RDS Instance Backup Settings

So I wondered, what happens if someone creates a database in the wrong recovery model by accident? Does this break the transaction log backups?

Nope! I created a database named DB1_CreatedInSIMPLE using the SIMPLE recovery model. Check out what showed up in the RDS logs right afterward:

RDS Recovery Model Set Back to Full

Alrighty then, we’ll do it your way

How Recently Can I Restore?

You can see log backups in the “Recent Events & Logs” above, or you can just look at the latest restore time for your instance. Since transaction log backups are every five minutes, you can’t always restore to one minute ago.

RDS Last Restore Time on Instance

OK, Let’s Restore This Thing! How Easy Is it?

I wasn’t lying when I said it was easy.  You just highlight your instance in the console and select “Restore to Point in Time” under instance actions.

Restore RDS Instance to Point in Time

Restore RDS Instance to Point in Time

This is going to create a whole new instance using the backups they’ve been taking. You get to pick the time you want to restore from. This will vary depending on how long the instance has existed, and what your backup window is set at. (Mine’s at 7 days, it can go up to 35 days.)

Restore RDS Instance to Point in Time- set time

You also get to pick the instance name you’re going to restore to, along with all sorts of properties and security settings:

Restore RDS Instance to Point in Time- set new instance name

Setting up the restored instance isn’t immediate. RDS has to set up a new Windows instance and apply a whole lot of your backups, and that takes time.

Restore may take a bit of time

After you get all your settings done, you click Launch to get your new RDS instance in motion to be created:

Restoring is launching a new instance

Our instance is Cooking

The instance goes through multiple phases. It automatically backs up the instance as part of the process if you’ve got backups enabled, just like any other instance.

New restored instance is creating

Once that completes, I can connect

Here I am, connected to my original and restored instances via SSMS on my desktop!

Connected to both RDS instances via SSMS

SQL Server in RDS Isn’t Perfect, But It Does a Lot For You

I am not a founding member of Team Cloud. But I meet a lot of people who struggle making sure that backups are being taken properly and that they can restore. They’re frequently teams of developers who don’t want to learn to deal with backups and restore. They also want things like database mirroring without having to learn it, and RDS handles that for them, too.

This isn’t for everyone. RDS limits the number of databases you can have on an instance. It doesn’t do all maintenance for you — you have to set up your own CHECKDB jobs, for instance, which you can do in the SQL Server Agent. You don’t get tools like Database Mail inside the SQL Server, or the ability to see the SQL Server Error log via SSMS (don’t worry, you can get to that and other logs in the AWS Console). You can’t use every version of SQL Server out there (2008R2 and 2012 only, currently). You can’t do Availability Groups in Amazon RDS, either, just database mirroring. (I’m a big fan of mirroring, so I’m not all that sad about that.)

Putting it All Together

Database as a service is a pretty compelling option these days. It doesn’t take away the role of the DBA, and it doesn’t handle everything for you — but the flip side of that is that you get a lot of flexibility and customization.

And dang, I love the ease of restoring that instance to a point in time.

Brent says: what I love about the cloud is that it takes away the parts of the DBA job that I never really liked in the first place.

Jeremiah says: I’m with Brent – I love getting rid of the parts of the job I never liked. Cloud lets me do that. And RDS lets me ease up management on a certain class of servers even more

Erik says:  I, for one, welcome our new Robot DBA overlords.

25 comments ↑ Back to top
css.php