Blog

RAM and Sympathy

With the release date for 2016 finally announced

Everyone can start gearing up to gaze upon its far shores from the 2008R2 instance they can’t or won’t upgrade for various reasons. I’m excited for a lot of the improvements and enhancements coming along, and generally hope I’m wrong about customer adoption.

One annoyance with the new release is the increase in CPU capacity for Standard Edition, with no increase in RAM capacity. You can now have up to 24 cores on your Standard Edition box. Yep, another $16k in licensing! And they’ll all be reading data from disk. Don’t kid yourself about Buffer Pool Extensions saving the day; nothing is going to beat having your data cached in memory. How many people on Standard Edition have CPU bound workloads?

Alright, now set MAXDOP and Cost Threshold to the right values. Anyone left?

Alright, check your missing index requests. Anyone left?

But Enterprise needs to be different

It’s already different. It already has a ton of features, including a plethora that smaller shops can’t or won’t ever touch. Full blown AGs, Hekaton, Page/Row Compression, ColumnStore, Online Index Create/Rebuild, Encryption, really, the list goes on and on. And c’mon, the HA/DR parts are what define Enterprise software to me.

24 cores and nothing on.

24 cores and nothing on.

Having a fast ship is way different from having a ship that’s hard to sink.

So what’s the solution?

Microsoft needs to make money. I get it. There’s no such thing as a free etc. But do they really need to make Enterprise licensing money off of people who will never use a single Enterprise feature? Should a small shop with a lot of data really have to make a $5000 jump per core just to cache another 128-256GB of data? That seems unreasonable to me. RAM is cheap. Licensing is not.

I wouldn’t suggest à la carte pricing, because licensing is already complicated enough. What could make sense is offering higher memory limits to shops with Software Assurance. Say up to 512GB on Standard Edition. That way, Microsoft can still manage to keep the lights on, and smaller shops that don’t need all the pizzaz and razzmatazz of Enterprise Edition can still hope to cache a reasonable amount of their data.

If Microsoft doesn’t start keeping up with customer reality, customers may start seeking cheaper and less restrictive solutions.

Thanks for reading!

Brent says: Adding 8 more cores to Standard Edition answers a question no one was asking. It’s almost like raising the number of available indexes per table to 2,000 – hardly anybody’s going to actually do that, and the ones who do are usually ill-advised. (Don’t get me wrong – there’s some good stuff in 2016 Standard – but this ain’t one of ’em.)

Creating Tables and Stored Procedures in TempDB – Permanently

No, not #tables – actual tables. Here’s how:

USE tempdb;
GO
/* This one is only available during my session: */
CREATE TABLE #myTempTable (ID INT IDENTITY(1,1), Stuffing VARCHAR(100));
GO


/* This one is global, meaning it's available to other sessions: */
CREATE TABLE ##myTempTable (ID INT IDENTITY(1,1), Stuffing VARCHAR(100));
GO
/* You can create both of those at the same time. They're different. */


/* This one is just like a user table, but in TempDB: */
CREATE TABLE dbo.myTempTable (ID INT IDENTITY(1,1), Stuffing VARCHAR(100));
GO

The first one disappears when my session is over, but the latter two persist until the SQL Server is restarted.

Why would you ever do the latter two? Say you need to share data between sessions, or between different applications, or staging tables for a data warehouse, or just faster tables that live on local SSDs in a cluster (as opposed to slower shared storage), or you wanna build a really crappy caching tier.

If you use global temp tables or user-space tables, though, you have to check for duplicates before creating your tables. Local temp tables are just all yours, and you can have a thousand users with the exact same-name local temp tables.

Next up, the ever-so-slightly different magic of temporary stored procedures:

USE tempdb;
GO

/* This one is only available during my session: */
CREATE PROC #usp_myTempWorker AS
  SELECT * FROM sys.databases;
GO

/* This one is global, meaning it's available to other sessions,
    but ONLY as long as my session is available: */
CREATE PROC ##usp_myTempWorker AS
  SELECT * FROM sys.databases;
GO

/* This one is just like a user stored proc, but in TempDB: */
CREATE PROC dbo.usp_myTempWorker AS
  SELECT * FROM sys.databases;
GO

Here, the first TWO disappear when my session is over, and only the latter one sticks around. Diabolical. So the ## temp stored proc doesn’t really help me here because I can never tell when the creator’s session is going to finish. (Not God. His session keeps right on going.)

So why would you ever create stored procedures – temporary or user – in TempDB? You might not have permissions in the user databases, just might not be technically allowed to change things, or maybe you’ve got monitoring queries that you want to hide, or you want to create procs temporarily to check parameter sniffing issues.

All of the above will disappear when the SQL Server is restarted – or will they? Not if you create them permanently in the model database, which is the source of TempDB’s creation when SQL Server restarts:

USE model;
GO
CREATE PROC dbo.usp_myTempWorker AS
  SELECT * FROM sys.databases;
GO
CREATE TABLE dbo.myTempTable (ID INT IDENTITY(1,1), Stuffing VARCHAR(100));
GO

/* Now restart your SQL Server, and check in TempDB */
USE tempdb;
GO
EXEC dbo.usp_myTempWorker;
GO
SELECT * FROM dbo.myTempTable;
GO

Why would you ever wanna do this? Well, say you need to make sure that, uh, in case … look, I’m just an idea man. Somebody, somewhere, is looking for a really bad idea. That’s what I’m here for.

Where Clauses and Empty Tables

Sometimes SQL is the presentation layer

And when it is, you end up doing a lot of concatenating. This isn’t about performance, or trying to talk you out of SQL as the presentation layer, this is just something you should keep in mind. SQL is a confusing language when you’re just starting out. Heck, sometimes it’s even confusing when you’ve been doing it for a long time.

Let’s say your have a website that stores files, and when a user logs in you use a temp table to track session actions as a sort of audit trail, which you dump out into a larger table when they log out. Your audit only cares about folders they have files stored in, not empty ones.

Here’s a couple tables to get us going.

IF OBJECT_ID('tempdb..#aggy') IS NOT NULL
DROP TABLE #aggy;

WITH x1 AS (
SELECT TOP (100)
ROW_NUMBER() OVER (ORDER BY(SELECT NULL)) AS ID
FROM sys.[messages] AS [m], sys.[messages] AS [m2])
SELECT ID, 
    DATEADD(DAY, [x1].[ID] * -1, CAST(GETDATE() AS DATE ) ) [CreateDate],
    'C:\temp\' + CAST(HASHBYTES('MD5', NCHAR([x1].[ID])) AS VARCHAR(32)) + '.gif' [Path]
INTO #aggy
FROM [x1];

IF OBJECT_ID('tempdb..#usersessioninfo') IS NOT NULL
DROP TABLE #usersessioninfo;
CREATE TABLE #usersessioninfo 
(LastActionID INT IDENTITY(1,1), UserID INT, UserMessage VARCHAR(100), MessageDetails VARCHAR(100))

And then we’ll stick some data into our session table like this.

INSERT [#usersessioninfo]
( [UserID] , [UserMessage] , [MessageDetails] )
SELECT 
@@SPID AS [UserID],
'Welcome to your folder!' AS [UserMessage],
'You have stored #' +
CAST(COUNT(*) AS VARCHAR(100)) +
' files in the last 30 days, starting on ' + 
CAST(MIN([a].[CreateDate]) AS VARCHAR(20)) + 
' ending on ' +
CAST(MAX([a].[CreateDate]) AS VARCHAR(20)) +
'.' AS [MessageDetails]
FROM [#aggy] AS [a]
WHERE [a].[CreateDate] >= GETDATE() -30

Everything looks great!

Select max blah blah blah

Select max blah blah blah

But if your table is empty…

You may find yourself with a bunch of junk you don’t care about! Empty folders. Contrived examples. Logic problems. Stay in school.

TRUNCATE TABLE [#aggy]

INSERT [#usersessioninfo]
( [UserID] , [UserMessage] , [MessageDetails] )
SELECT 
@@SPID AS [UserID],
'Welcome to your folder!' AS [UserMessage],
'You have stored #' +
CAST(COUNT(*) AS VARCHAR(100)) +
' files in the last 30 days, starting on ' + 
CAST(MIN([a].[CreateDate]) AS VARCHAR(20)) + 
' ending on ' +
CAST(MAX([a].[CreateDate]) AS VARCHAR(20)) +
'.' AS [MessageDetails]
FROM [#aggy] AS [a]
WHERE [a].[CreateDate] >= GETDATE() -30

What do you think is going to happen? We truncated the table, so there’s nothing in there. Our WHERE clause should just skip everything because there are no dates to qualify.

NULLs be here!

NULLs be here!

Darn. Dang. Gosh be hecked. These are words I really say when writing SQL.

That obviously didn’t work! You’re gonna need to do something a little different.

Having having bo baving banana fana fo faving

One of the first things I was ever really proud of was using the HAVING clause to show my boss duplicate records. This was quickly diminished by him asking me to then remove duplicates based on complicated logic.

Having is also pretty cool, because it’s processed after the where clause, so any rows that make it past there will be filtered out later on down the line. For our purposes, it will keep anything from being inserted, because our COUNT is a big fat 0. Zero. Zer-roh.

INSERT [#usersessioninfo]
( [UserID] , [UserMessage] , [MessageDetails] )
SELECT 
@@SPID AS [UserID],
'Welcome to your folder!',
'You have # ' +
CAST(COUNT(*) AS VARCHAR(100)) +
' files, starting on ' + 
CAST(MIN([a].[CreateDate]) AS VARCHAR(20)) + 
' ending on ' +
CAST(MAX([a].[CreateDate]) AS VARCHAR(20)) +
' in the last 30 days.'
FROM [#aggy] AS [a]
WHERE [a].[CreateDate] >= GETDATE() -30
HAVING COUNT(*) > 0

This inserts 0 rows, which is what we wanted. No longer auditing empty folders! Hooray! Everybody dance drink now!

Mom will be so proud

Not only did you stay out of jail, but you wrote some SQL that worked correctly.

Thanks for reading!

SQL Interview Question: “Talk me through this query.”

Last month’s post “For Technical Interviews, Don’t Ask Questions, Show Screenshots” was a surprise hit, and lots of folks asked for more details about the types of screenshots I’d show. Over the next few weeks, I’ll share a few more.

Normally I’d show this query as a screenshot, but for easier copy/pasting into comments, I’m showing it as code here.

CREATE PROC dbo.usp_ByCategory @Category NVARCHAR(20) AS
IF @Category = NULL SET @Category = 'Default'
SELECT i.itemID, i.itemName,
   COALESCE(po.Price, i.Price, 0) AS Price
FROM Items I
   LEFT OUTER JOIN PriceOverrides po
   ON i.itemID = po.itemID
   AND po.SaleStartDate >= GETDATE()
   AND po.SaleEndDate <= GETDATE()
WHERE i.Category = @Category

I’d say to the job candidate, “You’ve been asked to take a quick look at this code as part of a deployment. Explain what the business purpose of the code is, and tell me if there’s anything that concerns you.”

After a few days, I’ll follow up with my own thoughts in the comments.

SQL Server 2016 Release Date: June 1, 2016

It’s the news we’ve all been waiting for!

Microsoft just announced the SQL Server 2016 Release Date: June 1, 2016.

This PDF lays out the differences between editions, and here’s a few points that stand out:

  • Standard Edition now goes up to 24 cores, and still just 128GB max memory
  • Query Store is Enterprise Edition only (see update below)
  • Always Encrypted is Enterprise only, thereby killing its adoption rate among ISVs
  • In-memory analytics, R integration are Enterprise only
  • Business Intelligence Edition is gone with the wind
  • According to the newly released TPC-H benchmark Executive Summary, Enterprise Edition still costs around $7k USD per core

Great news! Let me know what you think in the comments.

UPDATE 6:30PM – Microsoft unveiled a more detailed feature comparison by edition, and this one says Query Store will be available in all editions (including Express!)

[Video] Office Hours 2016 2016/04/27

This week, Brent, Erik, Jessica, Angie, and Tara discuss SQL service packs, partial restores, breaking replication, backups, as well as their favorite TV shows.

Here’s the video on YouTube:

You can register to attend next week’s Office Hours, or subscribe to our podcast to listen on the go.

If you prefer to listen to the audio:

Transcript:

Jessica Connors: Let’s start with the service pack question from Jay H. He asks, “When patching applying SQL service pack to passive node in a cluster should one put into pause mode or okay to simply run the service pack?”

Tara Kizer: I’ve never put it into pause mode. I’ve patched hundreds, maybe thousands of servers.

Brent Ozar: Well but do you patch the passive? I mean you patch the passive and there’s no services running on it so you really don’t have to worry about it.

Tara Kizer: Yeah, nope.

Brent Ozar: Yeah, am I going to have to worry about something failing over to it in the middle of the service pack? I’ve never done it either. Talk about a first-world problem there. So I’ve never paused it. I don’t know, if I was probably going to get ambitious, like PowerShell script the whole thing out, I would probably put in some kind of pause there but I’m just not that guy.

Jessica Connors: What’s this about Terabyteasaurus Rex?

Brent Ozar: That is a presentation that I do about very large databases. There’s things that you want to do to first avoid very large databases. Then once you have one, how you do to cope with the aftermath. It’s like having your own little pet dinosaur. You have to plan for things ahead of time and don’t bite off more than you can chew.

Angie Walker: Like a Tamagotchi.

Brent Ozar: Like a Tamagotchi, yes. Only you can’t carry it around in your pocket.

Jessica Connors: Is that a … presentation you’re doing coming up?

Brent Ozar: That one’s actually from our regular training classes. I’m doing it for IDERA. IDERA pays us to periodically do free presentations for the community. We give them like our menu of private training courses and just go, “Here, which one do you want to buy for the community?” They go buy one and we give it out for free to everybody and off we go. So it’s really awesome how we work that with vendors.

Jessica Connors: Nice.

Brent Ozar: Everybody wins.

Jessica Connors: All right. This question is pretty vague from Abdullah. He says, “Hello. We have just received new hardware with 124 cores and 1 terabyte of memory to host many of our SQL instances.”

Tara Kizer: Dang.

Brent Ozar: Big spender.

Erik Darling: Bad news: the only licensed …edition.

[Laughter]

Brent Ozar: Wow. One word, virtualization. Virtualization. Normally, I’d never buy a host that big but I just wouldn’t do instance stacking. I wouldn’t run one Windows OS and then a lot of instance stacking. Have you guys ever run multiple instances on a server and what problems have you seen from it?

Erik Darling: Not willingly. I’ve inherited them and it’s always been like, okay, we’re going to really stifle this one which is less important because this other one is just choking it out anyway.

Tara Kizer: I used to have a four node cluster with eleven SQL instances on it.

Brent Ozar: Oh.

Tara Kizer: Yeah, the biggest challenge was installing patches. There was always some node that suddenly had to be rebooted. This was back on before 2008. It was horrible. Even after you rebooted the server it would say, “Oh, that server needs a reboot.” It would take an act, a miracle, for all four nodes to agree it’s now time to patch. I would reboot like 20 times before it would say it. It was horrible.

Jessica Connors: Oh boy. Did you end up running them all? Did you end up consolidating?

Tara Kizer: We ended up upgrading to newer versions where you could install it passively. Back then the SQL instance had to be in the right state on all four instances, or not the SQL instance, but you had to patch all four instances at the same time. So all four had to agree that it was ready to be patched.

Brent Ozar: Miserable.

Tara Kizer: It was horrible.

Brent Ozar: Still even today Windows patching is a giant pain in the rear. Do you want to take down all of the instances at once just in order to patch Windows? Windows patches come out kind of fast and furious. So virtualization adds that little bit layer of lower performance, and I’m not even going to go down the is virtualization slower or not. But worst case scenario, it’s a little slower and you deal with that. But then you get so much better management. Holy smokes. It’s way easier.

Jessica Connors: All right. Let’s talk about partial restores. “Can you recommend a good article on doing partial restores? I want to move my historical data to a separate NDF file so I can restore just the active portion right away so the users can use the database then restore the historical part.”

Tara Kizer: Do we have an article? I mean I know someone who does a session.

Brent Ozar: Who?

Tara Kizer: I’ve seen it twice.

Brent Ozar: Is it available publicly? Is there somewhere, a blog we can link to?

Tara Kizer: I don’t know. I’ve sure she has a blog. Kimberly Tripp has a whole session on it. At patch she’ll bring a USB, all these USBs, and she’ll unplug them to simulate losing a drive and the database stays online because it’s all in memory. It’s really cool but it has to do with the partial restores as well.

Brent Ozar: I bet if you go to hit Google and you do like “partial restore site:SQLSkills.com” I bet she’s got blog posts on it too because it’s one of her more famous demos. We talk about it in our training classes but I was just sitting there thinking, I don’t think we have a single public post on it.

Erik Darling: I was working on something similar when I was messing around with some foreign key stuff but SQL outsmarted me so I didn’t end up writing the post.

Tara Kizer: Erik will take care of that in the next 15 minutes though.

[Laughter]

Brent Ozar: Erik “the blogger” Darling.

Erik Darling: Oh, get out of here.

Brent Ozar: There’s Bob Pusateri, @SQLBob on Twitter. Bob Pusateri has an article too on his site to help move your stuff into a historical file group. He has really nice scripts that help you rebuild objects onto other file groups which is way trickier than it looks like. There’s things like off-row data that don’t move by default.

Erik Darling: If you’re on a version prior to 2012, a lot of that stuff is still offline, especially for large varchar and varchar types.

Jessica Connors:         All right. A question from John. He says, “When looking at statistics properties, it gives the date and time when statistics were last updated. How can I tell if that statistic’s object was last updated using a full scan or a sample scan and what that sample scan value was?”

Erik Darling: I know all of this.

Jessica Connors: Do you?

Erik Darling: Off the top of my head. So if you run DBCC SHOW_STATISTICS on the table and index that you’re interested in, you can either run the whole command or you can run it with stat header which will just give you the top row of output. There will be two columns in the stat header part. One will be rows and one will be rows sampled. If rows equal rows sampled, then you did a full scan. If rows sampled is less than rows, then it used a percentage. You can also hit a function sys.dm_db_stats_properties. If you cross apply that with sys.stats, you can pass an object ID and the stats ID and that will also tell you rows sampled versus rows in the table. So you can figure out all that stuff there. If you want to calculate a percentage, just put a calculation in your queries. That’s the way to tell.

Jessica Connors: Thanks, Brent Ozar.

[Laughter]

Why are you still Brent Ozar?

Brent Ozar: My alter ego. He’s logging in first to start the webcast because I’m shlumpy, lazy, and I don’t show up on time. So god bless him.

Erik Darling: Brent is taking cold drugs and hanging around.

Brent Ozar: Yes.

Jessica Connors: Got ya. Okay, question from Cameron. He says, “If you want to purposely break replication…”

Brent Ozar: What?

Jessica Connors: Why would you purposely break replication?

Brent Ozar: Is he trying to drop the microphone on his way out the door?

Angie Walker: Should we continue answering this question?

[Laughter]

Jessica Connors: “Is it better to unsubscribe from the slave database or should you delete the publication from the master?”

Brent Ozar: Oh, he wants to do a restore.

Tara Kizer: I haven’t used that term slave database, I assume he’s referring to the subscriber for a restore.

Brent Ozar: Subscriber.

Tara Kizer: I just right-click on the publication and say drop it and it takes care of everything.

Brent Ozar: Like it’s hot.

Tara Kizer: But no, you don’t delete the publication from the… I’m confused by the terms that he’s using.

Brent Ozar: I bet, so one thing I would say is I would do Tara’s approach rather than trying to remove any subscriber’s setups because what happens if you leave the publisher’s setup and somebody doesn’t do the restore? Like I’m assuming you’re trying to restore the publisher, not the subscriber. You can leave a hosed up replication setup behind. So as long as there’s only one subscriber, just delete it at the publisher.

Tara Kizer: If you end up with the hose situation, you can run sp_removedbpublisher, something like that. It’s remove something. Remove db something. That will just clean up anything that was left behind.

Brent Ozar: That’s how you know somebody has worked with replication before.

Tara Kizer: Yes.

Brent Ozar: She’s like looking up the rest.

Tara Kizer: Bailed over to the DR site and forgot to drop replication beforehand and it like orphaned at that point. It’s like, what is that command?

Brent Ozar: Oh god.

Jessica Connors: Tara has the best stories.

Tara Kizer: Lost lots of sleep.

Brent Ozar: God bless.

Jessica Connors: Yeah, you could have your own podcast. Yeah, that’s what he says, he says it’s just to do a restore. He’s not in there just for the sake of breaking stuff now that we know his first and last name.

Tara Kizer: I know how to break replication too. He meant drop it.

Jessica Connors: All right. Let’s talk about stack dumps. Chris has a question. He says, “My SQL error logs show a stacked up. Total server memory is 16GBs, Mac memory is setup just above 13GBs, LPIM is enabled. Available memory went to zero just before the crash according to my monitoring software. I’m thinking I should lower the max memory or disable the LPIM. What do you think?”

Tara Kizer: You need to figure out what is using the memory but I don’t think your 2.5 is enough for the OS. The 2.5 I mean.

Brent Ozar: Yeah, when Tara says find out what’s using the memory, it’s not SQL Server, or at least not the engine. It could be other things in SQL Server like integration services or analysis services or whatever but you set max memory, you set lock pages, and memory turned on. So that’s cool. That amount is locked but now, and people will often say, “I should turn on lock pages and memory. That way I don’t have to worry if something else needs RAM.” Hell yeah you do. You just did. You just suffered a crash because of it. SQL Server couldn’t back down on the amount of memory it needed. So now your fun journey begins to go troubleshoot what is using that extra memory. What would you guys use in order to find out what tools or what apps are using the extra memory?

Tara Kizer: Well I had low memory on my laptop on Monday during a client session. So after I was finally able to investigate it after the call in either the app log or the system log, it told me what the top three processes that were using the memory. It was two SQL server instances and WebEx; those were the top three. I’m not too sure if that would be seen if your actual server crashed though, but maybe. There might be low memory alerts in there leading up to the crash.

Brent Ozar: You’re right on the thing in saying lock pages and memory, should I maybe turn it off. I would while I’m doing the investigating. Just leave it off just to prevent—because this other app is probably going to fire up again before you have the chance to fix it.

Jessica Connors: All right. So let’s talk about what to do when your backups fail. Fred has a question. He says, “Checkdb is successful and our backups always complete successfully but trying to restore the backup gives an error that the backup failed, the data is invalid. Any thoughts on where to look.” He’s running SQL 2008 R2 Enterprise. Not using any Enterprise-only features. Four backups later we usually get a good backup that we can restore from.

Erik Darling: My first question is are you running backups with checksum enabled and do you have page verification turned on for that database? Because you could at least narrow down where the issue is happening. So if you have page verification turned on, SQL will start writing checksums to your pages to make sure that nothing wonky happens to them on disk. Then if you run backups to the checksums, SQL will check those checksums as it does the full backup. So you at least have something verifying there that it’s not SQL and that it’s something with your disk media or like when you’re transferring files over that’s happening. The only time I’ve ever seen that happen was when I was restoring a backup on a dev server and it turned out that one of the disks on the dev server was actually suffering from some malady, some lergy. So that was the issue on that. So I would absolutely verify that it’s not something happening on the primary sequences and then my investigation would be on whatever hardware I have on the instance I’m trying to restore it to.

Brent Ozar: I’ve seen it when you write to crappy storage for your backups, like the backup reports writes successfully but then the data is trash when you go through to read it. But I would like to say, just like the great singer Meatloaf, two out of three ain’t bad. Four out of five successful backups, that’s not such a bad number. 80 percent, that’s a passing score. You probably didn’t need one out of five. It’s probably not that big of a deal. But yeah, I would just try immediately after the backup finishes, try doing a restore with verify only. Either from the same SQL Server or from another SQL Server and that will at least tell you if the backup’s file is hosed.

Jessica Connors: All right. Back to replication.

Brent Ozar:          How come nobody ever says, “What do you guys think about flowers? What’s your favorite kind of chocolate?”

Erik Darling: How much do you like [inaudible: Casa Playa]?

Brent Ozar: It’s dreamy.

Jessica Connors: Let’s see. Question from John. He says, “Is it possible to mirror via transactional replication a SQL 2008 R2 database to a SQL 2016 database?”

Brent Ozar: Wow. I bet it would be.

Tara Kizer: I think so since 2016 goes all the way down to 2005 compatibility level.

Brent Ozar: Yeah, I bet you could.

Erik Darling: The only thing I’ve ever seen stand in the way is backwards stuff.

Tara Kizer: Mirroring and transactional replication, replication doesn’t really care about versions. Mirroring does.

Brent Ozar: Yeah, he should be fine.

Tara Kizer: Either way, it should be fine.

Jessica Connors: Can you just upgrade both to 2016?

Brent Ozar: I bet he’s so tired of running with his 2008 R2 box and he’s like just trying to give his users something that has nice, new functionality on 2016. That’s probably what it is. He’s like, “Here, go query over here. It’s really nice and fun.” Maybe he’s got nice nonclustered column store indexes on the table over there, make his queries fast. Maybe that’s what it is.

Jessica Connors: Kanye or Wanye West.

Brent: Wanye. [Laughter] Oh, Wayne you are never going to live that down, Wanye.

Jessica Connors: I think that’s a good question. Where is Richie?

Brent Ozar: Oh, he’s in Disneyworld.

Jessica Connors: Of course he is. He’s always getting lost at Disneyworld.

Tara Kizer: Driving home.

Angie Walker: Yeah, he’s on the road.

Jessica Connors: Let’s see here. Question from Jay H. He says, “Last year after applying a couple of particular Windows KB updates issues arose with JBDC TSL connections and had to be removed. Has Microsoft fixed this and can updates now be applied?”

Brent Ozar: I remember seeing Aaron Bertrand blog about this. This is one of those SSL and the connection string I’ve never paid too much attention to but I think Aaron Bertrand blogged about this. Other people are nodding like we vaguely have seen something along these lines.

Erik Darling: Yeah. I’ve just seen some stuff sitting around 2016 with TLS 1.1 and 1.2 having some weirdness bug things.

Brent Ozar: Yeah, we don’t touch it with a ten-foot pole. If you search for SQL Server TLS Aaron Bertrand and Aaron is A-A-R-O-N Bertrand, I bet you you’re going to find a blog post in there where he went into detail on that. Because like anything else with SQL Server updates Aaron Bertrand looks over those with a magnifying glass and a fine-tooth comb.

Erik Darling: In a kilt.

Brent Ozar: In a kilt. So Rusty Householder already replied with the answer, blogs.SQLSentry.com TLS support. I’m going to put that in the resources for everybody there. But yeah, it is a thing that Aaron blogged about.

Jessica Connors: Let’s see. This is the last comment/question. People have been pretty quiet today. From Chris Wood, he says, “Thanks for the help on the blocking.” You helped Chris with blocking?

Brent Ozar: I believe we did last week I think. I think we did.

Jessica Connors: Via Critical Care?

Brent Ozar: Oh no, it was a question about—I remember this. It was a database restore involving Relativity. He was doing a database restore on Relativity and I think we posted the question on Stack Exchange as well. SP who was active showed blocking and we couldn’t figure out which query it was that was doing the blocking. Turned out it was a system SPID that was a doing a full text crawl. So when you finished doing restore, it did a full text crawl and it locked some of the tables in the database. People weren’t allowed to access them. Awesome. Got to love that. People are like wow…

Erik Darling: I could have answered that one.

Brent Ozar: Oh could you, have you had that same problem?

Erik Darling: Yeah, embarrassingly.

Brent Ozar: Unbelievable.

Jessica Connors: Angie is getting a shout out. Were you posting on #SQLhelp?

Angie Walker: Nope. [Laughter] Apparently I have an impersonator. I didn’t think there were any other Angies out there. Oh, Tara, ah. It’s pretty hard to tell us apart, I know.

Brent Ozar: Just one of us cartoons, all our cartoons look like.

Erik Darling: I get mistaken for Jessica all the time.

[Laughter]

Erik Darling: They’re like, “Hey we need to buy some stuff.” I’m like…

Brent Ozar: Unsubscribe.

Jessica Connor: Yeah. Let’s see, Brent. Brent’s an awesome name. He says, “Any experience with Experian Correct Address which has to be installed on the database server for SQL CLR. Do you have any experience there?”

Brent Ozar: Oh, I’m vaguely remembering that this calls a web service. That it goes and validates people’s addresses. The way that it does it, whenever you want to like validate someone’s address you call an extended stored procedure, a CLR stored proc and it goes off and it calls a web service. So it’s possible that this is working on one node and not another because of firewall rules or network permissions. Windows Firewall, UAC, I mean, it could be almost anything that involves accessing the interwebs.

Erik Darling: I hate to say it but this is actually something that Master Data Services is good at.

Brent Ozar: Really?

Erik Darling: Yeah. You can do like address lookups and have like post office integration where you can get whatever like you know a post office valid address for a thing is, it can validate that. I don’t know a ton about it because I’ve only ever seen it in action a couple times but that’s actually something that Master Data Services does well which I feel filthy saying.

Brent Ozar: That’s a couple times more than me. I’m guessing it didn’t require CLR then, it was probably just stuff built into the database server?

Erik Darling: Yeah, but I don’t know how it was called so it still might have been CLR but it was integrated with Master Data Services. So it was like SQL friendly. It didn’t need to call out to anything else. It was already like built in somewhere.

Brent Ozar: Doing its thing.

Erik Darling: Yeah.

Jessica Connors: Let’s see. I think we talked about this last week. Nick Johnson, he says, “I found a couple articles that talk about how compatibility level 90 2005 does not work in SQL 2014. You guys have any confirmation on that even though 2014 in fact shows it, it doesn’t work.”

Erik Darling: You can’t upgrade directly, isn’t that it? Or is that from 2000?

Brent Ozar: I can’t remember either.

Angie Walker: I think in general, you can’t do more than two versions, right? You couldn’t go through 2005 to 2012 or straight to 2014. You’d have to make the hop in between.

Tara Kizer: You can’t restore but the compatibility level is there. So on 2014 you can go all the way down to 2005 compatibility level. Like he’s saying, the option is there but apparently some articles are saying it doesn’t work. I don’t know. I see it.

Brent Ozar: Yeah, I vaguely remember that during the release process, like it worked in SSMS but the ALTER didn’t work, like the ALTER DATABASE didn’t work. I think people thought, “It’s not going to work when it finally releases,” like that they’re going to yank 2005 compat. But I’m pretty sure it still does work because I distribute Stack Overflow in that format too and Doug was asking questions about that this week. It should work fine. I don’t know if it runs in 2016, if 2005 compat mode runs in 2016. No clue. And who the hell would use that? Why would I swear in the middle of a podcast? Who knows. Don’t do that. Don’t do that. By that, I mean 2005 compat mode, not swearing. You should totally swear. You’re a grown person.

[Laughter]

Jessica Connors: All right, Abdullah asks or states he’s in progress building a DR site on a multisite topology. Any recommendation for SAN replication?

Erik Darling: The fast one.

Tara Kizer: It’s going to be expensive.

Brent Ozar: And the cheap one.

[Laughter]

Brent Ozar: Yeah, you don’t get much of a choice. It’s whatever brand your SAN supports. I think EMC makes stuff as well, like appliances that will do between SANS. But cha-ching.

Erik Darling: Also make sure that you are deleting old snapshots and copies because you can find yourself—depending on like how big these copies are, deleting them could take a very long time so you want to make sure that you have enough room to both copy and have other stuff being deleted off. Because deleting 20 terabytes of SAN replication snapshots is time consuming no matter what.

Jessica Connors: All right, well I guess while we’re on the topic of that, do we like async DB mirroring as a DR strategy?

Tara Kizer: Yes.

Erik Darling: Yes.

Tara Kizer: I do. I used it for years. So when I joined a company three years ago, we were on SQL 2000 using log shipping and only upgraded 2005. We were relieved to get rid of log shipping just due to the amount of work it took to failover to DR site for a lot of servers. It was a lot of work. We were real excited about mirroring. We used asynchronous mirroring between our two sides. There was about 300 miles apart. We could not have done synchronous database mirroring because of the drastic performance degradation that you would have. But async mirroring worked great. We failed over to the DR site regularly, two, three times a year. Ran production out of it for a few weeks then failed back with it. It works great. All you have to do is when you want to failover, set it to synchronous mode, let it catch up, then do your failover and set it back to async.

Jessica Connors: There’s a Trump DBA handle on Twitter. Are you guys familiar with this?

Bret Ozar: Yes. I encourage humor in the SQL Server community. It’s not me doing it. That’s somebody else doing it. But I’m always like, if people want to have fun, that’s kind of cool.

Jessica Connors: What about DBA Reactions? Are you still involved with that?

Brent Ozar: I am. I took a brief hiatus from it while other people kept submitting stuff. Other people were submitting so many things. So I just went in and approved them all the time. Then of course I fell back in love with GIFs. I’m like, “Oh, let me go in and look and see because it’s been a while.” Oh my god, there’s so many good GIFs these days. So I started queueing them up again. I think I’ve already gotten them written through most of next week.

Jessica Connors: Oh, so you’re still doing that.

Brent Ozar: Yeah, I love it. I have a disturbing amount of fun with it.

Jessica Connors: They used to get a newsletter, the DBA Reactions.

Brent Ozar: I cut it back to like only once a week. It was going Tuesdays and Thursdays. Now it’s down to either just Tuesday or Thursday, I forget which one because I didn’t want to overwhelm people’s email boxes. There are like 5,000 people signed up to get this in their email box every week.

Jessica Connors: Yeah. I remember the time that somebody called for SQL Critical Care and they’re like “I heard about you guys from DBA Reactions. I love … I’m like, “we made a sale.”

[Laughter]

Brent Ozar: Well and at that point they’re going to call us because anybody who’s crazy enough to go, “I like DBA Reactions, they’re my people,” they already know exactly what we’re like. They know exactly how we work.

Jessica Connors: That’s fair. Cool, well you guys are being fairly quiet today so I think we’ll end there.

Brent Ozar: Thanks everybody for hanging out with us. We will see you guys next week.

Updated the StackOverflow SQL Server Database Torrent to 2016-03

so-logo
The StackOverflow XML Data Dump was recently updated with 2016-03 data, so I’ve updated our torrent of the SQL Server database version of the Stack Overflow data dump.

Fun facts about the database and its real-world-ness:

  • 95GB in size
  • 29,499,660 posts spanning 2008-07-31 to 2016-03-06
  • 5,277,831 users spanning ages from -972 to 96 (just like real world data, you can’t trust it)
  • 46,306,538 comments (227 of which have the F-bomb)
  • Every table has a clustered key on an Id identity field, and has relationships to other tables’ Ids (again, much more real-world-ish)
  • Lots of lumpy data distribution and sizes, making it fun for parameter sniffing demos
  • Case-sensitive collation (because if you’re going to share scripts online, you want to get used to testing them on case sensitive servers – this stuff exists out in the real world)
  • 1,305% cooler than AdventureWorks

Here’s how I built the torrent:

128GB USB3 flash drives with the StackOverflow database that we use in training classes

128GB USB3 flash drives with the StackOverflow database that we use in our training classes

In our AWS lab, we have an m4.large (2 cores, 8GB RAM) VM with SQL Server 2005. We use that for testing behaviors – even though 2005 isn’t supported anymore, sometimes it’s helpful to hop in and see how things used to work.

I still use 2005 to create the dump because I want the widest possible number of folks to be able to use it. (This is the same reason I don’t make the database smaller with table compression – that’s an Enterprise Edition feature, and not everybody can use that.) You can attach this database to a SQL 2005, 2008, 2008R2, 2012, or 2014 instance and it’s immediately usable. Keep in mind, though, that it attaches at a 2005 or similar compatibility level. If you want 2014’s new cardinality estimator, you’ll need to set your compat level to 2014 after you attach the database.

I downloaded the Stack Exchange data dump on that 2005 VM. It’s a little confusing because the Archive.org page says it was uploaded on 1/21/2014, but that’s just the first date the file was published. The top update date of March 1, 2016 is the current version you’ll get if you use the download links at the top right of the page.

To make the import run faster, I shut the VM down, then changed its instance type to the largest supported m4 – an M4 Deca Extra Large with 40 cores and 160GB RAM for $4.91/hour – and booted it back up. (Don’t forget to revisit your SQL Server’s max memory, MAXDOP, and TempDB settings when you make changes like this.)

I created an empty StackOverflow database, then fired up the Stack Overflow Data Dump Importer (SODDI), an open source tool that reads the XML data dump files and does batch inserts into a SQL Server database. I pasted in a connection string pointing to my SQL Server – ConnectionStrings.com makes this easy – and off it went:

SODDI importing the StackOverflow XML dump

SODDI importing the StackOverflow XML dump

The import finished in about 25 minutes, although it turns out the extra cores didn’t really help here – SODDI is single-threaded per import file:

Using a few threads while we import a few files

Using a few threads while we import a few files

After SODDI finished, I stopped the SQL Server service so I could access the ~95GB data and log files directly, and then used 7-zip set to use ultra compression and 32 cores, and the CPU usage showed a little different story:

Whammo, lots of active cores and 16+GB memory used

Whammo, lots of active cores and 16+GB memory used

After creating the 7z file, I shut down the EC2 VM, adjusted it back down to m4.large. I created a torrent with uTorrent, then hopped over to my Whatbox. Whatbox sells seedboxes – virtual machines that stay online and seed your torrent for you. They’re relatively inexpensive – around $10-$30/mo depending on the plan, and I just go for unlimited traffic to make sure the database is always available.

To double-check my work, I fired up my home BitTorrent client, downloaded the torrent, extracted it, and attached the database in my home lab. Presto, working 95GB StackOverflow database.

Now, you can go grab our torrent of the SQL Server database version of the Stack Overflow data dump. Enjoy!

My Favorite Database Disaster Stories

The statute of limitations has passed, so this week on SQL Server Radio, I get together with Guy Glantser and Matan Yungman to talk about our favorite oops moments.

I talked about my very first database disaster ever – done back when I was in my 20s and working for a photo studio, using Xenix, long before I ever thought I wanted to be a database administrator. (Yes, kids, Microsoft had their own Unix thirty years ago, and suddenly I feel really, really old. No, I wasn’t using it thirty years ago.)

This episode was so much fun because we recorded it in-person, together, gathered around a table in Tel Aviv when I was there for SQLSaturday Israel 2016. I really love talking to these guys, and I think you can hear how fun the chemistry is on the podcast.

Head on over and listen to our disaster stories, and when you’re done, check out my classic post 9 Ways to Lose Your Data.

One weird trick for managing a bunch of servers

Let’s face it, most people don’t have just one SQL Server

How many they tell Microsoft they have is another matter, but let the record show that I don’t condone licensing dishonesty. But going one step further, most places… Well, they’re ‘lucky’ if they have one DBA, never mind a team.

Everyone else: Give me your network people, your sysadmin, your huddled SAN group yearning to breathe free, the wretched refuse of your teeming developers.

Doing things on one server is aggravating enough. Doing things on a bunch of servers is even worse. Given some of today’s HA/DR features (I’m looking at you, Availability Groups, with your lack of a mechanism to sync anything outside of user databases. Rude.) people are more and more likely to have lots of SQL Servers that they need to tend to.

Sometimes just keeping track of them is impossible. If you’re one guy with 20 servers, have fun scrolling through the connection list in SSMS trying to remember which one is which. Because people name things well, right? Here’s SQLVM27\Instance1, SQLVM27\Instance2, SQLVM27\Instance3, and that old legacy accounting database is around here somewhere.

Register it and forget it

But don’t actually forget it. If you forget it and it goes offline, people will look at you funny. Turns out people don’t like offline servers much.

So what’s someone to do with all these servers? Register them! Hidden deep in the View menu of SSMS is the Registered Servers window

Hi there, handsome.

Hi there, handsome.

It will look pretty barren at first, just an empty folder. But you’ll fill it up quick, I’m sure. Can never have enough servers around, you know.

It’s pretty easy to populate, you can right click on the Local Server Group folder, or on servers you’re connected to in Object Explorer.

This

This

That

That

 

 

 

 

 

 

 

Either way, you get the same dialog box to add a server in. You can give it a friendly name if you want! Maybe WIN03-SQL05\Misc doesn’t tell a good story.

Joy of joys

Joy of joys

And if you hip and hop over to the Connection Properties tab, you can set all sorts of nifty stuff up. The biggest one for me was to give different types of servers different colored tabs that the bottom of SSMS is highlighted with. It’s the one you’re probably looking at now that’s a putrid yellow-ish color and tells you you’re connected and that your query has been executing for three hours. Reassuring. Anyway, I’d use this to differentiate dev from prod servers. Just make sure to choose light colors, because the black text doesn’t show up on dark colors too well.

Wonder of wonders

Wonder of wonders

Another piece of advice here is not to mix servers on different major (and sometimes minor) versions. The reason is that this feature gives you the ability to query multiple servers at once. If you’re looking at DMVs, they can have different columns in them, and you’ll just get an error. Even a simple query to sys.databases will throw you a bonk between 2012 and 2014.

By the planets!

By the planets!

I changed my mind. I hate planets.

I changed my mind. I hate planets.

Even if you’re running 2008R2, there are some pretty big differences in DMVs between SP1 and SP3. Microsoft has been known to change stuff in CUs (I’m looking at you, Extended Events).

On the plus side, you can use your multi-server connection to SELECT @@VERSION to help you decide how you should group them. If they have something better in common, like participating in Log Shipping, Mirroring, an AG, etc., all the better.

Insight

Insight

But my favorite thing, because I was a devotee to the Blitz line of stored procedures even before I got paid to like them, was that I could install them on ALL OF MY SERVERS AT ONCE! This was especially useful when updates came out. You know what it’s like to put a stored proc on 20 servers one at a time? Geeeeeet outta here!

Peanuts.

Peanuts.

Check that out. It’s on both of my servers. At once. That means simultaneously, FYI. If you have a DBA or Admin database that you keep on all your servers to hold your fancy pants scripts and tools, this is an awesome way to make sure they all have the latest and greatest.

You’re already better at your job

Even though this feature came out in 2008, I hardly see anyone using it. I found it really helpful comparing indexes and query plans across app servers that held different client data across them. It also exposes far less than Linked Servers; you need to worry less about access and level of privilege.

Just don’t forget to export your list if you change laptops!

Thanks for reading!

When Should You Hire a Consultant for Amazon RDS?

Powered By Somebody Else's Database on Somebody Else's Computer

Powered By Somebody Else’s Database on Somebody Else’s Computer

You’re hosting your SQL Server databases in Amazon RDS, and performance has been getting slower over time. You’re not sure if it’s storage IOPs, instance size, SQL Server configuration, queries, or indexes. What’s the easiest way to find out?

Ask a few questions:

Are you using SQL Server Enterprise Edition? The smallest EE server, a db.r3.2xlarge, costs about $4,241 per month on demand (which isn’t the cheapest way to buy, of course). The next smallest doubles in cost, which means if the performance tuning efforts could drop you by just one single instance size, a consulting engagement would pay for itself well within two months.

Are you using mirroring for multi-AZ protection? If so, the delays required for sequential writes between availability zones may be your biggest bottleneck for inserts, updates, and deletes. Check your wait types with sp_AskBrent, and if the top ones are database mirroring, then the data changes aren’t likely to get faster with hardware tuning. Increased IOPs might help – but it takes deeper digging to get to that conclusion. It’s time to look at reducing your change rate in the database. If, on the other hand, your biggest bottleneck consists of select queries, consulting can help.

Are you locked into a long-reserved instance? You can sell reserved EC2 instances on the secondary market, but you can’t sell RDS instances as of this writing. If you’re having performance problems on it, this is definitely a time to call for consulting help fast. You want to avoid dumping the smaller instance and jumping into another commitment if a growing customer base or slowing code base could mean yet another instance type change.

Or are you running a single Standard Edition instance in just one AZ? Try standing up another RDS instance – but this time with the largest Standard Edition instance type you can get, around $12/hour as of this writing. Run the same types of queries against it, and within a couple hundred bucks of experimentation, you can get an idea of whether or not hardware will be a cheap enough solution. Granted – your time isn’t free – but it’s cheaper than a consulting engagement.

These questions help you figure out when it’s just cheaper to throw more virtual hardware at it.

css.php