Blog

Your users are frustrated because the app is too slow, but you can’t change the queries. Maybe it’s a third party app, or maybe you’re using generated code, or maybe you’re just not allowed to change it. Take heart – there’s still hope.

I do this every week, and I’ll share my proven methodologies to performance tune with indexes, SQL Server configuration switches, and hardware.

In this one-day workshop on Friday, December 4 at the Microsoft office in Chevy Chase, you’ll learn:

Simplified internals – how SQL Server turns your query into an execution plan
Wait stats solutions – a list of fixes for most common bottlenecks
Plan cache analysis – how to discover the most resource-intensive queries so you know if indexes and hardware will work
Monitoring metrics – what DMVs and Perfmon counters to monitor, and what thresholds to alert on
Hardware warning signs – how to prove your server is undersized for your workloads
Tools and scripts – the best free and paid tools to make your tuning job easier

This session is for developers, DBAs, and consultants who have to make SQL Server go faster. You should be comfortable writing queries and creating tables, but not as confident about interpreting SQL Server’s DMVs and diagnostic data. I’ll train you the same way I train my own clients.

This session will be demo-heavy, and you can play along. Bring a laptop running SQL Server 2005 or newer, with at least 100GB of free drive space. We’ll distribute the Stack Overflow demo databases on USB drives for you to restore, and you can follow along with the demos. You’ll get the load test scripts, run them against your laptop, and see how we solve various challenges.

Space is limited to 100 attendees. Get your spot now for $199, and then register for SQLSaturday DC too. See you there!

SQL Server Features I’d Like To See, Oracle Edition

Last Updated March 18, 2016

Erik Darling

Oracle, SQL Server

BUT FRANCE HAS A PONY

I really like SQL Server. Most of the time. Okay, so most of the time I like SQL Server most of the time. Don’t get me wrong, if I had to go back through the career-time continuum and pick a RDBMS to work with, I’d probably still choose it over Oracle. Probably. And, because I don’t exclusively grow facial hair from my neck, I wouldn’t be allowed to choose PostgreSQL. They’d kick me off the mailing list.

Just kidding. You’re all handsome rogues. We could have had a nice life together, staring longingly into each other’s shoes and trying to implement parallelism.

I’d have DB2 here, but the cost of entry to the Developer Edition is rather steep. So, you know, I’m sure it’s great! But no. Though I would be really happy if Microsoft implemented ANSI Standard constructs into T-SQL half as fast as IBM does.

I have poked at Oracle and PostgreSQL a bit, and found they have some really cool stuff. Heresy, right?

Check out some of these Oracle gadgets and tell me they wouldn’t make your life a whole lot easier.

In no particular order:

Table restores! Built in! I’m very surprised we never got a feature like this. You can do it with a 3rd party tool like Dell LiteSpeed.

Adaptive Plans! Go to the link and read the second paragraph. Read it twice. Wipe the drool off your face.

UPDATE: Geoff Patterson has created a Connect item to get Adaptive Plans for SQL Server.

In-Database Row Archiving! You know all that stuff you do with partitions that Oracle already does better? Where you’re basically praying for partition elimination to not undo the two weeks of work you put in to setting up this partitioned table that developers are writing horrible MERGE upserts to? Yeah. You can just tell the engine to not pay attention to rows you don’t care about anymore when it accesses the index. Fancy that.

Bitmap Indexes! It’s kind of like a filtered index, except for all values of a highly non-selective column.

Materializing CTEs! Even though it’s undocumented, we use plenty of undocumented stuff in SQL Server to get the job done. This is really cool to me, since I’ve discussed this limitation in CTEs before. I’d love to see a way to do this in SQL with the same behavior; not having to create temp tables. It would be a nice way to get around issues with caching statistics for temp tables, and especially since MS is still fixing bugs around temp tables.

Are there more? Yeah, but this is a blog. Go grab a VirtualBox and read the documentation if you’re interested in learning more.

Training Plan for a SysAdmin to Manage Availability Groups

Last Updated April 9, 2017

Always On Availability Groups

How hard is it for a systems administrator who’s used to running SQL Server on Windows Clusters to tackle Availability Groups? Our example system administrator knows a bit of TSQL and their way around Management Studio, but is pretty new to performance tuning.

Well, it might be harder than you think. First, let’s look at the skills needed to succeed. Then let’s tackle a sample training plan to get those skills.

Do You Have the Skills to Manage AGs? Take this Test!

sp_Blitz® Sanity Test

Have you eliminated all warnings regarding backups, CHECKDB, alerts, and other administration basics that our free sp_Blitz® script warns about? (Not thought about eliminating them, actually eliminated them.)
Have you established clear RPO and RTO for the databases you’re considering for AGs and made sure your backups and restore plans are meeting them? (Not an estimate, actual sign-offs by business owners.)

Incident Management Skills Test

Do you regularly use techniques to learn what’s running and how long it’s been waiting at any given point in time that are NOT SQL Server’s Activity Monitor or sp_who/sp_who2? (Sorry for the bias, but Activity Monitor is full of lies, and those SQL Server 2000 tools don’t really help anymore.)
Do you have a documented incident response process to collect data when performance or availability suffers that at least 2 people can run?
Do you regularly execute a process to perform root cause analysis when performance is bad or the SQL Server is offline? (Only answer “yes” if it has successfully led to a real root cause 3 times in the last year.)

Performance Tuning Skills Test

Do you regularly analyze your SQL Server wait statistics and baseline them? (A “yes” means that you know what your top 10 bottlenecks mean, what “poison” waits are and what they mean if they show up, and you could quickly identify any changes in your top waits and react.)
Do you know the top 10 queries that run against your SQL Server by CPU, and have you tuned their indexes?

SQL Server Scripting/Configuration Skills Test

Do you have the scripting skills to be able to handle managing jobs across all nodes? Jobs must all run everywhere, and be smart enough to know who is the primary replica.
Do you have the scripting skills and security knowledge to be able to manage linked servers and logins across all replicas? (Don’t say yes if you haven’t done it before– it’s trickier than you think!)

Windows Clustering Skills Test

Have you successfully executed basic Windows clustering management tasks with the version of Windows you plan to go live on? These tasks are: configuring quorum, getting clean cluster validation reports, and using the Cluster Failover Manager.
Have you practiced advanced cluster rescue scenarios at least twice within the past year? (These tasks are: forcing quorum, generating and reading cluster logs)

Testing and Patching Skills Test

Are you subscribed to and do you actively review all hotfixes that come out for your version of Windows Failover Clustering and SQL Server?
Do you test and apply hotfixes and selective updates to a test environment on a regular basis as part of a patching deployment for production?

Here’s the deal: to do well with Availability Groups you need to honestly say yes to every single question on this list.

Closing the Gap: A Training Plan

The right training plan for Availability Groups includes learning from others as well as hands-on experience. To make this work, you need to be devoting at minimum 12 full days per quarter to learning. That’s at least one full day a week, and even then, your time may be very, very tight.

First Quarter: Learn the Pitfalls of AGs and Build your Incident Response Process

If you didn’t score two “yes’s” on the sp_Blitz® sanity test, that’s priority #1. Make sure you have completely covered that section before moving on.

Next, make sure you understand why you need to learn so much, and also make sure you really need AGs! Dig in and learn the basics about Availability Groups and why they’re so tricky, and how to choose the right HA/DR solution in SQL Server. Options:

Free option: Get reading on our blog, we’ve got tons of articles.
Video training option: Take our online course – The DBA’s Guide to High Availability and Disaster Recovery

Next, get working in production and build your incident response kit. It will take work and time to get incident response and root cause analysis really working, but this is an absolute pre-requisite for working with AGs.

Second Quarter: Baseline and Analyze Wait Stats and top Queries

Focus on establishing lightweight monitoring and baselining your top waits and queries in SQL Server.

Free tools: sp_BlitzFirst® helps you with wait information and sp_BlitzCache® helps you with queries
Books and blog posts: We have a list to get you started here
Video training option: Our performance tuning classes help.

Third Quarter: Build a Prototype and Get Advanced Training

At this point, you’ve learned a lot about how to understand what SQL Server is telling you. It’s time to start understanding Availability Groups specifically.

Review your notes from what you learned about looking out for in AGs– and make sure you plan the right version of Windows and SQL Server that’s really right to use.
Set up a prototype environment. You need the same hardware and storage that you’re going to use in production. This isn’t a temporary environment, either: to live well with AGs you need a pre-production environment!
Restore full sized production databases to your AG environment and generate activity against it.
Make the environment fail! You need to cause failures in multiple places and validate that your incident and performance management tools really work in this brave new world.
Oh, and work on all those scripting skills, too.

At this point in the process, it’s time to learn from others and really make sure you’re on the right track. You can do this in multiple ways:

Attend a SQL Server conference that lets you attend “lessons learned” style panels– such as the SQL PASS Summit
Attend an advanced SQL Server DBA training class like our Senior DBA course

Fourth Quarter: Reassess Your Progress and Fill the Gaps

You’ve come a long way. But you still had a job to do with other tasks in it.

Step back and re-take the quiz. How are you executing on all of those tasks, and how many people can execute on them? Return to the areas where you’ve scored the weakest and build those skills up.

Yep, it Takes a Year

Availability Groups are a very cool, cutting edge tool in SQL Server. They’re expensive, but people are willing to pay the price because availability and performance really matter for them.

But if you’re not careful, the tool that you’re using to try to improve performance can slow you down– or take you offline entirely. And if you don’t have really strong incident management and performance tuning skills, you’ll have no idea if the problem is the AG or something completely unrelated.

The good news is that if you have a foundations in system administration, you know some TSQL, and you really focus your efforts, you can learn these skills in a year!

Not Sure Where to Find the Time?

Show your manager this post and talk about your answers to the questions above, and the fact that you need to fill the gap. Be clear about areas where you and your team are weak and strong, and what risks you’re concerned about if you take on more complex technology that you can’t handle. It may not always be obvious, but usually your manager wants you to succeed as much as you do!

We need your help.

Last Updated October 8, 2015

Doctors Without Borders is a truly awesome charity. Medical professionals volunteer their own personal time to go to war-torn countries and solve issues of world health.

SQL Server community member, MCM, and all-around-good-guy Argenis Fernandez organizes Argenis Without Borders, a giving event to help. Last year, the SQL Server community helped raise $13,000 for this well-deserving charity.

Now it’s your turn. Please donate $25 if you can. It would mean a lot to me, and I know it’d mean a lot to people all over the world who urgently need medical help. You’ve got a great job – it’s your turn to give back.

Donate now. Seriously, right now.

AlwaysOn AG Databases Need Bigger Log Files

Last Updated February 9, 2017

Always On Availability Groups, SQL Server

Doug demonstrates what he's going to do to the failed replica. — Doug demonstrates what he’s going to do to the failed replica.

One of your SQL Servers is going to fail.

When one of your AG members goes down, what happens next is just like opening a new SSMS window and typing BEGIN TRAN. From this moment forwards, the transaction log starts growing.

And growing. And growing.

SQL Server is waiting for that failed replica to come back online and start synchronizing again. The transaction log has to contain all of the data necessary to bring that once-failed replica back up to speed, to the current point in time. It doesn’t matter whether this replica is sync or async, important failover partner or an unimportant reporting server.

After every 4 hours of outage troubleshooting (at 4 hours in, 8 hours, 12 hours), I look at the facts:

How much free space do I have left on the log file drive?
How long will it take to reseed the failed replica by using a full backup?
Based on these, should I remove the failed replica from the Availability Group?

This means at minimum, my AG replicas all need enough space to have 4 hours of transaction log activity. I can buy myself more time by disabling index defrag/rebuild jobs as soon as a replica fails, too. (In a perfect world, I’d code that into my index rebuild jobs, but that’s left as an exercise for the reader.)

After the emergency finishes, I don’t bother shrinking the log files back down – because after all, another replica is going to fail. It’s not a matter of if – it’s just a matter of when.

Knowing that all this is going to happen, it makes sense to pre-grow/size your transaction log files large enough to handle at least 4 hours of activity without a log backup. After all, when I’m troubleshooting an availability emergency, I don’t want to waste time troubleshooting a performance emergency. If my AG databases are constantly freezing up while they grow the log files, it’s going to be a bad distraction from the problem I really need to focus on.

Make Technical Decisions Easily With This One Trick

Last Updated September 30, 2015

Jeremiah Peschka

Decisions are emotional, right? Brent loves Sanka and I love Postum. We fight about it all the time. But when we wanted to settle the debate once and for all, we engineered a test to figure out who was right. You can do the same thing to take the personal investment out of technical decisions at work.

Check Yourself at the Door

The first thing you need to remember is that your opinions are just as valid everyone else involved. You need to move those aside and be ready to be wrong.

If the other people involved don’t want to play along, just tell them “Heck, I’d be happy to be wrong because I’ll learn something.” It’s not a bet or a contest, you’re just offering up your willingness to be wrong. Being wrong is great, especially when it’s your turn to be right later on.

Test, Test, Test

The next step to making that decision is to figure out a test. This test has to depend on your different opinions. The purpose of this test is to get your opinions out of the conversation.

Doing this correctly is really hard. You need to figure out:

What are both sides saying?
Which metrics will prove both points?
What won’t prove anyone’s points?
What’s the fastest way to test both options?
What’s a realistic scale to for testing?
What’s the worst that could happen if either side is right?
What’s the worst that could happen if either side is wrong?
If you can only run one test, which test should you run?

Hey, You Said This Was Easy!

All of this sounds like a lot of work. It turns out being factually right is just as much work as being factually wrong. If you really want to make sure that you’re choosing the right solution to a problem you need to figure out which option is the most right way to solve the problem – both solutions could be good, but one just might be better. The only way to get to proof is to test everything.

The next time there’s an argument on your team, or between two teams, figure out the best way to test each side’s ideas instead of spending your time arguing about which solution is the best.

Of course, you could always just fight in the parking lot like school kids. I hear that works well, too.

Amazon EC2 Dedicated Hosts: Much Cheaper SQL Server Licensing

Last Updated February 13, 2017

Amazon Web Services (AWS), Licensing, SQL Server

When you build SQL Servers in your data center, and you license the host with Enterprise Edition, you get the ability to run as many virtual SQL Servers as you want on that hardware.

In the cloud, though, you haven’t had that luxury. In the past, with Azure VMs and Amazon Web Services’ EC2 VMs, you’ve had to license each individual VM. You weren’t able to guarantee that your VMs all lived on the same host, so you had to pay through the nose.

Soon, Amazon is fixing that with the ability to get your own EC2 Dedicated Hosts.

This is gonna be huge. You can license one AWS host, and then run as many SQL Server VMs on there as you want. Use the same cool AWS management tools, and dramatically cut your costs.

No word on release dates or pricing yet.

I’m on the Away From the Keyboard Podcast

Last Updated February 13, 2017

Blogging, SQL Server

Cecil Phillip (@cecilphillip) and Richie Rump (@jorriss) interviewed me for their podcast, Away From the Keyboard.

In the half-hour discussion, we talked about:

Why I’m a huge believer in giving away stuff for free
How I got started with Topspeed Clarion, then moved to SQL Server
When I started blogging, and how that turned into a consulting company

Head on over to AFTK and listen, and stay tuned for part 2.

How to Download the Stack Overflow Database

Last Updated July 21, 2022

Stack Overflow

164

I use a Microsoft SQL Server version of the public Stack Overflow data export for my blog posts and training classes because it’s way more interesting than a lot of sample data sets out there. It’s easy to learn, has just a few easy-to-understand tables, and has real-world data distributions for numbers, dates, and strings. Plus, it’s open source and no charge for you – just choose your size:

Small: 10GB database as of 2010: 1GB direct download, or torrent or magnet. Expands to a ~10GB database called StackOverflow2010 with data from the years 2008 to 2010. If all you need is a quick, easy, friendly database for demos, and to follow along with code samples here on the blog, this is all you probably need.
Medium: 50GB database as of 2013: 10GB direct download, or torrent or magnet. Expands to a ~50GB database called StackOverflow2013 with data from 2008 to 2013 data. I use this in my Fundamentals classes because it’s big enough that slow queries will actually be kinda slow.
Large: current 430GB database as of 2022-06: 54GB torrent (magnet.) Expands to a ~430GB SQL Server 2016 database. Because it’s so large, I only distribute it with BitTorrent, not direct download links.
For my training classes: specialized copy as of 2018/06: 47GB torrent (magnet.) Expands to a ~180GB SQL Server 2016 database with queries and indexes specific to my training classes. Because it’s so large, I only distribute it with BitTorrent, not direct download links.

After you download it, extract the .7Zip files with 7Zip. (I use that for max compression to keep the downloads a little smaller.) The extract will have the database MDF, NDFs (additional data files), LDF, and a Readme.txt file. Don’t extract the files directly into your SQL Server’s database directories – instead, extract them somewhere else first, and then move or copy them into the SQL Server’s database directories. You’re going to screw up the database over time, and you’re going to want to start again – keep the original copy so you don’t have to download it again.

Then, attach the database. It’s in Microsoft SQL Server 2016 format (2008 for the older torrents), so you can attach it to any 2016 or newer instance. It doesn’t use any Enterprise Edition features like partitioning or compression, so you can attach it to Developer, Standard, or Enterprise Edition. If your SSMS crashes or throws permissions errors, you likely tried extracting the archive directly into the database directory, and you’ve got permissions problems on the data/log files.

As with the original data dump, this is provided under cc-by-sa 4.0 license. That means you are free to share this database and adapt it for any purpose, even commercially, but you must attribute it to the original authors (not me):

What’s Inside the StackOverflow Database

I want you to get started quickly while still keeping the database size small, so:

All tables have a clustered index on Id, an identity field
No other indexes are included (nonclustered or full text)
The log file is small, and you should grow it out if you plan to build indexes or modify data
It only includes StackOverflow.com data, not data for other Stack sites

To get started, here’s a few helpful links:

This Meta.SE post explains the database schema.
If you want to learn how to tune queries, Data.StackExchange.com is a fun source for queries written by other people.
For questions about the data, check the data-dump tag on Meta.StackExchange.com.

Past Versions

I also keep past versions online too in case you need to see a specific version for a demo.

2021-02 – 54GB torrent (magnet.) Expands to a ~401GB SQL Server 2016 database.
2020-06 – 46GB torrent (magnet.) Expands to a ~381GB SQL Server 2008 database. This is the last export that can be used with SQL Server 2014 & prior.
2019-12 – 52GB torrent (magnet.) Expands to a ~361GB SQL Server 2008 database.
2019-09 – 43GB torrent (magnet.) Expands to a ~352GB SQL Server 2008 database. This is the last export licensed with the cc-by-sa 3.0 license.
2019-06 – 40GB torrent (magnet.) Expands to a ~350GB SQL Server 2008 database.
2018-12 – 41GB torrent (magnet.) Expands to a ~323GB SQL Server 2008 database.
2018-09 – 39GB torrent (magnet.) Expands to a ~312GB SQL Server 2008 database.
2018-06 – 38GB torrent (magnet.) Expands to a ~304GB SQL Server 2008 database. Starting with this version & newer, the giant PostHistory table is included. As you can probably guess by the name, this would make for excellent partitioning and archival demos. As you might not guess, the NVARCHAR(MAX) datatypes of the Comment and Text fields make those demos rather…challenging.
2017-12 – 19GB torrent (magnet.) Expands to a ~137GB SQL Server 2008 database.
2017-08 – 16GB torrent (magnet), 122GB SQL Server 2008 database. Starting with this version & newer, each table’s Id fields are identity fields. This way we can run real-life-style insert workloads during my Mastering Query Tuning class. (Prior to this version, the Id fields were just INTs, so you needed to select the max value or some other trick to generate your own Ids.)
2017-06 – 16GB torrent (magnet), 118GB SQL Server 2008 database. Starting with this torrent & newer, I broke this up into multiple SQL Server data files, each in their own 7z file, to make compression / decompression / distribution a little easier. You need all of those files to attach the database.
2017-01 – 14GB torrent (magnet), 110GB SQL Server 2008 database
2016-03 – 12GB torrent (magnet), 95GB SQL Server 2005 database
2015-08 – 9GB torrent (magnet), 70GB SQL Server 2005 database

Why are Some Sizes/Versions Only On BitTorrent?

BitTorrent is a peer-to-peer file distribution system. When you download a torrent, you also become a host for that torrent, sharing your own bandwidth to help distribute the file. It’s a free way to get a big file shared amongst friends.

The download is relatively large, so it would be expensive for me to host on a server. For example, if I hosted it in Amazon S3, I’d have to pay around $5 USD every time somebody downloaded the file. I like you people, but not quite enough to go around handing you dollar bills. (As it is, I’m paying for multiple seedboxes to keep these available, heh.)

Some corporate firewalls understandably block BitTorrent because it can use a lot of bandwidth, and it can also be used to share pirated movies/music/software/whatever. If you have difficulty running BitTorrent from work, you’ll need to download it from home instead.

Does Creating an Indexed View Require Exclusive Locks on an Underlying Table?

Last Updated May 16, 2019

Indexing, Locking, Blocking, and Isolation Levels, SQL Server

An interesting question came up in our SQL Server Performance Tuning course in Chicago: when creating an indexed view, does it require an exclusive lock on the underlying table or tables?

Let’s test it out with a simple indexed view run against a non-production environment. (AKA, a VM on my laptop running SQL Server 2014.)

Tracing the Locks

To trace the locks, I just used one of the built-in XEvents templates tor tracing locks, added a filter so it only looked at locks from my current session (53), and set it to write to an event file. The query scripts out like this:

CREATE EVENT SESSION [LockTrace] ON SERVER 
ADD EVENT sqlserver.lock_acquired (
     SET collect_database_name=(1),collect_resource_description=(1)
    ACTION(sqlserver.query_hash)
    WHERE (
         [package0].[greater_than_uint64]([database_id],(4)) 
         AND [package0].[equal_boolean]([sqlserver].[is_system],(0)) 
         AND [package0].[equal_uint64]([sqlserver].[session_id],(53))
         )
    ) 
ADD TARGET package0.event_file (SET filename = N'S:\Xevents\Traces\LockTrace.xel')
WITH (MAX_MEMORY=4096 KB, EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=30 SECONDS, MAX_EVENT_SIZE=0 KB, MEMORY_PARTITION_MODE=NONE, TRACK_CAUSALITY=OFF, STARTUP_STATE=OFF)
GO

CREATE EVENT SESSION [LockTrace] ON SERVER

ADD EVENT sqlserver.lock_acquired (

SET collect_database_name=(1),collect_resource_description=(1)

ACTION(sqlserver.query_hash)

WHERE (

[package0].[greater_than_uint64]([database_id],(4))

AND [package0].[equal_boolean]([sqlserver].[is_system],(0))

AND [package0].[equal_uint64]([sqlserver].[session_id],(53))

)

ADD TARGET package0.event_file (SET filename = N'S:\Xevents\Traces\LockTrace.xel')

WITH (MAX_MEMORY=4096 KB, EVENT_RETENTION_MODE=ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY=30 SECONDS, MAX_EVENT_SIZE=0 KB, MEMORY_PARTITION_MODE=NONE, TRACK_CAUSALITY=OFF, STARTUP_STATE=OFF)

Creating the Indexed View

I test it out by creating and dropping the indexed view this way:

CREATE VIEW dbo.IndexMe
WITH SCHEMABINDING
AS
SELECT
PostTypeId,
COUNT_BIG(*) as CT
FROM dbo.Posts
GROUP BY PostTypeId;
GO

-- This is where it becomes an indexed view.
CREATE UNIQUE CLUSTERED INDEX ix_indexme on dbo.IndexMe (PostTypeId)
GO

DROP INDEX ix_indexme on dbo.IndexMe;
GO

CREATE VIEW dbo.IndexMe

WITH SCHEMABINDING

SELECT

PostTypeId,

COUNT_BIG(*) as CT

FROM dbo.Posts

GROUP BY PostTypeId;

-- This is where it becomes an indexed view.

CREATE UNIQUE CLUSTERED INDEX ix_indexme on dbo.IndexMe (PostTypeId)

DROP INDEX ix_indexme on dbo.IndexMe;

Now let’s analyze the locks!

I stop my trace and open up the event file. I click around in the magic XEvents toolbar and group by object_id and mode (that’s the lock mode). And here’s what I get:

The low object_ids are internal system tables. Object_id 526624919 =dbo.Posts. Object_id 843150049 = dbo.IndexMe.

Translating this out:

dbo.Posts required S and SCH_S locks
dbo.IndexMe required X, SCH_S, and SCH_M locks

We didn’t require exclusive locks on the underlying table. I did require shared locks, and creating this indexed view could require lots of IO or be part of a blocking chain with other transactions. But creating the indexed view only needed exclusive locks on the indexed view itself.

SQL Server 2016 CTP2.4: Maintenance Plan Changes

Last Updated February 13, 2017

CHECKDB and Corruption, SQL Server

You can download the next preview of SQL Server 2016 today, and the announcement blog post touted a few improvements to maintenance plans. Oddly, the screenshots in the announcement were apparently done with Microsoft Paint, so I fired up a VM to see how things really look.

The DBCC CHECKDB step in the wizard now has checkboxes for physical only and tablock:

The index reorg task now lets you do LOB compaction, scan index stats, and then only optimize the index based on statistics:

The rebuild task has similar options in the GUI, plus options about keeping the index online:

Updating statistics – I honestly don’t remember what older versions looked like. (I don’t step through that wizard often.)

The options on the full backup task got some attention as well:

While improvements are usually a good thing, I gotta be honest: these screens are pretty haphazard. Stuff doesn’t line up, warnings don’t fit into boxes, and it feels terribly slapped together. It doesn’t give me a high level of confidence, certainly not enough to steer me away from Ola Hallengren’s much more powerful and flexible free scripts. I get it – not everybody wants to use add-on scripts – but I’m just not filled with confidence when I see these screens.

That could mean we’re still quite a ways away from a SQL Server 2016 release, which is good news and bad news.

Forgotten Maintenance – Cycling the SQL Server Error Log

Last Updated September 25, 2015

Jeremiah Peschka

Most of us get caught up in fragmentation, finding the slowest queries, and looking at new features. We forget the little things that make managing a SQL Server easier – like cylcing the SQL Server error logs.

What’s the Error Log?

The SQL Server error log is a file that is full of messages generated by SQL Server. By default this tells you when log backups occurred, other informational events, and even contains pieces and parts of stack dumps. In short, it’s a treasure trove of information. When SQL Server is in trouble, it’s nice to have this available as a source of information during troubleshooting.

Unfortunately, if the SQL Server error log gets huge, it can take a long time to read the error log – it’s just a file, after all, and the GUI has to read that file into memory.

Keep the SQL Server Error Log Under Control

It’s possible to cycle the SQL Server error log. Cycling the error log starts a new file, and there are only two times when this happens.

When SQL Server is restarted.
When you execute sp_cycle_errorlog

When SQL Server cycles the error log, the current log file is closed and a new one is opened. By default, these files are in your SQL Server executables directory in the MSSQL\LOG folder. Admittedly, you don’t really need to know where these are unless you want to see how much room they take up.

SQL Server keeps up to 6 error log files around by default. You can easily change this. Open up your copy of SSMS and:

Expand the “Management” folder.
Right click on “SQL Server Logs”
Select “Configure”
Check the box “Limit the number of error log files before they are recycled”
Pick some value to put in the “Maximum number of error log failes” box
Click “OK”

It’s just that easy! Admittedly, you have to do this on every SQL Server that you have, so you might just want to click the “Script” button so you can push the script to multiple SQL Servers.

Automatically Rotating the SQL Server Error Log

You can set up SQL Server to automatically rotate your error logs. This is the easiest part of this blog post, apart from closing the window.

To cycle error logs on a regular basis, restart your SQL Server nightly.

Only joking.

You can set up a SQL Agent job with a T-SQL step. All it has to do is EXEC sp_cycle_errorlog. Schedule the SQL Agent job to run as frequently as you’d like and you’re good to go. The upside of this approach is that it’s automatic and the SQL Server error logs will be more granular, making it easier to find the error messages you’re looking for.

It’s Just That Easy!

Cycling the SQL Server error log is easy – you just need a regularly scheduled agent job. Rotating the logs makes it easier to find error messages. Let’s face it – you’re only looking for error messages when there’s a problem. That’s all there is to rotating the error logs.

Window Functions and Cruel Defaults

Last Updated March 12, 2018

Erik Darling

SQL Server, T-SQL

My First Post Here…

Well, my first technical post, was about how the default index creation method is OFFLINE. If you want that sweet, sweet Enterpri$e Edition ONLINE goodness, you need to specify it. It’s been a while since that one; almost six months to the day. So here’s another one!

But Window Functions Are Awesome

Heck yeah they are. And how. Boy howdy. Etc. You get the point. I’m enthusiastic. What can be cruel about them? Glad you asked!

Window Functions, according to the almighty ANSI Standard, have two ways of framing data: RANGE and ROWS. Without getting into the implementation differences between the ANSI Standard and Microsoft’s versions, or any performance differences between the two, there’s a funny difference in how they handle aggregations when ordered by non-unique values. A simple example using the Stack Overflow database follows.

SELECT OwnerUserId,
 CAST(CreationDate AS DATE) AS DumbedDownDate,
 Score,
 SUM(Score) OVER ( ORDER BY CAST(CreationDate AS DATE)) AS Not_Specified,
 SUM(Score) OVER ( ORDER BY CAST(CreationDate AS DATE)RANGE UNBOUNDED PRECEDING ) AS Range_Specified,
 SUM(Score) OVER ( ORDER BY CAST(CreationDate AS DATE)ROWS UNBOUNDED PRECEDING ) AS Rows_Specified
FROM dbo.Posts
WHERE OwnerUserId = 1
AND CAST(CreationDate AS DATE) BETWEEN '2008-08-01' AND '2008-08-31'
ORDER BY CAST(CreationDate AS DATE);

SELECT OwnerUserId,

CAST(CreationDate AS DATE) AS DumbedDownDate,

Score,

SUM(Score) OVER ( ORDER BY CAST(CreationDate AS DATE)) AS Not_Specified,

SUM(Score) OVER ( ORDER BY CAST(CreationDate AS DATE)RANGE UNBOUNDED PRECEDING ) AS Range_Specified,

SUM(Score) OVER ( ORDER BY CAST(CreationDate AS DATE)ROWS UNBOUNDED PRECEDING ) AS Rows_Specified

FROM dbo.Posts

WHERE OwnerUserId = 1

AND CAST(CreationDate AS DATE) BETWEEN '2008-08-01' AND '2008-08-31'

ORDER BY CAST(CreationDate AS DATE);

For the month of August, Year of Our Codd 2008, we’re getting a running total of the score for posts by UserId 1. Who is UserId 1? I’ll never tell. But back to the syntax! In the first SUM, we’re not specifying anything, for the next two we’re specifying RANGE and then ROWS. Why? REASONS! And why am I casting the CreateDate column as a date? MORE REASONS!

Before you scroll down, think for a second:

If I don’t specify RANGE or ROWS, which will SQL Server use?
If I left the CreateDate column as DATETIME, what ~~eff~~ ~~aff~~ difference would it make to the output?

Do you see a pattern forming here?

When we don’t specify RANGE or ROWS, well, SQL Server is nice enough to pick RANGE for us. “Nice”.

Deep breaths, Erik. Deep breaths.

You should also notice the difference in how each different method aggregates data. When the ordering column has duplicates, RANGE, and by extension, the default method, will SUM all the values for that group at once. When ROWS is specified as the framing method, you see the running total that most people are after.

And, of course, if all the values were unique, they’d do the same thing.

SELECT OwnerUserId,
 CreationDate,
 Score,
 SUM(Score) OVER ( ORDER BY CreationDate ) AS Not_Specified,
 SUM(Score) OVER ( ORDER BY CreationDate RANGE UNBOUNDED PRECEDING ) AS Range_Specified,
 SUM(Score) OVER ( ORDER BY CreationDate ROWS UNBOUNDED PRECEDING ) AS Rows_Specified
FROM dbo.Posts
WHERE OwnerUserId = 1
AND CAST(CreationDate AS DATE) BETWEEN '2008-08-01' AND '2008-08-31'
ORDER BY CreationDate;

SELECT OwnerUserId,

CreationDate,

Score,

SUM(Score) OVER ( ORDER BY CreationDate ) AS Not_Specified,

SUM(Score) OVER ( ORDER BY CreationDate RANGE UNBOUNDED PRECEDING ) AS Range_Specified,

SUM(Score) OVER ( ORDER BY CreationDate ROWS UNBOUNDED PRECEDING ) AS Rows_Specified

FROM dbo.Posts

WHERE OwnerUserId = 1

AND CAST(CreationDate AS DATE) BETWEEN '2008-08-01' AND '2008-08-31'

ORDER BY CreationDate;

Wrap. It. Up.

This one is pretty self explanatory. If you’re lucky enough to be on SQL Server 2012 or greater, and you’re using Window Functions to their full T-SQL potential, it’s was easier to calculate running totals. Just be careful how you write your code.

If you like this sort of stuff, Check out Doug’s new video series, T-SQL Level Up. There are next to zero fart jokes in it.

Announcing Our 2016 In-Person Training Class Schedule

Last Updated April 9, 2017

We just finished up the last of our 2015 training classes, and the reviews have been glowing:

“I have attended many training classes in my 19 years as a DBA, including Oracle and SQL Server. This is the first class where 75+% of the information I learned is applicable to my current work environment.” – Christina Z.

“I thought the class was much more useful than the Microsoft Official Curriculum. Those classes seem to teach to the book and instructors often can’t answer questions that aren’t included in the curriculum. You guys have real-world experience that you can share with the students. Very valuable!” – Kris C.

“They’re really best of the best on the market. And always open to QA sessions. And they asked do you have any questions? And ALWAYS have answers!!!!” – Vlad B.

And now, registration is open for next year’s training class lineup. Drum roll please:

Advanced Querying and Indexing – 5 days, $3,995

Jan 11-15, Newark
August 8-12, Portland

SQL Server Performance Troubleshooting – ~~4 days~~ now expanded to 5 days, $3,995:

February 22-26, San Diego
April 11-15, Newark
June 6-10, Chicago
August 15-19, Portland
December 5-9, Philadelphia

The Senior DBA Class of 2016 – 5 days, $3,995

March 7-11, Chicago
May 2-6, Denver
August 1-5, Philadelphia

We’ve been updating each class to reflect new technologies and techniques, and we’re excited to share what we’ve been teaching our clients. And of course, if you’d like to bring one of us onsite to teach a custom class lineup, check out our training catalog and email us.

Need to Migrate or Upgrade SQL Server? Register for our upcoming webcast!

Last Updated October 16, 2015

Change is exciting, and a little scary, too. If you’re looking to upgrade or migrate your SQL Server, head on over to our events page and get registered for our upcoming free webcast on Thursday, Oct 15 at 10:00AM Central, 11:00AM Eastern.

SQL Server Migration Checklist (Sponsored by Veeam)

It’s time to upgrade your SQL Server! But do you know exactly what you need to do? Join this free webinar with Microsoft Certified Master Kendra Little to get her checklist to make your migration go smoothly. You’ll learn what you need to script and document from your current instance, which settings you should change after your migration, required tests for new installations, and better alternatives to in-place upgrades. Whether your SQL Server 2005 instance is hitting end of life or you’re planning an upgrade to the latest and greatest, this migration checklist will save you time and frustration. Register now.

Missed the Webcast, but want the Migration Checklist?

Download the SQL Server Migration Checklist eBook as part of our free First Responder Kit.

Announcing: Level Up Your T-SQL

Last Updated April 9, 2017

Doug Lane

You spend at least an hour or two each day writing T-SQL code. You have a pretty good hunch that it’s not as fast or crisp as it should be, and you wonder if you’re missing a few techniques. You’ve heard about date tables and windowing functions, but you’re just not comfortable with ’em yet.

I’m Doug Lane of Brent Ozar Unlimited, and my new online training class is for you. It’s loaded with T-SQL goodies like using computed columns, how to replace cursors, and more. Here’s a sneak preview:

It’s unlike any T-SQL training you’ve ever seen (seriously!), and it’s totally free. I’d love to hear what you think.

Our customers, who got free access to it as a thank-you, have already started leaving reviews:

“Great job on delivering information with a wink and a nod. It really held my interest.” – Debby

“Doug is an exceptional speaker, and uses humour in effective ways to convey difficult concepts. I learnt a thing or two from this course.” – Randolph

“If only l had seen this course when l started out my SQL journey! I love the teaching style it works well for me. Then there are those simple time saving tips slipped in just at the right time. The analogy about banking the coins, genius wish l had thought of it! Teaching pace it timed perfectly. Overall looking forward to seeing more Robert

“This is perhaps the nerdiest thing I’ve seen all year.” – Wulfgar

“Great starter for thinking set-based!” – Gustavo

“I loved the Set Based sections with the examples of the cursors and how you could change the code to make them set based operations. But I think my favorite part was the windowing functions. Now I have good examples of how they work and when they are a good fit.” – Colin

“This course is funny and filled with a lot of good information. A great recap for any DBA. And a gold mine for anybody less experimented with coding T-SQL. The Windowing functions part is especially helpful. I suggest this course to anybody writing T-SQL.” – Benoit

What’s More Popular: SQL Server 2014, or SQL Server 2005?

Last Updated September 6, 2015

Sure, you’d much rather work with SQL Server 2014, but what’s more prevalent out in the real world? At Dell DBA Days, I saw a chart that shocked me:

That chart shows the number of servers running Dell Spotlight Essentials, a free SQL Server monitoring tool. Note that SQL Server 2014 has 4% of the population, about 1/4 as much as SQL Server 2005.

But the terrifying part: for every 2 SQL Server 2014 instances, there’s one SQL Server 2000 instance still kicking around! That’s not exactly a rapid adoption rate.

I wonder why we aren’t we upgrading our SQL Server 2000 instances? (And I’m not pointing the figure at you – it’s likely the business that’s making you keep it around.)

Filtered Indexes and IS NOT NULL

Last Updated February 13, 2017

Indexing, SQL Server

Filtered indexes can be tricky. Just getting your queries to use the filtered index can be a real pain. Parameterization may mean it’s “unsafe” to use the filtered index. The optimizer might want the filtered column in your key or include column list when it doesn’t seem like it should have to be there. The weather might be cloudy.

But there’s one type of filtered index that everyone can love for a couple different reasons: IS NOT NULL.

“IS NOT NULL” filter for a selective query…

Here’s an example index. We’re using the StackOverflow sample database and creating the index only on Posts which are closed (a small subset):

CREATE INDEX ix_Posts_Score_ClosedDate_INCLUDES_FILTERED on dbo.Posts 
    (Score, FavoriteCount, ClosedDate) 
    INCLUDE (Id, Title, Tags, CommentCount, OwnerUserId, LastEditDate, LastEditorDisplayName)
    WHERE (ClosedDate IS NOT NULL);
GO

CREATE INDEX ix_Posts_Score_ClosedDate_INCLUDES_FILTERED on dbo.Posts

(Score, FavoriteCount, ClosedDate)

INCLUDE (Id, Title, Tags, CommentCount, OwnerUserId, LastEditDate, LastEditorDisplayName)

WHERE (ClosedDate IS NOT NULL);

Here’s a query looking for Posts with a ClosedDate in a given range. Note that the query does NOT say “ClosedDate is NOT NULL”, it’s just specifying a value range:

CREATE PROCEDURE #ClosedPostsByScoreAndFavorites
    @Start DATETIME,
    @End DATETIME,
    @Score INT,
    @FavoriteCount INT
AS
    SELECT
        p.Id,
        u.DisplayName,
        p.Title,
        p.Tags,
        p.CommentCount,
        p.FavoriteCount,
        p.Score,
        p.ClosedDate,
        p.LastEditDate,
        p.LastEditorDisplayName
    FROM dbo.Posts as p
    JOIN dbo.Users as u on p.OwnerUserId=u.Id
    WHERE 
        p.ClosedDate between @Start and @End
        and p.Score > @Score
        and p.FavoriteCount > @FavoriteCount
GO

CREATE PROCEDURE #ClosedPostsByScoreAndFavorites

@Start DATETIME,

@End DATETIME,

@Score INT,

@FavoriteCount INT

SELECT

p.Id,

u.DisplayName,

p.Title,

p.Tags,

p.CommentCount,

p.FavoriteCount,

p.Score,

p.ClosedDate,

p.LastEditDate,

p.LastEditorDisplayName

FROM dbo.Posts as p

JOIN dbo.Users as u on p.OwnerUserId=u.Id

WHERE

p.ClosedDate between @Start and @End

and p.Score > @Score

and p.FavoriteCount > @FavoriteCount

We give the query a run and look at the plan….

EXEC #ClosedPostsByScoreAndFavorites @Start='2011-01-01', @End='2014-01-01', @Score=2, @FavoriteCount=2;
GO

1 2	EXEC #ClosedPostsByScoreAndFavorites @Start='2011-01-01', @End='2014-01-01', @Score=2, @FavoriteCount=2; GO

Woooo, it matches my index!

The highlighted index is ix_Posts_Score_ClosedDate_INCLUDES_FILTERED

And now for my next trick… what if you need to guarantee uniqueness for a column that allows NULLS?

We’re entering controversial territory here, so I’m going to be careful. SQL Server treats NULL as a value. This is a really big deal to some people, and I totally understand if you want to vent in the comments about how that’s a terrible thing and it shouldn’t be that way, and SQL Server stole your pickup truck. But it’s the SQL Server we live in.

Treating NULLs as a value means that I can’t create a unique index or a unique constraint on a column that allows NULLs if it has more than one row with NULL in it. More than one NULL means I have duplicates.

A filtered unique index can get you around the problem, like this:

CREATE UNIQUE INDEX uq_UsersAllowNulls_DisplayName on dbo.UsersAllowNulls ( DisplayName )
    WHERE DisplayName IS NOT NULL;
GO

CREATE UNIQUE INDEX uq_UsersAllowNulls_DisplayName on dbo.UsersAllowNulls ( DisplayName )

WHERE DisplayName IS NOT NULL;

Yay for indexes!

We’ve got tons of material on indexes, start reading more over here.

Getting Started with Diskspd

Last Updated September 15, 2015

Jeremiah Peschka