Blog

SQL Server, Writing and Presenting

When I’m writing a presentation or blog post, I often start here:

It’s a photo of me in my office in Dallas, Texas in 2004. When I look at that picture, I remember everything like it was yesterday. I can talk at length about everything on the bookshelf, on my desk, in my drawers (the desk drawers, that is).

I can tell you what technology problems I was struggling with, plus what problems my manager was concerned about. I remember what I knew, and what I didn’t know yet. I can recite the web sites I frequented.

Next, I can turn the mental camera around and see exactly what’s outside my office door: my developers and my support team. I can tell you what they rocked at and what they wanted training on. I can remember how we decorated their cubes for their birthdays – covering Julian’s stuff in aluminum foil, building a princess’ castle for Hima.

The funniest thing, though, is that I didn’t remember any of this until I rediscovered this photo several years ago. All of a sudden, everything was clear to me.

And I realized who I was writing for.

Now, it’s really easy for me to scope my presentations and blog posts because I’m writing for 2004 Brent. 2004 Brent hadn’t studied databases and tried to turn them inside out – he just needed to store data and get it back out quickly. He wasn’t on a first name basis with book authors and MVPs – he didn’t even know what an MVP was.

You need to take this picture today.

Set up your camera with a self-timer or get a colleague to shoot a few pictures of yourself sitting in your work environment. Get pictures of the books on your shelf, the stuff on your desk, and maybe take a screenshot of your task list. Write yourself a one-page note covering:

The stuff you’re comfortable with
The stuff you’re uncomfortable with
The things you want to learn this year
The things you learned recently that surprised you

Stash these pictures and words away in a time capsule folder somewhere. A few years from now, when you’re writing a presentation covering something you’ve learned, get these back out. Think about what you knew and didn’t know, and that’s your target audience. Before you use a term or acronym, think back and ask, “Did 2013 Me know that? If not, lemme introduce the topic.”

When you’re writing, remember that you’re never writing for your current self. You’re writing for the past version of you. Having these pictures and words will help you define your audience.

The Basics of Database Sharding

Last Updated May 1, 2013

Jeremiah Peschka

There are many ways to scale out your database; many of these techniques require advanced management and expensive add-ons or editions. Database sharding is a flexible way of scaling out a database. In this presentation, Jeremiah Peschka explains how to scale out using database sharding, covers basic techniques, and shares some of the pitfalls. This talk is for senior DBAs, database architects, and software architects who are interested in scaling out their database.

More resources are available over in our sharding article.

5 Things About Fillfactor

Last Updated February 13, 2017

Index Maintenance

Are you a page half full, or a page half empty kind of person?

I’ll never think “I’ve seen it all” when it comes to SQL Server– there’s just always someone waiting around the corner who’s found a weird new thing to do with it. But there are some things I really wish I could stop finding.

One of those things I hate finding is bad fillfactor settings. Fillfactor can be a useful tool to help performance, but it’s often a performance killer if you use it incorrectly.

A quick introduction: what is fillfactor, and how is it set?

“Fillfactor” is a setting for indexes in SQL Server. When you create or rebuild an index, you can tell SQL Server what percentage of each 8K data page used in the “leaf” level of the index it should fill up.

In other words, if you set a 90% fillfactor when you rebuild a clustered index, SQL Server will try to leave 10% of each leaf page empty. The empty space will be present on the page as it’s written on disk, and it’ll take up the same amount of space as it sits in memory.

By default, SQL Server uses a 100% fillfactor and tries to fill up all the pages in indexes as close to full as it can. Depending on how many rows actually fit on the page, your mileage may vary.

There are two ways to set fillfactor in SQL Server:

At the SQL Server instance level using a sys.configurations setting for fill factor. (I don’t recommend using this option, for reasons I’ll describe below.)
At an individual index level by specifying fillfactor when you create or rebuild the index.

Index maintenance jobs sometimes automate the second of these options. If you’re using an index maintenance stored procedure you picked up from around the web, you may be setting fillfactor without realizing it.

Why do people love to set fillfactor?

DBAs and developers often read that lowering the fillfactor improves performance by reducing page splits. Perhaps they’re trying to fix a performance problem, or perhaps they’re feeling paranoid. They either lower fillfactor too much on some indexes, or apply a fillfactor change to all indexes.

Here’s the scoop: it’s true that the default fillfactor of 100% isn’t always good. If I fill my pages to the brim, and then go back and need to insert a row onto that page, it won’t fit. To make the data fit and preserve the logical structure of the index, SQL Server will have to do a bunch of complicated things (a “bad” type of page split), including:

Add a new page
Move about half the data to the new page
Mark the data that was moved on the old page so it’s not valid anymore
Update page link pointers on existing pages to point to the new page

And yep, that’s a lot of work. It generates log records and causes extra IO. And yes, if you have this happen a lot, you might want to lower the fillfactor in that index a bit to help make it happen less often.

Where do people mess up fillfactor?

Here’s the thing: having a bunch of empty space on your data pages is ALSO bad for performance. Your data is more spread out so you probably have to read more pages into memory. You waste space in cache that’s just sitting there empty. That’s not only not awesome, it can be TERRIBLE in many cases.

This is particularly wasteful because not all indexes are prone to “bad” page splits. Let’s say I have a clustered index on an incrementing INT or BIGINT identity value. I insert loads of new rows and values are rarely updated or deleted. In this case I can fill my pages very full because I’m always putting new rows at the “end” of the index. Adding these new pages aren’t bad page splits– although unfortunately they are counted in the “page splits/sec” performance counter, which makes it very tricky to find just the “bad” splits.

I frequently find that people have put a fillfactor setting of 80 or below on all the indexes in a database. This can waste many GB of space on disk and in memory. This wasted space causes extra trips to storage, and the whole thing drags down the performance of your queries.

Best practices for setting fillfactor

Here’s some simple advice on how to set fillfactor safely:

Don’t set the system wide value for fillfactor. It’s very unlikely that this will help your performance more than it hurts.
Get a good index maintenance solution that checks index fragmentation and only acts on indexes that are fairly heavily fragmented. Have the solution log to a table. Look for indexes that are frequently fragmented. Consider lowering the fillfactor gradually on those individual indexes using a planned change to rebuild the index. When you first lower fillfactor, consider just going to 95 and reassessing the index after a week or two of maintenance running again. (Depending on your version and edition of SQL Server, the rebuild may need to be done offline. Reorganize can’t be used to set a new fillfactor.)

This second option may sound nitpicky, but in most environments it only takes a few minutes to figure out where you need to make a change. You can do it once a month. And it’s worth it– because nobody wants their database performance to slow down and realize that they’ve been causing extra IO by leaving many gigabytes of space in memory needlessly empty.

Now that we’ve covered the basics, how about those five things? Here’s the fine print on how fillfactor is implemented, and what it does and doesn’t impact.

1) Books Online warned you

The most painful thing about finding bad fillfactor settings is that the Microsoft team has tried to warn people about how bad fillfactor settings can hurt performance. Unfortunately, most people don’t seem to find the warning.

Check out this quote in Books Online: “For example, a fill factor value of 50 can cause database read performance to decrease by two times. “

That’s pretty clear, right? So if I set a fillfactor of 70 (when I don’t need it), I risk decreasing performance by 30%. That doesn’t sound great, either. Because of this delicate balance, follow the best practices above.

2) Fillfactor does not apply to heaps

The fillfactor setting only applies to indexes, not to all tables. If you have a table which does not have a clustered index, it’s called a “heap” in SQL Server. Heaps are weird in several ways. One of those ways is that fillfactor doesn’t apply– not even the fillfactor you set at the instance level. (Page splits don’t work the same way either, but this is not a good reason to have heaps.)

3) Fillfactor doesn’t impact new pages inserted at the end of an index

SQL Server only uses fillfactor when you’re creating, rebuilding, or reorganizing an index. It does not use fillfactor if it’s allocating a fresh new page at the end of the index.

Let’s look at the example of a clustered index where the key is an increasing INT identity value again. We’re just inserting rows and it’s adding new pages at the end of the index. The index was created with an 70% fillfactor (which maybe wasn’t a good idea). As inserts add new pages, those pages are filled as much as possible– likely over 70%. (It depends on the row size and how many can fit on the page.)

4) Fillfactor does not apply to LOB pages

Fillfactor applies to in-row data pages. When you create a table, depending on the data types, you have options as to when some large types get stored on-row, or off-row. When data is not stored in-row, fillfactor settings don’t apply to those special pages.

A general rule of thumb is that if you infrequently read large object columns, it’s better to keep those columns off-row. You will incur extra IO every time you need to fetch off-row data. but it keeps the frequently accessed in-row columns of your index more efficient.

5) Someone may have changed fillfactor without you realizing it

Once fillfactor is set on an index, it stays there. Further rebuilds or reorganizations of the index maintain that fillfactor unless you specify a different value. It’s easy for a change in fillfactor to sneak in. Unless you check for indexes with a fillfactor set, you might not realize what’s going on in your database.

Like anything else, there’s weird exceptions where in very rare cases, setting a super-low fillfactor on a very heavily updated table (which is probably small), can help reduce contention. These cases are very rare. Often there’s a better long term change in the application layer that would handle the issue.

Bottom line: If you find you’re using a low fill factor and you don’t know exactly why it was needed, always look carefully to see if you’re hurting performance more than helping it. And if you’re using the same fillfactor on all of your indexes, you’re probably wasting lots of valuable memory.

How to check your fillfactor today

Our free sp_BlitzIndex® tool looks for all sorts of insanity in your indexes at the database level. It will snoop through your metadata and warn you about low fillfactor values, plus let you know exactly how big those indexes have grown. Check it out today.

Why Most People Don’t Follow Best Practices

Last Updated February 9, 2017

CHECKDB and Corruption, SQL Server

Processes and Practices

"I don't care about best practices, I do what I want!" — “I don’t care about best practices, I do what I want!”

We write a lot about best practices, we teach customers best practices, and we publish popular references for best practices about all sorts of things like performance monitoring, SQL Server Setup, and Availability Groups.

But the internet is a sea of information and many SQL Server instances have unique needs. How does a SQL developer or DBA decide what’s a best practice?

How Most People Define Best Practices

In the real world, the #1 definition of a Best Practice in effect this: “Something we’ve been doing for years without noticing any problems.”

That’s just how human nature works. We all have a set of things we do that are “normal”, and we usually put them into our mental Best Practice Bucket. If we start to have problems, we assume it’s not due to anything “normal”.

Unfortunately, human nature makes people persist all sorts of bad practices. I find everything in the wild from weekly reboots to crazy settings in Windows and SQL Server that damage performance and can cause outages. When I ask why the settings are in place, I usually hear a story that goes like this:

Once upon a time, in a land far far away there was a problem
The people of the land were very unhappy
A bunch of changes were made
Some of the changes were recommended by someone on the internet. We think.
The problem went away
The people of the land were happier
We hunkered down and just hoped the problem would never come back
The people of the land have been growing more and more unhappy over time again

Most of the time “best practices” are implemented to try and avoid pain rather than to configure things well. And most of the time they aren’t thought out in terms of long term performance. Most people haven’t really implemented any best practices, they’ve just reacted to situations.

Why Best Practices Can Get You Fired

There’s one thing you can count on: many of yesterday’s best practices aren’t true today. Maybe the change you made long ago was a best practice at the time (or at least not an unusual or bad practice), but it may be hurting you now.

It doesn’t matter if something used to be a best practice or not. If times and technology have changed and you haven’t kept up, that outdated best practice can still get you fired if you don’t learn to let go.

What You Should Do Today

We try to make it easy for people to find out where their configuration veers away from current best practices.

Here’s what you should do today: run our free tool that checks your SQL Servers for signs you’re not following a best practice.

Look carefully and critically at everything that it points out. Don’t dismiss any of the items by saying, “oh, but we need that because we’ve always done it that way.” If it’s something you’ve been doing a long time, question it even more.

3 Things You Need to Start Doing to Your Database Server

Last Updated February 13, 2017

Jes Schultz

Taking care of your health is incredibly important (says the marathon runner). Eating right, exercising, and getting plenty of rest help you stay healthy and perform optimally.

Apple Cupcake — Does “an apple a day keep the doctor away” apply to cupcake apples?

Your database server’s health is important to your company’s health. You need to take regular backups, check for consistency, and keep indexes tuned for optimal performance. Here are a few tips to make sure your database stays in top shape.

Back up System Databases

The master, model, and msdb databases on every SQL Server are integral parts of the system.

Master contains information such as logins, linked servers, and information about all other databases on the server. Without this database, SQL Server can’t start up. If you don’t have a current backup of this database and a disaster occurs, you lose this information.

Model is used as a template for every new database that is created on your system. There are certain default settings you may adjust, such as file sizes, autogrowth, and the recovery model. If you don’t take a backup of this and a disaster occurs, when you set up the server again you have to remember to reconfigure all of these settings, rather than restoring them.

Msdb contains all of the information about SQL Server Agent operators, jobs, and alerts. It also holds backup and restore history tables. If this isn’t backed up and a disaster occurs, you will need to recreate all of your Agent items. I had to do this once – it was one of the worst weeks in my first DBA job.

Ensure that your backup job covers these databases, because in the event of a disaster, you want to restore them rather than rebuilding all of that information.

Run DBCC CHECKDB On All of Your Databases

You need to run DBCC CHECKDB to check for consistency errors in your database –yes, corruption. If you’re not running this on system databases, or only running it on “important” databases, you run the risk of not finding corruption in a database.

As already explained, the system databases are incredibly important. Combine not running DBCC CHECKDB with not taking a backup, and you can have a disaster without a hardware failure of any sort.

Only checking those databases you feel are “important” is dangerous. Who defines “important”? Is archived data as important as live data? To the person that uses that information for forecasting and trending, it is. Is a system used for reporting once a quarter as important as production data? To the person that has to use those reports to file taxes, it is. If a database is truly “not important”, and isn’t being used, it should be backed up and removed from the server – not sitting out there, taking up space and resources.

Index Maintenance

Indexes are the closest thing to a turbo button that SQL Server has. Proper indexes on tables will help your select queries run faster, and won’t hinder the performance of inserts, updates, and deletes. But putting an index or four on a table when it is released to production, then doing nothing further, is not productive. Table structure, the data in the tables, and the queries from users and applications change over time.

As data is added, updated, and removed, the pages in the index will become fragmented. Fragmentation leads to poor performance. The more scattered across the disk the data is, the harder storage has to work to retrieve it. Regular, consistent reorganization and rebuilding of the indexes can combat this.

Are you wondering where to start? Ola Hallengren has a brilliant set of maintenance scripts, including index maintenance, which can help you out!

Keep Your Server Healthy

By performing regular maintenance on your SQL Servers, you’ll ensure that your data is secure and it is performing optimally. Much like you don’t want to wait until you suffer a heart attack to see a doctor, you don’t want to wait until a disaster to find the weak points in your server.

Answers to Common High Availability and Disaster Recovery Questions

Last Updated December 5, 2017

Backup and Recovery, Clustering

In our recent webcast on HA & DR basics for DBAs, we got a bunch of questions that couldn’t be answered fast enough. Here you go:

Q: Do any of these techniques work for companies with just 1-2 IT staff?

To keep things simple, I’d check out virtualization replication like VMware SRM. SRM isn’t necessarily easy, but if you use that one technique, you can protect all of your virtual machines no matter what they’re running. That’s easier than learning different HA/DR techniques for a lot of different applications.

Q: For 1,000 databases on an instance, what uses more resources – mirroring or log shipping?

Technically, the answer is going to be log shipping because you’re probably already doing transaction log backups on those databases anyway. However, when you’re running thousands of databases per instance, several other interesting issues arise. How do you react to newly added databases? Can you back up all 1,000 in a timely manner, or do you need to roll your own transaction log backup jobs to run simultaneously? What’s your RPO/RTO? There’s a lot of neat questions that determine the real answer at this scale. Feel free to contact us for consulting help – we’ve got clients who run as many as 10,000 databases per instance, and we can help you learn lessons the easy way instead of the hard way.

Q: What happens if my DR is in a different datacenter, on another domain?

Your life is harder. Many built-in solutions become more challenging, and even just getting your users to be able to connect to the database become more challenging. This is a scenario where I like to step back and ask, “What problem are we trying to solve by using different domains?”

Q: Do my HA and DR servers have to be on the same subnet or VLAN?

No, all of SQL Server’s HA/DR features work on different subnets and VLANs.

Q: Do you have to kill all SQL Server connections before restoring a database?

Yes. Restoring a database – even a similar copy – means starting over with all-new data. Restoring transaction logs means SQL Server will be jumping around all over inside the database, and it doesn’t have time to deal with your queries. 😉

Q: Does clustering work for SSAS, SSIS, and SSRS?

Microsoft sells these things inside the same box as SQL Server, but for all practical purposes, you should consider them different products. They’re like free toys that just come in the same box, but they’re wildly different. The techniques you use to protect the SQL Server engine won’t be the same as the techniques you use for the rest.

Q: Can you read a database mirror?

Not directly, but you can take a snapshot of it and read the snapshot. I rarely see this used, though, because in order to refresh the snapshot, you have to drop it – and that means kicking everybody out of the snapshot. Users don’t seem to be a big fan of getting their queries killed.

Q: What Windows Server edition is needed for clustering and AlwaysOn Availability Groups?

You can use any version of Windows Server 2012, but if you’re still using Windows Server 2008R2, you’ll need Enterprise Edition.

Q: How do you get hosts in different data centers into the same cluster?

Clusters don’t need shared storage anymore. This is why I wrote the post Everything You Know About Clustering Is Wrong. Things really have changed a lot in the last several years around clustering.

Q: How well do these features work over a slow connection?

If you’ve got enough bandwidth to keep up with copying transaction log backups as they’re happening, you can stay up to date. If you can’t, they don’t work well.

Q: Can you install SQL Server Enterprise on Windows Standard?

Yes.

Q: I just joined the webcast and it’s almost over. Can you send me the slides?

You’d be surprised how often I get this question. This is why we make our material available for free on our YouTube channel. I don’t usually share the slide decks, though, because I’ve had some nasty incidents of people taking my slide decks, plagiarizing them, and presenting them as their own material.

Q: I’m using an AlwaysOn Availability Group. If I fail over to an asynchronous replica, will there be data loss?

Probably. If your primary has been doing a lot of work, and hasn’t been able to send all of that work to the asynchronous secondary, you’re doing to lose that data. It’s up to you to monitor the DMVs to see how far behind the replicas are.

Q: Is (feature A) better than (feature B)?

Yes.

Q: Really?

No. They both suck.

Q: Seriously, I need an answer.

Getting to the right answer means knowing your RPO, RTO, business needs, staff capabilities, hardware, network throughput, and more. If you ask a stranger to guess the right answer for you with just five minutes of background information, odds are it’s going to be the wrong answer. You’re going to have to buckle down and do some research into the features and your business needs, or you can bring in an experienced outsider who does this for a living. This is what we do, and we can help you get to the right answer as quickly as possible, and probably for less money than you’d expect.

Q: Is there a way of automating failovers for SQL Agent jobs for HA purposes?

If you’re using clustering, then this is built in. If you’re using a solution that does failovers at the user database level, then things get trickier because you have to know which jobs you want to fail over. For example, you probably don’t want backup jobs failing over (because you can just run those jobs on all nodes), but you might want a few database-based jobs failing over. That’s left as an exercise for the reader, though, because every shop’s needs are different – often even different between database servers in the same shop.

Q: What’s the right quorum configuration for my cluster?

Check out the excellent Books Online quorum sections for Windows Server 2012 and Windows Server 2008/R2. It’s a lot of reading, but it’s worth it – the quorum decision is a big one.

Q: Can I integrate clustering with (mirroring/log shipping/replication/AlwaysOn AGs)?

Clustering is probably the strongest feature in terms of interoperability with other features. It’s been around a long time, so it’s got a good interoperability story for pretty much everything.

Q: How can one second RTO be achieved in a DR scenario?

By having both the production and disaster recovery systems live at all times, accepting queries. At that point, it’s up to the application logic to try both servers to see which one is available fastest. Some applications do this by timing out their database connections after just one second. Other applications do it by trying both database servers simultaneously and taking the first available connection.

Q: Can you back up a SQL Server to SQL Azure?

Not in the sense that you can restore directly into SQL Azure (or as it’s known now, Windows Azure SQL Database.) However, you can use Windows Azure as a backup solution if you’re so inclined.

Q: Azure is a French word meaning blue. Americans are not pronouncing French correctly.

French is the most beautiful language in the world – when it’s spoken by the French. When the rest of us try it, we sound like we’re coughing up an unfiltered Marlboro. Thank you for your cheese, wine, and wonderful words like lingerie and ménage à trois. You are a beautiful country, and the rest of us apologize for the sounds we make when confronted with French.

Q: Can you cluster inside VMware?

Yes, but if you’re asking the question, you shouldn’t be doing it.

Q: What are the best practice recommendations for achieving 99.999% uptime?

Obviously this is way beyond what I can answer quickly, but the best advice I can give is to remember that there’s three parts to every solution: people, process, and technology. You need written processes that cover exactly how to react to an outage, and you need to rehearse and improve those processes constantly. Car racing teams practice over and over to get the fastest pit stop possible so that their race car spends more time driving and less time sitting. DBAs need to do the same.

Q: What HA option is Brent’s personal favorite, and why?

I like failover clustering because it protects everything in the instance – jobs, logins, and the server name itself – with as little manual intervention as possible. It still has weaknesses in a few single points of failure, and it’s not completely easy to implement and manage. I think the benefits still outweigh the costs.

How Much Cache Do You Have?

Last Updated May 8, 2013

Jeremiah Peschka

Without looking in your wallet, do you know how much cash you have? Most of us know within a few dollars. Now, without looking in your SQL Server, do you know much data is cached in memory? You probably don’t and that’s okay; you shouldn’t know how much data SQL Server is caching in memory. We can’t control how much data SQL Server is caching, but we can control how we cache data.

Different Types of Cache

There are a lot of different ways to approach caching. One of the most prevalent ways involves thinking about cache in two different levels (much like CPU cache): first level cache and second level cache.

First level cache lives in the application and second level cache is in a separate service — Green means go!

First level cache is an immediate, short-lived cache that works within a single session to attempt to minimize database calls. Unfortunately, first level cache is only used for the duration of a current session or transaction (depending on your terminology). This is very short lived and it’s only useful to the current process. While helpful, first level cache has a limited scope.

There’s another type of cache: second level cache. Second level cache exists outside of the current process and can be shared between multiple transactions, processes, servers, or even applications. When we talk about adding cache to an application, we really mean second level cache.

A Bit of Cache

Even the most basic of ORMs have a little a bit of cache available. The first level cache is used as a short lived buffer to reduce the amount of work that the ORM has to do. First level cache is used for caching objects in the current transaction and query text. Although this cache can be helpful for the current process, this cache isn’t shared across multiple processes or even multiple database batches. If we want to have a more robust cache, we have to look elsewhere.

ORMs like Entity Framework or the LLBLGen Framework don’t have a second level cache. It’s up to developers to add a cache when and where they need it. This exposes developers to additional concerns like cache invalidation, cache updates, and query caching. All of these features and functionality may not be necessary, but that’s an acceptable trade off – it’s up to developers to implement cache features in ways that support application requirements.

Although it takes up developer time, building the second level cache yourself has the benefit of creating a cache that’s suited to the application’s requirements. For many application level features, this is good enough. It’s important, though, that developers pick a caching layer capable of meeting their operational requirements. Operational requirements include horizontal scalability, redundancy and fail over, recovery of cached data, or customizable cache expiration on an object-by-object basis.

These basic ORMs aren’t really all that basic – they have full features in other parts of the ORM, but they only offer basic support for automatic caching through the ORM.

A Lot of Cache

You’ve got memory. You want to use it to cache data. What’s the easiest way to do that?

One of the easiest approaches to adding caching to your application is to use a framework that supports it out of the box. A number of ORMs, including both Hibernate and NHibernate, provide this support. Enabling cache is easy – just change a few lines in a configuration file and the cache will be available to your application. Things start getting tricky, though, when you examine the richness of the caching that’s provided by these tools.

Power comes with a price. When you’re getting starting with tools like Hibernate or NHibernate, there’s a lot to take in and many developers overlook these features. Developers can choose on an object by object basis which caching strategy should be applied. Based on business requirements we can choose to treat certain cacheable objects as read only while others can be used as a read/write cache. Some objects can be cached while others bypass the secondary cache entirely – there’s a lot of complexity for developers to manage.

While this can be overwhelming, this flexibility serves a purpose – not all features of an application have the same requirements. Some features can serve old data to users, other features need to be up to the minute or up to the second. Giving developers the ability to make these choices means that there is a choice to be made. Even if it’s a difficult one, developers can choose how the application behaves and can tailor performance and functionality to business requirements.

Making the Choice

If you’ve already got an existing project and you’re planning on adding a caching layer, don’t think that you have to re-implement your data access layer just to get better support for caching. Both approaches have their benefits and it’s far more important to be aware of which data needs to be cached and the best way to cache it.

Cloud Alternatives to Microsoft SQL Server

Last Updated February 13, 2017

Amazon Web Services (AWS), Cloud Computing, SQL Server

When people say “cloud”, they’re simplifying a lot of different solutions into a single catchphrase. Let’s break out the different options and compare them.

1. SQL Server in Amazon EC2 and Azure VMs

Amazon EC2 is a virtualization platform. Amazon buys servers, installs their secret sauce software, and rents you Windows virtual machines by the hour. Microsoft offers a similar product, Windows Azure Virtual Machines, that just went officially live.

You can rent a blank Windows VM without SQL Server installed, and then install SQL Server yourself just like you would on-premise. That’s a licensing mess, though – you have to use your existing SQL Server licenses or buy new ones for your VMs. That doesn’t make much financial sense. Instead, Amazon and Microsoft will rent you a Windows VM with SQL Server already configured, and your hourly fee includes the SQL Server licensing.

SQL Server runs just as it would in your own datacenter, which means you can use this as a disaster recovery option for your on-premise SQL Servers. You can do log shipping or database mirroring up to SQL in the cloud, running in Amazon EC2 or Microsoft Azure. When disaster strikes, fail over to your EC2/MS infrastructure, and you’re off and running.

The term “cloud” conjures up images of easy scalability and redundancy, but that’s not really the case here. We’re talking about a single virtual machine. This isn’t much different from running SQL Server in a VMware or Hyper-V guest in your own datacenter or in a colo datacenter. You can use all your traditional tools and techniques to manage SQL Server, which is both a pro and a con. If you need to patch it or scale out to multiple servers, there’s no tools included here. I still consider this the cloud, though, because the infrastructure and licensing are managed by somebody else. It’s easy to get started with one – or a hundred – virtual SQL Servers with no initial investment.

This method is the most conventional, and as I explain other options, I’m going to move from conventional to really-far-out-there. SQL in EC2 or MS VMs just works – it’s easy to understand and leverage without changing your code or your techniques – but it doesn’t bring a lot of the cloud’s benefits.

2. Amazon RDS for SQL Server

Instead of running SQL Server in an EC2 virtual machine, let’s start giving up a little bit of control in order to get more of cloud benefits. The next layer of clouds is Amazon Relational Database Service (RDS). Here, Amazon builds a Windows VM, installs SQL Server, configures it, and manages both Windows and the SQL Server service for you.

This is still the real, true blue SQL Server you know and love – all of your commands still work exactly the same as you’re used to, as long as you don’t try to access the server’s local drives directly. (Example: you can’t upload a flat file to the server’s C drive and then try to BCP data from that file into SQL Server.)

Amazon RDS is kinda like they’re the DBA, and you’re a very, very powerful developer. You can create and drop databases and users, but you can’t Remote Desktop into the SQL Server, nor can you access the drives.

Amazon RDS takes a few job duties away from you:

Storage management – want faster storage? Just pick (and pay for) more IOPs. There’s no arguing with the SAN guy.
Monitoring – Amazon CloudWatch tracks performance metrics and sends you emails when they’re outside of your thresholds.
Patching – You pick the major/minor versions you want and when you want patches applied.
Backups – You pick a time window for the full backups, and Amazon manages it using storage snapshots. You can restore from snapshots just by pointing and clicking in the management console.

But herein lies the first real compromise: you can’t restore from anything except snapshots. You can’t upload your own database backup file and restore it. To get data into Amazon RDS, you’ll want to export it to a file, upload that file to Amazon S3 (cloud-based file storage), and then import it. This also means you can’t use Amazon RDS as a participant in log shipping or database mirroring.

Microsoft doesn’t have a competitor to Amazon RDS for SQL Server today. Well, I say that, but some shops already manage their SQL Servers this way – they have an internal admin team that manages Windows and SQL. Departments get access to create & drop databases, change code, etc, but they don’t get access to the server’s desktop or backups. This doesn’t really compete with Amazon RDS, though – RDS is for companies who are too small to have this kind of internal engineering infrastructure. (Or for companies that want to get rid of this large engineering burden, I suppose.)

3. VMware vFabric Data Director

vFabric Data Director is a lot like running Amazon RDS in your own datacenter, but you can control the base Windows virtual machines. You build a Windows template to be used by default whenever a new SQL Server is created. VMware vFabric manages the implementation details for backups and high availability.

vFabric also supports Oracle, PostgreSQL, and Hadoop for a single pane of glass to create and manage your database servers. When someone in your company wants a new database instance, the sysadmins open up vFabric, configure it, and within a few minutes, it’s up and running.

vFabric makes sense for 100% virtualization shops who aren’t interested in moving their databases up to the cloud, but they want easier database management integrated into their virtualization tools.

Microsoft doesn’t have a competitor to VMware vFabric Data Director today. In theory, you could build your own alternative using System Center and a whole lotta scripting. That is left as an exercise for the reader.

4. Microsoft Windows Azure SQL Database (WASD)

The artist formerly known as SQL Azure takes the least conventional approach of all. While it’s technically built on Microsoft SQL Server, that’s like saying your pants are built on your underpants. They both cover your back end, but you can’t use them interchangeably.

Microsoft essentially built a new product designed for common database storage requirements. Like your underpants, you only get a minimum of feature and command coverage here. The new features and commands Microsoft has been adding to the boxed product for the last couple of versions just aren’t available in WASD including partitioning, Resource Governor, Service Broker, and CLR. But hey – are you really using those anyway? Most folks aren’t.

Rather than building large databases, WASD encourages developers to shard out their data across multiple smaller databases. While on-premise SQL Servers have had techniques to do this in the past, Microsoft started over and developed a new technique that makes more sense for cloud implementations. Again, though – we’re talking about a difference from the boxed product, something else that developers have to learn differently. As we’ve gone up this ladder into the clouds, we’ve been handling our problems differently. WASD’s partitioning technique is a good reminder that once you’ve gone this far up into the cloud, you’re dealing with something very different from SQL Server. You’re not going to take large volumes of code written for SQL Server and simply point them at WASD – you’re going to be doing a lot of testing and code changes.

Like Amazon RDS, there’s no backup/restore functionality here to get your existing data into the cloud. You’ll be exporting your data to a file, loading it into the cloud, and then…staying there. There’s no database mirroring or log shipping to/from Azure SQL Databases to on-premise SQL Servers.

Which One is Right for You?

If you’ve got an existing app, and you just want to cut costs without changing code, you can get started today with any of the first three options. They support the same T-SQL commands and datatypes you already know and love.

If you’ve got an existing app, and you’re willing to make code changes & do testing, you might be able to save even more plus gain new flexibility by going to Windows Azure SQL Database. In most cases, though, the cost savings won’t come anywhere near the costs required for the code changes and code review. We haven’t seen a case yet where the tradeoff made sense for our clients. I’m sure it’s out there – the perfect example would be a small amount of code that the developers know very well, can easily modify, and produces large server loads in short bursts.

If you’re building a new app from scratch, then let’s take a step back and survey the database industry as a whole. This is an incredible time to be in databases and there’s a bunch of really good options we didn’t even cover here.

Learn more about our SQL Server in Amazon cloud resources page.

Brent Answers Your Virtualization Questions

Last Updated April 12, 2013

SQL Server, Virtualization

I’m doing a 1-hour open Q&A session on May 8th for the PASS Virtualization Virtual Chapter. Bring your VMware and Hyper-V questions about setup, performance, management, monitoring, or whatever, and I’ll answer ’em.

You can even get a head start here – post your questions in the comments below, and I’ll build slides to answer ’em ahead of time. That way you can make sure you get the best answer possible. (Well, from me anyway, ha ha ho ho.)

Then come join us on the webcast and hear the answers. See you there!

3 Things You Need to Stop Doing to Your Database Server

Last Updated April 9, 2013

Jes Schultz

No one knows everything, especially about SQL Server. When you encounter a problem, especially for the first time, you may not know how to solve it. I’ve seen some crazy approaches taken to fix problems – like duct tape on a car. Many times, instead of fixing the root cause of a problem, such as a poorly-performing query, a server-level change is made.

Here are three rolls of duct tape to avoid.

car1 — I hope Brent can see out of the windshield. Don’t tell him I took the company car.

Clearing the Plan Cache Regularly

“We found that the system would be running fine, then a user would run a report with a certain set of parameters, and it would slow down. Running DBCC FREEPROCACHE every day solved that problem.”

When you run DBCC FREEPROCACHE, all the query execution plans in memory are dropped. The next time every query is run, SQL Server will have to compile a plan, and this is expensive. CPU usage will increase. Queries will run slower the first time they are executed. Is it worth dumping the hundreds or thousands of plans that are saved for reuse for the sake of one or two that are bad?

If a query runs acceptably most of the time, but runs very slowly at other times, check into what else is running on the server at the same time, and also investigate statistics and parameter sniffing. Often, you’ll find that a simple change can fix the problem – without taking drastic measures that affect the whole server.

Shrinking Files Regularly

“Tempdb grows from its normal 25-30 GB to 100 GB every Wednesday night. Our drive is only 100 GB in size, so I set up a job to shrink it back down every Thursday morning.”

Or, your log file doubles in size once a week, so you create a job to run DBCC SHRINKFILE weekly, too.

If this happens, there is something running in your database that is causing the data or log file to grow. A large index is rebuilt frequently. A batch job is run to insert a large amount of data all at once, instead of in sections. You need to find the source of the problem and fix it. Constantly shrinking and regrowing files leads to fragmentation, and fragmentation leads to poor performance.

Combining Development and Production Databases on the Same Server

“Our database performance has been really bad the last month. We are testing new features in our development database, which is on the same server as our production database, and I think that might be the problem.”

When you write new code, you need to test it first. You absolutely should have separate development and production databases. You also should keep them on separate servers. When they share a server, they share resources, and poorly performing queries that are being tested in development can bring production to a screeching halt.

You have many options for setting up a development server. Trying to find hardware? Use an older server that has reached the end of its production lifecycle, or, if you have a virtual environment, request a virtual SQL Server. Worried about licensing? For development only, you can purchase a copy of SQL Server Developer Edition for about $50 per user.

Learning from the Past

These are only three of the valuable lessons I’ve learned from working with SQL Server. Finding the root cause of a performance problem and solving it is a much better long-term solution than implementing a short-term workaround.

sp_Blitz® Version 18 Out: Lots of Bug Fixes

Last Updated March 26, 2014

There’s a new version in town. v18 adds new checks looking for:

Backups stored on the same drive letters as user databases (Thanks @AlinSelicean!)
Agent jobs that don’t have failure notifications set up (Thanks @Thirster42!)
Shows free drive space per drive letter if you pass @CheckServerInfo = 1 in. We don’t consider this a problem – we just show additional info if you use this parameter.
Added the elevated database permissions check back in (whoops! dropped it accidentally)

Plus bug fixes and improvements including:

Lots and lots of typos
Ignore max file sizes for filestream data files
Switched a lot of @@SERVERNAME usage out with SERVERPROPERTY(‘ServerName’) because in some SQL Server cloud implementations, those don’t match, and it’s okay.
Changed database name variables to be NVARCHAR(128) to handle those monster SharePoint databases
Improved the non-aligned-indexes check to return results even if the index hasn’t been used

And more. Version 17 also added a neat new exclusions table parameter – if you’re the kind of DBA who wants to automate sp_Blitz® data collection throughout the environment, but you want to skip certain checks or certain databases, you can do that now. I haven’t written up full instructions on that, but the basic idea is pretty obvious within reading the first 100-200 lines of the code.

As always, you can get the latest version and instructions from https://www.brentozar.com/blitz/. If you’d like to contribute code, you can email us at Help@BrentOzar.com. I’m still working through a backlog of about 20 more new check contributions, and we’ve got some fun stuff coming – including a thank-you page listing the contributors and their work.

DBAs vs Devs: ORMs, Caching & Access to Prod

Last Updated January 5, 2019

Developers and database administrators frequently clash on major questions about how to scale application performance. Are they just being stubborn, or is one side really right? Jeremiah and I recently gave a talk on this topic to the AtlantaMDF User group and we wanted to open this debate to the world.

Presenting *Developers vs DBAs* on three screens at the Atlanta MDF User Group
Photo courtesy of Aaron Nelson

Someone’s Right and Someone’s Wrong

Developers and DBAs are usually being stubborn, and for good reason. DBAs tend to be tasked with being the protectors of an environment, while developers are on a mission to ship more features, make changes, and bring in more money. Each side has a reason for their views, but on most controversial topics, one side actually is more right than the other.

ORMs

Most DBAs believe that Object Relational Mapping tools (ORMs) write terrible code and that stored procedures are a better data access layer. This belief is widespread because database administrators struggled for years with early versions of NHibernate and Entity Framework.

Some of the growing pains hurt: everything from type mismatches to hilariously long and tangled queries have burned people. But developers embrace ORMs for good reasons. These tools really do help build, test, and ship features more quickly. If you become a SQL Server performance tuning specialist, you can get the best of both worlds– but you have to let go of some of your hangups about ugly SQL queries.

Application Caching

It’s hard to argue against application caching… at least unless you’ve tried to implement it and come up frustrated. DBAs argue that developers should cache everything, while developers can tell you how that isn’t as easy as it sounds.

We know that usually there IS a place for caching, you just need to know how to find the “biggest bang for your buck” opportunities. (We’ve got some slides and scripts below to get you started.)

Access to Production

Who hasn’t had this argument over and over? Usually developers are fighting to get into production and DBAs are fighting to keep them out. Lots of time gets wasted on this topic needlessly.

We created tools like sp_BlitzIndex® to help people share information for performance tuning more easily. You just gotta find the right tools and the right level of access that lets everyone do their jobs (and keep their jobs).

Get the Scripts

Download ’em here.

Join Us Live

We train developers and DBAs diagnose the real bottlenecks in their SQL Servers and use data to get beyond the problems we just described. Join us at a live training event to learn how to performance tune SQL Server.

Catching SQL Server System Object Changes

Last Updated March 3, 2019

Let’s say you get your hands on a brand new version of SQL Server that you haven’t played with before, and you want to know what system objects have changed – DMVs, DMFs, system stored procs, yadda yadda yadda. Sure, you could read the documentation – but ain’t nobody got time for that, and they’re probably leaving out the juicy stuff, right? Here’s what I do (thanks to improvement suggestions from Kendra):

First, create a linked server between the new version and the old version. This lets us query across servers and join things together. In my examples, I’ll be running my queries on the new SQL Server, which I’ll be calling local, and the linked server will point to the remote server.

Identify new system objects with this query:

/* Objects - master database */
SELECT oNew.*
FROM [172.16.189.131].master.sys.all_objects oNew
LEFT OUTER JOIN master.sys.all_objects oOld
ON oNew.object_id = oOld.object_id AND oNew.is_ms_shipped = 1
WHERE oOld.name IS NULL
AND oNew.is_ms_shipped = 1
ORDER BY oNew.type_desc, oNew.name;
GO

/* Objects - master database */

SELECT oNew.*

FROM [172.16.189.131].master.sys.all_objects oNew

LEFT OUTER JOIN master.sys.all_objects oOld

ON oNew.object_id = oOld.object_id AND oNew.is_ms_shipped = 1

WHERE oOld.name IS NULL

AND oNew.is_ms_shipped = 1

ORDER BY oNew.type_desc, oNew.name;

We’re getting all local system objects that don’t have a matching system object on the other (remote) server. The “172.16.189.131” is the server name for our older SQL Server box that we’re comparing against – our linked server name. The results give you the new functions, stored procs, you name it:

New-Objects-2 — Ooo! New Shiny System Objects!

We’re joining on object_id, but that’s a gamble – Microsoft could use different object_ids between versions if they wanted to. If we wanted to filter those out, we could join on name, schema_id, and type_desc (view, stored proc, etc), but that gets a little sketchy. For example, there’s now two CHECK_CONSTRAINTS views in SQL Server 2012 with different object_ids, and we wouldn’t catch that if we were only joining on names. In the below screenshot, the top result set is from SQL Server 2012, and the bottom is from SQL Server 2008R2:

Check-Constraints — One of these servers is not like the others, one of these servers just doesn’t belong

Next, we want to find new or changed columns in existing objects. Here’s how:

/* New columns in existing objects - master */
SELECT oNew.type_desc, oNew.name, cNew.*
FROM [172.16.189.131].master.sys.all_objects oNew
INNER JOIN [172.16.189.131].master.sys.all_columns cNew ON oNew.object_id = cNew.object_id
INNER JOIN master.sys.all_objects oOld ON oNew.object_id = oOld.object_id
LEFT OUTER JOIN master.sys.all_columns cOld
ON cNew.object_id = cOld.object_id AND cNew.column_id = cOld.column_id --AND cLocal.name = cRemote.name COLLATE DATABASE_DEFAULT
WHERE cOld.name IS NULL
AND oNew.is_ms_shipped = 1
ORDER BY oNew.type_desc, oNew.name, cNew.column_id;
GO

/* New columns in existing objects - master */

SELECT oNew.type_desc, oNew.name, cNew.*

FROM [172.16.189.131].master.sys.all_objects oNew

INNER JOIN [172.16.189.131].master.sys.all_columns cNew ON oNew.object_id = cNew.object_id

INNER JOIN master.sys.all_objects oOld ON oNew.object_id = oOld.object_id

LEFT OUTER JOIN master.sys.all_columns cOld

ON cNew.object_id = cOld.object_id AND cNew.column_id = cOld.column_id --AND cLocal.name = cRemote.name COLLATE DATABASE_DEFAULT

WHERE cOld.name IS NULL

AND oNew.is_ms_shipped = 1

ORDER BY oNew.type_desc, oNew.name, cNew.column_id;

Note that we’ve commented out a join on name – if you’re worried that existing columns might have changed names, that’s how you’d catch it. The results:

New-DMV-columns — I see what you did there.

Presto – we can see what new instrumentation fields Microsoft is giving us. This helps script authors do a better job of testing their changes across multiple versions. When someone sends me an sp_Blitz® improvement that relies on a DMV or field I’m not familiar with, I can quickly check multiple servers to see if it’s available across all of my lab servers, or which version it appeared in.

Here are several scripts that can catch all kinds of changes:

/* Objects - master database */
SELECT oNew.*
FROM [172.16.189.131].master.sys.all_objects oNew
LEFT OUTER JOIN master.sys.all_objects oOld
ON oNew.object_id = oOld.object_id AND oNew.is_ms_shipped = 1
WHERE oOld.name IS NULL
AND oNew.is_ms_shipped = 1
ORDER BY oNew.type_desc, oNew.name;
GO


/* Objects - msdb database */
SELECT oNew.*
FROM [172.16.189.131].msdb.sys.all_objects oNew
LEFT OUTER JOIN msdb.sys.all_objects oOld
ON oNew.name = oOld.name COLLATE DATABASE_DEFAULT AND oNew.is_ms_shipped = 1
WHERE oOld.name IS NULL
AND oNew.is_ms_shipped = 1
AND oNew.type_desc NOT IN ('DEFAULT_CONSTRAINT', 'FOREIGN_KEY_CONSTRAINT', 'PRIMARY_KEY_CONSTRAINT', 'UNIQUE_CONSTRAINT')
ORDER BY oNew.type_desc, oNew.name;
GO

/* New columns in existing objects - master */
SELECT oNew.type_desc, oNew.name, cNew.*
FROM [172.16.189.131].master.sys.all_objects oNew
INNER JOIN [172.16.189.131].master.sys.all_columns cNew ON oNew.object_id = cNew.object_id
INNER JOIN master.sys.all_objects oOld ON oNew.object_id = oOld.object_id
LEFT OUTER JOIN master.sys.all_columns cOld
ON cNew.object_id = cOld.object_id AND cNew.column_id = cOld.column_id --AND cLocal.name = cRemote.name COLLATE DATABASE_DEFAULT
WHERE cOld.name IS NULL
AND oNew.is_ms_shipped = 1
ORDER BY oNew.type_desc, oNew.name, cNew.column_id;
GO

/* New columns in existing objects - msdb */
SELECT oNew.type_desc, oNew.name, cNew.*
FROM [172.16.189.131].msdb.sys.all_objects oNew
INNER JOIN [172.16.189.131].msdb.sys.all_columns cNew ON oNew.object_id = cNew.object_id
INNER JOIN msdb.sys.all_objects oOld ON oNew.name = oOld.name COLLATE DATABASE_DEFAULT AND oNew.schema_id = oOld.schema_id
LEFT OUTER JOIN msdb.sys.all_columns cOld
ON oOld.object_id = cOld.object_id AND cNew.column_id = cOld.column_id
WHERE cOld.name IS NULL
AND oNew.is_ms_shipped = 1
ORDER BY oNew.type_desc, oNew.name, cNew.column_id;
GO

/* New messages */
SELECT mNew.*, '<li>' + CAST(mNew.message_id AS VARCHAR(10)) + ': ' + mNew.text + '</li>' AS text_html
FROM [172.16.189.131].master.sys.messages mNew
LEFT OUTER JOIN master.sys.messages mOld
ON mNew.message_id = mOld.message_id AND mNew.language_id = mOld.language_id
WHERE mOld.message_id IS NULL
AND mNew.language_id = 1033
ORDER BY mNew.message_id;
GO

/* Configurations */
SELECT cNew.*
FROM [172.16.189.131].master.sys.configurations cNew
LEFT OUTER JOIN master.sys.configurations cOld ON cNew.configuration_id = cOld.configuration_id
WHERE cOld.configuration_id IS NULL;


/* Perfmon counters */
SELECT cNew.*, cOld.*
FROM [172.16.189.131].master.sys.dm_os_performance_counters cNew
LEFT OUTER JOIN master.sys.dm_os_performance_counters cOld 
    ON SUBSTRING(TRIM(cNew.object_name),CHARINDEX(':',TRIM(cNew.object_name))+1,LEN(TRIM(cNew.object_name))) = SUBSTRING(TRIM(cOld.object_name),CHARINDEX(':',TRIM(cOld.object_name))+1,LEN(TRIM(cOld.object_name)))
   AND TRIM(cNew.counter_name) = TRIM(cOld.counter_name)
--   AND TRIM(cNew.instance_name) = TRIM(cOld.instance_name)
WHERE cOld.object_name IS NULL
ORDER BY cNew.object_name, cNew.counter_name;

/* Objects - master database */

SELECT oNew.*

FROM [172.16.189.131].master.sys.all_objects oNew

LEFT OUTER JOIN master.sys.all_objects oOld

ON oNew.object_id = oOld.object_id AND oNew.is_ms_shipped = 1

WHERE oOld.name IS NULL

AND oNew.is_ms_shipped = 1

ORDER BY oNew.type_desc, oNew.name;

/* Objects - msdb database */

SELECT oNew.*

FROM [172.16.189.131].msdb.sys.all_objects oNew

LEFT OUTER JOIN msdb.sys.all_objects oOld

ON oNew.name = oOld.name COLLATE DATABASE_DEFAULT AND oNew.is_ms_shipped = 1

WHERE oOld.name IS NULL

AND oNew.is_ms_shipped = 1

AND oNew.type_desc NOT IN ('DEFAULT_CONSTRAINT', 'FOREIGN_KEY_CONSTRAINT', 'PRIMARY_KEY_CONSTRAINT', 'UNIQUE_CONSTRAINT')

ORDER BY oNew.type_desc, oNew.name;

/* New columns in existing objects - master */

SELECT oNew.type_desc, oNew.name, cNew.*

FROM [172.16.189.131].master.sys.all_objects oNew

INNER JOIN [172.16.189.131].master.sys.all_columns cNew ON oNew.object_id = cNew.object_id

INNER JOIN master.sys.all_objects oOld ON oNew.object_id = oOld.object_id

LEFT OUTER JOIN master.sys.all_columns cOld

ON cNew.object_id = cOld.object_id AND cNew.column_id = cOld.column_id --AND cLocal.name = cRemote.name COLLATE DATABASE_DEFAULT

WHERE cOld.name IS NULL

AND oNew.is_ms_shipped = 1

ORDER BY oNew.type_desc, oNew.name, cNew.column_id;

/* New columns in existing objects - msdb */

SELECT oNew.type_desc, oNew.name, cNew.*

FROM [172.16.189.131].msdb.sys.all_objects oNew

INNER JOIN [172.16.189.131].msdb.sys.all_columns cNew ON oNew.object_id = cNew.object_id

INNER JOIN msdb.sys.all_objects oOld ON oNew.name = oOld.name COLLATE DATABASE_DEFAULT AND oNew.schema_id = oOld.schema_id

LEFT OUTER JOIN msdb.sys.all_columns cOld

ON oOld.object_id = cOld.object_id AND cNew.column_id = cOld.column_id

WHERE cOld.name IS NULL

AND oNew.is_ms_shipped = 1

ORDER BY oNew.type_desc, oNew.name, cNew.column_id;

/* New messages */

SELECT mNew.*, '<li>' + CAST(mNew.message_id AS VARCHAR(10)) + ': ' + mNew.text + '</li>' AS text_html

FROM [172.16.189.131].master.sys.messages mNew

LEFT OUTER JOIN master.sys.messages mOld

ON mNew.message_id = mOld.message_id AND mNew.language_id = mOld.language_id

WHERE mOld.message_id IS NULL

AND mNew.language_id = 1033

ORDER BY mNew.message_id;

/* Configurations */

SELECT cNew.*

FROM [172.16.189.131].master.sys.configurations cNew

LEFT OUTER JOIN master.sys.configurations cOld ON cNew.configuration_id = cOld.configuration_id

WHERE cOld.configuration_id IS NULL;

/* Perfmon counters */

SELECT cNew.*, cOld.*

FROM [172.16.189.131].master.sys.dm_os_performance_counters cNew

LEFT OUTER JOIN master.sys.dm_os_performance_counters cOld

ON SUBSTRING(TRIM(cNew.object_name),CHARINDEX(':',TRIM(cNew.object_name))+1,LEN(TRIM(cNew.object_name))) = SUBSTRING(TRIM(cOld.object_name),CHARINDEX(':',TRIM(cOld.object_name))+1,LEN(TRIM(cOld.object_name)))

AND TRIM(cNew.counter_name) = TRIM(cOld.counter_name)

-- AND TRIM(cNew.instance_name) = TRIM(cOld.instance_name)

WHERE cOld.object_name IS NULL

ORDER BY cNew.object_name, cNew.counter_name;

Choosing a Presentation Tool

Last Updated March 29, 2013

Writing and Presenting

If your New Year’s resolution is to start presenting at user groups and conferences, congratulations! You’re about to embark on a fulfilling journey that will enrich the lives of thousands of people. It’s a blast. Now let’s put some thought into the tool you’re going to use.

You’re going to be working on these same sessions for years – growing them, expanding them, building them into all-day training. Don’t pick a flash-in-the-pan presentation technology that might not be here in a couple of years. Pick the most reliable technology you can find.

If you’re lucky, you can turn your presentations into an entire company. When that happens, you want your employees to be able to give your presentations to clients when possible. Don’t use a hard-to-understand technology – use the simplest, most straightforward way to get your point across.

If you’re unlucky, your laptop will fail right before you walk onstage. Save all your presentations on a USB drive (mine’s on my keychain) so that when disaster strikes, you can turn to the nearest attendee and say, “Mind if I borrow your laptop to present? You’ve got (technology x) on it, right?” Attendees love to say yes, but you’re not going to have the time or bandwidth to download & install new software, and you shouldn’t be installing anything on a stranger’s laptop anyway. I present about Microsoft technologies, so my audience usually has PowerPoint installed. I’ve presented from attendee laptops more than once.

7228110070_4a87e0b22e_z — My presentation gear at work. Photo by Michael Kappel.

Use a technology that allows you to move around the stage while advancing slides. When you just get started presenting, you’ll probably stand behind the podium, gripping it tightly, fearing that you’ll fall over if you let go. After a few sessions, you’ll gain the confidence to move around the stage and use positioning just like actors do. You’ll want a technology that lets you use a remote control. I use the $60 Logitech R800 because it’s also got a built-in countdown timer. I can glance down to see how much time I’ve got left, and it vibrates when I start running out of time.

Use a technology that allows for easily exportable, not-easily-editable content. I export my sessions to PDF and give ’em away to attendees. If you give away the native format, some attendees will take your session, edit out your name, and re-present it as their own. Of course, if you’re okay with that (and I am for some of my sessions), then take the other tack – use a technology that your attendees will all have, and will be able to quickly edit and re-present.

The export needs to stand on its own, including in a printed version. If your technology relies on animations to get the point across, and the export doesn’t include the animations, it won’t work. Personally, I do all my animations by having a separate slide for each step. That way even if you’re reading along in a printed handout, you can follow what I’m doing. At all-day training sessions, I’m amazed at how many attendees love following along in a printed copy and writing notes on it. Many attendees don’t have laptops/tablets that can last all day on battery for note-taking, and many conferences don’t have power outlets for every attendee during all-day training sessions.

A couple/few years into your journey, you’re going to be so proud of your chosen technology. You’ll have polished that knife to a sharp edge. At that point, it’s time to step back and pick up a new knife. Try a new tool, and start sharpening that too. The more knives you have, the better chef you’ll be, because different knives work better for different foods.

Every ~~knife~~ technology will fail on you. The slides, the demo code, the services, the remote, the laptop, the projector, the microphone, all of it. If you’ve polished multiple knives, you’ll be completely comfortable when one tool fails. I’ll never forget the time when I was presenting slides via projector about disaster recovery for databases, and midway through my session, the entire conference center’s power went out. I grabbed the whiteboard markers, eager to sketch out the concepts for the rest of the session. Those moments make your reputation as a presenter.

Having said all that, here’s the tools I use:

Text editor – I storyboard my sessions in a text editor first, using one line per slide. I write down the things I need to teach during the presentation, then I start adding and arranging, turning it into a story. When I’m done arranging the story, then I decide which portions need demos, diagrams, pictures, etc. If there’s an overwhelming theme, I try to pick just that one method of delivery. For example, if your session is 80% demos, dump the slides and just put comments in the code. Zoom in onscreen to show those comments in large text. If I can get the entire session delivered in just one tool, it makes the session easier for attendees to digest. If I do have to use two tools (like slides & demos, or slides & whiteboard) then I want to minimize the number of transitions back & forth.

Microsoft PowerPoint – I’m not a big fan of it, but it’s the de facto standard in the MS database world. Many MS conferences require me to use a PowerPoint template, plus I have to upload my slides to them for approval, and they’ll make edits and send it back. This just comes down to knowing your audience and picking a tool all the attendees will have. At our company, we’ve started breaking up our slides into mini-decks that our employees can reuse and present to clients. For example, I might have a 1-hour session on tuning databases, and then rip out two 15-minute sections of that to turn into mini-decks. When a client has a question and there’s a minideck to answer it, the employee can whip it out and give the client the best answer possible.

Demo code – think of this as a standalone presentation tool. If you can do a whole session in here (including the title, about-me slide, and resource slide), do it.

Whiteboard – I’ve always casually used this knife to handle attendee Q&A live, but I’m starting to polish it. I’m picking between iPad teaching tools to find one that lets me zoom in & out, record the session, write with a keyboard, etc. I want to get to the point where I can deliver a 1-hour session entirely via iPad whiteboard projected onscreen, and get good feedback from the attendees that it was the best way to learn a particular concept.

Want more tips like this? Check out our past presenting posts.

Saving Session State (video)

Last Updated February 13, 2017

Jeremiah Peschka

Architecture

Session state frequently ends up on a busy SQL Server. What seemed like a good idea in development turns into a problem in production. While there are valid business reasons for persisting session state to permanent storage; there are equally valid reasons to avoid using SQL Server as the permanent storage. We’ll investigate why session state poses problems for SQL Server and cover an alternate solution that allows for persistent session state. This talk is for developers and DBAs who want a better way to safely track ASP.NET session state.

Links and References

An Introduction to SQL Server IO for Developers (video)

Last Updated March 22, 2013

SQL Server, Videos

Developers frequently make a big mistake: they tune out on discussions of storage performance and IO. Don’t fall into this trap! Understanding how storage impacts SQL Server performance helps you tune the right parts of your code and focus on what really matters. In this 30 minute video, Microsoft Certified Master Kendra Little demonstrates why developers need to understand the IO impact on writes and reads in SQL Server. She’ll show you how simple configuration changes can sometimes save hundreds of hours of developer time.

Interested in the tools or references in the video? Check out the links at the bottom of this post.

Links and References

Learn more about Hammerora for SQL Server, the load testing tool I use in the video
Gather scripts on wait stats using our SQL Server First Responder Kit scripts
Check out sp_whoisactive, the stored procedure by Adam Machanic I use to check on what’s currently running in the demo

Log Shipping FAQ

Last Updated March 18, 2017

Jes Schultz

Log Shipping, SQL Server

390

I’ve been working with and talking about SQL Server Log Shipping a lot lately!

I’ve gotten a lot of great questions about log shipping, so I’ve put together an FAQ.

What editions of SQL Server is log shipping available in?
- 2012 – Enterprise, Business Intelligence, Standard, and Web
- 2008R2 – Datacenter, Enterprise, Standard, Web, and Workgroup
- 2008 – Enterprise, Standard, Web, and Workgroup
- 2005 – Enterprise, Standard, and Workgroup
Does the secondary need to be licensed?
- I am not the licensing police, and I am not Microsoft – check with your licensing representative to clarify your exact situation. Generally, you can have one warm standby server. However, the second someone starts using it for reporting, testing, or anything else, you need to license it like any other server.
Log shipping is compatible with backup compression. What edition of SQL Server do I need to take advantage of compression?
- 2012 – Enterprise, Business Intelligence, or Standard
- 2008R2 – Datacenter, Enterprise, or Standard
- 2008 – Enterprise
- 2005 – Not available
When log shipping is set up, Agent jobs are created to alert me if a backup, copy, or restore fails. How do I get notified?
- You need to go into the Agent job, pull up Notifications, and choose your method – email an operator, or write to the event log, for example.
Are my logins shipped from the primary to the secondary?
- No, they are not. You’ll need to set up a separate method to sync the logins.

Does this replace, or can it be combined with, our existing daily full and log backups?
- TL; DR – no.
- You’ll still want to take regular full and/or differential backups. Log shipping only takes one full backup – at the beginning – and that’s only if you specify that it does so. It can also be initialized from an existing full backup.
- Taking two log backups in separate jobs will break the log chain, however. If you implement log shipping, it will replace your current transaction log backup job.
What’s the difference between the secondary being in “Restoring” vs. “Standby”?
- Restoring means the database is not accessible. Standby means it is read-only. You make this decision when you set up the log shipping.
- If the database is in Standby mode, users can query it – except when a log backup is being restored. You need to decide if a restore job will disconnect users, or if the restore is delayed until after the users are disconnected.

What other questions do you have?

Interested in learning more about backups and recovery?

Grab my Backup & Recovery Step By Step training for more information on why backups and restores are so important, and how to perform them!

Databases Five Years from Today

Last Updated April 24, 2018

Architecture

Five years from now, in March 2018, what will be different about databases? I’m a Microsoft SQL Server guy, so keep that in mind when reading my thoughts.

Looking Back at 2008-2013

The big story here for me was virtualization. In mid-2008, Microsoft added support for SQL Server running under Hyper-V, and later they added support for VMware as well. Today, almost all of my clients have at least one virtual SQL Server. They typically start with development/QA/test instances, then disaster recovery, then production.

For the remaining physical servers, cheaper memory and solid state storage meant you could get insane performance out of commodity hardware. Microsoft got caught without a good scale-out story, but cheap scale-up prices meant it didn’t matter. I don’t know whether this was luck or strategy, but good on ’em either way. In 2008, I’d never would have believed that 2013 Brent would have a dozen clients with 2TB of memory per server. Over the last couple of years, I’ve had a few companies think they needed to switch database platforms in order to get the performance they need, but the real answer has always been quick (and relatively cheap) changes to code, configuration, and hardware.

Management tools didn’t really change at all, and that’s a story in and of itself. Microsoft made several attempts to change how DBAs and developers interact with databases – Visual Studio Data Dude, DACPACs, Utility Control Points. When SQL 2008R2 came out, I wrote about why these features wouldn’t really have an impact, and today, I’m not surprised that they’re just not getting any serious adoption. Microsoft threw in the towel on Data Dude and replaced it with SQL Server Data Tools, but didn’t include all of the functionality. I don’t see a lot of developer confidence in Microsoft’s short attention span here, so tooling hasn’t been a big story. (We did get an all-new SQL Server Management Studio under the hood, but Microsoft went to great pains to ensure it looked/worked basically the same as the old one, so…yeah.)

Business Intelligence (BI) got a lot of headlines, but here we’ve got another hyper-distraction story. Microsoft threw so many different tools against the wall that the naming even became a joke – does PowerPivotPointPro use the XVelocitySuperMart in v3.14? I don’t envy the BI pros who have to keep up with this jumbled mess of licenses, features, and names, but I do think Microsoft is heading in the right direction. The combination of Excel, SharePoint, columnar storage, and hella fast laptops means Microsoft is in a good spot to give insight to managers. It just wasn’t a huge revolution in 2008-2013 because the stories and products kept changing.

Looking Forward at 2013-2018

When our servers were physical, they had a built-in expiration date. The hardware support would grind to an end, and we’d be under pressure to migrate them onto more reliable hardware. We often included a SQL Server version upgrade in that same project.

Those days are over. The combination of virtualization and SQL 2005/2008 will leave an interesting legacy challenge for DBAs. Once your SQL Server is virtualized, it’s really easy to get it off old hardware – just VMotion or LiveMigrate it to another host. You can do it even while it’s still powered on. Does that old version need some more horsepower? Shut it down, add a couple of virtual CPUs and more memory, and power it back on. What used to be a big ugly maintenance project is now a matter of just a reboot.

This means you’ll be supporting SQL Server 2005 and 2008 forever.

SQL Server 2000 has thankfully (mostly) already been exterminated from serious production work. Its lack of management tools and mainstream support means it’s painful to troubleshoot, so most of us have already migrated production work to 2005 and 2008. Support for those newer versions doesn’t end for years, so settle in and get comfy. Sure, SQL Server 2005 and 2008 have bugs, and they’re missing cool features like backup compression in Standard Edition, but for the most part, they just work. Businesses will stick with ’em for most applications because they don’t see enough compelling features in 2012.

In SQL Server 2012 and beyond, we’ve got:

AlwaysOn Availability Groups – high availability, disaster recovery, and scale-out reads
Hekaton – in-memory storage with optimized stored procedures and new data formats on disk
Column store indexes – faster data retrieval for certain kinds of queries

Call me maybe crazy, but I don’t see really widespread adoption for any of these. To do them right, we’ve gotta make changes to application code. The changes won’t pay off for the majority of customers, so it’s risk without much reward. Don’t get me wrong – when you need this kind of speed, then you need it, and the features are fantastic. I do see widespread adoption coming in 2013-2018 for AlwaysOn, but only for high availability and disaster recovery, not the scale-out reads part.

The virtualization/SQL2005-is-good-enough combination also means we’re not going to see massive, widespread migrations from on-premise SQL Servers to cloud services like SQL Azure. (We’re also not going to see people call it by its official product name, Microsoft Windows Azure SQL Database.) Your app would require code changes to make that switch, and code changes are risky.

New development, on the other hand, means you can pick the features and compatibility you want. In those environments…I still don’t see a lot of widespread SQL Azure adoption coming. If I’m a developer building a new app from the ground up, I’m going to pick the cheapest, easiest database possible. These days, that’s probably PostgreSQL. Like SQL Server, it’s a platform that is available in a lot of different delivery mechanisms – on-premise on bare metal, on-premise virtualized, colo boxes, cloud-based services, etc. It’s open source, and it has all kinds of cool features we don’t get in SQL Server. I’m not changing horses mid-stream, but if I was starting again as a developer, I’d go that route.

What Do You See Changing?

When you look at your company’s needs, the job you do, and the tools you use, what do you see coming down the pike? In 2018, what will be the defining thing that changed your work?

Update 2018/03 – hey, I did pretty well! Here are my thoughts on what happened, and what’s coming next.

You don’t have a Big Data problem.

Last Updated March 7, 2013