Tag Archive: performance

Video: Heaps in SQL Server: Performance, Maintenance, and Your Sanity

Do you use heaps in SQL Server? How can heaps impact your query performance and database maintenance? In this 30 minute webcast I will show you how heaps behave differently than than tables with clustered indexes in SQL Server. You’ll learn how to identify heaps, how to measure their size, and how to plan a change to convert your heaps to clustered indexes.

This talk is aimed toward DBAs and developers who have worked with SQL Server >= 1 year. I cover LOTS of dynamic management views and tools to view table structures– it’s OK if you haven’t used them all before, you’ll still be able to get the concepts.

Want the scripts used in the talk? Scroll on down– scripts are included in this post below the video.

Kendra Little

Kendra specializes in high availability and performance tuning. She is a Microsoft Certified Master in SQL Server-- the highest technical SQL Server Certification available. Kendra loves databases and software development more than long walks on the beach. Those cartoons in her blog posts? She draws 'em all. Read more and contact Kendra.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Picking Your Cache

Answering questions about performance is almost becoming a catechism: “How can I make my database faster?” Use a caching server. “Which queries should I cache?” All of them. A lot of people stop there. They implement a cache in their application using some kind of in-memory hash table and call it a day. There’s more to it than that, of course. Not all caching solutions are cut from the same cloth.

Back to Basics: Memcached

Memcached is a big hash table: it’s a key/value store that lives entirely in RAM. Memcached has a few basic commands that correspond to the basic CRUD commands in any database but in memcached CRUD works by finding data based on a specific primary key – you can’t search through RAM. Like many key/value stores, memcached doesn’t have any ability to query based on the data that is stored in the value portion of the key/value pair.

What’s more, the memcached server has no native support for high availability features – it is just a cache. If the cache server goes down, it goes down. Losing a cache server might not sound terrible, but if your application depends on a fast cache for performance then losing a cache server can cause a critical performance problem.

What’s the solution to high availability with memcached? There isn’t an out of the box solution for high availability with memcached. Different drivers can use [consistent hashing][ch] to spread data across multiple memcached servers and there are libraries that support memcached replication between data centers, but none of these features are baked right in to the product. Ultimately it’s up to you to implement this yourself.

The upside of memcached is that it has been in development since 2003, it speaks a well known protocol, and many developers have run into it before. Amazon’s ElastiCache even speaks the memcached protocol; if you move into the cloud, there’s already something waiting for you. The memcached documentation also includes suggestions on ways to cache SQL queries, so your developers will have a leg up on when they start examining caching. There are also plugins for many languages, frameworks, and products – there’s a lot of support for memcached out there.

The Middle Ground: Flexible Tools

No matter which software stack your development team is using, there is going to be at least one caching solution that they can pick up and run with. Not all caching tools are as simple as memcached, others are more feature filled; Microsoft have created AppFabric Cache and developers in the Java world can use Ehcache. These two packages have the concept of availability baked into their core – both Ehcache and AppFabric Cache can cluster right out of the box. There are multiple advantages with this approach: high availability isn’t a bolt-on that depends on a third party library and management becomes much easier.

Since we mainly talk about SQL Server around this place, I’m going to keep talking about AppFabric Cache from here.

Although AppFabric Cache has only been around for a short period of time, it is being used in a number of places. Caching services are available in Azure, there’s an ASP.NET session state provider, and I’ve personally used AppFabric Cache to supply fast paging in large reports.One of the primary advantages of using AppFabric Cache is that it’s very easy for Windows admins to configure and administer – everything is handled using similar tools that administrators are already familiar with.

Like memcached, AppFabric Cache supplies a simple set of APIs to read, write, or delete data. The simplicity of AppFabric Cache makes it a logical choice for developers working with SQL Server and the Microsoft stack. On top of its simplicity, AppFabric Cache adds two features to make developers’ jobs easier. The first feature is cache expiration. Instead of relying on the cache to expire data when it is no longer used, developers can specify a time to live when saving a value to cache. As data is read, the expiration can be refreshed, but if data isn’t read for a long time, it will be marked as expired and will no longer be able to be read. The second feature is high availability – developers can save some values in multiple places at once, ensuring that data will survive the failure of a cache server.

On a feature by feature level, AppFabric Cache provides features that meet the needs of almost every application. It still has one downside – there’s no way to query data in an ad hoc fashion. Every data access is key/value. It’s possible to use different pools in the cache, but those are only separations in a logical sense. If you need to retrieve a range of data, your only option is to build inverted indexes: key/value pairs where the key is the index key (e.g. the state of Oregon) and the value is a list of indexes values (e.g. all zip codes in Oregon).

Fast and Furious: An In Memory Database

In memory databases have several advantages over pure caches. A cache is a simple key/value database and the value is nothing more than a collection of bytes. Databases, however, offer increased functionality – range scans, sorting, strongly typed data, and a host of rich commands.

Redis is a fast in-memory database. At first glance, you might think that Redis is a lot like a a cache – it looks in many ways like a simple in memory key-value store. However, Redis hides a lot of additional power: sorted sets, lists, queues, and replication are all supported features. By combining these features, it becomes possible to use Redis as something more than a cache – it becomes the primary database for fast querying. It’s easy to store user session properties in a hash, a user’s last 50 viewed pages in a list, and any number of objects as simple strings.

Once you start working with an in memory database like Redis, it’s important to start thinking of creative ways to use your database; Redis can be used as far more than a glorified key-value database. Developers are using Redis for realtime metrics, analytics, and Redis has even been called the AK–47 of databases because it’s simple, powerful, and reliable.

Wrapping Up

No matter how you choose to cache, make sure that you start caching in your applications as soon as you can. Yes, it does add extra work for developers to implement caching. But tis’ not as much work as many people would like to think. Careful and judicious use of caching will have immediate benefits for application performance – database CPU, memory, and I/O requirements will decrease and application response times will improve.

Jeremiah Peschka

Jeremiah Peschka has worked as a database and emerging technology expert at Quest Software where he researched new trends and technologies in the world of data storage. Over the course of his career he’s worked with companies across many industries as a system administrator, developer, and DBA. He’s been involved with all aspects of application development and deployment. He likes cheesecake, coffee, and ice cream.

More Posts - Website

Follow Me:
TwitterFacebook

How to Tell if TempDB Is a Performance Problem

Years ago, I came across an article about a gentleman who made a cat camera. Without boring you too much, this camera gave him a detailed view (including GPS info) about what his cats got up to all day. I got excited about the idea of finding out what my pets did all day. Not having the means to build a cat camera of my own and being far too cheap to buy one, I rigged up a web cam at home to watch the cats all day. It turns out that my pets do nothing all day. They slept and ate and slept and ate until I got home. At which point, they kept doing the exact same thing.

In a wonderful bit of quantum boringness, it turns out that I had no idea what kind of chaos my pets were causing until I directly observed them. I had no idea if they were actually moving things around in the house or if I was simply forgetting where I was putting things (turns out I’m very forgetful). TempDB is a lot like a basket of cats – you don’t know for sure that it’s causing your problems, you have some sneaking suspicions, but you’re not sure how to prove anything.

Watching Over TempDB

If TempDB is like a basket of cats, we need to watch what it’s doing; there’s no telling when it’s going to go from adorable to shredding the drapes. Knowing what to watch in TempDB is just as important as knowing that you should even be watching TempDB at all.

How Many Cats Do I Have? (Watching TempDB Free Space)

Excessive TempDB usage isn’t necessarily a sign that TempDB is a problem, but it is an indicator that you have problems worth looking into. When TempDB starts getting full, it’s an indicator that there’s a lot of temporary object creation as well as out of memory sorting and joining going on in the database. None of these things are bad, but they’re indicators that we should be taking a closer look at TempDB.

There’s no hard and fast metric for what you should do when your TempDB data file is large, but it’s a good indicator that you can stand to do one of a few things:

  1. Enable Instant File Initialization
  2. Add Multiple TempDB Files

These changes won’t always cure the problem, but they are starting points. Waiting for TempDB to grow can be a cause of performance problems and enabling Instant File Initialization makes it possible to quickly grow TempDB data files. Using multiple TempDB files uses more storage bandwidth, reduces file contention, and adds magical pixie dust to your queries.

What Are My Cats Doing? (Monitoring TempDB Usage)

The next step, after you know how much TempDB you’re using, is to find out how TempDB is being used. TempDB is used for a few distinct things: joins, aggregations, sorting, the version store, temporary tables (and table variables), and table/index spooling. While these are different operations, they all consume TempDB space. Understanding how your applications use TempDB is critical to understanding if TempDB is causing performance problems.

This is where things get more complicated; there’s never a right or wrong answer, but TempDB usage varies heavily by application and workload. Sometimes even the same application, with different customer workloads, can have wildly different TempDB usage characteristics. By monitoring TempDB through a variety of DMO calls, server side traces, and performance counters it’s possible to get an accurate picture of the health and utilization of TempDB over time. Through some careful DMO/DMV scripting it’s even possible trace who the biggest consumers of TempDB are back to the stored procedure or query that’s using TempDB.

Just like trying to watch a basket of cats through a webcam, you can only catch quick glimpses of what’s going on. This process makes it possible to capture a sample of what’s going on inside TempDB at any moment, but it’s only for a quick moment. The DMOs to monitor TempDB only look at the currently running queries, there is no historical record. The best way to get an accurate picture of what’s happening is to sample these DMOs on a regular basis and sample aggressively during peak performance periods. You won’t catch every query this way but you should be able to catch most.

Benchmark, Rinse, Repeat

Whenever I talk about performance tuning or general SQL Server problems, I always advise people to benchmark everything that they can. Having a steady baseline is the only way to verify that changes are having a positive effect on performance. Without a performance baseline in place, all you have to go on is a feeling that things are faster. Unfortunately, feelings don’t translate into quantifiable numbers (unless you’re trying to quantify how you feel about a basket of cats).

Establishing a performance baseline is one of my favorite parts of working with clients. As we go through the health check, I work with our clients to figure out where it hurts and help them build a solution. I cover how they can use the baseline to keep monitoring their system. With these tools in place, it’s easy to monitor a system’s health over time.

Focusing on performance metrics makes it easy to see which parts of an application are causing performance problems. It TempDB usage spikes after a change to a few stored procedures, it’s easy to identify the problem when you have a baseline established.

Determining whether or not TempDB is a performance problem boils down to establishing a baseline, monitoring performance before and after changes, and carefully making changes until acceptable performance levels are reached. This may involve adding more TempDB data files, forcing memory grant allocations, or using solid state drives for TempDB.

Resources

Jeremiah Peschka

Jeremiah Peschka has worked as a database and emerging technology expert at Quest Software where he researched new trends and technologies in the world of data storage. Over the course of his career he’s worked with companies across many industries as a system administrator, developer, and DBA. He’s been involved with all aspects of application development and deployment. He likes cheesecake, coffee, and ice cream.

More Posts - Website

Follow Me:
TwitterFacebook

You’ll be Reunited with an Old Friend in 2012

I may love a challenge, but I also like things that are quick and easy.

There’s one particular quick and easy feature from SQL Server 2005 which I thought was gone for good.

But there are signs it’s going to return.

Any guesses what I’m talking about?

Nope, it’s not Notification Services. (But that would be kinda funny.)

It’s better.

DBAs Like Data.

If you work with databases, you probably like graphs and trends. You probably enjoy tools that help you analyze the state of a server quickly.

This is why DBAs love the DMVs.

And this is why DBAs will probably also love the….

SQL Server 2012 Performance Dashboard Reports

I spied an announcement for this SQL PASS Session:

I don’t know about you, but I really liked these reports from 2005. They were handy, friendly, and easy to use. And they’re great because they’re accessible, and they lead people into learning more about the DMVs and building their skills for tuning SQL Server.

What Do You Hope is in the 2012 Version?

Were you a fan of the 2005 version of this report? What do you hope is updated or added in the new version?

Kendra Little

Kendra specializes in high availability and performance tuning. She is a Microsoft Certified Master in SQL Server-- the highest technical SQL Server Certification available. Kendra loves databases and software development more than long walks on the beach. Those cartoons in her blog posts? She draws 'em all. Read more and contact Kendra.

More Posts - Website

Follow Me:
TwitterFacebookLinkedInGoogle PlusYouTube

Better Living Through Caching

The fastest query is one you never execute.

The premise is that one of the slowest parts of starting up an application isn’t starting the application itself, it’s loading the initial application state. This can become a problem when you’re loading many copies of your application on many servers, especially you’re in the cloud and paying for CPU cycles. In that article, a commenter proposes reading application start up state from a serialized blob; basically a chunk of memory written to disk. The trick is that the serialized blob is stored in cache rather than on disk or in a database. Sometimes you need to hit disk in order to refresh the cache, but the general idea is that all configuration info is stored in a single binary object that can be quickly read and used to start up an application to a known good state.

Caching for More Than Start Up Times

Once you start caching application start state, it’s natural to look for more places to introduce additional caching. Remember, the fastest query is the one that you never execute.

Most people already know that they can add caching to their application to improve performance and get around slower parts of the system. There are a number of well understood design patterns that focus around caching and its place in software architecture. A lot of people don’t take this one step further and use caching as a trick to avoid down time when they roll out updates.

You might be thinking “Wait a minute, doesn’t my database/SAN/operating system have some kind of cache?” You’re right, it does. Storage cache is your last line of defense before reading from disk. Why not cache things in your application and skip the network hit?

So what happens when you need to update the application? In the past you probably scheduled an outage in the middle of the night. Or maybe you performed rolling outages from server to server and then slowly brought features online across groups of servers. However you did it, it’s complicated, requires down time, and you need to have a rollback plan; rollbacks on large databases can take a lot of time.

What if instead of just caching configuration to avoid slow start up, you start caching all data (or as much as can fit into memory)? You’re doing that already, right? Why mention it again?

If you’re caching data already, it seems logical that your application is written with multiple tiers. Those tiers are probably divided out by application or by service. If so, there’s a lot of logical separation between different features and functionality. You might even be calling a read/write API as if it were a service provided by a third party. This is a perfect example of how you can cache your reads and avoid hitting lower layers of the application; the front end never needs to know that anything exists apart from the services that provide data.

If you can cache data at the service level, you can theoretically take your back end systems offline for maintenance and bring them back online with minimal disruption to your users. Ideally, there would be no disruption. You could queue up modifications during your maintenance window and then commit them to the database once the updated database, services, or features are back online.

The Beauty of Isolation

By isolating features and layers from each other, you can make your applications more responsive. Rather than relying on servers to respond quickly during application start times, you can make it possible to load binary configuration data from cache. Frequently run queries can be served even faster by caching results in memory. Down times can even be avoided by caching reads and writes during the maintenance window. Of course, caching writes can be difficult. You can start by caching reads and keep your application up most of your users; it’s better than shutting everyone out completely.


To learn more about caching on Windows, read up on AppFabric Cache. On the *nix side of things, there’s the tried and true memcache. More novel and exotic solutions exist, but AppFabric Cache and memcache are great places to get started.

Jeremiah Peschka

Jeremiah Peschka has worked as a database and emerging technology expert at Quest Software where he researched new trends and technologies in the world of data storage. Over the course of his career he’s worked with companies across many industries as a system administrator, developer, and DBA. He’s been involved with all aspects of application development and deployment. He likes cheesecake, coffee, and ice cream.

More Posts - Website

Follow Me:
TwitterFacebook