Scale Up or Scale Out? Scaling SQL Server
Want your app to be able to handle more users? Here are tips, tricks, and links to comprehensive references on how to scale SQL Server– either Scale Up or Scale Out.
Do You Really Have a Scale Problem?
It could be that you just have a bad implementation. First, make sure you’re not wasting your time.
- Jeremiah explains Five Things that Fix Bad SQL Server Performance
- sp_Blitz: Uncover your SQL Server’s dirty little secrets
- Scripts to check your wait stats, plan cache, and currently running activity in SQL Server are here.
Want to grow upwards, using big hardware, big tables, and lots of queries running on a single instance?
- Brent gives an overview of What It Means to Scale Up
- Sargability – Brent shows why %string% is slow — and this can be a real problem in a scale up environment
- sp_BlitzIndex – Get insight into your toughest indexing challenges
- Table Partitioning: Will it help you scale up, or become your biggest nightmare?
- Performance Considerations of Datatypes – study by Michelle Ufford about the impact of choosing the right datatype.
- Isolation Levels: Kendra’s list of everything you might want to know about Isolation Levels (complete with poster)
- Locking and Blocking: Find lock related perf counters, how to run the blocked process report, and more.
- Kendra tells you how to implement Snapshot or Read Committed Snapshot Isolation
- List of all Microsoft RSS Feeds for KB Updates
- Making backups faster: “A Technical Case Study: Fast and Reliable Backup and Restore of Multi-Terabytes Database over the Network” on the SQL Customer Advistory Team (SQLCAT) list of whitepapers
Scale Out: Availability Groups, Sharding, and More
Instead of just building upwards, what are your options for distributing your data outwards in SQL Server?
- Brent gives you an introduction to Scaling Out SQL Server
- SQL Server 2012 AlwaysOn: Setup checklist, introduction, and much much more
- Jeremiah talks about Sharding in SQL Server
- If you’re using availability groups, they’re grounded in failover clusters. Get familiar with:
- It can be tricky to find out if a failover happened with an availability group. Learn more from the SQL Server PFE team.
- Jeremiah talks about Availability, Data Locality and Peer to Peer Replication
- Jeremiah explains why Conflict Resolution is a Big Issue with Peer to Peer Replication
- Lucene Full Text Search – this open source product is cheaper, faster, and more scalable than SQL Server’s built-in full text search. It’s much easier to scale this out with multiple boxes rather than trying to scale SQL Server. There’s also Lucene.NET for .NET folks.
- Adding Reliability to Your Infrastructure – Brent explains why, all other things being equal, the more moving parts you add, the less reliable your system gets.
Scale Out: Caching
Application caching sounds great, but is it really all that? (Well, yes, it actually is.)
- Brent says: “The fastest query is the one you never make“
- Jeremiah helps you pick the right caching tool – When your app needs data, check the cache tier first, then only query if the data isn’t in cache. Caching options include Redis, Memcached, and Windows AppFabric Caching.
- Finding Queries to Cache (script)
- Jeremiah and Kendra debate in DBAs vs Devs: ORMs, Caching, and Access to Prod (slideshow has a section on caching trickery)
- Caching in the Distributed Environment (msdn)
Testing and Performance Monitoring
You need to test and monitor no matter how you scale, but you need to become a serious pro at performance monitoring when you scale up.
- Random vs Synchronized Load Testing – users sometimes hit your application in waves, with several people doing the same thing at the exact same time. Find out if that will cause a problem rather than just having a bunch of people randomly click around in your app.
- Using Perfmon for Performance Tuning SQL Server – if you want to scale, you have to find out what bottleneck is currently holding you back from getting more performance. Brent explains what counters to look at and how to analyze them.
Lessons Learned Scaling StackOverflow
StackOverflow serves millions of pageviews per day with a SQL Server back end consisting of plain old commodity hardware with local drives. Here’s some of the lessons we’ve learned along the way.
- Capture & Test Every Query Before Upgrading – we upgraded to fix one problem, and ended up with a different problem because 2008 handled a specific kind of query differently.
- Scaling Full Text Search in an OLTP Environment – we ran into a problem with SQL 2008’s new FTS and got help from Microsoft.
- High Scalability Profile of StackOverflow – talks about the architecture and hardware in use.
- Cache-Warming Query – wildly unsupported, but gives you a rough idea of how to select data from all of your indexes in a particular database to warm the buffer pool. Doesn’t work with filtered indexes, and make sure you have enough memory to cache the entire database.
StackOverflow’s Shared Scaling Tips
- Blog.ServerFault.com – the sysadmins behind scaling StackOverflow’s infrastructure. It’s not purely dedicated to SQL – it touches a lot of parts of systems administration – but there’s a lot of great information in here. For an example, start with Different Views of the Same Problem.
- Sam Saffron’s Blog – Sam is a rocket surgeon coder who’s doing great stuff to keep performance fast.
- MVC Mini Profiler – like running a client-side trace that can tell you which queries are slow (or running too often). To see how it can work in your application, check out Scott Hanselman’s MiniProfiler walkthrough.
- Blog.StackOverflow.com – more general blog that includes the StackExchange Podcast. There’s some scalability info in here, but not as much as the ServerFault guys.
Scaling SQL Server in the Cloud
- SQL Azure is the Flat Pack Database – Microsoft’s SQL Azure is touted as a scalable cloud-based database – but what does that really mean?
- Bad Storage Performance in Amazon EC2 – cloud doesn’t always equal speed. Benchmark test your Amazon EC2 SQL Servers to find out why.
More SQL Server Scalability Resources
Read more on performance tuning.
Want to learn even more? We’ve got video training explaining it! In our 90 minute video series How to Think Like the SQL Server Engine, you’ll learn:
- The differences between clustered and nonclustered indexes
- How (and when) to make a covering index
- The basics of execution plans
- What determines sargability
- How SQL Server estimates query memory requirements
- What parameter sniffing means, and why it’s not always helpful
For $29, you get 18 months of access to the videos for one person. You can watch them at work, at home, even on your iPad. Learn more about it now.