You’ve heard that shrinking a database is bad because it introduces both external and internal fragmentation, it causes blocking, it causes transaction log growth while it runs, and it’s slow and single-threaded. You understand that if it’s just a matter of 10-20-30% of a database, and the database is only 100-200GB, you might as well just leave the space there, because you’re gonna end up using it anyway.
But your situation is different:
- Your database is 1TB or larger
- You’ve deleted 50% of the data
- You have 500GB+ empty space
- You’re never going to need that space because you’re now doing regular deletions and archiving
You’ve been tasked with reducing the database size to make restores to other environments easier, like when you need to restore to a QA or development environment. You don’t want those servers to need so much useless space.
Don’t shrink. Do this instead:
- Add a new filegroup, add new empty files in it, and set it to be the new default filegroup
- Move objects to the new filegroup using CREATE INDEX…WITH DROP_EXISTING = ON, ONLINE = ON commands
- When you’ve moved all the objects over, shrink the old filegroup, and it’ll shrink super-quick
- Just leave it there – this is your new database layout
This approach has several key advantages over shrinking:
- It can use parallelism: you can specify how many cores to use with MAXDOP hints. This is really helpful on servers with MAXDOP = 1 at the server level because your query-level hint will override it, even overriding it upwards.
- It can move a lot more than one 8KB page at a time. Heck, you can even do several index recreations simultaneously on different threads if you want to, really churning through the work quickly.
- You pick the days/times for each object: maybe some of your objects have very heavy concurrency during some time windows, and you want to avoid touching those until a maintenance window.
- You pick the settings for each object: rebuilding an index with ONLINE = ON is slower. Some of your objects might be archive tables or unused during certain days/times, so you can use ONLINE = OFF on those to get faster performance.
But it does have a few drawbacks, too:
- It’s faster, but it’s also higher load: shrinking a database is a low overhead process: it’s hard for anybody to notice you moving just one 8KB page at a time, with just one CPU core. Index recreations, buckle up: people are gonna notice when you throw a ton of CPU cores at the problem and really start hammering your storage. This is just the flip side of the coin to finishing faster: if you wanna finish faster, you’re gonna do more work in less time.
- This also means it generates more logs, faster: because we’re moving so much data and it’s a fully logged process, this can present problems for transaction log backups, log file sizes, database mirroring, Always On Availability Groups, and storage/VM replication. Ease into this gradually, starting with your smaller tables first, so you can see the impact it’s having on your transaction log sizes.
- ONLINE = ON isn’t fully online: even online index recreations need a momentary schema mod lock in order to finish their work. Thankfully, since SQL Server 2014, we’ve had the WAIT_AT_LOW_PRIORITY option to help mitigate that problem.
- You have to do some prep work: it’s easy to run DBCC SHRINKDATABASE, but the index recreation approach will take much more work if you want to leverage all the cool advantages I discussed above.
Thankfully, there’s help on that last one: Bob Pusateri has an in-depth writeup and a script to help.