You may have read that you need to have more than one data file in SQL Server’s tempdb. This can happen even if you’re using blazing fast storage. If you create a lot of tiny objects in tempdb you may hit a bottleneck on special pages that SQL Server uses internally to allocate all those objects. For certain workloads, adding more tempdb files speeds up SQL Server.
Most database administrators aren’t sure how to tell when they need to add more tempdb files. Here’s how you can find out for yourself.
Tempdb stress: my virtual machine setup
I stressed out my tempdb on a virtual machine with 6GB of memory, 4 virtual CPUs, Windows Server 2012 R2 and SQL Server 2014. The virtual machine is backed by solid state storage. My tempdb was configured with a single data file.
How I ran my stress test against tempdb
I used ostress.exe. This command line utility lets you easily run a query multiple times across many threads. You can get ostress.exe for free by downloading the RML utilities.
Here’s the ostress.exe command that I called:
ostress.exe -E -d"tempdb" -Q"exec dbo.tempdbstress" -n5 -r300 -b -q
The dbo.tempdbstress procedure looks like this:
CREATE PROCEDURE dbo.tempdbstress
SET NOCOUNT ON;
SELECT TOP(5000) a.name, replicate(a.status,4000) as col2
FROM master..spt_values a
CROSS JOIN master..spt_values b OPTION (MAXDOP 1);
Finding GAM and PFS contention in tempdb with sp_BlitzFirst® and Wait Stats
While this is running, look at waits in two ways. In one session, take a sample of waits with sp_BlitzFirst®:
exec sp_BlitzFirst @ExpertMode=1, @Seconds=10;
I see PAGELATCH_UP, PAGELATCH_SH and PAGELATCH_EX waits:
I also have a lot of just plain old CPU wait (the SOS_SCHEDULER_YIELD), because I’m bashing at only 4 virtual CPUs (and I’m running ostress.exe on the same VM as SQL Server).
Finding GAM and PFS contention in tempdb with sp_whoisactive
This lets you see the contention if you catch it at the right instant. Bonus: you can confirm exactly what type of pages its occuring on! I can see here that this is explicitly PFS waits. “PFS” means “page free space”: poor tempdb is hitting a hot spot on one of those special pages just recording how much space is available and where.
And here I can see the GAM waits. “GAM” means “Global Allocation Map”. That’s another one of those special pages that can get hot and slow things down in tempdb. It’s tracking where objects are allocated. (There’s also “SGAM” or “Shared Global Allocation Map” pages that you might see.)
Both of these types of waits can be alleviated if you add more data files to tempdb, because then you’ll get more PFS pages and GAM/SGAM pages across the files.
How to tell if you need more tempdb data files
First of all, if you just have one tempdb file, consider adding more files as preventive medicine. Don’t go crazy with this. The formula recommended by Microsoft in KB 2154845 to use one tempdb data file for each logical processor up to 8 processors is a good place to start.
Look at waits that have occurred since startup. Do you have PAGELATCH_UP or PAGELATCH_EX waits? (Note that these don’t have an “IO” in them. PAGEIOLATCH waits are different.)
If you do, that’s not necessarily 100% proof that the issue is in tempdb, but it’s a good indicator. If you don’t, well, it’s probably not a big issue.
If you do find these waits, dig deeper:
- Identify when the PAGELATCH_% waits are growing the most, and dig into those periods with sp_whoisactive. You can log sp_whoisactive results to a table
- Look at your top queries. The queries that are part of the issue might be suitable for caching in an application tier.
Tempdb file configuration reminders
Remember: this only applies to tempdb data files. You only need one tempdb log file.
Also remember that the tempdb data files will be used proportionally with their size. This means that you should keep the files equally sized. You can use trace flag 1117 to force all tempdb files to grow at the same time when one grows, but this will apply that behavior to every database on the instance (not just tempdb).
You can add tempdb files without restarting the SQL Server instance. However, we’ve seen everything from Anti-Virus attacking the new files to unexpected impacts from adding tempdb files. And if you need to shrink an existing file, it may not shrink gracefully when you run DBCC SHRINKFILE. For best results, deploy the configuration first to your pre-production environments, use a change request, and perform the change in a planned maintenance window.