Right now, people are waiting for your question. They can’t wait to help you out, and they’ll give you incredibly good advice for free. Here’s how to get it:
1. Boil the problem down to one question in one sentence. Technology is complicated, and your question probably has a lot of moving parts. If you can’t fit your question into one sentence, that’s a sign that it’s actually multiple questions, or that you don’t understand the issue quite well enough yet. Strip out as many moving parts as possible. Simply by getting the question down to one sentence, I find that I answer it myself – like how Jeff Atwood wrote about rubber duck problem solving.
2. Type that sentence into a search engine. After all, this is what your answerers are going to do. You probably aren’t going to find the answer you want (you probably are, but I’m being kind here), but you want to look at the kinds of responses that were posted. Look at the followup questions other people asked, try that same solution, and compile the evidence that you gather.
3. Provide additional evidence in an accessible format. Even though you’ve boiled down your question to one sentence, that doesn’t mean it’s the only thing you’re going to provide. Start your email/post with that one-sentence-question, and finish it the same way, but in the middle, put all the juicy details about the things you tried. If you’re asking about a slow query on a database server, that means including the full query and the execution plan. If you’re asking about how to design a data model, include your idea. Your goal is to make it as easy as possible for people to copy/paste your work into their own environment, make changes, and then most importantly, copy/paste the right answer from their environment into the answer. For long copy/pastes, check out Pastebin.
4. Never say “my client” in the question. If you’re a consultant or a student and you’re tasked with getting the answer, don’t reveal that in the question. It really pisses the rest of us off because you’re asking us to do your work. I’m not saying you can’t ask questions; you certainly can, and I encourage it – but don’t rub our noses in it. You can avoid a lot of negative feedback by using terms like “we” instead of “my client.”
5. Ask as many strangers as possible simultaneously. Post your question on a Q&A site like StackOverflow, Serverfault, DBA.StackExchange.com, or SQLServerCentral. There are thousands of people sitting around hitting refresh on these sites, waiting for questions to come in, eager to help you. This is a much more effective solution than emailing one or two people directly. I even have an email template set up that instructs folks to post their questions online – I’d rather teach people to fish than feed them one answer at a time.
6. Follow up promptly. After posting the question, set yourself a repeating 15-minute alarm on your phone. When it goes off, check for followup questions or comments on your question. Answer them fully and promptly. These are strangers who are taking time out of their day to help you – treat them with thanks and respect.
7. If you don’t get the answers you need after 24 hours, escalate it. If it’s a SQL Server question, post a link to it on Twitter with the #SQLhelp hash tag and a link to your full question. Email it to your friends and resources, but hold off for 24 hours first – you want to respect their time.
I’m a bit of a researcher. When I want to know about a topic, I start off by doing a lot of reading. After I take a lot in, I want to make sure I understand the details. At that point, I try to think like a scientist. I come up with ways to test what I’m learning and see it in action. My process isn’t revolutionary, but it goes something like this:
- Formulate a question.
- Pose a hypothesis.
- Form a prediction.
- Run a test.
- Analyze the results.
I can’t do this for everything, but I love it when it works.
Today’s story is about how I used this process to learn about how SQL Server writes to transaction log files.
This won’t be a super-practical post that will change your life. I’m not going to make any recommendations here, other than widely published best practices. But I promise to show you a tool that may help you learn a lot about how SQL Server works.
The Question: How Large are the Writes to a Transaction Log File?
A widely published SQL Server configuration best practice is to format your log file drives with a 64KB allocation unit size. There are exceptions to this for certain storage subsystems— you should always check the documentation from your storage provider, and you can also run tests with tools like SQLIO to determine how you can get the best performance with your storage implementation. (Different SAN configurations and settings like RAID stripe size make a difference when it comes to performance with a given allocation unit configuration.)
My question was: what does the allocation unit size mean in terms of how much SQL Server writes to disk when I issue a write against a transaction log? Do I always have to do the same unit of work as my allocation unit size?
In other words, am I always going to do a minimum of 64KB of IO against my log file? That seems like a lot, so I doubt it! But if not 64KB, then how much?
First, Some Background on Allocation Unit Size
The “Allocation Unit Size” on a drive is also called the “cluster size.” These are pretty much synonymous, and the fact that we have two ways to talk about this can make things confusing.
The default allocation unit size on a windows drive is 4KB. When you format the drive, you can set that to a larger size— again 64KB is recommended for SQL Server data, log, and tempdb files. The allocation unit size is basically the smallest unit of storage that any individual file can take up. If I set the allocation unit size on a drive to 64KB and then create a tiny text file which only takes up 11 bytes, that one file will take up 64KB on disk.
So 64KB isn’t a great allocation unit size for a drive with many tiny text files, but our SQL Server log files are all much larger than 64KB—- each file takes up many allocation units.
Hypothesis: We’re Writing to the Transaction Log in Small Chunks
My initial guess was this: even though I’m setting my allocation unit size to 64KB, the way that SQL Server issues the writes doesn’t write out 64KB each time. That would just be too high!
I did some reading to learn more about this from the SQL Server perspective. In the “Determining I/O Capacity” section in the Best Practices Whitepaper, it states, “Log writes are sequential in nature with a varying size, which is dependent on the nature of the workload (sector aligned up to 60 KB).”
So we’ve got a varying size, and it’s “sector aligned.” Bob Dorr explains more about sectors and transaction log writes in his SQL Server I/O Presentation: “Any time the physical flush takes place it occurs on a sector boundary. The FlushToLSN will attempt to pack as many active log records into sector aligned boundaries and write on the sector size. The sectors each contain a parity bit that the log can use to detect the valid sectors flushed before a crash recovery.”
Now we’re getting somewhere! So, in theory, the smallest write should hit ONE disk sector. A disk sector is a physical sector on a drive— it’s configured by the storage manufacturer, and isn’t something you configure. Most drives contain sectors which are 512 bytes, but now new-fangled drives (called “Advanced Format” drives) contain 4KB sectors.
To recap our terminology and put things together: if you format a single (“old school”) drive with a 64KB allocation unit size, each 64KB cluster will have 128 physical sectors. Each sector will have 512 bytes.
Based on this, my prediction is that SQL Server will write 512 bytes to the log file for a single tiny write transaction. Ideally, I want to observe this IO outside of SQL Server, not inside of it. It might be the case that SQL Server thought it was doing a small IO, when Windows saw it doing more IO.
My Test: Watching Writes with Process Monitor
It didn’t take long to find a tool to monitor IO with. I am a big fan of the SysInternals Suite suite of tools. Over the years SysInternals tools have helped me solve many problems and learn a lot.
Process Monitor does just what the name says: it helps you monitor the behavior of processes. As it monitors it streams a lot of information to the client, and I always get a bit nervous if I have to run it in production due to the volume. (Make sure you apply very good filters!)
However, for a lab environment it’s just candy: easy to use, quite intuitive, and highly configurable. I fired up process monitor and filtered out all processes except for sqlservr.exe. (Protip: all the cool kids drop vowels near the end of words. Just call me “Kendr.”)
This first filter helps reduce the abundant flow of information coming from Process Monitor. It helped me focus in on my quarry.
I was just interested in seeing IO to a log file. Next, I filtered out everything except for writes to the log file for a database named InsertTest, which was located at the path “T:\MSSQL\Data\InsertTest_log.ldf.”
I also validated in a command prompt that my T drive is formatted with a 64KB allocation unit size, and Windows understands my drive to have 512 bytes per sector. Sure enough, 64KB!
Setting Up A Database to Test
I wanted to do a very small write and see how much IO was issued, so I created the following table:
Use InsertTest; GO CREATE TABLE lilbittytable (tinycol bit)
That’s pretty darn small, right?
I started collecting data with ProcessMonitor and I ran an insert into my table:
Use InsertTest; GO INSERT lilbittytable (tinycol) SELECT 1
And, what do you know, here’s the write I saw:
As soon as I inserted the row, I hadn’t defined an explicit transaction, so it auto-committed. This required that the row be flushed to the log. This write went into my log file with a length of 512— and that length is reported by Process Monitor in bytes.
This verified my theory: tiny transaction log file writes are indeed written in a unit that corresponds to a single phyiscal disk sector, as defined by the drive itself. The answer to my original question is that the size of data written is NOT influenced by the allocation unit size you set when you format a drive.
There are lots of questions that the information I showed today may raise for you. What happens when you write larger records, or have multiple sessions doing writes at the same time? Does this all correspond to what you see inside the SQL Server transaction log when you read it with the (undocumented) fn_dblog function? What sizes of writes do you see happening against the data files after a checkpoint?
Good news: it’s really easy for you to find out! Firing up Process Monitor on a test environment is super easy, and will take you less time than reading this post did. Give it a shot, and happy exploring.
Your doctor monitors your heart rate, blood pressure, and weight. You’re a server doctor, and there are vital metrics you should watch about your SQL Servers. Even if your servers are healthy today, learn the ten most important Perfmon metrics you should be tracking so you’ll know if things are getting better or worse. Microsoft Certified Master Brent Ozar will help you along your internship in this 30-minute video:
The links we referenced:
- Our Perfmon Tutorial – how to slice & dice your metrics in Excel.
- Best Practices for Running SQL Server on VMware – more in-depth configuration tweaks for VMware.
- How to Make SQL Queries Run Faster – high disk latency numbers don’t necessarily mean you have a storage problem.
If you liked this video, check out our free upcoming live webcasts now.
Here at Brent Ozar Unlimited®, we offer a server performance checkup to clients. As the newest member of the team, I wasn’t sure what to expect when running one. I found out that it’s pretty cool. I pretend we’re Special Forces operatives, strapped into our parachutes, doing a HALO opening over client servers. (This homemade parachute gets caught on my chair a lot.)
This checkup is about learning as much as it is fixing servers. What have I learned? What is it like to be a consultant, to be deep-diving into a variety of environments?
Variety is the Spice of Life
Even though they all use the same software, every SQL Server environment is different. Each one has a few features installed. But have you worked in one company using clustering, replication, mirroring, and log shipping? How about full-text indexing and FileStream? How many different hardware configurations do you have? How many SAN vendors does one company use?
The checkups are enjoyable because I get to see a variety of configurations. There is always something new to learn. Seeing how something works in one environment can give me ideas for fixing or improving something in another.
Teaching Best Practices
Let me take you back a few years, to when I was handed the DBA title for the first time because the DBA left the company. At that time, I could write T-SQL and knew SSRS fairly well. However, I didn’t know auto-shrink and xp_cmdshell were bad. This isn’t knowledge we’re born with, it isn’t covered in most college textbooks, and a programmer won’t know. But I learned. Now, I get to pass on that knowledge.
Our sp_Blitz® script will check for things like databases in Full recovery without log backups being run, users with sysadmin rights, user databases on the C: drive, and auto-shrink enabled. When I see things that aren’t best practices, I can point them out and teach the users why. I can encourage them to change the settings, if applicable, or discuss why they have it set that way. They learn something new, and I get to impart my knowledge. We all win.
Our very own Brent Ozar wrote the sp_Blitz® script. But there are a lot of other people’s queries we’ve built on. Adam Machanic, Amit Banerjee, and Dave Levy have provided inspiration. We recommend Adam Machanic’s sp_WhoIsActive, and Ola Hallengren’s maintenance scripts.
There are many SQL Server resources available to you, whether they are blogs, videos, books, or scripts. We don’t reinvent the wheel each time we come to a problem, and you don’t need to either. As Brent has blogged before, we send our clients to other people’s blogs for resources.
Ready the Parachutes!
Running a performance checkup is a great experience for me and the client. Solving problems and teaching people at the same time is the most rewarding job I’ve had! I think a huge factor to being successful as a consultant is having the desire to constantly learn. If that sounds like you, there may be a career path for you to explore.
Wanna get your storage learn on? VMware has a well-laid-out explanation of the pros and cons of different ways to connect to shared storage. The guide covers the four storage protocols, but let’s get you a quick background primer first.
iSCSI, NFS, FC, and FCoE Basics
iSCSI means you map your storage over TCPIP. You typically put in dedicated Ethernet network cards and a separate network switch. Each server and each storage device has its own IP address(es), and you connect by specifying an IP address where your drive lives. In Windows, each drive shows up in Computer Manager as a hard drive, and you format it. This is called block storage.
NFS means you access a file share like \\MyFileServerName\MyShareName, and you put files on it. In Windows, this is a mapped network drive. You access folders and files there, but you don’t see the network mapped drive in Computer Manager as a local drive letter. You don’t get exclusive access to NFS drives. You don’t need a separate network cable for NFS – you just access your file shares over whatever network you want.
Fibre Channel is a lot like iSCSI, except it uses fiberoptic cables instead of Ethernet cables. It’s a separate dedicated network just for storage, so you don’t have to worry as much about performance contention – although you do still have to worry.
Fibre Channel Over Ethernet runs the FC protocol over Ethernet cables, specifically 10Gb Ethernet. This gained niche popularity because you can use just one network (10Gb Ethernet) for both regular network traffic and storage network traffic rather than having one set of switches for fiber and one set for Ethernet.
Now that you’re armed with the basics, check out VMware’s PDF guide, then read on for my thoughts.
What I See in the Wild
1Gb iSCSI is cheap as all get out, and just as slow. It’s a great way to get started with virtualization because you don’t usually need much storage throughput anyway – your storage is constrained by multiple VMs sharing the same spindles, so you’re getting random access, and it’s slow anyway. It’s really easy to configure 1Gb iSCSI because you’ve already got a 1Gb network switch infrastructure. SQL Server on 1Gb iSCSI sucks, though – you’re constrained big time during backups, index rebuilds, table scans, etc. These large sequential operations that can easily saturate a 1Gb pipe, and storage becomes your bottleneck in no time.
NFS is the easiest way to manage virtualization, and I see a lot of success with it. It’s probably an easy way to manage SQL clusters, too, but I’m not about to go there yet. It’s just too risky if you’re using the same network for both data traffic and storage traffic – a big stream of sudden network traffic (like backups) over the same network pipes is a real danger for SQL Server’s infamous 15 second IO errors. Using 10Gb Ethernet mitigates this risk, though.
Fibre Channel is the easiest way to maximize performance because you rule out the possibility of data traffic interfering with storage traffic. It’s really hard to troubleshoot, and requires a dedicated full time SAN admin, but once it’s in and configured correctly, it’s happy days for the DBA.
Want to learn more? We’ve got video training. Our VMware, SANs, and Hardware for SQL Server DBAs Training Video is a 5-hour training video series explaining how to buy the right hardware, configure it, and set up SQL Server for the best performance and reliability. Here’s a preview:
This question came up when working with a client last week. The team mentioned they were planning to move some tables into a new filegroup. The new filegroup is intended to help quickly restore data in cases when a monthly process goes off the rails and leaves the data in some tables in a bad state.
Over the years I’ve worked on a lot of backup and restore plans with SQL Server, and I learned the hard way that filegroup level restores don’t do exactly what you might think.
In this post, we’ll talk through a sample problem. I’ll explain while filegroup level backups don’t fit the bill to solve this problem, and then talk through a couple of alternatives that may make life easier.
Our Problem: We Need to Restore the Data for Some of the Tables in a Database
Let’s say you manage a database named ImportantDB. Some tables are updated daily in ImportantDB, and other tables are only updated once per month. The tables updated once per month are updated by a legacy application and you don’t have access to the source code.
Periodically, the monthly load will fail due to issues outside of SQL Server. When this occurs, the recommended procedure is to reset the monthly tables to the state they were in prior to the monthly load job beginning, and restart the process. (Essentially, you start over.)
Restoring the entire database takes a long time, and might remove changes to the “daily” tables as well.
Attempt 1: Filegroup Level Backups
In SQL Server we have the ability to back up and restore individual filegroups. In Enterprise Edition, SQL Server also has the ability to have the Primary and select other filegroups online and responding to queries while you are restoring individual filegroups.
For situations like this, it seems like filegroup level backups and restores will be the answer to our problem. The plan we make looks like this:
- Move the monthly tables into a filegroup named MONTHLYTABLES
- Set the filegroup as READ_ONLY
Whenever we need to load data into the filegroup, we plan to:
- Take a filegroup level backup of MONTHLYTABLES
- Set the MONTHLYTABLES filegroup as READ_WRITE
- Start the process to change data
- If there are any issues with the data load, restore over the filegroup from the READ_ONLY backup
Why Filegroup Level Restores Don’t Solve This Problem
The restore sequence described above in SQL Server won’t work, by design. SQL Server’s mission with filegroup level restores is described in Books Online here:
Each filegroup-restore sequence restores and recovers one or more offline filegroups to a point consistent with the database.
As soon as the MONTHLYTABLES filegroup is made READ_WRITE, we can no longer restore over it with a backup made from a prior point in time and bring it online. We can perform the filegroup level RESTORE if we’re in the appropriate recovery model, but in order to bring it online we also have to restore additional log backups to bring it to a point in time that is consistent with the rest of the database. (Don’t believe me? Check out Tibor Karaszi’s fine blog on the topic with sample code. Tibor’s examples are for the full recovery model— play around with things and you’ll see that you can’t make it work in the simple recovery model, either.)
Why does it have to be this way? It’s because SQL Server doesn’t track transaction status by filegroups, and its mission is to ensure transactional consistency within a database. If we could restore individual filegroups to different points in time and bring everything online, we wouldn’t have a way to ensure consistency.
Alternative 1: Database Snapshots
If we’re running SQL Server Enterprise Edition, we can automate a solution to our process by using a database snapshot. Our monthly load procedure would look like this:
- Create a database snapshot
- Start the process to change data
- If there are any issues with the data load, either:
- Revert to the snapshot (this will impact the whole database)
- Truncate and reload the monthly tables using the snapshot as a data source
This solution will work, but depending on the size of the monthly tables, it may have some performance problems. The database snapshot will use IO for all the data changed in the monthly tables. If the monthly tables are very large, we will end up spending a lot of writes against our snapshot file as well as the database we are repairing.
If it’s OK to lose any other data that has changed since the snapshot was taken, the revert option may work for us— provided that we don’t mind dropping any full text catalogs and rebuilding the transaction log.
The only time I really like a database snapshot option is when I’m 100% sure that other processes have not been updating data. That just isn’t the case for a lot of databases, so we may need to seek other options.
Alternative 2: Views
We could also solve this problem by moving the monthly tables into a new database named ImportantMonthly. Then, we would replace the original tables in ImportantDB with views. A simple view of “SELECT [columnnames] from ImportantMonthly.schema.table” would be put in place for each table.
With the views defined in this way, they would function just like tables and allow insert, update, delete, and select statements to be run against the base tables.
This would allow the process to become:
- Take a full database backup of ImportantMonthly
- Start the process to change data
- If there are any issues with the data load, restore over the ImportantMonthly database
The main gotcha to be aware with using views in this manner is that if the definition of the tables in ImportantMonthly changes, you may be required to refresh the metadata for the views with the procedure sp_refreshview.
Alternative 3: Synonyms
Similar to the view solution, we could move the tables into ImportantMonthly, and then replace the original tables in ImportantDB with synonyms.
Synonyms in SQL Server don’t have the same metadata issue as views, but there is one important thing to know: you cannot reference a synonym that is on the other side of a linked server.
The Devil is in the Details
This example highlights one thing to me: when you try to solve a problem with just one kind of tool, you have to make compromises. To build a strong, reliable solution and mitigate the risks, database administrators need the collaboration of whomever controls the application code for the database, whether that be a vendor or an in-house development team.
Do you use heaps in SQL Server? How can heaps impact your query performance and database maintenance? In this 30 minute webcast I will show you how heaps behave differently than than tables with clustered indexes in SQL Server. You’ll learn how to identify heaps, how to measure their size, and how to plan a change to convert your heaps to clustered indexes.
This talk is aimed toward DBAs and developers who have worked with SQL Server >= 1 year. I cover LOTS of dynamic management views and tools to view table structures– it’s OK if you haven’t used them all before, you’ll still be able to get the concepts.
Want the scripts used in the talk? Scroll on down– scripts are included in this post below the video.
/* Author: Brent Ozar Unlimited® Purpose: Demo some of the peculiarities of heaps in SQL Server. This script is for use on TEST servers (not production). It creates and drops a database and uses undocumented stored procedures. References: We run this script in our video demo on heaps. Check out the video for interpretation of the script results and more. Usage notes: This demo uses both undocumented and documented procedures in SQL Server. The undocumented procedures are NOT meant to be run against production they are widely used, but be careful with all undocumented procedures and use at your own risk. ("Undocumented" means Microsoft doesn't officially support them.) This demo also uses sys.dm_db_database_page_allocations, which is a new undocumented DMV in SQL Server 2012. If you're testing these scripts against an earlier version of SQL Server, you can use DBCC IND (also undocumented) to see the same information. We include an example of DBCC IND with the syntax in the demo. */ SET STATISTICS IO OFF; SET STATISTICS TIME OFF; GO --Recreate the HeapsOfHeaps database for testing. --This drops the database - use with care. (Hopefully you --do not have production databases with that name.) WHILE @@TRANCOUNT > 0 BEGIN ROLLBACK END IF DB_ID('HeapsOfHeaps') IS NOT NULL BEGIN USE master; ALTER DATABASE HeapsOfHeaps SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE HeapsOfHeaps; END GO --Create and use the database. --This creates it in the default location with default sizes-- change if you wish. CREATE DATABASE HeapsOfHeaps; GO USE HeapsOfHeaps; SET NOCOUNT ON; SET STATISTICS IO OFF; SET STATISTICS TIME OFF; GO --Create a heap! CREATE TABLE dbo.DataPile ( datapileid BIGINT IDENTITY NOT NULL, col1 VARCHAR(1024) NOT NULL ); GO --Insert 1000 rows. DECLARE @i INT = 1; BEGIN TRAN WHILE @i <= 1000 BEGIN INSERT dbo.dataPile (col1) SELECT REPLICATE('A',200) ; SELECT @i = @i + 1; END COMMIT; GO --Here is a query that will show if you have heaps. --You want to look for Index_Id = 0. SELECT sc.name AS [Schema Name] , so.name AS [Table_Name] FROM sys.indexes si JOIN sys.objects so ON si.object_id = so.object_id JOIN sys.schemas sc ON so.schema_id = sc.schema_id WHERE so.is_ms_shipped = 0 /* Not shipped by Microsoft */ AND si.index_id = 0 /* Index Id 0 = A Heap */ AND so.type = 'U'; /* User table */ GO --You can also look at the table and TRY to find indexes exec sp_helpindex 'DataPile'; GO --Look at the structure of our heap with DBCC IND. --This is a special system command (undocumented!). --PageType 10= an IAM page (Index Allocation Map). --Parameters: DatabaseName, TableName, IndexID DBCC IND('HeapsOfHeaps', 'DataPile',0) GO --In SQL 2012 we can also query a new DMV to see allocations! --This DMV is also undocumented. --Parameters: DatabaseId, ObjectId, IndexId, PartitionId, mode select * from sys.dm_db_database_page_allocations(db_id(),object_id('DataPile'),0,null,'DETAILED'); GO --We can also get information about the table's fragmentation. --This is a documented procedure, but I'm running it in detailed mode so it will look --at all the pages in the active table. Be very careful if you run this in detailed mode --against a production database. SELECT alloc_unit_type_desc, index_depth, page_count, avg_page_space_used_in_percent, record_count, forwarded_record_count FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('DataPile'), 0, NULL,'detailed'); GO --sys.dm_db_partition_stats looks at the metadata. select used_page_count, in_row_used_page_count, reserved_page_count, row_count from sys.dm_db_partition_stats where object_name(object_id) = 'DataPile'; --Count how many reads we have to do to scan the table. --Run the select, then go to the Messages tab. --Look at the number of "logical reads" we did. SET STATISTICS IO ON; SET STATISTICS TIME ON; SELECT * FROM dbo.dataPile; GO --Now make half of the rows have larger values in them. UPDATE dbo.DataPile SET col1=REPLICATE('B',1000) WHERE dataPileid % 2 = 0; GO --How many reads does it take to scan the table now? --Run the select, then go to the Messages tab. --Look at the number of "logical reads" we did. SELECT * FROM dbo.dataPile; GO --Why did we do so many more logical reads???? --Let's find some information by looking at the table's fragmentation now. --Note: look at the record_count column-- we didn't insert any records! --It's counting wrong by the number of forwarded records. --(This is documented in Books Online, by the way.) SELECT alloc_unit_type_desc, index_depth, page_count, avg_page_space_used_in_percent, record_count, forwarded_record_count FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('DataPile'), 0, NULL,'detailed'); GO --We can see something special happened reading, as well. --Check out forwarded_fetch_count. select leaf_insert_count, leaf_update_count, forwarded_fetch_count from sys.dm_db_index_operational_stats(db_id(),object_id('DataPile'),0,null); GO --What about deletes? --Delete all of the rows from the table, except for five. DELETE FROM dbo.datapile where datapileid > 5; GO --OK, we only have FIVE rows in this table now. We started with 1000. --How many reads does it take to read now? --Run the select, go to the messages tab, and look at logical reads. SELECT * FROM dbo.dataPile; GO --What changed? --We got rid of MOST of the forwarded record pointers. --But we still have 80+ pages allocated! For only five rows! SELECT alloc_unit_type_desc, index_depth, page_count, avg_page_space_used_in_percent, record_count, forwarded_record_count FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('DataPile'), 0, NULL,'detailed'); GO --sys.dm_db_partition_stats agrees that we still have LOTS of pages. --It gets the row count right, by the way. select used_page_count, in_row_used_page_count, reserved_page_count, row_count from sys.dm_db_partition_stats where object_name(object_id) = 'DataPile'; GO --Create a nonclustered index on our heap. CREATE UNIQUE NONCLUSTERED INDEX ix_datapile_nc_datapileid ON dbo.DataPile (datapileid); GO --Now look at what pages are being used for the nonclustered index. --We are using the undocumented new DMV again. --We have an INDEX page type instead of a DATA page type. */ select * from sys.dm_db_database_page_allocations(db_id(),object_id('DataPile'),3,null,'DETAILED'); GO --Now, take the index page ID: --This is the allocated_page_page_id for the row where page_type_desc=INDEX_PAGE --Plug it into DBCC PAGE. --This is another undocumented procedure that lets us look at page data. --Check out the values in the HEAP RID Column and record them. DBCC TRACEON (3604); DBCC PAGE (HeapsOfHeaps, 1, 284,3) GO --Now rebuild the heap. --We can do that-- we can rebuild a heap! (On SQL Server 2008 and above.) --Prior to SQL Server 2008, the most popular way to de-fragment a heap is to --add and remove a clustered index--- but it has the same impact seen here. ALTER TABLE dbo.DataPile REBUILD; GO --Now, check out our nonclustered index... select allocated_page_page_id, page_type_desc from sys.dm_db_database_page_allocations(db_id(),object_id('DataPile'),3,null,'DETAILED'); GO --Hey, wait a second, the PAGE IDs are different for the Non-Clustered index. --And we didn't rebuild the non-clustered index--- we just asked for a rebuild of the heap! --But yet our non-clustered index shows us that it is on totally new pages now. --What's up with that???? DBCC TRACEON (3604); DBCC PAGE (HeapsOfHeaps, 1, 370,3) GO -- Compare those HEAP RIDS to what you recorded before. --We explain why this is and what this means in the video. --Wow, rebuilding a big heap with nonclustered indexes could --cause LOTS and LOTS of unexpected IO. --Really, the times when we want to have a heap in SQL Server are the exception rather than the rule. --We like to default to clustered indexes in SQL Server--- --unless we have performance tests showing that heaps are better for that use case!
WANT TO BE AN INDEXING PRO? WE’VE GOT TRAINING!
For $299 you can get 18 months of access on your desktop, laptop, iPad, or iPhone to more than six hours of video training on SQL Server indexes, chock full of demo scripts and quizzes.Get more info or buy it now!
When I first started going to conferences, I thought all the value was up on the screen. I took copious amounts of notes in every session. I asked questions. I paid close attention. When the sessions were over, I packed up and went home, eager to implement what I’d learned. Of course, reality set in when I got back home. I had a gazillion emails and help desk tickets, and by the time I dug my way back out, I’d forgotten a lot of what I’d learned. My notes sat dormant on my hard drive for weeks or months.
I’ve since learned that the most valuable things at conferences are peers, not PowerPoints.
I started the Free-Con to connect and inspire people. It’s a free one-day event before conferences, and it completely ignores the conference itself. We focus on improving our content, our people skills, and our sense of community. I only invite a limited number of community content creators, but they’re from all walks of life – new bloggers, MVPs, and corporate spokespeople.
The next one is this Friday, May 18th in downtown Chicago, lining up with SQLSaturday 119. We’ve had a few last-minute cancellations, so now’s your chance to sneak in.
What to Expect
Free-Con feels like a panel discussion without an audience – just one big panel. We’ve got a great list of attendees with all kinds of different experience, and everybody wants to share and help each other. To promote discussion, I’ve got a list of topics and short presentations ready, but expect to spend most of the time interacting with each other.
FreeCon Chicago’s agenda includes:
- Your Resume is Backwards – Your resume is organized by company, then position, then skill. What if we organized it the other way and looked for gaps?
- Lean Methodology – “Lean” is part of a manufacturing strategy for minimizing waste. I’ll explain why it’s relevant to your career and your day-to-day work.
- Zen and V12 Maintenance – Can your work bring you closer to people who inspired you?
- Tufte’s Supergraphics in Action - Edward Tufte wrote about how to reinvent presentations using large printed handouts. Software manufacturers are starting to do a surprisingly good job of it. We’ll examine one and see how to turn it into a full-blown presentation.
- And more – like group discussions around where we find inspiration and how we can improve SQLSaturday.
The event will start at 9AM and last until 4PM. With this many bloggers in one room, we’ve got an incredible opportunity to build relationships and blog posts. We’ll cover three ways to get other bloggers involved in your blog, and then we’ll actually do it live. (Wait, not “do it” – although, I mean, if you find one of the other bloggers attractive, I suppose you could. Just don’t blog about that.)
Doors will open at 8, and if you make the list, you’re welcome to show up early. A continental breakfast will be served at 9am, and you can’t come to Chicago without having Giordano’s pizza for lunch, so that’s catered in too.
Who To Expect
This event is about you meeting your peers – not just SQL Server peers, but other people who want to build their brand online to get better jobs. I picked people I enjoy spending time with, and if I like hanging out with you, I’m pretty sure you’re going to like the other attendees. You have a lot in common, and you’ll do a better job at it if you’ve got friends you can ask for advice and help.
The attendees for this Free-Con are:
- Argenis Fernandez (Blog – @DBArgenis)
- Bill Lescher (@BLescher)
- Bob Pusateri (Blog – @SQLBob)
- Christina Leo (Blog – @ChristinaLeo)
- Eric Harrison (LinkedIn)
- Garima Sharma (LinkedIn)
- Hope Foley (Blog – @Hope_Foley)
- Jason Fay (Blog – @JFay_DBA)
- John Mazzolini (LinkedIn – @JMazzolini)
- Josh Fennessy (Blog – @JoshuaFennessy)
- Louis Fritz (LinkedIn)
- Norman Kelm (Blog – @NormanKelm)
- Ramin Surya (LinkedIn – @RSurya)
- Scott Ellis (LinkedIn)
- Tim Ford (Blog – @SQLAgentMan)
- Tim Radney (Blog – @TRadney)
- Tom Norman (LinkedIn – @tjnorman57)
Former Free-Con attendees include:
- Aaron Bertrand (Blog – @AaronBertrand)
- Andy Leonard (Blog – @AndyLeonard)
- Crys Manson (Blog – @CrysManson)
- David Stein (Blog – @Made2Mentor)
- Grant Fritchey (Blog – @GFritchey)
- Jeremiah Peschka (Blog – @Peschkaj)
- John Robel (Blog – @JohnRobel)
- Jorge Segarra (Blog – @SQLChicken)
- Karen Lopez (Blog – @DataChick)
- Kendra Little (Blog – @Kendra_Little)
- Mike Walsh (Blog – @Mike_Walsh)
- Noel McKinney (Blog – @NoelMcKinney)
- Rob Drysdale (Blog – @ProjMgr)
- Steve Jones (Blog – @Way0utwest)
- Stuart Ainsworth (Blog – @StuartA)
- Tom LaRock (Blog – @SQLRockstar)
- Yanni Robel (Blog – @YanniRobel)
What Past Attendees Said
“I got to pick the brains of some really smart, really talented, really motivated, really funny, really smart people. For someone who is relatively new to the database world, and really new to speaking and blogging, this was like a year’s worth of lessons crammed into a day. I got to sit next to people that I never thought I could and ask them anything about their career, their experiences, for advice, and listen to their stories. I cannot tell you how cool that was.”
“For me, FreeCon defined some things that have been running around in my brain for the last 18 months or so. It also inspired me to pump up my blogging, to try to improve my game and my brand. I’ve done a very large number of actions in the two weeks since FreeCon went down. Many of them are already bearing fruit, for example, I’m now hosted on my own domain. Others may bear fruit in the near term, and I have EXTREMELY high hopes for these. Still more are the high level goals that I’ve started to define that will likely take me years to deliver.”
“It was a terrific event and I learned a lot. In actuality, there’ll be a free PDF up on my blog fairly soon that is a direct result of attending. And there’ll likely be a second PDF download available before the end of April.”
Update – Filled Up!
Thanks for your emails – we’re now filled, and the attendees above are the final list.
I got my start in the hospitality business by working at distressed properties. Distressed is the polite term that hotel management companies use when a hotel is in financial trouble, has health violations, and nobody in their right mind would stay there. Thankfully, there are a lot of tourists who are not in their right mind. Think about that the next time you pull into a hotel after a long day of travel and say, “Just give me a room.” Here’s an example of a distressed hotel’s pool (and yes, I worked at this hotel):
That particular hotel had been hit by a tornado, and the owners were fighting with the insurance company over who was going to pick up the tab for the damage. After the tornado struck, the first management company simply closed the restaurant and shut the power – without emptying or cleaning the freezers. I had the joyous task of getting maintenance men to clean out hundreds of pounds of rotting steak.
When I say “distressed hotel,” it’s not just the building that’s distressed.
At distressed properties, a pattern emerged pretty quickly:
- The hotel owners would hire the management company because things weren’t going well
- The management company brought in their own General Manager
- The GM would bring in a bunch of his friends in for management positions
- Things would get better – for a while
- The hotel owners would fire the old management company and go back to the first step
- The GM would leave the property, take their friends along, and go to another property
Each time the management company changed and the GM left, I noticed that the GMs were bringing a lot of friends along for the ride. When the third new GM (Tom) came in, I made it a point to be buddy-buddy with him. He was thankful for the help, and he taught me a ton about the hotel business in a matter of months.
When the owners fired this new management company, Tom headed off to Georgia to manage yet another property, but this time something different happened. The phone rang, and Tom was on the other end.
Tom: “I’ve just taken over this hotel in Savannah. You wanna come down here and manage the front desk staff?
I said, “I have no idea how to do that.”
Tom: “Neither did the last guy here, but at least I know I can work with you, and you’ll work hard for me.”
Sold. I packed my car and moved down to join him. It was the easiest promotion I’d ever gotten. That job was never posted in the newspaper, and if it had been, I’d have never known to look in Savannah’s newspaper – and I wouldn’t have made the cut over all the other applicants anyway. Tom called me directly and hired me because I’d already gone through a long interview with him – I’d worked for him.
Your Coworkers Are Interviewing You Right Now
There’s a pretty good chance that if your manager quits, you might not want to work for them somewhere else. You probably also only have one manager, and you don’t want to put all your eggs in that one basket. After all, they may never leave the company.
Your coworkers, on the other hand, are plentiful and ambitious. I bet at least one of your coworkers leaves this nasty hole every year, and they break free to get a better job somewhere else. Wouldn’t it be nice if they called you up a few weeks later and gave you the inside track on a new job? It’s the easiest way to get ahead, and you’ll be working with at least one person who already knows and trusts you.
Recently, I asked you to write a testimonial for the Incredible DBA Team, and I asked you how it made you change the way you see your job. Now it’s time to go back, read the comments, and think about how you can generate some of those same testimonials for yourself. Your fellow developers, DBAs, sysadmins, SAN admins, managers, project managers, and end users are your very best network.
Who do you want to take you along? What can you do today to make sure they’d want to take you along for their adventures? How do you get them to write testimonials like these?
Whether you’re a developer or a DBA, you’re probably frustrated with query performance. Why is the app running slow? Why does query performance vary so much? Why can’t your queries run consistently fast?
You’ll learn the tricks of performance tuning in this three-hour session as Brent covers:
- How to find the worst queries on the server in 60 seconds
- What statistics are and how to tell if they’re causing problems
- How to think like the engine and build your own execution plan
- Why filters are cheap and sorts are expensive
- When you should throw indexes at the problem, and exactly when to stop
- Why temp tables might be the fastest way to fix many problems
- How to identify parameter sniffing problems and fix ‘em for good
This session is for developers and DBAs with 1-4 years of experience who are comfortable writing queries and adding indexes, but aren’t happy with the results they’re getting.
Want to Watch the Whole Thing?
Here’s your options:
Join me in a live webcast for $249 Buy the video right now for $149
- Leave a comment below. I’ll draw two random winners tomorrow (Tuesday 5/15) at 11am Eastern, announce their names on our Tuesday webcast, and give them free access to both the webcast and the video. One entry per person – US residency not required. Update – the contest is closed. We felt generous and gave away four complete packages to Andrew, Dustin Laine, Randy Jaye, and Yu Chen. Congrats, and if you didn’t win – hey, you can always buy in.