How Does SQL Server Store Data?

Last Updated 7 years ago

Let’s step back and take a look at the big picture. (Today, I’m writing for beginners, so you advanced gurus can go ahead and close the browser now. I’m going to simplify things and leave a lot out in order to get some main points across. Don’t well-actually me.)

Microsoft SQL Server databases are stored on disk in two files: a data file and a log file.

What’s Stored in the Data File (MDF)

Let’s start with a simple table. If you want to follow along with my code, this will work on SQL Server 2005 & newer, but please do it in a brand new database rather than reusing one of your existing ones. We’ll be looking at the log file later, and you won’t be able to quickly find the relevant entries in a sea of unrelated ones. Off we go:

CREATE TABLE dbo.Friends (id INT IDENTITY(1,1), FriendName VARCHAR(30));
GO
INSERT dbo.Friends (FriendName) VALUES ('Brent Ozar');
INSERT dbo.Friends (FriendName) VALUES ('Jeremiah Peschka');
INSERT dbo.Friends (FriendName) VALUES ('Jes Schultz Borland');
INSERT dbo.Friends (FriendName) VALUES ('Kendra Little');
GO

CREATE TABLE dbo.Friends (id INT IDENTITY(1,1), FriendName VARCHAR(30));

INSERT dbo.Friends (FriendName) VALUES ('Brent Ozar');

INSERT dbo.Friends (FriendName) VALUES ('Jeremiah Peschka');

INSERT dbo.Friends (FriendName) VALUES ('Jes Schultz Borland');

INSERT dbo.Friends (FriendName) VALUES ('Kendra Little');

We now have a table. I was going to say you’ve got four friends, but we’re not your friends. Let’s take this slow, alright? We just met. You can start by buying us a drink first. Let’s see how the table is stored in SQL Server behind the scenes – look under the table, as it were:

DBCC IND('MyDatabaseName', 'Friends', -1);

1	DBCC IND('MyDatabaseName', 'Friends', -1);

This command is totally safe to run – it just lists out where SQL Server is storing your data. Replace ‘MyDatabaseName’ with your database’s name. The result is a list of pages where SQL Server stored the Friends table:

Data Files Are Broken Up Into 8KB Pages

These pages are the smallest unit of storage both in memory and on disk. When we write the very first row into a table, SQL Server allocates an 8KB page to store that row – and maybe a few more rows, depending on the size of our data. In our Friends example, each of our rows is small, so we can cram a bunch of ’em onto a page. If we had bigger rows, they might take up multiple pages even just to store one row. For example, if you added a VARCHAR(MAX) field and stuffed it with data, it would span multiple pages.

Each page is dedicated to just one table. If we add several different small tables, they’ll each be stored on their own pages, even if they’re really small tables.

If we shut down the SQL Server, started it back up again, and then issued the following query:

SELECT * FROM dbo.Friends WHERE FriendName = 'Brent Ozar'

1	SELECT * FROM dbo.Friends WHERE FriendName = 'Brent Ozar'

SQL Server would check to see what page the dbo.Friends table is on, then read our entire 8KB page from disk, and cache that 8KB page in memory. I say “entire” as if it’s a big deal, but I want to make a point here: pages are stored identically both in memory and on disk, and they’re the smallest unit of caching. If you use SQL Server’s data compression, the data isn’t uncompressed from the page until it needs to be read again to satisfy another query – you get the benefit of compression in memory as well as on disk.

What happens if we change a data page? For example, if we issue the following command, what happens:

INSERT dbo.Friends (FriendName) VALUES ('Lady Gaga');

1	INSERT dbo.Friends (FriendName) VALUES ('Lady Gaga');

That’s where the log file comes in.

What’s Stored in the Log File (LDF)

The log file is a sequential record of what we did to the data. SQL Server writes down, start to finish, what we’re trying to do to those helpless, innocent data pages.

Your first reaction is probably, “Wow, I never want to look in there because my users do horrible, unspeakable things to my database server.” Good news – SQL Server doesn’t need to log the SELECT statements because we’re not affecting the data, and that’s usually where the worst nastiness happens. Bad news – even if you did want to look in the log file, SQL Server doesn’t give you an easy way to do it. The log file exists for SQL Server, not for you.

When we insert, update, or delete rows in our table, SQL Server first writes that activity into the log file (LDF). The log file must get hardened to disk before SQL Server says the transaction is committed.

But not the change to the data page – that part doesn’t have to hit the disk right away. See, SQL Server knows you’re the kind of person who makes lots of changes to the same data, over and over. You’re a busy person with things to do and data to trash. SQL Server can keep the same data page in memory for a while, and then flush it out to disk later – as long as the log file was written.

When Windows crashes hard or somebody pulls the power cables out from under your SQL Server, SQL Server will use the database’s log file on startup. SQL uses the log file to reconcile the state of the data file, deciding which transactions should be applied to the data file and which ones should be rolled back.

How the Data File and Log File are Accessed

This starts to point to a significant storage difference between these two files.

Log files are written to sequentially, start to finish. SQL Server doesn’t jump around – it just makes a little to-do list and keeps right on going. Eventually when it reaches the end of the log file, it’ll either circle back around to the beginning and start again, or it’ll add additional space at the end and keep on writing. Either way, though, we’re talking about sequential writes. It’s not that we never read the log file – we do, like when we perform transaction log backups. However, these are the exception rather than the norm.

Data files, on the other hand, are a jumbled mess of stuff. You’ve got tables and pages all over the place, and your users are making unpredictable changes all over the place. SQL Server’s access for data files tends to be random, and it’s a combination of both reads and writes. The more memory your server has, the less data file reads happen – SQL Server will cache the data pages in memory and just work off that cache rather than reading over and over. This is why we often suggest stuffing your SQL Server with as much memory as you can afford; it’s cheaper than buying good storage.

More Resources for SQL Server Data Storage

Want to learn more? We’ve got video training explaining it! In our free 90 minute video series How to Think Like the SQL Server Engine, you’ll learn:

The differences between clustered and nonclustered indexes
How (and when) to make a covering index
The basics of execution plans
What determines sargability
How SQL Server estimates query memory requirements
What parameter sniffing means, and why it’s not always helpful

7 Things Developers Should Know About SQL Server

Rolling Averages in SQL Server

60 Comments. Leave new

sohn
February 20, 2013 11:26 am

Is that just me or , the video has just the right speaker audible.

Reply
- William Andrus
  February 20, 2013 12:16 pm
  
  It does seem to be on one side only.
  
  Reply
  - Brent Ozar
    February 20, 2013 12:17 pm
    
    Sorry about that guys! Believe me when I say that stereo audio wouldn’t improve the knowledge from the session. 😉
    
    Reply
William Andrus
February 20, 2013 12:18 pm

“SQL uses the log file to reconcile the state of the data file, deciding which transactions should be applied to the log file and which ones should be rolled back.”

Shouldn’t it be that it is applied to the data file, not log?

Reply
- Brent Ozar
  February 20, 2013 12:19 pm
  
  Great catch! I’ll edit that.
  
  Reply
Rob Kraft
February 21, 2013 6:03 am

Brent does a great job explaining the basics of data storage in the video. I recommend all developers with any concern about the security of data in SQL Server check out this video. Well done Brent!

Reply
Chris Page
February 21, 2013 9:05 am

If writing to logs is done in a sequential manner is there much benefit to be had in using SSDs for log files?

Reply
- Brent Ozar
  February 21, 2013 9:08 am
  
  Chris – that’s a fantastic question, and the answer requires thinking a little out of the box – or rather, out of the database. How many databases do you have on the server, and are they all active at the same time? It’s fairly unusual that I see a server with only one active database, so SSDs can end up making a LOT of sense for the log files. However, if you’ve only got one database, like in a data warehouse scenario, you can usually get the throughput you need from 4-6 hard drives in a RAID 10 setup.
  
  Reply
Matt
February 21, 2013 4:44 pm

The Smiths guitarist was Johnny Marr 🙂

Reply
William Meitzen
February 25, 2013 7:49 am

Is this blog post (largely) true for SQL Server 2000?

Reply
- Brent Ozar
  February 25, 2013 10:24 am
  
  William – yep, but I’d caution against putting more time into learning SQL Server 2000. It’s well on its way to the graveyard.
  
  Reply
Merrill Aldrich
March 1, 2013 4:08 pm

I am grateful to Brent for trying and featuring a preview version of this tool – and happy to announce the first public version is out!

http://sqlblog.com/blogs/merrill_aldrich/archive/2013/03/01/public-release-sql-server-file-layout-viewer.aspx

Reply
Mind Q Systems Pvt Ltd
March 8, 2013 11:26 pm

wonderful information, I had come to know about your blog from my friend nandu , hyderabad,i have read atleast 7 posts of yours by now, and let me tell you, your website gives the best and the most interesting information. This is just the kind of information that i had been looking for, i’m already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanks a ton once again, Regards, Oracle Portal online training among the Oracle Portal in Hyderabad. Classroom Training in Hyderabad India

Sql server 2008R2 dba online training USA,CANADA,UK & AUSTRALIA,!

Reply
Louie Bao
March 26, 2013 4:44 pm

Surely I can’t be the first one to ACTUALLY follow the examples?

Msg 207, Level 16, State 1, Line 1
Invalid column name ‘Name’.

Reply
- Brent Ozar
  March 26, 2013 4:47 pm
  
  Louie – HAHAHA, yeah, you could totally be the first person. When I’m reading a blog, I usually just read the examples and assume that they work. You’re probably #1. Nice find! I tweaked that.
  
  Reply
Lelala
May 4, 2013 11:26 am

Any idea, why they came up with this 8kb thingy?
Why not 16kb?
Why not 4kb?
Or is it because on early 32-bit systems, 1st-level-cache of most CPUs was 8kb?

Regards

Reply
- Richard Peninger
  September 2, 2020 10:58 am
  
  I don’t know but I’ll bet it’s early 32-bit systems.
  
  Reply
Kerry Wilson
July 25, 2013 12:44 pm

My boss had just asked me yesterday about our SQL backups. I found the data file and the log file. The data file has a timestamp of 07/21/13 @ 22:33 which is the last time I ran the the backup procedure using the SQL Enterprise manager. However, the log file is timestamped at 8:12 this morning (07/25). It is now 12:42 in the afternoon. Why does the log file not have a more current timestamp?? Where are the transactions that have occurred since 8:12 this morning??

Reply
- Brent Ozar
  July 25, 2013 12:47 pm
  
  Hi, Kerry. Unfortunately this kind of fast on-demand troubleshooting doesn’t work well in blog comments. Your best bet will be to call Microsoft if you need that question answered quickly.
  
  Reply
Sachin Boda
November 16, 2013 11:07 am

Hi,

I learned lots of things from your blogs, My question is if one table have more then one page file then all page files for that one table stores in sequential manner or not.

Reply
- Brent Ozar
  November 16, 2013 11:31 am
  
  Hi, Sachin. Let’s think through that. If you’ve got a database that has a lot of tables in it, and you go back to add data for a table that already exists, will there be free pages right there next to it, ready to be used?
  
  Reply
  - Sachin Boda
    November 16, 2013 11:54 am
    
    Don’t know page will available or not, That’s why I ask you Buddy, but if one table has 10 page and the 10th page size is 4kb then my data will start store on 10th page, Am I right ?
    
    Reply
    - Brent Ozar
      November 16, 2013 12:00 pm
      
      Your answer is in the first several words of your question. You don’t know if the page will be available or not – therefore, you can’t predict ahead of time where your data will be. Sometimes the question is the answer, buddy. 😉
      
      Reply
      - Chris Hagelstein
        April 30, 2015 9:53 pm
        
        Is there any way to “force” SQL server to write sequentially ? Some function, perhaps, which adds intelligence to determine if the data can be added to a 4kb-filled page, and not grab an empty 8kb page ?
      - Brent Ozar
        May 1, 2015 8:44 am
        
        Chris – that would be a phenomenally bad investment since SQL Server doesn’t know how the data is laid out on modern storage. There’s no such thing as sequential blocks when we talk about shared storage and SSDs. Neat idea though!
Sachin Boda
November 16, 2013 11:12 am

If Hard drive is crash or power failure then after restarting the server machine how log file will work to make data file in consistent.

Reply
- Brent Ozar
  November 16, 2013 11:32 am
  
  Hi, Sachin. That’s a great question, and it’s well beyond the scope of something we can answer quickly in a blog comment. It sounds like you’re on a fun journey to start learning the internals of databases. There’s plenty of SQL Server internals books available to tackle questions like those – just make sure that as you go on that journey, you’re focusing your learning on the parts that will help you get ahead the most. Enjoy!
  
  Reply
Faisal
November 17, 2013 3:10 am

Hey Brent, you surely did a fine job.

Thank you very much!

Reply
Al
March 12, 2014 10:54 am

I’m still confused about the term “committed” in regards to a transaction. When you say a transaction is committed, does that mean it is written to the transaction log file on disk or the data file on the disk?

Thanks

Reply
- Brent Ozar
  March 12, 2014 8:42 pm
  
  When SQL Server gets the data into the transaction log, it tells the application that the transaction has been committed. The data page doesn’t have to make it to disk. If SQL Server crashes after the transaction was committed but before the data page makes it to disk, that’s cleaned up during the startup process of SQL Server. SQL reads the transaction log, figures out which things it needs to redo and which things it needs to undo.
  
  Reply
  - kumar
    June 16, 2015 11:59 pm
    
    Brent –
    
    When does the committed transaction records will be moved from ldf to mdf? As far my understanding goes checkpoint process will move all dirty pages from buffer log/buffer cache to ldf file and I am bit confused on which process will move committed records from ldf to mdf?
    
    Reply
    - Brent Ozar
      June 17, 2015 5:29 am
      
      Kumar – things don’t move from the LDF to MDF. This is a little beyond the scope of something I can tackle fast in a blog post comment, but if you’re interested in those internal details, now’s the time to pick up a book on SQL Server internals. Enjoy!
      
      Reply
    - Apisak Srihamat
      August 26, 2016 2:31 am
      
      Hi Kumar,
      
      I think you and someone interested in moving from ldf to mdf can simply watch video explain log internal and maintenance at http://download.microsoft.com/download/2/4/4/244C21DD-7601-4DF5-8ADC-0EC4C46BBD46/HDI-ITPro-TechNet-winvideo-MCM_04_LogFilesLecture.wmv
      
      Many more video at https://technet.microsoft.com/en-us/dn912438
      
      Reply
Conlan Patrek
July 9, 2014 12:39 pm

Are functions and stored procedures are stored in the page structure right next to the data?

Just curious! Thanks!

Reply
- Erik
  February 5, 2015 10:32 am
  
  If you’re using SQL 2012 or higher, there is an undocumented DMF that will do the same thing as DBCC IND, but is ultimately more useful:
  
  sys.dm_db_database_page_allocations
  
  Reply
Ana
April 7, 2015 1:16 pm

Hey Brent,

I was just wondering what is the process used to pull data page from disk to buffer i.e. Data cache? How does it happen? Is the page physically moved from disk to buffer and called physicalio or is there any technical term for it?

Thanks,
Ana

Reply
Vasan
June 19, 2015 9:06 pm

“The log file is a sequential record of what we did to the data.”
Are these log records saved in pages too ? Finding to check the allocation unit size of log file. Pages (8 KB) are smallest unit of work for read/write in data file but,wondering to see if these holds good to log file.

Reply
- Kendra Little
  June 19, 2015 9:52 pm
  
  It’s not the same 8K pages that database pages use. Writes can be up to 60KB (not quite the same as 64, but similar) — I wrote more on this here: https://www.brentozar.com/archive/2012/05/how-big-your-log-writes-spying-on-sql-server-transaction-log/
  
  Reply
  - MG
    September 28, 2015 1:37 pm
    
    Kendra, My biggest question is the inner workings of Commit. So, when a transaction is issued a Commit, the process forces SQL to write the transaction to the Log disk. Then upon completeion of that successful write, it is classified as COMMITed. Is this correct?
    
    Reply
    - Brent Ozar
      September 28, 2015 1:55 pm
      
      MG – let’s zoom back a little. Even before the transaction issues a commit, the data that it’s changing along the way will be written to the log file.
      
      Reply
      - MG
        September 28, 2015 2:00 pm
        
        Thanks Brent. Have found numerous info out there that led me to believe it worked differently. Thus, the transaction being written to disk is NOT done only with Checkpoint and is NOT done by the issue of COMMIt.
GDB
October 23, 2015 11:43 am

Great article!
I know it has been out there for some time, but I am fairly new to SQL.
I have been meaning to research this process for a while, and now feel comfortable having some of my knowledge gaps filled in.

Reply
Am@
December 13, 2015 1:29 am

Great job bud but I wanna know
R pages stored in non contiguous manner?
I mean let us suppose we need the data n as u said it is being stored in d pages format.
Now if we apply joins we get combined form data from two or more tables.So there’s a linking between those two.
So r those pages contiguously allocated ???

Reply
Joseph
April 15, 2016 8:00 pm

The explanation about pages is a bit confusing

“Each page is dedicated to just one table. If we add several different small tables, they’ll each be stored on their own pages, even if they’re really small tables.”

Ok but what about the scenario where the table is large and composed of multiple pages? Then multiple pages are dedicated to one table?

Reply
- Brent Ozar
  April 16, 2016 5:49 am
  
  Yep!
  
  Reply
amitab
May 29, 2016 12:54 pm

Hi to all,I am from hyerebad,india.I am new to sql server,can any one suggests me which books are usefull to know clear information about data storing procedure architecture?and where can I get those books?plz

Reply
- Brent Ozar
  May 30, 2016 7:04 am
  
  Amitab – sure, start with the books here: https://www.brentozar.com/archive/2008/08/recommended-books-for-sql-server-dbas/
  
  And you can get them on Amazon.
  
  Reply
Rashmith
July 21, 2016 6:50 am

How the image is stored into the database?

Reply
- Brent Ozar
  July 21, 2016 6:52 am
  
  There’s a few different ways of storing images in SQL Server – varbinary fields, filestream, and filetable.
  
  Reply
- Rashmith
  July 21, 2016 6:53 am
  
  Can we store the image in database table?
  
  Reply
Sadimeh
October 30, 2016 10:14 pm

After the transaction is committed the data will be moved to datafile.

My question is if the data is moved to data file after a committ, then why we need to take a log backup to truncate the log file. The log file should create the freespace after moving the data to the data file. what’s the need of logbackup for creating the space?

Reply
- Brent Ozar
  October 31, 2016 8:36 am
  
  Sadimeh – you need to be able to restore the database to a point in time, and the data in the log file helps you do that. If you don’t need point-in-time restore capabilities, then you’d want SIMPLE recovery model instead of FULL.
  
  Reply
soundharya
December 28, 2016 11:02 am

could you please explain that how the data are stored and in what format

Reply
- Brent Ozar
  December 29, 2016 8:27 am
  
  Soundharya – for details, pick up a book on internals.
  
  Reply
Steve
November 22, 2017 12:28 pm

Is there ANY way possible to place a “flat” file (sequential data file) onto a SQL Server database without that data file becoming a sql server table?

Reply
- Brent Ozar
  November 22, 2017 12:29 pm
  
  Steve – elaborate more about the problem you’re trying to solve. There’s several ways to do this.
  
  Reply
Charles Anderson
February 27, 2019 9:38 pm

Great article, It is helpful in problem-solving of SQL Server Store Data.

Reply
Vaibhav
April 3, 2019 5:11 am

First of all great article Brent!!
Can someone answer about how functions and stored procedures definitions are stored internally , I mean in which data page structure or its same page where data stored ?

Reply
- Brent Ozar
  April 3, 2019 5:16 am
  
  Vaibhav – thanks, glad you liked it. I’ve never looked at stored proc definition storage because I can’t do anything about it – you might find the answers to things like that in an internals book, though.
  
  Reply
  - Vaibhav
    April 3, 2019 6:10 am
    
    No worries , just curious in fact I didn’t find anything there as well however I do assume its data page only. Thanks Brent!
    
    Reply