Blog

There’s a secret to learning about SQL Server’s table partitioning feature: you need to get your hands on some code and really play with it in a test environment.

In this tutorial, Kendra will walk you through test scripts that set up a sample table partitioning environment. These scripts demonstrate gotchas to look out for and the cool features that can dramatically change your database’s performance.

Want to run the demos yourself? Download Table Partitioning Tutorial: Scripts.

Part 1: Our Mission

How can you use the demo and scripts from today in the best way possible? Find out here. 3.5 minutes

Part 2: How to create your test database and helper objects

It’s demotime! We create a test database with views that will help us quickly see critical information about out partitioned object. 3.5 minutes

Part 3: How to create partitioned objects

Next up we create a partition function. We create filegroups and files, and use a partition scheme to map the filegroups. We then create a partitioned object and indexes. Our helper views (created in Part 2), help us make sure things went as expected. 15 minutes

Part 4: Switch IN! How to use partition switching to add data to a partitioned table

Now for the cool stuff. In this session we explore how partition switching can allow us to snap a pile of data quickly into a partitioned table– and a major gotcha which can derail the whole process. 12 minutes

Part 5: Switch OUT! How to move data out of a partitioned table and merge boundary points

Deleting data is never fun… unless you’re switching data out of a partitioned table. But make sure to mind those best practices. In this section we show how to remove a lot of data from a partitioned table quickly and safely. 8 minutes

Part 6: Table partitioning takeaways

What are the big things we covered that you need to remember? Make sure you got all the important points here. 3 minutes

Learn More

The good news: we’ve got tons more information on table partitioning at http://brentozar.com/go/partitioning

The bad news: things get even more complicated than you see in this post. If you’re looking at switching out and merging partitions, before you get started on that read my post here on best practices and merging boundary points.

Looking for the scripts? Download Table Partitioning Tutorial: Scripts.

See http://BrentOzar.com/go/eula for our End User Licensing Agreement which governs use of the scripts.

↑ Back to top
  1. Kendra,
    Excellent demo and scripts. Thanks for posting it.

    Ameena

  2. Hi Kendra,
    thanks
    Excellent article!

    Itzik Ben-Gan uses “Virtual Auxiliary Table of Numbers” in his articles and books.
    for example here http://www.sqlmag.com/article/sql-server/virtual-auxiliary-table-of-numbers-103056
    I think you have a typo… Create a taily table with 4 million rows… What for? You do not use more than 5,000 rows.

    More comments are comming.

    • Whenever I load up a numbers table I just throw in a ton of rows in case I want to increase the rowcount by some factor. I love this method of creating the numbers table because it’s so fast!

  3. Thanks. I will go through these videos as soon as I can.

    FYI,
    I could not recognize Kendra as first time I have seen the glasses missing:-)

    Regards
    Chandan

  4. Kendra, I just discoverd you last week and have gone through a number of your videos. I first found you when I was looking for info on SQL DBA jobs and I stumbled across your video “Getting Your First DBA Job”. I use to develope DB systems long ago and in a product called FOCUS. Became a Master in it but it wasn’t Relational. Took another job that exposed me to SQL, DB2 and Relational Database design. I always wanted to work only as a DBA but just never got the opportunity, lost job, struggled to get back into IT, finally landed back on my feet. My current role is ETL Informatica / Cognos but would love an entry level / Mid level DBA job. I’ve been reading all the MS MCITP books related to SQL Server database technology. You’re Video “Getting Your First” you mention that the DBA industry is (((HOT))). I agree, never wanted out of IT, but got caught in the down economy. Now I can’t seem to find anything that a guy that’s been out as long as I, in the DBA areana. I know I can do the job if there is a DBA shop set up in a Team Environment. Thought maybe you could help point me in a direction. I’m willing to relocate for the right $ and Location. Thougths. I’m taking a chance that this gets to you..
    Dale.

  5. Kendra,

    Thank you! Excellent work.

  6. Pingback: Four Underutilized Features of SQL Server Enterprise Edition (and One of Standard Edition) | The SQL Herald

  7. Hello,

    I have a couple questions.

    In your demo you mentioned that # of filegroups = # of ranges + 1. Yet, all files are created on the same logical volume. Does it affect performance to map all partitions to less number of filegroups or just to single filegroup?

    Also, I understand why you recommend to create a ‘sandwich’ type of partitioning. But how can one leave first and last partition empty while loading data with BCP utility?

    And thank you for the great step-by-step videos!

  8. First of all congratulations on an amazing set of videos that gets us “very interested” in the world of table partitioning.

    The videos are amazing as they explain (and we can test) how partitioning works, what needs to be taken care of and some of the things we need to pay attention when working with partitioned objects.

    One thing that I haven’t seen discussed that much is “Partitioning Existing Tables”. If you ever have time it would be nice if you explained your recommended approach on this subject.

    Let’s say you have a database with tables that are growing very large (and after plenty of analysis you’ve decided that partition will help), which are becoming very difficult to maintain. What would be better?
    - Create a new (and partitioned) table, dump all the rows and start fresh?
    - Partition the existing object and all its indexes?
    - Would the same steps apply for this scenario?

    Thanks for any future comments on this and once again, congrats and keep it up.
    You just earned a new fan :)

    • Thanks for the comment, Nestor! I’m glad you enjoyed it.

      For transitioning an existing partitioned table, I just always seek the path that’s least likely to end me up watching a production database with a long rollback in progress and nobody able to access the tables. I’ve just been there too often. So for this reason I prefer to load everything up in a new object and do a cutover to it whenever possible– and often this leads to the smallest downtime as well, with the best rollback opportunities.

      This tends to be relatively straightforward for tables that are partitioned by date, where new data coming in all goes into more recent partitions. It can be much trickier when data can be modified across all partitions of the table. Michael Swart did some blog posts on strategies for migrating tables online that might be helpful in making this work: http://michaeljswart.com/2012/04/modifying-tables-online-part-1-migration-strategy/

      Sometimes partitioning the existing data can be OK. I would just be really careful testing everything on a restored copy of the database first to make sure that the amount of IO, log, etc available is all OK, that the runtime is acceptable, and that if you need to roll it back you still have a plan in that scenario.

      But I do have to admit, the “change the one copy of the data” approach just gives me the heebie jeebies. Man, do I hate watching a rollback on that one copy of the data.

      Edit: if you have fancy storage and a TON of data you can get really creative to make this work using SAN snapshots and storage on secondary servers to offload some of the CPU usage/locking issues. Making notes to do a blog post discussing some options!

  9. Great stuff Kendra, very informative and clearly explained. Thanks for taking the time to do this, it is really appreciated.

    I am about to start on a mini project partitioning some large production tables. What I have learnt here will be invaluable.

    Paul

  10. Curious about how switch outs relate to foreign key constraints. If I were to partition by client in a multi-client database and I needed to delete a client, I presume I would need to partition all my tables accordingly and switch out the data in a cascading manner from table to table, i.e. transactions table first, subscriptions table second, users table third….

    Perhaps it is best to simply use partitioning on the lowest level tables (which are usually the highest volume) to avoid this.

    • I would start much farther back — What are you trying to gain by partitioning by client? Is it specifically for a data loading/archive project, or are you trying to solve a query performance problem?

      • Well Kendra, I only started contemplating partitioning today so I haven’t given it a whole lot of thought yet. One of the possible advantages of doing it by client is that we can bulk load new clients into the system without interrupting existing clients. Many clients bring historical data when they join our software and importing 500,000 records without other clients noticing can be tricky. The same thing with deleting. Currently deleting a large client effectively locks the database for several hours so it can only be done during major releases with scheduled downtime.

        We are also looking at partitioning by date which strikes me as the more common approach. It is pseudo-archiving.

        We don’t currently have major issues that demand using partitioning but we have a team of developers working on a new version of our software and are future-proofing. In fact, we don’t even use Enterprise edition right now.

  11. Pingback: Switched Off! | SQL On The Fly

  12. Pingback: SQL Server – Table Partitioning | paulbrewer

  13. Pingback: SQL Server Table Partitioning | paulbrewer

  14. Hi Kendra,
    I watched your presentations and developed a slightly contrived ‘sliding window’ scenario script. It took about 2 hours, over 1 hour was wasted fumbling around with constraints on a table being switched in.

    This worked:
    ALTER TABLE dbo.HeartBeatIn ADD CONSTRAINT [Partition_PK_GE60] CHECK (Partition_PK >= 60)

    This was the original constraint and didn’t work:
    ALTER TABLE dbo.HeartBeatIn ADD CONSTRAINT [Partition_PK_GE60] CHECK (Partition_PK > 59)

    My ‘sliding window’ script, based loosely on your presentation is on http://www.paulbrewer.net . Your presentation was really useful and informative, thanks.

  15. Pingback: T-SQL Tuesday #46 : Table Partitioning to Archive | Andrew McDermid

  16. Hi Kendra!

    After I SWITCHED OUT, Can I delete the orphan filegroup/file?

    Ah! Nice Article!

  17. Hi, stumbled across your site today and it’s very informative. Trying to think of my current problem and how to address as i’m sure it’s not a new problem!

    I would like to partition a large Fact table by day as per your example. In my scenario I receive feeds for the latest day at different points across the day but need to load them as and when they arrive so they are available to users. As soon as i load one feed then the partition is no longer empty and any attempt to switch subsequent feeds for today in will fail with the Partition x must be empty.

    At the moment I think I need 2 tables along lines of
    tblFacts_current ( for all of todays data, drip fed )
    tblFacts_allOtherData (table partition on day for anything older than today)

    At the end of today I know i have all required data, switch in todays data to tblFacts_allOtherData, drop and recreate ready for the next days data.

    Am i thinking of this problem in the wrong way? Sure it’s a problem others have encountered.
    Thanks

    • Hi Ceri,

      Your approach could work. Build a prototype and check out if a partitioned view including tblFacts_current and tblFacts_allOtherData might help simplify some of your read queries with the model you’re describing.

      With your model, you could still have contention between loads of incoming data and reads of current day’s data. The way around that would be to just use a partitioning granularity smaller than one day so you could switch in each feed as it comes (perhaps a composite partitioning key using a computed column to partition by day source). But that adds complexity, particularly because making sure your queries use the partitioning key properly for elimination.

      So there are pros and cons to different approaches.

      Hope this helps!
      Kendra

  18. Pingback: Instant File Initialization and Database Growths | SQL On The Fly

  19. I loved the video series, but I had a question. I am trying to create an automated sliding window and I can’t seem to get it working correctly. Do you happen to have a video on automating things or a script to assist me?

  20. in a word this was AWESOME!!!

    It was VERY helpful with dealing with a Billion rows (yes with a B) table but are there any guidelines for partition size 10 million rows, 100 million rows?

  21. Thanks for this Detailed Post.. Appreciate the Time taken…It was very useful to start with…

  22. Very, Very, Very informative video, Thanks so much.

    However, i can’t let go the idea that uniqueness is so very difficult to enforce. Take your example, I believe you had “OrderName” as one column in the PK. The other column is OrderDate which is the partitioning column. I am pretty old school, i don’t like over-identifying the PK. Moreover, in data warehouse environment most dims and fact tables are set up with simple PKs, which makes partitioning very difficult unless we come up with a PK alogrithm that ensures uniqueness and can also be used as a partition column. Are there any best practices or good articles that discuss methods on enforcing uniqueness in a pK column in a partitioned table?

    • Hi Joe,

      You can enforce uniqueness by doing a non-aligned nonclustered index: those don’t have to contain the partitioning key. But then you lose the ability to switch partitions in and out as long as they’re enabled. So in terms of enforcing uniqueness, it’s a rock and hard place situation. I don’t know of any articles on it just because the limitation is so hard to deal with if you want to use partition switching.

      Kendra

  23. Hi Kendra,
    In your example script, RANGE RIGHT is used to partition data in multipe file groups. The script includes the steps needed to switch data in but doesn’t include a sequence of actions to switch old data out. I think there is a problem using multiple, ascending file groups to partition data when RANGE is RIGHT. An ALTER INDEX REBUILD is required to complete the physical movement of the data, potentially a very expensive operation on a large table. I’ve reproduced the problem here Thanks – http://paulbrewer.wordpress.com/2014/03/06/table-partitioning-bug-merge-range-switch-out/

    • Hi Paul,

      I believe this is technically functioning as designed (unfortunately).

      Can you give my post here on merging boundary points a read? I think it’s consistent with what you’re seeing, and in the post I link to where this is documented in Books Online. http://www.brentozar.com/archive/2013/01/best-practices-table-partitioning-merging-boundary-points/

      I didn’t set this example up with much about merging boundary points because of the complexity– I wanted to start things off simple. However, I think I should at least add a note/link at the end to that alternate point about merging so readers know it gets complicated. Doing that now.

      Kendra

      • I found and read your post about merging boundaries after posting a comment here, it’s very good and explains it well. As you say, it’s a really complicated subject but I realised my mistake, using the wrong boundary. I followed your rule and maintain an empty partition at the start and end of the partition range. When you switch out the oldest data (last but one partition), you need to merge the now empty, last but one partition with the empty last one (for RANGE LEFT & RANGE RIGHT). Then there is no data movement between file groups, in my lab I merged at the wrong boundary point. Thanks very much all the info about partitioning, I use your posts as a guideline and reference. Best wishes

        • I’m so glad it helped! And thank you very much for commenting on this post, I didn’t realize I didn’t have a direct link to the splitting issue on this post. Now that I’ve added that with a note to the bottom of the page, it might keep help someone else learn more quickly with less confusion.

  24. Pingback: Sliding Window Table Partitioning – A painful learning curve…… « Paul Brewer

  25. I am having a slight issue with this script. When I attempt to run:
    CREATE PARTITION SCHEME DailyPS
    AS PARTITION DailyPF
    TO (DailyFG1, DailyFG2, DailyFG3, DailyFG4, DailyFG5, DailyFG6)

    all of my partition_ranges end up in the first file group. Did I miss something along the way?

    • Sorry it took me so long to get to your comment!

      I’m a little confused by the question — it sounds like none of the commands are throwing errors?

      I just dug out the script and re-tested, and on SQL Server 2014 I’m getting what I got before at this point — five range values which are mapped to six filegroups, all of which are empty. (We haven’t created a partitioned object yet).

      Having PRIMARY show up is really odd. Are you seeing that from the results of the query “SELECT * FROM ph.FileGroupDetail;”, or in some other way?

  26. Pingback: Can you partition a temporary table? | SQL RNNR

  27. Pingback: Can you partition a temporary table? - SQL Server - SQL Server - Toad World

  28. Pingback: Gilbert Quevauvilliers – BI blog | SSIS – Adding Partitions to a Partitioned Table and switching out older data

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php