T-SQL Tuesday is a monthly blogothon where we get together and write about a different topic. I’m hosting this month’s episode.
Your mission: write a blog post about your favorite data type, and schedule it for next Tuesday, March 9.
Mad about money? Feverish about float? Tell us what you use your favorite data type for, and what people need to know before they get started using it. Data types have so many interesting edge cases, like when Aaron Bertrand taught us to use lower case for our data types. (I still don’t do that, but for the record, I do feel guilty.)
The rules:
- Your blog post has to go live on Tuesday, March 9
- Include the T-SQL Tuesday logo, and if somebody clicks on the logo, it has to link back to this post
- Include a link to this blog post, or after you publish it, leave a comment here so I can track you down for the roundup blog post next week
- If you want to host a future T-SQL Tuesday episode, tweet @way0utwest or leave a comment on this post
Don’t worry about somebody else picking the same data type – this is the beauty of blogging. Readers are here for your unique take on things, in your voice. And don’t worry about repeating things that are in the documentation – nobody reads that. You could seriously copy/paste the contents of the documentation and you’d get comments saying, “Whoa, I never knew that.”
36 Comments. Leave new
“Unto each DBA is measured but a single choice – more critical post go live when you know what should have been than during design – When all things seemed possible.” Shakespeare misinterpreted for the modern “_____ insert current gender identity here ”
I know mine. Check back next Tuesday. I have to go buy some Dr. Seuss books. My childhood is offensive.
HAHAHA, awesome!
My favorite? Int, of course.
The simplest and most common used datatype 😛
The point is to write a blog post about why. 😉
XML
Because even while indexed, is a cool way to get my CPU to 100%
I freakin’ love it! Awesome post, George! Couldn’t have said it better myself.
The new XML data type. Ahh, we all have different points of view with different pain points.
Who will say GUID.
Who will blog about “User defined data type”
Mine is varchar(max) and char. Yes check amount as char worked for credits ‘-87.35’ and saved formatting but all the extra code that grew up around the column to support that was cumbersome.
You won’t hear me say “GUID” as a bad thing on this or any other thread. Yup… it takes 16 bytes (nothing compared XML tag bloat, for example). Yep… they difficult to type (why would you ever need to, though? Copy’n’Paste still works). Yep… they don’t sort as humans would expect them to sort (so what?) Yep… having them as a clustered key can cause some extra fragmentation in Non-Clustered indexes but so does virtually everything else.
They also don’t need a central authority to issue, don’t need a central authority to prevent dupes even across dozens of different machines/applications (although you still need to check them), don’t require an extra trip to the server to figure out what IDENTITY value was just generated, are a nearly perfect solution for merging multiple sources into a single entity, they eliminate the “end of index” and all other hot spots, and they do something else that you’re probably not aware of… they prevent all page splits (even so-called “good” page splits) for months at a time and can also go months at a time with <1% logical fragmentation even after inserting more than a half million new rows into a GUID Keyed Clustered Index that already contains more than 3 million rows.
In other words, the infamous "GUIDs Fragment a Lot" is actually incorrect. What causes them to fragment a lot is people using supposed "Best Practice" Index Maintenance in a fatal exercise to maintain them, which was never intended to be a "Best Practice" (and basically says that right in the documentation), and is actually a "Worst Practice".
And, yeah… I can prove and demonstrate it all using code and a special tool I created that actually shows you what the index looks like at the page level for all pages in a single graph. If you have a user group that you'd like me to give the presentation to, I'll be happy to do so. You can reach me via LinkedIn. Warning… the presentation moves very quickly from sub-topic to sub-topic but it does a deep dive on all the perceived problems (Hint: One of them is NOT fragmentation!), demonstrable advantages (one of them IS to support high insert and "ExpAnsive" update rates with virtually no fragmentation), demonstrations on how "Best Practice" Index Maintenance is the real problem, and a deep myth-crushing dive on REORGANIZE and REBUILD including but not limited to their effects on the transaction log file in both the Standard and Enterprise Editions (Hint: you'll never be able to justify using REORGANIZE ever again) and that takes time to do… the presentation is 2.5 hours long.
And, you'll also learn why "Ever-Increasing" indexes may not be your friend unless you know how to find and fix an insidious problem that everyone knows about but never pays any attention to. Instead, they make the serious mistake of lowering the Fill Factor, which exacerbates the problem.
And yeah… I'm also working on a demo that does the same thing for more than 5 million rows of inserts with <1% fragmentation on GUID Keyed Clustered Index.
Hey, Brent… do you take guest speakers? 😉
I like your blog and would like to see more.
Would like to see your presentation at SQL Saturday or User group.
That said – there should be a manual 99% fool proof way to create human identifiable GUIDs. The Internet keeps things organized. SEARS and Roebuck had numbers 103.xxx washer. 273.27xxx bolt action riffle, 158.18 sewing machines. 440-250 a two story cape cod home [yes Sears sold complete homes. Wal*Mart has the biggest selection of coffins [trivia the great Alex Trebek never put on Jerpardy] )
GUIDs could be simpler but not easier than they are now.
Question – how anonymous are GUIDs ? Any idea what can be learned about the source machine ?
VIN numbers for cars.
GUIDs based on NEWSEQUENTIALID() have a whole raft of problems. They have all the DISAVANTAGES (yeah, you heard that right) of “Ever-Increasing Indexes”, none of the ADVANTAGES of Random GUIDs (yup, you heard that right again), and a few of the own disadvantages To your point, they also have the same security issue that the old Type 1 GUIDs that SQL server uses eons ago… they contain the MAC address of one of the NIC cards on whatever box they were generated on. Oddly enough, that’s also about the only advantage they have over the current Type 4 GUIDs (NEW_ID()), which are “just” astronomically large random values with no guarantee of being unique no matter how rare collisions actually are (A one in a billion chance out of every 103 Trillion that are generated according to some documentation I read).
So, to enhance the answer your question, there’s virtually nothing that someone is going to learn about your box(es) if Type 4 Random GUIDs are used. There are claims that the next number in the pseudo-random sequence can be derived (it IS based on a formula) but I’ve no need to do such a thing and I’m not using it for any type of crypto so I don’t care if someone can. 😀 They’re also awesome for quickly generating millions of rows of Random but Constrained test data of all kinds.
Shifting gears a bit, manually creating GUIDs to mean something might be a good idea but, remember, they will no longer be random and that destroys their ability to actually help prevent fragmentation (yeah… you heard that right, as well), prevent hot spots in the index, and a couple of other things. They also take up way too much space for things like part and catalog numbers.
I also find the 5 items that you cited to be a great combination…they’re all essential and dependent on each other although I’ve not yet needed the sewing machine to mend a bullet hole. 😀
Alright, Aaron Bertrand’s post has convinced me that lower case is the right case for data types. But if you use upper case keywords, then you have to write “varchar(MAX)”. And who could live with that.
I wish I’d seen this sooner. I would have railed against MySQL for completely disregarding the entire concept of a relational database and creating the Enum datatype.
Thanks for hosting, Brent.
My post is about the date data type, at http://blogs.lobsterpot.com.au/2021/03/09/beware-the-width-of-the-covering-range/
Another post from me (only two – not going to write another one today)
This one is about datetimeoffset, but also looks at strings. http://blogs.lobsterpot.com.au/2021/03/09/datetimeoffset-uniqueness/
Thanks again for hosting.
My submission: https://sqlkover.com/tsql2sday-136-your-favorite-data-type/
Thanks for hosting Brent.
Here’s my contribution for this month https://www.kevinrchant.com/2021/03/09/t-sql-tuesday-136-bit-versus-other-data-types/
Brent, thanks for hosting this month. Here’s my contribution:
https://www.michaelscalise.com/2021/03/09/t-sql-tuesday-136-blog-about-your-favorite-data-type-or-least-favorite/
Thanks Brent! I am a new blogger (started this year), and I blog in English and Portuguese. Here are my contributions for this month’s theme:
EN: https://camilahenrique.com/2021/03/09/t-sql-tuesday-my-least-favorite-sql-data-type-date/
PT: https://camilahenrique.com/2021/03/09/t-sql-tuesday-meu-tipo-de-dados-menos-favorito-em-sql-date/
Thanks for hosting Brent. Here’s my first ever contribution:
https://barneylawrence.com/2021/03/09/ive-got-99-0991-problems-but-floating-point-precision-aint-one/
Thanks Brent! Here’s mine:
https://sqlblog.org/2021/03/09/my-two-least-favorite-data-types
Thanks for hosting, Brent! Here’s mine: https://flxsql.com/t-sql-tuesday-136-your-favorite-or-least-favorite-data-type/
Thanks for hosting Brent! Here is my contribution:
https://glennsqlperformance.com/2021/03/09/t-sql-tuesday-136-my-favorite-data-type/
Thank you Brent for hosting!
Here is my post:
https://dbanuggets.com/2021/03/09/t-sql-tuesday-136-least-favorite-data-type/
Thank you Brent for hosting! Dusting off my rusty blogging skills: https://contextq.com/2021/03/09/t-sql-tuesday-136-my-favorite-data-type/ in favor of the weird beast, SQL Server Spatial Geography Types.
Now, with code samples!
Thanks for this month’s topic Brent, my post is at https://jqmartin.info/2021/03/09/t-sql-tuesday-136-my-favourite-data-type/.
It’s always a surprise whether or not pingbacks work. So this comment might be redundant.
Here’s my rant about float: https://sqlserverfast.com/blog/hugo/2021/03/9874/
Thanks for hosting, Brent!
My post is: https://callihandata.com/2021/03/09/t-sql-tuesday-136-your-favorite-or-least-favorite-data-type/
Cheers Brent!
My first TSQL2sday post: https://sqlrider.net/my-favourite-data-type-t-sql-tuesday-136/
Thanks for hosting Brent! Here’s mine! Good luck with the summary.
https://sqlstudies.com/2021/03/09/right-sizing-int-t-sql-tuesday-136/
Thanks for hosting Brent!
https://www.justinjbird.me/2021/tsql-tuesday-136/
Finally getting around to listing my contribution here: https://debthedba.wordpress.com/2021/03/09/t-sql-tuesday-136-favorite-data-types/
Thanks for hosting!
Thanks for hosting, this was a fun one.
https://www.travis-page.com/t-sql2sday/T-SQL-Tuesday-136-Your-Favorite-Datatype/
Thanks for participating, everybody! Y’all had a ton of great posts. The wrap-up post will go live on BrentOzar.com next Tuesday. Cheers!
Hey Brent, thank you for posting. Linking here as I do not see the pingback
https://www.bronowski.it/blog/2021/03/t-sql-tuesday-136-sql_variant-to-the-rescue/
Perfect, thanks!
Hello Brent, thanks for hosting. I’m late to the party – I forgot to paste my link here after publishing my post. It’s my first ever blog post and it’s a take on this idea from an ETL dev’s perspective: https://kkosno.wixsite.com/sqlcorner/post/ssis-and-sql-server-varchar-as-a-handy-data-type