Furthermore, I think we’ve all been taking them for granted.
Integers are clean, elegant datatypes that do exactly what they’re supposed to. They’re the Golden Retrievers of the database world: they have a few shortcomings, but so, so few. And so for this month’s T-SQL Tuesday, I wanna share my love for integers with you.
1. Integers have clear contents right from the start.
You can declare an integer variable or column datatype, then set it to a string – and it’ll work great as long as the string can be converted into a number:
But if you try to stuff something into an integer that doesn’t belong, you get a clear error right away:
I can guarantee that the only thing in there is gonna be whole numbers, and that I won’t have to play any of the finicky games involved with floating point math or someone trying to store dashes in there to format their social security numbers. If you wanna do formatting on integers, you can – there are functions to help – but it’s usually pretty clear that any formatting should be done on the application side.
2. Integers rarely have implicit conversion issues.
If someone passes in a string, or even a unicode string, and tries to do an equality search of a number field, SQL Server is smart enough to convert the string to an integer first. SQL Server even leverages statistics to understand what the string number will produce in a search!
CREATE INDEX Reputation ON dbo.Users(Reputation);
CREATE OR ALTER PROC dbo.usp_GetUsers @Reputation NVARCHAR(100) AS
WHERE Reputation = @Reputation;
EXEC usp_GetUsers N'2';
The resulting execution plan is beautiful because SQL Server converted the parameter, not the contents of the table, and accurately estimated the number of rows:
3. Integers are small – and compress down even smaller.
Integers take up just four bytes of space, and if a few billion values isn’t enough for you, their big brother BIGINTs take up just eight bytes. Eight bytes! I see people constantly slapping NVARCHAR(100) all over the place, storing un-normalized descriptions in row after row after row, just flushing space down the drain when they could be linking to a table with the handful of descriptions that they always use.
Need to save space? Row compression is a piece of cake to implement, and even BIGINT columns are just automatically stored in the smallest possible number of bytes. Row compression makes it easy to justify using BIGINTs as identity columns on tables where you’re worried about a large number of rows in the future because they just don’t take any extra space to store.
Got a lot of repeated values, and want even more compression? Enable page compression.
And when it’s time to build fact tables for reporting systems and data warehouses, buckle up: columnstore indexes bring really bananas levels of compression, like 70% or higher with repeated integers like order line item quantities, prices, or part numbers.
4. Integers even work in URLs.
If you’ve surfed Stack Overflow, you might have noticed how the URLs have integers in them, like this:
Well, you know what’s really cool? Lop off everything after the numbers, like this:
And go to that URL:
Stack Overflow automatically pulls up the full URL. Wanna know how they do it? Everything after the number is extraneous, because the number corresponds to the actual ID they’re storing in the Posts table! You can see the database version of the question you’re looking at by running a query:
WHERE Id = 7395915;
Try doing that with your fancypants strings, bucko, and welcome to URL encoding hell.