SQL Server has this fancy feature called constraints. Database constraints are used to restrict the domain (the range of allowed values) of a given attribute. That’s just a funny way of saying: through a set of carefully crafted rules, we control the shape of our universe.
Our Test Table
We’re going to be testing with the following tables:[code lang=”sql”] CREATE TABLE sandwiches
id INT IDENTITY(1,1),
A unique constraint guarantees that one and only one row in the table can have a specific value. If our application requires there is only one sandwich called “The Super Big Elvis”, we need some way to make sure that the database can’t contain two “The Super Big Elvis” sandwiches.
A naive approach would involve application code checking the database for the existence of data. This approach has several problems:
- Every application that connects to the database must either duplicate this logic or else use a centralized service to check.
- There’s no guarantee that another application won’t check for data and save faster.
Instead of requiring our applications to handle this integrity, we can push the uniqueness requirement down to the database using a unique constraint:[code lang=”sql”] ALTER TABLE sandwiches
ADD CONSTRAINT UQ_sandwiches
UNIQUE ( title );
There’s one important gotcha with SQL Server: only one
NULL value is allowed per column (or set of columns) in the unique constraint. If you had a table with two columns,
b, there are only three possibilities for rows that involve
If you need a uniqueness constraint to ignore
NULL values, then you’ll need to create a unique filtered index like this:
ON customers(first_name, last_name)
WHERE first_name IS NOT NULL
AND last_name IS NOT NULL;
Now we can at least make sure that the uniqueness applies to people with both a first name and last name.
The Primary Key Constraint
A primary key is a logical construct – it’s a set of attributes that cannot be duplicated by any other row in the table. The primary key doesn’t have to be a single column, it can be a composite key – as long as the columns uniquely identify a single row it’s a valid primary key.
Don’t confuse a primary key with a clustered index, though. In SQL Server, the default behavior is to create a primary key as a clustered index – the clustered index defines the physical order of data in the table. This has led to a slew of iffy advice like “never use natural primary keys” or “don’t use GUIDs as primary keys”.
When thinking about primary keys as constraints, concern yourself first with identifying the unique characteristics of the rows for your application. Follow that up with physical optimization decisions once you’re ready to implement the physical model.[code lang=”sql”] ALTER TABLE sandwiches
ADD CONSTRAINT pk_sandwiches
PRIMARY KEY (id);
Aww yeah, we’ve got a primary key. The default behavior for SQL Server is to create a clustered index under the PK if one doesn’t exist already. If you need to use a different index as the clustered index (e.g. the PK is a GUID and the clustered index is an integer) just make sure that you create the clustered index before you create the PK or that you specific the primary key as a non-clustered index:[code lang=”sql”] ALTER TABLE sandwiches
DROP CONSTRAINT pk_sandwiches;
ALTER TABLE sandwiches
ADD unique_id UNIQUEIDENTIFIER DEFAULT(NEWID())
CREATE UNIQUE CLUSTERED INDEX CX_sandwiches ON sandwiches(unique_id);
ALTER TABLE sandwiches
ADD CONSTRAINT pk_sandwiches
PRIMARY KEY NONCLUSTERED (id) ;
What do PKs buy us? A primary key gets you the same thing as a unique constraint with the added benefit of saying, “This is the primary identifier for this row. Other attribute combinations may be unique, but this combination of attributes identifies this entity in the database.”
Check constraints give you more flexibility than the other forms of constraints. Both unique constraints and primary key constraints operate on a list of columns that we supply – they are limited to simple combinations of columns. Check constraints, however, give you something more.
On the surface, as check constraint is just as simple as any other constraint – the statement in the check constraint has to evaluate to
true. Seems simple, right?
The upside of this approach is that “any statement” means that any valid T-SQL statement can be part of the check constraint… so long as the T-SQL statement doesn’t rely on access to another table. So, we can do something like this:[code lang=”sql”] CREATE FUNCTION fn_check_sandwich_name
IF LTRIM(RTRIM(@sandwich_name)) = ‘Big Mac’
We want to avoid lawsuits with a certain fast food chain, so this function should keep us safe. In order to use the function a check constraint we can just do the following:[code lang=”sql”] ALTER TABLE sandwiches
ADD CONSTRAINT CK_sandwich_name
CHECK (dbo.fn_check_sandwich_name(title) = 1);
Let’s try it out:[code lang=”sql”] INSERT INTO sandwiches (title, price) VALUES (‘The Big Ozar’, 12.88);
INSERT INTO sandwiches (title, price) VALUES (‘Big Mac’, 2.99);
The first row will successfully inserted into the
sandwiches table. The second insert attempt will fail – the
title matches the failure condition in the function
There’s at least one scenario that this doesn’t catch:
NULLs. With code in its current state, we can insert
NULL in the
title column of
Functions used as check constraints should explicitly check for
NULL inputs; without taking
NULL into account, it’s possible to create a check constraint that will allow incorrect data to end up in a table. Although these data rules should be pushed down to the underlying schema where it’s easy to declare a column as
NOT NULL, any check constraints that operate on incoming data need to make sure they account for
About That World…
Using a combination of table constraints, we can build a complete understanding of the totality of the data in the database without having to execute a single query. Constraints let us push universal rules about the total state of data down into the database where all applications can take advantage of them and where those rules only need to be maintained in one place.