Adventures In Foreign Keys 2: Common Foreign Key Errors

This week, we’re all about foreign keys. Yesterday’s post covered scripts to set up the Stack Overflow database to get ready, eliminating data that would violate FK relationships.

You Had Two Jobs!

Let’s say I wanted to implement Foreign Keys to do two things

  1. If a user deletes their account, all of their badges, comments, and posts will also get deleted
  2. If a user deletes their post, all of the comments and votes will also get deleted

Remember that the data dump is cleaned so user voting isn’t recorded, so we’ll steer clear of that. I also have no idea how to simply implement the removal of badges someone may have gotten from an associated post, so I’m leaving that out. The badge stays in the picture!

For the post and vote deletion — in real life, this would also be really complicated. We’d have to adjust user reputation, and info we maintain about how many up and down votes the user has cast. We’d also have to restore reputation to users if an answer they downvoted was deleted. This is well beyond what foreign keys can do well, or easily. That’s implementing business logic, which would be a better fit for a trigger, if we’re sticking to doing it in the database.

What can foreign keys do for us, then?

Foreign Keys offer you the option to have updates and deletes cascade from the parent (reference) table to the child table. We’ll look at how exactly what works in the next post, but for now let’s just try to get the foreign keys set up.

Issues, Tissues

In order to make testing this easier, I added some code to our stored procedure that drops indexes.

Now it can also drop foreign key, unique, and check constraints.

You can thank me later.

First, let’s get badges deleted if a user self-immolates.

Everything looks good so far! Let’s get comments deleted, too.

Also no problem. Okay, now let’s get votes deleted if a post goes away.

Alright, now let’s get posts deleted if a user cashes out.

This goes less well.

Orderly

If you’ve ever hit this error, you’ve no doubt landed on The Least Helpful Page On The Internet®

If you haven’t, I’ll summarize it for you: DON’T DO WHAT YOU’RE TRYING TO DO

Now, when I was testing this, I found that I’d hit that error in different places if I created things in a different order.

As a for-instance, if I start from the bottom up, I hit it trying to add a constraint from comments to posts.

It seems that I don’t have a straightforward way of getting both things to happen. I could try chaining things differently…

But the PostId column in Votes isn’t unique — the only truly unique columns in Stack tables are the Id columns. I’ll get this error.

I can’t create a unique index or constraint just on PostId anywhere, and even the combination of PostId and other columns can be non-unique.

I can make a unique index on UserId and Id, though.

But now I’m dealing with a multi-key foreign key, which’ll have some drawbacks that we’ll explore in a later post.

More locally, we’ll run into this error:

No foreign keys allowed on identity columns.

You Ain’t Got No Alibi

By now, we’re starting to ask ourselves if these cascading actions are worth it. If we don’t implement them, we can have referential integrity out the wazoo. And those trigger thingamajigs are starting to look mighty attractive. They don’t have all these “rules” to deal with, and we can offload the deletions there. Heck, if we’re using a stored procedure, we don’t even need triggers. We can just fire off deletes to any tables we care about from there.

Foreign Keys with cascading actions: Strike One. In the next post, we’ll learn why I’d give them a lifetime ban from baseball. Thanks for reading!

Previous Post
Adventures In Foreign Keys 1: Setting Up Foreign Keys in Stack Overflow
Next Post
Adventures In Foreign Keys 3: Why Cascading Deletes Perform Slowly

4 Comments. Leave new

  • There seems to be a copy/paste issue resulting in fk_posts_users_id being created twice, once for “Okay, now let’s get votes deleted if a post goes away” and once for “Alright, now let’s get posts deleted if a user cashes out.” It’s the wrong code for that first one.

    Reply
  • Looking forward to the next part in this series as I suspect it’s about performance. We run into _that_ fun all the time when trying to clear out the history in the SSISDB. MS decided to implement cascading deletes so deleting 1 row can affect millions of rows behind the scenes. :/

    I haven’t tried to nest anything this deep. Whenever our devs try to reach for the convenience of a cascading delete, I encourage calling a stored procedure instead. We can control the order and size and don’t run into crazy issues like the one above.

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.