Indexes For Kids

Kid Tested

When you work with something a lot, you tend to extrapolate concepts from it out to everything you see. That one person collecting tickets is a serial zone. That food truck line is a HEAP. The empty subway car that smells like a toilet is Replication.

Stuff like that.

If you have one or more kids, you might even see database concepts pop up with them, too. Their messy room is index fragmentation. Their tiny inefficient hands are like cursors. Their dirty diapers are like Replication.


I don’t sit around lecturing my kid about database stuff. It’s not that I don’t love what I do, it’s just that I honestly do hope there are job openings for princess astronaut doctors by the time she’s old enough to start looking.

But then we got this puzzle! And it’s like, the perfect way to explain why indexes can make things faster.

I mean, just look at it.

The wires are in this picture to annoy Richie.
The wires are in this picture to annoy Richie.


It even joins together. Lord help me. Here it comes.

If you’re already cool with indexes

This post isn’t for you, unless you’re super bored.

See, each of those stacks is like a column. Or a field. And each of those letters and pictures is like a row. Or a record.

Now that your database terminology OCD requires medication

The end goal is to match each one up. They even have matching patterns. It’s really something.

Here are your DBA choices for matching them:

  • If they’re both shuffled, you pick a stack and look for the corresponding piece in the other stack. This is horribly slow and procedural, even if you start to cheat and make matches when you remember you’ve seen a piece recently.
  • You can organize one alphabetically, much like adding an index, but you still have to wander through the unsorted stack. That’s not much better, is it? You’ve turned into a human Nested Loops Join. Half-indexed data doesn’t help you much here.
  • Last, you can organize both stacks alphabetically. This is obviously the most efficient. It’s almost like join elimination. You can mash both stacks together without looking.

This is a lot like what Brent teaches in How To Think Like The Engine. Any WHERE clause or JOIN that isn’t indexed means you’re scanning something. This doesn’t mean I’m totally against scans, it just means I’d rather SQL scan indexed data than unindexed data.

Towards entropy

If this puzzle were like your data, it probably wouldn’t be a 1:1 match. I’m not insulting your data, mind you. Some customers signed up, but they didn’t order anything.

Similarly, if we were missing any puzzle pieces, we’d know because we just sorted them all. If you don’t, trust me, you end up looking through the stack like four times thinking you lost your mind not being able to find the Xylophone. Ordering the data first means you know xylophone is missing, so when X comes up, you discard it.

Inevitably, one of these pieces will end up broken or in the toilet.

And this, my dear friends, is why you always make backups.

Thanks for reading!

Previous Post
Your Favorite Bugs & Enhancement Requests: #TSQL2sday 86 Roundup
Next Post
The 2017 Data Professional Salary Survey Results

18 Comments. Leave new

Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.