Blog

What Developers Need to Know About Designing Databases

One of my favorite tips, and I never thought of it that way. Code is agile, and databases are brittle. It seems easy to refactor the database in the beginning, but as your app grows and more stuff interfaces with it, life gets a lot harder.

Johan means you should use the right data types, like using date or time when you’re just storing one of those. Plus, if you know a field is nullable or unique or relates to a field in another table, tell the database by setting up constraints or foreign key relationships. This can actually make your queries faster because SQL Server uses this knowledge to build better execution plans.

For example, DATE is a narrower field than DATETIME, requiring less storage, so you can pack more rows per page on your date indexes.

NVARCHAR is Unicode, which SQL Server uses 2x the storage space for as compared to just plain VARCHAR. This means you can cache half as many rows in memory, and your storage IO takes twice as long. If you’re not seriously going to store Unicode, stick with VARCHAR. (Then of course right-size your fields as opposed to using MAX, which can be difficult to index and may get stored off-row depending on data size and config options.) On a related note:

While database vendors will say, “Sure, you can use our database as a file server!” remember that the licensing on databases is a lot more expensive than licensing on a file server. Plus, these big files hit the transaction log, making your smaller/faster transactions compete for log file access, and slowing everything down.

Primary keys (and clustering keys) should be unique, narrow, static, and ever-increasing. Don’t go redesigning an existing database to change this, because it’s probably not your biggest bottleneck, but when starting from scratch, keep these guidelines in mind.

I wrote triggers a lot too when I was a developer, but as I moved into the database side, I realized how tough they are to scale. They’re synchronous, and they can cause concurrency problems. Consider moving to an asynchronous solution that doesn’t do as much locking/blocking.

Narrow one-field indexes are usually less useful because you’re selecting more than one field. Plus, every index you add is like another copy of the table. Insert a record into a table with ten nonclustered indexes, and you’ll be doing eleven (or more) writes to disk each time.

SQL Server Management Studio’s index suggestions have no concerns whatsoever about adding overhead to your inserts/updates/deletes, and they’re often supersets or subsets of existing indexes.

When you first get started with a new database, don’t go crazy adding indexes with guesses about what will get queried. Indexes are easy to add later – wait to see how the queries actually shape up, and which ones get run the most often.

What Developers Need to Know About Writing Queries

This is one of my favorite tips. SQL is really easy to learn, but the underlying server code behaves very differently than you might expect. We’ve gotta think set-based. On a related note:

Learn more about isolation levels.

SQL Server licensing is about $2k USD per core for Standard Edition, and $7k per core for Enterprise Edition. These are usually the most expensive cores you’ve got in the shop. Aaron’s guidance is based on minimizing the amount of CPU clock cycles we need to burn in the database tier, and thinking about moving those to the easier-to-scale-out app tier.

Sometimes the business design requires outer joins, but beware that you can get much worse execution plans as your data grows.

Databases perform very differently at scale, whether it be scale of data or scale of underlying hardware or scale of simultaneous queries.

What Features, Commands, Datatypes, Etc. Should Be Generally Avoided

If you say BETWEEN December 1st and December 31st, SQL Server sees that last parameter as the very beginning of December 31st. Any data from, say, 1AM on December 31st and afterwards is going to be excluded.

Simplicity is a big theme here. Manhandling the engine around with hints usually results in worse performance overall.

We’re not saying nested views are wrong – sometimes they can work really well – but KBK is pointing out that you may end up dragging a bunch more tables into your query inadvertently.

Always good advice – but it’s advice for managers rather than developers, right?

Sometimes, a scan is easier – especially for grouping.

For more details, watch our There’s Something About NOLOCK video.

The Best One-Line Things to Improve Databases and Apps

That’s actually how I got started in database administration – I got tired of learning the Language Du Jour because I’m not all that bright. I have so much respect for developers who can rapidly learn new languages and frameworks and then deploy them effectively. I switched to databases because I could learn one language that’s been the same for decades, and it’s even used by multiple back end platforms. I’m lazy. On a related note:

Nick is from StackExchange, the guys who built Opserver, an open source monitoring system. They’re brilliant about watching the top resource-intensive queries on the database server, and you should be too. When something new pops up, you need to know why it’s suddenly using more resources than you expect.

Query execution plans are road maps for how SQL Server will process the data for your query. Learn more about those.

You don’t have to know how to FIX the bottlenecks, necessarily, but you have to know what’s going to be the pain point as you scale.

The best DBAs are partners, not predators. When you find a good one, they’re willing to help make you look like a star.

I’m all about this. By the time somebody brings me in for advice, the best practices guidelines aren’t usually working for them, and they need to learn how to bend the rules.

Live Video of the Twitter Chat

Here’s the 50-minute video of the Twitter chat where we built this list, plus covered a lot more tips:

I apologize for the audio – WebEx’s audio codecs are horrendous. (I’ve got a Blue Yeti microphone.)

Sign Up for the Next One:
What DBAs Need to Know About Monitoring

Page life expectancy. Disk queue length. Page splits. There’s so much old bogus advice out there around SQL Server monitoring, and it’s time to update the rules. Join Microsoft Certified Master Brent Ozar as we collaboratively write an e-book with new advice on monitoring. You can contribute too – we’ll be working together over Twitter using #DellSQL.

We’ll discuss and share:

  • What metrics every DBA should monitor
  • Which metrics are meaningless
  • Our favorite free DMV scripts

Register now to watch it live on Thursday, November 14th.

↑ Back to top
  1. I missed this event on twitter, so here are my contributions:

    Scalar Functions in a select statement are executed for every row, using SQL Server as inefficiently as possible, so a 10ms process done for 1,000,000 rows is the cause of that 10 second long query no one likes. They work great in OO, not in SQL. It’s a different language and needs to be respected as such.

    Indexing should be understood better, specifically looking at the data logically and finding reusable indexes. That almost always helps more then two “perfect” special-use indexes that compete for cache and both need updated. You talk up reusable code, and that just saves your time. This is saving server time on inserts/updates/deletes, using cache more effectively, reducing the load and amount of data on disk, and reducing the size of backups.

    Ask the DBA for help, and don’t be ashamed to do so. Keep in mind that you’d expect the DBA to ask you for help if we were fooling around with OO code. You’re respecting specialties to make future maintenance easier for the DBA, not harassing someone who doesn’t want to be bothered.

    Document where procs are used when you create them. We tend to tune things because they were on our list of top 20 most expensive queries then go back and ask you to figure out where it’s being used. I would prefer to go straight to QA and say “I just improve this screen, can you prove my work with real-life examples?”

    Ask us what we did to improve performance and why we did it. It’s almost giving us permission to brag, so we love it. On your side, you learn how we turn something slow into something fast so that you can improve at making something fast from the start.

    Bribe us. When we help you with something big, buy us lunch. When you want to impress your users with blazing fast performance, offer to watch our kids for a night. Remember, a bribable DBA is a controllable DBA. The guys at Brent Ozar unlimited can be controlled with Bacon, the girls can be controlled with post-run cupcakes, and I personally prefer to answer your questions over a Hibachi Lunch-and-Learn session. Know the person you’re asking to help and you’ll get better help.

  2. Pingback: .NET Security Book Featured In The Daily Six Pack: October 31, 2013

  3. Pingback: My links of the week – November 3, 2013 | R4

  4. Pingback: What developers need to know about SQL Server (brentozar.com) | Cedar Hill Tech Notes

  5. Pingback: What Developers Need to Know about SQL – Introduction - - OdedCoster.com/blog

  6. Pingback: What Developers Need to Know about SQL – Set Based Programming - - OdedCoster.com/blog

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

css.php