What Developers Need to Know About Designing Databases
Understand that changing database design at a later date is harder than changing code. Getting it right the first time is important #dellsql— Gail Shaw (@SQLintheWild) October 24, 2013
One of my favorite tips, and I never thought of it that way. Code is agile, and databases are brittle. It seems easy to refactor the database in the beginning, but as your app grows and more stuff interfaces with it, life gets a lot harder.
Johan means you should use the right data types, like using date or time when you’re just storing one of those. Plus, if you know a field is nullable or unique or relates to a field in another table, tell the database by setting up constraints or foreign key relationships. This can actually make your queries faster because SQL Server uses this knowledge to build better execution plans.
For example, DATE is a narrower field than DATETIME, requiring less storage, so you can pack more rows per page on your date indexes.
NVARCHAR is Unicode, which SQL Server uses 2x the storage space for as compared to just plain VARCHAR. This means you can cache half as many rows in memory, and your storage IO takes twice as long. If you’re not seriously going to store Unicode, stick with VARCHAR. (Then of course right-size your fields as opposed to using MAX, which can be difficult to index and may get stored off-row depending on data size and config options.) On a related note:
While database vendors will say, “Sure, you can use our database as a file server!” remember that the licensing on databases is a lot more expensive than licensing on a file server. Plus, these big files hit the transaction log, making your smaller/faster transactions compete for log file access, and slowing everything down.
Not all types are good for primary keys #dellsql— Rui Ribeiro (@ruirrib) October 24, 2013
Primary keys (and clustering keys) should be unique, narrow, static, and ever-increasing. Don’t go redesigning an existing database to change this, because it’s probably not your biggest bottleneck, but when starting from scratch, keep these guidelines in mind.
I wrote triggers a lot too when I was a developer, but as I moved into the database side, I realized how tough they are to scale. They’re synchronous, and they can cause concurrency problems. Consider moving to an asynchronous solution that doesn’t do as much locking/blocking.
Narrow one-field indexes are usually less useful because you’re selecting more than one field. Plus, every index you add is like another copy of the table. Insert a record into a table with ten nonclustered indexes, and you’ll be doing eleven (or more) writes to disk each time.
SQL Server Management Studio’s index suggestions have no concerns whatsoever about adding overhead to your inserts/updates/deletes, and they’re often supersets or subsets of existing indexes.
You can't design indexes based on the table structure alone. You need to know the common queries that run against it. #dellsql— Gail Shaw (@SQLintheWild) October 24, 2013
When you first get started with a new database, don’t go crazy adding indexes with guesses about what will get queried. Indexes are easy to add later – wait to see how the queries actually shape up, and which ones get run the most often.
@BrentO Standard warnings about premature optimization apply – sort of. Don't let your design be too brittle on day 1.— Vicky Harp (@vickyharp) October 24, 2013
What Developers Need to Know About Writing Queries
This is one of my favorite tips. SQL is really easy to learn, but the underlying server code behaves very differently than you might expect. We’ve gotta think set-based. On a related note:
SQL Server licensing is about $2k USD per core for Standard Edition, and $7k per core for Enterprise Edition. These are usually the most expensive cores you’ve got in the shop. Aaron’s guidance is based on minimizing the amount of CPU clock cycles we need to burn in the database tier, and thinking about moving those to the easier-to-scale-out app tier.
beware of outer joins #dellsql— Rui Ribeiro (@ruirrib) October 24, 2013
Sometimes the business design requires outer joins, but beware that you can get much worse execution plans as your data grows.
Databases perform very differently at scale, whether it be scale of data or scale of underlying hardware or scale of simultaneous queries.
What Features, Commands, Datatypes, Etc. Should Be Generally Avoided
If you say BETWEEN December 1st and December 31st, SQL Server sees that last parameter as the very beginning of December 31st. Any data from, say, 1AM on December 31st and afterwards is going to be excluded.
use the principle of least privilege. You don't need db_owner, much less dbo or sa. #dellsql— K. Brian Kelley (@kbriankelley) October 24, 2013
Simplicity is a big theme here. Manhandling the engine around with hints usually results in worse performance overall.
If you're going to nest views, look at and understand the query plan that will result. #dellsql— K. Brian Kelley (@kbriankelley) October 24, 2013
We’re not saying nested views are wrong – sometimes they can work really well – but KBK is pointing out that you may end up dragging a bunch more tables into your query inadvertently.
Always good advice – but it’s advice for managers rather than developers, right?
Sometimes, a scan is easier – especially for grouping.
NOLOCK is *not* always (or even usually) the answer. #dellsql— K. Brian Kelley (@kbriankelley) October 24, 2013
For more details, watch our There’s Something About NOLOCK video.
The Best One-Line Things to Improve Databases and Apps
Learning SQL Is an opportunity to expand your skill set. Embrace it like you run to the flavor of the week language/ORM #dellsql— Bill Fellows (@billinkc) October 24, 2013
That’s actually how I got started in database administration – I got tired of learning the Language Du Jour because I’m not all that bright. I have so much respect for developers who can rapidly learn new languages and frameworks and then deploy them effectively. I switched to databases because I could learn one language that’s been the same for decades, and it’s even used by multiple back end platforms. I’m lazy. On a related note:
Data integrity. A database that stores garbage is useless. Apps can enforce integrity but database should as well. #dellsql— Gail Shaw (@SQLintheWild) October 24, 2013
Nick is from StackExchange, the guys who built Opserver, an open source monitoring system. They’re brilliant about watching the top resource-intensive queries on the database server, and you should be too. When something new pops up, you need to know why it’s suddenly using more resources than you expect.
Query execution plans are road maps for how SQL Server will process the data for your query. Learn more about those.
You don’t have to know how to FIX the bottlenecks, necessarily, but you have to know what’s going to be the pain point as you scale.
Include DBAs early on in a development project and keep them in the loop for better results. #DellSQL— K. Brian Kelley (@kbriankelley) October 24, 2013
The best DBAs are partners, not predators. When you find a good one, they’re willing to help make you look like a star.
I’m all about this. By the time somebody brings me in for advice, the best practices guidelines aren’t usually working for them, and they need to learn how to bend the rules.
Live Video of the Twitter Chat
Here’s the 50-minute video of the Twitter chat where we built this list, plus covered a lot more tips:
I apologize for the audio – WebEx’s audio codecs are horrendous. (I’ve got a Blue Yeti microphone.)
Sign Up for the Next One:
What DBAs Need to Know About Monitoring
Page life expectancy. Disk queue length. Page splits. There’s so much old bogus advice out there around SQL Server monitoring, and it’s time to update the rules. Join Microsoft Certified Master Brent Ozar as we collaboratively write an e-book with new advice on monitoring. You can contribute too – we’ll be working together over Twitter using #DellSQL.
We’ll discuss and share:
- What metrics every DBA should monitor
- Which metrics are meaningless
- Our favorite free DMV scripts