Building PollGab.com: Designing the Data Model

PollGab
9 Comments

In the last post, I talked about how Richie and I sketched out the goals of PollGab.com, the site where you can post questions for other folks to answer during their live streams.

I challenged you to decide:

  1. What data needs to be stored to make all this work?
  2. What database or persistence layer should we use?

So let’s talk through each of ’em.

1. What data needs to be stored?

The 3 major components are:

  • Rooms – with attributes like the room name, URL, room owner email
  • Messages – including the message, person’s name, person’s email address. Initially, we didn’t want email address, but we added it and made it optional just so we could show Gravatar icons next to peoples’ names.
  • Votes – and this is where your opinions will play out in the comments. The data professional in you wants to capture the details of every vote, and you might want to add a separate normalized Votes table for it. However, for the sake of this app, we don’t need granular tracking of votes. We just need to increment the number of votes on a message – and for that, it makes sense to just have a VoteCount column on the Messages object. I know. You’re not happy about that. But remember, we have a goal to track as little data as possible.

Here’s one of the early data model sketches from Richie’s notebook:

Richie’s sketch adds a fourth data object, connection, that’s required for tracking each of the browsers connected at the moment. We don’t need that for evil purposes – we just need to push updates back to them so their screen can refresh in real time.

Just for clarity’s sake, this isn’t the exact structure of PollGab’s current data model – but I like showing this early draft because it shows the thought process. Data models change over time, especially when you’re building a brand new app like this one where we can iterate quickly, and we don’t need to deal with queries coming from other systems, like reporting tools.

There are additional things to think about as well, like banning users and making payments, but I’ll keep it simple for the sake of this discussion. (And really, in a modern app, you don’t want to hold the payments yourself – that’s what services like Stripe are for.)

In terms of the way we load and query this data, here are a few sample use cases:

  • View the list of rooms here
  • View the list of messages for one room (like Office Hours)
  • Insert a new message
  • Upvote a message
  • Flag a message
  • Delete all messages in a room

You’ll notice that the views are fairly tightly scoped: we’re not doing things like analytical reports.

2. What database or persistence layer should we use?

I know, most of us here use the Microsoft data platform to make a living. However, dating back several years ago when Richie built PasteThePlan.com, then when he built SQL ConstantCare®, we’ve long believed that where we can avoid using a relational database, we should.

NoSQL systems are generally cheaper and easier to manage than relational databases. Yeah, I said it. Come at me. Try hosting production-quality, auto-scaling relational databases for $3.43 per month. It’s damn hard.

So like PasteThePlan (but unlike SQL ConstantCare®), we ended up using Amazon DynamoDB. I know it’s marketing, but the main boxes on the DynamoDB product page do a pretty good job of summing up its advantages:

DynamoDB does have drawbacks for most of us in the audience: we can’t just whip open SSMS or Azure Data Studio, write a query with joins and grouping, and get reports out. However, for the sake of this web site, we just don’t need reporting across multiple rooms or questions.

During the design phase, I specifically said that any kind of marketing analytics were completely out of scope. I recognize that ad agencies and podcasting companies might want to do centralized purchases for lots of podcasts or rooms, and then get analytical data to know which users are asking the most questions, or what kinds of questions get the most upvotes. They’re just not our target market for the initial product release.

This is a cloud-native, locked-in app.

We designed this from the ground up to work exclusively with Amazon DynamoDB. This means we’re stuck there, for better or worse. If Amazon DynamoDB goes down, PollGab’s going down with it. As a small business owner, that’s a risk I’m totally willing to take. DynamoDB has much higher uptime than any server I’ve personally been involved with building and maintaining.

I’m sharing this with you because I want you, dear reader, to understand why startups and small businesses might choose to go cloud-native and build their applications without traditional relational database systems. I love databases! I still absolutely love ’em. But I’d rather not be on call for fixing them when (not if) they break.

I’m also being fine locked into AWS because we chose to keep building with Amazon Lambda, the same hosting and processing architecture that we chose for PasteThePlan and SQL ConstantCare®. Lambda lets you write short-lived functions, and you don’t have to know anything about the operating system or servers that run your code. When someone hits a URL, the code runs, period. When lots of people come in, AWS will gradually spin up more… somethings… that run your code. That’s just not really our problem.

Could we have used Azure or Google services instead? Absolutely! But at the time we started building this, Richie had already built up 5 solid years of experience with DynamoDB and Lambda, so we were able to start building more quickly without learning new tooling.

Could we have used a conventional server-based application, like a web site in IIS or a .NET app? Absolutely! But we wanted to be able to completely outsource uptime management and troubleshooting to Amazon, and that’s where Lambda makes things easier. The best evidence for that is that as Richie continues to build stuff, he’s able to spend an absolute minimum of time managing our already existing apps like PasteThePlan and SQL ConstantCare.

What questions do you have about PasteThePlan’s data model or persistence layer? Let us know in the comments.


Building PollGab.com: Design a Database for Live Session Questions

PollGab
19 Comments

Years ago when we did online presentations or live Office Hours webcasts, attendees would type their questions into the “Questions” section of whatever webcast platform we were using – WebEx, GoToWebinar, Zoom.

This sucked because:

  • People would put questions all over the place (Questions, Chat, email, etc)
  • It was hard to keep track of which questions were still left to be answered
  • People couldn’t write questions in advance
  • People kept copy/pasting the same questions over and over because they’d think we hadn’t seen them
  • We didn’t have a way to tell which questions other folks wanted us to answer

So when I decided to start taking Office Hours more seriously a couple of years ago, Richie and I sketched out a better way.

That way is www.PollGab.com, and it’s what Richie’s been working on for the past year. You might have seen links to it during my Office Hours:

Viewers can post messages, upvote messages that they’d like to see me discuss, and flag inappropriate messages.

To get an idea of how it works, you can see my room, see the list of active rooms, and create your own room for free. If you run into problems or have a question, email Help@PollGab.com.

When you’re clicking around in there, if we’ve done our job right, things are going to seem pretty intuitive and obvious. However, it takes a lot of planning and work to make an app seem effortless, so let’s talk about some of the design goals we came up with.

The data updates automatically. If you sit there watching a popular room, you can see new questions coming in, and see questions move around on the leaderboard, automatically without touching Refresh on your browser. In a a perfect world, you should be able to load a room’s page just once, and watch the action unfold.

Viewers & voters don’t need to log in. In today’s GDPR-sensitive world, we want to store as little data as practical, and delete it as quickly as possible. We do need a way for room owners to be recognized so we can let them see the list of flagged questions, un-flag them, ban problem users, or set a question as “now showing”, plus some other neat tools I’ll mention this week.

The lack of logins does mean people can game the system: if you keep spawning new incognito browser windows, you can repeatedly upvote your own question. For now, we’re not really concerned about that. (Gotta pick your battles when you’re building a new site.)

If anyone flags a message, it disappears from everyone’s screens. Given that we’re talking about the Internet, where people are endlessly creative and foul, we wanted to make it easy for the audience to help us police. On the flip side, if the room owner is looking at the list of flagged questions, and decides to UN-flag a question (because it’s appropriate and safe), then that message reappears automatically. Plus, it can’t be flagged again, because the room owner has already believed it to be appropriate.

The room owner has additional controls. They can stop taking messages, or clear the message queue entirely.

Tag, it’s your turn:
design the database.

Let’s pretend that someone sketched all this out for you and asked you the following questions:

  1. What data needs to be stored to make all this work?
  2. What database or persistence layer should we use?

I’m curious to hear what you think. Let me know in the comments, and then in the next blog post, we’ll discuss what we chose to do.


Big Data Clusters: Out of Aces

SQL Server
16 Comments

When this feature was announced in 2018, I wrote:

<sarcasm> It’s like linked servers, but since they don’t perform well, we need to scale out across containers. </sarcasm>

I just didn’t get it, and I continued:

I like that Microsoft is making a risky bet, planting a flag where nobody else is, saying, “We’re going to be at the center of the new modern data warehouse.” What they’re proposing is hard work – we all know first-hand the terrible performance and security complexities of running linked server queries, and this is next-level-harder. It’s going to take a lot of development investments to make this work well

The thing with bets, as the great strategist K. R. Rogers wrote, is you gotta know when to fold ’em, know when to walk away, and know when to run.

Discussing SQL Server licensing

Today, Big Data Clusters died in its sleep.

Back then, I said I liked that Microsoft was making risky bets. They’ve proven time and again that they’re willing to continue to do that with SQL Server, throwing all kinds of crazy features against the wall. Big Data Clusters in Kubernetes. Machine Learning Services on Linux. Calling Java code from T-SQL. I look at a lot of these bets and think, “Uh, I don’t think you can win a poker game with two threes, a Subway discount card, and a happy birthday card from Grandma. That’s not how this works.”

The great part about these utterly wacko bets is that they don’t break the database engine. Microsoft SQL Server is still a robust, powerful persistence layer with a lot of good features. Microsoft hasn’t made dumb moves like saying, “We’re going to replace T-SQL with interpretive dance,” or “We’re not going to run queries if they don’t end in a semicolon.” They’ve figured out that they have to keep the core engine working better and better with each release, too.

If you ask me, five nines is a better hand.

There’s a lesson in here when you gamble on your own career. I beat this drum a lot, but I’m gonna say it again: you only have so many hours in the day to learn new skills. Make sure every hour counts.


Meme Week: Setting Fill Factor to Fix Fragmentation

Index Maintenance
27 Comments

I just shake my head when I see people setting fill factor in an attempt to “fix” fragmentation.

The default fill factor is 100%. That means during an index rebuild, SQL Server packs 100% of your 8KB pages with sweet, juicy, golden brown and delicious data.

But somehow, some people have come to believe that’s bad, and to “fix” it, they should set fill factor to a lower number like 80% or 70%. But in doing so, they’re setting fragmentation to 20%-30%. They’re telling SQL Server to leave 20%-30% free space on every single page during an index rebuild.

That’s internal fragmentation – empty space on pages – and it’s bad. It makes your database 20-30% larger, makes your table scans take 20-30% longer, your maintenance jobs take 20-30% longer, and makes your memory 20-30% smaller.

And the very people who keep playing around with fill factor are the ones who wonder why their queries aren’t getting faster, why their maintenance is taking longer, and why their fragmentation isn’t getting better. The jobs still keep taking forever every night, and they don’t connect the dots that they’re the problem.


Meme Week: NOLOCK? NOSQL.

When you find yourself using NOLOCK, consider NOSQL instead.

NOLOCK has 4 big problems:

  • Your query can see the same rows twice
  • Your query can skip rows altogether
  • Your query can see data that was never committed
  • Your query can fail altogether

If you’re okay with all 4 of those, and you find yourself REGULARLY applying NOLOCK, you probably don’t need the expense of a relational database at all. You’d probably be better off with an eventual consistency platform that trades accurate results for speed.


Meme Week: SET DEADLOCK_PRIORITY HIGH

Hey, this seems harmless.

Seriously, don’t SET DEADLOCK_PRIORITY HIGH.

Even setting it to low can be a bad idea. I’ve heard people say, “Well, I’ve got a big process that runs nightly, but if it gets in anybody’s way, I would rather just kill my process than slow someone else down.” Keep in mind that SQL Server is still doing all that work until it hits the deadlock – and if we’re talking about an hours-long process, there can still be hours of work involved in rolling back your process to let someone else win the deadlock.

When you find yourself going near the DEADLOCK_PRIORITY option, take a step back and question every decision that brought you to this point.


Meme Week: Queue in the Database

Architecture
11 Comments

When you have a hammer, everything looks like a nail. Unfortunately, a lot of people don’t even know how to use a hammer correctly.

You really shouldn’t be doing queues in the database because:

  • They’re a blocking nightmare, constantly locking rows in highly concurrent tables
  • They’re a logging nightmare, constantly pouring data into the transaction log
  • They’re an HA/DR nightmare, constantly causing slow writes to your sync replicas
  • They’re a backup nightmare, constantly changing pages that bloat your differential backups
  • They’re a licensing nightmare, causing a lot of work on an expensive relational database for tables that have no relational joins to anything else whatsoever
  • There are simple, cheap queueing alternatives like Amazon SQS and a few in Azure

If you really gotta do queues in SQL Server, read how to use locking hints.


Meme Week: SQL Server 2012 as of 2022

This week, I’m writing blog posts that can be summed up in one image. Here’s the first one:

You could probably stop reading right there, but if you’re not convinced, allow me to elaborate.

SQL Server 2012 support ends July 12.

No, there’s not a year in that sentence, because it’s this year.

No, there’s no qualification about “mainstream support” because that’s the extended support end date.

Sure, there are extended security updates for a few more years, but those are only security bugs. Within 141 days, your SQL Server 2012 boxes are going to be unsupported, period, full stop, end of sentence. If you call Microsoft for support, they’re going to give you best effort, but at the end of the day, you shouldn’t be surprised if they tell you that the “fix” is to upgrade your SQL Server.

And you know what? I think that’s completely fair. SQL Server 2012 is a decade old at this point.

It wasn’t a bad version!

Okay, maybe it's deniable.
Undeniably attractive. Okay, maybe a little deniable.

I don’t want you to think I was disappointed in 2012. To the contrary, I think it was a groundbreaking release, especially looking back at 2008 and 2008R2. My 2008 R2 review noted that it was really just more of the same from 2008, whereas 2012 was really different.

In 2012, I was incredibly excited when Always On AvailabilityGroups (did I capitalize and space that right?) revolutionized high availability and disaster recovery. However, the SQL Server 2012 execution was outright terrible.

2012’s columnstore indexes? No way, Jose: they made the table read-only.

But this stuff represented great down payments on technology that Microsoft has continued to invest in. Today, in 2022, columnstore indexes and AGs are solid features that…

(listens to earpiece)

One moment please…

(continues listening to earpiece)

I’m being informed that Always On Availability Groups are still painful, but at least they’re widespread. (Did I phrase that right?)

Look, stop distracting me. The point is SQL Server 2012 is bad in 2022. If you’re still running 2012 in production today, it’s time to start having the discussion with the business. Here’s how I’d start: “Is it okay if application ___ is running on an unsupported version of SQL Server?”


What Trace Flags Are You Using, and Why?

Most of the time, when you need to change SQL Server’s behavior – like how it handles memory or parallelism – there are easy-to-use switches to get the job done. In SQL Server Management Studio, just right-click on the server’s name, go into Properties, and there are all kinds of things you can mess with. Most of the time, you’re going to make things worse, but whatever.

Every now and then, when you work with Microsoft support, you might hit a specialized situation where you need a very unusual change in SQL Server’s behavior. These settings aren’t well-documented, nor are they easy to enable via the GUI. These settings are typically implemented via trace flags.

Trace flags are edge cases. At least, that’s the idea. If you read through Konstantin Taranov’s excellent open source trace flag documentation, you’ll see why.

Back in the 2000s, a few settings turned out to be fairly important, like how TempDB handles page allocations. For a while back then, folks would recommend that everybody implement trace flags like 1117 and 1118. Microsoft took note of that, and eventually changed SQL Server’s normal behavior so that those trace flags weren’t necessary for TempDB anymore.

Konstantin’s documentation has a list of recommended trace flags, but I’m curious: if you’re running a trace flag that is NOT in Konstantin’s list of defaults, which one is it, and what prompted you to turn it on?


[Video] Office Hours: Q&A on the Beach at Sunset

Videos
0

You posted and voted on questions at https://pollgab.com/room/brento and I answer ’em on the beach. It’s not your eyes: the camera’s hunting for focus a lot in this video.

Here’s what we covered:

  • 00:00 Introductions
  • 01:01 RoJo: Please comment on the Cloud. Too much push at my company to the Cloud without backing to do so. Lots of issues, costs, delays to get there. Supposed savings (less maintenance) seem non-existent. Less performant. We already own SQL server so ‘renting’ a new location seems like waste
  • 02:58 Latka Gravas: Hi Brent, my coworker has sp_whoisactive logging to a table every 30 seconds. Is this is an acceptable amount of observer overhead?
  • 04:57 dan heile: for the answer question sessions – we have an archive database that is growing in size. we were going to look at row or page level compression but the sys engineers said that with our pure or netapp flash storage that all compression is done at the san level. thought?
  • 07:27 Dan Griswold: Hi Brent, do you know of any gotchas when upgrading to a new SQL Server server version and then running the old database in compatibility mode. I’m considering using this method to get off my older servers and then worry about upgrading the compatibility mode later. Thanks!
  • 08:42 Desi: I am seeing too many “sleeping” connections in SQL from application. Can this be a cause of concern? SQL version is 2017 Enterprise edition
  • 09:24 Gustav: Thank you for the highly informative Office Hours sessions. The sunrises are an additional bonus. Will there be a “Watch Brent Code” stream at some point considering your schedule? Those too are great to watch. How you “tackle” a challenge that needs to be solved.
  • 10:28 Hatsune: Does parameterizing TOP ( SELECT TOP (@MaxRows) ) contribute to potential query performance problems?
  • 11:25 Frank Poncherello: Hi Brent, Is Resource Governor ever a good traffic cop for a busy multi-tenant OLTP system with multi-terabyte DB?
  • 13:22 Career pro tip
  • 13:43 Ahmet: Do you know of a good page that shows the new features by SQL Server version?
  • 14:30 Nedim: We sometimes see high disk IO utilization for one of our 15 SAN drives but not sure which queries are driving the IO. Is there a good way to determine which queries are targeting tables for a given SAN drive for a particular point in time?
  • 15:35 Mike: Hi Brent, what are your thoughts regarding onprem SQL Server backup to azure storage functionality? Any issues with it?
  • 17:20 Don’tBotherAsking: Hi Brent, love your work. A friend improved a slow running vendor query for the business with a covering index. Query now completes in fraction of the time. But query optimizer now says reason for early termination is ‘time out’ instead of ‘good enough’. Is this fix sub-optimal?
  • 18:20 Sunset

Fundamentals Week: Your First Official SQL Server Class.

Company News
1 Comment

You love learning from me. You’ve attended some of my free online streams, read my blog posts, and you use the First Responder Kit. You enjoy my laid-back, humorous approach to sharing what I’ve learned over the years.

You love live classes. You’ve tried watching recordings or stepping through demos yourself, but…you just can’t block out the time, and you can’t stay motivated while you’re watching a recording. You like the fast-paced energy of seeing me live, asking me questions, and seeing me respond to your chats.

You hate conferences that feel like Zoom meetings. You’ve tried attending a few online conferences, but they felt just like sitting through one meeting after another. They didn’t dive deeply into any one subject – they just hopped around from one topic to another, and many of those topics just weren’t relevant to you. They were good for free stuff, but…

You’re ready for Fundamentals Week. In March, you’ll spend a week with me learning about indexes, query tuning, columnstore, how I use the First Responder Kit, and more. I’m teaching it two weeks in a row – one week in US-friendly times 9AM-5PM Eastern, and one week in Europe-friendly times 8:00-16:00 UTC.

To keep prices low, recordings are not included – this event is really just for folks who like to learn live. If you want the recordings, check out my Recorded Class Season Pass.

The class lineup and registration is open now. See you there!


Who’s Hiring in the Database Community? February 2022 Edition

Who's Hiring
31 Comments

The pandemic is causing so many folks to rethink what they do for a living, where they do it, who they do it for, and in light of yesterday’s salary survey results, how much money they earn doing it. It’s been really fulfilling to hear stories of employers who found candidates here, and candidates who found a new opportunity. So let’s keep ‘er going for a while longer.

Is your company hiring for a database position as of February 2022? Do you wanna work with the kinds of people who read this blog? Let’s set up some rapid networking here. If your company is hiring, leave a comment.

The rules:

  • Your comment must include the job title, and either a link to the full job description, or the text of it. It doesn’t have to be a SQL Server DBA job, but it does have to be related to databases. (We get a pretty broad readership here – it can be any database.)
  • An email address to send resumes, or a link to the application process – if I were you, I’d put an email address because you may want to know that applicants are readers here, because they might be more qualified than the applicants you regularly get.
  • Please state the location and include REMOTE and/or VISA when that sort of candidate is welcome. When remote work is not an option, include ONSITE.
  • Please only post if you personally are part of the hiring company—no recruiting firms or job boards. Only one post per company. If it isn’t a household name, please explain what your company does.
  • Commenters: please don’t reply to job posts to complain about something. It’s off topic here.
  • Readers: please only email if you are personally interested in the job.

If your comment isn’t relevant or smells fishy, I’ll delete it. If you have questions about why your comment got deleted, or how to maximize the effectiveness of your comment, contact me.

Each month, I publish a new post in the Who’s Hiring category here so y’all can get the latest opportunities.

If you want to be notified as people leave new jobs here, there’s a “subscribe without commenting” option where you leave a comment. I don’t use that for subscribing you to anything else.


[Video] Office Hours at Sunrise (No Really)

Videos
0

I’ve got a really sensitive lens (Sigma 16mm f/1.4) on my camera, so often when I record these at sunrise, it’s already too bright for you to see the sun. This time, I went down to the beach a little earlier, and through the video, you can see the light really pick up.

Here’s what we covered:

  • 00:00 Introductions
  • 00:31 Radek: Hi Brent, do you know a person who, is doing exactly what you do, but for production DBAs, not performance tuning folks? I cannot find up-to-date sessions/courses dedicated to DBAs that cover that part of the job
  • 02:14 Arnold: What are the top causes of data corruption in SQL Server?
  • 03:28 marcus the german: Hi Brent, a friend of mine 😉 wants to know if it’s possible to tell which page is on which datafile (db with multiple datafiles)?
  • 04:38 Hamid: What are the pros / cons of putting responder kit / sp_whoisactive in master database as opposed to some kind of dedicated DBA database?
  • 06:04 Hilal: What are the risks of killing the SPID associated with a long running transaction? Is terminating the offending app any better?
  • 07:42 Meiko: Hi Brent, I think sp_blitzFirst is reporting “Forwarded Fetches/Sec High” for temp tables created by sprocs / jobs on our system. How do we find the offending jobs / sp’s amongst our many jobs / sp’s?
  • 09:16 Mark: Hi Brent, what are the pros / cons of running two SQL instances on the same bare metal hardware?
  • 10:09 Preben: Hi Brent, have you ever used delayed durability to significantly reduce writelog waits? Are there any undocumented issues with it?
  • 11:29 I don’t wanna DBA Today: I want to setup a read-only SQL Server to run reports off of since we are currently crushing our production server. We have 2016 Standard and were looking at the best option, be it Basic Always On versus Replication. Only a subset of the production DBs will be used. Thanks!
  • 12:17 Wrap-up, fishing in Cabo

If you’d like to ask your own question, post it at https://pollgab.com/room/brento/ and upvote the ones you’d like to see me cover at my next Office Hours.


[Video] Office Hours: Ask Me Anything About SQL Server

Videos
0

You ask questions at https://pollgab.com/room/brento and I answer ’em from the beach in Cabo San Lucas:

Here’s what we covered:

  • 00:00 Introductions
  • 00:27 Paul: What are your thoughts on being a Microsoft MVP?
  • 04:33 Lance: I have a Server running SCCM that gets ‘unsafe assembly’ errors. Are unsafe assemblies something that needs to be worried about?
  • 04:52 Arslan: Hi Brent, what are the top issues you see related to customers running SQL CLR?
  • 06:02 Anil: Can bare metal SQL Server safely utilize NAS storage (slower / cheaper) for data files or is it recommended to only use SAN storage (faster / more expensive)?
  • 07:13 Hasan: Hi Brent, what are the origins of SQL Bits? Do you have any epic stories from SQL Bits?
  • 10:02 Jim: How does PostgreSQL query tuning compare with SQL Server query tuning?
  • 10:35 Samir: Do we ever have to worry about disk (not page) fragmentation with SQL Server?
  • 11:48 Wrap-up

[Video] Office Hours: Ask Me Anything About SQL Server in Cabo

Videos
4 Comments

I watched the sun rise from my balcony in Cabo this morning and answer your questions from https://pollgab.com/room/brento.

Here’s what we covered:

  • 00:00 Introductions
  • 00:42 Student: Hi Brent! Why is full backup sometimes slow and sometimes fast in similar situations? This problem will be resolved after restarting the server but it comes back again.
  • 02:58 Frank Peters Netherlands: What is your opinion of the Polybase feature in SQL 2019? I read it could possibly replace ETL and perform better than Linked Servers due to being able to scale out. Or are there still some major (performance) drawbacks?
  • 04:50 Scooter: Hi Brent, not a question… Just wanted to say how much we appreciate your training, office hours, and tools. Thanks for all you do for the SQL Server community.
  • 05:30 Paul: Hi Brent, what are the common performance issues you run into with queries generated by ORM’s such as Entity Framework? How do you deal with them?
  • 06:46 Mattia Nocerino: DB is 130 GB, biggest table (log) 56 GB. Inserting 1 row at a time, no updates/deletes. Columnscore gave me a D. SELECTs are mostly executed on recent data. Thinking about keeping last year worth of data and move the rest to another DB, to improve RTO. What is your recommendation
  • 08:32 Saverio: Nowadays it is common for (my friend’s) servers to have almost exclusively SSDs instead of HDDs. What are your thoughs about maintenance (Statistics Update and Index Rebuild) from this point of view? It’s still a big concern or something less important?
  • 09:41 Myron: Which is better in your opinion Azure, VM or physical hardware for a new SQL Server install?
  • 10:55 Youssef: What are your recommendations for keeping SQL agent jobs in sync between primary and secondary always on servers?
  • 12:07 Mr. Griffith: Have you ever been hired as an expert witness in a lawsuit? We’ve got a vendor telling us NOLOCK is the solution to the blocking problems for our mission-critical application..
  • 12:38 Byron: What are the top mistakes you see from clients with respect to sql backups and integrity checks?
  • 14:58 Erdem: Hi Brent, is it ok to use ‘DBCC DROPCLEANBUFFERS’ to simulate running a cold – slow query on production?
  • 16:00 Maciej: Hi Brent, my friend recently learned (to his own surprise) that Microsoft fixed (in SSMS 18.9) the bug with IntelliSense error message while connecting to SQL Server using DAC. Do you have a bug in SSMS and/or SQL Server that you treat (at this point) as an old friend? 😉
  • 18:26 Old DBA: I had a coworker tell me that stored procedures are no longer the way people are doing sql and that we need to stop. I feel he is absolutely wrong, what are your thoughts?
  • 20:31 Maksim Bondarenko: Hi Brent! My friend says that when we use RCSI we shouldn’t use FILLFACTOR=100 because 14-byte version pointers added to the modified rows. What do you think about that?
  • 22:07 Theo: How often do you recommend taking inventory of all company SQL servers and their configuration?
  • 23:25 Fred: To what extent should the developers be involved in creation of new NC indexes on prod? Let them run sp_blitzindex?
  • 25:33 Wrap-up

When Should You Use DESC in Indexes?

Indexing
8 Comments

The short answer is that if your query orders columns by a mix of ascending and descending order, back to back, then the index usually needs to match that same alternating order.

Now, for the long answer. When you create indexes, you can either create them in ascending order – which is the default:

Or descending order:

If your query orders the data in descending order, but your index is in ascending order, like this:

SQL Server can still use the index. I’ll demonstrate using the Stack Overflow database:

SQL Server can scan the index backwards, from the highest reputation to the lowest reputation value. You can see proof by right-clicking on the Index Scan and clicking Properties:

There’s a wee bit of a gotcha: backwards range scans can’t go parallel, as Paul White mentions in this post, but that’s rarely a dealbreaker.

In simple scenarios like this, you don’t need to screw around with DESCending order on your indexes.

Things change when you order by multiple columns in alternating orders.

Let’s say you want to build a leaderboard of the top Stack Overflow users. You want them in Reputation order descending, highest reputation first, and you want a tiebreaker column. If two people have the same Reputation score, you want to sort them by DisplayName. You’ve created an index to help:

Now things are a little different, as you can see in the actual execution plan:

SQL Server flat out ignored the index, read all 9 million rows in the table, and sorted all of them. The stats IO & time is horrifying, too:

For those of you who are horrified by this plan and don’t believe me, you’re welcome to reproduce it at home. I’m using a large Stack Overflow database, SQL Server 2019, 2019 compat mode.

SQL Server *COULD* use the index for this, mind you, as we can see with a hint, but the actual execution plan becomes even more terrifyingly insane:

And the statistics are bonkers:

You and I would design a better plan.

The execution plan you and I *WISH* we would get out of SQL Server is:

  1. Scan the index from highest reputation to lowest, reading the first 100 rows and then immediately stopping. All 100 of the rows will match – remember, our query doesn’t even have a WHERE clause. Just read the first 100 rows, and call it a day.
  2. Sort those 100 rows by our tiebreaker column, DisplayName.
  3. Do 100 key lookups to satisfy the SELECT *.

However, it’s possible that the top 200 rows (not 100) all have exactly the same Reputation score. To handle that edge case scenario, we’d have to modify our execution plan a little:

  1. Scan the index from highest reputation to lowest, reading the first 100 rows
  2. Check the 101st row, and see if its Reputation matches the 100th row. If it does, go read another row from the index. Repeat that process until there are no more ties.
  3. Sort the 100-110 (or 150 or whatever) rows by our tiebreaker column, DisplayName.
  4. Do 100 key lookups to satisfy the SELECT *.

That would work, but…SQL Server just doesn’t build queries that way. The operator that sorts the data is a different operator than the one that’s reading the data:

The sort happens long after the data is read. And there’s another problem: key lookups are usually bound directly to their index operations. I think I can count on one hand the number of times in my life when I’ve seen SQL Server do an index operation (like a seek or scan), then another operation (like a sort), and then go back to the same table to do a key lookup. (Usually it involves an indexed view – more on that in another post.)

So if SQL Server uses the index’s sort of Reputation, then it insists on doing these stupid key lookups at the same time, reading more 8KB pages than there are in the table.

The fix: use DESC in the index to remove the sort.

If we build the sort into the index itself, then we don’t have to worry about tiebreakers on the TOP 100 ORDER BY Reputation DESC, DisplayName sort. Let’s try an index with explicit ordering built in:

This actual execution plan is way more elegant:

The query uses our new DESC index and runs way faster with zero CPU:

The takeaway: if your query orders columns by a mix of ascending and descending order, back to back, then the index usually needs to match that same alternating order. I don’t see this problem too often, and I can see why the missing index recommendations don’t fire off here. Querying by alternating sort orders on a single table is fairly unusual.

Alternating sort orders on multiple tables is a different story altogether. I describe that problem in the Indexing for Joins module in Fundamentals of Index Tuning class.


[Video] Office Hours: Ask Me Anything About SQL Server

Videos
1 Comment

Let’s pick up right where we left off yesterday. I’ve got more time and champagne, so let’s keep going through your top-voted questions from PollGab.com/room/brento.

Here’s what we covered:

  • 00:00 Introductions
  • 00:39 Wilma: Poor code & tuning issues. Evidence = Database Performance Analyzer & sp_blitz. App is the customer’s idea. The dev has got golden balls (but writes poor queries), server, network and SQL always to blame. Any suggestions on how to present my findings with out losing a customer?
  • 03:15 Doug: What are the common gotcha’s associated with running SQL in a VM environment?
  • 04:36 Richard: Is it ever OK to use a non-unique date/time column (CreationDate in this case) for a new clustered index on a pre-existing HEAP table?
  • 05:10 Kyle: Does the 5 x 5 rule for new Non-Clustered indexes apply to both OLTP and OLAP tables?
  • 06:17 Dwight: What are the ways to find all queries targeting a given table?
  • 07:54 Bruce: Hello Brent, when should you play a good cop vs bad cop towards the dev executing performance destabilizing queries on the production sql server?
  • 11:11 BiggieDB: Hi Brent and all the party people! I am working on tuning up code and see the developer put, on every join, forceseek hint. I know we are bossing the optimizer around there, but is there ever a good case to use this technique?
  • 12:14 Bruce: Hello Brent, what are the hidden costs of PostgreSQL?
  • 13:40 Dwight: Hello Brent, what is the Microsoft true-up process like for SQL server? 14:30 SwedeDBA: Hi Brent, I have a friend that is searching for unused indexes on a DB in an availability group with readable secondaries. He wonders if the result from SP_BlitzIndex shows all index usage on the readable secondaries so he can securely drop the unused indexes? Thanx man!
  • 15:00 Bill: Hello Brent, have you ever considered working for Microsoft?
  • 21:00 Wally: Hi Brent, is there a good way to find root cause for when page life expectancy drops significantly? Is this a worthy tracking metric?
  • 21:59 Shawn: What are your favorite non technical books?
  • 25:34 Wrap-up, discussion of Dead by Daylight

[Video] Office Hours: Ask Me Anything About SQL Server

Videos
0

I’m back in San Diego, so let’s sit out on the balcony, enjoy a tasty beverage, and go through your top-voted questions from PollGab.com/room/brento.

Here’s what we covered:

  • 00:00 Carl: Hi Brent, what is your opinion of the graph db support starting in SQL2017? Is this another doomed SQL feature?
  • 02:06 Bob: Brent, sometimes when we run sp_BlitzCache for a given sp, the top resultset shows that we couldn’t find a plan for this query but the bottom resultset shows prioritized findings such as Functions, blocking, etc. Can the bottom resultset findings still be relied upon?
  • 03:00 juggler314: My friend has some old tables with no great candidates for a clustered index which are currently heaps. Is it better to use something like a timestamp which may not be unique (but already exists and is used), or just add an auto-increment internal field to be the key?
  • 04:52 Jim: Hi Brent, do you have any recommended one stop tools/scripts that capture server configuration (db files, db settings, memory configuration, os version, sql version, etc) for DR purposes? Love your office hours.
  • 07:28 Ken: Hi Brent, is there a good way to handle multi db backup/restores when the transactions span multiple db’s?
  • 09:48 Drowning in SQL Pool: Hi Brent, when it comes to upgrading compatibility levels on databases after you have left an upgrade stew for a few weeks, Should you upgrade system databases? and if so should you do these first, wait a week and then do user databases? Or tackle these all at the same time?
  • 10:35 Chris: Hi Brent, what are the pros/cons of using NUMA based servers for bare metal SQL?
  • 12:25 Todd LeBod: Hi Brent, do you know of anyone (successfully) using Stretch Database, and is Microsoft still pushing this (making enhancements, bug fixes, etc)?
  • 14:11 Wrap-up, discussing Cabo and San Diego

Updated First Responder Kit and Consultant Toolkit for January 2022

In this release, sp_Blitz shows some information about clusters and AGs, sp_AllNightLog and sp_DatabaseRestore get some love, and more.

How I Use the First Responder Kit
Wanna watch me use it? Take the class.

To get the new version:

Consultant Toolkit Changes

I updated it to this month’s First Responder Kit, but no changes to querymanifest.json or the spreadsheet. If you’ve customized those, no changes are necessary this month: just copy your spreadsheet and querymanifest.json into the new release’s folder.

sp_AllNightLog Changes

  • Enhancement: now works in any database as long as Ola Hallengren’s scripts are installed in the same database. (#3009, thanks Nick Fotopoulos.)
  • Enhancement: there’s a new configuration record to allow data/log files to be moved to the instance’s default directory. (#3046, thanks Dale Hirt.)

sp_Blitz Changes

  • Enhancement: now displays information about clusters and Availability Groups. (#2976, thanks Ragatilao.)
  • Fix: xp_regread no longer errors out if registry keys aren’t found. (#2836, thanks ScottL1969 and Andreas Jordan.)

sp_BlitzIndex Changes

  • Enhancement: faster visualization of nonclustered columnstore indexes on wide tables. (#3043, thanks Razvan Socol.)
  • Fix: total_forwarded_fetch_count column wasn’t included in the results if you saved the results to a table. (#3053, thanks Pollus Brodeur and Vladimir Vissoultchev.)

sp_DatabaseRestore Changes

  • Enhancement: the @StopAt parameter is now a real point in time restore, not just stopping at the transaction log before that point, but going up to that moment in time in the last log backup. (#3038, thanks Daniel van der Meulen.)

Bonus changes: Anthony Green kept the SQL Server versions file up to date, and Konstantin Taranov improved the repo’s readme.md.

For Support

When you have questions about how the tools work, talk with the community in the #FirstResponderKit Slack channel. Be patient: it’s staffed by volunteers with day jobs. If it’s your first time in the community Slack, get started here.

When you find a bug or want something changed, read the contributing.md file.

When you have a question about what the scripts found, first make sure you read the “More Details” URL for any warning you find. We put a lot of work into documentation, and we wouldn’t want someone to yell at you to go read the fine manual. After that, when you’ve still got questions about how something works in SQL Server, post a question at DBA.StackExchange.com and the community (that includes me!) will help. Include exact errors and any applicable screenshots, your SQL Server version number (including the build #), and the version of the tool you’re working with.


Error Handling Quiz Week: Where Are You Handling Errors?

T-SQL
11 Comments

This week’s series on error handling has been an eye opener for many of us. We’ve laughed. We’ve cried. We’ve screamed in horror. We’ve read the documentation.

I don’t blame you. This is a really confusing topic, and for many of us, it’s the first time we’ve actually looked at the mechanics of how transactions, TRY/CATCH, and error handling work. I know it’s the first time because a while back, I asked you how often your newly written code checks for errors. Most of the time, we’re not doing it at all.

That was enlightening (and a bummer) for sure, but I followed it up with another poll. When we DO check for errors, where are we doing it – in a TRY/CATCH block, in some other form of T-SQL, or in the app code?

When you think about the results, remember that this is nowhere near a valid survey – it’s just a quick question to open a discussion. This poll is actually why I wrote the blog post series – I had a pretty good hunch people didn’t understand how TRY/CATCH really works, and given that most of you aren’t using it, I figured I should talk about how complex it is.

The responders aren’t a wide sample. Most of my audience consists of database people, not pure application developers. Also keep in mind that if you’re writing C# code that accesses the database via Entity Framework or some other ORM, you’re not writing T-SQL queries either. In that case, error handling would obviously be in the app code, not TRY/CATCH or other forms of T-SQL.

I had to have the “I don’t check for errors” answer given how widespread the no-checking answers were on the prior poll. It helps remind the rest of my audience what kind of real-world code is out there, and it gives the YOLO crowd a way to answer without pestering me for replies.

Twitter polls only let you pick one answer, not mark-all-that-applies, and many folks responded with replies saying that they sometimes check in different places (or multiple places) depending on the kind of queries they’re writing. That’s great! And given what we saw with the complexity today, you can see why it’s helpful to try to catch errors in more than just one place.

As an industry, I think we’ve got some work to do around improving our handling of errors, transactions, and database failovers. I’m not saying every query needs to automatically retry deadlocks, handle database failovers, and survive out-of-drive-space issues, but there are a few easy things we can do to make our apps more graceful and fault-tolerant.