This morning, I wanted to write a few new “background noise” queries for my Mastering classes. In those classes, the students run live workloads against the Stack Overflow database and troubleshoot performance issues, and the more of a varied workload I’ve got running, the more realistic it feels. For this session, I specifically wanted to write quick queries – queries that would produce real-world style data in a few seconds, in the style a user would actually write them.
I streamed the whole thing live in Twitch, and you can watch. – or in YouTube if your company blocks Twitch.
If you wanna jump around to different parts of the video, here are the key points where I started new queries:
- 7min:33sec in: I introduce the topic of what I’m trying to do. The first query fetches the tags a user works on the most often and shows their scores. I happen to run across Gordon Linoff, a prolific answerer.
- 26m:08sec: I let y’all tell me what queries you want me to write, and then I start working on finding the top tags in a date range.
- 34m:54sec: I explain why I love COALESCE and how a friend of mine ended up putting a seasickness pill up their butt.
- 42m:30sec: I code a leaderboard of the top questions with the most answers.
- 1h:04m: I get a great question about parameter sniffing on update statements, so I code a demo showing it live.
- 1h:41m: I find the most recent answers from the top 10 most active folks, and explain why you have to use hard-coded dates rather than GETDATE() when working with older Stack Overflow exports.
- 1h:57m: I write a query to find average response times for a given date range & tag, and build it atop a view to simulate what users would do, and then write another proc atop that view to find the fastest answers in a date range.
Enjoyed this session? Follow me on Twitch, YouTube, or Facebook to be alerted whenever I’m streaming.
Here are the scripts from the demo:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 |
CREATE OR ALTER PROC dbo.usp_GetTagsForUser @UserId INT AS BEGIN SELECT TOP 10 COALESCE(p.Tags, pQ.Tags) AS Tag, SUM(p.Score) AS TotalScore, COUNT(*) AS TotalPosts FROM dbo.Users u INNER JOIN dbo.Posts p ON u.Id = p.OwnerUserId LEFT OUTER JOIN dbo.Posts pQ ON p.ParentId = pQ.Id WHERE u.Id = @UserId GROUP BY COALESCE(p.Tags, pQ.Tags) ORDER BY SUM(p.Score) DESC; END GO /* Mehow - top tags AbusedSysadmin question w/most answers DavidHooey - how many of my questions have been answered DamnTank - last 10 posts by the top 10 posters Bilbo - average answer response time Gdonufrio - fastest answer */ CREATE OR ALTER PROC dbo.usp_RptTopTags @StartDate DATETIME, @EndDate DATETIME, @SortOrder NVARCHAR(20)= 'Quantity' AS BEGIN /* Changelog: 2020/05/28 Mehow - I needed a quick report for the sales team. */ SELECT TOP 250 pQ.Tags, COUNT(*) AS TotalPosts, SUM(pQ.Score + COALESCE(pA.Score, 0)) AS TotalScore, SUM(pQ.ViewCount) AS TotalViewCount FROM dbo.Posts pQ LEFT OUTER JOIN dbo.Posts pA ON pQ.Id = pA.ParentId /* Answers join up to questions on this */ WHERE pQ.CreationDate >= @StartDate AND pQ.CreationDate < @EndDate AND pQ.PostTypeId = 1 GROUP BY pQ.Tags ORDER BY CASE WHEN @SortOrder = 'Quantity' THEN COUNT(*) WHEN @SortOrder = 'Score' THEN SUM(pQ.Score + pA.Score) WHEN @SortOrder = 'ViewCount' THEN SUM(pQ.ViewCount) ELSE COUNT(*) END DESC; END GO EXEC usp_RptTopTags @StartDate = '2010-11-10', @EndDate = '2010-11-11'; GO CREATE OR ALTER PROC dbo.usp_RptPostLeaderboard @StartDate DATETIME, @EndDate DATETIME, @PostTypeName VARCHAR(50) AS BEGIN /* Changelog: 2020/05/29 Jonathan9375 - make it work for multiple PostTypes 2020/05/28 AbusedSysadmin - New social media project to display viral questions. */ SELECT TOP 250 * FROM dbo.PostTypes pt INNER JOIN dbo.Posts p ON pt.Id = p.PostTypeId WHERE p.CreationDate >= @StartDate AND p.CreationDate < @EndDate AND pt.Type = @PostTypeName ORDER BY AnswerCount DESC; END GO EXEC usp_RptPostLeaderboard @StartDate = '2010/05/01', @EndDate = '2010/05/02', @PostTypeName = 'Question' GO CREATE OR ALTER PROC dbo.usp_RptQuestionsAnsweredForUser @UserId INT AS BEGIN /* Changelog: 2020/05/29 David Hooey - PM also wants to show a percentage 2020/05/28 David Hooey - Product manager wants to show each user's number of questions */ WITH MyQuestions AS ( SELECT pQ.Id AS QuestionId, pQ.AnswerCount FROM dbo.Users u INNER JOIN dbo.Posts pQ ON u.Id = pQ.OwnerUserId /* My questions */ INNER JOIN dbo.PostTypes pt ON pQ.PostTypeId = pt.Id AND pt.Type = 'Question' WHERE u.Id = @UserId ), MyAggregates AS ( SELECT COUNT(*) AS MyQuestions, SUM(CASE WHEN AnswerCount > 0 THEN 1 ELSE 0 END) AS Answered FROM MyQuestions ) SELECT MyQuestions, Answered, 100.0 * Answered / MyQuestions AS AnsweredPercent FROM MyAggregates END GO EXEC usp_RptQuestionsAnsweredForUser @UserId = 26837; EXEC usp_RptQuestionsAnsweredForUser @UserId = 9527192; GO CREATE OR ALTER PROC dbo.usp_DashboardFromTopUsers @AsOf DATETIME = '2018-06-03' AS BEGIN /* Changelog: 2020/05/28 DamnTank - last 10 posts by the top 10 posters */ CREATE TABLE #RecentlyActiveUsers (Id INT, DisplayName NVARCHAR(40), Location NVARCHAR(100)); INSERT INTO #RecentlyActiveUsers SELECT TOP 10 u.Id, u.DisplayName, u.Location FROM dbo.Users u WHERE EXISTS (SELECT * FROM dbo.Posts WHERE OwnerUserId = u.Id AND CreationDate >= DATEADD(DAY, -7, @AsOf)) ORDER BY u.Reputation DESC; SELECT TOP 100 u.DisplayName, u.Location, pAnswer.Body, pAnswer.Score, pAnswer.CreationDate FROM #RecentlyActiveUsers u INNER JOIN dbo.Posts pAnswer ON u.Id = pAnswer.OwnerUserId WHERE pAnswer.CreationDate >= DATEADD(DAY, -7, @AsOf) ORDER BY pAnswer.CreationDate DESC; END GO EXEC usp_DashboardFromTopUsers GO CREATE OR ALTER VIEW dbo.AverageAnswerResponseTime AS SELECT pQ.Id, pQ.Tags, pQ.CreationDate AS QuestionDate, DATEDIFF(SECOND, pQ.CreationDate, pA.CreationDate) AS ResponseTimeSeconds FROM dbo.Posts pQ INNER JOIN dbo.Posts pA ON pQ.AcceptedAnswerId = pA.Id WHERE pQ.PostTypeId = 1; GO CREATE OR ALTER PROC dbo.usp_RptAvgAnswerTimeByTag @StartDate DATETIME, @EndDate DATETIME, @Tag NVARCHAR(50) AS BEGIN /* Changelog: 2020/05/29 James Randell - fixing bugs left over from Bilbo 2020/05/28 Bilbo Baggins - find out when fast answers are coming in */ SELECT TOP 100 YEAR(QuestionDate) AS QuestionYear, MONTH(QuestionDate) AS QuestionMonth, AVG(ResponseTimeSeconds * 1.0) AS AverageResponseTimeSeconds FROM dbo.AverageAnswerResponseTime r WHERE r.QuestionDate >= @StartDate AND r.QuestionDate < @EndDate AND r.Tags = @Tag GROUP BY YEAR(QuestionDate), MONTH(QuestionDate) ORDER BY YEAR(QuestionDate), MONTH(QuestionDate); END GO EXEC usp_RptAvgAnswerTimeByTag @StartDate = '2013-01-01', @EndDate = '2013-01-08', @Tag = '<sql-server>' GO CREATE OR ALTER PROC dbo.usp_RptFastestAnswers @StartDate DATETIME, @EndDate DATETIME, @Tag NVARCHAR(50) AS BEGIN /* Changelog: 2020/05/28 Gabriele D'Onufrio - looking for the fastest answer fingers in the West, possibly fraud */ SELECT TOP 10 r.ResponseTimeSeconds, pQuestion.Title, pQuestion.CreationDate, pQuestion.Body, uQuestion.DisplayName AS Questioner_DisplayName, uQuestion.Reputation AS Questioner_Reputation, pAnswer.Body AS Answer_Body, pAnswer.Score AS Answer_Score, uAnswer.DisplayName AS Answerer_DisplayName, uAnswer.Reputation AS Answerer_Reputation FROM dbo.AverageAnswerResponseTime r INNER JOIN dbo.Posts pQuestion ON r.Id = pQuestion.Id INNER JOIN dbo.Users uQuestion ON pQuestion.OwnerUserId = uQuestion.Id INNER JOIN dbo.Posts pAnswer ON pQuestion.AcceptedAnswerId = pAnswer.Id INNER JOIN dbo.Users uAnswer ON pAnswer.OwnerUserId = uAnswer.Id WHERE r.QuestionDate >= @StartDate AND r.QuestionDate < @EndDate AND r.Tags = @Tag ORDER BY r.ResponseTimeSeconds ASC; END GO EXEC usp_RptFastestAnswers @StartDate = '2013-01-01', @EndDate = '2014-01-08', @Tag = '<sql-server>' GO /* License: Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) More info: https://creativecommons.org/licenses/by-sa/4.0/ You are free to: * Share - copy and redistribute the material in any medium or format * Adapt - remix, transform, and build upon the material for any purpose, even commercially Under the following terms: * Attribution - You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. * ShareAlike - If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. */ |
3 Comments. Leave new
What a shame. My organization has blocked access to Twitch.
Ouch. You might have to finally cough up and buy some Internet at home. Trust me, it’s worth it – opens up a whole new world when you can surf on your own time, too.
Could only watch it for an hour or so, but will definitely watch it later! Very nice you have Twitch now! By the way, you really have to work on your pronunciation of ‘stroopwafel’. 😛