If we’ve gotta take the database down for maintenance – perhaps a version upgrade, perhaps upgrading our own code, maybe scaling up the hardware – when’s the best time to do it?
For this week’s query exercise, the business has asked us to identify 3 1-hour periods with the least user activity, in this format:
- Mondays, 1:00-1:59
- Saturdays, 23:00-23:59
- Sundays, 2:00-2:59
Pop open the Stack Overflow database, and for the sake of simplicity for this exercise:
- We’re just going to look at the users table. We’ll judge load by times when the fewest number of users are created, judging by the users.creationdate column.
- We’re not dealing with time zones.
- We’re only looking for one-hour time ranges that start with 00 and end with 59, like 1:00-1:59. We’re using a little shorthand there and not typing out the seconds, but the point is that we don’t want you saying “the best time is 2:15:05 to 3:14:04.” That’s delightfully ambitious – but let’s keep it simple here. Not everyone has as much free time as you, bud.
Pick a 28-day window so that you’re looking at several weeks of trending activity, and come up with your one-hour preferences. (I expect folks to come up with widely different answers depending on which 28-day window they pick, and which copy of the database they’re using. That’s okay! The goal is to write the query, and then check your logic against the gotchas that I discuss next week.)
The performance of your query isn’t a concern here – this isn’t a query we’ll be running on a regular basis. We just need to know the quietest times in the database.
You can post your answers in this blog post’s comments, and discuss each others’ ideas. We’ll revisit your answers in this post. Have fun!
46 Comments. Leave new
/*I had a similar question a while ago and am curious if my solution was a good one.
The stackoverflow database i used is the 10GB 2010 version. */
/* First i look to see if there are edge cases in the data
Because of much higher numbers in september 2008 we avoid that month */
SELECT CAST(u.CreationDate AS date), COUNT(*)
FROM dbo.Users AS u
GROUP BY CAST(u.CreationDate AS date)
ORDER BY 2 DESC
/* I see no further data issues apart from maybe deleted users */
SELECT u.CreationDate, COUNT(*)
FROM dbo.Users AS u
GROUP BY u.CreationDate
ORDER BY 2 DESC
SELECT u.CreationDate FROM dbo.Users AS u WHERE CreationDate IS NULL
SELECT u.CreationDate FROM dbo.Users AS u WHERE CreationDate = wh.HourStart
AND CAST(u.CreationDate AS time) = ‘2010-11-01’
AND u.CreationDate < '2010-11-29'
GROUP BY ResultText
), Ranking AS(
SELECT
ResultText
, UsersCreated
, DENSE_RANK() OVER(ORDER BY UsersCreated) AS DenseRank /* In the case there are ties */
FROM UsersCreated
)
SELECT
ResultText
, UsersCreated
, DenseRank
FROM Ranking
WHERE DenseRank <= 3
/* This excersise actually made me realize the query i delivered earlier has the potential for errors since i did not account for NULL results and ties. Apart from the (potential) errors i am making and not realizing */
My query got messed up while posting. I chopped it up in multiple posts.
/*I had a similar question a while ago and am curious if my solution was a good one.
The stackoverflow database i used is the 10GB 2010 version. */
/* First i look to see if there are edge cases in the data
Because of much higher numbers in september 2008 we avoid that month */
SELECT CAST(u.CreationDate AS date), COUNT(*)
FROM dbo.Users AS u
GROUP BY CAST(u.CreationDate AS date)
ORDER BY 2 DESC
/* I see no further data issues apart from maybe deleted users */
SELECT u.CreationDate, COUNT(*)
FROM dbo.Users AS u
GROUP BY u.CreationDate
ORDER BY 2 DESC
SELECT u.CreationDate FROM dbo.Users AS u WHERE CreationDate IS NULL
SELECT u.CreationDate FROM dbo.Users AS u WHERE CreationDate < '2008-01-01'
SELECT MIN(u.CreationDate), MAX(u.CreationDate) FROM dbo.Users AS u
/* To make the grouping easier i create a temp table to join the data on */
DROP TABLE IF EXISTS #Hours
CREATE TABLE #Hours(
HourStart time NOT NULL PRIMARY KEY
, HourEnd time NOT NULL
)
INSERT INTO #Hours(HourStart, HourEnd)
VALUES (’00:00:00.000′, ’01:00:00.000′)
, (’01:00:00.000′, ’02:00:00.000′)
, (’02:00:00.000′, ’03:00:00.000′)
, (’03:00:00.000′, ’04:00:00.000′)
, (’04:00:00.000′, ’05:00:00.000′)
, (’05:00:00.000′, ’06:00:00.000′)
, (’06:00:00.000′, ’07:00:00.000′)
, (’07:00:00.000′, ’08:00:00.000′)
, (’08:00:00.000′, ’09:00:00.000′)
, (’09:00:00.000′, ’10:00:00.000′)
, (’10:00:00.000′, ’11:00:00.000′)
, (’11:00:00.000′, ’12:00:00.000′)
, (’12:00:00.000′, ’13:00:00.000′)
, (’13:00:00.000′, ’14:00:00.000′)
, (’14:00:00.000′, ’15:00:00.000′)
, (’15:00:00.000′, ’16:00:00.000′)
, (’16:00:00.000′, ’17:00:00.000′)
, (’17:00:00.000′, ’18:00:00.000′)
, (’18:00:00.000′, ’19:00:00.000′)
, (’19:00:00.000′, ’20:00:00.000′)
, (’20:00:00.000′, ’21:00:00.000′)
, (’21:00:00.000′, ’22:00:00.000′)
, (’22:00:00.000′, ’23:00:00.000′)
, (’23:00:00.000′, ’23:59:59.9999999′)
DROP TABLE IF EXISTS #WeekDays
CREATE TABLE #WeekDays(
WeekDayName varchar(10) PRIMARY KEY
)
INSERT INTO #WeekDays(WeekDayName)
VALUES
(‘Monday’)
, (‘Tuesday’)
, (‘Wednesday’)
, (‘Thursday’)
, (‘Friday’)
, (‘Saturday’)
, (‘Sunday’)
DROP TABLE IF EXISTS #WeekDayHours
CREATE TABLE #WeekDayHours(
WeekDayName varchar(10) NOT NULL
, HourStart time NOT NULL
, HourEnd time NOT NULL
, ResultText varchar(100) NOT NULL
, PRIMARY KEY (WeekDayName, HourStart)
)
INSERT INTO #WeekDayHours
SELECT
WeekDayName
, HourStart
, HourEnd
/* We create the resulting text here since we cant do it in the final table. We are doing a FULL OUTER JOIN there in case of 0 records created */
/* The result string will have leading zero’s but i hope the business wont notice so i dont have to write out the substrings (LTRIM wont work to get rid of the leading 0 since i installed SQL server 2019 on my own PC)*/
, CONCAT(WeekDayName , ‘s, ‘, LEFT(CAST(HourStart AS varchar(16)), 5), ‘-‘, LEFT(CAST(HourEnd AS varchar(16)), 5)) AS ResultText
FROM #WeekDays
CROSS JOIN #Hours
/* I join the users table using the created column on there respective time windows and day of the week
Then i aggregate to the day of the week and the time windows.
First i used an ORDER BY and TOP(3) but i noticed i had a tie on my third result. So i switched to using DENSE_RANK and a WHERE clause to also show the tie.
*/
;WITH UsersCreated AS(
SELECT
ResultText
, COUNT(*) AS UsersCreated
FROM dbo.Users AS u
FULL OUTER JOIN #WeekDayHours AS wh ON DATENAME(WEEKDAY, u.CreationDate) = wh.WeekDayName /* FULL OUTER JOIN in case there are NULL values */
AND CAST(u.CreationDate AS time) >= wh.HourStart
AND CAST(u.CreationDate AS time) = ‘2010-11-01’
AND u.CreationDate < '2010-11-29'
GROUP BY ResultText
), Ranking AS(
SELECT
ResultText
, UsersCreated
, DENSE_RANK() OVER(ORDER BY UsersCreated) AS DenseRank /* In the case there are ties */
FROM UsersCreated
)
SELECT
ResultText
, UsersCreated
, DenseRank
FROM Ranking
WHERE DenseRank <= 3
/* This excersise actually made me realize the query i delivered earlier has the potential for errors since i did not account for NULL results and ties. Apart from the (potential) errors i am making and not realizing */
Remko, just some friendly advice: before you answer the next one, you may want to set a 24-hour timer. Write your first version, test it, then sleep on it before posting. That’s 5 answers you’ve posted in the span of a few hours, hahah. The query’s in good shape now though!
/*
Utilized an older StackOverflow restore dated 2020-10-27 according to the zips
Used a 28-day range starting 2018-06-01
I checked not only the avg # of new users for each weekday/hour pair in the 28-day window, but also did a LEAD check on the next hour to see if our maintenance window ran OT due to a contingency
Best windows appear to be
Sunday, 00:00 – 00:59 (with the least amount of delta in users created the next hour)
Sunday, 01:00 – 01:59 (but with a large amount of delta in case the window runs OT)
Saturday, 00:00 – 00:59 (with slightly larger delta than the first one)
*/
DECLARE @Start datetime2(0)=’2018-06-01′
CREATE TABLE #HourRanges
(HourRangeStart datetime2(0),
HourRangeEnd datetime2(0),
AggregateLabel varchar(25)
)
;WITH Permutations(HourRangeStart,HourRangeEnd,HourNumber) AS
(SELECT
DATEADD(HOUR,HOURNUMBER,DATEADD(DAY,DAYNUMBER,@Start)),
DATEADD(SECOND,59,DATEADD(MINUTE,59,DATEADD(HOUR,HOURNUMBER,DATEADD(DAY,DAYNUMBER,@Start)))),
HourNumber
FROM
(SELECT TOP 24 HourNumber=ROW_NUMBER() OVER (ORDER BY (SELECT ‘1’ A))-1 FROM sys.objects) Hours
CROSS JOIN
(SELECT TOP 28 DayNumber=ROW_NUMBER() OVER (ORDER BY (SELECT ‘1’ A))-1 FROM sys.objects) Days
)
INSERT INTO #HourRanges(HourRangeStart,HourRangeEnd,AggregateLabel)
SELECT
HourRangeStart,HourRangeEnd,FORMAT(HourRangeStart,’dddd, HH:mm’) + ‘ – ‘ + FORMAT(HourRangeEnd,’HH:mm’)
FROM
Permutations
;WITH FullRange(EarliestStart,LatestEnd) AS
(SELECT MIN(HourRangeStart),MAX(HourRangeEnd) FROM #HourRanges)
SELECT TOP 3 AggregateLabel,AVG(Tally) AvgTally,AVG(NextTally) AvgTally_NextHour,AVG(NextTally-Tally) AvgTally_Delta FROM (
SELECT AggregateLabel,HourRangeStart,HourRangeEnd,Tally,
LEAD(Tally,1) OVER (ORDER BY HourRangeStart,HourRangeEnd) NextTally
FROM (
SELECT DISTINCT
AggregateLabel,HourRangeStart,HourRangeEnd
, SUM(CAST(1 as float)) OVER (PARTITION BY HourRangeStart,HourRangeEnd) Tally
FROM
[StackOverflow].[dbo].[Users] U
INNER JOIN
#HourRanges H
ON
U.CreationDate>=H.HourRangeStart AND U.CreationDate=F.EarliestStart AND U.CreationDate<=F.LatestEnd
) Base
) Base2
GROUP BY AggregateLabel
ORDER BY AvgTally,AvgTally_Delta
DROP TABLE #HourRanges
This doesn’t compile – you’re referencing F.EarliestStart and F.LatestEnd, but there’s no object in here aliased as F.
Hi everyone!
Before sharing the query, some context:
– I am using Brent training classes Stackoverflow DB: specialized copy as of 2018/06.
– Because business/exercise question relates to a recent period, I am using a 28 day time window that matches to this period, as close as possible according to available data. Specifically, I am querying the most recent creation date of an user that is available and then subtracting 28 days to it, which should give us what we want – I mean, there is no point to infer user creation activity on older dates, as the data distribution is quite different (tested).
My query is as follows:
— =================================================================
— Author: Sergio Torrinha
—
— Creation Date: 19-01-2024
— Description: This query is intended to answer Brent Ozar exercice available here: https://www.brentozar.com/archive/2024/01/query-exercise-find-the-best-time-for-maintenance/
—
— =================================================================
USE StackOverflow;
DECLARE
— fetching date of most recent user created
@MostRecentUserDate AS DATE = ( SELECT MAX( CreationDate ) FROM dbo.Users )
;
WITH UsersCreated AS (
— CTE that returns the number of users created per hour for each day of the week, in most recent 28 days
SELECT
DATENAME( WEEKDAY, CreationDate ) WeekDayName
,DATEPART(HOUR, CreationDate) HourNbr
,COUNT(1) TotalUsersCreated
FROM dbo.Users
WHERE
— filtering Users table, to fetch users created in most recent 28 days of records
— this will allow us to focus on most recent user creation activity to comply with exercice requests
CreationDate >= DATEADD(DAY, – 28, @MostRecentUserDate )
GROUP BY DATENAME( WEEKDAY, CreationDate ), DATEPART(HOUR, CreationDate)
),
UserCreationActivityRanking AS (
— CTE that returns the ranking of less user creation activity on a weekday and hourly level
SELECT
g.*
— the first 3 records, will indicate the days and hour numbers with less user creation activity
,DENSE_RANK() OVER( ORDER BY TotalUsersCreated ) LessActiveDay
FROM (
SELECT
WeekDayName
,HourNbr
,TotalUsersCreated
— the nbr 1 ranking will derive the less active hour within each weekday
,DENSE_RANK() OVER( PARTITION BY WeekDayName ORDER BY TotalUsersCreated ) LessActiveHourWithinDay
FROM UsersCreated
) g
WHERE LessActiveHourWithinDay = 1
)
SELECT
WeekDayName
–this will nicely format the hour intervals to comply with exercice/business requirement
,CAST( HourNbr AS VARCHAR(2) ) + ‘:00-‘ + CAST( HourNbr AS VARCHAR(8) ) + ‘:59’ HourInterval
FROM UserCreationActivityRanking
WHERE
— filtering only top 3 less active day/hours
LessActiveDay < 4
;
Looking forward to see what's wrong with it. =]
Thank you
/*
We need to find at least three 1 hour time slots over a period of 28 days with the least activity to decide which slots we can use for maintenance.
This is a classic “Gaps and Islands” question, so we need to set up a range containing all the possible hours in this 28 day window.
24 * 28 = 672 distinct hours, so setting up a range of 1000 unique values is enough.
Our range will start at: 2013-11-01
Based on the period 2013-11-01 – 2013-11-28
The best windows are in order:
Sunday, 1:00-1:59
Saterday, 5:00-5:59
Sunday, 0:00-0:59
*/
/* Populate DateRange with values */
DECLARE @StartDateTime datetime = N’2013-11-01 00:00:00′;
DECLARE @EndDateTime datetime = DATEADD(DAY, 28, @StartDateTime);
/* Thanks Omar on the SQL discord server for this beauty */
WITH Ten AS
(
SELECT
[Ten] = v.[Ten] /* Range 0-9 */
,[Weigth] = COUNT(v.[Ten]) OVER () /* 10 */
FROM
(
VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)
) AS v ([Ten])
), Hundred AS
(
SELECT
[Hundred] = Ten.[Ten] + (Ten.[Weigth] * Hundred.[Ten]) /* Range 0-99 */
,[Weigth] = Ten.[Weigth] * Ten.[Weigth] /* 10 * 10 = 100 */
FROM Ten AS Ten
CROSS APPLY Ten AS Hundred
), Thousand as
(
SELECT
[Thousand] = Ten.[Ten] + (Ten.[Weigth] * Hundred.[Hundred]) /* Range 0-999 */
,[Weigth] = Ten.[Weigth] * Hundred.[Weigth] /* 10 * 100 = 1,000 */
FROM Ten AS Ten
CROSS APPLY Hundred AS Hundred
), Dates AS
(
SELECT
[Date] = DATEADD(HOUR, [Thousand], @StartDateTime)
,[Window] = CASE DATEPART(WEEKDAY, DATEADD(HOUR, [Thousand], @StartDateTime))
WHEN 1 THEN N’Sunday’
WHEN 2 THEN N’Monday’
WHEN 3 THEN N’Tuesday’
WHEN 4 THEN N’Wednesday’
WHEN 5 THEN N’Thursday’
WHEN 6 THEN N’Friday’
WHEN 7 THEN N’Saterday’
END
+ N’, ‘
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N’:00-‘ +
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N’:59′
FROM Thousand
WHERE DATEADD(HOUR, [Thousand], @StartDateTime) < @EndDateTime
)
SELECT TOP 3
d.[Window]
FROM dbo.[Users] AS u
LEFT OUTER JOIN Dates AS d
ON DATEADD(HOUR, DATEPART(HOUR, u.[CreationDate]), CAST(CAST(u.[CreationDate] AS date) AS datetime)) = d.[Date] /* Strip the minutes, seconds, etc. from the CreationDate leaving only the date and the hour which we can match against our date range CTE */
GROUP BY d.[Window]
ORDER BY COUNT(*) ASC;
The Dates CTE can be further simplified with with a DATENAME instead of the CASE:
Dates AS
(
SELECT
[Date] = DATEADD(HOUR, [Thousand], @StartDateTime)
,[Window] = DATENAME(WEEKDAY, DATEADD(HOUR, [Thousand], @StartDateTime)) + N’, ‘
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N’:00-‘ +
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N’:59′
FROM Thousand
WHERE DATEADD(HOUR, [Thousand], @StartDateTime) < @EndDateTime
)
Messed up the LEFT JOIN and the Count (like Thomas Franz), it should have been the Date CTE on the Left and dbo.Users on the right, and the COUNT needs to be some field in the users table:
/*
We need to find at least three 1 hour time slots over a period of 28 days with the least activity to decide which slots we can use for maintenance.
This is a classic “Gaps and Islands” question, so we need to set up a range containing all the possible hours in this 28 day window.
24 * 28 = 672 distinct hours, so setting up a range of 1000 unique values is enough.
Our range will start at: 2013-11-01
Based on the period 2013-11-01 – 2013-11-28
The best windows are in order:
Sunday, 1:00-1:59
Saterday, 5:00-5:59
Sunday, 0:00-0:59
*/
/* Populate DateRange with values */
DECLARE @StartDateTime datetime = N’2013-11-01 00:00:00′;
DECLARE @EndDateTime datetime = DATEADD(DAY, 28, @StartDateTime);
/* Thanks Omar on the SQL discord server for this beauty */
WITH Ten AS
(
SELECT [Ten] = v.[Ten] /* Range 0-9 */, [Weigth] = COUNT(v.[Ten]) OVER () /* 10 */
FROM (VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS v ([Ten])
), Hundred AS
(
SELECT [Hundred] = Ten.[Ten] + (Ten.[Weigth] * Hundred.[Ten]) /* Range 0-99 */, [Weigth] = Ten.[Weigth] * Ten.[Weigth] /* 10 * 10 = 100 */
FROM Ten AS Ten CROSS APPLY Ten AS Hundred
), Thousand as
(
SELECT [Thousand] = Ten.[Ten] + (Ten.[Weigth] * Hundred.[Hundred]) /* Range 0-999 */, [Weigth] = Ten.[Weigth] * Hundred.[Weigth] /* 10 * 100 = 1,000 */
FROM Ten AS Ten CROSS APPLY Hundred AS Hundred
), Dates AS
(
SELECT [Date] = DATEADD(HOUR, [Thousand], @StartDateTime)
,[Window] = DATENAME(WEEKDAY, DATEADD(HOUR, [Thousand], @StartDateTime)) + N’, ‘
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N’:00-‘ +
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N’:59′
FROM Thousand
WHERE DATEADD(HOUR, [Thousand], @StartDateTime) < @EndDateTime
)
SELECT TOP 3 d.[Window]
FROM Dates AS d
LEFT OUTER JOIN dbo.[Users] AS u
ON d.[Date] = DATEADD(HOUR, DATEPART(HOUR, u.[CreationDate]), CAST(CAST(u.[CreationDate] AS date) AS datetime)) /* Strip the minutes, seconds, etc. from the CreationDate leaving only the date and the hour which we can match against our date range CTE */
GROUP BY d.[Window]
ORDER BY COUNT(u.[Id]) ASC;
Got to “love” the SQL Execution plan on this one though:
https://www.brentozar.com/pastetheplan/?id=BkNlGADta
SQL Server couldn’t even get the guess for the users table remotely right XD (123285650% Actual 2465713 of Estimate 2) thanks to the DATEADD(HOUR, DATEPART(HOUR, u.[CreationDate]), CAST(CAST(u.[CreationDate] AS date) AS datetime)) in the join
So I went in and optimize the query a little, by storing the User Activity in a Temp table we get an exec plan that is a little nicer:
https://www.brentozar.com/pastetheplan/?id=rkMp8AwKT
(Even Clippy is “suggesting” things now)
In terms of execution time it didn’t matter, I just have some spare time will waiting for my ETL test job to finish.
/*
We need to find at least three 1 hour time slots over a period of 28 days with the least activity to decide which slots we can use for maintenance.
This is a classic “Gaps and Islands” question, so we need to set up a range containing all the possible hours in this 28 day window.
24 * 28 = 672 distinct hours, so setting up a range of 1000 unique values is enough.
Our range will start at: 2013-11-01
Based on the period 2013-11-01 – 2013-11-28
The best windows are in order:
Sunday, 1:00-1:59
Saterday, 5:00-5:59
Sunday, 0:00-0:59
*/
/* Populate DateRange with values */
DECLARE @StartDateTime datetime = N’2013-11-01 00:00:00′;
DECLARE @EndDateTime datetime = DATEADD(DAY, 28, @StartDateTime);
IF OBJECT_ID(N’tempdb..#UserActivity’) IS NOT NULL
DROP TABLE #UserActivity;
CREATE TABLE #UserActivity
(
[ActivityDate] datetime
);
INSERT INTO #UserActivity ([ActivityDate])
SELECT [ActivityDate] = DATEADD(HOUR, DATEPART(HOUR, [CreationDate]), CAST(CAST([CreationDate] AS date) AS datetime)) /* Strip the minutes, seconds, etc. from the CreationDate leaving only the date and the hour which we can match against our date range CTE */
FROM dbo.[Users]
WHERE [CreationDate] >= @StartDateTime AND [CreationDate] < @EndDateTime;
/* Thanks Omar on the SQL discord server for this beauty */
WITH Ten AS
(
SELECT [Ten] = v.[Ten] /* Range 0-9 */, [Weigth] = COUNT(v.[Ten]) OVER () /* 10 */
FROM(VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)) AS v ([Ten])
), Hundred AS
(
SELECT [Hundred] = Ten.[Ten] + (Ten.[Weigth] * Hundred.[Ten]) /* Range 0-99 */, [Weigth] = Ten.[Weigth] * Ten.[Weigth] /* 10 * 10 = 100 */
FROM Ten AS Ten CROSS APPLY Ten AS Hundred
), Thousand as
(
SELECT [Thousand] = Ten.[Ten] + (Ten.[Weigth] * Hundred.[Hundred]) /* Range 0-999 */, [Weigth] = Ten.[Weigth] * Hundred.[Weigth] /* 10 * 100 = 1,000 */
FROM Ten AS Ten CROSS APPLY Hundred AS Hundred
), Dates AS
(
SELECT [Date] = DATEADD(HOUR, [Thousand], @StartDateTime)
,[Window] = DATENAME(WEEKDAY, DATEADD(HOUR, [Thousand], @StartDateTime)) + N', '
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N':00-' +
+ CAST(DATEPART(HOUR, DATEADD(HOUR, [Thousand], @StartDateTime)) AS nvarchar(2)) + N':59'
FROM Thousand
WHERE DATEADD(HOUR, [Thousand], @StartDateTime) < @EndDateTime
)
SELECT TOP 3 d.[Window]
FROM Dates AS d
LEFT OUTER JOIN #UserActivity AS u ON d.[Date] = u.[ActivityDate]
GROUP BY d.[Window]
ORDER BY COUNT(u.[ActivityDate]) ASC;
IF OBJECT_ID(N'tempdb..#UserActivity') IS NOT NULL
DROP TABLE #UserActivity;
I made a very complicated query before, there were some steps in the query that were totally not needed and I went and optimized it even further:
Eliminated the temp table completed as well as all CTEs. Although the temp table can possibly knock down the 15 megabyte memory grant even further… but 15 Mb isn’t *that* bad 😀
/*
We need to find at least three 1 hour time slots over a period of 28 days with the least activity to decide which slots we can use for maintenance.
This is a classic ” Gaps and Islands” question, so we need to set up a range containing all the possible hours in the per day window.
Our range will start at: 2013-11-01
Based on the period 2013-11-01 – 2013-11-28
The best windows are in order:
Sunday, 1:00-1:59
Saterday, 5:00-5:59
Sunday, 0:00-0:59
*/
SELECT TOP 3 v.[DayName] + N’, ‘ + HoursInTheDay.[HourWindow] + N’00-‘ + HoursInTheDay.[HourWindow] + N’59’
FROM (VALUES
(N’Monday’), (N’Tuesday’), (N’Wednesday’), (N’Thursday’), (N’Friday’), (N’Saturday’), (N’Sunday’)
) AS v ([DayName])
CROSS APPLY (
SELECT v.[HourWindow]
FROM (VALUES
( N’0:’), ( N’1:’), ( N’2:’), ( N’3:’), ( N’4:’), ( N’5:’), ( N’6:’), ( N’7:’),
( N’8:’), ( N’9:’), (N’10:’), (N’11:’), (N’12:’), (N’13:’), (N’14:’), (N’15:’),
(N’16:’), (N’17:’), (N’18:’), (N’19:’), (N’20:’), (N’21:’), (N’22:’), (N’23:’)
) AS v ([HourWindow])
) AS HoursInTheDay
OUTER APPLY (
SELECT [Activity] = 1
FROM dbo.[Users] AS u
WHERE u.[CreationDate] >= N’2013-11-01 00:00:00.000′
AND u.[CreationDate] < DATEADD(DAY, 28, N'2013-11-01 00:00:00.000')
AND v.[DayName] + N', ' + HoursInTheDay.[HourWindow] + N'00-' + HoursInTheDay.[HourWindow] + N'59' = DATENAME(WEEKDAY, u.[CreationDate]) + N', '
+ CAST(DATEPART(HOUR, u.[CreationDate]) AS nvarchar(2)) + N':00-'
+ CAST(DATEPART(HOUR, u.[CreationDate]) AS nvarchar(2)) + N':59'
) AS u
GROUP BY v.[DayName] + N', ' + HoursInTheDay.[HourWindow] + N'00-' + HoursInTheDay.[HourWindow] + N'59'
ORDER BY COUNT(u.[Activity]) ASC;
The exec plan for those interested 🙂
https://www.brentozar.com/pastetheplan/?id=ry6KiGdFa
Short (compared to all the other scripts) and simple and fast (< 1 sec on the StackOverFlow2013):
DECLARE @period_start DATETIME = '20120403' — CreationDate is a datetime, so we should use the same datatype here
SELECT t.weekday_name, t.start_hour, COUNT(*) AS created_users
FROM (VALUES ('Monday', 1)
, ('Saturday', 23)
, ('Sunday', 2)
) AS t (weekday_name, start_hour)
INNER JOIN dbo.Users AS u
ON u.CreationDate BETWEEN @period_start AND DATEADD(DAY, 29, @period_start) — this would give us 0:00:00 of the 29th day too, but since no of the start_hours in t is = 0, this is no problem
AND DATENAME(WEEKDAY, u.CreationDate) = t.weekday_name
AND DATEPART(HOUR, u.CreationDate) = t.start_hour
GROUP BY t.weekday_name, t.start_hour
ORDER BY created_users
;
Used 29 instead of 28 days by mistake, but this should be no problem to change 🙂
small correction: to be accurate, I should have had used a LEFT JOIN instead of INNER JOIN and COUNT(u.CreationDate) instead of COUNT(*), since there could be an hour without any new user in one of the three time frames (e.g. Monday, 2008-08-04 01:00 – 01:59:59).
Yep, this is a classic “Gaps and Islands” question so a LEFT JOIN is required.
Shouldn’t this be a list of all weekdays and all hours per day?
“FROM (VALUES (‘Monday’, 1)
, (‘Saturday’, 23)
, (‘Sunday’, 2)
) AS t (weekday_name, start_hour)”
Martin – bingo, yeah, Thomas is missing a whole lot of days/hours in here.
True, I misunderstood the text in the goal and thought, that Business told me that only this 3 time frames would be allowed. And in this case – why should I bother to check all other 165 hours, where exists already a No-Go-Sign from business?
PS: there would be much fewer spam if you would allow us to edit our orginal posts 🙂
You’re absolutely right about editing comments! It didn’t even occur to me. I was already on the hunt for something that would let y’all have syntax highlighting in the comments, but you’ve prompted me to go poke around in there again and see what I can do to raise the game here, heh.
[…] Query Exercise: Find the Best Time for Maintenance (Brent Ozar) […]
Hi folks!
Database: StackOverflow2013 (50GB)
Not sure if I didn’t take something into account in my query but here it is.
Best window appear to be (with ties):
Sunday, 4:00-4:59
Thursday, 6:00-6:59
Monday, 6:00-6:59
Friday, 5:00-5:59
Friday, 2:00-2:59
DECLARE @StartDateTime datetime = N’2008-08-01 00:00:0.000′;
DECLARE @EndDateTime datetime = DATEADD(DAY, 28, @StartDateTime);
WITH CTE AS
(
SELECT
CreationDate,
DATEPART(HH, CreationDate) AS ‘Hour’,
DATENAME(DW, CreationDate) AS ‘DayName’
FROM dbo.Users
WHERE CreationDate BETWEEN @StartDateTime AND @EndDateTime
)
SELECT TOP 3 WITH TIES DayName, Hour, COUNT(CreationDate) AS ‘CreatedUsersCount’,
DayName + ‘, ‘ + CAST(Hour AS VARCHAR) + ‘:00-‘ + CAST(Hour AS VARCHAR) + ‘:59’ AS ‘Formated’
FROM CTE
GROUP BY DayName, Hour
ORDER BY COUNT(CreationDate);
GO
You’re missing something. If within the 28 days you are considering on every Monday between 2:00-2:59 no accounts have been created (although unlikely) you’ll missing that combination from your results while that would be the best window as there is no activity then.
Bingo!
My dog ate that part 😀 So that was the tricky thing inside..
Corrected, hope it’s fine now.
DROP TABLE IF EXISTS #Dates;
GO
CREATE TABLE #Dates ( [Date] VARCHAR(10), [hour] INT )
DECLARE @startDate datetime = ‘2008-08-01 00:00:0.000’
DECLARE @endDate datetime = DATEADD(DAY, 28, @StartDate)
WHILE @startDate < @endDate
BEGIN
INSERT INTO #Dates ([Date], [hour]) VALUES (CAST(@startDate AS date),DATEPART(HH,@startDate));
SET @startDate = DATEADD(HOUR, 1, @startDate);
END
— SOLUTION
;WITH CTE AS (
SELECT
d.*
,CASE WHEN u.CreationDate IS NULL THEN 0 ELSE 1 END AS 'Existance'
FROM #Dates d
LEFT JOIN dbo.Users u
ON d.Date = CAST(u.CreationDate AS date) AND d.hour = DATEPART(HH, u.creationdate)
)
SELECT TOP 3 WITH TIES DATENAME(DW, Date) +', ' + CAST(Hour AS VARCHAR) + ':00-' + CAST(Hour AS VARCHAR) + ':59' AS 'Formated'
FROM CTE
GROUP BY DATENAME(DW, Date), hour
ORDER BY SUM(Existance);
DROP TABLE IF EXISTS #Dates;
GO
Select top 3 DATEPART(DAY,CreationDate)d,DATEPART(HOUR,CreationDate)h,count(*)c
from UsersCreated
where CreationDate >=’12/1/2023′ and CreationDate <'12/28/2023'
group by DATEPART(DAY,CreationDate),DATEPART(HOUR,CreationDate )
order by 3
While your query is nice and short, you’re missing something.
If within the 28 days you are considering on every Monday between 2:00-2:59 no accounts have been created (although unlikely) you’ll missing that combination from your results while that would be the best window as there is no activity then.
fair point it would have to be all mondays during that time. In a low utilization db could add some code to catch the zeros.
few errors above guess that is what i get for going quick lol
same idea this should be a very simple query noting this isn’t something that will be run very often.
Weekday here rather than day takes Day, then added some formatting for the time
Select top 3 datename(weekday,GETDATE())d,cast(DATEPART(HOUR,getdate())as varchar) +’:00-‘+cast(DATEPART(HOUR,getdate())as varchar)+’:59’h,count(*)c
from UsersCreated
where CreationDate >=’12/1/2023? and CreationDate <'12/29/2023'
group by datename(weekday,GETDATE()),cast(DATEPART(HOUR,getdate())as varchar) +':00-'+cast(DATEPART(HOUR,getdate())as varchar)+':59'
order by 3
few errors above guess that is what i get for going quick lol
same idea this should be a very simple query noting this isn’t something that will be run very often.
Weekday here rather than day takes Day, then added some formatting for the time
Select top 3 datename(weekday,GETDATE())d,cast(DATEPART(HOUR,getdate())as varchar) +’:00-‘+cast(DATEPART(HOUR,getdate())as varchar)+’:59’h,count(*)c
from UsersCreated
where CreationDate >=’12/1/2023? and CreationDate <'12/29/2023'
group by datename(weekday,GETDATE()),cast(DATEPART(HOUR,getdate())as varchar) +':00-'+cast(DATEPART(HOUR,getdate())as varchar)+':59'
order by 3
USE StackOverflow2013
GO
;WITH dates AS (
SELECT DATEADD ( DAY, ROW_NUMBER () OVER ( ORDER BY weeks.n ) – 1, ‘2010-02-01’ ) AS date
FROM ( VALUES (1), (2), (3), (4) ) weeks (n)
CROSS JOIN ( VALUES (1), (2), (3), (4), (5), (6), (7) ) days (n)
)
, hourly_windows AS (
SELECT DATEADD ( HOUR, hours.n + am_pm_offset.i , d.date ) AS start_datetime
, DATEADD ( HOUR, hours.n + am_pm_offset.i + 1, d.date ) AS end_datetime
FROM dates d
CROSS JOIN ( VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11)) hours ( n )
CROSS JOIN ( VALUES (0), (12)) am_pm_offset(i)
)
SELECT TOP ( 3 ) FORMAT ( hw.start_datetime, ‘D’ , ‘en-US’ ) AS date_fmt
, FORMAT ( hw.start_datetime, ‘HH:mm’, ‘en-US’ ) + ‘-‘
+ FORMAT ( DATEADD ( MINUTE, -1, hw.end_datetime ), ‘HH:mm’, ‘en-US’ ) AS time_range_24hr_fmt
, ( SELECT COUNT ( 1 / 0 )
FROM dbo.Users u
WHERE u.CreationDate >= hw.start_datetime
AND u.CreationDate < hw.end_datetime /* Can't use BETWEEN */
) AS activity
FROM hourly_windows hw
GROUP BY hw.start_datetime
, hw.end_datetime
ORDER BY activity ASC
, hw.start_datetime ASC;
Just realized that the GROUP BY clause in the main query is no longer needed as I moved the aggregation into a subquery.
Third time’s the charm! Sorry, left out the most important requirement. Here’s the updated query:
USE StackOverflow2013;
GO
;WITH params AS (
SELECT ‘2010-09-01’ AS four_weeks_starting_on
)
, dates AS (
SELECT DATEADD ( DAY, ROW_NUMBER () OVER ( ORDER BY weeks.n ) – 1, p.four_weeks_starting_on ) AS date
FROM ( VALUES (1), (2), (3), (4) ) weeks (n)
CROSS JOIN ( VALUES (1), (2), (3), (4), (5), (6), (7) ) days (n)
CROSS JOIN params p
)
, hourly_windows AS (
SELECT DATEADD ( HOUR, hours.n + am_pm_offset.i , d.date ) AS start_datetime
, DATEADD ( HOUR, hours.n + am_pm_offset.i + 1, d.date ) AS end_datetime
FROM dates d
CROSS JOIN ( VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11)) hours ( n )
CROSS JOIN ( VALUES (0), (12)) am_pm_offset(i)
)
, hourly_activity AS (
SELECT hw.start_datetime
, hw.end_datetime
, ( SELECT COUNT ( 1 / 0 )
FROM dbo.Users u
WHERE u.CreationDate >= hw.start_datetime
AND u.CreationDate < hw.end_datetime /* Can't use BETWEEN */
) AS activity
FROM hourly_windows hw
)
SELECT TOP ( 3 ) p.four_weeks_starting_on
, FORMAT ( ha.start_datetime, 'dddd', 'en-US' ) AS day_of_week
, FORMAT ( ha.start_datetime, 'HH:mm', 'en-US' ) + '-'
+ FORMAT ( DATEADD ( MINUTE, -1, ha.end_datetime ), 'HH:mm', 'en-US' ) AS time_range_24hr_fmt
, MAX ( ha.activity ) AS max_activity
FROM hourly_activity ha
CROSS JOIN params p
GROUP BY p.four_weeks_starting_on
, DATEPART ( WEEKDAY, ha.start_datetime )
, FORMAT ( ha.start_datetime, 'dddd', 'en-US' )
, FORMAT ( ha.start_datetime, 'HH:mm', 'en-US' ) + '-'
+ FORMAT ( DATEADD ( MINUTE, -1, ha.end_datetime ), 'HH:mm', 'en-US' )
ORDER BY max_activity ASC
, day_of_week
, time_range_24hr_fmt;
/*
four_weeks_starting_on day_of_week time_range_24hr_fmt max_activity
2010-09-01 Sunday 07:00-07:59 9
2010-09-01 Saturday 23:00-23:59 10
2010-09-01 Saturday 07:00-07:59 11
*/
I’m sure Brent or others will point out something I missed, but this seems to perform reasonably well, should account for “missing” hours, and creates an output that exactly matches Brent’s example.
/* using StackOverflow 2013 DB */
/* check range of values, pick slice used in final query within range */
SELECT MIN(CreationDate) AS MinDT, MAX(CreationDate) AS MaxDT FROM Users;
/*
create a lookup table that has the display text
result output should look like: “Mondays, 1:00-1:59”
don’t use DATENAME because Brent wants English/US day names (though we could use SET LANGUAGE)
IF we did use DATENAME, be sure to append “, ” to match Brent’s desired output text
*/
CREATE TABLE #dt(Wkdy tinyint, Hr tinyint, DayHourText varchar(30) );
DECLARE @wd tinyint = 1, @hr tinyint = 0
SET NOCOUNT ON;
WHILE @wd <= 7
BEGIN
WHILE @hr = ‘2013-04-01 00:00’ AND u.CreationDate < '2013-04-29 00:00' /* April Fool's day to clock tick BEFORE 29th day after */
GROUP BY d.DayHourText
ORDER BY UserCreations ASC;
/* 94% of work is Clustered Index Scan on Users table; it does use push-down predicate on CreationDate range */
DROP TABLE #dt;
Well, that post got chopped. Let’s try this in a couple steps. Part 1 of 2 (hopefully).
/* using StackOverflow 2013 DB */
/* check range of values, pick slice used in final query within range */
SELECT MIN(CreationDate) AS MinDT, MAX(CreationDate) AS MaxDT FROM Users;
/* create a lookup table that has the display text
result output should look like: “Mondays, 1:00-1:59”
don’t use DATENAME because Brent wants English/US day names (though we could use SET LANGUAGE)
IF we did use DATENAME, be sure to append “, ” to match Brent’s desired output text
*/
CREATE TABLE #dt(Wkdy tinyint, Hr tinyint, DayHourText varchar(30) );
DECLARE @wd tinyint = 1, @hr tinyint = 0
SET NOCOUNT ON;
WHILE @wd <= 7
BEGIN
WHILE @hr <= 23
BEGIN
INSERT INTO #dt(Wkdy, Hr, DayHourText)
VALUES(@wd, @hr,
CASE
WHEN @wd = 1 THEN 'Sundays, '
WHEN @wd = 2 THEN 'Mondays, '
WHEN @wd = 3 THEN 'Tuesdays, '
WHEN @wd = 4 THEN 'Wednesdays, '
WHEN @wd = 5 THEN 'Thursdays, '
WHEN @wd = 6 THEN 'Fridays, '
WHEN @wd = 7 THEN 'Saturdays, '
END + CONVERT(varchar(30), @hr) + ':00-' + CONVERT(varchar(30), @hr) + ':59'
);
SET @hr = @hr + 1;
END
SET @hr = 0;
SET @wd = @wd + 1;
END;
SET NOCOUNT OFF;
CREATE CLUSTERED INDEX cx_dt ON #dt(Hr, Wkdy) WITH (DATA_COMPRESSION=PAGE);
OK, cool. Now, if anyone in the world runs this, will it return the same result? (Hint: is 1 always Sunday?)
Yes, because part 2 of the code (below) uses “SET DATEFIRST 7;” to force Sunday to be weekday 1. Bazinga!
And part 2 of 2.
/* SELECT @@DATEFIRST */
SET DATEFIRST 7; /* I don’t care if Payroll thinks the week starts on Fridays */
/*
Get first 3; if there are ties, one’s as good as the other
Not dealing with time zones (e.g., if CreationDate is UTC rather than local, result will be UTC hours
DO NOT make an index on CreationDate for one-time query
Users.CreationDate is datetime so no string interpretation issues in DATEPART
Users.Id is PK – COUNT that; COUNT(*) returns 1 instead of 0 for OJ non-matched values
*/
SELECT TOP 3 d.DayHourText, COUNT(u.Id) AS UserCreations
FROM #dt AS d LEFT OUTER JOIN dbo.Users u
ON d.Wkdy = DATEPART(weekday,u.CreationDate) AND d.Hr = DATEPART(hour,u.CreationDate)
AND u.CreationDate >= ‘2013-04-01 00:00’ AND u.CreationDate < '2013-04-29 00:00' /* April Fool's day to clock tick BEFORE 29th day after */
GROUP BY d.DayHourText
ORDER BY UserCreations ASC;
–
DECLARE @Slots TABLE
( SWeekDay TINYINT
, SHour TINYINT )
— Populate Slot Table
INSERT @Slots
SELECT w.value, h.value
FROM GENERATE_SERIES(1, 7) w
JOIN GENERATE_SERIES(0, 23) h
ON 1 = 1
—
SELECT TOP 3
WeekDay = CASE s.SWeekDay
WHEN 1 THEN ‘Sunday’
WHEN 2 THEN ‘Monday’
WHEN 3 THEN ‘Tuesday’
WHEN 4 THEN ‘Wednesday’
WHEN 5 THEN ‘Thrusday’
WHEN 6 THEN ‘Friday’
WHEN 7 THEN ‘Saturday’
END
, Hour = CONCAT(S.SHour, ‘:00 – ‘, s.SHour, ‘:59′)
, Activity = COUNT(*)
FROM @Slots
s
LEFT
JOIN ( SELECT wd = DATEPART(dw, CreationDate)
, h = DATEPART(hh, CreationDate)
FROM dbo.Users
WHERE CreationDate BETWEEN N’2013-11-01 00:00:00.000′ AND DATEADD(DAY, 28, N’2013-11-01 00:00:00.000’)
) u
ON u.wd = s.SWeekDay
AND u.h = s.SHour
GROUP BY s.SWeekDay
, s.SHour
ORDER BY COUNT(*)
, s.SWeekDay
, s.SHour
OK, cool. Now, if anyone in the world runs this, will it return the same result? (Hint: is 1 always Sunday?)
Any where in the world and language:
—
SET LANGUAGE German
DECLARE @StartDate DATE = ‘2013-11-01’
—
SELECT TOP 3
DayOfTheWeek = DATENAME(WEEKDAY, MIN(u.CreationDate))
, Hour = CONCAT(S.SHour, ‘:00 – ‘, s.SHour, ‘:59’)
, Activity = COUNT(*)
FROM ( SELECT SWeekDay = w.value, SHour = h.value
FROM GENERATE_SERIES(1, 7) w
JOIN GENERATE_SERIES(0, 23) h
ON 1 = 1
) s
LEFT
JOIN ( SELECT wd = DATEPART(dw, CreationDate)
, h = DATEPART(hh, CreationDate)
, CreationDate
FROM dbo.Users
WHERE CreationDate BETWEEN @StartDate AND DATEADD(DAY, 28, @StartDate)
) u
ON u.wd = s.SWeekDay
AND u.h = s.SHour
GROUP BY s.SWeekDay
, s.SHour
ORDER BY COUNT(*)
, s.SWeekDay
, s.SHour
Nicely done! Great work.
Working when there is no activity on any hour.
—
SET LANGUAGE German
DECLARE @StartDate DATETIME = ‘Nov 01, 2013’
—
SELECT TOP 3
DayOfTheWeek = MIN(s.Sdate)
, Hour = CONCAT(S.SHour, ‘:00 – ‘, s.SHour, ‘:59’)
, Activity = COUNT(CreationDate)
FROM ( SELECT SWeekDay = DATEPART(WEEKDAY, DATEADD(DAY, w.value – 1, ‘1900-01-01’)) — 1900-01-01 is Monday
, Sdate = DATENAME(WEEKDAY, DATEADD(DAY, w.value – 1, ‘1900-01-01’) )
, SHour = h.value
FROM GENERATE_SERIES(1, 7) w
JOIN GENERATE_SERIES(0, 23) h
ON 1 = 1
) s
LEFT
JOIN ( SELECT wd = DATEPART(dw, CreationDate)
, h = DATEPART(hh, CreationDate)
, CreationDate
FROM dbo.Users
WHERE CreationDate >= @StartDate AND CreationDate < DATEADD(DAY, 28, @StartDate)
–AND NOT (DATEPART(dw, CreationDate) = 5 and DATEPART(hh, CreationDate) = 5) — To Test Day/Hour without Activity
) u
ON u.wd = s.SWeekDay
AND u.h = s.SHour
GROUP BY s.SWeekDay
, s.SHour
ORDER BY COUNT(*)
, s.SWeekDay
, s.SHour