Watch Brent Tune Queries at SQLSaturday Oslo

This morning, I had the honor of presenting virtually at SQLSaturday Oslo. The session title, Watch Brent Tune Queries, explains it all. Enjoy!

In the session, I used the 2018-06 Stack Overflow database, but any recent full-sized copy (not the 2010-2013 copies) will work. Here’s the query I worked with:

And for more sessions like this, check out the Watch Brent Tune Queries series page.

Previous Post
3 Ways to Run DBCC CHECKDB Faster
Next Post
SQL ConstantCare® Population Report: Summer 2020

8 Comments. Leave new

  • what happen at the end?
    how can I watch the rest?

    Reply
    • I’m not sure what you mean “the rest” – it was a one-hour session, and you see the whole thing up there in the video.

      Reply
      • I was expected to see you tune the query and it runs very fast
        I understand the concept of B.E. C.R.E.E.P.Y. but willing to see final optimization on this specific query.
        P.S. you are exceptional presenter 🙂 but I dont like new sound effects 🙂 it was hard to hear your voice when sound effect is playing

        Reply
  • Wojciech Sawicki
    August 31, 2020 5:18 am

    I have some doubts.
    select Id from Users u where u.DisplayName=’Jon Skeet’
    gives 22656

    SELECT SUM(p.Score) AS PostsScore
    FROM dbo.Posts p where p.OwnerUserId =22656 and
    p.CreationDate BETWEEN ‘20110901’ AND ‘20111001’
    gives 7947
    But result
    EXEC rpt_TopUsers_ByLocation
    @Location = N’Reading, United Kingdom’,
    @StartDate = ‘2011-09-01’, @EndDate = ‘2011-10-01’
    GO
    is
    Reputation DisplayName AboutMe PostsScore CommentsScore
    1047863 Jon Skeet … 9687393 677784

    A little more.
    Why?

    SELECT count(*)
    FROM dbo.Users u
    LEFT OUTER JOIN dbo.Posts p ON u.Id = p.OwnerUserId AND p.CreationDate BETWEEN ‘20110901’ AND ‘20111001’
    — LEFT OUTER JOIN dbo.Comments c ON u.Id = c.UserId AND c.CreationDate BETWEEN ‘20110901’ AND ‘20111001’
    WHERE u.Id = 22656

    =744

    SELECT count(*)
    FROM dbo.Users u
    — LEFT OUTER JOIN dbo.Posts p ON u.Id = p.OwnerUserId AND p.CreationDate BETWEEN ‘20110901’ AND ‘20111001’
    LEFT OUTER JOIN dbo.Comments c ON u.Id = c.UserId AND c.CreationDate BETWEEN ‘20110901’ AND ‘20111001’
    WHERE u.Id = 22656
    =1219

    SELECT count(*)
    FROM dbo.Users u
    LEFT OUTER JOIN dbo.Posts p ON u.Id = p.OwnerUserId AND p.CreationDate BETWEEN ‘20110901’ AND ‘20111001’
    LEFT OUTER JOIN dbo.Comments c ON u.Id = c.UserId AND c.CreationDate BETWEEN ‘20110901’ AND ‘20111001’
    WHERE u.Id = 22656
    =906936 (1219*744)

    Join Users Posts gives not unique u.Id and for each row we join dbo.Comments

    if we change logic:
    join Users Posts sum PostsScore get top 1000 and result (each row have unique Id) join with Comments
    It works much faster 🙂

    Reply
    • Yep, very good! That’s one of the goals of webcasts like this: to get you to download the Stack Overflow database, use these queries for practice, and learn to do performance tuning yourself. Good job!

      Reply
  • Original query is wrong -> produces duplicates:
    select top 1000 u.Reputation, u.DisplayName, u.AboutMe,
    sum(p.Score) as PostsScore,
    sum(c.Score) as CommentsScore
    from dbo.Users u
    left join dbo.Posts p on u.Id
    = p.OwnerUserId and
    p.CreationDate between @startdate and @enddate
    left join dbo.Comments c on u.Id
    = c.UserId and
    c.CreationDate between @startdate and @enddate
    where u.Location = @Location
    –and u.DisplayName != N’Jon Skeet’
    group by u.Reputation, u.DisplayName, u.AboutMe
    order by sum(p.Score) desc

    Example:
    select top 1000 u.Reputation, u.DisplayName, u.AboutMe,
    p.Id
    as PostId,
    p.Score as PostsScore,

    c.Id
    as CommentId,
    c.Score as CommentsScore,
    u.Id
    as UserId
    from dbo.Users u
    left join dbo.Posts p on u.Id
    = p.OwnerUserId and
    p.CreationDate between ‘2009-06-01’ and ‘2009-08-01’
    left join dbo.Comments c on u.Id
    = c.UserId and
    c.CreationDate between ‘2009-06-01’ and ‘2009-08-01′
    where u.Location = N’Reading, United Kingdom’
    and u.DisplayName = N’YiSh’

    Data:
    – user has 2 posts and 2 comments and there are 4 rows so the sum will be higher than normal
    Reputation DisplayName AboutMe PostId PostsScore CommentId CommentsScore UserId
    300 YiSh 1130449 3 956428 0 94814
    300 YiSh 1130449 3 956448 0 94814
    300 YiSh 1138599 1 956428 0 94814
    300 YiSh 1138599 1 956448 0 94814

    If you try with user:
    and u.DisplayName = N’Graeme Bradbury’
    This user has 11 posts and 7 comments and original query returns 77 rows. So sums are wrong.

    There is one user and you have join with 11 posts so there are 11 records and the you joins it with 7 comments and you have 77 records.

    In original query user has:
    3515 Graeme Bradbury Windows Azure Contractor 210 33

    But when you drill down to records in each table it should have:
    6 3515 Graeme Bradbury Windows Azure Contractor 30 3

    Posts score is multiplied by 7 (count on comments) and comment score is multiplied by 11 (count on Posts).

    Reply
  • My implementation:
    – I was working on StackOverflow 2010
    – original query was 2:08 minutes
    – after indexing it was 0:51 minute
    – the main problem is with aggregation cause it takes
    – after switching to temp tables – 0 seconds
    – also original query is wrong – there are duplicates -> see separate comment
    ——————————————————————————————————————–
    create or alter proc dbo.rpt_topusers_bylocation
    @Location nvarchar(100), @startdate date, @enddate date
    as
    begin
    create table #Users (Id int, Reputation int, DisplayName nvarchar(40), AboutMe nvarchar(max))
    create table #Posts(OwnerUserId int, score int)
    create table #Comments(UserId int, score int)
    –users
    insert into #Users
    select u.Id
    , u.Reputation, u.DisplayName, u.AboutMe
    from dbo.Users u
    where u.Location = @Location
    –posts
    insert into #Posts
    select p.OwnerUserId, sum(p.Score) as PostsScore
    from dbo.Posts p
    where p.OwnerUserId in (select ID from #Users)
    and p.CreationDate between @startdate and @enddate
    group by p.OwnerUserId
    –comments
    insert into #Comments
    select c.UserId, sum(c.Score) as CommentsScore
    from dbo.Comments c
    where c.UserId in (select ID from #Users)
    and c.CreationDate between @startdate and @enddate
    group by c.UserId
    –main select
    select top 1000 u.Reputation, u.DisplayName, u.AboutMe,
    p.score as PostsScore,
    c.score as CommentsScore
    from #Users u
    left join #Comments c on u.Id
    = c.UserId
    left join #Posts p on u.Id
    = p.OwnerUserId
    order by p.score desc

    end
    go

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.