Query Plan Oddities: Two Identical Missing Indexes

Last Updated December 9, 2017

As I’ve been building labs for my Mastering Query Tuning class, I’ve been spelunking through data.stackexchange.com. It’s where anybody can write queries against the Stack Overflow databases, and share them with friends.

For example, Daniel Vandersluis wrote a query to see how many edits he has:

-- How many edits do I have?
-- Counts how many edits a user has, broken into Question and Answer edits. 
-- Also shows how close you are to getting each editor badge.
DECLARE @UserId INT = 26837;

WITH qaedits AS (
  SELECT
  (
    SELECT COUNT(*) FROM Posts
    WHERE PostTypeId = 1
    AND LastEditorUserId = Users.Id
  ) AS QuestionEdits,
  (
    SELECT COUNT(*) FROM Posts
    WHERE PostTypeId = 2
    AND LastEditorUserId = Users.Id
  ) AS AnswerEdits
  FROM Users
  WHERE Id = @UserId
),

edits AS (
  SELECT QuestionEdits, AnswerEdits, QuestionEdits + AnswerEdits AS TotalEdits
  FROM qaedits
)

SELECT QuestionEdits, AnswerEdits, TotalEdits,
  CASE WHEN TotalEdits >= 1 THEN 'Received' ELSE '0%' END AS EditorBadge,
  CASE WHEN TotalEdits >= 100
    THEN 'Received'
    ELSE Cast(TotalEdits AS varchar) + '%'
  END AS StrunkAndWhiteBadge,
  CASE WHEN TotalEdits >= 600
    THEN 'Received'
    ELSE Cast(TotalEdits / 6 AS varchar) + '%'
  END AS CopyEditorBadge
FROM edits;

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

-- How many edits do I have?

-- Counts how many edits a user has, broken into Question and Answer edits.

-- Also shows how close you are to getting each editor badge.

DECLARE @UserId INT = 26837;

WITH qaedits AS (

SELECT

(

SELECT COUNT(*) FROM Posts

WHERE PostTypeId = 1

AND LastEditorUserId = Users.Id

) AS QuestionEdits,

(

SELECT COUNT(*) FROM Posts

WHERE PostTypeId = 2

AND LastEditorUserId = Users.Id

) AS AnswerEdits

FROM Users

WHERE Id = @UserId

),

edits AS (

SELECT QuestionEdits, AnswerEdits, QuestionEdits + AnswerEdits AS TotalEdits

FROM qaedits

)

SELECT QuestionEdits, AnswerEdits, TotalEdits,

CASE WHEN TotalEdits >= 1 THEN 'Received' ELSE '0%' END AS EditorBadge,

CASE WHEN TotalEdits >= 100

THEN 'Received'

ELSE Cast(TotalEdits AS varchar) + '%'

END AS StrunkAndWhiteBadge,

CASE WHEN TotalEdits >= 600

THEN 'Received'

ELSE Cast(TotalEdits / 6 AS varchar) + '%'

END AS CopyEditorBadge

FROM edits;

When I run that on my unindexed copy of the Stack Overflow database, I get a missing index recommendation – no surprise there:

Zooming in a little, note the impact – SQL Server is kinda saying, “Wow, if I had this index, the query would be 49.9998% faster!”

Hmm. That’s an unusual impact number.

Now if you’ve been index tuning for a while, you might know that SSMS only shows the first missing index recommendation in the plan – not all of them. Therefore, when you see a missing index, it’s prudent to view the plan’s XML to see if there might be more.

And in fact, in this plan, there are TWO missing indexes – kinda:

That’s two missing index recommendations, for the same index, in the same plan, each of which will make a 49.9998% improvement in our plan.

So why is that? Take a closer look at our original query and note that it has two nearly identical CTEs at the beginning:

WITH qaedits AS (
  SELECT
  (
    SELECT COUNT(*) FROM Posts
    WHERE PostTypeId = 1
    AND LastEditorUserId = Users.Id
  ) AS QuestionEdits,
  (
    SELECT COUNT(*) FROM Posts
    WHERE PostTypeId = 2
    AND LastEditorUserId = Users.Id
  ) AS AnswerEdits
  FROM Users
  WHERE Id = @UserId
),

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

WITH qaedits AS (

SELECT

(

SELECT COUNT(*) FROM Posts

WHERE PostTypeId = 1

AND LastEditorUserId = Users.Id

) AS QuestionEdits,

(

SELECT COUNT(*) FROM Posts

WHERE PostTypeId = 2

AND LastEditorUserId = Users.Id

) AS AnswerEdits

FROM Users

WHERE Id = @UserId

),

Nearly identical – but different, one filtering for PostTypeId 1, the other for 2.

For each of those, when SQL Server optimized that part of the query, it thought, “Wow, an index on PostTypeId and LastEditorUserId sure would help here!” Therefore, it recommends two missing indexes – each of which will make the query 49.9998% faster.

That’s such a nifty example of how SQL Server breaks queries down into parts and analyzes each of them independently, even generating missing index requests independently.

Related Experiments

Erik mused – would turning on Forced Parameterization help? Here’s the plan with forced parameterization turned on, and no, that didn’t help either – we still get two identical missing index hints.

How about using the same PostTypeId in both CTEs, so it’s effectively the same query? Nope, that doesn’t help either.

Using the same variable, @PostTypeId in each CTE? Nope, same problem. In every case, SQL Server optimizes each CTE independently, generating a separate (identical) missing index for each one.

Just For Fun: Missing Index Doublemint Twins

Which led me to have a little fun – I modified both Daniel’s CTEs to have an additional join to the Comments table. Left is before, right is after with the new join:

Left is before, right is with the new join to Comments

Now, when I run the query, I get FOUR missing index requests – two identical ones on Posts, and two identical ones on Comments:

But of course, when you look at the graphical plan, all you see is the first missing index request with a 39.5856% improvement – which isn’t right either:

And that’s why you can’t trust the impact numbers, let alone the graphical plans’ recommendations or even the field order.

Query Plan Oddities: Two Identical Missing Indexes

Related Experiments

Just For Fun: Missing Index Doublemint Twins

Related

1 Comment. Leave new

Leave a Reply Cancel reply

Hi! I’m Brent Ozar.

Query Plan Oddities: Two Identical Missing Indexes

Related Experiments

Just For Fun: Missing Index Doublemint Twins

Related

1 Comment. Leave new

Leave a Reply Cancel reply

Hi! I’m Brent Ozar.

Get Free SQL Stuff