Can AI Rewrite Bad Queries in Seconds? Kinda.
When I see horrific code, stuff that would take a lot of manual labor to fix, I like to lob it over to an LLM like ChatGPT just to see how it does.
For example, on a call with a client, I opened up one of their slowest pieces of code to find that it was running a cursor, processing data row by row rather than working in sets. To simulate it, I’ve written up a few stored procedures against the Stack Overflow database:
- Posts_Update_AnswerCount takes a single PostId, and updates the AnswerCount of that row
- Posts_Update_CommentCount does the same thing, but for a different column
- Post_Update_AllCounts calls both of the above for a list of PostIds, looping through them one post at a time
Transact-SQL
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
CREATE OR ALTER PROC dbo.Posts_Update_AnswerCount @PostId INT AS BEGIN UPDATE dbo.Posts SET AnswerCount = (SELECT SUM(1) FROM dbo.Posts pA WHERE pA.ParentId = @PostId) WHERE Id = @PostId; END GO CREATE OR ALTER PROC dbo.Posts_Update_CommentCount @PostId INT AS BEGIN UPDATE dbo.Posts SET CommentCount = (SELECT SUM(1) FROM dbo.Comments c WHERE PostId = @PostId) WHERE Id = @PostId; END GO CREATE OR ALTER PROC dbo.Post_Update_AllCounts @PostIds VARCHAR(MAX) AS BEGIN DECLARE @Current_PostId INT; DECLARE postids_cursor CURSOR FOR SELECT value FROM string_split(@PostIds, ','); OPEN postids_cursor; FETCH NEXT FROM postids_cursor INTO @Current_PostId; WHILE @@FETCH_STATUS = 0 BEGIN EXEC dbo.Posts_Update_AnswerCount @Current_PostId; EXEC dbo.Posts_Update_CommentCount @Current_PostId; FETCH NEXT FROM postids_cursor INTO @Current_PostId; END CLOSE postids_cursor; DEALLOCATE postids_cursor; END GO |
When I look at that code, sure, conceptually I know what work has to be done – but it’s stupid manual labor. So I copied that code into a text file and asked ChatGPT, “Please rewrite this Microsoft SQL Server code into a more performant version.” That’s about as simple of a prompt as you can get!
Here’s the code ChatGPT 4o came up with in the first pass, and you can read its full response to see its justifications:
Transact-SQL
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
CREATE OR ALTER PROC dbo.Posts_Update_AllCounts @PostIds VARCHAR(MAX) AS BEGIN -- Convert the comma-separated string into a table DECLARE @PostIdTable TABLE (PostId INT); INSERT INTO @PostIdTable (PostId) SELECT CAST(value AS INT) FROM string_split(@PostIds, ','); -- Update AnswerCount for all specified posts UPDATE p SET AnswerCount = sub.AnswerCount FROM dbo.Posts p INNER JOIN ( SELECT pA.ParentId, COUNT(*) AS AnswerCount FROM dbo.Posts pA INNER JOIN @PostIdTable t ON pA.ParentId = t.PostId GROUP BY pA.ParentId ) sub ON p.Id = sub.ParentId; -- Update CommentCount for all specified posts UPDATE p SET CommentCount = sub.CommentCount FROM dbo.Posts p INNER JOIN ( SELECT c.PostId, COUNT(*) AS CommentCount FROM dbo.Comments c INNER JOIN @PostIdTable t ON c.PostId = t.PostId GROUP BY c.PostId ) sub ON p.Id = sub.PostId; END; GO |
Is it better than what we started with? Yes. Is it good enough that I wanna put it into production? No, because it’s still making two separate passes over the exact same rows in the Posts table, updating them one column at a time.
I replied to ChatGPT with, “Can you make it even faster and more efficient?” The result:
Transact-SQL
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
CREATE OR ALTER PROC dbo.Posts_Update_AllCounts @PostIds VARCHAR(MAX) AS BEGIN -- Parse the comma-separated PostIds into a table variable DECLARE @PostIdTable TABLE (PostId INT PRIMARY KEY); INSERT INTO @PostIdTable (PostId) SELECT DISTINCT CAST(value AS INT) FROM string_split(@PostIds, ','); -- Use a common table expression (CTE) for answer and comment counts WITH AnswerCounts AS ( SELECT pA.ParentId AS PostId, COUNT(*) AS AnswerCount FROM dbo.Posts pA WHERE pA.ParentId IN (SELECT PostId FROM @PostIdTable) GROUP BY pA.ParentId ), CommentCounts AS ( SELECT c.PostId, COUNT(*) AS CommentCount FROM dbo.Comments c WHERE c.PostId IN (SELECT PostId FROM @PostIdTable) GROUP BY c.PostId ) -- Perform updates in a single pass UPDATE p SET AnswerCount = ISNULL(ac.AnswerCount, 0), CommentCount = ISNULL(cc.CommentCount, 0) FROM dbo.Posts p LEFT JOIN AnswerCounts ac ON p.Id = ac.PostId LEFT JOIN CommentCounts cc ON p.Id = cc.PostId WHERE p.Id IN (SELECT PostId FROM @PostIdTable); END; GO |
Now we’re getting somewhere! I love that it used CTEs with null protection, too, to handle the case where a post might have no answers or comments. I gave it one last shot by asking, “Could you make it faster, simpler, and easier to read?” but it reverted back to the two-update solution.
I then passed the same code and same prompts over to Google’s Gemini 2.0 Flash Thinking Experimental model. Its first result was similar to ChatGPT’s in that it had two separate update statements, but after prompting again to make it faster and more efficient, it came to this conclusion:
Transact-SQL
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
CREATE OR ALTER PROC dbo.Post_Update_AllCounts @PostIds VARCHAR(MAX) AS BEGIN -- Using a table variable for better performance than a cursor DECLARE @PostIdTable TABLE (PostId INT PRIMARY KEY); INSERT INTO @PostIdTable (PostId) SELECT value FROM STRING_SPLIT(@PostIds, ','); -- Update AnswerCounts and CommentCounts in a single, highly efficient operation UPDATE p SET AnswerCount = ISNULL(ac.AnswerCount, 0), CommentCount = ISNULL(cc.CommentCount, 0) FROM dbo.Posts p INNER JOIN @PostIdTable pt ON p.Id = pt.PostId LEFT JOIN ( SELECT ParentId, COUNT(*) AS AnswerCount FROM dbo.Posts WHERE ParentId IN (SELECT PostId FROM @PostIdTable) GROUP BY ParentId ) AS ac ON p.Id = ac.ParentId LEFT JOIN ( SELECT PostId, COUNT(*) AS CommentCount FROM dbo.Comments WHERE PostId IN (SELECT PostId FROM @PostIdTable) GROUP BY PostId ) AS cc ON p.Id = cc.PostId; END GO |
Instead of using a pair of CTEs like ChatGPT 4o, Gemini Flash 2.0 Thinking decided to use a pair of left join subqueries. I’m amused that both platforms chose to dump the contents of string_split into a table variable rather than using a temp table or using the string_split directly in the update’s where clause.
In both cases, the LLM (ChatGPT or Gemini) is able to change the query, but it may take multiple prompts to get to the best result – and the results don’t necessarily get better with every prompt. Today, these tools work best in the hands of someone who actually knows the results that they’re looking for, like you, dear reader.
That’s not to say you shouldn’t use them! This example here represents a really good starting point use case. In the client’s example (and this has happened several times), the LLM was able to give me a good-enough starting point in seconds, saving me a lot of time with manual labor on the rewrite. I still needed to do manual labor – I just had to do less, and that’s a win, because, uh, I’m lazy.
Related

Hi! I’m Brent Ozar.
I make Microsoft SQL Server go faster. I love teaching, travel, cars, and laughing. I’m based out of Las Vegas. He/him. I teach SQL Server training classes, or if you haven’t got time for the pain, I’m available for consulting too.
Get Free SQL Stuff
"*" indicates required fields

12 Comments. Leave new
“amused that both platforms chose to dump the contents of string_split into a table variable rather than using a temp table”
Does amused mean it was cheeky and found a good solution that is against common advise or does it mean it should have used the temp table?
Check out my Fundamentals of Query Tuning class to find out! WINK WINK
I knew that was coming 🙂
Being lazy is, in my opinion, the most important attribute of a good DBA. If you’re running something by hand that you can automate, you’re not lazy enough. If there’s a clearly defined troubleshooting path for a common issue and you don’t have it in a script, you’re not lazy enough.
I’ve used it for debugging silly syntax errors and for checking if a statement is portable to another database platform. It’s not bad. It’s kind of like the old 4GLs like Focus and other natural languages in the (distant) past. It’s perfectly fine of you don’t ask it to do too much.
I actually have been involved in doing this exact same thing. I’ve started a project to fix some of out ETL databases that have not been touched in over 12 years. They are a god awful mess, written by a Lotus Notes guy with no SQL experience at all.
It just so happens that I started this project the day after I finished “Fundamentals of Query Tuning” and 2 weeks after completing “Fundamentals of Index Tuning”. Which were a great help and doing this and really enforced the training from these courses.
I’ve been using AWS Claud and ChatCPT and I have to say Claud has given me the better results, not perfect all the time, because LLMs are never perfect, but results that were more accurate for the task. (IMO)
Here is Clauds result after 2 tweeks of the 1st response.
CREATE OR ALTER PROC dbo.Posts_Update_Counts
@PostIds VARCHAR(MAX) = NULL,
@SinglePostId INT = NULL
AS
BEGIN
SET NOCOUNT ON;
WITH PostIdsCTE AS (
SELECT DISTINCT CAST(value AS INT) AS PostId
FROM STRING_SPLIT(ISNULL(@PostIds, CAST(@SinglePostId AS VARCHAR(20))), ‘,’)
WHERE value IS NOT NULL
),
AnswerCounts AS (
SELECT
ParentId,
COUNT(*) OVER(PARTITION BY ParentId) as answer_count
FROM dbo.Posts WITH (NOLOCK)
WHERE ParentId IN (SELECT PostId FROM PostIdsCTE)
),
CommentCounts AS (
SELECT
PostId,
COUNT(*) OVER(PARTITION BY PostId) as comment_count
FROM dbo.Comments WITH (NOLOCK)
WHERE PostId IN (SELECT PostId FROM PostIdsCTE)
)
UPDATE p WITH (ROWLOCK)
SET
AnswerCount = ISNULL(a.answer_count, 0),
CommentCount = ISNULL(c.comment_count, 0)
FROM dbo.Posts p
INNER JOIN PostIdsCTE pid ON p.Id = pid.PostId
LEFT JOIN (
SELECT DISTINCT ParentId, answer_count
FROM AnswerCounts
) a ON p.Id = a.ParentId
LEFT JOIN (
SELECT DISTINCT PostId, comment_count
FROM CommentCounts
) c ON p.Id = c.PostId
OPTION (RECOMPILE);
END;
GO
Just for comparison sake, I asked Copilot to do the same thing and here is the results:
CREATE OR ALTER PROC dbo.Posts_Update_AllCounts
@PostIds VARCHAR(MAX)
AS
BEGIN
— Convert the comma-separated string into a table
DECLARE @PostIdTable TABLE (PostId INT);
INSERT INTO @PostIdTable (PostId)
SELECT CAST(value AS INT)
FROM string_split(@PostIds, ‘,’);
— Update AnswerCount and CommentCount for all specified posts
MERGE dbo.Posts AS p
USING (
SELECT t.PostId,
ISNULL(a.AnswerCount, 0) AS AnswerCount,
ISNULL(c.CommentCount, 0) AS CommentCount
FROM @PostIdTable t
LEFT JOIN (
SELECT pA.ParentId, COUNT(*) AS AnswerCount
FROM dbo.Posts pA
GROUP BY pA.ParentId
) a ON t.PostId = a.ParentId
LEFT JOIN (
SELECT c.PostId, COUNT(*) AS CommentCount
FROM dbo.Comments c
GROUP BY c.PostId
) c ON t.PostId = c.PostId
) AS src
ON p.Id = src.PostId
WHEN MATCHED THEN
UPDATE SET
p.AnswerCount = src.AnswerCount,
p.CommentCount = src.CommentCount;
END;
GO
First pass.
Hey, brent I just used the same code from your blog and asked Amazon Q, and here’s the version it produce in the first round. Looks like another good variation from the other LLMs, and it uses CROSS APPLY instead of LEFT JOINs:
CREATE OR ALTER PROCEDURE dbo.Post_Update_AllCounts
@PostIds VARCHAR(MAX)
AS
BEGIN
UPDATE p
SET p.AnswerCount = (
SELECT COUNT(*)
FROM dbo.Posts pA
WHERE pA.ParentId = p.Id
),
p.CommentCount = (
SELECT COUNT(*)
FROM dbo.Comments c
WHERE c.PostId = p.Id
)
FROM dbo.Posts p
CROSS APPLY STRING_SPLIT(@PostIds, ‘,’) AS ids
WHERE p.Id = CAST(ids.value AS INT);
END
GO
Hi Brent and thanks for the post!
My personal experience is that AI can do a surprisingly good job, as long as you give it enough instructions. I’ve been testing various services quite a lot in the past months, and it has taught me a lot about being patient and clear with the AI “helpers”. In that sense, I guess they’re quite similar to people 🙂
So, to me, it’s been a journey of finding the correct balance of lazy. Most of the time I would rather not re-write all of that T-SQL, but these days I put effort writing the prompt (which would still be less effort than re-writing all the T-SQL). If you specify what the query should look like, what parts you need optimized, and what it should use (temp tables, CTEs, etc.), you’ll get much better code out of IT.
And that goes for follow-up prompts as well. If you just ask it to “make it faster”, or “easier to read”, and keep asking, it’ll eventually go completely crazy and give you just complete garbage. I’ve tested this, it goes bad surprisingly quickly! But if you are specific “Make this part of the code more efficient by doing X”, that’s where it can make your work much faster.
[…] Can AI Rewrite Bad Queries in Seconds? Kinda. (Brent Ozar) […]
Just curious if paying for LLM service subscriptions is part of process here. Do you have a sense as to which ones make sense to pay or ask my employer to pay for?
I find many of these not fully correct as when using set var1 to isnull(count1, 0), you are setting it to zero because currently there are no count. However, prior to your update there could have been an existing count, and you have now set it to zero because you currently have no count. If you used set var1 to isnull(value, var1), then you preserve the prior value when there is no current count. I suspect one cannot blame the LLM/AI as the issue is present in the initial query given to it.