Never Judge A Query By Its Cost

Signs and Numbers

When tuning queries, or even finding queries to tune, there’s a rather misguided desire to look for queries with a high cost, or judge improvement by lowering query cost. The problem is that no matter what you’re looking at, costs are estimates, and often don’t reflect how long a query runs for or the actual work involved in processing the query.

You typically need to observe the query running and gather more precise metrics on what it’s really doing to judge things.

Good Example

I have these two indexes:

CREATE INDEX ix_users
        ON dbo.Users ( CreationDate, Reputation, Id );
    GO
    
    CREATE INDEX ix_posts
        ON dbo.Posts ( OwnerUserId, Id )
        INCLUDE ( PostTypeId )
        WHERE PostTypeId = 1;
    GO

And when I run this query, I would expect to use both of them to help the join/where clause, and then do a Key Lookup to get the columns not covered by my nonclustered index on users from the clustered index.

SELECT u.*, --Everything from Users
	       p.Id AS PostId
	FROM   dbo.Users AS u
	JOIN   dbo.Posts AS p
	ON p.OwnerUserId = u.Id
	WHERE  u.CreationDate > '20180401'
	AND    u.Reputation > 100
	AND    p.PostTypeId = 1
    ORDER BY u.Id;

After all, there are only about 3000 rows in the Users table that qualify for my predicates, and that only turns into about 7300 rows when joined to Posts. That seems like one of those “no brainer” places to do a key lookup, but it doesn’t happen.

Grinchy.

The metrics we get from stats time/io look like this (trimmed for brevity):

Table 'Posts'. Scan count 5, logical reads 36740
Table 'Users'. Scan count 5, logical reads 150649
Table 'Worktable'. Scan count 0, logical reads 0

 SQL Server Execution Times:
   CPU time = 8172 ms,  elapsed time = 4736 ms.

So about 5 seconds of wall clock time, 8 seconds of CPU time, and 150k logical reads from Users.

If we hint our index, to make the optimizer choose it, we get the plan we’re after.

SELECT u.*, 
	       p.Id AS PostId
	FROM   dbo.Users AS u WITH (INDEX = ix_Users) --TAKE A HINT
	JOIN   dbo.Posts AS p
	ON p.OwnerUserId = u.Id
	WHERE  u.CreationDate > '20180401'
	       AND u.Reputation > 100
	       AND p.PostTypeId = 1
    ORDER BY u.Id;

Fancy Feast

The metrics we get from stats time/io look like this (trimmed for brevity):

Table 'Users'. Scan count 5, logical reads 11076
Table 'Worktable'. Scan count 0, logical reads 0
Table 'Posts'. Scan count 5, logical reads 36964
Table 'Worktable'. Scan count 0, logical reads 0

 SQL Server Execution Times:
   CPU time = 1187 ms,  elapsed time = 780 ms.

Interesting! We’re down to 780 milliseconds of wall clock time, about 1.1 seconds of CPU time, and what appears to be a lot fewer reads from the Users table.

This is clearly the faster plan, so what gives?

Scrooged

The much slower query has a lower cost!

Fraudity

The optimizer associates a fairly high cost to the key lookup. This comes from two things:

The optimizer assumes data comes from a cold buffer cache, meaning no data is in memory
The optimizer hates I/O, especially random I/O

I’m gonna go out on a limb and say the costing algorithm still thinks about disks like they’re creaky old spinning rust, and that it’s not hip to SSDs. I don’t know this for a fact, and I’d love to hear otherwise. When you put it all together, you get a query plan that prefers to do one sequential scan of the clustered index because no data is in memory, and because doing a Key Lookup via random I/O, or even sorting data first would be more expensive.

Costs can be helpful to figure out why a certain plan was chosen, but they shouldn’t be used to judge what the best plan is.

Thanks for reading!

Brent says: in SQL Server’s defense, I bet most plan caches are chock full of beautiful query plans where the costs accurately reflect the amount of work involved. Say maybe 99.9% of your queries get perfectly crafted plans. Still, that’s 1 in 1,000 plans that go south, and those are the ones you’re typically focused on tuning. You’re tuning them because the estimates are wrong – and that’s why you don’t wanna put faith in those estimated cost numbers. If they were accurate, you probably wouldn’t be tuning ’em.

Never Judge A Query By Its Cost

Signs and Numbers

Good Example

Scrooged

Trending Articles

Scuffham Amps - S-GEAR 2.6.0 VST, AAX, STANDALONE x86 x64 (R2R NO iLok2, +NO...

Practice Sheet of Right form of verbs for HSC Students

VHSE First (1st) Allotment 2025 - vhscap.kerala.gov.in

UNIVERSE LEAGUE – UNIVERSE LEAGUE – WAR (We Are Ready) – EP [iTunes Plus M4A]

City Hunter Teledrama – Episode 18 – 07th May 2016

Comment on Proposed Criteria for Identifying Predatory Conferences by Luke...

Bureau of Internal Revenue: Regional Offices (Directory)

Kendrick Lamar – Not Like Us (2024) [24Bit-88.2kHz] [PMEDIA] ⭐️

Inception 2010 Hindi Dual Audio 650MB BRRip 720p ESubs HEVC

East Hull MD admits sexual assaults after another victim comes forward

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

R. v. Sargeant, 2023 ONSC 6406 (CanLII)

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Who’s been sentenced at Northampton Magistrates’ Court

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Family cries out as traditional ruler allegedly abducts brother, extorts N2.5m

Long-Running Conflict In Springfield (MA) Gangland Sphere Has Manzi Family &...

Wondershare Filmora X v10.1.20.16 x64

Man arrested after fracas in flat

Man charged in ongoing Sexual Assault Investigation Derek Nyilas, 46, Faces...