Quantcast
Channel: Erik Darling – Brent Ozar Unlimited®
Viewing all articles
Browse latest Browse all 370

Never Judge A Query By Its Cost

$
0
0

Signs and Numbers

When tuning queries, or even finding queries to tune, there’s a rather misguided desire to look for queries with a high cost, or judge improvement by lowering query cost. The problem is that no matter what you’re looking at, costs are estimates, and often don’t reflect how long a query runs for or the actual work involved in processing the query.

You typically need to observe the query running and gather more precise metrics on what it’s really doing to judge things.

Good Example

I have these two indexes:

CREATE INDEX ix_users
        ON dbo.Users ( CreationDate, Reputation, Id );
    GO
    
    CREATE INDEX ix_posts
        ON dbo.Posts ( OwnerUserId, Id )
        INCLUDE ( PostTypeId )
        WHERE PostTypeId = 1;
    GO

And when I run this query, I would expect to use both of them to help the join/where clause, and then do a Key Lookup to get the columns not covered by my nonclustered index on users from the clustered index.

SELECT u.*, --Everything from Users
	       p.Id AS PostId
	FROM   dbo.Users AS u
	JOIN   dbo.Posts AS p
	ON p.OwnerUserId = u.Id
	WHERE  u.CreationDate > '20180401'
	AND    u.Reputation > 100
	AND    p.PostTypeId = 1
    ORDER BY u.Id;

After all, there are only about 3000 rows in the Users table that qualify for my predicates, and that only turns into about 7300 rows when joined to Posts. That seems like one of those “no brainer” places to do a key lookup, but it doesn’t happen.

Grinchy.

The metrics we get from stats time/io look like this (trimmed for brevity):

Table 'Posts'. Scan count 5, logical reads 36740
Table 'Users'. Scan count 5, logical reads 150649
Table 'Worktable'. Scan count 0, logical reads 0

 SQL Server Execution Times:
   CPU time = 8172 ms,  elapsed time = 4736 ms.

So about 5 seconds of wall clock time, 8 seconds of CPU time, and 150k logical reads from Users.

If we hint our index, to make the optimizer choose it, we get the plan we’re after.

SELECT u.*, 
	       p.Id AS PostId
	FROM   dbo.Users AS u WITH (INDEX = ix_Users) --TAKE A HINT
	JOIN   dbo.Posts AS p
	ON p.OwnerUserId = u.Id
	WHERE  u.CreationDate > '20180401'
	       AND u.Reputation > 100
	       AND p.PostTypeId = 1
    ORDER BY u.Id;

Fancy Feast

The metrics we get from stats time/io look like this (trimmed for brevity):

Table 'Users'. Scan count 5, logical reads 11076
Table 'Worktable'. Scan count 0, logical reads 0
Table 'Posts'. Scan count 5, logical reads 36964
Table 'Worktable'. Scan count 0, logical reads 0

 SQL Server Execution Times:
   CPU time = 1187 ms,  elapsed time = 780 ms.

Interesting! We’re down to 780 milliseconds of wall clock time, about 1.1 seconds of CPU time, and what appears to be a lot fewer reads from the Users table.

This is clearly the faster plan, so what gives?

Scrooged

The much slower query has a lower cost!

Fraudity

The optimizer associates a fairly high cost to the key lookup. This comes from two things:

  1. The optimizer assumes data comes from a cold buffer cache, meaning no data is in memory
  2. The optimizer hates I/O, especially random I/O

I’m gonna go out on a limb and say the costing algorithm still thinks about disks like they’re creaky old spinning rust, and that it’s not hip to SSDs. I don’t know this for a fact, and I’d love to hear otherwise. When you put it all together, you get a query plan that prefers to do one sequential scan of the clustered index because no data is in memory, and because doing a Key Lookup via random I/O, or even sorting data first would be more expensive.

Costs can be helpful to figure out why a certain plan was chosen, but they shouldn’t be used to judge what the best plan is.

Thanks for reading!

Brent says: in SQL Server’s defense, I bet most plan caches are chock full of beautiful query plans where the costs accurately reflect the amount of work involved. Say maybe 99.9% of your queries get perfectly crafted plans. Still, that’s 1 in 1,000 plans that go south, and those are the ones you’re typically focused on tuning. You’re tuning them because the estimates are wrong – and that’s why you don’t wanna put faith in those estimated cost numbers. If they were accurate, you probably wouldn’t be tuning ’em.


Viewing all articles
Browse latest Browse all 370

Trending Articles