Quantcast
Channel: Erik Darling – Brent Ozar Unlimited®
Viewing all articles
Browse latest Browse all 370

Hash Join Memory Grant Factors

$
0
0

Buskets

Much like Sorts, Hash Joins require some amount of memory to operate efficiently  — without spilling, or spilling too much.

And to a similar degree, the number of rows and columns passed to the Hashing operator matter where the memory grant is concerned. This doesn’t mean Hashing is bad, but you may need to take some extra steps when tuning queries that use them.

The reasons are pretty obvious when you think about the context of a Hash operation, whether it’s a join or aggregation.

  1. All rows from the build side have to arrive at the operator (in parallel plans, usually after a bitmap filter)
  2. The hashing function gets applied to join or grouping columns
  3. In a join, the hashed values from the build side probe hashed values from the outer side
  4. In some cases, the actual values need to be checked as a residual

During all that nonsense, all the columns that you SELECT get dragged along for the ride.

Here’s a quick example!

This query doesn’t return any rows, because Jon Skeet hadn’t hit 1 million rep in the data dump I’m using (Stack Overflow 2010).

/*Returns nothing*/
SELECT u.DisplayName
FROM   dbo.Users AS u
JOIN   dbo.Posts AS p
    ON u.Id = p.OwnerUserId
WHERE  u.Reputation >= 1000000;

Despite that, the memory asks for about 7 MB of memory to run. This seems to be the lowest memory grant I could get the optimizer to ask for

Hashtastic

If we drop the Reputation filter down a bit so some rows get returned, the memory grant stays the same.

SELECT u.DisplayName
FROM   dbo.Users AS u
JOIN   dbo.Posts AS p
    ON u.Id = p.OwnerUserId
WHERE  u.Reputation >= 500000;

That’s why I’m calling 7MB the “base” grant here — that, and if I drop the Reputation filter lower to allow more people in, the grant will go up.

SELECT u.DisplayName
FROM   dbo.Users AS u
JOIN   dbo.Posts AS p
    ON u.Id = p.OwnerUserId
WHERE  u.Reputation >= 400000;

Creepin and creepin and creepin

But we can also get a grant higher than the base by requesting more columns.

SELECT u.DisplayName, p.Title
FROM   dbo.Users AS u
JOIN   dbo.Posts AS p
    ON u.Id = p.OwnerUserId
WHERE  u.Reputation >= 500000;

ARF ARF

This is more easily accomplished by selecting string data. Again, just like with Sorts, we don’t need to actually sort by string data for the memory grant to go up. We just need to make it pass through a memory consuming operator.

Thanks for reading!

Brent says: you remember how, in the beginning of your career, some old crusty DBA told you to avoid SELECT *? Turns out they were right.


Viewing all articles
Browse latest Browse all 370

Trending Articles