FINALLY…
This is the last post I’ll write about foreign keys for a while. Maybe ever.
Let’s face it, most developers probably find them more annoying than useful, and if you didn’t implement them when you first started designing your database, you’re not likely to go back and start trying to add them in.
Depending on the size of your tables, and how much bad stuff has leaked in, it could be quite a project to fix the data, add the constraints, and then finally… Get a bunch of app errors, and realize why they weren’t there in the first place.
Now you’ve annoyed the developers. They’ll be coming with torches and pitchforks for the sa password.
But My Join Elimination!
Let’s start with known limitations!
- Multi-key foreign keys
- Untrusted foreign keys
- If you uh… need a column from the other table
And sometimes, just sometimes, the optimizer will monkey your wrench. Here’s a funny example!
That’s Not Funny.
Here’s our constraint:
ALTER TABLE dbo.Badges ADD CONSTRAINT fk_badges_users_id FOREIGN KEY (UserId) REFERENCES dbo.Users(Id);
This query gets join elimination, because we followed all the rules.
SELECT b.UserId, COUNT(*) AS records FROM dbo.Users AS u JOIN dbo.Badges AS b ON u.Id = b.UserId WHERE b.Date >= '20120101' GROUP BY b.UserId HAVING COUNT(*) > 1000;
Our constraint is trusted, it’s on one column, and we’re only referencing the Badges table in the important parts of the query: the select list and where clause.
We’re very excited. Very excited indeed about the prospects of join elimination.
We’re so excited that now we wanna write our query to only filter down to a range of users.
SELECT b.UserId, COUNT(*) AS records FROM dbo.Users AS u JOIN dbo.Badges AS b ON u.Id = b.UserId WHERE b.Date >= '20120101' AND b.UserId BETWEEN 1 AND 25000 GROUP BY b.UserId HAVING COUNT(*) > 1000;
Now, if you look at that query, there are a whole lot of “b” aliases. I’m not asking for anything but a join to the Users table, just like before.
The last query got join elimination. This one doesn’t.
That’s an interesting choice, to say the least.
But I know what you’re thinking. Between is awful. Aaron said so.
How About Equality?
Let’s look for just one user. This’ll go better, right?
SELECT b.UserId, COUNT(*) AS records FROM dbo.Users AS u JOIN dbo.Badges AS b ON u.Id = b.UserId WHERE b.Date >= '20120101' AND b.UserId = 22656 GROUP BY b.UserId HAVING COUNT(*) > 1000;
About That Join Elimination…
Sometimes, even if you do everything right, it doesn’t work out. I’m not doing anything particularly tricky, here, either.
If a Bouncer-American can find this in a few minutes of messing around, imagine what you can come up with?
Thanks for reading!
Black Friday is on! Save 50% on live classes, 75% on recorded classes and SQL ConstantCare.