Physical Joins

There are three physical join types in SQL Server they are Nested Loops, Hash Match Join and Merge Join.


Hash Match Join

Hash Match join works by having two inputs, SQL server during the build phase will take one of these inputs and will build an in memory table.  Hashs are calculated based on the join input key the whole row is stored in a hash bucket.  Once the hash has been built SQL server will  perform the probe stage SQL server calculates the join key hash for each row in the second input to check if it matches the hash table that was created in the first build phase, SQL server will then check to make sure that the join keys actual match it does this becuase encase of a hash match collison.  Hash Match Join uses tempdb to join large datasets, this physical join is also a blocking operator.

Merge Join

The Merge Join is the fastest physical join, it works by iterating over the rows from the two inputs to comapre them, however the two inputs need to be sorted first either by index sorting or SQL server explcitly sorting the data. A Many-to-Many join creates a worktable in tempdb and SQL server will use this to comapre, once it’s gotten past the duplicate Many-to-Many values the worktable is deleted.

Nested Loops

SQL server will iterate over the rows from the inner input using the outer input to compare the values for a match.  Nested Loops join is CPU intensive, to help with performance it’s best to have the inner table pre-sorted.  Nested loops are used when the data is small or SQL server may use a nested loops join incorectly if you have incorrect estimates.





Leave a Reply

Your email address will not be published. Required fields are marked *