6 years ago Joe Obbish (imported from SE)

Consider the following query that inserts rows from a source table only if they aren’t already in the target table:

One possible plan shape includes a merge join and an eager spool. The eager spool operator is present to solve the Halloween Problem:

first plan

On my machine, the above code executes in about 6900 ms. Repro code to create the tables is included at the bottom of the question. If I’m dissatisfied with performance I might try to load the rows to be inserted into a temp table instead of relying on the eager spool. Here’s one possible implementation:

The new code executes in about 4400 ms. I can get actual plans and use Actual Time Statistics™ to examine where time is spent at the operator level. Note that asking for an actual plan adds significant overhead for these queries so totals will not match the previous results.

The query plan with the eager spool seems to spend significantly more time on the insert and spool operators compared to the plan that uses the temp table.

Why is the plan with the temp table more efficient? Isn’t an eager spool mostly just an internal temp table anyway? I believe I am looking for answers that focus on internals. I’m able to see how the call stacks are different but can’t figure out the big picture.

I am on SQL Server 2017 CU 11 in case someone wants to know. Here is code to populate the tables used in the above queries:

Top Answer
6 years ago Paul White (imported from SE)

This is what I call Manual Halloween Protection.

You can find an example of it being used with an update statement in my article Optimizing Update Queries. One has to be a bit careful to preserve the same semantics, for example by locking the target table against all concurrent modifications while the separate queries execute, if that is relevant in your scenario.

Why is the plan with the temp table more efficient? Isn’t an eager spool mostly just an internal temp table anyway?

A spool has some of the characteristics of a temporary table, but the two are not exact equivalents. In particular, a spool is essentially a row-by-row unordered insert to a b-tree structure. It does benefit from locking and logging optimizations, but does not support bulk load optimizations.

Consequently, one can often get better performance by splitting the query in a natural way: Bulk loading the new rows into a temporary table or variable, then performing an optimized insert (without explicit Halloween Protection) from the temporary object.

Making this separation also allows you extra freedom to tune the read and write portions of the original statement separately.

As a side note, it is interesting to think about how the Halloween Problem might be addressed using row versions. Perhaps a future version of SQL Server will provide that feature in suitable circumstances.


As Michael Kutz alluded to in a comment, you could also explore the possibility of exploiting the hole-filling optimization to avoid explicit HP. One way to achieve this for the demo is to create a unique index (clustered if you like) on the ID column of A_HEAP_OF_MOSTLY_NEW_ROWS.

With that guarantee in place the optimizer can use hole-filling and rowset sharing:

MERGE plan

While interesting, you will still be able to achieve better performance in many cases by employing carefully-implemented Manual Halloween Protection.

This room is for discussion about this question.

Once logged in you can direct comments to any contributor here.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.