Joe Obbish (imported from SE)
I'm asking this question in order to better understand the optimizer's behavior and to understand the limits around index spools. Suppose that I put integers from 1 to 10000 into a heap:

    truncate table X_10000;
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2;

And force a nested loop join with `MAXDOP 1`:

    SELECT *
    FROM X_10000 a
    INNER JOIN X_10000 b ON a.ID = b.ID

This is a rather unfriendly action to take towards SQL Server. Nested loop joins often aren't a good choice when both tables don't have any relevant indexes. Here's the plan:

[![bad query][1]][1]

The query takes 13 seconds on my machine with 100000000 rows fetched from the table spool. However, I don't see why the query has to be slow. The query optimizer has the ability to create indexes on the fly through [index spools][2]. This query seems like it would be a perfect candidate for an index spool.

The following query returns the same results as the first one, has an index spool, and finishes in less than a second:

    SELECT *
    FROM X_10000 a
    CROSS APPLY (SELECT TOP (9223372036854775807) b.ID FROM X_10000 b WHERE a.ID = b.ID) ca

[![workaround 1][3]][3]

This query also has an index spool and finishes in less than a second:

    SELECT *
    FROM X_10000 a
    INNER JOIN X_10000 b ON a.ID >= b.ID AND a.ID <= b.ID

[![workaround 2][4]][4]

Why doesn't the original query have an index spool? Is there any set of documented or undocumented hints or trace flags that will give it an index spool? I did find [this related question][5], but it doesn't fully answer my question and I can't get the mysterious trace flag to work for this query.

Top Answer
Paul White (imported from SE)
As you know, the optimizer's search is not exhaustive. It tries things that make sense in context, and which frequently pay dividends on real queries. Forcing a loop join between two single-column unindexed heap tables is not such a scenario. That said, here are some details:

SQL Server likes to transform applies to joins early, because it knows more tricks with joins. Later on, it may explore converting the join back to an apply. The [difference between the two][1] being correlated parameters (outer references). Applies make sense when there is a suitable index on the inner side. Your example has no indexes, so the optimizer is not persuaded to explore translation to an apply.

A simple (non-apply) join has the join predicate on the join operator instead of outer references. The spool optimization for a non-apply is typically a lazy table spool, since there is no predicate on the inner side, only at the join.

The optimizer does not consider building an index on the fly to enable an apply; rather the sequence of events is usually the reverse: transform to apply because a good index exists.

You can sometimes encourage an apply rather than a join by using `APPLY` syntax in your query. The undocumented trace flag 9114 can assist in this by dissuading the optimizer from translating a logical apply to a join up front. For example:

    SELECT * 
    FROM dbo.X_1000 AS a
    CROSS APPLY (SELECT * FROM dbo.X_1000 AS b WHERE b.ID = a.ID) AS b

[![Spool plan][2]][2]

An index spool is favoured for apply because the outer reference means the selection is applied on the inner side of the join. You will often see this via `SelToIndexOnTheFly` but other paths exist. See my article [The Eager Index Spool and The Optimizer](


This room is for discussion about this question.

Once logged in you can direct comments to any contributor here.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.