sql-server add tag
meme (imported from SE)
Spoolum
--
In SQL Server there are several kinds of spools. The two that I'm interested are [Table Spool][1]s and [Index spools][2], *outside of modification queries*.

Read only queries, particularly on the inner side of a Nested Loops join, may use either a Table or Index spool to potentially reduce I/O and improve query performance. These spools can be [Eager][3] or [Lazy][4]. Just like you and me.

My questions are:

 - Which factors go into the choice of Table vs. Index Spool
 - Which factors go into the choice between Eager and Lazy Spools


  [1]: https://sqlserverfast.com/epr/table-spool/
  [2]: https://sqlserverfast.com/epr/index-spool/
  [3]: https://www.red-gate.com/simple-talk/sql/learn-sql-server/operator-of-the-week-spools-eager-spool/
  [4]: https://www.red-gate.com/simple-talk/sql/learn-sql-server/showplan-operator-of-the-week-lazy-spool/
Top Answer
Joe Obbish (imported from SE)
This is a little broad but I think I understand the True Question and will answer accordingly. Just going to talk about table vs index spool though. I don't think it's quite correct to view there as being a choice between table and index spools. As you know, it's possible in a single subtree to get an index spool, a table spool, or both an index spool and a table spool. I believe it's generally correct to say that you get an index spool under the following conditions:

 1. The query optimizer has a reason to transform a join into an apply
 2. The query optimizer actually performs the transform to the apply
 3. The query optimizer uses the rule to add an index spool (at minimum the index spool must be safe to use)
 4. The plan with the index spool is selected

You can see most of these with simple demos. Start by creating a pair of heaps:

    DROP TABLE IF EXISTS dbo.X_10000_VARCHAR_901;
    CREATE TABLE dbo.X_10000_VARCHAR_901 (ID VARCHAR(901) NOT NULL);
    
    INSERT INTO dbo.X_10000_VARCHAR_901 WITH (TABLOCK)
    SELECT TOP (10000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2;
    
    
    DROP TABLE IF EXISTS dbo.X_10000_VARCHAR_800;
    CREATE TABLE dbo.X_10000_VARCHAR_800 (ID VARCHAR(800) NOT NULL);
    
    INSERT INTO dbo.X_10000_VARCHAR_800 WITH (TABLOCK)
    SELECT TOP (10000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM master..spt_values t1
    CROSS JOIN master..spt_values t2;

For the first query, there's nothing to seek on:

    SELECT *
    FROM dbo.X_10000_VARCHAR_901 a
    CROSS JOIN dbo.X_10000_VARCHAR_901 b
    OPTION (MAXDOP 1);

So there's no reason for the optimizer to transform the join into an apply. You end up with a table spool due to costing reasons. So this query fails the first test.

[![enter image description here][1]][1]

For the next query, it's fair to expect that the optimizer has a reason to consider an apply:

    SELECT *
    FROM dbo.X_10000_VARCHAR_901 a
    INNER JOIN dbo.X_10000_VARCHAR_901 b ON a.ID = b.ID 
    OPTION (LOOP JOIN, MAXDOP 1);

But it's not meant to be:

[![enter image description here][2]][2]

This query fails the second test. A complete explanation is [here][3]. Quoting the most relevant part:

> The optimizer does not consider building an index on the fly to enable
> an apply; rather the sequence of events is usually the reverse:
> transform to apply because a good index exists.

I can rewrite the query to encourage the optimizer consider an apply:

    SELECT *
    FROM dbo.X_10000_VARCHAR_901 a
    INNER JOIN dbo.X_10000_VARCHAR_901 b ON a.ID >= b.ID AND a.ID <= b.ID
    OPTION (MAXDOP 1);

But there's still no index spool:

[![enter image description here][4]][4]

This query fails the third test. In SQL Server 2014 there was an index key length limit of 900 bytes. This was extended in SQL Server 2016 but only for nonclustered indexes. [The index for a spool is a clustered index so the limit remains at 900 bytes][5]. In any case, the index spool rule can't be applied because it could lead to an error during query execution.

Reducing the data type length to 800 finally provides a plan with an index spool:

[![enter image description here][6]][6]

The index spool plan, not surprisingly, is costed significantly cheaper than a plan with no spool: 89.7603 units vs 598.832 units. You can see the difference with the undocumented `QUERYRULEOFF BuildSpool` query hint:

[![enter image description here][7]][7]

This isn't a complete answer, but hopefully it's some of what you were looking for.


  [1]: https://i.stack.imgur.com/rtZSm.png
  [2]: https://i.stack.imgur.com/5x6nk.png
  [3]: https://dba.stackexchange.com/a/173944/104717
  [4]: https://i.stack.imgur.com/BwYBZ.png
  [5]: https://sqlperformance.com/2019/09/sql-performance/nested-loops-joins-performance-spools
  [6]: https://i.stack.imgur.com/U8pZQ.png
  [7]: https://i.stack.imgur.com/Bfpf6.png

This room is for discussion about this question.

Once logged in you can direct comments to any contributor here.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.