Where do this Constant Scan and Left Outer Join come from in a trivial SELECT query plan? - Databases

TopAnswers Databases

Meta

Databases

TeX

Code Golf

APL

C++

.net

db<>fiddle

Java

*nix

PHP

PowerShell

Python

Rust

टेक्-मराठी

Typst

Web Client Dev

Web Server Dev

Where do this Constant Scan and Left Outer Join come from in a trivial SELECT query plan?

sql-server azure-sql-database add tag

sharptooth (imported from SE)

I have this table:

    CREATE TABLE [dbo].[Accounts] (
        [AccountId] UNIQUEIDENTIFIER UNIQUE NOT NULL DEFAULT NEWID(),
        -- WHATEVER other columns
    );
    GO
    CREATE UNIQUE CLUSTERED INDEX [AccountsIndex]
        ON [dbo].[Accounts]([AccountId] ASC);
    GO

This query:

    DECLARE @result UNIQUEIDENTIFIER
    SELECT @result = AccountId FROM Accounts WHERE AccountId='guid-here'

executes with a query plan consisting of a single Index Seek - as expected:

    SELECT <---- Clustered Index Seek

This query does the same:

    DECLARE @result UNIQUEIDENTIFIER
    SET @result = (SELECT AccountId FROM Accounts WHERE AccountId='guid-here')

but it's executed with a plan where result of Index Seek is Left Outer Joined with result of some Constant Scan and then fed into Compute Scalar:

    SELECT <--- Compute Scalar <--- Left Outer Join <--- Constant Scan
                                          ^
                                          |------Clustered Index Seek

What's that extra magic? What does that Constant Scan followed by Left Outer Join do?

Top Answer

Paul White (imported from SE)

The semantics of the two statements are different:

* The first does not set the value of the variable if no row is found.
* The second always sets the variable, including to null if no row is found.

The Constant Scan produces an empty row (with no columns!) that will result in the variable being updated in case nothing matches from the base table. The left join ensures the empty row survives the join. Variable assignment can be thought of as happening at the root node of the execution plan.

## Using `SELECT @result`

    -- Set initial value
    DECLARE @result uniqueidentifier = {guid 'FE2CA909-1162-4C6C-A7AC-33B257E28539'};

    -- @result does not change
    SELECT @result = AccountId 
    FROM Accounts 
    WHERE AccountId={guid '7AD4D33C-1ED7-4183-B7F3-48C33D666525'};

    SELECT @result;

[![Result 1][1]][1]

## Using `SET @result`

    -- Set initial value
    DECLARE @result uniqueidentifier = {guid 'FE2CA909-1162-4C6C-A7AC-33B257E28539'};

    -- @result set to null
    SET @result = 
    (
        SELECT AccountId 
        FROM Accounts 
        WHERE AccountId={guid '7AD4D33C-1ED7-4183-B7F3-48C33D666525'}
    );

    SELECT @result;

[![Result 2][2]][2]

### Execution plans

[![SELECT assignment][3]][3]
*No row arrives at the root node, so no assignment occurs.*

[![SET assignment][4]][4]
*A row always arrives at the root node, so variable assignment occurs.*

---

The extra Constant Scan and Nested Loops Left Outer Join are nothing to be concerned about. The join in particular is cheap since it is guaranteed to encounter one row on its outer input, and at most one row (in your example) on the inner input.

There are other ways to ensure a row is generated from the subquery to ensure a variable assignment occurs. One is to use a redundant scalar aggregate (no group by clause):

    -- Set initial value
    DECLARE @result uniqueidentifier = {guid 'FE2CA909-1162-4C6C-A7AC-33B257E28539'};

    -- @result set to null
    SET @result = 
        (
            SELECT MAX(AccountId)
            FROM Accounts 
            WHERE AccountId={guid '7AD4D33C-1ED7-4183-B7F3-48C33D666525'} 
        );
    SELECT @result;

[![Result 3][5]][5]

[![Scalar aggregate execution plan][6]][6]

Notice the scalar aggregate produces a row even though it receives no input.

Documentation:

* [SET @local_variable (Transact-SQL)][7]  
* [SELECT @local_variable (Transact-SQL)][8]

> If the SELECT statement returns no rows, the variable retains its present value. If expression is a scalar subquery that returns no value, the variable is set to NULL.

> For assigning variables, we recommend that you use SET @local_variable instead of SELECT @local_variable.

Further reading:

* [Fun with Scalar and Vector Aggregates][9]


  [1]: https://i.stack.imgur.com/p1Qxy.png
  [2]: https://i.stack.imgur.com/UaxFU.png
  [3]: https://i.stack.imgur.com/apLml.png
  [4]: https://i.stack.imgur.com/aSK8U.png
  [5]: https://i.stack.imgur.com/v4HAW.png
  [6]: https://i.stack.imgur.com/3PWRH.png
  [7]: https://msdn.microsoft.com/en-NZ/library/ms189484.aspx
  [8]: https://msdn.microsoft.com/en-us/library/ms187330.aspx
  [9]: https://www.sql.kiwi/2012/03/fun-with-aggregates.html

1 Answer