Can you explain this execution plan? - Databases

TopAnswers Databases

Meta

Databases

TeX

Code Golf

APL

C++

.net

db<>fiddle

Java

*nix

PHP

PowerShell

Python

Rust

टेक्-मराठी

Typst

Web Client Dev

Web Server Dev

Can you explain this execution plan?

sql-server sql-server-2008-r2 add tag

Andrew Savinykh (imported from SE)

I was researching something else when I came across this thing. I was generating test tables with some data in it and running different queries to find out how different ways to write queries affects execution plan. Here is the script that I used to generate random test data:

	IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID('t') AND type in (N'U'))
	DROP TABLE t
	GO

	CREATE TABLE t 
	(
	 c1 int IDENTITY(1,1) NOT NULL 
	,c2 int NULL
	) 
	GO

	insert into t
	select top 1000000 a from
	(select t1.number*2048 + t2.number a, newid() b
	from [master]..spt_values t1 
	cross join  [master]..spt_values t2
	where t1.[type] = 'P' and t2.[type] = 'P') a
	order by b
	GO

	update t set c2 = null
	where c2 < 2048 * 2048 / 10
	GO


	CREATE CLUSTERED INDEX pk ON [t] (c1)
	GO

	CREATE NONCLUSTERED INDEX i ON t (c2)
	GO


Now, given this data, I invoked the following query:

    select * 
    from t 
    where 
          c2 < 1048576 
       or c2 is null
    ;

To my great surprise, the execution plan that was generated for this query, was [this][1]. (Sorry for the external link, it's too large to fit here). 


Can someone explain to me what's up with all these "[Constant Scans][2]" and "[Compute Scalars][3]"? What's happening?

![Plan][4]

      |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1010], [Expr1011], [Expr1012]))
           |--Merge Interval
           |    |--Sort(TOP 2, ORDER BY:([Expr1013] DESC, [Expr1014] ASC, [Expr1010] ASC, [Expr1015] DESC))
           |         |--Compute Scalar(DEFINE:([Expr1013]=((4)&[Expr1012]) = (4) AND NULL = [Expr1010], [Expr1014]=(4)&[Expr1012], [Expr1015]=(16)&[Expr1012]))
           |              |--Concatenation
           |                   |--Compute Scalar(DEFINE:([Expr1005]=NULL, [Expr1006]=NULL, [Expr1004]=(60)))
           |                   |    |--Constant Scan
           |                   |--Compute Scalar(DEFINE:([Expr1008]=NULL, [Expr1009]=(1048576), [Expr1007]=(10)))
           |                        |--Constant Scan
           |--Index Seek(OBJECT:([t].[i]), SEEK:([t].[c2] > [Expr1010] AND [t].[c2] < [Expr1011]) ORDERED FORWARD)

  [1]: https://docs.google.com/spreadsheet/pub?key=0At0rnZg7VspEdFpuc29qWmYwbnlLdVdsZkVRamxJQ0E&single=true&gid=0&output=html
  [2]: http://msdn.microsoft.com/en-us/library/ms188318.aspx
  [3]: http://msdn.microsoft.com/en-us/library/ms178082.aspx
  [4]: https://i.stack.imgur.com/QTNL2.jpg

Top Answer

Martin Smith (imported from SE)

The constant scans each produce a single in-memory row with no columns. The top compute scalar outputs a single row with 3 columns

Expr1005 |   Expr1006 |   Expr1004
---------| -----------| -----------
NULL     |   NULL     |   60

The bottom compute scalar outputs a single row with 3 columns

Expr1008   | Expr1009  |  Expr1007
-----------| ----------| -----------
NULL       | 1048576   |     10

The concatenation operator Unions these 2 rows together and outputs the 3 columns but they are now renamed 

Expr1010  |  Expr1011  |  Expr1012
----------| -----------| -----------
NULL      |  NULL      |  60
NULL      |  1048576   |  10

The `Expr1012` column is a set of flags [used internally to define certain seek properties for the Storage Engine][1].

The next compute scalar along outputs 2 rows

Expr1010   | Expr1011   | Expr1012  |  Expr1013  |  Expr1014  |  Expr1015
-----------| -----------| ----------| -----------| -----------| -----------
NULL       | NULL       | 60        |  True      |  4         |  16            
NULL       | 1048576    | 10        |  False     |  0         |  0      


The last three columns are defined as follows and are just used for sorting purposes prior to presenting to the Merge Interval Operator

``` none
[Expr1013] = Scalar Operator(((4)&[Expr1012]) = (4) AND NULL = [Expr1010]), 
[Expr1014] = Scalar Operator((4)&[Expr1012]), 
[Expr1015] = Scalar Operator((16)&[Expr1012])
```

`Expr1014` and `Expr1015` just test whether certain bits are on in the flag. 
`Expr1013` appears to return a boolean column true if both the bit for `4` is on and `Expr1010` is `NULL`. 

From trying other comparison operators in the query I get these results:

| Operator | Expr1010 | Expr1011 | Flags (Dec) | 32 | 16 | 8 | 4 | 2 | 1 |
|----------|----------|----------|-------------|----|----|---|---|---|---|
| >        | 1048576  | NULL     |           6 |  0 |  0 | 0 | 1 | 1 | 0 |
| >=       | 1048576  | NULL     |          22 |  0 |  1 | 0 | 1 | 1 | 0 |
| <=       | NULL     | 1048576  |          42 |  1 |  0 | 1 | 0 | 1 | 0 |
| <        | NULL     | 1048576  |          10 |  0 |  0 | 1 | 0 | 1 | 0 |
| =        | 1048576  | 1048576  |          62 |  1 |  1 | 1 | 1 | 1 | 0 |
| IS NULL  | NULL     | NULL     |          60 |  1 |  1 | 1 | 1 | 0 | 0 |

From which I infer that Bit 4 means "Has start of range" (as opposed to being unbounded) and Bit 16 means the start of the range is inclusive.

This 6 column result set is emitted from the `SORT` operator sorted by 
`Expr1013 DESC, Expr1014 ASC, Expr1010 ASC, Expr1015 DESC`. Assuming `True` is represented by `1` and `False` by `0` the previously represented resultset is already in that order.

Based on my previous assumptions the net effect of this sort is to present the ranges to the merge interval in the following order

     ORDER BY 
              HasStartOfRangeAndItIsNullFirst,
              HasUnboundedStartOfRangeFirst,
              StartOfRange,
              StartOfRangeIsInclusiveFirst

The merge interval operator outputs 2 rows

Expr1010   | Expr1011  |  Expr1012
-----------| ----------| -----------
NULL       | NULL      |  60
NULL       | 1048576   |  10


For each row emitted a range seek is performed 

``` none
Seek Keys[1]: Start:[dbo].[t].c2 > Scalar Operator([Expr1010]), 
              End: [dbo].[t].c2 < Scalar Operator([Expr1011])
```

So it would appear as though two seeks are performed. One apparently `> NULL AND < NULL` and one  `> NULL AND < 1048576`. However the flags that get passed in appear to modify this to `IS NULL` and `< 1048576` respectively.

60 is for a comparison with NULL.  The range boundary expressions use NULL to represent 'unbounded' at either end.  The seek is always exclusive i.e. `Start: > Expr & End: < Expr` rather than inclusive using `>=` and `<=`.

If you change the query slightly to:

    select *
    from t 
    where 
          c2 > 1048576 
       or c2 = 0
    ;

Then the plan looks much simpler with an index seek with multiple seek predicates. 

The plan shows `Seek Keys` 

``` none
Start: c2 >= 0, End: c2 <= 0, 
Start: c2 > 1048576
```

The explanation for why this simpler plan cannot be used for the case in the OP is given by Paul White in the comments to the [earlier linked blog post][1].

An index seek with multiple predicates cannot mix different types of comparison predicate (ie. `Is` and `Eq` in the case in the OP). This is just a current limitation of the product (and is presumably the reason why the equality test in the last query `c2 = 0` is implemented using `>=` and `<=` rather than just the straightforward equality seek you get for the query `c2 = 0 OR c2 = 1048576`.

  [1]: https://sqlkiwi.blogspot.com/2012/01/dynamic-seeks-and-hidden-implicit-conversions.html
  [2]: https://dba.stackexchange.com/users/1192/paul-white

1 Answer