I was optimising a query on SQL Server and ran into something I was not expecting. There is a table `tblEvent` in the database, among other columns it has `IntegrationEventStateId` and `ModifiedDateUtc`. There is an index by these columns:

```
create index IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc
on dbo.tblEvent (
	IntegrationEventStateId,
	ModifiedDateUtc
)
```

When I execute the following statement:

```
select *
from dbo.tblEvent e
where
	e.IntegrationEventStateId = 1
	or e.IntegrationEventStateId = 2
	or e.IntegrationEventStateId = 5
	or (e.IntegrationEventStateId = 4 and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))
```

I get [this execution plan](https://www.brentozar.com/pastetheplan/?id=ry40ir3I8) (note the index does NOT get used):

[![enter image description here][1]][1]

But when I execute this statement:

```
select *
from dbo.tblEvent e
where
	1 = e.IntegrationEventStateId
	or 2 = e.IntegrationEventStateId
	or 5 = e.IntegrationEventStateId
	or (4 = e.IntegrationEventStateId and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))
```

I get [this execution plan](https://www.brentozar.com/pastetheplan/?id=SJcShB2LI) (note the index DOES get used):

[![enter image description here][2]][2]

The only difference between the two statements is the order of comparisons in the `where` clause. Can anyone please explain why I get different execution plans?

```none
Microsoft SQL Server 2016 (SP1-CU5) (KB4040714) - 13.0.4451.0 (X64) Sep 5 2017 16:12:34 Copyright (c) Microsoft Corporation 
Standard Edition (64-bit) on Windows Server 2012 R2 Standard 6.3 <X64> (Build 9600: ) (Hypervisor).
```

DB compatibility level: SQL Server 2016 (130). The behaviour is repeatable, I ran the queries multiple times and consistently got the above execution plans.

## Repro

    CREATE TABLE dbo.tblEvent
    (
       EventId                 INT IDENTITY PRIMARY KEY,
       IntegrationEventStateId INT,
       ModifiedDateUtc         DATETIME,
       OtherCol                CHAR(1),
       index IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc(IntegrationEventStateId, ModifiedDateUtc)
    );
    
    INSERT INTO dbo.tblEvent
    SELECT TOP 356525 3,
                      DATEADD(SECOND, ROW_NUMBER() OVER (ORDER BY @@SPID)%63424, GETUTCDATE()),
                      'A'
    FROM   sys.all_objects o1,
           sys.all_objects o2;
    
    UPDATE STATISTICS dbo.tblEvent WITH FULLSCAN
    
      
    select *
    from dbo.tblEvent e 
    where
        e.IntegrationEventStateId = 1
        or e.IntegrationEventStateId = 2
        or e.IntegrationEventStateId = 5
        or (e.IntegrationEventStateId = 4 and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))
    
    
    select *
    from dbo.tblEvent e
    where
        1 = e.IntegrationEventStateId
        or 2 = e.IntegrationEventStateId
        or 5 = e.IntegrationEventStateId
        or (4 = e.IntegrationEventStateId and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))

## Original table

```
CREATE TABLE [dbo].[tblEvent]
(
[EventId] [int] NOT NULL IDENTITY(1, 1),
[EventTypeId] [int] NOT NULL,
[ScorecardId] [int] NULL,
[ScorecardAreaId] [int] NULL,
[AreaId] [int] NULL,
[ScorecardTopicId] [int] NULL,
[TopicId] [int] NULL,
[ScorecardRequirementId] [int] NULL,
[RequirementId] [int] NULL,
[DocumentId] [int] NULL,
[FileId] [int] NULL,
[TopicTitle] [nvarchar] (100) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ScorecardTopicStatus] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[RequirementText] [nvarchar] (500) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[ScorecardRequirementStatus] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[DocumentName] [nvarchar] (260) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[CreatedByUserId] [int] NOT NULL,
[CreatedByUserSessionId] [int] NOT NULL,
[CreatedDateUtc] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Create__0737E4A2] DEFAULT (sysutcdatetime()),
[CreatedDateLocal] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Create__082C08DB] DEFAULT (sysdatetime()),
[ModifiedByUserId] [int] NOT NULL,
[ModifiedByUserSessionId] [int] NOT NULL,
[ModifiedDateUtc] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Modifi__09202D14] DEFAULT (sysutcdatetime()),
[ModifiedDateLocal] [datetime2] (4) NOT NULL CONSTRAINT [DF__tblEvent__Modifi__0A14514D] DEFAULT (sysdatetime()),
[IsDeleted] [bit] NOT NULL,
[RowVersion] [timestamp] NOT NULL,
[ScorecardRequirementPriority] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[AffectedUserId] [int] NULL,
[UserId] [int] NULL,
[CorrelationId] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventStateId] [int] NULL,
[IntegrationEventId] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventContent] [nvarchar] (max) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventType] [nvarchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[IntegrationEventTryCount] [int] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblEvent] ADD CONSTRAINT [PK_dbo.tblEvent] PRIMARY KEY CLUSTERED ([EventId]) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_tblEvent_IntegrationEventStateId_ModifiedDateUtc] ON [dbo].[tblEvent] ([IntegrationEventStateId], [ModifiedDateUtc]) ON [PRIMARY]
GO
ALTER TABLE [dbo].[tblEvent] ADD CONSTRAINT [FK_dbo.tblEvent_dbo.tblEventType_EventTypeId] FOREIGN KEY ([EventTypeId]) REFERENCES [dbo].[tblEventType] ([EventTypeId])
GO
ALTER TABLE [dbo].[tblEvent] ADD CONSTRAINT [FK_dbo.tblEvent_dbo.tblIntegrationEventState_IntegrationEventStateId] FOREIGN KEY ([IntegrationEventStateId]) REFERENCES [dbo].[tblIntegrationEventState] ([IntegrationEventStateId])
GO
```


  [1]: https://i.stack.imgur.com/nTA7o.png
  [2]: https://i.stack.imgur.com/mKBXR.png
  [3]: https://i.stack.imgur.com/eYOuo.png
Top Answer
Paul White
There are a number of issues here, but the most important is cardinality estimation (CE).

The newer ("default") CE model has a hard time with the predicates when it tries to compute selectivity against the histogram with no matching steps.

For example, the initial cardinality estimate returns a selectivity of 1.0 (all rows) for:

```
select *
from dbo.tblEvent e
where
    1 = e.IntegrationEventStateId
    or 2 = e.IntegrationEventStateId
    or 5 = e.IntegrationEventStateId
    or (4 = e.IntegrationEventStateId and e.ModifiedDateUtc >= dateadd(minute, -5, getutcdate()))
```

...as shown using trace flags 3604 and 2363:

```none
Begin selectivity computation

Input tree:

  LogOp_Select
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)
      ScaOp_Logical x_lopOr
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=1)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=2)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=5)
          ScaOp_Logical x_lopAnd
              ScaOp_Comp x_cmpGe
                  ScaOp_Identifier QCOL: [e].ModifiedDateUtc
                  ScaOp_Identifier COL: ConstExpr1001 
              ScaOp_Comp x_cmpEq
                  ScaOp_Identifier QCOL: [e].IntegrationEventStateId
                  ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=4)

Plan for computation:

  CSelCalcCombineFilters_ExponentialBackoff (OR)
      CSelCalcCombineFilters_ExponentialBackoff (AND)
          CSelCalcColumnInInterval
              Column: QCOL: [e].ModifiedDateUtc
          CSelCalcColumnInInterval
              Column: QCOL: [e].IntegrationEventStateId
      CSelCalcColumnInInterval
          Column: QCOL: [e].IntegrationEventStateId

Loaded histogram for column QCOL: [e].ModifiedDateUtc from stats with id 3
Loaded histogram for column QCOL: [e].IntegrationEventStateId from stats with id 2

Selectivity: 1

Stats collection generated: 

  CStCollFilter(ID=2, CARD=356525)
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)

End selectivity computation
```

When cost-based optimization starts, and the input tree is in a slightly different form, the CE is asked to compute the selectivity of the simpler predicates:

```none
Begin selectivity computation

Input tree:

  LogOp_Select
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)
      ScaOp_Logical x_lopOr
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=1)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=2)
          ScaOp_Comp x_cmpEq
              ScaOp_Identifier QCOL: [e].IntegrationEventStateId
              ScaOp_Const TI(int,ML=4) XVAR(int,Not Owned,Value=5)

Plan for computation:

  CSelCalcColumnInInterval
      Column: QCOL: [e].IntegrationEventStateId

Selectivity: 1

Stats collection generated: 

  CStCollFilter(ID=3, CARD=356525)
      CStCollBaseTable(ID=1, CARD=356525 TBL: dbo.tblEvent AS TBL: e)

End selectivity computation
```

This is the equivalent of:

```
SELECT *
FROM dbo.tblEvent AS TE 
WHERE TE.IntegrationEventStateId IN (1, 2, 5);
```

In both cases, the CE assesses that 100% of the rows will match, despite there being no histogram steps for the values 1, 2, or 5 (the sample data has values of 3 only). It is tempting to blame the `CSelCalcColumnInInterval` calculator for this, as it seems to treat {1, 2, 5} as a single interval {1:5}.

As is often the case, the "legacy" CE does a better (more detailed) job here, so you should find the following hint will produce much better plans:

```
OPTION (USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION'));
```

With the repro data, this produces a single seek and key lookup as one would hope.

!["legacy" CE plan][1]

Note that the seek performs four seeking operations, one for each disjoint predicate.

```none
[1] Seek Keys[1]: Prefix: IntegrationEventStateId = 1
[2] Seek Keys[1]: Prefix: IntegrationEventStateId = 2
[3] Seek Keys[1]: Prefix: IntegrationEventStateId = 4, Start: ModifiedDateUtc >= dateadd(minute,(-5),getutcdate())
[4] Seek Keys[1]: Prefix: IntegrationEventStateId = 5
```

The new CE is designed to be more predictable, and easier to maintain/extend than the original CE. The "legacy" one had bits bolted on to it and refinements made over a long period of time. That complexity has benefits and pitfalls. Regressions and lower-quality estimates are somewhat expected with the newer CE. This ought to improve over time, but we are not there yet. I would view the behaviour shown here as a limitation of the calculator. Perhaps they will fix it.

See [Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator](https://docs.microsoft.com/en-us/previous-versions/dn673537(v=msdn.10)).

## Plan shape

The question of why the plan shape depends on textual representation is more of a side issue. The compilation process does contain logic (e.g. rule `SelPredNorm`) to rewrite predicates into a normalized form, and both repro queries are successfully rewritten to the same tree. This is done for various internal purposes, including index and computed column matching, and to make logical simplification easier to process.

Unfortunately, the rewritten form is only used prior to cost-based optimization. The input to the cost-based optimizer retains the differences in textual order present in the original query.

I believe this is intentional, and done to prevent unexpected plan changes. People sometimes write queries in slightly different, and unusual, ways to achieve a particular plan shape. If the optimizer suddenly started defeating those attempts as logically redundant, people would be upset. This is arguably less of an issue with things like query store, and more efficient plan forcing, but these are relatively recent innovations.

In other words, the plans are different because people have relied on different text producing different plans in the past, and changing that now would be too disruptive.

  [1]: https://i.stack.imgur.com/6d48S.png

This room is for discussion about this question.

Once logged in you can direct comments to any contributor here.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.