sql-server sql-server-2014 sql-server-2017 add tag
daniel hutmacher (imported from SE)
I have two tables with identically named, typed, and indexed key columns. One of the them has a *unique* clustered index, the other one has a *non-unique*.

## The test setup

Setup script, including some realistic statistics:

    DROP TABLE IF EXISTS #left;
    DROP TABLE IF EXISTS #right;

    CREATE TABLE #left (
        a       char(4) NOT NULL,
        b       char(2) NOT NULL,
        c       varchar(13) NOT NULL,
        d       bit NOT NULL,
        e       char(4) NOT NULL,
        f       char(25) NULL,
        g       char(25) NOT NULL,
        h       char(25) NULL
        --- and a few other columns
    );
    
    CREATE UNIQUE CLUSTERED INDEX IX ON #left (a, b, c, d, e, f, g, h)
    
    UPDATE STATISTICS #left WITH ROWCOUNT=63800000, PAGECOUNT=186000;
    
    CREATE TABLE #right (
        a       char(4) NOT NULL,
        b       char(2) NOT NULL,
        c       varchar(13) NOT NULL,
        d       bit NOT NULL,
        e       char(4) NOT NULL,
        f       char(25) NULL,
        g       char(25) NOT NULL,
        h       char(25) NULL
        --- and a few other columns
    );
    
    CREATE CLUSTERED INDEX IX ON #right (a, b, c, d, e, f, g, h)
    
    UPDATE STATISTICS #right WITH ROWCOUNT=55700000, PAGECOUNT=128000;

## The repro

When I join these two tables on their clustering keys, I expect a one-to-many MERGE join, like so:

    SELECT *
    FROM #left AS l
    LEFT JOIN #right AS r ON
        l.a=r.a AND
        l.b=r.b AND
        l.c=r.c AND
        l.d=r.d AND
        l.e=r.e AND
        l.f=r.f AND
        l.g=r.g AND
        l.h=r.h
    WHERE l.a='2018';

This is the query plan I want:

[![This is what I want.][1]][1]

(Never mind the warnings, they have to do with the fake statistics.)

However, if I change the order of the columns around in the join, like so:

    SELECT *
    FROM #left AS l
    LEFT JOIN #right AS r ON
        l.c=r.c AND     -- used to be third
        l.a=r.a AND     -- used to be first
        l.b=r.b AND     -- used to be second
        l.d=r.d AND
        l.e=r.e AND
        l.f=r.f AND
        l.g=r.g AND
        l.h=r.h
    WHERE l.a='2018';

... this happens:

[![The query plan after changing the declared column order in the join.][2]][2]

The Sort operator seems to order the streams according to the declared order of the join, i.e. `c, a, b, d, e, f, g, h`, which adds a blocking operation to my query plan.

## Things I've looked at

* I've tried changing the columns to `NOT NULL`, same results.
* The original table was created with `ANSI_PADDING OFF`, but creating it with `ANSI_PADDING ON` does not affect this plan.
* I tried an `INNER JOIN` instead of `LEFT JOIN`, no change.
* I discovered it on a 2014 SP2 Enterprise, created a repro on a 2017 Developer (current CU).
* Removing the WHERE clause on the leading index column does generate the good plan, but it kind of affects the results.. :)

## Finally, we get to the question

* Is this intentional?
* Can I eliminate the sort without changing the query (which is vendor code, so I'd really rather not...). I can change the table and indexes.



  [1]: https://i.stack.imgur.com/HnXXA.png
  [2]: https://i.stack.imgur.com/qw7bT.png
Top Answer
Paul White (imported from SE)
>Is this intentional?

It is by design, yes. The best public source for this assertion was unfortunately lost when Microsoft retired the Connect feedback site, obliterating many useful comments from developers on the SQL Server team.

Anyway, the current optimizer design does not *actively seek* to avoid unnecessary sorts *per se*. This is most often encountered with windowing functions and the like, but can also be seen with other operators that are sensitive to ordering, and in particular to preserved ordering between operators.

Nevertheless, the optimizer is quite good (in many cases) at avoiding unnecessary sorting, but this outcome normally occurs for reasons other than aggressively trying different ordering combinations. In that sense, it is not so much a question of 'search space' as it is of the complex interactions between orthogonal optimizer features that have been shown to increase general plan quality at acceptable cost.

For example, sorting can often be avoided simply by matching an ordering requirement (e.g. top-level `ORDER BY`) to an existing index. Trivially in your case that could mean adding `ORDER BY l.a, l.b, l.c, l.d, l.e, l.f, l.g, l.h;` but this is an over-simplification (and unacceptable because you do not want to change the query).

More generally, each memo group may be associated with required or desired properties, which may include input ordering. When there is no obvious reason to *enforce* a particular order (e.g. to satisfy an `ORDER BY`, or to ensure correct results from an order-sensitive physical operator), there is an element of 'luck' involved. I wrote more about the specifics of that as it pertains to merge join (in union or join mode) in [Avoiding Sorts with Merge Join Concatenation][1]. Much of that goes beyond the supported surface area of the product, so treat it as informational, and subject to change.

In your particular case, yes, you may adjust the indexing [as jadarnel27 suggests][2] to avoid the sorts; though there is little reason to actually prefer a merge join here. You could also hint a choice between hash or loop physical join with `OPTION(HASH JOIN, LOOP JOIN)` using a Plan Guide without changing the query, depending on your knowledge of the data, and the trade-off between best, worst, and average-case performance.

Finally, as a curiosity, note that the sorts can be avoided with a simple `ORDER BY l.b`, at the cost of a potentially less efficient many-to-many merge join on `b` alone, with a complex residual. I mention this mostly as an illustration of the interaction between optimizer features I mentioned previously, and the way top-level requirements can propagate.

  [1]: https://sqlperformance.com/2014/09/t-sql-queries/avoiding-sorts-merge-join-concatenation
  [2]: https://dba.stackexchange.com/a/215099
Answer #2
Josh Darnell (imported from SE)
> Can I eliminate the sort without changing the query (which is vendor code, so I'd really rather not...). I can change the table and indexes.

If you can change the indexes, then changing the order of the index on `#right` to match the order of the filters in the join removes the sort (for me):

    CREATE CLUSTERED INDEX IX ON #right (c, a, b, d, e, f, g, h)

Surprisingly (to me, at least), this results in neither query ending up with a sort.

> Is this intentional?

Looking at the output from [some weird trace flags][2], there's an interesting difference in the final Memo structure:

[![screenshot of final memo structure for each query][1]][1]

As you can see in the "Root Group" at the top, both queries have the option to use a Merge Join as the main physical operation to execute this query.

## Good query

The join *without* the sort is driven by group 29 option 1 and group 31 option 1 (each of which are range scans on the indexes involved).  It's filtered by group 27 (not shown), which is the series of logical comparison operations that filter the join.

## Bad query

The one *with* the sort is driven by the (new) options 3 that each of those two groups (29 and 31) has.  Option 3 performs a physical sort on the results of the range scans mentioned previously (option 1 of each of those groups).

## Why?

For some reason, the option to use 29.1 and 31.1 directly as sources for the merge join is not even available to the optimizer in the second query.  Otherwise, I think it would be listed under the root group among the other options.  If it were available at all, then it would definitely pick those over the massively more expensive sort operations.

I can only conclude that either:

- this is a bug (or more likely a limitation) in the optimizer's search algorithm
  - changing the indexes and joins to only have 5 keys removes the sort for the second query (6, 7, and 8 keys all have the sort).  
  - This implies that the search space with 8 keys is so large that the optimizer just doesn't have time to identify the non-sort solution as a viable option before it terminates early with the "good enough plan found" reason
  - it does seem a little buggy to me that the order of the join conditions influences the optimizer's search process this much, but really that's a bit over my head
- the sort is required in order to ensure correctness in the results
  - this one seems unlikely, since the query *can* run without the sort when there are fewer keys, or the keys are specified in a different order

Hopefully someone can come along and explain *why* the sort is required, but I thought the difference in the Memo building was interesting enough to post as an answer.

  [1]: https://i.stack.imgur.com/QQB5x.png
  [2]: http://sqlblog.com/blogs/paul_white/archive/2012/04/28/query-optimizer-deep-dive-part-1.aspx

This room is for discussion about this question.

Once logged in you can direct comments to any contributor here.

Enter question or answer id or url (and optionally further answer ids/urls from the same question) from

Separate each id/url with a space. No need to list your own answers; they will be imported automatically.