The following is a simplification of a performance problem encountered with the Query Store: CREATE TABLE #tears ( plan_id bigint NOT NULL ); INSERT #tears (plan_id) VALUES (1); SELECT T.plan_id FROM #tears AS T LEFT JOIN sys.query_store_plan AS QSP ON QSP.plan_id = T.plan_id; The `plan_id` column is documented as being the primary key of [`sys.query_store_plan`][1], but the execution plan does not use [join elimination][2] as would be expected: 0. No attributes are being projected from the DMV. 0. The DMV primary key `plan_id` cannot duplicate rows from the temporary table 0. A `LEFT JOIN` is used, so no rows from `T` can be eliminated. [Execution plan][3] [![plan graphic][4]][4] Why is this, and what can be done to obtain join elimination here? [1]: https://docs.microsoft.com/en-us/sql/relational-databases/system-catalog-views/sys-query-store-plan-transact-sql?view=sql-server-2017 [2]: https://blogs.msmvps.com/robfarley/2008/11/08/join-simplification-in-sql-server/ [3]: https://www.brentozar.com/pastetheplan/?id=HyObYM_w7 [4]: https://i.stack.imgur.com/VHXax.png
The documentation is a little misleading. The DMV is a non-materialized view, and does not have a primary key as such. The underlying definitions are a little complex but a simplified definition of `sys.query_store_plan` is: CREATE VIEW sys.query_store_plan AS SELECT PPM.plan_id -- various other attributes FROM sys.plan_persist_plan_merged AS PPM LEFT JOIN sys.syspalvalues AS P ON P.class = 'PFT' AND P.[value] = plan_forcing_type; Further, `sys.plan_persist_plan_merged` is also a view, though one needs to connect via the Dedicated Administrator Connection to see its definition. Again, simplified: CREATE VIEW sys.plan_persist_plan_merged AS SELECT P.plan_id as plan_id, -- various other attributes FROM sys.plan_persist_plan P -- NOTE - in order to prevent potential deadlock -- between QDS_STATEMENT_STABILITY LOCK and index locks WITH (NOLOCK) LEFT JOIN sys.plan_persist_plan_in_memory PM ON P.plan_id = PM.plan_id; The indexes on `sys.plan_persist_plan` are: ``` none ╔════════════════════════╦══════════════════════════════════════╦═════════════╗ ║ index_name ║ index_description ║ index_keys ║ ╠════════════════════════╬══════════════════════════════════════╬═════════════╣ ║ plan_persist_plan_cidx ║ clustered, unique located on PRIMARY ║ plan_id ║ ║ plan_persist_plan_idx1 ║ nonclustered located on PRIMARY ║ query_id(-) ║ ╚════════════════════════╩══════════════════════════════════════╩═════════════╝ ``` So `plan_id` is constrained to be unique on `sys.plan_persist_plan`. Now, `sys.plan_persist_plan_in_memory` is a streaming table-valued function, presenting a tabular view of data only held in internal memory structures. As such, it does not have any unique constraints. At heart, the query being executed is therefore equivalent to: DECLARE @t1 table (plan_id integer NOT NULL); DECLARE @t2 table (plan_id integer NOT NULL UNIQUE CLUSTERED); DECLARE @t3 table (plan_id integer NULL); SELECT T1.plan_id FROM @t1 AS T1 LEFT JOIN ( SELECT T2.plan_id FROM @t2 AS T2 LEFT JOIN @t3 AS T3 ON T3.plan_id = T2.plan_id ) AS Q1 ON Q1.plan_id = T1.plan_id; ...which does not produce join elimination: [![no join elimination][1]][1] Getting right to the core of the issue, the problem is the inner query: DECLARE @t2 table (plan_id integer NOT NULL UNIQUE CLUSTERED); DECLARE @t3 table (plan_id integer NULL); SELECT T2.plan_id FROM @t2 AS T2 LEFT JOIN @t3 AS T3 ON T3.plan_id = T2.plan_id; ...clearly the left join might result in rows from `@t2` being duplicated because `@t3` has no uniqueness constraint on `plan_id`. Therefore, the join cannot be eliminated: [![inner query][2]][2] To workaround this, we can explicitly tell the optimizer that we do not require any duplicate `plan_id` values: ``` DECLARE @t2 table (plan_id integer NOT NULL UNIQUE CLUSTERED); DECLARE @t3 table (plan_id integer NULL); SELECT DISTINCT -- added T2.plan_id FROM @t2 AS T2 LEFT JOIN @t3 AS T3 ON T3.plan_id = T2.plan_id; ``` The outer join to `@t3` can now be eliminated: [![join eliminated][3]][3] Applying that to the real query: ``` SELECT DISTINCT T.plan_id FROM #tears AS T LEFT JOIN sys.query_store_plan AS QSP ON QSP.plan_id = T.plan_id; ``` Equally, we could add `GROUP BY T.plan_id` instead of the `DISTINCT`. Anyway, the optimizer can now correctly reason about the `plan_id` attribute all the way down through the nested views, and eliminate both outer joins as desired: [![Both joins eliminated][4]][4] Note that making `plan_id` unique in the temporary table would not be sufficient to obtain join elimination, since it would not preclude incorrect results. We must explicitly reject duplicate `plan_id` values from the final result to allow the optimizer to work its magic here. [1]: https://i.stack.imgur.com/iVQdv.png [2]: https://i.stack.imgur.com/19qlT.png [3]: https://i.stack.imgur.com/EyM0S.png [4]: https://i.stack.imgur.com/b0yx2.png