What characteristics lead to an execution plan benefiting from accelerated plan forcing? Are there any situations where accelerated plan forcing should be disabled (in practice or in theory)?
SQL Server 2019 introduced a new database scoped configuration option called `ACCELERATED_PLAN_FORCING` (APF). Little has been said about it in terms of announcements, although it is [documented]:
> Enables an optimized mechanism for query plan forcing, applicable to all forms of plan forcing, such as Query Store Force Plan, Automatic Tuning, or the USE PLAN query hint. The default is ON.
It's also notable in that it is on by default, which implies to me that the SQL Server team think there is little downside to using the feature. In fact, the docs even go on to warn against turning it off:
> ❕ Note
> It is not recommended to disable accelerated plan forcing.
In pursuit of seeing the feature in action, I found a big AdventureWorks2014 query [online], and started adding more joins and subqueries to it. It looks like this:
FROM Sales.SalesOrderDetail sod
INNER JOIN Sales.SalesOrderHeader soh ON sod.SalesOrderID = soh.SalesOrderID
INNER JOIN Sales.CreditCard cc ON cc.CreditCardID = soh.CreditCardID
INNER JOIN Sales.CurrencyRate cr ON cr.CurrencyRateID = soh.CurrencyRateID
LEFT JOIN Sales.Currency crn ON crn.CurrencyCode = cr.FromCurrencyCode and crn.ModifiedDate = cr.ModifiedDate
INNER JOIN Sales.SalesPerson sp ON soh.SalesPersonID = sp.BusinessEntityID
LEFT JOIN Sales.SalesTerritory st ON soh.TerritoryID = st.TerritoryID
INNER JOIN Sales.SalesTerritoryHistory sth ON st.TerritoryID = sth.TerritoryID and sth.StartDate = DATEADD(dd, st.SalesLastYear%10, GETDATE())
INNER JOIN Person.BusinessEntity be ON sp.BusinessEntityID = be.BusinessEntityID
INNER JOIN Person.Person p ON p.BusinessEntityID = be.BusinessEntityID
INNER JOIN Person.BusinessEntityAddress bea ON bea.BusinessEntityID = be.BusinessEntityID
INNER JOIN Person.[Address] a ON bea.AddressID = a.AddressID
LEFT JOIN Person.StateProvince spr ON a.StateProvinceID = spr.StateProvinceID
SELECT m = MAX(tha.TransactionDate), tha.ProductID
FROM Production.TransactionHistoryArchive tha
GROUP BY tha.ProductID
) tha ON tha.ProductID = sod.ProductID
FROM Sales.vSalesPersonSalesByFiscalYears spsbfy
spsbfy. IS NOT NULL
AND spsbfy. IS NOT NULL
AND spsbfy. IS NULL
) pe ON pe.SalesPersonID = soh.SalesPersonID
cc.CardType IN ('SuperiorCard','Vista') AND
p.PersonType = 'IN' AND
(soh.SalesOrderNumber > 'SO' OR spr.CountryRegionCode IN ('FR', 'US'))
Getting an estimated plan for that query takes around 100-150 ms on my machine. I then take the XML for that plan and paste it into an `OPTION (USE PLAN N'xml goes here')` query hint. For space reasons, I won't include that in the post body, but you can grab it [here] if you're interested.
Getting an estimated plan with the hint in place, and APF disabled:
ALTER DATABASE SCOPED CONFIGURATION SET ACCELERATED_PLAN_FORCING = OFF;
Results in a compile time of 600-650 ms - 4x higher than the compile time without the hint.
Enabling the feature:
ALTER DATABASE SCOPED CONFIGURATION SET ACCELERATED_PLAN_FORCING = ON;
Causes compile time to drop back to 150 ms. Clearly this query is a good candidate for the feature.
However, if I remove the last two `INNER JOIN`s on subqueries (aliased `tha` and `pe`), that plan does not benefit from APF at all (compile time of the forced plan is the same with APF on or off). The query, including the `USE PLAN` hint, is [here] for convenience.
So why do the additional `INNER JOIN`s on subqueries make a difference in the effectiveness of this feature?
Is it purely based on plan *size* (like, in KB)?
Or is that just an indirect correlation, and the reason is more along number of joins / tables accessed, number of indexes to choose from, number of stats loaded, etc?
As the documentation suggests, this feature should almost always be beneficial (or at least no worse) compared with the previous mechanism.
## Prior mechanism
Prior to SQL Server 2019 guided plan search was restricted to the *search2* phase of query optimization. This is also known as *full optimization*, where all exploration rules are available, and deeper search is enabled. Compilation was always *started* in *search2* rather than progressing from earlier stages.
This might seem sensible, but *search2* is normally tried last for good reasons. Having all exploration rules available means a large number of tasks can be generated, even with guided search active. The additional rules enabled in *search2* also tend to be more computationally expensive in themselves, as well as possibly generating a large number of new alternatives, which might themselves match again on expensive rules. Finally, the deeper search allows for alternatives generated from alternatives further than the *search1* distance limit of 5.
Examples rules applied only in *search2* include:
* `OJOJSwitch - (A OJ B) OJ C -> (A OJ C) OJ B`
* `ReorderLOJN - (A LOJ B) LOJ C -> A LOJ (B LOJ C)`
## New mechanism
This is able to apply guided search during *search1* (aka *quick plan*), which is where the majority of final plans are found anyway. If an equivalent to the guide cannot be found in *search1*, *search2* may be needed. When *search2* is needed, the time taken will usually be much less than under the old mechanism because the search is much more limited and targeted using the new arrangement. Note the database must be set to 150 compatibility to use APF.
Your example queries would both qualify for *search2* on complexity grounds, but the best cost plan found after *search0* and *search1* is low enough that entry to *search2* is not justified. More than that, the search even 'timed out' during *search1*. In other words, the optimizer stopped searching because a good enough cost plan was found.
Being able to apply guided search during *search1* is a win here because the cost of parsing and validating the XML showplan is more than repaid by finding a guided plan more quickly than via regular search.
The previous mechanism would have tried to find a matching plan via *search2*, which was frequently slower overall for the reasons mentioned previously.
You can monitor the effectiveness of the new mechanism using the Debug channel extended event `quickstage_useplan_statistics`.
On my machine, the full query produced the following event data:
* compilecpu_time_ms = 224 (from the plan guide)
* fullopt_time_ms = 0 (we didn't need *search2*)
* quickstage_successful = True (we found a match in *search1*)
* quickstage_time_ms = 41 (New time)
* quickstage_useplan_enabled = True (config setting is on)
* useplan_successful = True (plan found)
Note the times do not account for the XML parsing and validation time. You can see this using undocumented trace flag 8675 (with 3604 or 3605 as usual), if you're interested. When guided search is active, this produces output starting with something like:
USE PLAN hint parsing and validation time: 604 ms
### Test timings
The *estimated* execution plans showed 124ms for the unhinted full query and 142ms for the accelerated hinted query. Without acceleration, the guided plan took 774ms. The extended event shows most of this time was spent in *search2*:
* fullopt_time_ms = 666
For the query with the derived tables commented out:
* compilecpu_time_ms = 103
* fullopt_time_ms = 0
* quickstage_successful = True
* quickstage_time_ms = 25
* quickstage_useplan_enabled = True
* useplan_successful = True
The *estimated* execution plans showed 91ms for the unhinted cut-down query and 88ms for the accelerated hinted version. Without acceleration, the guided plan took 91ms. The extended event shows:
* fullopt_time_ms = 22
So it makes little difference either way for this query with a small database like AdventureWorks.
## Your question
> What characteristics lead to an execution plan benefiting from accelerated plan forcing?
This is really the wrong question, because you should only force a plan when you have no other choice e.g. natural optimization sometimes selects a terrible plan, and all other tuning efforts have failed. Remember you will need to monitor the forced plan as data volumes and distribution change over time, and when schema changes occur.
Anyway, that said:
If you do *need* to force a general plan shape, you should do no worse using the new mechanism than you did before. Sometimes it will be a big win. It is difficult to assess in advance because it depends how long the old mechanism would have spent in *search2*, and whether the new one can find a match during *search1*. Plan forcing will still sometimes be slower than compiling a fresh plan without guidance. Remember the overhead of XML parsing and validation. Still, the new system might make plan forcing to reduce compilation time more viable than it used to be.
Microsoft don't document the exploration rules that can only be applied in *search2* (and the details have changed over time). This makes it even harder to link to query features. At best, this would be a weak relationship because there is no reliable way to match declarative T-SQL with the state of the internal query tree at any point during the optimization process, or how far a particular optimization session might progress through the available stages.
To avoid being completely frustrating, I will say that complex aggregates (capable of being split or pulled/pushed around the tree) and multiple outer joins can lead to a lot of extra time being spent in *search2*. Queries with high estimated cost are also more likely to make it to full opt, and progress to deeper levels of recursive exploration.