I'm considering creating a filtered index in my [copy of the Stack Overflow database]. Something like this, for example:
CREATE UNIQUE NONCLUSTERED INDEX IX_DisplayName_Filtered
ON dbo.Users (DisplayName)
WHERE Reputation > 400000;
Should I always add the column in the filtering expression (`Reputation` in this example) to the key or includes for the index, or is having it in the filtering expression good enough?
For various reasons, it's always better to have the filtering column as part of the index: either in the keys, or in the includes
The following are some specific examples of filtered index query problems that are resolved by including the filtering columns in the index.
## Key lookups when the query predicate doesn't match the filter expression
First of all, [the documentation] has this to say about including filter expression columns:
> - A column in the filtered index expression should be a key or included column in the filtered index definition if the query predicate uses the column in a comparison that is not equivalent to the filtered index expression.
So if you have an inequality filter expression like `Reputation > 400000`, but your *query* uses a predicate like `WHERE Reputation > 400000 AND Reputation < 450000;`, the filtered index might still be used - but a key lookup will be required to satisfy the query's predicate.
Including the `Reputation` column in the index (key or includes) removes the need for this lookup.
See Erik Darling's post [Filtered Indexes: Just Add Includes] for additional details and an example of this situation.
Another example of this can be found in Paul White's answer here: [Unnecessary key lookup being performed while using filtered index]
## Key lookups when the filtering column is included in the resultset
The documentation goes on to say this:
> - A column in the filtered index expression should be a key or included column in the filtered index definition if the column is in the query result set.
This might feel like it goes without saying, but just to be complete: if your queries include the filtering column in the final resultset, you should probably include them in the index (key or includes).
## Poor row estimates when using equality expressions
There are cases where useful row estimates based on actual statistics can be eliminated during the optimization process (specifically when the query plan produced by the optimizer is converted to a physical execution plan). Including the filtering column can prevent these more-accurate estimates from being discarded.
More details, and an example, can be found in Paul White's answer here: [Incorrect row estimation given with a filtered index]
An additional example can be found here on dba.se: [Query using a filtered index but wrong estimated number of rows]
## Key lookups when using `IS NULL` in the filtering expression
Creating an index with a filtering expression that uses `IS NULL` can produce a completely unnecessary key lookup. See this question, and the related bug report on SQL Server's feedback site: [Why filtered index on IS NULL value is not used?]
As you might have guessed, the workaround presented is to add the filtering column as an included column in the filtered index.