Someone said it's preferable to craft your queries to avoid duplicate key exceptions, but I'm not convinced that's more performant that just setting `IGNORE_DUP_KEY = ON` for the index. My goal is to ensure a row or set of rows exists for one or more users, before attempting to update those rows. I do this, so that when I attempt to update the row with an update statement like the one below, and no rows are affected, it's because the `[Count]` portion of predicate wasn't satisfied, as opposed to the row not existing at all (i.e. the `[ID]` portion of the predicate not being satisfied): ``` UPDATE [Inventory] SET [Count] = [Count] + 1 WHERE [ID] = 3 AND ([Count] + 1) <= @MaxInventory ``` I could run `EXISTS(SELECT 1 From [Inventory] WHERE [ID] = 3` to check for that single row, and only insert the row if it does not exist. That simply avoids unnecessary inserts. The insert, if necessary, would still have to contend with concurrent transactions, so duplicate key exceptions can still occur. I'm curious whether it's more performant to just turn on `IGNORE_DUP_KEY` in this scenario, rather than allowing the error to be thrown and caught. Specifically, I'm curious if it's as-fast or possibly even faster than running an exists check, to just attempt to insert the record and let it ignore duplicate keys. This becomes even more important, when I'm checking for and initializing multiple records at once. For example, if I need to ensure records for thousands of users exist in a single update statement, the logic would be much simpler if I just ran that insert statement up-front, letting it ignore duplicate keys. Avoiding duplicates would be more complex, because I'd have to first query the table for which records don't exist, then attempt to add just those records (again, ignoring duplicate keys). Just inserting may be faster, even if all the records exist. I could meet it halfway and check whether *any* of the records are missing, such as with a left join or a `COUNT` comparison, but why bother if the insert ignoring duplicate keys is just faster? **Is is a good idea to use `IGNORE_DUP_KEY` and just attempt inserts instead of bothering with checking for row existence ahead of time? If not, why?**
It's definitely an atypical setting to enable for an index. I don't think you're going to find many people who jump up and talk about using it. Of course, there are a couple helpful posts on the subject by Paul White: - [IGNORE_DUP_KEY slower on clustered indexes][1] - [A creative use of IGNORE_DUP_KEY][2] If your concern is around upsert patterns (or something similar), this article by Michael Swart is really informative: - [SQL Server UPSERT Patterns and Antipatterns][3] [1]: https://sqlperformance.com/2019/04/sql-performance/ignore_dup_key-slower-clustered-indexes [2]: https://www.sql.kiwi/2013/02/a-creative-use-of-ignore-dup-key.html [3]: https://michaeljswart.com/2017/07/sql-server-upsert-patterns-and-antipatterns/
Every feature has a use case, and `IGNORE_DUP_KEY` is no exception. As I explain in [IGNORE_DUP_KEY slower on clustered indexes][1], this option only potentially benefits performance (on a clustered index) when the number of duplicate key exceptions is sufficiently small. The break-even point is system-dependant, and requires realistic testing to properly evaluate. Try not to make assumptions or judgements in advance. Test each of the robust implementation alternatives carefully, and choose the one that makes most sense in your scenario. As an example of a reason for a performance surprise, consider that the index insertion point has to be located at some stage, so repeating that operation may be cheaper than expected. When SQL Server can use the ["rowset sharing" optimization][2], there is even less overhead. The engine locates the insertion point once when checking for existence, and keeps that direct reference for the insert operation. Though not directly relevant to the question, I should mention some caveats to using `IGNORE_DUP_KEY`: * It is not respected by the insert activity of `MERGE` statements. * It cannot be added to an existing index. * It is not as 'discoverable' as an explicit existence test. I generally prefer separate statements to `MERGE`, but you should test it for your intended usage. `MERGE` can combine an insert and update into a single statement, while taking advantage of the [hole-filling optimization][3] and *rowset sharing*. [1]: https://sqlperformance.com/2019/04/sql-performance/ignore_dup_key-slower-clustered-indexes [2]: https://sqlperformance.com/2015/12/sql-plan/optimizing-update-queries [3]: https://sqlperformance.com/2013/02/sql-plan/halloween-problem-part-3