What is the purpose of this Uniq1002 column in this index scan?

sql-server sql-server-2012 add tag

Hannah Vernon (imported from SE)

Take the following repro:

```
USE tempdb;

IF OBJECT_ID(N'dbo.t', N'U') IS NOT NULL
DROP TABLE dbo.t
GO
CREATE TABLE dbo.t
(
    id int NOT NULL 
        PRIMARY KEY 
        NONCLUSTERED 
        IDENTITY(1,1)
    , col1 datetime NOT NULL
    , col2 varchar(800) NOT NULL
    , col3 tinyint NULL
    , col4 sysname NULL
);

INSERT INTO dbo.t (
      col1
    , col2
    , col3
    , col4
    ) 
SELECT TOP(100000) 
      CONVERT(datetime, 
         DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2000-01-01 00:00:00'))
    , replicate('A', 800)
    , sc2.bitpos
    , CONVERT(sysname, CHAR(65 + CRYPT_GEN_RANDOM(1) % 26) 
        + CHAR(65 + CRYPT_GEN_RANDOM(1) % 26) 
        + CHAR(65 + CRYPT_GEN_RANDOM(1) % 26))
FROM sys.syscolumns sc
    CROSS JOIN sys.syscolumns sc2;
```

Here I'm adding a clustered index onto a set of columns that are not unique, and typical single-column non-clustered index:

```
CREATE CLUSTERED INDEX t_cx 
ON dbo.t (col1, col2, col3);

CREATE INDEX t_c1 ON dbo.t(col4); 
```

This query forces SQL Server to do a lookup into the clustered index.  Please forgive the use of the index hint, it was the quickest way to get the repro:

```
SELECT id
    , col1
    , col2
    , col3
FROM dbo.t aad WITH (INDEX = t_c1)
WHERE col4 = N'JSB'
    AND col1 > N'2019-05-30 00:00:00';
```

The [actual query plan](https://www.brentozar.com/pastetheplan/?id=ryfqhopaV) shows a non-existent column in the Output List for the nonclustered index scan:

[![enter image description here][1]][1]

Ostensibly, this represents the uniqifier used in the non-unique clustered index.  Is that the case?  Is a column named like that *always* the clustered index uniqifier?


  [1]: https://i.stack.imgur.com/CKANG.png

Top Answer

Paul White (imported from SE)

>Ostensibly, this represents the uniqifier used in the non-unique clustered index. Is that the case?

Yes.

>What is the purpose of this Uniq1002 column in this index scan?

Each row in the nonclustered index must be associated with *exactly one row* in the base table so that *Bookmark Lookups* (RID or Key) work correctly. This mapping is provided by the "row locator".

For heap tables, the row locator is the RID. For clustered row store tables, it is the clustering key (including the *uniquifier* where necessary).

For the *Key Lookup* in your plan to work, it must have access to the row locator. This includes the *uniquifier*, so it must be emitted by the nonclustered index scan.

The *uniquifier* is stored in the variable-length portion of the row so it only takes up space when needed (i.e. when a duplicate key actually exists).

>Is a column named like that always the clustered index uniquifier?

Yes. The uniquifier column is always named `UniqXXXX`. The row locator associated with heap tables is named `BmkXXXX`. The row locator for a columnstore table is named `ColStoreLocXXXX`.

---

## Observing the uniquifier

It is possible to directly observe the values of the uniquifier on SQL Server versions that contain a functional [`query_trace_column_values` Extended Event][1].

This undocumented and unsupported event is in the *Debug* channel. It was introduced with SQL Server 2016, and stopped working around CU11 of SQL Server 2017.

For example:

```
CREATE TABLE #T (c1 integer NULL INDEX ic1 CLUSTERED, c2 integer NULL INDEX ic2 UNIQUE, c3 integer NULL);
GO
INSERT #T
    (c1, c2, c3)
VALUES 
    (100, 101, 0),
    (100, 102, 1),
    (100, 103, 2);
GO
DBCC TRACEON (2486);
SET STATISTICS XML ON;
SELECT T.* FROM #T AS T WITH (INDEX(ic2));
SET STATISTICS XML OFF;
DBCC TRACEOFF (2486);
GO
DROP TABLE #T;
```

Has the plan:

[![Plan][2]][2]

It produces event output like the following on SQL Server 2016:

[![Event output][3]][3]


  [1]: https://www.sqlshack.com/query-trace-column-values/
  [2]: https://i.stack.imgur.com/e1liR.png
  [3]: https://i.stack.imgur.com/W5knS.png

Answer #2

Hannah Vernon (imported from SE)

In order for SQL Server to create a *non-unique* clustered index, a hidden "column" is added to the physical structure of the clustered index.  That hidden column is known as the uniqifier, and as its name implies, provides a mechanism to ensure that every row in the clustered index is unique.

When you see that column show up in a query plan, it's a great indicator that the clustering key columns have not been defined as unique.  Possibly that's because the combination of columns is known to be not unique.  It's also possible the designer of the table simply forgot to add the `UNIQUE` qualifier to the `CREATE CLUSTERED INDEX` statement.

In fact, if we recreate the repro above with a unique clustered index, the `Uniq1002` column no longer appears in the query plan:

```
USE tempdb;

IF OBJECT_ID(N'dbo.t', N'U') IS NOT NULL
DROP TABLE dbo.t
GO
CREATE TABLE dbo.t
(
    id int NOT NULL 
        PRIMARY KEY 
        NONCLUSTERED 
        IDENTITY(1,1)
    , col1 datetime NOT NULL
    , col2 varchar(800) NOT NULL
    , col3 int NULL
    , col4 sysname NULL
);

INSERT INTO dbo.t (
      col1
    , col2
    , col3
    , col4
    ) 
SELECT TOP(100000) 
      CONVERT(datetime, DATEADD(DAY, CONVERT(int, CRYPT_GEN_RANDOM(1)), '2000-01-01 00:00:00'))
    , replicate('A', 800)
    , CONVERT(int, CRYPT_GEN_RANDOM(4))
    , CONVERT(sysname, CHAR(65 + CRYPT_GEN_RANDOM(1) % 26) 
        + CHAR(65 + CRYPT_GEN_RANDOM(1) % 26) 
        + CHAR(65 + CRYPT_GEN_RANDOM(1) % 26))
FROM sys.syscolumns sc
    CROSS JOIN sys.syscolumns sc2;
```


Here's the UNIQUE clustered index:

```
CREATE UNIQUE CLUSTERED INDEX t_cx 
ON dbo.t (col1, col2, col3);

CREATE INDEX t_c1 ON dbo.t(col4); 
```

And the query:
```
SELECT id
    , col1
    , col2
    , col3
FROM dbo.t aad WITH (INDEX = t_c1)
WHERE col4 = N'JSB'
    AND col1 > N'2019-05-30 00:00:00';
```

The plan now shows this for the non-clustered index scan output columns:

[![enter image description here][1]][1]

When a non-unique clustered index is created, the uniqifier is automatically added.  The uniqifier is also added to every non-clustered index, even though you can't "see" it by looking at the properties of the index, or by "scripting" the index.

The uniqifier is a four-byte column containing an integer that is automatically incremented behind the scenes for each row inserted into the table.  The first row inserted doesn't require a uniqifier; only rows added after the first row have the uniqifier present.


  [1]: https://i.stack.imgur.com/DrXA6.png

2 Answers