Why does SQL Server require the datatype length to be the same when using UNPIVOT? - Databases

TopAnswers Databases

Meta

Databases

TeX

Code Golf

APL

C++

.net

db<>fiddle

Java

*nix

PHP

PowerShell

Python

Rust

टेक्-मराठी

Typst

Web Client Dev

Web Server Dev

Why does SQL Server require the datatype length to be the same when using UNPIVOT?

sql-server add tag

taryn (imported from SE)

When applying the [`UNPIVOT`][1] function to data that is not normalized, SQL Server requires that the datatype and length be the same.  I understand why the datatype must be the same but why does UNPIVOT require the length to be the same?

Let's say that I have the following sample data that I need to unpivot:

    CREATE TABLE People
    (
    	PersonId int, 
    	Firstname varchar(50), 
    	Lastname varchar(25)
    )
    
    INSERT INTO People VALUES (1, 'Jim', 'Smith');
    INSERT INTO People VALUES (2, 'Jane', 'Jones');
    INSERT INTO People VALUES (3, 'Bob', 'Unicorn');

If I attempt to UNPIVOT the `Firstname` and `Lastname` columns similar to:

    select PersonId, ColumnName, Value  
    from People
    unpivot
    (
      Value 
      FOR ColumnName in (FirstName, LastName)
    ) unpiv;

SQL Server generates the error:

> Msg 8167, Level 16, State 1, Line 6

>The type of column "Lastname" conflicts with the type of other columns specified in the UNPIVOT list.

In order to resolve the error, we must use a subquery to first cast the `Lastname` column to have the same length as `Firstname`:

    select PersonId, ColumnName, Value  
    from
    (
      select personid, 
        firstname, 
        cast(lastname as varchar(50)) lastname
      from People
    ) d
    unpivot
    (
      Value FOR 
      ColumnName in (FirstName, LastName)
    ) unpiv;

See [SQL Fiddle with Demo][2]


Prior to UNPIVOT being introduced in SQL Server 2005, I would use a `SELECT` with `UNION ALL` to unpivot the `firstname`/`lastname` columns and the query would run without the need to convert the columns to the same length:

    select personid, 'firstname' ColumnName, firstname value
    from People
    union all
    select personid, 'LastName', LastName
    from People;

See [SQL Fiddle with Demo][3].  

We are also able to successfully unpivot the data using `CROSS APPLY` without having the same length on the datatype:

    select PersonId, columnname, value
    from People
    cross apply
    (
    	select 'firstname', firstname union all
    	select 'lastname', lastname
    ) c (columnname, value);

See [SQL Fiddle with Demo][4].

I have read through MSDN but I didn't find anything explaining the reasoning for forcing the length on the datatype to be the same.  

What is the logic behind requiring the same length when using UNPIVOT?  


  [1]: http://msdn.microsoft.com/en-us/library/ms177410%28v=sql.105%29.aspx
  [2]: http://sqlfiddle.com/#!3/e49ef/2
  [3]: http://sqlfiddle.com/#!3/e49ef/3
  [4]: http://sqlfiddle.com/#!3/e49ef/8

Top Answer

Paul White

>*What is the logic behind requiring the same length when using UNPIVOT?*

This question may only be truly answerable by the people who worked on the implementation of `UNPIVOT`. You might be able to obtain this by [contacting them for support][1]. The following is my understanding of the reasoning, which may not be 100% accurate:

---

T-SQL contains any number of instances of weird semantics and other counter-intuitive behaviours. Some of these will eventually go away as part of deprecation cycles, but others may never be 'improved' or 'fixed'. Quite aside from anything else, applications exist that depend on these behaviours, so backward compatibility has to be preserved.

The rules for implicit conversions, and expression type derivation account for a significant proportion of the weirdness mentioned above. I do not envy the testers who have to ensure that the weird (and often undocumented) behaviours are preserved (under all combinations of `SET` session values and so on) for new versions.

That said, there is no good reason not to make improvements, and avoid past mistakes, when introducing new language features (with obviously no backward compatibility baggage). New features like recursive common table expressions (as mentioned by [Andriy M][2] in a comment) and `UNPIVOT` were free to have relatively sane semantics and clearly-defined rules.

There will be a range of views as to whether including the length in the type is taking explicit typing too far, but personally I welcome it. In my view, the types `varchar(25)` and `varchar(50)` are **not** the same, any more than `decimal(8)` and `decimal(10)` are. Special casing string type conversion complicates things unnecessarily and adds no real value, in my opinion.

One could argue that only implicit conversions that might lose data should be required to be explicitly stated, but there are edge-cases there too. Ultimately, a conversion is going to be needed, so we might as well make it explicit.

If the implicit conversion from `varchar(25)` to `varchar(50)` were allowed, it would just be another (most likely hidden) implicit conversion, with all the usual weird edge cases and `SET` setting sensitivities. Why not make the implementation the simplest and most explicit possible? (Nothing is perfect, however, and it is a shame that hiding `varchar(25)` and `varchar(50)` inside a `sql_variant` is allowed.)

Rewriting the `UNPIVOT` with `APPLY` and `UNION ALL` avoids the (better) type behaviour because the rules for `UNION` are subject to backward compatibility, and are documented in Books Online as allowing different types so long as they are comparable using implicit conversion (for which the arcane rules of data type precedence are used, and so on).

The workaround involves being explicit about the data types and adding explicit conversions where necessary. This looks like progress to me :)

One way to write the explicitly-typed workaround:

```sql
SELECT
    U.PersonId,
    U.ColumnName,
    U.Value
FROM dbo.People AS P
CROSS APPLY
(
    VALUES (CONVERT(varchar(50), Lastname))
) AS CA (Lastname)
UNPIVOT
(
    Value FOR
    ColumnName IN (P.Firstname, CA.Lastname)
) AS U;
```

Recursive CTE example:

```
-- Fails
WITH R AS
(
    SELECT Dummy = 'A row'
    UNION ALL
    SELECT 'Another row'
    FROM R
    WHERE Dummy = 'A row'
)
SELECT Dummy
FROM R;
    
-- Succeeds
WITH R AS
(
    SELECT Dummy = CONVERT(varchar(11), 'A row')
    UNION ALL
    SELECT CONVERT(varchar(11), 'Another row')
    FROM R
    WHERE Dummy = 'A row'
)
SELECT Dummy
FROM R;
```

Finally, note that the rewrite using `CROSS APPLY` in the question is not quite the same as the `UNPIVOT`, because it does not reject `NULL` attributes.


  [1]: https://docs.microsoft.com/en-us/sql/sql-server/sql-server-get-help
  [2]: https://dba.stackexchange.com/users/6965/andriy-m

1 Answer