Here are three simple tests that read the same data, yet report very different logical reads: ### Setup The following script creates a test table with 100 identical rows, each containing an *xml* column with enough data to ensure it is stored off row. In my test database, the length of the *xml* generated is 20,204 bytes for each row. -- Conditional drop IF OBJECT_ID(N'dbo.XMLTest', N'U') IS NOT NULL DROP TABLE dbo.XMLTest; GO -- Create test table CREATE TABLE dbo.XMLTest ( ID integer IDENTITY PRIMARY KEY, X xml NULL ); GO -- Add 100 wide xml rows DECLARE @X xml; SET @X = ( SELECT TOP (100) * FROM sys.columns AS C FOR XML PATH ('row'), ROOT ('root'), TYPE ); INSERT dbo.XMLTest (X) SELECT TOP (100) @X FROM sys.columns AS C; -- Flush dirty buffers CHECKPOINT; ### Tests The following three tests read the *xml* column with: 1. A plain `SELECT` statement 2. Assigning the *xml* to a variable 3. Using `SELECT INTO` to create a temporary table ``` -- No row count messages or graphical plan -- Show I/O statistics SET NOCOUNT ON; SET STATISTICS XML OFF; SET STATISTICS IO ON; GO PRINT CHAR(10) + '=== Plain SELECT ====' DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; SELECT XT.X FROM dbo.XMLTest AS XT; GO PRINT CHAR(10) + '=== Assign to a variable ====' DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; DECLARE @X xml; SELECT @X = XT.X FROM dbo.XMLTest AS XT; GO PRINT CHAR(10) + '=== SELECT INTO ====' IF OBJECT_ID(N'tempdb..#T', N'U') IS NOT NULL DROP TABLE #T; DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS; SELECT XT.X INTO #T FROM dbo.XMLTest AS XT GO SET STATISTICS IO OFF; ``` ### Results The output is: ```none === Plain SELECT ==== Table 'XMLTest'. Scan count 1, logical reads 3, physical reads 1, read-ahead reads 0, lob logical reads 795, lob physical reads 37, lob read-ahead reads 796. === Assign to a variable ==== Table 'XMLTest'. Scan count 1, logical reads 3, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. === SELECT INTO ==== Table 'XMLTest'. Scan count 1, logical reads 3, physical reads 1, read-ahead reads 0, lob logical reads 300, lob physical reads 37, lob read-ahead reads 400. ``` ### Questions * Why are the LOB reads so different? * Surely the exact same data was read in each test?
Not all reads are equal. SQL Server knows that accessing LOB data is expensive, and tries to avoid it when possible. There are also detailed differences in the way the LOB data is read in each case: ### Summary The numbers are different because: * The select reads the LOB in packet-sized *chunks* * The variable assignment test *does not read* the LOB at all * The "select into" test reads the LOB in *whole pages* ### Detail 1. **Plain `SELECT`** [![Select plan][1]][1] The Clustered Index Scan does not read any LOB data. It only assigns a storage engine LOB *handle*. The handle isn't used until control returns to the root of the plan. The current row's LOB content is read in TDS packet sized chunks and streamed to the client. Logical reads count the number of times a page is touched, so: *The number of reads reported equals the number of chunked reads performed, plus one for each time a LOB page transition occurs.* For example: A logical read is counted at the start of each chunk as the process touches the page corresponding to the current position of the stream. Where packets are smaller than a database page (the usual case), several logical reads are counted for the same page. If the packet size were so large that the entire LOB could fit in one chunk, the number of logical reads reported would be the number of LOB pages. 2. **Variable assignment** [![Variable plan][2]][2] The Clustered Index Scan assigns a LOB *handle* as before. At the root of the plan, the LOB handle is copied to the variable. The LOB data itself is never accessed (zero LOB reads), because the variable is never read. Even if it were, it would only be via the LOB handle last assigned. *There are no LOB reads because the LOB data is never accessed.* 3. **`SELECT INTO`** [![Select Into Plan][3]][3] This plan uses the bulk rowset provider to copy the LOB data from the source table to the new table. It processes a complete LOB page on each read (no streaming or chunking). *The number of logical reads corresponds to the number of LOB pages in the test table.* [1]: https://i.stack.imgur.com/pGEAe.png [2]: https://i.stack.imgur.com/V8cwI.png [3]: https://i.stack.imgur.com/3DlKX.png