SQL Server table columns under the hood

October 20th, 2011

You probably can easily answer a question like ‘What columns does this table have?’. Whether you use the SSMS object explorer, or sp_help, or you query sys.column, the answer is fairly easy to find. But what is I ask ‘What are the physical columns of this table?’. Huh? Is there any difference? Lets see.

At the logical layer tables have exactly the structure you declare it in your CREATE TABLE statement, and perhaps modifications from ALTER TABLE statements. This is the layer at which you can look into sys.columns and see the table structure, or look at the table in SSMS object explorer and so on and so forth. But there is also a lower layer, the physical layer of the storage engine where the table might have surprisingly different structure from what you expect.

Inspecting the physical table structure

To view the physical table structure you must use the undocumented system internals views: sys.system_internals_partitions and sys.system_internals_partition_columns:

Read the rest of this entry »

Understanding Hash, Sort and Exchange Spill events

October 19th, 2011

Certain SQL Server query execution operations are calibrated to perform best by using a (somewhat) large amount of memory as intermediate storage. The Query Optimizer will choose a plan and estimate the cost based on these operators using this memory scratch-pad. But this is, of course, only an estimate. At execution the estimates may prove wrong and the plan must continue despite not having enough memory. In such an event, these operators spill to disk. When a spill occurs, the memory scratch-pad is flushed into tempdb and more data is accommodated in the (now) free memory. When the data flushed to tempdb is needed again, is read from disk. Needless to say, spilling into tempdb is order of magnitude slower than using just the memory scratch-pad. Monitoring for spilling is specially important in ETL jobs, since these occurrences may cause the ETL job to stretch for many more minutes, sometimes even hours. For an exhaustive discussion of ETL, including some references to spills, see The Data Loading Performance Guide.

Read the rest of this entry »

T-SQL functions do no imply a certain order of execution

August 10th, 2011

Looking at this question on StackOverflow: Conversion failed when converting from a character string to uniqueidentifier error in SQL Server one can see a reproducible example where a string split UDF works fine in the SELECT statement, but it gives a conversion error in the DELETE statement. Certainly, a bug, right? Actually, not.

The issue at hand is in fact remarkably similar to another common misconception around T-SQL I had to debunk some time ago, see On SQL Server boolean operator short-circuit: that C like operator short-circuit is guaranteed in T-SQL (hint: it isn’t, read the linked article to see a clear counter example).

In the StackOverlow post the misconception is that order of declaration implies order of execution, that the function is evaluated somehow separately from the rest of the query and some sort of temporary result is created that is then used in the overall query execution. This understanding comes naturally to the imperative procedural language mindset of developers trained in C, C++, C# and other similar languages. But in the SQL Server declarative language that is T-SQL, your intuition is actually wrong. To illustrate I will give a simple counter-example, reusing the code from my earlier boolean short-circuit article:

Read the rest of this entry »

Online Index Operations for indexes containing LOB columns

August 5th, 2011

SQL Server supports online index and table rebuild operations which allow for maintenance operations to occur w/o significant downtime. While a table is being rebuild, or a new index is being built on it, the table is fully utilizable. It can be queried and any updates done to the table while the online rebuild operation is occurring will be contained in the final rebuilt table. A detailed explanation on how these online rebuild operations work can be found in the Online Indexing Operations in SQL Server 2005 white paper. But Online Index Build operations in SQL Server 2005, 2008 and 2008 R2 do not support tables that contain LOB columns, attempting to do so would trigger an error:

Read the rest of this entry »

How to Multicast messages with SQL Server Service Broker

July 20th, 2011

Starting with SQL Server 11 the the SEND verb has a new syntax and accepts multiple dialogs handles to send on:

   ON CONVERSATION [(]conversation_handle [,.. @conversation_handle_n][)]
   [ MESSAGE TYPE message_type_name ]
   [ ( message_body_expression ) ]
[ ; ]

Read the rest of this entry »

Online non-NULL with values column add in SQL Server 2012

July 13th, 2011

Prior to SQL Server 2012 when you add a new non-NULLable column with default values to an existing table a size-of data operation occurs: every row in the table is updated to add the default value of the new column. For small tables this is insignificant, but for large tables this can be so problematic as to completely prohibit the operation. But starting with SQL Server 2012 the operation is, in most cases, instantaneous: only the table metadata is changed, no rows are being updated.

Lets look at a simple example, we’ll create a table with some rows and then add a non-NULL column with default values. First create and populate the table:

Read the rest of this entry »

How to update a table with a columnstore index

July 13th, 2011

In my previous article How to use columnstore indexes in SQL Server we’ve seen how to create a columnstore index on a table and how certain queries can significantly reduce the IO needed and thus increase in performance by leveraging this new feature. But once a columnstore index is added to a table the table becomes read-only as it cannot be updated. Trying to insert a new row in the table will result in an error:

insert into sales ([date],itemid, price, quantity) values ('20110713', 1,1.0,1);

Msg 35330, Level 15, State 1, Line 1
INSERT statement failed because data cannot be updated in a table with a columnstore index. Consider disabling the columnstore index before issuing the INSERT statement, then rebuilding the columnstore index after INSERT is complete.

Read the rest of this entry »

How to use columnstore indexes in SQL Server

July 13th, 2011

Column oriented storage is the data storage of choice for data warehouse and business analysis applications. Column oriented storage allows for a high data compression rate and as such it can increase processing speed primarily by reducing the IO needs. Now SQL Server allows for creating column oriented indexes (called COLUMNSTORE indexes) and thus brings the benefits of this highly efficient BI oriented indexes in the same engine that runs the OLTP workload. The syntax for creating columnstore indexes is described on MSDN at CREATE COLUMNSTORE INDEX. Lets walk trough a very simple example of how to create and use a columnstore index. First lets have a dummy sales table:

Read the rest of this entry »

Download and Upload images from SQL Server via ASP.Net MVC

December 28th, 2010

A frequent question that pops up on discussion forums is how to serve an image that is stored in a SQL Server table from an ASP.Net application. Unfortunately the answer is almost always wrong, as the prevalent solution involves copying the entire image file into memory before returning it to the client. This solution works fine when tested with a light load and returning few small images. But in production environment the memory consumption required by all those image files stored as byte arrays in memory causes serious performance degradation. A good solution must use streaming semantics, transferring the data in small chunks from the SQL Server to the HTTP returned result.

The SqlClient components do offer streaming semantics for large result sets, including large BLOB fields, but the client has to specifically ask for it. The ‘secret ingredient’ is the passing in the CommandBehavior.SequentialAccess flag to the SqlCommand.ExecuteReader:

Provides a way for the DataReader to handle rows that contain columns with large binary values. Rather than loading the entire row, SequentialAccess enables the DataReader to load data as a stream. You can then use the GetBytes or GetChars method to specify a byte location to start the read operation, and a limited buffer size for the data being returned.

An ASP.Net MVC virtual Media folder backed by SQL Server

Lets say we want to have a virtual Media folder in an ASP.Net MVC site, serving the files from a SQL Server database. A GET request for an URL like "http://site/Media/IMG0042.JPG" should return the content of the file named IMG0042.JPG from the database. A POST request to the URL "http://site/Media" which contains an embedded file should insert this new file in the database and redirect the response to the newly added file virtual path. This is how our upload HTML form looks like:

Read the rest of this entry »

SqlDependency based caching of LINQ Queries

August 4th, 2010

Query Notifications is the SQL Server feature that allows a client to subscribe to notifications that are sent when data in the database changes irrelevant of how that change occurs. I have talked before about how Query Notifications works in the article The Mysterious Notification. This feature was designed specifically for client side cache invalidation: the applications runs a query, gets back the result and stores it in the cache. Whenever the result is changed because data was updated, the application will be notified and it can invalidate the cached result.

Leveraging Query Notifications from the managed clients is very easy due to the dedicated SqlDependency class that takes care of a lot of the details needed to be set up in place in order to be able to receive these notifications. But the MSDN examples and the general community know how with SqlDepenendency is geared toward straight forward usage, by attaching it to a SqlCommand object.

Read the rest of this entry »