<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>rusanu.com</title>
	<atom:link href="http://rusanu.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://rusanu.com</link>
	<description>RUSANU CONSULTING LLC</description>
	<lastBuildDate>Tue, 11 Jun 2013 21:07:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>SQL Server Clustered Columnstore Indexes at TechEd 2013</title>
		<link>http://rusanu.com/2013/06/11/sql-server-clustered-columnstore-indexes-at-teched-2013/</link>
		<comments>http://rusanu.com/2013/06/11/sql-server-clustered-columnstore-indexes-at-teched-2013/#comments</comments>
		<pubDate>Tue, 11 Jun 2013 13:31:05 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[Columnstore]]></category>
		<category><![CDATA[SQL 2014]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1799</guid>
		<description><![CDATA[Now that the TechEd 2013 presentations are up online on Channel9 you can check out Brian Mitchell&#8217;s session What&#8217;s New for Columnstore Indexes and Batch Mode Processing. Brian does a great job at presenting the new updatable clustered columnstore indexes and the enhancements done to the vectorized query execution (aka. batch mode). Besides the TechEd [...]]]></description>
			<content:encoded><![CDATA[<p>Now that the TechEd 2013 presentations are <a href="http://channel9.msdn.com/Events/TechEd/NorthAmerica/2013" target="_blank">up online on Channel9</a> you can check out Brian Mitchell&#8217;s session <a href="http://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/DBI-B322" target="_blank">What&#8217;s New for Columnstore Indexes and Batch Mode Processing</a>. Brian does a great job at presenting the new updatable  clustered columnstore indexes and the enhancements done to the vectorized query execution (aka. batch mode). Besides the TechEd  presentation there is also another excellent resource available online right now for your education on the topic: the <a href="http://www.sigmod.org/2013/" target="_blank">SIGMOD 2013</a> paper <a href="http://research.microsoft.com/apps/pubs/default.aspx?id=193599" target="_blank">Enhancements to SQL Server Column Stores</a>.  Besides the obvious updatability, this paper cites some more improvements that are available in clustered columnstores:
<ul>
<li><a href="#indexbuild">Improved Index Build</a></li>
<li><a href="#sampling">Sampling Support</a></li>
<li><a href="#bookmark">Bookmark Support</a></li>
<li><a href="#ddl">Schema modification support</a></li>
<li><a href="#shortstrings">Support for short strings</a></li>
<li><a href="#mixed">Mixed Execution Mode</a></li>
<li><a href="#hash">Hash Join support</a></li>
<li><a href="#hash">Improvements in Bitmap filters</a></li>
<li><a href="#archival">Archival support</a></li>
</ul>
<p>It should be no surprise to anyone studying columnar storage that the updatable  clustered columnstores coming with the next version of SQL Server are based on deltastores. I talked before about the <a href="http://rusanu.com/2012/05/29/inside-the-sql-server-2012-columnstore-indexes/">SQL Server 2012 Columnstore internals</a> and I explained why the highly compressed format that makes columnar storage so fast it also makes it basically impossible to update in-place. The technique of having a &#8216;deltastore&#8217; which stores updates and, during scans, merge the updates with the columnar data is not new and is employed by several of the columnar storage solution vendors.</p>
<p><!-- more --></p>
<h1>Deltastores</h1>
<p class="callout float-right">Deltastores are ordinary B-Trees that store uncompressed row groups of the clustered columnstore</p>
<p>Columnstores introduce a new unit of organization, a row-group. A row-group is a logical unit that groups up to 2<sup>20</sup> rows (about 1 million rows). In SQL Server 2012 the row-groups where implicit and there was catalog view to show them. As you can see in Brian&#8217;s presentation SQL Server 14 adds a new catalog view: <tt>sys.column_store_row_groups</tt>. This catalog view show the state of each row group for all columnstores (including non-clustered ones). Updatable clustered columnstores can show the row groups in COMPRESSED state or in OPEN/CLOSED state. The OPEN/CLOSED row groups are deltastores (yes, there could be multiple deltastores per columnstore, see the above mentioned SIGMOD 2013 paper). OPEN row groups are ready to accept more inserts while CLOSED row groups have filled up and are awaiting compression. The structure of a deltastore is explained in the SIGMOD paper:</p>
<blockquote><p>A delta store contains the same columns as the corresponding column store index. The B-tree key is a unique integer row ID generated by the system (column stores do not have unique keys).</p></blockquote>
<p>If you wonder what a <i>unique integer row ID generated by the system</i> actually is, remember how <a href="http://msdn.microsoft.com/en-us/library/ms190639(v=sql.105).aspx" target="_blank">uniqueifier columns</a> work. Deltastores are managed entirely by the engine, there is no DDL to control the creation and deletion of deltastores. The engine creates a new deltastore whenever it needs one to handle inserts, closes them when full (have 1 million rows) and a background process called the Tuple Mover compresses this closed deltastores into columnar storage format.</p>
<p>When handling deltastores the columnar storage advantages are dimmed. Deltastores are row mode storage so the entire row has to be read, not only the column(s) of interest. Segment elimination does not occur for deltastores since the deltastores do not have metadata about min and max values contained inside each column. Parallel scans will distribute the deltastores among the threads so that multiple deltastores are scanned in parallel, but there is no attempt for parallelism inside a single deltastores. At maximum size of 1 million rows they&#8217;re simply too small to justify the engineering complications of handling parallelism inside the deltastores. All this is explained in the SIGMOD paper:</p>
<blockquote><p>Parallel scans assign each delta store to a single thread of execution. A single delta store is too small to justify scanning in parallel but multiple delta stores can be scanned in parallel. Scanning delta stores is slower than scanning data in columnar format because complete records have to be read and not just the columns needed by the query.</p></blockquote>
<h1>Tuple Mover</h1>
<p>The background Tuple Mover process is responsible to compressing full deltastores. The Tuple Mover is an online operation, it does not prevent data reads from the deltastores being compressed. This is described in the SIGMOD paper:</p>
<blockquote><p>The Tuple Mover reads one closed delta store at a time and starts building the corresponding compressed segments. During this time scans continue to see and read the delta store. When the Tuple Mover has finished compressing the delta store, the newly created segments are made visible and the delta store is made invisible in a single, atomic operation. New scans will see and scan the compressed format. The Tuple Mover then waits for all scans still operating in the delta store to drain after which the delta store is removed.</p></blockquote>
<p>Concurrent deletes or updates <b>are blocked</b> while the Tuple Mover compresses a deltastore. Concurrent Inserts are not blocked by the Tuple Mover. As for the &#8216;background process&#8217; part of the Tuple Mover the closest analogy is the <a href="http://www.sqlskills.com/blogs/paul/inside-the-storage-engine-ghost-cleanup-in-depth/" target="_blank">Ghost Cleanup process</a>. Veterans know that Ghost Cleanup is never tuned right for <i>my</i> job, is always either too aggressive for some users or too slow for others. Will Tuple Mover suffer from the same problems? I don&#8217;t expect it to, primarily because the unit of work is big. It takes time to accumulate 1 million rows in single row by row inserts.</p>
<p>The removal of compressed deltastores is very efficient as is basically a deallocation (as efficient as a DROP). This is important so that the Tuple Mover does not generate exaggerate logging during compression.</p>
<h1>Delete Bitmaps</h1>
<p>The deleted bitmap is another B-Tree associated with the clustered columnstore.  There is only one deleted bitmap for the entire columnstore, it covers all the row-groups (all segments). Only compressed segments use a deleted bitmap. The DELETE operation is in effect an insert into the deleted bitmap, the row-group and tuple number of the deleted row is inserted into the deleted bitmap. Scans (reads) honor the deleted bitmap by filtering out any row (tuple) marked as deleted. It is recommendable that clustered columnstore indexes that have seen a large number of deletes to be rebuild in order to restore the &#8216;health&#8217; of their segments by removing the deleted rows. UPDATE operations on clustered columnstores are <i>always</i> split updates, meaning the Query Optimizer will create a plan that contains a delete and an insert for any UPDATE.</p>
<p>Delete bitmaps do not cover deltastores. As B-Trees, the deltastores support direct delete so they implement the delete by removing the deleted row.</p>
<h1>Bulk Insert</h1>
<blockquote><p>Large bulk insert operations do not insert rows into delta stores but convert batches of rows directly into columnar format. The operation buffers rows until a sufficient number of rows has accumulated, converts them into columnar format, and writes the resulting segments and dictionaries to disk. This is very efficient; it reduces the IO requirements and immediately produces the columnar format needed for fast scans. The downside is the large memory space needed to buffer rows.</p></blockquote>
<p>For clustered columnstores bulk insert performance is critical, as is the bread and butter of the data warehousing ETL scenarios. You are going to have to give attention to your ETL pipeline and drive to achieve the optimal directly compressed format. This requires the bulk insert to be able to upload close to 1 million rows <b>per partition</b> in each batch. If the bulk insert uses too small batches then the result will be sub-optimal deltastores instead of the optimized compressed segments. With SSIS you will have to pay attention to the data flow buffer size as the default 10MB is way too small for achieving efficient columnstore bulk inserts. Of course there will be cases when there simply isn&#8217;t enough data to upload and in such cases the bulk insert will result in a deltastore instead of compressed columnstores. The deltastores will be left OPEN and subsequent bulk insert operations will reuse them and fill them up, close them and leave them for the Tuple Mover to compress them. While achieving the directly compressed format during bulk insert is desirable, you should not stretch your ETL and business logic out of the way just to achieve it. Going through the intermediate deltastore will remedy itself automatically once sufficient data accumulates.</p>
<p class="callout float-left">INSERT &#8230; SELECT &#8230; targeting clustered columnstore tables is a bulk insert operation</p>
<p>Enjoy! SELECT &#8230; INTO has always been a bulk (and minimally logged) operation, but that can only create a row-mode heap. With INSERT &#8230; SELECT being now optimized for clustered columnstore indexes to use the bulk insert API, your ETL pipeline can construct the switch-in data directly into columnar storage format in one single pass, w/o having to resort to a rebuild. This is, again, described in the SIGMOD paper.</p>
<h1>Trickle Insert</h1>
<p>Trickle inserts are all inserts that do not come through the bulk insert API. <tt>INSERT INTO ... VALUES ... </tt> statements are trickle inserts. Trickle inserts are handled always by a deltastore and they can never create directly compressed data. UPDATE statements (as well as MERGE) result in split updates (delete and insert) and will insert in trickle mode.</p>
<h2><a name="indexbuild">Improved Index Build</a></h2>
<p>Clustered columnstore index build is smart. It solve several problems in new ways:
<dl>
<dt>Relevant Dictionaries</dt>
<dd>Columnar storage columns use a global dictionary, shared by all segments, and local dictionaries shared by specific segments. The more relevant the global dictionary (the more actual data values are covered by it) the better the data compression achieved, as the secondary dictionaries are smaller or even not needed. In SQL Server 2012 the global dictionary was just the first dictionary built, so it could contain skewed data resulting in poor relevance. With clustered columnstore index build the entire data is first sampled ina  first stage, global dictionaries are built for all columns that requires them, and then the build proper starts. This creates much better global dictionaries and result in significant storage (compression) improvement.</dd>
<dt>Minimize blocking</dt>
<dd>This problem is not specific to clustered columnstore indexes: during an <b>offline</b> index rebuild, there is no reason to block <b>reads</b>. Yes, there is a risk of deadlock at the end of the index rebuild, but there are ways to solve that problem. With clustered columnstore indexes <b>offline rebuild operations are semi-online</b>, meaning reads are allowed but updates are blocked.</dd>
<dt>Workload Variation</dt>
<dd>Columnstore index build is very memory intensive. Traditional execution model determines the DOP at beginning of execution and then the query executes with the given DOP. With the improved columnstore build the actual execution DOP varies as the build progresses and the build process can actively and voluntarily reduce it&#8217;s DOP (by &#8216;parking&#8217; execution threads) in order to adjust to low memory conditions.</dd>
<h2><a name="sampling">Sampling Support</a></h2>
<p>Required by the aforementioned two-stage index build in order to create relevant global dictionaries. Similar to how heap and B-Tree sampling selects entire pages to sample, columnstores can select row groups (segments) to sample.</p>
<h2><a name="bookmark">Bookmark Support</a></h2>
<p>Required to implement split updates. The heap bookmark is a physical locator (file_id:page_id:slot_id), the B-Tree bookmark is the actual key value and the clustered columnstore bookmark is the (row_group_id:tuple_id) pair. In deltastores the tuple_id is the value of the uniquifier column so the bookmark is located efficiently with a seek operation. Having bookmarks also frees the optimizer to explore additional options like eager spool.</p>
<h2><a name="ddl">Schema modification support</a></h2>
<p>Add column, alter column, drop column are supported by clustered columnstore indexes.</p>
<h2><a name="shortstrings">Support for short strings</a></h2>
<p>Short strings, like US state abbreviations, are frequent in fact tables and previously only dictionary encoding was available for them, which is inefficient. Now short strings can be encoded by value, w/o requiring a dictionary.</p>
<h2><a name="mixed">Mixed Execution Mode</a></h2>
<p>Queries can now execute in a mixture of batch-mode and row-mode stages. Special adapter operators can exchange rows into batches and vice-versa. This gives the optimizer freedom to mix batch-mode with unsupported operators w/o having to resort to reverting the entire query to row-mode.</p>
<h2><a name="hash">Hash Join and Bitmap filters improvements</a></h2>
<p>The hash join is the preferred join of data warehousing workloads, dominated by large data sets and aggregations. The improved batch-mode hash join handles inner, outer, semi and anti-semi joins, ie. the entire spectrum of possible join operators (don&#8217;t confuse them with the join <i>syntax</i>). I recommend going over the relevant chapters on the SIGMOD paper for this topic for details.</p>
<h2><a name="archival">Archival support</a></h2>
<p>Simply put, add another layer of compression using XPress8 (a variant of <a href="http://en.wikipedia.org/wiki/LZ77_and_LZ78" target="_blank">LZ77</a>) over the already compressed segments and dictionaries. Recommended for cold data, can be applied per partition and can give an additional up to 66% reduction in size (YMMV).</p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2013/06/11/sql-server-clustered-columnstore-indexes-at-teched-2013/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Registry bloat after SQL Server 2012 SP1 installation</title>
		<link>http://rusanu.com/2013/02/15/registry-bloat-after-sql-server-2012-sp1-installation/</link>
		<comments>http://rusanu.com/2013/02/15/registry-bloat-after-sql-server-2012-sp1-installation/#comments</comments>
		<pubDate>Fri, 15 Feb 2013 12:55:09 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[SQL 2012]]></category>
		<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1728</guid>
		<description><![CDATA[SQL Server 2012 installation has the potential to leave an msiexec.exe installer process running after the installation finishes, as described in Windows Installer starts repeatedly after you install SQL Server 2012 SP1: After you install SQL Server 2012 SP1 on a computer, the Windows Installer (Msiexec.exe) process is repeatedly started to repair certain assemblies. Additionally, [...]]]></description>
			<content:encoded><![CDATA[<p>SQL Server 2012 installation has the potential to leave an <tt>msiexec.exe</tt> installer process running after the installation finishes, as described in <a href="http://support.microsoft.com/kb/2793634">Windows Installer starts repeatedly after you install SQL Server 2012 SP1</a>:</p>
<blockquote><p>After you install SQL Server 2012 SP1 on a computer, the Windows Installer (Msiexec.exe) process is repeatedly started to repair certain assemblies. Additionally, the following events are logged in the Application log:<br />
EventId: 1004<br />
Source: MsiInstaller<br />
Description: Detection of product &#8216;{A7037EB2-F953-4B12-B843-195F4D988DA1}&#8217;, feature &#8216;SQL_Tools_Ans&#8217;, Component &#8216;{0CECE655-2A0F-4593-AF4B-EFC31D622982}&#8217; failed. The resource&#8221;does not exist.</p>
<p>EventId: 1001<br />
Source: MsiInstaller<br />
Description: Detection of product &#8216;{A7037EB2-F953-4B12-B843-195F4D988DA1}&#8217;, feature &#8216;SQL_Tools_Ans’ failed during request for component &#8216;{6E985C15-8B6D-413D-B456-4F624D9C11C2}&#8217;</p>
<p>When this issue occurs, you experience high CPU usage.<br />
Cause</p>
<p>This issue occurs because the SQL Server 2012 components reference mismatched assemblies. This behavior causes native image generation to fail repeatedly on certain assemblies. Therefore, a repair operation is initiated on the installer package.
</p></blockquote>
<p>But the this problem has a much sinister side effect: it causes growth of the HKLM\Software registry hive. Except for the System hive, all the other registry hives are still restricted in size to a max of 2GB, see <a href="http://msdn.microsoft.com/en-us/library/windows/desktop/ms724881%28v=vs.85%29.aspx">Registry Storage Space</a>:<br />
<blockquote><p>Views of the registry files are mapped in paged pool memory&#8230;The maximum size of a registry hive is 2 GB, except for the system hive.</p></blockquote>
<p>As the runaway msiexec process bloats the Software registry hive your system may approach the maximum hive size and, even before that maximum is reached, the large hive will consume more and more of the very precious kernel paged pool memory. The system may start to exhibit erratic behavior, complaining about low &#8216;resources&#8217;, closing connections and other symptoms. For an example of how this erratic bahior may manifest, read <a href="http://blogs.msdn.com/b/sqljourney/archive/2012/10/25/why-the-registry-size-can-cause-problems-with-your-sql-2012-alwayson-setup.aspx">Why the registry size can cause problems with your SQL 2012 AlwaysOn/Failover Cluster setup</a>.</p>
<p class="callout float-right">If you installed SQL Server 2012 SP1 recently follow the steps in KB2793634 for a fix</p>
<p>The biggest problem with the registry bloat erratic behavior is that it occurs long after the SQL Server 2012 SP1 installation and is quite difficult, even for expert users, to trace back the causality of the server erratic behavior to the SP1 installation. If you are uncertain if this issue is affecting you, run <tt>dir %SystemRoot%\system32\config</tt> and check the size of the <tt>SOFTWARE</tt> file, which is the storage of this registry hive. If you are indeed dealing with a bloated registry hive, visit <a href="http://support.microsoft.com/kb/2498915">KB2498915: How to Compress &#8220;Bloated&#8221; Registry Hives</a>:</p>
<blockquote><p>
1)  Boot from a <a href="http://technet.microsoft.com/en-us/library/cc766093(WS.10).aspx">WinPE disk</a>.<br />
2)  Open regedit while booted in WinPe, load the bloated hive under HLKM.  (e.g. HKLM\Bloated)<br />
3)  Once the bloated hive has been loaded, export the loaded hive as a &#8220;Registry Hive&#8221; file with a unique name.  (e.g. %windir%\system32\config\compressedhive)<br />
      a)  You can use dir from a command line to verify the old and new sizes of the registry hives.<br />
4) Unload the bloated hive from regedit. (If you get an error here, close the registry editor. Then reopen the registry editor and try again.)<br />
5) Rename the hives so that you will boot with the compressed hive.<br />
e.g.<br />
<tt>c:\windows\system32\config\ren software software.old</tt><br />
<tt>c:\windows\system32\config\ren compressedhive software</tt>
</p></blockquote>
<p><b>Update</b>: At the time of writing this the SQL Server SP1 download page recommends installing the <a href="http://support.microsoft.com/kb/2793634">KB2793634</a>.  Kind folk that have been in the trenches and had to deal with this problem have also give me more feedback about the symptoms and solutions. I&#8217;ve been told that if the registry is <i>already</i> bloated due to the msiexec issue then applying the <a href="http://support.microsoft.com/kb/2793634">KB2793634</a> will not be enough. The fix only prevents further auto-restarts of msiexec, but it does not compress the registry. The following registry keys *may* be already huge:</p>
<ul>
<li><tt>HKLM\Software\Wow6432Node\Microsoft\.NETFramework\v2.0.50.27\NGENService</tt></li>
<li><tt>HKLM\SOFTWARE\Microsoft\.NETFramework\v2.0.50727\NGENService</tt></li>
<li><tt>HKLM\Software\Wow6432Node\Microsoft\.NETFramework\v4.0.30319\NGENService</tt></li>
<li><tt>HKLM\SOFTWARE\Microsoft\.NETFramework\v4.0.30319\NGENService</tt></li>
</ul>
<p> Symptoms of registry hive bloat include, but are not limited to:</p>
<ul>
<li>Cluster service is crashing and &#8220;The Group Policy Client service failed the logon. Insufficient system resources exist to complete the requested service&#8221; are logged into the event viewer log.</li>
<li>It is impossible to perform additional installations, or you can do this just shortly after rebooting the server.</li>
<li>Users with domain accounts are unable to logon on the machine. When they log in a temporary profile is loaded.</li>
<li>In Server Manager the Roles and Features are not listed anymore.</li>
<li>Many actions are prevented with the error message &#8221; Insufficient system resources exist to complete the requested service&#8221;.</li>
</ul>
<p>The recommended action is to open a support case and contact CSS. If you are brave, can&#8217;t afford CSS support contracts, and you are convinced that the problems you&#8217;re experiencing is due to the SQL Server 2012 SP1 registry bload, you may try the following <b>at your own risk</b>:</p>
<ol>
<li>Install the <a href="http://support.microsoft.com/kb/2793634">KB2793634</a> hot fix, as per the procedure described in the KB article.</li>
<li>From <tt>cmd</tt> prompt run <tt>%SystemRoot%\Microsoft.Net\Framework\v4.0.30319\ngen.exe executequeueditems</tt> and <tt>%SystemRoot%\Microsoft.Net\Framework64\v4.0.30319\ngen.exe executequeueditems</tt></li>
<li>Compress the registry hive using a WinPE disk or from the Windows recovery mode (press <tt>F8</tt> at the boot screen). Note that the Windows system on which you attempt to compress the hive must have the <a href="http://support.microsoft.com/kb/973817">KB973817</a> installed: <i>The Reg.exe utility does not compress a registry key when the utility saves a registry key to a hive file on a computer that is running Windows Server 2008, Windows Vista, Windows 7 or Windows Server 2008 R2</i>.</li>
<li>After the compression procedure is complete, from a <tt>cmd</tt> prompt run <tt>SCF /SCANNOW</tt>. See <a href="http://support.microsoft.com/kb/929833">KB929833</a> for more details.</li>
</ol>
<p>In the case when the registry hive bloat prevents even the installation of the SP1 hotfix you may attempt increasing the hive size to 4GB, see <a href="http://technet.microsoft.com/en-us/library/cc963194.aspx"><tt>RegistrySizeLimit</tt></a>. Editing this particular registry key to a wrong value can result in your system blue-screening during boot due to BAD_SYSTEM_CONFIG_INFO kernel panic. Each boot. Use at your own risk. The correct value to put is <tt>0xFFFFFFFF</tt>.</p>
<p class="callout float-right">If you plan to upgrade an RTM SQL Server 2012 instance to SP1 apply the slipstream SP1+CU2</p>
<p>If you have an RTM instance of SQL Server 2012 and plan to upgrade it to SP1 it is recommended to create a slipstream SP1+CU2 instalation and apply this instead. See <a href="http://msdn.microsoft.com/en-us/library/hh231670.aspx">Product Updates in SQL Server 2012 Installation</a>, in SQL Server 2012 the slipstream functionality has been renamed to Product Updates:</p>
<blockquote><p>The Product Update feature replaces the Slipstream functionality that was available in SQL Server 2008 PCU1. Therefore the command-line parameters, /PCUSource and /CUSource, associated with Slipstream functionality should no longer be used. The parameters will continue to work, but may be removed in a future release of SQL Server Setup. The /UpdateSource parameter combines the functionality of the Slipstream parameters.</p></blockquote>
<p>Frankly I think the term &#8216;slipstream&#8217; was less ambiguous than &#8216;product update&#8217; but then who am I to complain&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2013/02/15/registry-bloat-after-sql-server-2012-sp1-installation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to enable Selective XML indexes in SQL Server 2012 SP1</title>
		<link>http://rusanu.com/2013/02/11/how-to-enable-selective-xml-indexes-in-sql-server-2012-sp1/</link>
		<comments>http://rusanu.com/2013/02/11/how-to-enable-selective-xml-indexes-in-sql-server-2012-sp1/#comments</comments>
		<pubDate>Mon, 11 Feb 2013 10:39:37 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[SQL 2012]]></category>
		<category><![CDATA[Tutorials]]></category>
		<category><![CDATA[sqlqctive xml index]]></category>
		<category><![CDATA[upgrade]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1716</guid>
		<description><![CDATA[SQL Server 2012 SP1 has shipped a great enhancement to XML: Selective XML Indexes. When properly used these indexes can speed up the searching of XML columns tremendously, at little disk/size cost: The selective XML index feature lets you promote only certain paths from the XML documents to index. At index creation time, these paths [...]]]></description>
			<content:encoded><![CDATA[<p>SQL Server 2012 SP1 has shipped a great enhancement to XML: <a href="http://msdn.microsoft.com/en-us/library/jj670108.aspx">Selective XML Indexes</a>. When properly used these indexes can speed up the searching of XML columns tremendously, at little disk/size cost:</p>
<blockquote><p>The selective XML index feature lets you promote only certain paths from the XML documents to index. At index creation time, these paths are evaluated, and the nodes that they point to are shredded and stored inside a relational table in SQL Server. This feature uses an efficient mapping algorithm developed by Microsoft Research in collaboration with the SQL Server product team. This algorithm maps the XML nodes to a single relational table, and achieves exceptional performance while requiring only modest storage space.</p></blockquote>
<p>However, at the time of this post, the documentation on MSDN omits a crucial requirement: the database has to be enabled to support these new indexes. Because is an on-disk format change shipped in a service pack release the engine cannot use the new selective XML indexes unless explicitly allowed. And the database version must change to a version that the RTM SQL Server 2012 does not recognize so that the database is not accidentally attached/restored on an RTM instance of SQL Sever 2012 that would not comprehend the new Selective XML Index objects and would panic (undefined behavior). To enable Selective XML Indexes in the database you must run <a href="http://msdn.microsoft.com/en-us/library/jj670102.aspx"><tt>sp_db_selective_xml_index</tt></a>:</p>
<blockquote><p>Enables and disables Selective XML Index functionality on a SQL Server database. If called without any parameters, the stored procedure returns 1 if the Selective XML Index is enabled on a particular database.</p></blockquote>
<p><code class="prettyprint lang-sql"></pre>
<p>EXECUTE sys.sp_db_selective_xml_index<br />
    @db_name = N'AdventureWorks2012'<br />
  , @action = N'true';<br />
GO
</pre>
<p></code></p>
<p>Be aware that once upgraded to support Selective XML indexes this database can no longer be attached or restored on an RTM instance of SQL Server 2012. This applies to log shipping, Database Mirroring and AlwaysOn relationships, which will break when this upgrade is performed. <i>If</i> all the partners in the log shipping, DBM session or AG are also upgraded to SQL Server 2012 SP1 you can re-enable the relationship <b>after</b> you enabled Selective XML Indexes.</p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2013/02/11/how-to-enable-selective-xml-indexes-in-sql-server-2012-sp1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to enable and disable a queue using SMO</title>
		<link>http://rusanu.com/2013/02/06/how-to-enable-and-disable-a-queue-using-smo/</link>
		<comments>http://rusanu.com/2013/02/06/how-to-enable-and-disable-a-queue-using-smo/#comments</comments>
		<pubDate>Wed, 06 Feb 2013 12:22:19 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Samples]]></category>
		<category><![CDATA[Tutorials]]></category>
		<category><![CDATA[sharedmanagementobject]]></category>
		<category><![CDATA[smo]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1713</guid>
		<description><![CDATA[The SMO object model for SQL Server ServiceQueue does allow one to enable or disable a queue, but the property that modifies the queue status is not intuitive, it is IsEnqueueEnabled: Gets or sets the Boolean property that specifies whether the queue is enabled. This property matches the catalog view column is_enqueue_enabled in sys.service_queues but [...]]]></description>
			<content:encoded><![CDATA[<p>The SMO object model for SQL Server <a href="http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.smo.broker.servicequeue.aspx"><tt>ServiceQueue</tt></a> does allow one to enable or disable a queue, but the property that modifies the queue status is not intuitive, it is <a href="http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.management.smo.broker.servicequeue.isenqueueenabled.aspx"><tt>IsEnqueueEnabled</tt></a>:</p>
<blockquote><p>Gets or sets the Boolean property that specifies whether the queue is enabled.</p></blockquote>
<p>This property matches the catalog view column <tt>is_enqueue_enabled</tt> in <a href="http://msdn.microsoft.com/en-us/library/ms187795.aspx"><tt>sys.service_queues</tt></a> but bears little resemblance to the T-SQL statement used to enable or disable a queue: <tt>ALTER QUEUE ... WITH STATUS = {ON|OFF}</tt></p>
<p>For example the following SMO code snippet:</p>
<p><code class="prettyprint lang-sql">
<pre>
            Server server = new Server("...");
            Database db = server.Databases["msdb"];
            ServiceQueue sq = new ServiceQueue(db.ServiceBroker, "foo");
            sq.Create();
            sq.IsEnqueueEnabled = false;
            sq.Alter();
            sq.IsEnqueueEnabled = true;
            sq.Alter();
</pre>
<p></code></p>
<p>generates the following T-SQL:</p>
<p><code class="prettyprint lang-sql">
<pre>
CREATE QUEUE [dbo].[foo];
ALTER QUEUE [dbo].[foo]  WITH STATUS = OFF ...;
ALTER QUEUE [dbo].[foo] WITH STATUS = ON ...;
</pre>
<p></code></p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2013/02/06/how-to-enable-and-disable-a-queue-using-smo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SQL Server backup to URL</title>
		<link>http://rusanu.com/2013/01/25/sql-server-backup-to-url/</link>
		<comments>http://rusanu.com/2013/01/25/sql-server-backup-to-url/#comments</comments>
		<pubDate>Fri, 25 Jan 2013 12:24:15 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[SQL 2012]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1706</guid>
		<description><![CDATA[With the SQL Server 2012 SP1 CU2 release a new important feature was added: ability to back up and restore a database straight from Azure Blob storage: This feature released in SQL Server 2012 SP1 CU2, enables SQL Server backup and restore directly to the Windows Azure Blob service. This feature can be used to [...]]]></description>
			<content:encoded><![CDATA[<p>With the SQL Server 2012 SP1 CU2 release a new important feature was added: ability to <a href="http://msdn.microsoft.com/en-us/library/jj919148.aspx">back up and restore a database straight from Azure Blob storage</a>:</p>
<blockquote><p>This feature released in SQL Server 2012 SP1 CU2, enables SQL Server backup and restore directly to the Windows Azure Blob service. This feature can be used to backup SQL Server databases on an on-premises instance or an instance of SQL Server running a hosted environment such as Windows Azure Virtual Machine. Backup to cloud offers benefits such as availability, limitless geo-replicated off-site storage, and ease of migration of data to and from the cloud. </p></blockquote>
<p>The syntax for the new feature is straight forward. You must first create a credential for accessing your Azure Blob storage:</p>
<blockquote><p>SQL Server requires Windows Azure account name and access key authentication to be stored in a SQL Server Credential. This information is used to authenticate to the Windows Azure account when it performs backup or restore operations.</p></blockquote>
<p><code class="prettyprint lang-sql">
<pre>
CREATE CREDENTIAL mycredential WITH IDENTITY = 'mystorageaccount'
       ,SECRET = '<storage access key>' ;

BACKUP DATABASE AdventureWorks2012
      TO URL = 'https://mystorageaccount.blob.core.windows.net/mycontainer/db.bak'
      WITH CREDENTIAL = 'mycredential'
     ,STATS = 5;

RESTORE DATABASE AdventureWorks2012
     FROM URL = 'https://mystorageaccount.blob.core.windows.net/mycontainer/db.bak'
     WITH CREDENTIAL = 'mycredential'
</pre>
<p></code></p>
<p>Full backups, differential backups, log backups, filegroup backups, compressed backups are all supported. The only notable restrictions is that is not allowed to backup to two locations simultaneously. Here is the complete list of limitations:</p>
<blockquote><ul>
<li>The maximum backup size supported is 1 TB.</li>
<li>Backup to or restoring from the Windows Azure Blob storage service by using SQL Server Management Studio Backup or Restore wizard is not currently enabled</li>
<li>Creating a logical device name is not supported. So adding URL as a backup device using sp_dumpdevice or through SQL Server Management Studio is not supported</li>
<li>Appending to existing backup blobs is not supported. Backups to an existing Blob can only be overwritten by using the WITH FORMAT option</li>
<li>Backup to multiple blobs in a single backup operation is not supported</li>
<li>Specifying a block size with BACKUP is not supported.</li>
<li>Specifying a block size for restores might be required in certain scenarios</li>
<li>Specifying MAXTRANSFERSIZE is not supported.</li>
<li>Specifying backupset options &#8211; RETAINDAYS and EXPIREDATE are not supported.</li>
<li>SQL Server has a maximum limit of 259 characters for a backup device name. The BACKUP TO URL consumes 36 characters for the required elements used to specify the URL – ‘https://.blob.core.windows.net//.bak’, leaving 223 characters for account, container, and blob names put together.</li>
</ul>
</blockquote>
<p>I recommend going over <a href="http://msdn.microsoft.com/en-us/library/jj919149.aspx">Backup and Restore to Azure Blob Best Practices</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2013/01/25/sql-server-backup-to-url/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Case Sensitive collation sort order</title>
		<link>http://rusanu.com/2012/11/23/case-sensitive-collation-sort-order/</link>
		<comments>http://rusanu.com/2012/11/23/case-sensitive-collation-sort-order/#comments</comments>
		<pubDate>Fri, 23 Nov 2012 09:15:30 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Tutorials]]></category>
		<category><![CDATA[case sensitive]]></category>
		<category><![CDATA[collation]]></category>
		<category><![CDATA[sort]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[sql server]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1680</guid>
		<description><![CDATA[A recent inquiry from one of our front line CSS engineers had me look into how case sensitive collations decide the sort order. Consider a simple question like How should the values 'a 1', 'a 2', 'A 1' and 'A 2' sort? create table [test] ( [col] varchar(10) collate Latin1_General_CS_AS); go insert into [test] ([col]) [...]]]></description>
			<content:encoded><![CDATA[<p>A recent inquiry from one of our front line CSS engineers had me look into how case sensitive collations decide the sort order. Consider a simple question like <i>How should the values <tt>'a 1'</tt>, <tt>'a 2'</tt>, <tt>'A 1'</tt> and <tt>'A 2'</tt> sort?</i></p>
<pre><code class="prettyprint lang-sql">
create table [test] (
	[col] varchar(10)
		collate Latin1_General_CS_AS);
go

insert into [test] ([col]) values
	('a 1'),
	('a 2'),
	('A 1'),
	('A 2');
go

select [col]
from [test]
order by [col];
</code></pre>
<p>Here are two possible outputs:</p>
<div>
<div style="display: block; float: left; margin-left:  50px;"><a href="http://rusanu.com/wp-content/uploads/2012/11/Variant1.png"><img src="http://rusanu.com/wp-content/uploads/2012/11/Variant1.png" alt="" title="Variant1" width="165" height="130" class="alignleft size-full wp-image-1683"  /></a></div>
<div style="display: block; float: right; margin-right:  50px;"><a href="http://rusanu.com/wp-content/uploads/2012/11/Variant2.png"><img src="http://rusanu.com/wp-content/uploads/2012/11/Variant2.png" alt="" title="Variant2" width="158" height="129" class="alignright size-full wp-image-1684"></a></div>
<p>
</div>
<p style="clear: both">Which one is correct? A programmer will chose the first order: <tt>'a 1'</tt>, <tt>'a 2'</tt>, <tt>'A 1'</tt>, <tt>'A 2'</tt>. Because if one would implement a string comparison routine it would compare character by character until a difference is encountered, and <tt>'a'</tt> sorts ahead of <tt>'A'</tt>. But this answer is wrong. The correct sort order is <tt>'a 1'</tt>, <tt>'A 1'</tt>, <tt>'a 2'</tt>, <tt>'A 2'</tt>! And if you ran the query in SQL Server you certainly got the second output. But look again at the sort order and focus on just the first character:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/11/Variant3.png"><img src="http://rusanu.com/wp-content/uploads/2012/11/Variant3.png" alt="" title="Variant3" width="158" height="129" class="alignleft size-full wp-image-1690" /></a></p>
<p class="callout float-right">By default, the <a href="http://www.unicode.org/reports/tr10/#Scope" target="_blank">algorithm</a> makes use of three fully-customizable levels. For the Latin script, these levels correspond roughly to: alphabetic ordering, diacritic ordering, case ordering</p>
<p>So, in a case sensitive collation, is <tt>'a'</tt> ahead or after <tt>'A'</tt> in the sort order? The images shows them actually interleaved, is <tt>'a'</tt>, <tt>'A'</tt>, <tt>'a'</tt>, <tt>'A'</tt>. What&#8217;s going on? The answer is that collation sort order is a little more nuanced that just comparing characters until a difference is encountered. This is described in the <a href="http://www.unicode.org/reports/tr10">Unicode Technical Standard #10: UNICODE COLLATION ALGORITHM</a>. And yes, the same algorithm is applied for non-Unicode types (VARCHAR) too. The algorithm actually gives different weight to character differences and case differences, a difference in alphabetic order is more important than one in case order. To compare the sort order of two strings the algorithm is more like the following:</p>
<ul>
<li>Compare every character in case <b>insensitive</b>, accent <b>insensitive</b> manner. If a difference is found, this decides the sort order. If no difference is found, continue.</li>
<li>Compare every character in case <b>insensitive</b>, accent <b>sensitive</b> manner. If a difference is found, this decides the sort order. If no difference is found, continue.</li>
<li>Compare every character in case <b>sensitive</b> manner (we already know from the step above there is no accent difference). If a difference is found, this decides the sort order. If no difference is found the strings are equal.</li>
</ul>
<p>Needless to say the real algorithm does not need to traverse the strings 3 times, but the logic is equivalent to above. And remember that when the strings have different lengths then the comparison expands the shorter string with spaces and compares up to the length of the longest string. Combined with the case sensitivity rules this gives to a somewhat surprising result when using an inequality in a <tt>WHERE</tt> clause:</p>
<pre><code class="prettyprint lang-sql">
select [col]
from [test]
where [col] > 'A'
order by [col];
</code></pre>
<p><a href="http://rusanu.com/wp-content/uploads/2012/11/Variant2.png"><img src="http://rusanu.com/wp-content/uploads/2012/11/Variant2.png" alt="" title="Variant2" width="158" height="129" class="alignright size-full wp-image-1684"></a></p>
<p>That&#8217;s right, we got back all 4 rows, including those that start with <tt>'a'</tt>. This surprises some, but is the correct result. <tt>'a&nbsp;1'</tt> should be in the result, even though <tt>'a'</tt> is &lt; <tt>'A'</tt>. If you follow the algorithm above: first we expand the shorter string with spaces, so the comparison is between <tt>'a&nbsp;1'</tt> and <tt>'A&nbsp;&nbsp;'</tt>. Then we do the first pass comparison, which is only alphabetic order, case insensitive and accent insensitive, character by character: <tt>'a'</tt> and <tt>'A'</tt> are equal, <tt>'&nbsp;'</tt> and <tt>'&nbsp;'</tt> are equal, but <tt>'1'</tt> is &gt; <tt>'&nbsp;'</tt>. The comparison stops, we found a alphabetic order difference so <tt>'a&nbsp;1'</tt> &gt; <tt>'A&nbsp;&nbsp;'</tt>, the row qualifies and is included in the result. Ditto for <tt>'a&nbsp;2'</tt>.</p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2012/11/23/case-sensitive-collation-sort-order/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Handling exceptions that occur during the RECEIVE statement in activated procedures</title>
		<link>http://rusanu.com/2012/10/15/handling-exceptions-that-occur-during-the-receive-statement-in-activated-procedures/</link>
		<comments>http://rusanu.com/2012/10/15/handling-exceptions-that-occur-during-the-receive-statement-in-activated-procedures/#comments</comments>
		<pubDate>Mon, 15 Oct 2012 11:43:18 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Tutorials]]></category>
		<category><![CDATA[error 9617]]></category>
		<category><![CDATA[error handling]]></category>
		<category><![CDATA[service broker]]></category>
		<category><![CDATA[sql server]]></category>
		<category><![CDATA[The service queue is currently disabled]]></category>
		<category><![CDATA[transact-sql]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1657</guid>
		<description><![CDATA[The typical SQL Server activation procedure is contains a WHILE (1=1) loop and exit conditions based on checking @@ROWCOUNT. Error handling is done via a BEGIN TRY ... BEGIN CATCH block. This pattern is present in many Service Broker articles on the web, including this web site, in books and in Microsoft samples: create procedure [...]]]></description>
			<content:encoded><![CDATA[<p>The typical SQL Server activation procedure is contains a <tt>WHILE (1=1)</tt> loop and exit conditions based on checking <tt>@@ROWCOUNT</tt>. Error handling is done via a <tt>BEGIN TRY ... BEGIN CATCH</tt> block. This pattern is present in many Service Broker articles on the web, including this web site, in books and in Microsoft samples:</p>
<p><code language="SQL">
<pre>
create procedure [&lt;procedure name&gt;]
as
declare @dialog_handle uniqueidentifier
 , @message_type_name sysname
 , @message_body varbinary(max);
set nocount on;

while(1=1)
begin
 begin transaction;
 begin try;
  receive top(1)
   @dialog_handle = conversation_handle
   , @message_type_name = message_type_name
   , @message_body = message_body
  from [&lt;queue name&gt;];
  if @@rowcount = 0
  begin
   rollback;
   break;
  end
  if @message_type_name = N'&lt;my message type&gt;'
  begin
   -- process the message here
                        ...
  end
  else if @message_type_name = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
     or @message_type_name = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
  begin
   end conversation @dialog_handle;
  end
  commit transaction;
 end try
 begin catch
  declare @error_number int = ERROR_NUMBER()
   , @error_message nvarchar(4000) = ERROR_MESSAGE()
   , @xact_state int = XACT_STATE();
  if @xact_state = -1 or @xact_state = 1
  begin
   rollback;
  end
  -- log the error here
               ....
 end catch
end
go
</pre>
<p></code></p>
<p>This patter though contains a problem: it will handle very poorly a disabled queue, and hence it will handle very poorly poison messages.</p>
<h2>Error 9617 The service queue &#8220;&#8230;&#8221; is currently disabled</h2>
<p><span id="more-1657"></span></p>
<p>Attempting to <tt>RECEIVE</tt> from a disabled queue will raise error 9617 but will not interrupt the batch. The above procedure error handling will handle the exception, eventually logging it somewhere, and then it will resume and loop again, hitting again error 9617. Your ERRORLOG file will likely grow quickly, the CPU will be busy due to the tight loop, all to no avail. Furthermore the constant executing RECEIVE and the activated procedure may even prevent you from re-enabling the queue as is being locked. The only way to stop such a runaway procedure is to <a href="http://msdn.microsoft.com/en-us/library/ms173730.aspx"><TT>KILL it</TT></a>.</p>
<p class="callout float-left">Separate the exception handling of the RECEIVE statement</p>
<p>You can try several solutions. You could check for the queue state before issuing the <tt>RECEIVE</tt>. You could special case error 9617 in the <tt>CATCH</tt> block. But if error 9617 requires special handling, are you sure you have covered all the special cases? I would recommend a different approach. What happens here is that your error handling does not differentiate between an exception that occurred in the <tt>RECEIVE</tt> statement itself and an exception that occurred in the processing of the messages returned by <tt>RECEIVE</tt>. Differentiating this two cases would allow for separate handling: if the <tt>RECEIVE</tt> statement itself has raised an exception then is better to abandon and exit the loop. If the exception occurred in the message processing then, after the exception is handled based on business specific rules, it is OK to dequeue more messages and issue <tt>RECEIVE</tt> again:<br />
<code language="SQL">
<pre>
create procedure [&lt;procedure name&gt;]
as
declare @dialog_handle uniqueidentifier
 , @message_type_name sysname
 , @message_body varbinary(max)
 , @error_number int
 , @error_message nvarchar(4000)
 , @xact_state int;
set nocount on;

while(1=1)
begin
 set  @dialog_handle = null;
 begin transaction;
 begin try;
  receive top(1)
   @dialog_handle = conversation_handle
   , @message_type_name = message_type_name
   , @message_body = message_body
  from [&lt;queue name&gt;];
 end try
begin catch
  -- this catch block handles errors in the RECEIVE statement
  set @error_number = ERROR_NUMBER();
  set @error_message = ERROR_MESSAGE();
  set @xact_state = XACT_STATE();
  if @xact_state = -1 or @xact_state = 1
  begin
   rollback;
  end
  -- log the error here
  ...
  break;
end catch
if @dialog_handle is null
begin
  rollback;
  break;
end
begin try
  if @message_type_name = N'&lt;my message type&gt;'
  begin
   -- process the message here
   ...
  end
  else if @message_type_name = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
     or @message_type_name = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
  begin
   end conversation @dialog_handle;
  end
  commit transaction;
 end try
 begin catch
  -- this catch block handles errors in the message processing
  set @error_number = ERROR_NUMBER();
  set @error_message = ERROR_MESSAGE();
  set @xact_state = XACT_STATE();
  if @xact_state = -1 or @xact_state = 1
  begin
   rollback;
  end
  -- log the error here
 ...
 end catch
end
go
</pre>
<p></code></p>
<h2>Why loop in the first place?</h2>
<p>The idea of the loop inside the activated stored procedure is that once activated, a procedure should <tt>RECEIVE</tt> until it drains the message queue and only then exit. But here is the deal: the <a href="http://rusanu.com/2008/08/03/understanding-queue-monitors/">Queue Monitor</a> that launched the activated procedure already loops for you! It will keep calling the activated procedure until the procedure issues a <tt>RECEIVE</tt> statement that returns no rows. So you can get rid of the <tt>WHILE (1=1)</tt> loop, which simplifies the procedure and make sit more robust. If we do so, we no longer need to distinguish between errors that occurred in the <tt>RECEIVE</tt> statement vs. errors that occur in the message processing:</p>
<p><code language="SQL">
<pre>
create procedure [&lt;procedure name&gt;]
as
declare @dialog_handle uniqueidentifier
 , @message_type_name sysname
 , @message_body varbinary(max)
 , @error_number int
 , @error_message nvarchar(4000)
 , @xact_state int;
set nocount on;
 begin transaction;
 begin try;
  receive top(1)
   @dialog_handle = conversation_handle
   , @message_type_name = message_type_name
   , @message_body = message_body
  from [&lt;queue name&gt;];

  if @message_type_name = N'&lt;my message type&gt;'
  begin
   -- process the message here
   ...
  end
  else if @message_type_name = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
     or @message_type_name = N'http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog'
  begin
   end conversation @dialog_handle;
  end
  commit transaction;
 end try
 begin catch
  set @error_number = ERROR_NUMBER();
  set @error_message = ERROR_MESSAGE();
  set @xact_state = XACT_STATE();
  if @xact_state = -1 or @xact_state = 1
  begin
   rollback;
  end
  -- log the error here
 end catch
go
</pre>
<p></code></p>
<p>Notice there is no longer need for <tt>@@ROWCOUNT</tt> check, that is implicit in the testing of <tt>@message_type_name</tt> because it will be NULL if no message was received. There is only one <tt>BEGIN TRY ... BEGIN CATCH ...</tt> block, which makes for an easier to comprehend, test and present procedure. And with the <tt>WHILE (1=1)</tt> loop eliminated there are fewer chances for a bug causing an infinite loop, as it was the case with error 9617. </p>
<h1>Deploying blog samples in production</h1>
<p>Over time I learned that a lot of times readers deploy in production the code samples straight from the blogs they read. You have to keep in mind that a blog example is written first and foremost to illustrate the point of the blog post and not to handle a real life production environment. In my production code there must be error handling, exception logging, performance measurement instrumentation in place (with run-time knobs to enable or disable the instrumentation, or to dial the level of detail up or down) and so on and so forth. Also in a real production case I would try to leverage batch processing of messages as described in <a href="http://rusanu.com/2006/10/16/writing-service-broker-procedures/">Writing Service Broker Procedures</a>. When processing batch messages using a cursor the individual message processing is an excellent candidate for using a transaction savepoint, in the manner described in <a href="http://rusanu.com/2009/06/11/exception-handling-and-nested-transactions/">Exception Handling and Nested Transactions</a>. Use your judgement and remember that this code here is a sample, only provided to guide you in the right direction.</p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2012/10/15/handling-exceptions-that-occur-during-the-receive-statement-in-activated-procedures/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to shrink the SQL Server log</title>
		<link>http://rusanu.com/2012/07/27/how-to-shrink-the-sql-server-log/</link>
		<comments>http://rusanu.com/2012/07/27/how-to-shrink-the-sql-server-log/#comments</comments>
		<pubDate>Fri, 27 Jul 2012 07:52:41 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Troubleshooting]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1577</guid>
		<description><![CDATA[I noticed that my database log file has grown to 200Gb. I tried to shrink it but is still 200Gb. How can I shrink the log and reduce the file size? The problem is that even after you discover about DBCC SHRINKFILE and attempt to reduce the log size, the command seems not to work [...]]]></description>
			<content:encoded><![CDATA[<blockquote><p>I noticed that my database log file has grown to 200Gb. I tried to shrink it but is still 200Gb. How can I shrink the log and reduce the file size?</p></blockquote>
<p>The problem is that even after you discover about <tt>DBCC SHRINKFILE</tt> and attempt to reduce the log size, the command seems not to work at all and leaves the log at the same size as before. What is happening?</p>
<p>If you look back at <a href="http://rusanu.com/2012/01/17/what-is-an-lsn-log-sequence-number/">What is an LSN: Log Sequence Number</a> you will see that LSNs are basically pointers (offsets) inside the log file. There is one level of indirection (the VLF sequence number) and then the rest of the LSN is basically an offset inside the Virtual Log File (the VLF). The log is always defined by the two LSNs: the head of the log (where new log records will be placed) and the tail of the log (what is the oldest log record of interest). Generating log activity (ie. any updates in the database) advance the head of the log LSN number. The tail of the log advances when the database log is being backed up (this is a simplification, more on it later).</p>
<p><span id="more-1577"></span><br />
<a href="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-1.png"><img src="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-1.png" alt="" title="Truncate log-1" width="600" class="aligncenter size-full wp-image-1594" /></a></p>
<p class="callout float-right">Use <tt>DBCC LOGINFO</tt> to view the layout of the VFLs inside the log file</p>
<p>The image above shows as typical log file layout. To understand the layout of VLFs inside your LDF log file use <tt>DBCC LOGINFO</tt>. The head of the log moves ahead as transactions generate log records, occupying more of the free space ahead in the current VLF. If the current VLF fills up, a new empty VLF can be used. If there are no empty VLFs the system must grow the log LDF file and create a new VLF to allow for more log records to be written. This is when the physical LDF file actually increases and takes more space on disk:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-2.png"><img src="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-2.png" alt="" title="Truncate log-2" width="600" class="aligncenter size-full wp-image-1597" /></a></p>
<p class="callout float-right">Advancing the head of the log can cause LDF file growth when no free VLFs are available</p>
<p>In the image above some more transaction activity resulted in head of the log advancing forward. As the VLF2 filled up, the system had to allocate a new VLF by growing the physical log LDF file. The unused space in VLF1 cannot be used as long as there is even a single active LSN record in it, the <a href="http://rusanu.com/2012/01/17/what-is-an-lsn-log-sequence-number/">What is an LSN: Log Sequence Number</a> article explains why this is the case. If we now take a backup of the log <b>truncation</b> would occur. Truncation is described as &#8216;deleting the log records&#8217; but no actual physical deletion needs to occur. It simply means the tail of the log will move forward: a database property that contains the LSN number of the tails of the log gets updated with the new LSN number repreesenting the new rail of the log position:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-3.png"><img src="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-3.png" alt="" title="Truncate log-3" width="600" class="aligncenter size-full wp-image-1601" /></a></p>
<p>As the head of the log continues to advance due to normal database activity, it now has room to grow by re-using the free VLF 1. This will cause an active log wrap around: the log starts reusing free\ VLFs inside the LDF file and advancing the head of the log no longer causes physical LDF file to grow:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-4.png"><img src="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-4.png" alt="" title="Truncate log-4" width="600"  class="aligncenter size-full wp-image-1611" /></a></p>
<p>In this configuration if the head of the log continues to advance it has room to grow in the VLF 1. But, unless further truncation occurs, if it fills VLF 1 in order to continue to move ahead the head of the log will require yet again the LDF file to grow in size to accommodate a new free VLF:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-5.png"><img src="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-5.png" alt="" title="Truncate log-5" width="600" class="aligncenter size-full wp-image-1614" /></a></p>
<p>If we truncate the log now the tail of the log would apparently follow the path taken by the head of the log and eventual catch up the current head of the log. In effect all that happens is that the database property that contains the LSN which si the current tail of the log gets updated with the new LSN. The apparent &#8216;path&#8217; is just an effect of the nature of LSN structure.<a href="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-6.png"><img src="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-6.png" alt="" title="Truncate log-6" width="600" class="aligncenter size-full wp-image-1618" /></a></p>
<p>In this state the LDF file contains 3 empty VLFs and only a small active log portion. However, the physical LDF file cannot be reduced in size, any attempt to shrink the file will fail. As a general rule a file (any file) can only be shrunk (reduce in size) by removing data at the end of the file (basically by reducing the file length). Is not possible to &#8216;delete&#8217; from the beginning of a file or from the middle of a file. Unlike data MDF files, the system cannot afford to move records around in order to free space at the end of the file, remember that LSNs are basically an offset inside the file. The active portion of the log cannot be moved, because there all those records in the active portion of the log would suddenly become invalid.</p>
<p class="callout float-left">If the last VLF in a log file is inactive, the log file can shrink</p>
<p>So if the LDF file can reduce its size (shrink) only be reducing its length and active VFLs cannot move it is clear what is the condition when the <tt>DBCC SHRINKFILE</tt> requires in order to succeed: the last VLF in the file must be free (inactive). Recursively apply this condition to the VLFs left after removing the last one and we have the answer to <i>how much</i> can the log shrink. And this is the explanation why almost always when you <i>care</i> about this information, you are in a bad state in which you <i>cannot</i> shrink the log file. Is just that the typical sequence of events leads exactly to this situation:</p>
<ul>
<li>Full recovery model gets enabled on the database</li>
<li>No maintenance plan is in place and log backups are not taken</li>
<li>The large log LDF file is noticed</li>
<li>After a frantic search on internet, the solution of taking a log backup is found</li>
<li>Shrinking the log fails to yield any reduction in the log LDF file size</li>
</ul>
<p>This sequence of actions causes the head of the log to advance forward, creating new VLFs as the log file expands. When the situation is detected and the log backups are taken, the tail of the log catches up as truncation (logical freeing) occurs. But in the end the head of the log is located in the last added VLF, which is exactly at the end of the log file. Since the last VLF is active, no shrink can occur, as explained above, despite a large portion of the log file being free (inactive). To make progress in this situation one must first cause the head of the log to <i>move forward</i> so that it fills the last VLF and then it wraps around and reuses the free VLF(s) at the beginning of the file. So, perhaps counter intuitively, you must actually <i>generate</i> more log activity in order to be able to shrink the log file. On an active database this log activity will occur naturally from ordinary use, but if there is no activity then you must cause some. For example, do some updates in a transaction and roll back. Repeat this until the head of the log has wrapped around and is located in the VLFs at the beginning of the file. As soon as this happens, take another log backup to cause truncation which will move the tail of the log forward, following the head of the log and wrapping around. Now the VLFs at the end of the file are inactive and <tt>DBCC SHRINKFILE</tt> can actually succeed:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-7.png"><img src="http://rusanu.com/wp-content/uploads/2012/07/Truncate-log-7.png" alt="" title="Truncate log-7" width="600"  class="aligncenter size-full wp-image-1638" /></a></p>
<h2>DBCC LOGINFO</h2>
<p>Here is an example output from running the <tt>DBCC LOGINFO</tt> command:</p>
<pre>
RecoveryUnitId FileId  FileSize  StartOffset  FSeqNo Status Parity CreateLSN
-------------- ----------------- ------------ ------ ------ ------ -----------------
0              2       253952    8192         242    0      128    0
0              2       278528    262144       243    0      128    0
0              2       311296    540672       241    0      64     30000000028200477
0              2       262144    851968       236    0      128    34000000025600489
0              2       262144    1114112      237    0      128    35000000010400007
0              2       262144    1376256      238    0      128    35000000023400481
0              2       262144    1638400      239    0      128    36000000013600418
0              2       262144    1900544      240    0      128    36000000036800482
0              2       262144    2162688      244    0      128    73000000037600584
0              2       262144    2424832      245    0      128    74000000009400581
0              2       262144    2686976      246    0      128    74000000034200584
0              2       327680    2949120      247    0      128    75000000013600585
0              2       327680    3276800      248    2      128    76000000001600581
(13 row(s) affected)
</pre>
<p>This example shows 13 VLFs in the file. 12 are inactive (status is 0) and only the very last one is active (status 2). The active VLF starts at offset 3276800 in the file and has a size of 327680. A quick sneak in <a href="http://msdn.microsoft.com/en-us/library/ms174397.aspx"><tt>sys.database_files</tt></a> reveals that the log file size is 440. The file size is in 8k pages (even for log, although the log file structure has nothing to do with the 8k data page of the MDF/NDF data files) so the file size is 3604480 bytes. Which matches exactly our last VLF ending (3276800 + 327680). So indeed the log file is 3.4 MB in size, from which 3.12 MB are inactive and the last 320 Kb are active (in use by the last VLF). Not that I would ever recommend shrinking a 3.4 MB log, but if we&#8217;d try to shrink this file it would yield no results because the very last VLF is active.</p>
<h2>Tail of the log</h2>
<p>I said before that the tail of the log is moved forward by log backups, but that is a simplification. Advancing the tail of the log is tad more complex because the tail is not a precise LSN: is the lowest LSN of <i>any</i> consumer that needs to look at the log records. Such consumers are:</p>
<dl>
<dt>Active Transactions</dt>
<dd>Any transaction that has not yet committed needs to retain all the log from when the transaction started, in case it has to rollback. Remember that rollback is implemented by reading back all the transaction log record and generating a compensating (undo) action for each.</dd>
<dt>Log Backup</dt>
<dd>Log backup needs to retain all the log generated until the next log backup runs, so it is copied into the backup file.</dd>
<dt>Transactional Replication</dt>
<dd>The transactional replication publisher agent has to read the log to identify changes that occurred and have to be distributed to the subscribers.</dd>
<dt>Database Mirroring, AlwaysOn</dt>
<dd>Both these technologies work by literally copying the log to the partners, so they require the log to be retained until is copied over.</dd>
<dt>Other</dt>
<dd>If you check the <tt>log_reuse_wait_desc</tt> column documentation in <a href="http://msdn.microsoft.com/en-us/library/ms345414.aspx" target="_blank">Factors That Can Delay Log Truncation</a> you will see all the other processes that can retain the log tail, things like CHECKPOINT, an active backup operation, a database snapshot being created, a log scan (<tt>fn_dblog()</tt>) etc. All these will cause the tail of the log to stay in place (not progress forward).</dd>
</dl>
<p>The tail of the log will be the <i>lowest</i> LSN from any of the LSNs required by the processes above. Whenever the tail of the log advances forward the log is said to be <i>truncated</i>, as described in <a href="http://msdn.microsoft.com/en-us/library/ms189085.aspx" target="_blank">Transaction Log Truncation</a>.</p>
<h2>Further reading</h2>
<p>Everything I described in this article was discussed before, there is no original contribution here (maybe the pictures&#8230;). This subject is covered very well and here are <i>some</i> links to related articles:</p>
<ul>
<li><a href="http://msdn.microsoft.com/en-us/library/ms178037.aspx">Shrinking the Transaction Log</a></li>
<li><a href="http://msdn.microsoft.com/en-us/library/ms365418.aspx">Manage the Size of the Transaction Log File</a></li>
<li><a href="http://technet.microsoft.com/en-us/magazine/2009.02.logging.aspx">Understanding Logging and Recovery in SQL Server</a></li>
<li><a href="http://sqlskills.com/BLOGS/PAUL/post/Inside-the-Storage-Engine-More-on-the-circular-nature-of-the-log.aspx">Inside the Storage Engine: More on the circular nature of the log</a></li>
<li><a href="http://sqlblog.com/blogs/kalen_delaney/archive/2009/12/21/exploring-the-transaction-log-structure.aspx">Geek City: Exploring the Transaction Log Structure</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2012/07/27/how-to-shrink-the-sql-server-log/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Inside the SQL Server 2012 Columnstore Indexes</title>
		<link>http://rusanu.com/2012/05/29/inside-the-sql-server-2012-columnstore-indexes/</link>
		<comments>http://rusanu.com/2012/05/29/inside-the-sql-server-2012-columnstore-indexes/#comments</comments>
		<pubDate>Tue, 29 May 2012 10:38:46 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Columnstore]]></category>
		<category><![CDATA[SQL 2012]]></category>
		<category><![CDATA[Tutorials]]></category>
		<category><![CDATA[columnstore]]></category>
		<category><![CDATA[internals]]></category>
		<category><![CDATA[sql]]></category>
		<category><![CDATA[sql server]]></category>
		<category><![CDATA[sql server 2012]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1463</guid>
		<description><![CDATA[Columnar storage has established itself as the de-facto option for Business Intelligence (BI) storage. The traditional row-oriented storage of RDBMS was designed for fast single-row oriented OLTP workloads and it has problems handling the large volume range oriented analytical processing that characterizes BI workloads. But what is columnar storage and, more specifically, how does SQL [...]]]></description>
			<content:encoded><![CDATA[<p>Columnar storage has established itself as the de-facto option for Business Intelligence (BI) storage. The traditional row-oriented storage of RDBMS was designed for fast single-row oriented OLTP workloads and it has problems handling the large volume range oriented analytical processing that characterizes BI workloads. But what <i>is</i> columnar storage and, more specifically, how does SQL Server 2012 implement columnar storage with the new <tt>COLUMNSTORE</tt> indexes?</p>
<p><span id="more-1463"></span></p>
<h2>Column Oriented Storage</h2>
<p>The defining characteristic of columnar storage is the ability to read the values of a particular column of a table without having to read the values of all the other columns. In row-oriented storage this is impossible because the individual column values are physically stored grouped in rows on the pages and reading a page in order to read a column value <i>must</i> fetch the entire page in memory, thus automatically reading all the other columns in the row:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/05/row-oriented-reads.png"><img src="http://rusanu.com/wp-content/uploads/2012/05/row-oriented-reads.png" alt="" title="row-oriented-reads" width="600" class="aligncenter size-full wp-image-1465" /></a></p>
<p>This image shows how in row oriented storage a query that only needs to read the values of column <tt>A</tt> needs to pay the penalty of reading the entire pages, including the unnecessary columns B, C, D and E, simply because the physical format of the record and page. If you&#8217;re not familiar with the row format I strongly recommend Paul Randal&#8217;s article <a href="http://www.sqlskills.com/blogs/paul/post/Inside-the-Storage-Engine-Anatomy-of-a-record.aspx" target="_blank">Inside the Storage Engine: Anatomy of a Record</a>.</p>
<p>By contrast the column oriented storage stores the data in a format that groups columns together on disk, so reading values from a single column incurs only the minimal IO needed to fetch in the column required by the query:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/05/column-oriented-reads.png"><img src="http://rusanu.com/wp-content/uploads/2012/05/column-oriented-reads.png" alt="" title="column-oriented-reads" width="447" height="454" class="aligncenter size-full wp-image-1467" /></a></p>
<p>The most common question I hear often when columnar storage is <i>how is a row stored?</i>. In other words, if we store all the values of a column together, then how do we re-create a row? Ie. if the column Name has the values &#8216;John Doe&#8217; and &#8216;Joe Public&#8217;, while the Date_of_Birth column contains values &#8217;19650112&#8242; and &#8217;19680415&#8242; then when was John born, and when was Joe?  The answer is that that the position of the value in the column indicates to which row it belongs. So row 1 consists of the first value in Name column and first value in Date_of_Birth, while row 2 is second value in each column and so on.</p>
<p>By physically separating the values from individual columns into their own pages the engine can read only the columns needed. This reduces the IO for queries that:</p>
<ul>
<li>read only a small subset of columns from all the columns of the table (no <tt>SELECT *</tt>)</li>
<li>read many rows (scans and range scans)</li>
</ul>
<p class="callout float-right">Columnar storage reads from disk only the data for the columns referenced by the query</p>
<p>Not surprisingly this type of queries is the typical BI query pattern. Note that a column qualifies as &#8216;read&#8217; not only if is projected in the result set, but also if is referenced in the <tt>WHERE</tt> clause, or in a join, or anywhere else in the query. Typical BI queries aggregates fact values over large ranges defined by some dimension subsets (eg. &#8220;sum of sales in the NW region in Sep.- Nov. period&#8221;) and thus they tend to create exactly the queries that the columnar storage loves. The fact that BI workloads use a  pattern of queries that references only few columns but many rows, combined with the ability of the columnar storage to read from disk only the columns actually referenced by the query is the first and foremost advantage of the column store technology.</p>
<p>An OLTP workload by contrast tends to read entire rows (all columns) and only one or very few rows at a time. Columnar storage loses its appeal in OLTP workloads and in fact the columnar storage can be significantly slower than row oriented storage for OLTP workloads.</p>
<h2>Compression</h2>
<p>Because column oriented storage groups values from the same column together there is a new beneficial side effect: data becomes more compressible. Compression can be deployed on row-oriented storage too, see <a href="http://msdn.microsoft.com/en-us/library/cc280464.aspx" target="_blank">Page Compression Implementation</a>, but with column oriented storage the data is more homogenous, as it contains only values from a single column. Since compression ratio is subject to the data entropy (how homogenous the data is) it follows that columnar storage format is more compressible than the same data represented in row oriented storage format.</p>
<p class="callout float-left">Columnstores target large datasets for which compression yields most benefits</p>
<p>Since the columnstore format is targeted explicitly at BI workloads and large datasets, it is justifiable to invest significantly more development effort into enhancing compression benefits: deploy several alternative compression algorithms for the engine to choose from, leverage operations directly in the compressed format <i>w/o</i> decompressing the data first. Row oriented storage have to strike a balance between compression benefits and its runtime cost in consideration of the typical OLTP workload.</p>
<p>Columnstores embrace compression fully and go to great lengths into achieving high compression ratio because in the expected BI workload the savings in IO from better compression more than offset the runtime CPU loss. The individual compression techniques vary between various columnar storage implementation, but your can make a safe bet that some common techniques are used by most implementations:</p>
<ul>
<li><a href="http://en.wikipedia.org/wiki/Dictionary_coder" target="_blank">Dictionary Encoding</a></li>
<li><a href="http://en.wikipedia.org/wiki/Huffman_coding" target="_blank">Huffman Encoding</a></li>
<li><a href="http://en.wikipedia.org/wiki/Run-length_encoding" target="_blank">Run Length Encoding</a></li>
<li><a href="http://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch" target="_blank">Lempel-Ziv-Welch</a></li>
</ul>
<p>In addition domain specific compression is used, leveraging the knowledge of the types and values being stored. For example <a href="http://en.wikipedia.org/wiki/Column-oriented_DBMS#Compression" target="_blank">Wikipedia</a> mentions the benefits of sorting the data in order to yield better compression:</p>
<blockquote><p>To improve compression, sorting rows can also help. For example, using bitmap indexes, sorting can improve compression by an order of magnitude.[6] To maximize the compression benefits of the lexicographical order with respect to run-length encoding, it is best to use low-cardinality columns as the first sort keys.</p></blockquote>
<p>Note that I am not saying to apply the advice from above mentioned Wikipedia article directly to SQL Server 2012. Specifically columnstore indexes do not require to &#8216;use low-cardinality keys first&#8217;, the order in which you specify keys in SQL Server 2012 columnstore indexes is irrelevant.</p>
<h2>Batch Mode Processing</h2>
<p class="callout float-left">Iterating over hundreds of virtual function calls for each row processes is EXPENSIVE</p>
<p>Reading only the data for the columns referenced by the query and extensive use of compression are techniques used by all columnar storage engines. But SQL Server 2012 also brings something new: a completely new query execution engine optimized for BI queries. To understand why this was necessary and why it has a big performance impact one needs to understand first how query processing works on SQL Server. Paul White has a good series of introductory articles into query processing starting with <a href="http://sqlblog.com/blogs/paul_white/archive/2012/04/28/query-optimizer-deep-dive-part-1.aspx" target="_blank">Query Optimizer Deep Dive &#8211; Part 1</a> and I recommend these articles if you want a deeper introduction into SQL Server query processing. For our purposes it suffices to understand that a query execution consists of iterating over a tree of operators. Query execution is basically a loop that takes the query execution tree top operator and calls a method called <tt>GetNextRow</tt> until the methods returns end-of-file. This top operator in turn will have child operators on which it calls <tt>GetNextRow</tt> to produce the data, and these child operators have further child operators and so on and so forth, all the way to the bottom of the tree operators that implement actual physical access to the data. Whether is a simple scan, a seek, a nested loop join, a hash join, a sort, a where filter, <i>any</i> operator behaves the same. Which also implies that this <tt>GetNextRow</tt> call is a <a href="http://en.wikipedia.org/wiki/Virtual_method_table" traget="_blank">virtual function call</a>. Sometimes these execution trees can be hundreds of operators deep, which means that there will be hundreds of virtual function calls in order to produce one row, followed by again hundreds of calls to produce the next row, then again for the next row and so on. For the analytical workload of OLTP this is not a big deal, since queries are supposed to produce only a few rows and iterate only over small range of the data (a direct seek or a small range scan). But analytical BI queries, even if they produce a small result set containing aggregates, need to scan extremely large sets of data and the cost of traversing deep execution sub-trees to retrieve rows one by one quickly adds up, specially when the calls are all going through a v-table indirection.</p>
<p class="callout float-right">Batch mode execution amortizes the cost of calling the operators by requesting many rows in a single call</p>
<p>Enter the new batch mode operators. In batch mode instead of calling a function that returns one row at a time, the query operators call a function that returns many rows at a time: a batch. The benefits of doing so are not obvious for non-programmers, but believe you me, this can result in speed improvements of 2 and even 3 orders of magnitude.</p>
<p>But batch mode execution requires everybody to play the same tune: all operators have to implement the batch mode execution. Row mode operators cannot interact with batch mode operators nor the other way around. For best performance a query plan has to run in batch mode from top to bottom. Since not all operations were implemented in batch mode by the time SQL Server 2012 RTM shipped, sometimes there will be restrictions that prevent batch mode execution. The Query Optimizer can resort to a conversion operator from batch mode to row mode and it can create plans that have sub trees in batch mode even if the upper part of the tree is in row mode. This mixed mode can still yield significant performance improvements, provided that most data processing still occurs in batch mode (eg. aggregation occurs in batch mode and the result of aggregation is consumed in row mode). Eric Hanson has some very useful tricks for how to achieve batch mode processing in some edgier conditions at <a href="http://social.technet.microsoft.com/wiki/contents/articles/4995.sql-server-columnstore-performance-tuning.aspx#Ensuring_use_of_the_Fast_Batch_Mode_of_Query_Execution" target="_blank">Ensuring Use of the Fast Batch Mode of Query Execution</a>. Most tricks rely on splitting the query in two, a subquery that does the bulk of the work in batch mode and then project aggregated results into a an outer query that uses some non batch mode friendly operator like OUTER JOIN, NOT IT, IN, EXISTS, Scalar Aggregates or DISTRINCT aggregates.</p>
<h2>SQL Server 2012 Columnar Storage</h2>
<p class="callout float-right">COLUMNSTORE indexes use the same <a href="http://en.wikipedia.org/wiki/PowerPivot" target="_blank">xVelocity</a> technology as PowerPivot</p>
<p>Prior to SQL Server 2012 the columnar storage technology offer from Microsoft was restricted to the BI analytical line of products: Power Pivot and Microsoft SQL Server Analysis Server. Both of these products use xVelocity (formerly known as Vertipac) to create highly compressed in-memory columnar databases. The COLUMNSTORE index of SQL Server 2012 uses the same technology, but adapted to the SQL Server product, specifically to the SQL Server storage and memory model. MOLAP servers store the entire columnar database as a single monolithic file and load it entirely in memory. Such an use pattern would not work along with the complex memory management of the SQL Server buffer pool. Besides, the storage options of SQL Server are capped at the maximum of 2GB size of a BLOB value and that would make for a really small columnar storage database. So COLUMNSTORE indexes instead use the xVelocity technology on smaller shards of the data called <i>segments</i>:</p>
<p><a href="http://rusanu.com/wp-content/uploads/2012/05/column-segments.png"><img src="http://rusanu.com/wp-content/uploads/2012/05/column-segments.png" alt="" title="column-segments" width="466" height="598" class="aligncenter size-full wp-image-1518" /></a></p>
<p>A segment represents a group of consecutive rows in the columnstore index. For example segment 1 will have rows from 0 to row number 1 million, segment 2 will have rows from 1000001 to 2000000, segment 3 from 2000001 and so on. Each column will have an entry for each segment in <a href="http://msdn.microsoft.com/en-us/library/gg509105.aspx" target="_blank"><tt>sys.column_store_segments</tt></a>. Note that at the time of writing this the MSDN entry says <i>&#8220;Contains a row for each column in a columnstore index&#8221;</i> which is incorrect, it should say <i>&#8220;Contains a row for each column <b>in each segment</b> in a columnstore index&#8221;</i>. I have talked before about how columnstore indexes store data in the LOB allocation unit of the table, see <a href="http://rusanu.com/2011/07/13/how-to-use-columnstore-indexes-in-sql-server/" target="_blank">How to use columnstore indexes in SQL Server</a>. Each column segment will be a BLOB value in this allocation unit. In effect you can think about <tt>sys.column_store_segments</tt> as having a VARINARY(MAX) column containing the actual segment data, with the amendment that the storage of this VARBINARY(MAX) column comes from the columnstore index LOB allocation unit and not from the sys.column_store_segments own LOB allocation unit. So it should be clear now what is the main limitation in a column segment: it cannot exceed 2GB in size, since this is the maximum size of a VARBINARY(MAX) column value. A column segment is uniformly <i>encoded</i>: for example if the column segment uses a dictionary encoding then all values in the segment are encoded using a dictionary encoding representation.</p>
<p class="callout float-left">Dictionaries are always used to encode strings and may be used for non-string columns that have few distinct values</p>
<p>Besides column segments a columnstore consists of another data storage element: dictionaries. Dictionaries are widely used in columnar storage as a means to efficiently encode large data types, like strings. The values stores in the column segments will be just entry numbers in the dictionary, and the actual values are stored in the dictionary. This technique can yield very good compression for repeated values, but yields bad results if the values are all distinct (the required storage actually <i>increases</i>). This is what makes large columns (strings) with distinct values very poor candidates for columnstore indexes. Columnstore indexes contain separate dictionaries for each column and string columns contain two types of dictionaries:</p>
<dl>
<dt>Primary Dictionary</dt>
<dd>This is an global dictionary used by <i>all</i> segments of a column.</dd>
<dt>Secondary Dictionary</dt>
<dd>This is an overflow dictionary for entries that did not fit in the primary dictionaries. It can be <i>shared</i> by several segments of a column: the relation between dictionaries and column segments is one-to-many.</dd>
</dl>
<p><a href="http://rusanu.com/wp-content/uploads/2012/05/Dictionaries.png"><img src="http://rusanu.com/wp-content/uploads/2012/05/Dictionaries.png" alt="" title="Dictionaries" width="600" class="aligncenter size-full wp-image-1532" /></a></p>
<p>The image above illustrates the use of primary and secondary dictionaries. All segments of the column reference the primary dictionary. Segments 1,2 and 3 use the secondary dictionary with ID 1. Segment 4 does not need a secondary dictionary and segment 5 uses the secondary dictionary with ID 2.</p>
<p>Information about the dictionaries used by a columnstore can be found in <a href="http://msdn.microsoft.com/en-us/library/gg492082.aspx" target="_blank"><tt>sys.column_store_dictionaries</tt></a> catalog view. Again, at the time of writing the MSDN explanation on it is wrong, it should read &#8220;Contains a row for each <b>dictionary</b> in an xVelocity memory optimized columnstore index&#8221;. Information about which dictionary is used by each segment is available as the <tt>primary_dictionary_id</tt> and <tt>secondary_dictionary_id</tt> columns in <tt>sys.column_store_segments</tt>. Remember that:</p>
<ul>
<li>Not all columns use dictionaries.</li>
<li>Non-string columns may use a primary dictionary.</li>
<li>A string column will always have a primary dictionary and some segments may use a secondary dictionary.</li>
</ul>
<p>The storage of the dictionaries is very similar to the storage of the column segments: think of it as a VARBINARY(MAX) column in <tt>sys.column_store_dictionaries</tt> with the actual physical storage provided by the columnstore LOB allocation unit. The same 2Gb size limit limitation that applies to column segments applies to dictionaries.</p>
<h2><a name="segment_elimination"></a>Segment Elimination</h2>
<p class="callout float-right">The min and max value of the data in the segment is stored in metadata, which enables the query processing to skip entire segments</p>
<p>The <tt>sys.column_store_segments</tt> catalog view shows the columns <tt>min_data_id</tt> and <tt>max_data_id</tt> which are metadata about the min and max values stored in the column segment. The values in the catalog view cannot be interpreted directly because they have to be decoded based on the column segment encoding type and perhaps consider the dictionary used. Knowing that a column segment contains only values between the min and the max, query processing can decide to skip an entire segment. Since the min and the max are stored in the metadata it is not necessary to even load the segment from disk, saving both IO and processing CPU. Segment elimination thus works much like <a href="http://sqlskills.com/blogs/conor/post/An-Introduction-to-Partition-Elimination.aspx" target="_blank">partition elimination</a> works in classic row stores (BTrees and Heaps), but the columnstore segment elimination is actually better:</p>
<ul>
<li>Columnstore segments min and max values are stored in metadata for <i>every</i> column. Partition elimination works only on predicates that reference the partitioning column, but segment elimination works on any predicates that reference any column.</li>
<li>Columnstore segments min and max values are from the <i>actual</i> data in the segment. Partition elimination works not on actual data values but on partition definition boundaries, which may be much larger than the actual values present in the partition. Thus segment elimination kicks in more often.</li>
<li>Segments are much smaller than partitions, so simple probabilities tell us that more small segments will be eliminated where a partition would had to scan the <i>entire</i> partition, even when much of the data does <i>not</i> qualify.</li>
</ul>
<p>Where segment elimination comes to shine is time series. The nature of time series implies that segments will tend to have one column (the date column) with very narrow range of values, and each segment will have a different range of values: segment one will have sales from Jan. 1st, segment 2 will have from Jan. 1st and 2nd, segment 3 from Jan. 2nd and 3rd, segment 4 from Jan. 3rd only and so on. So queries that predicate a time range (which are the overwhelming type of queries on time series) will benefit massively from segment elimination, reading and scanning only a small subset of the data. But be aware that in order to get this benefit the columnstore index must be built in a way that results in the data being ordered by the time column. Remember that columnstore indexes <b>have no key</b>. The gritty details are described in <a href="http://social.technet.microsoft.com/wiki/contents/articles/ensuring-your-data-is-sorted-or-nearly-sorted-by-date-to-benefit-from-date-range-elimination.aspx" target="_blank">Ensuring Your Data is Sorted or Nearly Sorted by Date to Benefit from Date Range Elimination</a>. For more complex scenarios that involve multiple dimension you should consider aggregating the values into complex values, see <a href="http://social.technet.microsoft.com/wiki/contents/articles/5823.multi-dimensional-clustering-to-maximize-the-benefit-of-segment-elimination.aspx" target="_blank">Multi-Dimensional Clustering to Maximize the Benefit of Segment Elimination</a>. As a side node I find interesting that similar technique of aggregating values into complex keys applies to such a different technology as SimpleDB, see <a href="http://practicalcloudcomputing.com/post/712653349/simpledb-essentials-for-high-performance-users-part-1" target="_blank">SimpleDB Essentials for High Performance Users</a>.</p>
<p>Of course if you have a partitioned columnstore index then the query can leverage both partition elimination and segment elimination.</p>
<h1>How to use Columnstore Indexes</h1>
<p class="callout float-right">Columnstore indexes support only scan operations</p>
<p>If you remember one thing from this article is this: a columnstore index is not an index. In the traditional row storage an index is a <a href="http://en.wikipedia.org/wiki/B-tree" target="_blank">b-tree</a> structure which supports seek and range scan operations. Columnstores are much more like the row store <a href="http://msdn.microsoft.com/en-us/library/ms188270%28v=sql.105%29.aspx" target="_blank">heaps</a> and supports only end-to-end scans.  They do not have a key and do not offer any order of rows. Yet, on large data sets, they can offer exceptional performance for analytical workloads due to the factors mentioned above: read only for the columns referenced by the query, high compression, segment elimination and fast batch mode processing. I&#8217;ve seen dramatic examples of queries that on highly optimized row storage could not run below 17 seconds and a columnstore could run the same query in under 18 milliseconds! Performance improvements of factors of higher 10s and even 100s times are the norm you should be looking for with columnstores.</p>
<p class="callout float-left">Add all columns to the columnstore index</p>
<p>Comparing with traditional row store indexing strategies, the columnstore indexing rules are straight forward: pick a large table that is usually the fact table in a star schema data warehouse and add a columnstore index that includes every column. Without key order, ASCENDING or DESCENDING clauses, fill factors, filtered indexes, INCLUDE columns or any other of the whistles and bells of row store indexes, the decision is much simpler. Consider the columnstore index as an <i>alternative</i> to the base table.</p>
<p>Of course the biggest columnstore draw back with the SQL Server 2012 release is the inability to update data. Once a columnstore index is added to a table the table becomes read-only. The best way to circumvent the problem is to use the ability to do fast partition switch-in operations, which allow for partitioned columnstore indexes to be updated. See <a href="http://rusanu.com/2011/07/13/how-to-update-a-table-with-a-columnstore-index/">How to update a table with a columnstore index</a>.</p>
<h1>Recommended Reading</h1>
<p>The SQL Server social Wiki on MSDN has some excellent resources for columnstores:<br/><br />
<a href="http://social.technet.microsoft.com/wiki/contents/articles/4995.sql-server-columnstore-performance-tuning.aspx" target="_blank">SQL Server Columnstore Performance Tuning</a><br/><br />
<a href="http://social.technet.microsoft.com/wiki/contents/articles/3540.sql-server-columnstore-index-faq-en-us.aspx" target="_blank">SQL Server Columnstore Index FAQ</a><br/><br />
<a href="http://social.technet.microsoft.com/wiki/contents/articles/achieving-fast-parallel-columnstore-index-builds.aspx" target="_blank">Achieving Fast Parallel Columnstore Index Builds</a><br/><br />
<a href="http://social.technet.microsoft.com/wiki/contents/articles/work-around-performance-issues-for-columnstores-related-to-strings.aspx" target="_blank">Work Around Performance Issues for Columnstores Related to Strings</a><br/><br />
<a href="http://social.technet.microsoft.com/wiki/contents/articles/5022.use-outer-join-with-columnstores-and-still-get-the-benefit-of-batch-processing.aspx" target="_blank">Use Outer Join with Columnstores and Still Get the Benefit of Batch Processing</a><br/><br />
<a href="http://social.technet.microsoft.com/wiki/contents/articles/7404.using-statistics-with-columnstore-indexes.aspx" target="_blank">Using Statistics with Columnstore Indexes</a></br></p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2012/05/29/inside-the-sql-server-2012-columnstore-indexes/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>1000 Consecutive days on StackOverflow</title>
		<link>http://rusanu.com/2012/03/17/1000-consecutive-days-on-stackoverflow/</link>
		<comments>http://rusanu.com/2012/03/17/1000-consecutive-days-on-stackoverflow/#comments</comments>
		<pubDate>Sun, 18 Mar 2012 00:29:42 +0000</pubDate>
		<dc:creator>remus</dc:creator>
				<category><![CDATA[Announcements]]></category>
		<category><![CDATA[comment]]></category>
		<category><![CDATA[stackoverflow]]></category>

		<guid isPermaLink="false">http://rusanu.com/?p=1450</guid>
		<description><![CDATA[Few days ago I noticed that my StackOverflow profile shows 995 consecutive visited days. So naturally I started thinking about what does it mean to come back every day for a thousand days in a row. Looking back at the post I wrote almost 3 years ago to the day: stackoverflow.com: how to execute well [...]]]></description>
			<content:encoded><![CDATA[<p>Few days ago I noticed that my StackOverflow profile shows <a href="http://stackoverflow.com/users/105929/remus-rusanu">995 consecutive visited days</a>. So naturally I started thinking about what does it mean to come back every day for a thousand days in a row. Looking back at the post I wrote almost 3 years ago to the day: <a href="http://rusanu.com/2009/05/18/stackoverflowcom-how-to-execute-well-on-a-good-idea/">stackoverflow.com: how to execute well on a good idea</a> I can say that not much has changed: StackOverflow (and now the entire StackExchange network) is first and foremost a great community. The technical execution and the social nurturing of the site makes for a low friction environment that invites and rewards contribution, and it keeps getting better and better.</p>
]]></content:encoded>
			<wfw:commentRss>http://rusanu.com/2012/03/17/1000-consecutive-days-on-stackoverflow/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
