Show all index and heap access operators in the plan cache

January 27th, 2012

with xmlnamespaces (default 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')
select x.value(N'@NodeId',N'int') as NodeId
	, x.value(N'@PhysicalOp', N'sysname') as PhysicalOp
	, x.value(N'@LogicalOp', N'sysname') as LogicalOp
	, ox.value(N'@Database',N'sysname') as [Database]
	, ox.value(N'@Schema',N'sysname') as [Schema]
	, ox.value(N'@Table',N'sysname') as [Table]
	, ox.value(N'@Index',N'sysname') as [Index]
	, ox.value(N'@IndexKind',N'sysname') as [IndexKind]
	, x.value(N'@EstimateRows', N'float') as EstimateRows
	, x.value(N'@EstimateIO', N'float') as EstimateIO
	, x.value(N'@EstimateCPU', N'float') as EstimateCPU
	, x.value(N'@AvgRowSize', N'float') as AvgRowSize
	, x.value(N'@TableCardinality', N'float') as TableCardinality
	, x.value(N'@EstimatedTotalSubtreeCost', N'float') as EstimatedTotalSubtreeCost
	, x.value(N'@Parallel', N'tinyint') as DOP
	, x.value(N'@EstimateRebinds', N'float') as EstimateRebinds
	, x.value(N'@EstimateRewinds', N'float') as EstimateRewinds
	, st.*
	, pl.query_plan
from sys.dm_exec_query_stats as st
cross apply sys.dm_exec_query_plan (st.plan_handle) as pl
cross apply pl.query_plan.nodes('//RelOp[./*/Object/@Database]') as op(x)
cross apply op.x.nodes('./*/Object') as ob(ox)


I recently needed a query to look into the current query plan cache and locate all actual data access operators (index scans, index seeks, table scans). This is the query I used, I decided to place it here if someone else find it useful and, more importantly, so that I can find it again when I needed it…

SQL Server table columns under the hood

October 20th, 2011

You probably can easily answer a question like ‘What columns does this table have?’. Whether you use the SSMS object explorer, or sp_help, or you query sys.column, the answer is fairly easy to find. But what is I ask ‘What are the physical columns of this table?’. Huh? Is there any difference? Lets see.

At the logical layer tables have exactly the structure you declare it in your CREATE TABLE statement, and perhaps modifications from ALTER TABLE statements. This is the layer at which you can look into sys.columns and see the table structure, or look at the table in SSMS object explorer and so on and so forth. But there is also a lower layer, the physical layer of the storage engine where the table might have surprisingly different structure from what you expect.

Inspecting the physical table structure

To view the physical table structure you must use the undocumented system internals views: sys.system_internals_partitions and sys.system_internals_partition_columns:

select p.index_id, p.partition_number,
	pc.leaf_null_bit,
	coalesce(cx.name, c.name) as column_name,
	pc.partition_column_id,
	pc.max_inrow_length,
	pc.max_length,
	pc.key_ordinal,
	pc.leaf_offset,
	pc.is_nullable,
	pc.is_dropped,
	pc.is_uniqueifier,
	pc.is_sparse,
	pc.is_anti_matter
from sys.system_internals_partitions p
join sys.system_internals_partition_columns pc
	on p.partition_id = pc.partition_id
left join sys.index_columns ic
	on p.object_id = ic.object_id
	and ic.index_id = p.index_id
	and ic.index_column_id = pc.partition_column_id
left join sys.columns c
	on p.object_id = c.object_id
	and ic.column_id = c.column_id
left join sys.columns cx
	on p.object_id = cx.object_id
	and p.index_id in (0,1)
	and pc.partition_column_id = cx.column_id
where p.object_id = object_id('...')
order by index_id, partition_number;

Lets inspect some simple table structures:

create table users (
	user_id int not null identity(1,1),
	first_name varchar(100) null,
	last_name varchar(100) null,
	birth_date datetime null);

Running our query after, of course, we specify object_id('users'):

We can see that the physical structure is very much as we expected: the physical rowset has 4 physical columns, of the expected types and sizes. One thing to notice is that, although the column order is the one we specified, the columns are layout on disk in a different order: the user_id is stored in row at offset 4, followed by the birth_date at offset 8 and then by the two variable length columns (first_name and last_name) that have negative offsets, an indication that they reside in the variable size portion of the row. This should be of no surprise as we know that the row format places all fixed length columns first in the row, ahead of the variable length columns.

Adding a clustered index

Our table right now is a heap, we should make user_id a primary key, and lets check the table structure afterward:

alter table users add constraint pk_users_user_id primary key (user_id);


As we can see, the table has changed from a heap into a clustered index (index_id changed from 0 to 1) and the user_id column has become part of the key.

Non-Unique clustered indexes

Now lets say that we want a different clustered index, perhaps by birth_date (maybe our application requires frequent range scans on this field). We would change the primary key into a non-clustered primary key (remember, the primary key and the clustered index do not have to be the same key) and add a clustered index by the birth_date column:

alter table users drop constraint pk_users_user_id;
create clustered index cdx_users on users (birth_date);
alter table users add constraint
	pk_users_user_id primary key nonclustered  (user_id);


Several changes right there:

  • A new index has appeared (index_id 2), which was expected, this is the non-clustered index that now enforces the primary key constraint on column user_id.
  • The table has a new physical column, with partition_column_id 0. This new column has no name, because it is not visible in the logical table representation (you cannot select it, nor update it). This is an uniqueifier column (is_uniqueifier is 1) because we did not specify that our clustered index in unique:

    If the clustered index is not created with the UNIQUE property, the Database Engine automatically adds a 4-byte uniqueifier column to the table. When it is required, the Database Engine automatically adds a uniqueifier value to a row to make each key unique. This column and its values are used internally and cannot be seen or accessed by users.



    Klaus Aschenbrenner has a good series of articles that explain uniqueifier columns in more detail.

  • The non-clustered index that enforces the primary key constraint has 3 physical columns: user_id and two unnamed ones. This is because each row in the non-clustered index contains the corresponding clustered index row key values, in our case the birth_date column and the uniqueifier column.

Online operations and anti-matter columns

Is the uniqueifier column the only hidden column in a table? No. Lets rebuild our non-clustered index, making sure we specify it to be an online operation:

alter index pk_users_user_id on users rebuild with (online=on);


Notice how the non-clustered index now has one more column, a new column that has the is_antimatter value 1. As you probably guess, this is an anti-matter column. For an in-depth explanation of what is the purpose of the anti-matter column I recommend reading Online Indexing Operations in SQL Server 2005:

During the build phase, rows in the new “in-build” index may be in an intermediate state called antimatter. This mechanism allows concurrent DELETE statements to leave a trace for the index builder transaction to avoid inserting deleted rows. At the end of the index build operation all antimatter rows should be cleared.

Note that even after the online operation finishes and the antimatter rows are removed, the antimatter column will be part of the physical rowset structure.

There could exist more hidden columns in the table, for example if we enable change tracking:

alter table users ENABLE CHANGE_TRACKING;


Enabling change tracking on our table has added one more hidden column.

Table structure changes

Next lets look at some table structure modifying operations: adding, dropping and modifying columns. Were going to start anew with a fresh table for our examples:

create table users  (
	user_id int not null identity(1,1) primary key,
	first_name char(100) null,
	last_name char(100) null,
	birth_date datetime null);
go


Next lets modify some columns: we want to make the birth_date not nullable and reduce the length of the first_name to 75:

alter table users alter column birth_date datetime not null;
alter table users alter column first_name char(75);
go


Notice how the two alter operations ended up in adding a new column each, and dropping the old column. But also note how the null bit and the leaf_offset values have stayed the same. This means that the column was added ‘in place’, replacing the dropped column. This is a metadata only operation that did not modify any record in the table, it simply changed how the data in the existing records is interpreted.

But now we figure out the 75 characters length is wrong and we want to change it back to 100. Also, the birth_date column probably doesn’t need hours, so we can change it to a date type:

alter table users alter column birth_date date not null;
alter table users alter column first_name char(100);
go


The birth_date column has changed type and now it requires only 3 bytes, but the change occurred in-place, just as the nullability change we did before: it remains at the same offset in the record and it has the same null bit. However, the first_name column was moved from offset 8 to offset 211, and the null bit was changed from 4 to 5. Because the first_name column has increased in size it cannot occupy the same space as before in the record and the record has effectively increased to accommodate the new first_name column. This happened despite the fact that the first_name column was originally of size 100 so in theory it could reclaim the old space it used in the record, but the is simply a too corner case for the storage engine to consider.

By now we figure that the fixed length 100 for the first_name and last_name columns was a poor choice, so we would like to change them to more appropriate variable length columns:

alter table users alter column first_name varchar(100);
alter table users alter column last_name varchar(100);
go


The type change from fixed length column to variable length column cannot be done in-place, so the first_name and last_name columns get new null bit values.They also have negative leaf_offset values, which is typical for variable length columns as they don’t occupy fixed positions in the record.

Next lets change the length of the variable columns:

alter table users alter column first_name varchar(250);
alter table users alter column last_name varchar(250);
go

For this change the column length was modified without dropping the column and adding a new one. An increase of size for a variable length columns is one of the operations that is really altering the physical column, not dropping the column and adding a new one to replace it. However, a decrease in size, or a nullability change, does again drop and add the column, as we can quickly check now:

alter table users alter column first_name varchar(100);
alter table users alter column last_name varchar(250) not null;
go


Finally, lets say we tried to add two more fixed length columns, but we we undecided on the name and length so we added a couple of columns, then deleted them and added again a new one:

alter table users add mid_name char(50);
alter table users add surname char(25);
alter table users drop column mid_name;
alter table users drop column surname;

alter table users add middle_name char(100);
go


This case is interesting because it shows how adding a new fixed column can reuse the space left in the row by dropped fixed columns, but only if the dropped columns are the last fixed columns. In our table the columns mid_name and surname where originally added at offset 211 and 261 respectively and their length added up to 75 bytes. After we dropped them, the middle_name column we added is placed at offset 211, thus reusing the space formerly occupied by the dropped columns. This happens even though the length of the newly added column is 100 bytes, bigger than the 75 bytes occupied by the dropped columns before.

By now, our 5 column table has actually 15 columns in the physical storage format, 10 dropped columns and 5 usable columns. The table uses 412 bytes of fixed space for the 3 fixed length columns that have a total length of only 112 bytes. The variable length columns that now store the first_name and last_name are stored in the record after these 412 reserved bytes for fixed columns that are now dropped. Since the records always consume all the reserved fixed size, this is quite wasteful. How do we reclaim it? Rebuild the table:

alter table users rebuild;
go


As you can see the rebuild operation got rid off the dropped columns and now the physical storage layout is compact, aligned with the logical layout. So whenever doing changes to a table structure remember that at the storage layer most changes are cumulative, they are most times implemented by dropping a column and adding a new column back, with the new type/length/nullability. Whenever possible the a modified column reuses the space in the record and newly added columns may reuse space previously used by dropped columns.

Partitioned Tables

When partitioning is taken into account another dimension of the physical table structure is revealed. To start, lets consider a typical partitioned table:

create partition function pf (date) as range for values ('20111001');
go

create partition scheme ps as partition pf all to ([PRIMARY]);
go

create table sales (
	sale_date date not null,
	item_id int not null,
	store_id int not null,
	price money not null)
	on ps(sale_date);
go


As we know, the partitioned tables are, from a physical point of view, a collection of rowsets that belong to the same logical object (the 'table'). Looking under the hood shows that our table has two rowsets, and they have identical column structure. Typically new partitions are added by the ETL process that uploads data into a staging table that gets switched in using a fast partition switch operation:

create table sales_staging (
	sale_date date not null,
	item_id int not null,
	store_id int not null,
	price numeric(10,2) not null,
	constraint check_partition_range
		check (sale_date = '20111002'))
	on [PRIMARY];

alter partition scheme ps next used [PRIMARY];
alter partition function pf() split range ('20111002');
go

alter table sales_staging switch to sales partition $PARTITION.PF('20111002');
go
Msg 4944, Level 16, State 1, Line 2

ALTER TABLE SWITCH statement failed because column 'price' has data type numeric(10,2) in source table 'test.dbo.sales_staging' which is different from its type money in target table 'test.dbo.sales'.

OK, so we made a mistake in the staging table, so lets correct it, then try to switch it in again:

alter table sales_staging alter column price money not null;
alter table sales_staging switch to sales partition $PARTITION.PF('20111002');
go


We see that the partition switch operation has switched in the rowset of the staging table into the second partition of the sales table. But since the staging table had an operation that modified a column type, which in effect has dropped the numeric price columns and added a new price column of the appropriate money type, the second rowset of the partitioned table now has 5 columns, including a dropped one. The partition switch operation brings into the partitioned table the staging table as is, dropped columns and all. What that means is that partitions of the partitioned table can have, under the hood, a completely different physical structure from one another. Normally this should be of little concern and just a courisity to woe the newbies at DBA cocktail parties, but this problem has a darker side, known as KB2504090:

This issue occurs because the accessor that SQL Server uses to insert data into different partitions recognizes the metadata changes incorrectly. When data is inserted into the new partition that is created after a column is dropped, the number of the maximum nullable columns in the new partition may be one less than the number of the maximum nullable columns in the old partition.

If you ever find your partitioned tables not to have an uniform internal structure across all partitions, I recommend you rebuild the partition that is different. This simple query can be used to look at the number of physical columns in each partition of a table:

select count(*) as count_columns,
	index_id,
	partition_number
from sys.system_internals_partitions p
join sys.system_internals_partition_columns pc
	on p.partition_id = pc.partition_id
where p.object_id = object_id('sales')
group by index_id, partition_number;
go


This query shows that partitions 1 and 3 have 4 physical columns, while partition 2 has 5. We can rebuild partition 2:

alter table sales rebuild partition = 2;
go

With this operation we have rebuild the partition number 2 from scratch and removed all dropped columns in the process.

Note that my query above would only detect differences in the number of physical columns, but two partitions can still have a very different physical layout even if they have the same number of physical columns, eg. one can have physical column number 9 dropped and physical column number 10 used, while the other can have the opposite. Use your judgement when looking at the internal physical column layout to understand if they are truly the same.

Understanding Hash, Sort and Exchange Spill events

October 19th, 2011

Certain SQL Server query execution operations are calibrated to perform best by using a (somewhat) large amount of memory as intermediate storage. The Query Optimizer will choose a plan and estimate the cost based on these operators using this memory scratch-pad. But this is, of course, only an estimate. At execution the estimates may prove wrong and the plan must continue despite not having enough memory. In such an event, these operators spill to disk. When a spill occurs, the memory scratch-pad is flushed into tempdb and more data is accommodated in the (now) free memory. When the data flushed to tempdb is needed again, is read from disk. Needless to say, spilling into tempdb is order of magnitude slower than using just the memory scratch-pad. Monitoring for spilling is specially important in ETL jobs, since these occurrences may cause the ETL job to stretch for many more minutes, sometimes even hours. For an exhaustive discussion of ETL, including some references to spills, see The Data Loading Performance Guide.

Hash Warning Event

Hash recursion occurs when the build input does not fit into available memory, resulting in the split of input into multiple partitions that are processed separately. If any of these partitions still do not fit into available memory, it is split into subpartitions, which are also processed separately. This splitting process continues until each partition fits into available memory or until the maximum recursion level is reached (displayed in the IntegerData data column).

This is one of the most common spills. Hash Join is a darling of the Query Optimizer, as it a very fast operator that can join two unsorted sources efficiently, requiring a single pass over each source. Not only that, but it can also be used for duplicate removal (eg. DISTINCT clause) and grouping aggregates. See Understanding Hash Joins for more details.

Unfortunately it also requires a lot of memory. Hash spill occurrences are usually an indication of bad cardinality estimates that are in turn caused, most times, by missing or outdated statistics. The first action to do, when faced with Hash Join spill warnings, is to update (or create) stats on the columns involved. If the problem persists then you must resort to a different join type. Since the optimizer would had already picked a better join if it could, you need to look at the problem from a different angle: how could you help the optimizer to pick a different join, w/o forcing it?

The optimizer would pick a LOOP if it could seek the inner side for a reasonable number of times, so perhaps you need an index on the inner side to allow a seek and/or an index on the outer side that would filter the produced output early so it can reduce the number of probes in the inner side. See Understanding Nested Loops Joins for more details.

The other alternative is a MERGE join, but merge requires the input on both sides to be sorted. Adding an index to each of the sources on both sides of the join that provides the order guarantee would likely woe the optimizer into using the MERGE join. See Understanding Merge Joins for more details.

Sort Warning Event

The Sort Warnings event class indicates that sort operations do not fit into memory. This does not include sort operations involving the creation of indexes, only sort operations within a query (such as an ORDER BY clause used in a SELECT statement).

If a query requires an order guarantee (eg. it has an ORDER BY clause, or it projects a function like ROW_NUMBER) and there is no index to guarantee the order then there is little choice to have: the execution must sort the input before proceeding. If the input is small then the sort occurs in memory and is very cheap, but in this case no spill warning would occur. The Query Optimizer is usually smart to postpone the sorting until all filtering is complete, so if it still spills it usually means there is no more filtering that can be applied. The solution when faced with Sort spills is usually to add a covering index that provides the desired order.

Another case of Sort spill warning is when the Query Optimizer goes creative and adds a sort into a plan that technically does not require an order guarantee. Such events are though very rare and esoteric. The action to take really depends from case to case.

Exchange Spill Event

The Exchange Spill event class indicates that communication buffers in a parallel query plan have been temporarily written to the tempdb database. This occurs rarely and only when a query plan has multiple range scans.

You’ll probably never going to hit this problem. If you do, I recommend you go over the recommendations in the linked article: Exchange Spill Event

Conclusion

Technically there is one more spill class: the spool spill. Since spools are *meant* to spill, the presence of a spool spill is usually less of a concern.

The purpose of this article is to show that there is ample documentation available on MSDN regarding these spill events. The tempdb spills are easily detectable and reasonably explained in the product documentation. Presence of spills may indicate potential performance problems as a spill involves disk reads and writes and is many times slower than the corresponding in-memory-only operation. They also add overload to tempdb and may cause contention.

This server supports version 662 and earlier…

November 23rd, 2010

A new error started showing up in SQL Server 2008 SP2 installations:

The database cannot be opened because it is version 661. This server supports version 662 and earlier. A downgrade path is not supported.

661 sure is earlier than 662, so what seems to be the problem? This error message is a bit misleading. SQL Server 2008 supports database version 655 and earlier. But with support for 15000 partitions in SQL Server 2008 SP2, databases enabled for 15000 partitions are upgraded to version 662. This upgrade is necessary to prevent an SQL Server 2008 R2 instance from attaching a database that has more than 1000 partitions in it, since the code in R2 RTM does not understand 15000 partitions and the effects would be unpredictable. So SQL Server 2008 SP2 does indeed support version 662, but it does not support version 661. This behavior is explained in the Support for 15000 Partitions.docx document, although the database versions involved are not explicitly called out.

So the error message above should be really read as:

The database cannot be opened because it is version 661. This server supports versions 662, 655 and earlier than 655. A downgrade path is not supported

With this information the correct resolution can be achieved: the user is trying to attach a SQL Server 2008 R2 database (v. 661) to an SQL Server 2008 SP2 instance. This is not supported. User has to either upgrade the SQL Server 2008 SP2 instance to SQL Server 2008 R2, or it has to attach the database back to a R2 instance and copy out the data from the database into SQL Server 2008 instance database, eg. using the Import and Export Wizard.

The puzzle of U locks in deadlock graphs

May 12th, 2010

In a stackoverflow.com question the user has asked how come a SELECT statement could own a U mode lock?

S-U-X deadlock graph

S-U-X deadlock graph

The deadlock indeed suggests that the deadlock victim, a SELECT statement, is owning an U lock on the PK_B index. Why would a SELECT own an U lock? The query had no table hints and was a standalone query, not part of a multi-statement transaction that could had aquired the U lock in previous staements.

Turns out that the SELECT was actually not owning any U lock. The deadlock graph files (the *.xdl files) are in fact XML files and they can be opened as XML and inspected, for a little more detail than the visual deadlock graph visualizer permits. Here is the actual resource list in the deadlock XML:

<resource-list>
   <keylock hobtid="72057594052411392" dbid="10"
         objectname="A" indexname="PK_A" id="lock17ed4040"
        mode="X" associatedObjectId="72057594052411392">
    <owner-list>
     <owner id="process4f5d000" mode="X"/>
    </owner-list>
    <waiter-list>
     <waiter id="processfa3c8e0" mode="S" requestType="wait"/>
    </waiter-list>
   </keylock>
   <keylock hobtid="72057594051166208" dbid="10"
        objectname="B" indexname="PK_B" id="lock22ea3940"
        mode="U" associatedObjectId="72057594051166208">
    <owner-list>
     <owner id="processfa3c8e0" mode="S"/>
    </owner-list>
    <waiter-list>
     <waiter id="process4f5d000" mode="X" requestType="convert"/>
    </waiter-list>
   </keylock>
  </resource-list>

As you can see, the resource lock22ea3940 is owned by the process processfa3c8e0 (the SELECT) indeed, but is owned in S mode. The process process4f5d000 (the UPDATE) is requesting this resource for a convert from U to X mode. So the true deadlock is like this:

  • SELECT owns a lock on the row in PK_B in S mode
  • SELECT wants a lock on the row in PK_A in S mode
  • UPDATE owns a lock on the row in PK_A in X mode
  • UPDATE also owns a U lock on the PK_B row. (S and U modes are compatible)
  • UPDATE is requesting a convert of the U lock it has on the row on PK_B to X mode

As you can see, there is no mysterious U lock owned by the SELECT. There is an U lock on the row in PK_B, but is owned by the UPDATE, which is requesting a convert to X for it. The fact that the resource is showned in the deadlock graph viewer in SSMS as being ‘Owner mode: U’ and pointing to the SELECT is simply an artifact of how SSMS displays the deadlock graph.

The lesson to take home is that the visual graphic deadlock graph display is usefull only to have a cursory glance at the deadlock cycle. The true meat and potatoes are in the XML, which has a lot more information. Not to mention that the information in the XML is actually correct, which helps investigation…

SQL Server 2008 R2 Express database size limit: 10GB

April 28th, 2010

The SQL Server 2008 R2 Express editions has increased the database size limit to 10Gb from the previous limit of 4Gb. This is great news for many developers, as the 4Gb limitation was by far the most difficult barrier preventing Express adoption. With today’s rate of generating data, the 4Gb limit was just plain small.

All the other limitations of SQL Server Express stay in place:

CPU
SQL Server Express only uses once CPU socket. It will use all cores and any Hyper-Threading logical processor in that socket though.
Memory
SQL Server Express limits the size of the data buffer pool to 1Gb.
Replication
SQL Server Express can only participate as a subscriber in a replication topology.
Service Broker
Two SQL Server Express instances cannot exchange Service Broker messages directly, the messages have to be routed through a higher level SKU.
SQL Agent
SQL Server Express does not have an Agent service and as such it cannot run Agent scheduled jobs.

The Bizzaro Guide to SQL Server Performance

March 31st, 2010

Some say performance troubleshooting is a difficult science that blends just the right amount of patience, knowledge and experience. But I say forget all that, a few bullet points can get you a long way in fixing any problem you encounter. Is more important to find a google SEO friendly result that gives simplistic advice. Most importantly, good advice never contains the words ‘It depends’. Without further ado, here is my bulletproof SQL Server optimization guide:

  • Always trust your gut feeling. Avoid doing costly and unnecessary measurements. They may lead down the treacherous path of the scientific method. A gut feeling is always easier to explain and this improves communication. Measurements require use of complicated notions not everybody understands, so they lead to conflicts in the team.
  • High CPU utilization is caused by index fragmentation. Because the distance between database pages increases, the processor needs more cycles to reference the pages in the buffer pool.
  • Low CPU utilization is caused by index fragmentation. As the index fragments get smaller, they fit better into the processor L2 cache and this results in fewer cycles needed to access the row slots in the page. Because the data is in the cache the processor idles next cycles, resulting in low CPU utilization.
  • High Avg. Disk Sec. per Transfer is caused by index fragmentation. When indexes are fragmented the disk controller has to reorder the IO scatter-gather requests to put them in descending order. Needles to say, this operation increases the transfer times in geometric progression, because all the commercial disk controllers use bubble sort for this operation.
  • High memory consumption is caused by index fragmentation. This is fairly trivial and well known, but I’ll repeat it here: as the number of index fragments increases more pointers are needed to keep track of each fragment. Pointers are stored in virtual memory and virtual memory is very large, and this causes high memory consumption.
  • Syntax errors are caused by index fragmentation. Because the syntax is verified using the metadata catalogs, high fragmentation in the database can leave gaps in the syntax. This is turn causes the parser to generate syntax errors on perfectly valid statements like SECLET and UPTADE.
  • Covering indexes can lead to index fragmentation. Covering indexes are the indexes used by the query optimizer to cover itself in case the plan has execution faults. Because they are so often read they wear off and start to fragment.
  • Index fragmentation can be resolved by shrinking the database. As the data pages are squeezed tighter during the shrinking, they naturally realign themselves in the correct order.

There you have it, the simplest troubleshooting guide. Since most performance problems are caused by index fragmentation, all you have to do is shrink the database to force the pages to re-align correctly, and this will resolve the performance problem.

Happy April 1st everyone!

What deprecated features am I using?

January 12th, 2010

select instance_name as [Deprecated Feature]
  , cntr_value as [Frequency Used]
from sys.dm_os_performance_counters
where object_name = 'SQLServer:Deprecated Features'
and cntr_value > 0
order by cntr_value desc

Quick way to tell which deprecated feature are used on a running instance of SQL Server. The performance counters reset at each server start up, so the interogation is relevant only after the server was up for some time. This will not tell you where is the usage comming from, but will give you a very quick idea what deprecated features are used most frequently by your apps.

If the SQL Server is a named instance, you have to query the proper counter category: 'MSSQL$<instancename>:Deprecated Features'

System pagefile size on machines with large RAM

November 22nd, 2009

Irrelevant of the size of the RAM, you still need a pagefile at least 1.5 times the amount of physical RAM. This is true even if you have a 1 TB RAM machine, you’ll need 1.5 TB pagefile on disk (sounds crazy, but is true)

When a process asks for MEM_COMMIT memory via VirtualAlloc/VirtualAllocEx, the requested size needs to be reserved in the pagefile. This was true in the first Win NT system, and is still true today see Managing Virtual Memory in Win32:

When memory is committed, physical pages of memory are allocated and space is reserved in a pagefile.

Bare some extreme odd cases, SQL Server will always ask for MEM_COMMIT pages. And given the fact that SQL uses a Dynamic Memory Management policy that reserves upfront as much buffer pool as possible (reserves and commits in terms of VAS), SQL Server will request at start up a huge reservation of space in the pagefile. If the pagefile is not properly sized errors 801/802 will start showing up in SQL’s ERRORLOG file and operations.

This always causes some confusion, as administrators erroneously assume that a large RAM eliminates the need for a pagefile. In truth the contrary happens, a large RAM increases the need for pagefile, just because of the inner workings of the Windows NT memory manager. Although reserved pagefile is, hopefully, never used, the problem of reserving such a huge pagefile file can be quite serious and needs to be accounted for during capacity planning.

Fix slow application startup due to code sign validation

July 24th, 2009

Sometimes you are faced with applications that seem to take ages to start up. Usually they freeze for about 30-40 seconds and then all of the sudden they come to live. This happens for both native and managed application and it sometimes manifest as an IIS/ASP/ASP.Net AppPool starting up slow on the first request. The very first thing I always suspect is code signing verification. When a signed module is checked the certificate verification engine may consider that the Certificate Revocation List (CRL) it posses is obsolete and attempt to download a new one. For this it connects to the internet. The problem occurs when the connectivity is either slow, or blocked for some reason. By default the verification engine will time out after 15 seconds and resume with the old, obsolete, CRL it has. The timeout can occur several times, adding up to start up times of even minutes. This occurs completely outside of the control of the application being started, its modules are not even properly wired up in memory so there is no question of application code yet running.

The information on this subject is scarce to say the least. Luckily there is an TechNet article that describes not only the process occuring, but also the controlling parameters: Certificate Revocation and Status Checking. To fix the problem on computers with poor internet conectivity, registry settings have to be modified in the HKLM\SOFTWARE\Microsoft\Cryptography\OID\EncodingType 0\CertDllCreateCertificateChainEngine\Config key:

ChainUrlRetrievalTimeoutMilliseconds
This is each individual CRL check call timeout. If is 0 or not present the default value of 15 seconds is used. Change this timeout to a reasonable value like 200 milliseconds.
ChainRevAccumulativeUrlRetrievalTimeoutMilliseconds
This is the aggregate CRL retrieval timeout. If set to 0 or not present the default value of 20 seconds is used. Change this timeout to a value like 500 milliseconds.

With these two changes the code signing verification engine will timeout the CRL refresh operation in 500 milliseconds. If the connectivity to the certificate authority site is bad, this will dramatically increase the application start up times for code signed applications.