How to shrink the SQL Server log

July 27th, 2012

I noticed that my database log file has grown to 200Gb. I tried to shrink it but is still 200Gb. How can I shrink the log and reduce the file size?

The problem is that even after you discover about DBCC SHRINKFILE and attempt to reduce the log size, the command seems not to work at all and leaves the log at the same size as before. What is happening?

If you look back at What is an LSN: Log Sequence Number you will see that LSNs are basically pointers (offsets) inside the log file. There is one level of indirection (the VLF sequence number) and then the rest of the LSN is basically an offset inside the Virtual Log File (the VLF). The log is always defined by the two LSNs: the head of the log (where new log records will be placed) and the tail of the log (what is the oldest log record of interest). Generating log activity (ie. any updates in the database) advance the head of the log LSN number. The tail of the log advances when the database log is being backed up (this is a simplification, more on it later).

Read the rest of this entry »

Inside the SQL Server 2012 Columnstore Indexes

May 29th, 2012

Columnar storage has established itself as the de-facto option for Business Intelligence (BI) storage. The traditional row-oriented storage of RDBMS was designed for fast single-row oriented OLTP workloads and it has problems handling the large volume range oriented analytical processing that characterizes BI workloads. But what is columnar storage and, more specifically, how does SQL Server 2012 implement columnar storage with the new COLUMNSTORE indexes?

Read the rest of this entry »

1000 Consecutive days on StackOverflow

March 17th, 2012

Few days ago I noticed that my StackOverflow profile shows 995 consecutive visited days. So naturally I started thinking about what does it mean to come back every day for a thousand days in a row. Looking back at the post I wrote almost 3 years ago to the day: how to execute well on a good idea I can say that not much has changed: StackOverflow (and now the entire StackExchange network) is first and foremost a great community. The technical execution and the social nurturing of the site makes for a low friction environment that invites and rewards contribution, and it keeps getting better and better.

Adding a nullable column can update the entire table

February 16th, 2012

In a previous article Online non-NULL with values column add in SQL Server 2012 I talked about how adding a non-null column with default values is now an online operation in SQL Server 2012 and I mentioned how the situation when the newly added column may increase the rowsize can result in the operation being performed offline:

In the case when the newly added column increases the maximum possible row size over the 8060 bytes limit the column cannot be added online.

In this article I want to show you how such a situation can arise and how it impacts even the case that prior to SQL Server 2012 was always online, namely adding a nullable column. Lets consider the following example:

Read the rest of this entry »

Show all index and heap access operators in the plan cache

January 27th, 2012

I recently needed a query to look into the current query plan cache and locate all actual data access operators (index scans, index seeks, table scans). This is the query I used, I decided to place it here if someone else find it useful and, more importantly, so that I can find it again when I needed it:

Read the rest of this entry »

What is an LSN: Log Sequence Number

January 17th, 2012

LSNs, or Log Sequence Numbers, are explained on MSDN at Introduction to Log Sequence Numbers:

Every record in the SQL Server transaction log is uniquely identified by a log sequence number (LSN). LSNs are ordered such that if LSN2 is greater than LSN1, the change described by the log record referred to by LSN2 occurred after the change described by the log record LSN.

There are several places where LSNs are exposed. For example sys.database_recovery_status has the columns last_log_backup_lsn and fork_point_lsn, sys.database_mirroring has the mirroring_failover_lsn column and the msdb table backupset contains first_lsn, last_lsn, checkpoint_lsn, database_backup_lsn, fork_point_lsn and differential_base_lsn. Not surprisingly all these places where LSNs are exposed are related to backup and recovery (mirroring is a form of recovery). The LSN is exposed a numeric(25,0) value that can be compared: a bigger LSN number value means a later log sequence number and therefore it can indicate if more log needs to be backed up or recovered.

Yet we can dig deeper. Look at the LSN as exposed by the fn_dblog function:

Read the rest of this entry »

SQL Server table columns under the hood

October 20th, 2011

You probably can easily answer a question like ‘What columns does this table have?’. Whether you use the SSMS object explorer, or sp_help, or you query sys.column, the answer is fairly easy to find. But what is I ask ‘What are the physical columns of this table?’. Huh? Is there any difference? Lets see.

At the logical layer tables have exactly the structure you declare it in your CREATE TABLE statement, and perhaps modifications from ALTER TABLE statements. This is the layer at which you can look into sys.columns and see the table structure, or look at the table in SSMS object explorer and so on and so forth. But there is also a lower layer, the physical layer of the storage engine where the table might have surprisingly different structure from what you expect.

Inspecting the physical table structure

To view the physical table structure you must use the undocumented system internals views: sys.system_internals_partitions and sys.system_internals_partition_columns:

Read the rest of this entry »

Understanding Hash, Sort and Exchange Spill events

October 19th, 2011

Certain SQL Server query execution operations are calibrated to perform best by using a (somewhat) large amount of memory as intermediate storage. The Query Optimizer will choose a plan and estimate the cost based on these operators using this memory scratch-pad. But this is, of course, only an estimate. At execution the estimates may prove wrong and the plan must continue despite not having enough memory. In such an event, these operators spill to disk. When a spill occurs, the memory scratch-pad is flushed into tempdb and more data is accommodated in the (now) free memory. When the data flushed to tempdb is needed again, is read from disk. Needless to say, spilling into tempdb is order of magnitude slower than using just the memory scratch-pad. Monitoring for spilling is specially important in ETL jobs, since these occurrences may cause the ETL job to stretch for many more minutes, sometimes even hours. For an exhaustive discussion of ETL, including some references to spills, see The Data Loading Performance Guide.

Read the rest of this entry »

T-SQL functions do no imply a certain order of execution

August 10th, 2011

Looking at this question on StackOverflow: Conversion failed when converting from a character string to uniqueidentifier error in SQL Server one can see a reproducible example where a string split UDF works fine in the SELECT statement, but it gives a conversion error in the DELETE statement. Certainly, a bug, right? Actually, not.

The issue at hand is in fact remarkably similar to another common misconception around T-SQL I had to debunk some time ago, see On SQL Server boolean operator short-circuit: that C like operator short-circuit is guaranteed in T-SQL (hint: it isn’t, read the linked article to see a clear counter example).

In the StackOverlow post the misconception is that order of declaration implies order of execution, that the function is evaluated somehow separately from the rest of the query and some sort of temporary result is created that is then used in the overall query execution. This understanding comes naturally to the imperative procedural language mindset of developers trained in C, C++, C# and other similar languages. But in the SQL Server declarative language that is T-SQL, your intuition is actually wrong. To illustrate I will give a simple counter-example, reusing the code from my earlier boolean short-circuit article:

Read the rest of this entry »

Online Index Operations for indexes containing LOB columns

August 5th, 2011

SQL Server supports online index and table rebuild operations which allow for maintenance operations to occur w/o significant downtime. While a table is being rebuild, or a new index is being built on it, the table is fully utilizable. It can be queried and any updates done to the table while the online rebuild operation is occurring will be contained in the final rebuilt table. A detailed explanation on how these online rebuild operations work can be found in the Online Indexing Operations in SQL Server 2005 white paper. But Online Index Build operations in SQL Server 2005, 2008 and 2008 R2 do not support tables that contain LOB columns, attempting to do so would trigger an error:

Read the rest of this entry »