How to prevent conversation endpoint leaks

March 31st, 2014

One of the most common complains about using Service Broker in production is when administrators discover, usually after some months of usage, that sys.conversations_endpoints grows out of control with CLOSED conversations that are never cleaned up. I will show how this case occurs and what to do to fix it.

Read the rest of this entry »

How to read and interpret the SQL Server log

March 10th, 2014

The SQL Server transaction log contains the history of every action that modified anything in the database. Reading the log is often the last resort when investigating how certain changes occurred. It is one of the main forensic tools at your disposal when trying to identify the author of an unwanted change. Understanding the log and digging through it for information is pretty hard core and definitely not for the faint of heart. And the fact that the output of ::fn_dblog can easily go into millions of rows does not help either. But I’ll try to give some simple practical examples that can go a long way into helping sort through all the information and dig out what you’re interested in.

Read the rest of this entry »

How to analyse SQL Server performance

February 24th, 2014

So you have this SQL Server database that your application uses and it somehow seems to be slow. How do you troubleshoot this problem? Where do you look? What do you measure? I hope this article will answer enough questions to get you started so that you can identify the bottlenecks yourself, or know what to search for to further extend your arsenal and knowledge.

SQL Server clustered columnstore Tuple Mover

December 2nd, 2013

The updateable clustered columnstore indexes introduced with SQL Server 2014 rely on a background task called the Tuple Mover to periodically compress deltastores into the more efficient columnar format. Deltastores store data in the traditional row-mode (they are B-Trees) and as such are significantly more expensive to query that the compressed columnar segments. How more expensive? They are equivalent to storing the data in an uncompressed Heap and, due to small size (max 1048576 rows per deltastore rowset), they get little traction from parallelism and from read-aheads. It is important for your upload, initial seed and day-to-day ETL activities to achieve a healthy columnstore index, meaning no deltastores or only a few deltastores. To achieve this desired state of a healthy columnstore it is of paramount importance to understand how deltastores are created and removed.

Read the rest of this entry »

Using Hive local mode with YARN

November 19th, 2013

Most examples on how to run Hive in local mode mention using SET mapred.job.tracker=local. But with YARN this setting no longer has the desired effect, Hive queries are still executed with the M/R framework. Stepping through the Hadoop 23 shims reveals that the decision to split into local mode is driven by different configuration setting:


  @Override
  public boolean isLocalMode(Configuration conf) {
    return "local".equals(conf.get("mapreduce.framework.name"));
  }

Running Hive in localmode is important in debugging, is much easier to attach a remote to the local spawned Java process than to a real cluster, even a local one-node cluster. Thejas has a this short note about running hive in local mode, but following the advise there does not work on YARN clusters like the more recent Hortonworks sandboxes.

Services running under domain account should add dnscache service as dependency

August 27th, 2013

Recently I had to investigate a deadlocked Windows Server 2012 machine. Any attempt to start or stop a service on this machine would freeze in an infinite wait. I did not know about the “Analyze Wait Chain” feature in the Task Manager (new since Windows 7), but turns out is quite a life saver. This feature uses the Wait Chain Traversal debugging API and is, on the record, Matt Pietrek’s favourite Windows 7 feature. Simply using the Task Manager I was able to see than many programs were waiting on the “Services and Controller app” service, which is the SCM (the Service Control Manager):


Read the rest of this entry »

Understanding how SQL Server executes a query

August 1st, 2013

If you are a developer writing applications that use SQL Server and you are wondering what exactly happens when you ‘run’ a query from your application, I hope this article will help you write better database code and will help you get started when you have to investigate performance problems.

Requests

SQL Server is a client-server platform. The only way to interact with the back-end database is by sending requests that contain commands for the database. The protocol used to communicate between your application and the database is called TDS (Tabular Data Sream) and is described on MSDN in the Technical Document [MS-TDS]: Tabular Data Stream Protocol. The application can use one of the several client-side implementations of the protocol: the CLR managed SqlClient, OleDB, ODBC, JDBC, PHP Driver for SQL Server or the open source FreeTDS implementation. The gist of it is that when your application whats the database to do anything it will send a request over the TDS protocol. The request itself can take several forms:

Read the rest of this entry »

SQL Server Clustered Columnstore Indexes at TechEd 2013

June 11th, 2013

Now that the TechEd 2013 presentations are up online on Channel9 you can check out Brian Mitchell’s session What’s New for Columnstore Indexes and Batch Mode Processing. Brian does a great job at presenting the new updatable clustered columnstore indexes and the enhancements done to the vectorized query execution (aka. batch mode). Besides the TechEd presentation there is also another excellent resource available online right now for your education on the topic: the SIGMOD 2013 paper Enhancements to SQL Server Column Stores. Besides the obvious updatability, this paper cites some more improvements that are available in clustered columnstores:

It should be no surprise to anyone studying columnar storage that the updatable clustered columnstores coming with the next version of SQL Server are based on deltastores. I talked before about the SQL Server 2012 Columnstore internals and I explained why the highly compressed format that makes columnar storage so fast it also makes it basically impossible to update in-place. The technique of having a ‘deltastore’ which stores updates and, during scans, merge the updates with the columnar data is not new and is employed by several of the columnar storage solution vendors.

Read the rest of this entry »

Registry bloat after SQL Server 2012 SP1 installation

February 15th, 2013

SQL Server 2012 installation has the potential to leave an msiexec.exe installer process running after the installation finishes, as described in Windows Installer starts repeatedly after you install SQL Server 2012 SP1:

After you install SQL Server 2012 SP1 on a computer, the Windows Installer (Msiexec.exe) process is repeatedly started to repair certain assemblies. Additionally, the following events are logged in the Application log:
EventId: 1004
Source: MsiInstaller
Description: Detection of product ‘{A7037EB2-F953-4B12-B843-195F4D988DA1}’, feature ‘SQL_Tools_Ans’, Component ‘{0CECE655-2A0F-4593-AF4B-EFC31D622982}’ failed. The resource”does not exist.

EventId: 1001
Source: MsiInstaller
Description: Detection of product ‘{A7037EB2-F953-4B12-B843-195F4D988DA1}’, feature ‘SQL_Tools_Ans’ failed during request for component ‘{6E985C15-8B6D-413D-B456-4F624D9C11C2}’

When this issue occurs, you experience high CPU usage.
Cause

This issue occurs because the SQL Server 2012 components reference mismatched assemblies. This behavior causes native image generation to fail repeatedly on certain assemblies. Therefore, a repair operation is initiated on the installer package.

But the this problem has a much sinister side effect: it causes growth of the HKLM\Software registry hive. Except for the System hive, all the other registry hives are still restricted in size to a max of 2GB, see Registry Storage Space:

Views of the registry files are mapped in paged pool memory…The maximum size of a registry hive is 2 GB, except for the system hive.

Read the rest of this entry »

How to enable Selective XML indexes in SQL Server 2012 SP1

February 11th, 2013

SQL Server 2012 SP1 has shipped a great enhancement to XML: Selective XML Indexes. When properly used these indexes can speed up the searching of XML columns tremendously, at little disk/size cost:

The selective XML index feature lets you promote only certain paths from the XML documents to index. At index creation time, these paths are evaluated, and the nodes that they point to are shredded and stored inside a relational table in SQL Server. This feature uses an efficient mapping algorithm developed by Microsoft Research in collaboration with the SQL Server product team. This algorithm maps the XML nodes to a single relational table, and achieves exceptional performance while requiring only modest storage space.

Read the rest of this entry »