System pagefile size on machines with large RAM

November 22nd, 2009

Irrelevant of the size of the RAM, you still need a pagefile at least 1.5 times the amount of physical RAM. This is true even if you have a 1 TB RAM machine, you’ll need 1.5 TB pagefile on disk (sounds crazy, but is true)

When a process asks for MEM_COMMIT memory via VirtualAlloc/VirtualAllocEx, the requested size needs to be reserved in the pagefile. This was true in the first Win NT system, and is still true today see Managing Virtual Memory in Win32:

When memory is committed, physical pages of memory are allocated and space is reserved in a pagefile.

Bare some extreme odd cases, SQL Server will always ask for MEM_COMMIT pages. And given the fact that SQL uses a Dynamic Memory Management policy that reserves upfront as much buffer pool as possible (reserves and commits in terms of VAS), SQL Server will request at start up a huge reservation of space in the pagefile. If the pagefile is not properly sized errors 801/802 will start showing up in SQL’s ERRORLOG file and operations.

This always causes some confusion, as administrators erroneously assume that a large RAM eliminates the need for a pagefile. In truth the contrary happens, a large RAM increases the need for pagefile, just because of the inner workings of the Windows NT memory manager. Although reserved pagefile is, hopefully, never used, the problem of reserving such a huge pagefile file can be quite serious and needs to be accounted for during capacity planning.

Fix slow application startup due to code sign validation

July 24th, 2009

Sometimes you are faced with applications that seem to take ages to start up. Usually they freeze for about 30-40 seconds and then all of the sudden they come to live. This happens for both native and managed application and it sometimes manifest as an IIS/ASP/ASP.Net AppPool starting up slow on the first request. The very first thing I always suspect is code signing verification. When a signed module is checked the certificate verification engine may consider that the Certificate Revocation List (CRL) it posses is obsolete and attempt to download a new one. For this it connects to the internet. The problem occurs when the connectivity is either slow, or blocked for some reason. By default the verification engine will time out after 15 seconds and resume with the old, obsolete, CRL it has. The timeout can occur several times, adding up to start up times of even minutes. This occurs completely outside of the control of the application being started, its modules are not even properly wired up in memory so there is no question of application code yet running.

The information on this subject is scarce to say the least. Luckily there is an TechNet article that describes not only the process occuring, but also the controlling parameters: Certificate Revocation and Status Checking. To fix the problem on computers with poor internet conectivity, registry settings have to be modified in the HKLM\SOFTWARE\Microsoft\Cryptography\OID\EncodingType 0\CertDllCreateCertificateChainEngine\Config key:

ChainUrlRetrievalTimeoutMilliseconds
This is each individual CRL check call timeout. If is 0 or not present the default value of 15 seconds is used. Change this timeout to a reasonable value like 200 milliseconds.
ChainRevAccumulativeUrlRetrievalTimeoutMilliseconds
This is the aggregate CRL retrieval timeout. If set to 0 or not present the default value of 20 seconds is used. Change this timeout to a value like 500 milliseconds.

With these two changes the code signing verification engine will timeout the CRL refresh operation in 500 milliseconds. If the connectivity to the certificate authority site is bad, this will dramatically increase the application start up times for code signed applications.

%%lockres%% collision probability magic marker: 16,777,215

May 29th, 2009

@jrowlandjones blogged about a dubious deadlock case. I recommend this article as is correct and presents a somewhat esoteric case of deadlock: the resource hash collision. The lock manager in SQL Server doesn’t know what it locks, it just locks ‘resources’ (basically strings). It is the job of higher level components like the the access methods of the storage engine to present the ‘resource’ to the lock manager and ask for the desired lock. When locking rows in a heap or a b-tree the storage engine will synthesize a ‘resource’ from the record identifier. Since these resources have a limited length, the storage engine has to reduce the effective length of a key to the maximum length is allowed to present to the lock manager, and that means that the record’s key will be reduced to 6 bytes. This is achieved by hashing the key into a 6 byte hash value. Nothing spectacular here.

But if you have a table with a key of length 50 bytes and its reduced to 6 bytes, you may hit a collision. So how likely is this to happen?

On 6 bytes there are 281,474,976,710,656 distinct possible values. Its a pretty big number? Not that big actually. If we meet at a party and I say ‘I bet somebody in the room shares your birthday’ would you bet against me? You probably should :) What if I change my question to ‘I bet there are two people in this room that share the birthday!’? Now I will probably take your money. This is called a ‘meet-in-the-middle‘ attack in cryptography and basically it says that you get a 50% collision probability at half the hash length. So the SQL %%lockres%% hash will produce two records with same hash, with a 50% probability, out of the table, any table, of only 16,777,215 record. That suddenly doesn’t look like a cosmic constant, does it? And keep in mind that this is the absolutely maximum value, where the key has a perfectly random distribution. In reality keys are not that random after all. Take Jame’s example: a datetime (8 bytes), a country code (1 byte), a group code (2 bytes) and a random id (4 bytes). From these 15 bytes quite a few are actually constant: eg. every date between 2000 and 2010 has the first 4 bytes identical (0×00) and the 5th byte only has two possible values (0×08 or 0×09). If from the other codes (country, group) we use only 50% of the possible values, then in effect we use, generously, just 10 bytes of the 15 bytes of the key. This means the table has a 50% collision probability at only about 11 million records. considering that he was doing a ‘paltry’ 900 million records upload, no wonder he got collisions…

Have you met  ? Say hello to my BOM

May 21st, 2009

I recently had to look at a problem where a programmer writing a tool to parse some XML produced by a service written by me was complaining that my service serves invalid XML. All my documents seemed mangled and started with . I couldn’t help but smile. So what is this mysterious sequence ? Well, lets check the ISO-8859-1 character encoding, commonly known as ‘Latin alphabet’ and often confused with the Windows code page 1252: ï is the character corresponding to 0xEF (decimal 239), » is 0xBB (decimal 187) and ¿ is 0xBF (decimal 191). So the mysterious sequence is 0xEFBBBF. Does it look familiar now? It should, this is the UTF-8 Byte Order Mark. Moral: if you consume and parse XML, make sure you consume it as XML, not as text. All XML libraries I know of correctly understand and parse the BOM. The only problems I’ve seen are from hand written ‘parsers’ that treat XML as a string (and most often fail to accommodate namespaces too…).

Read/Write deadlock

May 16th, 2009

How does a simple SELECT deadlock with an UPDATE? Surprisingly, they can deadlock even on well tuned systems that does not do spurious table scans. The answer is very simple: when the read and the write use two distinct access paths to reach the same key and they use them in reverse order. Lets consider a simple example: we have a table with a clustered index and an non-clustered index. The reader (T1) seeks a key in the non-clustered index and then needs to look up the clustered index to retrieve an additional column required by the SELECT projection list. The writer (T2) is updating the clustered index and then needs to perform an index maintenance operation on the non-clustered index. So T1 holds an S lock for the key K on the non-clustered index and wants an S lock on the same key K on the clustered index. T2 has an X lock on the key K on the clustered index and wants an X lock on same key K on the non-clustered index. Deadlock, T1 will be chosen as a victim. So you see, there are no complex queries involved, no suboptimal scan operations, no lock escalation nor page locks involved. Simple, correctly written queries may deadlock if doing read/write operations on the same key on a table with two indexes. Lets show this in an example:

use tempdb;
go

create table TestDeadlock (TestId int identity(1,1),
		constraint cdxTestDealockTestId primary key clustered (TestId),
	ColA varchar(10) not null,
	ColB varchar(10) not null);
go

create index idxTestDeadlockColA on TestDeadlock(ColA);
go

Now lets populate the table with some random data:

set nocount on;
insert into TestDeadlock (ColA, ColB) values ('A', 'B');

declare @i int;
select @i = 0;
while @i < 10000
begin
	insert into TestDeadlock (ColA, ColB) values (rand()*99999, rand()*99999);
	select @i = @i+1;
end
go

Now open two query windows, and on the first one start this loop (the Reader):

set nocount on;
while (1=1)
begin
	declare @B varchar(10);
	select @B = ColB from TestDeadlock where ColA='A';
end

On the second one, start this loop (the Writer):

set nocount on;
while (1=1)
begin
	update TestDeadlock set ColA='A' where TestId=1;
	update TestDeadlock set ColA='B' where TestId=1;
end

Now switch back to the first query window and there you have it:

Msg 1205, Level 13, State 51, Line 6
Transaction (Process ID 52) was deadlocked on lock resources with another process
and has been chosen as the deadlock victim. Rerun the transaction.

Looking at the deadlock graph will show exactly the situation I described:

SPID 52 has an KEY S lock on idxTestDeadlockColA and wants an KEY S lock on cdxTestDeadlockTestId, SPID 53 has the KEY X lock on cdxTestDeadlockTestId and wants a KEY X lock on dxTestDeadlockColA. SPID 52 is chosen as victim since it performed less writes than SPID 53.

Why is this kind of deadlock not more prevalent? The key ingredient is that both T1 and T2 have to go after the same ‘key’. On a real life situation, this is just a matter of probabilities. If the write application pattern is to write new keys and to look up old ones, it is unlikely to happen. If the number of keys is very large then the probability of overlap is low (of course it increases around ‘hot spots’, for example a hot post on your website). In my recent experience on of the main culprits for these deadlocks are the hit counters prevalent in many content management systems (increment number of hits a page or image was shown).

Unfortunately things are not always so easy to understand and to explain. The same problem can manifest itself in the disguise of a join. The key learning to take home is this: neither the reads nor the writes do not acquire locks in an atomic fashion on all indexes of a table (that would be impossible, of course!). Whenever two transactions acquire these locks in a different order from one another, they open themselves to a deadlock.

The proper fix varies, of course. Things to consider are:

  • eliminate unnecessary columns from reader’s projection so he does not have to look up the clustered index
  • add required columns as contained columns to the non-clustered index to make the index covering, again so that the reader does not have look up the clustered index
  • avoid updates that have to maintain the non-clustered index
  • One final note: why did I add 10000 dummy record in my example? Because otherwise the optimizer would see that the clustered index has only one page and would choose a plan that scans this index instead of seeking the non-clustered index.

CLR Memory Leak

January 19th, 2009

On a recent client engagement I had to investigate what appeared a to be a memory leak in a managed application. The program was running for about a week now and it appeared to slowly degrade in performance over time. Although it appeared healthy at only about 245 MB of memory used, I decided to investigate. The fastest way in my opinion to track down leaks in a running environment is to attach Windbg and use SOS:

The heap stats will dump all the types along with the number of objects allocated and the the total memory consumed, sorted by memory consumed. Usually the last entry in the list is type that is being leaked. In my case some 2 million objects were present so the leak was quite obvious.

One you tracked down the type being leaked it is usually very easy to find the cause and fix it just by a simple code analysis, but if you are having problems with that the solution is to find a few objects of the leaked type. Use:

!dumpheap -type <LeakedTypeName>

and then dump the tree references that prevents the Garbage Collector from reclaiming the object:

!gcroot <AddressOfObject>

There is an excellent post on the subject from Rico Mariani that details the process: http://blogs.msdn.com/ricom/archive/2004/12/10/279612.aspx.

There are also various SOS cheat sheets out there, the one I found refreshingly simple and succinct is this one: http://geekswithblogs.net/.netonmymind/archive/2006/03/14/72262.aspx

In my case the leak cause was very obvious and was caused by not removing an event handler from an object event after the object was no longer needed. Unfortunately this pattern of leak is very frequent because of the default event handling code generated by pressing the Tab key in Visual Studio.

The following snippet of code illustrates the problem I’m talking about:

       class Worker
        {
            public void Work(object state)
            {
                // do some work here ...
                OnFinished();
            }

            private void OnFinished()
            {
                if (null != Finished)
                {
                    Finished(thisEventArgs.Empty);
                }
            }

            public event EventHandler Finished;
        }

        static int RunningCount;
        static AutoResetEvent eventDone;

        static void Main(string[] args)
        {
            RunningCount = 10000;
            eventDone = new AutoResetEvent(false);
            for (int i = 0; i < 10000; ++i)
            {
                Worker leaked = new Worker();
                leaked.Finished += new EventHandler(leaked_Finished);
                ThreadPool.QueueUserWorkItem(new WaitCallback(leaked.Work));
            }
            eventDone.WaitOne();
        }

        static void leaked_Finished(object sender, EventArgs e)
        {
            int count = Interlocked.Decrement(ref RunningCount);
            if (0 == count)
            {
                eventDone.Set();
            }
        }

To correct the problem in my example you have to remove the handler from the Worker Finiched event when it completes:

        class Worker
        {
            public void Work(object state)
            {
                 // do some work here ...
                OnFinished();
            }

            private void OnFinished()
            {
                if (null != Finished)
                {
                    Finished(thisEventArgs.Empty);
                }
            }

            public event EventHandler Finished;
        }

        static int RunningCount;
        static AutoResetEvent eventDone;
        static EventHandler callbackFinished;

        static void Main(string[] args)
        {
            RunningCount = 10000;
            eventDone = new AutoResetEvent(false);
            callbackFinished = new EventHandler(leaked_Finished);
            for (int i = 0; i < 10000; ++i)
            {
                Worker leaked = new Worker();
                leaked.Finished += callbackFinished;
                ThreadPool.QueueUserWorkItem(new WaitCallback(leaked.Work));
            }
            eventDone.WaitOne();
        }

        static void leaked_Finished(object sender, EventArgs e)
        {
            Worker leaked = (Worker)sender;
            leaked.Finished -= callbackFinished;
            int count = Interlocked.Decrement(ref RunningCount);
            if (0 == count)
            {
                eventDone.Set();
            }
        }

Replacing Service Certificates that are near expiration

November 26th, 2008

Service Broker services use certificates for authenticating message origin and for encrypting messages. I have explained in detail how this authentication works in my earlier post Conversation Authentication. The certificates used for service authentication are most times self-signed certificates created directly by SQL Server using CREATE CERTIFICATE and they expire by default one year after creation. When these certificates expire they have to be replaced and this article is goal is to help you do thisreplacement with no impact on production systems.

Identifying the certificates used by services

Here is a recap of the criteria Service Broker uses to pick a certificate to represent an service identity:

  1. Service owner and certificate owner must be the same database principal
  2. Certificate private key must be encrypted with the database master key
  3. Certificate start date and expiry dates must be valid
  4. Certificate must be ACTIVE FOR BEGIN_DIALOG = ON

If there are more certificates that satisfy the above criteria then Service Broker will pick the one with the latest expiration date. The following query will show the certificates that can be used by Service Broker and the services they are be used for:

select s.name, c.name, c.start_date, c.expiry_date
	from sys.services s
	join sys.certificates c on s.principal_id = c.principal_id
	where c.pvt_key_encryption_type = 'MK'
		and c.is_active_for_begin_dialog = 1
		and GETUTCDATE() BETWEEN c.start_date AND c.expiry_date
		and s.service_id > 2;

The query filters out the QueryNotifications and EventNotifications services because these two services do not pick certificates base don the service owner but instead they pick the ones used by the owner of the notification subscription.

Read the rest of this entry »

Event ID 833: I/O requests taking longer than 15 seconds

October 28th, 2008

The error 833 is usually associated with hardware or system driver problems and the typical recommendation is to replace the hardware or update the drivers and firmware used. However there is a common scenario that leads to this problem when your hardware is fine and sound.

Read the rest of this entry »

Certificate Not Yet Valid

August 25th, 2008

So you are running into a blocking issue, try to find a solution and you gets back a reply like ‘Sorry, I cannot repro your problem’. You go back and try again, and sure the problem is still there. You dig deeper, go to the Advanced search options on Google, but still cannot find the answer. Desperately, you try again and, surprise, the problem has vanished. Sounds familiar?

I know of one such problem that appears again and again when discussing Service Broker, Database Mirroring and, to a lesser extent, SQL Server 2005 Cryptographic facilities. I want to show you what the problem is, what’s causing it and even drill a bit into how something so basic has slipped into the product.

Read the rest of this entry »

Error Handling and Activation

August 13th, 2008

I have previously talked here about the queue monitors and the role they play in launching activated procedures. If you recall, I’ve mentioned that the Queue Monitors will enter the NOTIFIED state after they launch a stored procedure and not launch again the procedure until the Queue Monitor notices that RECEIVE statements are being issued against the queue. In an older post I have also talked about how difficult is to get error handling right, and in particular cast and convention errors. This may seem a trivial problem but in the Service Broker programs it is actually a serious problem because of the frequent conversation to and from XML. These two separate issues can actually ‘cooperate’ into a somehow surprising behavior. Namely if the activated procedure hits an error before it completes the RECEIVE statement, the Queue Monitor will stay in NOTIFIED state and won’t activate again the procedure. Although this looks similar to the typical poison message behavior when the queue is automatically disabled, this is a different issue. And unlike the poison message case, in the case when the Queue Monitor is stranded in NOTIFIED state you can not get a notification that the queue is no longer functional.

This post continues with an example showing how a relatively safe activated procedure can end up in this situation.

Read the rest of this entry »