Exception handling and nested transactions

June 11th, 2009

I wanted to use a template for writing procedures that behave as intuitively as possible in regard to nested transactions. My goals were:

  • The procedure template should wrap all the work done in the procedure in a transaction.
  • The procedures should be able to call each other and the calee should nest its transaction inside the outer caller transaction.
  • The procedure should only rollback its own work in case of exception, if possible.
  • The caller should be able to resume and continue even if the calee rolled back ts work.

My solution is to use a either a transactions or a savepoint, depending on the value of @@TRANCOUNT at procedure start. The procedures start a new transaction if no transaction is pending. Otherwise they simply create a savepoint. On exit the procedure commits the transaction they started (if they started one), otherwise they simply exit. On exception, if the transaction is not doomed, the procedure either rolls back (if it started the transaction), or rolls back to the savepoint it created (if calee already provided a transaction).

Because of the use of savepoints this template does not work in all situations, since there are cases like distributed transactions, that cannot be mixed with savepoints. But for the average run of the mill procedures, this template has saved me very well.

create procedure [usp_my_procedure_name]
as
begin
	set nocount on;
	declare @trancount int;
	set @trancount = @@trancount;
	begin try
		if @trancount = 0
			begin transaction
		else
			save transaction usp_my_procedure_name;

		-- Do the actual work here
	
lbexit:
		if @trancount = 0	
			commit;
	end try
	begin catch
		declare @error int, @message varchar(4000), @xstate int;
		select @error = ERROR_NUMBER(), @message = ERROR_MESSAGE(), @xstate = XACT_STATE();
		if @xstate = -1
			rollback;
		if @xstate = 1 and @trancount = 0
			rollback
		if @xstate = 1 and @trancount > 0
			rollback transaction usp_my_procedure_name;

		raiserror ('usp_my_procedure_name: %d: %s', 11, 1, @error, @message) ;
		return;
	end catch	
end

%%lockres%% collision probability magic marker: 16,777,215

May 29th, 2009

@jrowlandjones blogged about a dubious deadlock case. I recommend this article as is correct and presents a somewhat esoteric case of deadlock: the resource hash collision. The lock manager in SQL Server doesn’t know what it locks, it just locks ‘resources’ (basically strings). It is the job of higher level components like the the access methods of the storage engine to present the ‘resource’ to the lock manager and ask for the desired lock. When locking rows in a heap or a b-tree the storage engine will synthesize a ‘resource’ from the record identifier. Since these resources have a limited length, the storage engine has to reduce the effective length of a key to the maximum length is allowed to present to the lock manager, and that means that the record’s key will be reduced to 6 bytes. This is achieved by hashing the key into a 6 byte hash value. Nothing spectacular here.

But if you have a table with a key of length 50 bytes and its reduced to 6 bytes, you may hit a collision. So how likely is this to happen?

On 6 bytes there are 281,474,976,710,656 distinct possible values. Its a pretty big number? Not that big actually. If we meet at a party and I say ‘I bet somebody in the room shares your birthday’ would you bet against me? You probably should :) What if I change my question to ‘I bet there are two people in this room that share the birthday!’? Now I will probably take your money. This is called a ‘meet-in-the-middle‘ attack in cryptography and basically it says that you get a 50% collision probability at half the hash length. So the SQL %%lockres%% hash will produce two records with same hash, with a 50% probability, out of the table, any table, of only 16,777,215 record. That suddenly doesn’t look like a cosmic constant, does it? And keep in mind that this is the absolutely maximum value, where the key has a perfectly random distribution. In reality keys are not that random after all. Take Jame’s example: a datetime (8 bytes), a country code (1 byte), a group code (2 bytes) and a random id (4 bytes). From these 15 bytes quite a few are actually constant: eg. every date between 2000 and 2010 has the first 4 bytes identical (0×00) and the 5th byte only has two possible values (0×08 or 0×09). If from the other codes (country, group) we use only 50% of the possible values, then in effect we use, generously, just 10 bytes of the 15 bytes of the key. This means the table has a 50% collision probability at only about 11 million records. considering that he was doing a ‘paltry’ 900 million records upload, no wonder he got collisions…

Have you met  ? Say hello to my BOM

May 21st, 2009

I recently had to look at a problem where a programmer writing a tool to parse some XML produced by a service written by me was complaining that my service serves invalid XML. All my documents seemed mangled and started with . I couldn’t help but smile. So what is this mysterious sequence ? Well, lets check the ISO-8859-1 character encoding, commonly known as ‘Latin alphabet’ and often confused with the Windows code page 1252: ï is the character corresponding to 0xEF (decimal 239), » is 0xBB (decimal 187) and ¿ is 0xBF (decimal 191). So the mysterious sequence is 0xEFBBBF. Does it look familiar now? It should, this is the UTF-8 Byte Order Mark. Moral: if you consume and parse XML, make sure you consume it as XML, not as text. All XML libraries I know of correctly understand and parse the BOM. The only problems I’ve seen are from hand written ‘parsers’ that treat XML as a string (and most often fail to accommodate namespaces too…).

stackoverflow.com: how to execute well on a good idea

May 18th, 2009

Some time ago I started noticing on my google searches a newcomer: stackoverflow.com. At first I dismissed this as yet another SEO hack to divert traffic to some re-syndicated content of the old user groups and forums. But I was wrong. Turns out stackoverflow.com is an enterprise backed by well known industry names like Joel Spolsky of Joel on Software fame. Apparently I’ve been living in a cave, since this new site is quite popular and even doing a Stack Overflow DevDays tour!

Now being that I’m a forums and newsgroup addict with a long history of MSDN abuse, I had to join this one and start showing off my amazing knowledge and wit. OK, you can all stop laughing now. I am a noob, so what? I still love to answer questions ;) and not actually knowing the answer has never a stop for me. Imagination is more important than knowledge!

I spend now about 4 days on stackoverflow.com and I must say that I’m impressed. First of all, they do offer innovation in how the content is gathered and presented. The hierarchical forums model was obviously obsolete and it was showing its age: new users have a hard time figuring out which forum is the right one, monitoring the questions is difficult as the volume increases, there are often questions that obviously span multiple topics and picking the one forum for it severely restricts the exposure of the said question, and implicitly the quality of responses. Instead stackoverflow.com goes for a tags based model. When you ask a question you choose tags relevant for it and you can mix and match tags as diverse as linq, objective-c and php in one single question.

Now tags based contents isn’t exactly new, but the way is executed on stackoverflow.com takes it to a new level. Of course they offer tags based browsing of topics. But they also keep track of the tags you must often interact with (ask or answer). You can browse tags within current tags to get the questions that cover multiple tags of your interest. And the tags system is completely open, anyone can create new tags and they even have awards for successful tags.

The next innovation idea I like is how they try to blend the line between wiki and forums. Topics that prove to be popular and have a good answer can be promoted to wiki entries. This makes the entry serve the same reference role the ’sticky’ posts serve in forums, but with better functionality. Also rooted in wikis (and craigslist too) is the idea of member provided social policing for the content: answers get voted up or down by community members. Not only that, but questions also get to be voted up or down, which is something I have not seen elsewhere. And ultimately questions can be closed, responses deleted. How is this different from the forum administrators? These people are not administrators, are just ordinary community members. You gain reputation, you earn privileges.

The reputation system is not new, by now almost any community forum has a reputation points system in place. But with stackoverflow.com they added also a system of badges that I feel comes straight from the video games world of achievements and vanity awards : you get bronze, silver or gold badges for achieving tasks in the stackoverflow.com ecosystem. You get your Teacher badge for answering a question and receiving an Up vote, you get the Student badge for answering a question that receives an up vote, or even a gold badge of Great Answer if it gets voted up 100 times. Now these are, of course, vanity awards. We all know though how efficient they are in keeping users hooked in! Me, I’m eager to get my Critic badge…

The only serious thing missing is the RSS syndication of views, but I hear is in the plans.

But most impressive is the quality of execution on these good ideas. The site is fast and responsive. It provides suggestion to similar questions as you type yours. It provides fast navigation to your questions and answers. Visual notifications for changes since your previous check. Suggestions for related topics (ie. common tags). Is true that I don’t know how many users it carries. Judging from the ~30K Teacher badges, I’d guess some 50k users registered and active, as a conservative estimate.

Is also nice to see such an effort started from a grassroots movement, and not from the political sponsorship of an industry player. Today’s developer has to deal in the course of a single day with an NSConnection question, related to an issue of Appache .htaccess mod_rewrite and PHP cookie handling and resulting in a SQL Server access problem. A site like the Social on MSDN would not happily sponsor and encourage such questions, nor would it nurture and grow the community leaders that can answer such end-to-end and cross platform questions.

Read/Write deadlock

May 16th, 2009

How does a simple SELECT deadlock with an UPDATE? Surprisingly, they can deadlock even on well tuned systems that does not do spurious table scans. The answer is very simple: when the read and the write use two distinct access paths to reach the same key and they use them in reverse order. Lets consider a simple example: we have a table with a clustered index and an non-clustered index. The reader (T1) seeks a key in the non-clustered index and then needs to look up the clustered index to retrieve an additional column required by the SELECT projection list. The writer (T2) is updating the clustered index and then needs to perform an index maintenance operation on the non-clustered index. So T1 holds an S lock for the key K on the non-clustered index and wants an S lock on the same key K on the clustered index. T2 has an X lock on the key K on the clustered index and wants an X lock on same key K on the non-clustered index. Deadlock, T1 will be chosen as a victim. So you see, there are no complex queries involved, no suboptimal scan operations, no lock escalation nor page locks involved. Simple, correctly written queries may deadlock if doing read/write operations on the same key on a table with two indexes. Lets show this in an example:

use tempdb;
go

create table TestDeadlock (TestId int identity(1,1),
		constraint cdxTestDealockTestId primary key clustered (TestId),
	ColA varchar(10) not null,
	ColB varchar(10) not null);
go

create index idxTestDeadlockColA on TestDeadlock(ColA);
go

Now lets populate the table with some random data:

set nocount on;
insert into TestDeadlock (ColA, ColB) values ('A', 'B');

declare @i int;
select @i = 0;
while @i < 10000
begin
	insert into TestDeadlock (ColA, ColB) values (rand()*99999, rand()*99999);
	select @i = @i+1;
end
go

Now open two query windows, and on the first one start this loop (the Reader):

set nocount on;
while (1=1)
begin
	declare @B varchar(10);
	select @B = ColB from TestDeadlock where ColA='A';
end

On the second one, start this loop (the Writer):

set nocount on;
while (1=1)
begin
	update TestDeadlock set ColA='A' where TestId=1;
	update TestDeadlock set ColA='B' where TestId=1;
end

Now switch back to the first query window and there you have it:

Msg 1205, Level 13, State 51, Line 6
Transaction (Process ID 52) was deadlocked on lock resources with another process
and has been chosen as the deadlock victim. Rerun the transaction.

Looking at the deadlock graph will show exactly the situation I described:

SPID 52 has an KEY S lock on idxTestDeadlockColA and wants an KEY S lock on cdxTestDeadlockTestId, SPID 53 has the KEY X lock on cdxTestDeadlockTestId and wants a KEY X lock on dxTestDeadlockColA. SPID 52 is chosen as victim since it performed less writes than SPID 53.

Why is this kind of deadlock not more prevalent? The key ingredient is that both T1 and T2 have to go after the same ‘key’. On a real life situation, this is just a matter of probabilities. If the write application pattern is to write new keys and to look up old ones, it is unlikely to happen. If the number of keys is very large then the probability of overlap is low (of course it increases around ‘hot spots’, for example a hot post on your website). In my recent experience on of the main culprits for these deadlocks are the hit counters prevalent in many content management systems (increment number of hits a page or image was shown).

Unfortunately things are not always so easy to understand and to explain. The same problem can manifest itself in the disguise of a join. The key learning to take home is this: neither the reads nor the writes do not acquire locks in an atomic fashion on all indexes of a table (that would be impossible, of course!). Whenever two transactions acquire these locks in a different order from one another, they open themselves to a deadlock.

The proper fix varies, of course. Things to consider are:

  • eliminate unnecessary columns from reader’s projection so he does not have to look up the clustered index
  • add required columns as contained columns to the non-clustered index to make the index covering, again so that the reader does not have look up the clustered index
  • avoid updates that have to maintain the non-clustered index
  • One final note: why did I add 10000 dummy record in my example? Because otherwise the optimizer would see that the clustered index has only one page and would choose a plan that scans this index instead of seeking the non-clustered index.