Recycling Conversations

May 3rd, 2007

In my previous post Reusing Conversations I promised I’ll follow up with a solution to the question about how to end the conversations that are reused for the data-push scenario (logging, auditing, ETL for DW etc).

First of all, why end them in the first place? The answer is to minimize exposure in case of problems. If a problem happens and a dialog has to be destroyed (i.e. errored), then we want to minimize the potential number of messages that we have to re-process or that are potential lost. But wait a minute, how come we’re talking about message loss, doesn’t Service Broker guarantee delivery? The truth is that no system can ever guarantee delivery under all conditions. Human error can happen resulting in misconfiguration, applications have bugs that can send invalid messages (e.g. fail XML validation); hardware can be catastrophically lost w/o any possibility to restore a database/host and so on and so forth. Reusing a single dialog to send all your messages (or one per @@spid) is like putting all your eggs in one basket. If that dialog was trailing millions of messages behind, all those messages have to be reprocessed or are lost.

Another reason to recycle the dialogs is to allow RETENTION on queues. Enabling RETENTION keeps a copy of every message sent to/from that queue for the duration of the dialog. If a dialog is kept open for unlimited duration, the queue will continuously grow.

Also we want to recycle dialogs to deal with load balancing scenarios, since a dialog is sticky to one service instance we would like to periodically start new dialogs so they ‘stick’ to new instances of from the available target service instances.

So how should we end this dialogs? My recommendation is to end them based on time. Choose a time for how long a dialog should be used, depending on the expected number of messages traffic. Then you can use the Service Broker conversation timers to end the dialog when it’s time is done. I also believe that the timer should be handled by the initiator, but the conversation should be ended by the target. Given that the pattern we’re talking about (one way data push) involves sending only messages from initiator to target, the initiator is not allowed to end the conversation because that will create a fire-and-forget message pattern, a very undesirable message exchange pattern, see My recommendation is to have a special message in the contract used by the two services that informs that target that the initiator is done sending on this dialog. Lets call this message [EndOfStream]. The target would respond to this message by ending its side of the conversation, thus sending an EndDialog message to the initiator. The initiator responds to this message by ending its side (issuing the END CONVERSATION verb).

To implement what I described above we would need to modify the original example from to create a conversation timer when the dialog is created and to add an activated procedure that handles the timer message, as well as the EndDialog and Error messages. One interesting issue is how to handle the timer message. After sending the [EndOfStream] message, the initiator should not send any more messages on that dialog. This is easily achieved by removing the dialog handle from the [SessionConversations] table. The next invocation of usp_Send will create a new one, just as desired. But there is a race condition there, an usp_Send procedure might be executing already and trying to send a message on that conversation. Even though the dialog was removed from the [SessionConversations] table, the concurrent usp_Send will successfully send on the dialog that just sent an [EndOfStream] message. Since the target responds to the [EndOfstream] message by ending the conversation, the message sent is most likely lost as the target will ignore it! To solve this situation, the usp_Send procedure must guarantee the stability of the conversation it looked up in the [SessionConversations] table. So the SELECT from the [SessionConversations] table has to be REPEATABLE, and this can be achieved with an appropriate query hint (HOLDLOCK). Update: Several readers have deployed this into production and found out that HOLDLOCK is not enough and UPDLOCK behaves better. The code is updated to reflect this.

But by doing this we introduce a new problem: the activated procedure handling the timer message and the usp_Send procedure might deadlock each other! One will hold a lock on the conversation and try to update the [SessionConversations] table, while the other will hold the lock on the [SessionConversations] table and try to acquire the lock on the conversation: guaranteed deadlock. To solve this problem, will solve it as most deadlock problems are solved: acquire the locks in the same order in both procedures. Since the usp_Send must acquire the locks in the order row-in-[SessionConversations]-table-first-conversation-next, the activated procedure must do the same. That is: delete the row from [SessionConversations] first, send the [EndOfStream] message next.

So here is the code, the modified usp_Send and the initiator side activated procedure.

-- This table associates the current connection (@@SPID)
-- with a conversation. The association key also
-- contains the conversation parameteres:
-- from service, to service, contract used
CREATE TABLE [SessionConversations] (
, PRIMARY KEY (SPID, FromService, ToService, OnContract)
, UNIQUE (Handle));

-- SEND procedure. Will lookup to reuse an existing conversation
-- or start a new in case no conversation exists or the conversation
-- cannot be used
@fromService SYSNAME,
@toService SYSNAME,
@onContract SYSNAME,
@messageType SYSNAME,
@messageBody VARBINARY(MAX))
DECLARE @counter INT;
SELECT @counter = 1;
-- Will need a loop to retry in case the conversation is
-- in a state that does not allow transmission
WHILE (1=1)
-- Seek an eligible conversation in [SessionConversations]
SELECT @handle = Handle
FROM [SessionConversations] WITH (UPDLOCK)
AND FromService = @fromService
AND ToService = @toService
AND OnContract = @OnContract;
IF @handle IS NULL
-- Need to start a new conversation for the current @@spid
FROM SERVICE @fromService
TO SERVICE @toService
ON CONTRACT @onContract
-- Set an one hour timer on the conversation
INSERT INTO [SessionConversations]
(SPID, FromService, ToService, OnContract, Handle)
(@@SPID, @fromService, @toService, @onContract, @handle);
-- Attempt to SEND on the associated conversation
MESSAGE TYPE @messageType
SELECT @error = @@ERROR;
IF @error = 0
-- Successful send, just exit the loop
SELECT @counter = @counter+1;
IF @counter > 10
-- We failed 10 times in a row, something must be broken
N'Failed to SEND on a conversation for more than 10 times. Error %i.'
, 16, 1, @error) WITH LOG;
-- Delete the associated conversation from the table and try again
DELETE FROM [SessionConversations]
WHERE Handle = @handle;
SELECT @handle = NULL;

CREATE PROCEDURE usp_SenderActivation
DECLARE @messageTypeName SYSNAME;
@handle = conversation_handle,
@messageTypeName = message_type_name,
@messageBody = message_body
FROM [sender_queue];
IF @handle IS NOT NULL
-- Delete the message from the [SessionConversations] table
-- before sending the [EndOfStream] message. The order is
-- important to avoid deadlocks. Strictly speaking, we should
-- only delete if the message type is timer or error, but is
-- simpler and safer to just delete always
DELETE FROM [SessionConversations]
WHERE [Handle] = @handle;
IF @messageTypeName = N''
ELSE IF @messageTypeName = N''
ELSE IF @messageTypeName = N''
-- Insert your error handling here. Could send a notification or
-- store the error in a table for further inspection
-- We’re gonna log the error into the ERRORLOG and Eventvwr.exe
DECLARE @description NVARCHAR(4000);
@error = CAST(@messageBody AS XML).value(
'(//ssb:Error/ssb:Code)[1]', 'INT'),
@description = CAST(@messageBody AS XML).value(
'(//ssb:Error/ssb:Description)[1]', 'NVARCHAR(4000)')
RAISERROR(N'Received error Code:%i Description:"%s"',
 16, 1, @error, @description) WITH LOG;

-- attach the uspSenderActivation procedure with the sender’s queue
ALTER QUEUE [sender_queue]
PROCEDURE_NAME = [usp_SenderActivation],

7 Responses to “Recycling Conversations”

  1. [...] using Service Broker as a pure data-push one-way communication channel: Reusing Conversations and Recycling Conversations. I originally did not plan for this third part, but I was more than once asked the same question: [...]

  2. prasanth says:

    Very much impressed by the information provided about how to reuse the Conversation s and also reuse. I have a question to ask. My senderprocedure is called from a website where unknown number of instances may be created at the same time. And also the receiver service should process messages immediatly and need parallel processing there. How to apply both reusing (recycling ) conversations and parallelism in my case.

    If i were to use

    SELECT @handle = Handle

    FROM [SessionConversations] WITH (HOLDLOCK)


    AND FromService = @fromService

    AND ToService = @toService

    AND OnContract = @OnContract;

    my sender process will have same SPID and will get the same handle and eventually all the Messages sent to the receiver will come under one conversation. Then i cannot have parallel processing even with more queue readers.

  3. remus says:

    Hello Prasanth,
    Is there any relation between the messages being sent? If no, then you can augment the sending procedure to round-robin between several available conversations to achieve what you need. Also to consider if there are more than one sender to the same back-end, in which case the back-end could process in parallel conversations from different senders, w/o any change to the sender.
    On the other hand if the messages are related in any way, then you should look into using the relation between messages as the conversation, that is send all related messages on same conversation and open a new conversation for each set of related messages.
    If you have more questions you can use the contact form on the site to send me a mail, since mail is somehow easier to manage than this blog comments :)

  4. John says:

    Nice post ! Very usefull.
    In the code, when selecting from the conversation handles table, you wrote UPDLOCK but in the text, you wrote HOLDLOCK … Which one should be used ? Is this a typo ? What would be the difference between the two ?

  5. remus says:

    Hello John,
    Thanks for reading this. The original article had HOLDLOCK in both places. Later I got feedback from readers that actually deployed this solution that under high load the UPDLOCK is necessary. I have updated the article.

  6. [...] how long should a conversation be used for? [...]

  7. [...] when you want to do this. Why?… well there are a few issues described in admirable detail here, but invalid XML/ poisoned message were my main problem. The way this is usually handled is you try [...]