Resending messages
December 3rd, 2007This article is a continuation of my two articles on using Service Broker as a pure data-push one-way communication channel: Reusing Conversations and Recycling Conversations. I originally did not plan for this third part, but I was more than once asked the same question: if a conversation is in error what happens to the messages that were sent but not yet delivered?
The short answer is that messages that are still pending are in the sender’s database sys.transmission_queue system table so when an Error message is received the sender can scan this table and resend each message that is still pending. An example of such procedure is not difficult to code:
-- Resends all pending messages in sys.transmission_queue
-- belonging to an old colversation on a new conversation
--
CREATE PROCEDURE usp_ResendPending (
@oldConversation UNIQUEIDENTIFIER,
@newConversation UNIQUEIDENTIFIER)
AS
BEGIN
DECLARE @mt SYSNAME;
DECLARE @body VARBINARY;
-- Must declare a cursor to iterate over
-- all the pending messages
-- It is important to keep the message order
-- and to keep the original message type
DECLARE cursorPending CURSOR
LOCAL
FORWARD_ONLY
READ_ONLY
FOR SELECT [message_type_name],
[message_body]
FROM sys.[transmission_queue]
WHERE [conversation_handle] = @oldConversation
ORDER BY message_sequence_number;
OPEN cursorPending;
FETCH NEXT FROM cursorPending
INTO @mt, @body;
WHILE (@@FETCH_STATUS = 0)
BEGIN
-- Resend the message on the new conversation
IF (@body IS NOT NULL)
BEGIN
-- If the @body is not null it must be sent explicitly
SEND ON CONVERSATION (@newConversation)
MESSAGE TYPE @mt
(@body);
END
ELSE
BEGIN
-- Messages with no body must *not* specify the body,
-- cannot send a NULL value argument
--
SEND ON CONVERSATION (@newConversation)
MESSAGE TYPE @mt;
END
FETCH NEXT FROM cursorPending
INTO @mt, @body;
END
CLOSE cursorPending;
DEALLOCATE cursorPending;
END
GO
Now the interesting question here is not how to do this but when to do this. My previous article had this code that was dealing with error messages:
ELSE IF @messageTypeName = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error' BEGIN -- Insert your error handling here. Could send a notification or -- store the error in a table for further inspection -- We’re gonna log the error into the ERRORLOG and Eventvwr.exe -- DECLARE @error INT; DECLARE @description NVARCHAR(4000); WITH XMLNAMESPACES ('http://schemas.microsoft.com/SQL/ServiceBroker/Error' AS ssb) SELECT @error = CAST(@messageBody AS XML).value( '(//ssb:Error/ssb:Code)[1]', 'INT'), @description = CAST(@messageBody AS XML).value( '(//ssb:Error/ssb:Description)[1]', 'NVARCHAR(4000)'); RAISERROR(N'Received error Code:%i Description:"%s"', 16, 1, @error, @description) WITH LOG;
-- Insert call to usp_ResendPending here
-- END CONVERSATION has to be issued AFTER the call to usp_ResendPending END CONVERSATION @handle; END
I’ve already highlighted the place where the call to the resending procedure should be inserted. Important note: the place where the END CONVERSATION is issued in code has moved from my earlier post because once a conversation is ended all pending messages in sys.transmission_queue are deleted. So now your sender is going to resend any pending message in case it receives an error. But what sort of errors may the sender receive? Now that is no longer a trivial problem.
Service Broker conversation errors
The message_body of a conversation error message contains the error code and error description. There are two types of errors one conversation can receive:
· application errors that were sent by the other endpoint of the conversation using the END CONVERSATION … WITH ERROR … verb. The error code for these errors is a positive number.
· system errors that were sent by the Service Broker infrastructure when an error condition has occurred. The error code for these errors is a negative number and its value is negated value of a SQL Server error code from sys.messages.
I cannot comment anything about the application error values because, of course, they are application specific and is probably you that defined them in the first place. But I can give details on the system error codes what they mean:
–8494 You do not have permission to access the service ‘%.*ls’.
This system error is sent to a conversation that is trying to reach a service that is denying access to the sender.
–8489 The dialog has exceeded the specified LIFETIME.
This system error is sent to a conversation when the lifetime specified during BEGIN DIALOG has expired. Note that all conversations have a lifetime, if you omitted it then the lifetime is the 231 seconds (that is roughly 68 years)
–8462 The remote conversation endpoint is either in a state where no more messages can be exchanged, or it has been dropped.
This system error is sent on a conversation that continued to send messages to a peer that has already closed it’s endpoint.
–8470 Remote service has been dropped.
This system error is sent on each active conversation that was initiated from or targeted a service that was DROPed from the database
–8469 Remote service has been altered.
This system error is sent on each active conversation that was initiated from or targeted a service that was ALTERed
–8487 The remote service contract has been dropped
This system error is sent on each active conversation that was initiated using the contract that is being DROPed from the database
–8490 Cannot find the remote service ‘%.*ls’ because it does not exist.
This system error is sent to a conversation that is trying to reach a service in a specific broker instance and the service does not exist. This can happen if the optional broker instance is specified during BEGIN DIALOG
–8425 The service contract ‘%.*ls’ is not found.
This system error is sent to a conversation that is using a contract not installed on the peer database
–8408 Target service ‘%.*ls’ does not support contract ‘%.*ls’.
This system error is sent to a conversation that is using a contract not bound to the target service
–8428 The message type “%.*ls” is not found.
This system error is sent to a conversation that is using a message type not found in the target database
–8498 The remote service has sent a message of type ‘%.*ls’ that is not part of the local contract.
This system error is sent to a conversation that is using a message not bound to the contract used
–8430 The message body failed the configured validation.
This system error is sent to a conversation that is using a message type validation (none, empty, well_formed_xml) that does not agree with the peer validation for the same message type.
–8457 The message received was sent by the initiator of the conversation, but the message type ‘%.*ls’ is marked SENT BY TARGET in the contract.
This system error is sent to a conversation that has sent a message from initiator but the message is marked as SENT BY TARGET in the contract
–8437 The message received was sent by a Target service, but the message type ‘%.*ls’ is marked SENT BY INITIATOR in the contract.
This system error is sent to a conversation that has sent a message from target but the message is marked as SENT BY INITIATOR in the contract
–8495 The conversation has already been acknowledged by another instance of this service.
This system error is sent to a conversation that is sending a reply to an initiator but the initiator is rejecting this peer because has already established conversation with another peer. This scenario can happen only in certain conditions when the load-balancing routes of Service Broker are used.
–9616 A message of type ‘%.*ls’ was received and failed XML validation. %.*ls This occurred in the message with Conversation ID ‘%.*ls’, Initiator: %d, and Message sequence number: %I64d.
This system error is sent to a conversation that has sent a message type marked as conforming to a certain XML schema but the payload has failed to pass the XML validation for the said schema
–9719 The database for this conversation endpoint is attached or restored.
This system error is sent to any active conversation when the ALTER DATABASE … SET ERROR_BROKER_CONVERSATIONS is issued. The text of the error message comes from the fact that this ALTER statement is usually issued in conjunction with restore or attach operations.
–28052 Cannot decrypt session key while regenerating master key with FORCE option.
This system error is sent to a conversation when the database master key was regenerated with FORCE and the conversation sessions keys (encrypted with the database master key) are lost
Although the list of system errors is quite large (and I actually left out three more errors that I really think too much of a corner case to worth mention) we can distinguish three cases:
1. Errors that we can recover from immediately and can continue sending messages, including resend any pending messages. These are 8489 LIFETIME expired, 8462 Peer has closed, 9719 attach/restore and 28052 Master key regenerate.
2. Errors that are the result of a breach of contract between the two parties (initiator and target). There is no point in retrying in this case since these problems indicate either an application defect or a deployment configuration problem. The list of these errors is 8494 Access denied, 8490 Service does not exist and 8425, 8408, 8428, 8498, 8430, 8457, 8437, 9616 all of which are a form of service contract agreement violation. In case you wonder how can one hit errors like 8437 and why isn’t SQL Server itself enforcing the correct SENT BY TARGET/SENT BY INITIATOR when the SEND verb is issued this is because the message type and contract definition may be different between the initiator and target database.
3. Error that are the result of a database DROP or ALTER operation and the retry result is undefined. These are 8470, 8487 and 8469.
The appropriate action for each case depends on you application semantics. For my example of reusing/recycling dialogs I would only take the action of resending pending messages for errors in the first case 8489, 8462, 9719 and 28052:
...
ELSE IF @messageTypeName = N'http://schemas.microsoft.com/SQL/ServiceBroker/Error'
BEGIN
-- Insert your error handling here. Could send a notification or
-- store the error in a table for further inspection
-- We’re gonna log the error into the ERRORLOG and Eventvwr.exe
--
DECLARE @error INT;
DECLARE @description NVARCHAR(4000);
WITH XMLNAMESPACES ('http://schemas.microsoft.com/SQL/ServiceBroker/Error' AS ssb)
SELECT
@error = CAST(@messageBody AS XML).value(
'(//ssb:Error/ssb:Code)[1]', 'INT'),
@description = CAST(@messageBody AS XML).value(
'(//ssb:Error/ssb:Description)[1]', 'NVARCHAR(4000)');
RAISERROR(N'Received error Code:%i Description:"%s"',
16, 1, @error, @description) WITH LOG;
IF (@error IN (-8489, -8462, -9719, -28052))
BEGIN
-- Insert call to usp_ResendPending here
END
-- END CONVERSATION has to be issued AFTER the call to usp_ResendPending
END CONVERSATION @handle;
END
Resending on any of the other errors would most likely result immediately in an error being sent back because the condition that caused the first error would simply repeat itself, being SEND access denied or service contract agreement violation or whatever.
Idempotent Messages
Ultimately a warning for caution if you choose to resend pending messages. The fact that a message is pending in sys.transmission_queue is not a guarantee that the target has not received that message. It only indicates that an acknowledgment was not received back and/or the sender did not yet delete the pending message. So if you decide to add resending logic in your application be careful that the sender is now prone to receiving duplicate data as the same message payload is successfully delivered twice. One way to achieve this is to make the payload your messages carry idempotent, but how to achieve this is completely application specific, so I cannot give any generic advice on this (but hey, you could choose to use my consulting services and get a problem specific answer J).
[…] Resending messagesBy remusThis article is a continuation of my two articles on using Service Broker as a pure data-push one-way communication channel: Reusing Conversations and Recycling Conversations. I originally did not plan for this third part, …rusanu.com – http://rusanu.com […]
[…] subscription was dropped, I have covered the system error 8470 and others in my earlier post Resending messages But of course, the real question is why is this happening? Is there some coding error on the part […]
Maybe an obvious question but where does the @newConversation value come from?
The value passed in for the @newConversation parameter is a new dialog handle created the same way the [usp_Send] procedure creates new dialog handles in the original article: http://rusanu.com/2007/05/03/recycling-conversations/
If I’ve followed the other blog posts correctly, on the sender_queue usp_SenderActivation is activated by an error occurring (receives an Error message). usp_SenderActivation then calls usp_ResendPending within a transaction if (@error IN (-8489, -8462, -9719, -28052)). It seems only one conversation_handle is available in the usp_SenderActivation example, is the sample missing a create new dialog code block and insert into SessionConversations table to be complete?
[…] how to treat conversation errors? […]