Replacing Service Certificates that are near expirationNovember 26th, 2008
Service Broker services use certificates for authenticating message origin and for encrypting messages. I have explained in detail how this authentication works in my earlier post Conversation Authentication. The certificates used for service authentication are most times self-signed certificates created directly by SQL Server using CREATE CERTIFICATE and they expire by default one year after creation. When these certificates expire they have to be replaced and this article is goal is to help you do thisreplacement with no impact on production systems.
Identifying the certificates used by services
Here is a recap of the criteria Service Broker uses to pick a certificate to represent an service identity:
- Service owner and certificate owner must be the same database principal
- Certificate private key must be encrypted with the database master key
- Certificate start date and expiry dates must be valid
- Certificate must be ACTIVE FOR BEGIN_DIALOG = ON
If there are more certificates that satisfy the above criteria then Service Broker will pick the one with the latest expiration date. The following query will show the certificates that can be used by Service Broker and the services they are be used for:
select s.name, c.name, c.start_date, c.expiry_date from sys.services s join sys.certificates c on s.principal_id = c.principal_id where c.pvt_key_encryption_type = 'MK' and c.is_active_for_begin_dialog = 1 and GETUTCDATE() BETWEEN c.start_date AND c.expiry_date and s.service_id > 2;
The query filters out the QueryNotifications and EventNotifications services because these two services do not pick certificates base don the service owner but instead they pick the ones used by the owner of the notification subscription.
You can run this query in any database that has services establishing remote dialogs and quickly see if the certificates used are about to expire.
When I explained how to replace the certificates for endpoints I showed that the replacement can go smooth and with no downtime for production if it the replacement process takes care of first deploying the certificate to the peers and only after that starts using the new certificate. Same is true for certificates used by services. The replacement process goes as follows:
- Create a new certificate to replace the old one but mark the certificate as ACTIVE FOR BEGIN_DIALOG=OFF.
- Deploy the certificate to the other databases that establish dialogs with the service(s) that are having the certificate replaced.
- Activate the new certificate by turning ACTIVE FOR BEGIN_DIALOG=ON
- Drop the old certificate
I have two services that are exchanging messages over authenticated and encrypted conversations. The initiator service is named MyService and is hosted on the machine REMUSRX64. The target service is named ImageProcessor and is hosted on the machine VSQL2K5EXPRESS. First thing I’m doing is to check if the certificate used by MyService is near expiration:
select s.name, c.name, c.start_date, c.expiry_date, c.thumbprint from sys.services s join sys.certificates c on s.principal_id = c.principal_id where c.pvt_key_encryption_type = 'MK' and c.is_active_for_begin_dialog = 1 and GETUTCDATE() BETWEEN c.start_date AND c.expiry_date and s.service_id > 2;
The certificate will expire on Dec 12, 2008, which is close enough to warrant replacement now. So I will create a new certificate to replace this one, but I will be careful not to enable it for Service Broker by specifying ACTIVE FOR BEGIN_DIALOG=OFF:
create certificate MyService_2009_certificate authorization dbo with subject = 'My Service Identity', start_date = '2008-11-26' active for begin_dialog=off;
In my case the service MyService is owned by dbo so I used the authorization clause to specify dbo as the newly created certificate owner.
Now that I have the new certificate I can go ahead and deploy it on VSQL2K5EXPRESS. Certificate deployment is done by exporting the certificate into a .cer file, transferring the file to the partner system by an out-of-band method (eg. a file transfer, by email or via a ftp operation) and then importing the certificate into the target database:
backup certificate MyService_2009_certificate to file = 'c:\shared\MyService_2009_certificate.cer'
I next copy the file over to the VSQL2K5EXPRESS system:
In order to restore the certificate on the VSQL2K5EXPRESS system I need to know database principal used on VSQL2K5EXPRESS to represent the identity of MyService. In other words, I need to find out the owner, on VSQL2K5EXPRESS, of the existing certificate still used by MyService:
select user_name(principal_id) from sys.certificates where thumbprint = 0x1EF52285C7300D3F2DCB5160E93E2963DAB59D8D
I can now import the new certificate into VSQL2K5EXPRESS:
create certificate MyService_2009_Certificate authorization MyService_user from file = 'c:\temp\myservice_2009_certificate.cer'
Now the new certificate is deployed and ready to use. Because it is owned by the user MyService_user on VSQL2K5EXPRESS I do not need to grant SEND permission, the user already has this permission. The only action left to do is to enable the new certificate on REMUSRX64 for ACTIVE FOR BEGIN_DIALOG=ON so that Service Broker starts using this new certificate:
alter certificate MyService_2009_certificate with active for begin_dialog = on;
This is it. All new conversations initiated by MyService after this change are going to use the new certificate for authentication and encryption. Because we already deployed the certificate to the partner service there is no production downtime as the target is prepared to accept the new certificate.
Why isn’t necessary to mark the new certificate ACTIVE FOR BEGIN_DIALOG=OFF on VSQL2K5EXPRESS during the deployment? Because REMUSRX64 already has the certificate. So even if VSQL2K5EXPRESS does use the new certificate (and indeed it will use it to encrypt replies) it is OK, because REMUSRX64 already has this new certificate and can decrypt the replies.
OK, so this is how you replace the certificate used by initiator. How about the target, how does the procedure change? It doesn’t. The procedure is absolutely identical, but instead of starting from the initiator service certificate you start from the target. Even though the service authentication model is asymmetric and there are differences between the original deployment in the initiator vs. target role (think REMOTE SERVICE BINDING on initiator vs. SEND permission on target), the steps to replace the certificates used are absolutely identical.
One thing worth mentioning is how to look for target’s certificates about to expire on initiator:
select r.remote_service_name, c.name, c.start_date, c.expiry_date, c.thumbprint from sys.remote_service_bindings r join sys.certificates c on r.remote_principal_id = c.principal_id where pvt_key_encryption_type = 'NA' and GETUTCDATE() between c.start_date and c.expiry_date
This query will show certificates associated with remote service bindings and their start date and expiry date. If you see a certificate near expiration then you should connect to the system hosting the service shown in remote_service_name column and start from there the replacement process. If you are not the administrator of that system then all you can do is contact the administrator of said system and notify her that the certificate is about to expire.
Anonymous conversations use only one certificate, that of the target. Since they are intended for cases when the target cannot possibly know all the initiators that are connecting to him, the procedure described above will not work since is impossible to deploy the new certificate to all the partners without knowing them. Since for anonymous conversations the initiator services must have the target service’s certificate the usual deployment strategy involves the public posting of the certificate used by the target so that any potential initiator can deploy it before initiating the conversation with the target. In practice this usually means either a Web page, a FTP location or an intranet share from where the target service can be downloaded. Since that certificate will expire too, the administrator of the target system must also take necessary precautions to replace it. The procedure in this case will work like this:
- A new certificate is created on the target system.
- The new certificate is posted publicly with a note specifying that starting from a certain date in the future it will be the new certificate accepted.
- Initiators have a time period allowed for downloading and deploying the new certificate. They enable the new certificate immediately (ACTIVE FOR BEGIN_DIALOG=ON) and can drop the old certificate from their systems.
- At the announced date the old certificate will expire. Any client that had neglected to deploy the new certificate will stop working.
The target service in the anonymous case needs not bother with the ACTIVE FOR BEGIN_DIALOG clause. This clause controls only when Service Broker is actively picking a certificate to use and in anonymous conversations the target never picks one because the replies are encrypted with the initiator’s session keys, as I described in more detail in my Endpoint Authentication article.
Given the fact that in anonymous cases the replacement process is dependent on all the initiator’s doing deployment of the new certificate, it is best if the replacement is planned and started well ahead of expiration. A good policy would be to create certificates valid for one year and post the new replacement every 6 months. After a new certificate is announced for the next 6 months both the old certificate and the new one are accepted by target. After 6 month a new one is created, the former ‘new’ now becomes the old and the former ‘old’ becomes invalid. In such a scheme the target always accept two certificates.