SQL Server 2008 R2 Express database size limit: 10GB

April 28th, 2010

The SQL Server 2008 R2 Express editions has increased the database size limit to 10Gb from the previous limit of 4Gb. This is great news for many developers, as the 4Gb limitation was by far the most difficult barrier preventing Express adoption. With today’s rate of generating data, the 4Gb limit was just plain small.

All the other limitations of SQL Server Express stay in place:

CPU
SQL Server Express only uses once CPU socket. It will use all cores and any Hyper-Threading logical processor in that socket though.
Memory
SQL Server Express limits the size of the data buffer pool to 1Gb.
Replication
SQL Server Express can only participate as a subscriber in a replication topology.
Service Broker
Two SQL Server Express instances cannot exchange Service Broker messages directly, the messages have to be routed through a higher level SKU.
SQL Agent
SQL Server Express does not have an Agent service and as such it cannot run Agent scheduled jobs.

How to change database mirroring encryption with minimal downtime

April 23rd, 2010

SQL Server Database Mirroring uses an encrypted connection to ship the data between the principal and the mirror. By default RC4 encryption is enabled and used. The endpoint can be configured to use AES encryption instead, or no encryption. The overhead of using RC4 encryption is quite small, the overhead of using AES encryption is slightly bigger, but not significant. Under well understood conditions, like inside a secured data center, encryption can be safely turned off for a 5-10% increase in speed in mirroring traffic. Note that even with encryption turned off, the traffic is still cryptographically signed (HMAC). Traffic signing cannot be turned off.

To change the encryption used by an endpoint, one has to run the ALTER ENDPOINT … FOR DATABASE_MIRRORING (ENCRYPTION = {DISABLED|SUPPORTED|REQUIRED}). Two endpoints must have compatible encryption settings to be able to communicate. The following table shows the compatibility matrix of encryption settings:

DISABLED SUPPORTED REQUIRED
DISABLED CLEAR CLEAR
SUPPORTED CLEAR ENCRYPTED ENCRYPTED
REQUIRED ENCRYPTED ENCRYPTED

The default setting for an endpoint is ENCRYPTION = REQUIRED, which enforces encryption and refuses to connect to an endpoint that has disabled encryption.

Changing encryption settings on an existing endpoint

If you have a running mirroring session and want to change the settings to squeeze the extra 5-10% you can expect from removing RC4 encryption, then chances are you deployed the endpoint with the default encryption settings, namely REQUIRED. If you don’t know the current endpoint settings you can always check the sys.database_mirroring_endpoints metadata catalog. When encryption is REQUIRED the encryption_algorithm column is one of 1,2,5 or 6. When encryption is SUPPORTED the encryption_algorithm column is one of 3,4,7 or 8. When is DISABLED the encryption_algorithm is 0 and the is_encryption_Enabled column changes to 0. To force the traffic to be unencrypted at least one endpoint has to have ENCRYPTION = DISABLED.

When you run the ALTER ENDPOINT statement and change the encryption settings the endpoint is going to be stopped and restarted during the ALTER statement. All existing connections will be disconnected. A database mirroring session may immediately re-connect and not react to this short disruption in any fashion visible to the user.

The safest way to change an existing mirroring session that uses encryption to no longer encrypt the traffic, when there is no witness, would be like this:

  1. Change the mirror endpoint to SUPPORTED
  2. Change the principal endpoint to DISABLED
  3. Change the mirror endpoint to DISABLED
  4. Verify that the connections are unencrypted, check encryption_algorithm column in the sys.dm_db_mirroring_connections DMV.

If a the mirroring session involves a witness, then it too must have the endpoint set to a compatible encryption setting:

  1. Change the witness endpoint to SUPPORTED
  2. Change the mirror endpoint to SUPPORTED
  3. Change the principal endpoint to DISABLED
  4. Change the witness endpoint to DISABLED
  5. Change the mirror endpoint to DISABLED
  6. Verify that the connections are unencrypted, check encryption_algorithm column in the sys.dm_db_mirroring_connections DMV.

Note that if automatic failover is enabled then at the moment the principal endpoint is changed, it is possible for automatic failover to occur, given that for a brief moment the mirror and the witness will have a quorum.

How to troubleshoot if something goes wrong

Attach Profiler to all instances involved in the mirroring session and open a trace that listens for the Audit Database Mirroring Login Event Class (on SQL 2005 use the Audit Broker Login Event Class event instead, it will trace the DBM sessions). If you did a mistake during the ALTER ENDPOINT changes and ended up with incompatible settings, there will be an event generated visible in Profiler. The event Text will contain an error message explaining why the endpoints cannot connect.

What is Remus up to?

April 1st, 2010

Some of you already know this: I am again FTE with Microsoft. I was in a contract for a very interesting project with Microsoft IT over the past months in a vendor position, and getting back in touch with the cool stuff that goes on inside Microsoft kindled back the passion for bold projects with big impact. Since March 29 I’m working again as a developer with what is officially known as the SQL RDBMS Core Team. I’m no longer involved with the Service Broker though, I now work on the Access Methods. Indexes, BTrees, Heaps and lets not forget DBCC. Fun stuff.

The Bizzaro Guide to SQL Server Performance

March 31st, 2010

Some say performance troubleshooting is a difficult science that blends just the right amount of patience, knowledge and experience. But I say forget all that, a few bullet points can get you a long way in fixing any problem you encounter. Is more important to find a google SEO friendly result that gives simplistic advice. Most importantly, good advice never contains the words ‘It depends’. Without further ado, here is my bulletproof SQL Server optimization guide:

  • Always trust your gut feeling. Avoid doing costly and unnecessary measurements. They may lead down the treacherous path of the scientific method. A gut feeling is always easier to explain and this improves communication. Measurements require use of complicated notions not everybody understands, so they lead to conflicts in the team.
  • High CPU utilization is caused by index fragmentation. Because the distance between database pages increases, the processor needs more cycles to reference the pages in the buffer pool.
  • Low CPU utilization is caused by index fragmentation. As the index fragments get smaller, they fit better into the processor L2 cache and this results in fewer cycles needed to access the row slots in the page. Because the data is in the cache the processor idles next cycles, resulting in low CPU utilization.
  • High Avg. Disk Sec. per Transfer is caused by index fragmentation. When indexes are fragmented the disk controller has to reorder the IO scatter-gather requests to put them in descending order. Needles to say, this operation increases the transfer times in geometric progression, because all the commercial disk controllers use bubble sort for this operation.
  • High memory consumption is caused by index fragmentation. This is fairly trivial and well known, but I’ll repeat it here: as the number of index fragments increases more pointers are needed to keep track of each fragment. Pointers are stored in virtual memory and virtual memory is very large, and this causes high memory consumption.
  • Syntax errors are caused by index fragmentation. Because the syntax is verified using the metadata catalogs, high fragmentation in the database can leave gaps in the syntax. This is turn causes the parser to generate syntax errors on perfectly valid statements like SECLET and UPTADE.
  • Covering indexes can lead to index fragmentation. Covering indexes are the indexes used by the query optimizer to cover itself in case the plan has execution faults. Because they are so often read they wear off and start to fragment.
  • Index fragmentation can be resolved by shrinking the database. As the data pages are squeezed tighter during the shrinking, they naturally realign themselves in the correct order.

There you have it, the simplest troubleshooting guide. Since most performance problems are caused by index fragmentation, all you have to do is shrink the database to force the pages to re-align correctly, and this will resolve the performance problem.

Happy April 1st everyone!

Using tables as Queues

March 26th, 2010

A very common question asked on all programming forums is how to implement queues based on database tables. This is not a trivial question actually. Implementing a queue backed by a table is notoriously difficult, error prone and susceptible to deadlocks. Because queues are usually needed as a link between various processing stages in a workflow they operate in highly concurrent environments where multiple processes enqueue rows into the table while multiple processes attempt to dequeue these rows. This concurrency creates correctness, scalability and performance challenges.

But since SQL Server 2005 introduced the OUTPUT clause, using tables as queues is no longer a hard problem. This fact is called out in the OUTPUT Clause topic in BOL:

You can use OUTPUT in applications that use tables as queues, or to hold intermediate result sets. That is, the application is constantly adding or removing rows from the table… Other semantics may also be implemented, such as using a table to implement a stack.

The reason why OUTPUT clause is critical is that it offers an atomic destructive read operation that allows us to remove the dequeued row and return it to the caller, in one single statement.

Read the rest of this entry »

Performance comparison of varchar(max) vs. varchar(N)

March 22nd, 2010

The question of comparing the MAX types (VARCHAR, NVARCHAR, VARBINARY) with their non-max counterparts is often asked, but the answer usually gravitate around the storage differences. But I’d like to address the point that these types have inherent, intrinsic performance differences that are not driven by different storage characteristics. In other words, simply comparing and manipulating variables and columns in T-SQL can yield different performance when VARCHAR(MAX) is used vs. VARCHAR(N).

Assignment

First comparing simple assignment, assign a value to a VARBINARY(8000) variable in a tight loop:

Read the rest of this entry »

Dealing with Large Queues

March 9th, 2010

On a project I’m currently involved with we have to handle a constant influx of audit messages for processing. The messages come from about 50 SQL Express instances located in data centers around the globe, delivered via Service Broker into a processing queue hosted on a mirrored database where an activated procedure shreds the message payload into relational tables. These tables are in turn replicated with transactional replication into a data warehouse database. after that the messages are deleted from the processing servers, as replication is set up not to replicate deletes. The system must handle a constant average rate of about 200 messages per second, 24×7, with spikes going up to 2000-3000 messages per second over periods of minutes to an hour.

When dealing with these relatively high volumes, it is inevitable that queues will grow during the 2000 msgs/sec spikes and drain back to empty when the incoming rate stabilizes again at the normal 200 msgs/sec rate. Service Broker does an excellent job at handling these non-connectivity periods, retains the audit messages and quickly delivers them when connectivity is restored.

What I noticed though is that sometimes the processing of the received messages could hit a threshold from where it could not recover. The queue processing would slow down to a rate that was bellow the incoming rate, and from that point forward the queue could just grow. I want to detail a bit the reason why this can happen and what I did to alleviate the problem.

Read the rest of this entry »

What deprecated features am I using?

January 12th, 2010

select instance_name as [Deprecated Feature] , 
	cntr_value as [Frequency Used] 
	from sys.dm_os_performance_counters 
	where object_name = 'SQLServer:Deprecated Features' 
	and cntr_value > 0 
	order by cntr_value desc;

Quick way to tell which deprecated feature are used on a running instance of SQL Server. The performance counters reset at each server start up, so the interogation is relevant only after the server was up for some time. This will not tell you where is the usage comming from, but will give you a very quick idea what deprecated features are used most frequently by your apps.

If the SQL Server is a named instance, you have to query the proper counter category: 'MSSQL$<instancename>:Deprecated Features'

System pagefile size on machines with large RAM

November 22nd, 2009

Irrelevant of the size of the RAM, you still need a pagefile at least 1.5 times the amount of physical RAM. This is true even if you have a 1 TB RAM machine, you’ll need 1.5 TB pagefile on disk (sounds crazy, but is true)

When a process asks for MEM_COMMIT memory via VirtualAlloc/VirtualAllocEx, the requested size needs to be reserved in the pagefile. This was true in the first Win NT system, and is still true today see Managing Virtual Memory in Win32:

When memory is committed, physical pages of memory are allocated and space is reserved in a pagefile.

Bare some extreme odd cases, SQL Server will always ask for MEM_COMMIT pages. And given the fact that SQL uses a Dynamic Memory Management policy that reserves upfront as much buffer pool as possible (reserves and commits in terms of VAS), SQL Server will request at start up a huge reservation of space in the pagefile. If the pagefile is not properly sized errors 801/802 will start showing up in SQL’s ERRORLOG file and operations.

This always causes some confusion, as administrators erroneously assume that a large RAM eliminates the need for a pagefile. In truth the contrary happens, a large RAM increases the need for pagefile, just because of the inner workings of the Windows NT memory manager. Although reserved pagefile is, hopefully, never used, the problem of reserving such a huge pagefile file can be quite serious and needs to be accounted for during capacity planning.

bugcollect.com: better customer support

October 28th, 2009

I am a developer, I write applications for fun and profit, and I’ve been doing this basically my whole professional life. Over the years I’ve learned that it is important to understand the problems my users face. What are the most common issues, how often do they happen, who is most affected. I have tried the approach of logging into a text file and then asking my users to send me the log file. I’ve tried sending mail automatically from my application. It was useful, but my inbox just doesn’t scale to hundreds of messages that may happen after a … stormy release.

This is why I have created for myself an online service for application crash reporting. Applications can submit incident reports online and the service will collect them, aggregate them and do some initial analysis. I have been using this service in my applications over the past year and I think that if I find it so useful, perhaps you will too. So I’ve invested more resources into this, made it into a commercial product and put it out for everyone:http://bugcollect.com.

After an application is ready and published, bugcollect.com offers a private channel for collecting logging and crash reporting information. bugcollect.com analyzes crash reports and aggregates similar problems into buckets, groups incidents reported by the same source, helping the development team to focus on the most frequent crashes and problems. Developers get immediate feedback if a new release has a problem and they don’t have to ask for more information. Developers can also set up a response to an incident bucket, this response will be sent by bugcolect.com to any new incident report that falls into the same bucket. The application can then interpret this response and display feedback to the user, eg. it can instruct him about a new download available that fixes the problem.

bugcollect.com reporting differs from system crash reporting like iPhone crash, Mac ‘send to apple’ or Windows Dr. Watson because it is application initiated. An application can decide to submit a report anytime it wishes, typically in an exception catch block. All reports submitted to bugcollect.com are private and can be viewed only by the account owner, the application development team.

bugcollect.com features a public RESTful XML based API for submitting reports. There are already available client libraries for .Net and Java, as well as appender components for log4net and log4j. More client libraries are under development and an iPhone library will be made available soon.