SIGMOD 2011 Webcasts

June 23rd, 2011

This year SIGMOG keynotes are available online at https://services.choruscall.eu/links/sigmod1106.html

  • June 13th: A Quest for Beauty and Wealth (or, Business Processes for Database Researchers)
  • June 14th: Internet-Scale Storage (I highly recommend this session)
  • June 15th: SIGMOD Awards (this includes a segment on the SQL Server team)
  • June 16th: Managing Scientific Data

Scale out SQL Server by using Reliable Messaging

June 1st, 2011

How do large-scale sites and applications remain SQL-based? is a recent article from Michael Rys (Blog|Twitter) that highlights the relational SQL Server based web-scale deployment at MySpace. I have talked before about how MySpace uses Service Broker as a reliable messaging backbone to power the communications between +1000 databases, allowing them to scale-out and partition the user information into individual shards. Here are some more details about this architecture:

This new article uses the MySpace deployment as a case study to counter balance the claim that large web-scale deployments require the use of NoSQL storage because relational database cannot scale. BTW I know the SQL vs. NoSQL discussion is more subtle, but I won’t enter into details here. I think a good read on that topic is NoSQL vs. RDBMS: Let the flames begin!.

Why is reliable messaging a key tenet of implementing a data-partitioned scale-out application? Consider a typical example of an modern web-scale application: users connect, get authenticated and view their own profile, but they are also interested in the status updates or wall messages from their network of friends or followers. It is easy to see how one partitions the user profile data, but how do you partition the user ‘walls’? User A is in a partition hosted on node 1, while user B is in a partition hosted by node 2, when User A update his status, how does this show up on User B’s wall?

One option is to have the application update the wall of User B when User A changes his status. But this prevents scalability, because now writes have to occur on many nodes and the application has to orchestrate all these writes. Think Lady GaGa updating her status, the application has to update the wall of every follower.

Another option is to have the application read the status from User A when displaying the wall of User B. This also doesn’t scale, because reads now occur on many nodes. All those Lady GaGa followers refreshing their wall page have to read from the one node hosting her status, and the node is soon overwhelmed.

A solution is to replicate the write on User A’s status onto User B’s wall. The application only updates the User A status, and the infrastructure propagates the this update to User B’s wall.

But traditional replication was historically designed for replicating entire data sets of fixed schema over static topologies, and it falls short of replicating web-scale deployments of hundreds and thousands of nodes:

  • it depends too tightly on physical location and it cannot adapt to rapid changes of topology (nodes being added and removed)
  • its based on schema defined filtering which is difficult to map to the complex application specific data routing conditions (this update goes to node 2 because User B follows User A, but that update goes to node 3 because User D follows User C)
  • its very sensitive to schema changes making application upgrade roll outs a big challenge

Messaging is designed with application-to-application communication in mind and has different semantics that are more friendly on large scale-out deployments:

  • Logical routing to isolate application from topology changes
  • Protocol versioning information allows side-by-side deployments making application upgrade roll outs possible
  • Data schema can change more freely as peers are shielded from changes by keeping the communication protocol unchanged

With messaging in place the application updates the status of User A and the drops a message into a local outbound queue to notify all friends of A. The reliable messaging infrastructure dispatches this message to all nodes interested and processing of this message results in an update of User B wall. MySpace uses Service Broker as the reliable messaging infrastructure, and they make up for the lack of publish/subscribe by adding a router-dispatcher unit: all messages are sent to this dispatcher and the dispatcher in turn figures out how many subscribers have to be notified, and sends individual messages to them. See the slides linked at the beginning of my post for more details. As a side note, I see that StackExchange is also embarking on the route of creating a message bus for their pub/sub infrastructure, but they use Redis as a message store, see async Redis await BookSleeve.

Robert Scoble had a post MySpace’s death spiral: insiders say it’s due to bets on Los Angeles and Microsoft on which it claims that MySpace insiders blame their loss to Facebook to, amongst other things, the complexity of this infrastructure:

Workers inside MySpace tell me that this infrastructure, which they say has “hundreds of hacks to make it scale that no one wants to touch” is hamstringing their ability to really compete.

I do no know if the sources quoted by the article are right or wrong, and I was never personally involved with the MySpace deployment, but since I was so closely involved in building SQL Service Broker and as SSB is one of the critical components used by MySpace infrastructure, I am understandably interested in this discussion. Service Broker has a notoriously steep learning curve, there are no tools for administer, monitor and troubleshoot Service Broker deployments, and there is absolutely no support in the client programming stack (the managed .Net Framework). This is why all solutions that deploy Service Broker that I know of are large enterprise shops that are staffed with architects that understand the set of unique advantages this technology brings, and are willing to venture into uncharted waters despite the prospect of being impossible to hire development and ops talent with Service Broker expertise. But also the feedback I hear from these deployments is almost always positive: once deployed, Service Broker just works. Someone told me that the Service Broker solution they had was the only piece of technology that “did not break in the last 3 years” and that covers an upgrade from SQL Server 2005 to SQL Server 2008. See Is Service Broker in SQL Server 2008 compatible with the one in SQL Server 2005? to see how Service Broker helps address infrastructure upgrade problems. Personally I am not surprised that Service Broker turns out to be a cornerstone of the response SQL Server has to give to the NoSQL challenge, but this vitally unheard of technology (89 questions tagged service-broker on StackOverflow at time of writing this) will be quite a surprise answer for many.

How to determine the database version of an MDF file

April 4th, 2011

If you have an MDF file laying around and you don’t know what version is, how can you determine the version, preferably without altering the file? You could try to attach it to a SQL Server instance, but if you’re not careful the operation may end up upgrading the MDF file to that instance’s current version. Even if it doesn’t upgrade it because it happens to be the same version as the one supported by that instance, it can still run recovery on the database and thus alter the file. Is there a way to determine the version in a non destructive fashion?

Lets look into a database and dump a page. Not any page, but page 9 of the primary filegroup. This page happens to be the database boot page:

dbcc traceon(3604)
dbcc page(1,1,9,3);

the important bits of information are these:

...
Memory Dump @0x00000000124FA060
0000000000000000:   0000a405 95029502 00000000 00000000 †..¤.•.•.........
...
DBINFO @0x00000000124FA060
...
dbi_dbid = 1                         dbi_status = 65544                   dbi_nextid = 1723153184
dbi_dbname = master                  dbi_maxDbTimestamp = 4000            dbi_version = 661
dbi_createVersion = 661              dbi_ESVersion = 0
...

So at offset 0×60 on the page (96 decimal, which we know is the size of the page header) there is a DBINFO structure. This structure contains the dbi_version 661, which is 0×295 in hex. In the DBCC PAGE output this would show as 9502, and we can see that there are two such values at offset 0×64 and 0×66. So the database version is stored in the two bytes at offset 0×64 on the 9th page of the primary filegroup. The 9 page will be at offset 0×12000 in the file (just take the page size, 8k, and multiply it by 9). So the version of the MDF will be the DWORD value at offset 0×12064 in the file.

There are many ways you can read these bytes using your favourite hex file viewer/editor. Even Powershell can do it:

PS > get-content -Encoding Byte "...\foo.mdf" | select-object -skip 0x12064 -first 2
149
2
PS > 2*256+149
661

The first Powershell line dumped the two bytes at offset 0×12064 in the file: 149 and 2 (Powersell displays the value in decimal encoding…). Convert the two bytes to a DWORD gives us the MDF file version: 661, which is the SQL Server 2008 R2 version.

Erland’s Sommarskog adds another gem to his treasure chest

February 21st, 2011

There are probably no articles that I reference more often than Erland’s. There are questions on forums, discussion groups and stackoverflow that come back over and over, again and again. For some of these topics Elands has prepared articles that are, ultimately, the article to point anybody for more details on that topic. His coverage is deep, the articles are well structured, and they pretty much exhaust the topic. If somebody still have questions about that topic after reading the article, then I say that person knows too much about SQL Servero … or not enough. I’m writing this blog entry not so much as to announce his latest addition Slow in the Application, Fast in SSMS? Understanding Performance Mysteries (is hard for me to believe anyone that follows my blog does not know about them…) but more to create for myself a memo pad from where to pick up the link to the articles whenever I need them:

The Curse and Blessings of Dynamic SQL
Dynamic-SQL, the technique of generating and execute SQL code on-the-fly is sometimes irreplaceable, sometimes abused, and almost always done in a SQL-injection prone, dangerous, way. This article covers the pros and cons of this technique, when to use and when not, how to use it and what pitfalls to avoid.
Arrays and Lists in SQL Server
In lack of a true array type in the Transact-SQL language, developers have resorted to all sort of workarounds to pass sets of data as parameters to Transact-SQL procedures and batches: strings containing comma delimited items, XML variables, temp tables, table value parameters. Erland has a series of articles covering this topic in SQL Server 2008 (where table valued parameters are available), SQL Server 2005 (where XML data type became available) and SQL Server 2000 (the Dark Age).
How to Share Data Between Stored Procedures
All you need to know about how to share data between procedures and how to choose among the various solutions: table valued parameters, table valued functions, temp tables, XML data type, INSERT-EXEC, OPENQUERY, cursor variables, CLR.
Error Handling
Another series that covers error handling in Transact-SQL in SQL Server 2005 and later using TRY-CATCH blocks as well as SQL Server 2000 before TRY-CATCH blocks were available.
Giving Permissions through Stored Procedures
Code signing is, by a large margin, the perfect way to effectively grant granular control to non-privileged users for specific operations in SQL Server 2005 and later. This powerful mechanism is though seldom used, because of the arcane ‘black magic’ that goes into getting the sequence of operations correct to effectively sign a procedure, and because of the complication that arrise from the EXECUTE AS impersonation context required for signed procedures. This article goes to great length in shading some light on this esoteric topic.
Slow in the Application, Fast in SSMS?
This article describes parameter-sniffing, the process of ‘tuning’ a query plan to a specific value of the input parameters, and how this process can sometime ‘pollute’ the query plan cache with a plan that is optimized for a specific set of parameters, but performs poorly on other parameter values. The article describes what is parameter sniffing, how to identify when it happens, and how to fix it and prevent it.

FILESTREAM MVC: Download and Upload images from SQL Server

February 6th, 2011

In a previous article I have shown how it is possible to use efficient streaming semantics when Download and Upload images from SQL Server via ASP.Net MVC. In this article I will go over an alternative approach that relies on the FILESTREAM column types introduced in SQL Server 2008.

What is FILESTREAM?

FILESTREAM storage is a new option available in SQL Server 2008 and later that allows for BLOB columns to be stored directly on the file system as individual files. As files, the data is accessible through the Win32 file access API like ReadFile and WriteFile. But at the same time the same data is available through the normal T-SQL operations like SELECT or UPDATE. Not only that, but the data is contained logically in the database so it will be contained in a database backup, it is subject to ordinary transaction commit and rollback behavior, it is searched by SQL Server FullText indexes and it follows the normal SQL Server security access rules: if you are granted SELECT permission on the table, then you can open the file to read. There are some restrictions, eg. a database with FILESTREAM cannot be mirrored. For a full list of restrictions and limitations, see Using FILESTREAM with Other SQL Server Features. Note that SQL Server Express edition does support FILESTREAM storage.

Another attribute of the FILESTREAM storage is the size limitation: normal BLOB column values have a maximum size of 2Gb. FILESTREAM columns are limited only by the volume size limit of the file system. 2Gb may seem like a large value, but consider that a media file like an HD movie stream can easily go up to a size of 5Gb.

Using FILESTREAM

One way to use FILESTREAM columns is to treat them as ordinary BLOB values and manipulate them through T-SQL. The one restriction in place is that the efficient partial update syntax for BLOBs is not supported. That is, one cannot issue UPDATE table SET column.WRITE(...) WHERE ... on a FILESTREAM column. But where FILESTREAM storage begins to shine is when accessed through the system file API. This allows the application to efficiently read, write and seek in a large BLOB value, just as it would in a file. In fact, the application does read, write and seek in a file :) .

Native Win32 applications use a new API function OpenSqlFilestream that opens a HANDLE that can be then used with the file IO API functions. Managed applications use the new SqlFileStream class that exposes a Stream based on the underlying FILESTREAM value. Both the native and the manged API require as input two special values, a PathName and a Transaction Context that must be obtained previously from the SQL Server using T-SQL.

The requirement to provide a Transaction Context when manipulating a FILESTREAM value through the file IO API highlights another aspect of working with this new type: a T-SQL transaction must be started and must be kept open while the file is manipulated, and then it must be committed. If you think about it such a requirement is to be expected, since we said that FILESTREAM columns are subject to normal T-SQL transaction commit and rollback semantics, including when they are manipulated through the Windows file read/write API.

FILESTREAM based images table

In order to have FILESTREAM column, we must have a special filegroup in our database, a FILESTREAM filegroup. You can either add a new filegroup to your existing database, or you can create the database with a FILESTREAM filegroup from scratch. Also the FILESTREAM feature must be enabled on the SQL Server instance. For all the details, see Getting Started with FILESTREAM Storage. For my project, I’m simply going to create a new database with a FILESTREAM filegroup:


create database images
	on (name='images_data', filename='c:\temp\images.mdf')
	, filegroup FS contains FILESTREAM
              (name = 'images_files', filename='c:\temp\images_files')
	log on (name='images_log', filename='c:\temp\images.ldf');
go

The media table used by our MVC project is going to be similar to the one used in the previous article but the content column will have the FILESTREAM attribute added. A table that has FILESTREAM columns is required to have a ROWGUIDCOL column, so we’re going to add one of those as well:


create table media (
	[media_id] int not null identity(1,1),
	[file_name] varchar(256),
	[content_type] varchar(256),
	[content_coding] varchar(256),
	[media_rowguid] uniqueidentifier not null
               ROWGUIDCOL UNIQUE default newsequentialid() ,
	[content] varbinary(max) filestream,
        constraint pk_media_id primary key([media_id]),
	constraint unique_file_name unique ([file_name]));
go

FILESTREAM based IMediaRepository

If you haven’t read the previous article Download and Upload images from SQL Server via ASP.Net MVC yet, now is a good time to do it. I am going to reuse the same code and simply provide a new implementation for the IMediaRepository interface, an implementation that works with FILESTREAM storage:


    /// <summary>
    /// SQL Server FILESTREAM based implementation of the IMediaRepository
    /// </summary>
    public class FileStreamMediaRepository: IMediaRepository
    {
        /// <summary>
        /// Gets an open connection to the SQL Server back end
        /// </summary>
        /// <returns>the SqlConneciton object, ready to use</returns>
        private SqlConnection GetConnection()
        {
            SqlConnectionStringBuilder scsb = new SqlConnectionStringBuilder(
                ConfigurationManager.ConnectionStrings["Images"].ConnectionString);
            scsb.Pooling = true;
            scsb.AsynchronousProcessing = true;
            SqlConnection conn = new SqlConnection(scsb.ConnectionString);
            conn.Open();
            return conn;
        }

        /// <summary>
        /// Gets a file from the SQL repository
        /// </summary>
        /// <param name="fileName">filename to retrieve</param>
        /// <param name="file">Output, the model for the file if found</param>
        /// <returns>True if the file is found, False if not</returns>
        public bool GetFileByName(string fileName, out FileDownloadModel file)
        {
            SqlConnection conn = GetConnection();
            SqlTransaction trn = conn.BeginTransaction();
            try
            {
                SqlCommand cmd = new SqlCommand(
                    @"SELECT file_name,
	                    content_type,
                        content_coding,
                        DATALENGTH (content) as content_length,
	                content.PathName() as path,
                        GET_FILESTREAM_TRANSACTION_CONTEXT ()
                    FROM media
                    WHERE file_name = @fileName;", conn, trn);
                SqlParameter paramFilename = new SqlParameter(
                            @"fileName", SqlDbType.VarChar, 256);
                paramFilename.Value = fileName;
                cmd.Parameters.Add(paramFilename);
                using (SqlDataReader reader = cmd.ExecuteReader())
                {
                    if (false == reader.Read())
                    {
                        reader.Close();
                        trn.Dispose();
                        conn.Dispose();
                        trn = null;
                        conn = null;
                        file = null;
                        return false;
                    }

                    string contentDisposition = reader.GetString(0);
                    string contentType = reader.GetString(1);
                    string contentCoding = reader.IsDBNull(2) ? null : reader.GetString(2);
                    long contentLength = reader.GetInt64(3);
                    string path = reader.GetString(4);
                    byte[] context = reader.GetSqlBytes(5).Buffer;

                    file = new FileDownloadModel
                    {
                        FileName = contentDisposition,
                        ContentCoding = contentCoding,
                        ContentType = contentType,
                        ContentLength = contentLength,
                        Content = new MvcResultSqlFileStream
                        {
                            SqlStream = new SqlFileStream(path, context, FileAccess.Read),
                            Connection = conn,
                            Transaction = trn
                        }
                    };
                    conn = null; // ownership transfered to the stream
                    trn = null;
                    return true;
                }
            }
            finally
            {
                if (null != trn)
                {
                    trn.Dispose();
                }
                if (null != conn)
                {
                    conn.Dispose();
                }
            }
        }

        /// <summary>
        /// Adds a file to the SQL repository
        /// </summary>
        /// <param name="file">POST-ed file to be added</param>
        /// <param name="fileName">The filename part of the uploaded file path</param>
        public void PostFile(HttpPostedFileBase file, out string fileName)
        {
            fileName = Path.GetFileName(file.FileName);

            using (SqlConnection conn = GetConnection ())
            {
                using (SqlTransaction trn = conn.BeginTransaction ())
                {
                    SqlCommand cmdInsert = new SqlCommand(
@"insert into media
    (file_name, content_type, content_coding, content)
output
	INSERTED.content.PathName(),
    GET_FILESTREAM_TRANSACTION_CONTEXT ()
values
    (@content_disposition, @content_type, @content_coding, 0x)", conn, trn);
                    cmdInsert.Parameters.Add("@content_disposition", SqlDbType.VarChar, 256);
                    cmdInsert.Parameters["@content_disposition"].Value = fileName;
                    cmdInsert.Parameters.Add("@content_type", SqlDbType.VarChar, 256);
                    cmdInsert.Parameters["@content_type"].Value = file.ContentType;
                    cmdInsert.Parameters.Add("@content_coding", SqlDbType.VarChar, 256);
                    cmdInsert.Parameters["@content_coding"].Value = DBNull.Value;

                    string path = null;
                    byte[] context = null;

                    // cmdInsert is an INSERT command that uses the OUTPUT clause
                    // Thus we use the ExecuteReader to get the
                    // result set from the output columns
                    //
                    using (SqlDataReader rdr = cmdInsert.ExecuteReader())
                    {
                        rdr.Read();
                        path = rdr.GetString(0);
                        context = rdr.GetSqlBytes(1).Buffer;
                    }

                    using (SqlFileStream sfs = new SqlFileStream(
                                     path, context, FileAccess.Write))
                    {
                        file.InputStream.CopyTo(sfs);
                    }
                    trn.Commit ();
                }
            }
        }
    }

The implementation is pretty straight forward. The PostFile methods starts a transaction, inserts a row in the images table and obtains the PahtName and FILESTREAM transaction context to the newly inserted row, and then opens a SqlFileStream and copies in the uploaded file directly into the file that backs the FILESTREAM column. The GetFileByName method obtains the PathName and FILESTREAM transaction context of the desired file and then returns a SqlFileStream that accesses directly the file in which the FILESTREAM value is stored.

The only thing requiring explanation is the MvcResultSqlFileStream class. Because the SqlFileStream uses a transaction context, it is required to keep the SqlConnection and the SqlTransaction open until the SqlFileStream finishes accessing the content. In the PostFile method this requirement is achieved easily because the entire usage of the SqlFileStream complete within the stack frame, but the GetFileByName method has to return a Stream that in turn is passed into a FileStreamResult by the MediaController class and there is no control over how the MVC framework will use this returned ActionResult. My solution is to wrap the SqlFileStream in a new Stream derived class (since SqlFileStream is sealed, it cannot be inherited...) and then handle the resource management in this derived class' Dispose:


    /// <summary>
    /// Implementation of System.IO.Stream based on a SqlFileStream
    /// Disposes the connection, transaction and SqlFileStream objects
    /// </summary>
    public class MvcResultSqlFileStream: Stream
    {
        public SqlFileStream SqlStream { get; set; }
        public SqlConnection Connection { get; set; }
        public SqlTransaction Transaction { get; set; }
        public override bool CanRead
        {
            get { return SqlStream.CanRead; }
        }

        public override bool CanSeek
        {
            get { return SqlStream.CanSeek; }
        }

        public override bool CanWrite
        {
            get { return SqlStream.CanWrite; }
        }

        public override void Flush()
        {
            SqlStream.Flush ();
        }

        public override long Length
        {
            get { return SqlStream.Length; }
        }

        public override long Position
        {
            get
            {
                return SqlStream.Position;
            }
            set
            {
                SqlStream.Position = value;
            }
        }

        public override int Read(byte[] buffer, int offset, int count)
        {
            return SqlStream.Read (buffer, offset, count);
        }

        public override long Seek(long offset, SeekOrigin origin)
        {
            return SqlStream.Seek(offset, origin);
        }

        public override void SetLength(long value)
        {
            SqlStream.SetLength(value);
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            SqlStream.Write(buffer, offset, count);
        }

        protected override void Dispose(bool disposing)
        {
            if (disposing)
            {
                SqlStream.Dispose();
                Transaction.Dispose();
                Connection.Dispose();
            }
            base.Dispose(disposing);
        }
    }

This extra class feels rather unnecessary, but unfortunately is required because the MVC framework has no proper handling of resources passed out from the controller. If you ask me ActionResult should implement IDisposable, but then of course nobody asks me ;)

You can download the source for the entire MVC project from here.

Conclusion

Handling media content stored in the database using FILESTREAM storage makes for a nicer developer experience. The FILESTREAM content can be accessed using streaming semantics based on the SqlFileStream class, or using the Win32 file IO API for native applications. By comparison the previous article that relied on straight T-SQL to manipulate the BLOB values had to do some very unintuitive tricks in order to achieve streaming semantics, specially for uploading a file. FILESTREAM storage simplifies development and is supported in every SQL Server edition, including Express.

If you intend to deploy FILESTREAM in your environment I recommend going over some Knowledge Base and white paper articles first:

How to pass a NULL value in a message to a queue in SQL Server

January 15th, 2011

The SEND Transact-SQL verb does not allow to send a NULL message body, attempting to do so will result in error:


Msg 8433, Level 16, State 1, Line 11
The message body may not be NULL.  A zero-length UNICODE or binary string is allowed.

But there are ways to send a NULL message body. One way is to completely omit the message body argument:


SEND ON CONVERSATION @handle MESSAGE TYPE [...];

Another way is to send a 0 length message body, which will be enqueued as a NULL message body in the target queue:


SEND ON CONVERSATION @handle MESSAGE TYPE [...] (0x);
SEND ON CONVERSATION @handle MESSAGE TYPE [...] ('');
SEND ON CONVERSATION @handle MESSAGE TYPE [...] (N'');

All three forms above will enqueue the same message body: NULL. This is true for both binary messages (VALIDATION = NONE) and for XML messages (VALIDATION=WELL_FORMED_XML).

Here is a short test script showing this:


create message type [BINARY] validation = none;
create message type [XML] validation = well_formed_xml;
go

create contract [TEST] (
	[BINARY] sent by initiator,
	[XML] sent by initiator);
go

create queue Sender;
create service Sender on queue Sender;
go

create queue Receiver;
create service Receiver on queue Receiver  ([TEST]);
go

declare @h uniqueidentifier;

begin dialog conversation @h
from service [Sender]
to service N'Receiver', N'current database'
on contract [TEST]
with encryption = off;

send on conversation @h message type [BINARY];
send on conversation @h message type [BINARY] (0x);

send on conversation @h message type [XML];
send on conversation @h message type [XML] ('');
send on conversation @h message type [XML] (N'');

receive * from [Receiver];
go

The received message_body column have a NULL value for all 5 messages sent.

Download and Upload images from SQL Server via ASP.Net MVC

December 28th, 2010

A frequent question that pops up on discussion forums is how to serve an image that is stored in a SQL Server table from an ASP.Net application. Unfortunately the answer is almost always wrong, as the prevalent solution involves copying the entire image file into memory before returning it to the client. This solution works fine when tested with a light load and returning few small images. But in production environment the memory consumption required by all those image files stored as byte arrays in memory causes serious performance degradation. A good solution must use streaming semantics, transferring the data in small chunks from the SQL Server to the HTTP returned result.

The SqlClient components do offer streaming semantics for large result sets, including large BLOB fields, but the client has to specifically ask for it. The ‘secret ingredient’ is the passing in the CommandBehavior.SequentialAccess flag to the SqlCommand.ExecuteReader:

Provides a way for the DataReader to handle rows that contain columns with large binary values. Rather than loading the entire row, SequentialAccess enables the DataReader to load data as a stream. You can then use the GetBytes or GetChars method to specify a byte location to start the read operation, and a limited buffer size for the data being returned.

An ASP.Net MVC virtual Media folder backed by SQL Server

Lets say we want to have a virtual Media folder in an ASP.Net MVC site, serving the files from a SQL Server database. A GET request for an URL like "http://site/Media/IMG0042.JPG" should return the content of the file named IMG0042.JPG from the database. A POST request to the URL "http://site/Media" which contains an embedded file should insert this new file in the database and redirect the response to the newly added file virtual path. This is how our upload HTML form looks like:

<form method="post" action="/Media" enctype="multipart/form-data">
<input type="file" name="file" id="file"/>
<input type="submit" name="Submit" value="Submit"/>
</form>

We’ll start by adding a special route in Global.asax.cs that will act like a virtual folder for file download requests:

            routes.MapRoute(
                "Media",
                "Media/{filename}",
                new { controller = "Media", action = "GetFile" },
                new { filename = @"[^/?*:;{}\\]+" });

Note that the upload will be handled by the default MVC route if we add an Index() method to our controller that will handle the POST.

For our GET requests we need a FileDownloadModel class to represent the requested file properties. for this sample, we don’t need a POST model since we’re only going to have have a single input field, the uploaded file. We’re going to use a Repository interface that declares two methods: GetFileByName returns from the repository a FileDownloadModel given a file name, and PutFile will accept an uploaded file and put it in the repository.

    public class FileDownloadModel
    {
        public string FileName {get; internal set;}
        public string ContentType { get; internal set; }
        public string ContentCoding { get; internal set; }
        public long ContentLength { get; internal set; }
        public Stream Content { get; internal set; }
    }

    public interface IMediaRepository
    {
        bool GetFileByName(
            string fileName,
            out FileDownloadModel file);

        void PostFile(
            HttpPostedFileBase file,
            out string fileName);
    }

With this Repository interface we can actually code our MediaController class:

    public class MediaController : Controller
    {
        public IMediaRepository Repository { get; set; }

        public MediaController()
        {
            Repository = new SqlMediaRepository();
        }

        [HttpPost]
        public ActionResult Index()
        {
            string fileName;
            Repository.PostFile(Request.Files[0], out fileName);
            return new RedirectResult("/Media/" + fileName);
        }

        [HttpGet]
        public ActionResult GetFile(string fileName)
        {
            FileDownloadModel model;
            if (false == Repository.GetFileByName(
                fileName,
                out model))
            {
                return new HttpNotFoundResult
                {
                    StatusDescription = String.Format(
                        "File {0} not found",
                        fileName)
                };
            }

            if (null != model.ContentCoding)
            {
                Response.AddHeader(
                    "Content-Encoding",
                    model.ContentCoding);
            }
            Response.AddHeader(
                "Content-Length",
                model.ContentLength.ToString ());

            Response.BufferOutput = false;

            return new FileStreamResult(
                model.Content,
                model.ContentType);
        }
    }

In have hard coded the Repository implementation to SqlMediaRepository, a class we’ll create shortly. A real project would probably use the Dependency Injection or Inversion of Control patterns, perhaps using Castle Windsor for example. For brevity, I will skip these details, there are plenty of blogs and articles describing how to do it.

Note the use of the FileStreamResult return, which is an MVC supplied action for returning a download of an arbitrary Stream object. Which also brings us to the next point, we need to implement a Stream that read content from a SqlDataReader.

A SqlDataReader based Stream

We now need an implementation of the abstract Stream class that can stream out a BLOB column from a SqlDataReader. Wait, you say, doesn’t SqlBytes already have a Stream property that reads a BLOB from a result as a Stream? Unfortunately this small remark makes this class useless for our purposes:

Getting or setting the Stream property loads all the data into memory. Using it with large value data can cause an OutOfMemoryException.

So we’re left with implementing a Stream based on a SqlDataReader BLOB field, a Stream that returns the content of the BLOB using the proper GetBytes calls and does not load the entire BLOB in memory. Fortunately this is fairly simple as we only need to implement a handful of methods:

    public class SqlReaderStream: Stream
    {
        private SqlDataReader reader;
        private int columnIndex;
        private long position;

        public SqlReaderStream(
            SqlDataReader reader,
            int columnIndex)
        {
            this.reader = reader;
            this.columnIndex = columnIndex;
        }

        public override long Position
        {
            get { return position; }
            set { throw new NotImplementedException(); }
        }

        public override int Read(byte[] buffer, int offset, int count)
        {
            long bytesRead = reader.GetBytes(columnIndex, position, buffer, offset, count);
            position += bytesRead;
            return (int)bytesRead;
        }

        public override bool CanRead
        {
            get { return true; }
        }

        public override bool CanSeek
        {
            get { return false; }
        }

        public override bool CanWrite
        {
            get { return false; }
        }

        public override void Flush()
        {
            throw new NotImplementedException();
        }

        public override long Length
        {
            get { throw new NotImplementedException(); }
        }

        public override long Seek(long offset, SeekOrigin origin)
        {
            throw new NotImplementedException();
        }

        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            throw new NotImplementedException();
        }

        protected override void Dispose(bool disposing)
        {
            if (disposing && null != reader)
            {
                reader.Dispose();
                reader = null;
            }
            base.Dispose(disposing);
        }
    }

As you can see, we need only to return the proper responses to CanRead (yes), CanWrite (no) and CanSeek (also no), keep track of our current Position and we need implement Read to fetch more bytes from the reader, using GetReader.

We’re also overriding the Dispose(bool disposing) method. This is because we’re going to have to close the SqlDataReader when the content stream transfer is complete. If this is the first time you see this signature of the Dispose method, then you must read Implementing a Dispose Method

Streaming Upload of BLOB data

Just like retrieving large BLOB data from SQL Server poses challenges to avoid creating full blown in-memory copies of the entire BLOB, similar issues arrive when attempting to insert a BLOB. The best solution is actually quite convoluted. It involves sending the data to the server in chunks, and using the in-place BLOB UPDATE syntax. The MSDN has this to say in the Remarks section:

Use the .WRITE (expression, @Offset, @Length) clause to perform a partial or full update of varchar(max), nvarchar(max), and varbinary(max) data types. For example, a partial update of a varchar(max) column might delete or modify only the first 200 characters of the column, whereas a full update would delete or modify all the data in the column. For best performance, we recommend that data be inserted or updated in chunk sizes that are multiples of 8040 bytes.

To implement such semantics, we will write a second Stream implementation, this time for uploads:

    public class SqlStreamUpload: Stream
    {
        public SqlCommand InsertCommand { get; set; }
        public SqlCommand UpdateCommand { get; set; }
        public SqlParameter InsertDataParam { get; set; }
        public SqlParameter UpdateDataParam { get; set; }

        public override bool CanRead
        {
            get { return false; }
        }

        public override bool CanSeek
        {
            get { return false; }
        }

        public override bool CanWrite
        {
            get { return true; }
        }

        public override void Flush()
        {
        }

        public override long Length
        {
            get { throw new NotImplementedException(); }
        }

        public override long Position
        {
            get; set;
        }

        public override int Read(byte[] buffer, int offset, int count)
        {
            throw new NotImplementedException();
        }

        public override long Seek(long offset, SeekOrigin origin)
        {
            throw new NotImplementedException();
        }

        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            byte[] data = buffer;
            if (offset != 0 ||
                count != buffer.Length)
            {
                data = new byte[count];
                Array.Copy(buffer, offset, data, 0, count);
            }
            if (0 == Position &&
                null != InsertCommand)
            {
                InsertDataParam.Value = data;
                InsertCommand.ExecuteNonQuery();
            }
            else
            {
                UpdateDataParam.Value = data;
                UpdateCommand.ExecuteNonQuery();
            }
            Position += count;
        }
    }

This Stream implementation uses two SqlCommand objects: an InsertCommand to save the very first chunk, and an UpdateCommand to save the subsequent chunks. Note that the chunk sizes (the optimal 8040 bytes) is nowhere specified, that is easily achieved by wrapping the SqlStreamUpload in a BufferedStream instance.

The MEDIA table

create table media (
	[media_id] int not null identity(1,1),
	[file_name] varchar(256),
	[content_type] varchar(256),
	[content_coding] varchar(256),
	[content] varbinary(max),
        constraint pk_media_id primary key([media_id]),
	constraint unique_file_name unique ([file_name]));

This table contains the downloadable media files. Files are identified by name, so the names have an unique constraint. I’ve added a IDENTITY primary key, because in a CMS these files are often references from other parts of the application and and INT is a shorter reference key than a file name. The content_type field is needed to know the type of file: “image/png”, “image/jpg” etc (see the officially registered IANA Media types. The content_coding field is needed if the files are stored compressed and a Content-Encoding HTTP header with the tag “gzip” or “deflate” has to be added to the response. Note that most image types (JPG, PNG) don’t compress well, as these file formats already include a compression algorithm and when compressed again with the common HTTP transfer algorithms (gzip, deflate, compress) they usually increase in size.

The SqlMediaRepository

The final piece of the puzzle: an implementation of the IMediaRepository interface that uses a SQL Server back end for the files storage:

    public class SqlMediaRepository: IMediaRepository
    {
        private SqlConnection GetConnection()
        {
            SqlConnectionStringBuilder scsb = new SqlConnectionStringBuilder(
                ConfigurationManager.ConnectionStrings["Images"].ConnectionString);
            scsb.Pooling = true;
            SqlConnection conn = new SqlConnection(scsb.ConnectionString);
            conn.Open();
            return conn;
        }

        public void PostFile(
            HttpPostedFileBase file,
            out string fileName)
        {
            fileName = Path.GetFileName(file.FileName);

            using (SqlConnection conn = GetConnection())
            {
                using (SqlTransaction trn = conn.BeginTransaction())
                {
                    SqlCommand cmdInsert = new SqlCommand(
                        @"INSERT INTO media (
                            file_name,
                            content_type,
                            content_coding,
                            content)
                        values (
                            @content_disposition,
                            @content_type,
                            @content_coding,
                            @data);", conn, trn);
                    cmdInsert.Parameters.Add("@data", SqlDbType.VarBinary, -1);
                    cmdInsert.Parameters.Add("@content_disposition", SqlDbType.VarChar, 256);
                    cmdInsert.Parameters["@content_disposition"].Value = fileName;
                    cmdInsert.Parameters.Add("@content_type", SqlDbType.VarChar, 256);
                    cmdInsert.Parameters["@content_type"].Value = file.ContentType;
                    cmdInsert.Parameters.Add("@content_coding", SqlDbType.VarChar, 256);
                    cmdInsert.Parameters["@content_coding"].Value = DBNull.Value;

                    SqlCommand cmdUpdate = new SqlCommand(
                            @"UPDATE media
                            SET content.write (@data, NULL, NULL)
                            WHERE file_name = @content_disposition;", conn, trn);
                    cmdUpdate.Parameters.Add("@data", SqlDbType.VarBinary, -1);
                    cmdUpdate.Parameters.Add("@content_disposition", SqlDbType.VarChar, 256);
                    cmdUpdate.Parameters["@content_disposition"].Value = fileName;

                    using (Stream uploadStream = new BufferedStream(
                        new SqlStreamUpload
                        {
                            InsertCommand = cmdInsert,
                            UpdateCommand = cmdUpdate,
                            InsertDataParam = cmdInsert.Parameters["@data"],
                            UpdateDataParam = cmdUpdate.Parameters["@data"]
                        }, 8040))
                    {
                        file.InputStream.CopyTo(uploadStream);
                    }
                    trn.Commit();
                }
            }
        }

        public bool GetFileByName(string fileName, out FileDownloadModel file)
        {
            SqlConnection conn = GetConnection();
            try
            {
                SqlCommand cmd = new SqlCommand(
                    @"SELECT file_name,
	                    content_type,
                        content_coding,
                        DATALENGTH (content) as content_length,
	                    content
                    FROM media
                    WHERE file_name = @fileName;", conn);
                SqlParameter paramFilename = new SqlParameter(
                      @"fileName", SqlDbType.VarChar, 256);
                paramFilename.Value = fileName;
                cmd.Parameters.Add(paramFilename);
                SqlDataReader reader = cmd.ExecuteReader(
                    CommandBehavior.SequentialAccess |
                    CommandBehavior.SingleResult |
                    CommandBehavior.SingleRow |
                    CommandBehavior.CloseConnection);
                if (false == reader.Read())
                {
                    reader.Dispose();
                    conn = null;
                    file = null;
                    return false;
                }

                string contentDisposition = reader.GetString(0);
                string contentType = reader.GetString(1);
                string contentCoding = reader.IsDBNull(2) ? null : reader.GetString(2);
                long contentLength = reader.GetInt64(3);
                Stream content = new SqlReaderStream(reader, 4);

                file = new FileDownloadModel
                {
                    FileName = contentDisposition,
                    ContentCoding = contentCoding,
                    ContentType = contentType,
                    ContentLength = contentLength,
                    Content = content
                };
                conn = null; // ownership transfered to the reader/stream
                return true;
            }
            finally
            {
                if (null != conn)
                {
                    conn.Dispose();
                }
            }
        }
    }

Typically MVC applications tend to use a LINQ based Repository. In this case though I could not leverage LINQ because of the peculiar requirements of implementing efficient streaming for large BLOBs. So this Repository uses plain vanilla SqlClient code.

The GetFileByName method gets the one row from the media table and returns a FileDownloadModel with a SqlReaderStream object that is wrapping the SELECT command result. Note that I could not deploy the typical “using” pattern for the disposable SqlConnection because the connection must remain open until the command result SqlDataReader finishes streaming in the BLOB, and the streaming will be controlled by the MVC framework executing our ActionResult after the GetFileByName is completed. The connection will be closed when the SqlReaderStream is disposed, because of the CommandBehavior.CloseConnection flag. The stream will be disposed by the FileStreamResult.WriteFile method.

The PostFile method creates two SqlCommand statements, one to insert the first chunk of data along with the other relevant columns, and an update command that uses BLOB partial updates to write the subsequent chunks. A SqlStreamUpload object is then using these two SqlCommand to efficiently stream in the uploaded file. The intermediate BufferedStream is used to create upload chunks of the critical size of 8040 bytes (see above). If the content would have to be compressed, this is where it would happen, a GZipStream would be placed in front of the Bufferedstream in order to compress the uploaded file, and an the “@content_coding” parameter would have to be set as to “gzip”.

HTTP caching

I have left out from this implementation appropriate HTTP caching control. HTTP caching is extremely important in high volume traffic sites, there is no better way to optimize an HTTP request processing that to never receive said request at all and have the user-agent cache or some intermediate proxy serve a response instead. Our application would have to add appropriate ETag and Cache-Control HTTP headers and the MediaController would need to have an action for the HEAD requests. The IMediaRepository interface would need a new method to get all the file properties without the actual content. For the moment, I’ll leave these as an exercise for the reader…

To BLOB or Not to BLOB

This article does not try to address the fundamental question whether you should store the images in the database to start with. Arguments can be made both for and against this. Russel Sears, Catherine van Ingen and Jim Gray have published a research paper in 2006 on their performance comparison analysis between storing files in the database and storing them in the file system: To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem?. They concluded that:

The study indicates that if objects are larger than one megabyte on average, NTFS has a clear advantage over SQL Server. If the objects are under 256 kilobytes, the database has a clear advantage. Inside this range, it depends on how write intensive the workload is, and the storage age of a typical replica in the system.

However their study was not comparing how to serve the files as HTTP responses. The fact that the web server can efficiently serve a file directly from the file system without any code being run in the web application changes the equation quite a bit, and tilts the performance strongly in favor of the file system. But if you already have considered the pros and cons and decided that the advantages of a consistent backup/restore and strong referential integrity warrants database stored BLOBs, then I hope this article highlights an efficient way to return those BLOBs as HTTP responses.

This server supports version 662 and earlier…

November 23rd, 2010

A new error started showing up in SQL Server 2008 SP2 installations:

The database cannot be opened because it is version 661. This server supports version 662 and earlier. A downgrade path is not supported.

661 sure is earlier than 662, so what seems to be the problem? This error message is a bit misleading. SQL Server 2008 supports database version 655 and earlier. But with support for 15000 partitions in SQL Server 2008 SP2, databases enabled for 15000 partitions are upgraded to version 662. This upgrade is necessary to prevent an SQL Server 2008 R2 instance from attaching a database that has more than 1000 partitions in it, since the code in R2 RTM does not understand 15000 partitions and the effects would be unpredictable. So SQL Server 2008 SP2 does indeed support version 662, but it does not support version 661. This behavior is explained in the Support for 15000 Partitions.docx document, although the database versions involved are not explicitly called out.

So the error message above should be really read as:

The database cannot be opened because it is version 661. This server supports versions 662, 655 and earlier than 655. A downgrade path is not supported

With this information the correct resolution can be achieved: the user is trying to attach a SQL Server 2008 R2 database (v. 661) to an SQL Server 2008 SP2 instance. This is not supported. User has to either upgrade the SQL Server 2008 SP2 instance to SQL Server 2008 R2, or it has to attach the database back to a R2 instance and copy out the data from the database into SQL Server 2008 instance database, eg. using the Import and Export Wizard.

TRY CATCH THROW: Error handling changes in T-SQL

November 22nd, 2010

When SQL Server 2005 introduced BEGIN TRY and BEGIN CATCH syntax, it was a huge improvement over the previous error handling based on @@ERROR check after each statement. Finally, T-SQL joined the rank of programming languages, no more just a data access language. Experience has shown that exception handling leads to better code compared to error checks. Yes, SEH is slower, but is basically impossible to maintain the code discipline to check @@ERROR after every operation, so exception handling is just so much easier to get right. And besides, @@ERROR never had such a masterpiece article to guide you trough like A Crash Course on the Depths of Win32™ Structured Exception Handling.

But when trying to use the new TRY/CATCH exception handling in T-SQL code, one problem quickly became apparent: the CATCH block was masking the original error metadata: error number/severity/state, error text, origin line and so on. Within a CATCH block the code was only allowed to raise a *new* error. Sure, the original error information could be passed on in the raised error message, but only as a message. The all important error code was changed. This may seem like a minor issue, but turns out to have a quite serious cascading effect: the caller now has to understand the new error codes raised by your code, instead of the original system error codes. If the application code was prepared to handle deadlocks (error code 1205) in a certain way (eg. retry the transaction), with a T-SQL TRY/CATCH block the deadlock error code would all of the sudden translate into something above 50000.

Read the rest of this entry »

AlwaysOn: High-Availability and reads Scale-Out

November 11th, 2010

Along with SQLPASS Summit 2010 announcements on SQL Server “Denali” features the MSDN site has published preliminary content on some of these features. Going over the “HADR” Overview (SQL Server) content we can get an early idea about this feature. This post summarizes the AlwaysOn technology, and compares it with its predecessor and close cousin, Database Mirroring. For brevity, I am intentionally omitting a lot of details.

The AlwaysOn technology in SQL Server “Denali”, also known by the project name “HADR” and often called Hadron, it is a huge improvement over its predecessor, Database Mirroring. Like Mirroring, AlwaysOn is also based on physical replication of database by shipping over the transaction log. In fact, it is not only similar to Database Mirroring but actually using the DBM technologies to replicate the database. The steps to set up AlwaysOn contain the steps to set up a Mirroring sessions, and the mirroring endpoints, catalog views and DMVs are still used to set up and monitor AlwaysOn. But AlwaysOn brings three more Aces to the table to make an unbeatable play:

Availability Groups
Databases with dependencies on one another fail over together, as a group.
Multiple Secondaries
AlwaysOn will allow for multiple standby replicas for each availability group.
Readable Secondaries
The standby replicas are accessible for read-only operations.

The following table shows the main differences between AlwaysOn and Database Mirroring:

Feature Mirrroring AlwaysOn
Unit of failover Single Database Availability Group
Secondary Access Database Snapshots Read-Only access
Quorum and failover Witness Windows Server Clustering
Number of secondaries Exactly one Number of nodes in the cluster

Availability Groups

DBM considers each mirrored database as an individual entity:

But often application are deployed on several databases contained in an instance and there are tight dependencies between them (eg. cross database queries and procedures). With DBM one had to set up complicated logic to force failover of all related databases if one individual database was occurring a failover event. AlwaysOn introduces Availability Groups that allows explicit declaration of such dependencies. A failover occurs always as an entire group, all databases in the group fail over together become available on the new primary host.

The individual databases inside an Availability Group are still replicated using the Database Mirroring technology:

Multiple Secondaries

Database Mirroring is a private affair between two hosts. A mirroring session can only have on Principal and one Mirror, and they can switch roles. From a High Availability point of view the risk involved in losing one of the partner was fairly serious. Until the operations were set in place to either add a new partner or bring back the old partner online, the system would run at the risk of loosing availability on any further incident. With AlwaysOn there can be multiple stand-by secondary availability groups, on multiple hosts. This allows for a much safer operations, where multiple failures could be survived without loss of availability. This reduces the cost of achieving ‘five nines’ availability numbers, and eases planning, deployment and operating of mission critical systems.

The figure bellow show a possible AlwaysOn deployment on a 4 node cluster:

  • Availability Group 1 contains 2 databases and 3 nodes of the cluster have joined this availability group. The SQL Server instance on node A is the the primary replica and the ones nodes B and C each host an availability replica.
  • Availability Group 2 contains one database, it has the primary availability group running on the SQL Server instance on node B of the cluster and an availability secondary replica on the instance on node A of the cluster.
  • Availability Group 3 contains five databases, it has the primary availability group running on the SQL Server instance on node D of the cluster and an availability secondary replica on the instance on node C of the cluster.
  • Availability Group 4 contains two databases, it has the primary availability group running on the SQL Server instance on node B of the cluster and an availability secondary replica on the instance on node D of the cluster.

Readable Secondaries

Secondary replicas are readable in real-time. All read-only operations are allowed on databases in the secondary availability groups. Coupled with the capability to have multiple secondaries, this gives a very nice solution for Scale Out reads. Unlike Database Mirroring database snapshot solution, that was giving a point-in-time snapshot view of the database, readable secondaries allow for real query access to the content of the secondary database, in real-time for up-to-date changes. Access is read-only, no updates are permitted, and all queries run automatically under snapshot isolation model (lock hints and explicitly set isolation levels are ignored). The queries always get a transactionally consistent view of the data, as fresh as allowed by the underlying mirroring session that continuously updates the database. With synchronous mirroring this means that the scale-out achieves the all elusive golden standard of reads that are always perfectly consistent with the last committed writes *on any replica*, with no lag! So far this was impossible to achieve with any other scale-out technology [Correction 11/22: As Gopal points out bellow, this is still not achievable because synchronous mirroring only requires the shipped log to be hardened to disk, not redone. You can get pretty close, but but the application still needs to be prepared to handle stale reads].

The following image shows a possible AlwaysOn reads Scale-Out deployed on a 3 node cluster to serve a web farm. Read requests can be server by any of the availability replicas, including the Primary Replica. Write requests have to be all routed to the Primary Replica:

What gives

Database Mirroring has always been a viable choice for small businesses. Available in Standard Edition, running on commodity hardware and with minimal licensing cost requirements, it was basically dirt cheap to implement. AlwaysOn is on a different league. Its availability is no longer based on quorum decided by a witness, as in Database Mirroring, but instead is based on Windows Server Failover Clustering. WSFC is a premium feature available only on Enterprise and DataCenter editions of the Windows Server family, and is supported only on hardware that has passed the Validate a Configuration Wizard. Small businesses can still deploy basic Database Mirroring, but they will have to support a significantly higher entry cost to enjoy the advantages of the AlwaysOn technology.