SQL Server Links

To BLOB or not to BLOB
One of the most frequently asked question on discussion forums is wether to store media files in the database as BLOB or to store them on disk and keep just the name of the file in the database. This paper covers the performance characteristics of NTFS vs. storing a BLOB in the database and concludes, perhaps surprisingly that “if objects are larger than one megabyte on average, NTFS has a clear advantage over SQL Server. If the objects are under 256 kilobytes, the database has a clear advantage”
Previously committed rows might be missed if NOLOCK hint is used
For all the NOLOCK hint aficionados out there: this is why you never do dirty reads.
SQL Server I/O Basics
This old SQL Server 2000 article still applies to SQL Server 2005, 2008 and 2008 R2. It explains the inner workings of SQL Server I/O at the most detail level. There is a complementary SQL Server 2005 document SQL Server I/O Basics, Chapter 2 which goes over SQL Server 2005 specifics. You need to read the SQL Server 2000 article first, as the 2005 one does not go over the same concepts again and assumes you had read the first document.
Troubleshooting SQL Server connectivity
Service Broker in the real world
  • Developing Large Scale Web Applications and Services
  • MySpace DB Overview
  • Real Time Analytics with SQL Server 2008 R2 StreamInsight
  • March Madness on Demand
  • SQL vs. NoSQL

  • The End of an Architectural Era (It’s Time for a Complete Rewrite)
  • Life beyond Distributed Transactions: an Apostate’s Opinion
  • HStore: A HighPerformance, Distributed Main Memory Transaction Processing System
  • Dynamo: Amazon’s Highly Available Key-value Store
  • RDBMS Historic Links

    System R: Relational Approach to Database Management
    System R is the grand father of all relational databases today. This is the original paper that introduced System R to the world, back in 1976. All of today’s RDBMSs inherit from Sytem R the most fundamental traits:

    • Disk oriented storage
    • Indexing structures
    • Multithreading to amortize latency
    • Lock based concurency
    • Log based recovery
    Write-Ahead Logging
    Write-Ahead Logging is the cornerstone of recoverability in most RDBMS sytems. If you ever wanted to understand how a database log works, this is the paper to read: ARIES: A Transaction Recovery Method Supporting Fine-Granularity Locking and Partial Rollbacks Using Write-Ahead Logging
    Volcano-An Extensible and Parallel Query Evaluation System
    This is the typical row-mode pull driven query execution model, where operators ask their children for the next tuple.
    The Five Minute Rule
    Ever wondered where does the 300 seconds bufer pool page life expectancy come for? The original research was done back in 1986 and simple economics were involved: given the cost of memory vs. the cost of mass storage in 1986 it was best to keep in memory a page if it was being accessed again in less than 5 minutes. The paper was revisited again in 1996 and 2006 and found that the changes in technologies and cost were canceling each other and kept the equilibrium point still at 5 minutes:

    The 1995 SQL Reunion: People, Projects, and Politics
    This paper is actually a group interview with most of the people involve din the original release of the very first RDBMS.
    AlphaSort: A Cache-Sensitive Parallel External Sort
    Although this is not a RDBMS paper, I choose to post it here none the less. This paper shows the importance of writing cache-concious algorithms in modern processors. Is importance is even greater today, as multi-core systems become prevalent and parallel tasks are writen by everybody in day to day programming.
    Failure Trends in a Large Disk Population
    That moment you fear when your database disk will all of the sudden stop to respond may be way closer than you think. Did you test your Disaster Recovery plans and procedures? Oh, and perhaps you should read this also: Disk Failures in the Real World: What does a MTBF of 100,000 hours mean to you.

    Reference Links