Blog

An overview of working with files in SQL Server

When you are responsible for databases on a project (as an architect, administrator or developer), you sometimes have to store documents (Files) which are linked to structured data. These data often have to be accessible outside SQL Server (Windows API’s). Storing BLOBs in the database, especially the ones over 1MB, can consume large amounts of file space and expensive server resources. Within SQL Server and Azure, there are different solutions to solve these kind of issues.

Solution 1: Storing the network path (all versions)

The most basic way, and also the only way possible on PRE-2012 SQLServer versions, is to create a table (e.g. ‘Documents’) with a path column that contains the complete file path and name.

Disadvantages

  • Lack of transaction consistency: executed actions are not part of the transaction
  • Lack of backup: files cannot be backed up and restored with other SQL Server data
  • Application development is more complex
  • More difficult to manage

Solution 2: Remote BLOB Store (Using FileStream – 2008R2 and later)

SQL Server Remote BLOB Store (RBS) is an optional add-on component that lets database administrators store binary large objects in commodity storage solutions, instead of directly on the main database server. This way of BLOB storage is also used by SharePoint. Aside to the Native SQL BLOB Provider (Default) since SQL Server 2005, there is also a Filestream BLOB Provider since SQL Server 2008R2. This is the only Remote BLOB provider that ships with SQL Server 2008 and it allows you to store BLOB data on any local hard drive on the SQL Server. With a third party provider Store (Storage Point), it gives you the ability to use FileStream towards the Cloud.

Disadvantages

  • The implemenatation of Transparent Data Encryption (TDE) is impossible
  • Database mirroring is impossible
  • Using the FileStream BLOB provider to store data on a network share or a mapped drive is impossible. A physical disk on the server is requiered, except if you use 3rd party (vendor specific providers (NetApp, ECM2, Storage point (Cloud)). Be aware that individual RBS providers may or may not support native SQL backups.
  • 3rd party BLOB stores are also not supported in AlwaysOn Availability Groups
  • It’s quite cumbersome to set up

Note: RBS can be downloaded from the SQL Server 2008 R2 Feature Pack page. RBS is not included on the SQL Server 2008 R2 installation media, and is not installed by the SQL Server 2008 R2 Setup program.

Architectural design with RBS and Storage Point:

Solution 3: FileTable (Using FileStream – 2012 and later)

Since SQL Server 2012, Microsoft has prvided a table of files (also referred to as a FileTable) for applications that require file and directory storage in the database. A FileTable is a specialized user table with a pre-defined schema that stores FILESTREAM data, as well as file and directory hierarchy information and file attributes. This is, in my opinion, the biggest advantage. The files as such are available by the normal Windows File and Directory management, but you can keep track of it in SQL Server, including a hierarchical name space of directories, files and file attributes. Every row in a File Table represents a file or directory.

Note: FileTables do not support Memory-Mapped files (e.g. Notepad, Paint files).
With SQLServer Filetables, you can perform 3 actions: create, alter and drop

Setup

  • FILESTREAM must be enabled (Instance Level): SQL Server Configuration Manager -> Properties -> Filestream -> Set the desired options

EXEC sp_configure filestream_access_level, 2
RECONFIGURE
GO

0 = Disables FILESTREAM support for this instance.
1 = Enables FILESTREAM for Transact-SQL access.
2 = Enables FILESTREAM for Transact-SQL and Win32 streaming access.

  • FILESTREAM-Enabled Database
CREATE DATABASE Documents
ON PRIMARY
(Name = Documents,
FILENAME = 'D:\FileData\Documents.mdf'),
FILEGROUP FTFG CONTAINS FILESTREAM
(NAME = FileTableFS,
FILENAME='D:\FileData\FS')
LOG ON
(Name = DocumentsLog,
FILENAME = 'D:\FileData\ DocumentsLog.ldf')
WITH FILESTREAM (NON_TRANSACTED_ACCESS = FULL,
DIRECTORY_NAME = N'Documents');
GO

Note: Can also been done with an ALTER DATABASE command on an existing DB

Disadvantages

  • The implemenatation of Transparent Data Encryption (TDE) is impossible
  • Database mirroring is impossible
  • Performance is rather poor on frequently small updates to the Files
  • FileTable functionality is only partially supported by Always On Availability Groups. After a failover, FileTable data is accessible on the primary replica, but not on readable secondary replicas.
  • Partitioning is not supported on FileTables
  • Replication and related features are not supported with FileTables
  • In the current release of this feature, storing FileStream data in Azure Storage is not supported. You can store Filestream data in an Azure storage integrated local database, but you cannot move Filestream data between machines using Azure Storage.

Since Filestream is currently (current release of SQL 2016) not supported by Azure, you need to go to:

  • RBS with the 3th party provider of Storage Point
  • Use the Azure document storage service, which offloads the network traffic, but doesn’t give you the transaction (ACID), Backup/Restore, … integration.

In this case we’re back at the basics and store:

    • Collection – String
    • DocumentId – UniqueIdentifier
    • UserId – String

Advantages

  • Traffic doesn’t need to pass the local network: when a Business Service needs to have a location to upload or download a file from the user, it asks for an URL from the Document Storage Service. The Document Storage Service provides the service with a URL, that is secure and temporarily available.
  • Cleanup build in: when a file is first uploaded it will be sent to a temporary location. After a certain time the files in temporary storage will be cleaned up. This prevents a buildup of unintended files where the user changes his mind about a file and doesn’t want to continue.