Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 13 - Database Configuration
[.NET, C#, OpenSource, Design, Testing, PostgreSQL, SQL Server, Dapper]
This is Part 13 of a series on Designing, Building & Packaging A Scalable, Testable .NET Open Source Component.
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 1 - Introduction
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 2 - Basic Requirements
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 3 - Project Setup
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 4 - Types & Contracts
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 5 - Component Implementation
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 6 - Mocking & Behaviour Tests
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 7 - Sequence Verification With Moq
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 8 - Compressor Implementation
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 9 - Encryptor Implementation
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 10 - In Memory Storage
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 11 - SQL Server Storage
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 12 - PostgreSQL Storage
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 13 - Database Configuration (This Post)
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 14 - Virtualizing Infrastructure
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 15 - Test Organization
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 16 - Large File Consideration
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 17 - Large File Consideration On PostgreSQL
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 18 - Azure Blob Storage
- Designing, Building & Packaging A Scalable, Testable .NET Open Source Component - Part 19 - Testing Azure Blob Storage Locally
In our last post, we implemented the PostgreSQL storage engine.
In this post we shall look at the considerations around the database itself, regardless of the engine.
A couple of questions come to mind:
- In which database will the table that stores the files be? Should it have a dedicated database, or will the table be part of the host application’s database?
- How will the table(s) and index(es) be created? Will they be automated, or will they someone need to create them manually.
Let us tackle them one by one.
Which Database
We will leave the decision of which database to the host application. Given we are providing a connection string, the host can pass the details of the database to use to our component.
Schema & Objects Creation
We will also leave the creation of the table and index(es) to the user. In other words, this will not be automated. Why?
- Automation requires system administration rights, or at least roles for
CREATE TABLE
` / `ALTER TABLE, which implies escalated permission levels. It is not reasonable (or wise) to expect such a component to be working with such permissions. - It is always a good thing for any component, tool or software to extend visibility of exactly what is being done on the server. It is good to know exactly what SQL is being ran.
What we will do is provide scripts that can be ran to create the objects, that can be ran in advance. And when the connection string is being passed to the component, a low privilege user can be used.
Below is the script for SQL Server:
-- TABLE
-- Create table if it doesn't exist
IF NOT EXISTS (SELECT *
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = 'Files'
AND TABLE_SCHEMA = 'dbo')
BEGIN
CREATE TABLE dbo.Files
(
FileID UNIQUEIDENTIFIER PRIMARY KEY NOT NULL,
Name NVARCHAR(500) NOT NULL,
Extension NVARCHAR(10) NOT NULL,
DateUploaded DATETIME2 NOT NULL,
OriginalSize INT NOT NULL,
PersistedSize INT NOT NULL,
CompressionAlgorithm TINYINT NOT NULL,
EncryptionAlgorithm TINYINT NOT NULL,
Hash BINARY(32) NOT NULL,
Data VARBINARY(MAX)
);
END
GO
-- INDEXES
IF NOT EXISTS (SELECT 1
FROM sys.indexes
WHERE name = 'IX_Files_Metadata'
AND object_id = OBJECT_ID('dbo.Files'))
BEGIN
CREATE NONCLUSTERED INDEX IX_Files_Metadata
ON dbo.Files (FileID)
INCLUDE (
Name,
Extension,
DateUploaded,
OriginalSize,
PersistedSize,
CompressionAlgorithm,
EncryptionAlgorithm,
Hash
);
END
GO
Below is the script for PostgreSQL:
-- Table creation
CREATE TABLE IF NOT EXISTS public.files
(
fileid UUID PRIMARY KEY NOT NULL,
name VARCHAR(500) NOT NULL,
extension VARCHAR(10) NOT NULL,
dateuploaded TIMESTAMPTZ NOT NULL,
originalsize INT NOT NULL,
persistedsize INT NOT NULL,
compressionalgorithm SMALLINT NOT NULL,
encryptionalgorithm SMALLINT NOT NULL,
hash BYTEA NOT NULL,
data BYTEA
);
-- Index creation
CREATE INDEX IF NOT EXISTS ix_files_metadata
ON public.files (fileid)
INCLUDE (name, extension, dateuploaded, originalsize, persistedsize, compressionalgorithm, encryptionalgorithm, hash);
Prior to use, this script can be run with the rights of a higher privilege user.
What if you don’t want to use the name Files
in SQL Server, or files
in PostgreSQL?
We can rewrite the scripts to allow changing of the table name, so that at configuration the user can decide which name to use.
For SQL Server:
-- Modify the table name for use
DECLARE @TableName NVARCHAR(128) = 'Files';
DECLARE @SchemaName NVARCHAR(128) = 'dbo';
DECLARE @FullTableName NVARCHAR(256) = QUOTENAME(@SchemaName) + '.' + QUOTENAME(@TableName);
DECLARE @Sql NVARCHAR(MAX);
-- Check if table exists
IF NOT EXISTS (SELECT *
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = @TableName
AND TABLE_SCHEMA = @SchemaName)
BEGIN
SET @Sql = '
CREATE TABLE ' + @FullTableName + ' (
FileID UNIQUEIDENTIFIER PRIMARY KEY NOT NULL,
Name NVARCHAR(500) NOT NULL,
Extension NVARCHAR(10) NOT NULL,
DateUploaded DATETIME2 NOT NULL,
OriginalSize INT NOT NULL,
PersistedSize INT NOT NULL,
CompressionAlgorithm TINYINT NOT NULL,
EncryptionAlgorithm TINYINT NOT NULL,
Hash BINARY(32) NOT NULL,
Data VARBINARY(MAX)
);';
EXEC sp_executesql @Sql;
END;
-- Check if index exists
IF NOT EXISTS (SELECT 1
FROM sys.indexes
WHERE name = 'IX_' + @TableName + '_Metadata'
AND object_id = OBJECT_ID(@FullTableName))
BEGIN
SET @Sql = '
CREATE NONCLUSTERED INDEX' + ' IX_' + @TableName + '_Metadata
ON ' + @FullTableName + ' (FileID)
INCLUDE (
Name,
Extension,
DateUploaded,
OriginalSize,
PersistedSize,
CompressionAlgorithm,
EncryptionAlgorithm,
Hash
);';
EXEC sp_executesql @Sql;
END;
For PostgreSQL:
DO $$
DECLARE
-- Change this to your desired table name
table_name text := 'files';
BEGIN
EXECUTE format('
CREATE TABLE IF NOT EXISTS public.%I (
fileid UUID PRIMARY KEY NOT NULL,
name VARCHAR(500) NOT NULL,
extension VARCHAR(10) NOT NULL,
dateuploaded TIMESTAMPTZ NOT NULL,
originalsize INT NOT NULL,
persistedsize INT NOT NULL,
compressionalgorithm SMALLINT NOT NULL,
encryptionalgorithm SMALLINT NOT NULL,
hash BYTEA NOT NULL,
data BYTEA
);
', table_name);
EXECUTE format('
CREATE INDEX IF NOT EXISTS ix_%I_metadata
ON public.%I (fileid)
INCLUDE (name, extension, dateuploaded, originalsize, persistedsize, compressionalgorithm, encryptionalgorithm, hash);
', table_name,table_name);
END $$;
The name of the table will be passed in the connection string.
For SQL Server it will look like this:
DataSource=myserver;uid=mylogin;pwd=mypass;database=mydatabase
And for PostgreSQL
Host=myserver;Username=mylogin;Password=mypass;Database=mydatabase
Which begs the question - what if we omit the database?
The code will actually still work - provided the other parameters are correct.
The problem is the table will be created in the system databases - master
for SQL Server and postgres
for PostgreSQL.
We will need to parse the connection string and ensure there is a database passed.
We can write two parsers for this, using the respective implementations of the DbConnectionStringBuilder for parsing.
For SQL Server:
public sealed class SqlServerConnectionStringParser
{
private readonly string _connectionString;
public SqlServerConnectionStringParser(string connectionString)
{
_connectionString = connectionString;
}
public string Database
{
get
{
var builder = new SqlConnectionStringBuilder(_connectionString);
return builder.InitialCatalog;
}
}
}
For PostgreSQL
public sealed class PostgreSQLConnectionStringParser
{
private readonly string _connectionString;
public PostgreSQLConnectionStringParser(string connectionString)
{
_connectionString = connectionString;
}
public string? Database
{
get
{
var builder = new NpgsqlConnectionStringBuilder(_connectionString);
return builder.Database;
}
}
}
Finally we can use the parser in the constructor of the IStorageEngine
implementations.
SQL Server:
/// <summary>
/// Constructor, taking the connection string
/// </summary>
/// <param name="connectionString"></param>
public SqlServerStorageEngine(string connectionString)
{
_connectionString = connectionString;
// Parse the connection string
var parser = new SqlServerConnectionStringParser(connectionString);
if (string.IsNullOrEmpty(parser.Database))
throw new ArgumentException($"{nameof(parser.Database)} cannot be null or empty");
}
PostgreSQL:
/// <summary>
/// Constructor, taking the connection string
/// </summary>
/// <param name="connectionString"></param>
public PosgrgreSQLStorageEngine(string connectionString)
{
_connectionString = connectionString;
// Parse the connection string for a database
var parser = new PostgreSQLConnectionStringParser(connectionString);
if (string.IsNullOrEmpty(parser.Database))
throw new ArgumentException($"{nameof(parser.Database)} cannot be null or empty");
}
Thus we can ensure that the user has the responsibility to configure the database beforehand before using the component.
TLDR
In this post we have written scripts for the configuration of the SQL Server and PostgreSQL storage engines, as well as code to ensure a database is specified at startup.
The code is in my GitHub.
Happy hacking!