This is Part 23 of a series on Designing, Building & Packaging A Scalable, Testable .NET Open Source Component.

Our last post in the series looked at how to refactor the AzureBlobStorageEngine to perform asynchronous initialization.

This post will examine the refactoring required to enable the AmazonS3StorageEngine to utilize asynchronous initialization.

The constructor for the Amazon3StorageEngine currently looks like this:

public AmazonS3StorageEngine(string username, string password, string amazonLocation, string dataContainerName,
    string metadataContainerName)
{
    // Configuration for the amazon s3 client
    var config = new AmazonS3Config
    {
        ServiceURL = amazonLocation,
        ForcePathStyle = true
    };

    _dataContainerName = dataContainerName;
    _metadataContainerName = metadataContainerName;
    _client = new AmazonS3Client(username, password, config);
    _utility = new TransferUtility(_client);
}

We then have a InitializeAsync method that looks like this:

public static async Task<AmazonS3StorageEngine> InitializeAsync(string username, string password,
    string amazonLocation,
    string dataContainerName,
    string metadataContainerName, CancellationToken cancellationToken = default)
{
    var engine = new AmazonS3StorageEngine(username, password, amazonLocation, dataContainerName,
        metadataContainerName);

    // Configuration for the amazon s3 client
    var config = new AmazonS3Config
    {
        ServiceURL = amazonLocation,
        ForcePathStyle = true
    };

    var client = new AmazonS3Client(username, password, config);
    // Check if the metadata bucket exists
    if (!await AmazonS3Util.DoesS3BucketExistV2Async(client, metadataContainerName))
    {
        var request = new PutBucketRequest
        {
            BucketName = metadataContainerName,
            UseClientRegion = true
        };

        await client.PutBucketAsync(request, cancellationToken);
    }

    // Check if the data bucket exists
    if (!await AmazonS3Util.DoesS3BucketExistV2Async(client, dataContainerName))
    {
        var request = new PutBucketRequest
        {
            BucketName = dataContainerName,
            UseClientRegion = true
        };

        await client.PutBucketAsync(request, cancellationToken);
    }

    return engine;
}

As before, our first step is to make the constructor private, so that we control the initialization of the object.

private AmazonS3StorageEngine(string username, string password, string amazonLocation, string dataContainerName,
    string metadataContainerName)
{
    // Configuration for the amazon s3 client
    var config = new AmazonS3Config
    {
        ServiceURL = amazonLocation,
        ForcePathStyle = true
    };

    _dataContainerName = dataContainerName;
    _metadataContainerName = metadataContainerName;
    _client = new AmazonS3Client(username, password, config);
    _utility = new TransferUtility(_client);
}

Next, we make the InitializeAsync static and refactor it to construct and return an AmazonS3StorageEngine object. We also need to pass all the parameters required to call the constructor.

public static async Task<AmazonS3StorageEngine> InitializeAsync(string username, string password,
    string amazonLocation,
    string dataContainerName,
    string metadataContainerName, CancellationToken cancellationToken = default)
{
    var engine = new AmazonS3StorageEngine(username, password, amazonLocation, dataContainerName,
        metadataContainerName);

    // Configuration for the amazon s3 client
    var config = new AmazonS3Config
    {
        ServiceURL = amazonLocation,
        ForcePathStyle = true
    };

    var client = new AmazonS3Client(username, password, config);
    // Check if the metadata bucket exists
    if (!await AmazonS3Util.DoesS3BucketExistV2Async(client, metadataContainerName))
    {
        var request = new PutBucketRequest
        {
            BucketName = metadataContainerName,
            UseClientRegion = true
        };

        await client.PutBucketAsync(request, cancellationToken);
    }

    // Check if the data bucket exists
    if (!await AmazonS3Util.DoesS3BucketExistV2Async(client, dataContainerName))
    {
        var request = new PutBucketRequest
        {
            BucketName = dataContainerName,
            UseClientRegion = true
        };

        await client.PutBucketAsync(request, cancellationToken);
    }

    return engine;
}

Again, we have some repetition here - we are creating another AmazonS3Client, but this one is specifically to perform initialization work, as well as its necessary configuration.

The last step is the methods that all currently call InitializeAsync:

/// <inheritdoc />
public async Task<FileMetadata> StoreFileAsync(FileMetadata metaData, Stream data,
    CancellationToken cancellationToken = default)
{
    // Initialize
    await InitializeAsync(cancellationToken);

    // Upload the data and the metadata in parallel
    await Task.WhenAll(
        _utility.UploadAsync(new MemoryStream(Encoding.UTF8.GetBytes(JsonSerializer.Serialize(metaData))),
            _metadataContainerName, metaData.FileId.ToString(), cancellationToken),
        _utility.UploadAsync(data, _dataContainerName, metaData.FileId.ToString(), cancellationToken)
    );
    return metaData;
}

/// <inheritdoc />
public async Task<FileMetadata> GetMetadataAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    // Initialize
    await InitializeAsync(cancellationToken);

    //Verify file exists
    if (!await FileExistsAsync(fileId, _metadataContainerName, cancellationToken))
        throw new FileNotFoundException($"File {fileId} not found");

    // Create a request
    var request = new GetObjectRequest
    {
        BucketName = _metadataContainerName,
        Key = fileId.ToString()
    };

    // Retrieve the data
    using var response = await _client.GetObjectAsync(request, cancellationToken);
    await using var responseStream = response.ResponseStream;
    var memoryStream = new MemoryStream();
    await responseStream.CopyToAsync(memoryStream, cancellationToken);

    // Reset position
    memoryStream.Position = 0;
    using var reader = new StreamReader(memoryStream);
    var content = await reader.ReadToEndAsync(cancellationToken);
    return JsonSerializer.Deserialize<FileMetadata>(content) ?? throw new FileNotFoundException();
}

/// <inheritdoc />
public async Task<Stream> GetFileAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    // Initialize
    await InitializeAsync(cancellationToken);

    //Verify file exists
    if (!await FileExistsAsync(fileId, _dataContainerName, cancellationToken))
        throw new FileNotFoundException($"File {fileId} not found");

    // Create a request
    var request = new GetObjectRequest
    {
        BucketName = _dataContainerName,
        Key = fileId.ToString()
    };

    // Retrieve the data
    using var response = await _client.GetObjectAsync(request, cancellationToken);
    await using var responseStream = response.ResponseStream;
    var memoryStream = new MemoryStream();
    await responseStream.CopyToAsync(memoryStream, cancellationToken);
    // Reset position
    memoryStream.Position = 0;
    return memoryStream;
}

/// <inheritdoc />
public async Task DeleteFileAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    // Initialize
    await InitializeAsync(cancellationToken);

    //Verify file exists
    if (!await FileExistsAsync(fileId, _dataContainerName, cancellationToken))
        throw new FileNotFoundException($"File {fileId} not found");

    // Delete metadata and data in parallel
    await Task.WhenAll(
        _client.DeleteObjectAsync(_metadataContainerName, fileId.ToString(), cancellationToken),
        _client.DeleteObjectAsync(_dataContainerName, fileId.ToString(), cancellationToken));
}

/// <inheritdoc />
public async Task<bool> FileExistsAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    // Initialize
    await InitializeAsync(cancellationToken);

    return await FileExistsAsync(fileId, _dataContainerName, cancellationToken);
}

private async Task<bool> FileExistsAsync(Guid fileId, string containerName,
    CancellationToken cancellationToken = default)
{
    try
    {
        await _client.GetObjectMetadataAsync(containerName, fileId.ToString(), cancellationToken);
        return true;
    }
    catch (AmazonS3Exception ex)
    {
        if (ex.StatusCode == HttpStatusCode.NotFound)
        {
            return false;
        }

        throw;
    }
}

We can remove all those calls, so that they now look like this:

/// <inheritdoc />
public async Task<FileMetadata> StoreFileAsync(FileMetadata metaData, Stream data,
    CancellationToken cancellationToken = default)
{
    // Upload the data and the metadata in parallel
    await Task.WhenAll(
        _utility.UploadAsync(new MemoryStream(Encoding.UTF8.GetBytes(JsonSerializer.Serialize(metaData))),
            _metadataContainerName, metaData.FileId.ToString(), cancellationToken),
        _utility.UploadAsync(data, _dataContainerName, metaData.FileId.ToString(), cancellationToken)
    );
    return metaData;
}

/// <inheritdoc />
public async Task<FileMetadata> GetMetadataAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    //Verify file exists
    if (!await FileExistsAsync(fileId, _metadataContainerName, cancellationToken))
        throw new FileNotFoundException($"File {fileId} not found");

    // Create a request
    var request = new GetObjectRequest
    {
        BucketName = _metadataContainerName,
        Key = fileId.ToString()
    };

    // Retrieve the data
    using var response = await _client.GetObjectAsync(request, cancellationToken);
    await using var responseStream = response.ResponseStream;
    var memoryStream = new MemoryStream();
    await responseStream.CopyToAsync(memoryStream, cancellationToken);

    // Reset position
    memoryStream.Position = 0;
    using var reader = new StreamReader(memoryStream);
    var content = await reader.ReadToEndAsync(cancellationToken);
    return JsonSerializer.Deserialize<FileMetadata>(content) ?? throw new FileNotFoundException();
}

/// <inheritdoc />
public async Task<Stream> GetFileAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    //Verify file exists
    if (!await FileExistsAsync(fileId, _dataContainerName, cancellationToken))
        throw new FileNotFoundException($"File {fileId} not found");

    // Create a request
    var request = new GetObjectRequest
    {
        BucketName = _dataContainerName,
        Key = fileId.ToString()
    };

    // Retrieve the data
    using var response = await _client.GetObjectAsync(request, cancellationToken);
    await using var responseStream = response.ResponseStream;
    var memoryStream = new MemoryStream();
    await responseStream.CopyToAsync(memoryStream, cancellationToken);
    // Reset position
    memoryStream.Position = 0;
    return memoryStream;
}

/// <inheritdoc />
public async Task DeleteFileAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    //Verify file exists
    if (!await FileExistsAsync(fileId, _dataContainerName, cancellationToken))
        throw new FileNotFoundException($"File {fileId} not found");

    // Delete metadata and data in parallel
    await Task.WhenAll(
        _client.DeleteObjectAsync(_metadataContainerName, fileId.ToString(), cancellationToken),
        _client.DeleteObjectAsync(_dataContainerName, fileId.ToString(), cancellationToken));
}

/// <inheritdoc />
public async Task<bool> FileExistsAsync(Guid fileId, CancellationToken cancellationToken = default)
{
    return await FileExistsAsync(fileId, _dataContainerName, cancellationToken);
}

private async Task<bool> FileExistsAsync(Guid fileId, string containerName,
    CancellationToken cancellationToken = default)
{
    try
    {
        await _client.GetObjectMetadataAsync(containerName, fileId.ToString(), cancellationToken);
        return true;
    }
    catch (AmazonS3Exception ex)
    {
        if (ex.StatusCode == HttpStatusCode.NotFound)
        {
            return false;
        }

        throw;
    }
}

Again, we derive the benefit that we are sure that once we receive the created AmazonS3StorageEngine, it has been correctly configured in terms of setting up all the necessary bucket objects.

Our tests, we can confirm, still pass.

AmazonsS3RefactorTests

TLDR

We have refactored the AmazonS3StorageEngine to ensure that it is initialized correctly upon object creation.

The code is in my GitHub.

Happy hacking!