Now I will adapt the Repository I did in my previous post to use the blob storage this time. I'll only walk through the changes I'm making here.
Constructor
First, we need to create a CloudBlobContainer instance in the constructor. Please note that for blob storage container names are required to be lower-case.
public class ProjectRepository { private CloudTable table; private CloudBlobContainer container; public ProjectRepository() { var connectionString = "..."; CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connectionString); var tableClient = storageAccount.CreateCloudTableClient(); this.table = tableClient.GetTableReference("Project"); this.table.CreateIfNotExists(); var blobClient = storageAccount.CreateCloudBlobClient(); this.container = blobClient.GetContainerReference("project"); this.container.CreateIfNotExists(); } // ... }
Insert
Next for the Insert method, we no longer store the document in a property of the ElasticTableEntity object. Instead we want to serialize the document into the JSON format and upload it as a file to the blob storage and set the ContentType of that file to application/json. For the blob name (or path) the pattern I'm using looks like this: {document-type}/{partition-key}/{row-key}.
public void Insert(Project project) { project.Id = Guid.NewGuid(); var document = JsonConvert.SerializeObject(project, Newtonsoft.Json.Formatting.Indented); var partitionKey = project.Owner.ToString(); var rowKey = project.Id.ToString(); UploadDocument(partitionKey, rowKey, document); dynamic entity = new ElasticTableEntity(); entity.PartitionKey = partitionKey; entity.RowKey = rowKey; entity.Name = project.Name; entity.StartDate = project.StartDate; entity.TotalTasks = project.Tasks.Count(); this.table.Execute(TableOperation.Insert(entity)); } private void UploadDocument(string partitionKey, string rowKey, string document) { var filename = string.Format(@"project\{0}\{1}.json", partitionKey, rowKey); var blockBlob = this.container.GetBlockBlobReference(filename); using (var memory = new MemoryStream()) using (var writer = new StreamWriter(memory)) { writer.Write(document); writer.Flush(); memory.Seek(0, SeekOrigin.Begin); blockBlob.UploadFromStream(memory); } blockBlob.Properties.ContentType = "application/json"; blockBlob.SetProperties(); }
Load
For the Load method we can get the blob name using the PartitionKey and RowKey then download the document from blob storage. In DownloadDocument I'm using a MemoryStream and StreamReader to get the serialized document as a string.
public Project Load(string partitionKey, string rowKey) { var blobName = string.Format(@"project\{0}\{1}.json", partitionKey, rowKey); var document = this.DownloadDocument(blobName); return JsonConvert.DeserializeObject<Project>(document); } private string DownloadDocument(string blobName) { var blockBlob = this.container.GetBlockBlobReference(blobName); using (var memory = new MemoryStream()) using (var reader = new StreamReader(memory)) { blockBlob.DownloadToStream(memory); memory.Seek(0, SeekOrigin.Begin); return reader.ReadToEnd(); } }
List
In the first List method we want to get all documents of the same partition. We can do that by directly using the ListBlobs method of CloudBlobDirectory. For the ListWithTasks method we still need to query the table storage first to know which documents contain at least one task. Then with the entities we'll know the RowKey value of those documents so we can simply call the Load method we just saw.
public IEnumerable<Project> List(string partitionKey) { var listItems = this.container .GetDirectoryReference("project/" + partitionKey).ListBlobs(); return listItems.OfType<CloudBlockBlob>() .Select(x => this.DownloadDocument(x.Name)) .Select(document => JsonConvert.DeserializeObject<Project>(document)); } public IEnumerable<Project> ListWithTasks(string partitionKey) { var query = new TableQuery<ElasticTableEntity>() .Select(new [] { "RowKey" }) .Where(TableQuery.CombineFilters( TableQuery.GenerateFilterCondition("PartitionKey", QueryComparisons.Equal, partitionKey), TableOperators.And, TableQuery.GenerateFilterConditionForInt("TotalTasks", QueryComparisons.GreaterThan, 0))); dynamic entities = table.ExecuteQuery(query).ToList(); foreach (var entity in entities) yield return this.Load(partitionKey, entity.RowKey); }
Update
To update a document now we also need to serialize and upload the new version to blob storage.
public void Update(Project project) { var document = JsonConvert.SerializeObject(project, Newtonsoft.Json.Formatting.Indented); var partitionKey = project.Owner.ToString(); var rowKey = project.Id.ToString(); UploadDocument(partitionKey, rowKey, document); dynamic entity = new ElasticTableEntity(); entity.PartitionKey = partitionKey; entity.RowKey = rowKey; entity.ETag = "*"; entity.Name = project.Name; entity.StartDate = project.StartDate; entity.TotalTasks = project.Tasks != null ? project.Tasks.Count() : 0; this.table.Execute(TableOperation.Replace(entity)); }
Delete
Finally, deleting a document now requires us to call Delete on the CloudBlobContainer reference.
public void Delete(Project project) { dynamic entity = new ElasticTableEntity(); entity.PartitionKey = project.Owner.ToString(); entity.RowKey = project.Id.ToString(); entity.ETag = "*"; this.table.Execute(TableOperation.Delete(entity)); this.DeleteDocument(entity.PartitionKey, entity.RowKey); } public void Delete(string partitionKey, string rowKey) { dynamic entity = new ElasticTableEntity(); entity.PartitionKey = partitionKey; entity.RowKey = rowKey; entity.ETag = "*"; this.table.Execute(TableOperation.Delete(entity)); this.DeleteDocument(partitionKey, rowKey); } private void DeleteDocument(string partitionKey, string rowKey) { var blobName = string.Format(@"project\{0}\{1}.json", partitionKey, rowKey); var blockBlob = this.container.GetBlockBlobReference(blobName); blockBlob.Delete(DeleteSnapshotsOption.IncludeSnapshots); }
Conclusion
Using both Tables and Blobs Storage Services we can get the best of both worlds. We can query for document's properties with table storage and we can store documents larger than 64KB in blob storage. Of course now almost all operations on my Repository requires two calls to Azure. Currently those are done sequentially, waiting for the first call to complete before the doing the second call. I should fix that by using the asynchronous variants of storage service methods like the BeginDelete/EndDelete method pair on CloudBlobContainer.
I hope this post is giving you ideas on new and clever ways you can use the Windows Azure Storage Services in your projects.
See also
- Using Azure Table Storage with dynamic table entities- Document oriented database with Azure Table Storage Service