Develop for Azure storage Flashcards

Question 1

Perform operations on Cosmos DB containers and items using SDK

Accepted Answer

Azure Cosmos DB SDK enables developers to programmatically interact with containers and items through various operations. The SDK is available for multiple languages including .NET, Java, Python, and JavaScript.

To begin working with Cosmos DB, you first establish a connection using CosmosClient by providing the endpoint URI and primary key. From there, you can access databases and containers.

Container Operations:
- Create containers using CreateContainerIfNotExistsAsync() method, specifying container properties like partition key path and throughput settings
- Read container properties using ReadContainerAsync()
- Replace container settings with ReplaceContainerAsync()
- Delete containers using DeleteContainerAsync()
- Query containers using GetContainerQueryIterator()

Item Operations:
- Create items using CreateItemAsync<T>(), where T is your data model class. You must provide the item object and partition key value
- Read individual items with ReadItemAsync<T>() using the item ID and partition key
- Replace existing items using ReplaceItemAsync<T>()
- Upsert items (create or update) with UpsertItemAsync<T>()
- Delete items using DeleteItemAsync<T>()

Querying Items:
The SDK supports SQL-like queries through GetItemQueryIterator<T>(). You can create QueryDefinition objects with parameterized queries for security and performance. Results are returned as pages that you iterate through using FeedIterator.

Best Practices:
- Always specify partition keys for optimal performance
- Use async/await patterns for non-blocking operations
- Implement proper exception handling for CosmosException
- Configure appropriate RequestOptions for consistency levels and throughput
- Dispose CosmosClient properly or use singleton pattern
- Leverage bulk operations for high-volume scenarios using AllowBulkExecution option

The SDK also supports transactional batch operations within the same partition key, enabling atomic operations across multiple items. This ensures data consistency when performing related modifications together.

Question 2

Set the appropriate consistency level for Cosmos DB operations

Accepted Answer

Azure Cosmos DB offers five consistency levels that provide a spectrum between strong consistency and eventual consistency, allowing developers to make precise tradeoffs between read consistency, availability, latency, and throughput.

**Strong Consistency**: Guarantees linearizability, meaning reads always return the most recent committed version of an item. This level offers the highest consistency but may impact latency and availability across regions.

**Bounded Staleness**: Guarantees that reads lag behind writes by at most K versions or T time interval. This is ideal for applications requiring strong consistency with some tolerance for staleness, particularly in multi-region scenarios.

**Session Consistency**: The default and most widely used level. It guarantees read-your-writes, monotonic reads, and monotonic writes within a single client session. Perfect for user-centric applications where each user needs to see their own updates.

**Consistent Prefix**: Ensures reads never see out-of-order writes. If writes occur in order A, B, C, reads will see A, then A-B, then A-B-C, never seeing B before A.

**Eventual Consistency**: Provides the weakest consistency but offers the lowest latency and highest availability. Reads may return stale data, but eventually all replicas converge.

**Setting Consistency Level**:

You can configure the default consistency at the account level through Azure Portal or programmatically. For individual operations, you can override using RequestOptions:

csharp
ItemRequestOptions options = new ItemRequestOptions
{
    ConsistencyLevel = ConsistencyLevel.Strong
};
await container.ReadItemAsync<MyItem>(id, partitionKey, options);

**Best Practices**:
- Use Session consistency for most scenarios as it balances performance and consistency
- Choose Strong when financial transactions require absolute accuracy
- Select Eventual for high-throughput scenarios where temporary inconsistency is acceptable
- Consider Bounded Staleness for global applications needing predictable consistency guarantees

Remember that relaxing consistency improves performance and availability while reducing RU consumption.

Question 3

Implement Cosmos DB change feed notifications

Accepted Answer

Azure Cosmos DB Change Feed is a powerful feature that enables you to track and respond to changes in your Cosmos DB containers in real-time. The change feed provides a sorted list of documents that were modified in the order they were changed, making it ideal for building reactive applications and data pipelines.

To implement change feed notifications, you have several approaches:

**1. Azure Functions Trigger:**
The simplest method is using Azure Functions with a Cosmos DB trigger. The function automatically executes whenever documents are inserted or updated in your container. You configure this through the function.json bindings or using attributes in your code, specifying the database name, container name, and a lease container for tracking progress.

**2. Change Feed Processor Library:**
For more control, use the Change Feed Processor in the .NET SDK. This involves creating a ChangeFeedProcessor instance that monitors your container. You define a delegate handler that processes each batch of changes. The processor handles partitioning, checkpointing, and load balancing across multiple instances automatically.

**3. Pull Model:**
The SDK also supports a pull-based approach where you explicitly request changes using FeedIterator. This gives you fine-grained control over when and how you read changes from specific partitions.

**Key Implementation Steps:**
- Create a lease container to store checkpoints and coordinate between multiple readers
- Configure the start time or continuation token to specify where to begin reading
- Implement error handling and retry logic for resilience
- Process changes idempotently since the same change might be delivered multiple times

**Common Use Cases:**
- Real-time data synchronization between services
- Triggering notifications or workflows based on data changes
- Maintaining materialized views or search indexes
- Event sourcing and audit logging

The change feed captures inserts and updates but does not track deletions. For delete scenarios, implement soft deletes using a TTL property or deletion flag.

Question 4

Set and retrieve Blob Storage properties and metadata

Accepted Answer

Azure Blob Storage allows developers to work with properties and metadata to manage and organize blobs effectively. Properties are system-defined attributes that describe the blob's characteristics, while metadata consists of user-defined name-value pairs that provide custom information about the blob.

**Blob Properties:**
Properties include system-managed values such as ContentType, ContentLength, LastModified, ETag, and ContentEncoding. These help identify the blob's format, size, and state. You can retrieve and modify certain properties using the Azure SDK or REST API.

**Setting Properties:**
Using the Azure.Storage.Blobs SDK in .NET, you can set properties through the BlobClient class:

csharp
BlobClient blobClient = containerClient.GetBlobClient("myblob.txt");
BlobHttpHeaders headers = new BlobHttpHeaders
{
ContentType = "text/plain",
CacheControl = "max-age=3600"
};
await blobClient.SetHttpHeadersAsync(headers);

**Retrieving Properties:**
To fetch properties, use the GetPropertiesAsync method:

csharp
BlobProperties properties = await blobClient.GetPropertiesAsync();
Console.WriteLine($"Content Type: {properties.ContentType}");
Console.WriteLine($"Last Modified: {properties.LastModified}");

**Blob Metadata:**
Metadata enables you to attach custom key-value pairs to blobs for categorization, tagging, or tracking purposes.

**Setting Metadata:**
csharp
IDictionary<string, string> metadata = new Dictionary<string, string>
{
{ "author", "JohnDoe" },
{ "category", "documents" }
};
await blobClient.SetMetadataAsync(metadata);

**Retrieving Metadata:**
csharp
BlobProperties props = await blobClient.GetPropertiesAsync();
foreach (var item in props.Metadata)
{
Console.WriteLine($"{item.Key}: {item.Value}");
}

**REST API Approach:**
You can also use HTTP headers with REST calls. Properties use x-ms-blob-* headers, while metadata uses x-ms-meta-* prefixed headers.

Both properties and metadata are essential for building robust storage solutions, enabling better organization, caching strategies, and application-specific data management in Azure Blob Storage.

Question 5

Perform Blob Storage data operations using SDK

Accepted Answer

Azure Blob Storage SDK provides developers with powerful tools to perform data operations programmatically. The SDK is available for multiple languages including .NET, Java, Python, and JavaScript, enabling seamless integration with your applications.

To get started, you need to install the Azure.Storage.Blobs NuGet package for .NET applications. First, create a BlobServiceClient using your storage account connection string or Azure Active Directory credentials for authentication.

Key operations include:

**Container Operations:**
- Create containers using BlobContainerClient.CreateIfNotExistsAsync()
- Delete containers with DeleteAsync()
- List containers using GetBlobContainersAsync()
- Set access policies and metadata

**Blob Upload Operations:**
- Upload files using BlobClient.UploadAsync() method
- Stream large files efficiently with OpenWriteAsync()
- Set blob properties like content type and cache control
- Add custom metadata to blobs

**Blob Download Operations:**
- Download blobs using DownloadAsync() or DownloadToAsync()
- Stream content for large files using OpenReadAsync()
- Access blob properties and metadata

**Blob Management:**
- List blobs in a container using GetBlobsAsync()
- Copy blobs between containers or storage accounts
- Delete blobs with DeleteAsync() or DeleteIfExistsAsync()
- Create snapshots for point-in-time copies
- Set blob tiers (Hot, Cool, Archive)

**Advanced Features:**
- Implement leases for concurrency control
- Use batch operations for multiple blob actions
- Configure retry policies for resilience
- Implement progress tracking for uploads and downloads

Error handling is essential when working with the SDK. Wrap operations in try-catch blocks to handle RequestFailedException for storage-specific errors.

For optimal performance, use async methods throughout your code and consider parallel operations for bulk transfers. The SDK also supports cancellation tokens for long-running operations, allowing users to cancel requests when needed.

Question 6

Implement storage policies and data lifecycle management

Accepted Answer

Implementing storage policies and data lifecycle management in Azure is essential for optimizing costs and maintaining efficient storage operations. Azure Blob Storage offers built-in lifecycle management capabilities that allow you to automatically transition data between access tiers and delete outdated content based on predefined rules.

Lifecycle management policies are JSON-based rule sets that define actions to perform on blobs when certain conditions are met. These policies support three primary actions: transitioning blobs to cooler storage tiers (Hot to Cool, Cool to Archive), deleting blobs, and deleting blob snapshots or versions.

To create a lifecycle policy, you define rules with filters and actions. Filters specify which blobs the rule applies to using prefix matches and blob types. Actions define what happens when age-based conditions are satisfied, measured in days since the blob was created or last modified.

For example, you might configure a policy to move blobs older than 30 days from Hot tier to Cool tier, transition blobs older than 90 days to Archive tier, and delete blobs that exceed 365 days. This tiered approach significantly reduces storage costs for infrequently accessed data.

You can implement these policies through the Azure Portal, Azure CLI, PowerShell, REST API, or ARM templates. The Azure Storage SDK also provides programmatic access for managing these policies within your applications.

Best practices include analyzing your data access patterns before implementing policies, using Azure Storage Analytics to understand blob access frequency, and testing policies in development environments first. Consider implementing soft delete before aggressive deletion policies to provide recovery options.

Immutable storage policies offer another dimension of data governance, allowing you to store business-critical data in a WORM (Write Once, Read Many) state. These policies ensure regulatory compliance and protect against accidental or malicious modifications.

Combining lifecycle management with access tier optimization and immutability policies creates a comprehensive data governance strategy that balances cost efficiency with data protection requirements.

Learn Develop for Azure storage (AZ-204) with Interactive Flashcards