Recommend a high availability solution for semi-structured and unstructured data
5 minutes
5 Questions
When designing high availability solutions for semi-structured and unstructured data in Azure, architects must consider several key services and strategies. For semi-structured data like JSON, XML, or key-value pairs, Azure Cosmos DB stands as the premier choice. It offers multi-region replication …When designing high availability solutions for semi-structured and unstructured data in Azure, architects must consider several key services and strategies. For semi-structured data like JSON, XML, or key-value pairs, Azure Cosmos DB stands as the premier choice. It offers multi-region replication with automatic failover, guaranteeing 99.999% availability for reads and writes. Cosmos DB supports multiple consistency levels, allowing architects to balance between availability and data consistency based on business requirements. For unstructured data such as documents, images, videos, and logs, Azure Blob Storage with geo-redundant storage (GRS) or geo-zone-redundant storage (GZRS) provides excellent high availability. GRS replicates data to a secondary region hundreds of miles away, while GZRS combines zone redundancy within the primary region with geo-replication. For read access during regional outages, RA-GRS or RA-GZRS options enable applications to read from the secondary region. Azure Data Lake Storage Gen2 serves hybrid scenarios where both semi-structured and unstructured data coexist, offering hierarchical namespace capabilities with the same redundancy options as Blob Storage. For document-centric workloads, combining Azure Cognitive Search with these storage solutions ensures searchable, highly available content. Key architectural recommendations include implementing multiple storage accounts across regions, using Azure Front Door or Traffic Manager for intelligent routing, and establishing proper backup policies with Azure Backup. Monitoring through Azure Monitor and setting up alerts for replication lag or availability issues ensures proactive management. The solution should also incorporate soft delete and versioning for data protection, along with immutable storage policies for compliance requirements. Cost optimization involves selecting appropriate access tiers (hot, cool, archive) while maintaining redundancy requirements based on recovery time objectives (RTO) and recovery point objectives (RPO) defined by the business.
Recommend a High Availability Solution for Semi-Structured and Unstructured Data
Why This Topic Is Important
In modern cloud architectures, semi-structured data (JSON, XML, logs) and unstructured data (images, videos, documents, backups) represent a significant portion of enterprise data. Ensuring high availability for this data is critical for business continuity, as downtime can lead to revenue loss, customer dissatisfaction, and operational disruptions. The AZ-305 exam tests your ability to recommend appropriate solutions that balance availability, cost, and performance requirements.
What Is High Availability for Semi-Structured and Unstructured Data?
High availability (HA) refers to systems designed to remain operational and accessible for extended periods, typically measured as a percentage of uptime (e.g., 99.9% or 99.99%). For semi-structured and unstructured data in Azure, this involves:
• Azure Blob Storage - Primary service for unstructured data • Azure Data Lake Storage Gen2 - Optimized for big data analytics • Azure Cosmos DB - For semi-structured data requiring global distribution • Azure Files - Managed file shares for unstructured data
How It Works: Storage Redundancy Options
Local Redundancy: • LRS (Locally Redundant Storage) - 3 copies within a single datacenter (99.999999999% durability) • ZRS (Zone-Redundant Storage) - 3 copies across availability zones in one region
Geo-Redundancy: • GRS (Geo-Redundant Storage) - 6 copies: 3 in primary region, 3 in secondary region • GZRS (Geo-Zone-Redundant Storage) - Combines ZRS in primary region with LRS in secondary • RA-GRS / RA-GZRS - Read-access versions allowing reads from secondary region
Key Azure Services and Their HA Features
Azure Blob Storage: • Supports all redundancy options (LRS, ZRS, GRS, GZRS) • Soft delete and versioning for data protection • Object replication for cross-region redundancy • Access tiers: Hot, Cool, Cold, Archive
Azure Data Lake Storage Gen2: • Built on Blob Storage with hierarchical namespace • Supports ZRS and GRS redundancy • Ideal for analytics workloads requiring HA
Azure Cosmos DB: • Multi-region writes for maximum availability • Automatic failover capabilities • Five consistency levels to balance availability and consistency • 99.999% availability SLA with multi-region configuration
Azure Files: • Supports LRS, ZRS, GRS, and GZRS • Azure File Sync for hybrid scenarios • Premium tier for low-latency requirements
How to Choose the Right Solution
Consider these factors when recommending HA solutions:
1. RPO (Recovery Point Objective) - How much data loss is acceptable? 2. RTO (Recovery Time Objective) - How quickly must services recover? 3. Read requirements during outages - Use RA-GRS/RA-GZRS if needed 4. Cost constraints - GRS costs more than LRS 5. Compliance requirements - Data residency regulations may affect choices
Exam Tips: Answering Questions on This Topic
• Tip 1: When a question mentions regional failure protection, choose GRS, GZRS, or their read-access variants
• Tip 2: If the scenario requires reading data during primary region outage, select RA-GRS or RA-GZRS
• Tip 3: For datacenter-level protection within a region, ZRS is the appropriate choice
• Tip 4: When cost optimization is emphasized and regional disasters are acceptable risks, LRS may be sufficient
• Tip 5: Questions about semi-structured data with global distribution often point to Azure Cosmos DB with multi-region writes
• Tip 6: Look for keywords like analytics workloads and big data which suggest Azure Data Lake Storage Gen2
• Tip 7: Remember that GZRS provides the highest durability for blob storage scenarios
• Tip 8: If a question mentions file shares with HA requirements, Azure Files with appropriate redundancy is the answer
• Tip 9: For Cosmos DB questions, understand that strong consistency with multi-region reduces availability compared to eventual consistency
• Tip 10: Always match the solution to stated SLA requirements - 99.99% vs 99.999% availability needs affect your recommendation