Azure Table Storage: Complete Guide for DP-900 Exam
Azure Table Storage is a core non-relational data service in Microsoft Azure that stores large amounts of structured, non-relational (NoSQL) data in a key-value and schema-less design. Understanding Azure Table Storage is essential for the DP-900: Microsoft Azure Data Fundamentals exam, as it is a frequently tested topic under the non-relational data workloads on Azure.
Why Is Azure Table Storage Important?
Azure Table Storage plays a significant role in the Azure ecosystem for several reasons:
- Massive Scalability: It can store terabytes of structured data and serve them efficiently, making it ideal for web-scale applications.
- Cost-Effectiveness: Compared to traditional relational databases or even Azure Cosmos DB, Table Storage is significantly cheaper for simple key-value lookup scenarios.
- Schema-less Design: Each row (entity) in a table can have a different set of properties, providing flexibility that relational databases cannot offer without schema migrations.
- Fast Key-Based Access: Data is indexed by a combination of PartitionKey and RowKey, enabling extremely fast lookups.
- Part of Azure Storage Account: It comes bundled with every Azure Storage Account, meaning no additional provisioning is needed beyond creating a storage account.
What Is Azure Table Storage?
Azure Table Storage is a NoSQL key-value store hosted in the cloud. It stores data as collections of entities within tables. Here are the key concepts:
1. Storage Account: The top-level resource. Azure Table Storage exists as a service within an Azure Storage Account.
2. Table: A collection of entities. Unlike relational database tables, Azure tables do not enforce a fixed schema. You can have multiple tables within a single storage account.
3. Entity: An entity is analogous to a row in a relational database. Each entity is a set of properties (key-value pairs). An entity can hold up to 252 custom properties and can be up to 1 MB in size.
4. Properties: A property is a name-value pair. Each entity always has three system properties:
- PartitionKey: A string value that identifies the partition to which the entity belongs. This is the primary mechanism for distributing data across storage nodes.
- RowKey: A string value that uniquely identifies the entity within its partition. Together with the PartitionKey, it forms the composite primary key.
- Timestamp: A system-maintained value that records when the entity was last modified.
5. Schema-less: Different entities within the same table can have different sets of properties. There is no need to define the structure of the data in advance.
How Does Azure Table Storage Work?
Data Partitioning:
Azure Table Storage uses the PartitionKey to distribute data across multiple storage nodes automatically. All entities with the same PartitionKey are stored together on the same partition (physical or logical server). This design enables:
- Scalability: Azure can spread different partitions across many servers to handle high volumes of data and requests.
- Performance: Queries that target a single partition are fast because they only need to access one node.
- Entity Group Transactions (Batch Operations): You can perform atomic batch transactions on entities that share the same PartitionKey, allowing up to 100 operations in a single batch.
Data Access:
- The most efficient query is a point query, which specifies both the PartitionKey and RowKey. This retrieves a single entity instantly.
- A partition scan specifies only the PartitionKey, returning all entities in that partition.
- A table scan does not specify a PartitionKey, which means Azure must scan the entire table — this is the least efficient query type.
Supported Data Types:
Properties in Azure Table Storage support several data types, including String, Int32, Int64, Double, Boolean, DateTime, GUID, and Binary (byte arrays up to 64 KB).
Access Methods:
- REST API: Azure Table Storage exposes a RESTful OData endpoint for CRUD operations.
- Azure SDKs: Client libraries are available for .NET, Java, Python, JavaScript/Node.js, and more.
- Azure Storage Explorer: A desktop GUI tool for browsing and managing table data.
- Azure Portal: Basic table management is available through the portal.
Azure Table Storage vs. Azure Cosmos DB Table API:
Microsoft also offers the Azure Cosmos DB Table API, which is compatible with Azure Table Storage but provides additional capabilities:
- Global distribution and multi-region replication
- Single-digit millisecond latency with SLA guarantees
- Automatic secondary indexing on all properties
- Five consistency models (Strong, Bounded Staleness, Session, Consistent Prefix, Eventual)
- Higher throughput capacity
Azure Table Storage, by contrast, is simpler, cheaper, and only indexes by PartitionKey and RowKey. For the exam, you should know that Cosmos DB Table API is the premium alternative to Azure Table Storage, and applications can often migrate from Table Storage to Cosmos DB Table API with minimal code changes.
Common Use Cases for Azure Table Storage:
- Storing user profile data for web applications
- Storing device telemetry and IoT data
- Address books and contact information
- Logging and diagnostic data
- Configuration settings and metadata
- Any scenario requiring a simple, flexible, high-volume key-value store
Limitations to Be Aware Of:
- Maximum entity size is 1 MB
- Maximum number of properties per entity is 255 (including the 3 system properties)
- No support for secondary indexes (only PartitionKey and RowKey are indexed)
- No support for complex queries, joins, or relationships
- No foreign keys or referential integrity
- Limited query capabilities compared to SQL or even Cosmos DB
Exam Tips: Answering Questions on Azure Table StorageTip 1 — Know the Key Components: Always remember the core hierarchy:
Storage Account → Table → Entity → Properties. If a question asks what uniquely identifies an entity, the answer is the combination of
PartitionKey + RowKey.
Tip 2 — Understand PartitionKey and RowKey: This is the most commonly tested concept. The PartitionKey determines how data is distributed across servers, and the RowKey uniquely identifies an entity within a partition. Together, they form the
composite primary key. Questions may ask which key determines data distribution (answer: PartitionKey).
Tip 3 — Schema-less Nature: If a question describes a scenario where different records need different sets of columns or properties, and a low-cost NoSQL solution is needed, Azure Table Storage is often the correct answer.
Tip 4 — Differentiate from Cosmos DB Table API: If the question mentions requirements like
global distribution,
guaranteed low latency,
multiple consistency levels, or
automatic indexing of all properties, the answer is likely
Azure Cosmos DB Table API, not plain Azure Table Storage. If the requirement is simply
low-cost, basic key-value storage, the answer is Azure Table Storage.
Tip 5 — Differentiate from Blob Storage: Azure Table Storage stores
structured, tabular, non-relational data. Azure Blob Storage stores
unstructured data like images, videos, and documents. If the question involves structured key-value data, choose Table Storage.
Tip 6 — Differentiate from Relational Databases: If a question scenario requires joins, complex queries, foreign keys, stored procedures, or strict schemas, the answer is a relational service like Azure SQL Database —
not Table Storage. Table Storage is for simple, fast, schema-less lookups.
Tip 7 — Batch Transactions: Remember that batch (Entity Group) transactions are only supported for entities within the
same partition (same PartitionKey). If a question asks about atomic operations across partitions, the answer is that this is
not supported in Azure Table Storage.
Tip 8 — Query Performance: Know that
point queries (specifying both PartitionKey and RowKey) are the fastest.
Table scans (no PartitionKey specified) are the slowest and least efficient. Exam questions may test your understanding of which query pattern performs best.
Tip 9 — Part of Azure Storage Account: Azure Table Storage is not a standalone service. It is one of four data services within an Azure Storage Account (alongside Blob, Queue, and File storage). Questions may reference this grouping.
Tip 10 — Size Limits: Remember the 1 MB per entity limit and the 255-property maximum. If a question describes entities larger than 1 MB, Table Storage would not be appropriate.
By mastering these concepts and tips, you will be well-prepared to answer any DP-900 exam question related to Azure Table Storage confidently and correctly.