DynamoDB Global Secondary Indexes (GSIs) are powerful features that allow you to query data using alternate keys beyond the primary key of your base table. Unlike Local Secondary Indexes, GSIs can have a completely different partition key and sort key from the main table, providing maximum flexibil…DynamoDB Global Secondary Indexes (GSIs) are powerful features that allow you to query data using alternate keys beyond the primary key of your base table. Unlike Local Secondary Indexes, GSIs can have a completely different partition key and sort key from the main table, providing maximum flexibility for query patterns.
A GSI essentially creates a separate table that DynamoDB manages automatically. When you write data to your base table, DynamoDB asynchronously propagates changes to all associated GSIs. This means GSIs offer eventual consistency for reads, not strong consistency.
Key characteristics of GSIs include:
1. **Partition Key Flexibility**: You can choose any scalar attribute as the partition key, enabling queries on non-key attributes efficiently.
2. **Optional Sort Key**: You can optionally define a sort key to enable range queries on the index.
3. **Projected Attributes**: You control which attributes are copied to the index. Options include KEYS_ONLY, INCLUDE (specific attributes), or ALL attributes.
4. **Separate Throughput**: Each GSI has its own provisioned read and write capacity units, independent of the base table. This is crucial for capacity planning.
5. **Sparse Indexes**: If an item lacks the GSI key attribute, it won't appear in the index. This creates sparse indexes useful for filtering specific data subsets.
Best practices for GSIs:
- Design indexes based on your application's access patterns
- Consider storage costs since data is duplicated
- Monitor GSI throttling separately from base table metrics
- Project only necessary attributes to minimize storage and costs
You can create up to 20 GSIs per table. GSIs can be added or removed after table creation, making them flexible for evolving application requirements.
For the AWS Developer exam, understand that GSI writes consume additional write capacity, and queries against GSIs return eventually consistent data. Proper GSI design is essential for building scalable, cost-effective DynamoDB applications.
DynamoDB Global Secondary Indexes (GSI) - Complete Guide
Why Global Secondary Indexes Are Important
Global Secondary Indexes (GSIs) are crucial for DynamoDB because they allow you to query data using attributes other than the table's primary key. In real-world applications, you often need to access the same data in multiple ways. For example, an orders table might use OrderID as the primary key, but you also need to find all orders by a specific customer or within a date range. GSIs make this possible efficiently.
What is a Global Secondary Index?
A Global Secondary Index is an index with a partition key and optional sort key that can be different from those on the base table. The term global means the index spans all partitions of the base table, unlike Local Secondary Indexes which are constrained to the same partition key as the base table.
Key characteristics of GSIs: - Can have a different partition key AND sort key from the base table - Can be created at any time (during or after table creation) - Have their own provisioned throughput settings (separate from the base table) - Support eventual consistency only (not strongly consistent reads) - Maximum of 20 GSIs per table - Stored separately from the base table data
How Global Secondary Indexes Work
When you create a GSI, DynamoDB automatically maintains the index. Here is the process:
1. Index Creation: You define the partition key, optional sort key, and which attributes to project into the index.
2. Data Projection: You choose what data to include: - KEYS_ONLY: Only the base table's primary key and the index key attributes - INCLUDE: Specific attributes you specify plus keys - ALL: All attributes from the base table
3. Automatic Synchronization: When items are written to the base table, DynamoDB asynchronously updates all GSIs. This is why GSIs only support eventual consistency.
4. Querying: You can query the GSI using its partition key (required) and sort key (optional), just like querying a regular table.
Provisioned Throughput Considerations
GSIs have their own Read Capacity Units (RCUs) and Write Capacity Units (WCUs): - Write operations to the base table consume WCUs from both the table AND each affected GSI - If a GSI runs out of write capacity, the base table writes will be throttled - RCUs for GSIs are calculated based on projected item sizes, not base table item sizes
To query users by department, create a GSI: - GSI Partition Key: Department - GSI Sort Key: JoinDate - Projection: ALL
Now you can efficiently query all users in a department sorted by join date.
Exam Tips: Answering Questions on DynamoDB Global Secondary Indexes
1. GSI vs LSI Recognition: If a question mentions needing a different partition key, the answer is GSI. LSIs share the same partition key as the base table.
2. Consistency Model: Remember that GSIs support eventual consistency only. If a question requires strongly consistent reads on an alternate key, GSI is not the solution.
3. Throughput and Throttling: Questions about throttling often involve GSIs. If writes to a table are being throttled but the table has capacity, check if a GSI is under-provisioned.
4. Creation Timing: GSIs can be added after table creation; LSIs cannot. This is a common exam topic.
5. Projection Strategy: Choose KEYS_ONLY or INCLUDE to reduce storage costs and write capacity consumption. Only use ALL when you frequently need most attributes.
6. Sparse Indexes: GSIs only contain items where the index key attributes exist. This creates sparse indexes useful for filtering specific subsets of data.
7. Cost Awareness: Every GSI adds write costs. For exam scenarios involving cost optimization, fewer GSIs with careful projection choices is typically correct.
8. Limit Knowledge: Know that the limit is 20 GSIs per table. Questions may test this limit.