Data sources in Terraform allow you to fetch and reference information from external sources or existing infrastructure that is managed outside of your current Terraform configuration. They enable you to query data from providers, remote state files, or external APIs to use within your configuratio…Data sources in Terraform allow you to fetch and reference information from external sources or existing infrastructure that is managed outside of your current Terraform configuration. They enable you to query data from providers, remote state files, or external APIs to use within your configuration.
A data block is the configuration construct used to define a data source. The syntax follows this pattern:
data "provider_type" "local_name" {
# query parameters
}
For example, to retrieve information about an existing AWS AMI:
data "aws_ami" "ubuntu" {
most_recent = true
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
}
}
Key characteristics of data sources include:
1. Read-only operations: Data sources only read information; they do not create, modify, or destroy infrastructure resources.
2. Dependency management: Terraform automatically determines when to fetch data based on dependencies in your configuration.
3. Refresh behavior: Data is fetched during the planning phase and refreshed on each terraform plan or apply execution.
4. Reference syntax: You reference data source attributes using data.type.name.attribute format, such as data.aws_ami.ubuntu.id.
Common use cases for data sources:
- Querying existing infrastructure details like VPC IDs, subnet information, or security groups
- Fetching the latest AMI IDs for EC2 instances
- Reading outputs from remote Terraform state files using terraform_remote_state
- Retrieving secrets from external secret management systems
- Looking up availability zones in a region
Data sources are essential for integrating Terraform configurations with existing infrastructure, enabling dynamic configurations that adapt to current environment states, and promoting code reusability by avoiding hardcoded values. They bridge the gap between managed and unmanaged resources in your infrastructure.
Data Sources and Data Blocks in Terraform
What Are Data Sources in Terraform?
Data sources in Terraform allow you to fetch or query information from external sources or existing infrastructure that was created outside of your current Terraform configuration. They enable you to reference resources that already exist in your cloud provider or other systems.
Why Are Data Sources Important?
Data sources are crucial because they:
• Allow you to reference existing infrastructure that Terraform does not manage • Enable dynamic configuration by fetching current values from your provider • Help avoid hardcoding values like AMI IDs, VPC IDs, or availability zones • Facilitate integration between different Terraform configurations or workspaces • Support the principle of keeping configurations DRY (Don't Repeat Yourself)
How Data Blocks Work
Data sources are defined using the data block. The syntax follows this pattern:
data "provider_type" "local_name" { # Query parameters }
For example, to fetch information about an existing AWS AMI:
data "aws_ami" "ubuntu" { most_recent = true owners = ["099720109477"] filter { name = "name" values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"] }}
You reference data source attributes using: data.provider_type.local_name.attribute
Example: data.aws_ami.ubuntu.id
Key Differences: Data Sources vs Resources
• Resources create, update, and delete infrastructure • Data sources only read information - they are read-only • Resources use the resource block; data sources use the data block • Data sources do not manage lifecycle; they simply query existing data
Common Use Cases
• Looking up the latest AMI ID for a specific operating system • Fetching availability zones in a region • Referencing an existing VPC or subnet • Getting information about an existing IAM role or policy • Reading secrets from a secrets manager
Exam Tips: Answering Questions on Data Sources and Data Blocks
1. Remember the syntax: Data sources always start with the data keyword, not resource
2. Reference format: Know that data source attributes are accessed using data.TYPE.NAME.ATTRIBUTE - the prefix "data." distinguishes them from resources
3. Read-only nature: If a question asks about modifying or creating infrastructure, data sources are not the answer - they only retrieve information
4. Dependencies: Data sources can depend on resources and vice versa; Terraform handles the dependency graph
5. Filtering: Many data sources support filters to narrow down results; understand that filters help select specific resources
6. When to use: Choose data sources when the question involves existing infrastructure not managed by current Terraform code
7. Provider requirement: Data sources require their respective provider to be configured, just like resources
8. Refresh behavior: Data sources are read during the planning phase and refresh phase
9. Multiple results: Some data sources return single items while others (like aws_availability_zones) return lists
10. Watch for trick questions: Ensure you distinguish between creating new resources versus querying existing ones