Custom IAM Policies and Least Privilege
Custom IAM Policies and Least Privilege are fundamental concepts in AWS security and governance, especially critical for Data Engineers managing sensitive data pipelines and infrastructure. **Custom IAM Policies** are JSON-based documents that define granular permissions for AWS resources. Unlike … Custom IAM Policies and Least Privilege are fundamental concepts in AWS security and governance, especially critical for Data Engineers managing sensitive data pipelines and infrastructure. **Custom IAM Policies** are JSON-based documents that define granular permissions for AWS resources. Unlike AWS-managed policies, custom policies are created by administrators to address specific organizational requirements. They consist of key elements: Effect (Allow/Deny), Action (specific API operations), Resource (targeted AWS resources via ARNs), and optional Conditions (contextual constraints like IP ranges, time, or tags). For example, a data engineer might create a custom policy that allows read-only access to specific S3 buckets containing production data while denying access to buckets with PII data. **Least Privilege** is a security principle stating that users, roles, and services should be granted only the minimum permissions necessary to perform their tasks — nothing more. This reduces the blast radius of potential security breaches and limits accidental or malicious data exposure. In practice, implementing least privilege involves: 1. **Starting with zero permissions** and incrementally adding only what's needed. 2. **Using IAM Access Analyzer** to review and refine policies based on actual usage patterns. 3. **Leveraging resource-level permissions** to restrict actions to specific datasets, tables, or pipelines rather than broad service-level access. 4. **Applying conditions** such as `aws:RequestedRegion`, `s3:prefix`, or `aws:PrincipalTag` to further narrow access. 5. **Regularly auditing permissions** using tools like IAM Access Advisor and AWS CloudTrail. For data engineering workflows, this means Glue jobs should have roles that only access required S3 paths, Redshift users should only query permitted schemas, and Lambda functions should interact with only designated data stores. Key best practices include using **service control policies (SCPs)** at the organization level, implementing **permission boundaries** to cap maximum permissions, and employing **tag-based access control** for dynamic and scalable policy management. Together, custom IAM policies and least privilege form the backbone of a robust data governance strategy on AWS.
Custom IAM Policies and Least Privilege – AWS Data Engineer Associate Guide
Why Custom IAM Policies and Least Privilege Matter
In any AWS environment—especially one handling sensitive data pipelines, analytics workloads, and ETL processes—security is paramount. The principle of least privilege states that every identity (user, role, or service) should be granted only the minimum permissions necessary to perform its intended function. Custom IAM policies are the mechanism through which you enforce this principle with precision. Without them, organizations often rely on AWS-managed policies that may be overly permissive, increasing the blast radius of a compromised credential or misconfigured service.
For data engineers, this is critically important because data pipelines often touch multiple services—Amazon S3, AWS Glue, Amazon Redshift, Amazon Kinesis, AWS Lambda, and more. A single overly broad policy could expose sensitive data, allow unauthorized transformations, or permit unintended deletions across your entire data lake.
What Are Custom IAM Policies?
AWS Identity and Access Management (IAM) policies are JSON documents that define permissions. There are several types:
• AWS Managed Policies – Pre-built by AWS, covering common use cases (e.g., AmazonS3ReadOnlyAccess). They are convenient but often grant broader access than needed.
• Customer Managed Policies – Created and maintained by you. These are reusable across multiple identities within your account and offer full control over the permissions granted.
• Inline Policies – Embedded directly within a single user, group, or role. Useful for strict one-to-one relationships between a policy and an identity but harder to manage at scale.
A custom IAM policy is either a customer managed policy or an inline policy that you write yourself to precisely define what actions are allowed or denied, on which resources, and under what conditions.
How Custom IAM Policies Work
Every IAM policy document consists of one or more statements. Each statement includes:
• Effect – Either Allow or Deny.
• Action – The specific API actions (e.g., s3:GetObject, glue:StartJobRun, redshift-data:ExecuteStatement).
• Resource – The ARN(s) of the specific AWS resources the statement applies to (e.g., a particular S3 bucket or Glue database).
• Condition (optional) – Additional constraints such as IP address restrictions, time-based access, MFA requirements, tags, or encryption requirements.
Example – A Least Privilege Policy for a Glue ETL Job:
Imagine a Glue job that reads from an S3 source bucket and writes to a destination bucket. A least privilege custom policy would look like:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:ListBucket"],
"Resource": [
"arn:aws:s3:::source-data-bucket",
"arn:aws:s3:::source-data-bucket/*"
]
},
{
"Effect": "Allow",
"Action": ["s3:PutObject"],
"Resource": "arn:aws:s3:::destination-data-bucket/*"
},
{
"Effect": "Allow",
"Action": ["glue:GetDatabase", "glue:GetTable", "glue:GetPartitions"],
"Resource": "*"
}
]
}
Notice how the policy grants only the specific actions needed (read from source, write to destination, read Glue catalog metadata) rather than granting broad s3:* or glue:* permissions.
Key Concepts for Least Privilege Implementation
1. Scope Actions Narrowly – Instead of s3:*, specify exactly which S3 actions are required (e.g., s3:GetObject, s3:PutObject).
2. Scope Resources Specifically – Avoid using "Resource": "*" whenever possible. Target specific bucket ARNs, Glue database ARNs, Redshift cluster ARNs, etc.
3. Use Conditions for Additional Guardrails – Add conditions such as:
• aws:SourceVpc or aws:SourceVpce to restrict access to a specific VPC
• s3:x-amz-server-side-encryption to enforce encryption
• aws:PrincipalTag or aws:ResourceTag for tag-based access control (ABAC)
• aws:MultiFactorAuthPresent for sensitive operations
4. Use Explicit Deny Statements – Deny statements override any Allow, making them powerful guardrails. For example, you can deny deletion of production S3 objects regardless of other policies.
5. Leverage IAM Access Analyzer – AWS IAM Access Analyzer can generate least privilege policies based on actual CloudTrail activity. This is extremely useful for refining broad policies into precise ones.
6. Use Service Control Policies (SCPs) – In AWS Organizations, SCPs set permission boundaries across all accounts, complementing IAM policies at the organizational level.
7. Permissions Boundaries – These act as a maximum permissions ceiling for an IAM entity. Even if a policy grants broad access, the permissions boundary limits what is actually effective.
8. Review and Rotate Regularly – Use IAM credential reports and Access Advisor to identify unused permissions and remove them.
How Policy Evaluation Works
AWS evaluates policies using the following logic:
1. By default, all requests are implicitly denied.
2. An explicit Allow in a policy overrides the implicit deny.
3. An explicit Deny always overrides any Allow.
4. Multiple policy types are evaluated together: identity-based policies, resource-based policies, SCPs, permissions boundaries, and session policies.
Understanding this evaluation order is critical for exam questions that present scenarios where access is unexpectedly denied or allowed.
Common Data Engineering Scenarios
• S3 Data Lake Access – Grant Glue crawlers read-only access to specific S3 prefixes; grant ETL jobs write access only to processed-data prefixes.
• Cross-Account Data Sharing – Use resource-based policies (S3 bucket policies) combined with IAM role assumption to allow another account's pipeline to access your data with limited permissions.
• Redshift Spectrum – The IAM role attached to Redshift needs s3:GetObject and s3:ListBucket on the specific external data buckets, plus Glue catalog read permissions.
• Kinesis Data Streams – Producer roles need kinesis:PutRecord and kinesis:PutRecords on specific streams; consumer roles need kinesis:GetRecords, kinesis:GetShardIterator, and kinesis:DescribeStream.
• Lake Formation – Works on top of IAM by adding fine-grained column-level and row-level permissions to the Glue Data Catalog, further enforcing least privilege for analytics queries.
Exam Tips: Answering Questions on Custom IAM Policies and Least Privilege
1. Always choose the most restrictive answer that still works. If one option uses s3:* and another uses s3:GetObject and s3:PutObject, choose the more specific one. The exam rewards least privilege thinking.
2. Watch for "Resource": "*" traps. If a question asks about best practices and one answer specifies exact ARNs while another uses wildcards, the specific ARN answer is almost always correct.
3. Understand the difference between identity-based and resource-based policies. For cross-account access, resource-based policies (like S3 bucket policies) can grant access without requiring role assumption, while identity-based policies require the target account to also have a trust relationship.
4. Know when to use Deny statements. Questions may describe a scenario where a broad managed policy is attached but certain dangerous actions (like s3:DeleteObject) must be prevented. An explicit Deny in a custom policy is the correct approach.
5. Recognize IAM Access Analyzer scenarios. If a question asks how to generate or refine least privilege policies based on actual usage, IAM Access Analyzer is the answer. If it asks about identifying external access to resources, Access Analyzer also handles that.
6. Understand Permissions Boundaries. Exam questions may describe a delegated admin scenario where you want to allow a team to create roles but limit the permissions those roles can have. Permissions boundaries are the solution.
7. Conditions are key differentiators. If a question requires access control based on tags, encryption status, VPC, IP range, or MFA, look for the answer that uses IAM policy conditions—not network-level controls alone.
8. Lake Formation vs. IAM for data governance. For fine-grained access control at the database, table, column, or row level in a data lake, Lake Formation is the preferred answer over raw IAM policies on S3.
9. Service-linked roles and AWS-managed policies are not customizable. If the question asks about customizing permissions for a specific use case, the answer involves creating a custom IAM policy, not modifying a managed one.
10. Policy evaluation logic matters. If a scenario describes conflicting policies (one allows, one denies), remember: explicit Deny always wins. If no explicit Allow exists, access is denied by default. SCPs can restrict even account-root-level actions in member accounts of an organization.
By mastering these concepts, you will be well-prepared to handle any question on custom IAM policies and least privilege on the AWS Data Engineer Associate exam.
Unlock Premium Access
AWS Certified Data Engineer - Associate + ALL Certifications
- Access to ALL Certifications: Study for any certification on our platform with one subscription
- 2970 Superior-grade AWS Certified Data Engineer - Associate practice questions
- Unlimited practice tests across all certifications
- Detailed explanations for every question
- AWS DEA-C01: 5 full exams plus all other certification exams
- 100% Satisfaction Guaranteed: Full refund if unsatisfied
- Risk-Free: 7-day free trial with all premium features!