All posts by Jonathan Jenkyn

Implementing least privilege access for Amazon Bedrock

Post Syndicated from Jonathan Jenkyn original https://aws.amazon.com/blogs/security/implementing-least-privilege-access-for-amazon-bedrock/

Generative AI applications often involve a combination of various services and features—such as Amazon Bedrock and large language models (LLMs)—to generate content and to access potentially confidential data. This combination requires strong identity and access management controls and is special in the sense that those controls need to be applied on various levels. In this blog post, you will review the scenarios and approaches where you can apply least privilege access to applications using Amazon Bedrock. To fully benefit from the guidance in this post, you need an understanding of AWS APIs, AWS Identity and Access Management (IAM) policies, and AWS security services.

Let’s start by defining the principle of least privilege (PoLP): The PoLP is a security concept that advises granting the minimal level of access—or permissions—necessary for users, programs, or systems to perform their tasks. The main idea is that the fewer permissions an entity has, the lower the risk of malicious or accidental damage. Applying the PoLP to your use of AWS serves two purposes:

  • Security: By limiting access, you reduce the potential impact of a security incident. If a user or service has minimal permissions, the scope for any damage can be significantly reduced.
  • Operational simplicity: Managing permissions can become complex if not properly managed and maintained. Applying the PoLP to your access controls early helps keep configurations as manageable as possible. Finally, there are regulatory frameworks that require separation of duty between roles and a documented strategy for access controls, which can be achieved in part by adhering to the PoLP.

Amazon Bedrock is a fully managed AWS service that makes high-performing foundation models (FMs) available through a single unified API. You use Amazon Bedrock through AWS APIs, which expose actions for the control plane and administration such as the configuration of Amazon Bedrock Guardrails and Amazon Bedrock Agents, in addition to data plane functional actions such as inference.

Generally, the path to using Amazon Bedrock for a production workload includes the following stages:

  • Model selection: Decide on the required features (Retrieval Augmented Generation (RAG), fine-tuning, and so on), evaluate and select a model, and approve a EULA if necessary.
  • Model adaptation: Prompt engineering, integration of Amazon Bedrock into the application, and addition of model customization if desired.
  • Model testing: Validate and test the solution.
  • Model operation: Deploy the solution and make it available. Monitor and operate the solution.

In the following sections, we go through each phase and outline how you can apply the PoLP.

Model selection

In this phase, you choose the features and models that are needed to fulfill your requirements and define how you will apply the PoLP. These can include, for example, model customization, Retrieval Augmented Generation (RAG) or the use of agents.

Security should be integrated into the design so that the defined controls can be implemented during the development phase. One approach to define the necessary security controls is threat modeling. Doing this exercise early in the process will simplify the upcoming phases. The results can be used later to decide on the required guardrails, potential changes to the architecture, and test cases.

In this phase, you will also decide how the solution should be deployed. Customers typically operate in a multi-account setup; therefore, the selection of target organizational units (OUs) and accounts is required. We recommend creating a new OU for generative AI applications. For details, see the deep-dive chapter on generative AI in the AWS Security Reference Architecture. We will talk later about service control policies (SCPs) and how they can be used to restrict permissions. The generative AI OU is a good place to enforce those guardrails.

Amazon Bedrock provides access to a variety of high-performing FMs from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon. In this stage, you need to choose the models that you’ll use and approve them. With a third-party FM, approval might include accepting a EULA. You can limit identities and the models that they can subscribe to in order to follow compliance with EULAs that have been reviewed by your legal department. The following is an example of an SCP that allows account operators to enable all Anthropic FMs and a single Meta Llama FM.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAcceptingModelEULAs",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Subscribe"
      ],
      "Resource": "*",
      "Condition": {
        "ForAnyValue:StringEquals": {
          "aws-marketplace:ProductId": [
            "c468b48a-84df-43a4-8c46-8870630108a7",
            "b0eb9475-3a2c-43d1-94d3-56756fd43737",
            "prod-6dw3qvchef7zy",
            "prod-m5ilt4siql27k",
            "prod-ozonys2hmmpeu",
            "prod-fm3feywmwerog",
            "prod-2c2yc2s3guhqy"
          ]
        }
      }
    },
    {
      "Sid": "AllowUnsubscribingFromModels",
      "Effect": "Allow",
      "Action": [
        "aws-marketplace:Unsubscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*"
    }
  ]
}

While this approach works well if you’re only allowlisting actions, you might have highly privileged users that already have broad access to AWS Marketplace APIs. In such a case, you can follow a deny all except a few approach. Such a policy, using the same models as before, would look like the following example:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyAcceptingAllExceptCertainModelEULAs",
      "Effect": "Deny",
      "Action": [
        "aws-marketplace:Subscribe"
      ],
      "Resource": "*",
      "Condition": {
        "ForAllValues:StringNotEquals": {
          "aws-marketplace:ProductId": [
            "c468b48a-84df-43a4-8c46-8870630108a7",
            "b0eb9475-3a2c-43d1-94d3-56756fd43737",
            "prod-6dw3qvchef7zy",
            "prod-m5ilt4siql27k",
            "prod-ozonys2hmmpeu",
            "prod-fm3feywmwerog",
            "prod-2c2yc2s3guhqy"
          ]
        }
      }
    },
    {
      "Sid": "DenyUnsubscribingAllExceptCertainModels",
      "Effect": "Deny",
      "Action": [
        "aws-marketplace:Unsubscribe",
        "aws-marketplace:ViewSubscriptions"
      ],
      "Resource": "*",
      "Condition": {
        "ForAllValues:StringNotEquals": {
          "aws-marketplace:ProductId": [
            "c468b48a-84df-43a4-8c46-8870630108a7",
            "b0eb9475-3a2c-43d1-94d3-56756fd43737",
            "prod-6dw3qvchef7zy",
            "prod-m5ilt4siql27k",
            "prod-ozonys2hmmpeu",
            "prod-fm3feywmwerog",
            "prod-2c2yc2s3guhqy"
          ]
        }
      }
    }
  ]
}

You can find the required product IDs used in the condition in Grant IAM permissions to request access to Amazon Bedrock foundation models.

Model adaptation

In this phase, the solution is built—that is, code is written. This is mostly identical to traditional software development, however there are some areas specific to generative AI, such as prompt engineering, prompt guardrails, model monitoring, and agent design. In this post, we focus solely on the identity and access management aspects.

Adaptation is the phase where the detailed permission sets are created. Data perimeters can be used as a conceptual tool to define and implement guardrails. Because data perimeters are typically coarse grained, they aren’t sufficient to achieve the goal of the PoLP. However, in combination with fine-grained policies, they support a defense-in-depth approach. The following data perimeters exist:

  • Identity: Only trusted identities are allowed in my network, only trusted identities can access my resources.
  • Resource: My identities can access only trusted resources, only trusted resources can be accessed from my network.
  • Network: My identities can access resources only from expected networks, my resources can only be accessed from expected networks.

For applications that use Amazon Bedrock, you can use a virtual private cloud (VPC) network construct with Amazon Virtual Private Cloud (Amazon VPC) to host them. Doing so means that you can then use AWS PrivateLink to create VPC endpoints for both data and control plane APIs. Using PrivateLink to create endpoints, it’s possible to provide access to Amazon Bedrock for VPC-bound compute resources (such as Amazon Elastic Compute Cloud (Amazon EC2), or AWS Lambda) without the need for an internet gateway. In other words, you can deploy these resources entirely in private subnets. By using resource-based policies on these endpoints, you can restrict the principals, actions, resources, and conditions related to making API calls.

Let’s assume you have a VPC with an EC2 instance running in a private subnet hosting an application that uses Amazon Bedrock model invocations and have created an interface VPC endpoint to connect to the Amazon Bedrock data plane. The EC2 instance is configured to use an instance profile using the <rolename> IAM role and needs to be able to invoke a single Anthropic’s Claude Instant FM through an Amazon Bedrock InvokeModel API call. You can apply the PoLP to the containing VPC, and thus the EC2 instance, with a custom policy on the Amazon Bedrock interface VPC endpoint. To use the following policy in your own account, replace the default interface VPC endpoint policy with the following example, replacing <rolename> with the role you want to allow and <account-id> with your 12-digit account number.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowInvokingClaudeInstantV1Models",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<account-id>:role/<rolename>"
      },
      "Action": [
        "bedrock:InvokeModel"
      ],
      "Resource": "arn:aws:bedrock:*::foundation-model/anthropic.claude-instant-v1"
    }
  ]
}

Check out Security Objective 2: Implement a data perimeter using VPC endpoint policies for more information about this data perimeter approach.

You can define the allowed models that can be used in Amazon Bedrock directly in the policy. However, if you have multiple applications that use Amazon Bedrock, you might have to update multiple policies when a new model is allowed to be used. To complement the data perimeter approach, you can add an SCP to limit the models that can be used for inference. Because Amazon Bedrock is using a simple API (InvokeModel and Converse) for inference, a condition element in an IAM policy can be used to deny the use of unapproved models. Note that while the two policies (the SCP and the VPC endpoint policy) look similar, they work differently: VPC endpoint policies are enforced provided that the network path through PrivateLink is enforced; SCPs are applied to principals within the account or OU they’re attached to. Be extra careful if the calling identity resides outside of your account, because only the VPC endpoint policy will apply.

For example, imagine that you wanted to block the invocation of all Anthropic FMs across your organizations within AWS Organizations, in all AWS Regions. The following SCP example applied to the OUs or AWS accounts in scope would achieve that outcome:

{
  "Version": "2012-10-17",
  "Statement": {
    "Sid": "DenyInferenceForAnthropicModels",
    "Effect": "Deny",
    "Action": [
      "bedrock:InvokeModel",
      "bedrock:InvokeModelWithResponseStream"
    ],
    "Resource": [
      "arn:aws:bedrock:*::foundation-model/anthropic.*"
    ]
  }
}

You can use the same pattern to access data that’s needed for your application, such as data residing in Amazon Simple Storage Service (Amazon S3).

Model customization

A solution built on Amazon Bedrock might include model customization. The common denominator of the different customization approaches is that they include data, which is assumed to be confidential and thus in-scope for applying the PoLP. Here, we take a scenario where data is stored in Amazon S3 and can be encrypted using a customer managed AWS Key Management Service (AWS KMS) key.

Measures can be taken on multiple levels, as conceptualized in data perimeters: network, identity, and resource. Amazon Bedrock model customization uses service roles, which allows you to apply fine-grained and least-privilege access end-to-end. These service roles will be assumed by the Amazon Bedrock service principal, so that it can execute actions on your behalf. To allow the Amazon Bedrock service principal to assume the role in your account, you need to attach a trust policy to the role.

Let’s imagine that you’re running an Amazon Bedrock customization job in the us-east-1 (N. Virginia) Region. Using the following trust policy example will allow only the Amazon Bedrock service principal to assume your role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowBedrockServicePrincipalUnderConditions",
      "Effect": "Allow",
      "Principal": {
        "Service": "bedrock.amazonaws.com"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "aws:SourceAccount": "<account-id>"
        },
        "ArnEquals": {
          "aws:SourceArn": "arn:aws:bedrock:us-east-1:<account-id>:model-customization-job/*"
        }
      }
    }
  ]
}

Make sure to replace <account-id> in the preceding example trust policy with your own 12-digit account number. The policy contains a condition that provides cross-service confused deputy prevention by adding the aws:SourceAccount condition. The confused deputy problem is a situation where an entity that doesn’t have permission to perform an action can coerce a more privileged entity to perform the action. In AWS, cross-service impersonation can result in the confused deputy problem. Cross-service impersonation can occur when one service (the calling service) calls another service (the called service). AWS provides tools to help you protect your data for all services with service principals that have been given access to resources in your account. Both the aws:SourceArn and aws:SourceAccount global condition context keys in the role’s trust policy limit the permissions that Amazon Bedrock gives another service (in the preceding case, to the customization job) to the resource. aws:SourceArn is the more restrictive approach here, because it defines the specific source of the assume request, and not just the AWS account.

You should provide only the permissions that are required to fulfill the model customization task. For example, imagine that you want to limit access to your training data, the validation data bucket, and the output bucket (where Amazon Bedrock will deliver output metrics). The following policy, attached to that same service role, provides only those permissions. Replace the <training-bucket> placeholder with the bucket name that contains your training data, <validation-bucket> with your validation bucket, and <output-bucket> with the bucket where you want to store metrics.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAccessToTrainingAndValidationBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::<training-bucket>",
        "arn:aws:s3:::<training-bucket>/*",
        "arn:aws:s3:::<validation-bucket>",
        "arn:aws:s3:::<validation-bucket>/*"
      ]
    },
    {
      "Sid": "AllowAccessToOutputBucket",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:ListBucket"
      ],
      "Resource": [
        "arn:aws:s3:::<output-bucket>",
        "arn:aws:s3:::<output-bucket>/*"
      ]
    }
  ]
}

Complementing this approach, we recommend using a VPC for the model customization job to restrict access to the training data. Technically, this again involves a VPC endpoint resource policy because the network is using interface VPC endpoints to access your S3 bucket. This allows you to define another network control, specifically an S3 bucket policy that only allows access through a specific VPC endpoint. So, for the situation where you want to limit access for the customization job itself, you can apply a bucket policy such as the following example, replacing <training-bucket> with the bucket name that contains your training data, and <vpce-id> with the ID of the VPC endpoint that resides in your VPC:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AccessToSpecificVPCEOnly",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",      
      "Resource": [
        "arn:aws:s3:::<training-bucket>",
        "arn:aws:s3:::<training-bucket>/*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:SourceVpce": "<vpce-id>"
        }
      }
    }
  ]
}

In addition, you would restrict the principals that can access your VPC endpoint and the actions they’re allowed to take in Amazon S3. For simplicity, we’re omitting an example policy here because it’s very similar to the one we have in the Amazon Bedrock invocation section earlier in this post.

If you need to enforce encryption in Amazon S3 using a customer managed AWS KMS key (SSE-KMS), you will need to do the following:

  • Update the bucket policy with a statement denying unencrypted content being uploaded.
  • Update the KMS key policy to allow the service role to decrypt and describe the key.

The next policy example should be added to the bucket policy and demonstrates how to deny unencrypted objects being added to an S3 bucket. Again, replace <training-bucket> with the name of the S3 bucket that contains your training data:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "DenyObjectsThatAreNotSSEKMS",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:PutObject",
      "Resource": "arn:aws:s3:::<training-bucket>/*",
      "Condition": {
        "Null": {
          "s3:x-amz-server-side-encryption-aws-kms-key-id": "true"
        }
      }
    }
  ]
}

Finally, in the KMS key policy, you need a statement similar to the following to allow the Amazon Bedrock service role access to the KMS key. Replace <account-id> with your 12-digit account number and <bedrock-service-role> with the role you created, which will be assumed by the Amazon Bedrock service principal. Make sure to only give the required access to decrypt data with the KMS key to the IAM role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowUseOfKeyByBedrockRole",
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::<account-id>:role/<bedrock-service-role>"
      },
      "Action": [
        "kms:Decrypt",
        "kms:DescribeKey"
      ],
      "Resource": "*"
    }
  ]
}

Amazon Bedrock can also encrypt a customized model with a customer managed KMS key. Amazon Bedrock uses KMS key grants to encrypt the customized model and to decrypt it later when you deploy it for inference. Therefore, you need to grant the same IAM role permissions to create KMS key grants in the KMS key policy. The KMS key you use for this purpose is typically different than the one you used to encrypt the training data to allow fine-grained permissions on both keys.

So, let’s imagine that you want to use two different roles to encrypt and decrypt the customized models. To allow the role that executes the model customization job to use your KMS key, you need to add the following policy statements to the KMS key policy, replacing <account-id> with your 12-digit account number, <region> with the Region where you run Amazon Bedrock, <bedrock-model-customization-role> with the role name you use to run the model customization job, and <invocation-role> with the name of the role you use for inference.

{
  "Version": "2012-10-17",
  "Id": "PermissionsCustomModelKey",
  "Statement": [
    {
      "Sid": "PermissionsEncryptCustomModel",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<account-id>:role/<bedrock-model-customization-role>"
        ]
      },
      "Action": [
        "kms:Decrypt",
        "kms:GenerateDataKey",
        "kms:DescribeKey",
        "kms:CreateGrant"
      ],
      "Resource": "*",
      "Condition": {
        "StringLike": {
          "kms:ViaService": [
            "bedrock.<region>.amazonaws.com"
          ]
        }
      }
    },
    {
      "Sid": "PermissionsDecryptModel",
      "Effect": "Allow",
      "Principal": {
        "AWS": [
          "arn:aws:iam::<account-id>:role/<invocation-role>"
        ]
      },
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "*",
      "Condition": {
        "StringLike": {
          "kms:ViaService": [
            "bedrock.<region>.amazonaws.com"
          ]
        }
      }
    }
  ]
}

By using KMS key grants, you can revoke the permissions you granted to the service role after the customization job is done, thus reducing the permissions to least privilege. Also, Amazon Bedrock uses secondary KMS key grants for model encryption, which means that they’re automatically retired as soon as the operation that Amazon Bedrock performs on behalf of the customer is completed. The Encryption of model customization jobs and artifacts describes in more detail how grants are used.

To completement these IAM policy guardrails, you can add network controls to reduce the scope of the permissions of the process. Because we focus on IAM policies in this post, we won’t go into details here but only mention how the process works.

When you start a model customization job, a model training job is triggered within the model deployment account. The training job takes a base model from its S3 bucket, then connects to the S3 bucket that holds the customization training data to start the customization. This can be done through your VPC, where you specify a VPC configuration such as subnets and security groups, and the training job places an elastic network interface (ENI) into that VPC as specified. A request to the S3 bucket to read the training data now adheres to whatever routing rules are present in the VPC for that ENI. The VPC routing and security group attached to the ENI can be used to limit networking access to the model customization job.

Amazon Bedrock Agents

Amazon Bedrock Agents offers the capability to build and configure autonomous agents for applications. You can find more information about Amazon Bedrock Agents in Automate tasks in your application using AI agents.

Using an Amazon Bedrock agent also provides certain security properties that are applied to an inference task. For example, at the time of writing, there is no IAM condition key for the bedrock:InvokeModel API to require an Amazon Bedrock guardrail being attached to that same call. However, you can require that inferences are invoked through a call to an agent that has specific Amazon Bedrock guardrails configured.

Let’s say that you want to create a role that explicitly is only allowed to invoke Amazon Bedrock models through a specific Amazon Bedrock agent. The following IAM principal permissions policy example implies that the Amazon Bedrock agent specified has approved Amazon Bedrock guardrails configured. Again replace <region>, <account-id>, <bedrock-agent-id>, and <bedrock-agent-alias-id> with the values of your Amazon Bedrock agent.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAgentInvocation",
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeAgent"
      ],
      "Resource": "arn:aws:bedrock:<region>:<account-id>:agent-alias/<bedrock-agent-id>/<bedrock-agent-alias-id>"
    },
    {
      "Sid": "DenyDirectInvocation",
      "Effect": "Deny",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:CreateModelInvocationJob"
      ],
      "Resource": "arn:aws:bedrock:*::foundation-model/*"
    }
  ]
}

Provided that the Amazon Bedrock agent is configured by a systems administrator or operator with an approved Amazon Bedrock guardrail, the principal with the preceding policy attached to it will be able to invoke it with a prompt, and won’t directly invoke an Amazon Bedrock model. This strategy for making sure that Amazon Bedrock guardrails are applied to all Amazon Bedrock invocations is currently not possible with the bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream APIs, because they don’t have a condition key to match an Amazon Bedrock guardrail to. In addition, denying bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream also denies the Converse APIs and StartAsyncInvoke APIs, so there’s no need to add these separately to the Deny statement.

Because this strategy verifies the use of specific Amazon Bedrock guardrails, you can also use it for the enforcement of specific prompts, IAM service roles, knowledge bases, prompt and completion content restrictions, KMS keys, and FMs in inference invocations. For this approach to be effective, you need to also limit the principals that can create and update Amazon Bedrock agent configurations. Again, this can be restricted using an IAM policy, which is attached to only specific principals.

The following is an example IAM policy statement that gives an attached IAM principal the ability to update the configuration of a specific Amazon Bedrock agent, replacing <region>, <account-id>, and <agent-id> with the Region, account and identifier, and agent identifier that you’re using. If you want this to apply to all agents, replace <agent-id> with an asterisk (*).

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowUpdatingBedrockAgents",
      "Effect": "Allow",
      "Action": [
        "bedrock:DisassociateAgentKnowledgeBase",
        "bedrock:GetAgent*",
        "bedrock:ListAgent*",
        "bedrock:PrepareAgent",
        "bedrock:TagResource",
        "bedrock:UntagResource",
        "bedrock:UpdateAgent*"
      ],
      "Resource": [
        "arn:aws:bedrock:<region>:<account-id>:agent/<agent-id>"
      ]
    }
  ]
}

Where agents aren’t suitable, users or applications performing inference against the Amazon Bedrock models will need permissions to call the bedrock:InvokeModel, bedrock:InvokeModelWithResponseStream, or bedrock:CreateModelInvocationJob actions. In these cases, it can be desirable to limit the target models, following an allowlisting approach. Also, such permissions would only be attached to roles or applications that need to use them.

The following is an example of such a policy that restricts invocation to Anthropic’s Claude Instant.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowInvokationOnAnthropicClaudeInstantV1",
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:InvokeModelWithResponseStream",
        "bedrock:CreateModelInvocationJob"
      ],
      "Resource": "arn:aws:bedrock:*::foundation-model/anthropic.claude-instant-v1"
    }
  ]
}

You can include detective or reactive controls using Amazon CloudWatch EventBridge rules to detect model invocations that don’t use appropriate Amazon Bedrock guardrails, but that’s outside the scope of this post.

Model testing

Testing is the last step before the solution is deployed. Types of tests include unit tests, integration tests, user acceptance tests, penetration testing, and more. In this phase, you can again verify that the permissions that were assigned are indeed least privilege.

Especially in functional tests where data is involved, it’s important to consider that the data used for testing might be confidential. This is typically true when no synthetic test data is generated for the testing process. Controls to restrict access to the data and logs that are produced and might contain pieces of this data need to be the same as you will apply in a production environment. That you are only testing the solution doesn’t automatically mean that data access controls aren’t needed.

As discussed earlier in this post, controls are activated not only on identities, but also on the network and on resources. All of these should be validated and their effectiveness confirmed in this phase. Tests include but aren’t limited to:

  • Validate that you can only perform allowed actions in Amazon Bedrock through VPC endpoints, and that actions that don’t use VPC endpoints are blocked.
  • Validate the effectiveness of the resource policies on VPC endpoints by making sure that they can only be used by authenticated and authorized principals.
  • When using knowledge bases, validate that only the Amazon Bedrock service principal can access them.
  • When using Amazon Bedrock guardrails, evaluate their effectiveness. Because of the nature of generative AI applications, the diversity in input and output data can be big. Therefore, make sure to test guardrails with a reasonably large number of prompts.
  • If model invocation logging is activated, validate that logs are correctly written and protected with IAM permissions and encryption.
  • Validate that only the required personnel can access these logs, because they might contain sensitive data. Consider automatically sanitizing and forwarding them to a new CloudWatch log group.
  • Validate that all relevant Amazon Bedrock API calls are being properly logged in AWS CloudTrail, and that you can effectively monitor and alert on any suspicious activity.
  • Make sure that sensitive information—such as prompts and responses—isn’t being stored in the CloudTrail logs or in any trace output.

The threat modelling that you have potentially created in the design phase can provide valuable inputs that you can use for security-related test cases.

Model operation

In this phase, the solution is finally deployed into production. Operators need Amazon Bedrock control plane permissions to provision and manage Amazon Bedrock resources such as agents, guardrails, prompt libraries, and knowledge bases. They should only get the control plane permissions to provision and manage the Amazon Bedrock features that are being used. These same operators should have access to invoke or configure the Amazon Bedrock service features or their resources (Amazon Bedrock Agents, Amazon Bedrock Guardrails, and prompt libraries) only through an authorized pipeline. This immutable infrastructure approach restricts human users from creating situations or configurations that would otherwise allow unapproved access to the data plane, untracked changes to the control plane, or potentially disruptive updates to your application.

Alternatively, to reduce the assigned permissions to the absolute minimum, automated deployments using pipelines and pipeline roles can be used. This will not only provide versioned infrastructure but also adheres to PoLP by not providing access to human identities.

After deployment, the solution is live and being accessed by real users. At this point, topics such as monitoring, logging, and incident response become relevant. While Amazon Bedrock by default doesn’t store inference or response data, it’s recommended that you activate logging of those elements to constantly verify the accuracy of your generative AI application.

By using this solution, you reduce access to a minimum in the following areas:

  • Logged prompts and responses
  • Data available through knowledge bases, RAG, or similar sources
  • The ability to change the infrastructure

Use multiple, dedicated, least privileged roles for each task. This helps reduce the permissions scope to a minimum. Also, because least privilege enforces using a specific role for a specific task, it reduces the risk of unintended changes by requiring the assumption of a specific role.

By following the AWS Security Reference Architecture, security monitoring data is consolidated in a central security account. This allows a comprehensive, central overview of the security posture of your infrastructure.

Logging sensitive information

An important operational aspect is logging potentially sensitive data that’s sent to or received from the LLM. While Amazon Bedrock doesn’t store prompts or responses, you can use model invocation logging to collect invocation logs, model input data, and model output data for all invocations in your AWS account used in Amazon Bedrock. Model invocation logging isn’t enabled by default. After it’s enabled, prompts, completions, or both for all invocations of all approved models are then logged to the configured log destination. Valid log destinations for prompt and completion logs are Amazon S3 and CloudWatch. When writing logs to these destinations, they can optionally be encrypted using a supplied KMS key.

The contents of these logs might contain sensitive information in the prompt provided by the user or the reply generated by the model. As such, access to these logs should be restricted to personnel and machine processes that require and are authorized to access this classification of data. There are strategies such as using Amazon Macie on the Amazon S3 logs and CloudWatch Logs data protection capabilities to detect, monitor, and redact this sensitive information from logs, but that’s outside the scope of this post.

Even with Amazon Bedrock guardrails in place, the contents of these logs contain the pre-guardrailed user input, and so you must assume that these prompt and completion logs contain sensitive information. In this case, best practice is to encrypt log data with a KMS key, apply a data protection policy to the log group, and define at least three IAM roles:

  1. BasicCompletionLogReviewer: An IAM role whose sole purpose is to access and review the redacted version of these logs.
  2. SensitiveDataCompletionLogReviewer: A restricted IAM role whose sole purpose is to access and review the unredacted version of these logs.
  3. CompletionLogAdmin: A restricted IAM role whose sole purpose is to create, view, and delete data protection policies that can send audit findings to Amazon S3 and CloudWatch destinations.

To allow reading log events in a specific log group, use a policy such as the following and attach it to the BasicCompletionLogReviewer role, replacing <region>, <account-id>, <log-group-name>, and <alias-name> with values that match your CloudWatch log group and the KMS key that encrypts it.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowReadingMaskedLogStream",
      "Effect": "Allow",
      "Action": [
        "logs:DescribeLogStreams",
        "logs:GetLogEvents"
      ],
      "Resource": "arn:aws:logs:<region>:<account-id>:log-group:<log-group-name>:*"
    },
    {
      "Sid": "AllowDecryptOfLogEvents",
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "arn:aws:kms:<region>:<account-id>:alias/<alias-name>"
    }

  ]
}

With an active data protection policy in place, the preceding policy won’t allow access to the unredacted version of these logs. To allow access to the unredacted versions to the SensitiveDataCompletionLogReviewer role, you need to add an additional action, replacing <region>, <account-id>, <log-group-name>, and <alias-name> with values that match your CloudWatch log group and the KMS key that encrypts it.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowReadingMaskedLogEvents",
      "Effect": "Allow",
      "Action": [
        "logs:DescribeLogStreams",
        "logs:GetLogEvents",
        "logs:Unmask"
      ],
      "Resource": "arn:aws:logs:<region>:<account-id>:log-group:<log-group-name>:*"
    },
    {
      "Sid": "AllowDecryptOfLogEvents",
      "Effect": "Allow",
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "arn:aws:kms:<region>:<account-id>:alias/<alias-name>"
    }
  ]
}

The policy for the CompletionLogAdmin role requires different permissions; the following sample policy allows a user to create, view, and delete data protection policies that can send audit findings to all three types of audit destinations. It doesn’t permit the user to view unmasked data. This policy will look like the following example, replacing <delivery-stream-id>, <bucket-name>, <log-group-name>, and <alias-name> with the values that match your setup. Note that this includes a statement that explicitly denies the attached role access to decrypt the logs with the configured KMS key, aligning with the PoLP:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowLogGroupsManagement1",
      "Effect": "Allow",
      "Action": [
        "logs:CreateLogDelivery",
        "logs:PutResourcePolicy",
        "logs:DescribeLogGroups",
        "logs:DescribeResourcePolicies"
      ],
      "Resource": "*"
    },
    {
      "Sid": "AllowLogGroupsManagement2",
      "Effect": "Allow",
      "Action": [
        "logs:GetDataProtectionPolicy",
        "logs:DeleteDataProtectionPolicy",
        "logs:PutDataProtectionPolicy",
        "s3:PutBucketPolicy",
        "firehose:TagDeliveryStream",
        "s3:GetBucketPolicy"
      ],
      "Resource": [
        "arn:aws:firehose:::deliverystream/<delivery-stream-id>",
        "arn:aws:s3:::<bucket-name>",
        "arn:aws:logs:::log-group:<log-group-name>:*"
      ]
    },
    {
      "Sid": "AllowListKMSKeys",
      "Effect": "Allow",
      "Action": [
        "kms:ListKeys",
        "kms:ListAliases"
      ],
      "Resource": "*"
    },
    {
      "Sid": "DenyDecryptOfLogEvents",
      "Effect": "Deny",
      "Action": [
        "kms:Decrypt"
      ],
      "Resource": "arn:aws:kms:<region>:<account-id>:alias/<alias-name>"
    }

  ]
}

This approach helps to avoid inadvertent access to or exposure of this sensitive data source and upholds the PoLP by separating duties.

Review IAM permissions on a periodic basis

Managing permissions is an ongoing effort because requirements and functionality change over time. Therefore, we recommend regularly reviewing the assigned permissions and verifying that they aren’t overly permissive. For example, if you have a Lambda function that makes API calls to Amazon Bedrock, and changes are made to that function that require additional permissions (perhaps the use of a new model), then it’s acceptable to update the policy attached to the IAM role that the function uses. It’s not always obvious that permissions for using the earlier model are still needed in the same policy; or permissions might be widened unnecessarily to include all models. When applying the PoLP, it’s important that policies be tested at the time they’re deployed to make sure that they meet the exact application needs and no more, but also that the presumed needs are reviewed periodically.

Using AWS IAM Access Analyzer, you can review and simulate proposed changes to IAM policies to ensure their suitability for a given application or function. You can also use IAM Access Analyzer to review unused permissions over time. This gives system operators an opportunity to inspect and then inform the removal of unused permissions in policies used with Amazon Bedrock applications. Remember that some permissions are dormant and ready for periodic use, such as incident response, recovery and other rare use cases, so your review shouldn’t assume that an unused permission is unnecessary, but an opportunity to review the need for the permission.

Finally, align monitoring of new Amazon Bedrock APIs with your IAM strategy. Especially when using denylisting approaches, it’s important to consider that services will announce new APIs, capabilities, and FMs over time. An example for this was the announcement of the new Converse API. This API provides functionality similar to Invoke, but in a consistent and thus simpler way. Considering such changes is therefore an integral part of your regular policy review processes.

Strong identity and access management is a journey, not a one-time action.

Conclusion

In this post, we have demonstrated some ways that you can apply the principle of least privilege (PoLP) to large language model (LLM)-based applications that use Amazon Bedrock. We have discussed the security considerations of each phase of the development lifecycle and provided examples that you can use as a starting point to implement your own PoLP strategy. It’s important that security doesn’t start late in the process; think about risks and the required actions as early as possible to make sure that your strategy is effective when your application goes live.

Finally, remember that the field of generative AI is moving quickly. We believe that it has the potential to transform virtually every customer experience. From a security perspective, this means that the threat landscape will change and evolve over time. Make sure to constantly adapt to new risks; evaluate and integrate them into your PoLP strategy.

Your AWS account team and specialists are happy to assist you on this journey.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
 

Jonathan Jenkyn
Jonathan Jenkyn

Jonathan (“JJ”) Jenkyn is a Sr Security Assurance Solution Architect with AWS Security Assurance Services. With over 30 years of experience, he is a proven security leader who delivers robust cloud security outcomes. JJ is also an active member of the AWS People with Disabilities affinity group and enjoys running, cycling, and spending time with his family.
Michael Tschannen
Michael Tschannen

Michael is an EMEA Sr. Specialist SA Security with AWS, based in Switzerland. After working as a penetration tester early in his career, he decided to move to the defensive side. Since then, he’s been protecting customers and their data every day by making security and privacy an integral part of every solution he builds.

Architecting for database encryption on AWS

Post Syndicated from Jonathan Jenkyn original https://aws.amazon.com/blogs/security/architecting-for-database-encryption-on-aws/

In this post, I review the options you have to protect your customer data when migrating or building new databases in Amazon Web Services (AWS). I focus on how you can support sensitive workloads in ways that help you maintain compliance and regulatory obligations, and meet security objectives.

Understanding transparent data encryption

I commonly see enterprise customers migrating existing databases straight from on-premises to AWS without reviewing their design. This might seem simpler and faster, but they miss the opportunity to review the scalability, cost-savings, and feature capability of native cloud services. A straight lift and shift migration can also create unnecessary operational overheads, carry-over unneeded complexity, and result in more time spent troubleshooting and responding to events over time.

One example is when enterprise customers who are using Transparent Data Encryption (TDE) or Extensible Key Management (EKM) technologies want to reuse the same technologies in their migration to AWS. TDE and EKM are database technologies that encrypt and decrypt database records as the records are written and read to the underlying storage medium. Customers use TDE features in Microsoft SQL Server, Oracle 10g and 11g, and Oracle Enterprise Edition to meet requirements for data-at-rest encryption. This shouldn’t mean that TDE is the requirement. It’s infrequent that an organizational policy or compliance framework specifies a technology such as TDE in the actual requirement. For example, the Payment Card Industry Data Security Standard (PCI-DSS) standard requires that sensitive data must be protected using “Strong cryptography with associated key-management processes and procedures.” Nowhere does PCI-DSS endorse or require the use of a specific technology.

Understanding risks

It’s important that you understand the risks that encryption-at-rest mitigates before selecting a technology to use. Encryption-at-rest, in the context of databases, generally manages the risk that one of the disks used to store database data is physically stolen and thus compromised. In on-premises scenarios, TDE is an effective technology used to manage this risk. All data from the database—up to and including the disk—is encrypted. The database manages all key management and cryptographic operations. You can also use TDE with a hardware security module (HSM) so that the keys and cryptography for the database are managed outside of the database itself. In TDE implementations, the HSM is used only to manage the key encryption keys (KEK), and not the data encryption keys (DEK) themselves. The DEKs are in volatile memory in the database at runtime, and so the cryptographic operations occur on the database itself.

You can also use native operating system encryption technologies such as dm-crypt or LUKS (Linux Unified Key Setup). Dm-crypt is a full disk encryption (FDE) subsystem in Linux kernel version 2.6 and beyond. Dm-crypt can be used on its own or with LUKS as an extension to add more features. When using dm-crypt, the operating system kernel is responsible for encrypting and decrypting data as it’s written and read from the attached volumes. This would achieve the same outcome as TDE—data written and read to the disk volume is encrypted, and the risk related to physical disk compromise is managed. DEKs are in runtime memory of the machine running the database.

With some TDE implementations, you can encrypt tables, rows, columns, and cells with different DEKs to achieve granular separation of duties between operators. Customers can then configure TDE to authorize access to each DEK based on database login credentials and job function, helping to manage risks associated with unauthorized access. However, the most common configuration I’ve seen is to rely on whole database encryption when using TDE. This configuration gives similar protection against the identified risks as dm-crypt with LUKS used without an HSM, since the DEKs and KEKs are stored within the instance in both cases and the result is that the database data on disk is encrypted.

Using encryption to manage data at rest risks in AWS

When you move to AWS, you gain additional security capabilities that can simplify your security implementations. Since the announcement of the AWS Key Management Service (AWS KMS) in 2014, it has been tightly integrated with Amazon Elastic Block Store (Amazon EBS), Amazon Simple Storage Service (Amazon S3), and dozens of other services on AWS. This means that data is encrypted on disk by checking a single check box. Furthermore, you get the benefits of AWS KMS for key management and cryptographic operations, while being transparent to the Amazon Elastic Compute Cloud (Amazon EC2) instance where the data is being encrypted and decrypted. For simplicity, the authorization for access to the data is managed entirely by AWS Identity and Access Management (IAM) and AWS KMS key resource policies.

If you need more granular access control to the data, you can use the AWS Encryption SDK to encrypt data at the application layer. That provides the same effect as TDE cell-level protection, with a FIPS140-2 Level 2 validated HSM, as might be required by a recognizing standard.

If you must use a FIPS140-2 Level 3 validated HSM to meet more stringent compliance standards or regulations, then you can use the Custom Key Store capability of AWS KMS to achieve that—again in a transparent way. This option has a trade-off, as there is additional operational overhead in terms of managing an AWS CloudHSM cluster.

Many customers choose to migrate their database into the managed Amazon Relational Database Service (Amazon RDS), rather than managing the database instance themselves. Like the Amazon EC2 service, RDS uses Amazon EBS volumes for its data storage, and so can seamlessly use AWS KMS for encryption at rest functionality. When you do so, your management overhead for the protection of data-at-rest reduces to almost zero. This lets you focus on business value while AWS is responsible for the management of your database and the protection of the underlying data. The next section reviews this option and others in more detail.

You can review the available Amazon RDS database engines and versions via the Amazon RDS User Guide documentation, or by running the following AWS Command Line Interface (AWS CLI) command:

aws rds describe-db-engine-versions --query "DBEngineVersions[].DBEngineVersionDescription" --region <regionIdentifier>

Recommended Solutions

If you’re moving an existing database to AWS, you have the following solutions for data at rest encryption. I go into more detail for each option below.

Table 1 – Encryption options

Option Database management Host Encryption Key management
1 Amazon managed Amazon RDS Amazon EBS AWS KMS
2 Amazon managed Amazon RDS Amazon EBS AWS KMS Custom Key Store
3 Customer managed Amazon EC2 Amazon EBS AWS KMS
4 Customer managed Amazon EC2 Amazon EBS AWS KMS Custom Key Store
5 Customer managed Amazon EC2 Amazon EBS LUKS
6 Customer managed Amazon EC2 Database Database TDE
7 Customer managed Amazon EC2 Database CloudHSM

Option 1 – Using Amazon RDS with Amazon EBS encryption and key management provided by AWS KMS

This approach uses the Amazon RDS service where AWS manages the operating system and database engine. You can configure this service to be a highly scalable resource spanning multiple Availability Zones within an AWS Region to provide resiliency. AWS KMS manages the keys that are used to encrypt the attached Amazon EBS volumes at rest.

Note: This configuration is recommended as your default database encryption approach.

Benefits

  • No key management requirement on host; key management is automated and performed by AWS KMS
  • Meets FIPS140-2 Level 2 validation requirements
  • Simple vertical and horizontal scalability
  • Snapshots for recovery are encrypted automatically
  • AWS manages the patching, maintenance, and configuration of the operating system and database engine
  • Well-recognized configuration, with support offered through AWS Support
  • AWS KMS costs are comparatively low

Challenges

  • Dependent on Amazon RDS supported engines and versions
  • Might require additional controls to manage unauthorized access at table, row, column, or cell level

Option 2 – Using Amazon RDS with Amazon EBS encryption and key management provided by AWS KMS custom key store

This approach uses the Amazon RDS service where AWS manages the operating system and database engine. You can configure this service to be a highly scalable resource spanning multiple Availability Zones within a Region to provide resiliency. CloudHSM keys are used via AWS KMS service integration to encrypt the Amazon EBS volumes at rest.

Note: This configuration is recommended where FIPS140-2 Level 3 validation is a specified compliance requirement.

Benefits

  • No key management requirement on host; key management is performed by AWS KMS
  • Meets FIPS140-2 Level 3 validation requirements
  • Simple vertical and horizontal scalability
  • Snapshots for recovery are encrypted automatically
  • AWS manages the patching, maintenance, and configuration of the database engine
  • Well-recognized configuration with support offered through AWS Support

Challenges

  • Dependent on Amazon RDS supported engines and versions
  • You are responsible for provisioning, configuration, scaling, maintenance, and costs of running CloudHSM cluster
  • Might require additional controls to manage unauthorized access at table, row, column or cell level

Option 3 – Customer-managed database platform hosted on Amazon EC2 with Amazon EBS encryption and key management provided by KMS

In this approach, the key difference is that you’re responsible for managing the EC2 instances, operating systems, and database engines. You can still configure your databases to be highly scalable resources spanning multiple Availability Zones within a Region to provide resiliency, but it takes more effort. AWS KMS manages the keys that are used to encrypt the attached Amazon EBS volumes at rest.

Note: This configuration is recommended when Amazon RDS doesn’t support the desired database engine type or version.

Benefits

  • A 1:1 relationship for migration of database engine configuration
  • Key rotation and management is handled transparently by AWS
  • Data encryption keys are managed by the hypervisor, not by your EC2 instance
  • AWS KMS costs are comparatively low

Challenges

  • You’re responsible for patching and updates of the database engine and OS
  • Might require additional controls to manage unauthorized access at table, row, column, or cell level

Option 4 – Customer-managed database platform hosted on Amazon EC2 with Amazon EBS encryption and key management provided by KMS custom key store

In this approach, you are again responsible for managing the EC2 instances, operating systems, and database engines. You can still configure your databases to be highly scalable resources spanning multiple Availability Zones within a Region to provide resiliency, but it takes more effort. And similar to Option 2, CloudHSM keys are used via AWS KMS service integration to encrypt the Amazon EBS volumes at rest.

Note: This configuration is recommended when Amazon RDS doesn’t support the desired database engine type or version and when FIPS140-2 Level 3 compliance is required.

Benefits

  • A 1:1 relationship for migration of database engine configuration
  • Data encryption keys managed by the hypervisor, not by your EC2 instance
  • Keys managed by FIPS140-2 Level 3 validated HSM

Challenges

  • You’re responsible for provisioning, configuration, scaling, maintenance, and costs of running CloudHSM cluster
  • You’re responsible for patching and updates of the database engine and OS
  • Might require additional controls to manage unauthorized access at table, row, column, or cell level

Option 5 – Customer-managed database platform hosted on Amazon EC2 with Amazon EBS encryption and key management provided by LUKS

In this approach, you’re still responsible for managing the EC2 instances, operating systems, and database engines. You also need to install LUKS onto the Linux instance to manage the encryption of data on Amazon EBS.

Benefits

  • A 1:1 relationship for migration of database engine configuration
  • Transparent encryption is managed by OS with LUKS

Challenges

  • You’re responsible for patching and updates of the database engine and OS
  • Data encryption keys are managed directly on the EC2 instance, and not a dedicated key management system
  • Scaling must be vertical, which is slow and costly
  • LUKS is supported through open-source licensing
  • Support for backup and recovery is LUKS specific, and require additional consideration
  • Might require additional controls to manage unauthorized access at table, row, column or cell level

Note: This approach limits you to only Linux instances and requires the most technical knowledge and effort on your part. Options, such as BitLocker and SQL Server Always Encrypted, exist for Windows hosts, and the complexity and challenges are similar to those of LUKS.

Option 6 – Customer-managed database platform hosted on Amazon EC2 with database encryption and key management provided by TDE

In this approach, you’re still responsible for managing the EC2 instances, operating systems, and database engines. However, instead of encrypting the Amazon EBS volume where the database is stored, you use TDE wallet keys managed by the database engine to encrypt and decrypt records as they are stored and retrieved.

Benefits

  • A 1:1 relationship for migration of database engine configuration
  • Table, row, column, and cell level encryption are managed by TDE, reducing end point risks relating to unauthorized access

Challenges

  • You’re responsible for patching and updates of the database engine and OS
  • Costly license for TDE feature
  • Data encryption keys are managed directly on the EC2 instance
  • Scaling is dependent on TDE functionality and Amazon EC2 scaling
  • Support is split between AWS and a third-party database vendor
  • Cannot share snapshots

Note: This approach is not available with Amazon RDS.

Option 7 – Customer-managed database platform hosted on Amazon EC2 with database encryption performed by TDE and key management provided by CloudHSM

In this approach, you’re still responsible for managing the EC2 instances, operating systems, and database engines. However, instead of encrypting the Amazon EBS volume where the database is stored, you use TDE wallet keys managed by a CloudHSM cluster to encrypt and decrypt records as they are stored and retrieved.

Benefits

  • A 1:1 relationship for migration of database engine configuration
  • Wallet keys (KEK) are managed by a FIPS140-2 Level 3 validated HSM
  • Table, row, column, and cell level encryption are managed by TDE, reducing end point risks relating to unauthorized access

Challenges

  • You’re responsible for patching and updates of the database engine and OS
  • Costly license for TDE feature
  • You are responsible for provisioning, configuration, scaling, maintenance, and costs of running CloudHSM cluster
  • Integration and support of CloudHSM with TDE might vary
  • Scaling is dependent on TDE functionality, Amazon EC2 scaling, and CloudHSM cluster.
  • Data encryption keys are managed on EC2 instance
  • Support is split between AWS and a third-party database vendor
  • Cannot share snapshots

Note: This approach is not available with Amazon RDS.

Summary

While you can operate in AWS similar to how you operate in your on-premises environment, the preceding configurations and recommendations show how you can significantly reduce your challenges and increase your benefits by using cloud-native security services like AWS KMS, Amazon RDS, and CloudHSM. Specifically, using Amazon RDS with Amazon EBS volumes encrypted by AWS KMS provides a highly scalable, resilient, and secure way to manage your keys in AWS.

While there might be some architectural redesign and configuration work needed to move an on-premises database into Amazon RDS, you can leverage AWS services to help you meet your compliance requirements with less effort. By offloading the OS and database maintenance responsibility to AWS, you simultaneously reduce operational friction and increase security. By migrating this way, you can benefit from the scalability and resilience of the AWS global infrastructure and expertise. Lastly, to get started with migrating your database to AWS, I encourage you to use the AWS Database Migration Service.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Jonathan Jenkyn

Jonathan is a Senior Security Growth Strategies Consultant with AWS Professional Services. He’s an active member of the People with Disabilities affinity group, and has built several Amazon initiatives supporting charities and social responsibility causes. Since 1998, he has been involved in IT Security at many levels, from implementation of cryptographic primitives to managing enterprise security governance. Outside of work, he enjoys running, cycling, fund-raising for the BHF and Ipswich Hospital Charity, and spending time with his wife and 5 children.

Author

Scott Conklin

Scott is a Senior Security Consultant with AWS Professional Services (Global Specialty Practice). Based out of Chicago with 4 years tenure, he is an avid distance runner, crypto nerd, lover of unicorns, and enjoys camping, nature, playing Minecraft with his 3 kids, and binge watching Amazon Prime with his wife.

How to use trust policies with IAM roles

Post Syndicated from Jonathan Jenkyn original https://aws.amazon.com/blogs/security/how-to-use-trust-policies-with-iam-roles/

November 3, 2022: We updated this post to fix some syntax errors in the policy statements and to add additional use cases.

August 30, 2021: This post is currently being updated. We will post another note when it’s complete.


AWS Identity and Access Management (IAM) roles are a significant component of the way that customers operate on Amazon Web Service (AWS). In this post, we will dive into the details of how role trust policies work and how you can use them to restrict how your roles are assumed.

There are several different scenarios where you might use IAM roles on AWS:

  • An AWS service or resource accesses another AWS resource in your account – When an AWS resource needs access to other AWS services, functions, or resources, you can create a role that has appropriate permissions for use by that AWS resource. Services like AWS Lambda and Amazon Elastic Container Service (Amazon ECS) assume roles to deliver temporary credentials to your code that’s running in them.
  • An AWS service generates AWS credentials to be used by devices running outside AWS
    AWS IAM Roles Anywhere, AWS IoT Core, and AWS Systems Manager hybrid instances can deliver role session credentials to applications, devices, and servers that don’t run on AWS.
  • An AWS account accesses another AWS account – This use case is commonly referred to as a cross-account role pattern. It allows human or machine IAM principals from one AWS account to assume this role and act on resources within a second AWS account. A role is assumed to enable this behavior when the resource in the target account doesn’t have a resource-based policy that could be used to grant cross-account access.
  • An end user authenticated with a web identity provider or OpenID Connect (OIDC) needs access to your AWS resources – This use case allows identities from Facebook or OIDC providers such as GitHub, Amazon Cognito, or other generic OIDC providers to assume a role to access resources in your AWS account.
  • A customer performs workforce authentication using SAML 2.0 federation – This occurs when customers federate their users into AWS from their corporate identity provider (IdP) such as Okta, Microsoft Azure Active Directory, or Active Directory Federation Services (ADFS), or from AWS Single Sign-On (AWS SSO).

An IAM role is an IAM principal whose entitlements are assumed in one of the preceding use cases. An IAM role differs from an IAM user as follows:

  • An IAM role can’t have long-term AWS credentials associated with it. Rather, an authorized principal (an IAM user, AWS service, or other authenticated identity) assumes the IAM role and inherits the permissions assigned to that role.
  • Credentials associated with an IAM role are temporary and expire.
  • An IAM role has a trust policy that defines which conditions must be met to allow other principals to assume it.

Managing access to IAM roles

Let’s dive into how you can control access to IAM roles by understanding the policy types that you can apply to an IAM role.

There are three circumstances where policies are used for an IAM role:

  • Trust policy – The trust policy defines which principals can assume the role, and under which conditions. A trust policy is a specific type of resource-based policy for IAM roles. The trust policy is the focus of the rest of this blog post.
  • Identity-based policies (inline and managed) – These policies define the permissions that the user of the role is able to perform (or is denied from performing), and on which resources.
  • Permissions boundary – A permissions boundary is an advanced feature for using a managed policy to set the maximum permissions for a role. A principal’s permissions boundary allows it to perform only the actions that are allowed by both its identity-based permissions policies and its permissions boundaries. You can use permissions boundaries to delegate permissions management tasks, such as IAM role creation, to non-administrators so that they can create roles in self-service.

For the rest of this post, you’ll learn how to enforce the conditions under which roles can be assumed by configuring their trust policies.

An example of a simple trust policy

A common use case is when you need to provide access to a role in account A to assume a role in Account B. To facilitate this, you add an entry in the role in account B’s trust policy that allows authenticated principals from account A to assume the role through the sts:AssumeRole API call.

Important: If you reference :root in an IAM role’s trust policy, you might allow more principals to assume your role than you intended, so it’s a best practice to use the Principal element or conditions to only allow specific principals or paths to assume a role. Later in this post, we show you how to limit this access to more specific principals.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:root"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

This trust policy has the same structure as other IAM policies with Effect, Action, and Condition components. It also has the Principal element, but no Resource element. This is because the resource is the IAM role itself. For the same reason, the Action element will only ever be set to relevant actions for role assumption.

Note: The suffix :root in the policy’s Principal element equates to the principals in the account, not the root user of that account.

Using the Principal element to limit who can assume a role

In a trust policy, the Principal element indicates which other principals can assume the IAM role. In the preceding example, 111122223333 represents the AWS account number for the auditor’s AWS account. This allows a principal in the 111122223333 account with sts:AssumeRole permissions to assume this role.

To allow a specific IAM role to assume a role, you can add that role within the Principal element. For example, the following trust policy would allow only the IAM role LiJuan from the 111122223333 account to assume the role it is attached to.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:role/LiJuan"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The principals included in the Principal element can be a principal defined within the IAM documentation, and can refer to an AWS or a federated principal. You can’t use a wildcard (“*” or “?”) within the Principal element for a trust policy, other than one case which we will cover later. You must define precisely which principal you are referring to because there is a translation that occurs when you submit your trust policy that ties it to each principal’s ID. For more information, see Why is there an unknown principal format in my IAM resource-based policy?

If an IAM role has a principal from the same account in its trust policy directly, that principal doesn’t need an explicit entitlement in its identity-attached policy to assume the role.

Using the Condition element in a trust policy

The Condition element in a role trust policy sets additional requirements for the Principal trying to assume the role. The Condition element is a flexible way to reduce the set of users that are able to assume the role without necessarily specifying the principals.

Condition elements of role trust policies behave identically to condition elements in identity-based policies and other resource policies on AWS.

Using SAML identity federation on AWS

Federated users from a SAML 2.0 compliant IdP are given permissions to access AWS accounts through the use of IAM roles. The mapping of which enterprise users get which roles is established within the directory used by the SAML 2.0 IdP and is placed inside the signed SAML assertion by the IdP.

The Principal element of a role trust policy for SAML federation contains the ARN of the SAML IdP in the same AWS account. IdPs in other accounts can’t be referenced. Roles assumed by SAML federation can use SAML-specific condition keys in their role trust policy.

A role trust policy for a role to be assumed by SAML places the ARN of the SAML IDP in the Principal element, and checks the intended audience (SAML:aud) of the SAML assertion. Setting the audience condition is important because it means that only SAML assertions intended for AWS can be used to assume a role:

{
    "Version": "2012-10-17",
    "Statement": {
      "Effect": "Allow",
      "Action": "sts:AssumeRoleWithSAML",
      "Principal": {"Federated": "arn:aws:iam::account-id:saml-provider/PROVIDER-NAME"},
      "Condition": {"StringEquals": {"SAML:aud": "https://signin.aws.amazon.com/saml"}}
    }
  }

The AWS documentation covers creating roles for SAML 2.0 federation in detail. For information about how to manage the role trust policies of roles assumed by SAML from multiple AWS Regions for resiliency, see the blog post How to use regional SAML endpoints for failover.

For federating workforce access to AWS, you can use AWS IAM Identity Center (successor to AWS Single Sign-On) to broker access to IAM roles through SAML. Roles managed by IAM Identity Center can’t have their trust policy modified by IAM directly.

SAML IDPs used in a role trust policy must be in the same account as the role is.

Assuming a role with WebIdentity

Roles can also be assumed with tokens issued by web identity providers and OpenID Connect (OIDC) compliant providers.

After you’ve created an OpenID Connect identity provider in your account, you can configure roles to be assumed by that OpenID Connect identity provider.

The following is a trust policy that allows a role to be assumed by the identity provider auth.example.com where the value of the sub claim is equal to Administrator and the aud is equal to MyappWebIdentity.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::11112222333:oidc-provider/auth.example.com"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "auth.example.com:sub": "Administrator",
  "auth.example.com:aud": "MyappWebIdentity"
                }
            }
        }
    ]
}

The condition keys used for roles assumed by OIDC identity providers are always prefixed with the name of the OIDC identity provider (for example, auth.example.com). So to use claims in the ID Token like aud, sub, and amr, they are prefixed to become auth.example.com:aud, auth.example.com:sub, and auth.example.com:amr, respectively, in a trust policy to be evaluated as a condition key. Only ID Token claims listed in the STS documentation can be used in role trust policies as condition keys.

It’s important to set the:aud condition in role trust policies to help verify that the tokens being used to assume roles in your AWS accounts are tokens that are intended to be used for that purpose, and are for your application or tenant if your web identity provider is a public or multi-tenant identity provider, such as Google or GitHub.

Amazon Elastic Kubernetes Service (Amazon EKS) clusters have OIDC identity provider capabilities and can be used to assume roles in AWS accounts.

OIDC identity providers used to assume a role must be in the same AWS account as the role.

Limiting role use based on an identifier

At times you might need to give a third-party access to your AWS resources. Suppose that you hired a third-party company, Example Corp, to monitor your AWS account and help optimize costs. To track your daily spending, Example Corp needs access to your AWS resources, so you allow them to assume an IAM role in your account. However, Example Corp also tracks spending for other customers, and there could be a configuration issue in the Example Corp environment that allows another customer to compel Example Corp to attempt to take an action in your AWS account, even though that customer should only be able to take the action in their own account. This is referred to as the cross-account confused deputy problem. This section shows you a way to mitigate this risk.

The following trust policy requires that principals from the Example Corp AWS account, 444455556666, have provided a special string, called an external ID, when making their request to assume the role. Adding this condition reduces the risk that someone from the 444455556666 account will assume this role by mistake. This string is configured by specifying an ExternalID conditional context key.

External IDs should be generated by the third-party assuming your role, like Example Corp, and associated with the other assume role calls to assume a given customer’s role by Example Corp. By doing this, other Example Corp customers won’t be able to compel Example Corp to assume your roles on their behalf because they can’t force Example Corp to use your external ID through their tenant even if they become aware of your external ID.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::444455556666:role/ExampleCorpRole"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "sts:ExternalId": "ExampleUniquePhrase"
        }
      }
    }
  ]
}

The external IDs should be unique for every customer of a service provider. AWS doesn’t treat external IDs as secrets—they can be seen by anyone with entitlements to view a role’s trust policy.

If you assume roles in your customers’ accounts, it’s a best practice to generate unique external ID values on behalf of your customers and associate them with your customers, and you shouldn’t allow your customers to specify an external ID.

Roles with the sts:ExternalId condition can’t be assumed through the AWS console, unless there is another Allow statement without that condition.

Limiting role use based on IP addresses or CIDR ranges

You can put IP address conditions into a role trust policy to limit the networks from which a role can be assumed. For example, you can limit role assumption to a corporate network or VPN range. The following example trust policy will only allow the role to be assumed if the call is made from within the 203.0.113.0/24 CIDR range.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:role/IpBoundedRole"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "IpAddress": {
          "aws:SourceIp": "203.0.113.0/24"
        }
      }
    }
  ]
}

By using aws:SourceIP in the trust policy, you limit where the role can be assumed from, but this doesn’t limit where the credentials can be used from after they are assumed. To restrict where the credentials can be used from, you can use aws:SourceIP as a condition within the principal’s identity-based policy or the service control policies that apply to it. For more information on restricting where credentials can be used from, see Establishing a data perimeter on AWS.

Limiting role use based on tags

You can use IAM tagging capabilities to build flexible and adaptive trust policies. You can use an attribute-based access control (ABAC) model for assuming IAM roles in the same way that you can for accessing objects in an Amazon Simple Storage Service (Amazon S3) bucket. You can build trust policies that only permit principals that have already been tagged with a specific key and value to assume a specific role. The following example trust policy requires that IAM principals in the AWS account 111122223333 have the value of their principal tag department match the value of the IAM role’s tag owningDepartment.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
        "StringEquals": {
          "aws:PrincipalTag/department": "${aws:ResourceTag/owningDepartment}"
        }
      }
    }
  ]
}

As an example, in the preceding policy, if the role is tagged with an owningDepartment of finance, then only principals within account 111122223333 who have a tag department with a value of finance will be able to assume the role.

When using ABAC, it’s important to have governance around who can set tags on resources, principals, and sessions. If someone can change or modify tags on principals, resources, or sessions, they might be able to access resources that you didn’t intend them to. Principals from AWS accounts outside of your control might have different tag governance practices than your own organization, and you should take this into account when using principal tags as part of cross-account role assumption. You can use tag policies to help govern tags within your organization, and later in this blog post, we show how to manage tags set on assumption by using role trust policies.

For more information, see the Attribute-Based Access Control (ABAC) for AWS page.

Limiting role assumption to only principals within your organization

Since its announcement in 2016, the vast majority of enterprise customers that we work with use AWS Organizations. This AWS service allows you to create an organizational structure for your accounts by creating logical boundaries/organizational units that allow grouping of AWS accounts that need common guardrails applied. You can use the PrincipalOrgID condition key to limit the use of roles solely to principals within your organization in AWS Organizations.

The following example shows a policy that denies assumption of this role except by AWS services or by principals that are a member of the o-abcd12efg1 organization. This statement can be broadly applied to prevent someone outside your AWS organization from assuming your roles.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Deny",
            "Principal": {
                "AWS": "*"
            },
            "Action": "sts:AssumeRole",
            "Condition": {
                "StringNotEquals": {
                    "aws:PrincipalOrgID": "o-abcd12efg1"
                },
                "Bool": {
                    "aws:PrincipalIsAWSService": "false"
                }
            }
        }
    ]
}

In the preceding example, the StringNotEquals operator denies access to this role by a principal that doesn’t belong to a member account of the specified organization.

AWS roles that you intend to use with AWS services need to be able to be assumed by those AWS services. In the preceding example, we added the aws:PrincipalIsAWSService condition key so that an AWS service principal isn’t impacted by the explicit Deny statement. All principals, including AWS services, are still required to have an explicit Allow statement in a role’s trust policy to assume that role. Requests to customer resources by AWS service principals do so with the aws:PrincipalIsAWSService condition key set to a value of true, which means that the preceding Deny statement won’t apply to a service principal, but an Allow statement will let a service principal assume the role.

You can also use the aws:PrincipalOrgPaths condition key to limit role assumption to member accounts within a specific OU of an organization if you want role assumption to be more fine-grained.

Enforcing invariants with Deny statements

Only allowing principals in your organization to assume your roles is an example of a security invariant. Security invariants are security principles that you want to always apply. Deny statements are useful in trust policies to restrict conditions under which you would never want a role to be assumable. In AWS authorization, the presence of an applicable Deny statement overrides an applicable Allow statement. So having a Deny statement with conditions in it that should never be met such as allowing a role to be assumed by a principal outside of your organization is powerful.

Setting the source identity on role sessions to help trace actions in CloudTrail

You can configure a role session to have a source identity when assumed. This is most common when customers federate users into IAM through SAML2.0 or Web Identity/OpenID Connect to assume roles. You can configure your IdP to set the SourceIdentity attribute on the role session. Setting the source identity causes AWS CloudTrail logs for actions taken by this role session to contain the source identity so that you can trace actions taken by roles back to the user that assumed them. The SourceIdentity attribute also follows that role session if it assumes another role.

To set a source identity, you need to grant the IdP the sts:SetSourceIdentity entitlement in the role’s trust policy.

{
    "Version": "2012-10-17",
    "Statement": {
      "Effect": "Allow",
      "Action": ["sts:AssumeRoleWithSAML","sts:SetSourceIdentity"],
      "Principal": {"Federated": "arn:aws:iam::111122223333:saml-provider/PROVIDER-NAME"},
      "Condition": {"StringEquals": {"SAML:aud": "https://signin.aws.amazon.com/saml"}}
    }
  }

In order for a role session that has a SourceIdentity set to assume a second role, it must also have the sts:SetSourceIdentity entitlement in that second role’s trust policy. If it doesn’t, the first role won’t be able to assume the second role.

You can also use the sts:SourceIdentity condition key to enforce that the SourceIdentity attribute that is being set conforms to an expected standard:

            "Condition": {
                "StringLike": {"sts:SourceIdentity": "*@example.org"}
            }

In the preceding example, for the Condition element, all requests must contain @example.org.

Setting tags on role sessions

You can set tags on role sessions, which can then be used in IAM and resource policy authorization decisions. Tags on role sessions are evaluated with the same condition key that tags on IAM roles are: aws:PrincipalTag/TagKey. Tag values that are set when a role is assumed have precedence over tag values that are attached to the role.

If you’re basing authorization on principal tags in your AWS accounts, it’s important that you control who can set the session tags and principal tags in your accounts so that access isn’t granted to unintended parties.

The ability to tag a role session must be granted in a role’s trust policy using the sts:TagSession permission, and you can use conditions and condition keys to restrict which tags can be set to which values.

The following is an example statement for a role trust policy that allows a principal from account 111122223333 to assume the role and requires that the three session tags for Project, CostCenter and Department are set. The Department tag must have a value of either Engineering or Marketing. The third Condition statement allows the Project and Department tags to be set as transitive when the role is assumed. Because conditions for the tags are in the same Allow statement as the AssumeRole entitlement, these tags are required to be set.

        {
            "Effect": "Allow",
            "Action": ["sts:TagSession","sts:AssumeRole"],
            "Principal": {"AWS": "arn:aws:iam::111122223333:root"},
            "Condition": {
                "StringLike": {
                    "aws:RequestTag/Project": "*",
                    "aws:RequestTag/CostCenter": "*"
                },
                "StringEquals": {
                    "aws:RequestTag/Department": [
                        "Engineering",
                        "Marketing"
                    ]
                },
                "ForAllValues:StringEquals": {
                    "sts:TransitiveTagKeys": [
                        "Project",
                        "Department"
                    ]
                }
            }
        }

When a role session assumes another role, transitive tags from the calling role session are set to the same value within the subsequent role session. For more information, see Chaining roles with session tags.

You can use Deny statements with the sts:TagSession operation to restrict certain tags from being set. In the following example, attempts to tag a session with an Admin tag would be denied:

{
    "Effect": "Deny",
    "Action": "sts:TagSession",
    "Principal": {"AWS": "*"},
    "Condition": {
        "Null": {
            "aws:RequestTag/Admin": false
        }
    }
}

In the following example statements, we deny tagging operations on role sessions where the Team tag is equal to Admin, but we allow the setting of a different tag value.

{
    "Effect": "Deny",
    "Action": "sts:TagSession",
    "Principal": {"AWS": "*"},
    "Condition": {
        "StringEquals": {
            "aws:RequestTag/Team": "Admin"
        }
    }
},
{
    "Effect": "Allow",
    "Action": "sts:TagSession",
    "Principal": {"AWS": "*"},
    "Condition": {
        "StringLike": {
            "aws:RequestTag/Team": "*"
        }
    }
}

What happens when a role in a trust policy is deleted

When you specify a role in the Principal element of a trust policy, AWS uses that role’s unique RoleId to make the authorization decision. If the ExampleCorpRole role from the earlier policy examples was deleted and re-created in account 111122223333, then the unique RoleId would be different, and the new ExampleCorpRole wouldn’t be able to assume the roles that trusted it in the Principal element.

When a role is deleted, the trust policy of the remaining roles that referenced this now-deleted role will show the unique RoleId it trusted in the Principal element when viewed:

"Principal": {
				"AWS": "AROA1234567123456D"
			}

Because the policy references a now-invalid RoleID, it can’t be modified until the invalid RoleID is removed from it. You can retrieve the original role ARNs by looking at CloudTrail logs for UpdateAssumeRolePolicy and CreateRole events for a role and reading the trust policy from the log entries.

For more information about using the Principal element in policy statements, see IAM role principals.

Principals placed inside the Condition block of a trust policy statement are not referenced to their RoleID but instead use the ARN of the role. The following trust policy statement would allow the ExampleCorpRole to assume the role that trusted it, even if the ExampleCorpRole role was deleted and re-created.

  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
    "ArnEquals": {
      "aws:PrincipalArn": "arn:aws:iam::111122223333:role/ExampleCorpRole"
    }
  }
    }
  ]
}

When creating a role trust policy, you should determine the behavior that you want to occur when a role is deleted. Your organization’s security posture might dictate that a deleted and re-created role should no longer be able to assume a role in your account, so using a specific principal in the Principal element is appropriate. Or you might want to allow the role to be assumed in the event that a given principal is deleted and re-created.

If you use the aws:PrincipalArn condition with a principal of :root to allow role assumption within the same account, the principal doing the assuming will need the sts:AssumeRole action in its identity-based policy.

Wildcarding principals

Earlier we noted that wildcards can’t be placed in the Principal element of a policy as part of an ARN. However, wildcards can be used in the Condition block of a policy, so wildcarding is possible with the ArnLike and StringLike condition operators. This is useful when you don’t know the specific roles, but you do have other controls that limit the path where known roles are created, such as delegated administrator models. The following policy allows a role from account 111122223333 in the path OpsRoles to assume it.

  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:root"
      },
      "Action": "sts:AssumeRole",
      "Condition": {
    "ArnLike": {
      "aws:PrincipalArn": "arn:aws:iam::111122223333:role/OpsRoles/*"
    }
  }
    }
  ]
}

It’s a best practice to restrict role assumption to specific paths or principals instead of allowing an entire account where possible.

Using multiple statements

So far, the examples in this post have been single policy statements. Trust policies, like other policies on AWS, can have multiple statements up to the quota for role trust policy length.

You can combine multiple statements together to create complex role trusts like the following, which allows ExampleRole to assume a role and tag the session, but only from the network range 203.0.113.0/24 while forbidding that the Admin tag be set:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": [
                    "arn:aws:iam::111122223333:role/ExampleRole"
                ]
            },
            "Action": [
                "sts:AssumeRole",
                "sts:TagSession"
            ],
            "Condition": {
                "IpAddress": {
                    "aws:SourceIp": "203.0.113.0/24"
                }
            }
        },
        {
            "Effect": "Deny",
            "Action": "sts:TagSession",
            "Principal": {
                "AWS": "*"
            },
            "Condition": {
                "Null": {
                    "aws:RequestTag/Admin": false 
                }
            }
        }
    ]
}

Although it’s possible to use multiple statements, it’s a best practice that you don’t use roles for unrelated purposes, and that you don’t share roles across different AWS services. It’s also a best practice to use different IAM roles for different use cases and AWS services, and to avoid situations where different principals have access to the same IAM role.

Working with services that deliver role-session credentials

IAM Roles Anywhere, AWS IoT Core, and Systems Manager can deliver AWS role session credentials to devices, servers, and applications running outside of AWS. These roles are assumed by AWS services and delivered to your devices, servers, and applications after they authenticate to their respective AWS services.

For more information about these services and their requirements, see the following documentation:

Role chaining

When a role assumes another role, it’s called role chaining. Sessions created by role chaining have a maximum lifetime of 1 hour regardless of the maximum session length that a role is configured to allow.

Roles that are assumed by other means are not considered role chaining and are not subject to this restriction.

Conclusion

In this post, you learned how to craft trust policies for your IAM roles to restrict their assumption by specific principals and under certain conditions, and to combine multiple statements with different conditions. You also learned how to use features like source identity and session tags, how to protect against the cross-account confused deputy problem, and the nuances of the Principal element. You now have the tools that you need to build robust and effective trust policies for roles in your organization.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Jonathan Jenkyn

Jonathan is a Senior Security Growth Strategies Consultant with AWS Professional Services. He’s an active member of the People with Disabilities affinity group, and has built several Amazon initiatives supporting charities and social responsibility causes. Since 1998, he has been involved in IT Security at many levels, from implementation of cryptographic primitives to managing enterprise security governance. Outside of work, he enjoys running, cycling, fund-raising for the BHF and Ipswich Hospital Charity, and spending time with his wife and 5 children.

Liam Wadman

Liam Wadman

Liam is a Solutions Architect with the Identity Solutions team. When he’s not building exciting solutions on AWS or helping customers, he’s often found in the hills of British Columbia on his Mountain Bike. Liam points out that you cannot spell LIAM without IAM.