Amazon Web Services (AWS) is pleased to announce that the Winter 2024 System and Organization Controls (SOC) 1 report is now available. The report covers 183 services over the 12-month period from January 1, 2024, to December 31, 2024, giving customers a full year of assurance. This report demonstrates our continuous commitment to adhere to the heightened expectations for cloud service providers.
AWS strives to continuously bring services into the scope of its compliance programs to help customers meet their architectural and regulatory needs. Customers can reach out to their AWS account team if they have any questions or feedback about SOC compliance.
To learn more about AWS compliance and security programs, see AWS Compliance Programs. As always, we value feedback and questions; reach out to the AWS Compliance team through the Contact Us page.
If you have feedback about this post, submit comments in the Comments section below.
Every organization strives to empower teams to drive innovation while safeguarding their data and systems from unintended access. For organizations that have thousands of Amazon Web Services (AWS) resources spread across multiple accounts, organization-wide permissions guardrails can help maintain secure and compliant configurations. For example, some AWS services support resource-based policies that can be used to grant identities permissions to perform actions on the resources they’re attached to. With the management of resource-based policies frequently delegated to application owners, central security teams use permissions guardrails to help ensure that possible misconfigurations don’t lead to unintended access to these resources.
In this post, we discuss how you can use resource control policies (RCPs) to centrally restrict access to resources. We demonstrate how RCPs can help improve your security posture while allowing even more freedom to developers in managing their resources, thus reducing friction between central security and application teams. Using a sample use case, we uncover key considerations for designing and effectively implementing RCPs in your organization at scale.
We recommend implementing permissions guardrails, including RCPs, using the following iterative process, which consists of five phases (as shown in Figure 1).
This phased approach helps ensure an effective integration of RCPs into your security strategy, improving your security posture while helping to maintain business continuity. Let’s explore each phase of RCP implementation in detail and outline key considerations for an effective implementation strategy.
Phase 1: Examine your security control objectives
The first step in implementing RCPs is identifying areas where RCPs can help improve your security posture or optimize the implementation of controls for your organization’s specific security control objectives.
Your control objectives can be influenced by a variety of factors such as compliance and regulatory requirements, legal and contractual obligations, types of workloads, data classification, and your organization’s threat model. After your control objectives are well-defined and prioritized, identify those that can be achieved using RCPs.
Like SCPs, RCPs are designed to establish coarse-grained access controls, security invariants that rarely change and serve as always-on boundaries across a wide range of AWS resources in your accounts. RCPs aren’t for managing fine-grained access controls. You will keep using policies such as resource-based and identity-based policies to apply least-privilege permissions.
More specifically, the following are key control objectives that you can achieve using RCPs:
Establish a data perimeter around your AWS resources. For example, you can use RCPs to help ensure that only trusted identities can access your AWS resources.
Mitigate the cross-service confused deputy risk. You can use RCPs to help ensure that your AWS resources are accessed by AWS services only on behalf of your organization.
Apply consistent access controls to your AWS resources regardless of the identities accessing them. For example, you can use RCPs to help ensure your Amazon Simple Storage Service (Amazon S3) buckets require TLS v1.2 or higher for in-transit encryption.
Let’s begin with the scenario illustrated in Figure 2. Your company’s central cloud team manages your corporate AWS Organizations organization, which consists of two corporate AWS accounts. An IAM principal in Account A should be able to assume an IAM role in Account B to perform day-to-day operations. To align to the broader control objective of Only trusted identities can access my resources, the central security team wants to make sure that the IAM role in Account B (my resource) can only be assumed by IAM principals that belong to their organization (trusted identities).
Figure 2: Simple scenario depicting a trusted identity accessing an IAM role
One way of achieving this control objective is to follow the principle of least-privilege and make sure that the role trust policy, the resource-based policy attached to the IAM role, only allows access to identities that require that access. The following is an example trust policy that grants permissions to Role A in Account A to assume Role B in Account B.
In organizations that have only a few accounts, central teams typically manage these policies. While this centralized governance model helps ensure that trust policies applied to roles are always restricted to trusted identities, it can also impede the productivity of application teams when operating at a greater scale.
Assume that your company has started growing its cloud footprint so much that your central security team now must achieve the same control objective with hundreds of IAM roles that are spread across multiple AWS accounts, as demonstrated in Figure 3.
Figure 3: Restricting access by managing individual IAM role trust policies
At this scale, we see organizations delegating permissions management to application teams to better support the growth of their business and empower developers to innovate faster. While central security teams no longer have full control over the permissions granted to resources across AWS accounts, they must make sure that access is aligned with their organization’s security standard. For example, they might want to make sure that the GrantCrossAccountAccess statement that is now managed by developers doesn’t inadvertently grant access to an account that doesn’t belong to their organization. Previously, central security teams typically achieved this by developing automated mechanisms to insert a standard statement into all trust policies. This statement helped ensure that access remained bounded to their organization, even when developers configured broad access permissions for their roles. The following is an example trust policy where a developer granted permissions to an external account through the GrantCrossAccountAccess statement. However, because of the RestrictAccessToMyOrg statement added to the policy by the central security team, the external account will be unable to use these permissions.
The RestrictAccessToMyOrg statement uses the aws:PrincipalOrgID and aws:PrincipalIsAWSService condition keys to restrict access to principals within your organization or to AWS service principals. The BoolIfExists operator with the aws:PrincipalIsAWSService condition key is required if the roles you’re applying a control to are service roles that are used by AWS services to perform operations on your behalf. When an AWS service assumes a service role, it uses its AWS service principal, an identity that is owned by AWS and that does not belong to your organization.
The central security teams could, for example, use AWS Config rules to detect misconfigurations and then use AWS Config remediation to automatically add the RestrictAccessToMyOrg statement to the IAM roles’ trust policies when new IAM roles are created or their trust policies are changed. Even though the addition of the RestrictAccessToMyOrg statement to trust policies can be automated, RCPs can greatly simplify enforcement of such coarse-grained controls in a multi-account environment.
Phase 2: Design permissions guardrails
Central security teams can implement permissions guardrails by creating an RCP that centrally blocks external access to IAM roles. The RCP that you will implement contains similar restrictions to the RestrictAccessToMyOrg statement that you used in the IAM trust policy.
Like SCPs, you attach the RCP to an account, organizational unit (OU), or the root of your organization. After being attached, the RCP automatically applies to applicable resources—in this case, IAM roles—within the scope of that AWS Organizations entity. This centralized approach alleviates the need to modify hundreds of trust policies across multiple accounts, lowering the operational overhead for central security teams and helping ensure consistent access controls are applied at scale. RCPs also help you achieve separation of duties with developers still managing their least-privilege permissions in trust policies and administrators applying coarse-grained access controls in RCPs. If developers make configuration mistakes while managing permissions for their applications, the preventative access controls implemented using RCPs will help ensure that they stay within your organization’s access control guidelines. See How AWS enforcement code logic evaluates requests to allow or deny access to understand how different policy types impact the authorization process.
If you’re transitioning existing controls from resource-based policies to RCPs, use the opportunity to reassess the control design based on your current control objectives and the additional benefits offered by RCPs. For example, your previous controls might have been limited to specific resource types, such as IAM roles in this use case, or to particular accounts, such as those storing the most sensitive data. RCPs enable you to extend controls to additional resources across your entire organization, reducing operational overhead through centralized management of permissions guardrails.
While designing your RCPs, consider the following guidelines.
Design for operational excellence
A key foundation for effectively implementing and operating permissions guardrails like RCPs is organizing your AWS environment using multiple accounts. Account boundaries and strategic placement of workloads across them allow you to apply tailored access controls that align with data sensitivity and specific access requirements. Grouping accounts into OUs within AWS Organizations enables more effective access control, even in scenarios where cross-account access is required. Figure 4 illustrates an example organization structure, demonstrating how RCPs can be applied at various levels of the organizational hierarchy to adhere to the security requirements of different workloads.
Figure 4: A sample organization with RCPs applied at various levels
When operating at scale, consider delegating policy management to a central security account in your organization. With AWS Organizations resource-based delegation, central teams don’t need access to the management account for any SCP or RCP related changes or troubleshooting.
Establishing clear governance helps you define how to implement and continuously manage RCPs within your organization. This includes the operating model, change management processes, and exceptions handling procedures. RCPs provide authorization controls similar to SCPs and therefore should integrate with your existing governance framework rather than requiring separate oversight. For example, if your change management process requires two-person approval for SCP changes, you should consider applying the same approval process for RCP implementation. You should also adopt the same mechanisms you currently use to prevent unauthorized changes or detect drifts in your policies.
Plan for exceptions
There might be scenarios where you have a few resources that should be accessible publicly or by identities that don’t belong to your organization. If you’re organizing your resources across multiple accounts and OUs based on their compliance requirements or a common set of controls, then you most likely have such resources in a dedicated set of accounts or OUs, such as the Public Data OU in Figure 4. These accounts or OUs can have applicable policies that account for their unique access requirements.
Another option to accommodate these scenarios is to use the aws:ResourceAccount or aws:ResourceOrgPaths condition key to exclude certain accounts from the control. For example, the following policy will deny access to identities outside your organization from assuming IAM roles unless the identity is an AWS service principal or the role that is being accessed belongs to Account A.
There also might be situations where your company’s trusted partners or acquisitions need to be granted an exception for access to a subset of your company’s resources distributed across multiple accounts. For example, your company might integrate with Cloud Security Posture Management (CSPM) tools that assume roles in your accounts to assess your accounts’ security posture, as shown in Figure 5.
Figure 5: Representative view of granting exceptions to trusted partners
When implementing a control with an RCP that by default will apply to all resources of the entity it’s attached to, you can manage resource specific exceptions using the aws:ResourceTag condition key. In addition, use the aws:PrincipalAccount context key to conditionally grant exceptions based on the AWS account ID of the trusted partner.
Let’s examine the two statements in the preceding RCP:
RestrictAccessToMyOrgExceptTaggedRoles
This statement helps ensure that your roles can only be assumed by identities that belong to your organization or by AWS service principals, unless a role is tagged with partner-access-exception set to trusted-partner.
RestrictAccessForTaggedRoles
This statement further restricts access by helping ensure that the roles that have the partner-access-exception tag can only be assumed by identities that belong to your trusted partner account.
If you have a well-known, tightly scoped set of resources that need to be excluded, you can also use the IAM policy element, NotResource, to list the Amazon Resource Names (ARNs) of resources to exclude from the control.
When implementing tag-based exception processes, establishing strict controls over tag management is key. Unauthorized modifications of tags on resources, principals, or sessions could impact your security posture by enabling unintended access. You should implement controls to help prevent unauthorized tag manipulation. For example, the following SCP restricts the use of the partner-access-exception tag to the admin role so that unauthorized users cannot alter the control by attaching, detaching, or modifying the tag.
You should also make sure that the partner-access-exception tag cannot be passed as a session tag when identities assume roles. See the sample RCP in the data perimeter policy examples repository.
Phase 3: Anticipate potential impacts
Before rolling out RCPs, you need to understand their potential impact on your organization. Introducing new policies or modifying existing ones without proper validation can disrupt your security-productivity balance. Be aware that overly restrictive policies might inadvertently impede legitimate data flows that are essential for achieving your business objectives.
Another effective method to assess impact is to review and analyze your account activity using AWS CloudTrail. For example, if you centralize all your CloudTrail logs in an S3 bucket, you can use Amazon Athena to query these logs. Specifically, look for STS API calls made against your IAM roles by identities outside your organization. Then, compare the results with your list of known trusted partners and those you have already accounted for in your RCPs. Based on this analysis, determine if you need to add the partner-access-exception tag to additional IAM roles and further refine the policy before enforcement. This is essential to ensure trusted partner integrations continue to function as expected when you enforce your RCPs. Furthermore, use this analysis to identify any illegitimate access patterns in your environment and plan for necessary remediations, further enhancing your security posture as part of RCP implementation.
As you transition into the implementation phase, consider the following key factors to promote a smooth rollout while enhancing your security posture.
Deployment automation and integration
Use your existing deployment pipelines to implement RCPs, the same as you do for SCPs. This approach will minimize operational overhead while maintaining consistency in the deployment of your controls.
As with SCPs, AWS strongly advises against attaching RCPs in production environments without thoroughly testing the impact that the policies have on resources in your accounts. Follow standard CI/CD processes and begin your RCP rollout in lower environments by attaching them to individual test accounts or OUs first. After you validate that the controls behave as excepted, gradually promote the RCPs to upper environments.
If your goal is to transition an existing control from resource-based policies to RCPs, keep your resource-based policies in place while conducting the progressive rollout. After you have completed rolling out your RCPs and confirmed that they operate as expected, you can consider deactivating the automation you used to apply the control using resource-based policies. This approach lets you deploy RCPs without impacting your existing security posture or disrupting business workflows.
Additionally, consider deploying RCPs to a subset of resources or accounts first to limit the scope of impact and provide an opportunity to test and refine your deployment and operational processes. You can follow your standard prioritization approach to define deployment waves, for example, start with resources or accounts that store sensitive data or pose the highest risk, based on your current operational practices and other controls that might be in place. For additional best practices, see OPS06-BP03 Employ safe deployment strategies in the AWS Well-Architected Framework: Operation Excellence Pillar whitepaper.
Phase 5: Monitor permissions guardrails
Finally, establish monitoring processes to help ensure that controls for preventing external access to your resources operate as expected. You can use the same tools you used for impact analysis. For example, you can use IAM Access Analyzer external access findings to understand the impact of your RCPs on resource permissions. This information will help you verify that your RCPs are crafted in accordance with your intent and plan remediation actions, if required. You can also set alerts for occurrences of unintended access patterns observed in your CloudTrail logs.
Furthermore, follow the phased approach outlined in this post to regularly review and update your controls to help ensure that they align with evolving business and security objectives. Consider factors such as organizational changes, changes in partner relationships, data criticality shifts, and opportunities for expanding your RCP coverage. This continuous improvement process helps maintain the effectiveness of your security controls while supporting business growth and transformation.
Conclusion
In this post, we discussed how to effectively implement coarse-grained access controls on AWS resources at scale using RCPs. You can use the phased implementation approach described here to achieve your security control objectives while minimizing the risk of disrupting your business workflows. You can apply the same approach to implement other preventative controls, such as SCPs, across your multi-account environment.
Remember that RCPs, like SCPs, provide a powerful mechanism for enforcing coarse-grained controls across multiple accounts in your organization. They don’t replace your least-privilege controls and should be part of a broader, multi-layered approach to data security that includes other well-architected security design principles.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Amazon Web Services (AWS) provides service reference information in JSON format to help you automate policy management workflows. With the service reference information, you can access available actions across AWS services from machine-readable files. The service reference information helps to address a key customer need: keeping up with the ever-growing list of services and actions in AWS. As new services launch and existing services expand their capabilities, you can now conveniently identify and incorporate available actions, resources, and condition keys for each AWS service into your policy authoring and validation workflows. As your business expands and your AWS footprint grows, you might decide to automate your policy management workflows. With the service authorization reference, you can build custom tools to make it easier to evaluate and use new actions, resources, and condition keys that AWS services introduce.
Getting started with service reference information
The service reference information is static information about the actions, resources, and condition keys available for each service in AWS. To obtain the list of AWS services for which reference information is available, go to the following URL: https://servicereference.us-east-1.amazonaws.com/v1/service-list.json
This URL endpoint provides a JSON file that contains an up-to-date catalog of AWS services with available reference information. By querying this endpoint, you can retrieve the most current list of services supported by the AWS Service Reference Information feature.
To retrieve the list of actions, resources, and condition keys for a specific AWS service, go to the following URL: https://servicereference.us-east-1.amazonaws.com/v1/<service-name>/<service-name>.json
Replace <service-name> with the name of the desired AWS service (for example, “s3” for Amazon Simple Storage service (Amazon S3) or “ec2” for Amazon Elastic Compute Cloud (Amazon EC2)). This URL endpoint provides a JSON file that contains the comprehensive list of actions, resources, and condition keys that are available for that particular service.
The following example shows the format of the output from the service-list.json file, which contains the service names and URLs for each service’s reference information:
You can navigate to the service information page by using the url field to view the list of permissions for the service. You can also download the JSON file to use in your policy authoring workflows. For example, you can download the permissions for Amazon S3 by following this URL: https://servicereference.us-east-1.amazonaws.com/v1/s3/s3.json
The following example shows a partial output of the permissions for Amazon S3. The AWS Identity and Access Management (IAM) actions are available in JSON format, and each action is its own JSON object. The Name field for those objects provides the name of the IAM action, the ActionConditionKeys field provides the available condition keys for this action, and the Resources field provides the available resources for this action.
What can you build with the service reference information?
Let’s explore how you can make use of the service reference information through practical examples. To help you get started, here are two custom tools that use the service reference information. You can find these tools in our GitHub repository, ready for you to use and adapt to your specific needs. You can download the source code for these tools by visiting the following links:
The SCP pre-processor provides a convenient way to write SCPs. You run the SCP pre-processor as a command-line tool. The tool takes a single, monolithic JSON file and runs a series of transformations and optimizations, then outputs a collection of valid service control policies that fit within policy size quotas. The tool uses AWS service reference information data in order to optimize lists of IAM actions.
Notification tool for new or removed IAM actions
You might find yourself needing to update various policies throughout your AWS environment when new IAM actions or services are released. You can use this tool to notify you when new services or new actions are added or removed. It works by downloading the service reference information and comparing it to the previous version of the file when the tool last ran. You can use these notifications to perform actions like automatically updating IAM policies when new actions are added or manually reviewing the notifications for new, sensitive actions.
The AWS service reference information makes it easier for you to create automation for policy authoring and validation. By providing the AWS service actions reference in JSON format, this feature enables you to create custom tools for policy authoring and management.
We’re excited to know what kind of policy authoring tools you can think up.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
The new IRAP report includes an additional six AWS services that are now assessed at the PROTECTED level under IRAP. This brings the total number of services assessed at the PROTECTED level to 164.
The following are the six newly assessed services:
AWS has developed an IRAP documentation pack to help Australian customers and their partners plan, architect, and assess risk for their workloads when they use AWS Cloud services.
The IRAP pack on AWS Artifact also includes newly updated versions of the AWS Consumer Guide and the whitepaper Reference Architectures for ISM PROTECTED Workloads in the AWS Cloud.
Reach out to your AWS representatives to let us know which additional services you would like to see in scope for upcoming IRAP assessments. We strive to bring more services into scope at the PROTECTED level under IRAP to support your requirements.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
This blog post explains the specific steps needed to integrate WorkMail with Okta via IAM IdC. WorkMail customers who use other 3rd party identity providers that are compatible with IdC will also find this post useful alongside the AWS documentation for their specific identity source. Alternatively, WorkMail organization administrators who do not use an external identity provider, but wish to enable multi-factor-authentication (MFA) for WorkMail viaIAM IdC will find guidance in a different blog post.
By integrating WorkMail with Okta via IAM IdC, users benefit from the security, convenience, and familiarity of SSO when accessing their WorkMail inbox and calendars via the WorkMail web-app. WorkMail organization administrators benefit from the security and efficiency of managing WorkMail user additions, modifications, or removals via the Okta admin tools. More detailed information about this integration can be found in the IAM IdC documentation as well as in the WorkMail documentation.
Introduction
Email remains a critical business communication tool, yet it is also one of the most targeted entry points for cyber threats. A single compromised email account can expose sensitive business data, damage an organization’s reputation, and serve as a gateway for further cyber attacks. As traditional username and password authentication becomes increasingly inadequate, organizations must adopt stronger access controls to protect their users and communications.
You can enhance your WorkMail organization’s security and simplify user authentication with SSO and a familiar login experience by integrating WorkMail with IAM IdC and Okta. This (and similar) integrations provide a more seamless authentication experience and helps administrators enforce consistent security policies across their organization. The integration between WorkMail, IAM IdC and Okta supports these industry standards:
Security Assertion Markup Language (SAML) – Which provides secure authentication and authorization by exchanging identity and privilege information between identity providers and WorkMail.
WorkMail authentication options
When you first set up WorkMail, you use the built-in user directory which relies upon standard username | password authentication for the WorkMail web browser client as well as popular desktop and mobile email clients such as Microsoft Outlook, Apple Mail and Thunderbird.
By default, WorkMail uses username | password authentication
The integration between IAM Identity Center and Okta uses the SCIM protocol to provision and map user attributes in Okta to named attributes in IAM IdC. Once enabled, user and group information from Okta is automatically synchronized with IAM IdC. IAM IdC in turn uses the SAML protocol to provide Okta users with a familiar and convenient SSO experience.
After integrating IAM IdC with Okta, WorkMail uses:
Okta SSO for users of the WorkMail web-app.
Personal access tokens (PAT) for users of desktop &/or mobile email applications.
After integration with IdC and Okta, WorkMail uses SSO (web-app) and username | PAT (desktop/mobile) for authentication
Prerequisites for the WorkMail, IAM IdC and Okta integration
Admin access to an AWS account
You can evaluate the integration in this post for a limited period of time using an AWS Free Tier Account
Admin access to an Amazon WorkMail Organization
The integration uses the email domain that is configured as WorkMail’s default
Admin access to Amazon IAM Identity Center
Or admin access to the IdC organization account
Your WorkMail and IAM Identity Center must be in the same AWS region
IAM Role with the security credentials needed to deploy AWS infrastructure via the AWS CDK
Administrator access to a fully functional Okta Identity account populated with users and (optionally) group(s) that you want to integrate and synchronize with IAM IdC and WorkMail.
Access to the AWS Samples GitHub repository, where you’ll find sample code that deploys via AWS CDK. This code sample deploys a customizable AWS Lambda initially configured to perform user synchronization between IAM IdC and WorkMail every 15 minutes.
A source code editor such as Visual Studio Code with your AWS admin credentials.
Test user access to WorkMail with and without Okta SSO (web-app) and Personal Access Token (desktop & mobile email clients)
Notify WorkMail users of upcoming changes to login procedures.
Switch WorkMail Authentication Mode to IAM IdC.
Detailed Configuration Guidance
The instructions that follow walk you through the steps needed to integrate your Okta system with IAM IdC. Once IAM IdC is synchronizing with Okta, you’ll integrate IAM IdC with WorkMail. This process will keep your WorkMail organization’s users and groups synchronized with Okta. Once fully enabled, your WorkMail users will use their Otka SSO credentials to access the WorkMail web-client or a username and unique Personal Access Tokens to access desktop &/or mobile email clients. Let’s get started!
Configure Okta integration with IAM Identity Center (these steps have been adapted from this blog post)
Open the Okta admin landing page in a new browser (tab).
Click the Applications Section in the left menu.
Click on Applications in the dropdown menu.
Click Browse App Catalog.
Type AWS IAM Identity Center in the search box.
Click on the app that matches AWS IAM Identity Center.
Click Add Integration to initiate the integration process.
Type a memorable name (like Workshop AWS IAM Identity Center) for the application label, and click Done.
Click on the Sign On tab, scroll down to view SAML Signing Certificates.
Click Actions to download the certificate to your local machine. Keep the Okta Admin dashboard Open, you will need the SAML information found here to configure IdC in the next section.
Configure AWS IAM IdC
In a new browser window (or tab), open the IAM IdC console in the same region as your WorkMail Organization (we’re using us-east-1).
If this is your first time accessing IAM IdC, you’ll be greeted with the IdC console home page prompting you to “Enable IAM Identity Center”. Click Enable.
Check that you are in the correct region, click Enable (note – this will enable an Organization instance of IdC, which is recommended. If you want to use an Account instance of IdC, please see the documentation).
Once IdC has been enabled, scroll down to the IAM Identity Center setup section and click Confirm Identity Source.
Click Actions, and select Change identity source from dropdown menu.
Select External identity provider and click Next.
Upload the SAML certificate you downloaded from Okta (above, step xx)
Open the Okta dashboard browser/tab. Under the SAML 2.0 configuration, scroll down and expand > More details.
Copy the Sign on URL from Okta SAML and paste the URL into the IdP sign-in URL field in IdC.
Copy the Issuer URL from Okta SAML and paste the URL into the IdP sign-in URL field in IdC.
Click Next.
Type ACCEPT in the confirmation text box & click Change identity source.
Switch back to the IAM IdC console browser window/tab. On the Settings page, locate the Automatic provisioning information box, and then choose Enable. Leave this browser window/tab open as you’ll need it to copy the values for the SCIM endpoint and Access token into Okta shortly (note – IdC Automatic provisioning uses SCIM protocol to provision users from Okta).
Return to the Okta admin dashboard (you should have left it open in another browser window or tab). Under IAM Identity Center applications, select the Workshop AWS IAM Identity Center and open the Provisioning tab.
Under Integration, click Configure API Integration.
Select the checkbox next to Enable API integration to enable provisioning.
Refer to the IAM IcC console browser window/tab you left open and copy (from IdC) and paste (into Okta) the values for:
Base URL field – paste the IdC SCIM endpoint value (make sure there is no trailing forward slash at the end of the URL).
API Token field, enter the Access token value.
Import Groups should be checked
Click Test API Credentials to verify the credentials entered are valid (the message AWS IAM Identity Center was verified successfully! should appear).
Save.
Under Settings, choose To App, click Edit and select Enable for Create Users, Update User Attributes and Deactivate Users. Click Save.
In the Okta admin dashboard’s left-hand menu pane, click on Directory.
Choose Groups to see the list of existing groups.
Click on the Add Group button to create a new group called workmail_users.
Fill in the required details for the new group, including:
Group Name: Enter workmail_users for the group.
Group Description (optional): Provide a description for the group’s purpose (WorkMail_users).
Save the New Group
Reload the page to see the newly created group
Click Assign people
Add Okta users to the workmail_users group
Click Save
In the Okta admin dashboard’s left-hand menu pane, click on Applications and open the AWS IAM Identity Center application.
Click the Assignments tab, choose Assign, then choose Assign to people.
Choose the Okta users that you want to have access to WorkMail via IAM Identity Center app.
For each user, click Assign, review the user info, and click Save then Go Back
When all Okta users for whom you want WorkMail users created/synced have been assigned, click Done to start the process of provisioning the users into IAM Identity Center.
On the Assignments page, choose Assign, and then choose Assign to groups.
Click the Okta group (workmail_users) that you want to have access to WorkMail via IAM Identity Center.
Click Assign, review the selections, click Save and Go Back. Click Done. This starts the process of provisioning the users in the workmail_users group into IAM Identity Center.
Choose the Push Groups tab.
Choose the workmail_users group that contains all the users that you assigned to the IAM Identity Center app.
If the group is not found in the list, choose the Find groups by name option by clicking (+)Push Groups Menu.
Enter the group name ( workmail_users) that you created in the previous step and select it from the search results
Click Save. The group status changes to “pushing” then “Active” (this can take several minutes), once the Okta workmail_users group and all its members have been pushed to IAM Identity Center.
Return to the browser window/tab with the IAM Identity Center console.
In the left navigation, select Groups (or Users) (you should see the Group (or users) list that was just populated by the Okta “push” action).
Enable IAM Identity Center in Amazon WorkMail
Open a new browser window (tab) to the WorkMail console in the same region as IdC, and open your WorkMail Organization.
In WorkMail’s left navigation rail click Identity Center
Click on Multi-factor authentication setup guide, Click Enable Identity Center, read the default configuration notice ,and click Enable
Under Identity Center Settings, click the Authentication mode tab and ensure it is set for WorkMail directory and Identity center (this is the default).
This mode is needed for testing and to allow desktop & mobile email client users to retrieve their unique personal access tokens.
Once the integration with IdC is successfully tested, and those WorkMail users who rely on desktop or mobile email clients for access have had the opportunity to login to the WorkMail web app to retrieve a PAT, you will change the Authentication mode to Identity center only.
Deploy the sample solution from GitHub
Solution prerequisites:
– AWS Account – AWS IAM user with Administrator permissions – Python (> v3.x) and Pip fully installed and configured on your computer – AWS CLI (v2) installed and configured on your computer – AWS CDK (v2) installed and configured on your computer – Sample CDK project that creates and deploys an AWS Lambda in your AWS account to perform users synchronization between IAM IdC and WorkMail.
The instructions that follow guide you in deploying a sample solution for the AWS Samples Serverless-Mail repository that programmatically creates/syncs WorkMail users from IAM Identity Center using the AWS CDK CLI. The sample uses naming conventions used in this blog. For example, the Lambda only syncs those users found in the IdC Group named "workmail_users". If your IdC group has a different name, or you want to sync multiple IdC groups with WorkMail, you will need to modify the sample code before proceeding. (BE AWARE: The sample code base is an Open Source starter project designed to provide a demonstration and a base to start from for specific use cases. It should not be considered fully Production-ready, nor should it be deployed in a production environment. Refer to the README.md and License.md for more information.)
Clone the sample project to your computer (git clone https://github.com/aws-samples/serverless-mail/tree/main/idc_okta-workmail). CD into the ‘idc_okta-workmail‘ directory.
Create a virtual environment for packaging in Python (Docker must already be running locally) with python3 -m venv .venv
Activate the virtual environment with source .venv/bin/activate
Make sure cdk.json references the correct path to Python (by default cdk.json refers to .venv/bin/python app.py). If you need to locate Python, run the which python command.
Install the project dependencies with pip3 install -r requirements.txt
Check the AWS CLI local credentials and region with aws sts get-caller-identity . If you need to change any parameter, run aws configure
Prepare the `app.py` file by running python3 get-my-variables.py . The ‘get-my-variables.py’ script fetches the necessary variables from your AWS account and create the `app.py` file needed by the CDK. You may want to review the auto-generated `app.py` file for accuracy, especially if the AWS account has been used for other IdC or WorkMail related projects. The app.py file requires the following account variables:
AWS accountId
AWS Region – the region must be the same for IdC and WorkMail.
IDENTITYSTORE_ID – – found in the Identity Center console under settings or via the AWS CLI aws identitystore list-groups --identity-store-id {identity_store_id}
IDENTITY_CENTER_INSTANCE_ARN – found in the Identity Center console under settings or via the AWS CLI aws sso-admin list-instances
IDENTITY_CENTER_APPLICATION_ARN – found in the Identity Center console under Applications or via the AWS CLI aws sso-admin list-applications --instance-arn {instance_arn}`
WORKMAIL_ORGANIZATION_ID – found in the WorkMail console under Organizations or via the AWS CLI aws workmail list-organizations
OKTA_GROUP_ID_TO_ASSIGN_TO_WORKMAIL – found in the Identity Center console under Groups > General Information or via the AWS CLI aws identitystore list-groups --identity-store-id {identity_store_id}
Bootstrap the package (this step may take several minutes to complete) with cdk bootstrap
Synthesize the package (this step may take several minutes to complete) with cdk synthesize
Deploy your package (this step may take several minutes to complete) with cdk deploy and follow the prompts in the AWS CDK CLI.
Once deployment to your AWS account starts, you can view the deployment status in the CloudFormation console.
Confirm WorkMail Authentication mode
WorkMail’s default Authentication mode should be set to WorkMail directory and Identity center. Don’t change this yet. This mode will allow WorkMail web-app users to continue to login to the WorkMail client directly, without Okta SSO. WorkMail’s Personal access token configuration default is set to active, and token lifespan set to 365 days. You can change this if your security needs differ from the default. PATs are used by desktop and mobile email clients to login to WorkMail. Leaving WorkMail’s default Authentication mode set to WorkMail directory allows desktop and mobile email clients to continue to login to the WorkMail with their username and password, without yet requiring a PAT instead of password.
Conduct a few spot tests to verify WorkMail web-app users can properly access their WorkMail accounts via:
The Amazon WorkMail web application URL (found on the WorkMail organization page)
Your organization’s unique AWS access portal URL (found on the setting page of the IAM IdC dashboard).
This will open a new browser tab that redirects to the Okta user login page.
Once user has authenticated, this page will redirect to the AWS access portal with linked application tiles to all of the AWS applications that have been integrated with IAM Identity Center.
Click on the WorkMail logo, and the WorkMail web-app will load.
Desktop or mobile email software users need to create a WorkMail Personal Access Token (PAT) to access WorkMail. These users must retrieve their own PATs after logging into the WorkMail web-client. Open the Webmail login link found on the WorkMail Organization page under User Login
In the WorkMail web-app,
click the settingsgear (in top right)
Click Personal Access token and enter a token name (i.e. desktop-thunderbird-pat)
Click Create token.
Click Create token.
Copy the token value (this is the only time you can retrieve this token value).
Open your desktop or mobile email software, enter your username and your PAT (the PAT replaces your existing user password).
In settings, click Personal Access token and Create token
Enter a token name (typically the device on which you’ll use this PAT) and select create token.
Copy the token value (this is the only time you can retrieve this token value).
Open your desktop or mobile email software, enter your username and your PAT (the PAT replaces your existing user password), and test your new login (username | PAT).
Notify your WorkMail users of upcoming changes to login procedures
Once you have tested the integration between WorkMail and IAM Identity Center with a few test users, you should prepare your WorkMail users for the increased login security. For example, you could send them an email that explains:
The organization is adding an additional login security step to help protect their inboxes.
Inform them that they should anticipate an email from the AWS IAM Identity Center with info about the upcoming implementation of MFA for web-app users and PATs for desktop and mobile client users.
Users should accept the invitation and create a new password for the AWS IAM Identity Center.
Inform them that once WorkMail MFA is enabled, all WorkMail web-app users will be required to use their username, password and MFA.
Inform them that once WorkMail PATs are enabled, all WorkMail desktop and mobile email client users will need to login to the WorkMail web client (with MFA) via the AWS access portal URL, create one PAT per software client (the same PAT can not be used on desktop and mobile). They then must update their desktop or mobile email software to use their username and PAT, instead of their current password. Explain that the PAT is now their email client application password and their personal desktop or mobile email software passwords will no longer work.
Provide users with a way to request support.
Switch WorkMail Authentication Mode to Identity Center only
Once you are satisfied that your WorkMail users have incorporated Okta SSO and/or PATs into their WorkMail login routines, the WorkMail administrator should disable the WorkMail directory Authentication mode found in the WorkMail console under Organization > Identity Center.
Optional – Ask several trusted users to test their ability to login to WorkMail web app via Okta credentials.
Congratulations, your WorkMail organization’s authentication is now being managed by IAM Identity Center and your external identity provider, Okta.
Conclusion
By integrating IAM Identity Center with Okta Identity and WorkMail, you can provide your users with the convenience of Okta SSO for the WorkMail web-app. This allows your organization to improve it’s email security posture, better safeguard sensitive communications and protect you WorkMail organization against unauthorized access. Furthermore, this integration reduces admin overhead as user access to WorkMail is allowed, or revoked via the Okta administration dashboard.
Take control of your email communications today with Amazon WorkMail:
Join the Conversation: Connect with other administrators and security professionals on the AWS re:Post community to share insights and learn best practices.
The IAR provides management and technical information security controls to help establish, implement, maintain, and continuously improve information assurance. AWS alignment with IAR requirements demonstrates our ongoing commitment to adhere to the heightened expectations for cloud service providers. As such, IAR-regulated customers can continue to use AWS services with confidence.
Independent third-party auditors from BDO evaluated AWS for the period of November 1, 2023, to October 31, 2024. The assessment report that illustrates the status of AWS compliance is available through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.
AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about IAR compliance, reach out to your AWS account team.
To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.
If you have feedback about this post, submit comments in the Comments section below.
AWS Key Management Service (AWS KMS) is pleased to launch key-level filtering for AWS KMS API usage in Amazon CloudWatch metrics, providing enhanced visibility to help customers improve their operational efficiency and aid in security and compliance risk management.
AWS KMS currently publishes account-level AWS KMS API usage metrics to Amazon CloudWatch, enabling you to monitor and manage your API usage. However, if you’re using numerous KMS keys, pinpointing the ones with the highest request rate quota usage or significant API costs becomes challenging. For example, if you have more than 10 active KMS keys in your account, prior to this launch you would have needed to build a custom CloudTrail and Amazon Athena based solution to locate which specific keys are driving the majority of API usage and costs. With the new CloudWatch metrics, which are available under the AWS/KMS namespace in CloudWatch, you can track, understand, and set alerts on detailed API usage at the individual KMS key level without building a costly customized solution.
This blog post explores several use cases to help you better take advantage of these newly introduced CloudWatch metrics to manage your AWS KMS API usage and costs. The use cases cover viewing and understanding your API usage at the key level, and creating CloudWatch alerts to detect unintentional runaway usage.
Overview of new CloudWatch metrics for KMS keys
With CloudWatch metrics for KMS keys, you can now do the following:
View the API usage for a specific KMS key, filtered by individual API operations (for example, Encrypt, Decrypt, or GenerateDataKey).
See the aggregated usage across cryptographic operations for a given KMS key.
Set up an alarm if a specific KMS key exceeds a specified threshold on a single API operation, or a set of API operations.
This streamlined approach allows you to quickly monitor, understand, and troubleshoot the API usage patterns of your KMS keys, without the overhead of the previous multi-step process. Let’s detail how these key-level API usage metrics can be used in two real-world examples.
Example 1: How to locate the KMS keys that consume the most API usage quota or contribute the most API charges
When you surpass your AWS KMS API request rate quotas, you can view your AWS KMS API utilization within the Service Quotas console. However, you might still find it cumbersome to identify the KMS keys that consume the largest amount of your request quota. When you receive the AWS KMS API charges that exceed your expectation, you can check the detailed billing usage in each AWS Region in Cost Explorer, but you cannot easily locate the KMS keys with the most API charges. This process becomes even more challenging when you manage a large number of KMS keys.
With the key-level API usage CloudWatch metrics, you can use the advanced metric query option to query CloudWatch Metrics Insights data with a user-friendly dialect of SQL to locate the KMS keys that consume the largest portion of the API usage quota or contribute the most API charges.
Walkthrough
To use Amazon CloudWatch Metrics Insights to identify the top 20 KMS keys that have the most cryptographic API usage up to the last 3 hours, complete the following steps:
In the navigation pane, choose Metrics, and then choose All metrics.
Choose the Multi source query tab.
For the data source, choose CloudWatch Metrics Insights.
You can enter the following example query in Editor view:
Note: In Builder view, the metric namespace, metric name, filter by, group by, order by, and limit options are shown. In Editor view, the same options as in Builder view are shown in query format.
SELECT SUM(SuccessfulRequest)
FROM SCHEMA("AWS/KMS", KeyArn, Operation)
GROUP BY KeyArn
ORDER BY MAX () DESC
LIMIT 20
Choose Run in the Editor view or Graph query in the Builder view.
Example 2: How to set a new detailed alarm on unintentional runaway AWS KMS API usage
Running big data processing workflows that read Amazon Simple Storage Service (Amazon S3) files encrypted by KMS keys is a common scenario for analytics, business reporting, or machine learning projects. Typically, these workflows read a limited number of files from S3 on each invocation. However, misconfigured workflows could unintentionally read a large number of S3 files, which could result in exceeding your AWS KMS API request rate quotas or incurring undesirable charges due to spiky AWS KMS API usage. Historically, to address this issue, you would have had to build a customized alarm system by following these steps: 1) send AWS CloudTrail events generated by AWS KMS to Amazon CloudWatch Logs; 2) write queries in Amazon CloudWatch Logs Insights to track your API request usage; and 3) enable anomaly detection on the corresponding CloudWatch Log Insights math expression.
Now, with key-level API usage CloudWatch metrics, you can directly enable anomaly detection on these metrics to set up alarms for anomalous AWS KMS API usage patterns. This provides a more streamlined and efficient way to monitor and detect potential runaway workflows. By using these CloudWatch metrics and anomaly detection capabilities, you can proactively identify and address unintended increases in AWS KMS API usage, helping to avoid unexpected charges or service disruptions in your analytics, reporting, or machine learning pipelines.
Walkthrough
Consider a scenario where you have an analytics workflow that runs frequently, which uses the Decrypt AWS KMS API operation on a KMS key to decrypt and read data from S3. You would like to enable anomaly detection on the KMS key to trigger an alarm when the Decrypt call volume to the specific KMS key sees a discernible trend or pattern. To do so, complete the following steps:
In the navigation pane, choose Metrics, and then choose All metrics.
Choose KMS, and then choose KeyArn, Operation.
In the search bar, enter the Amazon Resource Name (ARN) of the key, and then choose Search. Select the CloudWatch metric you would like to enable anomaly detection for.
Navigate to Graphed metrics, and using the Statistic and Period drop-down lists, choose the statistic and period that you would like to monitor. Then you can enable anomaly detection by selecting the Pulse icon.
Figure 1: How to enable anomaly detection on a SuccessfulRequest metric
Figure 2: Anomaly detection is enabled on the SuccessfulRequest metric. The gray band illustrates the expected range of values and the anomaly is in red
Conclusion
This blog post highlighted the newly introduced key-level filtering capability for the AWS KMS API usage in CloudWatch. We showed two real-world use cases to demonstrate how you can use the new CloudWatch metrics. These use cases include improving operational visibility, setting up proactive alarms on anomalies in KMS API usage patterns, and potentially tracking detailed key usage for compliance purposes.
If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread in the AWS Key Management Service re:Post.
Every day, I talk with security leaders who are navigating a critical balancing act. On one side, their organizations are moving faster than ever, adopting transformative technologies like generative AI and expanding their cloud footprint. On the other, they’re working to maintain strong security controls and visibility across an increasingly complex landscape. We all know that adding more tools and controls isn’t sustainable. We need a different approach to security at scale.
re:Inforce 2025: Your roadmap to security that powers innovation
This is what shaped our vision for AWS re:Inforce 2025. When done right, security at scale becomes a business accelerator, helping organizations move faster and more confidently in the cloud. This is more than just a philosophy; it’s a practical reality I’ve seen proven time and again by our customers, and it’s what we want to help every organization achieve.
At re:Inforce, we’ll share a vision for simplifying security at scale that’s deeply rooted in our experiences supporting millions of customers worldwide. We’ll explore how organizations are building inherently resilient applications that can withstand modern threats while accelerating innovation. I’m particularly excited to showcase real customer examples and architectural patterns that demonstrate how security better enables your business goals.
An environment built for learning cloud security
There’s a reason we created re:Inforce as a dedicated in-person security event. While I love our broader AWS events, security practitioners need space to dive deep into implementation details, ask tough questions, and work through complex scenarios. At re:Inforce, you can grab a whiteboard with the engineers who built our security services, collaborate with security partners, and schedule personal time with our leaders to tackle your specific security needs. It’s the kind of environment where real learning happens.
We’ve designed multiple learning paths to meet you wherever you are in your security journey. With over 250 technical sessions, you’ll find content that matches your needs – whether you’re looking to automate security controls, align development and security teams, or transform your security operations. You’ll find interactive workshops where you’ll build solutions in real-time, small-group technical deep-dives, hands-on labs where you can test new approaches, and solution-building sessions with AWS experts. Best of all, 70% of our content is at advanced or expert level, making sure you get the detailed implementation guidance you need.
I invite you to join us for three days that will transform how you think about and implement security in the cloud. Registration is now open, and I encourage you to secure your spot early—based on previous years, spots will fill up quickly. Join us to explore how simplified, scalable cloud security can fuel your organization’s future.Register today with the code SECBLObhZzr9 to receive a limited time $300 USD discount, while supplies last.
If you have feedback about this post, submit comments in the Comments section below.
Containerization offers organizations significant benefits such as portability, scalability, and efficient resource utilization. However, managing access control and authorization for containerized workloads across diverse environments—from on-premises to multi-cloud setups—can be challenging.
This blog post explores four architectural patterns that use Amazon Verified Permissions for application authorization in Kubernetes environments. Verified Permissions is a scalable permissions management and fine-grained authorization service for your applications.
In this blog post, we cover the following patterns and discuss their trade-offs:
Calling Verified Permissions from an Amazon API Gateway API fronting your application in Kubernetes
Calling Verified Permissions from a Kubernetes Ingress controller component
Calling Verified Permissions from a sidecar container running in the same pod as the application container
Calling Verified Permissions from the application container
Understanding these patterns and their implications can help you implement secure and consistent authorization mechanisms across your entire infrastructure without compromising the scalability, portability, and resource efficiency of your containerized workloads.
Consistent authorization through centralized policy management
Access to application resources can be secured more effectively with a centralized and consistent approach to authorization. Especially in containerized environments with distributed architectures and shared resources, traditional access control methods, like embedding authorization logic within individual application code or relying on local access control policies, can become difficult to manage and prone to errors. This becomes even more challenging when you have a combination of on-premises and cloud setups.
A centralized authorization solution empowers developers to implement consistent access control across individual components of an application efficiently. Benefits include reduced duplicate work, an improved security posture, and lower complexity in managing and enforcing access control policies.
Verified Permissions benefits in a containerized environment
Amazon Verified Permissions provides several key benefits as an external authorization service:
Benefits for the platform engineering team – Centralized authorization enables platform engineering teams to implement, maintain, and govern authorization policies across the organization without requiring changes to individual applications. This aligns with modern platform engineering practices, where platform engineering teams can provide authorization as a service to application teams, promoting consistent security standards while reducing the operational burden on development teams.
Consistent authorization across environments – With Verified Permissions, you can define and manage access control policies in a centralized location. This makes it easier to apply consistent authorization rules across your entire infrastructure, including on-premises deployments and different cloud environments.
Simplified application development – Externalizing authorization logic from applications reduces development complexity. Developers can focus on core application functionality without having to implement and maintain authorization mechanisms within each service or component. This separation of concerns promotes code modularity, reusability, and faster iteration cycles.
Scalable and highly available – Verified Permissions is a managed service, designed to be scalable and highly available out of the box. As your containerized workloads grow in scale and complexity, Verified Permissions can handle increasing authorization request volumes while maintaining performance and availability.
Fine-grained access control – Verified Permissions supports attribute-based access control (ABAC) and role-based access control (RBAC). This allows you to define granular policies in the open source Cedar language based on various attributes like user roles, resource properties, environmental factors, and more.
Integration patterns for authorization
Kubernetes provides many options for architecting applications. Therefore, there are multiple locations in a typical architecture where authorization decisions can be enforced, as shown in Figure 1.
Figure 1: Integration points for authorization in containerized workloads
The workflow is as follows:
API Gateway. Organizations can use entry points to the application outside of the Kubernetes cluster, such as an API gateway, to obtain an authorization decision. In AWS, Amazon API Gateway enables customers to use authorizer Lambda functions to send an authorization request to Verified Permissions.
Ingress controller. The Kubernetes API defines Ingress objects, which provide load balancing and routing functions on layer 7. Common Ingress controllers like Traefik offer the option to integrate external authorization services.
Sidecar proxy container. You can intercept every request routed to the application by using a sidecar container running in the same pod as the application container. This sidecar container calls Verified Permissions for authorization decisions.
Application container. Developers can use the Amazon SDK to communicate with Verified Permissions from inside the application when an authorization decision is needed.
In the following sections, we explore each of these patterns in detail, examining their implementation, use cases, and specific considerations. At the end of our discussion, we provide a comprehensive comparison table to help you choose the most appropriate pattern based on factors such as scalability, performance, maintenance overhead, and specific use case requirements. This will help you make an informed decision about which pattern best suits your application’s needs.
Authorization workflow
Independent of which of the four mentioned options for authorization you choose, the overall authorization workflow, shown in Figure 2, will stay the same.
Figure 2: Authorization workflow with Amazon Verified Permissions
The workflow is as follows:
Authentication. The user first authenticates with an identity provider to obtain a JSON Web Token (JWT). You can configure the identity provider to write relevant information like user roles, tenant ID, or other needed user attributes into the JWT. You can then use this information later to make an authorization decision.
API request. The user makes a request to your application that includes the JWT.
Authorization information. Your application extracts the relevant information that is needed to make an authorization decision from the request. This can include principal information from the JWT, information about the resource that the user requests, and what action the user wants to perform.
(Optional) Policy information point lookup. Depending on your policies, you might need additional information in order for Verified Permissions to make an authorization decision. For example, you can query ownership details for a document from a database.
Authorization decision. You then send the relevant information to Verified Permissions, which returns a decision stating whether the request is permitted or forbidden.
Authorization enforcement. You then enforce the decision from Verified Permissions in your application by allowing or denying an action. For a REST API, this would result in sending back an HTTP 403 forbidden status if the request was denied, or processing the request if it was allowed and sending an HTTP 200 OK status.
Authorization outside of the cluster by using Amazon API Gateway
In this pattern, authorization decisions are made at the API gateway layer before requests reach the Kubernetes cluster. When a request arrives at the API gateway, it triggers an authorization check with Verified Permissions to evaluate the request against defined policies. Based on the Verified Permissions response, the gateway either forwards the request to the containerized application or denies access.
This pattern excels in scenarios where you need coarse-grained access control that can be enforced with information accessible at the API level (such as an HTTP header or ID or access token) and that supports RBAC and ABAC. Consider a document management application where different users have access to different documents based on group membership or identity attribute.
This approach to authorization works consistently regardless of whether your application runs in containers, virtual machines, or serverless environments. The API gateway acts as a unified control point for enforcing access policies across backend services.
For implementations that use Amazon API Gateway specifically, you can use Lambda authorizers to integrate with Verified Permissions. For each incoming API request, API Gateway invokes the authorizer Lambda function, which makes a call to Verified Permissions to evaluate the request against the defined authorization policies, as shown in Figure 3.
Figure 3: Integration of Amazon Verified Permissions in Amazon API Gateway
Another option to implement coarse-grained access control in use cases as described in the previous section is to use a Kubernetes Ingress layer. Some customers prefer Kubernetes-native solutions, especially if they need to run Kubernetes clusters within and outside of AWS.
Kubernetes provides an API to create and maintain Ingress objects, operating at layer 7 (the application layer), which enables routing decisions based on HTTP attributes. This layer 7 capability makes Ingress controllers ideal for implementing authorization checks.
One Kubernetes Ingress controller that supports external authorization is Traefik Proxy. With this feature, you can delegate authorization decisions to an external service like Verified Permissions before routing requests to the application container.
Assuming that the authorization endpoint is backed by a service in the same Kubernetes cluster, the architecture looks as shown in Figure 4.
Figure 4: Integration of Amazon Verified Permissions in a Kubernetes Ingress
The workflow is as follows:
Authenticated users access the service through an Elastic Load Balancer of type Network Load Balancer (NLB).
The NLB—operating at layer 4—exposes a Kubernetes Ingress inside the cluster that provides layer 7 capabilities. The Ingress object is implemented by an Ingress controller that supports external authorization, as described earlier.
The Ingress forwards the request—or parts of it—that needs authorization to a local authorization service in the cluster. We use a dedicated authorization service in this architecture because the Ingress backend service allows an external endpoint to be called for authorization.
The authorization service is deployed into its own Kubernetes namespace with a dedicated Kubernetes service account. EKS Pod Identity provides the ability to link the service account in this namespace to an AWS Identity and Access Management (IAM) role that grants access to Verified Permissions by injecting temporary AWS access credentials into the pod at runtime.
The authorization service extracts relevant information from the request and sends it to Verified Permissions for an authorization decision.
The Ingress for the backend service awaits the response of the authorization service and forwards it to the backend service, if access is granted. The Ingress expects the authorization service to respond with HTTP status code 200 for authorized requests. If the Ingress receives HTTP status code 403, the requester is not allowed to access the requested resource, and the Ingress will block the request at this stage.
Only authorized requests are forwarded to registered backend pods.
Because integration with external authorization services is not part of the Kubernetes Ingress API, you need to consult the documentation of the Ingress controller that you decide to use to determine the availability of this feature and its implementation details. Forward authentication of the Traefik Kubernetes Ingress supports this pattern and can be configured with the vendor-specific annotations described in the Traefik documentation.
Authorization in sidecar containers
Not all Ingress controllers support integration with an external authorization service. Amazon Elastic Kubernetes Service (Amazon EKS) customers might prefer the AWS Load Balancer Controller to manage the lifecycle of NLBs and ALBs for their services. Customers can continue using their existing Ingress controller, even if it does not support calling external authorization services today. You can move the authorization of requests behind the Ingress layer with the sidecar container pattern.
Sidecar containers are a common pattern for extending an application’s functionality in Kubernetes. A sidecar is a container running in the same pod as the application it relates to. This means that the sidecar and application follow the same lifecycle and share resources, such as the network ID. This pattern is a good fit when the authorization logic is service-specific. Because the authorization service is deployed alongside the application, this pattern also provides better support in situations where changes to the application demand changes in the authorization logic.
Consider a document management system where access control depends on document metadata and team structures. When a user attempts to edit a document, the sidecar queries the document’s metadata, such as the classification level, tags, and department ownership. The sidecar can also check the organizational team hierarchy to understand reporting relationships and access privileges. This context enables fine-grained authorization decisions that consider not just who the user is, but also, for example, their organizational context or the individual document’s metadata.
Although it’s possible to configure sidecar proxies such as Envoy for individual pods manually, the more convenient option is to introduce a service mesh. A service mesh provides a control plane to manage proxies, including centralized configuration, automated injection of sidecars, and an Ingress layer for traffic routing. Istio is a popular option for a service mesh in Kubernetes.
The diagram in Figure 5 shows the deployment architecture to implement authorization with Verified Permissions in a service mesh.
Figure 5: Integration of Amazon Verified Permissions in a Kubernetes Ingress
The workflow is as follows:
Authenticated users access the application through an NLB.
The request is routed through an Ingress in the Kubernetes cluster.
The Ingress forwards the request directly to the backend service.
Pods of the backend service consist of multiple containers. Each request is routed through an Envoy proxy first.
The Envoy proxy forwards the request to a co-located container running the authorization service.
Pod Identity is used to map an IAM role to a Kubernetes service account bound to the pod, which enables the authorization sidecar to invoke Verified Permissions for an authorization decision. Note that each container in this pod has access to the IAM credentials that are mapped to the service account.
The Envoy proxy awaits the response of the authorization sidecar and blocks or forwards the request to the backend container, depending on the Verified Permissions authorization decision.
When Istio is deployed into a Kubernetes cluster, it introduces Custom Resource Definitions (CRDs) for managing the service mesh. The authorization workflow can be implemented using the ServiceEntry CRD and an Istio Authorization Policy. The authorization service running as a local container in the application pod becomes a registered service entry in the mesh. This service entry can then be configured in an authorization policy as the target for request authorization in the proxy. For more details, see the External Authorization section in the Istio documentation.
Application container
When it comes to integrating Verified Permissions directly within your application container, you have the advantage of fine-grained control over authorization decisions at the application level. This approach allows for more context-aware authorization checks and can be useful when you need to make authorization decisions based on application-specific data that you can query from a policy information point.
Unlike the sidecar pattern, where authorization happens before the request reaches your application code, this approach lets you gather the necessary context from your application state, databases, or other services before making the authorization call. This is particularly valuable when the authorization logic is deeply intertwined with business logic or requires data that’s only available within the application context. This pattern also supports minimizing the number of authorization requests, if, for example, only a subset of requests processed by a monolithic service require authorization.
However, it’s important to note that this tight coupling between authorization and business logic makes the system more brittle and susceptible to breakage when functional or business logic changes occur. This means that modifications to your application code might require careful consideration of their impact on the authorization logic, potentially increasing maintenance complexity.
The architecture for authorization requests from the application container is shown in Figure 6.
Figure 6: Integration of Amazon Verified Permissions in the application container
The workflow is as follows:
Authenticated users access the application through an Elastic Load Balancer—either Application Load Balancer (ALB) or NLB depending on workload requirements.
The Kubernetes service or Ingress for the backend application is directly registered at the ALB or NLB by the AWS Load Balancer Controller.
Requests are directly routed to a pod that is backing the service.
The backend application’s logic is responsible for identifying whether a request needs authorization. The backend uses an IAM role injected at runtime through Pod Identity, when an authorization decision from Verified Permission is needed. The backend application returns HTTP status code 403 if the decision is a deny; otherwise it will continue processing the request.
You now have a set of patterns to introduce authorization into your containerized workloads. You need to consider multiple factors and understand the trade-offs that come with each pattern to identify the best option for a given scenario. In the following table, we list certain areas with influence on your architectural decisions.
Granularity of authorization decisions
API gateway
Authorization on API or service level
Decision based on information from HTTP request
Suitable for consistent coarse-grained authorization
Ingress controller
Authorization on API or service level
Decision based on information from HTTP request
Suitable for consistent coarse-grained authorization
Sidecar proxy
Authorization on service level
Decision based on information from HTTP request and service domain (such as policy information point or static service-specific rules)
Suitable for service-specific authorization with low or mid-level complexity for decisions
Application container
Authorization on code level
Decision based on information from HTTP request and arbitrary business logic
Unauthorized requests don’t consume application pod resources
Sidecar proxy
Authorization services increases resource demand of each pod in the cluster
Unauthorized requests consume application pod resources
Application container
Authorization service consumes resources only when authorization is performed
Unauthorized requests consume CPU cycles of application logic until the authorization logic is triggered
Scalability
API gateway
Fully managed serverless service
Ingress controller
Scaling of Ingress pods needs to be defined
Cluster auto-scaling or capacity planning for compute resources in cluster needed
Sidecar proxy
Existing scaling policies can be leveraged
Application container
Existing scaling policies can be leveraged
Performance
API gateway
Invokes Verified Permission in AWS for each request that needs authorization
Supports caching to reduce number of requests to Verified Permission out of the box
Ingress controller
Invokes Verified Permission in AWS for each request that needs authorization
(Optional) Integration of avp-local-agent to minimize number of requests to Verified Permissions
Sidecar proxy
Invokes Verified Permissions in AWS for each request that needs authorization
(Optional) Integration of avp-local-agent to minimize number of requests to Verified Permissions
Application container
Invokes Verified Permissions in AWS only if the business logic processing a request requires authorization
(Optional) Integration of avp-local-agent to minimize number of requests to Verified Permissions
Cost
API gateway
Consumption-based costs depending on requests received and data transferred out, and optionally cache size
Consumption-based costs to invoke Verified Permissions
Enabling a cache potentially reduces costs per requests
Ingress controller
Adds to infrastructure costs depending on the underlying compute resources needed to run the fleet of pods for Ingress and authorization service
Consumption-based costs to invoke Verified Permissions, which can be minimized by integrating avp-local-agent
Sidecar proxy
Adds to infrastructure costs depending on the underlying compute resources needed to run the sidecar containers for the proxy and authorization service in each pod
Consumption-based costs to invoke Verified Permissions, which can be minimized by integrating avp-local-agent
Application container
Typically minimal additional costs, since authorization code shares the application’s resources
Consumption-based costs to invoke Verified Permissions, which can be minimized by integrating avp-local-agent
Portability
API gateway
Portability is limited and depends on API Gateway with functionality for custom authorization
Ingress controller
Portable across Kubernetes environments with Ingress controller supporting external authorization
Sidecar proxy
Portable across Kubernetes environments
Application container
Highly portable without dependencies to underlying components
Complexity
API gateway
Offered as central component by platform engineering team to offload complexity of authorization from product teams
Changes in authorization service impact product teams
Ingress controller
Offered as central component by platform engineering team to offload complexity of authorization from product teams
Changes in authorization service impact product teams
Sidecar proxy
Platform engineering teams provide standardized patterns (and optionally implementation) for authorization that can be integrated and implemented by product teams
Increases autonomy of individual teams
Application container
Full responsibility for authorization lies with the product teams
Increases autonomy of individual teams
Not all services have the same requirements for authorization. You can also combine the patterns discussed in this post. You can, for example, put a basic authorization workflow with coarse-grained access control in front of the majority of your services. You can then rely on sidecar proxies with policy information points to inject additional, dynamic context into authorization decisions for specific services. Lastly, if certain use cases of a service demand complex authorization decisions, you can fall back to application-level authorization for specific parts of your code base.
Conclusion
In this blog post, we explored four patterns for integrating Verified Permissions into a containerized environment. We discussed the benefits and considerations of implementing Verified Permissions at different levels: from an API gateway outside of Kubernetes clusters, by means of Ingress controllers and sidecar proxies as network components inside the Kubernetes cluster, to authorization within the application itself. We saw how each pattern offers unique advantages. We also discussed considerations for finding a suitable option for your situation and when to combine patterns.
By using Verified Permissions, organizations can implement consistent, fine-grained authorization across their containerized workloads, whether they’re running on-premises, in the cloud, or in hybrid environments. This centralized approach to authorization can enhance security and simplify policy management and application development.
[Amazon Simple Email Service] provides a secure email solution that scales with your business needs. Unfortunately, all email systems, including Amazon SES, remain the primary target for spammers and bad actors due to email’s widespread use and accessibility.
While SES offers powerful features for application-based email sending, its SMTP credentials require careful management to prevent unauthorized access. Compromised credentials enable bad actors to send malicious emails through legitimate domains, which can bypass security filters and damage sender reputation.
To protect your SES implementation, you must encrypt SMTP credentials during storage and transmission. Additionally, implementing role-based access controls helps restrict credential access to authorized personnel only. Regular credential rotation at fixed intervals, typically every 90 days, minimizes potential security breaches. Automating this rotation process eliminates human error and ensures consistent security practices across your organization.
Problem Statement
Imagine you are the administrator for a large financial institution. You recently began using Amazon SES to send email from two dozen on-premises servers. Your email servers authenticate with SES using SMTP credentials to access the SES SMTP interface. Your organization’s security policies mandate regular credential rotation, including the ability to rotate them on-demand. How can you automate SMTP credential rotation such that you can meet your organization’s security policies?
This blog post will present two solutions that automate the secure management and automatic rotation of SMTP credentials for Amazon SES. Each will help enhance email security, comply with regulations, and minimize operational overhead.
Both solutions provide SES customers who use SMTP with additional tools to improve email security, ensure compliance, and reduce operational overhead. You can deploy the option that best suites your needs by following the guidance in this blog post.
If your environment supports automated rotation, AWS Systems Manager Documents (SSM Documents) can help by providing pre-defined or custom automation workflows for securely managing secrets rotation, deploy Option 1.
If your environment does not support automated rotation, you can still implement an auditable, managed rotation solution by storing your secrets in AWS Systems Manager Parameter Store by deploying Option 2.
As a pay-per-use platform, the underlying AWS services used in either deployment option will only charge you for the resources that you actually consume. You can leverage the AWS Pricing Calculator to estimate the run-time costs for your specific workload. Alternatively, you can work directly with your AWS account team to understand the pricing for these solutions.
Getting SES SMTP Credentials
To send emails through the Amazon SES SMTP interface, email servers must first authenticate with SES using dedicated SES SMTP credentials. Typically, a systems administrator logs into the AWS SES console, clicks the Create SMTP Credentials button, and navigates to the AWS Identity and Access Management] (IAM) console. There, the administrator creates an IAM user with permissions for SES. The administrator then uses the IAM user’s secret access key to generate the SES SMTP password, which they use to configure their email servers or SMTP-enabled applications for use with SES.
The SES SMTP interface authenticates requests using an SMTP credential derived from an IAM user’s access key ID and secret access key. Since temporary access keys cannot be used to derive SES SMTP credentials, you must deploy and regularly rotate a long-lived key.
While the manual process of creating SES SMTP credentials works for a small number of credentials, it becomes cumbersome for customers with numerous email servers or strict password rotation policies. These customers may find the automated credential rotation mechanisms described in the following solutions better suited to their production needs.
Option 1 – Fully Automated Credential Rotation:
The fully automated version of this solution uses a custom Lambda function to create an SMTP password, which is stored in AWS Secrets Manager. AWS Secrets Manager’s built-in rotation feature then triggers the rotation of SES SMTP credentials. AWS Systems Manager Documents utilize AWS Systems Manager Agents to automatically make the changes to the authentication configuration on email servers.
The key advantages of using AWS Systems Manager to make the email server configuration changes include:
Ability to deploy changes to on-premises and Amazon EC2 hosts, allowing rotation of secrets across a hybrid estate. Customization of the document to specific email software configurations. Targeting the secret (SMTP credential) rotation document on all email servers based on tags.
Let’s dive deep into Option 1 – Fully Automated Credential Rotation.
How Option 1 works:
Refer to the image above for the workflow:
AWS Secrets Manager initiates a rotation request, either on a schedule or via an authorized user’s request, triggering the “rotation Lambda” to rotate the SES SMTP credentials.
The SES Secret Rotation Function Lambda (see figure x above):
a. Creates a new IAM secret access key for the designated SES IAM user, derives the new SES SMTP password, and stores it in AWS Secrets Manager.
b. Connects to SES to verify the new SMTP password can authenticate.
c. Initiates an AWS Systems Manager Run Command to update the new SMTP password on target email servers using a pre-configured Systems Manager Document.
d. (and e.) Monitors the status of the Systems Manager Document execution until all updates complete successfully
f. Deletes the old IAM access and secret access keys.
With this fully automated solution, SES SMTP credentials can be rotated on a schedule or triggered manually, with no impact to email service uptime.
Deploying the Fully Automated Solution in Your AWS Account (Option 1)
Prerequisites for the Fully Automated Solution
AWS Account Access, typically with admin-level permission to allow for the deployment.
Alternatively, you can use the AWS CLI from the AWS CloudShell in your browser.
Clone the Github repository (for this solution, you only need the README.md and sesautomaticrotation.yaml files found in /ses-credential-rotation/automatic-rotation).
Note – We follow the principles of least privilege in this solution. The CloudFormation templates we’ve supplied require you to specify an identity, or configuration-set resource to use in the SES sending operation. You can find guidance on defining these values at Actions, resources, and condition keys for Amazon SES. Additionally, we’ve limited the IAM User to the ses:SendRawEmail action, which you can adjust as appropriate).
Console access to your AWS SES account that is properly configured to send emails via at least one verified identity.
Target email server(s) properly configured to send email via SES using SES SMTP authentication.
The AWS Systems Manager agent(s) must be correctly installed and configured on your target email server(s) as detailed in Setting up AWS Systems Manager.
The target email servers must be decorated with the tag (“SSMServerTag“) and value (“SSMServerTagValue“). These values allow the Systems Manager Document to identify them.
We use the tag “EmailServer” and the value “True” in our example, but you can use any tag and value that you wish).
An email address (or list) to receive SMTP rotation notifications.
Console access to your AWS Secrets Manager.
Console access to your AWS Systems Manager.
Deployment Steps
Clone the GitHub repository to your IDE
If using AWS CloudShell, ensure you are in the same region as your AWS Systems and Secrets Manager
Update the appropriate AWS Systems Manager sample document created by the CloudFormation Template to reflect your email server environments. These can be found in the AWS Systems Manager console under Documents > Owned by me
The ExampleWindowsIISSMTPSESpasswordrotator sample provides an example for Microsoft Windows hosts using the runPowerShellScript action to update the server’s SMTP credentials.
The ExamplePostfixSESpasswordrotator sample provides an example for Linux hosts using the runShellScript action to update the server’s SMTP credentials.
The partially automated version uses a custom AWS Lambda function to create an SMTP password, which is stored in AWS Systems Manager Parameter Store. This solution simplifies credential rotation, where manual changes must be conducted by support staff. By wrapping the manual change process with AWS Step Functions, you can ensure a robust and auditable process to regularly rotate the SES SMTP credential.
How Option 2 works:
The credential rotation AWS Step Function creates a new SES SMTP credential and updates it in AWS Systems Manager Parameter Store.
It retrieves a list of servers from an Amazon DynamoDB table and launches a manual confirmation AWS Step Function execution for each server to initiate and track the manual step.
The manual confirmation AWS Step Function emails the designated address, requesting support staff to arrange the rotation. The email includes a link specific to that server.
The person completing the manual change confirms back to the AWS Step Function via the link that the rotation is complete.
Once the rotation on a server is confirmed, the manual confirmation AWS Step Function for that server is marked as complete.
After all server rotations are complete, the credential rotation AWS Step Function continues, disabling the old SES SMTP credential and deleting it after a few days.
AWS Step Function executions can last up to 365 days, providing sufficient time for the manual rotation and confirmation.
The screenshot below shows a graphical representation of the credential rotation AWS Step Function execution status, providing a real-time view of the rotation progress.
You can also track the status of individual servers via the manual rotation step function execution list.
The partially automated solution for rotating Amazon SES SMTP credentials is illustrated and detailed below:
Refer to the image above for the option 2 workflow:
EventBridge Scheduler Trigger: An EventBridge scheduler rule triggers a custom Starter Function Lambda (SF Lambda) on the last day of every 3rd month (this can be adjusted to suit your needs in the CloudFormation template).
Credential Rotation Step Function: The Starter Function Lambda triggers the Credential Rotation AWS Step Function, providing a clearly defined name to facilitate auditing (“password-rotation-dd-mm-yy“).
Credential Rotation Step Function Actions:
Creates a new IAM (Identity and Access Management) secret access key for the SES IAM user.
Triggers the SMTP Credential Generator Lambda to derive the SES SMTP password from the newly created IAM secret access key (using the algorithm provided in the SES documentation.
Reads a list of servers that are utilizing this credential from a DynamoDB table.
Manual Confirmation Step Function:
For each server, a manual confirmation AWS Step Function is triggered, sending a message on the Amazon Simple Notification Service (SNS) topic.
The SNS notification prompts the server administrator via email to manually rotate the SMTP credentials on the on-premises email server.
The server administrator uses a link in the email to confirm the credential has been rotated and tested on the server.
The link triggers the Confirmation Lambda exposed via API Gateway, which marks the ManualConfirmation step function as complete.
Credential Rotation Completion: The CredentialRotation step function waits until all manual confirmation step functions have completed before proceeding.
Old IAM Access Key Deletion: Once confirmation has been received for all servers, the step function deletes the old IAM access key.
Deployment
To deploy the partially automated solution in your AWS account, you will need the following prerequisites:
Prerequisites for the Partially Automated Solution
AWS Account Access, typically with admin-level permission to allow for the deployment.
SES enabled, configured, and properly sending emails.
External email server(s) currently configured to use SES with SMTP.
Administrator email address to receive notifications.
AWS Secrets Manager and AWS Systems Manager set up.
AWS Systems Manager agent(s) correctly installed and configured on your target email servers as detailed in Setting up AWS Systems Manager.
Amazon EC2 instance with Postfix configured to send emails through SES
Target email servers must be decorated with a tag (“SSMServerTag“) and value (“SSMServerTagValue“) that will be used to identify them by the Systems Manager Document (we used “server” and “email”)
AWS Parameter Store and AWS Step Functions.
Once you have the prerequisites in place, follow the instructions in the GitHub project.
Conclusion
Implementing an automated credential rotation process for Amazon SES SMTP enhances security and compliance, streamlines operations, and reduces the risk of downtime and human error. By leveraging AWS Secrets Manager and AWS Systems Manager (option 1) or AWS Systems Manager Parameter Store and Step Functions (option 2), organizations can centralize SES SMTP credential management, maintain an audit trail, and quickly update email application servers with new SMTP credentials.
Need additional guidance?
Join the conversation and connect with other administrators and security professionals on the AWS re:Post community to share insights and learn best practices.
This alignment with DESC requirements demonstrates our continued commitment to adhere to the heightened expectations for CSPs. Government customers of AWS can run their applications in AWS Cloud-certified Regions with confidence.
The independent third-party auditor (BSI) issued the Certificate of Compliance to AWS on behalf of DESC on January 23, 2025. The Certificate of Compliance that illustrates the compliance status of AWS is available through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.
The certification includes 11 additional services in scope, for a total of 98 services. This is a 13% year-on-year increase in the number of services in the Middle East (UAE) Region that are in scope of the DESC CSP certification. For up-to-date information, including when additional services are added, see the AWS Services in Scope by Compliance Program webpage and choose DESC CSP.
AWS strives to continuously bring services into the scope of its compliance programs to help you adhere to your architectural and regulatory needs. If you have questions or feedback about DESC compliance, reach out to your AWS account team.
To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.
If you have feedback about this post, submit comments in the Comments section below.
We’ve added four additional AWS services to the audit scope since the last certification issued on November 29, 2024. These are the four additional services:
For a full list of AWS services that are certified under ISO and CSA STAR, see the AWS ISO and CSA STAR Certified page. You can also access the certifications in the AWS Management Console through AWS Artifact.
If you have feedback about this post, submit comments in the Comments section below.
By implementing the PBHVA control overlay over a CCCS Medium baseline, you can better protect your organization’s most critical assets from potential threats and vulnerabilities, providing continuity of essential government operations and safeguarding sensitive information.
Understanding CCCS PBHVA overlay requirements
The CCCS PBHVA overlay consists of 137 controls designed to protect high-value assets, including 69 new controls and 68 controls from CCCS Medium. These controls provide enhanced data protection, particularly for integrity and availability, and are based on NIST SP 800-53 Revision 5.
Key findings from the Coalfire assessment
Coalfire’s assessment found that the LZA on AWS solution significantly supports CCCS PBHVA overlay compliance requirements:
71 percent of in-scope controls (97 of 137) are supported by the AWS contribution to compliance in the shared responsibility model
The solution uses over 35 AWS services to provide comprehensive security capabilities
Strong network segmentation is achieved through network account and network-boundary VPC design
Infrastructure-as-code (IaC) enables reliable build and deployment results
The 29 percent of controls not addressed by the LZA are on the customer side of the shared responsibility model. They are addressed in the customer’s application stack or as non-technical controls such as policies and procedures.
Key security capabilities
The LZA solution implements several critical security features:
While the LZA solution provides significant compliance support, organizations should note:
The solution alone does not guarantee compliance
Organizations must implement their own policies, standards, and procedures
A thorough understanding of the shared responsibility model is essential
The AWS Landing Zone Accelerator Verified Reference Architecture documentation is available for customer download in AWS Artifact. This resource can help organizations reduce the time and effort required to deploy an environment that aligns with CCCS PBHVA overlay requirements.
Conclusion
The Coalfire assessment confirms that the LZA on AWS solution provides effective support for CCCS PBHVA overlay compliance objectives. However, organizations should remember that compliance is an ongoing process that requires active management and cannot be achieved through technology alone.
For more information about implementing the Landing Zone Accelerator for CCCS PBHVA overlay requirements, contact your AWS account team or the AWS Public Sector team directly.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
As your Amazon Web Services (AWS) environment grows, you might develop a need to grant cross-account access to resources. This could be for various reasons, such as enabling centralized operations across multiple AWS accounts, sharing resources across teams or projects within your organization, or integrating with third-party services. However, granting cross-account access requires careful consideration of your security, availability, and manageability requirements.
In this blog post, we explore four different ways to grant cross-account access using resource-based policies. Each method has its own unique tradeoffs, and the best choice depends on your specific requirements and use case.
Evaluating different techniques for granting cross-account access
Your choice of how to specify the principal in a resource-based policy impacts some aspects of both the confidentiality and the availability of your solution. Understanding this impact and making the right tradeoffs for your use case is the focus of this post.
An example scenario
Imagine that you have an S3 bucket in your AWS account (Account A) that needs to be accessed by different principals in another AWS account (Account B). For this scenario, we assume that the principals in Account B have the necessary access to S3 in their identity-based policies, and we will focus on authoring the resource-based policies in Account A. While the methods explained here use Amazon S3, the concepts discussed apply to all AWS services that support resource-based policies. In the following sections, we walk through four different ways to grant cross-account access in this scenario and discuss the tradeoffs of each.
Method 1: Grant access to a specific IAM role using the Principal element of the resource-based policy
In this example, you use an S3 bucket policy to grant access to a specific IAM role (RoleFromAccountB) in Account B by specifying the IAM role’s Amazon Resource Name (ARN) in the Principal element of the policy in Account A.
Using this bucket policy, if someone in Account B deletes or recreates the role (RoleFromAccountB), then that role can no longer access the amzn-s3-demo-bucket-account-a bucket, even if that role is recreated with the same name. The reason is that when you save this policy, the role ARN is mapped to the unique ID of the role, which looks something like this: AROADBQP57FF2AEXAMPLE. You will see a role identifier in the Principal element of your resource-based policies if you view them after you delete the role that they referenced.
This behavior is intentional. The resource-based policy only allows the specific instance of the role that you set as principal at the time of policy creation. This helps prevent unintended access to your resources if you delete a role, but forget to update your resource-based policy to remove that role. This behavior can also cause an availability risk because the role (RoleFromAccountB) will have a new unique ID when it is recreated and will no longer have access to the bucket. Roles can be recreated for a number of reasons, including accidentally when you use tools such as infrastructure as code.
You might consider choosing this method if:
You own the roles in both Account A and Account B and can control the creation and deletion of these roles.
You want your resource-based policy in Account A to stop granting access when the specified role (RoleFromAccountB) is deleted.
You prioritize granular access control over potential availability concerns if the role (RoleFromAccountB) is deleted.
Method 2: Grant access to an account using the Principal element of the resource-based policy
In this example, you grant access to a specific account in the Principal element of the resource-based policy. This resource-based policy of Account A allows any user or role from Account B that also has an identity-based policy that grants them access to read the objects.
Note: You can use either "Principal": {"AWS": "111122223333"} or "Principal": {"AWS": "arn:aws:iam::111122223333:root"} in the Principal element. They are equivalent, and the long-form ARN does not represent the root user.
This resource-based policy helps avoid the potential availability issue discussed for Method 1. If a role in Account B that needs to have access to the bucket is recreated, it will still have access after the recreation of that role. This is because you don’t specify a role in the Principal element—instead, you specify an account. If you use Method 2, you must be comfortable delegating access control decisions to the owner of that account.
This approach explicitly delegates access control decisions to IAM in the other account (Account B). Principals in Account B have access to this bucket if allowed by their identity-based policies.
You might consider choosing this method if:
You need to grant access to many principals in Account B.
You want to delegate the access decision in the account where the principal exists (Account B).
You prioritize ease of management and availability over granular access control.
Method 3: Grant access to a specific IAM role using the aws:PrincipalArn condition
This method expands on Method 2 and adds a condition that grants access only to a specific IAM role. Similar to Method 2, you use the account number as the value of the Principal element, but also use the aws:PrincipalArn condition key to limit access to a specific principal in Account B.
The aws:PrincipalArn condition key is a global condition key that compares the ARN of the principal that made the request with the ARN that you specify in the policy. For IAM roles, the request context returns the ARN of the role, not the ARN of the user that assumed the role.
This policy comes with the same availability benefits as the policy in Method 2: access to this resource will survive role recreation. This is because the role is translated to its unique identifier only when it is used in the Principal element. It is not translated to a unique identifier when it is used in a condition. If the role (RoleFromAccountB) in Account B is recreated, accidentally or intentionally, the policy will continue to grant access because the role matches the role ARN specified in the condition key of the resource-based policy in Account A. As a result, Method 3 provides a balanced approach to availability and security.
You might consider choosing this method if:
You are comfortable that this policy will continue to grant access to the role specified in the aws:PrincipalArn condition key if that role (RoleFromAccountB) is recreated.
You don’t own the Account B you are granting access to and don’t control when that role may be recreated.
You want a balance of availability and confidentiality.
Method 4: Grant access to an entire AWS Organizations organization
This method is focused on a different use case and is not an alternative to the methods listed earlier. Use this method if you have a resource (an S3 bucket, in this example) that you want to share with your entire organization, but not share with anyone outside of it.
There is no way to specify an organization by using the Principal element of a resource-based policy, so you must use the aws:PrincipalOrgId condition key to restrict access to a specific organization. In this policy, you specify a wildcard in the Principal element, which says that anyone can access the bucket. Then the condition reduces “anyone” to just those AWS account principals that belong to the specified organization and have an identity-based policy that allows them access.
You then add an additional conditional block that compares the aws:PrincipalAccount condition key to the aws:ResourceAccount condition key by using a policy variable. This extra conditional block is optional and excludes the account that owns the bucket (Account A) from the allow statement. The reason for using this extra conditional block is so that principals in Account A still require an allow statement in their identity-based policy to access this bucket. If you choose to exclude this aws:PrincipalAccount comparison, principals in Account A are granted access to the bucket without an explicit allow statement in their identity-based policy. Policy evaluation logic only requires either the identity-based policy or the resource-based policy (but not both) to allow a request when the principal and resource are in the same account.
You might consider choosing this method if:
You have a shared resource that should be accessible to your entire organization.
Conclusion
Choosing a method to grant cross-account access requires careful consideration of your requirements and use case. Each of the four methods discussed in this blog post has its own advantages and tradeoffs. By understanding these methods and their implications, you can decide on the most appropriate approach to grant cross-account access to your AWS resources. Remember to regularly review and audit your resource-based policies to verify that they align with your security and access requirements.
Many customers want to seamlessly integrate their on-premises Kubernetes workloads with AWS services, implement hybrid workloads, or migrate to AWS. Previously, a common approach involved creating long-term access keys, which posed security risks and is no longer recommended. While solutions such as Kubernetes secrets vault and third-party options exist, they fail to address the underlying issue effectively.
One option to connect your on-premises Kubernetes workloads to AWS APIs is to use the service account issuer discovery feature. This allows the Kubernetes API server to act as an OpenID Connect (OIDC) identity provider and be federated with AWS Identity and Access Management (IAM). However, this approach requires public internet access to the Kubernetes API server, which might not be desirable for some customers.
To help eliminate the need for long-term access keys or exposing the Kubernetes API server to the public internet, AWS has introduced AWS IAM Roles Anywhere. This feature enables secure, seamless integration of on-premises Kubernetes workloads with AWS services, promoting robust security practices and minimizing potential risks associated with long-term credentials or public exposure.
IAM Roles Anywhere enables workloads outside of AWS to access AWS resources by exchanging X.509 bound identities for temporary AWS credentials. With IAM Roles Anywhere, you can use the same IAM roles and policies as your AWS workloads to access AWS resources, promoting consistency.
IAM Roles Anywhere can be combined with a standard public key infrastructure solution. In this blog post, we use AWS Private Certificate Authority, which has several advantages over using a self-signed certificate authority (CA). First, it reduces operational and management overhead, because AWS manages the CA for you. Second, the cryptographic key material can be stored in hardware security modules or at least vaulted, which helps you protect your private CA against key compromises. Additionally, certificates can be short-lived, which aligns with dynamic Kubernetes environments where pod lifetimes are typically shorter than traditional servers.
We also demonstrate how to integrate IAM Roles Anywhere without modifying your existing workload Docker files, and how to automate the X.509 certificate lifecycle with cert-manager and an AWS Private CA backend in short-lived certificate mode. By using these capabilities, you can seamlessly integrate your on-premises Kubernetes workloads with AWS services, promoting robust security practices, minimizing risks associated with long-term credentials, and helping to ensure a streamlined, consistent access management experience.
“Why should I prefer X.509 certificates over IAM access keys?” Access keys are long-term credentials that must be rotated regularly to minimize the risk of unauthorized access. They need to be securely deployed onto servers hosting applications that use them, requiring procedures for secure transfer and deletion of transient copies. As the number of applications and access keys grows, tracking and managing them becomes operationally challenging.
In contrast, X.509 certificates use public key infrastructure (PKI). The private key is generated directly on the application server and doesn’t leave it. Only a certificate signing request, which doesn’t contain secrets, is sent to the CA for signing and returning the certificate. This alleviates the need for securely transmitting secret keys.
However, you can argue that X.509 certificates are also long-lived credentials. This concern is valid, but not necessarily true. As demonstrated by projects such as Let’s Encrypt, it’s possible to reduce certificate lifetimes from years to months by implementing automation for certificate renewal. After such a mechanism is in place, certificate lifetimes can be further limited to days or even hours.
In this post, we introduce mutually authenticated Transport Layer Security (mTLS), which uses certificates for high-assurance bidirectional authentication. Certificates are used to establish trust between the client and server, making sure that both parties are authenticated and authorized to communicate securely. By implementing mTLS, you can achieve a higher level of security and trust in your communication channels, mitigating potential risks associated with unauthorized access or man-in-the-middle attacks. Here, we implement ephemeral certificates that are tied to the lifecycle of pods. When a pod is started, a certificate is automatically created, and it expires after a short period of time unless it’s actively in use by the pod, in which case it’s automatically renewed by the cert-manager. This approach verifies that certificates are only valid for the duration of the pod’s lifetime, minimizing the potential risk associated with long-lived credentials. Additionally, IAM Roles Anywhere supports certificate revocation list (CRL) checks, allowing you to perform explicit revocation of certificates if required. This feature provides an additional layer of security, enabling you to revoke access promptly in case of compromised credentials or other security concerns.
Throughout this post, we assume that you have a basic understanding of IAM Roles Anywhere. For more information you can see this blog post. Furthermore, we assume that you are familiar with Kubernetes, kubectl,Helm, and cert-manager.
Solution overview
This solution assumes that you have an existing Kubernetes cluster running outside of AWS.
Figure 1 shows the high-level architecture of our solution. An on-premises Kubernetes cluster accessing AWS APIs using IAM Roles Anywhere with X.509 certificates issued by AWS Private CA in short-lived-certificate mode.
Figure 1: High level architecture of on-premises Kubernetes accessing AWS APIs
Here’s how the solution works, as shown in Figure 1:
When you set up your AWS Private CA as a trusted source and establish a specific profile, IAM Roles Anywhere will validate and accept authentication requests that use certificates issued by your AWS Private CA.
cert-manager, deployed into your Kubernetes cluster, orchestrates the issuance of AWS Private CA certificates to authorized pods.
Each pod uses IAM Roles Anywhere to create an AWS session using its private key and X.509 certificate obtained from cert-manager.
Let’s explore the different parts of the architecture in more detail.
AWS Private CA short lived credentials
AWS Private CA offers a short-lived certificate, where the validity period is limited to 7 days or fewer. You can see this AWS Blog to learn how to use AWS Private CA short-lived certificates. This new mode can be used to issue certificates for your Kubernetes pods and benefit from lower costs of operations. By synchronizing the certificate lifecycle with the lifecycle of the pod, you can minimize the operational overhead for this solution. To help meet requirements for auditability and transparency, you can use the audit report feature to list the issued certificates in a machine readable format.
IAM Roles Anywhere
Figure 2 shows a detailed overview of the components involved in authentication with IAM Roles Anywhere.
Figure 2: Components of IAM Roles Anywhere
IAM Roles Anywhere allows you to obtain temporary security credentials for workloads that run outside of AWS. Your workloads must use a certificate issued by a trusted PKI CA to authenticate with IAM Roles Anywhere. You establish trust between IAM Roles Anywhere and your CA by creating a trust anchor that points to the root of the CA.
cert-manager
Figure 3 shows a detailed overview of the cert-manager setup used in this post, including the aws-privateca-issuer add-on for the integration of AWS Private CA.
Figure 3: Detailed overview of cert-manager setup
cert-manager is a tool for managing X.509 certificates in Kubernetes. As shown in Figure 3, cert-manager will make sure that certificates are valid and up-to-date and attempt to renew them before they expire. By using add-ons, you can configure different backends for issuing X.509 certificates. In this post, we explore how to integrate cert-manager with AWS Private CA using the aws-privateca-issuer add-on. The aws-privateca-issuer add-on defines two custom resources, AWSPCAIssuer and AWSPCAClusterIssuer, which are used to configure the link to AWS Private CA. They are similar to the Issuer and ClusterIssuer resources that come with cert-manager, but specific to aws-privateca-issuer.
After the AWSPCAIssuer or AWSPCAClusterIssuer is available, aws-privateca-issuer authenticates towards AWS APIs using temporary security credentials obtained from IAM Roles Anywhere. cert-manager watches for the certificate resource, which references to an AWSPCAIssuer, which in turn references to AWS Private CA. aws-privatca-issuer requests a certificate from AWS Private CA. The auto-generated private key and the signed certificate are stored in Kubernetes secrets.
Using certificates and secrets
cert-manager supports multiple ways of integrating into your Kubernetes workloads. You can use certificate resources, which represent a human-readable definition of a certificate signing request (CSR) and contain information on certificate lifespan and renewal time. When using a certificate, the auto-generated private key and the signed certificate are stored in Kubernetes secrets.
With this option, an X.509 certificate is issued manually and saved as a secret. After a PKI is configured as an issuer, a certificate resource is created to automate the renewal of the certificate. With the certificate resource, the lifecycle of certificates is decoupled from the lifecycle of the pods that use them. This allows you to bootstrap the X.509 certificate even before the trusted PKI is deployed.
Using the CSI driver
Another way of integrating cert-manager is by using a CSI driver. In this case, the certificate lifecycle is bound to the lifecycle of the pod. An X.509 certificate and private key are mounted into a predefined folder where your workloads can read them. On pod creation, cert-manager automatically creates a private key and requests a certificate for the configured trusted PKI. When the pod is deleted, the private key and certificate are also deleted and become invalid because they aren’t renewed by cert-manager.
In this post, we use the CSI driver approach for workloads to create ephemeral certificates for IAM Roles Anywhere.
Workload configuration
Figure 4 shows a detailed view of how pods can be configured to use IAM Roles Anywhere without needing to change the underlying Docker images by using a sidecar that provides an IMDSv2 endpoint that mimics the behavior in the Amazon Elastic Compute Cloud (Amazon EC2) instance metadata endpoint.
Figure 4: Pod configuration using a sidecar
As shown in Figure 4, when using a certificate resource, the auto-generated private key and the signed certificate are stored in Kubernetes secrets and mounted into the pod. When using the CSI driver, a private key is generated locally (for the pod), a certificate is requested from cert-manager based on the given attributes and is issued by AWSPCAIssuer, and the certificates are mounted directly into the pod with no intermediate secret being created.
IAM Roles Anywhere uses the CreateSession API to authenticate requests with a SigV4a signature using the private key and its associated X.509 certificate. This exchange provides a IAM role session credential, as if you had assumed the IAM role. The aws_signing_helper binary is provided to call the CreateSession API from the command line. In this post, a sidecar container that provides an IMDSv2 endpoint to the workload container is used. This container uses the aws_signing_helper binary and uses its serve command.
This way, applications using AWS SDKs can use the AWS_EC2_METADATA_SERVICE_ENDPOINT environment variable to set the instance metadata endpoint to the correct port on the localhost interface. The X.509 certificate and private key are provided as files to the sidecar container.
Solution deployment
In this section, we show the steps needed to deploy the solution in your AWS account.
Prerequisites
To deploy the solution in this post, make sure that you have the following in place:
An AWS account and IAM permissions for IAM, IAM Roles Anywhere, and AWS Private CA
Latest stable Kubernetes
kubectl (matching your Kubernetes version)
Helm 3
jq
Note: As an alternative to using the AWS CLI, you can use the AWS Controllers for Kubernetes (ACK) service controller for AWS Private CA for creating and managing CertificateAuthority, Certificate, and CertificateAuthorityActivation resources directly within your Kubernetes cluster. After establishing your CA hierarchy using the ACK controller, you can proceed with the subsequent steps involving IAM Roles Anywhere integration, aws-privateca-issuer, and cert-manager as described in this post.
Step 1 – AWS Private CA
Set up a root CA in AWS Private CA, which will issue short lived certificates for your pods. In this example you use only one CA; for production environments, you should check the considerations for designing CA hierarchies. Start by using the AWS CLI to create a configuration.
The command will return a CertificateAuthorityArn, which you will need for further commands, so export it for later use. Replace <region> with your AWS Region.
At this point your root CA is set up and ready to use. The next step is to configure IAM Roles Anywhere.
Start by defining a trust anchor that will refer to your newly created AWS Private CA and export the trustAnchorArn. Replace <value-of-trustAnchorArn> with the Amazon Resource Name (ARN) value of your IAM Roles Anywhere trust anchor.
Create an IAM role to be used by the aws-privateca-issuer cert-manager plugin. This role needs to include the actions sts:AssumeRole, sts:SetSourceIdentity and sts:TagSession, which are required by IAMRA. Replace <TA_ID> with your trust anchor.
Note: You should specify a PrincipalTag with the CN. Furthermore, it should be scoped to the IAMRA service principal. This further restricts authorization based on attributes that are extracted from the X.509 certificate and provides an additional layer of security by helping to ensure that even if an unauthorized party gains access to a valid certificate, they cannot assume the role unless the certificate’s CN matches the specified value.
Attach an inline policy that allows the role request certificates from your PCA and retrieve these. Note that there is a condition limiting the AWS Private CA templates to only allow EndEntityCertificate.
The created role iamra-issuer will only be used by the aws-privateca-issuer to integrate with AWS Private CA. You should repeat the process of creating IAM roles and IAMRA profiles for your workloads. it’s recommended to create a separate IAM role for each workload and limit its use with condition statements in the trust policy, checking for the workload identity and trust anchor (for example, matching the common name). Furthermore, it’s important that you add IAMRA to the trust policy and allow the aforementioned actions. Best practice with IAM roles is to apply least-privilege permissions.
Step 3 – Create the init container
To integrate IAM Roles Anywhere within your Kubernetes environment, you need to provide an IMDSv2 endpoint to your application containers by running the aws_signing_helper binary as a sidecar. You also need to configure your applications using an environment variable to use the new instance metadata endpoint. To do so, build a Docker image that works as a sidecar.
In this step, create a basic image that fulfills the preceding requirements. In your environment, you might want to adapt this example to use your own base image and implement your image hardening processes.
Copy the following script and save it as init.sh.
#!/bin/sh
if [[ -z "$TRUST_ANCHOR_ARN" ]]; then
echo "Must provide TRUST_ANCHOR_ARN environment variable." 1>&2
exit 1
fi
if [[ -z "$PROFILE_ARN" ]]; then
echo "Must provide PROFILE_ARN environment variable." 1>&2
exit 1
fi
if [[ -z "$ROLE_ARN" ]]; then
echo "Must provide ROLE_ARN environment variable." 1>&2
exit 1
fi
echo "starting IMDSv2 endpoint with aws_signing_helper ..."
/aws_signing_helper serve \
--certificate /iamra/tls.crt \
--private-key /iamra/tls.key \
--trust-anchor-arn $TRUST_ANCHOR_ARN \
--profile-arn $PROFILE_ARN \
--role-arn $ROLE_ARN
This script is the entry point of the sidecar container. It expects the environment variables TRUST_ANCHOR_ARN, PROFILE_ARN, and ROLE_ARN, which are required by aws_signing_helper. It also expects an X.509 certificate and its private key in the folder /iamra, which will be mounted in a later stage during pod initialization. Finally, it invokes the aws_signing_helper with the serve directive which creates an IMDSv2 endpoint listening on 9911 by default. This can be customized using the --port parameter.
Now let’s inspect the Docker file.
Note: At the time of writing, we used the alpine3.17.0 image. Use a hardened base image that’s designed to be secure and aligns with the requirements of your environment.
FROM alpine:3.17.0
COPY init.sh .
RUN apk add --no-cache libc6-compat libgcc wget
RUN wget https://rolesanywhere.amazonaws.com/releases/1.3.0/X86_64/Linux/aws_signing_helper
RUN chmod +x /aws_signing_helper /init.sh
RUN ln -s /lib/libc.musl-x86_64.so.1 /lib/libresolv.so.2
ENTRYPOINT ["/bin/sh", "-c", "/init.sh"]
This Docker file copies the init.sh and downloads the aws_signing_helper binary. The init.sh script is defined as an entry point to the container. Dynamic libraries required by aws_signing_helper are installed using Alpine Linux package manager (Apk).
Now build the docker image, sign in to it, and push it for later use. For the following commands replace <my-docker-registry> with the hostname of your local registry or use an ECR Repository.
In this step, install cert-manager into your cluster and configure aws-privateca-issuer using a manually bootstrapped certificate. cert-manager-approver-policy is used to control which certificates can be requested by the workloads. Then, set up the cert-manager CSI driver to automatically provision X.509 certificates for your workload pods.
Start with the cert-manager setup:
Add the cert-manager repository to Helm and install the chart.
Note: At the time of writing, we used cert-manager version 1.16.2. Check for the latest stable version.
Now, install the cert-manager aws-privateca-issuer plugin. This integration connects cert-manager with AWS Private CA and lets you issue short-lived certificates automatically. Currently, aws-privateca-issuer Helm chart doesn’t support IAMRA natively. So, you’re going to use the same init-container to set up IAMRA as for the workload pods.
You need to issue the first X.509 certificate for aws-privateca-issuer IAMRA manually. Later, cert-manager will renew it automatically.
Create the bootstrap certificate. When asked for a common name, enter iamra-issuer.
The previous command will create an RSA private key named iamra.key and a certificate signing request name iamra.csr. Now you need to call AWS Private CA to issue the bootstrap certificate.
Set the validity period of the certificate to 1 day so that cert-manager will replace it after it’s set up. The IAM role that’s performing this action must have permissions to AWS Certificate Manager (ACM), IAM, and IAM Roles Anywhere to complete the setup.
You’re ready to install the aws-privateca-issuer. You need to modify the Helm chart because it doesn’t currently support IAMRA. You will render the Helm chart into YAML manifests, which are then adapted for IAMRA.
Install the Helm repository and render the charts into a file.
Add your previously built image as a sidecar and replace the environment variables with your exported values. Search for the deployment definition and add the following section:
Apply your modified manifest to install aws-privateca-issuer and verify the deployment you have modified. It should show that one pod is ready and available.
kubectl apply -f privateca-issuer.yaml
kubectl get deployment -n cert-manager -l app.kubernetes.io/name=aws-privateca-issuer
NAME READY UP-TO-DATE AVAILABLE AGE
iamra-aws-privateca-issuer 1/1 1 1 4d10h
Define an AWSPCAIssuer, which will be used for renewal of the manually bootstrapped certificate for the aws-privateca-issuer add-on.
Note: At the time of writing, we used awspca cert-manager API version v1beta1. Check for the latest stable version.
After at least one AWSPCAIssuer or AWSPCAClusterIssuer is available, aws-privateca-issuer is going to authenticate towards AWS APIs by calling sts.get-caller-identity and verify the authentication method. You can verify this using its log files. It should print the assumed role.
Now, you can create a cert-manager Certificate resource that represents a desired certificate that should be issued by the referenced cert-manager Issuer. It combines information of a CSR with details on the validity period and renewal.
In Step 4, sub-step 9, you created an AWSPCAIssuer named iamra-cm-issuer. You then used this AWSPCAIssuer to renew the manually bootstrapped certificate for the aws-privateca-issuer.
In Step 4, sub-step 11, you created the certificate iamra-privateca-issuer-cert, which is used by the aws-privateca-issuer.
In this step, you will deploy the sample workload. When deploying the sample workload, make sure to repeat the process of creating IAM roles and IAMRA profiles (from Step 2), the AWSPCAIssuer (Step 4, sub-step 9), and the CertificateRequestPolicy (Step 4, sub-step 11) for the certificate request.
To test the deployment, you can use kubectl exec to access the iamra-sidecar container. Navigate to the iamra directory and check if the certificate and key are mounted.
Command: kubectl exec -it acmpca-csi-test – sh ls | grep iamra
Output: You should see iam-roles-anywhere-s3-full-access in caller-identity.
Command: $aws s3 ls
Output: You should be able to list the S3 bucket based on the permissions associated with the assumed role.
Summary
In this post, you learned about a solution for securely connecting on-premises Kubernetes workloads to AWS services using IAM Roles Anywhere. The approach alleviates the need for long-term access keys or public internet exposure of the Kubernetes API server. By using this solution for containerized and full stack applications, you can benefit from:
Enhanced security: Use short-lived X.509 certificates instead of long-term credentials.
Simplified management: Automate the certificate lifecycle with cert-manager and AWS Private CA.
Seamless integration: No modifications are required to existing workload Docker files.
Consistent policies: Use the same IAM roles and policies across AWS and on premises.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
When it comes to controlling incoming (ingress) and outgoing (egress) network traffic, organizations typically focus heavily on inbound traffic controls—carefully restricting what traffic can enter their network perimeter. However, this approach addresses only inbound security challenges. Modern applications rely heavily on third-party code through operating systems, libraries, and packages. This dependency can create potential security vulnerabilities. If these components are compromised, affected workloads might attempt to connect to unauthorized command and control servers or send sensitive data to unauthorized destinations on the internet.
This is why implementing strong outbound traffic controls—particularly through domain-based allowlisting—has become a critical security best practice. Rather than allowing unrestricted outbound access or maintaining an ever-growing denylist of low-reputation domains, many organizations are shifting to domain-based allowlisting. This approach restricts outbound communications to explicitly trusted domains, reduces potential risk surfaces, and helps to protect against both known and unknown threats. However, manually identifying and maintaining these allowlists has traditionally been a complex and time-consuming process.
AWS Network Firewall automated domain lists improve visibility into network traffic patterns and simplify outbound traffic control management. This feature provides analytics for HTTP and HTTPS network traffic, helping organizations understand domain usage patterns. It also automates firewall log analysis to create rules based on your network traffic. By combining increased visibility with automation, this feature enhances your security awareness and helps to improve the effectiveness of your firewall rules.
In this blog post, we’ll guide you through the implementation of the AWS Network Firewall automated domain list feature, providing a detailed overview, step-by-step instructions, and best practices to optimize your network security.
Overview of automated domain lists and traffic insights
Domain-based security allows you to control network traffic based on the domain names that your applications and users are trying to access. This approach offers a more intuitive and flexible way to create firewall rules, focusing on the destinations your network is trying to reach rather than just IP addresses. However, effectively configuring and managing firewall rules remains challenging for some customers, especially in large environments where connected devices, applications, and traffic patterns are continuously growing and changing. Organizations might struggle to keep up with these changes, leading to outdated or ineffective firewall rules and policies that are either too permissive, exposing the network to risks, or too restrictive, blocking legitimate traffic.
Let’s explore how automated domain lists address these challenges through various use cases and benefits:
Preventive and detective security controls
Domain control through allowlisting – Establishing domain allowlists aligns with the security principle of least privilege for network traffic. A least-privilege model adjusts the scope of what a workload can do across the network, from infinite and undefined to scoped-down and well-defined, enabling better insight into potentially risky behaviors. By limiting outbound connections to only approved domains, organizations can more effectively control and monitor workload communications.
Rule audit and compliance – Domain allowlisting makes it clear which domains are allowed, supporting alignment with standards like the Payment Card Industry Data Security Standard (PCI DSS), Health Insurance Portability and Accountability Act (HIPAA), Cybersecurity Maturity Model Certification (CMMC), and General Data Protection Regulation (GDPR).
Preventive controls enable detection – Preventive controls also act as detective controls, establishing a baseline for normal domain access patterns. With a domain allowlist in place, security teams can better detect workloads that show signs of unauthorized activity.
Incident response support – Domain reporting provides the latest list of workload domains accessed, enabling quick identification of potentially malicious domains during security incidents. This information helps teams prioritize which workloads may need immediate attention.
Operational value
Initial firewall setup and management – Automated allowlisting involves analyzing existing traffic patterns and recommending domain-based rules, which simplifies the process of establishing baseline firewall rules. This helps organizations quickly deploy effective security policies, potentially reducing the time and expertise needed for initial firewall configuration and ongoing management.
Application modernization – Allowlisting supports adjusting firewall rules to accommodate rapidly changing traffic patterns in microservices and containerized environments, helping security to keep pace with evolving architectures.
Cross-environment consistency – Allowlisting enables consistent firewall rule creation and management across multi-cloud and hybrid environments, regardless of where applications or data reside.
How the automated domain list feature works
Automated domain lists work by analyzing your HTTP and HTTPS traffic, generating reports on frequently accessed domains, and providing a convenient way to create rules based on actual network traffic patterns. To begin using automated domain lists in AWS Network Firewall, sign in to the AWS Management Console, access the Network Firewall service, and either work with an existing firewall or create a new one. Then follow the rest of the steps in this post.
Step 1: Enable traffic analysis mode to capture HTTP and HTTPS traffic domain logs
After you’ve selected a firewall, in the left navigation pane, choose Configure advanced settings. Select the Enable traffic analysis mode checkbox to enable it, as shown in Figure 1. Network Firewall uses this logging mode to collect data on observed domains for HTTP and HTTPS traffic to create domain reports.
Figure 1: Enabling traffic analysis mode for a firewall
To stop collecting data on frequently accessed domains in your network traffic, clear the checkbox to disable traffic analysis mode, as shown in Figure 2. Note that if you disable traffic analysis mode, you won’t be able to generate domain reports.
Figure 2: Disabling traffic analysis mode
Once traffic analysis mode is enabled, you’re ready to generate a domain report based on observed network traffic. Next, you can go to the Monitoring and observability tab and choose Create report.
Figure 3: Traffic analysis mode enabled: Now you’re ready to generate domain-based reports
Step 2: Create a domain report
The domain report summarizes the HTTP and HTTPS traffic observed by your firewall for up to 30 days (or for the duration since firewall activation if less than 30 days). Select the checkbox next to each traffic analysis type you want to include in the report—HTTP, HTTPS, or both.
Important: Use your monthly domain report to examine 30 days of traffic behavior. Each report type (HTTP, HTTPS) is available once every 30 days at no additional cost.
Figure 4: Create a domain report that includes traffic analysis types HTTP, HTTPS, or both
To see the status of your domain report, go to the Reports section in the console for your specific firewall. When the report is ready, you can review the report directly in the console or download it, as shown in Figure 5.
Figure 5: The list of domain reports in the Reports section of the console for your specific firewall
Step 3: Review the report details
The report details include the traffic type (HTTP or HTTPS) and the observation period (start and end dates). By default, the report covers the last 30 days, or the entire period since traffic analysis was enabled if that is less than 30 days. The report also shows these details:
The Domain list shows domains that are a fully qualified domain name (FQDN) observed in the network traffic, such as aws.com or subdomain.aws.com.
The Access attempt count refers to the overall count of connection requests to the domain, including both successful and failed attempts.
The Unique sources field shows the number of distinct source IP addresses connected to the domain, indicating its popularity. For example, if one workload connects to aws.com, then count = 1; if 1000 workloads connect to aws.com, then count = 1,000.
The First accessed field shows when the domain was first seen in your traffic, while Last accessed shows when it was most recently seen. This includes both successful and failed attempts to access the domain.
The Protocol field indicates how the domain was observed—through either HTTP or HTTPS traffic (in other words, HTTP headers or a TLS handshake).
An example report is shown in Figure 6.
Figure 6: Example domain report details: 30-day analysis
Step 4: (Optional) Create a domain list rule group
You can copy the list of observed domains from the report to a stateful domain list rule group and update your firewall policy. To do so, in the Report details section, choose Create domain list group to use the firewall policy wizard to create or update your firewall rules. The selected domains are automatically copied to a domain list rule group, as shown in Figure 7. For detailed instructions, see the AWS Network Firewall documentation.
Figure 7: Option to copy over the observed domain lists and create a domain list rule group using the firewall policy wizard
Best practices for implementing domain allowlists
When you implement domain allowlisting, consider the following guidelines for operational success. We recommend that you also consult your own internal compliance and security policies.
Start with a strategy of generous allowlisting:
Begin with broader and more generous allowlist rules rather than a more refined list, initially, to reduce the risk of accidently blocking legitimate domains.
Focus on getting to a Default Deny policy so that you can benefit from its risk surface reduction.
Create flexible rules for trusted domains, including second-level domains and top-level domains, such as allowing access to subdomains under your registered second-level domain. Or allow access to second-level domains under top-level domains that your organization trusts—for example, .mil, .gov, or .edu.
Remember that even a broad allowlist provides better security than having no allowlist at all.
Make iterative improvements:
After you establish an initial generous allowlist and Default Deny rules, evaluate the rules to determine which areas you might want to start narrowing down further. Use alert rules before pass rules in order to log the specific domains a pass rule might be allowing access to.
Adjust logging levels based on domain trust levels and monitoring requirements.
Review and update rules based on operational insights and changing requirements.
Take a pragmatic and iterative approach to rule refinement rather than attempting to make the ruleset very strict.
Consider setting up proactive alerts for denied domains accessed by critical workloads.
Monitor logs to identify potential allowlist additions or changes.
Additional considerations:
After you enable traffic analysis mode, the automated domain lists feature provides visibility into your network traffic, reporting on observed connections. Although it doesn’t distinguish between allowed and blocked traffic, the domain list report can help you identify the most critical domains to include in your firewall rules.
The domain traffic data used to generate the list of domain recommendations is available for up to the last 30 days after traffic analysis has been enabled. This allows you to focus on the most relevant and recent network activity when optimizing your firewall policies.
Data collection for automated domain lists is opt-in and performed independently of the firewall policy and logging configuration. Enabling the feature doesn’t impact the performance of the firewall itself.
Conclusion
With AWS Network Firewall automated domain lists, you can simplify your firewall management process, create more effective rules based on actual traffic patterns, and maintain a strong security posture with less manual effort. This feature helps you address common challenges such as keeping up with rapidly changing application landscapes, managing security across complex environments, and adhering to compliance requirements. To learn more about Network Firewall and its features, see the product page and service documentation.
At Amazon Web Services (AWS), earning trust isn’t just a goal—it’s one of our core Leadership Principles that guide every decision we make. As CISO of AWS, I’ve seen firsthand how this commitment to earning trust shapes our culture, our services, and our daily interactions with customers. Customers choose AWS over other providers because they trust us to provide the most secure infrastructure and services, and to tell them exactly how data is being protected.
To make that information even easier to find, we’re launching the AWS Trust Center, a new online resource that shares how we approach securing your assets in the cloud. The AWS Trust Center is a window into our security practices, compliance programs, and data protection controls that demonstrates how we work to earn your trust every day.
Building on a foundation of trust
Security has been our top priority since day one. When we launched AWS in 2006, we designed our infrastructure as the most secure cloud computing environment available. We knew we couldn’t just offer the same level of security as existing on-premises infrastructure. To earn customer trust, we had to disrupt and exceed the exacting industry standards of the world’s most security-conscious organizations. We continue to constantly reinforce security in our everyday decision making. With the Trust Center, we’re making it easier for you to understand how we protect your workloads, safeguard your data, and help you meet your compliance goals.
The Trust Center reflects our belief that more easily accessible information builds and maintains trust. Whether you’re looking for information about our data center controls, checking compliance certifications, or reviewing our shared responsibility model, you’ll now find the security and compliance information you need in one central location.
A single source of truth for security and compliance
In the Trust Center, you’ll find information about our approach to security at every level—from our physical data centers to our cloud infrastructure and our portfolio of cloud services. We’re including documentation about our security services and tools, helping you to understand both how we secure the cloud and how we help you secure your workloads within it. You’ll also find information about our compliance programs, including the certifications and attestations we maintain globally. This is valuable for teams working in regulated industries who need to demonstrate compliance to auditors and regulators.
The Trust Center highlights information about our data protection and privacy practices. Customers can learn how we protect your data and how we manage encryption. Further, we understand that customers are concerned about who can access their data and under what circumstances. We’ve consolidated detailed information about our operator access controls, which are designed to use the principle of least privilege at their core. You’ll learn about our zero-access designs for key services like AWS Key Management Service (AWS KMS) and Amazon Elastic Compute Cloud (Amazon EC2), our use of forward access sessions (FAS) to cryptographically enforce customer authorization, and our global monitoring systems.
The Trust Center provides a central place to find information about service health and security events, so you have the information you need to maintain operational excellence. You can stay up-to-date on security bulletins, and check real-time service health status. If you need to report a security concern or conduct your own security assessment, we’ve made those processes even easier to find. Resources are organized for ease of access, with direct links to the agreements, documentation, and resources you need to make informed decisions about your cloud security posture.
Empowering customers to drive secure innovation
What excites me most about the Trust Center is how it removes barriers for our customers. With detailed security information, easier links to compliance documentation, and operational insights now at your fingertips, you can move faster and innovate with confidence.
As we continue to innovate and expand AWS services, we’re committed to enhancing the Trust Center with the latest security information. This is a living resource that will evolve alongside our cloud and services. Maintaining your trust isn’t just about what we’ve built today—it’s about demonstrating, through both our commitments and delivery, that we’re worthy of being your trusted security partner. That’s our commitment to you, and that’s what the AWS Trust Center represents.
We invite you to explore the AWS Trust Center today, and we look forward to continuing to earn your trust, every day.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
February 12, 2025: This post was republished to include new services and features that have launched since the original publication date of June 11, 2020.
Encryption is a critical component of a defense-in-depth security strategy that uses multiple defensive mechanisms to protect workloads, data, and assets. As organizations look to innovate while building trust with customers, they need to meet critical compliance requirements and improve data security. Encryption, when used correctly, adds a layer of protection against unauthorized access that can help you strengthen data protection, adhere to regulations and standards, and enhance the security of communications.
How and why does encryption work?
Encryption works by using an algorithm with a key to convert data into unreadable data (ciphertext) that can only become readable again with the right key. For example, a simple phrase like “Hello World!” may look like “1c28df2b595b4e30b7b07500963dc7c” when encrypted. There are several different types of encryption algorithms, all using different types of keys. A strong encryption algorithm relies on mathematical properties to produce ciphertext that can’t be decrypted using any practically available amount of computing power without also having the necessary key. Therefore, protecting and managing the keys becomes a critical part of any encryption solution.
Encryption as part of your security strategy
An effective security strategy begins with stringent access control and continuous work to define the least privilege necessary for persons or systems accessing data. When using the AWS Cloud, you adopt the model of shared responsibility. You are responsible for managing your own access control policies. Encryption is a critical component of a defense-in-depth strategy because it can mitigate weaknesses in your primary access control mechanism. What if an access control mechanism fails and allows access to the raw data on disk or traveling along a network link? If the data is encrypted using a strong key, as long as the decryption key is not on the same system as your data, it is computationally infeasible for a bad actor to decrypt your data.
To show how infeasible this is, let’s consider the Advanced Encryption Standard (AES) with 256-bit keys (AES-256). It’s the strongest industry-adopted and government-approved algorithm for encrypting data. AES-256 is the technology we use to encrypt data in AWS, including Amazon Simple Storage Service (S3) server-side encryption. It would take at least a trillion years to break using current (and foreseeable future) computing technology. Current research suggests that even the future availability of quantum-based computing won’t sufficiently reduce the time it would take to break AES-256 encryption.
But what if you mistakenly create overly permissive access policies on your data? A well-designed encryption and key management system can also help prevent this from becoming an issue, because it separates access to the decryption key from access to your data.
Requirements for an encryption solution
To get the most from an encryption solution, you need to think about two things:
Protecting keys at rest: Are the systems using encryption keys secured so the keys can never be used outside the system? In addition, do these systems implement encryption algorithms correctly to produce strong ciphertexts that cannot be decrypted without access to the right keys?
Independent key management: Is the authorization to use encryption independent from how access to the underlying data is controlled?
There are third-party solutions that you can bring to AWS to help meet these requirements. However, these systems can be difficult and expensive to operate at scale. AWS offers a range of options to simplify encryption and key management.
Protecting keys at rest
When you use third-party key management solutions, it can be difficult to gauge the risk of your plaintext keys leaking and being used outside the solution. The keys have to be stored somewhere, and you can’t always know or audit all the ways those storage systems are secured from unauthorized access. The combination of technical complexity and the necessity of making the encryption usable without degrading performance or availability means that choosing and operating a key management solution can present difficult tradeoffs. The best practice to maximize key security is using a hardware security module (HSM). This is a specialized computing device that has several security controls built into it to help prevent encryption keys from leaving the device in a way that could allow an adversary to access and use those keys.
One such control in modern HSMs is tamper response, in which the device detects physical or logical attempts to access plaintext keys without authorization, and destroys the keys before the attack succeeds. Because you can’t install and operate your own hardware in AWS datacenters, AWS offers two services using HSMs with tamper response to protect customers’ keys: AWS Key Management Service (AWS KMS), which manages a fleet of HSMs on the customer’s behalf, and AWS CloudHSM, which gives customers the ability to manage their own HSMs. Each service can create keys on your behalf, or you can import keys from your on-premises systems to be used by each service.
The keys in AWS KMS or AWS CloudHSM can be used to encrypt data directly, or to protect other keys that are distributed to applications that directly encrypt data. The technique of encrypting encryption keys is called envelope encryption, and it enables encryption and decryption to happen on the computer where the plaintext customer data exists, rather than sending the data to the HSM each time. For very large data sets (e.g., a database), it’s not practical to move gigabytes of data between the data set and the HSM for every read/write operation. Instead, envelope encryption allows a data encryption key to be distributed to the application when it’s needed. The “master” keys in the HSM are used to encrypt a copy of the data key so the application can store the encrypted key alongside the data encrypted under that key. Once the application encrypts the data, the plaintext copy of data key can be deleted from its memory. The only way for the data to be decrypted is if the encrypted data key, which is only a few hundred bytes in size, is sent back to the HSM and decrypted.
The process of envelope encryption is used in AWS services in which data is encrypted on a customer’s behalf (which is known as server-side encryption) to minimize performance degradation. If you want to encrypt data in your own applications (client-side encryption), you’re encouraged to use envelope encryption with AWS KMS or AWS CloudHSM. Both services offer client libraries and SDKs to add encryption functionality to their application code and use the cryptographic functionality of each service. The AWS Encryption SDK is an example of a tool that can be used anywhere, not just in applications running in AWS. To make it easier for customers to encrypt data in databases like Amazon DynamoDB, we built the AWS Database Encryption SDK. The AWS Database Encryption SDK is a set of software libraries that enable you to use client-side encryption in your database design, including record-level encryption of database items. Today, the AWS Database Encryption SDK supports Amazon DynamoDB with attribute-level encryption.
Because implementing encryption algorithms and HSMs is critical to get right, all vendors of HSMs should have their products validated by a trusted third party. HSMs in both AWS KMS and AWS CloudHSM are validated under the National Institute of Standards and Technology’s FIPS 140 program, the standard for evaluating cryptographic modules. This validates the secure design and implementation of cryptographic modules, including functions related to ports and interfaces, authentication mechanisms, physical security and tamper response, operational environments, cryptographic key management, and electromagnetic interference/electromagnetic compatibility (EMI/EMC). Encryption using a FIPS 140 level 3 validated cryptographic module is often a requirement for other security-related compliance schemes like FedRamp and HIPAA-HITECH in the U.S., or the international payment card industry standard (PCI-DSS).
Independent key management
While AWS KMS and AWS CloudHSM can protect plaintext master keys on your behalf, you are still responsible for managing access controls to determine who can cause which encryption keys to be used under which conditions. One advantage of using AWS KMS is that the policy language you use to define access controls on keys is the same one you use to define access to all other AWS resources. Note that the language is the same, not the actual authorization controls. You need a mechanism for managing access to keys that is different from the one you use for managing access to your data. AWS KMS provides that mechanism by allowing you to assign one set of administrators who can only manage keys and a different set of administrators who can only manage access to the underlying encrypted data. Configuring your key management process in this way helps provide separation of duties you need to avoid accidentally escalating privilege to decrypt data to unauthorized users. For even further separation of control, AWS CloudHSM offers an independent policy mechanism to define access to keys.
In 2022, AWS KMS launched support for external key stores (XKS), a feature that allows you to store AWS KMS customer managed keys on an HSM that you operate on premises or at a location of your choice. At a high level, AWS KMS forwards requests for encryption and decryption to your HSM. Your key material never leaves your HSM. This can help you unblock use cases for a small portion of highly regulated workloads where encryption keys should be stored and used outside of an AWS data center. However, XKS forces a significant shift in the shared responsibility model—you now have responsibility for the durability, throughput, latency, and availability of your KMS key. If that key is lost or destroyed, you could permanently lose access to data, and if an XKS key becomes unavailable, all workloads in AWS that are dependent on that XKS key will be inaccessible.
Even with the ability to separate key management from data management, you can still verify that you have configured access to encryption keys correctly. AWS KMS is integrated with AWS CloudTrail so you can audit who used which keys, for which resources, and when. This provides granular vision into your encryption management processes, which is typically much more in-depth than on-premises audit mechanisms. Audit events from AWS CloudHSM can be sent to Amazon CloudWatch, the AWS service for monitoring and alarming third-party solutions you operate in AWS.
Encrypting data at rest and in transit
AWS services that handle customer data, encrypt data that is sent from one system to another—known as data in transit—provide options to encrypt data at rest. AWS services that offer encryption at rest using AWS KMS or AWS CloudHSM use AES-256. None of these services store plaintext encryption keys at rest—that’s a function that only AWS KMS and AWS CloudHSM may perform using their FIPS 140 level 3 validated HSMs. This architecture helps minimize the unauthorized use of keys.
When encrypting data in transit, AWS services use the Transport Layer Security (TLS) protocol to provide encryption between your application and the AWS service. Most commercial solutions use an open source project called OpenSSL for their TLS needs. OpenSSL has roughly 500,000 lines of code with at least 70,000 of those implementing TLS. The code base is large, complex, and difficult to audit. Moreover, when OpenSSL has bugs, the global developer community is challenged to not only fix and test the changes, but also to make sure that the resulting fixes themselves do not introduce new flaws.
AWS’s response to challenges with the TLS implementation in OpenSSL was to develop our own implementation of TLS, known as s2n, or signal to noise. We released s2n in June 2015, which we designed to be small and fast. The goal of s2n is to provide you with network encryption that is easier to understand and that is fully auditable. We released and licensed it under the Apache 2.0 license and hosted it on GitHub.
We also designed s2n to be analyzed using automated reasoning to test for safety and correctness using mathematical logic. Through this process, known as formal methods, we verify the correctness of the s2n code base every time we change the code. We also automated these mathematical proofs, which we regularly re-run to ensure the desired security properties are unchanged with new releases of the code. Automated mathematical proofs of correctness are an emerging trend in the security industry, and AWS uses this approach for a wide variety of our mission-critical software.
Similarly, in 2022, we released s2n-quic, an open-source Rust implementation of the QUIC protocol that was added to our set of AWS encryption open source libraries. QUIC is an encrypted transport protocol designed for performance and is the foundation of HTTP/3. It is specified in a set of IETF standards that were ratified in May 2021. Amazon CloudFront HTTP/3 support is built on top of s2n-quic, due to its emphasis on performance and efficiency. You can learn more about s2n-quic in this Security Blog post.
Implementing TLS requires using encryption keys and digital certificates that assert the ownership of those keys. AWS Certificate Manager and AWS Private Certificate Authority are two services that can simplify the issuance and rotation of digital certificates across your infrastructure that needs to offer TLS endpoints. Both services use a combination of AWS KMS and AWS CloudHSM to generate and/or protect the keys used in the digital certificates they issue.
Encrypting data in use
You might also have use cases for protecting data that is actively being used by federated learning models or other applications. Cryptographic computing—a set of technologies that allow computations to be performed on encrypted data, so that sensitive data is not exposed—is a methodology for protecting data in use.
Consider the example of an insurance company that works with other companies to develop machine learning models for insurance fraud detection. You might need to use sensitive data about your customers as training data for your models, but you don’t want to share your customer data in plaintext form with the other companies. Cryptographic computing gives organizations a way to train models collaboratively without exposing plaintext data about their customers to each other, or to a cloud provider like AWS. You can read more about cryptographic computing in this AWS Security Blog post.
Today, you can see cryptographic computing at work in AWS Clean Rooms, a service that helps companies and their partners more easily and securely analyze and collaborate on their collective datasets—all without sharing or copying one another’s underlying data. AWS Clean Rooms has a feature called Cryptographic Computing for AWS Clean Rooms (C3R) that cryptographically protects your data even while it is being processed by an AWS Clean Rooms collaboration.
The role of end-to-end encryption in secure communications
End-to-end encryption (E2EE) is a method of secure communication between two or more parties that combines encryption in transit and encryption at rest to protect data from unauthorized access, interception, or tampering. Decryption happens only on the parties you intend to communicate with, and no service providers in between. Every call, message, and file is encrypted with a unique private key and remains protected in transit. Unauthorized parties can’t access communication content, because they don’t have the private key required to decrypt the data.
AWS Wickr is an end-to-end encrypted messaging and collaboration service that protects one-to-one and group messaging, voice and video calling, file sharing, screen sharing, and location sharing with 256-bit encryption. With Wickr, each message gets a unique AES private encryption key and a unique Elliptic-curve Diffie–Hellman (ECDH) public key to negotiate the key exchange with recipients. Message content—including text, files, audio, or video—is encrypted on the sending device (your iPhone, for example) by using the message-specific AES key. This key is then exchanged by using the ECDH key exchange mechanism, so that only intended recipients can decrypt the message.
Quantum computing and post-quantum cryptography
Quantum computing is a field of technology that uses quantum mechanics to solve complex problems faster than on classical computers. Quantum computers are able to solve certain types of problems faster by taking advantage of quantum mechanical effects, such as superposition and quantum interference. For cryptography, this has implications that affect traditional encryption mechanisms such as asymmetric key encryption, which is often used for protecting data in transit (TLS) or creating hash-based signatures to verify the integrity and authenticity of a message or file. Quantum computers, if they are performant and stable enough, could theoretically compromise the security of asymmetric key algorithms like RSA, Elliptic Curve Cryptography (ECC), or Diffie-Hellman key agreement schemes. Based on current research, symmetric key algorithms like AES are not considered to be at risk from a quantum computer, because the key length of 256 bits is already sufficient to compensate for a decrease in cryptographic key strength posed by quantum algorithms.
AWS gives customers the option of evaluating post-quantum algorithms alongside traditional algorithms, using hybrid schemes that make use of both classic cryptography and newer post-quantum cryptographic (PQC) algorithms that are designed to be resistant to quantum computer threats. AWS has taken the first step in deploying PQC by implementing ML-KEM, a module lattice-based key encapsulation mechanism, within AWS-LC, our open source FIPS-140-3 validated cryptographic library. AWS-LC is the core cryptographic library used throughout AWS. Specifically, AWS-LC is used in s2n-tls, our open source TLS implementation used across AWS services with HTTPS-based endpoints.
AWS provides customers the ability to encrypt everything, everywhere. Customers can encrypt data at rest, in transit, and in memory, with a few clicks in the AWS Management Console, or an AWS API call. Services like Amazon Simple Storage Service (Amazon S3) encrypt new objects by default, and also support the use of customer managed AWS KMS keys to give customers more control over their encryption keys. Importantly, AWS KMS uses techniques like envelope encryption and highly scalable key management infrastructure to enable AWS services like Amazon S3 or Amazon Elastic Block Store (Amazon EBS) to encrypt data with minimal performance impact to customer applications.
AWS is also consistently working to improve the performance and security of our customers’ data as it moves between networks or devices. As of June 2024, all AWS API endpoints support TLS 1.3 and require at least TLS 1.2 or higher. By using TLS 1.3, you can decrease your connection time by removing one network round trip for every connection request, and can benefit from some of the most modern and secure cryptographic cipher suites available today.
Customers who require memory encryption can use AWS Graviton, our custom-built family of processors based on ARM. AWS Graviton2, AWS Graviton3, and AWS Graviton3E support always-on memory encryption. The encryption keys are securely generated within the host system, do not leave the host system, and are destroyed when the host is rebooted or powered down. Memory encryption is also supported for other instance types; see the EC2 documentation for more details.
As part of our AWS Digital Sovereignty Pledge, we commit to continue to innovate and invest in additional controls for encryption features so that our customers can encrypt everything, everywhere with encryption keys managed inside or outside the AWS Cloud.
Summary
At AWS, security is our top priority. We are committed to helping you control how your data is used, who has access to it, and how it is protected. By building and supporting encryption tools that work both on and off the cloud, we help you secure your data and enable compliance across your environment. We put security at the center of everything we do to make sure that you can protect your data using best-of-breed security technology in a cost-effective way.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS KMS forum or the AWS CloudHSM forum, or contact AWS Support.
Generative AI applications often involve a combination of various services and features—such as Amazon Bedrock and large language models (LLMs)—to generate content and to access potentially confidential data. This combination requires strong identity and access management controls and is special in the sense that those controls need to be applied on various levels. In this blog post, you will review the scenarios and approaches where you can apply least privilege access to applications using Amazon Bedrock. To fully benefit from the guidance in this post, you need an understanding of AWS APIs, AWS Identity and Access Management (IAM) policies, and AWS security services.
Let’s start by defining the principle of least privilege (PoLP): The PoLP is a security concept that advises granting the minimal level of access—or permissions—necessary for users, programs, or systems to perform their tasks. The main idea is that the fewer permissions an entity has, the lower the risk of malicious or accidental damage. Applying the PoLP to your use of AWS serves two purposes:
Security: By limiting access, you reduce the potential impact of a security incident. If a user or service has minimal permissions, the scope for any damage can be significantly reduced.
Operational simplicity: Managing permissions can become complex if not properly managed and maintained. Applying the PoLP to your access controls early helps keep configurations as manageable as possible. Finally, there are regulatory frameworks that require separation of duty between roles and a documented strategy for access controls, which can be achieved in part by adhering to the PoLP.
Amazon Bedrock is a fully managed AWS service that makes high-performing foundation models (FMs) available through a single unified API. You use Amazon Bedrock through AWS APIs, which expose actions for the control plane and administration such as the configuration of Amazon Bedrock Guardrails and Amazon Bedrock Agents, in addition to data plane functional actions such as inference.
Generally, the path to using Amazon Bedrock for a production workload includes the following stages:
Model selection: Decide on the required features (Retrieval Augmented Generation (RAG), fine-tuning, and so on), evaluate and select a model, and approve a EULA if necessary.
Model adaptation: Prompt engineering, integration of Amazon Bedrock into the application, and addition of model customization if desired.
Model testing: Validate and test the solution.
Model operation: Deploy the solution and make it available. Monitor and operate the solution.
In the following sections, we go through each phase and outline how you can apply the PoLP.
Model selection
In this phase, you choose the features and models that are needed to fulfill your requirements and define how you will apply the PoLP. These can include, for example, model customization, Retrieval Augmented Generation (RAG) or the use of agents.
Security should be integrated into the design so that the defined controls can be implemented during the development phase. One approach to define the necessary security controls is threat modeling. Doing this exercise early in the process will simplify the upcoming phases. The results can be used later to decide on the required guardrails, potential changes to the architecture, and test cases.
In this phase, you will also decide how the solution should be deployed. Customers typically operate in a multi-account setup; therefore, the selection of target organizational units (OUs) and accounts is required. We recommend creating a new OU for generative AI applications. For details, see the deep-dive chapter on generative AI in the AWS Security Reference Architecture. We will talk later about service control policies (SCPs) and how they can be used to restrict permissions. The generative AI OU is a good place to enforce those guardrails.
Amazon Bedrock provides access to a variety of high-performing FMs from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon. In this stage, you need to choose the models that you’ll use and approve them. With a third-party FM, approval might include accepting a EULA. You can limit identities and the models that they can subscribe to in order to follow compliance with EULAs that have been reviewed by your legal department. The following is an example of an SCP that allows account operators to enable all Anthropic FMs and a single Meta Llama FM.
While this approach works well if you’re only allowlisting actions, you might have highly privileged users that already have broad access to AWS Marketplace APIs. In such a case, you can follow a deny all except a few approach. Such a policy, using the same models as before, would look like the following example:
In this phase, the solution is built—that is, code is written. This is mostly identical to traditional software development, however there are some areas specific to generative AI, such as prompt engineering, prompt guardrails, model monitoring, and agent design. In this post, we focus solely on the identity and access management aspects.
Adaptation is the phase where the detailed permission sets are created. Data perimeters can be used as a conceptual tool to define and implement guardrails. Because data perimeters are typically coarse grained, they aren’t sufficient to achieve the goal of the PoLP. However, in combination with fine-grained policies, they support a defense-in-depth approach. The following data perimeters exist:
Identity: Only trusted identities are allowed in my network, only trusted identities can access my resources.
Resource: My identities can access only trusted resources, only trusted resources can be accessed from my network.
Network: My identities can access resources only from expected networks, my resources can only be accessed from expected networks.
For applications that use Amazon Bedrock, you can use a virtual private cloud (VPC) network construct with Amazon Virtual Private Cloud (Amazon VPC) to host them. Doing so means that you can then use AWS PrivateLink to create VPC endpoints for both data and control plane APIs. Using PrivateLink to create endpoints, it’s possible to provide access to Amazon Bedrock for VPC-bound compute resources (such as Amazon Elastic Compute Cloud (Amazon EC2), or AWS Lambda) without the need for an internet gateway. In other words, you can deploy these resources entirely in private subnets. By using resource-based policies on these endpoints, you can restrict the principals, actions, resources, and conditions related to making API calls.
Let’s assume you have a VPC with an EC2 instance running in a private subnet hosting an application that uses Amazon Bedrock model invocations and have created an interface VPC endpoint to connect to the Amazon Bedrock data plane. The EC2 instance is configured to use an instance profile using the <rolename> IAM role and needs to be able to invoke a single Anthropic’s Claude Instant FM through an Amazon Bedrock InvokeModel API call. You can apply the PoLP to the containing VPC, and thus the EC2 instance, with a custom policy on the Amazon Bedrock interface VPC endpoint. To use the following policy in your own account, replace the default interface VPC endpoint policy with the following example, replacing <rolename> with the role you want to allow and <account-id> with your 12-digit account number.
You can define the allowed models that can be used in Amazon Bedrock directly in the policy. However, if you have multiple applications that use Amazon Bedrock, you might have to update multiple policies when a new model is allowed to be used. To complement the data perimeter approach, you can add an SCP to limit the models that can be used for inference. Because Amazon Bedrock is using a simple API (InvokeModel and Converse) for inference, a condition element in an IAM policy can be used to deny the use of unapproved models. Note that while the two policies (the SCP and the VPC endpoint policy) look similar, they work differently: VPC endpoint policies are enforced provided that the network path through PrivateLink is enforced; SCPs are applied to principals within the account or OU they’re attached to. Be extra careful if the calling identity resides outside of your account, because only the VPC endpoint policy will apply.
For example, imagine that you wanted to block the invocation of all Anthropic FMs across your organizations within AWS Organizations, in all AWS Regions. The following SCP example applied to the OUs or AWS accounts in scope would achieve that outcome:
A solution built on Amazon Bedrock might include model customization. The common denominator of the different customization approaches is that they include data, which is assumed to be confidential and thus in-scope for applying the PoLP. Here, we take a scenario where data is stored in Amazon S3 and can be encrypted using a customer managed AWS Key Management Service (AWS KMS) key.
Measures can be taken on multiple levels, as conceptualized in data perimeters: network, identity, and resource. Amazon Bedrock model customization uses service roles, which allows you to apply fine-grained and least-privilege access end-to-end. These service roles will be assumed by the Amazon Bedrock service principal, so that it can execute actions on your behalf. To allow the Amazon Bedrock service principal to assume the role in your account, you need to attach a trust policy to the role.
Let’s imagine that you’re running an Amazon Bedrock customization job in the us-east-1 (N. Virginia) Region. Using the following trust policy example will allow only the Amazon Bedrock service principal to assume your role.
Make sure to replace <account-id> in the preceding example trust policy with your own 12-digit account number. The policy contains a condition that provides cross-service confused deputy prevention by adding the aws:SourceAccount condition. The confused deputy problem is a situation where an entity that doesn’t have permission to perform an action can coerce a more privileged entity to perform the action. In AWS, cross-service impersonation can result in the confused deputy problem. Cross-service impersonation can occur when one service (the calling service) calls another service (the called service). AWS provides tools to help you protect your data for all services with service principals that have been given access to resources in your account. Both the aws:SourceArn and aws:SourceAccount global condition context keys in the role’s trust policy limit the permissions that Amazon Bedrock gives another service (in the preceding case, to the customization job) to the resource. aws:SourceArn is the more restrictive approach here, because it defines the specific source of the assume request, and not just the AWS account.
You should provide only the permissions that are required to fulfill the model customization task. For example, imagine that you want to limit access to your training data, the validation data bucket, and the output bucket (where Amazon Bedrock will deliver output metrics). The following policy, attached to that same service role, provides only those permissions. Replace the <training-bucket> placeholder with the bucket name that contains your training data, <validation-bucket> with your validation bucket, and <output-bucket> with the bucket where you want to store metrics.
Complementing this approach, we recommend using a VPC for the model customization job to restrict access to the training data. Technically, this again involves a VPC endpoint resource policy because the network is using interface VPC endpoints to access your S3 bucket. This allows you to define another network control, specifically an S3 bucket policy that only allows access through a specific VPC endpoint. So, for the situation where you want to limit access for the customization job itself, you can apply a bucket policy such as the following example, replacing <training-bucket> with the bucket name that contains your training data, and <vpce-id> with the ID of the VPC endpoint that resides in your VPC:
In addition, you would restrict the principals that can access your VPC endpoint and the actions they’re allowed to take in Amazon S3. For simplicity, we’re omitting an example policy here because it’s very similar to the one we have in the Amazon Bedrock invocation section earlier in this post.
If you need to enforce encryption in Amazon S3 using a customer managed AWS KMS key (SSE-KMS), you will need to do the following:
Update the bucket policy with a statement denying unencrypted content being uploaded.
Update the KMS key policy to allow the service role to decrypt and describe the key.
The next policy example should be added to the bucket policy and demonstrates how to deny unencrypted objects being added to an S3 bucket. Again, replace <training-bucket> with the name of the S3 bucket that contains your training data:
Finally, in the KMS key policy, you need a statement similar to the following to allow the Amazon Bedrock service role access to the KMS key. Replace <account-id> with your 12-digit account number and <bedrock-service-role> with the role you created, which will be assumed by the Amazon Bedrock service principal. Make sure to only give the required access to decrypt data with the KMS key to the IAM role:
Amazon Bedrock can also encrypt a customized model with a customer managed KMS key. Amazon Bedrock uses KMS key grants to encrypt the customized model and to decrypt it later when you deploy it for inference. Therefore, you need to grant the same IAM role permissions to create KMS key grants in the KMS key policy. The KMS key you use for this purpose is typically different than the one you used to encrypt the training data to allow fine-grained permissions on both keys.
So, let’s imagine that you want to use two different roles to encrypt and decrypt the customized models. To allow the role that executes the model customization job to use your KMS key, you need to add the following policy statements to the KMS key policy, replacing <account-id> with your 12-digit account number, <region> with the Region where you run Amazon Bedrock, <bedrock-model-customization-role> with the role name you use to run the model customization job, and <invocation-role> with the name of the role you use for inference.
By using KMS key grants, you can revoke the permissions you granted to the service role after the customization job is done, thus reducing the permissions to least privilege. Also, Amazon Bedrock uses secondary KMS key grants for model encryption, which means that they’re automatically retired as soon as the operation that Amazon Bedrock performs on behalf of the customer is completed. The Encryption of model customization jobs and artifacts describes in more detail how grants are used.
To completement these IAM policy guardrails, you can add network controls to reduce the scope of the permissions of the process. Because we focus on IAM policies in this post, we won’t go into details here but only mention how the process works.
When you start a model customization job, a model training job is triggered within the model deployment account. The training job takes a base model from its S3 bucket, then connects to the S3 bucket that holds the customization training data to start the customization. This can be done through your VPC, where you specify a VPC configuration such as subnets and security groups, and the training job places an elastic network interface (ENI) into that VPC as specified. A request to the S3 bucket to read the training data now adheres to whatever routing rules are present in the VPC for that ENI. The VPC routing and security group attached to the ENI can be used to limit networking access to the model customization job.
Amazon Bedrock Agents
Amazon Bedrock Agents offers the capability to build and configure autonomous agents for applications. You can find more information about Amazon Bedrock Agents in Automate tasks in your application using AI agents.
Using an Amazon Bedrock agent also provides certain security properties that are applied to an inference task. For example, at the time of writing, there is no IAM condition key for the bedrock:InvokeModel API to require an Amazon Bedrock guardrail being attached to that same call. However, you can require that inferences are invoked through a call to an agent that has specific Amazon Bedrock guardrails configured.
Let’s say that you want to create a role that explicitly is only allowed to invoke Amazon Bedrock models through a specific Amazon Bedrock agent. The following IAM principal permissions policy example implies that the Amazon Bedrock agent specified has approved Amazon Bedrock guardrails configured. Again replace <region>, <account-id>, <bedrock-agent-id>, and <bedrock-agent-alias-id> with the values of your Amazon Bedrock agent.
Provided that the Amazon Bedrock agent is configured by a systems administrator or operator with an approved Amazon Bedrock guardrail, the principal with the preceding policy attached to it will be able to invoke it with a prompt, and won’t directly invoke an Amazon Bedrock model. This strategy for making sure that Amazon Bedrock guardrails are applied to all Amazon Bedrock invocations is currently not possible with the bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream APIs, because they don’t have a condition key to match an Amazon Bedrock guardrail to. In addition, denying bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream also denies the Converse APIs and StartAsyncInvoke APIs, so there’s no need to add these separately to the Deny statement.
Because this strategy verifies the use of specific Amazon Bedrock guardrails, you can also use it for the enforcement of specific prompts, IAM service roles, knowledge bases, prompt and completion content restrictions, KMS keys, and FMs in inference invocations. For this approach to be effective, you need to also limit the principals that can create and update Amazon Bedrock agent configurations. Again, this can be restricted using an IAM policy, which is attached to only specific principals.
The following is an example IAM policy statement that gives an attached IAM principal the ability to update the configuration of a specific Amazon Bedrock agent, replacing <region>, <account-id>, and <agent-id> with the Region, account and identifier, and agent identifier that you’re using. If you want this to apply to all agents, replace <agent-id> with an asterisk (*).
Where agents aren’t suitable, users or applications performing inference against the Amazon Bedrock models will need permissions to call the bedrock:InvokeModel, bedrock:InvokeModelWithResponseStream, or bedrock:CreateModelInvocationJob actions. In these cases, it can be desirable to limit the target models, following an allowlisting approach. Also, such permissions would only be attached to roles or applications that need to use them.
The following is an example of such a policy that restricts invocation to Anthropic’s Claude Instant.
You can include detective or reactive controls using Amazon CloudWatch EventBridge rules to detect model invocations that don’t use appropriate Amazon Bedrock guardrails, but that’s outside the scope of this post.
Model testing
Testing is the last step before the solution is deployed. Types of tests include unit tests, integration tests, user acceptance tests, penetration testing, and more. In this phase, you can again verify that the permissions that were assigned are indeed least privilege.
Especially in functional tests where data is involved, it’s important to consider that the data used for testing might be confidential. This is typically true when no synthetic test data is generated for the testing process. Controls to restrict access to the data and logs that are produced and might contain pieces of this data need to be the same as you will apply in a production environment. That you are only testing the solution doesn’t automatically mean that data access controls aren’t needed.
As discussed earlier in this post, controls are activated not only on identities, but also on the network and on resources. All of these should be validated and their effectiveness confirmed in this phase. Tests include but aren’t limited to:
Validate that you can only perform allowed actions in Amazon Bedrock through VPC endpoints, and that actions that don’t use VPC endpoints are blocked.
Validate the effectiveness of the resource policies on VPC endpoints by making sure that they can only be used by authenticated and authorized principals.
When using knowledge bases, validate that only the Amazon Bedrock service principal can access them.
When using Amazon Bedrock guardrails, evaluate their effectiveness. Because of the nature of generative AI applications, the diversity in input and output data can be big. Therefore, make sure to test guardrails with a reasonably large number of prompts.
If model invocation logging is activated, validate that logs are correctly written and protected with IAM permissions and encryption.
Validate that only the required personnel can access these logs, because they might contain sensitive data. Consider automatically sanitizing and forwarding them to a new CloudWatch log group.
Validate that all relevant Amazon Bedrock API calls are being properly logged in AWS CloudTrail, and that you can effectively monitor and alert on any suspicious activity.
Make sure that sensitive information—such as prompts and responses—isn’t being stored in the CloudTrail logs or in any trace output.
The threat modelling that you have potentially created in the design phase can provide valuable inputs that you can use for security-related test cases.
Model operation
In this phase, the solution is finally deployed into production. Operators need Amazon Bedrock control plane permissions to provision and manage Amazon Bedrock resources such as agents, guardrails, prompt libraries, and knowledge bases. They should only get the control plane permissions to provision and manage the Amazon Bedrock features that are being used. These same operators should have access to invoke or configure the Amazon Bedrock service features or their resources (Amazon Bedrock Agents, Amazon Bedrock Guardrails, and prompt libraries) only through an authorized pipeline. This immutable infrastructure approach restricts human users from creating situations or configurations that would otherwise allow unapproved access to the data plane, untracked changes to the control plane, or potentially disruptive updates to your application.
Alternatively, to reduce the assigned permissions to the absolute minimum, automated deployments using pipelines and pipeline roles can be used. This will not only provide versioned infrastructure but also adheres to PoLP by not providing access to human identities.
After deployment, the solution is live and being accessed by real users. At this point, topics such as monitoring, logging, and incident response become relevant. While Amazon Bedrock by default doesn’t store inference or response data, it’s recommended that you activate logging of those elements to constantly verify the accuracy of your generative AI application.
By using this solution, you reduce access to a minimum in the following areas:
Logged prompts and responses
Data available through knowledge bases, RAG, or similar sources
The ability to change the infrastructure
Use multiple, dedicated, least privileged roles for each task. This helps reduce the permissions scope to a minimum. Also, because least privilege enforces using a specific role for a specific task, it reduces the risk of unintended changes by requiring the assumption of a specific role.
By following the AWS Security Reference Architecture, security monitoring data is consolidated in a central security account. This allows a comprehensive, central overview of the security posture of your infrastructure.
Logging sensitive information
An important operational aspect is logging potentially sensitive data that’s sent to or received from the LLM. While Amazon Bedrock doesn’t store prompts or responses, you can use model invocation logging to collect invocation logs, model input data, and model output data for all invocations in your AWS account used in Amazon Bedrock. Model invocation logging isn’t enabled by default. After it’s enabled, prompts, completions, or both for all invocations of all approved models are then logged to the configured log destination. Valid log destinations for prompt and completion logs are Amazon S3 and CloudWatch. When writing logs to these destinations, they can optionally be encrypted using a supplied KMS key.
The contents of these logs might contain sensitive information in the prompt provided by the user or the reply generated by the model. As such, access to these logs should be restricted to personnel and machine processes that require and are authorized to access this classification of data. There are strategies such as using Amazon Macie on the Amazon S3 logs and CloudWatch Logs data protection capabilities to detect, monitor, and redact this sensitive information from logs, but that’s outside the scope of this post.
Even with Amazon Bedrock guardrails in place, the contents of these logs contain the pre-guardrailed user input, and so you must assume that these prompt and completion logs contain sensitive information. In this case, best practice is to encrypt log data with a KMS key, apply a data protection policy to the log group, and define at least three IAM roles:
BasicCompletionLogReviewer: An IAM role whose sole purpose is to access and review the redacted version of these logs.
SensitiveDataCompletionLogReviewer: A restricted IAM role whose sole purpose is to access and review the unredacted version of these logs.
CompletionLogAdmin: A restricted IAM role whose sole purpose is to create, view, and delete data protection policies that can send audit findings to Amazon S3 and CloudWatch destinations.
To allow reading log events in a specific log group, use a policy such as the following and attach it to the BasicCompletionLogReviewer role, replacing <region>, <account-id>, <log-group-name>, and <alias-name> with values that match your CloudWatch log group and the KMS key that encrypts it.
With an active data protection policy in place, the preceding policy won’t allow access to the unredacted version of these logs. To allow access to the unredacted versions to the SensitiveDataCompletionLogReviewer role, you need to add an additional action, replacing <region>, <account-id>, <log-group-name>, and <alias-name> with values that match your CloudWatch log group and the KMS key that encrypts it.
The policy for the CompletionLogAdmin role requires different permissions; the following sample policy allows a user to create, view, and delete data protection policies that can send audit findings to all three types of audit destinations. It doesn’t permit the user to view unmasked data. This policy will look like the following example, replacing <delivery-stream-id>, <bucket-name>, <log-group-name>, and <alias-name> with the values that match your setup. Note that this includes a statement that explicitly denies the attached role access to decrypt the logs with the configured KMS key, aligning with the PoLP:
This approach helps to avoid inadvertent access to or exposure of this sensitive data source and upholds the PoLP by separating duties.
Review IAM permissions on a periodic basis
Managing permissions is an ongoing effort because requirements and functionality change over time. Therefore, we recommend regularly reviewing the assigned permissions and verifying that they aren’t overly permissive. For example, if you have a Lambda function that makes API calls to Amazon Bedrock, and changes are made to that function that require additional permissions (perhaps the use of a new model), then it’s acceptable to update the policy attached to the IAM role that the function uses. It’s not always obvious that permissions for using the earlier model are still needed in the same policy; or permissions might be widened unnecessarily to include all models. When applying the PoLP, it’s important that policies be tested at the time they’re deployed to make sure that they meet the exact application needs and no more, but also that the presumed needs are reviewed periodically.
Using AWS IAM Access Analyzer, you can review and simulate proposed changes to IAM policies to ensure their suitability for a given application or function. You can also use IAM Access Analyzer to review unused permissions over time. This gives system operators an opportunity to inspect and then inform the removal of unused permissions in policies used with Amazon Bedrock applications. Remember that some permissions are dormant and ready for periodic use, such as incident response, recovery and other rare use cases, so your review shouldn’t assume that an unused permission is unnecessary, but an opportunity to review the need for the permission.
Finally, align monitoring of new Amazon Bedrock APIs with your IAM strategy. Especially when using denylisting approaches, it’s important to consider that services will announce new APIs, capabilities, and FMs over time. An example for this was the announcement of the new Converse API. This API provides functionality similar to Invoke, but in a consistent and thus simpler way. Considering such changes is therefore an integral part of your regular policy review processes.
Strong identity and access management is a journey, not a one-time action.
Conclusion
In this post, we have demonstrated some ways that you can apply the principle of least privilege (PoLP) to large language model (LLM)-based applications that use Amazon Bedrock. We have discussed the security considerations of each phase of the development lifecycle and provided examples that you can use as a starting point to implement your own PoLP strategy. It’s important that security doesn’t start late in the process; think about risks and the required actions as early as possible to make sure that your strategy is effective when your application goes live.
Finally, remember that the field of generative AI is moving quickly. We believe that it has the potential to transform virtually every customer experience. From a security perspective, this means that the threat landscape will change and evolve over time. Make sure to constantly adapt to new risks; evaluate and integrate them into your PoLP strategy.
Your AWS account team and specialists are happy to assist you on this journey.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Amazon EKS introduced Pod Identity in 2023 as a feature that streamlines the process of configuring IAM permissions for Kubernetes applications. Pod Identity simplifies the identity management of applications running on top of Amazon EKS by allowing you to set up permissions directly through EKS interfaces, reducing the number of steps required and alleviating the need to switch between the EKS and IAM services. Pod Identity enables the use of a single IAM role across multiple clusters without updating trust policies, and it supports role session tags for more granular access control. This approach not only simplifies policy management by allowing the reuse of permission policies across roles, but also enhances security by enabling access to AWS resources based on matching tags.
Background
When your applications that are running on top of Amazon EKS require sensitive information like credentials to access a database, or a key to authenticate through an API, we call this kind of information secrets. You can secure, store, and manage secrets in AWS Secrets Manager. ASCP allows Kubernetes applications to securely retrieve the secrets stored in Secrets Manager and Systems Manager Parameter Store. Previously, ASCP relied on IAM roles for service accounts (IRSA) for authentication. Although IRSA provided improvements over previous methods, Pod Identity offers even greater security and simplicity.
Pod Identity provides a more secure and efficient way to integrate IAM roles with applications running on EKS, granting more granular AWS permissions to individual Pods, so that you don’t need instance-level credentials or IRSA.
This integration provides the following key benefits over using IRSA:
Enhanced security: Pod Identity provides a more granular and secure way to manage permissions at the Pod level, alleviating the need to expose the IAM role annotation on Kubernetes ServiceAccount objects.
Simplified configuration:Pod Identity streamlines and simplifies the setup process, reducing the potential for misconfiguration, especially in high-scale environments.
Improved operation: Pod Identity reduces the operational overhead compared to previous methods, centralizing the management with the AWS API.
Native EKS integration: Pod Identity is the new standard for IAM integration for applications running on EKS and provides a more cohesive experience.
Solution overview
With this integration, ASCP uses Pod Identity to authenticate and authorize access to AWS services. When a Pod requires access to a secret, the workflow is as shown in Figure 1.
Figure 1: The workflow performed by the Pod Identity and ASCP integration to provide access to a secret stored on AWS Secrets Manager for a Pod running on Amazon EKS
The workflow is as follows:
A user creates an IAM role and a Pod Identity association between the IAM role and the Kubernetes ServiceAccount assigned to the Pod.
EKS API validates the Pod Identity association, allowing ASCP to use this role to authenticate with AWS services.
If authorized, the Pod Identity agent allows ASCP to assume the IAM role assigned to the Pod through the use of a ServiceAccount token.
The Pod retrieves the requested secrets values and makes them available to the Pod through the use of a mounted volume.
Prerequisites
You need to have the following prerequisites in place in order to implement this solution:
An Amazon EKS cluster (version 1.24 or later)
Pod Identity enabled on your cluster
Access to two AWS accounts (for cross-account access)
This guide presents two scenarios: single-account setup and cross-account setup. Complete the single-account steps in Account A before proceeding to the cross-account configuration, which involves Account B. The cross-account setup builds on the single-account foundation to demonstrate secure secrets management across AWS accounts.
Amazon EKS cluster setup
Before you start, you’ll need to set up an Amazon EKS cluster with the required add-ons in a single account (Account A).
(Optional) Use the following commands to set environment variables and create an Amazon EKS cluster:
kubectl --namespace=kube-system get pods -l "app=secrets-store-csi-driver"
You should see this expected output:
NAME READY STATUS RESTARTS AGE
csi-secrets-store-secrets-store-csi-driver-fcxf6 3/3 Running 0 41s
csi-secrets-store-secrets-store-csi-driver-j9wqh 3/3 Running 0 41s
Now that you’ve set up your Amazon EKS cluster, in this use case, you’ll create an AWS Secrets Manager secret in the same account where the cluster and the applications reside (Account A).
Create a secret in AWS Secrets Manager with tags kubernetes-namespace and eks-cluster-name within the same AWS account as the EKS cluster:
In the preceding example IAM role permission policy, the use of conditions with kubernetes-namespace and eks-cluster-name tags helps enforce fine-grained access control by specifying that secrets can only be accessed by Pods from specific namespace and clusters. This allows secretsmanager actions only if the secret is tagged with the matching kubernetes-namespace and eks-cluster-name values.
Notice that the preceding example IAM role trust policy allows the Amazon EKS Pod Identity service (pods.eks.amazonaws.com) to assume the role and tag the session. These actions are necessary for Pod Identity to function correctly, enabling Pods to securely access AWS resources.
Finally, apply the role and trust policy:
aws iam create-role --role-name ascp-podidentity --assume-role-policy-document file://trust.json
aws iam put-role-policy --role-name ascp-podidentity --policy-name ascp-podidentity --policy-document file://policy.json
Note the ARN of the new role. You will use it in the next step.
Create a Kubernetes ServiceAccount and add the Pod Identity association between the ServiceAccount and the IAM role:
Create a SecretProviderClass to use the newly created secret in your Amazon EKS cluster.
cat << EOF | kubectl apply -f -
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: aws-secrets-manager
spec:
provider: aws
parameters:
objects: |
- objectName: "secret-a" # Secret name or ARN to be mounted to the Pod
objectType: "secretsmanager"
usePodIdentity: "true" # Indicator to use Pod Identity instead of IRSA
EOF
Create a new deployment to consume your newly created secret using Pod Identity as a mounted volume.
cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: application-a
name: application-a
spec:
replicas: 1
selector:
matchLabels:
app: application-a
template:
metadata:
labels:
app: application-a
spec:
containers:
- image: public.ecr.aws/amazonlinux/amazonlinux:2023-minimal
name: amazonlinux
args:
- infinity
command:
- sleep
volumeMounts: - name: secrets-store-inline mountPath: "/mnt/secret" # Directory where secret will be mounted to readOnly: true volumes: - name: secrets-store-inline csi: driver: secrets-store.csi.k8s.io readOnly: true volumeAttributes: secretProviderClass: "aws-secrets-manager" # Match SecretProviderClass name created serviceAccountName: serviceaccount-a # Specify service account created
EOF
Validate that the deployment was created successfully and confirm the secret was mounted correctly.
kubectl get pods -l app=application-a
kubectl exec -it $(kubectl get pods -l app=application-a -o name) -- cat /mnt/secret/secret-a
You should see this expected output:
NAME READY STATUS RESTARTS AGE
application-a-b98d44bb8-bzvn4 1/1 Running 0 3s
{"user":"user1","password":"passwd1"}%
Working with cross-account secrets through resource policies
For this second use case, you’ll also use the Amazon EKS cluster on Account A, and create a new Secrets Manager secret in a different account (Account B).
In order to access a secret in a different account, you can’t use the default AWS Key Management Service (AWS KMS) key, but will use aws/secretsmanager to encrypt this secret. So you need to first create a new AWS KMS key that allows cross-account access.
On Account B
Create a customer managed key on AWS KMS with cross-account permissions:
Create a SecretProviderClass to use the cross-account secret created in Account B in your Amazon EKS cluster:
cat << EOF | kubectl apply -f -
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: aws-secrets-manager-cross
spec:
provider: aws
parameters:
objects: |
- objectName: "$SECRET_CROSS_ARN" # Full ARN of the Secret in the ACCOUNT B to be mounted on Pod
objectType: "secretsmanager"
usePodIdentity: "true" # Indicator to use Pod Identity instead of IRSA
EOF
Create a new deployment to consume your newly created secret using Pod Identity as a mounted volume:
cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: application-b
name: application-b
spec:
replicas: 1
selector:
matchLabels:
app: application-b
template:
metadata:
labels:
app: application-b
spec:
containers:
- image: public.ecr.aws/amazonlinux/amazonlinux:2023-minimal
name: amazonlinux
args:
- infinity
command:
- sleep
volumeMounts: - name: secrets-store-cross mountPath: "/mnt/secret" # Directory where secret will be mounted to readOnly: true volumes: - name: secrets-store-cross csi: driver: secrets-store.csi.k8s.io readOnly: true volumeAttributes: secretProviderClass: "aws-secrets-manager-cross" # Match SecretProviderClass name created for cross-account access serviceAccountName: serviceaccount-b # Specify service account created for cross-account access
EOF
Validate that the deployment was created successfully and confirm the mounted secret:
kubectl get pods -l app=application-b
kubectl exec -it $(kubectl get pods -l app=application-b -o name) -- cat /mnt/secret/$SECRET_CROSS_ARN
Note that the volume name is the secret ARN, because it’s a cross-account secret.
You should see this expected output:
NAME READY STATUS RESTARTS AGE
application-b-67b755444f-ngrhv 1/1 Running 0 8s
"This is a Cross Account Secret"%
Conclusion
The integration of ASCP with Pod Identity marks a significant step forward in secrets management for Amazon EKS. It offers enhanced security, simplified configuration, and improved operations. We encourage all EKS users to explore this new integration and take advantage of its benefits.
The integration of ASCP with Pod Identity offers these benefits over IRSA:
Simplified setup: With Pod Identity, you don’t need to create and manage service accounts for each application.
Enhanced security: Pod Identity provides more granular control over permissions at the Pod level.
Improved scalability: Pod Identity is easier to implement in large-scale environments.
Consistent AWS experience: Pod Identity aligns more closely with AWS best practices for IAM management.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
The collective thoughts of the interwebz
Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.