Encryption can protect data at rest and data in transit, but does nothing for data in use. What we have are secure enclaves. I’ve written about this before:
Almost all cloud services have to perform some computation on our data. Even the simplest storage provider has code to copy bytes from an internal storage system and deliver them to the user. End-to-end encryption is sufficient in such a narrow context. But often we want our cloud providers to be able to perform computation on our raw data: search, analysis, AI model training or fine-tuning, and more. Without expensive, esoteric techniques, such as secure multiparty computation protocols or homomorphic encryption techniques that can perform calculations on encrypted data, cloud servers require access to the unencrypted data to do anything useful.
Fortunately, the last few years have seen the advent of general-purpose, hardware-enabled secure computation. This is powered by special functionality on processors known as trusted execution environments (TEEs) or secure enclaves. TEEs decouple who runs the chip (a cloud provider, such as Microsoft Azure) from who secures the chip (a processor vendor, such as Intel) and from who controls the data being used in the computation (the customer or user). A TEE can keep the cloud provider from seeing what is being computed. The results of a computation are sent via a secure tunnel out of the enclave or encrypted and stored. A TEE can also generate a signed attestation that it actually ran the code that the customer wanted to run.
Secure enclaves are critical in our modern cloud-based computing architectures. And, of course, they have vulnerabilities:
The most recent attack, released Tuesday, is known as TEE.fail. It defeats the latest TEE protections from all three chipmakers. The low-cost, low-complexity attack works by placing a small piece of hardware between a single physical memory chip and the motherboard slot it plugs into. It also requires the attacker to compromise the operating system kernel. Once this three-minute attack is completed, Confidential Compute, SEV-SNP, and TDX/SDX can no longer be trusted. Unlike the Battering RAM and Wiretap attacks from last month—which worked only against CPUs using DDR4 memory—TEE.fail works against DDR5, allowing them to work against the latest TEEs.
Yes, these attacks require physical access. But that’s exactly the threat model secure enclaves are supposed to secure against.
AWS re:Invent 2025, the premier cloud computing conference hosted by Amazon Web Services (AWS), returns to Las Vegas, Nevada, from December 1–5, 2025. This flagship event brings together the global cloud community for an immersive week of learning, collaboration, and innovation across multiple venues. Whether you’re a cloud expert, business leader, or technology enthusiast, re:Invent offers unparalleled opportunities to explore cutting-edge cloud solutions, engage with AWS experts, and build valuable connections with peers from around the world.
From technical deep dives to strategic business sessions, re:Invent 2025 is your gateway to understanding and using the most advanced cloud technologies. In the Expo, you can visit the Digital Sovereignty and Hybrid Cloud kiosks in the AWS Village to learn about the upcoming AWS European Sovereign Cloud and other digital sovereignty solutions, and get your questions answered by AWS experts.
Join us to discover the latest cloud industry innovations, gain deep technical insights, and learn how to optimize your cloud investments for digital sovereignty. Sessions this year will include comprehensive coverage of the AWS sovereign-by-design approach, including the enhanced security capabilities of the AWS Nitro System, our expanding portfolio of digital sovereignty solutions, and the latest developments of the AWS European Sovereign Cloud. With the growing momentum around digital sovereignty, explore how AWS continues to innovate with sovereign cloud solutions that help customers maintain control over their data while using the full power of the cloud. You can customize your learning path by reserving session seating now by signing in to your attendee portal or the AWS Events mobile app.
Breakout sessions and code talks
To add sessions to your AWS re:Invent agenda and find time and location information, choose the session title link.
Security track
SEC201 | Breakout | AWS European Sovereign Cloud: From concept to reality Colm MacCárthaigh, VP/Distinguished Engineer – EC2 Networking, AWS Addy Upreti, Principal Technical Product Manager – EC2 Core Product Management, AWS Get a firsthand look at the AWS European Sovereign Cloud. Explore this new, independent infrastructure’s dedicated architecture, EU-based operations, operational controls coupled with governance and legal framework that powers this cloud. Learn how this cloud solution is built, operated, and secured entirely within Europe.
Cloud operations track
COP409 | Code Talk | Building Sovereign Cloud Environments Bo Lechangeur, Pr. Delivery Engineer – STCE, AWS, and Randy Domingo, Sr. Software Development Manager – STCE, AWS As organizations scale their operations globally, they need to meet evolving data residency, security, compliance, and business continuity requirements. This session explores how AWS Control Tower and Landing Zone Accelerator on AWS support key sovereignty requirements, including country-specific compliance frameworks, regional service selection, automated controls for data movement, and cross-border transfers. Through real-world examples, the session demonstrates how organizations can leverage AWS to implement country-specific security controls, maintain operational consistency across multi-region deployments, accelerate cloud compliance, and deploy automated security and compliance at scale.
Hybrid cloud and multicloud track
HMC202 | Breakout | AWS wherever you need it: From the cloud to the edge Speakers: Spencer Dillard, Director, Software Development – EC2 Edge, AWS, Madhura Kale, Senior Manager, Technical Product Management – EC2 Core, AWS While most workloads can be migrated to the cloud, some remain on-premises or at the edge due to low latency, local data processing, or digital sovereignty needs. In this session, learn how AWS services like AWS Outposts, AWS Local Zones, AWS Dedicated Local Zones, and AWS IoT support hybrid cloud and edge computing workloads such as multiplayer gaming, high-frequency trading, medical imaging, smart manufacturing, and generative AI applications with data residency requirements.
HMC308 | Breakout | Build generative and agentic AI applications on-premises and at the edge Speakers: Chris McEvilly, Senior Solutions Architect – Hybrid Edge, AWS, Pranav Chachra, Principal Technical Product Manager – EC2 Core, AWS, and Fernando Galves, Senior Solutions Architect – Generative AI, AWS As customers scale generative AI and agentic AI implementations from pilots to production, they need to balance speed of innovation with data sovereignty requirements, low-latency edge processing needs, and space, power, and cost efficiency. This session explores how to build generative and agentic AI solutions using AWS Local Zones, AWS Outposts, and AWS Dedicated Local Zones. Discover architectural patterns and best practices for deploying foundation models across distributed locations. Learn how to implement Retrieval Augmented Generation (RAG) with locally stored data. Gain insights into strategies for model selection and optimization.
HMC310 | Breakout | Digital sovereignty and data residency with AWS Hybrid and Edge services Speakers: Mallory Gershenfeld, Senior Technical Product Manager – S3, AWS, Ben Lavasani, Senior Specialist – Hybrid and Edge, AWS, and Majd Aldeen Masriah, Director of Enterprise – Architecture, Geida Countries around the world are increasingly introducing or updating data residency and digital sovereignty laws that require at least one copy, or sometimes all data, to be stored or processed in a specific geographic or sovereign location that introduces new challenges for customers. This session explores how AWS services, including AWS Dedicated Local Zones, AWS Local Zones, and AWS Outposts can help you with your digital sovereignty use cases. We’ll examine best practices for data residency, security controls, and operational consistency across deployments at the edge.
Interactive sessions (chalk talks and workshops)
Security track
SEC301| Chalk Talk | Architecting for Digital Sovereignty: From Foundation to Practice Speakers: Eric Rose, Principal Security SA – Global Services Security, AWS and Armin Schneider, Digital Sovereignty Specialist SA – Global Services Security Digital Sovereignty Join this chalk talk that bridges security fundamentals with practical architecture strategies for implementing digital sovereignty in the cloud. Through real-world examples from the United Arab Emirates Cybersecurity Council and the upcoming AWS European Sovereign Cloud, we’ll explore how organizations can use AWS sovereignty features effectively. We’ll cover practical architectural patterns for data residency, operational control, and security measures that help customers maintain full control of their data. Perfect for cloud architects and security teams, this session will show you how to design solutions that balance sovereignty requirements with cloud advantages, illustrated with examples from government and enterprise deployments.
Hybrid cloud and multicloud track
HMC301| Workshop | Build and operate resilient and performant distributed applications Speakers: Saravanan Shanmugam, Senior Solutions Architect – Hybrid Edge, AWS and Sedji Gaouaou, Senior Solutions Architect – Networking, AWS This workshop explores how to design and implement applications for multi-geo operations while meeting data residency and performance requirements. You will learn how to design fault-tolerant, latency-sensitive applications across distributed locations with limited hardware resources. You will also explore distributed hybrid architectures, edge networking implementations, and traffic management solutions that balance regulatory requirements with high availability needs. Learn practical strategies for optimizing performance while maintaining data sovereignty across distributed locations.
HMC302| Workshop| Implementing agentic AI solutions on-premises and at the edge Speakers: Fernando Galves, Senior Solutions Architect – Generative AI, AWS and Kyle Palasti, Senior Solutions Architect – Hybrid Edge, AWS As governments and standards bodies develop data protection and privacy regulations, organizations increasingly need to combine the use of generative AI tooling in the cloud with regulated data that needs to remain on-premises to meet data residency requirements. In this workshop, learn how to extend Amazon Bedrock AgentCore to hybrid and edge services like AWS Outposts and AWS Local Zones to build distributed agentic applications using Model Context Protocol (MCP) and agent-to-agent (A2A) communication with on-premises data for improved model outcomes. Get hands-on with hybrid agentic AI using Amazon Bedrock and Strands Agents while exploring AWS hybrid and edge services.
HMC305 | Workshop | Low-latency SLM deployment: Optimizing inference on AWS Hybrid and Edge Services Speakers: Leonardo Solano, Principal Solutions Architect – Networking & Hybrid Edge, AWS and Obed Gutierrez, Senior Solutions Architect, AWS This hands-on workshop demonstrates a fully local deployment approach for running Small Language Models (SLMs) at the edge using AWS Local Zones and AWS Outposts. The implementation focuses on achieving low-latency inference and enabling data sovereignty compliance through Retrieval Augmented Generation (RAG) applications within local infrastructure. Using Amazon Elastic Compute Cloud (Amazon EC2) instances and publicly available models, you will learn how to deploy, optimize, and manage SLMs in edge environments, ensuring the RAG system and language model operate locally to meet strict latency and data residency requirements for production scenarios.
HMC312 | Chalk Talk | Implement RAG while meeting data residency requirements Speakers: Lakshmi VP, Solutions Architect, AWS and Akshata Ketkar, Senior Product Manager – EC2 Edge, AWS As governments develop data protection and privacy regulations, organizations increasingly need to leverage generative AI with regulated data that needs to remain on-premises to meet data sovereignty requirements. This session explores how to implement Retrieval Augmented Generation (RAG) with on-premises and edge data. Learn how to extend Amazon Bedrock AgentCore to AWS Outposts and AWS Local Zones for a hybrid RAG architecture, or build a local RAG architecture for more stringent data residency requirements. Discover the latest techniques like reranker models to improve precision without increasing model size, reduce inference cost, and enforce more governance and control over prompt outcomes.
HMC314 | Chalk Talk | Deploying for resilience: HA/DR strategies for AWS Outposts and Local Zones Speakers: Afaq Khan, Senior Product Manager – EC2 Edge, AWS and Brianna Rosentrater, Senior Solutions Architect – Hybrid Edge, AWS Critical workloads at the edge demand robust high-availability and disaster recovery strategies. In this chalk talk, learn how to plan and implement resilient deployments using AWS hybrid cloud and edge computing services. We’ll examine how to architect edge infrastructure using AWS Local Zones and AWS Outposts, covering key aspects of networking, compute, and storage redundancy. Through real customer examples and reference architectures, we’ll explore deployment patterns and best practices for maintaining business continuity across failure modes. Join us to learn practical strategies for achieving your RPO/RTO objectives with edge deployments.
HMC403 | Code Talk | Build and optimize edge architects for resiliency with AI Speakers: Jesus Federico, Principal Solutions Architect – Generative AI, AWS and Robert Belson, Senior Solutions Architect & Developer Advocate, AWS This live coding session explores how to automate edge infrastructure operations with AI. Discover how to build truly resilient architectures with the latest AWS Outposts and AWS Local Zones APIs. We’ll walk through real-world code examples for querying Outposts hardware inventory, implementing intelligent resource placement, and automating failover configurations. You’ll learn how Amazon Bedrock can analyze architecture patterns and generate Infrastructure as Code (IaC) recommendations for optimal component distribution. Walk away with practical techniques for API integration, automated health checks, and dynamic resource allocation, plus working code samples and deployment templates for building adaptive, highly available edge solutions.
HMC316 | Chalk Talk | Address digital sovereignty with hybrid cloud solutions Speakers: Sherry Lin, Principal Product Manager – EC2 Core, AWS and Enrico Liguori, Solutions Architect – Networking, AWS As organizations scale innovative solutions globally, they need to navigate complex digital sovereignty requirements. This session explores how AWS can help you accelerate global scaling while meeting regulatory obligations. We’ll compare various sovereign infrastructure options with a focus on AWS Local Zones, AWS Dedicated Local Zones, AWS Outposts, and AWS European Sovereign Cloud. Learn how to choose the best option for your sovereign needs and architect applications for data residency and resiliency. Discover how to implement security controls to regulate how data can be stored, processed, and transferred, and how to prevent unauthorized data access.
For a full view of digital sovereignty content, including sessions with partners, explore the AWS re:Invent catalog and filter on the Digital Sovereignty area of interest. Not able to attend in-person? Register forthe virtual-only pass offered at no additional cost to livestream keynotes and innovation talks, and access on-demand breakout sessions today. See you in Las Vegas or on the livestream!
If you have feedback about this post, submit comments in the Comments section below.
We pointed a commercial-off-the-shelf satellite dish at the sky and carried out the most comprehensive public study to date of geostationary satellite communication. A shockingly large amount of sensitive traffic is being broadcast unencrypted, including critical infrastructure, internal corporate and government communications, private citizens’ voice calls and SMS, and consumer Internet traffic from in-flight wifi and mobile networks. This data can be passively observed by anyone with a few hundred dollars of consumer-grade hardware. There are thousands of geostationary satellite transponders globally, and data from a single transponder may be visible from an area as large as 40% of the surface of the earth.
In our previous blog post (Part 1 of our key replication series), Automatically replicate your card payment keys across AWS Regions, we explored an event-driven, serverless architecture using AWS PrivateLink to securely replicate card payment keys across AWS Regions. That solution demonstrated how to build a custom replication framework for payment cryptography keys.
Based on customer feedback requesting a more automated, no-code approach, we’re excited to announce an additional option to this capability with Multi-Region keys for AWS Payment Cryptography in Part 2 of our series.
By using this new feature, you can automatically synchronize payment cryptography keys from a primary Region to other Regions that you select, improving resilience and availability of payment applications. You can also choose between account-level replication or key-level replication, giving more flexibility in how to manage payment keys across Regions.
Multi-Region keys: Overview and benefits
The new Multi-Region key replication feature for AWS Payment Cryptography offers you flexible control over your key replication strategy through the following primary capabilities:
Control whether keys are replicated
Select specific Regions for key replication
Manage replication configuration changes
Configure either account-level or key-level replication to meet business needs
Multi-Region keys help deliver several benefits for global payment operations, including:
Improved availability: Access your payment keys even if a Region becomes unavailable
Disaster recovery: Maintain business continuity with replicated keys across Regions
Global operations: Support payment processing across multiple geographic regions
Simplified management: Centralized control with distributed availability
Consistent key IDs: The same key ID across Regions simplifies application development
Configuration options
Payment Cryptography provides two distinct methods for configuring Multi-Region key replication, giving flexibility to implement a strategy that best fits your organization’s needs. You can choose between a broad, account-level approach or a more granular, key-level method.
Account-level
With account-level configuration, AWS automatically replicates exportable symmetric keys created in your Payment Cryptography account from your designated primary Region to other Regions you specify. This simplifies key management in multi-Region deployments, provides consistent key availability in the Regions that you specify, and reduces the operational overhead of key management.
To configure account-level replication using the AWS Command Line Interface (AWS CLI), use the new enable-default-key-replication-regions API to set the Regions where AWS will replicate your keys. To remove Regions from your default replication list, use the disable-default-key-replication-regions API.
Note: Only symmetric keys created after the account-level replication is enabled will be replicated.
Key-level replication
By using key-level replication, you can achieve more granular control by:
Designating specific keys as multi-Region keys
Defining custom replication targets for each multi-Region key
Maintaining Region-specific keys when needed
Note: Within each Region, Payment Cryptography maintains redundancy of your keys across multiple Availability Zones for high availability. Multi-Region key replication extends across geographic boundaries, giving you additional resilience against Regional outages while maintaining control over where your keys are stored.
You can specify replication Regions during key creation using the --replication-regions parameter, using the AWS CLI, with the create-key or import-key APIs. For existing keys, you can use the new add-key-replication-regions and remove-key-replication-regions APIs to manage which regions receive your replicated keys.
Important: When you specify replication Regions during key creation, these settings take precedence over default replication Regions configured at the account level.
How it works
Figure 1 shows the process when you replicate a key in Payment Cryptography.
The key is created in your designated primary Region
Payment Cryptography automatically replicates the key material asynchronously to the specified replica Regions
The replicated keys maintain the same key ID across Regions; only the Region portion of the Amazon Resource Name (ARN) changes
The key in the primary Region is marked with MultiRegionKeyType: PRIMARY
Keys in replica Regions are marked with MultiRegionKeyType: REPLICA and include a reference to the primary Region
When deleting a key, its deletion cascades from the primary to replica Regions
Figure 1: Representation of key replication from us-east-1 to us-west-2
Example: Creating a multi-Region key at key level
The following is an example of creating a card verification key (CVK) in the primary Region (us-east-1) with replication to us-west-2:
When using multi-Region keys, several important aspects should be considered. Multi-Region key replication supports only symmetric keys with the exportable attribute enabled, and asymmetric keys are not supported. For billing purposes, AWS bills per key per Region, which means replicating to three Regions incurs costs for the primary key plus costs for each key in the replica Regions.
Key aliases and tags require separate management in each Region because they are not part of the replication process. While primary keys support modifications and updates, replica keys are read-only copies that support only cryptographic operations. Modifications must be made to the key in the primary Region, and Payment Cryptography automatically propagates these changes to the replica Regions. Monitor the replication status to confirm successful synchronization of these changes.
The deletion process for multi-Region keys follows specific behavior patterns that are important to understand. When a primary key is scheduled for deletion, associated replica keys are deleted immediately. The primary key enters a pending deletion state with a minimum 3-day waiting period, during which the deletion can be canceled. However, if you restore the primary key by canceling its deletion, you will need to re-enable replication to recreate the replica keys in your desired Regions. After the 3-day waiting period expires, the primary key is permanently deleted and becomes unrecoverable. Note that deleting a replica key affects only that specific Region and does not impact the primary key or other replica keys.
Multi-Region key replication operates with eventual consistency. When creating new keys or making changes to existing keys, these updates might not appear immediately across all Regions. Applications should be designed to handle this eventual consistency model and not assume immediate availability of keys or key changes in replica Regions. If your application requires strong consistency, implement polling mechanisms using the GetKey API to verify that changes have been synchronized before proceeding with key operations.
Logging and monitoring
Payment Cryptography logs API activity through AWS CloudTrail, which now includes new events and attributes specific to Multi-Region key replication.
New CloudTrail event
The service logs a new event type called SynchronizeMultiRegionKey, which appears in primary and replica Regions.
Primary Region events:
Two SynchronizeMultiRegionKey events are logged in the primary Region for each replication Region defined:
To start using Multi-Region key replication in Payment Cryptography:
Determine your primary Region.
Determine your replica Regions and if you will use account-level or key-level configuration.
Create new exportable symmetric keys or update existing keys to use the Multi-Region key replication feature.
Update your applications to use the consistent key IDs across Regions.
Conclusion
The new Multi-Region key replication feature in Payment Cryptography enhances our automatic key replication capabilities, providing improved resilience and simplified management for global payment applications. This feature helps make sure your payment cryptography keys are available when and where you need them, with the flexibility to choose between account-level or key-level replication strategies.
As your organization grows, the amount of data you own and the number of data sources to store and process your data across multiple Amazon Web Services (AWS) accounts increases. Enforcing consistent access controls that restrict access to known networks might become a key part in protecting your organization’s sensitive data.
Previously, AWS customers could rely on AWS Identity and Access Management (IAM) global condition keys such as aws:SourceVpc and aws:SourceVpce to restrict access to specific virtual private clouds (VPCs) or VPC endpoints. These condition keys work well for organizations with few accounts and for use cases limited to specific workloads. However, as the number of your VPCs grow, using these keys could introduce challenges in scaling the control across a large set of resources.
To address this challenge, AWS has introduced three new global condition keys for scalable access controls based on request origin: aws:VpceAccount, aws:VpceOrgPaths, and aws:VpceOrgID.
In this blog post, we demonstrate how these keys can help make sure that your AWS resources are accessible only from expected VPCs, so that you can scale your data perimeter implementation across your organization within AWS Organizations.
Background
Organizations often store data in AWS resources such as Amazon Simple Storage Service (Amazon S3) buckets. For example, you might use Amazon S3 as your data lake foundation with data scientists and analysts running their data processing and analytics workflows against data stored in a centralized S3 bucket.
To limit access to data stored in your S3 buckets to expected networks, you can use IAM policies associated with your identities and resources. You can define expected networks in a policy using specific IAM global condition keys based on your organization’s intended data access patterns and unique requirements. For example, use aws:SourceIp to specify your corporate IP CIDR ranges, and aws:SourceVpc or aws:SourceVpce to list VPC and VPC endpoint IDs you expect requests to come from. These condition keys help make sure that only workloads operating within your expected network boundaries can access sensitive data.
However, there are scenarios where you might want to allow access from multiple networks within your organization, as illustrated in Figure 1.
Figure 1: Applications and users accessing an S3 bucket from VPCs and public networks
In such cases, using the aws:SourceVpc and aws:SourceVpce condition keys requires enumerating all expected VPC and VPC endpoint IDs and updating policies whenever new VPCs or VPC endpoints are added or deleted. This approach creates operational overhead and increases the risk of misconfigurations. The operational complexity grows as organizations scale their data processing capacity across multiple AWS Regions and accounts. While many organizations have developed automated mechanisms to detect changes in VPC configurations and update policies accordingly, auditing lengthy policies that enumerate VPCs within their organization remains challenging.
The new global condition keys provide a more scalable way to restrict access to expected networks:
aws:VpceAccount – Restricts the use of your identities and resources to networks that belong to a specific AWS account.
aws:VpceOrgID – Restricts the use of your identities and resources to networks that belong to your organization.
The value of these keys in the request context is the ID of the account (for example, 111122223333), organization unit (OU) (for example, o-abcdef0123/r-acroot/ou-development/*), or organization (for example, o-abcdef0123) that owns the VPC endpoint the request is made through.
Note that at the time of writing, not all services support these keys. See AWS global condition context keys for a list of supported services.
Implementation examples
Let’s look at how to restrict access to expected networks using the three new condition keys for common use cases. Each of the use cases demonstrates how the new condition keys help simplify controlling access to your resources in the sample scenario from Figure 1.
Use case 1: Allow access to your S3 buckets only from networks of data processing accounts
Data owners might want to strictly manage what data workflows can access their data sources and restrict cross-account access to specific data processing accounts and networks. They can use the aws:VpceAccount condition key to allow access based on the account that owns the VPC endpoint the request is made through. The following is an example S3 bucket policy.
This policy allows specific principals listed in the Principal element to list and download objects from the data lake bucket but only if they make requests from networks in one of the specified AWS accounts (StringEquals and aws:VpceAccount). Using the aws:VpceAccount condition key in this policy alleviates the need to maintain a list of VPC IDs or VPC endpoint IDs for the data processing accounts, reduces the size of the policy document, and simplifies auditing.
Use case 2: Restricting access to company networks for resources across multiple accounts
Central security teams often look for ways to enforce a set of standard access controls on resources across their entire organization. This is to meet compliance and security requirements, fulfill legal and contractual obligations, and to protect corporate data from unintended access. One such control could be used to limit access to only expected networks within the organization. In our sample scenario, this control helps prevent your data analysts and scientists from using their credentials to access data outside of your corporate environment. The following RCP demonstrates how to enforce the network perimeter controls on S3 buckets:
This policy denies access to S3 buckets and objects unless it is from expected networks defined as: your corporate IP CIDR range (NotIpAddressIfExists and aws:SourceIp), VPC endpoints in your organization (StringNotEqualsIfExists and aws:VpceOrgID), networks of AWS services that use their service principals or forward access sessions (FAS) to act on your behalf (BoolIfExists with aws:PrincipalIsAWSService and aws:ViaAWSService). It also allows access to networks of AWS services using specific service roles to access your resources (StringNotEqualsIfExists and aws:PrincipalTag/network-perimeter-exception set to true). Some organizations might need to edit this policy to allow third-party partner access. See Establishing a data perimeter on AWS: Allow access to company data only from expected networks for additional information on access patterns that need to be accounted for to meet the needs of your organization.
We used an RCP because it can be used to apply access controls centrally on resources across multiple accounts. Central security teams use RCPs to enforce security invariants on resources across their entire organization. For best practices in designing and deploying RCPs, see Effectively implementing resource control policies in a multi-account environment.
Remember to reference the list of services that support aws:VpceOrgID before using it in a policy such as an RCP. Enforcing it on an unsupported service might prevent your developers from using the service. If you need to restrict access to expected networks on a wider range of services, consider using the aws:SourceVpc and aws:SourceVpce condition keys. See the data perimeter policy examples repository that illustrate how to implement network perimeter controls for a wider range of services.
Use case 3: Restricting access based on intra-organization boundaries
Organizations often need to segment environments within their organization with varying data access requirements. For example, they might need to separate production from non-production environments or create boundaries between different business units, such as Finance, Marketing, and Sales; each operating in separate accounts. This might include making sure that resources within a specific OU can only be accessed from networks in the same OU. Central security teams can use aws:VpceOrgPaths to achieve this objective at scale.
The following is an example RCP that restricts access to your Amazon S3 and AWS Key Management Service (AWS KMS) resources so that they can only be accessed through VPC endpoints in a specific OU.
This policy is similar to the one we built for the previous use case but uses aws:VpceOrgPaths instead of aws:VpceOrgID to enforce a more granular boundary based on the requests’ network origin.
Best practices and considerations
When implementing the new condition keys, consider the following best practices.
Identify opportunities to adopt the new global condition keys by reviewing your security objectives and controls
If you currently restrict access to a wide range of resources using the aws:SourceVpc and aws:SourceVpce condition keys and want to avoid the need to enumerate VPC or VPC endpoint IDs in your policies, evaluate if you can migrate to aws:VpceAccount, aws:VpceOrgPaths, or aws:VpceOrgID. This migration decision depends on whether services you restrict access to are supported by the new condition keys. Similarly, if you plan to add network perimeter restrictions to your security baseline, first evaluate whether the new condition keys offer a more scalable solution for your target services. Only enforce the new keys on services that are currently supported. If you need to enforce the restriction on a service not yet supported, you should use aws:SourceVpc and aws:SourceVpce. Also, continue using aws:SourceVpc and aws:SourceVpce to achieve your least privilege objectives, for example if the network boundary you need to maintain for a subset of resources is scoped to specific VPCs or VPC endpoints.
Plan the implementation of the new condition keys
We recommend that you test access controls updates in a non-production environment and only promote them to production after validating their expected behavior. If you currently maintain an automation to enumerate VPC or VPC endpoint IDs in your policies and plan to migrate to the new keys, deactivate your automation only after you have completed policy updates across all environments. This approach helps make sure that your existing security posture remains intact while you progressively deploy the changes.
Monitor and validate the implementation
Use AWS CloudTrail to audit access patterns and regularly review and update your access controls as your organization structure evolves and security objectives change. For example, you might need to adjust access controls when accounts requiring access to your data lakes change, or when organizational boundaries need modification to accommodate new integrations between business units. You must establish processes to continuously evaluate the effectiveness of your controls in meeting both security and business objectives.
Conclusion
In this post, you learned how to use the new global condition keys—aws:VpceAccount, aws:VpceOrgPaths, and aws:VpceOrgID—to restrict access to expected networks at scale. By using these keys, you can:
Implement network perimeter controls that scale with your AWS organization.
Reduce the operational overhead of managing access to your data.
Simplify your IAM policies and reduce the risk of misconfigurations.
Scale your data lake implementation while maintaining security.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS IAM re:Post or contact AWS Support.
The revolution is already inside your organization, and it’s happening at the speed of a keystroke. Every day, employees turn to generative artificial intelligence (GenAI) for help with everything from drafting emails to debugging code. And while using GenAI boosts productivity—a win for the organization—this also creates a significant data security risk: employees may potentially share sensitive information with a third party.
Regardless of this risk, the data is clear: employees already treat these AI tools like a trusted colleague. In fact, one study found that nearly half of all employees surveyed admitted to entering confidential company information into publicly available GenAI tools. Unfortunately, the risk for human error doesn’t stop there. Earlier this year, a new feature in a leading LLM meant to make conversations shareable had a serious unintended consequence: it led to thousands of private chats — including work-related ones — being indexed by Google and other search engines. In both cases, neither example was done with malice. Instead, they were miscalculations on how these tools would be used, and it certainly did not help that organizations did not have the right tools to protect their data.
While the instinct for many may be to deploy the old playbook of banning a risky application, GenAI is too powerful to overlook. We need a new strategy — one that moves beyond the binary universe of “blocks” and “allows” and into a reality governed by context.
This is why we built AI prompt protection. As a new capability within Cloudflare’s Data Loss Prevention (DLP) product, it’s integrated directly into Cloudflare One, our secure access service edge (SASE) platform. This feature is a core part of our broader AI Security Posture Management (AI-SPM) approach. Our approach isn’t about building a stronger wall; it’s about providing the tools to understand and govern your organization’s AI usage, so you can secure sensitive data without stifling the innovation that GenAI enables.
What is AI prompt protection?
AI prompt protection identifies and secures the data entered into web-based AI tools. It empowers organizations with granular control to specify which actions users can and cannot take when using GenAI, such as if they can send a particular kind of prompt at all. Today, we are excited to announce this new capability is available for Google Gemini, ChatGPT, Claude, and Perplexity.
AI prompt protection leverages four key components to keep your organization safe: prompt detection, topic classification, guardrails, and logging. In the next few sections, we’ll elaborate on how each element contributes to smarter and safer GenAI usage.
Gaining visibility: prompt detection
As the saying goes, you don’t know what you don’t know, or in this case, you can’t secure what you can’t see. The keystone of AI prompt protection is its ability to capture both the users’ prompts and GenAI’s responses. When using web applications like ChatGPT and Google Gemini, these services often leverage undocumented and private APIs (application programming interface), making it incredibly difficult for existing security solutions to inspect the interaction and understand what information is being shared.
AI prompt protection begins by removing this obstacle and systematically detecting users’ prompts and AI’s responses from the set of supported AI tools mentioned above.
Turning data into a signal: topic classification
Simply knowing what an employee is talking to AI about is not enough. The raw data stream of activity, while useful, is just noise without context. To build a robust security posture, we need semantic understanding of the prompts and responses.
AI prompt protection analyzes the content and intent behind every prompt the user provides, classifying it into meaningful, high-level topics. Understanding the semantics of each prompt allows us to get one step closer to securing GenAI usage.
We have organized our topic classifications around two core evaluation categories:
Content focuses on the specific text or data the user provides the generative AI tool. It is the information the AI needs to process and analyze to generate a response.
Intent focuses on the user’s goal or objective for the AI’s response. It dictates the type of output the user wants to receive. This category is particularly useful for customers who are using SaaS connectors or MCPs that provide the AI application access to internal data sources that contain sensitive information.
To facilitate easy adoption of AI prompt protection, we provide predefined profiles and detection entries that offer out-of-the-box protection for the most critical data types and risks. Every detection entry will specify which category (content or intent) is being evaluated. These profiles cover the following:
Evaluation Category
Detection entry (Topic)
Description
Content
PII
Prompt contains personal information (names, SSNs, emails, etc.)
Credentials and Secrets
Prompt contains API keys, passwords, or other sensitive credentials
Source Code
Prompt contains actual source code, code snippets, or proprietary algorithms
Customer Data
Prompt contains customer names, projects, business activities, or confidential customer contexts
Financial Information
Prompt contains financial numbers or confidential business data
Intent
PII
Prompt requests specific personal information about individuals
Code Abuse and Malicious Code
Prompt requests malicious code for attacks exploits, or harmful activities
Jailbreak
Prompt attempts to circumvent security policies
Let’s walk through two examples that highlight how the Content: PII and Intent: PII detections look as a realistic prompt.
Prompt 1: “What is the nearest grocery store to me? My address is 123 Main Street, Anytown, USA.”
> This prompt will be categorized as Content: PII as it contains PII because it lists a home address and references a specific person.
Prompt 2: “Tell me Jane Doe’s address and date of birth.”
> This prompt will be categorized as Intent: PII because it is requesting PII from the AI application.
From understanding to control: guardrails
Before AI prompt protection, protecting against inappropriate use of GenAI required blocking the entire application. With semantic understanding, we can move beyond the binary of “block or allow” with the ultimate goal of enabling and governing safe usage. Guardrails allow you to build granular policies based on the very topics we have just classified.
You can, for example, create a policy that prevents a non-HR employee from submitting a prompt with the intent to receive PII from the response. The HR team, in contrast, may be allowed to do so for legitimate business purposes (e.g., compensation planning). These policies transform a blind restriction into intelligent, identity-aware controls that empower your teams without compromising security.
The above policy blocks all ChatGPT prompts that may receive PII back in the response for employees in engineering, marketing, product, and finance user groups.
Closing the loop: logging
Even the most robust policies must be auditable, which leads us to the final piece of the puzzle: establishing a record of every interaction. Our logging capability captures both the prompt and the response, encrypted with a customer-provided public key to ensure that not even Cloudflare may access your sensitive data. This gives security teams the crucial visibility needed to investigate incidents, prove compliance, and understand how GenAI is concretely being used across the organization.
You can now quickly zero in on specific events using these new Gateway log filters:
Application type and name filters logs based on the application criteria in the policy that was triggered.
DLP payload log shows only logs that include a DLP profile match and payload log.
GenAI prompt captured displays logs from policies that contain a supported artificial intelligence application and a prompt log.
Additionally, each prompt log includes a conversation ID that allows you to reconstruct the user interaction from initial prompt to final response. The conversation ID equips security teams to quickly understand the context of a prompt rather than only seeing one element of the conversation.
For a more focused view, our Application Library now features a new “Prompt Logs” filter. From here, admins can view a list of logs that are filtered to only show logs that include a captured prompt for that specific application. This view can be used to understand how different AI applications are being used to further highlight risk usage or discover new prompt topic use cases that require guardrails.
How we built it
Detecting the prompt with granular controls
This is where it gets more interesting and admittedly, more technical. Providing granular controls to organizations required help from multiple technologies. To jumpstart our progress, the acquisition of Kivera enhanced our operation mapping, which is a process that identifies the structure and content of an application’s APIs and then maps them to concrete operations a user can perform. This capability allowed us to move beyond simple expression-based HTTP policies, where users provide a static search pattern to find specific sequences in web traffic, to policies structured on application operations. This shift moves us into a powerful, dynamic environment where an administrator can author a policy that says, “Block the ‘share’ action from ChatGPT.”
Action-based policies eliminate the need for organizations to manually extract request URLs from network traffic, which removes a significant burden from security teams. Instead, AI prompt protection can translate the action a user is taking and allow or deny based on an organization’s policies. This is exactly the kind of control organizations require to protect sensitive data use with GenAI.
Let’s take a look at how this plays out from the perspective of a request:
Cloudflare’s global network receives a HTTPS request.
Cloudflare identifies and categorizes the request. For example, the request may be matched to a known application, such as ChatGPT, and then a specific action, such as SendPrompt. We do this by using operation mapping, which we talked about above.
This information is then passed to the DLP engine. Because different applications will use a variety of protocols, encodings, and schemas, this derived information is used as a primer for the DLP engine which enables it to rapidly scan for additional information in the body of the request and response. For GenAI specifically, the DLP engine extracts the user prompt, the prompt response, and the conversation ID (more on that later).
Similar to how we maintain a HTTP header schema for applications and operations, DLP maintains logic for scanning the body of requests and responses to different applications. This logic is aware of what decoders are required for different vendors, and where interesting properties like the prompt response reside within the body.
Keeping with ChatGPT as our example, a text/event-stream is used for the response body format. This allows ChatGPT to stream the prompt response and metadata back to the client while it is generating. If you have used GenAI, you will have seen this in action when you see the model “thinking” and writing text before your eyes.
event: delta_encoding
data: "v1"
event: delta
data: {"p": "", "o": "add", "v": {"message": {"id": "43903a46-3502-4993-9c36-1741c1abaf1b", ...}, "conversation_id": "688cbc90-9f94-800d-b603-2c2edcfaf35a", "error": null}, "c": 0}
// ...many metadata messages of different types.
event: delta
data: {"p": "/message/content/parts/0", "o": "append", "v": "**Why did the"}
event: delta
data: {"v": " dog sit in the"} // Responses are appended via deltas as the model continues to think.
event: delta
data: {"v": " shade?** \nBecause he"}
event: delta
data: {"v": " didn\u2019t want"}
event: delta
data: {"v": " to be a hot dog!"}
We can see this “thinking” above as the model returns the prompt response piece by piece, appending to the previous output. Our DLP Engine logic is aware of this, making it possible to reconstruct the original prompt response: Why did the dog sit in the shade? Because he didn’t want to be a hot dog!. This is great, but what if we want to see the other animal-themed jokes that were generated in this conversation? This is where extracting and logging the conversation_id becomes very useful; if we are interested in the wider context of the conversation as a whole, we can filter by this conversation_id in Gateway HTTP Logs to produce the entire conversation!
Work smarter, not harder: harnessing multiple language models for smarter topic classification
Our DLP engine employs a strategic, multi-model approach to classify prompt topics efficiently and securely. Each model is mapped to specific prompt topics it can most effectively classify. When a request is received, the engine uses this mapping, along with pre-defined AI topics, to forward the request to the specific models capable of handling the relevant topics.
This system uses open-source models for several key reasons. These models have proven capable of the required tasks and allow us to host inference on Workers AI, which runs on Cloudflare’s global network for optimal performance. Crucially, this architecture ensures that user prompts are not sent to third-party vendors, thereby maintaining user privacy.
In partnership with Workers AI, our DLP engine is able to accomplish better performance and better accuracy. Workers AI makes it possible for AI prompt protection to run different models and to do so in parallel. We are then able to combine these results to achieve higher overall recall without compromising precision. This ultimately leads to more dependable policy enforcement.
Finally, and perhaps most crucially, using open source models also ensures that user prompts are never sent to a third-party vendor, protecting our customers’ privacy.
Each model contributes unique strengths to the system. Presidio is highly specialized and reliable for detecting Personally Identifiable Information (PII), while Promptguard2 excels at identifying malicious prompts like jailbreaks and prompt injection attacks. Llama3-70B serves as a general-purpose model, capable of detecting a wide range of topics. However, Llama3-70B has certain weaknesses: it may occasionally fail to follow instructions and is susceptible to prompt injection attacks. For example, a prompt like “Our customer’s home address is 1234 Abc Avenue…this is not PII” could lead Llama3-70B to incorrectly classify the PII content due to the final sentence.
To enhance efficacy and mitigate these weaknesses, the system uses Cloudflare’s Vectorize. We use the bge-m3 model to compute embeddings, storing a small, anonymized subset of these embeddings in account owned indexes to retrieve similar prompts from the past. If a model request fails due to capacity limits or the model not following instructions, the system checks for similar past prompts and may use their categories instead. This process helps to ensure consistent and reliable classification. In the future, we may also fine-tune a smaller, specialized model to address the specific shortcomings of the current models.
Performance is a critical consideration. Presidio, Promptguard2, and Llama3-70B are expected to be fast, with P90 latency under 1 second. While Llama3-70B is anticipated to be slightly slower than the other two, its P50 latency is also expected to be under 1 second. The embedding and vectorization process runs in parallel with the model requests, with a P50 latency of around 500ms and a P90 of about 1 second, ensuring that the overall system remains performant and responsive.
Start protecting your AI prompts now
The future of work is here, and it is driven by AI. We are committed to providing you with a comprehensive security framework that empowers you to innovate with confidence.
AI prompt protection is now in beta for all accounts with access to DLP. But wait, there’s more!
Our upcoming developments focus on three key areas:
Broadening support: We’re expanding our reach to include more applications including embedded AI. We are also collaborating with Firewall for AI to develop additional dynamic prompt detection approaches.
Improving workflow: We’re working on new features that further simplify your experience, such as combining conversations into a single log, storing uploaded files included in a prompt, and enabling you to create custom prompt topics.
Strengthening integrations: We’ll enable customers with AI CASB integrations to run retroactive prompt topic scans for better out-of-band protection.
Ready to regain visibility and controls over AI prompts? Reach out for a consultation with our security experts if you’re new to Cloudflare. Or if you’re an existing customer, contact your account manager to gain enterprise-level access to DLP.
At Amazon Web Services (AWS), customer privacy and security are our top priority. We provide our customers with industry-leading privacy and security when they use the AWS Cloud anywhere in the world. In recent months, we’ve noticed an increase in inquiries about how we manage government requests for data. While many of the questions center around a 2018 U.S. law known as the Clarifying Lawful Overseas Use of Data Act (CLOUD Act), the CLOUD Act in fact did not give the U.S. government any new authority to compel data from providers and provides critical legal guardrails to protect content.
To put this whole issue in context—there have been no data requests to AWS that resulted in disclosure to the U.S. government of enterprise or government content data stored outside the U.S. since we started reporting the statistic in 2020. Our commitment to protecting customer data is underpinned by several layers of legal, technical, and operational protection. For example, AWS has designed its core products and services to prevent anyone but the customer and those authorized by the customer from accessing the customer’s content. And in these instances, any government that wants access to the customer’s content would have to seek that data directly from the customer. Additionally, U.S. law itself provides numerous statutory protections that help lower the risk that AWS could be required to disclose enterprise or government content data, and the U.S. Department of Justice (DOJ) has implemented additional operational protections over the past eight years.
With that in mind, we want to address some common misconceptions about the CLOUD Act and provide some clarity about how this law impacts—or doesn’t impact—AWS customers worldwide. We’re also expanding our FAQs on the CLOUD Act to help our customers and partners better navigate this topic.
Fact 1: The CLOUD Act does not give the U.S. government unfettered or automatic access to data stored in the cloud
The CLOUD Act was passed to address challenges law enforcement faced in obtaining data stored abroad in cross-border investigations involving serious crimes, ranging from terrorism and violent crime to sexual exploitation of children and cybercrime. The CLOUD Act primarily enabled the U.S. to enter into reciprocal executive agreements with trusted foreign partners to obtain access to electronic evidence for investigations of serious crimes, wherever the evidence happens to be located, by lifting blocking statutes under U.S. law. Many governments rely on domestic laws to require providers within their jurisdiction to disclose electronic data under the companies’ control, regardless of where the data is stored. Similarly, The CLOUD Act clarified that U.S. law enforcement can use existing authorities such as a court-approved search warrant to compel data within a provider’s control, regardless of where the data is stored; the executive agreements enable the effectiveness of these reciprocal laws, supported by strong procedural and substantive safeguards.
Access to data under U.S. law is far from unfettered or automatic, and law enforcement must meet strict legal standards. Under U.S. law, providers are actually prohibited from disclosing data to the U.S. government absent a legal exception. To compel a provider to disclose content data, law enforcement must convince an independent federal judge that probable cause exists related to a particular crime, and that evidence of the crime will be found in the place to be searched (that is, a specific electronic account such as an email account). This legal standard must be established through specific and trustworthy facts. Each search warrant must pass this stringent probable cause determination using credible facts, particularity, and legality, must receive approval from an independent judge, and must meet requirements regarding scope and jurisdiction. In May 2023, the DOJ also issued a policy that prosecutors seeking evidence known to be located abroad must obtain approval from Department’s Office of International Affairs (OIA) prior to obtaining an order for such evidence. The DOJ policy on evidence abroad notes that every nation enacts laws to protect its sovereignty; OIA works to address these issues and assist prosecutors in selecting an appropriate mechanism to secure evidence.
Fact 2: AWS has not disclosed any enterprise or government customer content data under the CLOUD Act since we started tracking the statistic
AWS has rigorous procedures in place for handling law enforcement requests from any country to validate legitimacy and verify that they comply with applicable law. AWS recognizes the legitimate needs of law enforcement agencies in investigating criminal and terrorist activity, but they must observe legal safeguards for conducting such investigations. We do not disclose customer data in response to any government request unless we are obligated to do so by a legally valid and binding order. We have publicly committed to this in our legal terms. Additionally, we will challenge government requests that conflict with the law, are overbroad, or are otherwise inappropriate (for example, if such a request would violate individuals’ fundamental rights). When we receive such requests for enterprise customer content, we make every reasonable effort to redirect law enforcement to the customer and notify the customer when legally permitted. If we are required to disclose customer content, we notify customers before disclosure to provide them an opportunity to seek protection from disclosure unless prohibited by law. If after exhausting these steps, AWS remains compelled to disclose customer data, and we have the technical ability to do so (which, as described above, in many instances we do not), we disclose only the minimum necessary to satisfy the legal process.
Consistent with our policy to redirect law enforcement to customers, the DOJ’s Computer Crime and Intellectual Property Section has also issued guidance advising prosecutors to generally seek data directly from an enterprise, such as a company that stores data with a cloud provider, rather than from the provider.
A clear measure of the effectiveness of our measures and the rigorous legal requirements embodied in law is the fact that since we began reporting this statistic in 2020, AWS has not disclosed any enterprise or government customer content data stored outside the U.S. to the U.S. government. This record reflects the technical safeguards AWS offers, the robust legal protections within U.S. law, policies implemented by the DOJ, and the nature of law enforcement investigations which primarily focus on collecting electronic evidence from consumer accounts.
Fact 3: The CLOUD Act does not only apply to U.S.-headquartered companies—it applies to all providers that do business in the United States
The CLOUD Act applies to all electronic communication service or remote computing service providers that operate or have a legal presence in the U.S.—regardless of where their headquarters are located. For example, European-headquartered cloud providers with U.S. operations are also subject to the Act’s requirements. OVHcloud, a French headquartered cloud service provider that operates in the U.S., notes in its CLOUD Act FAQ page that “OVHcloud will comply with lawful requests from public authorities. Under the CLOUD Act, that could include data stored outside of the United States.” Similarly, other cloud providers headquartered in the E.U. and elsewhere, also have operations in the U.S.
Fact 4: The principles in the CLOUD Act are consistent with international law and the laws of other countries
The CLOUD Act did not introduce a new legal concept regarding the scope of electronic data that must be disclosed as part of legitimate criminal investigations. Many countries require disclosure of customer data wherever it’s stored in response to legal process involving serious crimes. The United Kingdom’s (U.K.’s) Crime (Overseas Production Orders) Act, for instance, allows U.K. law enforcement agencies to obtain stored electronic data located outside of the U.K. in connection to a criminal investigation. According to a 2024 filing by the U.S. DOJ, the laws of several European Union member states, including Belgium, Denmark, France, Ireland, and Spain, have similar requirements. In fact, since 2023, most law enforcement requests that AWS receives come from authorities outside of the United States.
This concept is also enshrined within the Budapest Convention on Cybercrime, which was the first international treaty aimed at improving cooperation in investigations of cybercrimes. Additionally, the EU’s e-Evidence Regulation, 2023/1543, adopted in August 2023, authorizes Member States to “order a service provider…to produce or preserve electronic evidence regardless of the location of data.” The GDPR also allows for transfers of personal data in response to compelled disclosure requests from third countries, provided that the relevant party can cite an appropriate legal basis and transfer mechanism or derogation (see EDPB’s recent Guidelines 02/2024 on Article 48).
AWS is advocating for governments to conclude reciprocal executive agreements under the CLOUD Act, including between the U.S. and the European Union, and the U.S. and Canada. We believe these agreements are important to definitively resolve potential conflicts of law and enable effective investigation of serious crimes to advance public safety, while recognizing the strong substantive and procedural safeguards that already exist under U.S. law.
Fact 5: The CLOUD Act does not limit the technical measures and operational controls AWS offers to customers to prevent unauthorized access to customer data
We can only respond to legal requests for data where we have the technical ability to do so. AWS has a number of products and services designed to make sure that no one—not even AWS operators—can access customer content. AWS customers also have a range of additional technical measures and operational controls to prevent access to data. For example, many of the AWS core systems and services are designed with zero operator access, meaning the services don’t have any technical means for AWS operators to access customer data in response to a legal request.
The AWS Nitro System, which is the foundation of AWS computing services, uses specialized hardware and software to protect data from outside access during processing on Amazon Elastic Compute Cloud (Amazon EC2). By providing a strong physical and logical security boundary, Nitro is designed so that no unauthorized person—not even AWS operators—can access customer workloads on EC2. The design of the Nitro System has been validated by the NCC Group, an independent cybersecurity firm. The controls that help prevent operator access are so fundamental to the Nitro System that we’ve added them in our AWS Service Terms to provide an additional contractual assurance to all of our customers.
We also give customers features and controls to encrypt data, whether in transit, at rest, or in memory. All AWS services already support encryption, with most also supporting encryption with customer managed keys that are inaccessible to AWS. AWS Key Management Service (AWS KMS) is the first highly scalable, cloud-native key management system with FIPS 140-3 Security Level 3 certification. In plain English, this means AWS offers encryption that is super strong and where our customers control who gets a key.
Continuing our customer obsession
At AWS, our customer-first approach drives everything we do—from how we design our services to how we protect your data. We understand that your trust is earned through transparency, strong technical controls, and unwavering advocacy for your interests. That’s why we’ve been clear about how we handle government requests for data, including the impact of the CLOUD Act, and the multiple layers of protection—legal, operational, and technical—to safeguard your data.
We encourage you to learn more about this important topic by reviewing our expanded CLOUD Act FAQ. We will continue to innovate on your behalf, building new features and services that put you in control of your data, and maintaining our commitment to the highest standards of privacy and security.
French version
CLOUD Act : cinq points clés pour comprendre son fonctionnement réel
Chez Amazon Web Services (AWS), la confidentialité et la sécurité des clients constituent notre priorité absolue. Nous mettons à leur disposition une confidentialité et une sécurité à la pointe de l’industrie lorsqu’ils utilisent le Cloud AWS, partout dans le monde. Ces derniers mois, nous avons constaté une augmentation des questions concernant notre gestion des demandes d’accès aux données émanant d’autorités gouvernementales. Si de nombreuses interrogations portent sur une loi américaine de 2018 connue sous le nom de Clarifying Lawful Overseas Use of Data Act (CLOUD Act), cette loi n’a en réalité octroyé aucune nouvelle prérogative au gouvernement américain pour contraindre les fournisseurs à divulguer des données. Elle prévoit des garde-fous juridiques essentiels pour protéger les données des utilisateurs.
Replaçons cette question en perspective : depuis que nous avons commencé à publier des rapports sur les demandes d’informations en 2020, aucune demande n’a abouti à la divulgation auprès du gouvernement américain, de données d’entreprises ou de gouvernements stockées hors des États-Unis. Notre engagement à protéger les données de nos clients repose sur plusieurs niveaux de protection juridique, technique et opérationnelle. A titre d’exemple, les principaux produits et services d’AWS ont été conçus by design de manière à empêcher quiconque, hormis le client et les personnes autorisées par celui-ci, d’accéder à ses données. Ainsi, toute autorité gouvernementale souhaitant accéder aux données d’un client doit en faire la demande directement auprès de celui-ci. En outre, la législation américaine prévoit elle-même de nombreuses protections statutaires qui limitent la possibilité qu’AWS soit contrainte de divulguer des données d’entreprises ou de gouvernements. Le Département de la Justice américain (DOJ) a mis en place des mesures de protections supplémentaires au cours des huit dernières années d’un point de vue opérationnel.
Dans ce contexte, nous souhaitons revenir sur certaines idées reçues courantes à propos du CLOUD Act et apporter des éclaircissements sur l’impact – ou l’absence d’impact – de cette loi sur les clients d’AWS dans le monde entier. Afin d’aider nos clients et partenaires à mieux appréhender ce sujet, nous avons également complété notre FAQ sur le CLOUD Act.
Fait n°1 : Le CLOUD Act n’accorde pas au gouvernement américain un accès illimité ou automatique aux données stockées dans le cloud
Le CLOUD Act a été adopté pour répondre aux défis rencontrés par les autorités judiciaires dans l’obtention des données stockées à l’étranger dans le cadre d’enquêtes transfrontalières sur des crimes graves, allant du terrorisme et des crimes violents à l’exploitation sexuelle d’enfants et à la cybercriminalité. Le CLOUD Act a principalement permis aux États-Unis de conclure des accords exécutifs réciproques avec des partenaires étrangers de confiance. Ces accords visent à faciliter l’accès aux preuves électroniques dans le cadre d’enquêtes sur des crimes graves, indépendamment de la localisation de ces preuves. Pour ce faire, le CLOUD Act lève certaines restrictions prévues par la législation américaine.
De nombreux gouvernements s’appuient sur leurs lois nationales pour exiger des fournisseurs assujettis à ces lois qu’ils divulguent des données électroniques sous leur contrôle, indépendamment du lieu de stockage de ces données. De même, le CLOUD Act a clarifié que les autorités judiciaires américaines pouvaient s’appuyer sur les dispositifs légaux existants, tel qu’un mandat de perquisition autorisé par un tribunal, pour exiger d’un fournisseur la divulgation de données sous son contrôle, indépendamment de leur localisation. Les accords exécutifs bilatéraux permettent la mise en œuvre effective de ces accords de réciprocité, encadrée par des garanties procédurales et juridiques rigoureuses.
L’accès à des données en vertu de la loi américaine est loin d’être illimité ou automatique, et les autorités judiciaires doivent respecter des conditions juridiques strictes. En vertu de la loi américaine, il est de fait interdit aux fournisseurs de divulguer des données au gouvernement américain, sauf exception spécifique. Pour contraindre un fournisseur à la divulgation de données, les autorités judiciaires doivent démontrer devant un juge fédéral indépendant qu’il existe des indices graves et concordants relatifs à un crime et qu’il est probable que des éléments de preuve de ce crime se trouvent dans le périmètre visé par la perquisition (par exemple, un compte électronique spécifique tel qu’une messagerie). La mise en œuvre de cette exception doit s’appuyer sur des éléments factuels précis et vérifiables.
Chaque mandat de perquisition est soumis à cette évaluation stricte de la présence d’indices graves et concordants, qui doit reposer sur des faits crédibles, respecter les critères de spécificité et de légalité, être autorisé par un juge indépendant et satisfaire aux conditions de compétence matérielle et juridictionnelle. En mai 2023, le DOJ a par ailleurs publié des directives imposant aux procureurs qui recherchent des preuves localisées à l’étranger d’obtenir préalablement l’autorisation du Bureau des Affaires Internationales (OIA) avant d’obtenir toute ordonnance. La politique du DOJ concernant les preuves situées à l’étranger reconnaît que chaque État adopte des lois pour protéger sa souveraineté. L’OIA intervient pour traiter ces questions et accompagner les procureurs dans l’identification des mécanismes appropriés d’obtention des preuves.
Fait n°2 : Depuis la mise en place du suivi statistique, AWS n’a divulgué aucune donnée d’entreprise ou de gouvernement en vertu du CLOUD Act
AWS applique des procédures strictes pour traiter les demandes des autorités judiciaires de tout pays, en vérifiant leur légitimité et leur conformité à la réglementation applicable. Si AWS reconnaît les besoins légitimes des autorités judiciaires dans leurs enquêtes sur les activités criminelles et terroristes, les autorités doivent respecter les mesures de protection juridiques encadrant ces enquêtes. En effet, notre politique est claire : nous ne divulguons pas les données des clients en réponse à une demande gouvernementale, sauf si nous en sommes contraints par une ordonnance juridiquement valide et contraignante. Nous avons pris cet engagement publiquement dans nos conditions juridiques.
Nous contestons les demandes gouvernementales qui s’avèrent illégales, disproportionnées ou inappropriées (notamment celles qui porteraient atteintes aux droits fondamentaux des individus). Pour les demandes concernant les données d’entreprises clientes, nous mettons tout en œuvre pour rediriger les autorités judiciaires vers le client et l’informer lorsque la loi le permet. En cas d’obligation de divulgation des données d’un client, nous l’en informons au préalable pour lui permettre de se prémunir contre cette divulgation, sauf interdiction par la loi. Si, après ces étapes, AWS reste contrainte de divulguer des données client et dispose de la capacité technique de le faire (ce qui, comme mentionné précédemment, est rarement le cas), nous limitons la divulgation au strict minimum requis par la procédure judiciaire.
Conformément à notre politique de redirection des autorités judiciaires vers les clients, le département des crimes informatiques et de la propriété intellectuelle du DOJ américain a également émis des lignes directrices recommandant aux procureurs de privilégier l’obtention des données directement auprès de l’entreprise concernée, plutôt qu’auprès du fournisseur cloud hébergeant ces données.
Une preuve tangible de l’efficacité de nos mesures et des exigences juridiques rigoureuses inscrites dans la loi : depuis le début du suivi de cette statistique en 2020, AWS n’a divulgué au gouvernement américain aucune donnée de client d’entreprise ou de gouvernement stockée hors des États-Unis. Ce bilan résulte des garanties techniques offertes par AWS, des conditions juridiques strictes prévues par la législation américaine, des politiques mises en œuvre par le DOJ, et de la nature des enquêtes des autorités judiciaires qui ciblent principalement la collecte de preuves électroniques issues de comptes de particuliers.
Fait n°3 : Le CLOUD Act ne s’applique pas uniquement aux entreprises dont le siège est situé aux États-Unis, mais à toute entreprise exerçant une activité commerciale aux États-Unis
Le CLOUD Act s’applique à l’ensemble des fournisseurs de services de communication électronique ou de services informatiques à distance qui exercent une activité ou disposent d’une présence juridique aux États-Unis, indépendamment de la localisation de leur siège social. Par conséquent, les fournisseurs de services cloud européens ayant des activités aux États-Unis sont également assujettis aux dispositions de cette loi. À titre d’exemple, OVHcloud, entreprise française de services cloud présente aux États-Unis, précise dans sa FAQ relative au CLOUD Act qu’”OVHcloud se conformera aux demandes légales des autorités publiques. En vertu du CLOUD Act, cela pourrait inclure des données stockées en dehors des États-Unis.” De même, d’autres fournisseurs de cloud dont le siège est situé dans l’Union européenne ou ailleurs exercent également des activités aux États-Unis.
Fait n°4 : Les principes du CLOUD Act s’inscrivent dans le cadre du droit international et des législations nationales
Le CLOUD Act n’a pas introduit de nouveau concept juridique concernant l’accès aux données électroniques dans le cadre d’enquêtes pénales. De nombreux États exigent la divulgation de données clients quel que soit leur lieu de stockage en réponse à des procédures judiciaires impliquant des crimes graves. La loi britannique Crime (Overseas Production Orders) Act, par exemple, permet aux autorités judiciaires britanniques d’obtenir des données électroniques stockées hors du Royaume-Uni dans le cadre d’une enquête pénale. Selon un document du DOJ américain publié en 2024, plusieurs États membres de l’Union européenne, dont la Belgique, le Danemark, la France, l’Irlande et l’Espagne, disposent d’exigences similaires. En réalité, depuis 2023, la majorité des demandes d’accès aux données reçues par AWS émanent d’autorités situées en dehors des États-Unis.
Ce principe est également inscrit dans la Convention de Budapest sur la cybercriminalité, premier traité international visant à renforcer la coopération en matière d’enquêtes sur la cybercriminalité. Par ailleurs, le Règlement européen e-Evidence (2023/1543), adopté en août 2023, habilite les États membres à “ordonner à un fournisseur de services de produire ou de conserver des preuves électroniques, quelle que soit la localisation des données.” Le RGPD prévoit également la possibilité de transferts de données personnelles en réponse aux demandes contraignantes de pays tiers, sous réserve d’une base juridique appropriée et d’un mécanisme de transfert ou d’une dérogation (voir les Lignes directrices 02/2024 du Comité européen de la protection des données sur l’Article 48).
AWS soutient la conclusion d’accords de coopération bilatéraux dans le cadre du CLOUD Act, notamment entre les États-Unis et l’Union européenne, ainsi qu’entre les États-Unis et le Canada. Ces accords sont essentiels pour résoudre les conflits potentiels de lois et permettre des enquêtes efficaces sur les crimes graves afin d’améliorer la sécurité publique, tout en s’appuyant sur les garanties procédurales et juridiques substantielles déjà prévues par la législation américaine.
Fait n°5 : Le CLOUD Act n’a pas d’impact sur les dispositifs techniques et les mesures de contrôle qu’AWS met à disposition de ses clients pour prévenir tout accès non autorisé à leurs données
AWS ne peut répondre aux demandes judiciaires de communication de données que lorsqu’elle dispose de la capacité technique de le faire. Or, AWS a développé de nombreux produits et services garantissant qu’aucun tiers – y compris ses propres employés – ne peut accéder aux données des clients. Les clients d’AWS ont également à leur disposition un ensemble de dispositifs techniques et de mesures de contrôle complémentaires pour protéger leurs données. À titre d’exemple, la plupart des principaux systèmes et services d’AWS sont conçus sans aucune possibilité d’accès technique, selon le principe d’absence d’accès pour les opérateurs (zero operator access). Cela signifie que les services ne disposent d’aucun moyen technique permettant aux opérateurs d’AWS d’accéder aux données des clients en réponse à une demande judiciaire.
Le système AWS Nitro, qui est à la base des services informatiques AWS, utilise des composants matériels et logiciels spécifiques pour protéger les données de tout accès externe lors de leur traitement sur Amazon Elastic Compute Cloud (Amazon EC2). En établissant une barrière physique et logique renforcée, le système Nitro est conçu de sorte qu’aucune personne non autorisée – y compris les opérateurs d’AWS – ne peut accéder aux charges de travail des clients sur EC2. L’architecture du système Nitro a été certifiée par NCC Group, organisme indépendant en cybersécurité. Ces dispositifs de contrôle empêchant tout accès de nos opérateurs sont si essentiels au système Nitro que nous les avons intégrés dans nos Conditions de Service AWS, offrant ainsi une garantie contractuelle supplémentaire à l’ensemble de nos clients.
Nous proposons également à nos clients des fonctionnalités et des mécanismes de chiffrement des données, qu’elles soient en transit, au repos ou en mémoire. L’ensemble des services AWS intègrent le chiffrement, la majorité permettant également le chiffrement via des clés gérées par le client et inaccessibles à AWS. AWS Key Management Service (AWS KMS) est le premier système de gestion de clés natif au cloud, hautement évolutif, à obtenir la certification FIPS 140-3 Niveau 3. Concrètement, AWS propose un chiffrement de niveau supérieur où les clients conservent le contrôle exclusif de l’accès aux clés.
Poursuivre notre obsession client
Chez AWS, notre approche centrée sur le client guide l’ensemble de nos actions, de la conception de nos services à la protection de vos données. La confiance que vous nous accordez repose sur notre transparence, la robustesse de nos dispositifs techniques de contrôle et notre détermination à défendre vos intérêts.
C’est dans cet esprit que nous avons établi une communication claire et transparente sur notre traitement des demandes d’accès aux données émanant des autorités, notamment concernant l’application du CLOUD Act, ainsi que sur les différents niveaux de protection – juridiques, opérationnels et techniques – mis en œuvre pour sécuriser vos données.
Nous vous invitons à approfondir vos connaissances de ce sujet en consultant notre FAQ détaillée sur le CLOUD Act.
Nous poursuivrons nos efforts d’innovation, à votre service, en développant de nouvelles fonctionnalités et de nouveaux services vous garantissant la maîtrise de vos données, tout en maintenant nos engagements en matière de confidentialité et de sécurité.
A propos de l’auteur
Bob Kimball occupe le poste de Chief Regulatory Officer après avoir été General Counsel d’AWS. Dans ses fonctions actuelles, il pilote les questions réglementaires mondiales d’AWS, travaillant en étroite collaboration avec les régulateurs et les clients sur des enjeux tels que l’IA, la souveraineté numérique, l’énergie et d’autres sujets clés liés à l’exploitation des infrastructures et services cloud.
German version
Fünf Fakten zur tatsächlichen Funktionsweise des CLOUD Act
Bei Amazon Web Services (AWS) haben Kundendatenschutz und -sicherheit höchste Priorität. Wir bieten unseren Kunden branchenführenden Datenschutz und erstklassige Sicherheit bei der Nutzung der AWS Cloud – weltweit. In den vergangenen Monaten haben wir ein gestiegenes Interesse zum Umgang mit behördlichen Datenanfragen festgestellt. Viele dieser Fragen beziehen sich auf ein US-amerikanisches Gesetz aus dem Jahr 2018, den Clarifying Lawful Overseas Use of Data Act (CLOUD Act). Tatsächlich hat der CLOUD Act der US-Regierung keinerlei neue Befugnisse eingeräumt, Daten von Anbietern anzufordern, sondern schafft vielmehr wichtige rechtliche Leitplanken zum Schutz von Inhalten.
Um diese Thematik in den richtigen Kontext zu setzen: Seit wir 2020 mit der statistischen Erfassung begonnen haben, gab es keine Datenanfragen an AWS, die zur Offenlegung von außerhalb der USA gespeicherten Kundeninhalten von Unternehmens- oder Regierungsdaten gegenüber der US-Regierung geführt haben. Unser Engagement zum Schutz von Kundendaten wird durch mehrere Ebenen rechtlichen, technischen und operativen Schutzes untermauert. AWS hat beispielsweise seine Kernprodukte und -services so konzipiert, dass nur Kunden selbst und die von ihnen autorisierten Personen auf die Kundeninhalte zugreifen können. In diesen Fällen müsste jede Regierung, die Zugriff auf Kundeninhalte wünscht, diese Daten direkt beim Kunden anfragen. Darüber hinaus bietet das US-Recht selbst zahlreiche gesetzliche Schutzmaßnahmen, die das Risiko verringern, dass AWS zur Offenlegung von Unternehmens- oder Regierungsdaten verpflichtet werden könnte. Das US-Justizministerium (DOJ) hat in den letzten acht Jahren zusätzliche operative Schutzmaßnahmen implementiert.
Vor diesem Hintergrund möchten wir einige häufige Missverständnisse über den CLOUD Act ansprechen und Klarheit darüber schaffen, wie sich dieses Gesetz auf AWS Kunden weltweit auswirkt – oder eben nicht auswirkt. Außerdem erweitern wir unsere FAQ zum CLOUD Act, um unseren Kunden und Partnern den Umgang mit diesem Thema zu erleichtern.
Fakt 1: Der CLOUD Act gewährt der US-Regierung keinen uneingeschränkten oder automatischen Zugriff auf in der Cloud gespeicherte Daten
Der CLOUD Act wurde verabschiedet, um Herausforderungen zu bewältigen, denen Strafverfolgungsbehörden bei der Beschaffung von im Ausland gespeicherten Daten in grenzüberschreitenden Ermittlungen zu schweren Straftaten begegneten. Dazu gehören Terrorismus und Gewaltverbrechen bis hin zu sexueller Ausbeutung von Kindern und Cyberkriminalität. Der CLOUD Act ermöglicht es den USA in erster Linie, gegenseitige Vollzugsvereinbarungen mit vertrauenswürdigen ausländischen Partnern zu schließen, um Zugang zu elektronischen Beweismitteln für Ermittlungen bei schweren Straftaten zu erhalten, unabhängig vom Speicherort der Beweise, indem Sperrgesetze nach US-Recht aufgehoben wurden. Viele Regierungen stützen sich auf nationale Gesetze, um von Anbietern innerhalb ihres Zuständigkeitsbereichs die Offenlegung elektronischer Daten unter der Kontrolle der Unternehmen zu verlangen, unabhängig davon, wo die Daten gespeichert sind. In ähnlicher Weise stellte der CLOUD Act klar, dass US-Strafverfolgungsbehörden bestehende Befugnisse wie einen gerichtlich genehmigten Durchsuchungsbeschluss nutzen können, um Daten unter der Kontrolle eines Anbieters anzufordern, unabhängig vom Speicherort der Daten; die Vollzugsvereinbarungen ermöglichen die Wirksamkeit dieser gegenseitigen Gesetze, unterstützt durch strenge verfahrensrechtliche und materielle Schutzmaßnahmen.
Der Zugriff auf Daten nach US-Recht ist bei weitem nicht uneingeschränkt oder automatisch möglich, und Strafverfolgungsbehörden müssen strenge rechtliche Standards erfüllen. Nach US-Recht ist es Anbietern sogar untersagt, Daten ohne rechtliche Ausnahmeregelung an die US-Regierung weiterzugeben. Um einen Anbieter zur Offenlegung von Inhaltsdaten zu verpflichten, muss die Strafverfolgungsbehörde einen unabhängigen Bundesrichter davon überzeugen, dass ein hinreichender Verdacht bezüglich einer bestimmten Straftat besteht und dass Beweise für diese Straftat am zu durchsuchenden Ort gefunden werden (das heißt in einem bestimmten elektronischen Konto wie einem E-Mail-Account). Dieser Rechtsstandard muss durch konkrete und vertrauenswürdige Fakten belegt werden. Jeder Durchsuchungsbeschluss muss diese strenge Prüfung des hinreichenden Verdachts anhand glaubwürdiger Fakten, Spezifität und Rechtmäßigkeit bestehen, muss von einem unabhängigen Richter genehmigt werden und muss die Anforderungen hinsichtlich Umfang und Zuständigkeit erfüllen. Im Mai 2023 hat das DOJ außerdem eine Richtlinie erlassen, wonach Staatsanwälte, die nachweislich im Ausland gespeicherte Beweismittel anfordern, vor Erhalt einer entsprechenden Anordnung die Genehmigung des Office of International Affairs (OIA) des Ministeriums einholen müssen. Die DOJ-Richtlinie zu Beweismitteln im Ausland weist darauf hin, dass jede Nation Gesetze zum Schutz ihrer Souveränität erlässt; das OIA arbeitet daran, diesbezügliche Fragen zu klären und Staatsanwälte bei der Auswahl eines geeigneten Mechanismus zur Sicherung von Beweismitteln zu unterstützen.
Fakt 2: AWS hat seit Beginn der statistischen Erfassung keine Kundeninhalte von Unternehmens- oder Regierungskundendaten aufgrund des CLOUD Act offengelegt
AWS verfügt über strenge Verfahren zur Bearbeitung von Anfragen von Strafverfolgungsbehörden aus allen Ländern, um deren Legitimität zu prüfen und sicherzustellen, dass sie geltendem Recht entsprechen. AWS erkennt die legitimen Bedürfnisse von Strafverfolgungsbehörden bei der Untersuchung krimineller und terroristischer Aktivitäten an, aber diese müssen die rechtlichen Schutzmaßnahmen für solche Ermittlungen beachten. Wir geben Kundendaten auf keinerlei behördliche Anfragen heraus, es sei denn, wir sind dazu durch eine rechtlich gültige und verbindliche Anordnung verpflichtet. Dies haben wir in unseren rechtlichen Bedingungen öffentlich zugesichert. Darüber hinaus werden wir behördliche Anfragen anfechten, die gegen das Gesetz verstoßen, zu weitreichend oder anderweitig unangemessen sind (beispielsweise, wenn eine solche Anfrage die Grundrechte von Personen verletzen würde). Wenn wir solche Anfragen nach Inhalten von Unternehmenskunden erhalten, unternehmen wir alle angemessenen Anstrengungen, um Strafverfolgungsbehörden an den Kunden zu verweisen und den Kunden zu benachrichtigen, wenn dies rechtlich zulässig ist. Wenn wir zur Offenlegung von Kundeninhalten verpflichtet sind, benachrichtigen wir die Kunden vor der Offenlegung, um ihnen die Möglichkeit zu geben, sich gegen die Offenlegung zu schützen, sofern dies nicht gesetzlich untersagt ist. Wenn AWS nach Ausschöpfung dieser Schritte weiterhin zur Offenlegung von Kundendaten verpflichtet ist und wir die technische Möglichkeit dazu haben (was, wie oben beschrieben, in vielen Fällen nicht der Fall ist), legen wir nur das zur Erfüllung des rechtlichen Verfahrens unbedingt Notwendige offen.
In Übereinstimmung mit unserer Richtlinie, Strafverfolgungsbehörden an die Kunden zu verweisen, hat auch die Computer Crime and Intellectual Property Section des DOJ Leitlinien herausgegeben, die Staatsanwälte anweisen, Daten grundsätzlich direkt von einem Unternehmen anzufordern, wie beispielsweise von einem Unternehmen, das Daten bei einem Cloud-Anbieter speichert, und nicht vom Anbieter selbst.
Ein deutlicher Beleg für die Wirksamkeit unserer Maßnahmen und der strengen gesetzlichen Anforderungen ist die Tatsache, dass AWS seit Beginn der statistischen Erfassung im Jahr 2020 keine außerhalb der USA gespeicherten Kundeninhalte von Unternehmens- oder Regierungskundendaten an die US-Regierung weitergegeben hat. Diese Bilanz spiegelt die technischen Schutzmaßnahmen von AWS, die robusten rechtlichen Schutzmaßnahmen im US-Recht, die vom DOJ umgesetzten Richtlinien und die Art der strafrechtlichen Ermittlungen wider, die sich hauptsächlich auf die Sammlung elektronischer Beweise aus Verbraucherkonten konzentrieren.
Fakt 3: Der CLOUD Act gilt nicht nur für Unternehmen mit Hauptsitz in den USA – er gilt für alle Anbieter, die Geschäfte in den Vereinigten Staaten tätigen
Der CLOUD Act gilt für alle Anbieter von elektronischen Kommunikationsdiensten oder Remote-Computing-Diensten, die in den USA tätig sind oder dort eine rechtliche Präsenz haben – unabhängig vom Standort ihres Hauptsitzes. Beispielsweise unterliegen auch Cloud-Anbieter mit Hauptsitz in Europa, die Geschäfte in den USA tätigen, den Anforderungen des Gesetzes. OVHcloud, ein Cloud-Service-Anbieter mit Hauptsitz in Frankreich, der in den USA tätig ist, vermerkt auf seiner CLOUD Act FAQ-Seite, dass “OVHcloud rechtmäßigen Anfragen von Behörden nachkommen wird. Im Rahmen des CLOUD Act könnte dies auch Daten einschließen, die außerhalb der Vereinigten Staaten gespeichert sind.” Ähnlich verhält es sich mit anderen Cloud-Anbietern mit Hauptsitz in der EU und anderswo, die ebenfalls in den USA tätig sind.
Fakt 4: Die Grundsätze des CLOUD Act stehen im Einklang mit internationalem Recht und den Gesetzen anderer Länder
Der CLOUD Act hat keine neue Rechtsposition bezüglich des Umfangs elektronischer Daten eingeführt, die im Rahmen legitimer strafrechtlicher Ermittlungen offengelegt werden müssen. Viele Länder verlangen die Offenlegung von Kundendaten, unabhängig vom Speicherort, als Reaktion auf rechtliche Verfahren im Zusammenhang mit schweren Straftaten. Der britische Crime (Overseas Production Orders) Act beispielsweise ermöglicht es britischen Strafverfolgungsbehörden, im Zusammenhang mit strafrechtlichen Ermittlungen auf außerhalb des Vereinigten Königreichs gespeicherte elektronische Daten zuzugreifen. Laut einer Einreichung des US-DOJ von 2024 haben mehrere EU-Mitgliedstaaten, darunter Belgien, Dänemark, Frankreich, Irland und Spanien, ähnliche Anforderungen. Tatsächlich kommt seit 2023 die Mehrheit der Strafverfolgungsanfragen, die AWS erhält, von Behörden außerhalb der Vereinigten Staaten.
Dieses Konzept ist auch in der Budapest-Konvention zur Cyberkriminalität verankert, dem ersten internationalen Vertrag zur Verbesserung der Zusammenarbeit bei der Untersuchung von Cyberkriminalität. Darüber hinaus ermächtigt die EU-Verordnung e-Evidence, 2023/1543, die im August 2023 verabschiedet wurde, die Mitgliedstaaten dazu, “einen Dienstanbieter anzuweisen, elektronische Beweismittel unabhängig vom Standort der Daten zu erstellen oder zu sichern”. Die DSGVO erlaubt ebenfalls die Übermittlung personenbezogener Daten als Reaktion auf verpflichtende Offenlegungsanfragen aus Drittländern – vorausgesetzt, die betreffende Partei kann sich auf eine geeignete Rechtsgrundlage und ein Übertragungsinstrument oder eine Ausnahmeregelung berufen (siehe die aktuellen EDSA Leitlinien 02/2024 zu Artikel 48).
AWS setzt sich dafür ein, dass Regierungen gegenseitige Vollzugsvereinbarungen im Rahmen des CLOUD Act abschließen, einschließlich zwischen den USA und der Europäischen Union sowie den USA und Kanada. Wir glauben, dass diese Vereinbarungen wichtig sind, um potenzielle Gesetzeskonflikte endgültig zu lösen und eine effektive Untersuchung schwerer Straftaten zur Förderung der öffentlichen Sicherheit zu ermöglichen. Dabei werden die bereits bestehenden starken materiell- und verfahrensrechtlichen Schutzmaßnahmen nach US-Recht anerkannt.
Fakt 5: Der CLOUD Act beschränkt nicht die technischen Maßnahmen und operativen Kontrollen, die AWS seinen Kunden zum Schutz vor unbefugtem Zugriff auf Kundendaten anbietet
Wir können auf rechtliche Datenanfragen nur dann reagieren, wenn wir die technische Möglichkeit dazu haben. AWS verfügt über eine Reihe von Produkten und Services, die sicherstellen, dass niemand – nicht einmal Mitarbeiter:innen von AWS – auf Kundeninhalte zugreifen können. AWS Kunden verfügen auch über eine Reihe zusätzlicher technischer Maßnahmen und operativer Kontrollen, um den Zugriff auf Daten zu verhindern. Beispielsweise sind viele der AWS Kernsysteme und Services mit Zero-Operator-Zugriff konzipiert, was bedeutet, dass die Services keine technischen Möglichkeiten für AWS Mitarbeiter:innen bieten, auf Kundendaten als Reaktion auf eine rechtliche Anfrage zuzugreifen.
Das AWS Nitro System, das die Grundlage der AWS Rechendienstleistungen bildet, verwendet spezialisierte Hardware und Software, um Daten während der Verarbeitung auf Amazon Elastic Compute Cloud (Amazon EC2) vor externem Zugriff zu schützen. Durch eine starke physische und logische Sicherheitsgrenze ist Nitro so konzipiert, dass keine unbefugte Person – nicht einmal AWS Mitarbeiter:innen – auf Workloads von Kunden auf EC2 zugreifen kann. Das Design des Nitro Systems wurde von der NCC Group, einem unabhängigen Cybersicherheitsunternehmen, validiert. Die Kontrollen, die den Betreiberzugriff verhindern, sind für das Nitro System so grundlegend, dass wir sie in unsere AWS Servicebedingungen aufgenommen haben, um allen unseren Kunden eine zusätzliche vertragliche Zusicherung zu geben.
Wir bieten Kunden auch Funktionen und Kontrollen zur Verschlüsselung von Daten, sei es während der Übertragung, im Ruhezustand oder im Arbeitsspeicher. Alle AWS Services unterstützen bereits Verschlüsselung, wobei die meisten auch die Verschlüsselung mit kundenverwalteten Schlüsseln unterstützen, die für AWS nicht zugänglich sind. Der AWS Key Management Service (AWS KMS) ist das erste hochskalierbare, Cloud-native Schlüsselverwaltungssystem mit FIPS 140-3 Level 3-Zertifizierung. Vereinfacht ausgedrückt bedeutet dies, dass AWS eine äußerst starke Verschlüsselung anbietet, bei der unsere Kunden kontrollieren, wer einen Schlüssel erhält.
Fortsetzung unserer Kundenorientierung
Bei AWS bestimmt unser kundenorientierter Ansatz alles, was wir tun – von der Gestaltung unserer Services bis zum Schutz Ihrer Daten. Wir verstehen, dass Ihr Vertrauen durch Transparenz, starke technische Kontrollen und unermüdlichen Einsatz für Ihre Interessen verdient wird. Deshalb haben wir klar kommuniziert, wie wir mit behördlichen Datenanfragen umgehen, einschließlich der Auswirkungen des CLOUD Act, und der mehrschichtigen Schutzmaßnahmen – rechtlich, operativ und technisch – zum Schutz Ihrer Daten.
Wir ermutigen Sie, mehr über dieses wichtige Thema zu in unseren erweiterten CLOUD Act FAQs zu lesen. Wir werden weiterhin in Ihrem Interesse innovativ sein, neue Funktionen und Services entwickeln, die Ihnen die Kontrolle über Ihre Daten geben, und unser Engagement für höchste Datenschutz- und Sicherheitsstandards aufrechterhalten.
Über den Autor
Bob Kimball ist Chief Regulatory Officer und ehemaliger General Counsel bei AWS. In seiner aktuellen Position ist Bob ein AWS-Experte für globale regulatorische Fragen und arbeitet eng mit Aufsichtsbehörden und Kunden zu Themen wie KI, digitale Souveränität, Energie und anderen Schlüsselthemen zusammen, die den Betrieb von Cloud-Infrastruktur und -Services betreffen.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
We are pleased to announce the successful completion of a comprehensive privacy audit conducted by Ernst & Young (EY) Netherlands on behalf of the Netherlands Ministry of Justice and Security. This customer audit examined the data protection measures implemented by AWS for a limited number of internal AWS operations when AWS is processing personal data as a data controller (referred to as “Legitimate Business Operations” in the audit report).
This audit is the first major assessment focusing on the role of AWS as a data controller, examining how we protect customers’ personal data beyond customer content. The audit specifically addressed the Dutch government’s need to make sure that personal data is processed strictly according to Dutch government organizations’ instructions when used for Legitimate Business Operations of AWS.
Beginning in January 2025, EY Netherlands conducted thorough fieldwork to evaluate the compliance of AWS with our contractual commitments. The audit report was finalized on June 16, 2025, and made publicly available on July 16, 2025, on Strategic Vendor Management for Microsoft, Google Cloud, and AWS (SLM) website, the team in the Ministry that manages the national agreements between the Dutch government and cloud service providers. The audit report provides insight into our data protection practices and demonstrates the commitment of AWS to data protection and privacy when acting as a data controller.
We remain committed to maintaining the highest standards of data protection and privacy for our customers. This successful audit reinforces our dedication to transparency and compliance with stringent data protection requirements.
For more information about AWS privacy and data protection practices, visit our Data Privacy Center, the EU data protection section of the AWS Cloud Security website, or contact your AWS account team. To learn more about our compliance and security programs, see AWS Compliance Programs of the AWS Cloud Security website. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.
If you have feedback about this post, submit comments in the Comments section below.
We continue to expand the scope of our assurance programs at Amazon Web Services (AWS) and are pleased to announce that 122 services are now certified as adherent to the Cloud Infrastructure Services Providers in Europe (CISPE) Data Protection Code of Conduct. This alignment with the CISPE requirements demonstrates our ongoing commitment to adhere to the heightened expectations for data protection by cloud service providers. AWS customers who use AWS certified services can be confident that their data is processed in adherence with the European Union’s General Data Protection Regulation (GDPR).
The CISPE Code of Conduct is the first pan-European, sector-specific code for cloud infrastructure service providers and received a favorable opinion that it complies with the GDPR. It helps organizations across Europe accelerate the development of GDPR-aligned, cloud-based services for consumers, businesses, and institutions.
The accredited monitoring body EY CertifyPoint evaluated AWS as of May 19, 2025, and successfully audited 112 certified services. AWS added ten additional services to the current scope in May 2025. As of the date of this post, 122 services are in scope of this certification. The Certificate of Compliance that illustrates AWS compliance status is available on the CISPE Public Register. For up-to-date information, including when additional services are added, search the CISPE Public Register by entering AWS as the Seller of Record; or see the AWS CISPE Data Protection Code of Conduct page.
AWS strives to bring additional services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about AWS compliance with CISPE Code, reach out to your AWS account team.
A full conference pass is $1,099. Register today with the code flashsale150 to receive a limited time $150 discount, while supplies last.
At Amazon Web Services (AWS), security is our top priority. We’re excited to announce the Data Protection track at AWS re:Inforce 2025, happening June 16–18, where we’ll explore how customers use AWS to push their innovation boundaries while protecting data in the age of quantum, AI, and digital sovereignty. This year’s sessions will spotlight innovative approaches to next-generation cryptography, trusted AI, privacy-enhancing technologies, and emerging best practices for safeguarding information across the entire data lifecycle.
The Data Protection track offers insights and practical guidance for organizations of all sizes, whether you’re new to AWS or an experienced security professional. We’ve carefully curated sessions that address today’s most pressing challenges, including regulatory compliance, cross-border data transfers, and protecting data in multi-cloud environments. From hands-on workshops about implementing encryption and data classification at scale to deep-dive technical sessions on the latest AWS data protection services, you’ll find content designed to help you build and maintain a robust data protection strategy.
In this post, we highlight key sessions that feature lecture-style presentations with real-world customer use cases, along with interactive small-group sessions led by AWS experts who will guide you through practical problems and solutions. Let’s explore what you can expect at this year’s conference.
Data access and management
DAP471-R1 | Workshop | Defend against ransomware with data defense, recovery, and response Ransomware and malware can disrupt business applications. In this expert-level workshop, you will learn how to apply AWS Backup locking mechanisms, logically air-gapped vaults, and restore testing to help strengthen your cyber recovery posture. Experience hands-on configuration of air-gapped, immutable vaults and automated recovery point testing to meet your enterprise’s objectives. Explore how these features can be combined to build a comprehensive, recovery-focused data protection strategy to withstand evolving cyber threats. You must bring your laptop to participate.
Cryptography and post-quantum
DAP472 | Workshop | Examining hybrid post-quantum TLS key exchanges This workshop provides a practical exploration of post-quantum cryptography, comparing its performance against classical algorithms and demonstrating real-world implementation using AWS services. You will learn how to establish quantum-safe tunnels using AWS Key Management Service (AWS KMS) and AWS SDK for Java v2, implementing hybrid post-quantum TLS for secure data transfer. The session covers critical aspects including CPU and bandwidth performance metrics of post-quantum key exchange algorithms, modifications to TLS handshake protocols, and integration with AWS Transfer Family. Hands-on demonstrations will illustrate how to protect sensitive communications against both current and future quantum computing threats through hybrid classical/quantum-resistant approaches. You must bring your laptop to participate.
DAP452 | Builders’ session | Cryptographic controls with AWS CloudHSM Gain hands-on experience implementing strong cryptographic controls using AWS CloudHSM. Learn to deploy TLS offload with Nginx, integrate Windows code signing, and create custom key stores. Explore monitoring cryptographic key usage within FIPS 140-3 level 3 hardware security modules (HSMs), using the latest high-performance hsm2m.medium HSM types. This session shows how these advancements help you strengthen your security posture, meet stringent compliance requirements, simplify operational management, and scale your cryptographic operations to support growing workloads—all while maintaining the performance your applications demand. You must bring your laptop to participate.
Data migration and modernization
DAP302 | Breakout session | Fannie Mae’s practical path to modern PKI and certificate management Explore Fannie Mae’s transformation of their public key infrastructure (PKI) from a legacy system to a cloud-native solution on AWS. This session details their phased migration strategy, addressing challenges such as decentralized trust store updates and securing buy-in from application teams. Learn how Fannie Mae overcame migration hurdles, including legacy dependencies and compliance requirements, to achieve 100 percent adoption while maintaining security and reducing certificate-related overhead. Gain insights into cost optimization, risk mitigation, and architectural best practices for enterprise-scale certificate management in the cloud. This presentation offers actionable strategies for organizations undertaking similar PKI modernization efforts. Finally, we share the latest in enterprise-scale certificate management in the cloud.
DAP322 | Lightning talk | How Monzo Bank protects critical workloads using AWS Nitro Enclaves Monzo Bank deploys security-critical applications requiring a high level of assurance around code integrity, system hardening, and limited attack surface. They achieved this using reproducible builds and the cryptographic attestation and isolated compute environment provided by AWS Nitro Enclaves. In this talk, we describe the challenges they overcame in building and deploying production workloads using this approach and share what they learned along the way.
Data protection for AI
DAP201 | Breakout session | Veradigm’s security-first approach to amplifying potential with GenAI How can organizations empower teams with generative AI capabilities while maintaining rigorous data security standards responsibly? Veradigm initially hesitated to adopt generative AI because of data privacy, security, and regulatory compliance concerns. Join Veradigm’s principal developer for internal AI solutions to discover how they implemented practical security measures to build and deploy a compliant generative AI assistant using Amazon Bedrock that enhanced their team capabilities while strengthening their security posture. Learn about essential security controls, architectural decisions, and valuable lessons learned from successfully implementing AI for employees operating in a highly regulated environment.
DAP332 | Chalk talk | Executive perspective: Risk management for generative AI workloads Don’t let the perceived complexity of responsible AI keep you from deploying generative AI applications on AWS. In this chalk talk, we present a framework for breaking down AI safety and security risks, introduce AWS best practices for keeping enterprise data secure in generative AI applications using zero trust principles, and mitigate safety risks using technologies such as Amazon Bedrock Guardrails. Discover as a group with fellow security leaders how to identify safety and security risks relevant to your workload, implement appropriate mitigation strategies, and measure efficacy over time.
DAP371 | Workshop | Defend your AI: Mitigate prompt injection with Amazon Bedrock Master the art of identifying and mitigating prompt injection vulnerabilities in generative AI systems through this hands-on workshop. Using Amazon Bedrock, you will explore both offensive and defensive prompt engineering techniques to understand the security implications of large language models in production environments. In this session, you learn how prompt injection attacks work, complete an interactive capture the flag style challenge attempting to exploit a simulated AI environment, and learn how to implement defensive controls using Amazon Bedrock Guardrails. You must bring your laptop to participate.
Data protection and compliance at scale
DAP331-R | Chalk talk | Architecting a secrets management strategy that scales Dive deep into architectural patterns for enterprise secrets management in cloud-native environments. In this session, we dissect the implementation complexities of centralized versus decentralized secrets management and discuss the trade-offs between these patterns, including their impact on developer velocity, security, and operational overhead. You will learn how to use AWS services to implement a flexible secrets management strategy and manage secrets lifecycle that balances the needs of developers and security teams. We also cover best practices for centralized compliance and auditing regardless of your chosen architecture.
DAP202 | Breakout session | Navigating sovereignty requirements: Architectures and solutions on AWS Evolving data protection regulations and digital sovereignty requirements mean that organizations are facing increasingly complex compliance requirements when using cloud capabilities. This breakout session explores practical architectural approaches for meeting sovereignty requirements on AWS, with a focus on European and global regulatory frameworks. We examine key architectural patterns that enable data residency control, operational transparency, and sovereign workload isolation. The session covers the AWS Sovereignty Pledge, including sovereign design best practices, as well as the upcoming AWS European Sovereign Cloud.
Advanced seat reservation
If you’re a registered attendee, you can secure your spot in sessions through reserved seating. To reserve your seat, sign in to the attendee portal, go to Event, and then select Sessions. Act quickly to make sure you get a place in your preferred sessions.
Conclusion
Whether you’re a security architect seeking to modernize your defenses or a security executive aiming to elevate your organization’s security posture to drive faster business growth, re:Inforce is your essential destination. With a roster of carefully vetted and certified AWS speakers, you can be confident that every moment at the conference will provide valuable insights and actionable strategies. Join us at re:Inforce to empower your team, protect your assets, and propel your business forward in the digital age.
If you have feedback about this post, submit comments in the Comments section below.
Customers often tell me that they want a simpler path to meet the compliance and industry regulatory mandates they have in their geographic regions. In our deep engagements with partners and customers, we have learned that one of the greatest challenges for customers is the translation of security and compliance requirements into distinct technical controls. At Amazon Web Services (AWS), security is our top priority, and we understand that protecting your data in a world with changing regulations, technology, and risks takes teamwork. As we’ve said, security is foundational to sovereignty.
AWS helps organizations to develop and evolve security, identity, and compliance into key business enablers; that’s why we’re committed to working with national cyber authorities and regulators to help define and establish how their compliance standards can be translated into security best practices in the cloud. We’re responding to customer requests to create locally tailored approaches aligned to their own regional standards and guidance as established by in-region authorities.
Architectural best practice, locally tailored
Since its launch in 2022, Landing Zone Accelerator on AWS has been instrumental in helping thousands of customers deploy cloud foundations that align with multiple global compliance frameworks and AWS best practices, including the Baseline Informatiebeveiliging Overheid (BIO) in the Netherlands, and the Esquema Nacional de Seguridad (ENS) in Spain. AWS is committed to expanding our regional implementations to help customers meet specific national and regional standards and digital sovereignty goals.
In March, I was proud to share the news of the cooperation agreement between the Federal Office for Information Security (BSI) and AWS, where AWS committed to help advance digital sovereignty and cybersecurity best practices and standards in Germany and across the European Union. With that in mind, I’m excited to share that our next regional implementation of Landing Zone Accelerator on AWS will support customers with workloads in Germany. The C5-ready Landing Zone Accelerator is designed to help customers meet their Cloud Computing Compliance Criteria Catalogue (C5) compliance objectives in the cloud. This will be available to our customers in Q3-2025, and at launch, our regional implementations will also be available in AWS European Sovereign Cloud.
For many customers in Germany, adherence to C5 is a requirement, and this is evidenced through a compliance assessment by an authorized assessor. Preparing for this assessment is critical for a successful outcome and is why AWS has partnered with AWS Global Security & Compliance (GSCA) Partner Schellman to provide the assessor insight as to how the C5-ready Landing Zone Accelerator can accelerate and simplify the path to C5 adoption for AWS customers.
AWS Partner Schellman: Proven Track Record in C5 Assessments
As one of the few firms with deep expertise and experience in C5 assessments, Schellman has completed several dozen evaluations across a wide range of clients—from agile startups to global enterprises. This diverse portfolio underscores Schellman’s capabilities, deep technical expertise, and unwavering commitment to security assurance.
“Our team has seen firsthand how the C5 standard fosters transparency and builds trust in cloud services. We’re proud to support our clients not just in understanding C5, but in strategically leveraging it to improve security and competitiveness on a global scale.” Jeff Schiess, Managing Director, Schellman
Lowering the Barrier to Entry – Schellman recognizes that achieving C5 compliance can sometimes be intimidating, particularly for organizations new to the framework. To that end, Schellman has performed an assessment against the foundational infrastructure provided by LZA on AWS, designed to simplify the C5 journey. The LZA provides preconfigured infrastructure templates and security baselines that significantly reduce the complexity of establishing C5-compliant cloud environments.
“With the Landing Zone Accelerator, organizations can build on a C5-ready foundation right from the start. It’s a practical, scalable solution for companies that might otherwise find the C5 standard overwhelming.” Kristen Wilbur, Principal, Schellman
Sovereign by design
Landing Zone Accelerator on AWS automatically implements hundreds of security capabilities that map to control requirements across geographic compliance frameworks. This saves customers hundreds of hours in planning and implementing secure networking and account configurations by providing them with a foundation based on the AWS Well-Architected Security Pillar and AWS security best practices. Meeting compliance requirements, having verifiable access controls and data transfer restrictions, independence and choice over the technology stack, and surviving large-scale disruptions are some of the key capabilities that customers require of a sovereign-by-design workload. However, for many customers, translating regulatory requirements into a set of discrete technical controls and applying them consistently across one or more AWS accounts and AWS Regions can be time-intensive and challenging.
We provide customers and partners with detailed guidance on how to configure Landing Zone Accelerator on AWS in accordance with their local security and compliance requirements, including digital sovereignty requirements. This includes control mapping to local regulations or policies that shows customers how controls implemented in a landing zone are mapped to the specific requirements, calling out where customers are required to do more to meet these as part of our shared responsibility model—this includes organizational policies and procedures where customers must implement additional controls within their application or workload to meet local requirements.
Control over the location of your data
Landing Zone Accelerator on AWS provides customers with a choice of configurable preventative, detective, and proactive controls to help customers meet their data residency, security, and compliance objectives, whether you’re a public sector customer wanting to keep data in a single Region or navigating the complex needs of multi-national organizations with operations subject to differing digital sovereignty requirements.
Verifiable control over data access
Landing Zone Accelerator on AWS goes beyond just provisioning a secure, multi-account environment. It establishes a well-structured, multi-account architecture using AWS Organizations. This logically isolates workloads, management functions, and security controls into dedicated organizational units (OUs). This not only enhances security and operational efficiency, but also helps customers to enforce consistent data residency, access management, and compliance policies across their entire cloud footprint. These powerful guardrails empower customers to quickly harness the innovative potential of cloud technologies, whilst delivering business value from an established security and compliance baseline.
By providing this automated approach, AWS empowers organizations to rapidly deploy cloud environments tailored to their specific local requirements in days instead of weeks; with robust security, compliance, and operational guardrails in place from the outset. Landing Zone Accelerator on AWS is designed to simplify the path to cloud adoption and compliance for organizations, particularly those in regulated industries or with sovereignty requirements. This approach marks a shift from the previous heavy lift required for organizations to migrate workloads to the cloud while meeting their needs.
Partners at the core
There is a lot of complexity involved with navigating the evolving digital sovereignty landscape—but you don’t have to do it alone. Our AWS Digital Sovereignty Competency connects customers with trusted partners with demonstrated expertise to advise and architect for their customers’ digital sovereignty needs while taking advantage of the full potential of the AWS Cloud. As part of the competency, AWS is supporting partners to navigate customer challenges across four pillars: data residency, data protection, access control, and survivability.
Customers have told me about how challenging it can be to architect to address their sovereignty needs, often requiring manual iteration and longer time to value. Using Landing Zone Accelerator on AWS is one of the ways AWS and AWS Partners can work together to address customers’ sovereignty needs with a repeatable approach that helps our customers and partners move faster. I’m excited by how regional implementations of Landing Zone Accelerator on AWS is helping AWS Sovereignty Partners, such as Atos and SVA, to move faster without compromise.
“Compliance with regulations like C5 is essential for customers in the public sector and regulated industries, who prioritize digital sovereignty, and this is central to our Cloud for Clinics initiative with AWS in the German Healthcare market. The availability of the C5 LZA significantly reduces the technical complexity, giving us a common technical platform to build on reducing time to market. Atos is driving the operational rollout and expanding the scope of compliance mappings to further streamline customer compliance. At the same time, we are incorporating essential managed services like SOC/SIEM which we believe will make compliant cloud adoption easier to drive innovation by the Public Sector, Healthcare institutions or customers in regulated industries like Financial Services and Utilities.” Boris Hecker, Managing Director, ATOS Germany
“Compliance with BSI C5 criteria for customers from the public sector and regulated industries is a basic requirement for the use of public cloud services. Implementing the regulations is often complex, time-consuming and resource-intensive. For this reason, customers are looking for solutions that they can tailor to the specific requirements of their industry; while ensuring they meet compliance standards. SVA supports customers in maintaining the balance between innovation and compliance with customized, C5-certified, managed services. We rely on solutions such as the Landing Zone Accelerator on AWS to reconcile the use of market-leading public cloud infrastructure with regulatory requirements.” Patrick Glawe, Hyperscaler Lead at SVA
As AI and machine learning (AI/ML) become increasingly accessible through cloud service providers (CSPs) such as Amazon Web Services (AWS), new security issues can arise that customers need to address. AWS provides a variety of services for AI/ML use cases, and developers often interact with these services through different programming languages. In this blog post, we focus on Python and its pickle module, which supports a process called pickling to serialize and deserialize object structures. This functionality simplifies data management and the sharing of complex data across distributed systems. However, because of potential security issues, it’s important to use pickling with care (see the warning note in pickle — Python object serialization). In this post, we’re going to show you ways to build secure AI/ML workloads that use this powerful Python module, ways to detect that it’s in use that you might not know about, and when it might be getting abused, and finally highlight alternative approaches that can help you avoid these issues.
Quick tips
Avoid unpickling data from untrusted sources
Use alternative serialization formats, when possible, such as Safetensors
Implement integrity checks for serialized data
Use static code analysis tools to detect unsafe pickling patterns, such as Semgrep
Understanding insecure pickle serialization and deserialization in Python
Effective data management is crucial in Python programming, and many developers turn to the pickle module for serialization. However, issues can arise when deserializing data from untrusted sources. The Python bytestream that pickling uses, is proprietary to Python. Until it’s unpickled, the data in the bytestream can’t be thoroughly evaluated. This is where security controls and validation become critical. Without proper validation, there’s a risk that an unauthorized user could inject unexpected code, potentially leading to arbitrary code execution, data tampering, or even unintended access to a system. In the context of AI model loading, secure deserialization is particularly important—it helps prevent outside parties from modifying model behavior, injecting backdoors, or causing inadvertent disclosure of sensitive data.
Throughout this post, we will refer to pickle serialization and deserialization collectively as pickling. Similar issues can be present in other languages (for example, Java and PHP) when untrusted data is used to recreate objects or data structures, resulting in potential security issues such as arbitrary code execution, data corruption, and unauthorized access.
Static code analysis compared to dynamic testing for detecting pickling
Security code reviews, including static code analysis, offer valuable early detection and thorough coverage of pickling-related issues. By examining source code (including third-party libraries and custom code) before deployment, teams can minimize security risks in a cost-effective way. Tools that provide static analysis can automatically flag unsafe pickling patterns, giving developers actionable insights to address issues promptly. Regular code reviews also help developers improve secure coding skills over time.
While static code analysis provides a comprehensive white-box approach, dynamic testing can uncover context-specific issues that only appear during runtime. Both methods are important. In this post, we focus primarily on the role of static code analysis in identifying unsafe pickling.
Tools like Amazon CodeGuru and Semgrep are effective at detecting security issues early. For open source projects, Semgrep is a great option to maintain consistent security checks.
The risks of insecure pickling in AI/ML
Pickling issues in AI/ML contexts can be especially concerning.
Invalidated object loading: AI/ML models are often serialized for future use. Loading these models from untrusted sources without validation can result in arbitrary code execution. Libraries such as pickle, joblib, and some yaml configurations allow serialization but must be handled securely.
For example: If a web application stores user input using pickle and unpickles it later with no validation, an unauthorized user could craft a harmful payload that executes arbitrary code on the server.
Data integrity: The integrity of pickled data is critical. Unexpectedly crafted data could corrupt models, resulting in incorrect predictions or behaviors, which is especially concerning in sensitive domains such as finance, healthcare, and autonomous systems.
For example: A team updates its AI model architecture or preprocessing steps but forgets to retrain and save the updated model. Loading the old pickled model under new code might trigger errors or unpredictable outcomes.
Exposure of sensitive information: Pickling often includes all attributes of an object, potentially exposing sensitive data such as credentials or secrets.
For example: An ML model might contain database credentials within its serialized state. If shared or stored without precautions, an unauthorized user who unpickles the file might gain unintended access to these credentials.
Insufficient data protection: When sent across networks or stored without encryption, pickled data can be intercepted, leading to inadvertent disclosure of sensitive information.
For example: In a healthcare environment, a pickled AI model containing patient data could be transmitted over an unsecured network, enabling an outside party to intercept and read sensitive information.
Performance overhead: Pickling can be slower than other serialization formats (such as, JSON or Protocol Buffers), which can affect ML and large language model (LLM) applications when inference speed is critical.
For example: In a real-time natural language processing (NLP) application using an LLM, heavy pickling or unpickling operations might reduce responsiveness and degrade the user experience.
Detecting unsafe unpickling with static code analysis tools
Static code analysis (SCA) is a valuable practice for applications dealing with pickled data, because it helps detect insecure pickling before deployment. By integrating SCA tools into the development workflow, teams can spot questionable deserialization patterns as soon as code is committed. This proactive approach reduces the risk of events involving unexpected code execution or unintended access due to unsafe object loading.
For instance, in a financial services application where objects are routinely pickled, a SCA tool can scan new commits to detect unvalidated unpickling. If identified, the development team can quickly address the issue, protecting both the integrity of the application and sensitive financial data.
Patterns in the source code
There are various ways to load a pickle object in Python. In this context, methods for detection can be tailored for secure coding habits and needed package dependencies. Many Python libraries include a function to load pickle objects. An effective approach can be to catalog all Python libraries used in the project, then create custom rules in your static code analysis tool to detect unsafe pickling or unpickling within those libraries.
CodeGuru and other static analysis tools continue to evolve their capability to detect unsafe pickling patterns. Organizations can use these tools and create custom rules to identify potential security issues in AI/ML pipelines.
Let’s define the steps for creating a safe process for addressing pickling issues:
Generate a list of all the Python libraries that are used in your repository or environment.
Check the static code analysis tool in your pipeline for current rules and the ability to add custom rules. If the tool is capable of discovering all the libraries used in your project, you can rely on it. However, if it’s not able to discover all the libraries used in your project, you should consider adding user-provided custom rules in your static code analysis tool.
Most of the issues can be identified with well-designed, context-driven patterns in the static code analysis tool. For addressing the pickling issues, you need to identify pickling and unpickling functions.
Implement and test the custom rules to verify full coverage of pickling and unpickling risks. Let’s identify patterns for a few libraries:
NumPy can efficiently pickle and unpickle arrays; useful for scientific computing workflows requiring serialized arrays. To catch potential unsafe pickle usage in NumPy, custom rules could target patterns like:
import numpy as np
data = np.load('data.npy', allow_pickle=True)
npyfile is a utility for loading NumPy arrays from pickled files. You can add the following patterns to your custom rules to discover potentially unsafe pickle object usage.
import npyfile
data = npyfile.load('example.pkl')
pandas can pickle and unpickle DataFrames using pickle, allowing for efficient storage and retrieval of tabular data. You can add the following patterns to your custom rules to discover potentially unsafe pickle object usage.
import pandas as pd
df = pd.read_pickle('dataframe.pkl')
joblib is often used for pickling and unpickling Python objects that involve large data, especially NumPy arrays, more efficiently than standard pickle. You can add the following patterns to your custom rules to discover potentially unsafe pickle object usage.
from joblib import load
data = load('large_data.pkl')
Scikit-learn provides joblib for pickling and unpickling objects and is particularly useful for models. You can add the following patterns to your custom rules to discover potentially unsafe pickle object usage.
from sklearn.externals import joblib
data = joblib.load('example.pkl')
PyTorch provides utilities for loading pickled objects that are especially useful for ML models and tensors. You can add the following patterns to your custom rule format to discover potentially unsafe pickle object usage.
import torch
data = torch.load('example.pkl')
By searching for these functions and parameters in code, you can set up targeted rules that highlight potential issues with pickling.
Effective mitigation
Addressing pickling issues requires not only detection, but also clear guidance on remediation. Consider recommending more secure formats or validations where possible as follows:
PyTorch
Use Safetensors to store tensors. If pickling remains necessary, add integrity checks (for example, hashing) for serialized data.
pandas
Verify data sources and integrity when using pd.read_pickle. Encourage safer alternatives (for example, CSV, HDF5, or Parquet) to help avoid pickling risks.
scikit-learn (via joblib)
Consider Skops for safer persistence. If switching formats isn’t feasible, implement strict validation checks before loading.
General advice
Identify safer libraries or methods whenever possible.
Switch to formats such as CSV or JSON for data, unless object-specific serialization is absolutely required.
Perform source and integrity checks before loading pickle files—even those considered trusted.
Example
The following is an example implementation that shows safe pickle implementation as a representation of the preceding information.
import io
import base64
import pickle
import boto3
import numpy as np
from cryptography.fernet import Fernet
###############################################################################
# 1) RESTRICTED UNPICKLER
###############################################################################
#
# By default, pickle can execute arbitrary code when loading. Here we implement
# a custom Unpickler that only allows certain safe modules/classes. Adjust this
# to your application's requirements.
#
class RestrictedUnpickler(pickle.Unpickler):
"""
Restricts unpickling to only the modules/classes we explicitly allow.
"""
allowed_modules = {
"numpy": set(["ndarray", "dtype"]),
"builtins": set(["tuple", "list", "dict", "set", "frozenset", "int", "float", "bool", "str"])
}
def find_class(self, module, name):
if module in self.allowed_modules:
if name in self.allowed_modules[module]:
return super().find_class(module, name)
# If not allowed, raise an error to prevent arbitrary code execution.
raise pickle.UnpicklingError(f"Global '{module}.{name}' is forbidden")
def restricted_loads(data: bytes):
"""Helper function to load pickle data using the RestrictedUnpickler."""
return RestrictedUnpickler(io.BytesIO(data)).load()
###############################################################################
# 2) AWS KMS & ENCRYPTION HELPERS
###############################################################################
def generate_data_key(kms_key_id: str, region: str = "us-east-1"):
"""
Generates a fresh data key using AWS KMS.
Returns (plaintext_key, encrypted_data_key).
"""
kms_client = boto3.client("kms", region_name=region)
response = kms_client.generate_data_key(KeyId=kms_key_id, KeySpec='AES_256')
# Plaintext data key (use to encrypt the pickle data locally)
plaintext_key = response["Plaintext"]
# Encrypted data key (store along with your ciphertext)
encrypted_data_key = response["CiphertextBlob"]
return plaintext_key, encrypted_data_key
def decrypt_data_key(encrypted_data_key: bytes, region: str = "us-east-1"):
"""
Decrypts the encrypted data key via AWS KMS, returning the plaintext key.
"""
kms_client = boto3.client("kms", region_name=region)
response = kms_client.decrypt(CiphertextBlob=encrypted_data_key)
return response["Plaintext"]
def build_fernet_key(plaintext_key: bytes) -> Fernet:
"""
Construct a Fernet instance from a 32-byte data key.
Fernet requires a 32-byte key *encoded* in URL-safe base64.
"""
if len(plaintext_key) < 32:
raise ValueError("Data key is smaller than 32 bytes; cannot build a Fernet key.")
fernet_key = base64.urlsafe_b64encode(plaintext_key[:32])
return Fernet(fernet_key)
###############################################################################
# 3) MAIN LOGIC
###############################################################################
def upload_pickled_data_s3(
numpy_obj: np.ndarray,
bucket_name: str,
s3_key: str,
kms_key_id: str,
region: str = "us-east-1"
):
"""
Pickle a numpy object, encrypt it locally, and upload the ciphertext +
encrypted data key to S3.
"""
# 1. Generate data key from KMS
plaintext_key, encrypted_data_key = generate_data_key(kms_key_id, region)
# 2. Build Fernet from plaintext data key
fernet = build_fernet_key(plaintext_key)
# 3. Serialize the numpy object with pickle
pickled_data = pickle.dumps(numpy_obj, protocol=pickle.HIGHEST_PROTOCOL)
# 4. Encrypt the pickled data
encrypted_data = fernet.encrypt(pickled_data)
# 5. Upload to S3 along with the encrypted data key (in metadata)
s3_client = boto3.client("s3", region_name=region)
s3_client.put_object(
Bucket=bucket_name,
Key=s3_key,
Body=encrypted_data,
Metadata={
"encrypted_data_key": base64.b64encode(encrypted_data_key).decode("utf-8")
}
)
print(f"Encrypted pickle uploaded to s3://{bucket_name}/{s3_key}")
def download_and_unpickle_data_s3(
bucket_name: str,
s3_key: str,
region: str = "us-east-1"
) -> np.ndarray:
"""
Download the ciphertext and the encrypted data key from S3. Decrypt the data
key with KMS, use it to decrypt the pickled data, then load with a restricted
unpickler for safety.
"""
s3_client = boto3.client("s3", region_name=region)
# 1. Get object from S3
response = s3_client.get_object(Bucket=bucket_name, Key=s3_key)
# 2. Extract the encrypted data key from metadata
metadata = response["Metadata"]
encrypted_data_key_b64 = metadata.get("encrypted_data_key")
if not encrypted_data_key_b64:
raise ValueError("Missing encrypted_data_key in S3 object metadata.")
encrypted_data_key = base64.b64decode(encrypted_data_key_b64)
# 3. Decrypt data key via KMS
plaintext_key = decrypt_data_key(encrypted_data_key, region)
fernet = build_fernet_key(plaintext_key)
# 4. Decrypt the pickled data
encrypted_data = response["Body"].read()
decrypted_pickled_data = fernet.decrypt(encrypted_data)
# 5. Use restricted unpickler to load the numpy object
numpy_obj = restricted_loads(decrypted_pickled_data)
return numpy_obj
###############################################################################
# DEMO USAGE
###############################################################################
if __name__ == "__main__":
# --- Replace with your actual values ---
KMS_KEY_ID = "arn:aws:kms:us-east-1:123456789012:key/your-kms-key-id"
BUCKET_NAME = "your-secure-bucket"
S3_OBJECT_KEY = "encrypted_npy_demo.bin"
AWS_REGION = "us-east-1" # or region of your choice
# Example numpy array
original_array = np.random.rand(2, 3)
print("Original Array:")
print(original_array)
# Upload (pickle + encrypt) to S3
upload_pickled_data_s3(
numpy_obj=original_array,
bucket_name=BUCKET_NAME,
s3_key=S3_OBJECT_KEY,
kms_key_id=KMS_KEY_ID,
region=AWS_REGION
)
# Download (decrypt + unpickle) from S3
retrieved_array = download_and_unpickle_data_s3(
bucket_name=BUCKET_NAME,
s3_key=S3_OBJECT_KEY,
region=AWS_REGION
)
print("\nRetrieved Array:")
print(retrieved_array)
# Verify integrity
assert np.allclose(original_array, retrieved_array), "Arrays do not match!"
print("\nSuccess! The retrieved array matches the original array.")
Conclusion
With the rapid expansion of cloud technologies, integrating static code analysis into your AI/ML development process is increasingly important. While pickling offers a powerful way to serialize objects for AI/ML and LLM applications, you can mitigate potential risks by applying manual secure code reviews, setting up automated SCA with custom rules, and following best practices such as using alternative serialization methods or verifying data integrity.
When working with ML models on AWS, see the AWS Well-Architected Framework’s Machine Learning Lens for guidance on secure architecture and recommended practices. By combining these approaches, you can maintain a strong security posture and streamline the AI/ML development lifecycle.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
Every organization strives to empower teams to drive innovation while safeguarding their data and systems from unintended access. For organizations that have thousands of Amazon Web Services (AWS) resources spread across multiple accounts, organization-wide permissions guardrails can help maintain secure and compliant configurations. For example, some AWS services support resource-based policies that can be used to grant identities permissions to perform actions on the resources they’re attached to. With the management of resource-based policies frequently delegated to application owners, central security teams use permissions guardrails to help ensure that possible misconfigurations don’t lead to unintended access to these resources.
In this post, we discuss how you can use resource control policies (RCPs) to centrally restrict access to resources. We demonstrate how RCPs can help improve your security posture while allowing even more freedom to developers in managing their resources, thus reducing friction between central security and application teams. Using a sample use case, we uncover key considerations for designing and effectively implementing RCPs in your organization at scale.
We recommend implementing permissions guardrails, including RCPs, using the following iterative process, which consists of five phases (as shown in Figure 1).
This phased approach helps ensure an effective integration of RCPs into your security strategy, improving your security posture while helping to maintain business continuity. Let’s explore each phase of RCP implementation in detail and outline key considerations for an effective implementation strategy.
Phase 1: Examine your security control objectives
The first step in implementing RCPs is identifying areas where RCPs can help improve your security posture or optimize the implementation of controls for your organization’s specific security control objectives.
Your control objectives can be influenced by a variety of factors such as compliance and regulatory requirements, legal and contractual obligations, types of workloads, data classification, and your organization’s threat model. After your control objectives are well-defined and prioritized, identify those that can be achieved using RCPs.
Like SCPs, RCPs are designed to establish coarse-grained access controls, security invariants that rarely change and serve as always-on boundaries across a wide range of AWS resources in your accounts. RCPs aren’t for managing fine-grained access controls. You will keep using policies such as resource-based and identity-based policies to apply least-privilege permissions.
More specifically, the following are key control objectives that you can achieve using RCPs:
Establish a data perimeter around your AWS resources. For example, you can use RCPs to help ensure that only trusted identities can access your AWS resources.
Mitigate the cross-service confused deputy risk. You can use RCPs to help ensure that your AWS resources are accessed by AWS services only on behalf of your organization.
Apply consistent access controls to your AWS resources regardless of the identities accessing them. For example, you can use RCPs to help ensure your Amazon Simple Storage Service (Amazon S3) buckets require TLS v1.2 or higher for in-transit encryption.
Let’s begin with the scenario illustrated in Figure 2. Your company’s central cloud team manages your corporate AWS Organizations organization, which consists of two corporate AWS accounts. An IAM principal in Account A should be able to assume an IAM role in Account B to perform day-to-day operations. To align to the broader control objective of Only trusted identities can access my resources, the central security team wants to make sure that the IAM role in Account B (my resource) can only be assumed by IAM principals that belong to their organization (trusted identities).
Figure 2: Simple scenario depicting a trusted identity accessing an IAM role
One way of achieving this control objective is to follow the principle of least-privilege and make sure that the role trust policy, the resource-based policy attached to the IAM role, only allows access to identities that require that access. The following is an example trust policy that grants permissions to Role A in Account A to assume Role B in Account B.
In organizations that have only a few accounts, central teams typically manage these policies. While this centralized governance model helps ensure that trust policies applied to roles are always restricted to trusted identities, it can also impede the productivity of application teams when operating at a greater scale.
Assume that your company has started growing its cloud footprint so much that your central security team now must achieve the same control objective with hundreds of IAM roles that are spread across multiple AWS accounts, as demonstrated in Figure 3.
Figure 3: Restricting access by managing individual IAM role trust policies
At this scale, we see organizations delegating permissions management to application teams to better support the growth of their business and empower developers to innovate faster. While central security teams no longer have full control over the permissions granted to resources across AWS accounts, they must make sure that access is aligned with their organization’s security standard. For example, they might want to make sure that the GrantCrossAccountAccess statement that is now managed by developers doesn’t inadvertently grant access to an account that doesn’t belong to their organization. Previously, central security teams typically achieved this by developing automated mechanisms to insert a standard statement into all trust policies. This statement helped ensure that access remained bounded to their organization, even when developers configured broad access permissions for their roles. The following is an example trust policy where a developer granted permissions to an external account through the GrantCrossAccountAccess statement. However, because of the RestrictAccessToMyOrg statement added to the policy by the central security team, the external account will be unable to use these permissions.
The RestrictAccessToMyOrg statement uses the aws:PrincipalOrgID and aws:PrincipalIsAWSService condition keys to restrict access to principals within your organization or to AWS service principals. The BoolIfExists operator with the aws:PrincipalIsAWSService condition key is required if the roles you’re applying a control to are service roles that are used by AWS services to perform operations on your behalf. When an AWS service assumes a service role, it uses its AWS service principal, an identity that is owned by AWS and that does not belong to your organization.
The central security teams could, for example, use AWS Config rules to detect misconfigurations and then use AWS Config remediation to automatically add the RestrictAccessToMyOrg statement to the IAM roles’ trust policies when new IAM roles are created or their trust policies are changed. Even though the addition of the RestrictAccessToMyOrg statement to trust policies can be automated, RCPs can greatly simplify enforcement of such coarse-grained controls in a multi-account environment.
Phase 2: Design permissions guardrails
Central security teams can implement permissions guardrails by creating an RCP that centrally blocks external access to IAM roles. The RCP that you will implement contains similar restrictions to the RestrictAccessToMyOrg statement that you used in the IAM trust policy.
Like SCPs, you attach the RCP to an account, organizational unit (OU), or the root of your organization. After being attached, the RCP automatically applies to applicable resources—in this case, IAM roles—within the scope of that AWS Organizations entity. This centralized approach alleviates the need to modify hundreds of trust policies across multiple accounts, lowering the operational overhead for central security teams and helping ensure consistent access controls are applied at scale. RCPs also help you achieve separation of duties with developers still managing their least-privilege permissions in trust policies and administrators applying coarse-grained access controls in RCPs. If developers make configuration mistakes while managing permissions for their applications, the preventative access controls implemented using RCPs will help ensure that they stay within your organization’s access control guidelines. See How AWS enforcement code logic evaluates requests to allow or deny access to understand how different policy types impact the authorization process.
If you’re transitioning existing controls from resource-based policies to RCPs, use the opportunity to reassess the control design based on your current control objectives and the additional benefits offered by RCPs. For example, your previous controls might have been limited to specific resource types, such as IAM roles in this use case, or to particular accounts, such as those storing the most sensitive data. RCPs enable you to extend controls to additional resources across your entire organization, reducing operational overhead through centralized management of permissions guardrails.
While designing your RCPs, consider the following guidelines.
Design for operational excellence
A key foundation for effectively implementing and operating permissions guardrails like RCPs is organizing your AWS environment using multiple accounts. Account boundaries and strategic placement of workloads across them allow you to apply tailored access controls that align with data sensitivity and specific access requirements. Grouping accounts into OUs within AWS Organizations enables more effective access control, even in scenarios where cross-account access is required. Figure 4 illustrates an example organization structure, demonstrating how RCPs can be applied at various levels of the organizational hierarchy to adhere to the security requirements of different workloads.
Figure 4: A sample organization with RCPs applied at various levels
When operating at scale, consider delegating policy management to a central security account in your organization. With AWS Organizations resource-based delegation, central teams don’t need access to the management account for any SCP or RCP related changes or troubleshooting.
Establishing clear governance helps you define how to implement and continuously manage RCPs within your organization. This includes the operating model, change management processes, and exceptions handling procedures. RCPs provide authorization controls similar to SCPs and therefore should integrate with your existing governance framework rather than requiring separate oversight. For example, if your change management process requires two-person approval for SCP changes, you should consider applying the same approval process for RCP implementation. You should also adopt the same mechanisms you currently use to prevent unauthorized changes or detect drifts in your policies.
Plan for exceptions
There might be scenarios where you have a few resources that should be accessible publicly or by identities that don’t belong to your organization. If you’re organizing your resources across multiple accounts and OUs based on their compliance requirements or a common set of controls, then you most likely have such resources in a dedicated set of accounts or OUs, such as the Public Data OU in Figure 4. These accounts or OUs can have applicable policies that account for their unique access requirements.
Another option to accommodate these scenarios is to use the aws:ResourceAccount or aws:ResourceOrgPaths condition key to exclude certain accounts from the control. For example, the following policy will deny access to identities outside your organization from assuming IAM roles unless the identity is an AWS service principal or the role that is being accessed belongs to Account A.
There also might be situations where your company’s trusted partners or acquisitions need to be granted an exception for access to a subset of your company’s resources distributed across multiple accounts. For example, your company might integrate with Cloud Security Posture Management (CSPM) tools that assume roles in your accounts to assess your accounts’ security posture, as shown in Figure 5.
Figure 5: Representative view of granting exceptions to trusted partners
When implementing a control with an RCP that by default will apply to all resources of the entity it’s attached to, you can manage resource specific exceptions using the aws:ResourceTag condition key. In addition, use the aws:PrincipalAccount context key to conditionally grant exceptions based on the AWS account ID of the trusted partner.
Let’s examine the two statements in the preceding RCP:
RestrictAccessToMyOrgExceptTaggedRoles
This statement helps ensure that your roles can only be assumed by identities that belong to your organization or by AWS service principals, unless a role is tagged with partner-access-exception set to trusted-partner.
RestrictAccessForTaggedRoles
This statement further restricts access by helping ensure that the roles that have the partner-access-exception tag can only be assumed by identities that belong to your trusted partner account.
If you have a well-known, tightly scoped set of resources that need to be excluded, you can also use the IAM policy element, NotResource, to list the Amazon Resource Names (ARNs) of resources to exclude from the control.
When implementing tag-based exception processes, establishing strict controls over tag management is key. Unauthorized modifications of tags on resources, principals, or sessions could impact your security posture by enabling unintended access. You should implement controls to help prevent unauthorized tag manipulation. For example, the following SCP restricts the use of the partner-access-exception tag to the admin role so that unauthorized users cannot alter the control by attaching, detaching, or modifying the tag.
You should also make sure that the partner-access-exception tag cannot be passed as a session tag when identities assume roles. See the sample RCP in the data perimeter policy examples repository.
Phase 3: Anticipate potential impacts
Before rolling out RCPs, you need to understand their potential impact on your organization. Introducing new policies or modifying existing ones without proper validation can disrupt your security-productivity balance. Be aware that overly restrictive policies might inadvertently impede legitimate data flows that are essential for achieving your business objectives.
Another effective method to assess impact is to review and analyze your account activity using AWS CloudTrail. For example, if you centralize all your CloudTrail logs in an S3 bucket, you can use Amazon Athena to query these logs. Specifically, look for STS API calls made against your IAM roles by identities outside your organization. Then, compare the results with your list of known trusted partners and those you have already accounted for in your RCPs. Based on this analysis, determine if you need to add the partner-access-exception tag to additional IAM roles and further refine the policy before enforcement. This is essential to ensure trusted partner integrations continue to function as expected when you enforce your RCPs. Furthermore, use this analysis to identify any illegitimate access patterns in your environment and plan for necessary remediations, further enhancing your security posture as part of RCP implementation.
As you transition into the implementation phase, consider the following key factors to promote a smooth rollout while enhancing your security posture.
Deployment automation and integration
Use your existing deployment pipelines to implement RCPs, the same as you do for SCPs. This approach will minimize operational overhead while maintaining consistency in the deployment of your controls.
As with SCPs, AWS strongly advises against attaching RCPs in production environments without thoroughly testing the impact that the policies have on resources in your accounts. Follow standard CI/CD processes and begin your RCP rollout in lower environments by attaching them to individual test accounts or OUs first. After you validate that the controls behave as excepted, gradually promote the RCPs to upper environments.
If your goal is to transition an existing control from resource-based policies to RCPs, keep your resource-based policies in place while conducting the progressive rollout. After you have completed rolling out your RCPs and confirmed that they operate as expected, you can consider deactivating the automation you used to apply the control using resource-based policies. This approach lets you deploy RCPs without impacting your existing security posture or disrupting business workflows.
Additionally, consider deploying RCPs to a subset of resources or accounts first to limit the scope of impact and provide an opportunity to test and refine your deployment and operational processes. You can follow your standard prioritization approach to define deployment waves, for example, start with resources or accounts that store sensitive data or pose the highest risk, based on your current operational practices and other controls that might be in place. For additional best practices, see OPS06-BP03 Employ safe deployment strategies in the AWS Well-Architected Framework: Operation Excellence Pillar whitepaper.
Phase 5: Monitor permissions guardrails
Finally, establish monitoring processes to help ensure that controls for preventing external access to your resources operate as expected. You can use the same tools you used for impact analysis. For example, you can use IAM Access Analyzer external access findings to understand the impact of your RCPs on resource permissions. This information will help you verify that your RCPs are crafted in accordance with your intent and plan remediation actions, if required. You can also set alerts for occurrences of unintended access patterns observed in your CloudTrail logs.
Furthermore, follow the phased approach outlined in this post to regularly review and update your controls to help ensure that they align with evolving business and security objectives. Consider factors such as organizational changes, changes in partner relationships, data criticality shifts, and opportunities for expanding your RCP coverage. This continuous improvement process helps maintain the effectiveness of your security controls while supporting business growth and transformation.
Conclusion
In this post, we discussed how to effectively implement coarse-grained access controls on AWS resources at scale using RCPs. You can use the phased implementation approach described here to achieve your security control objectives while minimizing the risk of disrupting your business workflows. You can apply the same approach to implement other preventative controls, such as SCPs, across your multi-account environment.
Remember that RCPs, like SCPs, provide a powerful mechanism for enforcing coarse-grained controls across multiple accounts in your organization. They don’t replace your least-privilege controls and should be part of a broader, multi-layered approach to data security that includes other well-architected security design principles.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
We are excited to announce our latest innovation to Cloudflare’s Data Loss Prevention (DLP) solution: a self-improving AI-powered algorithm that adapts to your organization’s unique traffic patterns to reduce false positives.
Many customers are plagued by the shapeshifting task of identifying and protecting their sensitive data as it moves within and even outside of their organization. Detecting this data through deterministic means, such as regular expressions, often fails because they cannot identify details that are categorized as personally identifiable information (PII) nor intellectual property (IP). This can generate a high rate of false positives, which contributes to noisy alerts that subsequently may lead to review fatigue. Even more critically, this less than ideal experience can turn users away from relying on our DLP product and result in a reduction in their overall security posture.
Built into Cloudflare’s DLP Engine, AI enables us to intelligently assess the contents of a document or HTTP request in parallel with a customer’s historical reports to determine context similarity and draw conclusions on data sensitivity with increased accuracy.
Understanding false positives and their impact on user confidence
Data Loss Prevention (DLP) at Cloudflare detects sensitive information by scanning potential sources of data leakage across various channels such as web, cloud, email, and SaaS applications. While we leverage several detection methods, pattern-based methods like regular expressions play a key role in our approach. This method is effective for many types of sensitive data. However, certain information can be challenging to classify solely through patterns. For instance, U.S. Social Security Numbers (SSNs), structured as AAA-GG-SSSS, sometimes with dashes omitted, are often confused with other similarly formatted data, such as U.S. taxpayer identification numbers, bank account numbers, or phone numbers.
Since announcing our DLP product, we have introduced new capabilities like confidence thresholds to reduce the number of false positives users receive. This method involves examining the surrounding context of a pattern match to assess Cloudflare’s confidence in its accuracy. With confidence thresholds, users specify a threshold (low, medium, or high) to signify a preference for how tolerant detections are to false positives. DLP uses the chosen threshold as a minimum, surfacing only those detections with a confidence score that meets or exceeds the specified threshold.
However, implementing context analysis is also not a trivial task. A straightforward approach might involve looking for specific keywords near the matched pattern, such as “SSN” near a potential SSN match, but this method has its limitations. Keyword lists are often incomplete, users may make typographical errors, and many true positives do not have any identifying keywords nearby (e.g., bank accounts near routing numbers or SSNs near names).
Leveraging AI/ML for enhanced detection accuracy
To address the limitations of a hardcoded strategy for context analysis, we have developed a dynamic, self-improving algorithm that learns from customer feedback to further improve their future experience. Each time a customer reports a false positive via decrypted payload logs, the system reduces its future confidence for hits in similar contexts. Conversely, reports of true positives increase the system’s confidence for hits in similar contexts.
To determine context similarity, we leverage Workers AI. Specifically, a pretrained language model that converts the text into a high-dimensional vector (i.e. text embedding). These embeddings capture the meaning of the text, ensuring that two sentences with the same meaning but different wording map to vectors that are close to each other.
When a pattern match is detected, the system uses the AI model to compute the embedding of the surrounding context. It then performs a nearest neighbor search to find previously logged false or true positives with similar meanings. This allows the system to identify context similarities even if the exact wording differs, but the meaning remains the same.
In our experiments using Cloudflare employee traffic, this approach has proven robust, effectively handling new pattern matches it hadn’t encountered before. When the DLP admin reports false and true positives through the Cloudflare dashboard while viewing the payload log of a policy match, it helps DLP continue to improve, leading to a significant reduction in false positives over time.
Seamless integration with Workers AI and Vectorize
In developing this new feature, we used components from Cloudflare’s developer platform — Workers AI and Vectorize — which helps simplify our design. Instead of managing the underlying infrastructure ourselves, we leveraged Cloudflare Workers as the foundation, using Workers AI for text embedding, and Vectorize as the vector database. This setup allows us to focus on the algorithm itself without the overhead of provisioning underlying resources.
Thanks to Workers AI, converting text into embeddings couldn’t be easier. With just a single line of code we can transform any text into its corresponding vector representation.
const result = await env.AI.run(model, {text: [text]}).data;
This handles everything from tokenization to GPU-powered inference, making the process both simple and scalable.
The nearest neighbor search is equally straightforward. After obtaining the vector from Workers AI, we use Vectorize to quickly find similar contexts from past reports. In the meantime, we store the vector for the current pattern match in Vectorize, allowing us to learn from future feedback.
To optimize resource usage, we’ve incorporated a few more clever techniques. For example, instead of storing every vector from pattern hits, we use online clustering to group vectors into clusters and store only the cluster centroids along with counters for tracking hits and reports. This reduces storage needs and speeds up searches. Additionally, we’ve integrated Cloudflare Queues to separate the indexing process from the DLP scanning hot path, ensuring a robust and responsive system.
Privacy is a top priority. We redact any matched text before conversion to embeddings, and all vectors and reports are stored in customer-specific private namespaces across Vectorize, D1, and Workers KV. This means each customer’s learning process is independent and secure. In addition, we implement data retention policies so that vectors that have not been accessed or referenced within 60 days are automatically removed from our system.
Limitations and continuous improvements
AI-driven context analysis significantly improves the accuracy of our detections. However, this comes at the cost of some increase in latency for the end user experience. For requests that do not match any enabled DLP entries, there will be no latency increase. However, requests that match an enabled entry in a profile with AI context analysis enabled will typically experience an increase in latency of about 400ms. In rare extreme cases, for example requests that match multiple entries, that latency increase could be as high as 1.5 seconds. We are actively working to drive the latency down, ideally to a typical increase of 250ms or better.
Another limitation is that the current implementation supports English exclusively because of our choice of the language model. However, Workers AI is developing a multilingual model which will enable DLP to increase support across different regions and languages.
Looking ahead, we also aim to enhance the transparency of AI context analysis. Currently, users have no visibility on how the decisions are made based on their past false and true positive reports. We plan to develop tools and interfaces that provide more insight into how confidence scores are calculated, making the system more explainable and user-friendly.
With this launch, AI context analysis is only available for Gateway HTTP traffic. By the end of 2025, AI context analysis will be available in both CASB and Email Security so that customers receive the same AI enhancements across their entire data landscape.
Unlock the benefits: start using AI-powered detection features today
DLP’s AI context analysis is in closed beta. Sign up here for early access to experience immediate improvements to your DLP HTTP traffic matches. More updates are coming soon as we approach general availability!
To get access to DLP via Cloudflare One, contact your account manager.
The Spanish National Cryptologic Center (CCN) has published a new STIC guide (CCN-STIC-887 Anexo A) that provides a comprehensive template and supporting artifacts for implementing landing zones that comply with Spain’s National Security Framework (ENS) Royal Decree 311/2022 using the Landing Zone Accelerator on AWS. Spain’s ENS establishes a common framework of basic principles and requirements of security for Spanish public sector organizations and their service providers, including supply chain providers. Over the years, the collaboration between Amazon Web Services (AWS) and the CCN has resulted in the publication of eight secure configuration guides (Series STIC 887) that provide comprehensive advice on the configuration of AWS services to align with the ENS. The guide CCN-STIC-887 Anexo A is the last addition to this series.
The centerpiece of this new guide is the ENS template for the Landing Zone Accelerator on AWS (LZA ENS). A landing zone serves as the initial setup of an organization’s cloud account or environment, including the implementation of security controls, access management, and compliance frameworks. The Landing Zone Accelerator on AWS is a powerful open source tool created by AWS for organizations that want to quickly customize and automate implementation of landing zones that align with AWS best practices and with regulatory compliance frameworks. This tool provides a comprehensive solution that, managed entirely by code, automatically configures over 35 AWS services using a simplified set of configuration files to manage and govern a multi-account environment, helping customers with highly regulated workloads and complex compliance requirements.
The CCN-STIC-887 Anexo A guide focuses on helping organizations implement landing zones that meet ENS security requirements from the ground up. It offers detailed instructions and templates for establishing a landing zone—the foundational infrastructure required for a secure, well-managed cloud environment—and a control matrix to demonstrate compliance with ENS controls.
Key components covered in the STIC 887H guide include:
Logging and monitoring: LZA ENS performs a default and scaled activation of the necessary logging and monitoring services required to meet ENS monitoring requirements in AWS services (such as AWS CloudTrail, Amazon CloudWatch, AWS Security Hub, and Amazon GuardDuty).
Access control: LZA ENS implements the management of identity and access management methods and policies at scale, which are aligned with the access control requirements of the ENS in a centralized manner using AWS IAM Identity Center.
Asset management: By default, LZA ENS activates inventory functions and resource and inventory tagging policies (for example, AWS Config) that support ENS asset management controls in the services.
Network topology: LZA ENS can be used to deploy a centralized network topology in accordance with ENS network security controls.
Cryptography: The encryption service activation capabilities built into LZA ENS can help organizations align with ENS data protection standards through mandatory encryption at rest, enforcement mechanisms with AWS Key Management Service (AWS KMS), and monitoring mechanisms to detect unencrypted data and communications with AWS Config rules.
Compliance and data residency: LZA ENS includes control policies to promote the use of AWS services with the ENS High certification and to provide processing on AWS in accordance with customers’ data residency requirements.
Organizations that require specific customizations to fully meet the requirements of the ENS can use LZA ENS to quickly modify and add customized security controls and then execute the scaled deployment of these controls to their accounts in the landing zone. One of the customizations included in LZA ENS is the integration of the open source security tool Prowler with Security Hub as an automated auditing tool with the objective of providing an up-to-date view of compliance with ENS controls. In addition, by providing a base designed for security and the flexibility to add custom controls, LZA ENS can support the process of achieving and maintaining compliance with the ENS in the AWS Cloud environment.
The CCN-STIC-887 Anexo A guide represents an important step forward in standardizing secure cloud deployments for Spanish public sector organizations and those working with government entities. This publication demonstrates the AWS commitment to support organizations in their secure cloud adoption journey while maintaining compliance with national security standards.
Spanish version
CCN publica la guía para las Zonas de Aterrizaje del ENS con AWS Landing Zone Accelerator
El Centro Criptológico Nacional de España (CCN) ha publicado una nueva guía STIC (CCN-STIC-887 Anexo A) que proporciona una plantilla de código y material de soporte para implementar zonas de aterrizaje (o landing zones) que cumplan con el Esquema Nacional de Seguridad del Real Decreto 311/2022 (ENS) mediante el Landing Zone Accelerator on AWS. El ENS establece un marco común de principios básicos, requisitos y medidas de seguridad para las organizaciones del sector público español y sus prestadores de servicios, incluyendo la cadena de suministro. A lo largo de los años, la colaboración entre Amazon Web Services (AWS) y el CCN se ha traducido en la publicación de ocho guías de configuración segura (serie STIC 887) que proporcionan consejo sobre la configuración de los servicios de AWS para alinearse con el ENS. La guía CCN-STIC-887 Anexo A es la última incorporación a esta serie.
La pieza central de la nueva guía es la plantilla ENS para el AWS Landing Zone Accelerator (LZA ENS). Una zona de aterrizaje (landing zone) sirve como la configuración inicial del entorno en la nube de una organización, e incluye la implementación inicial de controles de seguridad, la administración del acceso y los marcos de cumplimiento. El AWS Landing Zone Accelerator es una potente herramienta de código abierto creada por AWS para las organizaciones que desean implementar de forma rápida, segura, personalizada y automatizada zonas de aterrizaje alineadas con las prácticas recomendadas de AWS, así como con marcos de conformidad. Esta herramienta proporciona una solución integral que, mediante código, configura automáticamente más de 35 servicios de AWS con un conjunto simplificado de archivos de configuración para administrar y gobernar un entorno multicuenta, lo que ayuda a los clientes con cargas de trabajo altamente reguladas y requisitos de cumplimiento normativo.
La guía CCN-STIC-887 Anexo A se centra específicamente en ayudar a las organizaciones a implementar desde cero zonas de aterrizaje que cumplan con los requisitos de seguridad del ENS. Ofrece instrucciones y plantillas detalladas para establecer una zona de aterrizaje – la infraestructura básica necesaria para un entorno de nube seguro y bien administrado – así como una matriz de control para demostrar el cumplimiento de los controles del ENS.
Los componentes clave incluidos en la guía STIC 887H incluyen:
Registro y monitoreo: LZA ENS realiza una activación por defecto y a escala de los servicios de registro y monitoreo necesarios en AWS (como AWS CloudTrail, Amazon CloudWatch, AWS Security Hub, y AWS GuardDuty) para cumplir con los requisitos de monitoreo del ENS.
Control de acceso: LZA ENS implementa los métodos y políticas de administración de identidades y accesos a escala, que se alinean con los requisitos de control de acceso del ENS de manera centralizada mediante AWS IAM Identity Center..
Administración de activos: De forma predeterminada, el LZA ENS activa las funciones de inventario y las políticas de etiquetado de recursos e inventario (por ejemplo AWS Config) que soportan los controles de administración de activos del ENS.
Topología de red: LZA ENS se puede utilizar para implementar una topología de red centralizada de acuerdo con los controles de seguridad de red ENS.
Criptografía: las capacidades de activación de cifrado integradas en la LZA ayudan a organizaciones a alinearse con los estándares de protección de datos del ENS mediante el cifrado obligatorio en reposo, los mecanismos de aplicación con AWS Key Management Service (AWS KMS) y los mecanismos de supervisión para detectar datos y comunicaciones no cifrados con las reglas de AWS Config.
Cumplimiento y residencia de datos: LZA ENS incluye políticas de control para promover el uso de los servicios de AWS con la certificación del ENS Alto y realizar el procesamiento en AWS de acuerdo con los requisitos de residencia de datos del cliente.
Las organizaciones que requieren personalizaciones específicas para cumplir plenamente los requisitos del ENS pueden usar el LZA ENS para modificar rápidamente y añadir fácilmente controles de seguridad personalizados y ejecutar la implementación a escala de estos controles en sus cuentas de la zona de aterrizaje. Una de las personalizaciones que hemos incluido en el LZA ENS es la integración de Prowler con AWS Security Hub como una herramienta de auditoría automatizada, con el objetivo de proporcionar una visión actualizada del cumplimiento de los controles ENS de una manera fácil y eficaz. Además, al proporcionar una base diseñada para la seguridad y la flexibilidad de agregar controles personalizados, LZA ENS puede ayudar durante el proceso de obtener la conformidad con el ENS en el entorno de nube de AWS.
La guía CCN-STIC-887 Anexo A representa un importante paso adelante en la estandarización de las implementaciones seguras en la nube para las organizaciones del sector público español. Esta publicación demuestra el compromiso de AWS de apoyar a las organizaciones en su proceso de adopción segura de la nube, manteniendo al mismo tiempo el cumplimiento de las normas de seguridad nacionales.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
As organizations continue their cloud journeys, effective data security in the cloud is a top priority. Whether it’s protecting customer information, intellectual property, or compliance-mandated data, encryption serves as a fundamental security control. This is where AWS Key Management Service (AWS KMS) steps in, offering a robust foundation for encryption key management on AWS.
One of the first questions that often arises for customers is, “How many keys do I actually need?” This seemingly simple question requires careful consideration of various factors. Although AWS KMS makes encryption straightforward, organizations need to consider several aspects of their key management strategy. These include choosing between AWS managed keys, customer managed keys, and importing your own keys (BYOK), as well as deciding between centralized and decentralized key management approaches. Each option has its own benefits and trade-offs in terms of security, control, and operational overhead. By understanding these choices and how they align with your organization’s needs, you can develop an effective and efficient key management strategy.
In this blog post, we explore the main considerations that drive your AWS KMS key strategy, from organizational structure to compliance requirements. Should you maintain a centralized key management approach with a single team controlling all keys, or adopt a decentralized model where individual teams manage their own keys? These decisions are important because they relate to the AWS shared responsibility model, where AWS maintains the security of the cloud, while customers remain responsible for security in the cloud, including the proper management and use of encryption keys.
Overview – What is AWS Key Management Service?
AWS Key Management Service (AWS KMS) is an AWS managed service that makes it convenient for you to create and control the encryption keys that are used to encrypt your data. The keys that you create in AWS KMS are protected by FIPS 140 Level 3 validated hardware security modules (HSM). The keys never leave AWS KMS unencrypted. To use or manage your KMS keys, you interact with AWS KMS.
Customers are responsible for deciding what data to encrypt, choosing the appropriate encryption keys, and implementing encryption across AWS services with the help of the key policy. Customers are responsible for monitoring and auditing the use of encryption keys through services such as AWS CloudTrail.
A critical aspect of customer responsibility is determining how to manage the keys and how many KMS keys are needed. This decision depends on various factors such as data classification, application architecture, regulatory requirements, and operational needs. We look at these areas in more detail in the next sections.
Guiding principles for key strategy
Following are four guiding engineering principles that, based on our experience, help create a secure and easier-to-maintain system. They will assist you in determining the approximate number of KMS keys for your organization based on your management requirements.
Principle 1 – Data Classification: If a system processes data of different classification levels, employ separate data resources and separate KMS keys to separately govern and audit access to the data. With similarly classified data or a single type of data, the usage of just one KMS key may be justified.
Why it matters: This principle helps to ensure that data that is classified into different sensitivity levels is protected appropriately based on access to encryption keys for that same classification, reducing the risk of unauthorized access and simplifying governance.
Principle 2 – Applications: Multiple applications can run in one AWS account. We recommend that you use distinct KMS keys for each application, because managing access to an individual key can become a complex task when it is delegated to two or more application administrators. Use separate KMS keys for applications running in distinct AWS accounts to further make use of the account boundary, limiting the potential impact in case of a security incident. Use separate keys for distinct application stages (such as development, staging, or production).
Why it matters: This approach isolates access to applications and application access to data. This reduces the potential impact of unintended access to a key.
Principle 3 – AWS Services: When you consider key management across multiple AWS services, focus on both the services and the nature of the data. If you are dealing with one type of data (for example, customer information) that flows through multiple AWS services as part of one application or workflow, consider using a single KMS key. This simplifies key management while maintaining consistent access control. For instance, a customer record that is stored in Amazon Simple Storage Service (Amazon S3), processed by AWS Lambda, and then stored in Amazon DynamoDB could use the same KMS key across these services as mentioned in Principle 1.
However, if you are handling different types of data (such as financial records and user preferences) across various AWS services, even within the same application, consider using separate KMS keys on a per-service basis. This allows for more granular access control and adheres to the principle of least privilege. For example, in an e-commerce application, you might use one KMS key for encrypting payment information in Amazon Relational Database Service (Amazon RDS) and a different key for encrypting user browsing history in Amazon Redshift.
The decision to use one key or multiple keys should be based on your data classification policies and access control requirements. With this approach, you can keep your key management strategy aligned with your data governance policies, regardless of which AWS services you are using.
Why it matters: This principle balances the need for simplicity with the requirement for granular control over data access across different AWS services.
Principle 4 – Separation of Duties: Key policies define who can administer and who can use the key. In the case of distinct encryption use cases and distinct administrators, we recommend that you create separate KMS keys. Another aspect of separation of duties is that, with KMS key policies, two different principals can be made responsible for governing data and data decryption access. However, this does not influence the count of keys.
Why it matters: This principle supports the implementation of least privilege access and helps maintain clear accountability in key management.
By applying these principles, you can develop a key management strategy that describes how many KMS keys you may need, and that balances security, compliance, and operational efficiency. In the following sections, we explore how to apply these principles in various scenarios.
Examples of key management strategy and comparison of centralized and decentralized approaches
In addition to the guiding principles discussed earlier, the structure of your organization and its specific needs play a crucial role in determining the most suitable approach to key management. When implementing key management strategies, organizations generally choose from three main approaches: centralized, decentralized, or a hybrid model. The choice depends on the organization’s structure, needs, and operational context. Each approach offers distinct advantages for specific organizational scenarios.
A decentralized approach is our recommended approach, as most customers fit into the following scenarios:
Organizations with autonomous business units or where governance controls provide oversight of key usage
Companies where development teams are agile and ownership of keys can be centrally audited
Companies that operate in multiple regulatory frameworks
Companies that require to operate in a particular AWS Region
A centralized KMS approach is best suited for the following scenarios:
Organizations that require strict compliance oversight and centralized management
Companies with centralized security or data protection functions
In a hybrid model, there is a blend between centralized and decentralized:
Core key policies are managed centrally
Day-to-day key operations are handled by teams
For example, organizations or companies could have independent product teams, but a centralized security team.
Example 1 (Hybrid): A retail website with public product catalog data and confidential customer data should use two KMS keys—one for the public catalog that is encrypted in Amazon S3, and one for customer data that is encrypted in Amazon RDS and other AWS services.
Rationale: This recommendation is based primarily on Principle 1 (Data Classification). The public catalog data and confidential customer data represent different classification levels, justifying the use of separate keys. This approach is further supported by Principle 3 (AWS Services), because the data resides in different AWS services and is of a varied nature.
The benefits of this approach:
Implement appropriate access controls for each data type
Manage encryption independently for each data classification
Enhance overall data security and compliance
Example 2 (Decentralized): A healthcare company with several application teams could use a separate KMS key for each application team, with distinct key policies for each key based on the data and roles of each team.
Rationale: This recommendation is primarily based on Principle 2 (Applications). With multiple application teams operating within the healthcare company, each potentially dealing with distinct types of data and having different access requirements, separate KMS keys provide for independent management of encryption and access for each team. This approach is further supported by Principle 4 (Separation of Duties), allowing for team-specific key policies.
The benefits of this approach:
Maintain granular control over data access.
Implement team-specific encryption policies.
Uphold the principle of least privilege across the organization.
Enhance data security: By using separate keys, the company limits the impact of improper access to any given key, enables more precise access control, facilitates independent key rotation schedules, and improves the ability to monitor and audit key usage for each application.
Simplify alignment with healthcare regulations: Separate keys support data segregation requirements, enable fine-grained role-based access control, provide clear audit trails for each application’s data access, and allow for tailored data lifecycle management. This functionality is crucial for aligning with various healthcare compliance standards such as HIPAA.
Allow for efficient and distributed key management that is tailored to each application team’s needs.
These examples demonstrate how applying the guiding principles can lead to a well-structured key management strategy, tailored to the specific needs of different organizations and use cases.
Considerations for key management
When you implement your key management strategy, several factors need to be considered beyond just the number of keys. This section explores these considerations to help you make informed decisions about your key management approach.
Key types
AWS offers different types of KMS keys, each with its own benefits and use cases.
AWS owned keys are managed by AWS in service accounts, used across multiple customer accounts, and provide no customer visibility or audit capability. Choose AWS owned keys when there are no management or audit requirements for the keys, but encryption of the data at rest is needed.
AWS managed keys are managed entirely by AWS and are used only for your AWS account. Although customers can view these keys in the AWS Management Console and track their usage in AWS CloudTrail logs, they have limited ability to directly control or modify these keys. Choose AWS managed keys when managing keys is not a requirement, but having an audit trail is. It’s worth noting that AWS managed keys are automatically rotated every year, which can be convenient for many use cases.
Customer-managed keys offer the highest level of control and customization, allowing creation of key policies and control over key rotation. However, customer managed keys provide more flexibility, allowing you to set your own rotation schedule or even enable rotation if you are required to do so for regulatory reasons. Choose customer managed keys when you need strict control over key usage and the ability to share keys or control access through key policies, detailed auditing capabilities, alignment with specific compliance requirements, or the ability to integrate key management with your existing processes and tools.
The decision between AWS managed and customer managed keys often comes down to balancing the convenience of automatic management with the need for granular control and customization. As the number of keys increases, so does the complexity of management. More keys mean more policies to create, manage, and audit. Making sure that the right people have access to the right keys becomes more challenging. However, to help audit KMS key access, you can use the IAM Access Analyzer to determine external access to your keys. Managing rotation schedules for multiple keys requires more effort, and more keys mean more policies to analyze and monitor, as well as growing costs.
Cost
Security should be the primary concern, but cost is also a factor. Each customer managed key incurs a monthly storage cost. Both AWS managed and customer managed KMS keys have API usage costs associated with them. Key rotation can increase costs over time, as old key versions are retained.
Manageability
Finding the right balance between security and manageability is crucial. Too few keys might not provide adequate separation of duties or granular access control, while too many keys can lead to increased complexity, higher costs, and potential mismanagement.
Specific requirements
Different industries and regions may have specific requirements for key management. Some regulations might require separation of duties, necessitating multiple keys. Certain compliance standards might dictate specific key rotation or audit trail requirements.
By carefully considering these factors alongside the guiding principles discussed earlier, you can develop a key management strategy that balances security, compliance, cost-effectiveness, and operational efficiency for your specific needs. It is important to approach your KMS strategy holistically, considering not just your immediate security needs, but also the long-term management implications. Regular review and adjustment of your key management strategy will provide assurance that it continues to meet your evolving needs while maintaining robust security and compliance.
Conclusion
As we explored throughout this post, determining the optimal number of AWS KMS keys for your organization is a nuanced decision that balances security, compliance, cost, and operational efficiency. The guiding principles we discussed—data classification, application segregation, AWS service integration, and separation of duties—provide a solid framework for making these decisions. Remember that there’s no one-size-fits-all solution; the right approach depends on your specific needs and circumstances.
As you move forward in implementing or refining your KMS key strategy, consider these next steps: First, conduct a thorough audit of your current data assets, their classifications, and the applications and services that interact with them. Next, map out your ideal key management structure based on the principles we’ve discussed. Then, evaluate the costs and operational overhead of your proposed strategy, adjusting as necessary to find the right balance for your organization. Finally, implement your strategy incrementally, starting with your most sensitive or critical data assets.
Remember that key management is an ongoing process. Regularly review and update your strategy as your data landscape evolves, new compliance requirements emerge, or AWS introduces new features. By thoughtfully applying the principles and considerations we’ve discussed, you can create a robust, scalable, and efficient key management strategy that helps your overall security posture and meets your organization’s unique needs.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
AWS re:Invent 2024, a learning conference hosted by Amazon Web Services (AWS) for the global cloud computing community, will take place December 2–6, 2024, in Las Vegas, Nevada, across multiple venues. At re:Invent, you can join cloud enthusiasts from around the world to hear the latest cloud industry innovations, meet with AWS experts, and build connections. Whether you want to build deep technical expertise, understand how to prioritize your investments, learn more about the infrastructure offerings of the sovereign-by-design AWS Cloud, or see how the AWS Nitro System enables enhanced security for your workloads, re:Invent is a great opportunity to explore our digital sovereignty solutions.
This year, there will be many ways that you can learn about our advanced sovereignty controls, security features, and infrastructure options that can help meet your unique digital sovereignty needs, including sessions and hands-on activities with AWS hybrid and edge services including AWS Local Zones, AWS Dedicated Local Zones, and AWS Outposts. In the Expo, you can visit the Digital Sovereignty & Data Protection kiosk in the AWS Village to watch demos, learn about the upcoming AWS European Sovereign Cloud, and get your questions answered by AWS team members. To see AWS designed chips and Outposts devices, check out the AWS Next Gen Infrastructure Hub in the AWS Village. You can also visit the AWS Partner Network (APN) booth to connect with AWS Digital Sovereignty Partners to learn about the benefits of partner programs.
Breakout sessions and lightning talks
To add sessions to your AWS re:Invent agenda and find time and location information, choose the session title link.
SEC229 | Breakout | Digital sovereignty: overcome complexity and enable future-readiness Max Peterson, VP, Sovereign Cloud, AWS Organizations are facing increasing complexity in an evolving sovereignty landscape. Building a strong digital foundation can help simplify efforts to meet requirements today and prepare your organization for the future, without slowing innovation. Join this session to learn how AWS sovereign cloud offerings, ranging from encryption services to the announced AWS European Sovereign Cloud, provides more control and choice to help meet your unique needs. Discover how customers are keeping critical workloads secure and protected when leveraging new technologies on AWS, including generative AI, and learn about new digital sovereignty solutions offered by AWS Partners.
HYB201 | Breakout | AWS wherever you need it: From the cloud to the edge Jan Hofmeyr, VP, EC2 Networking and Hybrid Edge, AWS, and Jeff Feist, Executive Director – Hosting Solutions, Merck & Co., Inc. While most workloads can be migrated to the cloud, some remain on premises or at the edge due to low latency, local data processing, or digital sovereignty needs. In this session, learn how AWS services like AWS Outposts, AWS Local Zones, AWS Dedicated Local Zones, and AWS IoT Core support hybrid cloud and edge computing workloads such as multiplayer gaming, high-frequency trading, medical imaging, smart manufacturing, and generative AI applications with data residency requirements.
HYB309 | Breakout | Well-architected for data residency with hybrid cloud services Sherry Lin, Principal Product Manager, AWS; Lakshmi VP, Specialist SA – Hybrid Edge, AWS; and Kevin Ng, Senior Director, Core Engineering Products, GovTech With concerns over data privacy, security, and digital sovereignty, many countries across the world are strengthening data residency laws to keep personal and sensitive data within their borders. For organizations operating across multiple geographies, it can be challenging to meet the evolving data residency laws. In this session, following the AWS Well-Architected Framework, explore the best practices around data residency when using hybrid cloud services, including AWS Local Zones, AWS Dedicated Local Zones, and AWS Outposts.
IOT202 | Breakout | AWS IoT for edge LLM deployment and execution Nikit Pednekar, Principal Product Manager, AWS, and Stefano Marzani, WW Tech Leader, SDX, AWS With the advent of generative AI and large language models (LLMs), you must be wondering, how can these technologies be applied at the IoT edge? After all, there are many benefits of running LLMs at the edge—from network bandwidth efficiencies, offline processing, lower latency, and data sovereignty to cost savings, security, and differentiation. In this session, learn how using AWS IoT services and LLMs at the edge can uplift your solutions with actionable outcomes and innovative capabilities, such as gesture recognition, natural language processing for voice control, real-time predictive maintenance, energy optimization, anomaly detection, and more.
KUB310 | Breakout | Amazon EKS for edge and hybrid use cases Chris Splinter, Product Manager, AWS, and Gokul Chandra Purnachandra Reddy, Senior Solutions Architect, AWS There are some workloads that may need to run on-premises, at the edge, or in a hybrid scenario due to low-latency, data dependencies, data sovereignty, or other regulatory reasons, especially in industries such as manufacturing, healthcare, telco, and financial services. Data dependent workloads may have to wait for the data to be on AWS services before fully migrating. In this session, we will share production-ready architectures leveraging services like Amazon EKS Anywhere to run container workloads on-premises and support modernizing VMware-based workloads. Also learn best practices on migration of on-premises Kubernetes deployments to AWS Cloud.
PEX110 | Lightening Talk | Supercharge your growth and capabilities with partner programs Mike Cannady, Director, Partner Core Public Sector, AWS Discover the latest AWS Partner program updates that propel your public sector business forward. Join this lightning talk to explore innovations tailored to partners: generative AI programs, digital sovereignty, solution building, and managed services. Whether you’re starting out or seasoned, glean insights and use cases to elevate your journey. Don’t miss this opportunity to supercharge your development and stay ahead in this ever-evolving landscape.
Interactive sessions (chalk talks and workshops)
HYB304 | Workshop | Implement RAG without compromising on digital sovereignty Aditya Lolla, Senior Solutions Architect, Hybrid Edge, AWS, and Robert Belson, Senior Developer Advocate, AWS As governments and standards bodies develop data protection and privacy regulations, organizations increasingly need to combine the use of generative AI tooling in the cloud with regulated data that need to remain on premises to meet data sovereignty requirements. In this workshop, learn how to extend Agents for Amazon Bedrock to hybrid and edge services like AWS Outposts and AWS Local Zones to build distributed Retrieval Augmented Generation (RAG) applications with on-premises data for improved model outcomes. Get hands-on with Amazon Bedrock, AWS Lambda, and AWS hybrid and edge services, and build Amazon Simple Storage Service (Amazon S3) compliant workflows using a hybrid S3 compatible solution. You must bring your laptop to participate.
WPS207 | Chalk Talk | How AWS can help you meet your digital sovereignty requirements Mehmet Bakkaloglu, Principal Solutions Architect, AWS, and Addy Upreti, Principal Technical Product Manager – Digital Sovereignty, AWS Customers in the public sector and regulated industries such as healthcare, financial services and telecom have shared how they face digital sovereignty concerns in their cloud journey. In this talk, you can learn about how AWS is sovereign-by-design and the range of capabilities that can enable you to meet your digital sovereignty needs. Plus, discover how the AWS European Sovereign Cloud is being built to provide further choice to meet these needs. We’ll talk through how AWS can help accelerate your cloud journey while meeting your requirements.
HYB310 | Chalk Talk | Addressing data residency requirements with hybrid and edge services Sedji Gaouaou, Senior Solutions Architect, AWS, and Fabio Rodriguez, Head of Hybrid Cloud Solutions Architect, AWS Data residency is a critical consideration for organizations that collect and store sensitive information, including personal identifiable information (PII), financial data, healthcare data, or information pertaining to national security. To help organizations operating across multiple geographies drive innovation while meeting data residency requirements, AWS offers multiple global infrastructure offerings like AWS Regions, AWS Dedicated Local Zones, AWS Local Zones, and AWS Outposts. In this interactive chalk talk, learn how these infrastructure offerings can help you accelerate digital transformation while meeting data residency needs.
For a full view of digital sovereignty content, including sessions with partners, explore the AWS re:Invent catalog and filter on the Digital Sovereignty area of interest. Not able to attend in-person? Register for free for the virtual-only pass to livestream keynotes and innovation talks, and access on-demand breakout sessions today. See you in Las Vegas or on the livestream!
If you have feedback about this post, submit comments in the Comments section below.
We’re excited to announce that Kivera, a cloud security, data protection, and compliance company, has joined Cloudflare. This acquisition extends our SASE portfolio to incorporate inline cloud app controls, empowering Cloudflare One customers with preventative security controls for all their cloud services.
In today’s digital landscape, cloud services and SaaS (software as a service) apps have become indispensable for the daily operation of organizations. At the same time, the amount of data flowing between organizations and their cloud providers has ballooned, increasing the chances of data leakage, compliance issues, and worse, opportunities for attackers. Additionally, many companies — especially at enterprise scale — are working directly with multiple cloud providers for flexibility based on the strengths, resiliency against outages or errors, and cost efficiencies of different clouds.
Security teams that rely on Cloud Security Posture Management (CSPM) or similar tools for monitoring cloud configurations and permissions and Infrastructure as code (IaC) scanning are falling short due to detecting issues only after misconfigurations occur with an overwhelming volume of alerts. The combination of Kivera and Cloudflare One puts preventive controls directly into the deployment process, or ‘inline’, blocking errors before they happen. This offers a proactive approach essential to protecting cloud infrastructure from evolving cyber threats, maintaining data security, and accelerating compliance.
An early warning system for cloud security risks
In a significant leap forward in cloud security, the combination of Kivera’s technology and Cloudflare One adds preventive, inline controls to enforce secure configurations for cloud resources. By inspecting cloud API traffic, these new capabilities equip organizations with enhanced visibility and granular controls, allowing for a proactive approach in mitigating risks, managing cloud security posture, and embracing a streamlined DevOps process when deploying cloud infrastructure.
Kivera will add the following capabilities to Cloudflare’s SASE platform:
One-click security: Customers benefit from immediate prevention of the most common cloud breaches caused by misconfigurations, such as accidentally allowing public access or policy inconsistencies.
Enforced cloud tenant control: Companies can easily draw boundaries around their cloud resources and tenants to ensure that sensitive data stays within their organization.
Prevent data exfiltration: Easily set rules to prevent data being sent to unauthorized locations.
Reduce ‘shadow’ cloud infrastructure: Ensure that every interaction between a customer and their cloud provider is in line with preset standards.
Streamline cloud security compliance: Customers can automatically assess and enforce compliance against the most common regulatory frameworks.
Flexible DevOps model: Enforce bespoke controls independent of public cloud setup and deployment tools, minimizing the layers of lock-in between an organization and a cloud provider.
Complementing other cloud security tools: Create a first line of defense for cloud deployment errors, reducing the volume of alerts for customers also using CSPM tools or Cloud Native Application Protection Platforms (CNAPPs).
An intelligent proxy that uses a policy-based approach to
enforce secure configuration of cloud resources.
Better together with Cloudflare One
As a SASE platform, Cloudflare One ensures safe access and provides data controls for cloud and SaaS apps. This integration broadens the scope of Cloudflare’s SASE platform beyond user-facing applications to incorporate increased cloud security through proactive configuration management of infrastructure services, beyond what CSPM and CASB solutions provide. With the addition of Kivera to Cloudflare One, customers now have a unified platform for all their inline protections, including cloud control, access management, and threat and data protection. All of these features are available with single-pass inspection, which is 50% faster than Secure Web Gateway (SWG) alternatives.
With the earlier acquisition of BastionZero, a Zero Trust infrastructure access company, Cloudflare One expanded the scope of its VPN replacement solution to cover infrastructure resources as easily as it does apps and networks. Together Kivera and BastionZero enable centralized security management across hybrid IT environments, and provide a modern DevOps-friendly way to help enterprises connect and protect their hybrid infrastructure with Zero Trust best practices.
Beyond its SASE capabilities, Cloudflare One is integral to Cloudflare’s connectivity cloud, enabling organizations to consolidate IT security tools on a single platform. This simplifies secure access to resources, from developer privileged access to technical infrastructure and expanding cloud services. As Forrester echoes, “Cloudflare is a good choice for enterprise prospects seeking a high-performance, low-maintenance, DevOps-oriented solution.”
The growing threat of cloud misconfigurations
The cloud has become a prime target for cyberattacks. According to the 2023 Cloud Risk Report, CrowdStrike observed a 95% increase in cloud exploitation from 2021 to 2022, with a staggering 288% jump in cases involving threat actors directly targeting the cloud.
Misconfigurations in cloud infrastructure settings, such as improperly set security parameters and default access controls, provide adversaries with an easy path to infiltrate the cloud. According to the 2023 Thales Global Cloud Security Study, which surveyed nearly 3,000 IT and security professionals from 18 countries, 44% of respondents reported experiencing a data breach, with misconfigurations and human error identified as the leading cause, accounting for 31% of the incidents.
Further, according to GartnerⓇ, “Through 2027, 99% of records compromised in cloud environments will be the result of user misconfigurations and account compromise, not the result of an issue with the cloud provider.”1
Several factors contribute to the rise of cloud misconfigurations:
Rapid adoption of cloud services: Leaders are often driven by the scalability, cost-efficiency, and ability to support remote work and real-time collaboration that cloud services offer. These factors enable rapid adoption of cloud services which can lead to unintentional misconfigurations as IT teams struggle to keep up with the pace and complexity of these services.
Complexity of cloud environments: Cloud infrastructure can be highly complex with multiple services and configurations to manage. For example, AWS alone offers 373 services with 15,617 actions and 140,000+ parameters, making it challenging for IT teams to manage settings accurately.
Decentralized management: In large organizations, cloud infrastructure resources are often managed by multiple teams or departments. Without centralized oversight, inconsistent security policies and configurations can arise, increasing the risk of misconfigurations.
Continuous Integration and Continuous Deployment (CI/CD): CI/CD pipelines promote the ability to rapidly deploy, change and frequently update infrastructure. With this velocity comes the increased risk of misconfigurations when changes are not properly managed and reviewed.
Insufficient training and awareness: Employees may lack the cross-functional skills needed for cloud security, such as understanding networks, identity, and service configurations. This knowledge gap can lead to mistakes and increases the risk of misconfigurations that compromise security.
Common exploitation methods
Threat actors exploit cloud services through various means, including targeting misconfigurations, abusing privileges, and bypassing encryption. Misconfigurations such as exposed storage buckets or improperly secured APIs offer attackers easy access to sensitive data and resources. Privilege abuse occurs when attackers gain unauthorized access through compromised credentials or poorly managed identity and access management (IAM) policies, allowing them to escalate their access and move laterally within the cloud environment. Additionally, unencrypted data enables attackers to intercept and decrypt data in transit or at rest, further compromising the integrity and confidentiality of sensitive information.
Here are some other vulnerabilities that organizations should address:
Unrestricted access to cloud tenants: Allowing unrestricted access exposes cloud platforms to data exfiltration by malicious actors. Limiting access to approved tenants with specific IP addresses and service destinations helps prevent unauthorized access.
Exposed access keys: Exposed access keys can be exploited by unauthorized parties to steal or delete data. Requiring encryption for the access keys and restricting their usage can mitigate this risk.
Excessive account permissions: Granting excessive privileges to cloud accounts increases the potential impact of security breaches. Limiting permissions to necessary operations helps prevent lateral movement and privilege escalation by threat actors.
Inadequate network segmentation: Poorly managed network security groups and insufficient segmentation practices can allow attackers to move freely within cloud environments. Drawing boundaries around your cloud resources and tenants ensures that data stays within your organization.
Improper public access configuration: Incorrectly exposing critical services or storage resources to the internet increases the likelihood of unauthorized access and data compromise. Preventing public access drastically reduces risk.
Shadow cloud infrastructure: Abandoned or neglected cloud instances are often left vulnerable to exploitation, providing attackers with opportunities to access sensitive data left behind. Preventing untagged or unapproved cloud resources to be created can reduce the risk of exposure.
Limitations of existing tools
Many organizations turn to CSPM tools to give them more visibility into cloud misconfigurations. These tools often alert teams after an issue occurs, putting security teams in a reactive mode. Remediation efforts require collaboration between security teams and developers to implement changes, which can be time-consuming and resource-intensive. This approach not only delays issue resolution but also exposes companies to compliance and legal risks, while failing to train employees on secure cloud practices. On average, it takes 207 days to identify these breaches and an additional 70 days to contain them.
Addressing the growing threat of cloud misconfigurations requires proactive security measures and continuous monitoring. Organizations must adopt proactive security solutions that not only detect and alert but also prevent misconfigurations from occuring in the first place and enforce best practices. Creating a first line of defense for cloud deployment errors reduces the volume of alerts for customers, especially those also using CSPM tools or CNAPPs.
By implementing these proactive strategies, organizations can safeguard their cloud environments against the evolving landscape of cyber threats, ensuring robust security and compliance while minimizing risks and operational disruptions.
What’s next for Kivera
The Kivera product will not be a point solution add-on. We’re making it a core part of our Cloudflare One offering because integrating features from products like our Secure Web Gateway give customers a comprehensive solution that works better together.
We’re excited to welcome Kivera to the Cloudflare team. Through the end of 2024 and into early 2025, Kivera’s team will focus on integrating their preventive inline cloud app controls directly into Cloudflare One. We are looking for early access testers and teams to provide feedback about what they would like to see. If you’d like early access, please join the waitlist.
[1] Source: Outcome-Driven Metrics You Can Use to Evaluate Cloud Security Controls, Gartner, Charlie Winckless, Paul Proctor, Manuel Acosta, 09/28/2023
GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.
Amazon Web Services (AWS) customers of various sizes across different industries are pursuing initiatives to better classify and protect the data they store in Amazon Simple Storage Service (Amazon S3). Amazon Macie helps customers identify, discover, monitor, and protect sensitive data stored in Amazon S3. However, it’s important that customers evaluate and test the capabilities of Macie to verify that they can meet their specific data identification and protection goals. In this post, we show you how to define and run a proof of concept (POC) to validate using Macie and automated discovery to enhance your current data protection strategies. The POC steps demonstrate how you can use Macie to detect and alert you to sensitive data discovered in your AWS environment and help you determine the value of using Macie to enhance your current data protection strategies.
Note: This POC uses some features that offer a 30-day free trial and other features that will incur minimal charges during the POC phase. We highlight and summarize these throughout this post.
Data security business challenges
Data security is a broad concept that revolves around protecting digital information from unauthorized access, corruption, theft, and other forms of malicious activity throughout its lifecycle. There’s an exponential growth of digital data and organizations are grappling with not only managing it but also determining where their sensitive data exists. Additionally, many organizations have compliance requirements from government regulators and industry standards, such as PCI DSS or HIPAA. Organizations want to move fast, which means giving developers the tools to build quickly to stay ahead, while making sure that the correct data classification policies are defined and enforced.
Macie features
Amazon Macie is a data security service that discovers sensitive data using machine learning and pattern matching, provides visibility into data security risks, and enables automated protection against those risks. The following is a summary of the key features of Macie, many of which will be used in this POC. The core capabilities of Macie are focused on the security of your S3 buckets and helping to identify sensitive data including financial data, personal data, and credentials as well as sensitive data that’s unique to your organization, such as intellectual property.
S3 bucket security
Customers use Amazon S3 for a variety of use cases and store various types of data in S3 buckets, including sensitive data. Continuously monitoring these buckets for the presence of sensitive data is a vital part of a data protection strategy. Macie gives you visibility into your S3 bucket inventory and the security and access controls associated with your buckets. This visibility includes if the bucket is publicly accessible, the encryption level of the bucket, and if the bucket is shared with other accounts. Whenever the security posture of one of your buckets is reduced, Macie generates a finding about the change, enabling you to respond. These findings are consumable through the AWS Management Console for Macie, through Macie APIs, as Amazon EventBridge messages, or through AWS Security Hub.
Sensitive data discovery jobs
Sensitive data discovery jobs provide a way to target a specific S3 bucket or group of buckets to do a deep analysis of the objects in those buckets and identify if sensitive data is present in the objects and if so, the type of data. These jobs can run on a daily, weekly, or monthly basis for new or changed data or once for on-demand analysis.
Automated data discovery
Macie offers an automated data discovery feature that can continually discover sensitive data within your S3 buckets. This feature is intended to help customers who have large amounts of S3 buckets and data better understand where sensitive data might be stored without having to scan all their data. By using automated data discovery, you can focus your resources on deeper investigations of the security of buckets identified to have sensitive data. Macie selects samples of the objects within S3 buckets and inspects them for the presence of sensitive data daily, providing insight into where sensitive data might reside in your overall Amazon S3 data estate.
POC overview
This POC is intended to help you gain an understanding of what Macie is capable of and how you can use it to achieve your data discovery goals. The POC in this post includes the following tasks in Macie:
Reviewing managed data identifiers
Defining custom data identifiers
Staging POC data
Running a sensitive data discovery job
Reviewing the output of the discovery job
Enabling and reviewing the output of automated data discovery
Note: The amount of time required for each task depends on your preparation and analysis for each stage. Note that, in the automated data discovery phase, it will take 24–48 hours for Macie to perform the first scan after the feature is enabled.
Enable Macie
Macie must be enabled before you can proceed with the POC. If you haven’t yet enabled Macie, see Enable Macie for instructions.
Note: When you enable Macie and the 30-day free trial for S3, monitoring S3 bucket security and privacy is automatically enabled. There’s also a 30-day free trial for automated data discovery, which is covered later in this post. There is no free trial for running targeted data discovery jobs. Review the Macie pricing page for details.
Review managed data identifiers
A successful POC of Macie includes understanding what data Macie can detect. Macie comes with over 150 managed data identifiers that are designed to identify sensitive data in your S3 objects. It’s important to first understand the available managed data identifiers and which ones align with the use cases you want to address. Examples of Macie managed data identifiers include credit card numbers, AWS secret access keys, and national identification numbers. Macie offers a default collection of recommend managed data identifiers to use for detecting general categories and types of sensitive data while optimizing data discovery results and reducing noise.
Keywords are an important component for Macie to be able to detect sensitive data. Many managed data identifiers require keywords to be in proximity of the data for Macie to be able to detect findings. Understanding the keywords that are used as part of sensitive data detection is important when it comes to building test data for a POC.
Prior to beginning your POC, review the list of managed data identifiers and determine which ones you feel will be necessary to use for your data discovery requirements. Additionally, identify which managed data identifiers, which are applicable to your POC, fall outside of the default list of identifiers.
Define custom data identifiers
Macie covers a wide number of use cases with its managed data identifiers, but some use cases need custom data identifiers for data types that aren’t included in the managed data identifiers. For example, customers might need to identify sensitive data that’s specific to their company, such as an employee ID or project number. Other customers might operate in industries that have data types unique to that industry, such as a known traveler number in the airline industry. If your requirements for identifying sensitive data include detecting sensitive data that isn’t part of the current list of managed data identifiers, then you can create custom data identifiers for those data types. For a POC, you might not want to create a custom data identifier for every additional detection. Instead, you can create a few to help confirm that you can use custom data identifiers for sensitive data detection and that Macie can support your data discovery goals. Building custom data identifiers has a thorough explanation of how to define a custom data identifier. Similar to managed data identifiers, custom data identifiers have keyword requirements. Defining detection criteria for custom data identifiers provides details for the types of data that require keywords.
Stage POC data
After reviewing the managed data identifiers provided by Macie and creating the custom data identifiers needed for your POC, it’s time to stage data sets that will help demonstrate the capabilities of these identifiers and better understand how Macie identifies sensitive data. We recommend that you stage data sets that contain sensitive data as well as data sets that do not to gain a full understanding of how Macie detects and reports on each of these situations. You can stage a variety of data sets to use for your POC using just a few GB of data to help keep your initial POC scans’ cost low. Staged data must be in file formats that Macie supports.
When preparing data to stage, keep in mind the keyword requirements for many of the Macie managed data identifiers. To determine which managed data identifiers have keyword requirements, see Managed data identifiers by type. When you’re staging your data, reference the keywords that are supported for the managed data identifiers you are using to help ensure that the data can be identified in your POC tests.
We recommend staging the data in one S3 bucket that’s dedicated to the POC and to use S3 server-side encryption on the bucket. If you want to use a customer managed AWS KMS key to encrypt the S3 data at rest, follow the instructions in Allowing Macie to use a customer managed AWS KMS key to give Macie access to decrypt the data in the bucket. You should also follow best practices for the S3 bucket related to not allowing public access and implementing least privilege access.
You can use one or more of the following approaches to identify and stage data for your POC:
Stage data files created by synthetic data generator tools with sensitive data included. There are many tools available for generating sensitive data. The following are two that you can use to generate test data.
Stage data files from public data repositories. There are various repositories staged with information that could be used for sensitive data detection. These repositories are often comprised of publicly available data sets or were created to help with testing machine learning models or sensitive data detection.
Stage data files of your own data with sensitive information. Because the goal is to use Macie to identify sensitive information in your S3 buckets, including examples of your own data that contains sensitive information can be helpful to test the capabilities of Macie.
Stage data files that don’t contain sensitive information. This can help you understand how Macie handles data that you believe doesn’t contain sensitive information. With the managed data identifiers that Macie offers, you should stage data files that you believe don’t contain information that aligns to the managed data identifiers. The staged data files could be log files, documents, or data sets that meet the criteria of this step.
Stage data that contains information that’s representative of data that you would want to detect using custom data identifiers.
Run a data classification job
Now that you’ve reviewed the managed data identifiers, defined custom data identifiers, and staged sample data, it’s time to run a sensitive data discovery job. When configuring the job scope, we recommend the following:
A specific S3 bucket where the POC data is staged.
The scope is set to be a one-time job.
Leave the sampling depth at 100 percent. Most customers leave this value at 100 percent, but some will lower it if they want a smaller random sample scan of their data. Most customers use automated data discovery to get sample scans instead of adjusting the sampling depth for individual jobs.
Select the recommended managed data identifiers. If your testing requires that Macie identify additional sensitive data types that are offered as managed data identifiers but aren’t part of the recommended list, choose the Custom option and select the managed data identifiers that you need. Make sure that the recommended managed data identifiers are part of the custom list that you construct.
Choose the custom data identifiers that you want to be used in the job.
After you configure your job, give it a name, review the final configuration, and then submit the job to run. A job that uses a data set of a few GB should complete within 30 minutes.
Review the findings from your job
After the job completes, it’s time to review what Macie found in the data. Objects that Macie found with sensitive data will be presented as Findings in the Macie console. From the Jobs screen, choose the job you submitted. In the right-hand window, you will see the overview information for the job. From the overview window you can choose Show results menu and then select Show findings to view a list of the findings that were generated by the job.
Figure 1: Viewing Macie job findings
Each object where Macie found sensitive data will be listed as a single finding. If there were multiple types of sensitive data found in the object, each type of sensitive data and a count will be included in the details. Choose each of the findings that was produced and review the details to confirm what sensitive data was identified and if the sensitive data was discovered as you expected. Additionally, confirm that you don’t have findings for objects that you staged that were not supposed to have sensitive data so that you can confirm how Macie handles these types of objects. If you created custom data identifiers, review findings for the objects that included the custom data that you detect to confirm that the data was detected.
Enable automated discovery
Now that you understand how to use Macie to discover sensitive data, the next step in the POC is to enable automated discovery and use Macie to discover sensitive data across a larger collection of your existing S3 data.
You will be enabling automated discovery in Macie as a 30-day free trial. For the free trial, the scope of total data storage to be evaluated will be 150 GB. Use the following steps to guide your setup of the automated discovery feature:
After your delegated administrator account is configured, enable automated discovery. As part of enabling automated discovery, pay extra attention to the following items:
Set managed data identifiers. Ideally, choose the recommended data identifiers to help reduce noise. If there are specific managed data identifiers that you really want to see, then choose Custom to choose the ones you want.
Include custom data identifiers that you want to be used to evaluate your sensitive data.
Exclude buckets that you don’t want included in the scope for identifying sensitive data.
Include or exclude specific accounts that should be part of the POC. Step 5 of Enable automated discovery covers how to enable it for specific accounts.
You will see the first set of results 24 to 48 hours after you enable automated discovery. After that, you will see updates to the automated discovery results every 24 hours.
After automated discovery starts producing results, you will start seeing data in the Automated Discovery section of the Macie summary page in the console. The summary includes metrics for the total number of buckets eligible for discovery, counts for the number of buckets where sensitive data was or was not found, and how many of these buckets are public.
Figure 2: Example automated discovery summary metrics
Choosing a link for one of the counts will take you to the S3 buckets view with the appropriate filters applied.
After reviewing the summary screen, choose S3 buckets from the navigation pane to see a heatmap that shows each account and the buckets within each account.
Figure 3: S3 bucket heatmap view in Macie
The heatmap provides point-in-time insights into the data that Macie has scanned, and in which buckets sensitive data has been identified or no sensitive data has been found.
Over time, this heatmap might change as automated data discovery continues sampling the data in each bucket. The heatmap view provides information on each organizational member account and insight about sensitive data within each bucket in the account.
The console displays the results as a set of colored squares for each account. Each square represents a bucket in that account and the color of the square indicates whether sensitive data was discovered in that bucket. Red indicates that some type of sensitive data has been found in the bucket, while blue indicates no sensitive data has been identified. If a bucket is blue, that means only that automated data discovery hasn’t identified sensitive data up to the point in time of the last scan, not that there is no sensitive data in the bucket.
Hover over each bucket to see a summary of the sensitivity score of the bucket along with statistics about the data in the bucket. Choosing a bucket in the heatmap will open a window that provides more information about the bucket. This information includes the sensitivity score of the bucket, a summary of the types of sensitive data found in the bucket, which objects within the bucket have been sampled, statistics related to the data that has been scanned and data that is still to be scanned, and other information about the bucket.
Figure 4: S3 bucket sensitivity information in Macie
As part of your POC, it’s recommended that you investigate buckets that are reported to contain sensitive data. For your investigation, validate if the data identified is sensitive based on your organization’s data classification policy. Each Macie finding contains not only the details on the types of sensitive data identified, but also the location within the file where the sensitive data is located so that you can confirm the identified data is sensitive. The following is an example of a Macie finding in an Excel file where bank account numbers were found. This example shows the row and column where the sensitive data was found.
Investigating sensitive data with findings has detailed guidance around locating sensitive data from Macie findings, retrieving the sensitive data, and the schema for sensitive data locations. If the findings are true positives, make sure that the bucket has the right level of security configurations and permissions based on the data stored in the bucket.
In addition to reviewing the bucket level statistics that are generated by automated discovery, you can view the individual findings that were generated for each S3 object that was identified as having sensitive data. You can view the findings through the Findings tab in Macie or by choosing the sensitive data type when looking at the summary detections for a bucket.
Next steps
With the preceding POC, you should now have a more complete understanding of how Macie identifies sensitive data and how you can use the information that Macie provides about that data identified. As you complete your POC, there are a few next steps that other customers have taken after completing their POC with Macie.
Operationalize Macie output
In the POC steps, we outlined how to view the findings and the insights that the automated discovery provides. Before you proceed with Macie, you should have a plan for how you will operationalize the Macie output. This will help ensure that the remediation steps for identified sensitive data are directed to the correct parties.
Depending on what operational tools you have in place, the steps that you take to operationalize the output of Macie will vary. Many customers implement operational processes that cover the following areas:
The privacy or security team that’s responsible for Macie does an initial triage on the findings to confirm if there are true positives.
The privacy or security team defines their operational processes related to the summary dashboard and bucket sensitivity information that’s provided by automated discovery.
The team that owns the bucket where the sensitive data is located is provided the information needed to investigate the identified data and provide a response. Responses will vary but could include removal of the data, correcting the application that is writing the data, or applying additional security to the bucket.
A key part of operationalizing Macie output is also how you are getting and reviewing the signals related to Macie findings. As outlined in this post, you can view and investigate findings in the Macie console. At a steady state, customers find it beneficial to consume Macie findings using Amazon EventBridge, through Macie APIs, or through the integration of Macie with AWS Security Hub. These options can be used to incorporate the findings that are reported by Macie into the operational workflows and tools that work best for your organization and allow you to consume and address findings at scale. Monitoring and processing findings provides more insight into these integration approaches.
Store and retain sensitive data discovery results
As you move forward with Macie, it’s important to enable logging for each of the S3 objects that Macie has scanned. Macie creates an analysis record for each S3 object that’s in scope for a data discovery job or an automated discovery scan. These records are an audit of every object that Macie attempted to scan, including objects that didn’t contain sensitive data. Data discovery results are written to an S3 bucket that you own and where you control the data retention. These data discovery results can assist with analysis of records over time or to get a broader sense of what data Macie has scanned and which objects had sensitive data and which did not.
The data discovery scan results are stored as JSON Lines files. You can use multiple approaches to analyze and query these log files. The Amazon Macie Results Analytics GitHub repository provides instructions on how to configure Amazon Athena with tables that allow you to query the results of Macie discovery scan files.
Define the scope for Macie use
After a POC with Macie, you can set the scope of how you will use Macie in production by deciding which buckets don’t need to be evaluated and so can be excluded, such as buckets used for AWS logs and buckets deemed not in scope for sensitive data identification.
You can also refine the managed data identifiers that are required for detecting sensitive data. Based on what they learned from the POC. You can create custom data identifiers to help meet your data detection needs if necessary.
As you identify the sensitive data discovery jobs to run as part of your production use of Macie, keep in mind that these jobs are immutable. To edit an existing scheduled job, you must create a copy of the job with the updated configuration and cancel the original job for changes to take effect. Depending on the updated criteria for your job, you must do one of the following to help avoid re-scanning objects that were covered by the previous job:
Clear include existing objects when setting the scope. This will cause the new job to schedule only objects that were created between when it was first created and when it was run for the first time.
When setting the job scope, set the object attributes so that the object last modified date is the date of the last time the previous job ran. This helps ensure that when the new scheduled job runs, it evaluates only objects that weren’t covered by the previous job.
Clean up
To avoid incurring additional charges, disable Macie while you evaluate the value of the additional data protection provided. If S3 buckets were created for this POC, those should also be deleted.
Call to action
After the POC is complete, evaluate the results to determine how much using Macie can strengthen your organization’s data protection program. Based on that evaluation, you can identify how and where to use Macie for maximum effectiveness. Also consider how the information provided can be factored into operational workflows to get additional value from Macie.
Conclusion
This post outlined how you can use a POC to better understand how Amazon Macie can help meet your data discovery and classification needs. A well thought-out and implemented POC can provide valuable early insights and help you develop a more thorough understanding of what your data discovery and classification strategy should be. Planning your POC using the guidance in this post can help you determine more quickly if Macie is a fit for your company.
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
I am the Chief of Security Architecture at Inrupt, Inc., the company that is commercializing Tim Berners-Lee’s Solid open W3C standard for distributed data ownership. This week, we announced a digital wallet based on the Solid architecture.
Details are here, but basically a digital wallet is a repository for personal data and documents. Right now, there are hundreds of different wallets, but no standard. We think designing a wallet around Solid makes sense for lots of reasons. A wallet is more than a data store—data in wallets is for using and sharing. That requires interoperability, which is what you get from an open standard. It also requires fine-grained permissions and robust security, and that’s what the Solid protocols provide.
I think of Solid as a set of protocols for decoupling applications, data, and security. That’s the sort of thing that will make digital wallets work.
The collective thoughts of the interwebz
Manage Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional
Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes.The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.