Tag Archives: security

AWS KMS is now FIPS 140-2 Security Level 3. What does this mean for you?

2023-11-07 Rushir Patel

Post Syndicated from Rushir Patel original https://aws.amazon.com/blogs/security/aws-kms-now-fips-140-2-level-3-what-does-this-mean-for-you/

AWS Key Management Service (AWS KMS) recently announced that its hardware security modules (HSMs) were given Federal Information Processing Standards (FIPS) 140-2 Security Level 3 certification from the U.S. National Institute of Standards and Technology (NIST). For organizations that rely on AWS cryptographic services, this higher security level validation has several benefits, including simpler set up and operation. In this post, we will share more details about the recent change in FIPS validation status for AWS KMS and explain the benefits to customers using AWS cryptographic services as a result of this change.

Background on NIST FIPS 140

The FIPS 140 framework provides guidelines and requirements for cryptographic modules that protect sensitive information. FIPS 140 is the industry standard in the US and Canada and is recognized around the world as providing authoritative certification and validation for the way that cryptographic modules are designed, implemented, and tested against NIST cryptographic security guidelines.

Organizations follow FIPS 140 to help ensure that their cryptographic security is aligned with government standards. FIPS 140 validation is also required in certain fields such as manufacturing, healthcare, and finance and is included in several industry and regulatory compliance frameworks, such as the Payment Card Industry Data Security Standard (PCI DSS), the Federal Risk and Authorization Management Program (FedRAMP), and the Health Information Trust Alliance (HITRUST) framework. FIPS 140 validation is recognized in many jurisdictions around the world, so organizations that operate globally can use FIPS 140 certification internationally.

For more information on FIPS Security Levels and requirements, see FIPS Pub 140-2: Security Requirements for Cryptographic Modules.

What FIPS 140-2 Security Level 3 means for AWS KMS and you

Until recently, AWS KMS had been validated at Security Level 2 overall and at Security Level 3 in the following four sub-categories:

Cryptographic module specification
Roles, services, and authentication
Physical security
Design assurance

The latest certification from NIST means that AWS KMS is now validated at Security Level 3 overall in each sub-category. As a result, AWS assumes more of the shared responsibility model, which will benefit customers for certain use cases. Security Level 3 certification can assist organizations seeking compliance with several industry and regulatory standards. Even though FIPS 140 validation is not expressly required in a number of regulatory regimes, maintaining stronger, easier-to-use encryption can be a powerful tool for complying with FedRAMP, U.S. Department of Defense (DOD) Approved Product List (APL), HIPAA, PCI, the European Union’s General Data Protection Regulation (GDPR), and the ISO 27001 standard for security management best practices and comprehensive security controls.

Customers who previously needed to meet compliance requirements for FIPS 140-2 Level 3 on AWS were required to use AWS CloudHSM, a single-tenant HSM solution that provides dedicated HSMs instead of managed service HSMs. Now, customers who were using CloudHSM to help meet their compliance obligations for Level 3 validation can use AWS KMS by itself for key generation and usage. Compared to CloudHSM, AWS KMS is typically lower cost and easier to set up and operate as a managed service, and using AWS KMS shifts the responsibility for creating and controlling encryption keys and operating HSMs from the customer to AWS. This allows you to focus resources on your core business instead of on undifferentiated HSM infrastructure management tasks.

AWS KMS uses FIPS 140-2 Level 3 validated HSMs to help protect your keys when you request the service to create keys on your behalf or when you import them. The HSMs in AWS KMS are designed so that no one, not even AWS employees, can retrieve your plaintext keys. Your plaintext keys are never written to disk and are only used in volatile memory of the HSMs while performing your requested cryptographic operation.

The FIPS 140-2 Level 3 certified HSMs in AWS KMS are deployed in all AWS Regions, including the AWS GovCloud (US) Regions. The China (Beijing) and China (Ningxia) Regions do not support the FIPS 140-2 Cryptographic Module Validation Program. AWS KMS uses Office of the State Commercial Cryptography Administration (OSCCA) certified HSMs to protect KMS keys in China Regions. The certificate for the AWS KMS FIPS 140-2 Security Level 3 validation is available on the NIST Cryptographic Module Validation Program website.

As with many industry and regulatory frameworks, FIPS 140 is evolving. NIST approved and published a new updated version of the 140 standard, FIPS 140-3, which supersedes FIPS 140-2. The U.S. government has begun transitioning to the FIPS 140-3 cryptography standard, with NIST announcing that they will retire all FIPS 140-2 certificates on September 22, 2026. NIST recently validated AWS-LC under FIPS 140-3 and is currently in the process of evaluating AWS KMS and certain instance types of AWS CloudHSM under the FIPS 140-3 standard. To check the status of these evaluations, see the NIST Modules In Process List.

For more information on FIPS 140-3, see FIPS Pub 140-3: Security Requirements for Cryptographic Modules.

Legal Disclaimer

This document is provided for the purposes of information only; it is not legal advice, and should not be relied on as legal advice. Customers are responsible for making their own independent assessment of the information in this document. This document: (a) is for informational purposes only, (b) represents current AWS product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers.

AWS encourages its customers to obtain appropriate advice on their implementation of privacy and data protection environments, and more generally, applicable laws and other obligations relevant to their business.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Aggregating, searching, and visualizing log data from distributed sources with Amazon Athena and Amazon QuickSight

2023-11-02 Pratima Singh

Post Syndicated from Pratima Singh original https://aws.amazon.com/blogs/security/aggregating-searching-and-visualizing-log-data-from-distributed-sources-with-amazon-athena-and-amazon-quicksight/

Customers using Amazon Web Services (AWS) can use a range of native and third-party tools to build workloads based on their specific use cases. Logs and metrics are foundational components in building effective insights into the health of your IT environment. In a distributed and agile AWS environment, customers need a centralized and holistic solution to visualize the health and security posture of their infrastructure.

You can effectively categorize the members of the teams involved using the following roles:

Executive stakeholder: Owns and operates with their support staff and has total financial and risk accountability.
Data custodian: Aggregates related data sources while managing cost, access, and compliance.
Operator or analyst: Uses security tooling to monitor, assess, and respond to related events such as service disruptions.

In this blog post, we focus on the data custodian role. We show you how you can visualize metrics and logs centrally with Amazon QuickSight irrespective of the service or tool generating them. We use Amazon Simple Storage Service (Amazon S3) for storage, AWS Glue for cataloguing, and Amazon Athena for querying the data and creating structured query language (SQL) views for QuickSight to consume.

Target architecture

This post guides you towards building a target architecture in line with the AWS Well-Architected Framework. The tiered and multi-account target architecture, shown in Figure 1, uses account-level isolation to separate responsibilities across the various roles identified above and makes access management more defined and specific to those roles. The workload accounts generate the telemetry around the applications and infrastructure. The data custodian account is where the data lake is deployed and collects the telemetry. The operator account is where the queries and visualizations are created.

Throughout the post, I mention AWS services that reduce the operational overhead in one or more stages of the architecture.

Figure 1: Data visualization architecture

Ingestion

Irrespective of the technology choices, applications and infrastructure configurations should generate metrics and logs that report on resource health and security. The format of the logs depends on which tool and which part of the stack is generating the logs. For example, the format of log data generated by application code can capture bespoke and additional metadata deemed useful from a workload perspective as compared to access logs generated by proxies or load balancers. For more information on types of logs and effective logging strategies, see Logging strategies for security incident response.

Amazon S3 is a scalable, highly available, durable, and secure object storage that you will use as the storage layer. To build a solution that captures events agnostic of the source, you must forward data as a stream to the S3 bucket. Based on the architecture, there are multiple tools you can use to capture and stream data into S3 buckets. Some tools support integration with S3 and directly stream data to S3. Resources like servers and virtual machines need forwarding agents such as Amazon Kinesis Agent, Amazon CloudWatch agent, or Fluent Bit.

Amazon Kinesis Data Streams provides a scalable data streaming environment. Using on-demand capacity mode eliminates the need for capacity provisioning and capacity management for streaming workloads. For log data and metric collection, you should use on-demand capacity mode, because log data generation can be unpredictable depending on the requests that are being handled by the environment. Amazon Kinesis Data Firehose can convert the format of your input data from JSON to Apache Parquet before storing the data in Amazon S3. Parquet is naturally compressed, and using Parquet native partitioning and compression allows for faster queries compared to JSON formatted objects.

Scalable data lake

Use AWS Lake Formation to build, secure, and manage the data lake to store log and metric data in S3 buckets. We recommend using tag-based access control and named resources to share the data in your data store to share data across accounts to build visualizations. Data custodians should configure access for relevant datasets to the operators who can use Athena to perform complex queries and build compelling data visualizations with QuickSight, as shown in Figure 2. For cross-account permissions, see Use Amazon Athena and Amazon QuickSight in a cross-account environment. You can also use Amazon DataZone to build additional governance and share data at scale within your organization. Note that the data lake is different to and separate from the Log Archive bucket and account described in Organizing Your AWS Environment Using Multiple Accounts.

Figure 2: Account structure

Amazon Security Lake

Amazon Security Lake is a fully managed security data lake service. You can use Security Lake to automatically centralize security data from AWS environments, SaaS providers, on-premises, and third-party sources into a purpose-built data lake that’s stored in your AWS account. Using Security Lake reduces the operational effort involved in building a scalable data lake, as the service automates the configuration and orchestration for the data lake with Lake Formation. Security Lake automatically transforms logs into a standard schema—the Open Cybersecurity Schema Framework (OCSF) — and parses them into a standard directory structure, which allows for faster queries. For more information, see How to visualize Amazon Security Lake findings with Amazon QuickSight.

Querying and visualization

Figure 3: Data sharing overview

After you’ve configured cross-account permissions, you can use Athena as the data source to create a dataset in QuickSight, as shown in Figure 3. You start by signing up for a QuickSight subscription. There are multiple ways to sign in to QuickSight; this post uses AWS Identity and Access Management (IAM) for access. To use QuickSight with Athena and Lake Formation, you first must authorize connections through Lake Formation. After permissions are in place, you can add datasets. You should verify that you’re using QuickSight in the same AWS Region as the Region where Lake Formation is sharing the data. You can do this by checking the Region in the QuickSight URL.

You can start with basic queries and visualizations as described in Query logs in S3 with Athena and Create a QuickSight visualization. Depending on the nature and origin of the logs and metrics that you want to query, you can use the examples published in Running SQL queries using Amazon Athena. To build custom analytics, you can create views with Athena. Views in Athena are logical tables that you can use to query a subset of data. Views help you to hide complexity and minimize maintenance when querying large tables. Use views as a source for new datasets to build specific health analytics and dashboards.

You can also use Amazon QuickSight Q to get started on your analytics journey. Powered by machine learning, Q uses natural language processing to provide insights into the datasets. After the dataset is configured, you can use Q to give you suggestions for questions to ask about the data. Q understands business language and generates results based on relevant phrases detected in the questions. For more information, see Working with Amazon QuickSight Q topics.

Conclusion

Logs and metrics offer insights into the health of your applications and infrastructure. It’s essential to build visibility into the health of your IT environment so that you can understand what good health looks like and identify outliers in your data. These outliers can be used to identify thresholds and feed into your incident response workflow to help identify security issues. This post helps you build out a scalable centralized visualization environment irrespective of the source of log and metric data.

This post is part 1 of a series that helps you dive deeper into the security analytics use case. In part 2, How to visualize Amazon Security Lake findings with Amazon QuickSight, you will learn how you can use Security Lake to reduce the operational overhead involved in building a scalable data lake and centralizing log data from SaaS providers, on-premises, AWS, and third-party sources into a purpose-built data lake. You will also learn how you can integrate Athena with Security Lake and create visualizations with QuickSight of the data and events captured by Security Lake.

Part 3, How to share security telemetry per Organizational Unit using Amazon Security Lake and AWS Lake Formation, dives deeper into how you can query security posture using AWS Security Hub findings integrated with Security Lake. You will also use the capabilities of Athena and QuickSight to visualize security posture in a distributed environment.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Refine permissions for externally accessible roles using IAM Access Analyzer and IAM action last accessed

2023-11-01 Nini Ren

Post Syndicated from Nini Ren original https://aws.amazon.com/blogs/security/refine-permissions-for-externally-accessible-roles-using-iam-access-analyzer-and-iam-action-last-accessed/

When you build on Amazon Web Services (AWS) across accounts, you might use an AWS Identity and Access Management (IAM) role to allow an authenticated identity from outside your account—such as an IAM entity or a user from an external identity provider—to access the resources in your account. IAM roles have two types of policies attached to them: a trust policy that allows access to an external entity, and a permissions policy that defines what actions the role can take. This blog post focuses on how to use AWS Identity and Access Management Access Analyzer cross-account access findings and IAM action last accessed information to refine the permissions policies of your IAM roles that have a trust policy.

IAM Access Analyzer helps you set, verify, and refine permissions. To learn more about how IAM Access Analyzer guides you toward least-privilege permissions, visit Using AWS IAM Access Analyzer. Action last accessed information helps you identify unused permissions and refine the access of your IAM roles to only the actions they use. IAM now provides action last accessed information for more than 140 services such as Amazon Kinesis Data Streams and Data Firehose, Amazon DynamoDB, and Amazon Simple Queue Service (Amazon SQS).

This blog post walks you through how to use IAM Access Analyzer and action last accessed to refine the required permissions for your IAM roles that have a trust policy, which allows entities outside of your account to assume a role and access your resources.

Use IAM roles to grant access to an external entity

You can create an IAM role that grants permissions for an entity outside your account to access the resources in your account. For example, if you’re an application developer, you might grant cross-account access to your AWS resources by using a role and attaching a trust policy to the role.

To allow an external entity access to your resources by using a role, you first create a role with a role trust policy to grant access to entities outside your account, and then grant permissions that specify which actions the role can take. The external entities can then assume the role in your account and access your resources based on the permissions you granted to the role. See Cross-account access using roles for more information.

You should restrict the access of roles that grant access outside of your account to just the permissions required to perform a specific task.

Use IAM Access Analyzer cross-account access findings to identify roles that grant access to external entities

When you use role trust policies to grant account access to entities outside your account, those entities can access and take the allowed actions on your resources. IAM Access Analyzer continuously monitors your account to identify the resources in your account that can be accessed from outside your account and helps you verify whether the access permissions meet your intent. For the example in this post, if you were to add a new trust policy to your
ApplicationRole to grant permissions to an external account to access an application in your account, IAM Access Analyzer would let you know that ApplicationRole is accessible by entities from outside your account.

Use IAM action last accessed information to identify and remove unused permissions

After you’ve identified the IAM roles that grant access to entities outside your account, review what those roles can do and remove unused permissions. You can use action last accessed to show you the latest timestamp when your IAM role used an action, analyze its access permissions, and remove unused permissions.

Refine permissions for externally accessible roles by using IAM Access Analyzer cross-account access findings and action last accessed information

This example demonstrates how you can combine the information from IAM Access Analyzer cross-account access findings and IAM action last accessed information to identify roles that can be assumed from outside your account, review unused and unnecessary actions, and reduce the permissions available to external roles.

To view action last accessed information in the IAM console

Open the AWS Management Console and go to the IAM console, and then select Access analyzer in the navigation pane.
If you’ve already created an analyzer, go to Step 3. Otherwise, follow Identify Unintended Resource Access with IAM Access Analyzer to create an analyzer.
Review your findings on the IAM Access Analyzer tab.
Under Active findings, for Filter active findings, enter AWS::IAM::Role. The list of Active findings shows you the roles that can be accessed by entities outside your account.

Figure 1: Findings filtered by resource types

Under the Finding ID column, select a finding for a role (for example, ApplicationRole) that you want to review.
A new page for the Finding ID will appear. Choose the resource ARN link in the Resource field under the Details section.

Figure 2: Findings page

A new page for the role will appear. Select the Access Advisor tab to review the last accessed information of your services for this role. This tab displays the AWS services to which the role has permissions. Action last accessed reports the actions listed in the IAM action last accessed information services and actions. The tracking period for services is the last 400 days—fewer if your AWS Region began tracking within the last 400 days. Learn more about Where AWS tracks last accessed information.

Figure 3: Last accessed information of allowed services

In this exercise, we will use DynamoDB as an example. Under Allowed services, for Search, enter Amazon DynamoDB and under the Service column, choose Amazon DynamoDB. This will take you to a new section titled Allowed management actions for Amazon DynamoDB, which displays the action last accessed information of your role for DynamoDB. The Action column displays the action, the Last Accessed column displays the timestamp of when access was last attempted, and the Region accessed column displays in which region access was last attempted.
The Action column on the resulting Allowed management actions for Amazon DynamoDB section includes the actions to which the role has permissions, when the role last accessed each action, and the Region accessed. You can sort the actions by choosing the arrow next to Last accessed.

Figure 4: Action last accessed information for Amazon DynamoDB

Because you want to remove unused permissions, filter for all unused actions for the role by selecting Services not accessed from the Last accessed dropdown list. This will show you the actions that haven’t been accessed during the tracking period.

Figure 5: Action last accessed information ordered by not accessed

To return to the service view, choose Back to Allowed services and then select the Permissions tab. Select the plus sign to the left of DynamoDBAccess to see the JSON of the customer managed policy.

Figure 6: The JSON code of the customer managed policy

Choose Edit and remove dynamodb:* and replace it with just the actions that have been used recently such as: DescribeTable and DescribeKinesisStreamingDestination. Not all actions are reported by action last accessed. Review the list of actions that action last accessed information reports and when action last accessed started tracking the action for the service in an AWS Region.
Choose Next and then Save changes. Return to the Access Advisor tab to confirm that all the retained permissions have been used recently.

Conclusion

In this post, you learned how to use IAM Access Analyzer and action last accessed information to identify and refine permissions for externally accessible roles in your journey toward least privilege. You first used IAM Access Analyzer cross-account access findings to identify IAM roles that can be accessed from outside your account. You then used IAM action last accessed information to review the permissions those roles are using and to remove unused permissions.

For more information about IAM Access Analyzer cross-account findings, see Findings for public and cross-account access. For more information about action last accessed information, see Things to know about last accessed information and the IAM action last accessed information services and actions.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS re:Post or contact AWS Support.

Evolving cyber threats demand new security approaches – The benefits of a unified and global IT/OT SOC

2023-10-30 Stuart Gregg

Post Syndicated from Stuart Gregg original https://aws.amazon.com/blogs/security/evolving-cyber-threats-demand-new-security-approaches-the-benefits-of-a-unified-and-global-it-ot-soc/

In this blog post, we discuss some of the benefits and considerations organizations should think through when looking at a unified and global information technology and operational technology (IT/OT) security operations center (SOC). Although this post focuses on the IT/OT convergence within the SOC, you can use the concepts and ideas discussed here when thinking about other environments such as hybrid and multi-cloud, Industrial Internet of Things (IIoT), and so on.

The scope of assets has vastly expanded as organizations transition to remote work, and from increased interconnectivity through the Internet of Things (IoT) and edge devices coming online from around the globe, such as cyber physical systems. For many organizations, the IT and OT SOCs were separate, but there is a strong argument for convergence, which provides better context for the business outcomes of being able to respond to unexpected activity. In the ten security golden rules for IIoT solutions, AWS recommends deploying security audit and monitoring mechanisms across OT and IIoT environments, collecting security logs, and analyzing them using security information and event management (SIEM) tools within a SOC. SOCs are used to monitor, detect, and respond; this has traditionally been done separately for each environment. In this blog post, we explore the benefits and potential trade-offs of the convergence of these environments for the SOC. Although organizations should carefully consider the points raised throughout this blog post, the benefits of a unified SOC outweigh the potential trade-offs—visibility into the full threat chain propagating from one environment to another is critical for organizations as daily operations become more connected across IT and OT.

Traditional IT SOC

Traditionally, the SOC was responsible for security monitoring, analysis, and incident management of the entire IT environment within an organization—whether on-premises or in a hybrid architecture. This traditional approach has worked well for many years and ensures the SOC has the visibility to effectively protect the IT environment from evolving threats.

Note: Organizations should be aware of the considerations for security operations in the cloud which are discussed in this blog post.

Traditional OT SOC

Traditionally, OT, IT, and cloud teams have worked on separate sides of the air gap as described in the Purdue model. This can result in siloed OT, IIoT, and cloud security monitoring solutions, creating potential gaps in coverage or missing context that could otherwise have improved the response capability. To realize the full benefits of IT/OT convergence, IIoT, IT and OT must collaborate effectively to provide a broad perspective and the most effective defense. The convergence trend applies to newly connected devices and to how security and operations work together.

As organizations explore how industrial digital transformation can give them a competitive advantage, they’re using IoT, cloud computing, artificial intelligence and machine learning (AI/ML), and other digital technologies. This increases the potential threat surface that organizations must protect and requires a broad, integrated, and automated defense-in-depth security approach delivered through a unified and global SOC.

Without full visibility and control of traffic entering and exiting OT networks, the operations function might not be able to get full context or information that can be used to identify unexpected events. If a control system or connected assets such as programmable logic controllers (PLCs), operator workstations, or safety systems are compromised, threat actors could damage critical infrastructure and services or compromise data in IT systems. Even in cases where the OT system isn’t directly impacted, the secondary impacts can result in OT networks being shut down due to safety concerns over the ability to operate and monitor OT networks.

The SOC helps improve security and compliance by consolidating key security personnel and event data in a centralized location. Building a SOC is significant because it requires a substantial upfront and ongoing investment in people, processes, and technology. However, the value of an improved security posture is of great consideration compared to the costs.

In many OT organizations, operators and engineering teams may not be used to focusing on security; in some cases, organizations set up an OT SOC that’s independent from their IT SOC. Many of the capabilities, strategies, and technologies developed for enterprise and IT SOCs apply directly to the OT environment, such as security operations (SecOps) and standard operating procedures (SOPs). While there are clearly OT-specific considerations, the SOC model is a good starting point for a converged IT/OT cybersecurity approach. In addition, technologies such as a SIEM can help OT organizations monitor their environment with less effort and time to deliver maximum return on investment. For example, by bringing IT and OT security data into a SIEM, IT and OT stakeholders share access to the information needed to complete security work.

Benefits of a unified SOC

A unified SOC offers numerous benefits for organizations. It provides broad visibility across the entire IT and OT environments, enabling coordinated threat detection, faster incident response, and immediate sharing of indicators of compromise (IoCs) between environments. This allows for better understanding of threat paths and origins.

Consolidating data from IT and OT environments in a unified SOC can bring economies of scale with opportunities for discounted data ingestion and retention. Furthermore, managing a unified SOC can reduce overhead by centralizing data retention requirements, access models, and technical capabilities such as automation and machine learning.

Operational key performance indicators (KPIs) developed within one environment can be used to enhance another, promoting operational efficiency such as reducing mean time to detect security events (MTTD). A unified SOC enables integrated and unified security, operations, and performance, which supports comprehensive protection and visibility across technologies, locations, and deployments. Sharing lessons learned between IT and OT environments improves overall operational efficiency and security posture. A unified SOC also helps organizations adhere to regulatory requirements in a single place, streamlining compliance efforts and operational oversight.

By using a security data lake and advanced technologies like AI/ML, organizations can build resilient business operations, enhancing their detection and response to security threats.

Creating cross-functional teams of IT and OT subject matter experts (SMEs) help bridge the cultural divide and foster collaboration, enabling the development of a unified security strategy. Implementing an integrated and unified SOC can improve the maturity of industrial control systems (ICS) for IT and OT cybersecurity programs, bridging the gap between the domains and enhancing overall security capabilities.

Considerations for a unified SOC

There are several important aspects of a unified SOC for organizations to consider.

First, the separation of duty is crucial in a unified SOC environment. It’s essential to verify that specific duties are assigned to individuals based on their expertise and job function, allowing the most appropriate specialists to work on security events for their respective environments. Additionally, the sensitivity of data must be carefully managed. Robust access and permissions management is necessary to restrict access to specific types of data, maintaining that only authorized analysts can access and handle sensitive information. You should implement a clear AWS Identity and Access Management (IAM) strategy following security best practices across your organization to verify that the separation of duties is enforced.

Another critical consideration is the potential disruption to operations during the unification of IT and OT environments. To promote a smooth transition, careful planning is required to minimize any loss of data, visibility, or disruptions to standard operations. It’s crucial to recognize the differences in IT and OT security. The unique nature of OT environments and their close ties to physical infrastructure require tailored cybersecurity strategies and tools that address the distinct missions, challenges, and threats faced by industrial organizations. A copy-and-paste approach from IT cybersecurity programs will not suffice.

Furthermore, the level of cybersecurity maturity often varies between IT and OT domains. Investment in cybersecurity measures might differ, resulting in OT cybersecurity being relatively less mature compared to IT cybersecurity. This discrepancy should be considered when designing and implementing a unified SOC. Baselining the technology stack from each environment, defining clear goals and carefully architecting the solution can help ensure this discrepancy has been accounted for. After the solution has moved into the proof-of-concept (PoC) phase, you can start to testing for readiness to move the convergence to production.

You also must address the cultural divide between IT and OT teams. Lack of alignment between an organization’s cybersecurity policies and procedures with ICS and OT security objectives can impact the ability to secure both environments effectively. Bridging this divide through collaboration and clear communication is essential. This has been discussed in more detail in the post on managing organizational transformation for successful IT/OT convergence.

Unified IT/OT SOC deployment:

Figure 1 shows the deployment that would be expected in a unified IT/OT SOC. This is a high-level view of a unified SOC. In part 2 of this post, we will provide prescriptive guidance on how to design and build a unified and global SOC on AWS using AWS services and AWS Partner Network (APN) solutions.

Figure 1: Unified IT/OT SOC architecture

The parts of the IT/OT unified SOC are the following:

Environment: There are multiple environments, including a traditional IT on-premises organization, OT environment, cloud environment, and so on. Each environment represents a collection of security events and log sources from assets.

Data lake: A centralized place for data collection, normalization, and enrichment to verify that raw data from the different environments is standardized into a common scheme. The data lake should support data retention and archiving for long term storage.

Visualize: The SOC includes multiple dashboards based on organizational and operational needs. Dashboards can cover scenarios for multiple environments including data flows between IT and OT environments. There are also specific dashboards for the individual environments to cover each stakeholder’s needs. Data should be indexed in a way that allows humans and machines to query the data to monitor for security and performance issues.

Security analytics: Security analytics are used to aggregate and analyze security signals and generate higher fidelity alerts and to contextualize OT signals against concurrent IT signals and against threat intelligence from reputable sources.

Detect, alert, and respond: Alerts can be set up for events of interest based on data across both individual and multiple environments. Machine learning should be used to help identify threat paths and events of interest across the data.

Conclusion

Throughout this blog post, we’ve talked through the convergence of IT and OT environments from the perspective of optimizing your security operations. We looked at the benefits and considerations of designing and implementing a unified SOC.

Visibility into the full threat chain propagating from one environment to another is critical for organizations as daily operations become more connected across IT and OT. A unified SOC is the nerve center for incident detection and response and can be one of the most critical components in improving your organization’s security posture and cyber resilience.

If unification is your organization’s goal, you must fully consider what this means and design a plan for what a unified SOC will look like in practice. Running a small proof of concept and migrating in steps often helps with this process.

In the next blog post, we will provide prescriptive guidance on how to design and build a unified and global SOC using AWS services and AWS Partner Network (APN) solutions.

Learn more:

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Mask and redact sensitive data published to Amazon SNS using managed and custom data identifiers

2023-10-25 Otavio Ferreira

Post Syndicated from Otavio Ferreira original https://aws.amazon.com/blogs/security/mask-and-redact-sensitive-data-published-to-amazon-sns-using-managed-and-custom-data-identifiers/

Today, we’re announcing a new capability for Amazon Simple Notification Service (Amazon SNS) message data protection. In this post, we show you how you can use this new capability to create custom data identifiers to detect and protect domain-specific sensitive data, such as your company’s employee IDs. Previously, you could only use managed data identifiers to detect and protect common sensitive data, such as names, addresses, and credit card numbers.

Overview

Amazon SNS is a serverless messaging service that provides topics for push-based, many-to-many messaging for decoupling distributed systems, microservices, and event-driven serverless applications. As applications become more complex, it can become challenging for topic owners to manage the data flowing through their topics. These applications might inadvertently start sending sensitive data to topics, increasing regulatory risk. To mitigate the risk, you can use message data protection to protect sensitive application data using built-in, no-code, scalable capabilities.

To discover and protect data flowing through SNS topics with message data protection, you can associate data protection policies to your topics. Within these policies, you can write statements that define which types of sensitive data you want to discover and protect. Within each policy statement, you can then define whether you want to act on data flowing inbound to an SNS topic or outbound to an SNS subscription, the AWS accounts or specific AWS Identity and Access Management (IAM) principals the statement applies to, and the actions you want to take on the sensitive data found.

Now, message data protection provides three actions to help you protect your data. First, the audit operation reports on the amount of sensitive data found. Second, the deny operation helps prevent the publishing or delivery of payloads that contain sensitive data. Third, the de-identify operation can mask or redact the sensitive data detected. These no-code operations can help you adhere to a variety of compliance regulations, such as Health Insurance Portability and Accountability Act (HIPAA), Federal Risk and Authorization Management Program (FedRAMP), General Data Protection Regulation (GDPR), and Payment Card Industry Data Security Standard (PCI DSS).

This message data protection feature coexists with the message data encryption feature in SNS, both contributing to an enhanced security posture of your messaging workloads.

Managed and custom data identifiers

After you add a data protection policy to your SNS topic, message data protection uses pattern matching and machine learning models to scan your messages for sensitive data, then enforces the data protection policy in real time. The types of sensitive data are referred to as data identifiers. These data identifiers can be either managed by Amazon Web Services (AWS) or custom to your domain.

Managed data identifiers (MDI) are organized into five categories:

Personally identifiable information (PII) includes data identifiers such as name, home address, email address, and social security number
Protected health information (PHI) examples include healthcare card number, prescription drug code, and healthcare procedure code
Financial information examples include bank account number, credit card number, and credit card magnetic strip data
Credential information examples include AWS secret access key, SSH private key, and PGP private key
Device information examples include IP addresses

In a data protection policy statement, you refer to a managed data identifier using its Amazon Resource Name (ARN), as follows:

{
    "Name": "__example_data_protection_policy",
    "Description": "This policy protects sensitive data in expense reports",
    "Version": "2021-06-01",
    "Statement": [{
        "DataIdentifier": [
            "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
        ],
        "..."
    }]
}

Custom data identifiers (CDI), on the other hand, enable you to define custom regular expressions in the data protection policy itself, then refer to them from policy statements. Using custom data identifiers, you can scan for business-specific sensitive data, which managed data identifiers can’t. For example, you can use a custom data identifier to look for company-specific employee IDs in SNS message payloads. Internally, SNS has guardrails to make sure custom data identifiers are safe and that they add only low single-digit millisecond latency to message processing.

In a data protection policy statement, you refer to a custom data identifier using only the name that you have given it, as follows:

{
    "Name": "__example_data_protection_policy",
    "Description": "This policy protects sensitive data in expense reports",
    "Version": "2021-06-01",
    "Configuration": {
        "CustomDataIdentifier": [{
            "Name": "MyCompanyEmployeeId", "Regex": "EID-\d{9}-US"
        }]
    },
    "Statement": [{
        "DataIdentifier": [
            "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber",
            "MyCompanyEmployeeId"
        ],
        "..."
    }]
}

Note that custom data identifiers can be used in conjunction with managed data identifiers, as part of the same data protection policy statement. In the preceding example, both MyCompanyEmployeeId and CreditCardNumber are in scope.

For more information, see Data Identifiers, in the SNS Developer Guide.

Inbound and outbound data directions

In addition to the DataIdentifier property, each policy statement also sets the DataDirection property (whose value can be either Inbound or Outbound) as well as the Principal property (whose value can be any combination of AWS accounts, IAM users, and IAM roles).

When you use message data protection for data de-identification and set DataDirection to Inbound, instances of DataIdentifier published by the Principal are masked or redacted before the payload is ingested into the SNS topic. This means that every endpoint subscribed to the topic receives the same modified payload.

When you set DataDirection to Outbound, on the other hand, the payload is ingested into the SNS topic as-is. Then, instances of DataIdentifier are either masked, redacted, or kept as-is for each subscribing Principal in isolation. This means that each endpoint subscribed to the SNS topic might receive a different payload from the topic, with different sensitive data de-identified, according to the data access permissions of its Principal.

The following snippet expands the example data protection policy to include the DataDirection and Principal properties.

{
    "Name": "__example_data_protection_policy",
    "Description": "This policy protects sensitive data in expense reports",
    "Version": "2021-06-01",
    "Configuration": {
        "CustomDataIdentifier": [{
            "Name": "MyCompanyEmployeeId", "Regex": "EID-\d{9}-US"
        }]
    },
    "Statement": [{
        "DataIdentifier": [
            "MyCompanyEmployeeId",
            "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
        ],
        "DataDirection": "Outbound",
        "Principal": [ "arn:aws:iam::123456789012:role/ReportingApplicationRole" ],
        "..."
    }]
}

In this example, ReportingApplicationRole is the authenticated IAM principal that called the SNS Subscribe API at subscription creation time. For more information, see How do I determine the IAM principals for my data protection policy? in the SNS Developer Guide.

Operations for data de-identification

To complete the policy statement, you need to set the Operation property, which informs the SNS topic of the action that it should take when it finds instances of DataIdentifer in the outbound payload.

The following snippet expands the data protection policy to include the Operation property, in this case using the Deidentify object, which in turn supports masking and redaction.

{
    "Name": "__example_data_protection_policy",
    "Description": "This policy protects sensitive data in expense reports",
    "Version": "2021-06-01",
    "Configuration": {
        "CustomDataIdentifier": [{
            "Name": "MyCompanyEmployeeId", "Regex": "EID-\d{9}-US"
        }]
    },
    "Statement": [{
        "Principal": [
            "arn:aws:iam::123456789012:role/ReportingApplicationRole"
        ],
        "DataDirection": "Outbound",
        "DataIdentifier": [
            "MyCompanyEmployeeId",
            "arn:aws:dataprotection::aws:data-identifier/CreditCardNumber"
        ],
        "Operation": { "Deidentify": { "MaskConfig": { "MaskWithCharacter": "#" } } }
    }]
}

In this example, the MaskConfig object instructs the SNS topic to mask instances of CreditCardNumber in Outbound messages to subscriptions created by ReportingApplicationRole, using the MaskWithCharacter value, which in this case is the hash symbol (#). Alternatively, you could have used the RedactConfig object instead, which would have instructed the SNS topic to simply cut the sensitive data off the payload.

The following snippet shows how the outbound payload is masked, in real time, by the SNS topic.

// original message published to the topic:
My credit card number is 4539894458086459

// masked message delivered to subscriptions created by ReportingApplicationRole:
My credit card number is ################

For more information, see Data Protection Policy Operations, in the SNS Developer Guide.

Applying data de-identification in a use case

Consider a company where managers use an internal expense report management application where expense reports from employees can be reviewed and approved. Initially, this application depended only on an internal payment application, which in turn connected to an external payment gateway. However, this workload eventually became more complex, because the company started also paying expense reports filed by external contractors. At that point, the company built a mobile application that external contractors could use to view their approved expense reports. An important business requirement for this mobile application was that specific financial and PII data needed to be de-identified in the externally displayed expense reports. Specifically, both the credit card number used for the payment and the internal employee ID that approved the payment had to be masked.

Figure 1: Expense report processing application

To distribute the approved expense reports to both the payment application and the reporting application that backed the mobile application, the company used an SNS topic with a data protection policy. The policy has only one statement, which masks credit card numbers and employee IDs found in the payload. This statement applies only to the IAM role that the company used for subscribing the AWS Lambda function of the reporting application to the SNS topic. This access permission configuration enabled the Lambda function from the payment application to continue receiving the raw data from the SNS topic.

The data protection policy from the previous section addresses this use case. Thus, when a message representing an expense report is published to the SNS topic, the Lambda function in the payment application receives the message as-is, whereas the Lambda function in the reporting application receives the message with the financial and PII data masked.

Deploying the resources

You can apply a data protection policy to an SNS topic using the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS SDK, or AWS CloudFormation.

To automate the provisioning of the resources and the data protection policy of the example expense management use case, we’re going to use CloudFormation templates. You have two options for deploying the resources:

Run the /templates/message_data_protection_cdi/deploy script in the aws-sns-samples repository in GitHub.
Alternatively, use the following four CloudFormation templates, in order. Allow time for each stack to complete before deploying the next stack.

Deploy using the individual CloudFormation templates in sequence

Prerequisites template: This first template provisions two IAM roles with a managed policy that enables them to create SNS subscriptions and configure the subscriber Lambda functions. You will use these provisioned IAM roles in steps 3 and 4 that follow.
Topic owner template: The second template provisions the SNS topic along with its access policy and data protection policy.
Payment subscriber template: The third template provisions the Lambda function and the corresponding SNS subscription that comprise of the Payment application stack. When prompted, select the PaymentApplicationRole in the Permissions panel before running the template. Moreover, the CloudFormation console will require you to acknowledge that a CloudFormation transform might require access capabilities.
Reporting subscriber template: The final template provisions the Lambda function and the SNS subscription that comprise of the Reporting application stack. When prompted, select the ReportingApplicationRole in the Permissions panel, before running the template. Moreover, the CloudFormation console will require, once again, that you acknowledge that a CloudFormation transform might require access capabilities.

Figure 2: Select IAM role

Now that the application stacks have been deployed, you’re ready to start testing.

Testing the data de-identification operation

Use the following steps to test the example expense management use case.

In the Amazon SNS console, select the ApprovalTopic, then choose to publish a message to it.

In the SNS message body field, enter the following message payload, representing an external contractor expense report, then choose to publish this message:

{
    "expense": {
        "currency": "USD",
        "amount": 175.99,
        "category": "Office Supplies",
        "status": "Approved",
        "created_at": "2023-10-17T20:03:44+0000",
        "updated_at": "2023-10-19T14:21:51+0000"
    },
    "payment": {
        "credit_card_network": "Visa",
        "credit_card_number": "4539894458086459"
    },
    "reviewer": {
        "employee_id": "EID-123456789-US",
        "employee_location": "Seattle, USA"
    },
    "contractor": {
        "employee_id": "CID-000012348-CA",
        "employee_location": "Vancouver, CAN"
    }
}

In the CloudWatch console, select the log group for the PaymentLambdaFunction, then choose to view its latest log stream. Now look for the log stream entry that shows the message payload received by the Lambda function. You will see that no data has been masked in this payload, as the payment application requires raw financial data to process the credit card transaction.

Still in the CloudWatch console, select the log group for the ReportingLambdaFunction, then choose to view its latest log stream. Now look for the log stream entry that shows the message payload received by this Lambda function. You will see that the values for properties credit_card_number and employee_id have been masked, protecting the financial data from leaking into the external reporting application.

{
    "expense": {
        "currency": "USD",
        "amount": 175.99,
        "category": "Office Supplies",
        "status": "Approved",
        "created_at": "2023-10-17T20:03:44+0000",
        "updated_at": "2023-10-19T14:21:51+0000"
    },
    "payment": {
        "credit_card_network": "Visa",
        "credit_card_number": "################"
    },
    "reviewer": {
        "employee_id": "################",
        "employee_location": "Seattle, USA"
    },
    "contractor": {
        "employee_id": "CID-000012348-CA",
        "employee_location": "Vancouver, CAN"
    }
}

As shown, different subscribers received different versions of the message payload, according to their sensitive data access permissions.

Cleaning up the resources

After testing, avoid incurring usage charges by deleting the resources that you created. Open the CloudFormation console and delete the four CloudFormation stacks that you created during the walkthrough.

Conclusion

This post showed how you can use Amazon SNS message data protection to discover and protect sensitive data published to or delivered from your SNS topics. The example use case shows how to create a data protection policy that masks messages delivered to specific subscribers if the payloads contain financial or personally identifiable information.

For more details, see message data protection in the SNS Developer Guide. For information on costs, see SNS pricing.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

AWS Digital Sovereignty Pledge: Announcing a new, independent sovereign cloud in Europe

2023-10-25 Matt Garman

Post Syndicated from Matt Garman original https://aws.amazon.com/blogs/security/aws-digital-sovereignty-pledge-announcing-a-new-independent-sovereign-cloud-in-europe/

French | German | Italian | Spanish

From day one, Amazon Web Services (AWS) has always believed it is essential that customers have control over their data, and choices for how they secure and manage that data in the cloud. Last year, we introduced the AWS Digital Sovereignty Pledge, our commitment to offering AWS customers the most advanced set of sovereignty controls and features available in the cloud. We pledged to work to understand the evolving needs and requirements of both customers and regulators, and to rapidly adapt and innovate to meet them. We committed to expanding our capabilities to allow customers to meet their digital sovereignty needs, without compromising on the performance, innovation, security, or scale of the AWS Cloud.

AWS offers the largest and most comprehensive cloud infrastructure globally. Our approach from the beginning has been to make AWS sovereign-by-design. We built data protection features and controls in the AWS Cloud with input from financial services, healthcare, and government customers—who are among the most security- and data privacy-conscious organizations in the world. This has led to innovations like the AWS Nitro System, which powers all our modern Amazon Elastic Compute Cloud (Amazon EC2) instances and provides a strong physical and logical security boundary to enforce access restrictions so that nobody, including AWS employees, can access customer data running in Amazon EC2. The security design of the Nitro System has also been independently validated by the NCC Group in a public report.

With AWS, customers have always had control over the location of their data. In Europe, customers that need to comply with European data residency requirements have the choice to deploy their data to any of our eight existing AWS Regions (Ireland, Frankfurt, London, Paris, Stockholm, Milan, Zurich, and Spain) to keep their data securely in Europe. To run their sensitive workloads, European customers can leverage the broadest and deepest portfolio of services, including AI, analytics, compute, database, Internet of Things (IoT), machine learning, mobile services, and storage. To further support customers, we’ve innovated to offer more control and choice over their data. For example, we announced further transparency and assurances, and new dedicated infrastructure options with AWS Dedicated Local Zones.

Announcing the AWS European Sovereign Cloud

When we speak to public sector and regulated industry customers in Europe, they share how they are facing incredible complexity and changing dynamics with an evolving sovereignty landscape. Customers tell us they want to adopt the cloud, but are facing increasing regulatory scrutiny over data location, European operational autonomy, and resilience. We’ve learned that these customers are concerned that they will have to choose between the full power of AWS or feature-limited sovereign cloud solutions. We’ve had deep engagements with European regulators, national cybersecurity authorities, and customers to understand how the sovereignty needs of customers can vary based on multiple factors, like location, sensitivity of workloads, and industry. These factors can impact their workload requirements, such as where their data can reside, who can access it, and the controls needed. AWS has a proven track record of innovation to address specialized workloads around the world.

Today, we’re excited to announce our plans to launch the AWS European Sovereign Cloud, a new, independent cloud for Europe, designed to help public sector organizations and customers in highly regulated industries meet their evolving sovereignty needs. We’re designing the AWS European Sovereign Cloud to be separate and independent from our existing Regions, with infrastructure located wholly within the European Union (EU), with the same security, availability, and performance our customers get from existing Regions today. To deliver enhanced operational resilience within the EU, only EU residents who are located in the EU will have control of the operations and support for the AWS European Sovereign Cloud. As with all current Regions, customers using the AWS European Sovereign Cloud will benefit from the full power of AWS with the same familiar architecture, expansive service portfolio, and APIs that millions of customers use today. The AWS European Sovereign Cloud will launch its first AWS Region in Germany available to all European customers.

The AWS European Sovereign Cloud will be sovereign-by-design, and will be built on more than a decade of experience operating multiple independent clouds for the most critical and restricted workloads. Like existing Regions, the AWS European Sovereign Cloud will be built for high availability and resiliency, and powered by the AWS Nitro System, to help ensure the confidentiality and integrity of customer data. Customers will have the control and assurance that AWS will not access or use customer data for any purpose without their agreement. AWS gives customers the strongest sovereignty controls among leading cloud providers. For customers with enhanced data residency needs, the AWS European Sovereign cloud is designed to go further and will allow customers to keep all metadata they create (such as the roles, permissions, resource labels, and configurations they use to run AWS) in the EU. The AWS European Sovereign Cloud will also be built with separate, in-Region billing and usage metering systems.

Delivering operational autonomy

The AWS European Sovereign Cloud will provide customers the capability to meet stringent operational autonomy and data residency requirements. To deliver enhanced data residency and operational resilience within the EU, the AWS European Sovereign Cloud infrastructure will be operated independently from existing AWS Regions. To assure independent operation of the AWS European Sovereign Cloud, only personnel who are EU residents, located in the EU, will have control of day-to-day operations, including access to data centers, technical support, and customer service.

We’re taking learnings from our deep engagements with European regulators and national cybersecurity authorities and applying them as we build the AWS European Sovereign Cloud, so that customers using the AWS European Sovereign Cloud can meet their data residency, operational autonomy, and resilience requirements. For example, we are looking forward to continuing to partner with Germany’s Federal Office for Information Security (BSI).

“The development of a European AWS Cloud will make it much easier for many public sector organizations and companies with high data security and data protection requirements to use AWS services. We are aware of the innovative power of modern cloud services and we want to help make them securely available for Germany and Europe. The C5 (Cloud Computing Compliance Criteria Catalogue), which was developed by the BSI, has significantly shaped cybersecurity cloud standards and AWS was in fact the first cloud service provider to receive the BSI’s C5 testate. In this respect, we are very pleased to constructively accompany the local development of an AWS Cloud, which will also contribute to European sovereignty, in terms of security.”
— Claudia Plattner, President of the German Federal Office for Information Security (BSI)

Control without compromise

Though separate, the AWS European Sovereign Cloud will offer the same industry-leading architecture built for security and availability as other AWS Regions. This will include multiple Availability Zones (AZs), infrastructure that is placed in separate and distinct geographic locations, with enough distance to significantly reduce the risk of a single event impacting customers’ business continuity. Each AZ will have multiple layers of redundant power and networking to provide the highest level of resiliency. All AZs in the AWS European Sovereign Cloud will be interconnected with fully redundant, dedicated metro fiber, providing high-throughput, low-latency networking between AZs. All traffic between AZs will be encrypted. Customers who need more options to address stringent isolation and in-country data residency needs will be able to use Dedicated Local Zones or AWS Outposts to deploy AWS European Sovereign Cloud infrastructure in locations they select.

Continued AWS investment in Europe

The AWS European Sovereign Cloud represents continued AWS investment in Europe. AWS is committed to innovating to support European values and Europe’s digital future. We drive economic development through investing in infrastructure, jobs, and skills in communities and countries across Europe. We are creating thousands of high-quality jobs and investing billions of euros in European economies. Amazon has created more than 100,000 permanent jobs across the EU. Some of our largest AWS development teams are located in Europe, with key centers in Dublin, Dresden, and Berlin. As part of our continued commitment to contribute to the development of digital skills, we will hire and develop additional local personnel to operate and support the AWS European Sovereign Cloud.

Customers, partners, and regulators welcome the AWS European Sovereign Cloud

In the EU, hundreds of thousands of organizations of all sizes and across all industries are using AWS – from start-ups, to small and medium-sized businesses, to the largest enterprises, including telecommunication companies, public sector organizations, educational institutions, and government agencies. Organizations across Europe support the introduction of the AWS European Sovereign Cloud.

“As the market leader in enterprise application software with strong roots in Europe, SAP and AWS have long collaborated on behalf of customers to accelerate digital transformation around the world. The AWS European Sovereign Cloud provides further opportunities to strengthen our relationship in Europe by enabling us to expand the choices we offer to customers as they move to the cloud. We appreciate the ongoing partnership with AWS, and the new possibilities this investment can bring for our mutual customers across the region.”
— Peter Pluim, President, SAP Enterprise Cloud Services and SAP Sovereign Cloud Services.

“The new AWS European Sovereign Cloud can be a game-changer for highly regulated business segments in the European Union. As a leading telecommunications provider in Germany, our digital transformation focuses on innovation, scalability, agility, and resilience to provide our customers with the best services and quality. This will now be paired with the highest levels of data protection and regulatory compliance that AWS delivers, and with a particular focus on digital sovereignty requirements. I am convinced that this new infrastructure offering has the potential to boost cloud adaptation of European companies and accelerate the digital transformation of regulated industries across the EU.”
— Mallik Rao, Chief Technology and Information Officer, O₂ Telefónica in Germany

“Deutsche Telekom welcomes the announcement of the AWS European Sovereign Cloud, which highlights AWS’s dedication to continuous innovation for European businesses. The AWS solution will provide greater choice for organizations when moving regulated workloads to the cloud and additional options to meet evolving digital governance requirements in the EU.”
— Greg Hyttenrauch, senior vice president, Global Cloud Services at T-Systems

“Today, we stand at the cusp of a transformative era. The introduction of the AWS European Sovereign Cloud does not merely represent an infrastructural enhancement, it is a paradigm shift. This sophisticated framework will empower Dedalus to offer unparalleled services for storing patient data securely and efficiently in the AWS cloud. We remain committed, without compromise, to serving our European clientele with best-in-class solutions underpinned by trust and technological excellence.”
— Andrea Fiumicelli, Chairman, Dedalus

“At de Volksbank, we believe in investing in a better Netherlands. To do this effectively, we need to have access to the latest technologies in order for us to continually be innovating and improving services for our customers. For this reason, we welcome the announcement of the European Sovereign Cloud which will allow European customers to easily demonstrate compliance with evolving regulations while still benefitting from the scale, security, and full suite of AWS services.”
— Sebastiaan Kalshoven, Director IT/CTO, de Volksbank

“Eviden welcomes the launch of the AWS European Sovereign Cloud. This will help regulated industries and the public sector address the requirements of their sensitive workloads with a fully featured AWS cloud wholly operated in Europe. As an AWS Premier Tier Services Partner and leader in cybersecurity services in Europe, Eviden has an extensive track record in helping AWS customers formalize and mitigate their sovereignty risks. The AWS European Sovereign Cloud will allow Eviden to address a wider range of customers’ sovereignty needs.”
— Yannick Tricaud, Head of Southern and Central Europe, Middle East, and Africa, Eviden, Atos Group

“We welcome the commitment of AWS to expand its infrastructure with an independent European cloud. This will give businesses and public sector organizations more choice in meeting digital sovereignty requirements. Cloud services are essential for the digitization of the public administration. With the “German Administration Cloud Strategy” and the “EVB-IT Cloud” contract standard, the foundations for cloud use in the public administration have been established. I am very pleased to work together with AWS to practically and collaboratively implement sovereignty in line with our cloud strategy.”
— Dr. Markus Richter, CIO of the German federal government, Federal Ministry of the Interior

Our commitments to our customers

We remain committed to giving our customers control and choices to help meet their evolving digital sovereignty needs. We continue to innovate sovereignty features, controls, and assurances globally with AWS, without compromising on the full power of AWS.

You can discover more about the AWS European Sovereign Cloud and learn more about our customers in the Press Release and on our European Digital Sovereignty website. You can also get more information in the AWS News Blog.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

French

AWS Digital Sovereignty Pledge : Un nouveau cloud souverain, indépendant en Europe

Depuis sa création, Amazon Web Services (AWS) est convaincu qu’il est essentiel que les clients aient le contrôle de leurs données et puissent choisir la manière dont ils les sécurisent et les gèrent dans le cloud. L’année dernière, nous avons annoncé l’AWS Digital Sovereignty Pledge, notre engagement à offrir aux clients d’AWS l’ensemble le plus avancé de contrôles et de fonctionnalités de souveraineté disponibles dans le cloud. Nous nous sommes engagés à travailler pour comprendre les besoins et les exigences en constante évolution de nos clients et des régulateurs, et à nous adapter et innover rapidement pour y répondre. Nous nous sommes engagés à développer notre offre afin de permettre à nos clients de répondre à leurs besoins en matière de souveraineté numérique, sans compromis sur les performances, l’innovation, la sécurité ou encore l’étendue du Cloud AWS.

AWS propose l’infrastructure cloud la plus étendue et la plus complète au monde. Dès l’origine, notre approche a été de rendre AWS souverain dès la conception. Nous avons développé des fonctionnalités et des contrôles de protection des données dans le Cloud AWS en nous appuyant sur les retours de clients du secteur financier, de la santé et du secteur public, qui figurent parmi les organisations les plus soucieuses de la sécurité et de la confidentialité des données au monde. Cela a donné lieu à des innovations telles que AWS Nitro System, qui alimente toutes nos instances modernes Amazon Elastic Compute Cloud (Amazon EC2) et fournit une solide barrière de sécurité physique et logique pour implémenter les restrictions d’accès afin que personne, y compris les employés d’AWS, ne puisse accéder aux données des clients traitées dans Amazon EC2. La conception de la sécurité du Système Nitro a également été validée de manière indépendante par NCC Group dans un rapport public.

Avec AWS, les clients ont toujours eu le contrôle de l’emplacement de leurs données. En Europe, les clients qui doivent se conformer aux exigences européennes en matière de localisation des données peuvent choisir de déployer leurs données dans l’une de nos huit Régions AWS existantes (Irlande, Francfort, Londres, Paris, Stockholm, Milan, Zurich et Espagne) afin de conserver leurs données en toute sécurité en Europe. Pour exécuter leurs applications sensibles, les clients européens peuvent avoir recours à l’offre de services la plus étendue et la plus complète, de l’intelligence artificielle à l’analyse, du calcul aux bases de données, en passant par l’Internet des objets (IoT), l’apprentissage automatique, les services mobiles et le stockage. Pour soutenir davantage nos clients, nous avons innové pour offrir un plus grand choix en matière de contrôle sur leurs données. Par exemple, nous avons annoncé une transparence et des garanties de sécurité accrues, ainsi que de nouvelles options d’infrastructure dédiée appelées AWS Dedicated Local Zones.

Annonce de l’AWS European Sovereign Cloud
Lorsque nous parlons à des clients du secteur public et des industries régulées en Europe, ils nous font part de l’incroyable complexité à laquelle ils sont confrontés dans un contexte de souveraineté en pleine évolution. Les clients nous disent qu’ils souhaitent adopter le cloud, mais qu’ils sont confrontés à des exigences réglementaires croissantes en matière de localisation des données, d’autonomie opérationnelle européenne et de résilience. Nous entendons que ces clients craignent de devoir choisir entre la pleine puissance d’AWS et des solutions de cloud souverain aux fonctionnalités limitées. Nous avons eu des contacts approfondis avec les régulateurs européens, les autorités nationales de cybersécurité et les clients afin de comprendre comment ces besoins de souveraineté peuvent varier en fonction de multiples facteurs tels que la localisation, la sensibilité des applications et le secteur d’activité. Ces facteurs peuvent avoir une incidence sur leurs exigences, comme l’endroit où leurs données peuvent être localisées, les personnes autorisées à y accéder et les contrôles nécessaires. AWS a fait ses preuves en matière d’innovation pour les applications spécialisées dans le monde entier.

Aujourd’hui, nous sommes heureux d’annoncer le lancement de l’AWS European Sovereign Cloud, un nouveau cloud indépendant pour l’Europe, conçu pour aider les organisations du secteur public et les clients des industries régulées à répondre à leurs besoins évolutifs en matière de souveraineté. Nous concevons l’AWS European Sovereign Cloud de manière à ce qu’il soit distinct et indépendant de nos Régions existantes, avec une infrastructure entièrement située dans l’Union européenne (UE), avec les mêmes niveaux de sécurité, de disponibilité et de performance que ceux dont bénéficient nos clients aujourd’hui dans les Régions existantes. Pour offrir une résilience opérationnelle accrue au sein de l’UE, seuls des résidents de l’UE qui se trouvent dans l’UE auront le contrôle sur l’exploitation et le support de l’AWS European Sovereign Cloud. Comme dans toutes les Régions existantes, les clients utilisant l’AWS European Sovereign Cloud bénéficieront de toute la puissance d’AWS avec la même architecture à laquelle ils sont habitués, le même portefeuille de services étendu et les mêmes API que ceux utilisés par des millions de clients aujourd’hui. L’AWS European Sovereign Cloud lancera sa première Région AWS en Allemagne, disponible pour tous les clients européens.

L’AWS European Sovereign Cloud sera souverain dès la conception et s’appuiera sur plus d’une décennie d’expérience dans l’exploitation de clouds indépendants pour les applications les plus critiques et les plus sensibles. À l’instar des Régions existantes, l’AWS European Sovereign Cloud sera conçu pour offrir une haute disponibilité et un haut niveau de résilience, et sera basé sur le Système AWS Nitro afin de garantir la confidentialité et l’intégrité des données des clients. Les clients auront le contrôle de leurs données et l’assurance qu’AWS n’y accèdera pas, ni ne les utilisera à aucune fin sans leur accord. AWS offre à ses clients les contrôles de souveraineté les plus puissants parmi les principaux fournisseurs de cloud. Pour les clients ayant des besoins accrus en matière de localisation des données, l’AWS European Sovereign Cloud est conçu pour aller plus loin et permettra aux clients de conserver toutes les métadonnées qu’ils créent (telles que les rôles de compte, les autorisations, les étiquettes de données et les configurations qu’ils utilisent au sein d’AWS) dans l’UE. L’AWS European Sovereign Cloud sera également doté de systèmes de facturation et de mesure de l’utilisation distincts et propres.

Apporter l’autonomie opérationnelle
L’AWS European Sovereign Cloud permettra aux clients de répondre à des exigences strictes en matière d’autonomie opérationnelle et de localisation des données. Pour améliorer la localisation des données et la résilience opérationnelle au sein de l’UE, l’infrastructure de l’AWS European Sovereign Cloud sera exploitée indépendamment des Régions AWS existantes. Afin de garantir le fonctionnement indépendant de l’AWS European Sovereign Cloud, seul le personnel résidant de l’UE et situé dans l’UE contrôlera les opérations quotidiennes, y compris l’accès aux centres de données, l’assistance technique et le service client.

Nous tirons les enseignements de nos échanges approfondis auprès des régulateurs européens et des autorités nationales de cybersécurité et les appliquons à la création de l’AWS European Sovereign Cloud, afin que les clients qui l’utilisent puissent répondre à leurs exigences en matière de localisation des données, d’autonomie opérationnelle et de résilience. Par exemple, nous nous réjouissons de poursuivre notre partenariat avec l’Office fédéral allemand de la sécurité de l’information (BSI).

« Le développement d’un cloud AWS européen facilitera grandement l’utilisation des services AWS pour de nombreuses organisations du secteur public et des entreprises ayant des exigences élevées en matière de sécurité et de protection des données. Nous sommes conscients du pouvoir d’innovation des services cloud modernes et nous voulons contribuer à les rendre disponibles en toute sécurité pour l’Allemagne et l’Europe. Le C5 (Catalogue des critères de conformité du cloud computing), développé par le BSI, a considérablement façonné les normes de cybersécurité dans le cloud et AWS a été en fait le premier fournisseur de services cloud à recevoir la certification C5 du BSI. À cet égard, nous sommes très heureux d’accompagner de manière constructive le développement local d’un cloud AWS, qui contribuera également à la souveraineté européenne en termes de sécurité ».
— Claudia Plattner, Présidente de l’Office fédéral allemand de la sécurité de l’information (BSI)

Contrôle sans compromis
Bien que distinct, l’AWS European Sovereign Cloud proposera la même architecture à la pointe de l’industrie, conçue pour offrir la même sécurité et la même disponibilité que les autres Régions AWS. Cela inclura plusieurs Zones de Disponibilité (AZ), des infrastructures physiques placées dans des emplacements géographiques séparés et distincts, avec une distance suffisante pour réduire de manière significative le risque qu’un seul événement ait un impact sur la continuité des activités des clients. Chaque AZ disposera de plusieurs couches d’alimentation et de réseau redondantes pour fournir le plus haut niveau de résilience. Toutes les Zones de Disponibilité de l’AWS European Sovereign Cloud seront interconnectées par un réseau métropolitain de fibres dédié entièrement redondant, fournissant un réseau à haut débit et à faible latence entre les Zones de Disponibilité. Tous les échanges entre les AZ seront chiffrés. Les clients recherchant davantage d’options pour répondre à des besoins stricts en matière d’isolement et de localisation des données dans le pays pourront tirer parti des Dedicated Local Zones ou d’AWS Outposts pour déployer l’infrastructure de l’AWS European Sovereign Cloud sur les sites de leur choix.

Un investissement continu d’AWS en Europe
L’AWS European Sovereign Cloud s’inscrit dans un investissement continu d’AWS en Europe. AWS s’engage à innover pour soutenir les valeurs européennes et l’avenir numérique de l’Europe.

Nous créons des milliers d’emplois qualifiés et investissons des milliards d’euros dans l’économie européenne. Amazon a créé plus de 100 000 emplois permanents dans l’UE.

Nous favorisons le développement économique en investissant dans les infrastructures, les emplois et les compétences dans les territoires et les pays d’Europe. Certaines des plus grandes équipes de développement d’AWS sont situées en Europe, avec des centres majeurs à Dublin, Dresde et Berlin. Dans le cadre de notre engagement continu à contribuer au développement des compétences numériques, nous recruterons et formerons du personnel local supplémentaire pour exploiter et soutenir l’AWS European Sovereign Cloud.

Les clients, partenaires et régulateurs accueillent favorablement l’AWS European Sovereign Cloud
Dans l’UE, des centaines de milliers d’organisations de toutes tailles et de tous secteurs utilisent AWS, qu’il s’agisse de start-ups, de petites et moyennes entreprises ou de grandes entreprises, y compris des sociétés de télécommunications, des organisations du secteur public, des établissements d’enseignement ou des agences gouvernementales. Des organisations de toute l’Europe accueillent favorablement l’AWS European Sovereign Cloud.

« En tant que leader du marché des logiciels de gestion d’entreprise fortement ancré en Europe, SAP collabore depuis longtemps avec AWS pour le compte de ses clients, afin d’accélérer la transformation numérique dans le monde entier. L’AWS European Sovereign Cloud offre de nouvelles opportunités de renforcer notre relation en Europe en nous permettant d’élargir les choix que nous offrons aux clients lorsqu’ils passent au cloud. Nous apprécions le partenariat en cours avec AWS, et les nouvelles possibilités que cet investissement peut apporter à nos clients mutuels dans toute la région. »
— Peter Pluim, Président, SAP Enterprise Cloud Services and SAP Sovereign Cloud Services.

« Le nouvel AWS European Sovereign Cloud peut changer la donne pour les secteurs d’activité très réglementés de l’Union européenne. En tant que fournisseur de télécommunications de premier plan en Allemagne, notre transformation numérique se concentre sur l’innovation, l’évolutivité, l’agilité et la résilience afin de fournir à nos clients les meilleurs services et la meilleure qualité. Cela sera désormais associé aux plus hauts niveaux de protection des données et de conformité réglementaire qu’offre AWS, avec un accent particulier sur les exigences de souveraineté numérique. Je suis convaincu que cette nouvelle offre d’infrastructure a le potentiel de stimuler l’adaptation au cloud des entreprises européennes et d’accélérer la transformation numérique des industries réglementées à travers l’UE. »
— Mallik Rao, Chief Technology and Information Officer, O2 Telefónica, Allemagne

« Deutsche Telekom se réjouit de l’annonce de l’AWS European Sovereign Cloud, qui souligne la volonté d’AWS d’innover en permanence pour les entreprises européennes. La solution d’AWS offrira un plus grand choix aux organisations lorsqu’elles migreront des applications réglementées vers le cloud, ainsi que des options supplémentaires pour répondre à l’évolution des exigences en matière de gouvernance numérique dans l’UE. »
— Greg Hyttenrauch, vice-président senior, Global Cloud Services chez T-Systems

« Aujourd’hui, nous sommes à l’aube d’une ère de transformation. Le lancement de l’AWS European Sovereign Cloud ne représente pas seulement une amélioration de l’infrastructure, c’est un changement de paradigme. Ce cadre sophistiqué permettra à Dedalus d’offrir des services inégalés pour le stockage sécurisé et efficace des données des patients dans le cloud AWS. Nous restons engagés, sans compromis, à servir notre clientèle européenne avec les meilleures solutions de leur catégorie, étayées par la confiance et l’excellence technologique ».
— Andrea Fiumicelli, Chairman at Dedalus

« À de Volksbank, nous croyons qu’il faut investir dans l’avenir des Pays-Bas. Pour y parvenir efficacement, nous devons avoir accès aux technologies les plus récentes afin d’innover en permanence et d’améliorer les services offerts à nos clients. C’est pourquoi nous nous réjouissons de l’annonce de l’AWS European Sovereign Cloud, qui permettra aux clients européens de démontrer facilement leur conformité aux réglementations en constante évolution tout en bénéficiant de l’étendue, de la sécurité et de la suite complète de services AWS ».
— Sebastiaan Kalshoven, Director IT/CTO, de Volksbank

« Eviden se réjouit du lancement de l’AWS European Sovereign Cloud. Celui-ci aidera les industries réglementées et le secteur public à satisfaire leurs exigences pour les applications les plus sensibles, grâce à un Cloud AWS doté de toutes ses fonctionnalités et entièrement opéré en Europe. En tant que partenaire AWS Premier Tier Services et leader des services de cybersécurité en Europe, Eviden a une longue expérience dans l’accompagnement de clients AWS pour formaliser et maîtriser leurs risques en termes de souveraineté. L’AWS European Sovereign Cloud permettra à Eviden de répondre à un plus grand nombre de besoins de ses clients en matière de souveraineté ».
— Yannick Tricaud, Head of Southern and Central Europe, Middle East and Africa, Eviden, Atos Group

« Nous saluons l’engagement d’AWS d’étendre son infrastructure avec un cloud européen indépendant. Les entreprises et les organisations du secteur public auront ainsi plus de choix pour répondre aux exigences de souveraineté numérique. Les services cloud sont essentiels pour la numérisation de l’administration publique. La “stratégie de l’administration allemande en matière de cloud” et la norme contractuelle “EVB-IT Cloud” ont constitué les bases de l’utilisation du cloud dans l’administration publique. Je suis très heureux de travailler avec AWS pour mettre en œuvre de manière pratique et collaborative la souveraineté, conformément à notre stratégie cloud. »
— Markus Richter, DSI du gouvernement fédéral allemand, ministère fédéral de l’Intérieur.

Nos engagements envers nos clients
Nous restons déterminés à donner à nos clients le contrôle et les choix nécessaires pour répondre à l’évolution de leurs besoins en matière de souveraineté numérique. Nous continuons d’innover en matière de fonctionnalités, de contrôles et de garanties de souveraineté au niveau mondial au sein d’AWS, tout en fournissant sans compromis et sans restriction la pleine puissance d’AWS.

Pour en savoir plus sur l’AWS European Sovereign Cloud et en apprendre davantage sur nos clients, consultez notre
communiqué de presse, et notre site web sur la souveraineté numérique européenne. Vous pouvez également obtenir plus d’informations en lisant l’AWS News Blog.

German

AWS Digital Sovereignty Pledge: Ankündigung der neuen, unabhängigen AWS European Sovereign Cloud

Amazon Web Services (AWS) war immer der Meinung, dass es wichtig ist, dass Kunden die volle Kontrolle über ihre Daten haben. Kunden sollen die Wahl haben, wie sie diese Daten in der Cloud absichern und verwalten.

Letztes Jahr haben wir unseren „AWS Digital Sovereignty Pledge“ vorgestellt: Unser Versprechen, allen AWS-Kunden ohne Kompromisse die fortschrittlichsten Steuerungsmöglichkeiten für Souveränitätsanforderungen und Funktionen in der Cloud anzubieten. Wir haben uns dazu verpflichtet, die sich wandelnden Anforderungen von Kunden und Aufsichtsbehörden zu verstehen und sie mit innovativen Angeboten zu adressieren. Wir bauen unser Angebot so aus, dass Kunden ihre Bedürfnisse an digitale Souveränität erfüllen können, ohne Kompromisse bei der Leistungsfähigkeit, Innovationskraft, Sicherheit und Skalierbarkeit der AWS-Cloud einzugehen.

AWS bietet die größte und umfassendste Cloud-Infrastruktur weltweit. Von Anfang an haben wir bei der AWS-Cloud einen „sovereign-by-design“-Ansatz verfolgt. Wir haben mit Hilfe von Kunden aus besonders regulierten Branchen, wie z.B. Finanzdienstleistungen, Gesundheit, Staat und Verwaltung, Funktionen und Steuerungsmöglichkeiten für Datenschutz und Datensicherheit entwickelt. Dieses Vorgehen hat zu Innovationen wie dem AWS Nitro System geführt, das heute die Grundlage für alle modernen Amazon Elastic Compute Cloud (Amazon EC2) Instanzen und Confidential Computing auf AWS bildet. AWS Nitro setzt auf eine starke physikalische und logische Sicherheitsabgrenzung und realisiert damit Zugriffsbeschränkungen, die unautorisierte Zugriffe auf Kundendaten in EC2 unmöglich machen – das gilt auch für AWS-Personal. Die NCC Group hat das Sicherheitsdesign von AWS Nitro im Rahmen einer unabhängigen Untersuchung in einem öffentlichen Bericht validiert.

Mit AWS hatten und haben Kunden stets die Kontrolle über den Speicherort ihrer Daten. Kunden, die spezifische europäische Vorgaben zum Ort der Datenverarbeitung einhalten müssen, haben die Wahl, ihre Daten in jeder unserer bestehenden acht AWS-Regionen (Frankfurt, Irland, London, Mailand, Paris, Stockholm, Spanien und Zürich) zu verarbeiten und sicher innerhalb Europas zu speichern. Europäische Kunden können ihre kritischen Workloads auf Basis des weltweit umfangreichsten und am weitesten verbreiteten Portfolios an Diensten betreiben – dazu zählen AI, Analytics, Compute, Datenbanken, Internet of Things (IoT), Machine Learning (ML), Mobile Services und Storage. Wir haben Innovationen in den Bereichen Datenverwaltung und Kontrolle realisiert, um unsere Kunden besser zu unterstützen. Zum Beispiel haben wir weitergehende Transparenz und zusätzliche Zusicherungen sowie neue Optionen für dedizierte Infrastruktur mit AWS Dedicated Local Zones angekündigt.

Ankündigung der AWS European Sovereign Cloud
Kunden aus dem öffentlichen Sektor und aus regulierten Industrien in Europa berichten uns immer wieder, mit welcher Komplexität und Dynamik sie im Bereich Souveränität konfrontiert werden. Wir hören von unseren Kunden, dass sie die Cloud nutzen möchten, aber gleichzeitig zusätzliche Anforderungen im Zusammenhang mit dem Ort der Datenverarbeitung, der betrieblichen Autonomie und der operativen Souveränität erfüllen müssen.

Kunden befürchten, dass sie sich zwischen der vollen Leistung von AWS und souveränen Cloud-Lösungen mit eingeschränkter Funktion entscheiden müssen. Wir haben intensiv mit Aufsichts- und Cybersicherheitsbehörden sowie Kunden aus Deutschland und anderen europäischen Ländern zusammengearbeitet, um zu verstehen, wie Souveränitätsbedürfnisse aufgrund verschiedener Faktoren wie Standort, Klassifikation der Workloads und Branche variieren können. Diese Faktoren können sich auf Workload-Anforderungen auswirken, z. B. darauf, wo sich diese Daten befinden dürfen, wer darauf zugreifen kann und welche Steuerungsmöglichkeiten erforderlich sind. AWS hat eine nachgewiesene Erfolgsbilanz insbesondere für innovative Lösungen zur Verarbeitung spezialisierter Workloads auf der ganzen Welt.

Wir freuen uns, heute die AWS European Sovereign Cloud ankündigen zu können: Eine neue, unabhängige Cloud für Europa. Sie soll Kunden aus dem öffentlichen Sektor und stark regulierten Industrien (z.B. Betreiber kritischer Infrastrukturen („KRITIS“)) dabei helfen, spezifische gesetzliche Anforderungen an den Ort der Datenverarbeitung und den Betrieb der Cloud zu erfüllen. Die AWS European Sovereign Cloud wird sich in der Europäischen Union (EU) befinden und dort betrieben. Sie wird physisch und logisch von den bestehenden AWS-Regionen getrennt sein und dieselbe Sicherheit, Verfügbarkeit und Leistung wie die bestehenden AWS-Regionen bieten. Die Kontrolle über den Betrieb und den Support der AWS European Sovereign Cloud wird ausschließlich von AWS-Personal ausgeübt, das in der EU ansässig ist und sich in der EU aufhält.

Wie schon bei den bestehenden AWS-Regionen, werden Kunden, welche die AWS European Sovereign Cloud nutzen, von dem gesamten AWS-Leistungsumfang profitieren. Dazu zählen die gewohnte Architektur, das umfangreiche Service-Portfolio und die APIs, die heute schon von Millionen von Kunden verwendet werden. Die AWS European Sovereign Cloud wird mit ihrer ersten AWS-Region in Deutschland starten und allen Kunden in Europa zur Verfügung stehen.

Die AWS European Sovereign Cloud wird “sovereign-by-design” sein und basiert auf mehr als zehn Jahren Erfahrung beim Betrieb mehrerer unabhängiger Clouds für besonders kritische und vertrauliche Workloads. Wie schon bei unseren bestehenden AWS-Regionen wird die AWS European Sovereign Cloud für Hochverfügbarkeit und Ausfallsicherheit ausgelegt sein und auf dem AWS Nitro System aufbauen, um die Vertraulichkeit und Integrität von Kundendaten sicherzustellen. Kunden haben die Kontrolle und Gewissheit darüber, dass AWS nicht ohne ihr Einverständnis auf Kundendaten zugreift oder sie für andere Zwecke verwendet. Die AWS European Sovereign Cloud ist so gestaltet, dass nicht nur alle Kundendaten, sondern auch alle Metadaten, die durch Kunden angelegt werden (z.B. Rollen, Zugriffsrechte, Labels für Ressourcen und Konfigurationsinformationen), innerhalb der EU verbleiben. Die AWS European Sovereign Cloud verfügt über unabhängige Systeme für das Rechnungswesen und zur Nutzungsmessung.

„Die neue AWS European Sovereign Cloud kann ein Game Changer für stark regulierte Geschäftsbereiche in der Europäischen Union sein. Als führender Telekommunikationsanbieter in Deutschland konzentriert sich unsere digitale Transformation auf Innovation, Skalierbarkeit, Agilität und Resilienz, um unseren Kunden die besten Dienste und die beste Qualität zu bieten. Dies wird nun von AWS mit dem höchsten Datenschutzniveau unter Einhaltung der regulatorischen Anforderungen vereint mit einem besonderen Schwerpunkt auf die Anforderungen an digitale Souveränität. Ich bin überzeugt, dass dieses neue Infrastrukturangebot das Potenzial hat, die Cloud-Adaption von europäischen Unternehmen voranzutreiben und die digitale Transformation regulierter Branchen in der EU zu beschleunigen.“
— Mallik Rao, Chief Technology and Information Officer bei O2 Telefónica in Deutschland

Sicherstellung operativer Autonomie
Die AWS European Sovereign Cloud bietet Kunden die Möglichkeit, strenge Anforderungen an Betriebsautonomie und den Ort der Datenverarbeitung zu erfüllen. Um eine Datenverarbeitung und operative Souveränität innerhalb der EU zu gewährleisten, wird die AWS European Sovereign Cloud-Infrastruktur unabhängig von bestehenden AWS-Regionen betrieben. Um den unabhängigen Betrieb der AWS European Sovereign Cloud zu gewährleisten, hat nur Personal, das in der EU ansässig ist und sich in der EU aufhält die Kontrolle über den täglichen Betrieb. Dazu zählen der Zugang zu Rechenzentren, der technische Support und der Kundenservice.

Wir nutzen die Erkenntnisse aus unserer intensiven Zusammenarbeit mit Aufsichts- und Cybersicherheitsbehörden in Europa beim Aufbau der AWS European Sovereign Cloud, damit Kunden ihren Anforderungen an die Kontrolle über den Speicher- und Verarbeitungsort ihrer Daten, der betrieblichen Autonomie und der operativen Souveränität gerecht werden können. Wir freuen uns, mit dem Bundesamt für Sicherheit in der Informationstechnik (BSI) auch bei der Umsetzung der AWS European Sovereign Cloud zu kooperieren:

„Der Aufbau einer europäischen AWS-Cloud wird es für viele Behörden und Unternehmen mit hohen Anforderungen an die Datensicherheit und den Datenschutz deutlich leichter machen, die AWS-Services zu nutzen. Wir wissen um die Innovationskraft moderner Cloud-Dienste und wir wollen mithelfen, sie für Deutschland und Europa sicher verfügbar zu machen. Das BSI hat mit dem Kriterienkatalog C5 die Cybersicherheit im Cloud Computing bereits maßgeblich beeinflusst, und tatsächlich war AWS der erste Cloud Service Provider, der das C5-Testat des BSI erhalten hat. Insofern freuen wir uns sehr, den hiesigen Aufbau einer AWS-Cloud, die auch einen Beitrag zur europäischen Souveränität leisten wird, im Hinblick auf die Sicherheit konstruktiv zu begleiten.“
— Claudia Plattner, Präsidentin, deutsches Bundesamt für Sicherheit in der Informationstechnik (BSI)

Kontrolle ohne Kompromisse
Obwohl sie separat betrieben wird, bietet die AWS European Sovereign Cloud dieselbe branchenführende Architektur, die auf Sicherheit und Verfügbarkeit ausgelegt ist wie andere AWS-Regionen. Dazu gehören mehrere Verfügbarkeitszonen (Availability Zones, AZs) – eine Infrastruktur, die sich an verschiedenen voneinander getrennten geografischen Standorten befindet. Diese räumliche Trennung verringert signifikant das Risiko, dass ein Zwischenfall an einem einzelnen Standort den Geschäftsbetrieb des Kunden beeinträchtigt. Jede Verfügbarkeitszone besitzt eine autarke Stromversorgung und Kühlung und verfügt über redundante Netzwerkanbindungen, um ein Höchstmaß an Ausfallsicherheit zu gewährleisten. Zudem zeichnet sich jede Verfügbarkeitszone durch eine hohe physische Sicherheit aus. Alle AZs in der AWS European Sovereign Cloud werden über vollständig redundante, dedizierte Metro-Glasfaser miteinander verbunden und ermöglichen so eine Vernetzung mit hohem Durchsatz und niedriger Latenz zwischen den AZs. Der gesamte Datenverkehr zwischen AZs wird verschlüsselt. Für besonders strikte Anforderungen an die Trennung von Daten und den Ort der Datenverarbeitung innerhalb eines Landes bieten bestehende Angebote wie AWS Dedicated Local Zones oder AWS Outposts zusätzliche Optionen. Damit können Kunden die AWS European Sovereign Cloud Infrastruktur auf selbstgewählte Standorte erweitern.

Kontinuierliche AWS-Investitionen in Deutschland und Europa
Mit der AWS European Sovereign Cloud setzt AWS seine Investitionen in Deutschland und Europa fort. AWS entwickelt Innovationen, um europäische Werte und die digitale Zukunft in Deutschland und Europa zu unterstützen. Wir treiben die wirtschaftliche Entwicklung voran, indem wir in Infrastruktur, Arbeitsplätze und Ausbildung in ganz Europa investieren. Wir schaffen Tausende von hochwertigen Arbeitsplätzen und investieren Milliarden von Euro in europäische Volkswirtschaften. Amazon hat mehr als 100.000 dauerhafte Arbeitsplätze innerhalb der EU geschaffen.

„Die deutsche und europäische Wirtschaft befindet sich auf Digitalisierungskurs. Insbesondere der starke deutsche Mittelstand braucht eine souveräne Digitalinfrastruktur, die höchsten Anforderungen genügt, um auch weiterhin wettbewerbsfähig im globalen Markt zu sein. Für unsere digitale Unabhängigkeit ist wichtig, dass Rechenleistungen vor Ort in Deutschland entstehen und in unseren Digitalstandort investiert wird. Wir begrüßen daher die Ankündigung von AWS, die Cloud für Europa in Deutschland anzusiedeln.“
— Stefan Schnorr, Staatssekretär im deutschen Bundesministerium für Digitales und Verkehr

Einige der größten Entwicklungsteams von AWS sind in Deutschland und Europa angesiedelt, mit Standorten in Aachen, Berlin, Dresden, Tübingen und Dublin. Da wir uns verpflichtet fühlen, einen langfristigen Beitrag zur Entwicklung digitaler Kompetenzen zu leisten, wird AWS zusätzliches Personal vor Ort für die AWS European Sovereign Cloud einstellen und ausbilden.

Kunden, Partner und Aufsichtsbehörden begrüßen die AWS European Sovereign Cloud
In der EU nutzen Hunderttausende Organisationen aller Größen und Branchen AWS – von Start-ups über kleine und mittlere Unternehmen bis hin zu den größten Unternehmen, einschließlich Telekommunikationsunternehmen, Organisationen des öffentlichen Sektors, Bildungseinrichtungen und Regierungsbehörden. Europaweit unterstützen Organisationen die Einführung der AWS European Sovereign Cloud. Für Kunden wird die AWS European Sovereign Cloud neue Möglichkeiten im Cloudeinsatz eröffnen.

„Wir begrüßen das Engagement von AWS, seine Infrastruktur mit einer unabhängigen europäischen Cloud auszubauen. So erhalten Unternehmen und Organisationen der öffentlichen Hand mehr Auswahlmöglichkeiten bei der Erfüllung der Anforderungen an digitale Souveränität. Cloud-Services sind für die Digitalisierung der öffentlichen Verwaltung unerlässlich. Mit der Deutschen Verwaltungscloud-Strategie und dem Vertragsstandard EVB-IT Cloud wurden die Grundlagen für die Cloud-Nutzung in der Verwaltung geschaffen. Ich freue mich sehr, gemeinsam mit AWS Souveränität im Sinne unserer Cloud-Strategie praktisch und partnerschaftlich umzusetzen.”
— Dr. Markus Richter, Staatssekretär im deutschen Bundesministerium des Innern und für Heimat sowie Beauftragter der Bundesregierung für Informationstechnik (CIO des Bundes)

„Als Marktführer für Geschäftssoftware mit starken Wurzeln in Europa, arbeitet SAP seit langem im Interesse der Kunden mit AWS zusammen, um die digitale Transformation auf der ganzen Welt zu beschleunigen. Die AWS European Sovereign Cloud bietet weitere Möglichkeiten, unsere Beziehung in Europa zu stärken, indem wir die Möglichkeiten, die wir unseren Kunden beim Wechsel in die Cloud bieten, erweitern können. Wir schätzen die fortlaufende Zusammenarbeit mit AWS und die neuen Möglichkeiten, die diese Investition für unsere gemeinsamen Kunden in der gesamten Region mit sich bringen kann.“
— Peter Pluim, President – SAP Enterprise Cloud Services und SAP Sovereign Cloud Services

„Heute stehen wir an der Schwelle zu einer transformativen Ära. Die Einführung der AWS European Sovereign Cloud stellt nicht nur eine infrastrukturelle Erweiterung dar, sondern ist ein Paradigmenwechsel. Dieses hochentwickelte Framework wird Dedalus in die Lage versetzen, unvergleichliche Dienste für die sichere und effiziente Speicherung von Patientendaten in der AWS-Cloud anzubieten. Wir bleiben kompromisslos dem Ziel verpflichtet, unseren europäischen Kunden erstklassige Lösungen zu bieten, die auf Vertrauen und technologischer Exzellenz basieren.“
— Andrea Fiumicelli, Chairman bei Dedalus

„Die Deutsche Telekom begrüßt die Ankündigung der AWS European Sovereign Cloud, die das Engagement von AWS für fortwährende Innovationen für europäische Unternehmen unterstreicht. Diese AWS-Lösung wird Unternehmen eine noch größere Auswahl bieten, wenn sie kritische Workloads in die AWS-Cloud verlagern, und zusätzliche Optionen zur Erfüllung der sich entwickelnden Anforderungen an die digitale Governance in der EU.”
— Greg Hyttenrauch, Senior Vice President, Global Cloud Services bei T-Systems

„Wir begrüßen die AWS European Sovereign Cloud als neues Angebot innerhalb von AWS, um die komplexesten regulatorischen Anforderungen an die Datenresidenz und betrieblichen Erfordernisse in ganz Europa zu adressieren.“
— Bernhard Wagensommer, Vice President Prinect bei der Heidelberger Druckmaschinen AG

„Die AWS European Sovereign Cloud wird neue Branchenmaßstäbe setzen und sicherstellen, dass Finanzdienstleistungsunternehmen noch mehr Optionen innerhalb von AWS haben, um die wachsenden Anforderungen an die digitale Souveränität hinsichtlich der Datenresidenz und operativen Autonomie in der EU zu erfüllen.“
— Gerhard Koestler, Chief Information Officer bei Raisin

„Mit einem starken Fokus auf Datenschutz, Sicherheit und regulatorischer Compliance unterstreicht die AWS European Sovereign Cloud das Engagement von AWS, die höchsten Standards für die digitale Souveränität von Finanzdienstleistern zu fördern. Dieser zusätzliche robuste Rahmen ermöglicht es Unternehmen wie unserem, in einer sicheren Umgebung erfolgreich zu sein, in der Daten geschützt sind und die Einhaltung höchster Standards leichter denn je wird.“
— Andreas Schranzhofer, Chief Technology Officer bei Scalable Capital

„Die AWS European Sovereign Cloud ist ein wichtiges, zusätzliches Angebot von AWS, das hochregulierten Branchen, Organisationen der öffentlichen Hand und Regierungsbehörden in Deutschland weitere Optionen bietet, um strengste regulatorische Anforderungen an den Datenschutz in der Cloud noch einfacher umzusetzen. Als AWS Advanced Tier Services Partner, AWS Solution Provider und AWS Public Sector Partner beraten und unterstützen wir kritische Infrastrukturen (KRITIS) bei der erfolgreichen Implementierung. Das neue Angebot von AWS ist ein wichtiger Impuls für Innovationen und Digitalisierung in Deutschland.“
— Martin Wibbe, CEO bei Materna

„Als eines der größten deutschen IT-Unternehmen und strategischer AWS-Partner begrüßt msg ausdrücklich die Ankündigung der AWS European Sovereign Cloud. Für uns als Anbieter von Software as a Service (SaaS) und Consulting Advisor für Kunden mit spezifischen Datenschutzanforderungen ermöglicht die Schaffung einer eigenständigen europäischen Cloud, unseren Kunden dabei zu helfen, die Einhaltung sich entwickelnder Vorschriften leichter nachzuweisen. Diese spannende Ankündigung steht im Einklang mit unserer Cloud-Strategie. Wir betrachten dies als Chance, um unsere Partnerschaft mit AWS zu stärken und die Entwicklung der Cloud in Deutschland voranzutreiben.“
— Dr. Jürgen Zehetmaier, CEO von msg

Unsere Verpflichtung gegenüber unseren Kunden
Um Kunden bei der Erfüllung der sich wandelnden Souveränitätsanforderungen zu unterstützen, entwickelt AWS fortlaufend innovative Features, Kontrollen und Zusicherungen, ohne die Leistungsfähigkeit der AWS Cloud zu beeinträchtigen.

Weitere Informationen zur AWS European Sovereign Cloud und über unsere Kunden finden Sie in der Pressemitteilung und auf unserer Website zur europäischen digitalen Souveränität. Sie finden auch weitere Informationen im AWS News Blog.

Italian

AWS Digital Sovereignty Pledge: Annuncio di un nuovo cloud sovrano e indipendente in Europa

Fin dal primo giorno, abbiamo sempre creduto che fosse essenziale che tutti i clienti avessero il controllo sui propri dati e sulle scelte di come proteggerli e gestirli nel cloud. L’anno scorso abbiamo introdotto l’AWS Digital Sovereignty Pledge, il nostro impegno a offrire ai clienti AWS il set più avanzato di controlli e funzionalità di sovranità disponibili nel cloud. Ci siamo impegnati a lavorare per comprendere le esigenze e le necessità in costante evoluzione sia dei clienti che delle autorità di regolamentazione, e per adattarci e innovare rapidamente per soddisfarli. Ci siamo impegnati ad espandere le nostre funzionalità per consentire ai clienti di soddisfare le loro esigenze di sovranità digitale senza compromettere le prestazioni, l’innovazione, la sicurezza o la scalabilità del cloud AWS.

AWS offre l’infrastruttura cloud più grande e completa a livello globale. Il nostro approccio fin dall’inizio è stato quello di rendere il cloud AWS sovrano by design. Abbiamo creato funzionalità e controlli di protezione dei dati nel cloud AWS confrontandoci con i clienti che operano in settori quali i servizi finanziari e l’assistenza sanitaria, che sono in assoluto tra le organizzazioni più attente alla sicurezza e alla privacy dei dati. Ciò ha portato a innovazioni come AWS Nitro System, che alimenta tutte le nostre moderne istanze Amazon Elastic Compute Cloud (Amazon EC2) e fornisce un solido standard di sicurezza fisico e logico-infrastrutturale al fine di imporre restrizioni di accesso in modo che nessuno, compresi i dipendenti AWS, possa accedere ai dati dei clienti in esecuzione in EC2. Il design di sicurezza del sistema Nitro è stato inoltre convalidato in modo indipendente dal gruppo NCC in un report pubblico.

Con AWS i clienti hanno sempre avuto il controllo sulla posizione dei propri dati. I clienti che devono rispettare i requisiti europei di residenza dei dati possono scegliere di distribuire i propri dati in una delle otto regioni AWS esistenti (Irlanda, Francoforte, Londra, Parigi, Stoccolma, Milano, Zurigo e Spagna) per conservare i propri dati in modo sicuro in Europa. Per gestire i propri carichi di lavoro sensibili, i clienti europei possono sfruttare il portafoglio di servizi più ampio e completo, tra cui intelligenza artificiale, analisi ed elaborazione dati, database, Internet of Things (IoT), apprendimento automatico, servizi mobili e storage. Per supportare ulteriormente i clienti, abbiamo introdotto alcune innovazioni per offrire loro maggiore controllo e scelta sulla gestione dei dati. Ad esempio, abbiamo annunciato ulteriore trasparenza e garanzie e nuove opzioni di infrastruttura dedicate con AWS Dedicated Local Zones.

Annuncio AWS European Sovereign Cloud
Quando in Europa parliamo con i clienti del settore pubblico e delle industrie regolamentate, riceviamo continue conferme di come si trovano ad affrontare una incredibile complessità e mutevoli dinamiche di un panorama di sovranità in continua evoluzione. I clienti ci dicono che vogliono adottare il cloud, ma si trovano ad affrontare crescenti interventi normativi in relazione alla residenza dei dati, all’autonomia operativa ed alla resilienza europea. Abbiamo appreso che questi clienti temono di dover scegliere tra tutta la potenza di AWS e soluzioni cloud sovrane ma con funzionalità limitate. Abbiamo collaborato intensamente con le autorità di regolamentazione europee, le agenzie nazionali per la sicurezza informatica e i nostri clienti per comprendere come le esigenze di sovranità possano variare in base a molteplici fattori come la residenza, la sensibilità dei carichi di lavoro e il settore. Questi fattori possono influire sui requisiti del carico di lavoro, ad esempio dove possono risiedere i dati, chi può accedervi e i controlli necessari, ed AWS ha una comprovata esperienza di innovazione per affrontare carichi di lavoro specializzati in tutto il mondo.

Oggi siamo lieti di annunciare il nostro programma di lancio dell’AWS European Sovereign Cloud, un nuovo cloud indipendente per l’Europa, progettato per aiutare le organizzazioni del settore pubblico e i clienti in settori altamente regolamentati a soddisfare le loro esigenze di sovranità in continua evoluzione. Stiamo progettando il cloud sovrano europeo AWS in modo che sia separato e indipendente dalle nostre regioni esistenti, con un’infrastruttura situata interamente all’interno dell’Unione Europea (UE), con la stessa sicurezza, disponibilità e prestazioni che i nostri clienti ottengono dalle regioni esistenti oggi. Per garantire una maggiore resilienza operativa all’interno dell’UE, solo i residenti dell’UE che si trovano nell’UE avranno il controllo delle operazioni e il supporto per l’AWS European Sovereign Cloud. Come per tutte le regioni attuali, i clienti che utilizzeranno l’AWS European Sovereign Cloud trarranno vantaggio da tutta la potenza di AWS con la stessa architettura, un ampio portafoglio di servizi e API che milioni di clienti già utilizzano oggi. L’AWS European Sovereign Cloud lancerà la sua prima regione AWS in Germania, disponibile per tutti i clienti europei.

Il cloud sovrano europeo AWS sarà progettato per garantire l’indipendenza operativa e la resilienza all’interno dell’UE e sarà gestito e supportato solamente da dipendenti AWS che si trovano nell’UE e che vi risiedono. Questo design offrirà ai clienti una scelta aggiuntiva per soddisfare le diverse esigenze di residenza dei dati, autonomia operativa e resilienza. Come in tutte le regioni AWS attuali, i clienti che utilizzano l’AWS European Sovereign Cloud trarranno vantaggio da tutta la potenza di AWS, dalla stessa architettura, dall’ampio portafoglio di servizi e dalle stesse API utilizzate oggi da milioni di clienti. L’AWS European Sovereign Cloud lancerà la sua prima regione in Germania.

Il cloud sovrano europeo AWS sarà sovrano by design e si baserà su oltre un decennio di esperienza nella gestione di più cloud indipendenti per carichi di lavoro critici e soggetti a restrizioni. Come le regioni esistenti, il cloud sovrano europeo AWS sarà progettato per garantire disponibilità e resilienza elevate e sarà alimentato da AWS Nitro System per contribuire a garantire la riservatezza e l’integrità dei dati dei clienti. Clienti che avranno il controllo e la garanzia che AWS non potrà accedere od utilizzare i dati dei clienti per alcuno scopo senza il loro consenso. AWS offre ai clienti i controlli di sovranità più rigorosi tra quelli offerti dai principali cloud provider. Per i clienti con esigenze avanzate di residenza dei dati, il cloud sovrano europeo AWS è progettato per andare oltre, e consentirà ai clienti di conservare tutti i metadati che creano (come le etichette dei dati, le categorie, i ruoli degli account e le configurazioni che utilizzano per eseguire AWS) nell’UE. L’AWS European Sovereign Cloud sarà inoltre realizzato con sistemi separati di fatturazione e misurazione dell’utilizzo a livello regionale.

Garantire autonomia operativa
L’AWS European Sovereign Cloud fornirà ai clienti la capacità di soddisfare rigorosi requisiti di autonomia operativa e residenza dei dati. Per offrire un maggiore controllo sulla residenza dei dati e sulla resilienza operativa all’interno dell’UE, l’infrastruttura AWS European Sovereign Cloud sarà gestita indipendentemente dalle regioni AWS esistenti. Per garantire il funzionamento indipendente dell’AWS European Sovereign Cloud, solo il personale residente nell’UE, situato nell’UE, avrà il controllo delle operazioni quotidiane, compreso l’accesso ai data center, il supporto tecnico e il servizio clienti.

Stiamo attingendo alle nostre profonde collaborazioni con le autorità di regolamentazione europee e le agenzie nazionali per la sicurezza informatica per applicarle nella realizzazione del cloud sovrano europeo AWS, di modo che i clienti che utilizzano AWS European Sovereign Cloud possano soddisfare i loro requisiti di residenza dei dati, di controllo, di autonomia operativa e resilienza. Ne è un esempio la stretta collaborazione con l’Ufficio federale tedesco per la sicurezza delle informazioni (BSI).

“Lo sviluppo di un cloud AWS europeo renderà molto più semplice l’utilizzo dei servizi AWS per molte organizzazioni del settore pubblico e per aziende con elevati requisiti di sicurezza e protezione dei dati. Siamo consapevoli della forza innovativa dei moderni servizi cloud e vogliamo contribuire a renderli disponibili in modo sicuro per la Germania e l’Europa. Il C5 (Cloud Computing Compliance Criteria Catalogue), sviluppato da BSI, ha plasmato in modo significativo gli standard cloud di sicurezza informatica e AWS è stato infatti il primo fornitore di servizi cloud a ricevere l’attestato C5 di BSI. In questo senso, siamo molto lieti di accompagnare in modo costruttivo lo sviluppo locale di un Cloud AWS, che contribuirà anche alla sovranità europea, in termini di sicurezza”.
— Claudia Plattner, Presidente dell’Ufficio federale tedesco per la sicurezza informatica (BSI)

Controllo senza compromessi
Sebbene separato, l’AWS European Sovereign Cloud offrirà la stessa architettura leader del settore creata per la sicurezza e la disponibilità delle altre regioni AWS. Ciò includerà multiple zone di disponibilità (AZ) e un’infrastruttura collocata in aree geografiche separate e distinte, con una distanza sufficiente a ridurre in modo significativo il rischio che un singolo evento influisca sulla continuità aziendale dei clienti. Ogni AZ disporrà di più livelli di alimentazione e rete ridondanti per fornire il massimo livello di resilienza. Tutte le AZ del cloud sovrano europeo AWS saranno interconnesse con fibra metropolitana dedicata e completamente ridondata, che fornirà reti ad alta velocità e bassa latenza tra le AZ. Tutto il traffico tra le AZ sarà crittografato. I clienti che necessitano di più opzioni per far fronte ai rigorosi requisiti di isolamento e residenza dei dati all’interno del Paese potranno sfruttare le zone locali dedicate o AWS Outposts per distribuire l’infrastruttura AWS European Sovereign Cloud nelle località da loro selezionate.

Continui investimenti di AWS in Europa
L’AWS European Sovereign Cloud è parte del continuo impegno ad investire in Europa di AWS. AWS si impegna a innovare per sostenere i valori europei e il futuro digitale dell’Europa. Promuoviamo lo sviluppo economico investendo in infrastrutture, posti di lavoro e competenze nelle comunità e nei paesi di tutta Europa. Stiamo creando migliaia di posti di lavoro di alta qualità e investendo miliardi di euro nelle economie europee. Amazon ha creato più di 100.000 posti di lavoro permanenti in tutta l’UE. Alcuni dei nostri team di sviluppo AWS più grandi si trovano in Europa, con centri di eccellenza a Dublino, Dresda e Berlino. Nell’ambito del nostro continuo impegno a contribuire allo sviluppo delle competenze digitali, assumeremo e svilupperemo ulteriore personale locale per gestire e supportare l’AWS European Sovereign Cloud.

Clienti, partner e autorità di regolamentazione accolgono con favore il cloud sovrano europeo AWS
Nell’UE, centinaia di migliaia di organizzazioni di tutte le dimensioni e in tutti i settori utilizzano AWS, dalle start-up alle piccole e medie imprese, alle grandi imprese, alle società di telecomunicazioni, alle organizzazioni del settore pubblico, agli istituti di istruzione e alle agenzie governative. Organizzazioni di tutta Europa sostengono l’introduzione dell’AWS European Sovereign Cloud.

“In qualità di leader di mercato nel software applicativo aziendale con forti radici in Europa, SAP collabora da tempo con AWS per conto dei clienti per accelerare la trasformazione digitale in tutto il mondo. L’AWS European Sovereign Cloud offre ulteriori opportunità per rafforzare le nostre relazioni in Europa consentendoci di ampliare le scelte che offriamo ai clienti mentre passano al cloud. Apprezziamo la partnership continua con AWS e le nuove possibilità che questo investimento può offrire ai nostri comuni clienti in tutta la regione.”
— Peter Pluim, Presidente, SAP Enterprise Cloud Services e SAP Sovereign Cloud Services

“Il nuovo AWS European Sovereign Cloud può rappresentare un punto di svolta per i segmenti di business altamente regolamentati nell’Unione Europea. In qualità di fornitore leader di telecomunicazioni in Germania, la nostra trasformazione digitale si concentra su innovazione, scalabilità, agilità e resilienza per fornire ai nostri clienti i migliori servizi e la migliore qualità. Ciò sarà ora abbinato ai più alti livelli di protezione dei dati e conformità normativa offerti da AWS e con un’attenzione particolare ai requisiti di sovranità digitale. Sono convinto che questa nuova offerta di infrastrutture abbia il potenziale per stimolare l’adozione del cloud da parte delle aziende europee e accelerare la trasformazione digitale delle industrie regolamentate in tutta l’UE”.
— Mallik Rao, Chief Technology & Information Officer (CTIO) presso O2 Telefónica in Germania

“Deutsche Telekom accoglie l’annuncio dell’AWS European Sovereign Cloud, che evidenzia l’impegno di AWS a un’innovazione continua nel mercato europeo. Questa soluzione AWS offrirà opportunità important per le aziende e le organizzazioni nell’ambito della migrazione regolamentata sul cloud e opzioni addizionali per soddisfare i requisiti di sovranità digitale europei in continua evoluzione”.
— Greg Hyttenrauch, Senior Vice President, Global Cloud Services presso T-Systems

“Oggi siamo al culmine di un’era di trasformazione. L’introduzione dell’AWS European Sovereign Cloud non rappresenta semplicemente un miglioramento infrastrutturale, è un cambio di paradigma. Questo sofisticato framework consentirà a Dedalus di offrire servizi senza precedenti per l’archiviazione dei dati dei pazienti in modo sicuro ed efficiente nel cloud AWS. Rimaniamo impegnati, senza compromessi, a servire la nostra clientela europea con soluzioni best-in-class sostenute da fiducia ed eccellenza tecnologica”.
— Andrea Fiumicelli, Presidente di Dedalus

“Noi di de Volksbank crediamo nell’investire per migliorare i Paesi Bassi. Ma perché questo avvenga in modo efficace, dobbiamo avere accesso alle tecnologie più recenti per poter innovare e migliorare continuamente i servizi per i nostri clienti. Per questo motivo, accogliamo con favore l’annuncio dello European Sovereign Cloud che consentirà ai clienti europei di rispettare facilmente la conformità alle normative in evoluzione, beneficiando comunque della scalabilità, della sicurezza e della suite completa dei servizi AWS”.
— Sebastiaan Kalshoven, Direttore IT/CTO della Volksbank

“Eviden accoglie con favore il lancio dell’AWS European Sovereign Cloud, che aiuterà le industrie regolamentate e il settore pubblico a soddisfare i requisiti dei loro carichi di lavoro sensibili con un cloud AWS completo e interamente gestito in Europa. In qualità di partner AWS Premier Tier Services e leader nei servizi di sicurezza informatica in Europa, Eviden ha una vasta esperienza nell’aiutare i clienti AWS a formalizzare e mitigare i rischi di sovranità. L’AWS European Sovereign Cloud consentirà a Eviden di soddisfare una gamma più ampia di esigenze di sovranità dei clienti”.
— Yannick Tricaud, Responsabile Europa meridionale e centrale, Medio Oriente e Africa, Eviden, Gruppo Atos

“Accogliamo con favore l’impegno di AWS di espandere la propria infrastruttura con un cloud europeo indipendente. Ciò offrirà alle imprese e alle organizzazioni del settore pubblico una scelta più ampia nel soddisfare i requisiti di sovranità digitale. I servizi cloud sono essenziali per la digitalizzazione della pubblica amministrazione. Con l’” Strategia cloud per l’Amministrazione tedesca” e lo standard contrattuale “EVB-IT Cloud”, sono state gettate le basi per l’utilizzo del cloud nella pubblica amministrazione. Sono molto lieto di collaborare con AWS per implementare in modo pratico e collaborativo la sovranità in linea con la nostra strategia cloud.”
— Dr. Markus Richter, CIO del governo federale tedesco, Ministero federale degli interni

I nostri impegni nei confronti dei nostri clienti
Manteniamo il nostro impegno a fornire ai nostri clienti il controllo e la possibilità di scelta per contribuire a soddisfare le loro esigenze in continua evoluzione in materia di sovranità digitale. Continueremo a innovare le funzionalità, i controlli e le garanzie di sovranità del dato all’interno del cloud AWS globale e a fornirli senza compromessi sfruttando tutta la potenza di AWS.

Puoi scoprire di più sull’AWS European Sovereign Cloud nel Comunicato Stampa o sul nostro sito European Digital Sovereignty. Puoi anche ottenere ulteriori informazioni nel blog AWS News.

Spanish

Compromiso de Soberanía Digital de AWS: anuncio de una nueva nube soberana independiente en la Unión Europea

Desde el primer día, en Amazon Web Services (AWS) siempre hemos creído que es esencial que los clientes tengan el control sobre sus datos y capacidad para proteger y gestionar los mismos en la nube. El año pasado, anunciamos el Compromiso de Soberanía Digital de AWS, nuestra garantía de que ofrecemos a todos los clientes de AWS los controles y funcionalidades de soberanía más avanzados que estén disponibles en la nube. Nos comprometimos a trabajar para comprender las necesidades y los requisitos cambiantes tanto de los clientes como de los reguladores, y a adaptarnos e innovar rápidamente para satisfacerlos. Asimismo, nos comprometimos a ampliar nuestras capacidades para permitir a los clientes satisfacer sus necesidades de soberanía digital sin reducir el rendimiento, la innovación, la seguridad o la escalabilidad de la nube de AWS.

AWS ofrece la infraestructura de nube más amplia y completa del mundo. Nuestro enfoque desde el principio ha sido hacer que AWS sea una nube soberana por diseño. Creamos funcionalidades y controles de protección de datos en la nube de AWS teniendo en cuenta las aportaciones de clientes de sectores como los servicios financieros, sanidad y entidades gubernamentales, que se encuentran entre los más preocupados por la seguridad y la privacidad de los datos en el mundo. Esto ha dado lugar a innovaciones como el sistema Nitro de AWS, que impulsa todas nuestras instancias de Amazon Elastic Compute Cloud (Amazon EC2) y proporciona un límite de seguridad físico y lógico sólido para imponer restricciones de acceso, de modo que nadie, incluidos los empleados de AWS, pueda acceder a los datos de los clientes que se ejecutan en Amazon EC2. El diseño de seguridad del sistema Nitro también ha sido validado de forma independiente por el Grupo NCC en un informe público.

Con AWS, los clientes siempre han tenido el control sobre la ubicación de sus datos. En Europa, los clientes que deben cumplir con los requisitos de residencia de datos europeos tienen la opción de implementar sus datos en cualquiera de las ocho Regiones de AWS existentes (Irlanda, Frankfurt, Londres, París, Estocolmo, Milán, Zúrich y España) para mantener sus datos de forma segura en Europa. Para ejecutar sus cargas de trabajo sensibles, los clientes europeos pueden aprovechar la cartera de servicios más amplia y completa, que incluye inteligencia artificial, análisis, computación, bases de datos, Internet de las cosas (IoT), aprendizaje automático, servicios móviles y almacenamiento. Para apoyar aún más a los clientes, hemos innovado ofreciendo más control y opciones sobre sus datos. Por ejemplo, anunciamos una mayor transparencia y garantías, y nuevas opciones de infraestructura de uso exclusivo con Zonas Locales Dedicadas de AWS.

Anunciamos AWS European Sovereign Cloud
Cuando hablamos con clientes del sector público y de sectores regulados en Europa, nos comparten cómo se enfrentan a una gran complejidad y a una dinámica cambiante en el panorama de la soberanía, que está en constante evolución. Los clientes nos dicen que quieren adoptar la nube, pero se enfrentan a un creciente escrutinio regulatorio en relación con la ubicación de los datos, la autonomía operativa europea y la resiliencia. Sabemos que a estos clientes les preocupa tener que elegir entre toda la potencia de AWS o soluciones de nube soberana con funciones limitadas. Hemos mantenido conversaciones muy provechosas con los reguladores europeos, las autoridades nacionales de ciberseguridad y los clientes para entender cómo las necesidades de soberanía de los clientes pueden variar en función de diferentes factores, como la ubicación, la sensibilidad de las cargas de trabajo y el sector. Estos factores pueden impactar en los requisitos aplicables a sus cargas de trabajo, como dónde pueden residir sus datos, quién puede acceder a ellos y los controles necesarios. AWS tiene un historial comprobado de innovación para abordar cargas de trabajo sensibles o especiales en todo el mundo.

Hoy nos complace anunciar nuestros planes de lanzar la Nube Soberana Europea de AWS, una nueva nube independiente para la Unión Europea, diseñada para ayudar a las organizaciones del sector público y a los clientes de sectores altamente regulados a satisfacer sus necesidades de soberanía en constante evolución. Estamos diseñando la Nube Soberana Europea de AWS para que sea independiente y separada de nuestras Regiones actuales, con una infraestructura ubicada íntegramente dentro de la Unión Europea y con la misma seguridad, disponibilidad y rendimiento que nuestros clientes obtienen en las Regiones actuales. Para ofrecer una mayor resiliencia operativa dentro de la UE, solo los residentes de la UE que se encuentren en la UE, tendrán el control de las operaciones y el soporte de la Nube Soberana Europea de AWS. Como ocurre con todas las Regiones actuales, los clientes que utilicen la Nube Soberana Europea de AWS se beneficiarán de toda la potencia de AWS con la misma arquitectura conocida, una amplia cartera de servicios y las APIs que utilizan millones de clientes en la actualidad. La Nube Soberana Europea de AWS lanzará su primera Región de AWS en Alemania disponible para todos los clientes en Europa.

La Nube Soberana Europea de AWS será soberana por diseño y se basará en más de una década de experiencia en la gestión de múltiples nubes independientes para las cargas de trabajo más críticas y restringidas. Al igual que las Regiones existentes, la Nube Soberana Europea de AWS se diseñará para ofrecer una alta disponibilidad y resiliencia, y contará con la tecnología del sistema Nitro de AWS, a fin de garantizar la confidencialidad e integridad de los datos de los clientes. Los clientes tendrán el control y la seguridad de que AWS no accederá a los datos de los clientes ni los utilizará para ningún propósito sin su consentimiento. AWS ofrece a los clientes los controles de soberanía más estrictos entre los principales proveedores de servicios en la nube. Para los clientes con necesidades de residencia de datos mejoradas, la Nube Soberana Europea de AWS está diseñada para ir más allá y permitirá a los clientes conservar todos los metadatos que crean (como funciones, permisos, etiquetas de recursos y configuraciones), las funciones de las cuentas y las configuraciones que utilizan para ejecutar AWS) dentro de la UE. La Nube Soberana Europea de AWS también se construirá con sistemas independientes de facturación y medición del uso dentro de la Región.

Ofreciendo autonomía operativa
La Nube Soberana Europea de AWS proporcionará a los clientes la capacidad de cumplir con los estrictos requisitos de autonomía operativa y residencia de datos que sean de aplicación a cada cliente. Para proporcionar una mejor residencia de los datos y resiliencia operativa en la UE, la infraestructura de la Nube Soberana Europea de AWS se gestionará de forma independiente del resto de las Regiones de AWS existentes. Para garantizar el funcionamiento independiente de la Nube Soberana Europea de AWS, solo el personal residente en la UE y ubicado en la UE tendrá el control de las operaciones diarias, incluido el acceso a los centros de datos, el soporte técnico y el servicio de atención al cliente.

Estamos aprendiendo de nuestras intensas conversaciones con los reguladores europeos y las autoridades nacionales de ciberseguridad, aplicando estos aprendizajes a medida que construimos la Nube Soberana Europea de AWS, de modo que los clientes que la utilicen puedan cumplir sus requisitos de residencia, autonomía operativa y resiliencia de los datos. Por ejemplo, esperamos continuar colaborando con la Oficina Federal de Seguridad de la Información (BSI) de Alemania.

«El desarrollo de una nube europea de AWS facilitará mucho el uso de los servicios de AWS a muchas organizaciones y empresas del sector público con altos requisitos de seguridad y protección de datos. Somos conscientes del poder innovador de los servicios en la nube modernos y queremos contribuir a que estén disponibles de forma segura en Alemania y Europa. El C5 (Cloud Computing Compliance Criteria Catalogue), desarrollado por la BSI, ha influido considerablemente en los estándares de ciberseguridad en la nube y, de hecho, AWS fue el primer proveedor de servicios en la nube en recibir el certificado C5 de la BSI. En este sentido, nos complace acompañar de manera constructiva el desarrollo local de una nube de AWS, que también contribuirá a la soberanía europea en términos de seguridad».
— Claudia Plattner, presidenta de la Oficina Federal Alemana de Seguridad de la Información (BSI)

Control sin concesiones
A pesar de ser independiente, la Nube Soberana Europea de AWS ofrecerá la misma arquitectura líder en el sector que otras Regiones de AWS, creada para garantizar la seguridad y la disponibilidad. Esto incluirá varias Zonas de Disponibilidad, una infraestructura distribuida en ubicaciones geográficas separadas y distintas, con una distancia suficiente para reducir el riesgo de que un incidente afecte a la continuidad del negocio de los clientes. Cada Zona de Disponibilidad tendrá varias fuentes de alimentación eléctrica y redes redundantes para ofrecer el máximo nivel de resiliencia. Todas las Zonas de Disponibilidad de la Nube Soberana Europea de AWS estarán interconectadas mediante fibra de uso exclusivo y totalmente redundante, lo que proporcionará una red de alto rendimiento y baja latencia entre las Zonas de Disponibilidad. Todo el tráfico entre las Zonas de Disponibilidad se encriptará. Los clientes que necesiten más opciones para abordar estrictas necesidades de aislamiento y residencia de datos en el país podrán utilizar las Zonas Locales Dedicadas o AWS Outposts para implementar la infraestructura de Nube Soberana Europea de AWS en las ubicaciones que elijan.

Inversión continua de AWS en Europa
La Nube Soberana Europea de AWS representa una inversión continua de AWS en la UE. AWS se compromete a innovar para respaldar los valores y el futuro digital de la Unión Europea. Impulsamos el desarrollo económico mediante la inversión en infraestructura, empleos y habilidades en comunidades y países de toda Europa. Estamos creando miles de puestos de trabajo de alta calidad e invirtiendo miles de millones de euros en las economías europeas. Amazon ha creado más de 100 000 puestos de trabajo permanentes en toda la UE. Algunos de nuestros equipos de desarrollo de AWS más importantes se encuentran en Europa, con centros clave en Dublín, Dresde y Berlín. Como parte de nuestro compromiso continuo de contribuir al desarrollo de las habilidades digitales, contrataremos y capacitaremos a más personal local para gestionar y apoyar la Nube Soberana Europea de AWS.

Los clientes, socios y reguladores dan la bienvenida a la Nube Soberana Europea de AWS
En la UE, cientos de miles de organizaciones de todos los tamaños y sectores utilizan AWS, desde startups hasta PYMEs, grandes compañías incluyendo empresas de telecomunicaciones, organizaciones del sector público, instituciones educativas, ONGs y agencias gubernamentales. Organizaciones de toda Europa apoyan la introducción de la Nube Soberana Europea de AWS.

“Como líder del mercado en software de aplicaciones empresariales con sólidas raíces en Europa, SAP lleva colaborando durante mucho tiempo con AWS en nombre de los clientes para acelerar la transformación digital en todo el mundo. La Nube Soberana Europea de AWS ofrece nuevas oportunidades para fortalecer nuestra relación en Europa, ya que nos permite ampliar las opciones que ofrecemos a los clientes a medida que se trasladan a la nube. Valoramos la asociación existente con AWS y las nuevas posibilidades que esta inversión puede ofrecer a los clientes de ambos en toda la región”.
– Peter Pluim, Presidente de SAP Enterprise Cloud Services y SAP Sovereign Cloud Services.

“La nueva Nube Soberana Europea de AWS puede cambiar las reglas del juego para los segmentos empresariales altamente regulados de la Unión Europea. Como proveedor de telecomunicaciones líder en Alemania, nuestra transformación digital se centra en la innovación, la escalabilidad, la agilidad y la resiliencia para ofrecer a nuestros clientes los mejores servicios y la mejor calidad. Esto se combinará ahora con los niveles más altos de protección de datos y cumplimiento normativo que ofrece AWS, y con un enfoque particular en los requisitos de soberanía digital. Estoy convencido de que esta nueva oferta de infraestructura tiene el potencial de impulsar la adaptación a la nube de las empresas europeas y acelerar la transformación digital de las industrias reguladas en toda la UE”.
— Mallik Rao, Directora de Tecnología e Información de O2 Telefónica en Alemania

“Hoy nos encontramos en la cúspide de una era de transformación. La introducción de la Nube Soberana Europea de AWS no solo representa una mejora de la infraestructura, sino que supone un cambio de paradigma. Este sofisticado marco permitirá a Dedalus ofrecer servicios incomparables para almacenar los datos de los pacientes de forma segura y eficiente en la nube de AWS. Mantenemos nuestro compromiso, sin concesiones, de servir a nuestra clientela europea con las mejores soluciones de su clase respaldadas por la confianza y la excelencia tecnológica”.
— Andrea Fiumicelli, Presidente de Dedalus

“En de Volksbank, creemos en invertir en unos Países Bajos mejores. Para hacerlo de manera eficaz, necesitamos tener acceso a las últimas tecnologías para poder innovar y mejorar continuamente los servicios para nuestros clientes. Por este motivo, acogemos con satisfacción el anuncio de la Nube Soberana Europea, que permitirá a los clientes europeos demostrar fácilmente el cumplimiento de las cambiantes normativas y, al mismo tiempo, beneficiarse de la escala, la seguridad y la gama completa de servicios de AWS”.
— Sebastian Kalshoven, director de TI y CTO de Volksbank

“Eviden acoge con satisfacción el lanzamiento de la Nube Soberana Europea de AWS. Esto ayudará a las industrias reguladas y al sector público a abordar los requisitos de sus cargas de trabajo confidenciales con una nube de AWS con todas las funciones y que funcione exclusivamente en Europa. Como socio de servicios de primer nivel de AWS y líder en servicios de ciberseguridad en Europa, Eviden tiene una amplia trayectoria ayudando a los clientes de AWS a formalizar y mitigar sus riesgos de soberanía. La Nube Soberana Europea de AWS permitirá a Eviden abordar una gama más amplia de necesidades de soberanía de los clientes”.
— Yannick Tricaud, director de Europa Central y Meridional, Oriente Medio y África, de Eviden, del Grupo Atos

Nuestros compromisos con nuestros clientes
Mantenemos nuestro compromiso de ofrecer a nuestros clientes el control y las opciones que les ayuden a satisfacer sus necesidades de soberanía digital en constante evolución. Seguiremos innovando en las funcionalidades, los controles y las garantías de soberanía globalmente, y ofreceremos esto sin renunciar a la toda la potencia de AWS.

Puede descubrir más sobre la Nube Soberana Europea de AWS y obtener más información sobre nuestros clientes en nuestra Nota de Prensa y en la web European Digital Sovereignty. También puede obtener más información en AWS News Blog.

Introducing the Project Argus Datacenter-ready Secure Control Module design specification

2023-10-16 Xiaomin Shen

Post Syndicated from Xiaomin Shen original http://blog.cloudflare.com/introducing-the-project-argus-datacenter-ready-secure-control-module-design-specification/

Introducing the Project Argus Datacenter-ready Secure Control Module design specification

Historically, data center servers have used motherboards that included all key components on a single circuit board. The DC-SCM (Datacenter-ready Secure Control Module) decouples server management and security functions from a traditional server motherboard, enabling development of server management and security solutions independent of server architecture. It also provides opportunities for reducing server printed circuit board (PCB) material cost, and allows unified firmware images to be developed.

Today, Cloudflare is announcing that it has partnered with Lenovo to design a DC-SCM for our next-generation servers. The design specification has been published to the OCP (Open Compute Project) contribution database under the name Project Argus.

A brief introduction to baseboard management controllers

A baseboard management controller (BMC) is a specialized processor that can be found in virtually every server product. It allows remote access to the server through a network connection, and provides a rich set of server management features. Some of the commonly used BMC features include server power management, device discovery, sensor monitoring, remote firmware update, system event logging, and error reporting.

In a typical server design, the BMC resides on the server motherboard, along with other key components such as the processor, memory, CPLD and so on. This was the norm for generations of server products, but that has changed in recent years as motherboards are increasingly optimized for high-speed signal bandwidth, and servers need to support specialized security requirements. This has made it necessary to decouple the BMC and its related components from the server motherboard, and move them to a smaller common form factor module known as the Datacenter Secure Control Module (DC-SCM).

Figure 1 is a picture of a motherboard used on Cloudflare’s previous generation of edge servers. The BMC and its related circuit components are placed on the same printed circuit board as the host CPU.

For Cloudflare’s next generation of edge servers, we are partnering with Lenovo to create a DC-SCM based design. On the left-hand side of Figure 2 is the printed circuit board assembly (PCBA) for the Host Processor Module (HPM). It hosts the CPU, the memory slots, and other components required for the operation and features of the server design. But the BMC and its related circuits have been relocated to a separate PCBA, which is the DC-SCM.

Benefits of DC-SCM based server design

PCB cost reduction

As of today, DDR5 memory runs at 6400MT/s (mega transfers per second). In the future DDR5 speed may even increase to 7200MT/s or 8800MT/s. Meanwhile, PCIe Gen5 is running at 32 GT/s (giga transfers per second), doubling the speed rate of PCIe Gen4. Both DDR5 and PCIE Gen5 are key interfaces for the processors used on our next-generation servers.

The increasing rates of high-speed IO signals and memory buses are pushing the next generation of server motherboard designs to transition from low-loss to ultra-low loss dielectric printed circuit board (PCB) materials, and higher layer counts in the PCB. At the same time, the speed of BMC and its related circuitry are not progressing so quickly. For example, the physical layer interface of ASPEED AST2600 BMC is only at PCIe Gen2 (5 GT/s).

Ultra-low loss dielectric PCB material and higher PCB layer count are both driving factors for higher PCB cost. Another driving factor of PCB cost is the size of the PCB. In a traditional server motherboard design, the size of the server motherboard is larger, since the BMC and its related circuits are placed on the same PCB as the host CPU.

By decoupling the BMC and its related circuitry from the host processor module (HPM), we can reduce the size of the relatively more expensive PCB for the HPM. BMC and its related circuitry can be placed on relatively cheaper PCB, with reduced layer count and lossier PCB dielectric materials. For example, in the design of Cloudflare’s next generation of servers, the server motherboard PCB needs to be 14 or more layers, whereas the BMC and its related components can be easily routed with 8 or 10 layers of PCB. In addition, the dielectric material used on DC-SCM PCB is low-loss dielectric — another cost saver compared to ultra-low loss dielectric materials used on HPM PCB.

Modularized design enables flexibility

DC-SCM modularizes server management and security components into a common add-in card form factor, enabling developers to remove customer specific solutions from the more complex components, such as motherboards, to the DC-SCM. This provides flexibility for developers to offer multiple customer-specific solutions, without the need to redesign multiple motherboards for each solution.

Developers are able to reuse the DC-SCM from a previous generation of server design, if the management and security requirements remain the same. This reduces the overall cost of upgrading to a new generation of servers, and has the potential to reduce e-waste when a server is decommissioned.

Likewise, management and security solution upgrades within a server generation can be carried out separately by modifying or replacing the DC-SCM. The more complex components on the HPM do not need to be redesigned. From a data center perspective, it speeds up the upgrade of management and security hardware across multiple server platforms.

Unified interoperable OpenBMC firmware development

Data center secure control interface (DC-SCI) is a standardized hardware interface between DC-SCM and the Host Processor Module (HPM). It provides a basis for electrical interoperability between different DC-SCM and host processor module (HPM) designs.

This interoperability makes it possible to have a unified firmware image across multiple DC-SCM designs, concentrating development resources on a single firmware rather than an array of them. The publicly-accessible OpenBMC repository provides a perfect platform for firmware developers of different companies to collaborate and develop such unified OpenBMC images. Instead of maintaining a separate BMC firmware image for each platform, we now use a single image that can be applied across multiple server platforms. The device tree specific to each respective server is automatically loaded based on device product information.

Using a unified OpenBMC image significantly simplifies the process of releasing BMC firmware to multiple server platforms. Firmware updates and changes are propagated to all supported platforms in a single firmware release.

Project Argus

The DC-SCM specifications have been driven by the Open Compute Project (OCP) Foundation hardware management workstream, as a way to standardize server management, security, and control features.

Cloudflare has partnered with Lenovo on what we call Project Augus, Cloudflare’s first DC-SCM implementation that fully adheres to the DC-SCM 2.0 specification. In the DC-SCM 2.0 specifications, a few design items are left open for implementers to decide on the most suitable architectural choices. With the goal of improving interoperability of Cloudflare DC-SCM designs across server vendors and server designs, Project Argus includes documentation on implementation details and design decisions on form factor, mechanical locking mechanism, faceplate design, DC-SCI pin out, BMC chip, BMC pinout, Hardware Root of Trust (HWRoT), HWRoT pinout, and minimum bootable device tree.

At the heart of the Project Argus DC-SCM is the ASPEED AST2600 BMC System on Chip (SoC), which when loaded with a compatible OpenBMC firmware, provides a rich set of common features necessary for remote server management. ASPEED AST1060 is used on Project Argus DC-SCM as the HWRoT solution, providing secure firmware authentication, firmware recovery, and firmware update capability. Project Argus DC-SCM 2.0 uses Lattice MachXO3D CPLD with secure boot and dual boot ability as the DC-SCM CPLD to support a variety of IO interfaces including LTPI, SGPIO, UART and GPIOs.

The mechanical form factor of Project Argus DC-SCM 2.0 is the horizontal External Form Factor (EFF).

Cloudflare and Lenovo have contributed Project Argus Design Specification and reference design files to the OCP contribution database. Below is a detailed list of our contribution:

SPI, I2C/I3C, UART, LTPI/SGPIO block diagrams
DC-SCM PCB stackup
DC-SCM Board placements (TOP and BOTTOM layers)
DC-SCM schematic PDF file
DC-SCI pin definition PDF file
Power sequence PDF file
DC-SCM bill of materials Excel spreadsheet
Minimum bootable device tree requirements
Mechanical Drawings PDF files, including card assembly drawing and interlock rail drawing

The security foundation for our Gen 12 hardware

Cloudflare has been innovating around server design for many years, delivering increased performance per watt and reduced carbon footprints. We are excited to integrate Project Argus DC-SCM 2.0 into our next-generation, Cloudflare Gen 12 servers. Stay tuned for more exciting updates on Cloudflare Gen 12 hardware design!

Now available: Building a scalable vulnerability management program on AWS

2023-10-12 Anna McAbee

Post Syndicated from Anna McAbee original https://aws.amazon.com/blogs/security/now-available-how-to-build-a-scalable-vulnerability-management-program-on-aws/

Vulnerability findings in a cloud environment can come from a variety of tools and scans depending on the underlying technology you’re using. Without processes in place to handle these findings, they can begin to mount, often leading to thousands to tens of thousands of findings in a short amount of time. We’re excited to announce the Building a scalable vulnerability management program on AWS guide, which includes how you can build a structured vulnerability management program, operationalize tooling, and scale your processes to handle a large number of findings from diverse sources.

Building a scalable vulnerability management program on AWS focuses on the fundamentals of building a cloud vulnerability management program, including traditional software and network vulnerabilities and cloud configuration risks. The guide covers how to build a successful and scalable vulnerability management program on AWS through preparation, enabling and configuring tools, triaging findings, and reporting.

Targeted outcomes

This guide can help you and your organization with the following:

Develop policies to streamline vulnerability management and maintain accountability.
Establish mechanisms to extend the responsibility of security to your application teams.
Configure relevant AWS services according to best practices for scalable vulnerability management.
Identify patterns for routing security findings to support a shared responsibility model.
Establish mechanisms to report on and iterate on your vulnerability management program.
Improve security finding visibility and help improve overall security posture.

Using the new guide

We encourage you to read the entire guide before taking action or building a list of changes to implement. After you read the guide, assess your current state compared to the action items and check off the items that you’ve already completed in the Next steps table. This will help you assess the current state of your AWS vulnerability management program. Then, plan short-term and long-term roadmaps based on your gaps, desired state, resources, and business needs. Building a cloud vulnerability management program often involves iteration, so you should prioritize key items and regularly revisit your backlog to keep up with technology changes and your business requirements.

Further information

For more information and to get started, see the Building a scalable vulnerability management program on AWS.

We greatly value feedback and contributions from our community. To share your thoughts and insights about the guide, your experience using it, and what you want to see in future versions, select Provide feedback at the bottom of any page in the guide and complete the form.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

New whitepaper available: Charting a path to stronger security with Zero Trust

2023-10-11 Quint Van Deman

Post Syndicated from Quint Van Deman original https://aws.amazon.com/blogs/security/new-whitepaper-available-charting-a-path-to-stronger-security-with-zero-trust/

Security is a top priority for organizations looking to keep pace with a changing threat landscape and build customer trust. However, the traditional approach of defined security perimeters that separate trusted from untrusted network zones has proven to be inadequate as hybrid work models accelerate digital transformation.

Today’s distributed enterprise requires a new approach to ensuring the right levels of security and accessibility for systems and data. Security experts increasingly recommend Zero Trust as the solution, but security teams can get confused when Zero Trust is presented as a product, rather than as a security model. We’re excited to share a whitepaper we recently authored with SANS Institute called Zero Trust: Charting a Path To Stronger Security, which addresses common misconceptions and explores Zero Trust opportunities.

Gartner predicts that by 2025, over 60% of organizations will embrace Zero Trust as a starting place for security.

The whitepaper includes context and analysis that can help you move past Zero Trust marketing hype and learn about these key considerations for implementing a successful Zero Trust strategy:

Zero Trust definition and guiding principles
Six foundational capabilities to establish
Four fallacies to avoid
Six Zero Trust use cases
Metrics for measuring Zero Trust ROI

The journey to Zero Trust is an iterative process that is different for every organization. We encourage you to download the whitepaper, and gain insight into how you can chart a path to a multi-layered security strategy that adapts to the modern environment and meaningfully improves your technical and business outcomes. We look forward to your feedback and to continuing the journey together.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

HTTP/2 Rapid Reset: deconstructing the record-breaking attack

2023-10-10 Lucas Pardue

Post Syndicated from Lucas Pardue original http://blog.cloudflare.com/technical-breakdown-http2-rapid-reset-ddos-attack/

HTTP/2 Rapid Reset: deconstructing the record-breaking attack

Starting on Aug 25, 2023, we started to notice some unusually big HTTP attacks hitting many of our customers. These attacks were detected and mitigated by our automated DDoS system. It was not long however, before they started to reach record breaking sizes — and eventually peaked just above 201 million requests per second. This was nearly 3x bigger than our previous biggest attack on record.

Concerning is the fact that the attacker was able to generate such an attack with a botnet of merely 20,000 machines. There are botnets today that are made up of hundreds of thousands or millions of machines. Given that the entire web typically sees only between 1–3 billion requests per second, it's not inconceivable that using this method could focus an entire web’s worth of requests on a small number of targets.

Detecting and Mitigating

This was a novel attack vector at an unprecedented scale, but Cloudflare's existing protections were largely able to absorb the brunt of the attacks. While initially we saw some impact to customer traffic — affecting roughly 1% of requests during the initial wave of attacks — today we’ve been able to refine our mitigation methods to stop the attack for any Cloudflare customer without it impacting our systems.

We noticed these attacks at the same time two other major industry players — Google and AWS — were seeing the same. We worked to harden Cloudflare’s systems to ensure that, today, all our customers are protected from this new DDoS attack method without any customer impact. We’ve also participated with Google and AWS in a coordinated disclosure of the attack to impacted vendors and critical infrastructure providers.

This attack was made possible by abusing some features of the HTTP/2 protocol and server implementation details (see CVE-2023-44487 for details). Because the attack abuses an underlying weakness in the HTTP/2 protocol, we believe any vendor that has implemented HTTP/2 will be subject to the attack. This included every modern web server. We, along with Google and AWS, have disclosed the attack method to web server vendors who we expect will implement patches. In the meantime, the best defense is using a DDoS mitigation service like Cloudflare’s in front of any web-facing web or API server.

This post dives into the details of the HTTP/2 protocol, the feature that attackers exploited to generate these massive attacks, and the mitigation strategies we took to ensure all our customers are protected. Our hope is that by publishing these details other impacted web servers and services will have the information they need to implement mitigation strategies. And, moreover, the HTTP/2 protocol standards team, as well as teams working on future web standards, can better design them to prevent such attacks.

RST attack details

HTTP is the application protocol that powers the Web. HTTP Semantics are common to all versions of HTTP — the overall architecture, terminology, and protocol aspects such as request and response messages, methods, status codes, header and trailer fields, message content, and much more. Each individual HTTP version defines how semantics are transformed into a "wire format" for exchange over the Internet. For example, a client has to serialize a request message into binary data and send it, then the server parses that back into a message it can process.

HTTP/1.1 uses a textual form of serialization. Request and response messages are exchanged as a stream of ASCII characters, sent over a reliable transport layer like TCP, using the following format (where CRLF means carriage-return and linefeed):

 HTTP-message   = start-line CRLF
                   *( field-line CRLF )
                   CRLF
                   [ message-body ]

For example, a very simple GET request for https://blog.cloudflare.com/ would look like this on the wire:

GET / HTTP/1.1 CRLFHost: blog.cloudflare.comCRLF

And the response would look like:

HTTP/1.1 200 OK CRLFServer: cloudflareCRLFContent-Length: 100CRLFtext/html; charset=UTF-8CRLF<100 bytes of data>

This format frames messages on the wire, meaning that it is possible to use a single TCP connection to exchange multiple requests and responses. However, the format requires that each message is sent whole. Furthermore, in order to correctly correlate requests with responses, strict ordering is required; meaning that messages are exchanged serially and can not be multiplexed. Two GET requests, for https://blog.cloudflare.com/ and https://blog.cloudflare.com/page/2/, would be:

GET / HTTP/1.1 CRLFHost: blog.cloudflare.comCRLFGET /page/2 HTTP/1.1 CRLFHost: blog.cloudflare.comCRLF

With the responses:

HTTP/1.1 200 OK CRLFServer: cloudflareCRLFContent-Length: 100CRLFtext/html; charset=UTF-8CRLF<100 bytes of data>HTTP/1.1 200 OK CRLFServer: cloudflareCRLFContent-Length: 100CRLFtext/html; charset=UTF-8CRLF<100 bytes of data>

Web pages require more complicated HTTP interactions than these examples. When visiting the Cloudflare blog, your browser will load multiple scripts, styles and media assets. If you visit the front page using HTTP/1.1 and decide quickly to navigate to page 2, your browser can pick from two options. Either wait for all of the queued up responses for the page that you no longer want before page 2 can even start, or cancel in-flight requests by closing the TCP connection and opening a new connection. Neither of these is very practical. Browsers tend to work around these limitations by managing a pool of TCP connections (up to 6 per host) and implementing complex request dispatch logic over the pool.

HTTP/2 addresses many of the issues with HTTP/1.1. Each HTTP message is serialized into a set of HTTP/2 frames that have type, length, flags, stream identifier (ID) and payload. The stream ID makes it clear which bytes on the wire apply to which message, allowing safe multiplexing and concurrency. Streams are bidirectional. Clients send frames and servers reply with frames using the same ID.

In HTTP/2 our GET request for https://blog.cloudflare.com would be exchanged across stream ID 1, with the client sending one HEADERS frame, and the server responding with one HEADERS frame, followed by one or more DATA frames. Client requests always use odd-numbered stream IDs, so subsequent requests would use stream ID 3, 5, and so on. Responses can be served in any order, and frames from different streams can be interleaved.

Stream multiplexing and concurrency are powerful features of HTTP/2. They enable more efficient usage of a single TCP connection. HTTP/2 optimizes resources fetching especially when coupled with prioritization. On the flip side, making it easy for clients to launch large amounts of parallel work can increase the peak demand for server resources when compared to HTTP/1.1. This is an obvious vector for denial-of-service.

In order to provide some guardrails, HTTP/2 provides a notion of maximum active concurrent streams. The SETTINGS_MAX_CONCURRENT_STREAMS parameter allows a server to advertise its limit of concurrency. For example, if the server states a limit of 100, then only 100 requests can be active at any time. If a client attempts to open a stream above this limit, it must be rejected by the server using a RST_STREAM frame. Stream rejection does not affect the other in-flight streams on the connection.

The true story is a little more complicated. Streams have a lifecycle. Below is a diagram of the HTTP/2 stream state machine. Client and server manage their own views of the state of a stream. HEADERS, DATA and RST_STREAM frames trigger transitions when they are sent or received. Although the views of the stream state are independent, they are synchronized.

HEADERS and DATA frames include an END_STREAM flag, that when set to the value 1 (true), can trigger a state transition.

Let's work through this with an example of a GET request that has no message content. The client sends the request as a HEADERS frame with the END_STREAM flag set to 1. The client first transitions the stream from idle to open state, then immediately transitions into half-closed state. The client half-closed state means that it can no longer send HEADERS or DATA, only WINDOW_UPDATE, PRIORITY or RST_STREAM frames. It can receive any frame however.

Once the server receives and parses the HEADERS frame, it transitions the stream state from idle to open and then half-closed, so it matches the client. The server half-closed state means it can send any frame but receive only WINDOW_UPDATE, PRIORITY or RST_STREAM frames.

The response to the GET contains message content, so the server sends HEADERS with END_STREAM flag set to 0, then DATA with END_STREAM flag set to 1. The DATA frame triggers the transition of the stream from half-closed to closed on the server. When the client receives it, it also transitions to closed. Once a stream is closed, no frames can be sent or received.

Applying this lifecycle back into the context of concurrency, HTTP/2 states:

Streams that are in the "open" state or in either of the "half-closed" states count toward the maximum number of streams that an endpoint is permitted to open. Streams in any of these three states count toward the limit advertised in the SETTINGS_MAX_CONCURRENT_STREAMS setting.

In theory, the concurrency limit is useful. However, there are practical factors that hamper its effectiveness— which we will cover later in the blog.

HTTP/2 request cancellation

Earlier, we talked about client cancellation of in-flight requests. HTTP/2 supports this in a much more efficient way than HTTP/1.1. Rather than needing to tear down the whole connection, a client can send a RST_STREAM frame for a single stream. This instructs the server to stop processing the request and to abort the response, which frees up server resources and avoids wasting bandwidth.

Let's consider our previous example of 3 requests. This time the client cancels the request on stream 1 after all of the HEADERS have been sent. The server parses this RST_STREAM frame before it is ready to serve the response and instead only responds to stream 3 and 5:

Request cancellation is a useful feature. For example, when scrolling a webpage with multiple images, a web browser can cancel images that fall outside the viewport, meaning that images entering it can load faster. HTTP/2 makes this behaviour a lot more efficient compared to HTTP/1.1.

A request stream that is canceled, rapidly transitions through the stream lifecycle. The client's HEADERS with END_STREAM flag set to 1 transitions the state from idle to open to half-closed, then RST_STREAM immediately causes a transition from half-closed to closed.

Recall that only streams that are in the open or half-closed state contribute to the stream concurrency limit. When a client cancels a stream, it instantly gets the ability to open another stream in its place and can send another request immediately. This is the crux of what makes CVE-2023-44487 work.

Rapid resets leading to denial of service

HTTP/2 request cancellation can be abused to rapidly reset an unbounded number of streams. When an HTTP/2 server is able to process client-sent RST_STREAM frames and tear down state quickly enough, such rapid resets do not cause a problem. Where issues start to crop up is when there is any kind of delay or lag in tidying up. The client can churn through so many requests that a backlog of work accumulates, resulting in excess consumption of resources on the server.

A common HTTP deployment architecture is to run an HTTP/2 proxy or load-balancer in front of other components. When a client request arrives it is quickly dispatched and the actual work is done as an asynchronous activity somewhere else. This allows the proxy to handle client traffic very efficiently. However, this separation of concerns can make it hard for the proxy to tidy up the in-process jobs. Therefore, these deployments are more likely to encounter issues from rapid resets.

When Cloudflare's reverse proxies process incoming HTTP/2 client traffic, they copy the data from the connection’s socket into a buffer and process that buffered data in order. As each request is read (HEADERS and DATA frames) it is dispatched to an upstream service. When RST_STREAM frames are read, the local state for the request is torn down and the upstream is notified that the request has been canceled. Rinse and repeat until the entire buffer is consumed. However this logic can be abused: when a malicious client started sending an enormous chain of requests and resets at the start of a connection, our servers would eagerly read them all and create stress on the upstream servers to the point of being unable to process any new incoming request.

Something that is important to highlight is that stream concurrency on its own cannot mitigate rapid reset. The client can churn requests to create high request rates no matter the server's chosen value of SETTINGS_MAX_CONCURRENT_STREAMS.

Rapid Reset dissected

Here's an example of rapid reset reproduced using a proof-of-concept client attempting to make a total of 1000 requests. I've used an off-the-shelf server without any mitigations; listening on port 443 in a test environment. The traffic is dissected using Wireshark and filtered to show only HTTP/2 traffic for clarity. Download the pcap to follow along.

It's a bit difficult to see, because there are a lot of frames. We can get a quick summary via Wireshark's Statistics > HTTP2 tool:

The first frame in this trace, in packet 14, is the server's SETTINGS frame, which advertises a maximum stream concurrency of 100. In packet 15, the client sends a few control frames and then starts making requests that are rapidly reset. The first HEADERS frame is 26 bytes long, all subsequent HEADERS are only 9 bytes. This size difference is due to a compression technology called HPACK. In total, packet 15 contains 525 requests, going up to stream 1051.

Interestingly, the RST_STREAM for stream 1051 doesn't fit in packet 15, so in packet 16 we see the server respond with a 404 response. Then in packet 17 the client does send the RST_STREAM, before moving on to sending the remaining 475 requests.

Note that although the server advertised 100 concurrent streams, both packets sent by the client sent a lot more HEADERS frames than that. The client did not have to wait for any return traffic from the server, it was only limited by the size of the packets it could send. No server RST_STREAM frames are seen in this trace, indicating that the server did not observe a concurrent stream violation.

Impact on customers

As mentioned above, as requests are canceled, upstream services are notified and can abort requests before wasting too many resources on it. This was the case with this attack, where most malicious requests were never forwarded to the origin servers. However, the sheer size of these attacks did cause some impact.

First, as the rate of incoming requests reached peaks never seen before, we had reports of increased levels of 502 errors seen by clients. This happened on our most impacted data centers as they were struggling to process all the requests. While our network is meant to deal with large attacks, this particular vulnerability exposed a weakness in our infrastructure. Let's dig a little deeper into the details, focusing on how incoming requests are handled when they hit one of our data centers:

We can see that our infrastructure is composed of a chain of different proxy servers with different responsibilities. In particular, when a client connects to Cloudflare to send HTTPS traffic, it first hits our TLS decryption proxy: it decrypts TLS traffic, processes HTTP 1, 2 or 3 traffic, then forwards it to our "business logic" proxy. This one is responsible for loading all the settings for each customer, then routing the requests correctly to other upstream services — and more importantly in our case, it is also responsible for security features. This is where L7 attack mitigation is processed.

The problem with this attack vector is that it manages to send a lot of requests very quickly in every single connection. Each of them had to be forwarded to the business logic proxy before we had a chance to block it. As the request throughput became higher than our proxy capacity, the pipe connecting these two services reached its saturation level in some of our servers.

When this happens, the TLS proxy cannot connect anymore to its upstream proxy, this is why some clients saw a bare "502 Bad Gateway" error during the most serious attacks. It is important to note that, as of today, the logs used to create HTTP analytics are also emitted by our business logic proxy. The consequence of that is that these errors are not visible in the Cloudflare dashboard. Our internal dashboards show that about 1% of requests were impacted during the initial wave of attacks (before we implemented mitigations), with peaks at around 12% for a few seconds during the most serious one on August 29th. The following graph shows the ratio of these errors over a two hours while this was happening:

We worked to reduce this number dramatically in the following days, as detailed later on in this post. Both thanks to changes in our stack and to our mitigation that reduce the size of these attacks considerably, this number is today is effectively zero:

499 errors and the challenges for HTTP/2 stream concurrency

Another symptom reported by some customers is an increase in 499 errors. The reason for this is a bit different and is related to the maximum stream concurrency in a HTTP/2 connection detailed earlier in this post.

HTTP/2 settings are exchanged at the start of a connection using SETTINGS frames. In the absence of receiving an explicit parameter, default values apply. Once a client establishes an HTTP/2 connection, it can wait for a server's SETTINGS (slow) or it can assume the default values and start making requests (fast). For SETTINGS_MAX_CONCURRENT_STREAMS, the default is effectively unlimited (stream IDs use a 31-bit number space, and requests use odd numbers, so the actual limit is 1073741824). The specification recommends that a server offer no fewer than 100 streams. Clients are generally biased towards speed, so don't tend to wait for server settings, which creates a bit of a race condition. Clients are taking a gamble on what limit the server might pick; if they pick wrong the request will be rejected and will have to be retried. Gambling on 1073741824 streams is a bit silly. Instead, a lot of clients decide to limit themselves to issuing 100 concurrent streams, with the hope that servers followed the specification recommendation. Where servers pick something below 100, this client gamble fails and streams are reset.

There are many reasons a server might reset a stream beyond concurrency limit overstepping. HTTP/2 is strict and requires a stream to be closed when there are parsing or logic errors. In 2019, Cloudflare developed several mitigations in response to HTTP/2 DoS vulnerabilities. Several of those vulnerabilities were caused by a client misbehaving, leading the server to reset a stream. A very effective strategy to clamp down on such clients is to count the number of server resets during a connection, and when that exceeds some threshold value, close the connection with a GOAWAY frame. Legitimate clients might make one or two mistakes in a connection and that is acceptable. A client that makes too many mistakes is probably either broken or malicious and closing the connection addresses both cases.

While responding to DoS attacks enabled by CVE-2023-44487, Cloudflare reduced maximum stream concurrency to 64. Before making this change, we were unaware that clients don't wait for SETTINGS and instead assume a concurrency of 100. Some web pages, such as an image gallery, do indeed cause a browser to send 100 requests immediately at the start of a connection. Unfortunately, the 36 streams above our limit all needed to be reset, which triggered our counting mitigations. This meant that we closed connections on legitimate clients, leading to a complete page load failure. As soon as we realized this interoperability issue, we changed the maximum stream concurrency to 100.

Actions from the Cloudflare side

In 2019 several DoS vulnerabilities were uncovered related to implementations of HTTP/2. Cloudflare developed and deployed a series of detections and mitigations in response. CVE-2023-44487 is a different manifestation of HTTP/2 vulnerability. However, to mitigate it we were able to extend the existing protections to monitor client-sent RST_STREAM frames and close connections when they are being used for abuse. Legitimate client uses for RST_STREAM are unaffected.

In addition to a direct fix, we have implemented several improvements to the server's HTTP/2 frame processing and request dispatch code. Furthermore, the business logic server has received improvements to queuing and scheduling that reduce unnecessary work and improve cancellation responsiveness. Together these lessen the impact of various potential abuse patterns as well as giving more room to the server to process requests before saturating.

Mitigate attacks earlier

Cloudflare already had systems in place to efficiently mitigate very large attacks with less expensive methods. One of them is named "IP Jail". For hyper volumetric attacks, this system collects the client IPs participating in the attack and stops them from connecting to the attacked property, either at the IP level, or in our TLS proxy. This system however needs a few seconds to be fully effective; during these precious seconds, the origins are already protected but our infrastructure still needs to absorb all HTTP requests. As this new botnet has effectively no ramp-up period, we need to be able to neutralize attacks before they can become a problem.

To achieve this we expanded the IP Jail system to protect our entire infrastructure: once an IP is "jailed", not only it is blocked from connecting to the attacked property, we also forbid the corresponding IPs from using HTTP/2 to any other domain on Cloudflare for some time. As such protocol abuses are not possible using HTTP/1.x, this limits the attacker's ability to run large attacks, while any legitimate client sharing the same IP would only see a very small performance decrease during that time. IP based mitigations are a very blunt tool — this is why we have to be extremely careful when using them at that scale and seek to avoid false positives as much as possible. Moreover, the lifespan of a given IP in a botnet is usually short so any long term mitigation is likely to do more harm than good. The following graph shows the churn of IPs in the attacks we witnessed:

As we can see, many new IPs spotted on a given day disappear very quickly afterwards.

As all these actions happen in our TLS proxy at the beginning of our HTTPS pipeline, this saves considerable resources compared to our regular L7 mitigation system. This allowed us to weather these attacks much more smoothly and now the number of random 502 errors caused by these botnets is down to zero.

Observability improvements

Another front on which we are making change is observability. Returning errors to clients without being visible in customer analytics is unsatisfactory. Fortunately, a project has been underway to overhaul these systems since long before the recent attacks. It will eventually allow each service within our infrastructure to log its own data, instead of relying on our business logic proxy to consolidate and emit log data. This incident underscored the importance of this work, and we are redoubling our efforts.

We are also working on better connection-level logging, allowing us to spot such protocol abuses much more quickly to improve our DDoS mitigation capabilities.

Conclusion

While this was the latest record-breaking attack, we know it won’t be the last. As attacks continue to become more sophisticated, Cloudflare works relentlessly to proactively identify new threats — deploying countermeasures to our global network so that our millions of customers are immediately and automatically protected.

Cloudflare has provided free, unmetered and unlimited DDoS protection to all of our customers since 2017. In addition, we offer a range of additional security features to suit the needs of organizations of all sizes. Contact us if you’re unsure whether you’re protected or want to understand how you can be.

HTTP/2 Zero-Day Vulnerability Results in Record-Breaking DDoS Attacks

2023-10-10 Grant Bourzikas

Post Syndicated from Grant Bourzikas original http://blog.cloudflare.com/zero-day-rapid-reset-http2-record-breaking-ddos-attack/

HTTP/2 Zero-Day Vulnerability Results in Record-Breaking DDoS Attacks

Earlier today, Cloudflare, along with Google and Amazon AWS, disclosed the existence of a novel zero-day vulnerability dubbed the “HTTP/2 Rapid Reset” attack. This attack exploits a weakness in the HTTP/2 protocol to generate enormous, hyper-volumetric Distributed Denial of Service (DDoS) attacks. Cloudflare has mitigated a barrage of these attacks in recent months, including an attack three times larger than any previous attack we’ve observed, which exceeded 201 million requests per second (rps). Since the end of August 2023, Cloudflare has mitigated more than 1,100 other attacks with over 10 million rps — and 184 attacks that were greater than our previous DDoS record of 71 million rps.

This zero-day provided threat actors with a critical new tool in their Swiss Army knife of vulnerabilities to exploit and attack their victims at a magnitude that has never been seen before. While at times complex and challenging to combat, these attacks allowed Cloudflare the opportunity to develop purpose-built technology to mitigate the effects of the zero-day vulnerability.

If you are using Cloudflare for HTTP DDoS mitigation, you are protected. And below, we’ve included more information on this vulnerability, and resources and recommendations on what you can do to secure yourselves.

Deconstructing the attack: What every CSO needs to know

In late August 2023, our team at Cloudflare noticed a new zero-day vulnerability, developed by an unknown threat actor, that exploits the standard HTTP/2 protocol — a fundamental protocol that is critical to how the Internet and all websites work. This novel zero-day vulnerability attack, dubbed Rapid Reset, leverages HTTP/2’s stream cancellation feature by sending a request and immediately canceling it over and over.

By automating this trivial “request, cancel, request, cancel” pattern at scale, threat actors are able to create a denial of service and take down any server or application running the standard implementation of HTTP/2. Furthermore, one crucial thing to note about the record-breaking attack is that it involved a modestly-sized botnet, consisting of roughly 20,000 machines. Cloudflare regularly detects botnets that are orders of magnitude larger than this — comprising hundreds of thousands and even millions of machines. For a relatively small botnet to output such a large volume of requests, with the potential to incapacitate nearly any server or application supporting HTTP/2, underscores how menacing this vulnerability is for unprotected networks.

Threat actors used botnets in tandem with the HTTP/2 vulnerability to amplify requests at rates we have never seen before. As a result, our team at Cloudflare experienced some intermittent edge instability. While our systems were able to mitigate the overwhelming majority of incoming attacks, the volume overloaded some components in our network, impacting a small number of customers’ performance with intermittent 4xx and 5xx errors — all of which were quickly resolved.

Once we successfully mitigated these issues and halted potential attacks for all customers, our team immediately kicked off a responsible disclosure process. We entered into conversations with industry peers to see how we could work together to help move our mission forward and safeguard the large percentage of the Internet that relies on our network prior to releasing this vulnerability to the general public.

We cover the technical details of the attack in more detail in a separate blog post: HTTP/2 Rapid Reset: deconstructing the record-breaking attack.

How is Cloudflare and the industry thwarting this attack?

There is no such thing as a “perfect disclosure.” Thwarting attacks and responding to emerging incidents requires organizations and security teams to live by an assume-breach mindset — because there will always be another zero-day, new evolving threat actor groups, and never-before-seen novel attacks and techniques.

This “assume-breach” mindset is a key foundation towards information sharing and ensuring in instances such as this that the Internet remains safe. While Cloudflare was experiencing and mitigating these attacks, we were also working with industry partners to guarantee that the industry at-large could withstand this attack.

During the process of mitigating this attack, our Cloudflare team developed and purpose-built new technology to stop these DDoS attacks and further improve our own mitigations for this and other future attacks of massive scale. These efforts have significantly increased our overall mitigation capabilities and resiliency. If you are using Cloudflare, we are confident that you are protected.

Our team also alerted web server software partners who are developing patches to ensure this vulnerability cannot be exploited — check their websites for more information.

Disclosures are never one and done. The lifeblood of Cloudflare is to ensure a better Internet, which stems from instances such as these. When we have the opportunity to work with our industry partners and governments to ensure there are no widespread impacts on the Internet, we are doing our part in increasing the cyber resiliency of every organization no matter the size or vertical.

To gain more of an understanding around mitigation tactics and next steps on patching, register for our webinar.

What are the origins of the HTTP/2 Rapid Reset and these record-breaking attacks on Cloudflare?

It may seem odd that Cloudflare was one of the first companies to witness these attacks. Why would threat actors attack a company that has some of the most robust defenses against DDoS attacks in the world?

The reality is that Cloudflare often sees attacks before they are turned on more vulnerable targets. Threat actors need to develop and test their tools before they deploy them in the wild. Threat actors who possess record-shattering attack methods can have an extremely difficult time testing and understanding how large and effective they are, because they don't have the infrastructure to absorb the attacks they are launching. Because of the transparency that we share on our network performance, and the measurements of attacks they could glean from our public performance charts, this threat actor was likely targeting us to understand the capabilities of the exploit.

But that testing, and the ability to see the attack early, helps us develop mitigations for the attack that benefit both our customers and industry as a whole.

From CSO to CSO: What should you do?

I have been a CSO for over 20 years, on the receiving end of countless disclosures and announcements like this. But whether it was Log4J, Solarwinds, EternalBlue WannaCry/NotPetya, Heartbleed, or Shellshock, all of these security incidents have a commonality. A tremendous explosion that ripples across the world and creates an opportunity to completely disrupt any of the organizations that I have led — regardless of the industry or the size.

Many of these were attacks or vulnerabilities that we may have not been able to control. But regardless of whether the issue arose from something that was in my control or not, what has set any successful initiative I have led apart from those that did not lean in our favor was the ability to respond when zero-day vulnerabilities and exploits like this are identified.

While I wish I could say that Rapid Reset may be different this time around, it is not. I am calling all CSOs — no matter if you’ve lived through the decades of security incidents that I have, or this is your first day on the job — this is the time to ensure you are protected and stand up your cyber incident response team.

We’ve kept the information restricted until today to give as many security vendors as possible the opportunity to react. However, at some point, the responsible thing becomes to publicly disclose zero-day threats like this. Today is that day. That means that after today, threat actors will be largely aware of the HTTP/2 vulnerability; and it will inevitably become trivial to exploit and kickoff the race between defenders and attacks — first to patch vs. first to exploit. Organizations should assume that systems will be tested, and take proactive measures to ensure protection.

To me, this is reminiscent of a vulnerability like Log4J, due to the many variants that are emerging daily, and will continue to come to fruition in the weeks, months, and years to come. As more researchers and threat actors experiment with the vulnerability, we may find different variants with even shorter exploit cycles that contain even more advanced bypasses.

And just like Log4J, managing incidents like this isn’t as simple as “run the patch, now you’re done”. You need to turn incident management, patching, and evolving your security protections into ongoing processes — because the patches for each variant of a vulnerability reduce your risk, but they don’t eliminate it.

I don’t mean to be alarmist, but I will be direct: you must take this seriously. Treat this as a full active incident to ensure nothing happens to your organization.

Recommendations for a New Standard of Change

While no one security event is ever identical to the next, there are lessons that can be learned. CSOs, here are my recommendations that must be implemented immediately. Not only in this instance, but for years to come:

Understand your external and partner network’s external connectivity to remediate any Internet facing systems with the mitigations below.
Understand your existing security protection and capabilities you have to protect, detect and respond to an attack and immediately remediate any issues you have in your network.
Ensure your DDoS Protection resides outside of your data center because if the traffic gets to your datacenter, it will be difficult to mitigate the DDoS attack.
Ensure you have DDoS protection for Applications (Layer 7) and ensure you have Web Application Firewalls. Additionally as a best practice, ensure you have complete DDoS protection for DNS, Network Traffic (Layer 3) and API Firewalls
Ensure web server and operating system patches are deployed across all Internet Facing Web Servers. Also, ensure all automation like Terraform builds and images are fully patched so older versions of web servers are not deployed into production over the secure images by accident.
As a last resort, consider turning off HTTP/2 and HTTP/3 (likely also vulnerable) to mitigate the threat. This is a last resort only, because there will be a significant performance issues if you downgrade to HTTP/1.1
Consider a secondary, cloud-based DDoS L7 provider at perimeter for resilience.

Cloudflare’s mission is to help build a better Internet. If you are concerned with your current state of DDoS protection, we are more than happy to provide you with our DDoS capabilities and resilience for free to mitigate any attempts of a successful DDoS attack. We know the stress that you are facing as we have fought off these attacks for the last 30 days and made our already best in class systems, even better.

If you’re interested in finding out more, we have a webinar coming up with more details on the zero-day and how to respond; you can register here. We also have more technical details of the attack in more detail in a separate blog post: HTTP/2 Rapid Reset: deconstructing the record-breaking attack. Finally, if you’re being targeted or need immediate protection, please contact your local Cloudflare representative or visit https://www.cloudflare.com/under-attack-hotline/.

Uncovering the Hidden WebP vulnerability: a tale of a CVE with much bigger implications than it originally seemed

2023-10-05 Willi Geiger

Post Syndicated from Willi Geiger original http://blog.cloudflare.com/uncovering-the-hidden-webp-vulnerability-cve-2023-4863/

Uncovering the Hidden WebP vulnerability: a tale of a CVE with much bigger implications than it originally seemed

At Cloudflare, we're constantly vigilant when it comes to identifying vulnerabilities that could potentially affect the Internet ecosystem. Recently, on September 12, 2023, Google announced a security issue in Google Chrome, titled "Heap buffer overflow in WebP in Google Chrome," which caught our attention. Initially, it seemed like just another bug in the popular web browser. However, what we discovered was far more significant and had implications that extended well beyond Chrome.

Impact much wider than suggested

The vulnerability, tracked under CVE-2023-4863, was described as a heap buffer overflow in WebP within Google Chrome. While this description might lead one to believe that it's a problem confined solely to Chrome, the reality was quite different. It turned out to be a bug deeply rooted in the libwebp library, which is not only used by Chrome but by virtually every application that handles WebP images.

Digging deeper, this vulnerability was in fact first reported in an earlier CVE from Apple, CVE-2023-41064, although the connection was not immediately obvious. In early September, Citizen Lab, a research lab based out of the University of Toronto, reported on an apparent exploit that was being used to attempt to install spyware on the iPhone of "an individual employed by a Washington DC-based civil society organization." The advisory from Apple was also incomplete, stating that it was a “buffer overflow issue in ImageIO,” and that they were aware the issue may have been actively exploited. Only after Google released CVE-2023-4863 did it become clear that these two issues were linked, and there was a wider vulnerability in WebP.

The vulnerability allows an attacker to create a malformed WebP image file that makes libwebp write data beyond the buffer memory allocated to the image decoder. By writing past the legal bounds of the buffer, it is possible to modify sensitive data in memory, eventually leading to execution of the attacker's code.

WebP, introduced over a decade ago, has gained widespread adoption in various applications, ranging from web browsers to email clients, chat apps, graphics programs, and even operating systems. This ubiquity meant that this vulnerability had far-reaching consequences, affecting a vast array of software and virtually all users of the WebP format.

Understanding the technical details

So what exactly was the issue, how could it be exploited, and how was it shut down? We can get our best clues by looking at the patch that was made to libwebp. This patch fixes a potential out-of-buffer (OOB) error in part of the image decoder – the Huffman tables – with two changes: additional validation of the input data, and a modified dynamic memory allocation model. A deeper dive into libwebp and the WebP image format built on top of it reveals what this means.

WebP is a combination of two different image formats: a lossy format similar to JPEG using VP8 codec, and a lossless format using WebP's custom lossless codec. The bug was in the lossless codec's handling of Huffman coding.

The fundamental idea behind Huffman coding is that using a constant number of bits for every basic unit of information in a dataset – like a pixel color – is not the most efficient representation. We can use a variable number of bits, and assign shortest sequences to the most frequently occurring values, and longer ones to the least common values. The sequences of ones and zeros can be represented as a binary tree, with the shorter, more common codes near the root, and longer, less common codes deeper in the tree. Looking up values in the tree bit by bit is relatively slow. Practical implementations build lookup tables that allow matching many bits at a time.

Image files contain compact information about the shape of the Huffman tree, which the decoder uses to reconstruct the tree, and build lookup tables for the codes. The bug in libwebp was in the code building the lookup tables. A specially crafted WebP file can contain a very unbalanced Huffman tree that contains codes much longer than any normal WebP file would have, and this made the function generating lookup tables write data beyond the buffer allocated for the lookup tables. Libwebp had checks for validity of the Huffman tree, but it would write the invalid lookup tables before the consistency check.

The buffer for lookup tables is allocated on the heap. Heap is an area of memory where most of the data of the application is stored. Code that writes data past its buffer allows attackers to modify and corrupt data that happens to be adjacent in memory to the buffer. This can be exploited to make the application misbehave, and eventually start executing code supplied by the attacker.

The fixed version of libwebp ensures that the input data will always create a valid internal structure, and if so, allocates more memory if necessary to ensure the buffer is always big enough.

Libwebp is a mature library, maintained by seasoned professionals. But it's written in the C language, which has very few safeguards against programming errors, especially memory use. Despite the care taken in the library's development, a single erroneous assumption led to a critical vulnerability.

Swift action

On the same day that Google's announcement caught our attention, we filed an internal security ticket, to document and address the vulnerability.

Google was initially perplexed about the true source of the problem. They did not release a patched version of libwebp before announcing the vulnerability. We discovered the yet-unreleased patch for libwebp in its repository, and used it to update libwebp in our services. libwebp officially released the patch a day later.

Our image processing services are written in Rust. We've submitted patches to Rust packages that contained a copy of libwebp and filed RustSec advisories for them (RUSTSEC-2023-0061 and RUSTSEC-2023-0062). This ensured that the broader Rust ecosystem was informed and could take appropriate action.

In an interesting turn of events, GitHub's vulnerability scanner was quick to recognize our RustSec reports as the first case of CVE-2023-4863, even before the issue gained widespread attention. This highlights the importance of having robust security reporting mechanisms in place and the vital role that platforms like GitHub play in keeping the open-source community secure.

These quick actions demonstrate how seriously Cloudflare takes this kind of threat. We have a belt-and-suspenders approach to security that limits the binaries we run at our edge to those signed by us, and ensures that all vulnerabilities are identified and remedied as soon as possible. In this case, we have scrutinized our logs, and found no evidence that any attackers attempted to leverage this vulnerability against Cloudflare. We believe this exploit targeted individuals rather than the infrastructure of a company like Cloudflare, but we never take chances with our customers’ data, and so fixed this vulnerability as quickly as possible, before it became well known.

Conclusion

Google has now widened its description of this issue, correctly calling out that all uses of WebP are potentially affected. This widened description was originally filed as yet another new CVE – CVE-2023-5129 – but then that was flagged as a duplicate of the original CVE-2023-4863, and the description of the earlier filing updated. This incident serves as a reminder of the complex and interconnected nature of the internet ecosystem. What initially seemed like a Chrome-specific problem revealed a much deeper issue that touched nearly every corner of the digital world. The incident also showcased the importance of swift collaboration and the critical role that responsible disclosure plays in mitigating security risks.

For each and every user, it demonstrates the need to keep all browsers, apps and operating systems up to date, and to install recommended security patches. All applications supporting WebP images need to be updated. We've updated our services.

At Cloudflare, we remain committed to enhancing the security of the internet, and incidents like these drive us to continually refine our processes and strengthen our partnerships within the global developer community. By working together, we can make the Internet a safer place for everyone.

Enable Security Hub partner integrations across your organization

2023-10-04 Joaquin Manuel Rinaudo

Post Syndicated from Joaquin Manuel Rinaudo original https://aws.amazon.com/blogs/security/enable-security-hub-partner-integrations-across-your-organization/

AWS Security Hub offers over 75 third-party partner product integrations, such as Palo Alto Networks Prisma, Prowler, Qualys, Wiz, and more, that you can use to send, receive, or update findings in Security Hub.

We recommend that you enable your corresponding Security Hub third-party partner product integrations when you use these partner solutions. By centralizing findings across your AWS and partner solutions in Security Hub, you can get a holistic cross-account and cross-Region view of your security risks. In this way, you can move beyond security reporting and start implementing automations on top of Security Hub that help improve your overall security posture and reduce manual efforts. For example, you can configure your third-party partner offerings to send findings to Security Hub and build standardized enrichment, escalation, and remediation solutions by using Security Hub automation rules, or other AWS services such as AWS Lambda or AWS Step Functions.

To enable partner integrations, you must configure the integration in each AWS Region and AWS account across your organization in AWS Organizations. In this blog post, we’ll show you how to set up a Security Hub partner integration across your entire organization by using AWS CloudFormation StackSets.

Overview

Figure 1 shows the architecture of the solution. The main steps are as follows:

The deployment script creates a CloudFormation template that deploys a stack set across your AWS accounts.
The stack in the member account deploys a CloudFormation custom resource using a Lambda function.
The Lambda function iterates through target Regions and invokes the Security Hub boto3 method enable_import_findings_for_product to enable the corresponding partner integration.

When you add new accounts to the organizational units (OUs), StackSets deploys the CloudFormation stack and the partner integration is enabled.

Figure 1: Diagram of the solution

Prerequisites

To follow along with this walkthrough, make sure that you have the following prerequisites in place:

Security Hub enabled across an organization in the Regions where you want to deploy the partner integration.
Trusted access with AWS Organizations enabled so that you can deploy CloudFormation StackSets across your organization. For instructions on how to do this, see Activate trusted access with AWS Organizations.
Permissions to deploy CloudFormation StackSets in a delegated administrator account for your organization.
AWS Command Line Interface (AWS CLI) installed.

Walkthrough

Next, we show you how to get started with enabling your partner integration across your organization using the following solution.

Step 1: Clone the repository

In the AWS CLI, run the following command to clone the aws-securityhub-deploy-partner-integration GitHub repository:

git clone https://github.com/aws-samples/aws-securityhub-partner-integration

Step 2: Set up the integration parameters

Open the parameters.json file and configure the following values:
- ProductName — Name of the product that you want to enable.
- ProductArn — The unique Amazon Resource Name (ARN) of the Security Hub partner product. For example, the product ARN for Palo Alto PRISMA Cloud Enterprise, is arn:aws:securityhub:<REGION>:188619942792:product/paloaltonetworks/redlock; and for Prowler, it’s arn:aws:securityhub:<REGION>::product/prowler/prowler. To find a product ARN, see Available third-party partner product integrations.
- DeploymentTargets — List of the IDs of the OUs of the AWS accounts that you want to configure. For example, use the unique identifier (ID) for the root to deploy across your entire organization.
- DeploymentRegions — List of the Regions in which you’ve enabled Security Hub, and for which the partner integration should be enabled.
Save the changes and close the file.

Step 3: Deploy the solution

Open a command line terminal of your preference.
Set up your AWS_REGION (for example, export AWS_REGION=eu-west-1) and make sure that your credentials are configured for the delegated administrator account.
Enter the following command to deploy:
```
./setup.sh deploy
```

Step 4: Verify Security Hub partner integration

To test that the product integration is enabled, run the following command in one of the accounts in the organization. Replace <TARGET-REGION> with one of the Regions where you enabled Security Hub.

aws securityhub list-enabled-products-for-import --region <TARGET-REGION>

Step 5: (Optional) Manage new partners, Regions, and OUs

To add or remove the partner integration in certain Regions or OUs, update the parameters.json file with your desired Regions and OU IDs and repeat Step 3 to redeploy changes to your Security Hub partner integration. You can also directly update the CloudFormation parameters for the securityhub-integration-<PARTNER-NAME> from the CloudFormation console.

To enable new partner integrations, create a new parameters.json file version with the partner’s product name and product ARN to deploy a new stack using the deployment script from Step 3. In the next step, we show you how to disable the partner integrations.

Step 6: Clean up

If needed, you can remove the partner integrations by destroying the stack deployed. To destroy the stack, use the command line terminal configured with the credentials for the AWS StackSets delegated administrator account and run the following command:

 ./setup.sh destroy

You can also directly delete the stack mentioned in Step 5 from the CloudFormation console by accessing the stack page from the CloudFormation console, selecting the stack securityhub-integration-<PARTNER-NAME>, and then choosing Delete.

Conclusion

In this post, you learned how you to enable Security Hub partner integrations across your organization. Now you can configure the partner product of your choice to send, update, or receive Security Hub findings.

You can extend your security automation by using Security Hub automation rules, Amazon EventBridge events, and Lambda functions to start or enrich automated remediation of new ingested findings from partners. For an example of how to do this, see Automated Security Response on AWS.

Developer teams can opt in to configure their own chatbot in AWS Chatbot to receive notifications in Amazon Chime, Slack, or Microsoft Teams channels. Lastly, security teams can use existing bidirectional integrations with Jira Service Management or Jira Core to escalate severe findings to their developer teams.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security, Identity, & Compliance re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Announcing General Availability for the Magic WAN Connector: the easiest way to jumpstart SASE transformation for your network

2023-10-03 Annika Garbers

Post Syndicated from Annika Garbers original http://blog.cloudflare.com/magic-wan-connector-general-availability/

Announcing General Availability for the Magic WAN Connector: the easiest way to jumpstart SASE transformation for your network

Today, we’re announcing the general availability of the Magic WAN Connector, a key component of our SASE platform, Cloudflare One. Magic WAN Connector is the glue between your existing network hardware and Cloudflare’s network — it provides a super simplified software solution that comes pre-installed on Cloudflare-certified hardware, and is entirely managed from the Cloudflare One dashboard.

It takes only a few minutes from unboxing to seeing your network traffic automatically routed to the closest Cloudflare location, where it flows through a full stack of Zero Trust security controls before taking an accelerated path to its destination, whether that’s another location on your private network, a SaaS app, or any application on the open Internet.

Since we announced our beta earlier this year, organizations around the world have deployed the Magic WAN Connector to connect and secure their network locations. We’re excited for the general availability of the Magic WAN Connector to accelerate SASE transformation at scale.

When customers tell us about their journey to embrace SASE, one of the most common stories we hear is:

We started with our remote workforce, deploying modern solutions to secure access to internal apps and Internet resources. But now, we’re looking at the broader landscape of our enterprise network connectivity and security, and it’s daunting. We want to shift to a cloud and Internet-centric model for all of our infrastructure, but we’re struggling to figure out how to start.

The Magic WAN Connector was created to address this problem.

Zero-touch connectivity to your new corporate WAN

Cloudflare One enables organizations of any size to connect and secure all of their users, devices, applications, networks, and data with a unified platform delivered by our global connectivity cloud. Magic WAN is the network connectivity “glue” of Cloudflare One, allowing our customers to migrate away from legacy private circuits and use our network as an extension of their own.

Previously, customers have connected their locations to Magic WAN with Anycast GRE or IPsec tunnels configured on their edge network equipment (usually existing routers or firewalls), or plugged into us directly with CNI. But for the past few years, we’ve heard requests from hundreds of customers asking for a zero-touch approach to connecting their branches: We just want something we can plug in and turn on, and it handles the rest.

The Magic WAN Connector is exactly this. Customers receive Cloudflare-certified hardware with our software pre-installed on it, and everything is controlled via the Cloudflare dashboard. What was once a time-consuming, complex process now takes a matter of minutes, enabling robust Zero-Trust protection for all of your traffic.

In addition to automatically configuring tunnels and routing policies to direct your network traffic to Cloudflare, the Magic WAN Connector will also handle traffic steering, shaping and failover to make sure your packets always take the best path available to the closest Cloudflare network location — which is likely only milliseconds away. You’ll also get enhanced visibility into all your traffic flows in analytics and logs, providing a unified observability experience across both your branches and the traffic through Cloudflare’s network.

Zero Trust security for all your traffic

Once the Magic WAN Connector is deployed at your network location, you have automatic access to enforce Zero Trust security policies across both public and private traffic.

A secure on-ramp to the Internet

An easy first step to improving your organization’s security posture after connecting network locations to Cloudflare is creating Secure Web Gateway policies to defend against ransomware, phishing, and other threats for faster, safer Internet browsing. By default, all Internet traffic from locations with the Magic WAN Connector will route through Cloudflare Gateway, providing a unified management plane for traffic from physical locations and remote employees.

A more secure private network

The Magic WAN Connector also enables routing private traffic between your network locations, with multiple layers of network and Zero Trust security controls in place. Unlike a traditional network architecture, which requires deploying and managing a stack of security hardware and backhauling branch traffic through a central location for filtering, a SASE architecture provides private traffic filtering and control built-in: enforced across a distributed network, but managed from a single dashboard interface or API.

A simpler approach for hybrid cloud

Cloudflare One enables connectivity for any physical or cloud network with easy on-ramps depending on location type. The Magic WAN Connector provides easy connectivity for branches, but also provides automatic connectivity to other networks including VPCs connected using cloud-native constructs (e.g., VPN Gateways) or direct cloud connectivity (via Cloud CNI). With a unified connectivity and control plane across physical and cloud infrastructure, IT and security teams can reduce overhead and cost of managing multi- and hybrid cloud networks.

Single-vendor SASE dramatically reduces cost and complexity

With the general availability of the Magic WAN Connector, we’ve put the final piece in place to deliver a unified SASE platform, developed and fully integrated from the ground up. Deploying and managing all the components of SASE with a single vendor, versus piecing together different solutions for networking and security, significantly simplifies deployment and management by reducing complexity and potential integration challenges. Many vendors that market a full SASE solution have actually stitched together separate products through acquisition, leading to an un-integrated experience similar to what you would see deploying and managing multiple separate vendors. In contrast, Cloudflare One (now with the Magic WAN Connector for simplified branch functions) enables organizations to achieve the true promise of SASE: a simplified, efficient, and highly secure network and security infrastructure that reduces your total cost of ownership and adapts to the evolving needs of the modern digital landscape.

Evolving beyond SD-WAN

Cloudflare One addresses many of the challenges that were left behind as organizations deployed SD-WAN to help simplify networking operations. SD-WAN provides orchestration capabilities to help manage devices and configuration in one place, as well as last mile traffic management to steer and shape traffic based on more sophisticated logic than is possible in traditional routers. But SD-WAN devices generally don't have embedded security controls, leaving teams to stitch together a patchwork of hardware, virtualized and cloud-based tools to keep their networks secure. They can make decisions about the best way to send traffic out from a customer’s branch, but they have no way to influence traffic hops between the last mile and the traffic's destination. And while some SD-WAN providers have surfaced virtualized versions of their appliances that can be deployed in cloud environments, they don't support native cloud connectivity and can complicate rather than ease the transition to cloud.

Cloudflare One represents the next evolution of enterprise networking, and has a fundamentally different architecture from either legacy networking or SD-WAN. It's based on a "light branch, heavy cloud" principle: deploy the minimum required hardware within physical locations (or virtual hardware within virtual networks, e.g., cloud VPCs) and use low-cost Internet connectivity to reach the nearest "service edge" location. At those locations, traffic can flow through security controls and be optimized on the way to its destination, whether that's another location within the customer's private network or an application on the public Internet. This architecture also enables remote user access to connected networks.

This shift — moving most of the "smarts" from the branch to a distributed global network edge, and leaving only the functions at the branch that absolutely require local presence, delivered by the Magic WAN Connector — solves our customers’ current problems and sets them up for easier management and a stronger security posture as the connectivity and attack landscape continues to evolve.

Aspect	Example	MPLS/VPN Service	SD-WAN	SASE with Cloudflare One
Configuration	New site setup, configuration and management	By MSP through service request	Simplified orchestration and management via centralized controller	Automated orchestration via SaaS portal Single Dashboard
Last mile traffic control	Traffic balancing, QoS, and failover	Covered by MPLS SLAs	Best Path selection available in SD-WAN appliance	Minimal on-prem deployment to control local decision making
Middle mile traffic control	Traffic steering around middle mile congestion	Covered by MPLS SLAs	“Tunnel Spaghetti” and still no control over the middle mile	Integrated traffic management & private backbone controls in a unified dashboard
Cloud integration	Connectivity for cloud migration	Centralized breakout	Decentralized breakout	Native connectivity with Cloud Network Interconnect
Security	Filter in & outbound Internet traffic for malware	Patchwork of hardware controls	Patchwork of hardware and/or software controls	Native integration with user, data, application & network security tools
Cost	Maximize ROI for network investments	High cost for hardware and connectivity	Optimized connectivity costs at the expense of increased hardware and software costs	Decreased hardware and connectivity costs for maximized ROI

Summary of legacy, SD-WAN based, and SASE architecture considerations

Love and want to keep your current SD-WAN vendor? No problem – you can still use any appliance that supports IPsec or GRE as an on-ramp for Cloudflare One.

Ready to simplify your SASE journey?

You can learn more about the Magic WAN Connector, including device specs, specific feature info, onboarding process details, and more at our dev docs, or contact us to get started today.

Birthday Week recap: everything we announced — plus an AI-powered opportunity for startups

2023-10-02 Dina Kozlov

Post Syndicated from Dina Kozlov original http://blog.cloudflare.com/birthday-week-2023-wrap-up/

Birthday Week recap: everything we announced — plus an AI-powered opportunity for startups

This year, Cloudflare officially became a teenager, turning 13 years old. We celebrated this milestone with a series of announcements that benefit both our customers and the Internet community.

From developing applications in the age of AI to securing against the most advanced attacks that are yet to come, Cloudflare is proud to provide the tools that help our customers stay one step ahead.

We hope you’ve had a great time following along and for anyone looking for a recap of everything we launched this week, here it is:

Monday

What	In a sentence…
Switching to Cloudflare can cut emissions by up to 96%	Switching enterprise network services from on-prem to Cloudflare can cut related carbon emissions by up to 96%.
Cloudflare Trace	Use Cloudflare Trace to see which rules and settings are invoked when an HTTP request for your site goes through our network.
Cloudflare Fonts	Introducing Cloudflare Fonts. Enhance privacy and performance for websites using Google Fonts by loading fonts directly from the Cloudflare network.
How Cloudflare intelligently routes traffic	Technical deep dive that explains how Cloudflare uses machine learning to intelligently route traffic through our vast network.
Low Latency Live Streaming	Cloudflare Stream’s LL-HLS support is now in open beta. You can deliver video to your audience faster, reducing the latency a viewer may experience on their player to as little as 3 seconds.
Account permissions for all	Cloudflare account permissions are now available to all customers, not just Enterprise. In addition, we’ll show you how you can use them and best practices.
Incident Alerts	Customers can subscribe to Cloudflare Incident Alerts and choose when to get notified based on affected products and level of impact.

Tuesday

What	In a sentence…
Welcome to the connectivity cloud	Cloudflare is the world’s first connectivity cloud — the modern way to connect and protect your cloud, networks, applications and users.
Amazon’s $2bn IPv4 tax — and how you can avoid paying it	Amazon will begin taxing their customers $43 for IPv4 addresses, so Cloudflare will give those \$43 back in the form of credits to bypass that tax.
Sippy	Minimize egress fees by using Sippy to incrementally migrate your data from AWS to R2.
Cloudflare Images	All Image Resizing features will be available under Cloudflare Images and we’re simplifying pricing to make it more predictable and reliable.
Traffic anomalies and notifications with Cloudflare Radar	Cloudflare Radar will be publishing anomalous traffic events for countries and Autonomous Systems (ASes).
Detecting Internet outages	Deep dive into how Cloudflare detects Internet outages, the challenges that come with it, and our approach to overcome these problems.

Wednesday

What	In a sentence…
The best place on Region: Earth for inference	Now available: Workers AI, a serverless GPU cloud for AI, Vectorize so you can build your own vector databases, and AI Gateway to help manage costs and observability of your AI applications. Cloudflare delivers the best infrastructure for next-gen AI applications, supported by partnerships with NVIDIA, Microsoft, Hugging Face, Databricks, and Meta.
Workers AI	Launching Workers AI — AI inference as a service platform, empowering developers to run AI models with just a few lines of code, all powered by our global network of GPUs.
Partnering with Hugging Face	Cloudflare is partnering with Hugging Face to make AI models more accessible and affordable to users.
Vectorize	Cloudflare’s vector database, designed to allow engineers to build full-stack, AI-powered applications entirely on Cloudflare's global network — available in Beta.
AI Gateway	AI Gateway helps developers have greater control and visibility in their AI apps, so that you can focus on building without worrying about observability, reliability, and scaling. AI Gateway handles the things that nearly all AI applications need, saving you engineering time so you can focus on what you're building.
You can now use WebGPU in Cloudflare Workers	Developers can now use WebGPU in Cloudflare Workers. Learn more about why WebGPUs are important, why we’re offering them to customers, and what’s next.
What AI companies are building with Cloudflare	Many AI companies are using Cloudflare to build next generation applications. Learn more about what they’re building and how Cloudflare is helping them on their journey.
Writing poems using LLama 2 on Workers AI	Want to write a poem using AI? Learn how to run your own AI chatbot in 14 lines of code, running on Cloudflare’s global network.

Thursday

What	In a sentence…
Hyperdrive	Cloudflare launches a new product, Hyperdrive, that makes existing regional databases much faster by dramatically speeding up queries that are made from Cloudflare Workers.
D1 Open Beta	D1 is now in open beta, and the theme is “scale”: with higher per-database storage limits and the ability to create more databases, we’re unlocking the ability for developers to build production-scale applications on D1.
Pages Build Caching	Build cache is a feature designed to reduce your build times by caching and reusing previously computed project components — now available in Beta.
Running serverless Puppeteer with Workers and Durable Objects	Introducing the Browser Rendering API, which enables developers to utilize the Puppeteer browser automation library within Workers, eliminating the need for serverless browser automation system setup and maintenance
Cloudflare partners with Microsoft to power their Edge Secure Network	We partnered with Microsoft Edge to provide a fast and secure VPN, right in the browser. Users don’t have to install anything new or understand complex concepts to get the latest in network-level privacy: Edge Secure Network VPN is available on the latest consumer version of Microsoft Edge in most markets, and automatically comes with 5GB of data.
Re-introducing the Cloudflare Workers playground	We are revamping the playground that demonstrates the power of Workers, along with new development tooling, and the ability to share your playground code and deploy instantly to Cloudflare’s global network
Cloudflare integrations marketplace expands	Introducing the newest additions to Cloudflare’s Integration Marketplace. Now available: Sentry, Momento and Turso.
A Socket API that works across Javascript runtimes — announcing WinterCG spec and polyfill for connect()	Engineers from Cloudflare and Vercel have published a draft specification of the connect() sockets API for review by the community, along with a Node.js compatible polyfill for the connect() API that developers can start using.
New Workers pricing	Announcing new pricing for Cloudflare Workers, where you are billed based on CPU time, and never for the idle time that your Worker spends waiting on network requests and other I/O.

Friday

What	In a sentence…
Post Quantum Cryptography goes GA	Cloudflare is rolling out post-quantum cryptography support to customers, services, and internal systems to proactively protect against advanced attacks.
Encrypted Client Hello	Announcing a contribution that helps improve privacy for everyone on the Internet. Encrypted Client Hello, a new standard that prevents networks from snooping on which websites a user is visiting, is now available on all Cloudflare plans.
Email Retro Scan	Cloudflare customers can now scan messages within their Office 365 Inboxes for threats. The Retro Scan will let you look back seven days to see what threats your current email security tool has missed.
Turnstile is Generally Available	Turnstile, Cloudflare’s CAPTCHA replacement, is now generally available and available for free to everyone and includes unlimited use.
AI crawler bots	Any Cloudflare user, on any plan, can choose specific categories of bots that they want to allow or block, including AI crawlers. We are also recommending a new standard to robots.txt that will make it easier for websites to clearly direct how AI bots can and can’t crawl.
Detecting zero-days before zero-day	Deep dive into Cloudflare’s approach and ongoing research into detecting novel web attack vectors in our WAF before they are seen by a security researcher.
Privacy Preserving Metrics	Deep dive into the fundamental concepts behind the Distributed Aggregation Protocol (DAP) protocol with examples on how we’ve implemented it into Daphne, our open source aggregator server.
Post-quantum cryptography to origin	We are rolling out post-quantum cryptography support for outbound connections to origins and Cloudflare Workers fetch() calls. Learn more about what we enabled, how we rolled it out in a safe manner, and how you can add support to your origin server today.
Network performance update	Cloudflare’s updated benchmark results regarding network performance plus a dive into the tools and processes that we use to monitor and improve our network performance.

One More Thing

When Cloudflare turned 12 last year, we announced the Workers Launchpad Funding Program – you can think of it like a startup accelerator program for companies building on Cloudlare’s Developer Platform, with no restrictions on your size, stage, or geography.

A refresher on how the Launchpad works: Each quarter, we admit a group of startups who then get access to a wide range of technical advice, mentorship, and fundraising opportunities. That includes our Founders Bootcamp, Open Office Hours with our Solution Architects, and Demo Day. Those who are ready to fundraise will also be connected to our community of 40+ leading global Venture Capital firms.

In exchange, we just ask for your honest feedback. We want to know what works, what doesn’t and what you need us to build for you. We don’t ask for a stake in your company, and we don’t ask you to pay to be a part of the program.

Targum (my startup) was one of the first AI companies (w/ @jamdotdev ) in the Cloudflare workers launchpad!

In return to tons of stuff we got from CF 🙏 they asked for feedback, and my main one was, let me do everything end to end on CF, I don't want to rent GPU servers… https://t.co/0j2ZymXpsL

— Alex Volkov (@altryne) September 27, 2023

Over the past year, we’ve received applications from nearly 60 different countries. We’ve had a chance to work closely with 50 amazing early and growth-stage startups admitted into the first two cohorts, and have grown our VC partner community to 40+ firms and more than $2 billion in potential investments in startups building on Cloudflare.

Next up: Cohort #3! Between recently wrapping up Cohort #2 (check out their Demo Day!), celebrating the Launchpad’s 1st birthday, and the heaps of announcements we made last week, we thought that everyone could use a little extra time to catch up on all the news – which is why we are extending the deadline for Cohort #3 a few weeks to October 13, 2023. AND we’re reserving 5 spots in the class for those who are already using any of last Wednesday’s AI announcements. Just be sure to mention what you’re using in your application.

So once you’ve had a chance to check out the announcements and pour yourself a cup of coffee, check out the Workers Launchpad. Applying is a breeze — you’ll be done long before your coffee gets cold.

Until next time

That’s all for Birthday Week 2023. We hope you enjoyed the ride, and we’ll see you at our next innovation week!

i hate @Cloudflare launch week

most launch weeks are underwhelming

cloudflare always makes me rethink everything i’m doing

— Dax (@thdxr) September 29, 2023

Detecting zero-days before zero-day

2023-09-29 Michael Tremante

Post Syndicated from Michael Tremante original http://blog.cloudflare.com/detecting-zero-days-before-zero-day/

Detecting zero-days before zero-day

We are constantly researching ways to improve our products. For the Web Application Firewall (WAF), the goal is simple: keep customer web applications safe by building the best solution available on the market.

In this blog post we talk about our approach and ongoing research into detecting novel web attack vectors in our WAF before they are seen by a security researcher. If you are interested in learning about our secret sauce, read on.

This post is the written form of a presentation first delivered at Black Hat USA 2023.

The value of a WAF

Many companies offer web application firewalls and application security products with a total addressable market forecasted to increase for the foreseeable future.

In this space, vendors, including ourselves, often like to boast the importance of their solution by presenting ever-growing statistics around threats to web applications. Bigger numbers and scarier stats are great ways to justify expensive investments in web security. Taking a few examples from our very own application security report research (see our latest report here):

The numbers above all translate to real value: yes, a large portion of Internet HTTP traffic is malicious, therefore you could mitigate a non-negligible amount of traffic reaching your applications if you deployed a WAF. It is also true that we are seeing a drastic increase in global API traffic, therefore, you should look into the security of your APIs as you are likely serving API traffic you are not aware of. You need a WAF with API protection capabilities. And so on.

There is, however, one statistic often presented that hides a concept more directly tied to the value of a web application firewall:

This brings us to zero-days. The definition of a zero-day may vary depending on who you ask, but is generally understood to be an exploit that is not yet, or has very recently become, widely known with no patch available. High impact zero-days will get assigned a CVE number. These happen relatively frequently and the value can be implied by how often we see exploit attempts in the wild. Yes, you need a WAF to make sure you are protected from zero-day exploits.

But herein hides the real value: how quickly can a WAF mitigate a new zero-day/CVE?

By definition a zero-day is not well known, and a single malicious payload could be the one that compromises your application. From a purist standpoint, if your WAF is not fast at detecting new attack vectors, it is not providing sufficient value.

The faster the mitigation, the better. We refer to this as “time to mitigate”. Any WAF evaluation should focus on this metric.

How fast is fast enough?

24 hours? 6 hours? 30 minutes? Luckily we run one of the world's largest networks, and we can look at some real examples to understand how quickly a WAF really needs to be to protect most environments. I specifically mention “most” here as not everyone is the target of a highly sophisticated attack, and therefore, most companies should seek to be protected at least by the time a zero-day is widely known. Anything better is a plus.

Our first example is Log4Shell (CVE-2021-44228). A high and wide impacting vulnerability that affected Log4J, a popular logging software maintained by the Apache Software Foundation. The vulnerability was disclosed back in December 2021. If you are a security practitioner, you have certainly heard of this exploit.

The proof of concept of this attack was published on GitHub on December 9, 2021, at 15:27 UTC. A tweet followed shortly after. We started observing a substantial amount of attack payloads matching the signatures from about December 10 at 10:00 UTC. That is about ~19 hours after the PoC was published.

We blogged extensively about this event if you wish to read further.

Our second example is a little more recent: Atlassian Confluence CVE-2022-26134 from June 2, 2022. In this instance Atlassian published a security advisory pertaining to the vulnerability at 20:00 UTC. We were very fast at deploying mitigations and had rules globally deployed protecting customers at 23:38 UTC, before the four-hour mark.

Although potentially matching payloads were observed before the rules were deployed, these were not confirmed. Exact matches were only observed on 2022-06-03 at 10:30 UTC, over 10 hours after rule deployment. Even in this instance, we provided our observations on our blog.

The list of examples could go on, but the data tells the same story: for most, as long as you have mitigations in place within a few hours, you are likely to be fine.

That, however, is a dangerous statement to make. Cloudflare protects applications that have some of the most stringent security requirements due to the data they hold and the importance of the service they provide. They could be the one application that is first targeted with the zero-day well before it is widely known. Also, we are a WAF vendor and I would not be writing this post if I thought “a few hours” was fast enough.

Zero (time) is the only acceptable time to mitigate!

Signatures are not enough, but are here to stay

All WAFs on the market today will have a signature based component. Signatures are great as they can be built to minimize false positives (FPs), their behavior is predictable and can be improved overtime.

We build and maintain our own signatures provided in the WAF as the Cloudflare Managed Ruleset. This is a set of over 320 signatures (at time of writing) that have been fine-tuned and optimized over the 13 years of Cloudflare’s existence.

Signatures tend to be written in ModSecurity, regex-like syntax or other proprietary language. At Cloudflare, we use wirefilter, a language understood by our global proxy. To use the same example as above, here is what one of our Log4Shell signatures looks like:

Our network, which runs our WAF, also gives us an additional superpower: the ability to test new signatures (or updates to existing ones) on over 64M HTTP/S requests per second at peak. We can tell pretty quickly if a signature is well written or not.

But one of their qualities (low false positive rates), along with the fact that humans have to write them, are the source of our inability to solely rely on signatures to reach zero time to mitigate. Ultimately a signature is limited by the speed at which we can write it, and combined with our goal to keep FPs low, they only match things we know and are 100% sure about. Our WAF security analyst team is, after all, limited by human speed while balancing the effectiveness of the rules.

The good news: signatures are a vital component to reach zero time to mitigate, and will always be needed, so the investment remains vital.

Getting to zero time to mitigation

To reach zero time to mitigate we need to rely on some machine learning algorithms. It turns out that WAFs are a great application for this type of technology especially combined with existing signature based systems. In this post I won’t describe the algorithms themselves (subject for another post) but will provide the high level concepts of the system and the steps of how we built it.

Step 1: create the training set

It is a well known fact in data science that the quality of any classification system, including the latest generative AI systems, is highly dependent on the quality of the training set. The old saying “garbage in, garbage out” resonates well.

And this is where our signatures come into play. As these were always written with a low false positive rate in mind, combined with our horizontal WAF deployment on our network, we essentially have access to millions of true positive examples per second to create what is likely one of the best WAF training sets available today.

We also, due to customer configurations and other tools such as Bot Management, have a pretty clear idea of what true negatives look like. In summary, we have a constant flow of training data. Additionally due to our self-service plans and the globally distributed nature of Cloudflare’s service and customer base, our data tends to be very diverse, removing a number of biases that may otherwise be present.

It is important to note at this point that we paid a lot of effort to ensure we anonymised data, removed PII, and that data boundary settings provided by our data localization suite were implemented correctly. We’ve published previously blog posts describing some of our strategies for data collection in the context of WAF use cases.

Step 2: enhance the training set

Simply relying on real traffic data is good, but with a few artificial enhancements the training set can become a lot better, leading to much higher detection efficacy.

In a nutshell we went through the process of generating artificial (but realistic) data to increase the diversity of our data even further by studying statistical distribution of existing real-world data. For example, mutating benign content with random character noise, language specific keywords, generating new benign content and so on.

Some of the methods adopted to improve accuracy are discussed in detail in a prior blog post if you wish to read further.

Step 3: build a very fast classifier

One restriction that often applies to machine learning based classifiers running to inline traffic, like the Cloudflare proxy, is latency performance. To be useful, we need to be able to compute classification “inline” without affecting the user experience for legitimate end users. We don’t want security to be associated with “slowness”.

This required us to fine tune not only the feature set used by the classification system, but also the underlying tooling, so it was both fast and lightweight. The classifier is built using TensorFlow Lite.

At time of writing, our classification model is able to provide a classification output under 1ms at 50th percentile. We believe we can reach 1ms at 90th percentile with ongoing efforts.

Step 4: deploy on the network

Once the classifier is ready, there is still a large amount of additional work needed to deploy on live production HTTP traffic, especially at our scale. Quite a few additional steps need to be implemented starting from a fully formed live HTTP request and ending with a classification output.

The diagram below is a good summary of each step. First and foremost, starting from the raw HTTP request, we normalize it, so it can easily be parsed and processed, without unintended consequences, by the following steps in the pipeline. Second we extract the relevant features found after experimentation and research, that would be more beneficial for our use case. To date we extract over 6k features. We then run inference on the resulting features (the actual classification) and generate outputs for the various attack types we have trained the model for. To date we classify cross site scripting payloads (XSS), SQL injection payloads (SQLi) and remote code execution payloads (RCE). The final step is to consolidate the output in a single WAF Attack Score.

Step 5: expose output as a simple interface

To make the system usable we decided the output should be in the same format as our Bot Management system output. A single score that ranges from 1 to 99. Lower scores indicate higher probability that the request is malicious, higher scores indicate the request is clean.

There are two main benefits of representing the output within a fixed range. First, using the output to BLOCK traffic becomes very easy. It is sufficient to deploy a WAF rule that blocks all HTTP requests with a score lower than $x, for example a rule that blocks all traffic with a score lower than 10 would look like this:

cf.waf.score < 10 then BLOCK

Secondly, deciding what the threshold should be can be done easily by representing the score distributions on your live traffic in colored “buckets”, and then allowing you to zoom in where relevant to validate the correct classification. For example, the graph below shows an attack that we observed against blog.cloudflare.com when we initially started testing the system. This graph is available to all business and enterprise users.

All that remains, is to actually use the score!

Success in the wild

The classifier has been deployed for just over a year on Cloudflare’s network. The main question stated at the start of this post remains: does it work? Have we been able to detect attacks before we’ve seen them? Have we achieved zero time to mitigate?

To answer this we track classification output for new CVEs that fail to be detected by existing Cloudflare Managed Rules. Of course our rule improvement work is always ongoing, but this gives us an idea on how well the system is performing.

And the answer: YES. For all CVEs or bypasses that rely on syntax similar to existing vulnerabilities, the classifier performs very well, and we have observed several instances of it blocking valid malicious payloads that were not detected by our signatures. All of this, while keeping false positives very low at a threshold of 15 or below. XSS variations, SQLi CVEs, are in most cases, a problem fully solved if the classifier is deployed.

One recent example is a set of Sitecore vulnerabilities that were disclosed in June 2023 listed below:

CVE	Date	Score	Signature match	Classification match (score less than 10)
CVE-2023-35813	06/17/2023	9.8 CRITICAL	Not at time of announcement	Yes
CVE-2023-33653	06/06/2023	8.8 HIGH	Not at time of announcement	Yes
CVE-2023-33652	06/06/2023	8.8 HIGH	Not at time of announcement	Yes
CVE-2023-33651	06/06/2023	7.5 HIGH	Not at time of announcement	Yes

The CVEs listed above were not detected by Cloudflare Managed Rules, but were correctly detected and classified by our model. Customers that had the score deployed in a rule in June 2023, would have been protected in zero time.

This does not mean there isn’t space for further improvement.

The classification works very well for attack types that are aligned, or somewhat similar to existing attack types. If the payload implements a brand new never seen before syntax, then we still have some work to do. Log4Shell is actually a very good example of this. If another zero-day vulnerability was discovered that leveraged the JNDI Java syntax, we are confident that our customers who have deployed WAF rules using the WAF Attack Score would be safe against it.

We are already working on adding more detection capabilities including web shell detection and open redirects/path traversal.

The perfect feedback loop

I mentioned earlier that our security analyst driven improvements to our Cloudflare Managed Rulesets are not going to stop. Our public changelog is full of activity and there is no sign of slowing down.

There is a good reason for this: the signature based system will remain, and likely eventually be converted to our training set generation tool. But not only that, it also provides an opportunity to speed up improvements by focusing on reviewing malicious traffic that is classified by our machine learning system but not detected by our signatures. The delta between the two systems is now one of the main focuses of attention for our security analyst team. The diagram below visualizes this concept.

It is this delta that is helping our team to further fine tune and optimize the signatures themselves. Both to match malicious traffic that is bypassing the signatures, and to reduce false positives. You can now probably see where this is going as we are starting to build the perfect feedback loop.

Better signatures provide a better training set (data). In turn, we can create a better model. The model will provide us with a more interesting delta, which, once reviewed by humans, will allow us to create better signatures. And start over.

We are now working to automate this entire process with the goal of having humans simply review and click to deploy. This is the leading edge for WAF zero-day mitigation in the industry.

Summary

One of the main value propositions of any web application security product is the ability to detect novel attack vectors before they can cause an issue, allowing internal teams time to patch and remediate the underlying codebase. We call this time to mitigate. The ideal value is zero.

We’ve put a lot of effort and research into a machine learning system that augments our existing signature based system to yield very good classification results of new attack vectors the first time they are seen. The system outputs a score that we call the WAF Attack Score. We have validated that for many CVEs, we are indeed able to correctly classify malicious payloads on the first attempt and provide Sitecore CVEs as an example.

Moving forward, we are now automating a feedback loop that will allow us to both improve our signatures faster, to then subsequently iterate on the model and provide even better detection.

The system is live and available to all our customers in the business or enterprise plan. Log in to the Cloudflare dashboard today to receive instant zero-day mitigation.

Cloudflare is free of CAPTCHAs; Turnstile is free for everyone

2023-09-29 Benedikt Wolters

Post Syndicated from Benedikt Wolters original http://blog.cloudflare.com/turnstile-ga/

Cloudflare is free of CAPTCHAs; Turnstile is free for everyone

For years, we’ve written that CAPTCHAs drive us crazy. Humans give up on CAPTCHA puzzles approximately 15% of the time and, maddeningly, CAPTCHAs are significantly easier for bots to solve than they are for humans. We’ve spent the past three and a half years working to build a better experience for humans that’s just as effective at stopping bots. As of this month, we’ve finished replacing every CAPTCHA issued by Cloudflare with Turnstile, our new CAPTCHA replacement (pictured below). Cloudflare will never issue another visual puzzle to anyone, for any reason.

Now that we’ve eliminated CAPTCHAs at Cloudflare, we want to make it easy for anyone to do the same, even if they don’t use other Cloudflare services. We’ve decoupled Turnstile from our platform so that any website operator on any platform can use it just by adding a few lines of code. We’re thrilled to announce that Turnstile is now generally available, and Turnstile’s ‘Managed’ mode is now completely free to everyone for unlimited use.

Easy on humans, hard on bots, private for everyone

There’s a lot that goes into Turnstile’s simple checkbox to ensure that it’s easy for everyone, preserves user privacy, and does its job stopping bots. Part of making challenges better for everyone means that everyone gets the same great experience, no matter what browser you’re using. Because we do not employ a visual puzzle, users with low vision or blindness get the same easy to use challenge flow as everyone else. It was particularly important for us to avoid falling back to audio CAPTCHAs to offer an experience accessible to everyone. Audio CAPTCHAs are often much worse than even visual CAPTCHAs for humans to solve, with only 31.2% of audio challenges resulting in a three-person agreement on what the correct solution actually is. The prevalence of free speech-to-text services has made it easy for bots to solve audio CAPTCHAs as well, with a recent study showing bots can accurately solve audio CAPTCHAs in over 85% of attempts.

We also created Turnstile to be privacy focused. Turnstile meets ePrivacy Directive, GDPR and CCPA compliance requirements, as well as the strict requirements of our own privacy commitments. In addition, Cloudflare's FedRAMP Moderate authorized package, "Cloudflare for Government" now includes Turnstile. We don’t rely on tracking user data, like what other websites someone has visited, to determine if a user is a human or robot. Our business is protecting websites, not selling ads, so operators can deploy Turnstile knowing that their users’ data is safe.

With all of our emphasis on how easy it is to pass a Turnstile challenge, you would be right to ask how it can stop a bot. If a bot can find all images with crosswalks in grainy photos faster than we can, surely it can check a box as well. Bots definitely can check a box, and they can even mimic the erratic path of human mouse movement while doing so. For Turnstile, the actual act of checking a box isn’t important, it’s the background data we’re analyzing while the box is checked that matters. We find and stop bots by running a series of in-browser tests, checking browser characteristics, native browser APIs, and asking the browser to pass lightweight tests (ex: proof-of-work tests, proof-of-space tests) to prove that it’s an actual browser. The current deployment of Turnstile checks billions of visitors every day, and we are able to identify browser abnormalities that bots exhibit while attempting to pass those tests.

For over one year, we used our Managed Challenge to rotate between CAPTCHAs and our own Turnstile challenge to compare our effectiveness. We found that even without asking users for any interactivity at all, Turnstile was just as effective as a CAPTCHA. Once we were sure that the results were effective at coping with the response from bot makers, we replaced the CAPTCHA challenge with our own checkbox solution. We present this extra test when we see potentially suspicious signals, and it helps us provide an even greater layer of security.

Turnstile is great for fighting fraud

Like all sites that offer services for free, Cloudflare sees our fair share of automated account signups, which can include “new account fraud,” where bad actors automate the creation of many different accounts to abuse our platform. To help combat this abuse, we’ve rolled out Turnstile’s invisible mode to protect our own signup page. This month, we’ve blocked over 1 million automated signup attempts using Turnstile, without a reported false positive or any change in our self-service billings that rely on this signup flow.

Lessons from the Turnstile beta

Over the past twelve months, we’ve been grateful to see how many people are eager to try, then rely on, and integrate Turnstile into their web applications. It’s been rewarding to see the developer community embrace Turnstile as well. We list some of the community created Turnstile integrations here, including integrations with WordPress, Angular, Vue, and a Cloudflare recommended React library. We’ve listened to customer feedback, and added support for 17 new languages, new callbacks, and new error codes.

76,000+ users have signed up, but our biggest single test by far was the Eurovision final vote. Turnstile runs on challenge pages on over 25 million Cloudflare websites. Usually, that makes Cloudflare the far and away biggest Turnstile consumer, until the final Eurovision vote. During that one hour, challenge traffic from the Eurovision voting site outpaced the use of challenge pages on those 25 million sites combined! Turnstile handled the enormous spike in traffic without a hitch.

While a lot went well during the Turnstile beta, we also encountered some opportunities for us to learn. We were initially resistant to disclosing why a Turnstile challenge failed. After all, if bad actors know what we’re looking for, it becomes easier for bots to fool our challenges until we introduce new detections. However, during the Turnstile beta, we saw a few scenarios where legitimate users could not pass a challenge. These scenarios made it clear to us that we need to be transparent about why a challenge failed to help aid any individual who might modify their browser in a way that causes them to get caught by Turnstile. We now publish detailed client-side error codes to surface the reason why a challenge has failed. Two scenarios came up on several occasions that we didn’t expect:

First, we saw that desktop computers at least 10 years old frequently had expired motherboard batteries, and computers with bad motherboard batteries very often keep inaccurate time. This is because without the motherboard battery, a desktop computer’s clock will stop operating when the computer is off. Turnstile checks your computer’s system time to detect when a website operator has accidentally configured a challenge page to be cached, as caching a challenge page will cause it to become impassable. Unfortunately, this same check was unintentionally catching humans who just needed to update the time. When we see this issue, we now surface a clear error message to the end user to update their system time. We’d prefer to never have to surface an error in the first place, so we’re working to develop new ways to check for cached content that won’t impact real people.

Second, we find that a few privacy-focused users often ask their browsers to go beyond standard practices to preserve their anonymity. This includes changing their user-agent (something bots will do to evade detection as well), and preventing third-party scripts from executing entirely. Issues caused by this behavior can now be displayed clearly in a Turnstile widget, so those users can immediately understand the issue and make a conscientious choice about whether they want to allow their browser to pass a challenge.

Although we have some of the most sensitive, thoroughly built monitoring systems at Cloudflare, we did not catch either of these issues on our own. We needed to talk to users affected by the issue to help us understand what the problem was. Going forward, we want to make sure we always have that direct line of communication open. We’re rolling out a new feedback form in the Turnstile widget, to ensure any future corner cases are addressed quickly and with urgency.

Turnstile: GA and Free for Everyone

Announcing Turnstile’s General Availability means that Turnstile is now completely production ready, available for free for unlimited use via our visible widget in Managed mode. Turnstile Enterprise includes SaaS platform support and a visible mode without the Cloudflare logo. Self-serve customers can expect a pay-as-you-go option for advanced features to be available in early 2024. Users can continue to access Turnstile’s advanced features below our 1 million siteverify request limit, as has been the case during the beta. If you’ve been waiting to try Turnstile, head over to our signup page and create an account!

Get the full benefits of IMDSv2 and disable IMDSv1 across your AWS infrastructure

2023-09-28 Saju Sivaji

Post Syndicated from Saju Sivaji original https://aws.amazon.com/blogs/security/get-the-full-benefits-of-imdsv2-and-disable-imdsv1-across-your-aws-infrastructure/

The Amazon Elastic Compute Cloud (Amazon EC2) Instance Metadata Service (IMDS) helps customers build secure and scalable applications. IMDS solves a security challenge for cloud users by providing access to temporary and frequently-rotated credentials, and by removing the need to hardcode or distribute sensitive credentials to instances manually or programmatically. The Instance Metadata Service Version 2 (IMDSv2) adds protections; specifically, IMDSv2 uses session-oriented authentication with the following enhancements:

IMDSv2 requires the creation of a secret token in a simple HTTP PUT request to start the session, which must be used to retrieve information in IMDSv2 calls.
The IMDSv2 session token must be used as a header in subsequent IMDSv2 requests to retrieve information from IMDS. Unlike a static token or fixed header, a session and its token are destroyed when the process using the token terminates. IMDSv2 sessions can last up to six hours.
A session token can only be used directly from the EC2 instance where that session began.
You can reuse a token or create a new token with every request.
Session token PUT requests are blocked if they contain an X-forwarded-for header.

In a previous blog post, we explained how these new protections add defense-in-depth for third-party and external application vulnerabilities that could be used to try to access the IMDS.

You won’t be able to get the full benefits of IMDSv2 until you disable IMDSv1. While IMDS is provided by the instance itself, the calls to IMDS are from your software. This means your software must support IMDSv2 before you can disable IMDSv1. In addition to AWS SDKs, CLIs, and tools like the SSM agents supporting IMDSv2, you can also use the IMDS Packet Analyzer to pinpoint exactly what you need to update to get your instances ready to use only IMDSv2. These tools make it simpler to transition to IMDSv2 as well as launch new infrastructure with IMDSv1 disabled. All instances launched with AL2023 set the instance to provide only IMDSv2 (IMDSv1 is disabled) by default, with AL2023 also not making IMDSv1 calls.

AWS customers who want to get the benefits of IMDSv2 have told us they want to use IMDSv2 across both new and existing, long-running AWS infrastructure. This blog post shows you scalable solutions to identify existing infrastructure that is providing IMDSv1, how to transition to IMDSv2 on your infrastructure, and how to completely disable IMDSv1. After reviewing this blog, you will be able to set new Amazon EC2 launches to IMDSv2. You will also learn how to identify existing software making IMDSv1 calls, so you can take action to update your software and then require IMDSv2 on existing EC2 infrastructure.

Identifying IMDSv1-enabled EC2 instances

The first step in transitioning to IMDSv2 is to identify all existing IMDSv1-enabled EC2 instances. You can do this in various ways.

Using the console

You can identify IMDSv1-enabled instances using the IMDSv2 attribute column in the Amazon EC2 page in the AWS Management Console.

To view the IMDSv2 attribute column:

Open the Amazon EC2 console and go to Instances.
Choose the settings icon in the top right.
Scroll down to IMDSv2, turn on the slider.
Choose Confirm.

This gives you the IMDS status of your instances. A status of optional means that IMDSv1 is enabled on the instance and required means that IMDSv1 is disabled.

Figure 1: Example of IMDS versions for EC2 instances in the console

Using the AWS CLI

You can identify IMDSv1-enabled instances using the AWS Command Line Interface (AWS CLI) by running the aws ec2 describe-instances command and checking the value of HttpTokens. The HttpTokens value determines what version of IMDS is enabled, with optional enabling IMDSv1 and IMDSv2 and required means IMDSv2 is required. Similar to using the console, the optional status indicates that IMDSv1 is enabled on the instance and required indicates that IMDSv1 is disabled.

"MetadataOptions": {
                        "State": "applied", 
                        "HttpEndpoint": "enabled", 
                        "HttpTokens": "optional", 
                        "HttpPutResponseHopLimit": 1
                    },

[ec2-user@ip-172-31-24-101 ~]$ aws ec2 describe-instances | grep '"HttpTokens": "optional"' | wc -l
4

Using AWS Config

AWS Config continually assesses, audits, and evaluates the configurations and relationships of your resources on AWS, on premises, and on other clouds. The AWS Config rule ec2-imdsv2-check checks whether your Amazon EC2 instance metadata version is configured with IMDSv2. The rule is NON_COMPLIANT if the HttpTokens is set to optional, which means the EC2 instance has IMDSv1 enabled.

Figure 2: Example of noncompliant EC2 instances in the AWS Config console

After this AWS Config rule is enabled, you can set up AWS Config notifications through Amazon Simple notification Service (Amazon SNS).

Using Security Hub

AWS Security Hub provides detection and alerting capability at the account and organization levels. You can configure cross-Region aggregation in Security Hub to gain insight on findings across Regions. If using AWS Organizations, you can configure a Security Hub designated account to aggregate findings across accounts in your organization.

Security Hub has an Amazon EC2 control ([EC2.8] Amazon EC2 instances should use Instance Metadata Service Version 2 (IMDSv2)) that uses the AWS Config rule ec2-imdsv2-check to check if the instance metadata version is configured with IMDSv2. The rule is NON_COMPLIANT if the HttpTokens is set to optional, which means EC2 instance has IMDSv1 enabled.

Figure 3: Example of AWS Security Hub showing noncompliant EC2 instances

Using Amazon Event Bridge, you can also set up alerting for the Security Hub findings when the EC2 instances are noncompliant for IMDSv2.

{
  "source": ["aws.securityhub"],
  "detail-type": ["Security Hub Findings - Imported"],
  "detail": {
    "findings": {
      "ProductArn": ["arn:aws:securityhub:us-west-2::product/aws/config"],
      "Title": ["ec2-imdsv2-check"]
    }
  }
}

Identifying if EC2 instances are making IMDSv1 calls

Not all of your software will be making IMDSv1 calls; your dependent libraries and tools might already be compatible with IMDSv2. However, to mitigate against compatibility issues in requiring IMDSv2 and disabling IMDSv1 entirely, you must check for remaining IMDSv1 calls from your software. After you’ve identified that there are instances with IMDSv1 enabled, investigate if your software is making IMDSv1 calls. Most applications make IMDSv1 calls at instance launch and shutdown. For long running instances, we recommend monitoring IMDSv1 calls during a launch or a stop and restart cycle.

You can check whether your software is making IMDSv1 calls by checking the MetadataNoToken metric in Amazon CloudWatch. You can further identify the source of IMDSv1 calls by using the IMDS Packet Analyzer tool.

Steps to check IMDSv1 usage with CloudWatch

Open the CloudWatch console.
Go to Metrics and then All Metrics.
Select EC2 and then choose Per-Instance Metrics.
Search and add the Metric MetadataNoToken for the instances you’re interested in.

Figure 4: CloudWatch dashboard for MetadataNoToken per-instance metric

You can use expressions in CloudWatch to view account wide metrics.

SEARCH('{AWS/EC2,InstanceId} MetricName="MetadataNoToken"', 'Maximum')

Figure 5: Using CloudWatch expressions to view account wide metrics for MetadataNoToken

You can combine SEARCH and SORT expressions in CloudWatch to help identify the instances using IMDSv1.

SORT(SEARCH('{AWS/EC2,InstanceId} MetricName="MetadataNoToken"', 'Sum', 300), SUM, DESC, 10)

Figure 6: Another example of using CloudWatch expressions to view account wide metrics

If you have multiple AWS accounts or use AWS Organizations, you can set up a centralized monitoring account using CloudWatch cross account observability.

IMDS Packet Analyzer

The IMDS Packet Analyzer is an open source tool that identifies and logs IMDSv1 calls from your software, including software start-up on your instance. This tool can assist in identifying the software making IMDSv1 calls on EC2 instances, allowing you to pinpoint exactly what you need to update to get your software ready to use IMDSv2. You can run the IMDS Packet Analyzer from a command line or install it as a service. For more information, see IMDS Packet Analyzer on GitHub.

Disabling IMDSv1 and maintaining only IMDSv2 instances

After you’ve monitored and verified that the software on your EC2 instances isn’t making IMDSv1 calls, you can disable IMDSv1 on those instances. For all compatible workloads, we recommend using Amazon Linux 2023, which offers several improvements (see launch announcement), including requiring IMDSv2 (disabling IMDSv1) by default.

You can also create and modify AMIs and EC2 instances to disable IMDSv1. Configure the AMI provides guidance on how to register a new AMI or change an existing AMI by setting the imds-support parameter to v2.0. If you’re using container services (such as ECS or EKS), you might need a bigger hop limit to help avoid falling back to IMDSv1. You can use the modify-instance-metadata-options launch parameter to make the change. We recommend testing with a hop limit of three in container environments.

To create a new instance

For new instances, you can disable IMDSv1 and enable IMDSv2 by specifying the metadata-options parameter using the run-instance CLI command.

aws ec2 run-instances
    --image-id <ami-0123456789example>
    --instance-type c3.large
    --metadata-options “HttpEndpoint=enabled,HttpTokens=required”

To modify the running instance

aws ec2 modify-instance-metadata-options \
--instance-id <instance-0123456789example> \
--http-tokens required \
--http-endpoint enabled

To configure a new AMI

aws ec2 register-image \
    --name <my-image> \
    --root-device-name /dev/xvda \
    --block-device-mappings DeviceName=/dev/xvda,Ebs={SnapshotId=<snap-0123456789example>} \
    --imds-support v2.0

To modify an existing AMI

aws ec2 modify-image-attribute \
    --image-id <ami-0123456789example> \
    --imds-support v2.0

Using the console

If you’re using the console to launch instances, after selecting Launch Instance from AWS Console, choose the Advanced details tab, scroll down to Metadata version and select V2 only (token required).

Figure 7: Modifying IMDS version using the console

Using EC2 launch templates

You can use an EC2 launch template as an instance configuration template that an Amazon Auto Scaling group can use to launch EC2 instances. When creating the launch template using the console, you can specify the Metadata version and select V2 only (token required).

Figure 8: Modifying the IMDS version in the EC2 launch templates

Using CloudFormation with EC2 launch templates

When creating an EC2 launch template using AWS CloudFormation, you must specify the MetadataOptions property to use only IMDSv2 by setting HttpTokens as required.

In this state, retrieving the AWS Identity and Access Management (IAM) role credentials always returns IMDSv2 credentials; IMDSv1 credentials are not available.

{
"HttpEndpoint" : <String>,
"HttpProtocolIpv6" : <String>,
"HttpPutResponseHopLimit" : <Integer>,
"HttpTokens" : required,
"InstanceMetadataTags" : <String>
}

Using Systems Manager automation runbook

You can run the EnforceEC2InstanceIMDSv2 automation document available in AWS Systems Manager, which will enforce IMDSv2 on the EC2 instance using the ModifyInstanceMetadataOptions API.

Open the Systems Manager console, and then select Automation from the navigation pane.
Choose Execute automation.
On the Owned by Amazon tab, for Automation document, enter EnforceEC2InstanceIMDSv2, and then press Enter.
Choose EnforceEC2InstanceIMDSv2 document, and then choose Next.
For Execute automation document, choose Simple execution.

Note: If you need to run the automation on multiple targets, then choose Rate Control.
For Input parameters, enter the ID of EC2 instance under InstanceId
For AutomationAssumeRole, select a role.

Note: To change the target EC2 instance, the AutomationAssumeRole must have ec2:ModifyInstanceMetadataOptions and ec2:DescribeInstances permissions. For more information about creating the assume role for Systems Manager Automation, see Create a service role for Automation.
Choose Execute.

Using the AWS CDK

If you use the AWS Cloud Development Kit (AWS CDK) to launch instances, you can use it to set the requireImdsv2 property to disable IMDSv1 and enable IMDSv2.

new ec2.Instance(this, 'Instance', {
        // <... other parameters>
        requireImdsv2: true,
})

Using AWS SDK

The new clients for AWS SDK for Java 2.x use IMDSv2, and you can use the new clients to retrieve instance metadata for your EC2 instances. See Introducing a new client in the AWS SDK for Java 2.x for retrieving EC2 Instance Metadata for instructions.

Maintain only IMDSv2 EC2 instances

To maintain only IMDSv2 instances, you can implement service control policies and IAM policies that verify that users and software on your EC2 instances can only use instance metadata using IMDSv2. This policy specifies that RunInstance API calls require the EC2 instance use only IMDSv2. We recommend implementing this policy after all of the instances in associated accounts are free of IMDSv1 calls and you have migrated all of the instances to use only IMDSv2.

{
    "Version": "2012-10-17",
    "Statement": [
               {
            "Sid": "RequireImdsV2",
            "Effect": "Deny",
            "Action": "ec2:RunInstances",
            "Resource": "arn:aws:ec2:*:*:instance/*",
            "Condition": {
                "StringNotEquals": {
                    "ec2:MetadataHttpTokens": "required"
                }
            }
        }
    ]
}

You can find more details on applicable service control policies (SCPs) and IAM policies in the EC2 User Guide.

Restricting credential usage using condition keys

As an additional layer of defence, you can restrict the use of your Amazon EC2 role credentials to work only when used in the EC2 instance to which they are issued. This control is complementary to IMDSv2 since both can work together. The AWS global condition context keys for EC2 credential control properties (aws:EC2InstanceSourceVPC and aws:EC2InstanceSourcePrivateIPv4) restrict the VPC endpoints and private IPs that can use your EC2 instance credentials, and you can use these keys in service control policies (SCPs) or IAM policies. Examples of these policies are in this blog post.

Conclusion

You won’t be able to get the full benefits of IMDSv2 until you disable IMDSv1. In this blog post, we showed you how to identify IMDSv1-enabled EC2 instances and how to determine if and when your software is making IMDSv1 calls. We also showed you how to disable IMDSv1 on new and existing EC2 infrastructure after your software is no longer making IMDSv1 calls. You can use these tools to transition your existing EC2 instances, and set your new EC2 launches, to use only IMDSv2.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Compute re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Enable external pipeline deployments to AWS Cloud by using IAM Roles Anywhere

2023-09-26 Olivier Gaumond

Post Syndicated from Olivier Gaumond original https://aws.amazon.com/blogs/security/enable-external-pipeline-deployments-to-aws-cloud-by-using-iam-roles-anywhere/

Continuous integration and continuous delivery (CI/CD) services help customers automate deployments of infrastructure as code and software within the cloud. Common native Amazon Web Services (AWS) CI/CD services include AWS CodePipeline, AWS CodeBuild, and AWS CodeDeploy. You can also use third-party CI/CD services hosted outside the AWS Cloud, such as Jenkins, GitLab, and Azure DevOps, to deploy code within the AWS Cloud through temporary security credentials use.

Security credentials allow identities (for example, IAM role or IAM user) to verify who they are and the permissions they have to interact with another resource. The AWS Identity and Access Management (IAM) service authentication and authorization process requires identities to present valid security credentials to interact with another AWS resource.

According to AWS security best practices, where possible, we recommend relying on temporary credentials instead of creating long-term credentials such as access keys. Temporary security credentials, also referred to as short-term credentials, can help limit the impact of inadvertently exposed credentials because they have a limited lifespan and don’t require periodic rotation or revocation. After temporary security credentials expire, AWS will no longer approve authentication and authorization requests made with these credentials.

In this blog post, we’ll walk you through the steps on how to obtain AWS temporary credentials for your external CI/CD pipelines by using IAM Roles Anywhere and an on-premises hosted server running Azure DevOps Services.

Deploy securely on AWS using IAM Roles Anywhere

When you run code on AWS compute services, such as AWS Lambda, AWS provides temporary credentials to your workloads. In hybrid information technology environments, when you want to authenticate with AWS services from outside of the cloud, your external services need AWS credentials.

IAM Roles Anywhere provides a secure way for your workloads — such as servers, containers, and applications running outside of AWS — to request and obtain temporary AWS credentials by using private certificates. You can use IAM Roles Anywhere to enable your applications that run outside of AWS to obtain temporary AWS credentials, helping you eliminate the need to manage long-term credentials or complex temporary credential solutions for workloads running outside of AWS.

To use IAM Roles Anywhere, your workloads require an X.509 certificate, issued by your private certificate authority (CA), to request temporary security credentials from the AWS Cloud.

IAM Roles Anywhere can work with your existing client or server certificates that you issue to your workloads today. In this blog post, our objective is to show how you can use X.509 certificates issued by your public key infrastructure (PKI) solution to gain access to AWS resources by using IAM Roles Anywhere. Here we don’t cover PKI solutions options, and we assume that you have your own PKI solution for certificate generation. In this post, we demonstrate the IAM Roles Anywhere setup with a self-signed certificate for the purpose of the demo running in a test environment.

External CI/CD pipeline deployments in AWS

CI/CD services are typically composed of a control plane and user interface. They are used to automate the configuration, orchestration, and deployment of infrastructure code or software. The code build steps are handled by a build agent that can be hosted on a virtual machine or container running on-premises or in the cloud. Build agents are responsible for completing the jobs defined by a CI/CD pipeline.

For this use case, you have an on-premises CI/CD pipeline that uses AWS CloudFormation to deploy resources within a target AWS account. The CloudFormation template, the pipeline definition, and other files are hosted in a Git repository. The on-premises build agent requires permissions to deploy code through AWS CloudFormation within an AWS account. To make calls to AWS APIs, the build agent needs to obtain AWS credentials from an IAM role. The solution architecture is shown in Figure 1.

Figure 1: Using external CI/CD tool with AWS

To make this deployment securely, the main objective is to use short-term credentials and avoid the need to generate and store long-term credentials for your pipelines. This post walks through how to use IAM Roles Anywhere and certificate-based authentication with Azure DevOps build agents. The walkthrough will use Azure DevOps Services with Microsoft-hosted agents. This approach can be used with a self-hosted agent or Azure DevOps Server.

IAM Roles Anywhere and certificate-based authentication

IAM Roles Anywhere uses a private certificate authority (CA) for the temporary security credential issuance process. Your private CA is registered with IAM Roles Anywhere through a service-to-service trust. Once the trust is established, you create an IAM role with an IAM policy that can be assumed by your services running outside of AWS. The external service uses a private CA issued X.509 certificate to request temporary AWS credentials from IAM Roles Anywhere and then assumes the IAM role with permission to finish the authentication process, as shown in Figure 2.

Figure 2: Certificate-based authentication for external CI/CD tool using IAM Roles Anywhere

The workflow in Figure 2 is as follows:

The external service uses its certificate to sign and issue a request to IAM Roles Anywhere.
IAM Roles Anywhere validates the incoming signature and checks that the certificate was issued by a certificate authority configured as a trust anchor in the account.
Temporary credentials are returned to the external service, which can then be used for other authenticated calls to the AWS APIs.

Walkthrough

In this walkthrough, you accomplish the following steps:

Deploy IAM roles in your workload accounts.
Create a root certificate to simulate your certificate authority. Then request and sign a leaf certificate to distribute to your build agent.
Configure an IAM Roles Anywhere trust anchor in your workload accounts.
Configure your pipelines to use certificate-based authentication with a working example using Azure DevOps pipelines.

Preparation

You can find the sample code for this post in our GitHub repository. We recommend that you locally clone a copy of this repository. This repository includes the following files:

DynamoDB_Table.template: This template creates an Amazon DynamoDB table.
iamra-trust-policy.json: This trust policy allows the IAM Roles Anywhere service to assume the role and defines the permissions to be granted.
parameters.json: This passes parameters when launching the CloudFormation template.
pipeline-iamra.yml: The definition of the pipeline that deploys the CloudFormation template using IAM Roles Anywhere authentication.
pipeline-iamra-multi.yml: The definition of the pipeline that deploys the CloudFormation template using IAM Roles Anywhere authentication in multi-account environment.

The first step is creating an IAM role in your AWS accounts with the necessary permissions to deploy your resources. For this, you create a role using the AWSCloudFormationFullAccess and AmazonDynamoDBFullAccess managed policies.

When you define the permissions for your actual applications and workloads, make sure to adjust the permissions to meet your specific needs based on the principle of least privilege.

Run the following command to create the CICDRole in the Dev and Prod AWS accounts.

aws iam create-role --role-name CICDRole --assume-role-policy-document file://iamra-trust-policy.json
aws iam attach-role-policy --role-name CICDRole --policy-arn arn:aws:iam::aws:policy/AmazonDynamoDBFullAccess
aws iam attach-role-policy --role-name CICDRole --policy-arn arn:aws:iam::aws:policy/AWSCloudFormationFullAccess

As part of the role creation, you need to apply the trust policy provided in iamra-trust-policy.json. This trust policy allows the IAM Roles Anywhere service to assume the role with the condition that the Subject Common Name (CN) of the certificate is cicdagent.example.com. In a later step you will update this trust policy with the Amazon Resource Name (ARN) of your trust anchor to further restrict how the role can be assumed.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "rolesanywhere.amazonaws.com"
            },
            "Action": [
                "sts:AssumeRole",
                "sts:TagSession",
                "sts:SetSourceIdentity"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:PrincipalTag/x509Subject/CN": "cicd-agent.example.com"
                }
            }
        }
    ]
}

Issue and sign a self-signed certificate

Use OpenSSL to generate and sign the certificate. Run the following commands to generate a root and leaf certificate.

Note: The following procedure has been tested with OpenSSL 1.1.1 and OpenSSL 3.0.8.

# generate key for CA certificate
openssl genrsa -out ca.key 2048

# generate CA certificate
openssl req -new -x509 -days 1826 -key ca.key -subj /CN=ca.example.com \
    -addext 'keyUsage=critical,keyCertSign,cRLSign,digitalSignature' \
    -addext 'basicConstraints=critical,CA:TRUE' -out ca.crt 

#generate key for leaf certificate
openssl genrsa -out private.key 2048

#request leaf certificate
cat > extensions.cnf <<EOF
[v3_ca]
keyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment
EOF

openssl req -new -key private.key -subj /CN=cicd-agent.example.com -out iamra-cert.csr

#sign leaf certificate with CA
openssl x509 -req -days 7 -in iamra-cert.csr -CA ca.crt -CAkey ca.key -set_serial 01 -extfile extensions.cnf -extensions v3_ca -out certificate.crt

The following files are needed in further steps: ca.crt, certificate.crt, private.key.

Configure the IAM Roles Anywhere trust anchor and profile in your workload accounts

In this step, you configure the IAM Roles Anywhere trust anchor, the profile, and the role with the associated IAM policy to define the permissions to be granted to your build agents. Make sure to set the permissions specified in the policy to the least privileged access.

To configure the IAM Role Anywhere trust anchor

Open the IAM console and go to Roles Anywhere.
Choose Create a trust anchor.
Choose External certificate bundle and paste the content of your CA public certificate in the certificate bundle box (the content of the ca.crt file from the previous step). The configuration looks as follows:

Figure 3: IAM Roles Anywhere trust anchor

To follow security best practices by applying least privilege access, add a condition statement in the IAM role’s trust policy to match the created trust anchor to make sure that only certificates that you want to assume a role through IAM Roles Anywhere can do so.

To update the trust policy of the created CICDRole

Open the IAM console, select Roles, then search for CICDRole.
Open CICDRole to update its configuration, and then select Trust relationships.
Replace the existing policy with the following updated policy that includes an additional condition to match on the trust anchor. Replace the ARN ID in the policy with the ARN of the trust anchor created in your account.

Figure 4: IAM Roles Anywhere updated trust policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "rolesanywhere.amazonaws.com"
            },
            "Action": [
                "sts:AssumeRole",
                "sts:TagSession",
                "sts:SetSourceIdentity"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:PrincipalTag/x509Subject/CN": "cicd-agent.example.com"
                },
                "ArnEquals": {
                    "aws:SourceArn": "arn:aws:rolesanywhere:ca-central-1:111111111111:trust-anchor/9f084b8b-2a32-47f6-aee3-d027f5c4b03b"
                }
            }
        }
    ]
}

To create an IAM Role Anywhere profile and link the profile to CICDRole

Open the IAM console and go to Roles Anywhere.
Choose Create a profile.
In the Profile section, enter a name.
In the Roles section, select CICDRole.
Keep the other options set to default.

Figure 5: IAM Roles Anywhere profile

Configure the Azure DevOps pipeline to use certificate-based authentication

Now that you’ve completed the necessary setup in AWS, you move to the configuration of your pipeline in Azure DevOps. You need to have access to an Azure DevOps organization to complete these steps.

Have the following values ready. They’re needed for the Azure DevOps Pipeline configuration. You need this set of information for every AWS account you want to deploy to.

Trust anchor ARN – Resource identifier for the trust anchor created when you configured IAM Roles Anywhere.
Profile ARN – The identifier of the IAM Roles Anywhere profile you created.
Role ARN – The ARN of the role to assume. This role needs to be configured in the profile.
Certificate – The certificate tied to the profile (in other words, the issued certificate: file certificate.crt).
Private key – The private key of the certificate (private.key).

Azure DevOps configuration steps

The following steps walk you through configuring Azure DevOps.

Create a new project in Azure DevOps.
Add the following files from the sample repository that you previously cloned to the Git Azure repo that was created as part of the project. (The simplest way to do this is to add a new remote to your local Git repository and push the files.)
- DynamoDB_Table.template – The sample CloudFormation template you will deploy
- parameters.json – This passes parameters when launching the CloudFormation template
- pipeline-iamra.yml – The definition of the pipeline that deploys the CloudFormation template using IAM RA authentication
Create a new pipeline:
1. Select Azure Repos Git as your source.
2. Select your current repository.
3. Choose Existing Azure Pipelines YAML file.
4. For the path, enter pipeline-iamra.yml.
5. Select Save (don’t run the pipeline yet).
In Azure DevOps, choose Pipelines, and then choose Library.
Create a new variable group called aws-dev that will store the configuration values to deploy to your AWS Dev environment.
Add variables corresponding to the values of the trust anchor profile and role to use for authentication.

Figure 6: Azure DevOps configuration steps: Adding IAM Roles Anywhere variables
Save the group.
Update the permissions to allow your pipeline to use the variable group.

Figure 7: Azure DevOps configuration steps: Pipeline permissions
In the Library, choose the Secure files tab to upload the certificate and private key files that you generated previously.

Figure 8: Azure DevOps configuration steps: Upload certificate and private key
For each file, update the Pipeline permissions to provide access to the pipeline created previously.

Figure 9: Azure DevOps configuration steps: Pipeline permissions for each file
Run the pipeline and validate successful completion. In your AWS account, you should see a stack named my-stack-name that deployed a DynamoDB table.

Figure 10: Verify CloudFormation stack deployment in your account

Explanation of the pipeline-iamra.yml

Here are the different steps of the pipeline:

The first step downloads and installs the credential helper tool that allows you to obtain temporary credentials from IAM Roles Anywhere.

- bash: wget https://rolesanywhere.amazonaws.com/releases/1.0.3/X86_64/Linux/aws_signing_helper; chmod +x aws_signing_helper;
  displayName: Install AWS Signer

The second step uses the DownloadSecureFile built-in task to retrieve the certificate and private key that you stored in the Azure DevOps secure storage.

- task: DownloadSecureFile@1
  name: Certificate
  displayName: 'Download certificate'
  inputs:
    secureFile: 'certificate.crt'

- task: DownloadSecureFile@1
  name: Privatekey
  displayName: 'Download private key'
  inputs:
    secureFile: 'private.key'

The credential helper is configured to obtain temporary credentials by providing the certificate and private key as well as the role to assume and an IAM AWS Roles Anywhere profile to use. Every time the AWS CLI or AWS SDK needs to authenticate to AWS, they use this credential helper to obtain temporary credentials.

bash: |
    aws configure set credential_process "./aws_signing_helper credential-process --certificate $(Certificate.secureFilePath) --private-key $(Privatekey.secureFilePath) --trust-anchor-arn $(TRUSTANCHORARN) --profile-arn $(PROFILEARN) --role-arn $(ROLEARN)" --profile default
    echo "##vso[task.setvariable variable=AWS_SDK_LOAD_CONFIG;]1"
  displayName: Obtain AWS Credentials

The next step is for troubleshooting purposes. The AWS CLI is used to confirm the current assumed identity in your target AWS account.

task: AWSCLI@1
  displayName: Check AWS identity
  inputs:
    regionName: 'ca-central-1'
    awsCommand: 'sts'
    awsSubCommand: 'get-caller-identity'

The final step uses the CloudFormationCreateOrUpdateStack task from the AWS Toolkit for Azure DevOps to deploy the Cloud Formation stack. Usually, the awsCredentials parameter is used to point the task to the Service Connection with the AWS access keys and secrets. If you omit this parameter, the task looks instead for the credentials in the standard credential provider chain.

task: CloudFormationCreateOrUpdateStack@1
  displayName: 'Create/Update Stack: Staging-Deployment'
  inputs:
    regionName:     'ca-central-1'
    stackName:      'my-stack-name'
    useChangeSet:   true
    changeSetName:  'my-stack-name-changeset'
    templateFile:   'DynamoDB_Table.template'
    templateParametersFile: 'parameters.json'
    captureStackOutputs: asVariables
    captureAsSecuredVars: false

Multi-account deployments

In this example, the pipeline deploys to a single AWS account. You can quickly extend it to support deployment to multiple accounts by following these steps:

Repeat the Configure IAM Roles Anywhere Trust Anchor for each account.
In Azure DevOps, create a variable group with the configuration specific to the additional account.
In the pipeline definition, add a stage that uses this variable group.

The pipeline-iamra-multi.yml file in the sample repository contains such an example.

Cleanup

To clean up the AWS resources created in this article, follow these steps:

Delete the deployed CloudFormation stack in your workload accounts.
Remove the IAM trust anchor and profile from the workload accounts.
Delete the CICDRole IAM role.

Alternative options available to obtain temporary credentials in AWS for CI/CD pipelines

In addition to the IAM Roles Anywhere option presented in this blog, there are two other options to issue temporary security credentials for the external build agent:

Option 1 – Re-host the build agent on an Amazon Elastic Compute Cloud (Amazon EC2) instance in the AWS account and assign an IAM role. (See IAM roles for Amazon EC2). This option resolves the issue of using long-term IAM access keys to deploy self-hosted build agents on an AWS compute service (such as Amazon EC2, AWS Fargate, or Amazon Elastic Kubernetes Service (Amazon EKS)) instead of using fully-managed or on-premises agents, but it would still require using multiple agents for pipelines that need different permissions.
Option 2 – Some DevOps tools support the use of OpenID Connect (OIDC). OIDC is an authentication layer based on open standards that makes it simpler for a client and an identity provider to exchange information. CI/CD tools such as GitHub, GitLab, and Bitbucket provide support for OIDC, which helps you to integrate with AWS for secure deployments and resources access without having to store credentials as long-lived secrets. However, not all CI/CD pipeline tools supports OIDC.

Conclusion

In this post, we showed you how to combine IAM Roles Anywhere and an existing public key infrastructure (PKI) to authenticate external build agents to AWS by using short-lived certificates to obtain AWS temporary credentials. We presented the use of Azure Pipelines for the demonstration, but you can adapt the same steps to other CI/CD tools running on premises or in other cloud platforms. For simplicity, the certificate was manually configured in Azure DevOps to be provided to the agents. We encourage you to automate the distribution of short-lived certificates based on an integration with your PKI.

For demonstration purposes, we included the steps of generating a root certificate and manually signing the leaf certificate. For production workloads, you should have access to a private certificate authority to generate certificates for use by your external build agent. If you do not have an existing private certificate authority, consider using AWS Private Certificate Authority.

Want more AWS Security news? Follow us on Twitter.

Set up fine-grained permissions for your data pipeline using MWAA and EKS

2023-09-25 Ulrich Hinze

Post Syndicated from Ulrich Hinze original https://aws.amazon.com/blogs/big-data/set-up-fine-grained-permissions-for-your-data-pipeline-using-mwaa-and-eks/

This is a guest blog post co-written with Patrick Oberherr from Contentful and Johannes Günther from Netlight Consulting.

This blog post shows how to improve security in a data pipeline architecture based on Amazon Managed Workflows for Apache Airflow (Amazon MWAA) and Amazon Elastic Kubernetes Service (Amazon EKS) by setting up fine-grained permissions, using HashiCorp Terraform for infrastructure as code.

Many AWS customers use Amazon EKS to execute their data workloads. The advantages of Amazon EKS include different compute and storage options depending on workload needs, higher resource utilization by sharing underlying infrastructure, and a vibrant open-source community that provides purpose-built extensions. The Data on EKS project provides a series of templates and other resources to help customers get started on this journey. It includes a description of using Amazon MWAA as a job scheduler.

Contentful is an AWS customer and AWS Partner Network (APN) partner. Behind the scenes of their Software-as-a-Service (SaaS) product, the Contentful Composable Content Platform, Contentful uses insights from data to improve business decision-making and customer experience. Contentful engaged Netlight, an APN consulting partner, to help set up a data platform to gather these insights.

Most of Contentful’s application workloads run on Amazon EKS, and knowledge of this service and Kubernetes is widespread in the organization. That’s why Contentful’s data engineering team decided to run data pipelines on Amazon EKS as well. For job scheduling, they started with a self-operated Apache Airflow on an Amazon EKS cluster and later switched to Amazon MWAA to reduce engineering and operations overhead. The job execution remained on Amazon EKS.

Contentful runs a complex data pipeline using this infrastructure, including ingestion from multiple data sources and different transformation jobs, for example using dbt. The whole pipeline shares a single Amazon MWAA environment and a single Amazon EKS cluster. With a diverse set of workloads in a single environment, it is necessary to apply the principle of least privilege, ensuring that individual tasks or components have only the specific permissions they need to function.

By segmenting permissions according to roles and responsibilities, Contentful’s data engineering team was able to create a more robust and secure data processing environment, which is essential for maintaining the integrity and confidentiality of the data being handled.

In this blog post, we walk through setting up the infrastructure from scratch and deploying a sample application using Terraform, Contentful’s tool of choice for infrastructure as code.

Prerequisites

To follow along this blog post, you need the latest version of the following tools installed:

AWS CLI, configured with access to your AWS account
Terraform CLI
kubectl

Overview

In this blog post, you will create a sample application with the following infrastructure:

Architecture drawing of the sample application deployed in this blog post

The sample Airflow workflow lists objects in the source bucket, temporarily stores this list using Airflow XComs, and writes the list as a file to the destination bucket. This application is executed using Amazon EKS pods, scheduled by an Amazon MWAA environment. You deploy the EKS cluster and the MWAA environment into a virtual private cloud (VPC) and apply least-privilege permissions to the EKS pods using IAM roles for service accounts. The configuration bucket for Amazon MWAA contains runtime requirements, as well as the application code specifying an Airflow Directed Acyclic Graph (DAG).

Initialize the project and create buckets

Create a file main.tf with the following content in an empty directory:

locals {
  region = "us-east-1"
}

provider "aws" {
  region = local.region
}

resource "aws_s3_bucket" "source_bucket" {
  bucket_prefix = "source"
}

resource "aws_s3_object" "dummy_object" {
  bucket  = aws_s3_bucket.source_bucket.bucket
  key     = "dummy.txt"
  content = ""
}

resource "aws_ssm_parameter" "source_bucket" {
  name  = "mwaa_source_bucket"
  type  = "SecureString"
  value = aws_s3_bucket.source_bucket.bucket
}

resource "aws_s3_bucket" "destination_bucket" {
  bucket_prefix = "destination"
  force_destroy = true
}

resource "aws_ssm_parameter" "destination_bucket" {
  name  = "mwaa_destination_bucket"
  type  = "SecureString"
  value = aws_s3_bucket.destination_bucket.bucket
}

This file defines the Terraform AWS provider as well as the source and destination bucket, whose names are exported as AWS Systems Manager parameters. It also tells Terraform to upload an empty object named dummy.txt into the source bucket, which enables the Airflow sample application we will create later to receive a result when listing bucket content.

Initialize the Terraform project and download the module dependencies by issuing the following command:

terraform init

Create the infrastructure:

terraform apply

Terraform asks you to acknowledge changes to the environment and then starts deploying resources in AWS. Upon successful deployment, you should see the following success message:

Apply complete! Resources: 5 added, 0 changed, 0 destroyed.

Create VPC

Create a new file vpc.tf in the same directory as main.tf and insert the following:

data "aws_availability_zones" "available" {}

locals {
  cidr = "10.0.0.0/16"
  azs  = slice(data.aws_availability_zones.available.names, 0, 3)
}

module "vpc" {
  name               = "data-vpc"
  source             = "terraform-aws-modules/vpc/aws"
  version            = "~> 4.0"
  cidr               = local.cidr
  azs                = local.azs
  public_subnets     = [for k, v in local.azs : cidrsubnet(local.cidr, 8, k + 48)]
  private_subnets    = [for k, v in local.azs : cidrsubnet(local.cidr, 4, k)]
  enable_nat_gateway = true
}

This file defines the VPC, a virtual network, that will later host the Amazon EKS cluster and the Amazon MWAA environment. Note that we use an existing Terraform module for this, which wraps configuration of underlying network resources like subnets, route tables, and NAT gateways.

Download the VPC module:

terraform init

Deploy the new resources:

terraform apply

Note which resources are being created. By using the VPC module in our Terraform file, much of the underlying complexity is taken away when defining our infrastructure, but it’s still useful to know what exactly is being deployed.

Note that Terraform now handles resources we defined in both files, main.tf and vpc.tf, because Terraform includes all .tf files in the current working directory.

Create the Amazon MWAA environment

Create a new file mwaa.tf and insert the following content:

locals {
  requirements_filename = "requirements.txt"
  airflow_version       = "2.6.3"
  requirements_content  = <<EOT
apache-airflow[cncf.kubernetes]==${local.airflow_version}
EOT
}

module "mwaa" {
  source = "github.com/aws-ia/terraform-aws-mwaa?ref=1066050"

  name              = "mwaa"
  airflow_version   = local.airflow_version
  environment_class = "mw1.small"

  vpc_id             = module.vpc.vpc_id
  private_subnet_ids = slice(module.vpc.private_subnets, 0, 2)

  webserver_access_mode = "PUBLIC_ONLY"

  requirements_s3_path = local.requirements_filename
}

resource "aws_s3_object" "requirements" {
  bucket  = module.mwaa.aws_s3_bucket_name
  key     = local.requirements_filename
  content = local.requirements_content

  etag = md5(local.requirements_content)
}

Like before, we use an existing module to save configuration effort for the Amazon MWAA environment. The module also creates the configuration bucket, which we use to specify the runtime dependency of the application (apache-airflow-cncf-kubernetes) in the requirements.txt file. This package, in combination with the preinstalled package apache-airflow-amazon, enables interaction with Amazon EKS.

Download the MWAA module:

terraform init

Deploy the new resources:

terraform apply

This operation takes 20–30 minutes to complete.

Create the Amazon EKS cluster

Create a file eks.tf with the following content:

module "cluster" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints?ref=8a06a6e"

  cluster_name    = "data-cluster"
  cluster_version = "1.27"

  vpc_id             = module.vpc.vpc_id
  private_subnet_ids = module.vpc.private_subnets
  enable_irsa        = true

  managed_node_groups = {
    node_group = {
      node_group_name = "node-group"
      desired_size    = 1
    }
  }
  application_teams = {
    mwaa = {}
  }

  map_roles = [{
    rolearn  = module.mwaa.mwaa_role_arn
    username = "mwaa-executor"
    groups   = []
  }]
}

data "aws_eks_cluster_auth" "this" {
  name = module.cluster.eks_cluster_id
}

provider "kubernetes" {
  host                   = module.cluster.eks_cluster_endpoint
  cluster_ca_certificate = base64decode(module.cluster.eks_cluster_certificate_authority_data)
  token                  = data.aws_eks_cluster_auth.this.token
}

resource "kubernetes_role" "mwaa_executor" {
  metadata {
    name      = "mwaa-executor"
    namespace = "mwaa"
  }

  rule {
    api_groups = [""]
    resources  = ["pods", "pods/log", "pods/exec"]
    verbs      = ["get", "list", "create", "patch", "delete"]
  }
}

resource "kubernetes_role_binding" "mwaa_executor" {
  metadata {
    name      = "mwaa-executor"
    namespace = "mwaa"
  }
  role_ref {
    api_group = "rbac.authorization.k8s.io"
    kind      = "Role"
    name      = kubernetes_role.mwaa_executor.metadata[0].name
  }
  subject {
    kind      = "User"
    name      = "mwaa-executor"
    api_group = "rbac.authorization.k8s.io"
  }
}

output "configure_kubectl" {
  description = "Configure kubectl: make sure you're logged in with the correct AWS profile and run the following command to update your kubeconfig"
  value       = "aws eks --region ${local.region} update-kubeconfig --name ${module.cluster.eks_cluster_id}"
}

To create the cluster itself, we take advantage of the Amazon EKS Blueprints for Terraform project. We also define a managed node group with one node as the target size. Note that in cases with fluctuating load, scaling your cluster with Karpenter instead of the managed node group approach shown above makes the cluster scale more flexibly. We used managed node groups primarily because of the ease of configuration.

We define the identity that the Amazon MWAA execution role assumes in Kubernetes using the map_roles variable. After configuring the Terraform Kubernetes provider, we give the Amazon MWAA execution role permissions to manage pods in the cluster.

Download the EKS Blueprints for Terraform module:

terraform init

Deploy the new resources:

terraform apply

This operation takes about 12 minutes to complete.

Create IAM roles for service accounts

Create a file roles.tf with the following content:

data "aws_iam_policy_document" "source_bucket_reader" {
  statement {
    actions   = ["s3:ListBucket"]
    resources = ["${aws_s3_bucket.source_bucket.arn}"]
  }
  statement {
    actions   = ["ssm:GetParameter"]
    resources = [aws_ssm_parameter.source_bucket.arn]
  }
}

resource "aws_iam_policy" "source_bucket_reader" {
  name   = "source_bucket_reader"
  path   = "/"
  policy = data.aws_iam_policy_document.source_bucket_reader.json
}

module "irsa_source_bucket_reader" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints//modules/irsa"

  eks_cluster_id              = module.cluster.eks_cluster_id
  eks_oidc_provider_arn       = module.cluster.eks_oidc_provider_arn
  irsa_iam_policies           = [aws_iam_policy.source_bucket_reader.arn]
  kubernetes_service_account  = "source-bucket-reader-sa"
  kubernetes_namespace        = "mwaa"
  create_kubernetes_namespace = false
}

data "aws_iam_policy_document" "destination_bucket_writer" {
  statement {
    actions   = ["s3:PutObject"]
    resources = ["${aws_s3_bucket.destination_bucket.arn}/*"]
  }
  statement {
    actions   = ["ssm:GetParameter"]
    resources = [aws_ssm_parameter.destination_bucket.arn]
  }
}

resource "aws_iam_policy" "destination_bucket_writer" {
  name   = "irsa_destination_bucket_writer"
  policy = data.aws_iam_policy_document.destination_bucket_writer.json
}

module "irsa_destination_bucket_writer" {
  source = "github.com/aws-ia/terraform-aws-eks-blueprints//modules/irsa"

  eks_cluster_id              = module.cluster.eks_cluster_id
  eks_oidc_provider_arn       = module.cluster.eks_oidc_provider_arn
  irsa_iam_policies           = [aws_iam_policy.destination_bucket_writer.arn]
  kubernetes_service_account  = "destination-bucket-writer-sa"
  kubernetes_namespace        = "mwaa"
  create_kubernetes_namespace = false
}

This file defines two Kubernetes service accounts, source-bucket-reader-sa and destination-bucket-writer-sa, and their permissions against the AWS API, using IAM roles for service accounts (IRSA). Again, we use a module from the Amazon EKS Blueprints for Terraform project to simplify IRSA configuration. Note that both roles only get the minimum permissions that they need, defined using AWS IAM policies.

Download the new module:

terraform init

Deploy the new resources:

terraform apply

Create the DAG

Create a file dag.py defining the Airflow DAG:

from datetime import datetime

from airflow import DAG
from airflow.providers.amazon.aws.operators.eks import EksPodOperator

dag = DAG(
    "dag_with_fine_grained_permissions",
    description="DAG with fine-grained permissions",
    default_args={
        "cluster_name": "data-cluster",
        "namespace": "mwaa",
        "get_logs": True,
        "is_delete_operator_pod": True,
    },
    schedule="@hourly",
    start_date=datetime(2023, 1, 1),
    catchup=False,
)

read_bucket = EksPodOperator(
    task_id="read-bucket",
    pod_name="read-bucket",
    service_account_name="source-bucket-reader-sa",
    image="amazon/aws-cli:latest",
    cmds=[
        "sh",
        "-xc",
        "aws s3api list-objects --output json --bucket $(aws ssm get-parameter --name mwaa_source_bucket --with-decryption --query 'Parameter.Value' --output text)  > /airflow/xcom/return.json",
    ],
    do_xcom_push=True,
    dag=dag,
)

write_bucket = EksPodOperator(
    task_id="write-bucket",
    pod_name="write-bucket",
    service_account_name="destination-bucket-writer-sa",
    image="amazon/aws-cli:latest",
    cmds=[
        "sh",
        "-xc",
        "echo '{{ task_instance.xcom_pull('read-bucket')|tojson }}' > list.json; aws s3 cp list.json s3://$(aws ssm get-parameter --name mwaa_destination_bucket  --with-decryption --query 'Parameter.Value' --output text)",
    ],
    dag=dag,
)

read_bucket >> write_bucket

The DAG is defined to run on an hourly schedule, with two tasks read_bucket with service account source-bucket-reader-sa and write_bucket with service account destination-bucket-writer-sa, running after one another. Both are run using the EksPodOperator, which is responsible for scheduling the tasks on Amazon EKS, using the AWS CLI Docker image to run commands. The first task lists files in the source bucket and writes the list to Airflow XCom. The second task reads the list from XCom and stores it in the destination bucket. Note that the service_account_name parameter differentiates what each task is permitted to do.

Create a file dag.tf to upload the DAG code to the Amazon MWAA configuration bucket:

locals {
  dag_filename = "dag.py"
}

resource "aws_s3_object" "dag" {
  bucket = module.mwaa.aws_s3_bucket_name
  key    = "dags/${local.dag_filename}"
  source = local.dag_filename

  etag = filemd5(local.dag_filename)
}

Deploy the changes:

terraform apply

The Amazon MWAA environment automatically imports the file from the S3 bucket.

Run the DAG

In your browser, navigate to the Amazon MWAA console and select your environment. In the top right-hand corner, select Open Airflow UI . You should see the following:

Screenshot of the MWAA user interface

To trigger the DAG, in the Actions column, select the play symbol and then select Trigger DAG. Click on the DAG name to explore the DAG run and its results.

Navigate to the Amazon S3 console and choose the bucket starting with “destination”. It should contain a file list.json recently created by the write_bucket task. Download the file to explore its content, a JSON list with a single entry.

Clean up

The resources you created in this walkthrough incur AWS costs. To delete the created resources, issue the following command:

terraform destroy

And approve the changes in the Terraform CLI dialog.

Conclusion

In this blog post, you learned how to improve the security of your data pipeline running on Amazon MWAA and Amazon EKS by narrowing the permissions of each individual task.

To dive deeper, use the working example created in this walkthrough to explore the topic further: What happens if you remove the service_account_name parameter from an Airflow task? What happens if you exchange the service account names in the two tasks?

For simplicity, in this walkthrough we used a flat file structure with Terraform and Python files inside a single directory. We did not adhere to the standard module structure proposed by Terraform, which is generally recommended. In a real-life project, splitting up the project into multiple Terraform projects or modules may also increase flexibility, speed, and independence between teams owning different parts of the infrastructure.

Lastly, make sure to study the Data on EKS documentation, which provides other valuable resources for running your data pipeline on Amazon EKS, as well as the Amazon MWAA and Apache Airflow documentation for implementing your own use cases. Specifically, have a look at this sample implementation of a Terraform module for Amazon MWAA and Amazon EKS, which contains a more mature approach to Amazon EKS configuration and node automatic scaling, as well as networking.

If you have any questions, you can start a new thread on AWS re:Post or reach out to AWS Support.

About the Authors

Ulrich Hinze is a Solutions Architect at AWS. He partners with software companies to architect and implement cloud-based solutions on AWS. Before joining AWS, he worked for AWS customers and partners in software engineering, consulting, and architecture roles for 8+ years.

Patrick Oberherr is a Staff Data Engineer at Contentful with 4+ years of working with AWS and 10+ years in the Data field. At Contentful he is responsible for infrastructure and operations of the data stack which is hosted on AWS.

Johannes Günther is a cloud & data consultant at Netlight with 5+ years of working with AWS. He has helped clients across various industries designing sustainable cloud platforms and is AWS certified.