Ending TLS Client Authentication Certificate Support in 2026

Post Syndicated from Let's Encrypt original https://letsencrypt.org/2025/05/14/ending-tls-client-authentication.html

Let’s Encrypt will no longer include the “TLS Client Authentication” Extended Key Usage (EKU) in our certificates beginning in 2026. Most users who use Let’s Encrypt to secure websites won’t be affected and won’t need to take any action. However, if you use Let’s Encrypt certificates as client certificates to authenticate to a server, this change may impact you.

To minimize disruption, Let’s Encrypt will roll this change out in multiple stages, using ACME Profiles:

  • Today: Let’s Encrypt already excludes the Client Authentication EKU on our tlsserver ACME profile. You can verify compatibility by issuing certificates with this profile now.
  • October 1, 2025: Let’s Encrypt will launch a new tlsclient ACME profile which will retain the TLS Client Authentication EKU. Users who need additional time to migrate can opt-in to this profile.
  • February 11, 2026: the default classic ACME profile will no longer contain the Client Authentication EKU.
  • May 13, 2026: the tlsclient ACME profile will no longer be available and no further certificates with the Client Authentication EKU will be issued.

Once this is completed, Let’s Encrypt will switch to issuing with new intermediate Certificate Authorities which also do not contain the TLS Client Authentication EKU.

For some background information, all certificates include a list of intended uses, known as Extended Key Usages (EKU). Let’s Encrypt certificates have included two EKUs: TLS Server Authentication and TLS Client Authentication.

  • TLS Server Authentication is used to authenticate connections to TLS Servers, like websites.
  • TLS Client Authentication is used by clients to authenticate themselves to a server. This feature is not typically used on the web, and is not required on the certificates used on a website.

After this change is complete, only TLS Server Authentication will be available from Let’s Encrypt.

This change is prompted by changes to Google Chrome’s root program requirements, which impose a June 2026 deadline to split TLS Client and Server Authentication into separate PKIs. Many uses of client authentication are better served by a private certificate authority, and so Let’s Encrypt is discontinuing support for TLS Client Authentication ahead of this deadline.

Ending TLS Client Authentication Certificate Support in 2026

Post Syndicated from Let's Encrypt original https://letsencrypt.org/2025/05/14/ending-tls-client-authentication/

Let’s Encrypt will no longer include the “TLS Client Authentication” Extended Key Usage (EKU) in our certificates beginning in 2026. Most users who use Let’s Encrypt to secure websites won’t be affected and won’t need to take any action. However, if you use Let’s Encrypt certificates as client certificates to authenticate to a server, this change may impact you.

To minimize disruption, Let’s Encrypt will roll this change out in multiple stages, using ACME Profiles:

  • Today: Let’s Encrypt already excludes the Client Authentication EKU on our tlsserver ACME profile. You can verify compatibility by issuing certificates with this profile now.
  • October 1, 2025: Let’s Encrypt will launch a new tlsclient ACME profile which will retain the TLS Client Authentication EKU. Users who need additional time to migrate can opt-in to this profile.
  • February 11, 2026: the default classic ACME profile will no longer contain the Client Authentication EKU.
  • May 13, 2026: the tlsclient ACME profile will no longer be available and no further certificates with the Client Authentication EKU will be issued.

Once this is completed, Let’s Encrypt will switch to issuing with new intermediate Certificate Authorities which also do not contain the TLS Client Authentication EKU.

For some background information, all certificates include a list of intended uses, known as Extended Key Usages (EKU). Let’s Encrypt certificates have included two EKUs: TLS Server Authentication and TLS Client Authentication.

  • TLS Server Authentication is used to authenticate connections to TLS Servers, like websites.
  • TLS Client Authentication is used by clients to authenticate themselves to a server. This feature is not typically used on the web, and is not required on the certificates used on a website.

After this change is complete, only TLS Server Authentication will be available from Let’s Encrypt.

This change is prompted by changes to Google Chrome’s root program requirements, which impose a June 2026 deadline to split TLS Client and Server Authentication into separate PKIs. Many uses of client authentication are better served by a private certificate authority, and so Let’s Encrypt is discontinuing support for TLS Client Authentication ahead of this deadline.

Protect against advanced DNS threats with Amazon Route 53 Resolver DNS Firewall

Post Syndicated from Lawton Pittenger original https://aws.amazon.com/blogs/security/protect-against-advanced-dns-threats-with-amazon-route-53-resolver-dns-firewall/

Every day, millions of applications seamlessly connect users to the digital services they need through DNS queries. These queries act as an interface to the internet’s address book, translating familiar domain names like amazon.com into the IP addresses that computers use to appropriately route traffic. The DNS landscape presents unique security challenges and opportunities in Amazon Virtual Private Cloud (Amazon VPC) environments. First, DNS resolution acts as an early checkpoint that you can use to control network traffic before it even begins. Second, DNS queries in your VPC follow a distinct path through the Amazon Route 53 Resolver that operates independently from your standard internet gateway, bypassing other network security controls.

To address this, Amazon Route 53 Resolver DNS Firewall provides protection for DNS traffic, starting with traditional domain lists where you can explicitly allow or deny DNS resolution of specific domains. Also, included are AWS Managed Domain Lists, which automatically block known malicious domains identified through Amazon Threat Intelligence and our trusted security partners. While this approach works effectively to help prevent known threats, sophisticated bad actors are increasingly using techniques that traditional blocklists can’t catch.

Instead of relying solely on static lists, Amazon Route 53 Resolver DNS Firewall Advanced provides intelligent protection alongside these traditional controls. These advanced rules work like a skilled security analyst, watching for suspicious patterns in DNS queries in real time. By examining characteristics such as query length, entropy, and frequency, the service can spot potentially malicious activity even when encountering previously unknown domains. This approach enables detecting and blocking advanced threats like DNS tunneling and domain generation algorithms (DGAs)—techniques that bad actors use to establish hidden communication channels or connect malware to their control servers.

In this post, we take you on a practical journey exploring these DNS-based threats and tools to help prevent them. You’ll learn how to set up effective Route 53 Resolver DNS Firewall Advanced rules, and we provide a ready-to-deploy CloudFormation template with our recommended configurations. Finally, we demonstrate an example of real-world threat detection and show you how the service integrates with AWS Security Hub to improve visibility of alerts. By the time you finish reading this post, you’ll have a clear understanding of how to deploy Route 53 Resolver DNS Firewall rules to add an intelligent, proactive layer of security to your AWS environment.

Understanding the risks of DNS tunneling and DGAs

As mentioned earlier, the Route 53 Resolver provides a service-managed path to the internet that operates independently from your VPC’s internet gateway. While this architecture enables efficient DNS resolution, it can be exploited through techniques such as DNS tunneling. Let’s explore how these techniques work and why they present unique challenges.

DNS tunneling takes advantage of the DNS protocol’s basic function—asking questions about domain names and receiving answers from the authoritative nameserver for the domain. But instead of using DNS for its intended purpose of domain name resolution, tunneling encodes other types of data within DNS queries and responses. For example, rather than asking simply what is the IP address for example.com?, a tunneling exploit might embed data within a query like secretdata123.attacker.com, where secretdata123 contains encoded information. This can lead to DNS being used as a two-way communications command and control channel. Detecting and blocking DNS tunneling is a vital control for stopping data exfiltration and command and control (C2) communications.

DGAs represent a different challenge for DNS security. Rather than using a fixed, predictable domain name that can be quickly blocked, DGAs automatically create many possible domain names using mathematical formulas, which are then used as a destination for C2 traffic. For instance, a DGA might generate domains like xkt7py.com today and mn9qrs.com tomorrow. This makes it difficult to maintain effective blocklists, because the domains change frequently and appear random. Traditional threat intelligence feeds, which rely on identifying and blocking known malicious domains, struggle to keep pace with DGA-generated domains.

How DNS Firewall Advanced works

When examining a domain name, Route 53 Resolver DNS Firewall Advanced looks at multiple characteristics that help distinguish between legitimate and suspicious domains. For example, legitimate domain names typically use real words and follow predictable patterns that are designed to facilitate a human’s ability to recall and enter them accurately. In contrast, domains used for tunneling or generated by DGAs often contain random-looking strings of characters or unusual patterns.

Route 53 Resolver DNS Firewall Advanced builds its intelligence on extensive analysis of real-world domain usage patterns. It learns what legitimate domain names look like by studying the most resolved domains on the internet, combined with actual domain resolution patterns from across AWS. This real-world training data helps establish a baseline for normal domain name characteristics. DNS Firewall Advanced then contrasts these patterns against known techniques used in DNS tunneling and domain generation to identify suspicious activity.

The service analyzes various aspects of each domain name, including:

  • How the domain name is structured and broken into parts
  • The patterns of letters and numbers used
  • How closely the domain resembles natural language
  • The presence of common words versus random character combinations

The service analyzes queries in real time, processing each one in less than a millisecond, which maintains strong security controls without affecting your applications’ performance.

Route 53 Resolver DNS Firewall Advanced has customized protection levels that you can use to choose how aggressively you want to detect and respond to suspicious domains through confidence thresholds:

  • High confidence: This setting focuses on the most obvious threats, minimizing false positives. It’s ideal for production environments where blocking legitimate traffic could be disruptive.
  • Medium confidence: Provides balanced protection, suitable for most environments.
  • Low confidence: Offers the most detection but might require more tuning to avoid false positives. This setting is useful for high-security environments or for initial monitoring to understand traffic patterns.

You can combine these confidence levels with different actions (block or alert) to create a defense strategy that matches your security needs.

Manually create a DNS Firewall Advanced rule:

To start, we show you how to manually create a Route 53 Resolver DNS Firewall Advanced rule in the AWS Management Console. This rule will block DNS queries that it has detected to be DNS tunneling with high confidence.

To manually create a rule:

  1. In the Route 53 console, choose Rules in the navigation pane, and then choose Add rule.
    Figure 1: Rules in the Route 53 console

    Figure 1: Rules in the Route 53 console

  2. Enter a name for the rule and select DNS Firewall Advanced protections.
    Figure 2: Add a rule

    Figure 2: Add a rule

  3. Under DNS Firewall Advanced protection:
    1. Select DNS tunneling detection.
    2. For Confidence threshold, select High.
    3. Leave the Query type empty so that the rule applies to all query types.
    Figure 3: Select DNS protection options

    Figure 3: Select DNS protection options

  4. Under Action:
    1. Select Block.
    2. For the response, select OVERRIDE.
    3. For the Record value, enter dns-firewall-advanced-block.
    4. For the Record type, select CNAME.
    5. Choose Add rule.
    Figure 4: Configure actions for the rule

    Figure 4: Configure actions for the rule

We’ve created an AWS CloudFormation stack that deploys the following recommended Route 53 Resolver DNS Firewall rules in a DNS Firewall rule group. We recommend this configuration because it provides a balanced security approach—blocking high-confidence threats immediately while generating alerts for lower-confidence detections.

The inclusion of the AWS Managed Aggregate Threat List is particularly valuable because it combines domains from multiple threat categories (malware, ransomware, botnet, spyware, and DNS tunneling) into a blocklist. This consolidated list includes the domains from other AWS Managed Domain Lists, including those identified by GuardDuty threat intelligence systems, giving you broad protection against known malicious domains while the Route 53 DNS Firewall Advanced rules catch previously unseen threats.

For enterprise environments, you can scale this protection across your entire organization by using AWS Firewall Manager to automatically deploy and manage this rule group configuration consistently across the VPCs in your organization.

  • BLOCK – Aggregate Threat List (domains associated with multiple DNS threat categories including malware, ransomware, botnet, spyware, and DNS tunneling to help block multiple types of threats)
  • BLOCK – DNS Tunneling | Confidence: HIGH
  • BLOCK – DGAs | Confidence: HIGH
  • ALERT – DNS Tunneling | Confidence: LOW
  • ALERT – DGAs | Confidence: LOW

To deploy this rule group using a CloudFormation stack:

  1. Navigate to the CloudFormation console, choose Stacks from the navigation pane. Choose Create Stack in the upper right and select With new resources (standard).
    Figure 5: Create a stack

    Figure 5: Create a stack

  2. Download the CloudFormation template. Select Choose an existing template and then select Upload a template file and upload the CloudFormation stack. Choose Next.
    Figure 6: Use the CloudFormation template

    Figure 6: Use the CloudFormation template

  3. Enter a stack name and choose Next.
    Figure 7: Enter a stack name

    Figure 7: Enter a stack name

  4. Leave the default values for all options, select Next, and then choose Submit.
  5. Navigate to the Route 53 Resolver DNS Firewall by visiting the Amazon VPC console, scroll down to the DNS firewall section, and select the Rule groups tab.
  6. Select the newly created rule group.
  7. Select the Associated VPCs tab, choose Associate VPC, and then associate a VPC you want to protect and choose Associate.
    Figure 8: Associate a VPC

    Figure 8: Associate a VPC

Observability

Route 53 Resolver query logging provides detailed visibility into DNS queries made from resources associated with your VPCs, enabling you to monitor and analyze your DNS traffic for security and compliance purposes. By configuring query logging, you can capture essential information about each DNS request, including the domain name being queried, the record type, the response code, and the originating VPC and instance. Query logging is particularly valuable when used in conjunction with Route 53 Resolver DNS Firewall, because it helps you track blocked queries and fine-tune your security rules based on actual DNS traffic patterns in your environment. The following are examples of log entries generated when DNS Firewall detects and responds to suspicious activities, showing the detailed information available for security analysis and incident response.

Example log entry: DNS tunneling block

The following is an example of a DNS tunneling block.

{
    "version": "1.100000",
    "account_id": "11111111111",
    "region": "us-west-2",
    "vpc_id": "vpc-0fcc85bd45b791d5a",
    "query_timestamp": "2025-02-05T03:54:12Z",
    "query_name": "1WTE4CyL4Vf1LQDDAToimuqFBEtMXyYMsYP8zPgVyTagzSh5PvinuQcL6N8at4A.REZv3VqKU4x43DPcCKAzQk4UKoZjB3nDMukHAuKTtDckTqZ8SDDZ1iXRey6a5sD.mEDMdrzPocS9exqoBQ1xfSuKfvW.1.dnstunnel.com.",
    "query_type": "A",
    "query_class": "IN",
    "rcode": "NXDOMAIN",
    "answers": [
        {
            "Rdata": "dns-firewall-advanced-block.",
            "Type": "CNAME",
            "Class": "IN"
        }
    ],
    "srcaddr": "10.1.0.122",
    "srcport": "41859",
    "transport": "UDP",
    "srcids": {
        "instance": "i-0c738190f19db9a2c"
    },
    "firewall_rule_action": "BLOCK",
    "firewall_rule_group_id": "rslvr-frg-63efa138b43f428b",
    "firewall_protection": "DNS_TUNNELING"
}

Example log entry: DNS tunneling alert

The following is an example of a DNS tunneling alert.

{
    "version": "1.100000",
    "account_id": "11111111111",
    "region": "us-west-2",
    "vpc_id": "vpc-0fcc85bd45b791d5a",
    "query_timestamp": "2025-02-05T04:00:02Z",
    "query_name": "1WTEc8GwFH3qHY8XKjbhXuj43yGShMrhacqwJYSZkSqRQ95sagz64NUpnuj4R8R.S79aru2KRB8d9nCHEPdXWJxGT4aUjVMqtCRSq9EZXRCo8NH5cmLvmcho3hh1mbK.NqGY1X6M4qpMGX6dnTSHuCsZFbf.1.dnstunnel.com.",
    "query_type": "A",
    "query_class": "IN",
    "rcode": "NOERROR",
    "answers": [
        {
        "Rdata": "202.92.34.217",
        "Type": "A",
        "Class": "IN"
        }
    ],
    "srcaddr": "10.1.0.122",
    "srcport": "35116",
    "transport": "UDP",
    "srcids": {
        "instance": "i-0c738190f19db9a2c",
        "resolver_endpoint": "rslvr-out-e20639d3666748f58"
    },
    "firewall_rule_action": "ALERT",
    "firewall_rule_group_id": "rslvr-frg-63efa138b43f428b",
    "firewall_protection": "DNS_TUNNELING"
}

Integration with Security Hub

Security Hub provides you with a view of your security state in AWS and helps you to check your environment against security industry standards and best practices. Security Hub collects security data from across AWS accounts, AWS services, and supported third-party partner products, and helps you to analyze security trends and identify the highest priority security issues. It enables findings from both the Amazon: Route 53 Resolver DNS Firewall – AWS List and Amazon: Route 53 Resolver DNS Firewall Advanced list by default, so you’ll automatically receive these alerts without additional configuration. You only need to manually enable Amazon: Route 53 Resolver DNS Firewall – Custom List findings if you’re using custom domain lists in your rule groups. See Sending findings from Route 53 Resolver DNS Firewall to Security Hub for more information.

The following figure is an example of how Route 53 Resolver DNS Firewall Advanced findings appear in the Security Hub console, providing you with actionable security intelligence directly in your centralized dashboard.

Figure 9: DNS Firewall Advanced findings in Security Hub

Figure 9: DNS Firewall Advanced findings in Security Hub

Select a finding to view details such as Finding ID, Types, Workflow status, and so on.

Figure 10: Findings details

Figure 10: Findings details

Conclusion

Amazon Route 53 Resolver DNS Firewall Advanced represents a significant step forward in protecting organizations against sophisticated DNS-based threats. As mentioned, DNS queries sent to the Route 53 Resolver follow a unique path that bypasses traditional AWS security controls like security groups, NACLs, and even AWS Network Firewall—creating a security gap in many environments. Throughout this post, we’ve explored how DNS tunneling and DGA-based exploits take advantage of this blind spot, and how you can use Route 53 Resolver DNS Firewall Advanced to protect from these threats through real-time pattern analysis and anomaly detection. You learned how to configure the service in the AWS console and use the provided CloudFormation template with recommended rules that balance blocking high-confidence threats while alerting on potential issues. And you saw how query logging provides valuable visibility into your DNS traffic and how Security Hub integration centralizes your security findings. Implementing these capabilities helps you protect your infrastructure from sophisticated DNS-based exploits that traditional domain blocklists cannot catch, strengthening your cloud security posture while maintaining operational efficiency.

If you have feedback about this post, submit comments in the Comments section below.

Lawton Pittenger

Lawton Pittenger

Lawton is a Security Solutions Architect at AWS, based in New York City, focused on helping customers implement native AWS security services. Professionally, Lawton has worked in IT security roles, securing cloud environments. Outside of cloud security, his interests include skateboarding, snowboarding, and rock climbing.

Michael Leighty

Michael Leighty

Michael is a Senior Security Solutions Architect at AWS, based in Atlanta. He specializes in helping customers design and implement effective network security controls, drawing from extensive experience at leading network security vendors. At AWS, he works closely with service teams to drive continuous improvement in security services based on customer needs and feedback.

SMS Onboarding for SaaS, ISV, and Multi-Tenant Applications with AWS End User Messaging

Post Syndicated from Tyler Holmes original https://aws.amazon.com/blogs/messaging-and-targeting/sms-onboarding-for-saas-isv-and-multi-tenant-applications-with-aws-end-user-messaging/

Introduction

SMS messaging continues to be one of the most reliable and effective communication channels. However, for Software as a Service (SaaS) companies, Independent Software Vendors (ISVs), and multi-tenant solution providers looking to incorporate SMS capabilities into their offerings, the journey can be complex and filled with challenges.

This guide is specifically designed for technology providers—whether you’re a SaaS company, an ISV, or any platform that enables your customers to send SMS messages to their end users. Throughout this article, the following terminology will be used:

  • Provider: An organization offering SMS capabilities as part of your product or service
  • Customer: The entities using Provider technology to send SMS messages
  • End User: The recipients who opt in to receive SMS messages from Customers

The landscape of SMS implementation can be complicated, with varying country-specific regulations, lengthy registration processes that can take weeks or even months, different originator types (Long Code, Short Code, Sender ID, etc.) with unique capabilities, and the diverse needs of Customers and End Users. These challenges are amplified when you’re a Provider offering SMS services to your own Customers, who in turn serve their End Users.

By the end of this guide, you’ll understand:

  • How opt-in influences architecture
  • Options for how to structure your SMS offering to Customers
  • Strategies for reducing friction in the SMS implementation process

Let’s dive in.

The Registration Dilemma: Who Owns the Relationship?

One of the most critical decisions for your SMS Originator registration is determining whose information is used to apply. The biggest mistake AWS sees Providers make is not knowing how their relationship with their Customers and their Customers’ End Users affects their architecture and how they complete any registrations that are necessary.

Mobile carriers want to know who will be sending SMS to their customers, how that entity will opt them in, and what content they will be sending. When registering for originators, especially in the United States, you will need to succinctly explain how End Users will opt in and how that data will not be shared with any third parties. Your architecture must ensure:

AWS consistently sees Providers register themselves when obtaining an Originator when they do not have a relationship with their Customers’ End Users. The decision of whose information belongs in the registration hinges primarily on a fundamental question: Who does the End User believe they’re entering into a relationship with when they provide their phone number?

The most common scenarios are below:

Scenario 1: End Users interact with the Customer’s brand only

In most cases, End Users are completely unaware of your existence as the Provider. They believe they’re opting in to receive messages from your Customer directly. In this scenario:

  • Registration should be completed using the Customer’s information. There are many ways you can facilitate this process and some ways to reduce this common friction point will be discussed later in this post.
  • Messages should appear to come from the Customer, not the Provider, your service name should not appear in messaging

Scenario 2: End Users explicitly opt in through the Provider application

In some cases, End Users clearly understand they’re opting in to receive messages via your technology platform, on behalf of your Customer. The opt-in data will not be shared with your Customers and your brand, as the Provider, will be the named entity in all SMS sent.

There are a number of ways that this can happen:

  • End Users could opt in using a widget you build that your Customers install on their site or in their app
  • A paper form or verbal script that you supply that clearly identifies you, the Provider

AWS commonly sees this occurring with Providers that supply:

  • Third-party payment processing
  • Shipping and logistics support
  • Customer service platforms
  • One-Time Password (OTP) capabilities

In this scenario your company name would typically appear in the messaging and registration would use your company information.

NOTE: There are edge cases to these two scenarios but the implementation can be complicated, so if you are a Provider and you don’t think that you fit into these two scenarios above make sure to reach out to your Account Manager, open a case, or speak to a specialist before starting to implement anything.

Architectural Models for SMS Implementation

Let’s explore various architectural models for structuring your SMS offering based on your business needs and Customer relationships. Each model has distinct characteristics in the following areas:

1. “Bring Your Own AWS Account” Model

Who does the registration and configuration?

  • The Customer connects their own AWS account, so the registration and any configuration happens in the Customer account.
  • Usually in this scenario the information that is input into the registration is the Customer’s since it’s their account

Customer responsibilities:

  • Customer handles all registration and configuration requirements themselves
  • Customer integrates their account with the Provider service
  • Customer manages sending, opt-out lists, etc.
  • Pays the AWS bill

Provider responsibilities:

  • The Provider offers a user-friendly interface that calls the AWS End User Messaging Service APIs using the Customer’s credentials.
  • The depth of services offered by the Provider can vary

Best for: Technical Customers who want full control and already use AWS; Providers who want to avoid registration and configuration complexities.

2. Provider Account – Manual Registration and Configuration

Who does the registration and configuration?

  • The Provider owns the account and is not providing the Customer with a way to submit their own information so the Provider must enter the information
  • The Customer’s information is captured manually
  • The Provider handles the complexity of registration and configuration through the console

Customer responsibilities:

  • Provide necessary information to the Provider for registration purposes

Provider responsibilities:

  • Captures the registration information manually from Customers.
  • Manages the complexity on behalf of your Customers.

This can be implemented either with separate AWS accounts for each Customer or a multi-tenant architecture in a single account.

Best for: Providers with a small number of high-value Customers who need hand-holding through the SMS implementation process.

3. Semi-Automated Solution – Customer Sending

Who does the registration and configuration?

  • The Provider builds a way for the Customer to submit their registration information, which the Provider then programmatically submits to carriers/regulators.

Customer responsibilities:

  • Your platform manages the technical configuration and provides sending capabilities, but the Customer is responsible for maintaining compliance.

Provider responsibilities:

  • You provide a streamlined way for Customers to submit registration information (webhooks, forms, APIs).
  • You programmatically submit the registration data to carriers/regulators.
  • You manage the technical configuration and provide sending capabilities.

Best for: Providers with moderate technical sophistication who want to reduce friction while maintaining separation of regulatory responsibilities.

4. Fully Automated Solution – Provider Sending

Who does the registration and configuration?

  • The Customer’s information is used in the registration, which you handle programmatically.

Customer responsibilities:

  • You handle all technical aspects of registration, but the Customer is still responsible for maintaining messaging compliance.

Provider responsibilities:

  • You provide hosted, customizable Terms & Conditions and Privacy Policies for each Customer that are compliant out of the box.
  • You offer compliant opt-in pathways (web forms, verbal scripts, etc.).
  • You handle all technical aspects of registration.

Best for: Large-scale Providers serving many Customers with varying levels of technical sophistication.

5. Template-Restricted Fully Automated Messaging

Who does the registration and configuration?

  • The Customer’s information is used in the registration, which you handle programmatically.

Customer responsibilities:

  • You manage all regulatory compliance centrally, and the Customer can only personalize specific fields in pre-approved message templates.

Provider responsibilities:

  • You provide a suite of pre-approved message templates.
  • You manage all regulatory compliance centrally.
  • You simplify the registration process since the content is tightly controlled.

Best for: Use cases with predictable messaging needs like appointment reminders, shipping notifications, or one-time passwords.

6. Fully Managed Programs

Who does the registration and configuration?

  • The Customer authorizes you to send messages on their behalf, so you own the relationship with the end-user and the registration.

Customer responsibilities:

  • Only required to give you any pertinent information necessary for you to send messages to the End-Users. This could be things like tracking numbers or other information that the particular use case requires and is part of the personalization that is allowed.

Provider responsibilities:

  • You manage all aspects of the end-user relationship.
  • You control the entire messaging experience, including opt-in collection and the end-user relationship.

Example: A shipping notification service might send messages like: “ShipTrack: Your order from ACME Corp will arrive tomorrow. Track at [link]”

Best for: Specialized use cases where your platform adds significant value as an identified intermediary.

Shaping Your SMS Offering: Strategic Considerations

Pricing Strategies

When incorporating SMS into your product, one of the first considerations is how to structure your pricing. Unlike many digital services with predictable costs, SMS pricing varies significantly based on destination country, originator type, and volume.

AWS End User Messaging Service bills based on volume sent per country, with each country having its own price point. This pricing is determined by the recipient’s handset country code, not their physical location. This means that even if you primarily serve U.S. based Customers, you may need to account for international rates when recipients have non-U.S. phone numbers.

There are also one-time and ongoing fees to be accounted for. Registrations often have one-time processing fees and Originators can have leasing costs that range from free to more than $1,000 a month for short codes in some countries. Make sure that you think through how those costs will or will not be passed to your Customers.

As you design your pricing model, consider these common volume based approaches:

  • SMS Credits: Create a standardized credit system where Customers purchase credits regardless of destination country. You would internally manage the conversion between credits and actual costs.
  • Dollar-Based Allocation: Provide Customers with a budget that gets depleted based on actual costs per message sent.
  • Tiered Country Pricing: Group countries into tiers (e.g., Tier 1 for North America, Tier 2 for Western Europe) with different pricing for each tier.
  • Bundled Messaging: Include a certain number of messages in your base subscription with overage fees for additional messages.

Each approach has trade-offs in terms of simplicity, transparency, and risk management. Your decision should align with your overall business model and Customer expectations.

Geographic Considerations

Different countries have distinct regulatory requirements for SMS messaging, including:

  • Originator Support: Not all countries support all originator types, view the details here
  • Originator Selection: In cases where multiple types of originators are supported, how do you support your Customer in selecting the right originator for the right use case?
  • Read through this tutorial to help decide what originator(s) are right for your use case(s)
  • Registration: An increasing numbers of countries require you to register before being allowed to send
  • Quiet hours: Many countries restrict when promotional messages can be sent
  • Content restrictions: Certain types of content (gambling, alcohol, adult content, etc.) may be prohibited or heavily restricted. A more comprehensive list can be found here
  • Template requirements: Some jurisdictions require pre-approval of message templates
  • Sender ID regulations: Rules regarding who can use alphanumeric sender IDs vary widely

As a Provider, you need to decide which countries you’ll support and how you’ll ensure compliance across markets. This decision affects not just your pricing but your entire product architecture, especially if you serve global Customers.

Strategies to Reduce Implementation Friction

Implementing SMS can be complex for your Customers. Here are some strategies that can simplify and/or streamline the process. Some of these can be mixed and matched and could also be used as a value-add or even as a paid offering to your Customers:

Provider-Hosted Privacy Policy and/or Terms & Conditions

Create customizable, compliant templates for Privacy Policies and Terms & Conditions that your Customers can use. This ensures proper disclosure of SMS practices without requiring Customers to update their own legal documents.

Registration Webforms and Workflows

Develop user-friendly webforms that collect all required registration information in a guided process. These can significantly simplify complex registrations like 10DLC brand and campaign registration.

Below, Figures 1-3, you will find several examples of compliant forms that could be customized for your use:

Fig. 1

Fig. 2

Fig. 3

Pre-Approved Opt-In Widgets

Create embeddable widgets, such as Figures 1-3 above, that your Customers can add to their websites or apps that implement compliant opt-in processes. These can include all required disclosures and confirmations while being easy to integrate.

Template Libraries

Provide a library of pre-approved message templates for common use cases. This reduces compliance risks and simplifies the sending process for your Customers.

Testing Environments

Create sandbox environments where Customers can test their SMS implementation before going live. This helps catch issues with formatting, opt-in processes, or content compliance.

Documentation and Training

Develop clear documentation and training resources specific to each originator type and use case. This empowers your Customers while reducing support burden.

Conclusion

Incorporating SMS capabilities into your platform can enhance Customer engagement, but the journey can be complex. This guide has explored key considerations to help you navigate it successfully.

This post examined various architectural models, each with tradeoffs in Customer responsibilities and Provider responsibilities. This post reviewed strategic factors like pricing, geographic regulations, and originator types that must be carefully considered.
Finally, practical strategies to reduce implementation friction for Customers such as hosted compliance documents, streamlined registration workflows, and pre-approved templates, you can use to simplify the integration process were discussed .

The critical first step though, is understanding the relationship between you as the Provider, your Customers, and their End Users. This shapes whose information is used for originator registration, which in turn defines the SMS experience.

Ultimately, a successful SMS solution requires balancing technical, regulatory, and Customer-centric factors. Leveraging this guidance will equip you to design and deploy an offering that delights your Customers and their End Users.

Additional resources:

Patch Tuesday – May 2025

Post Syndicated from Adam Barnett original https://blog.rapid7.com/2025/05/13/patch-tuesday-may-2025/

Patch Tuesday - May 2025

Microsoft is addressing 77 vulnerabilities this May 2025 Patch Tuesday. Microsoft has evidence of in-the-wild exploitation for five of the vulnerabilities published today, and these are already reflected in CISA KEV. Separately, Microsoft is aware of existing public disclosure for two vulnerabilities published today. This is now the eight consecutive Patch Tuesday on which Microsoft has published zero-day vulnerabilities without evaluating any of them as critical severity at time of publication. Today also sees the publication of six critical remote code execution (RCE) vulnerabilities. Six browser vulnerabilities have already been published separately this month, and are not included in the total.

Windows Scripting Engine: zero-day RCE

In the majority of cases, the CVSSv3 base score provides a solid sense of the severity of a vulnerability. Sometimes, however, even a correct CVSS assessment can disguise the potential impact of a specific vulnerability. This arguably the case with CVE-2025-30397, a zero-day RCE vulnerability in the Windows Scripting Engine with a healthy but unremarkable CVSSv3 base score of 7.5. Microsoft is aware of exploitation in the wild. It’s certainly not the worst of the worst — we save that level of alarm for pre-authentication RCE with no requirement for user interaction —  and Microsoft assesses attack complexity as high, which is arguably correct. And yet…

The advisory FAQ for CVE-2025-30397 explains that successful exploitation requires an attacker to first prepare the target so that it uses Edge in Internet Explorer Mode, and then causes the user to click a malicious link; there is no mention of a requirement for the user to actively reload the page in Internet Explorer Mode, so we must assume that exploitation requires only that the “Allow sites to be reloaded in Internet Explorer” option is enabled. Users who are most likely to require Internet Explorer compatibility mode in 2025 are surely users at enterprise organizations, where critical business workflows still depend on applications from the dinosaur days when Internet Explorer ruled the roost. No doubt the concept of a plan for migration of all of these applications exists, buried several layers deep in a dusty backlog, but Microsoft would hardly be offering IE compatibility mode until at least 2029 if it didn’t know that a huge swathe of its customer base demands it.

If the pre-requisite conditions are already conveniently in place on the target asset thanks to a well-meaning corporate IT policy, attack complexity is suddenly nice and low. If this vulnerability didn’t have that requirement for environment preparation, the CVSS base score would then be 8.8, which is as close to critical as you can get without actually stepping over the line. As Rapid7 has previously noted on a number of occasions, the MSHTML/Trident scripting engine is still present in Windows; this is true even for assets which have only ever run versions of Windows released well after the end of support for Internet Explorer 11 back in June 2022.

Common Log File System: zero-day EoPs

Neither CVE-2025-32701 nor CVE-2025-32706 are the first zero-day vulnerabilities in the Windows Common Log File Driver System; indeed, they are the latest members of an ongoing dynasty where exploitation typically leads to elevation of privilege to SYSTEM. Credit where credit is due: recent disclosures by Microsoft’s own Threat Intelligence Center (MSTIC), including this month’s CVE-2025-32701, demonstrate that Microsoft is putting serious effort into detecting and rooting out CLFS exploitation. Of course, since Microsoft is aware of exploitation in the wild, we know that someone else got there first, and there’s no reason to suspect that threat actors will stop looking for ways to abuse CLFS any time soon.

Windows Desktop Window Manager: zero-day EoP

If proof were needed that elevation of privilege to SYSTEM will never go out of style, today sees the publication of CVE-2025-30400, which is a zero-day vulnerability in the Windows Desktop Window Manager (DWM). As it happens, tomorrow marks the one-year anniversary of CVE-2024-30051, a previous zero-day EoP vulnerability in DWM.

Visual Studio: zero-day RCE

Today, all current versions of Visual Studio 2022 and 2019 receive patches for CVE-2025-32702, a zero-day RCE where exploitation requires the user to download and open a malicious file. There is nothing obviously remarkable about this, although Microsoft is aware of public disclosure. As usual for a malicious file/link vuln, the word Remote here refers to the location of the attacker, even though exploitation is set in motion by local user action.

Ancillary Function Driver for Winsock: zero-day EoP

Regular Patch Tuesday watchers will recognize the Ancillary Function Driver for Winsock, which is the site of CVE-2025-32709, an elevation of privilege vulnerability for which Microsoft is aware of exploitation. In something of a break with tradition for Patch Tuesday zero-day EoP vulnerabilities, exploitation only leads to administrator privileges rather than all the way to SYSTEM, but no attacker is going to waste too many cycles feeling sad about that.

Defender for Identity: situationally-ironic zero-day spoofing

Today sees the publication of CVE-2025-26685, a zero-day spoofing vulnerability in Microsoft Defender for Identity. The advisory provides puzzle pieces which don’t by themselves add up to anything like a full explanation of the vulnerability; no action is required for remediation, but you can render yourself vulnerable if you insist by opening a case with Microsoft Support to re-enable the legacy NTLM authentication method.

However, the FAQ does offer a link to an article published yesterday: Configure SAM-R to enable lateral movement path detection in Microsoft Defender for Identity. This solid piece of documentation is part of the overall Defender for Identity administration guide, and explains that the lateral movement path detection feature can itself potentially be exploited by an adversary to obtain an NTLM hash.

Exploitation relies on achieving fallback from Kerberos to NTLM; the compromised credentials in this case would be those of the Directory Service Account for Defender for Identity. The new Defender for Identity sensor (version 3.x) is not affected by this issue as it uses different detection methods; at time of writing, the Defender for Identity What’s new? page doesn’t yet describe the 3.x release, but this will presumably receive an update soon.

Microsoft lifecycle update

The next batch of significant Microsoft product lifecycle status changes are due in July 2025, when SQL Server 2012 ESU program draws to a close, along with support for Visual Studio 2022 17.8 LTSC.

Summary charts

Patch Tuesday - May 2025
Patch Tuesday - May 2025
Patch Tuesday - May 2025

Summary tables

Apps vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-29975 Microsoft PC Manager Elevation of Privilege Vulnerability No No 7.8

Azure vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-29972 Azure Storage Resource Provider Spoofing Vulnerability No No 9.9
CVE-2025-29827 Azure Automation Elevation of Privilege Vulnerability No No 9.9
CVE-2025-30387 Document Intelligence Studio On-Prem Elevation of Privilege Vulnerability No No 9.8
CVE-2025-47733 Microsoft Power Apps Information Disclosure Vulnerability No No 9.1
CVE-2025-33072 Microsoft msagsfeedback.azurewebsites.net Information Disclosure Vulnerability No No 8.1
CVE-2025-29973 Microsoft Azure File Sync Elevation of Privilege Vulnerability No No 7

Azure Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-27488 Microsoft Windows Hardware Lab Kit (HLK) Elevation of Privilege Vulnerability No No 6.7

Browser vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-29825 Microsoft Edge (Chromium-based) Spoofing Vulnerability No No 6.5
CVE-2025-4372 Chromium: CVE-2025-4372 Use after free in WebAudio No No N/A
CVE-2025-4096 Chromium: CVE-2025-4096 Heap buffer overflow in HTML No No N/A
CVE-2025-4052 Chromium: CVE-2025-4052 Inappropriate implementation in DevTools No No N/A
CVE-2025-4051 Chromium: CVE-2025-4051 Insufficient data validation in DevTools No No N/A
CVE-2025-4050 Chromium: CVE-2025-4050 Out of bounds memory access in DevTools No No N/A

Developer Tools vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-29813 Azure DevOps Server Elevation of Privilege Vulnerability No No 10
CVE-2025-26646 .NET, Visual Studio, and Build Tools for Visual Studio Spoofing Vulnerability No No 8
CVE-2025-32702 Visual Studio Remote Code Execution Vulnerability No Yes 7.8
CVE-2025-21264 Visual Studio Code Security Feature Bypass Vulnerability No No 7.1
CVE-2025-32703 Visual Studio Information Disclosure Vulnerability No No 5.5

ESU Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-29962 Windows Media Remote Code Execution Vulnerability No No 8.8
CVE-2025-29966 Remote Desktop Client Remote Code Execution Vulnerability No No 8.8
CVE-2025-29967 Remote Desktop Client Remote Code Execution Vulnerability No No 8.8
CVE-2025-32701 Windows Common Log File System Driver Elevation of Privilege Vulnerability Yes No 7.8
CVE-2025-32706 Windows Common Log File System Driver Elevation of Privilege Vulnerability Yes No 7.8
CVE-2025-30385 Windows Common Log File System Driver Elevation of Privilege Vulnerability No No 7.8
CVE-2025-32709 Windows Ancillary Function Driver for WinSock Elevation of Privilege Vulnerability Yes No 7.8
CVE-2025-32707 NTFS Elevation of Privilege Vulnerability No No 7.8
CVE-2025-24063 Kernel Streaming Service Driver Elevation of Privilege Vulnerability No No 7.8
CVE-2025-29831 Windows Remote Desktop Services Remote Code Execution Vulnerability No No 7.5
CVE-2025-30397 Scripting Engine Memory Corruption Vulnerability Yes No 7.5
CVE-2025-29969 MS-EVEN RPC Remote Code Execution Vulnerability No No 7.5
CVE-2025-29833 Microsoft Virtual Machine Bus (VMBus) Remote Code Execution Vulnerability No No 7.1
CVE-2025-27468 Windows Kernel-Mode Driver Elevation of Privilege Vulnerability No No 7
CVE-2025-29959 Windows Routing and Remote Access Service (RRAS) Information Disclosure Vulnerability No No 6.5
CVE-2025-29960 Windows Routing and Remote Access Service (RRAS) Information Disclosure Vulnerability No No 6.5
CVE-2025-29830 Windows Routing and Remote Access Service (RRAS) Information Disclosure Vulnerability No No 6.5
CVE-2025-29832 Windows Routing and Remote Access Service (RRAS) Information Disclosure Vulnerability No No 6.5
CVE-2025-29836 Windows Routing and Remote Access Service (RRAS) Information Disclosure Vulnerability No No 6.5
CVE-2025-29958 Windows Routing and Remote Access Service (RRAS) Information Disclosure Vulnerability No No 6.5
CVE-2025-29961 Windows Routing and Remote Access Service (RRAS) Information Disclosure Vulnerability No No 6.5
CVE-2025-29835 Windows Remote Access Connection Manager Information Disclosure Vulnerability No No 6.5
CVE-2025-29968 Active Directory Certificate Services (AD CS) Denial of Service Vulnerability No No 6.5
CVE-2025-29957 Windows Deployment Services Denial of Service Vulnerability No No 6.2
CVE-2025-30394 Windows Remote Desktop Gateway (RD Gateway) Denial of Service Vulnerability No No 5.9
CVE-2025-29954 Windows Lightweight Directory Access Protocol (LDAP) Denial of Service Vulnerability No No 5.9
CVE-2025-29974 Windows Kernel Information Disclosure Vulnerability No No 5.7
CVE-2025-29837 Windows Installer Information Disclosure Vulnerability No No 5.5
CVE-2025-29956 Windows SMB Information Disclosure Vulnerability No No 5.4
CVE-2025-29839 Windows Multiple UNC Provider Driver Information Disclosure Vulnerability No No 4

Microsoft Dynamics vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-47732 Microsoft Dataverse Remote Code Execution Vulnerability No No 8.7
CVE-2025-29826 Microsoft Dataverse Elevation of Privilege Vulnerability No No 7.3

Microsoft Office vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-30377 Microsoft Office Remote Code Execution Vulnerability No No 8.4
CVE-2025-30386 Microsoft Office Remote Code Execution Vulnerability No No 8.4
CVE-2025-32704 Microsoft Excel Remote Code Execution Vulnerability No No 8.4
CVE-2025-30382 Microsoft SharePoint Server Remote Code Execution Vulnerability No No 7.8
CVE-2025-29976 Microsoft SharePoint Server Elevation of Privilege Vulnerability No No 7.8
CVE-2025-29978 Microsoft PowerPoint Remote Code Execution Vulnerability No No 7.8
CVE-2025-32705 Microsoft Outlook Remote Code Execution Vulnerability No No 7.8
CVE-2025-29977 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-29979 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-30375 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-30376 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-30379 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-30381 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-30383 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-30393 Microsoft Excel Remote Code Execution Vulnerability No No 7.8
CVE-2025-30384 Microsoft SharePoint Server Remote Code Execution Vulnerability No No 7.4
CVE-2025-30378 Microsoft SharePoint Server Remote Code Execution Vulnerability No No 7

Microsoft Office ESU Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-30388 Windows Graphics Component Remote Code Execution Vulnerability No No 7.8

System Center vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-26684 Microsoft Defender Elevation of Privilege Vulnerability No No 6.7
CVE-2025-26685 Microsoft Defender for Identity Spoofing Vulnerability No Yes 6.5

Windows vulnerabilities

CVE Title Exploited? Publicly disclosed? CVSSv3 base score
CVE-2025-29964 Windows Media Remote Code Execution Vulnerability No No 8.8
CVE-2025-29840 Windows Media Remote Code Execution Vulnerability No No 8.8
CVE-2025-29963 Windows Media Remote Code Execution Vulnerability No No 8.8
CVE-2025-30400 Microsoft DWM Core Library Elevation of Privilege Vulnerability Yes No 7.8
CVE-2025-29970 Microsoft Brokering File System Elevation of Privilege Vulnerability No No 7.8
CVE-2025-26677 Windows Remote Desktop Gateway (RD Gateway) Denial of Service Vulnerability No No 7.5
CVE-2025-29971 Web Threat Defense (WTD.sys) Denial of Service Vulnerability No No 7.5
CVE-2025-29842 UrlMon Security Feature Bypass Vulnerability No No 7.5
CVE-2025-29838 Windows ExecutionContext Driver Elevation of Privilege Vulnerability No No 7.4
CVE-2025-29841 Universal Print Management Service Elevation of Privilege Vulnerability No No 7
CVE-2025-29955 Windows Hyper-V Denial of Service Vulnerability No No 6.2
CVE-2025-29829 Windows Trusted Runtime Interface Driver Information Disclosure Vulnerability No No 5.5

[$] A look at what’s possible with BPF arenas

Post Syndicated from daroc original https://lwn.net/Articles/1019885/


BPF arenas
are areas of memory where the verifier can safely relax its checking of
pointers, allowing programmers to write arbitrary data structures in BPF. Emil
Tsalapatis reported on how his team has used arenas in writing

sched_ext schedulers
at the 2025 Linux Storage, Filesystem,
Memory-Management, and BPF Summit. His biggest complaint was about the fact that
kernel pointers can’t be stored in BPF arenas — something that the BPF
developers hope to address, although there are some implementation problems that
must be sorted out first.

Nextcloud claims Google is being anticompetitive

Post Syndicated from jzb original https://lwn.net/Articles/1021016/

Nextcloud provides an
open-source collaboration platform called Nextcloud Hub, which includes file-sharing and syncing
features. The company has written
a blog post explaining that Google has revoked a critical permission
from the Nextcloud Files app for Android that allows it to sync files
to Nextcloud Hub.

Google is stating security concerns as a reason for revoking the
permission. This is hard to believe for us. Nextcloud has had this
feature since its inception in 2016, and we have never heard about any
security concerns from Google about it. Moreover, several Big Tech
apps as well as Google’s own still have this. What we think: Google
owning the platform means they can and are giving themselves
preferential treatment.

Despite multiple appeals since mid-2024, Google has refused to
reinstate the permission, blocking automated Nextcloud file uploads
for millions of users.

The Nextcloud
app
available via F-Droid does not have this limitation, but the
post notes that that is not an option for many users.

AI lifecycle risk management: ISO/IEC 42001:2023 for AI governance

Post Syndicated from Abdul Javid original https://aws.amazon.com/blogs/security/ai-lifecycle-risk-management-iso-iec-420012023-for-ai-governance/

As AI becomes central to business operations, so does the need for responsible AI governance. But how can you make sure that your AI systems are ethical, resilient, and aligned with compliance standards?

ISO/IEC 42001, the international management system standard for AI, offers a framework to help organizations implement AI governance across the lifecycle. In this post, we walk through how ISO/IEC 42001 enables effective AI governance, review the risk management requirements, and explore how you can use threat modeling as a practical technique to meet those expectations.

AI governance

AI governance refers to the organizational structures, policies, and controls that enable AI systems to be used responsibly, ethically, and safely. Governance spans the entire AI lifecycle and includes the following activities:

  • Setting the intended purpose and stakeholder alignment
  • Managing data, models, and deployment risks
  • Designing in explainability, bias mitigation, and traceability
  • Establishing accountability, monitoring, and decommissioning practices

These activities are the foundation of a formal framework that you can use to establish governance processes, identify and manage risk, and implement processes for continuous improvement

AI lifecycle

While ISO 42001 provides a framework for AI governance, ISO/IEC 22989:2022 describes what an AI system is and how it evolves. Governance should be implemented at every stage of the AI lifecycle to manage AI risks effectively. According to the ISO/IEC 22989:2022 standard, an organization’s AI life cycle might include these stages:

  1. Inception: Identifying needs, goals, and feasibility
  2. Design and development: Defining system architecture, data flows, and training models
  3. Verification and validation: Testing and confirming that the system meets requirements and performs as intended
  4. Deployment: Releasing the system into its operational environment
  5. Operation and monitoring: Running the system, logging activity, and monitoring performance and outcomes
  6. Re-evaluation: Assessing whether the system continues to meet objectives under changing conditions
  7. Retirement: Decommissioning the system and addressing long-term data and access risks

Understanding the AI lifecycle, shown in Figure 1 that follows, is critical for identifying and mitigating AI risks. While these seven stages are provided directly in ISO 22989:2022, your organization might define its AI lifecycle stages differently to suit its business context. We refer to these stages as we explore the components of an AI management system, from initial AI system scoping, through threat monitoring and risk assessment, to monitoring the established governance program.

Figure 1: Example of AI system lifecycle model stages and high-level processes based on ISO/IEC 22989:2022

Figure 1: Example of AI system lifecycle model stages and high-level processes based on ISO/IEC 22989:2022

Risk management in ISO/IEC 42001:2023

After an organization has identified and assessed AI risks (Clause 6.1 of ISO/IEC 42001:2023), operational controls to mitigate those risks must be implemented (Clause 8.2), and those controls and the AI system itself should be continuously monitored, documented, and improved (Clauses 9 and 10). AI impact assessments (AIIAs) are critical in high-risk use cases, complementing baseline risk assessments by focusing on societal, ethical, and legal impacts. AIIAs are like data protection impact assessments (DPIAs) for high-risk personal data processing under many privacy regulations. DPIAs are specifically designed to assess risks to individuals’ privacy and data protection rights under laws such as the GDPR. While AIIAs help organizations maintain responsible AI governance, DPIAs can be used in parallel to help verify that AI systems comply with data protection laws, together providing a holistic view of risks and safeguards across both ethical and legal dimensions.

You are free to select the AIIA tools or methodologies that best fit your use case. Two widely accepted frameworks are:

  • ISO 31000: A general-purpose enterprise risk management standard that helps identify, evaluate, and treat risks in a structured and repeatable way. It aligns well with organizations seeking to embed AI risk into their broader enterprise risk management (ERM) programs.
  • NIST AI Risk Management Framework (AI RMF): A NIST framework specifically designed for AI systems. It introduces tailored concepts such as explainability, robustness, fairness, and accountability, with actionable guidance organized into four core functions: Map, measure, manage, and govern.

ISO 42001 provides structured methods to conduct risk and impact assessments. Threat modeling tools such as:

  • STRIDE (spoofing, tampering, repudiation, information disclosure, denial of service, and elevation of privilege). STRIDE aims to make sure that a system meets security requirements for confidentiality, integrity and availability.
  • DREAD (damage potential, reproducibility, exploitability, affected users, and discoverability) is a framework that can assess severity of individual threats.
  • OWASP (Open Worldwide Application Security Project) for machine learning (ML) enables analysis of AI system vulnerabilities, adversarial risks, and privacy threats.

Trustworthy AI is the result of strategic governance, structured methodologies, and technical analysis.

Figure 2 that follows shows the tiered structure of AI risk governance, moving from high-level governance to detailed technical assessments. On the left side, there’s a downward flow representing the increasing depth of controls, while the right side shows an upward scale indicating escalating AI risks.

  • At the top layer, ISO/IEC 42001:2023 defines formal requirements for AI governance, including risk assessment mandates, control implementation, and lifecycle oversight.
  • The middle layer features widely adopted risk assessment methodologies and frameworks, such as ISO 31000 and the NIST AI Risk Management Framework (RMF), which provide structured methods to identify, evaluate, and mitigate AI risks.
  • At the base, are detailed threat modeling tools—including STRIDE, DREAD, PASTA, LINDDUN, and OWASP for ML—that support deep analysis of AI systems for vulnerabilities related to security, privacy, data protection, and adversarial threats.

Together, these layers form a comprehensive approach to AI risk governance, aligning strategic oversight with operational and technical defenses.

Figure 2: A layered approach to AI risk management aligned with ISO/IEC 42001. ISO/IEC 42001 defines AI governance for responsible AI

Figure 2: A layered approach to AI risk management aligned with ISO/IEC 42001. ISO/IEC 42001 defines AI governance for responsible AI

Threat modeling for AI risk identification

Threat modeling identifies AI lifecycle technical risks such as exploit surfaces, adversarial threats, and misuse scenarios that complement organizational risk analysis and impact assessments. This post takes a broader AI lifecycle view, showing you how threat modeling complements other risk strategies within the context of ISO/IEC 42001:2023. Additionally, AWS has published AI threat modeling guidance, such as:

The following table is an example STRIDE threat model for a generative AI resource using AWS services by AI lifecycle stage and risk type. This illustrates technical threat remediation through AWS cloud native governance features.

STRIDE category Example threat Lifecycle stage Risk type AWS feature for governance
Spoofing A fake identity uses the AI system to generate phishing emails or misinformation Inception Security AWS IAM Identity Center and Amazon Cognito for multi-factor authentication (MFA), Amazon GuardDuty for threat detection
Tampering A malicious prompt injection or API injection alters the model behavior or bypasses filters Design development Integrity Amazon Bedrock Guardrails, Amazon API Gateway and AWS WAF rules, AWS CloudTrail for input auditing
Repudiation Users deny prompt activity or content creation, and there’s no logging Verification and validation Accountability CloudTrail, Amazon Bedrock invocation logs, Amazon SageMaker ML Lineage Tracking for traceability
Information disclosure Sensitive internal data—such as code or personally identifiable information (PII)—accidentally learned and reproduced by the large language model (LLM) Operation and monitoring Privacy, Security SageMaker Clarify, AWS VPC PrivateLink, AWS Key Management Service (AWS KMS) encryption, Amazon Bedrock data handling commitments
Denial of service Bad actors overload the AI endpoint with prompt spam, degrading service Deployment Availability AWS Shield, API rate limiting using API Gateway, auto scaling with SageMaker endpoints
Elevation of privilege An internal user modifies system prompts or updates to override content filters Reevaluation Ethics and access control AWS Identity and Access Management (IAM) roles, Amazon Bedrock Guardrails, AWS Config, service control policies (SCPs)

While STRIDE is used here for illustrative clarity, it’s just one of several threat modeling approaches that can be applied depending on the system context. Other widely recognized methods include:

By integrating these threat modeling practices into ISO/IEC 42001’s risk-based approach, organizations are not just “checking compliance boxes” they’re operationalizing trustworthy, secure, and accountable AI governance throughout the full system lifecycle.

Threat modeling touchpoints across the AI lifecycle

ISO 42001:2023 uses the STRIDE threat modeling framework to align specific security threats to each stage. Each lifecycle stage is associated with particular threat types, relevant Annex references from the ISO standard, and examples of what to monitor.

  • Inception (Annex A.8.1): Focuses on spoofing and fake identity input risks.
  • Design and Development (Annex A.9.1): Linked to tampering threats.
  • Verification and Validation (Annex A.7.1): Concerns around repudiation, such as lack of model decision logs.
  • Deployment (Annex A.5.1): Addresses information disclosure vulnerabilities.
  • Operation and Monitoring (Annex A.10.3): Maps to denial-of-service attacks.
  • Re-evaluation (Annex A.8.6): Highlights risks of privilege escalation.

AI threat modeling isn’t a one-time task but must be applied continuously across each lifecycle stage, supported by ISO 42001’s annexes and STRIDE categories.

Figure 3: An illustration of how organizations can use ISO/IEC 42001:2023 as a structured framework for AI risk management, using threat modeling as a key technique across the AI lifecycle

Figure 3: An illustration of how organizations can use ISO/IEC 42001:2023 as a structured framework for AI risk management, using threat modeling as a key technique across the AI lifecycle

AWS tools for AI governance and risk management

AWS governance service capabilities support the controls required in the Statement of Applicability (SoA) under ISO/IEC 42001. These services and features help organizations operationalize responsible AI practices at scale and align with ISO/IEC 42001’s emphasis on structured, accountable AI lifecycle management.

  • Amazon SageMaker Model Cards: Provides standardized documentation for ML models including purpose, performance, and limitations. In the governance context, model cards help maintain transparency, accountability, and auditability of model behavior and use.
  • Amazon SageMaker Clarify: Detects bias in datasets and models and supports explainability of predictions. This directly supports governance controls related to fairness, non-discrimination, and explainability.
  • Amazon SageMaker Ground Truth: Provides high-quality, human-in-the-loop data labeling workflows. It supports data governance by making sure labeled datasets are accurate, consistent, and traceable.
  • Amazon Bedrock Guardrails: Can be used to define safety filters for generative AI, such as avoiding toxic content or harmful outputs. This facilitates alignment with ethical and content governance policies.
  • AWS CloudTrail and AWS Config: Enable audit logging and continuous monitoring of system changes. These are essential for accountability, traceability, and compliance reporting within AI governance frameworks.
  • AWS Identity and Access Management (IAM), AWS Key Management Service (AWS KMS), and AWS PrivateLink: IAM controls access, AWS KMS provides encryption and key management, and PrivateLink enables private connectivity. These features are critical for enforcing access governance, securing data, and maintaining privacy standards.
  • AWS Generative AI Lens: A part of the AWS Well-Architected Framework tool. It provides structured guidance for evaluating and improving the design of generative AI systems. It helps organizations implement responsible AI practices, manage risks

Conducting AI impact assessments for high risk use cases

While general risk assessments (Clause 6.1 of ISO/IEC 42001) are required for AI systems, ISO/IEC 42001 also calls for AIIAs in situations where the AI system poses high potential impact to individuals, groups, or society. AIIAs should result in a documented report of identified risks associated with the target AI activity, in addition to the severity of potential negative outcomes. These risks should be integrated into the AI management system (AIM) and monitored over time. Several stakeholders and specialists might need to provide input in the assessment process, such as legal, risk, compliance, data management, and security teams. Identified risks should be mitigated where possible, and a determination made about whether the residual risk is acceptable.

AIIAs help answer questions such as:

  • Is the AI use justifiable, ethical, and proportionate?
  • Could the system cause discrimination, exclusion, or loss of rights?
  • What safeguards should be built to protect affected people?

AIIA is required:

  • If the system makes or informs decisions that materially affect people
  • If the system is deployed in sensitive domains (such as healthcare, finance, or public services)
  • If risks to fundamental rights, fairness, or trust are flagged during initial risk assessments

AIAA should cover:

  • Purpose and scope of the AI system
  • Stakeholder and impact mapping
  • Legal, ethical, and social risk evaluation
  • Transparency and recourse mechanisms
  • Recommendations for mitigation

AIIA process workflow

Figure 4 that follows illustrates a generic AIIA workflow that includes initiating, scoping, assessing impact, planning mitigation, and documenting the outcome to evaluate how an AI system can affect individuals, groups, and society. Organizations can tailor this process to the AI system context, business objectives, and compliance requirements for their use case.

Figure 4: Sample prescriptive process with key phases on conducting an AIIA

Figure 4: Sample prescriptive process with key phases on conducting an AIIA

AIIA outcome

AIIA reports should capture the core purpose of the exercise: to evaluate how an AI system might affect individuals, communities, and society at large and to make sure that potential risks are addressed through appropriate mitigation strategies. While formats might vary across industries, an AIIA outcome typically includes key sections such as summary of system purpose, a mapping of affected stakeholders, a contextual analysis of legal and social factors, an evaluation of likely impacts (including fairness, bias, and autonomy risks) and a plan for a mitigation, oversight, and monitoring. Governance details such as sign off responsibility and reassessment triggers should also be included.

Whether you’re starting from scratch or adapting an existing template, these foundational elements will help make sure that your documentation supports transparency, accountability, and ethical AI deployment.

Templates:

Mapping AI lifecycle risks to ISO/IEC 42001 controls

After you have identified risks through techniques such as threat modeling and impact assessments, the next step is to make sure that they’re mitigated through the appropriate ISO/IEC 42001 controls. Using the lifecycle stages defined in ISO/IEC 22989:2022, you can map AI risks identified during the threat hunting process to the corresponding ISO/IEC 42001:2023 clauses and Annex A controls. This mapping helps you align your AI development and governance efforts with a standards-based risk framework.

AI lifecycle stage Identified risk Relevant ISO/IEC 42001 clauses Risk mitigation – Annex A controls
Inception Spoofing: Impersonation Clause 4, Clause 5 A.6.1 (Governance roles), A.5.1
Design and development Tampering: Unauthorized changes Clause 6.1, Clause 8.2 A.8.2, A.9.1
Verification and validation Repudiation: No traceability Clause 8.2 A.8.5, A.7.1
Deployment Elevation of privilege: Unauthorized model tweaks Clause 8.2, Clause 9.1 A.10.2, A.6.1
Operation and monitoring Denial of service: System overload Clause 9.1, Clause 10.1 A.8.3, A.10.3
Re-evaluation Drift and new threat vectors Clause 9.3, Clause 10.2 A.10.2, A.6.4
Retirement Information disclosure: Residual risks Clause 8.3, Clause 10.2 A.9.4, A.5.2

Maintaining AI governance

Like most technology risk and governance programs, AI management must be continuously monitored and maintained. ISO 42001 requires an organization to have leadership support and sufficient resources to operate effectively over time. This means that AI governance should be built into every process in the AI development and maintenance journey. AIIAs and threat modeling should be conducted at least annually on existing systems, and prior to the deployment of any new AI function. Policies should be reviewed at least annually and after major change to the AI system. Internal audits should review and monitor compliance with controls continuously, and organizations seeking ISO certification will require annual external audits. Progress toward governance goals and metrics on the status of known AI risks should be reported to the highest level of leadership in a live dashboard, and incidents of negative outcomes related to AI use should be tracked and analyzed to improve the AI system.

Conclusion

Managing AI risk effectively means aligning technical, organizational, and ethical considerations throughout the AI system lifecycle. ISO/IEC 42001 provides structure and accountability. Threat modeling techniques such as STRIDE, MITRE ATLAS, and OWASP for LLM surface deep technical risks. AWS services and features such as SageMaker Model Cards, SageMaker Clarify, and Amazon Bedrock Guardrails help embed governance into layers of AI development.

By combining technical tools, structured assessments, and standards-driven controls, you can build AI systems that are trustworthy, resilient, and aligned with societal expectations.

For additional guidance on achieving, maintaining, and automating compliance in the cloud, contact AWS Security Assurance Services (AWS SAS) or their account team. AWS SAS is a PCI QSAC and HITRUST Assessor Firm that can help by tying together applicable audit standards to AWS service specific features and functionality. They help you build on frameworks such as ISO 42001, PCI DSS, HITRUST CSF, NIST-CSF and Privacy Framework, SOC 2, HIPAA, ISO 27001 and 27701, and more. In addition, AWS Professional Services can also help you plan and map your compliance journey.

Disclaimer: The risk strategies and threat modeling guidance shared in this blog are intended to provide general direction and practical insight into implementing AI risk management under ISO/IEC 42001:2023. However, organizations are responsible for conducting their own context-specific risk assessments, as mandated by the standard. This blog should not be interpreted as an exhaustive approach to or guarantee of compliance with ISO/IEC 42001.

If you have feedback about this post, submit comments in the Comments section below.

Abdul Javid

Abdul Javid

Abdul is a Senior Security Assurance Consultant and a PECB ISO 42001 Lead Auditor and IAPP Certified AI Governance Professional. He draws on his extensive experience to guide AWS customers on compliance matters. He holds an M.S. in Computer Science from IIT Chicago and numerous certifications from IAPP, AWS, ISO, HITRUST, ISACA, PMI, PCI DSS, and ISC2.

Amber Welch

Amber Welch

Amber is a Senior Privacy Consultant with AWS Security Assurance Services and a PECB-certified ISO 42001 Lead Auditor. With extensive security and privacy management experience across industries, she advises AWS customers on compliance and risk management. She holds an M.A. in English – Technical Communication, CIPM and CIPP-E certifications, and is a thought leader in AI privacy, contributing to the AWS Privacy Reference Architecture.

Enhanced remote desktop experience: Amazon DCV with Amazon Linux 2023

Post Syndicated from Madhur Kulkarni original https://aws.amazon.com/blogs/compute/enhanced-remote-desktop-experience-amazon-dcv-with-amazon-linux-2023/

Amazon DCV has evolved as a powerful remote display protocol, enabling secure high-performance remote desktop access and application streaming. This blog talks about how DCV remote display capabilities are now integrated with Amazon Linux 2023 (AL2023).

Overview

This post introduces new Graphical Desktop with AL2023 and provides an overview of new features available through DCV. The Graphical Desktop comes with GNOME 47 for a smooth UI experience that you can connect using DCV, enabling remote desktop access from anywhere. It also provides an overview of more tools such as a terminal emulator with Ptyxis for improved CLI experience, Mozilla Firefox for secure web browsing, an image viewer with Loupe, a text editor, and a file manager for file navigation, and.

Core features

AL2023 introduces an enhanced desktop experience, specifically tailored for remote access needs, as shown in the following Figure-1. DCV technology allows you to connect seamlessly to Graphical Desktop interface with GNOME 47. Users benefit from native DCV protocol support that enables high-performance remote access, featuring dynamic resolution adaptation and hardware-accelerated video encoding. Enhanced security features include advanced encryption and granular access controls.

Although DCV supports multiple desktop environments, the use of GNOME 47 is specifically part of the AL23 current release. The GNOME 47 system uses Mutter 47.0 as its window manager and compositor, alongside the GTK 4 toolkit for its user interface. This includes window management capabilities that provide more precise control over application placement and sizing, while improved multi-monitor support makes sure that workspaces expand seamlessly across displays. Most importantly, there is a native desktop-like experience with the DCV local features such as clipboard sharing, audio redirection, and multi-monitor capabilities, which deliver a seamless and responsive remote environment.

Figure 1. DCV desktop interface with AL2023

Ptyxis delivers exceptional performance with SSH, SFTP, and TLS/SSL support, as shown in the following figure. You can experience GPU-accelerated text rendering with a crystal-clear display, supporting UTF-8/16 and Unicode 15.0, while maintaining minimal input lag at 60 Hz refresh rates. The DCV 4K support enables high-resolution (3840 x 2160 pixels) remote desktop streaming, which allows users to work with graphically intensive applications while maintaining excellent visual quality. Ptyxis is deeply integrated with GNOME through D-Bus and GNOME I/O (GIO) interfaces, providing access to global search and system notifications. Users can use advanced session management with JSON-based configurations, tab groups, split views up to 16 panes, and automatic session restoration. The terminal includes full 256-color and true-color support, compatible with Bash, Zsh, and Fish shells, while maintaining robust connection stability.

Figure 2. Terminal Emulator

Firefox in AL2023 is optimized specifically for remote desktop scenarios, as shown in the following figure. The browser features hardware-accelerated rendering and WebGL 2.0 support, delivering smooth graphics and responsive page loading. Enhanced browser capabilities provide better performance for 3D applications and interactive web content. Users can experience optimized video playback with minimal frame drops and improved synchronization, which are particularly important for remote streaming needs. Integration with DCV streaming technology enables efficient resource usage and provides a local-like experience when accessing remote workstations, featuring seamless audio-video synchronization, smooth multi-monitor support, and native peripheral device integration.

Figure 3. Mozilla Firefox with DCV

More features

The GNOME Text Editor seamlessly integrates with AL2023, providing a modern, distraction-free interface for coding and text editing within the DCV environment, as shown in the following figure. As the default text editor in the AL23 GNOME desktop, it offers essential features such as syntax highlighting, dark/light themes, and autosave functionality, making it ideal for remote development work.

Figure 4. GNOME Text Editor

Loupe offers a sleek and intuitive image viewing experience in AL2023 when accessed through DCV, as shown in the following figure. It features a clean interface with smooth animations, efficient image loading, and gesture support, all while maintaining responsive performance over the DCV remote desktop connection. This makes it ideal for viewing and basic image manipulation tasks.

Figure 5. Image Viewer features and options

The GNOME File manager in AL2023 provides a robust and intuitive interface for managing files and folders when accessed through the DCV remote desktop environment, as shown in the following figure. It offers essential features such as drag-and-drop functionality, list and grid views, file search, and seamless integration with cloud storage, all while maintaining responsive performance over the DCV optimized remote connection protocol. You can use DCV to upload files to and download files from DCV session storage. For instructions on how to enable and configure session storage, go to Enabling session storage in the DCV Administrator Guide.

Figure 6. File Manager

Conclusion

The Amazon DCV team is committed to delivering the best remote desktop experience possible, and these enhancements demonstrate that commitment. In this post, we demonstrated how our integrated solution, from the GNOME 47 intuitive interface to the powerful terminal capabilities of Ptyxis, creates a seamless remote workspace. Using these improvements allows you to enhance productivity and overall user experience in remote desktop environments. These enhanced capabilities offer a significant step forward in remote computing, thereby providing tools and optimizations designed to meet the evolving needs of the distributed and flexible work environments today.

For a deeper dive into setup and advanced configurations, you should review our comprehensive DCV admin guides, which provide detailed information to help you maximize the potential of these new features.

Zero-copy, Coordination-free approach to OpenSearch Snapshots

Post Syndicated from Sachin Kale original https://aws.amazon.com/blogs/big-data/zero-copy-coordination-free-approach-to-opensearch-snapshots/

Amazon OpenSearch Service provides automated hourly snapshots as a critical backup and recovery mechanism for customer data. These snapshots serve as point-in-time backups that you can use to restore your OpenSearch domains to a previous state, helping to ensure data durability and business continuity. While this functionality is essential, it’s equally important that the snapshot process operates seamlessly without impacting the domain’s core operations. The snapshot workflow must be efficient enough to maintain optimal performance of search and indexing operations, preserve the domain’s ability to scale with growing workloads, and support overall cluster stability.

In this blog post, we tell you how we enhanced the snapshot efficiency in Amazon OpenSearch Service while carefully maintaining these critical operational aspects. These snapshot optimizations are enabled for all OpenSearch optimized instance family (OR1, OR2, OM2) domains from version 2.17 onwards.

Background

In the traditional snapshot mechanism of OpenSearch, the process involves uploading incremental segment files from each shard to Amazon Simple Storage Service (Amazon S3). The workflow begins when the cluster manager node initiates the snapshot creation and coordinates with the nodes holding primary shards to capture their respective snapshots. Throughout this process, data nodes continuously communicate with the cluster manager node to report their snapshot progress. To provide resilience against leader failures, the cluster state maintains detailed tracking of all in-progress snapshots. This state is shared with all data nodes. However, this approach introduces significant communication overhead, especially in large-scale deployments.

Consider a cluster with M nodes and N primary shards. Each snapshot operation requires at least N cluster state updates, with M*N transport calls flowing to and from the cluster manager node to the data nodes (comprising one cluster state update for each primary shard and M transport calls for each update), as shown in the following diagram. In large domains with hundreds of nodes and thousands of shards, this intensive communication pattern can potentially overwhelm the cluster manager node, impacting its ability to handle other critical cluster management tasks.

Traditional Snapshot

The OpenSearch optimized instance family introduced a significant advancement in data durability and snapshot efficiency. Built to deliver high throughput with 11 nines of durability, OpenSearch optimized instances maintain a copy of all indexed data in Amazon S3. This architectural design eliminated the need to re-upload data during snapshot creation. Instead, the system references the existing data checkpoint in the snapshot metadata. Data checkpoints track the state of data on shards at a given point in time to help ensure consistency and durability. We also prevent cleaning up data from Amazon S3 that is referenced in the snapshot metadata. This approach made snapshots substantially more lightweight and faster compared to the conventional method.

The improved snapshot flow with OpenSearch optimized instances, also called a shallow snapshot v1, manages checkpoint referencing by creating explicit lock files for each checkpoint of a given shard. This flow is illustrated in the following diagram where in the fourth step, instead of uploading segments data, we upload a checkpoint lock file.

Shallow Snapshot V1

While this approach successfully addressed the data redundancy issue by replacing segment data uploads with checkpoint lock file creation, it introduced its own set of challenges. The communication overhead between nodes remained unchanged during snapshot creation and deletion operations. Additionally, the system creates lock files for every shard in each snapshot, regardless of whether the shard receives active traffic or not. This design choice generated an excessive number of remote store calls in order to create a lock file per shard during snapshot operations which is particularly problematic for larger OpenSearch domains.

Revised shallow snapshot (v2)

At its core, shallow snapshot v2 reimagines how we handle data backup in OpenSearch. Shallow snapshot v2 takes a more intelligent approach by implementing a timestamp-based referencing system that reduces data duplication while eliminating the communication overhead. In shallow snapshot v2, as shown in the following diagram, instead of putting an explicit lock on the remote store checkpoint file of a shard, it puts an implicit lock based on the timestamp of the snapshot and of the checkpoint file. We track these snapshot timestamps in pinned timestamp files and upload them to the remote store. With this implicit lock, the checkpoints that match with timestamps in pinned timestamp files aren’t cleaned up from Amazon S3. With this architectural change, data nodes don’t need to send shard updates to the cluster manager, avoiding the subsequent cluster state updates. The snapshot restoration process works by reading a pinned timestamp file corresponding to your snapshot, which helps the data node locate and download the correct version of data from Amazon S3.

Key benefits

Let’s explore the major advantages of using shallow snapshot v2.

Performance improvements

The performance benefits of shallow snapshot v2 are substantial and multifaceted. By minimizing the amount of data that needs to be uploaded to the remote store and the number of cluster state updates that need to be communicated between nodes during snapshot creation, the system significantly reduces I/O and network operations. This reduction translates to faster snapshot creation times and lower system resource utilization during backup operations.

The evaluations shown in the following table were performed to assess the influence on snapshot operations when the domain experiences significant load.

Domain config Snapshot creation time
Number of nodes Number of shards Traditional Shallow snapshot v1 Shallow snapshot v2
10 100 15–20 minutes 1–2 minutes <1 second
10 10,000 30–40 minutes 5–10 minutes <5 seconds
100 100,000 >1 hour >1 hour <10 seconds

Scalability

With fixed number of inter-node communication calls during snapshot creation, the snapshot creation time is single digit seconds even as the node, index, and shard count grows. When tested on 1,000 nodes in an Amazon OpenSearch Service domain, shallow snapshot v2 creation time was observed between 10–20 seconds. For organizations managing large Amazon OpenSearch Service domains, shallow snapshot v2 offers particular advantages. The reduced storage cost from shallow snapshot and faster snapshot creation times from shallow snapshot v2 make it possible to maintain more frequent backups without overwhelming storage resources or impacting system performance.

Architectural simplification

The architectural improvements in Shallow Snapshot V2 go beyond performance optimization. The new implementation features a more streamlined and maintainable codebase, reducing the effort needed to debug issues and implement future enhancements. The simplified architecture reduces the complexity of the snapshot and restore process, leading to more reliable operations and fewer potential points of failure for use cases that require frequent backups, such as compliance-driven scenarios or development environments. This means that you can establish a lower recovery point objective for disaster recovery. Shallow snapshot v2’s efficient handling of incremental changes makes it possible to maintain more granular backup schedules without performance penalties.

Storage efficiency

The cornerstone of shallow snapshot v2 is its innovative approach to storage management. Instead of creating multiple copies of unchanged data, the system maintains smart references to existing data blocks. This implicit timestamp-based reference-counting mechanism avoids creating explicit locks per shard. In environments where storage resources are at a premium, the storage efficiency of shallow snapshot v2 can lead to significant cost savings. The reference-based approach helps ensure optimal use of available storage space while maintaining comprehensive backup coverage.

Looking ahead

The introduction of Shallow Snapshot V2 marks the beginning of our journey toward more efficient data backup solutions. Building upon the framework created by shallow snapshot v2, we can implement additional features such as point in time recovery (PITR), better cluster state integration, and various performance optimizations.

Conclusion

Shallow Snapshot V2 represents a significant advancement in OpenSearch’s backup capabilities. By combining storage efficiency, improved performance, and architectural simplification, it provides a robust solution for modern data backup challenges. If you’re using an instance type from the optimized instance family, shallow snapshot v2 is already enabled for you. Whether you’re using a large-scale domain or working within storage constraints, shallow snapshot v2 offers tangible benefits for your Amazon OpenSearch Service domains.


About the Authors

Sachin Kale is a senior software development engineer at AWS working on OpenSearch.

Bukhtawar Khan is a Principal Engineer working on Amazon OpenSearch Service. He is interested in building distributed and autonomous systems. He is a maintainer and an active contributor to OpenSearch.

Gaurav Bafna is a Senior Software Engineer working on OpenSearch at Amazon Web Services. He is fascinated about solving problems in distributed systems. He is a maintainer and an active contributor to OpenSearch.

GitHub Issues search now supports nested queries and boolean operators: Here’s how we (re)built it

Post Syndicated from Deborah Digges original https://github.blog/developer-skills/application-development/github-issues-search-now-supports-nested-queries-and-boolean-operators-heres-how-we-rebuilt-it/


Originally, Issues search was limited by a simple, flat structure of queries. But with advanced search syntax, you can now construct searches using logical AND/OR operators and nested parentheses, pinpointing the exact set of issues you care about.

Building this feature presented significant challenges: ensuring backward compatibility with existing searches, maintaining performance under high query volume, and crafting a user-friendly experience for nested searches. We’re excited to take you behind the scenes to share how we took this long-requested feature from idea to production.

Here’s what you can do with the new syntax and how it works behind the scenes

Issues search now supports building queries with logical AND/OR operators across all fields, with the ability to nest query terms. For example is:issue state:open author:rileybroughten (type:Bug OR type:Epic) finds all issues that are open AND were authored by rileybroughten AND are either of type bug or epic.

Screenshot of an Issues search query involving the logical OR operator.

How did we get here?

Previously, as mentioned, Issues search only supported a flat list of query fields and terms, which were implicitly joined by a logical AND. For example, the query assignee:@me label:support new-project translated to “give me all issues that are assigned to me AND have the label support AND contain the text new-project.

But the developer community has been asking for more flexibility in issue search, repeatedly, for nearly a decade now. They wanted to be able to find all issues that had either the label support or the label question, using the query label:support OR label:question. So, we shipped an enhancement towards this request in 2021, when we enabled an OR style search using a comma-separated list of values.

However, they still wanted the flexibility to search this way across all issue fields, and not just the labels field. So we got to work. 

Technical architecture and implementation

The architecture of the Issues search system (and the changes needed to build this feature).

From an architectural perspective, we swapped out the existing search module for Issues (IssuesQuery), with a new search module (ConditionalIssuesQuery), that was capable of handling nested queries while continuing to support existing query formats.

This involved rewriting IssueQuery, the search module that parsed query strings and mapped them into Elasticsearch queries.

Search Architecture

To build a new search module, we first needed to understand the existing search module, and how a single search query flowed through the system. At a high level, when a user performs a search, there are three stages in its execution:

  1. Parse: Breaking the user input string into a structure that is easier to process (like a list or a tree)
  2. Query: Transforming the parsed structure into an Elasticsearch query document, and making a query against Elasticsearch.
  3. Normalize: Mapping the results obtained from Elasticsearch (JSON) into Ruby objects for easy access and pruning the results to remove records that had since been removed from the database.

Each stage presented its own challenges, which we’ll explore in more detail below. The Normalize step remained unchanged during the re-write, so we won’t dive into that one.

Parse stage

The user input string (the search phrase) is first parsed into an intermediate structure. The search phrase could include:

  • Query terms: The relevant words the user is trying to find more information about (ex: “models”)
  • Search filters: These restrict the set of returned search documents based on some criteria (ex: “assignee:Deborah-Digges”)

 Example search phrase: 

  • Find all issues assigned to me that contain the word “codespaces”:
    • is:issue assignee:@me codespaces
  • Find all issues with the label documentation that are assigned to me:
    • assignee:@me label:documentation

The old parsing method: flat list

When only flat, simple queries were supported, it was sufficient to parse the user’s search string into a list of search terms and filters, which would then be passed along to the next stage of the search process.

The new parsing method: abstract syntax tree

As nested queries may be recursive, parsing the search string into a list was no longer sufficient. We changed this component to parse the user’s search string into an Abstract Syntax Tree (AST) using the parsing library parslet.

We defined a grammar (a PEG or Parsing Expression Grammar) to represent the structure of a search string. The grammar supports both the existing query syntax and the new nested query syntax, to allow for backward compatibility.

A simplified grammar for a boolean expression described by a PEG grammar for the parslet parser is shown below:

class Parser < Parslet::Parser
  rule(:space)  { match[" "].repeat(1) }
  rule(:space?) { space.maybe }

  rule(:lparen) { str("(") >> space? }
  rule(:rparen) { str(")") >> space? }

  rule(:and_operator) { str("and") >> space? }
  rule(:or_operator)  { str("or")  >> space? }

  rule(:var) { str("var") >> match["0-9"].repeat(1).as(:var) >> space? }

  # The primary rule deals with parentheses.
  rule(:primary) { lparen >> or_operation >> rparen | var }

  # Note that following rules are both right-recursive.
  rule(:and_operation) { 
    (primary.as(:left) >> and_operator >> 
      and_operation.as(:right)).as(:and) | 
    primary }
    
  rule(:or_operation)  { 
    (and_operation.as(:left) >> or_operator >> 
      or_operation.as(:right)).as(:or) | 
    and_operation }

  # We start at the lowest precedence rule.
  root(:or_operation)
end

For example, this user search string:
is:issue AND (author:deborah-digges OR author:monalisa ) 
would be parsed into the following AST:

{
  "root": {
    "and": {
      "left": {
        "filter_term": {
          "attribute": "is",
          "value": [
            {
              "filter_value": "issue"
            }
          ]
        }
      },
      "right": {
        "or": {
          "left": {
            "filter_term": {
              "attribute": "author",
              "value": [
                {
                  "filter_value": "deborah-digges"
                }
              ]
            }
          },
          "right": {
            "filter_term": {
              "attribute": "author",
              "value": [
                {
                  "filter_value": "monalisa"
                }
              ]
            }
          }
        }
      }
    }
  }
}

Query

Once the query is parsed into an intermediate structure, the next steps are to:

  1. Transform this intermediate structure into a query document that Elasticsearch understands
  2. Execute the query against Elasticsearch to obtain results

Executing the query in step 2 remained the same between the old and new systems, so let’s only go over the differences in building the query document below.

The old query generation: linear mapping of filter terms using filter classes

Each filter term (Ex: label:documentation) has a class that knows how to convert it into a snippet of an Elasticsearch query document. During query document generation, the correct class for each filter term is invoked to construct the overall query document.

The new query generation: recursive AST traversal to generate Elasticsearch bool query

We recursively traversed the AST generated during parsing to build an equivalent Elasticsearch query document. The nested structure and boolean operators map nicely to Elasticsearch’s boolean query with the AND, OR, and NOT operators mapping to the must, should, and should_not clauses.

We re-used the building blocks for the smaller pieces of query generation to recursively construct a nested query document during the tree traversal.

Continuing from the example in the parsing stage, the AST would be transformed into a query document that looked like this:

{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "must": {
                    "prefix": {
                      "_index": "issues"
                    }
                  }
                }
              },
              {
                "bool": {
                  "should": {
                    "terms": {
                      "author_id": [
                        "<DEBORAH_DIGGES_AUTHOR_ID>",
                        "<MONALISA_AUTHOR_ID>"
                      ]
                    }
                  }
                }
              }
            ]
          }
        }
      ]
    }
    // SOME TERMS OMITTED FOR BREVITY
  }
}

With this new query document, we execute a search against Elasticsearch. This search now supports logical AND/OR operators and parentheses to search for issues in a more fine-grained manner.

Considerations

Issues is one of the oldest and most heavily -used features on GitHub. Changing core functionality like Issues search, a feature with an average of  nearly 2000 queries per second (QPS)—that’s almost 160M queries a day!—presented a number of challenges to overcome.

Ensuring backward compatibility

Issue searches are often bookmarked, shared among users, and linked in documents, making them important artifacts for developers and teams. Therefore, we wanted to introduce this new capability for nested search queries without breaking existing queries for users. 

We validated the new search system before it even reached users by:

  • Testing extensively: We ran our new search module against all unit and integration tests for the existing search module. To ensure that the GraphQL and REST API contracts remained unchanged, we ran the tests for the search endpoint both with the feature flag for the new search system enabled and disabled.
  • Validating correctness in production with dark-shipping: For 1% of issue searches, we ran the user’s search against both the existing and new search systems in a background job, and logged differences in responses. By analyzing these differences we were able to fix bugs and missed edge cases before they reached our users.
    • We weren’t sure at the outset how to define “differences,” but we settled on “number of results” for the first iteration. In general, it seemed that we could determine whether a user would be surprised by the results of their search against the new search capability if a search returned a different number of results when they were run within a second or less of each other.

Preventing performance degradation

We expected more complex nested queries to use more resources on the backend than simpler queries, so we needed to establish a realistic baseline for nested queries, while ensuring no regression in the performance of existing, simpler ones.

For 1% of Issue searches, we ran equivalent queries against both the existing and the new search systems. We used scientist, GitHub’s open source Ruby library, for carefully refactoring critical paths, to compare the performance of equivalent queries to ensure that there was no regression.

Preserving user experience

We didn’t want users to have a worse experience than before just because more complex searches were possible

We collaborated closely with product and design teams to ensure usability didn’t decrease as we added this feature by:

  • Limiting the number of nested levels in a query to five. From customer interviews, we found this to be a sweet spot for both utility and usability.
  • Providing helpful UI/UX cues: We highlight the AND/OR keywords in search queries, and provide users with the same auto-complete feature for filter terms in the UI that they were accustomed to for simple flat queries.

Minimizing risk to existing users

For a feature that is used by millions of users a day, we needed to be intentional about rolling it out in a way that minimized risk to users.

We built confidence in our system by:

  • Limiting blast radius: To gradually build confidence, we only integrated the new system in the GraphQL API and the Issues tab for a repository in the UI to start. This gave us time to collect, respond to, and incorporate feedback without risking a degraded experience for all consumers. Once we were happy with its performance, we rolled it out to the Issues dashboard and the REST API.
  • Testing internally and with trusted partners: As with every feature we build at GitHub, we tested this feature internally for the entire period of its development by shipping it to our own team during the early days, and then gradually rolling it out to all GitHub employees. We then shipped it to trusted partners to gather initial user feedback.

And there you have it, that’s how we built, validated, and shipped the new and improved Issues search!

Feedback

Want to try out this exciting new functionality? Head to our docs to learn about how to use boolean operators and parentheses to search for the issues you care about!

If you have any feedback for this feature, please drop us a note on our community discussions.

Acknowledgements

Special thanks to AJ Schuster, Riley Broughten, Stephanie Goldstein, Eric Jorgensen Mike Melanson and Laura Lindeman for the feedback on several iterations of this blog post!

The post GitHub Issues search now supports nested queries and boolean operators: Here’s how we (re)built it appeared first on The GitHub Blog.

Mapping AWS security services to MITRE frameworks for threat detection and mitigation

Post Syndicated from Pratima Singh original https://aws.amazon.com/blogs/security/mapping-aws-security-services-to-mitre-frameworks-for-threat-detection-and-mitigation/

In the cloud security landscape, organizations benefit from aligning their controls and practices with industry standard frameworks such as MITRE ATT&CK®, MITRE EngageTM, and MITRE D3FENDTM. MITRE frameworks are structured, openly accessible models that document threat actor behaviors to help organizations improve threat detection and response.

Figure 1: Interaction between the various MITRE frameworks

Figure 1: Interaction between the various MITRE frameworks

Figure 1 showcases how the frameworks interact with each other to identify threatening behavior and provide actionable defensive measures. MITRE ATT&CK provides insights into threat actor behavior while D3FEND translates insights from ATT&CK into actionable defensive measures. MITRE Engage uses both ATT&CK and D3FEND to plan proactive engagement strategies that disrupt threat actor activity. As organizations use AWS to enhance their operational capabilities, implementing comprehensive security strategies becomes an important part of cloud adoption.

This blog post explores how AWS security services align with the MITRE frameworks to provide a systematic approach for threat detection and mitigation. We’ll examine how organizations can use AWS security tools such as Amazon GuardDuty, Amazon Security Lake, and AWS Security Hub in conjunction with MITRE frameworks to implement security controls across different stages of their cloud security operations.

Understanding MITRE frameworks

Today’s security teams face increasingly sophisticated threats, with actors continuously evolving their tactics, techniques, and procedures (TTPs). To help organizations strengthen their security posture, industry frameworks such as MITRE ATT&CK, D3FEND, and Engage provide structured methodologies for understanding and responding to these threats.

Understanding these threats through a risk lifecycle approach is crucial for security teams. This structured methodology enables teams to detect anomalies early, map threats to known risk stages, and implement proactive defense mechanisms. By following a risk lifecycle approach, organizations can enhance threat intelligence, improve incident response, and minimize dwell time, ultimately strengthening their security posture against evolving cyber threats.

The integration of MITRE ATT&CK, D3FEND, and Engage frameworks offers organizations a comprehensive approach across the security operations lifecycle. At the foundation, MITRE ATT&CK provides a common language for describing threat actor TTPs. This knowledge base is invaluable during threat modeling and risk assessment, helping teams identify potential vulnerabilities and threat vectors.

Building upon ATT&CK, MITRE D3FEND complements the tactical knowledge with a framework for defensive countermeasures. It suggests proactive security controls, such as implementing least privilege access or securing system configurations. This allows organizations to align their defenses directly with known exploit patterns.

MITRE Engage then adds a layer of active defense capabilities. It guides security teams in planning and implementing strategies that can help in three different ways and potentially simultaneously. Defenders can expose threat actors by detecting them as they attempt to access or operate on infrastructure. Defenders can use Engage to help impose costs by causing threat actors to focus on fake infrastructure rather than legitimate assets. Finally, defenders can set up enticing fake targets to lure threat actors into exploiting them and thereby revealing tradecraft.

A MITRE operation that was run in conjunction with a partner might clarify how this is valuable. MITRE worked with a partner to set up a fake network to appear as a specific type of entity. The goal was to elicit TTPs from a specific advanced persistent threat (APT) for which MITRE and the partner had a recent malware sample. MITRE ran the sample on the fake network and observed the APT’s activities. From that operation, MITRE gathered a list of specific TTPs that were executed by a script in a particular order that helped the partner develop a novel analytic. Plus, in reviewing event traces, MITRE found a flaw in a well-known security tool that missed a specific type of process-tampering event. This was disclosed to the vendor, who fixed that in later versions. Finally, every minute of operating in this environment imposed a cost on the APT by diverting resources from real victims. Full details of the exercise were presented at Shmoocon 2022.

As we move through the security operations lifecycle, these three MITRE frameworks continue to work in concert:

  • During detection and monitoring, ATT&CK informs threat hunting and log analysis and correlation, D3FEND strengthens real-time detection and anomaly tracking, and Engage enables strategic detection through deception techniques.
  • When responding to incidents, ATT&CK helps map incident progression, D3FEND automates response actions, and Engage provides methods to gather additional intelligence about threat activities.
  • In the post-incident phase, ATT&CK helps map the incident chain for better detection tuning, D3FEND refines security controls, and Engage expands deception tactics based on lessons learned. By integrating these efforts, organizations can implement a systematic approach to security operations that combines tactical knowledge, defensive measures, and strategic engagement capabilities.

Aligning AWS to MITRE frameworks

AWS offers a broad set of cloud services with high security at global scale, and has proven experience helping businesses innovate faster. Customers use AWS services in various configurations to build solutions for their bespoke business needs. A fundamental aspect of using AWS is understanding the Shared Responsibility Model, shown in Figure 2 that follows.

Figure 2: AWS Shared Responsibility Model

Figure 2: AWS Shared Responsibility Model

AWS is responsible for security of the cloud, while customers are responsible for security in the cloud. This means that AWS is responsible for protecting the infrastructure that runs the services offered in the AWS Cloud, while customer responsibility is determined by the AWS Cloud services that a customer selects. As customers embark on their cloud security journey, we help them understand two important concepts of cloud-scale environments:

  • Interconnected resources and configurations: Cloud architectures consist of interconnected entities—ranging from virtual machines using Amazon Elastic Compute Cloud (Amazon EC2) to serverless functions using AWS Lambda. To help customers maintain visibility and control, AWS offers native tools designed for cloud-scale management.
  • Dynamic access management and least privilege: Cloud environments require robust authentication mechanisms and fine-grained permissions. AWS provides comprehensive identity and access management tools to implement least privilege access and manage dynamic workloads effectively.

To support our customers’ security needs, AWS offers native security services that align with industry-standard frameworks like MITRE ATT&CK, D3FEND, and Engage. Here’s how these services map across the security lifecycle:

For threat modeling and risk assessment, Security Lake aggregates logs for MITRE ATT&CK-based analytics, while Amazon Inspector scans for vulnerabilities mapped to threat actor techniques. Amazon Macie detects sensitive data exposure across AWS resources.

When implementing preventive controls, implementing least privilege for access is fundamental. AWS Identity and Access Management (IAM) and AWS Organizations provide capabilities to enforce least privilege across your AWS environment. You can use IAM permissions and service control policies (SCPs) to build an identity perimeter. AWS Web Application Firewall (AWS WAF) provides application-layer protections, while you can use AWS Secrets Manager to store honey tokens. Secrets Manager is an AWS service that you can use to centrally manage the lifecycle of secrets. Honey tokens act as digital decoys that simulate legitimate credentials or sensitive data, enticing threat actors to reveal their presence when they interact with them. When triggered, these tokens generate real-time alerts and detailed event logs, enabling swift investigation and deeper insights into threat actor tactics. Deploying honey tokens on AWS involves creating decoy credentials or sensitive data entries that serve no legitimate purpose yet are closely monitored for unauthorized access attempts. One common approach is to use Secrets Manager to store fake secrets that mimic real credentials. When such tokens, stored in Secrets Manager, are accessed, the service generates detailed event logs with AWS CloudTrail and Amazon CloudWatch. You can continuously monitor these logs and events and configure them to alert you if the decoys are ever accessed.

During the detection and monitoring phase, GuardDuty identifies unusual activity patterns across your AWS accounts and workloads, Amazon Detective helps investigate these anomalies by analyzing root causes and plotting out the incident scope in an interactive way, while Security Hub centralizes security alerts and enables automated responses across your environment.

For incident response, containment, and recovery, Lambda and Step Functions help automate responses when security events occur. AWS Shield and WAF work together to provide real-time threat mitigation against denial-of-service type threats like distributed denial of service (DDoS), while Security Lake and Detective provide the necessary data and tools for conducting thorough forensic analysis. In 2024, AWS announced the AWS Security Incident Response service that uses automated monitoring and investigation through the AWS Customer Incident Response Team to prepare for, respond to, and recover from security events. You can use the service to augment your cloud-based security response function aligned with AWS security best practices.

By blocking malicious traffic, Shield and WAF provide real-time DDoS mitigation. AWS deception tactics could include redirecting threat actors to honeypots or deploying decoy Amazon Simple Storage Service (Amazon S3) files to enhance engagement strategies, like the honey token deployment and storage using Secrets Manager explained earlier in this post. Post incident, Security Lake and Detective assist in forensic analysis, while Security Hub and IAM policies refine security controls based on past exploit trends. MITRE Engage tactics can further evolve by analyzing honeypot interactions. By integrating these AWS security services, you can detect, prevent, and deceive threat actors effectively, strengthening your organization’s overall security posture. The following table maps MITRE lifecycle stages to AWS services and tools.

Lifecycle stage AWS tools for MITRE ATT&CK (detect and map) AWS tools for MITRE D3FEND (prevent and contain) AWS tools for MITRE Engage (deceive and disrupt)
Threat modeling and risk assessment Security Lake, Amazon Inspector, Macie, and Security Hub IAM policies and AWS WAF Secrets Manager and honey tokens
Detection and monitoring GuardDuty, CloudTrail, and Security Hub Detective, auto-remediation using AWS services such as Amazon EventBridge, Lambda, and Step Functions. Fake IAM users, and decoy Amazon S3 files
Incident response and containment Step Functions, Lambda, GuardDuty, AWS Security Incident Response, and Detective Auto-block using AWS WAF, multi-factor authentication (MFA) enforcement, and AWS Security Incident Response Redirect exploits to honeypots
Post-incident and intelligence Analyze and correlate logs with Security Lake, Amazon Athena, and Detective IAM hardening and AWS Config Adaptive deception traps

You can use Table 1 as a guide to understand how AWS services map to the various lifecycle stages in the incident response lifecycle. We will now demonstrate how GuardDuty, an AWS security service that continuously monitors your AWS accounts and workloads to provide automated threat detection, works in line with the MITRE ATT&CK framework.

GuardDuty: MITRE framework integration in action

In 2024, AWS worked extensively with MITRE to create new techniques and sub-techniques, and to update some of the existing detection objects in the MITRE ATT&CK cloud matrix. The work that AWS did with MITRE drew from real-world threat actor techniques performed against AWS customers and helped to provide more detailed information and specific detections on how threat actors abuse AWS services. For example, AWS threat detection teams observed a new tactic in the cloud environment (T1485.001 | Data Destruction: Lifecycle-Triggered Deletion) where threat actors could modify lifecycle policies for S3 buckets to delete all objects stored in the bucket. This technique, along with associated mitigations, detection, and references was submitted back to the MITRE ATT&CK framework.

AWS security services such as AWS Security Incident Response and GuardDuty use MITRE ATT&CK to provide threat intelligence and detailed information on threats identified in an AWS account. You can examine how these AWS security services integrate with MITRE ATT&CK through a specific example. GuardDuty Extended Threat Detection helps customers with contextual threat detection in their AWS environment and aligns the signals with the MITRE ATT&CK lifecycle. GuardDuty automatically detects and correlates individual findings with connected resources to produce an attack sequence finding. Consider an attack sequence finding generated by GuardDuty detecting data compromise in your AWS account. We will use this as an example in this post.

To begin, the finding summary includes a textual description of the sequence of events and the TTPs detected, as shown in Figure 3. It also shows a summary of the observed TTP identifiers, AWS API calls, and IP addresses.

Figure 3: GuardDuty finding summary visible in the service console

Figure 3: GuardDuty finding summary visible in the service console

As seen in Figure 4, every attack sequence finding highlights the signals and the MITRE tactic associated with the activity. The finding shown in Figure 4 shows the full lifecycle of the threat from discovery to impact.

Figure 4: Signals and MITRE tactics alignment

Figure 4: Signals and MITRE tactics alignment

Diving deeper into each signal reveals the specific MITRE tactic associated with the activity and the technique identifier. Another interesting feature is that you can see the correlation between the AWS API call associated with the resources involved in the attack sequence and the user agent.

Figure 5 shows one of the signals associated with the attack sequence in the previous finding. A data exfiltration activity has been reported because of the nature of the AWS API call (s3:GetObject) and the user agent (Kali Linux) that was used to perform the activity. The level of detail for each signal is contextual based on the type of activity and tactic.

Figure 5: Details for a single signal within a GuardDuty attack sequence finding

Figure 5: Details for a single signal within a GuardDuty attack sequence finding

Figure 6 shows another signal from the same finding, but in this case the level of detail includes the malicious IP lists and suspicious network activity detected in relation to the signal and associated resources.

Figure 6: Details of TTPs associated with an indicator within a GuardDuty attack sequence finding

Figure 6: Details of TTPs associated with an indicator within a GuardDuty attack sequence finding

This information can be downloaded in a JSON-formatted file. The information from the JSON document can be used to automate responses and remediations for the detections.

Conclusion

AWS security services work together to support the implementation of MITRE frameworks—ATT&CK for threat detection, D3FEND for preventative security, and Engage for threat actor engagement across the cybersecurity lifecycle. As demonstrated through the GuardDuty Extended Threat Detection example, these integrations provide customers with practical, actionable security capabilities across their AWS environment. The alignment of AWS security services with MITRE frameworks helps you build security operations using industry-standard methodologies, implement automated detection and response capabilities, maintain visibility across your AWS environment, and continuously enhance your security controls.

Through this integration of AWS security services with MITRE frameworks, you can implement comprehensive security operations that evolve with your organization’s business needs. To get started, visit the GuardDuty console to enable Extended Threat Detection, and explore our documentation to learn more about implementing these security capabilities in your AWS environment. Join us at AWS re:Inforce 2025 to learn more about AWS security services, including deep dives into the integration of Amazon GuardDuty with MITRE frameworks and hands-on workshops with AWS security experts.

If you have feedback about this post, submit comments in the Comments section below.

Pratima Singh
Pratima Singh

Pratima is a Security Specialist Solutions Architect with AWS, based out of Sydney, Australia. She is a security enthusiast who enjoys helping customers find innovative solutions to complex business challenges. Outside of work, Pratima enjoys going on long drives and spending time with her family at the beach.

Contributors

Special thanks to Dr. Stanley Barr, Senior Principal Scientist at MITRE, and Jess Modini, former Advisory Solutions Architect at AWS, who made significant contributions to this post.

Vendor-Agnostic Security: The Key To Smarter Risk Management

Post Syndicated from Michael Chroney original https://blog.rapid7.com/2025/05/13/vendor-agnostic-security-the-key-to-smarter-risk-management/

Vendor-Agnostic Security: The Key To Smarter Risk Management

Security teams are investing in more tools than ever – but visibility into real risk is still elusive. Why? Because too many tools are locked inside closed ecosystems that don’t share data or context.

A vendor-agnostic security strategy changes that. It gives you the flexibility to integrate best-in-class tools, eliminate blind spots, and build a stronger, more agile cybersecurity program. It’s also a core enabler of modern frameworks like continuous threat exposure management (CTEM).

In this post, we’ll explore how a vendor-agnostic approach, powered by exposure assessment platforms (EAPs), helps you manage risk smarter – by unifying your attack surface and helping your team focus on what matters most.

The risks of vendor lock-in in cybersecurity

Security teams rely on a mix of tools from different vendors. According to the 2023 Gartner® Technology Adoption Roadmap for Large Enterprises Survey, “cybersecurity leaders indicated that on average their organizations had 43 tools in their cybersecurity product portfolios, and 5% of the leaders indicated their organizations had over 100 tools”. When those tools don’t speak the same language, you’re left with siloed data and a fragmented security strategy. That’s how blind spots are born – and how critical vulnerabilities slip through the cracks.

On top of that, being locked into a single vendor makes it costly and complicated to switch solutions, often forcing organizations to stick with suboptimal tools. Instead of driving innovation, you have limited options that lead to unnecessary spending on add-ons that may not fully meet your needs.

How a vendor-agnostic approach powers CTEM

CTEM is designed to be proactive, contextual, and continuous. It’s about knowing what exposures exist, which ones to prioritize, and how to remediate them – before attackers take advantage. To get the most out of CTEM, your security framework needs to be as flexible as the threats you’re defending against.

That means looking beyond a single vendor’s lens. A vendor-agnostic approach helps you:

  • Ingest data from anywhere across endpoints, cloud, identities, networks, threat intel, and more.
  • Correlate and prioritize with context – so your team can focus on what’s urgent and actionable.
  • Act faster across teams with remediation workflows that plug into existing tools and processes.

Unlocking CTEM with exposure assessment platforms

This is where EAPs make a real difference. These platforms unify and enrich data from across your hybrid environment, continuously identifying and prioritizing exposures – like vulnerabilities and misconfigurations – across a wide range of asset types. This gives security teams the context they need to act with clarity and confidence.

With a vendor-agonostic EAP, security teams can:

  • Continuously discover exposures across hybrid environments
  • Prioritize based on actual risk, not just raw severity scores
  • Correlate findings across sources to surface exploitable attack paths
  • Enable confident, fast decisions using context like business criticality and threat intel

It’s a centralized command center for everything that puts your organization at risk – and helps provide insight into what you can do about it.

Real-world example: Why risk context matters

Let’s say your team spots a misconfiguration in a firewall. On its own, that might trigger a red flag. But without deeper context, it’s hard to know if it’s actually a risk – or just noise.

Now imagine you can instantly cross-reference that misconfiguration with endpoint telemetry. If those endpoints aren’t exposed or already have compensating controls in place, you can safely deprioritize the issue. But if it opens the door to vulnerable assets? You’ve got the clarity (and urgency) to act.

That level of insight is only possible with a centralized, vendor-agnostic platform that brings together telemetry from across your environment. It filters out the noise and empowers your team to make informed, high-impact decisions.

Key takeaways

Strengthen your organization’s overall security posture by adopting a vendor-agnostic strategy that helps your team:

  • Break free from vendor lock-in for more flexibility and control
  • Unify security tools to drive a more effective CTEM program
  • Enhance decision-making with EAPs
  • Extract more value from the tools and telemetry you already have

Build a future-ready cybersecurity strategy

Rapid7’s Exposure Command embraces a vendor-agnostic approach to provide a unified, transparent view of your security landscape. It aggregates telemetry and risk signals from across your existing tools – endpoint, cloud, identity, vulnerability management, and more – so you can:

  • Uncover blind spots hidden in fragmented vendor ecosystems
  • Correlate and contextualize risk with a unified, real-time view
  • Streamline decisions and accelerate remediation with automated workflows and prioritization

By moving to a vendor-agnostic approach with Rapid7, you’re not just reducing risk — you’re building a security program that’s resilient, scalable, and built for what’s next.


1Gartner, Infrastructure Security Primer for 2025, John Watts, Franz Hinner, 29 January 2025 (For Gartner subscribers only)

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

The collective thoughts of the interwebz