Tag Archives: security

The anatomy of ransomware event targeting data residing in Amazon S3

Post Syndicated from Megan O'Neil original https://aws.amazon.com/blogs/security/anatomy-of-a-ransomware-event-targeting-data-in-amazon-s3/

Ransomware events have significantly increased over the past several years and captured worldwide attention. Traditional ransomware events affect mostly infrastructure resources like servers, databases, and connected file systems. However, there are also non-traditional events that you may not be as familiar with, such as ransomware events that target data stored in Amazon Simple Storage Service (Amazon S3). There are important steps you can take to help prevent these events, and to identify possible ransomware events early so that you can take action to recover. The goal of this post is to help you learn about the AWS services and features that you can use to protect against ransomware events in your environment, and to investigate possible ransomware events if they occur.

Ransomware is a type of malware that bad actors can use to extort money from entities. The actors can use a range of tactics to gain unauthorized access to their target’s data and systems, including but not limited to taking advantage of unpatched software flaws, misuse of weak credentials or previous unintended disclosure of credentials, and using social engineering. In a ransomware event, a legitimate entity’s access to their data and systems is restricted by the bad actors, and a ransom demand is made for the safe return of these digital assets. There are several methods actors use to restrict or disable authorized access to resources including a) encryption or deletion, b) modified access controls, and c) network-based Denial of Service (DoS) attacks. In some cases, after the target’s data access is restored by providing the encryption key or transferring the data back, bad actors who have a copy of the data demand a second ransom—promising not to retain the data in order to sell or publicly release it.

In the next sections, we’ll describe several important stages of your response to a ransomware event in Amazon S3, including detection, response, recovery, and protection.

Observable activity

The most common event that leads to a ransomware event that targets data in Amazon S3, as observed by the AWS Customer Incident Response Team (CIRT), is unintended disclosure of Identity and Access Management (IAM) access keys. Another likely cause is if there is an application with a software flaw that is hosted on an Amazon Elastic Compute Cloud (Amazon EC2) instance with an attached IAM instance profile and associated permissions, and the instance is using Instance Metadata Service Version 1 (IMDSv1). In this case, an unauthorized user might be able to use AWS Security Token Service (AWS STS) session keys from the IAM instance profile for your EC2 instance to ransom objects in S3 buckets. In this post, we will focus on the most common scenario, which is unintended disclosure of static IAM access keys.

Detection

After a bad actor has obtained credentials, they use AWS API actions that they iterate through to discover the type of access that the exposed IAM principal has been granted. Bad actors can do this in multiple ways, which can generate different levels of activity. This activity might alert your security teams because of an increase in API calls that result in errors. Other times, if a bad actor’s goal is to ransom S3 objects, then the API calls will be specific to Amazon S3. If access to Amazon S3 is permitted through the exposed IAM principal, then you might see an increase in API actions such as s3:ListBuckets, s3:GetBucketLocation, s3:GetBucketPolicy, and s3:GetBucketAcl.

Analysis

In this section, we’ll describe where to find the log and metric data to help you analyze this type of ransomware event in more detail.

When a ransomware event targets data stored in Amazon S3, often the objects stored in S3 buckets are deleted, without the bad actor making copies. This is more like a data destruction event than a ransomware event where objects are encrypted.

There are several logs that will capture this activity. You can enable AWS CloudTrail event logging for Amazon S3 data, which allows you to review the activity logs to understand read and delete actions that were taken on specific objects.

In addition, if you have enabled Amazon CloudWatch metrics for Amazon S3 prior to the ransomware event, you can use the sum of the BytesDownloaded metric to gain insight into abnormal transfer spikes.

Another way to gain information is to use the region-DataTransfer-Out-Bytes metric, which shows the amount of data transferred from Amazon S3 to the internet. This metric is enabled by default and is associated with your AWS billing and usage reports for Amazon S3.

For more information, see the AWS CIRT team’s Incident Response Playbook: Ransom Response for S3, as well as the other publicly available response frameworks available at the AWS customer playbooks GitHub repository.

Response

Next, we’ll walk through how to respond to the unintended disclosure of IAM access keys. Based on the business impact, you may decide to create a second set of access keys to replace all legitimate use of those credentials so that legitimate systems are not interrupted when you deactivate the compromised access keys. You can deactivate the access keys by using the IAM console or through automation, as defined in your incident response plan. However, you also need to document specific details for the event within your secure and private incident response documentation so that you can reference them in the future. If the activity was related to the use of an IAM role or temporary credentials, you need to take an additional step and revoke any active sessions. To do this, in the IAM console, you choose the Revoke active session button, which will attach a policy that denies access to users who assumed the role before that moment. Then you can delete the exposed access keys.

In addition, you can use the AWS CloudTrail dashboard and event history (which includes 90 days of logs) to review the IAM related activities by that compromised IAM user or role. Your analysis can show potential persistent access that might have been created by the bad actor. In addition, you can use the IAM console to look at the IAM credential report (this report is updated every 4 hours) to review activity such as access key last used, user creation time, and password last used. Alternatively, you can use Amazon Athena to query the CloudTrail logs for the same information. See the following example of an Athena query that will take an IAM user Amazon Resource Number (ARN) to show activity for a particular time frame.

SELECT eventtime, eventname, awsregion, sourceipaddress, useragent
FROM cloudtrail
WHERE useridentity.arn = 'arn:aws:iam::1234567890:user/Name' AND
-- Enter timeframe
(event_date >= '2022/08/04' AND event_date <= '2022/11/04')
ORDER BY eventtime ASC

Recovery

After you’ve removed access from the bad actor, you have multiple options to recover data, which we discuss in the following sections. Keep in mind that there is currently no undelete capability for Amazon S3, and AWS does not have the ability to recover data after a delete operation. In addition, many of the recovery options require configuration upon bucket creation.

S3 Versioning

Using versioning in S3 buckets is a way to keep multiple versions of an object in the same bucket, which gives you the ability to restore a particular version during the recovery process. You can use the S3 Versioning feature to preserve, retrieve, and restore every version of every object stored in your buckets. With versioning, you can recover more easily from both unintended user actions and application failures. Versioning-enabled buckets can help you recover objects from accidental deletion or overwrite. For example, if you delete an object, Amazon S3 inserts a delete marker instead of removing the object permanently. The previous version remains in the bucket and becomes a noncurrent version. You can restore the previous version. Versioning is not enabled by default and incurs additional costs, because you are maintaining multiple copies of the same object. For more information about cost, see the Amazon S3 pricing page.

AWS Backup

Using AWS Backup gives you the ability to create and maintain separate copies of your S3 data under separate access credentials that can be used to restore data during a recovery process. AWS Backup provides centralized backup for several AWS services, so you can manage your backups in one location. AWS Backup for Amazon S3 provides you with two options: continuous backups, which allow you to restore to any point in time within the last 35 days; and periodic backups, which allow you to retain data for a specified duration, including indefinitely. For more information, see Using AWS Backup for Amazon S3.

Protection

In this section, we’ll describe some of the preventative security controls available in AWS.

S3 Object Lock

You can add another layer of protection against object changes and deletion by enabling S3 Object Lock for your S3 buckets. With S3 Object Lock, you can store objects using a write-once-read-many (WORM) model and can help prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.

AWS Backup Vault Lock

Similar to S3 Object lock, which adds additional protection to S3 objects, if you use AWS Backup you can consider enabling AWS Backup Vault Lock, which enforces the same WORM setting for all the backups you store and create in a backup vault. AWS Backup Vault Lock helps you to prevent inadvertent or malicious delete operations by the AWS account root user.

Amazon S3 Inventory

To make sure that your organization understands the sensitivity of the objects you store in Amazon S3, you should inventory your most critical and sensitive data across Amazon S3 and make sure that the appropriate bucket configuration is in place to protect and enable recovery of your data. You can use Amazon S3 Inventory to understand what objects are in your S3 buckets, and the existing configurations, including encryption status, replication status, and object lock information. You can use resource tags to label the classification and owner of the objects in Amazon S3, and take automated action and apply controls that match the sensitivity of the objects stored in a particular S3 bucket.

MFA delete

Another preventative control you can use is to enforce multi-factor authentication (MFA) delete in S3 Versioning. MFA delete provides added security and can help prevent accidental bucket deletions, by requiring the user who initiates the delete action to prove physical or virtual possession of an MFA device with an MFA code. This adds an extra layer of friction and security to the delete action.

Use IAM roles for short-term credentials

Because many ransomware events arise from unintended disclosure of static IAM access keys, AWS recommends that you use IAM roles that provide short-term credentials, rather than using long-term IAM access keys. This includes using identity federation for your developers who are accessing AWS, using IAM roles for system-to-system access, and using IAM Roles Anywhere for hybrid access. For most use cases, you shouldn’t need to use static keys or long-term access keys. Now is a good time to audit and work toward eliminating the use of these types of keys in your environment. Consider taking the following steps:

  1. Create an inventory across all of your AWS accounts and identify the IAM user, when the credentials were last rotated and last used, and the attached policy.
  2. Disable and delete all AWS account root access keys.
  3. Rotate the credentials and apply MFA to the user.
  4. Re-architect to take advantage of temporary role-based access, such as IAM roles or IAM Roles Anywhere.
  5. Review attached policies to make sure that you’re enforcing least privilege access, including removing wild cards from the policy.

Server-side encryption with customer managed KMS keys

Another protection you can use is to implement server-side encryption with AWS Key Management Service (SSE-KMS) and use customer managed keys to encrypt your S3 objects. Using a customer managed key requires you to apply a specific key policy around who can encrypt and decrypt the data within your bucket, which provides an additional access control mechanism to protect your data. You can also centrally manage AWS KMS keys and audit their usage with an audit trail of when the key was used and by whom.

GuardDuty protections for Amazon S3

You can enable Amazon S3 protection in Amazon GuardDuty. With S3 protection, GuardDuty monitors object-level API operations to identify potential security risks for data in your S3 buckets. This includes findings related to anomalous API activity and unusual behavior related to your data in Amazon S3, and can help you identify a security event early on.

Conclusion

In this post, you learned about ransomware events that target data stored in Amazon S3. By taking proactive steps, you can identify potential ransomware events quickly, and you can put in place additional protections to help you reduce the risk of this type of security event in the future.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security, Identity and Compliance re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Megan O’Neil

Megan is a Principal Specialist Solutions Architect focused on threat detection and incident response. Megan and her team enable AWS customers to implement sophisticated, scalable, and secure solutions that solve their business challenges.

Karthik Ram

Karthik Ram

Karthik is a Senior Solutions Architect with Amazon Web Services based in Columbus, Ohio. He has a background in IT networking, infrastructure architecture and Security. At AWS, Karthik helps customers build secure and innovative cloud solutions, solving their business problems using data driven approaches. Karthik’s Area of Depth is Cloud Security with a focus on Threat Detection and Incident Response (TDIR).

Kyle Dickinson

Kyle Dickinson

Kyle is a Sr. Security Solution Architect, specializing in threat detection, incident response. He focuses on working with customers to respond to security events with confidence. He also hosts AWS on Air: Lockdown, a livestream security show. When he’s not – he enjoys hockey, BBQ, and trying to convince his Shitzu that he’s in-fact, not a large dog.

Cloudflare’s handling of a bug in interpreting IPv4-mapped IPv6 addresses

Post Syndicated from Lucas Ferreira original https://blog.cloudflare.com/cloudflare-handling-bug-interpreting-ipv4-mapped-ipv6-addresses/

Cloudflare's handling of a bug in interpreting IPv4-mapped IPv6 addresses

Cloudflare's handling of a bug in interpreting IPv4-mapped IPv6 addresses

In November 2022, our bug bounty program received a critical and very interesting report. The report stated that certain types of DNS records could be used to bypass some of our network policies and connect to ports on the loopback address (e.g. 127.0.0.1) of our servers. This post will explain how we dealt with the report, how we fixed the bug, and the outcome of our internal investigation to see if the vulnerability had been previously exploited.

RFC 4291 defines ways to embed an IPv4 address into IPv6 addresses. One of the methods defined in the RFC is to use IPv4-mapped IPv6 addresses, that have the following format:

   |                80 bits               | 16 |      32 bits        |
   +--------------------------------------+--------------------------+
   |0000..............................0000|FFFF|    IPv4 address     |
   +--------------------------------------+----+---------------------+

In IPv6 notation, the corresponding mapping for 127.0.0.1 is ::ffff:127.0.0.1 (RFC 4038)

The researcher was able to use DNS entries based on mapped addresses to bypass some of our controls and access ports on the loopback address or non-routable IPs.

This vulnerability was reported on November 27 to our bug bounty program. Our Security Incident Response Team (SIRT) was contacted, and incident response activities began shortly after the report was filed. A hotpatch was deployed three hours later to prevent exploitation of the bug.

Date Time (UTC) Activity
27 November 2022 20:42 Initial report to Cloudflare’s bug bounty program
21:04 SIRT oncall is paged
21:15 SIRT manager on call starts working on the report
21:22 Incident declared and team is assembled and debugging starts
23:20 A hotfix is ready and deployment starts
23:47 Team confirms that the hotfix is deployed and working
23:58 Team investigates if other products are affected. Load Balancers and Spectrum are potential targets. Both products are found to be unaffected by the vulnerability.
28 November 2022 21:14 A permanent fix is ready
29 November 2022 21:34 Permanent fix is merged

Blocking exploitation

Immediately after the vulnerability was reported to our Bug Bounty program, the team began working to understand the issue and find ways to quickly block potential exploitation. It was determined that the fastest way to prevent exploitation would be to block the creation of the DNS records required to execute the attack.

The team then began to implement a patch to prevent the creation of DNS records that include IPv6 addresses that map loopback or RFC 1918 (internal) IPv4 addresses. The fix was fully deployed and confirmed three hours after the report was filed. We later realized that this change was insufficient because records hosted on external DNS servers could also be used in this attack.

The exploit

The exploit provided consisted of the following: a DNS entry, and a Cloudflare Worker. The DNS entry was an AAAA record pointing to ::ffff:127.0.0.1:

exploit.example.com AAAA ::ffff:127.0.0.1

The worker included the following code:

export default {
    async fetch(request) {
        const requestJson = await request.json()
        return fetch(requestJson.url, requestJson)
    }
}

The Worker was given a custom URL such as proxy.example.com.

With that setup, it was possible to make the worker attempt connections on the loopback interface of the server where it was running. The call would look like this:

curl https://proxy.example.com/json -d '{"url":"http://exploit.example.com:80/url_path"}'

The attack could then be scripted to attempt to connect to multiple ports on the server.

It was also found that a similar setup could be used with other IPv4 addresses to attempt connections into internal services. In this case, the DNS entry would look like:

exploit.example.com AAAA ::ffff:10.0.0.1

This exploit would allow an attacker to connect to services running on the loopback interface of the server. If the attacker was able to bypass the security and authentication mechanisms of a service, it could impact the confidentiality and integrity of data. For services running on other servers, the attacker could also use the worker to attempt connections and map services available over the network. As in most networks, Cloudflare’s network policies and ACLs must allow a few ports to be accessible. These ports would be accessible by an attacker using this exploit.

Investigation

We started an investigation to understand the root cause of the problem and created a proof-of-concept that allowed the team to debug the issue. At the same time, we started a parallel investigation to determine if the issue had been previously exploited.

It all happened when two bugs collided.

The first bug happened in our internal DNS system which is responsible for mapping hostnames to IP addresses of our customers’ origin servers (the DNS system). When the DNS system tried to answer a query for the DNS record from exploit.example.com, it serialized the IP as a string. The Golang net library used for DNS automatically converted the IP ::ffff:10.0.0.1 to string “10.0.0.1”. However, the DNS system still treated it as an IPv6 address. So a query response {ipv6: “10.0.0.1”} was returned.

The second bug was in our internal HTTP system (the proxy) which is responsible for forwarding HTTP traffic to customer’s origin servers. The bug happened in how the proxy validates this DNS response, {ipv6: “10.0.0.1”}. The proxy has two deny lists of IPs that are not allowed to be used, one for IPv4 and one for IPv6. These lists contain localhost IPs and private IPs. The bug was that the proxy system compared the address 10.0.0.1 against the IPv6 deny list because the address was in the “ipv6” section. Naturally the address didn’t match any entry in the deny list. So the address was allowed to be used as an origin IP address.

The second investigation team searched through the logs and found no evidence of previous exploitation of this vulnerability. The team also checked Cloudflare DNS for entries using IPv4-mapped IPv6 addresses and determined that all the existing entries had been used for testing purposes. As of now, there are no signs that this vulnerability could have been previously used against Cloudflare systems.

Remediating the vulnerability

To address this issue we implemented a fix in the proxy service to correctly use the deny list of the parsed address, not the deny list of the IP family the DNS API response claimed to be, to validate the IP address. We confirmed both in our test and production environments that the fix did prevent the issue from happening again.

Beyond maintaining a bug bounty program, we regularly perform internal security reviews and hire third-party firms to audit the software we develop. But it is through our bug bounty program that we receive some of the most interesting and creative reports. Each report has helped us improve the security of our services. We invite those that find a security issue in any of Cloudflare’s services to report it to us through HackerOne.

Define a custom session duration and terminate active sessions in IAM Identity Center

Post Syndicated from Ron Cully original https://aws.amazon.com/blogs/security/define-a-custom-session-duration-and-terminate-active-sessions-in-iam-identity-center/

Managing access to accounts and applications requires a balance between delivering simple, convenient access and managing the risks associated with active user sessions. Based on your organization’s needs, you might want to make it simple for end users to sign in and to operate long enough to get their work done, without the disruptions associated with requiring re-authentication. You might also consider shortening the session to help meet your compliance or security requirements. At the same time, you might want to terminate active sessions that your users don’t need, such as sessions for former employees, sessions for which the user failed to sign out on a second device, or sessions with suspicious activity.

With AWS IAM Identity Center (successor to AWS Single Sign-On), you now have the option to configure the appropriate session duration for your organization’s needs while using new session management capabilities to look up active user sessions and revoke unwanted sessions.

In this blog post, I show you how to use these new features in IAM Identity Center. First, I walk you through how to configure the session duration for your IAM Identity Center users. Then I show you how to identify existing active sessions and terminate them.

What is IAM Identity Center?

IAM Identity Center helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. IAM Identity Center is the recommended approach for workforce identities to access AWS resources. In IAM Identity Center, you can integrate with an external identity provider (IdP), such as Okta Universal Directory, Microsoft Azure Active Directory, or Microsoft Active Directory Domain Services, as an identity source or you can create users directly in IAM Identity Center. The service is built on the capabilities of AWS Identity and Access Management (IAM) and is offered at no additional cost.

IAM Identity Center sign-in and sessions

You can use IAM Identity Center to access applications and accounts and to get credentials for the AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDK sessions. When you log in to IAM Identity Center through a browser or the AWS CLI, an AWS access portal session is created. When you federate into the console, IAM Identity Center uses the session duration setting on the permission set to control the duration of the session.

Note: The access portal session duration for IAM Identity Center differs from the IAM permission set session duration, which defines how long a user can access their account through the IAM Identity Center console.

Before the release of the new session management feature, the AWS access portal session duration was fixed at 8 hours. Now you can configure the session duration for the AWS access portal in IAM Identity Center from 15 minutes to 7 days. The access portal session duration determines how long the user can access the portal, applications, and accounts, and run CLI commands without re-authenticating. If you have an external IdP connected to IAM Identity Center, the access portal session duration will be the lesser of either the session duration that you set in your IdP or the session duration defined in IAM Identity Center. Users can access accounts and applications until the access portal session expires and initiates re-authentication.

When users access accounts or applications through IAM Identity Center, it creates an additional session that is separate but related to the AWS access portal session. AWS CLI sessions use the AWS access portal session to access roles. The duration of console sessions is defined as part of the permission set that the user accessed. When a console session starts, it continues until the duration expires or the user ends the session. IAM Identity Center-enabled application sessions re-verify the AWS access portal session approximately every 60 minutes. These sessions continue until the AWS access portal session terminates, until another application-specific condition terminates the session, or until the user terminates the session.

To summarize:

  • After a user signs in to IAM Identity Center, they can access their assigned roles and applications for a fixed period, after which they must re-authenticate.
  • If a user accesses an assigned permission set, the user has access to the corresponding role for the duration defined in the permission set (or by the user terminating the session).
  • The AWS CLI uses the AWS access portal session to access roles. The AWS CLI refreshes the IAM permission set in the background. The CLI job continues to run until the access portal session expires.
  • If users access an IAM Identity Center-enabled application, the user can retain access to an application for up to an hour after the access portal session has expired.

Note: IAM Identity Center doesn’t currently support session management capabilities for Active Directory identity sources.

For more information about session management features, see Authentication sessions in the documentation.

Configure session duration

In this section, I show you how to configure the session duration for the AWS access portal in IAM Identity Center. You can choose a session duration between 15 minutes and 7 days.

Session duration is a global setting in IAM Identity Center. After you set the session duration, the maximum session duration applies to IAM Identity Center users.

To configure session duration for the AWS access portal:

  1. Open the IAM Identity Center console.
  2. In the left navigation pane, choose Settings.
  3. On the Settings page, choose the Authentication tab.
  4. Under Authentication, next to Session settings, choose Configure.
  5. For Configure session settings, choose a maximum session duration from the list of pre-defined session durations in the dropdown. To set a custom session duration, select Custom duration, enter the length for the session in minutes, and then choose Save.
Figure 1: Set access portal session duration

Figure 1: Set access portal session duration

Congratulations! You have just modified the session duration for your users. This new duration will take effect on each user’s next sign-in.

Find and terminate AWS access portal sessions

With this new release, you can find active portal sessions for your IAM Identity Center users, and if needed, you can terminate the sessions. This can be useful in situations such as the following:

  • A user no longer works for your organization or was removed from projects that gave them access to applications or permission sets that they should no longer use.
  • If a device is lost or stolen, the user can contact you to end the session. This reduces the risk that someone will access the device and use the open session.

In these cases, you can find a user’s active sessions in the AWS access portal, select the session that you’re interested in, and terminate it. Depending on the situation, you might also want to deactivate sign-in for the user from the system before revoking the user’s session. You can deactivate sign-in for users in the IAM Identity Center console or in your third-party IdP.

If you first deactivate the user’s sign-in in your IdP, and then deactivate the user’s sign-in in IAM Identity Center, deactivation will take effect in IAM Identity Center without synchronization latency. However, if you deactivate the user in IAM Identity Center first, then it is possible that the IdP could activate the user again. By first deactivating the user’s sign-in in your IdP, you can prevent the user from signing in again when you revoke their session. This action is advisable when a user has left your organization and should no longer have access, or if you suspect a valid user’s credentials were stolen and you want to block access until you reset the user’s passwords.

Termination of the access portal session does not affect the active permission set session started from the access portal. IAM role session duration when assumed from the access portal will last as long as the duration specified in the permission set. For AWS CLI sessions, it can take up to an hour for the CLI to terminate after the access portal session is terminated.

Tip: Activate multi-factor authentication (MFA) wherever possible. MFA offers an additional layer of protection to help prevent unauthorized individuals from gaining access to systems or data.

To manage active access portal sessions in the AWS access portal:

  1. Open the IAM Identity Center console.
  2. In the left navigation pane, choose Users.
  3. On the Users page, choose the username of the user whose sessions you want to manage. This takes you to a page with the user’s information.
  4. On the user’s page, choose the Active sessions tab. The number in parentheses next to Active sessions indicates the number of current active sessions for this user.
    Figure 2: View active access portal sessions

    Figure 2: View active access portal sessions

  5. Select the sessions that you want to delete, and then choose Delete session. A dialog box appears that confirms you’re deleting active sessions for this user.
    Figure 3: Delete selected active sessions

    Figure 3: Delete selected active sessions

  6. Review the information in the dialog box, and if you want to continue, choose Delete session.

Conclusion

In this blog post, you learned how IAM Identity Center manages sessions, how to modify the session duration for the AWS access portal, and how to view, search, and terminate active access portal sessions. I also shared some tips on how to think about the appropriate session duration for your use case and related steps that you should take when terminating sessions for users who shouldn’t have permission to sign in again after their session has ended.

With this new feature, you now have more control over user session management. You can use the console to set configurable session lengths based on your organization’s security requirements and desired end-user experience, and you can also terminate sessions, enabling you to manage sessions that are no longer needed or potentially suspicious.

To learn more, see Manage IAM Identity Center integrated application sessions.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Ron Cully

Ron is a Principal Product Manager at AWS where he has led feature and roadmap planning for workforce identity products for over 6 years. He has over 25 years of experience leading networking and directory related product delivery. Ron is passionate about delivering solutions to help make it easier for you to migrate identity-aware workloads, simplify resource and application authorization, and give people a simple sign-in and access experience in the cloud.

Palak Arora

Palak Arora

Palak is a Senior Product Manager at AWS Identity. She has over eight years of cyber security experience with specialization in Identity and Access Management (IAM) domain. She has helped various customers across different sectors to define their enterprise and customer IAM roadmap and strategy, and improve the overall technology risk landscape.

How to set up ongoing replication from your third-party secrets manager to AWS Secrets Manager

Post Syndicated from Laurens Brinker original https://aws.amazon.com/blogs/security/how-to-set-up-ongoing-replication-from-your-third-party-secrets-manager-to-aws-secrets-manager/

Secrets managers are a great tool to securely store your secrets and provide access to secret material to a set of individuals, applications, or systems that you trust. Across your environments, you might have multiple secrets managers hosted on different providers, which can increase the complexity of maintaining a consistent operating model for your secrets. In these situations, centralizing your secrets in a single source of truth, and replicating subsets of secrets across your other secrets managers, can simplify your operating model.

This blog post explains how you can use your third-party secrets manager as the source of truth for your secrets, while replicating a subset of these secrets to AWS Secrets Manager. By doing this, you will be able to use secrets that originate and are managed from your third-party secrets manager in Amazon Web Services (AWS) applications or in AWS services that use Secrets Manager secrets.

I’ll demonstrate this approach in this post by setting up a sample open-source HashiCorp Vault to create and maintain secrets and create a replication mechanism that enables you to use these secrets in AWS by using AWS Secrets Manager. Although this post uses HashiCorp Vault as an example, you can also modify the replication mechanism to use secrets managers from other providers.

Important: This blog post is intended to provide guidance that you can use when planning and implementing a secrets replication mechanism. The examples in this post are not intended to be run directly in production, and you will need to take security hardening requirements into consideration before deploying this solution. As an example, HashiCorp provides tutorials on hardening production vaults.

You can use these links to navigate through this post:

Why and when to consider replicating secrets
Two approaches to secrets replication
Replicate secrets to AWS Secrets Manager with the pull model
Solution overview
Set up the solution
Step 1: Deploy the solution by using the AWS CDK toolkit
Step 2: Initialize the HashiCorp Vault
Step 3: Update the Vault connection secret
Step 4: (Optional) Set up email notifications for replication failures
Test your secret replication
Update a secret
Secret replication logic
Use your secret
Manage permissions
Options for customizing the sample solution

Why and when to consider replicating secrets

The primary use case for this post is for customers who are running applications on AWS and are currently using a third-party secrets manager to manage their secrets, hosted on-premises, in the AWS Cloud, or with a third-party provider. These customers typically have existing secrets vending processes, deployment pipelines, and procedures and processes around the management of these secrets. Customers with such a setup might want to keep their existing third-party secrets manager and have a set of secrets that are accessible to workloads running outside of AWS, as well as workloads running within AWS, by using AWS Secrets Manager.

Another use case is for customers who are in the process of migrating workloads to the AWS Cloud and want to maintain a (temporary) hybrid form of secrets management. By replicating secrets from an existing third-party secrets manager, customers can migrate their secrets to the AWS Cloud one-by-one, test that they work, integrate the secrets with the intended applications and systems, and once the migration is complete, remove the third-party secrets manager.

Additionally, some AWS services, such as Amazon Relational Database Service (Amazon RDS) Proxy, AWS Direct Connect MACsec, and AD Connector seamless join (Linux), only support secrets from AWS Secrets Manager. Customers can use secret replication if they have a third-party secrets manager and want to be able to use third-party secrets in services that require integration with AWS Secrets Manager. That way, customers don’t have to manage secrets in two places.

Two approaches to secrets replication

In this post, I’ll discuss two main models to replicate secrets from an external third-party secrets manager to AWS Secrets Manager: a pull model and a push model.

Pull model
In a pull model, you can use AWS services such as Amazon EventBridge and AWS Lambda to periodically call your external secrets manager to fetch secrets and updates to those secrets. The main benefit of this model is that it doesn’t require any major configuration to your third-party secrets manager. The AWS resources and mechanism used for pulling secrets must have appropriate permissions and network access to those secrets. However, there could be a delay between the time a secret is created and updated and when it’s picked up for replication, depending on the time interval configured between pulls from AWS to the external secrets manager.

Push model
In this model, rather than periodically polling for updates, the external secrets manager pushes updates to AWS Secrets Manager as soon as a secret is added or changed. The main benefit of this is that there is minimal delay between secret creation, or secret updating, and when that data is available in AWS Secrets Manager. The push model also minimizes the network traffic required for replication since it’s a unidirectional flow. However, this model adds a layer of complexity to the replication, because it requires additional configuration in the third-party secrets manager. More specifically, the push model is dependent on the third-party secrets manager’s ability to run event-based push integrations with AWS resources. This will require a custom integration to be developed and managed on the third-party secrets manager’s side.

This blog post focuses on the pull model to provide an example integration that requires no additional configuration on the third-party secrets manager.

Replicate secrets to AWS Secrets Manager with the pull model

In this section, I’ll walk through an example of how to use the pull model to replicate your secrets from an external secrets manager to AWS Secrets Manager.

Solution overview

Figure 1: Secret replication architecture diagram

Figure 1: Secret replication architecture diagram

The architecture shown in Figure 1 consists of the following main steps, numbered in the diagram:

  1. A Cron expression in Amazon EventBridge invokes an AWS Lambda function every 30 minutes.
  2. To connect to the third-party secrets manager, the Lambda function, written in NodeJS, fetches a set of user-defined API keys belonging to the secrets manager from AWS Secrets Manager. These API keys have been scoped down to give read-only access to secrets that should be replicated, to adhere to the principle of least privilege. There is more information on this in Step 3: Update the Vault connection secret.
  3. The third step has two variants depending on where your third-party secrets manager is hosted:
    1. The Lambda function is configured to fetch secrets from a third-party secrets manager that is hosted outside AWS. This requires sufficient networking and routing to allow communication from the Lambda function.

      Note: Depending on the location of your third-party secrets manager, you might have to consider different networking topologies. For example, you might need to set up hybrid connectivity between your external environment and the AWS Cloud by using AWS Site-to-Site VPN or AWS Direct Connect, or both.

    2. The Lambda function is configured to fetch secrets from a third-party secrets manager running on Amazon Elastic Compute Cloud (Amazon EC2).

    Important: To simplify the deployment of this example integration, I’ll use a secrets manager hosted on a publicly available Amazon EC2 instance within the same VPC as the Lambda function (3b). This minimizes the additional networking components required to interact with the secrets manager. More specifically, the EC2 instance runs an open-source HashiCorp Vault. In the rest of this post, I’ll refer to the HashiCorp Vault’s API keys as Vault tokens.

  4. The Lambda function compares the version of the secret that it just fetched from the third-party secrets manager against the version of the secret that it has in AWS Secrets Manager (by tag). The function will create a new secret in AWS Secrets Manager if the secret does not exist yet, and will update it if there is a new version. The Lambda function will only consider secrets from the third-party secrets manager for replication if they match a specified prefix. For example, hybrid-aws-secrets/.
  5. In case there is an error synchronizing the secret, an email notification is sent to the email addresses which are subscribed to the Amazon Simple Notification Service (Amazon SNS) Topic deployed. This sample application uses email notifications with Amazon SNS as an example, but you could also integrate with services like ServiceNow, Jira, Slack, or PagerDuty. Learn more about how to use webhooks to publish Amazon SNS messages to external services.

Set up the solution

In this section, I walk through deploying the pull model solution displayed in Figure 1 using the following steps:
Step 1: Deploy the solution by using the AWS CDK toolkit
Step 2: Initialize the HashiCorp Vault
Step 3: Update the Vault connection secret
Step 4: (Optional) Set up email notifications for replication failures

Step 1: Deploy the solution by using the AWS CDK toolkit

For this blog post, I’ve created an AWS Cloud Development Kit (AWS CDK) script, which can be found in this AWS GitHub repository. Using the AWS CDK, I’ve defined the infrastructure depicted in Figure 1 as Infrastructure as Code (IaC), written in TypeScript, ready for you to deploy and try out. The AWS CDK is an open-source software development framework that allows you to write your cloud application infrastructure as code using common programming languages such as TypeScript, Python, Java, Go, and so on.

Prerequisites:

To deploy the solution, the following should be in place on your system:

  1. Git
  2. Node (version 16 or higher)
  3. jq
  4. AWS CDK Toolkit. Install using npm (included in Node setup) by running npm install -g aws-cdk in a local terminal.
  5. An AWS access key ID and secret access key configured as this setup will interact with your AWS account. See Configuration basics in the AWS Command Line Interface User Guide for more details.
  6. Docker installed and running on your machine

To deploy the solution

  1. Clone the CDK script for secret replication.
    git clone https://github.com/aws-samples/aws-secrets-manager-hybrid-secret-replication-from-hashicorp-vault.git SecretReplication
  2. Use the cloned project as the working directory.
    cd SecretReplication
  3. Install the required dependencies to deploy the application.
    npm install
  4. Adjust any configuration values for your setup in the cdk.json file. For example, you can adjust the secretsPrefix value to change which prefix is used by the Lambda function to determine the subset of secrets that should be replicated from the third-party secrets manager.
  5. Bootstrap your AWS environments with some resources that are required to deploy the solution. With correctly configured AWS credentials, run the following command.
    cdk bootstrap

    The core resources created by bootstrapping are an Amazon Elastic Container Registry (Amazon ECR) repository for the AWS Lambda Docker image, an Amazon Simple Storage Service (Amazon S3) bucket for static assets, and AWS Identity and Access Management (IAM) roles with corresponding IAM policies. You can find a full list of the resources by going to the CDKToolkit stack in AWS CloudFormation after the command has finished.

  6. Deploy the infrastructure.
    cdk deploy

    This command deploys the infrastructure shown in Figure 1 for you by using AWS CloudFormation. For a full list of resources, you can view the SecretsManagerReplicationStack in AWS CloudFormation after the deployment has completed.

Note: If your local environment does not have a terminal that allows you to run these commands, consider using AWS Cloud9 or AWS CloudShell.

After the deployment has finished, you should see an output in your terminal that looks like the one shown in Figure 2. If successful, the output provides the IP address of the sample HashiCorp Vault and its web interface.

Figure 2: AWS CDK deployment output

Figure 2: AWS CDK deployment output

Step 2: Initialize the HashiCorp Vault

As part of the output of the deployment script, you will be given a URL to access the user interface of the open-source HashiCorp Vault. To simplify accessibility, the URL points to a publicly available Amazon EC2 instance running the HashiCorp Vault user interface as shown in step 3b in Figure 1.

Let’s look at the HashiCorp Vault that was just created. Go to the URL in your browser, and you should see the Raft Storage initialize page, as shown in Figure 3.

Figure 3: HashiCorp Vault Raft Storage initialize page

Figure 3: HashiCorp Vault Raft Storage initialize page

The vault requires an initial configuration to set up storage and get the initial set of root keys. You can go through the steps manually in the HashiCorp Vault’s user interface, but I recommend that you use the initialise_vault.sh script that is included as part of the SecretsManagerReplication project instead.

Using the HashiCorp Vault API, the initialization script will automatically do the following:

  1. Initialize the Raft storage to allow the Vault to store secrets locally on the instance.
  2. Create an initial set of unseal keys for the Vault. Importantly, for demo purposes, the script uses a single key share. For production environments, it’s recommended to use multiple key shares so that multiple shares are needed to reconstruct the root key, in case of an emergency.
  3. Store the unseal keys in init/vault_init_output.json in your project.
  4. Unseals the HashiCorp Vault by using the unseal keys generated earlier.
  5. Enables two key-value secrets engines:
    1. An engine named after the prefix that you’re using for replication, defined in the cdk.json file. In this example, this is hybrid-aws-secrets. We’re going to use the secrets in this engine for replication to AWS Secrets Manager.
    2. An engine called super-secret-engine, which you’re going to use to show that your replication mechanism does not have access to secrets outside the engine used for replication.
  6. Creates three example secrets, two in hybrid-aws-secrets, and one in super-secret-engine.
  7. Creates a read-only policy, which you can see in the init/replication-policy-payload.json file after the script has finished running, that allows read-only access to only the secrets that should be replicated.
  8. Creates a new vault token that has the read-only policy attached so that it can be used by the AWS Lambda function later on to fetch secrets for replication.

To run the initialization script, go back to your terminal, and run the following command.
./initialise_vault.sh

The script will then ask you for the IP address of your HashiCorp Vault. Provide the IP address (excluding the port) and choose Enter. Input y so that the script creates a couple of sample secrets.

If everything is successful, you should see an output that includes tokens to access your HashiCorp Vault, similar to that shown in Figure 4.

Figure 4: Initialize HashiCorp Vault bash script output

Figure 4: Initialize HashiCorp Vault bash script output

The setup script has outputted two tokens: one root token that you will use for administrator tasks, and a read-only token that will be used to read secret information for replication. Make sure that you can access these tokens while you’re following the rest of the steps in this post.

Note: The root token is only used for demonstration purposes in this post. In your production environments, you should not use root tokens for regular administrator actions. Instead, you should use scoped down roles depending on your organizational needs. In this case, the root token is used to highlight that there are secrets under super-secret-engine/ which are not meant for replication. These secrets cannot be seen, or accessed, by the read-only token.

Go back to your browser and refresh your HashiCorp Vault UI. You should now see the Sign in to Vault page. Sign in using the Token method, and use the root token. If you don’t have the root token in your terminal anymore, you can find it in the init/vault_init_output.json file.

After you sign in, you should see the overview page with three secrets engines enabled for you, as shown in Figure 5.

Figure 5: HashiCorp Vault secrets engines overview

Figure 5: HashiCorp Vault secrets engines overview

If you explore hybrid-aws-secrets and super-secret-engine, you can see the secrets that were automatically created by the initialization script. For example, first-secret-for-replication, which contains a sample key-value secret with the key secrets and value manager.

If you navigate to Policies in the top navigation bar, you can also see the aws-replication-read-only policy, as shown in Figure 6. This policy provides read-only access to only the hybrid-aws-secrets path.

Figure 6: Read-only HashiCorp Vault token policy

Figure 6: Read-only HashiCorp Vault token policy

The read-only policy is attached to the read-only token that we’re going to use in the secret replication Lambda function. This policy is important because it scopes down the access that the Lambda function obtains by using the token to a specific prefix meant for replication. For secret replication we only need to perform read operations. This policy ensures that we can read, but cannot add, alter, or delete any secrets in HashiCorp Vault using the token.

You can verify the read-only token permissions by signing into the HashiCorp Vault user interface using the read-only token rather than the root token. Now, you should only see hybrid-aws-secrets. You no longer have access to super-secret-engine, which you saw in Figure 5. If you try to create or update a secret, you will get a permission denied error.

Great! Your HashiCorp Vault is now ready to have its secrets replicated from hybrid-aws-secrets to AWS Secrets Manager. The next section describes a final configuration that you need to do to allow access to the secrets in HashiCorp Vault by the replication mechanism in AWS.

Step 3: Update the Vault connection secret

To allow secret replication, you must give the AWS Lambda function access to the HashiCorp Vault read-only token that was created by the initialization script. To do that, you need to update the vault-connection-secret that was initialized in AWS Secrets Manager as part of your AWS CDK deployment.

For demonstration purposes, I’ll show you how to do that by using the AWS Management Console, but you can also do it programmatically by using the AWS Command Line Interface (AWS CLI) or AWS SDK with the update-secret command.

To update the Vault connection secret (console)

  1. In the AWS Management Console, go to AWS Secrets Manager > Secrets > hybrid-aws-secrets/vault-connection-secret.
  2. Under Secret Value, choose Retrieve Secret Value, and then choose Edit.
  3. Update the vaultToken value to contain the read-only token that was generated by the initialization script.
Figure 7: AWS Secrets Manager - Vault connection secret page

Figure 7: AWS Secrets Manager – Vault connection secret page

Step 4: (Optional) Set up email notifications for replication failures

As highlighted in Figure 1, the Lambda function will send an email by using Amazon SNS to a designated email address whenever one or more secrets fails to be replicated. You will need to configure the solution to use the correct email address. To do this, go to the cdk.json file at the root of the SecretReplication folder and adjust the notificationEmail parameter to an email address that you own. Once done, deploy the changes using the cdk deploy command. Within a few minutes, you’ll get an email requesting you to confirm the subscription. Going forward, you will receive an email notification if one or more secrets fails to replicate.

Test your secret replication

You can either wait up to 30 minutes for the Lambda function to be invoked automatically to replicate the secrets, or you can manually invoke the function.

To test your secret replication

  1. Open the AWS Lambda console and find the Secret Replication function (the name starts with SecretsManagerReplication-SecretReplication).
  2. Navigate to the Test tab.
  3. For the text event action, select Create new event, create an event using the default parameters, and then choose the Test button on the right-hand side, as shown in Figure 8.
Figure 8: AWS Lambda - Test page to manually invoke the function

Figure 8: AWS Lambda – Test page to manually invoke the function

This will run the function. You should see a success message, as shown in Figure 9. If this is the first time the Lambda function has been invoked, you will see in the results that two secrets have been created.

Figure 9: AWS Lambda function output

Figure 9: AWS Lambda function output

You can find the corresponding logs for the Lambda function invocation in a Log group in AWS CloudWatch matching the name /aws/lambda/SecretsManagerReplication-SecretReplicationLambdaF-XXXX.

To verify that the secrets were added, navigate to AWS Secrets Manager in the console, and in addition to the vault-connection-secret that you edited before, you should now also see the two new secrets with the same hybrid-aws-secrets prefix, as shown in Figure 10.

Figure 10: AWS Secrets Manager overview - New replicated secrets

Figure 10: AWS Secrets Manager overview – New replicated secrets

For example, if you look at first-secret-for-replication, you can see the first version of the secret, with the secret key secrets and secret value manager, as shown in Figure 11.

Figure 11: AWS Secrets Manager – New secret overview showing values and version number

Figure 11: AWS Secrets Manager – New secret overview showing values and version number

Success! You now have access to the secret values that originate from HashiCorp Vault in AWS Secrets Manager. Also, notice how there is a version tag attached to the secret. This is something that is necessary to update the secret, which you will learn more about in the next two sections.

Update a secret

It’s a recommended security practice to rotate secrets frequently. The Lambda function in this solution not only replicates secrets when they are created — it also periodically checks if existing secrets in AWS Secrets Manager should be updated when the third-party secrets manager (HashiCorp Vault in this case) has a new version of the secret. To validate that this works, you can manually update a secret in your HashiCorp Vault and observe its replication in AWS Secrets Manager in the same way as described in the previous section. You will notice that the version tag of your secret gets updated automatically when there is a new secret replication from the third-party secrets manager to AWS Secrets Manager.

Secret replication logic

This section will explain in more detail the logic behind the secret replication. Consider the following sequence diagram, which explains the overall logic implemented in the Lambda function.

Figure 12: State diagram for secret replication logic

Figure 12: State diagram for secret replication logic

This diagram highlights that the Lambda function will first fetch a list of secret names from the HashiCorp Vault. Then, the function will get a list of secrets from AWS Secrets Manager, matching the prefix that was configured for replication. AWS Secrets Manager will return a list of the secrets that match this prefix and will also return their metadata and tags. Note that the function has not fetched any secret material yet.

Next, the function will loop through each secret name that HashiCorp Vault gave and will check if the secret exists in AWS Secrets Manager:

  • If there is no secret that matches that name, the function will fetch the secret material from HashiCorp Vault, including the version number, and create a new secret in AWS Secrets Manager. It will also add a version tag to the secret to match the version.
  • If there is a secret matching that name in AWS Secrets Manager already, the Lambda function will first fetch the metadata from that secret in HashiCorp Vault. This is required to get the version number of the secret, because the version number was not exposed when the function got the list of secrets from HashiCorp Vault initially. If the secret version from HashiCorp Vault does not match the version value of the secret in AWS Secrets Manager (for example, the version in HashiCorp vault is 2, and the version in AWS Secrets manager is 1), an update is required to get the values synchronized again. Only now will the Lambda function fetch the actual secret material from HashiCorp Vault and update the secret in AWS Secrets Manager, including the version number in the tag.

The Lambda function fetches metadata about the secrets, rather than just fetching the secret material from HashiCorp Vault straight away. Typically, secrets don’t update very often. If this Lambda function is called every 30 minutes, then it should not have to add or update any secrets in the majority of invocations. By using metadata to determine whether you need the secret material to create or update secrets, you minimize the number of times secret material is fetched both from HashiCorp Vault and AWS Secrets Manager.

Note: The AWS Lambda function has permissions to pull certain secrets from HashiCorp Vault. It is important to thoroughly review the Lambda code and any subsequent changes to it to prevent leakage of secrets. For example, you should ensure that the Lambda function does not get updated with code that unintentionally logs secret material outside the Lambda function.

Use your secret

Now that you have created and replicated your secrets, you can use them in your AWS applications or AWS services that are integrated with Secrets Manager. For example, you can use the secrets when you set up connectivity for a proxy in Amazon RDS, as follows.

To use a secret when creating a proxy in Amazon RDS

  1. Go to the Amazon RDS service in the console.
  2. In the left navigation pane, choose Proxies, and then choose Create Proxy.
  3. On the Connectivity tab, you can now select first-secret-for-replication or second-secret-for-replication, which were created by the Lambda function after replicating them from the HashiCorp Vault.
Figure 13: Amazon RDS Proxy - Example of using replicated AWS Secrets Manager secrets

Figure 13: Amazon RDS Proxy – Example of using replicated AWS Secrets Manager secrets

It is important to remember that the consumers of the replicated secrets in AWS Secrets Manager will require scoped-down IAM permissions to use the secrets and AWS Key Management Service (AWS KMS) keys that were used to encrypt the secrets. For example, see Step 3: Create IAM role and policy on the Set up shared database connections with Amazon RDS Proxy page.

Manage permissions

Due to the sensitive nature of the secrets, it is important that you scope down the permissions to the least amount required to prevent inadvertent access to your secrets. The setup adopts a least-privilege permission strategy, where only the necessary actions are explicitly allowed on the resources that are required for replication. However, the permissions should be reviewed in accordance to your security standards.

In the architecture of this solution, there are two main places where you control access to the management of your secrets in Secrets Manager.

Lambda execution IAM role: The IAM role assumed by the Lambda function during execution contains the appropriate permissions for secret replication. There is an additional safety measure, which explicitly denies any action to a resource that is not required for the replication. For example, the Lambda function only has permission to publish to the Amazon SNS topic that is created for the failed replications, and will explicitly deny a publish action to any other topic. Even if someone accidentally adds an allow to the policy for a different topic, the explicit deny will still block this action.

AWS KMS key policy: When other services need to access the replicated secret in AWS Secrets Manager, they need permission to use the hybrid-aws-secrets-encryption-key AWS KMS key. You need to allow the principal these permissions through the AWS KMS key policy. Additionally, you can manage permissions to the AWS KMS key for the principal through an identity policy. For example, this is required when accessing AWS KMS keys across AWS accounts. See Permissions for AWS services in key policies and Specifying KMS keys in IAM policy statements in the AWS KMS Developer Guide.

Options for customizing the sample solution

The solution that was covered in this post provides an example for replication of secrets from HashiCorp Vault to AWS Secrets Manager using the pull model. This section contains additional customization options that you can consider when setting up the solution, or your own variation of it.

  1. Depending on the solution that you’re using, you might have access to different metadata attached to the secrets, which you can use to determine if a secret should be updated. For example, if you have access to data that represents a last_updated_datetime property, you could use this to infer whether or not a secret ought to be updated.
  2. It is a recommended practice to not use long-lived tokens wherever possible. In this sample, I used a static vault token to give the Lambda function access to the HashiCorp Vault. Depending on the solution that you’re using, you might be able to implement better authentication and authorization mechanisms. For example, HashiCorp Vault allows you to use IAM auth by using AWS IAM, rather than a static token.
  3. This post addressed the creation of secrets and updating of secrets, but for your production setup, you should also consider deletion of secrets. Depending on your requirements, you can choose to implement a strategy that works best for you to handle secrets in AWS Secrets Manager once the original secret in HashiCorp Vault has been deleted. In the pull model, you could consider removing a secret in AWS Secrets Manager if the corresponding secret in your external secrets manager is no longer present.
  4. In the sample setup, the same AWS KMS key is used to encrypt both the environment variables of the Lambda function, and the secrets in AWS Secrets Manager. You could choose to add an additional AWS KMS key (which would incur additional cost), to have two separate keys for these tasks. This would allow you to apply more granular permissions for the two keys in the corresponding KMS key policies or IAM identity policies that use the keys.

Conclusion

In this blog post, you’ve seen how you can approach replicating your secrets from an external secrets manager to AWS Secrets Manager. This post focused on a pull model, where the solution periodically fetched secrets from an external HashiCorp Vault and automatically created or updated the corresponding secret in AWS Secrets Manager. By using this model, you can now use your external secrets in your AWS Cloud applications or services that have an integration with AWS Secrets Manager.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Secrets Manager re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Laurens Brinker

Laurens Brinker

Laurens is a Software Development Engineer working for AWS Security and is based in London. Previously, Laurens worked as a Security Solutions Architect at AWS, where he helped customers running their workloads securely in the AWS Cloud. Outside of work, Laurens enjoys cycling, a casual game of chess, and building open source projects.

CVE-2022-47929: traffic control noqueue no problem?

Post Syndicated from Frederick Lawler original https://blog.cloudflare.com/cve-2022-47929-traffic-control-noqueue-no-problem/

CVE-2022-47929: traffic control noqueue no problem?

CVE-2022-47929: traffic control noqueue no problem?

USER namespaces power the functionality of our favorite tools such as docker, podman, and kubernetes. We wrote about Linux namespaces back in June and explained them like this:

Most of the namespaces are uncontroversial, like the UTS namespace which allows the host system to hide its hostname and time. Others are complex but straightforward – NET and NS (mount) namespaces are known to be hard to wrap your head around. Finally, there is this very special, very curious USER namespace. USER namespace is special since it allows the – typically unprivileged owner to operate as “root” inside it. It’s a foundation to having tools like Docker to not operate as true root, and things like rootless containers.

Due to its nature, allowing unprivileged users access to USER namespace always carried a great security risk. With its help the unprivileged user can in fact run code that typically requires root. This code is often under-tested and buggy. Today we will look into one such case where USER namespaces are leveraged to exploit a kernel bug that can result in an unprivileged denial of service attack.

Enter Linux Traffic Control queue disciplines

In 2019, we were exploring leveraging Linux Traffic Control’s queue discipline (qdisc) to schedule packets for one of our services with the Hierarchy Token Bucket (HTB) classful qdisc strategy. Linux Traffic Control is a user-configured system to schedule and filter network packets. Queue disciplines are the strategies in which packets are scheduled. In particular, we wanted to filter and schedule certain packets from an interface, and drop others into the noqueue qdisc.

noqueue is a special case qdisc, such that packets are supposed to be dropped when scheduled into it. In practice, this is not the case. Linux handles noqueue such that packets are passed through and not dropped (for the most part). The documentation states as much. It also states that “It is not possible to assign the noqueue queuing discipline to physical devices or classes.” So what happens when we assign noqueue to a class?

Let’s write some shell commands to show the problem in action:

1. $ sudo -i
2. # dev=enp0s5
3. # tc qdisc replace dev $dev root handle 1: htb default 1
4. # tc class add dev $dev parent 1: classid 1:1 htb rate 10mbit
5. # tc qdisc add dev $dev parent 1:1 handle 10: noqueue

  1. First we need to log in as root because that gives us CAP_NET_ADMIN to be able to configure traffic control.
  2. We then assign a network interface to a variable. These can be found with ip a. Virtual interfaces can be located by calling ls /sys/devices/virtual/net. These will match with the output from ip a.
  3. Our interface is currently assigned to the pfifo_fast qdisc, so we replace it with the HTB classful qdisc and assign it the handle of 1:. We can think of this as the root node in a tree. The “default 1” configures this such that unclassified traffic will be routed directly through this qdisc which falls back to pfifo_fast queuing. (more on this later)
  4. Next we add a class to our root qdisc 1:, assign it to the first leaf node 1 of root 1: 1:1, and give it some reasonable configuration defaults.
  5. Lastly, we add the noqueue qdisc to our first leaf node in the hierarchy: 1:1. This effectively means traffic routed here will be scheduled to noqueue

Assuming our setup executed without a hitch, we will receive something similar to this kernel panic:

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor instruction fetch in kernel mode
...
Call Trace:
<TASK>
htb_enqueue+0x1c8/0x370
dev_qdisc_enqueue+0x15/0x90
__dev_queue_xmit+0x798/0xd00
...
</TASK>

We know that the root user is responsible for setting qdisc on interfaces, so if root can crash the kernel, so what? We just do not apply noqueue qdisc to a class id of a HTB qdisc:

# dev=enp0s5
# tc qdisc replace dev $dev root handle 1: htb default 1
# tc class add dev $dev parent 1: classid 1:2 htb rate 10mbit // A
// B is missing, so anything not filtered into 1:2 will be pfifio_fast

Here, we leveraged the default case of HTB where we assign a class id 1:2 to be rate-limited (A), and implicitly did not set a qdisc to another class such as id 1:1 (B). Packets queued to (A) will be filtered to HTB_DIRECT and packets queued to (B) will be filtered into pfifo_fast.

Because we were not familiar with this part of the codebase, we notified the mailing lists and created a ticket. The bug did not seem all that important to us at that time.

Fast-forward to 2022, we are pushing USER namespace creation hardening. We extended the Linux LSM framework with a new LSM hook: userns_create to leverage eBPF LSM for our protections, and encourage others to do so as well. Recently while combing our ticket backlog, we rethought this bug. We asked ourselves, “can we leverage USER namespaces to trigger the bug?” and the short answer is yes!

Demonstrating the bug

The exploit can be performed with any classful qdisc that assumes a struct Qdisc.enqueue function to not be NULL (more on this later), but in this case, we are demonstrating just with HTB.

$ unshare -rU –net
$ dev=lo
$ tc qdisc replace dev $dev root handle 1: htb default 1
$ tc class add dev $dev parent 1: classid 1:1 htb rate 10mbit
$ tc qdisc add dev $dev parent 1:1 handle 10: noqueue
$ ping -I $dev -w 1 -c 1 1.1.1.1

We use the “lo” interface to demonstrate that this bug is triggerable with a virtual interface. This is important for containers because they are fed virtual interfaces most of the time, and not the physical interface. Because of that, we can use a container to crash the host as an unprivileged user, and thus perform a denial of service attack.

Why does that work?

To understand the problem a bit better, we need to look back to the original patch series, but specifically this commit that introduced the bug. Before this series, achieving noqueue on interfaces relied on a hack that would set a device qdisc to noqueue if the device had a tx_queue_len = 0. The commit d66d6c3152e8 (“net: sched: register noqueue qdisc”) circumvents this by explicitly allowing noqueue to be added with the tc command without needing to get around that limitation.

The way the kernel checks for whether we are in a noqueue case or not, is to simply check if a qdisc has a NULL enqueue() function. Recall from earlier that noqueue does not necessarily drop packets in practice? After that check in the fail case, the following logic handles the noqueue functionality. In order to fail the check, the author had to cheat a reassignment from noop_enqueue() to NULL by making enqueue = NULL in the init which is called way after register_qdisc() during runtime.

Here is where classful qdiscs come into play. The check for an enqueue function is no longer NULL. In this call path, it is now set to HTB (in our example) and is thus allowed to enqueue the struct skb to a queue by making a call to the function htb_enqueue(). Once in there, HTB performs a lookup to pull in a qdisc assigned to a leaf node, and eventually attempts to queue the struct skb to the chosen qdisc which ultimately reaches this function:

include/net/sch_generic.h

static inline int qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch,
				struct sk_buff **to_free)
{
	qdisc_calculate_pkt_len(skb, sch);
	return sch->enqueue(skb, sch, to_free); // sch->enqueue == NULL
}

We can see that the enqueueing process is fairly agnostic from physical/virtual interfaces. The permissions and validation checks are done when adding a queue to an interface, which is why the classful qdics assume the queue to not be NULL. This knowledge leads us to a few solutions to consider.

Solutions

We had a few solutions ranging from what we thought was best to worst:

  1. Follow tc-noqueue documentation and do not allow noqueue to be assigned to a classful qdisc
  2. Instead of checking for NULL, check for struct noqueue_qdisc_ops, and reset noqueue to back to noop_enqueue
  3. For each classful qdisc, check for NULL and fallback

While we ultimately went for the first option: “disallow noqueue for qdisc classes”, the third option creates a lot of churn in the code, and does not solve the problem completely. Future qdiscs implementations could forget that important check as well as the maintainers. However, the reason for passing on the second option is a bit more interesting.

The reason we did not follow that approach is because we need to first answer these questions:

Why not allow noqueue for classful qdiscs?

This contradicts the documentation. The documentation does have some precedent for not being totally followed in practice, but we will need to update that to reflect the current state. This is fine to do, but does not address the behavior change problem other than remove the NULL dereference bug.

What behavior changes if we do allow noqueue for qdiscs?

This is harder to answer because we need to determine what that behavior should be. Currently, when noqueue is applied as the root qdisc for an interface, the path is to essentially allow packets to be processed. Claiming a fallback for classes is a different matter. They may each have their own fallback rules, and how do we know what is the right fallback? Sometimes in HTB the fallback is pass-through with HTB_DIRECT, sometimes it is pfifo_fast. What about the other classes? Perhaps instead we should fall back to the default noqueue behavior as it is for root qdiscs?

We felt that going down this route would only add confusion and additional complexity to queuing. We could also make an argument that such a change could be considered a feature addition and not necessarily a bug fix. Suffice it to say, adhering to the current documentation seems to be the more appealing approach to prevent the vulnerability now, while something else can be worked out later.

Takeaways

First and foremost, apply this patch as soon as possible. And consider hardening USER namespaces on your systems by setting sysctl -w kernel.unprivileged_userns_clone=0, which only lets root create USER namespaces in Debian kernels, sysctl -w user.max_user_namespaces=[number] for a process hierarchy, or consider backporting these two patches: security_create_user_ns() and the SELinux implementation  (now in Linux 6.1.x) to allow you to protect your systems with either eBPF or SELinux. If you are sure you’re not using USER namespaces and in extreme cases, you might consider turning the feature off with CONFIG_USERNS=n. This is just one example of many where namespaces are leveraged to perform an attack, and more are surely to crop up in varying levels of severity in the future.

Special thanks to Ignat Korchagin and Jakub Sitnicki for code reviews and helping demonstrate the bug in practice.

Reduce risk by implementing HttpOnly cookie authentication in Amazon API Gateway

Post Syndicated from Marc Borntraeger original https://aws.amazon.com/blogs/security/reduce-risk-by-implementing-httponly-cookie-authentication-in-amazon-api-gateway/

Some web applications need to protect their authentication tokens or session IDs from cross-site scripting (XSS). It’s an Open Web Application Security Project (OWASP) best practice for session management to store secrets in the browsers’ cookie store with the HttpOnly attribute enabled. When cookies have the HttpOnly attribute set, the browser will prevent client-side JavaScript code from accessing the value. This reduces the risk of secrets being compromised.

In this blog post, you’ll learn how to store access tokens and authenticate with HttpOnly cookies in your own workloads when using Amazon API Gateway as the client-facing endpoint. The tutorial in this post will show you a solution to store OAuth2 access tokens in the browser cookie store, and verify user authentication through Amazon API Gateway. This post describes how to use Amazon Cognito to issue OAuth2 access tokens, but the solution is not limited to OAuth2. You can use other kinds of tokens or session IDs.

The solution consists of two decoupled parts:

  1. OAuth2 flow
  2. Authentication check

Note: This tutorial takes you through detailed step-by-step instructions to deploy an example solution. If you prefer to deploy the solution with a script, see the api-gw-http-only-cookie-auth GitHub repository.

Prerequisites

No costs should incur when you deploy the application from this tutorial because the services you’re going to use are included in the AWS Free Tier. However, be aware that small charges may apply if you have other workloads running in your AWS account and exceed the free tier. Make sure to clean up your resources from this tutorial after deployment.

Solution architecture

This solution uses Amazon Cognito, Amazon API Gateway, and AWS Lambda to build a solution that persists OAuth2 access tokens in the browser cookie store. Figure 1 illustrates the solution architecture for the OAuth2 flow.

Figure 1: OAuth2 flow solution architecture

Figure 1: OAuth2 flow solution architecture

  1. A user authenticates by using Amazon Cognito.
  2. Amazon Cognito has an OAuth2 redirect URI pointing to your API Gateway endpoint and invokes the integrated Lambda function oAuth2Callback.
  3. The oAuth2Callback Lambda function makes a request to the Amazon Cognito token endpoint with the OAuth2 authorization code to get the access token.
  4. The Lambda function returns a response with the Set-Cookie header, instructing the web browser to persist the access token as an HttpOnly cookie. The browser will automatically interpret the Set-Cookie header, because it’s a web standard. HttpOnly cookies can’t be accessed through JavaScript—they can only be set through the Set-Cookie header.

After the OAuth2 flow, you are set up to issue and store access tokens. Next, you need to verify that users are authenticated before they are allowed to access your protected backend. Figure 2 illustrates how the authentication check is handled.

Figure 2: Authentication check solution architecture

Figure 2: Authentication check solution architecture

  1. A user requests a protected backend resource. The browser automatically attaches HttpOnly cookies to every request, as defined in the web standard.
  2. The Lambda function oAuth2Authorizer acts as the Lambda authorizer for HTTP APIs. It validates whether requests are authenticated. If requests include the proper access token in the request cookie header, then it allows the request.
  3. API Gateway only passes through requests that are authenticated.

Amazon Cognito is not involved in the authentication check, because the Lambda function can validate the OAuth2 access tokens by using a JSON Web Token (JWT) validation check.

1. Deploying the OAuth2 flow

In this section, you’ll deploy the first part of the solution, which is the OAuth2 flow. The OAuth2 flow is responsible for issuing and persisting OAuth2 access tokens in the browser’s cookie store.

1.1. Create a mock protected backend

As shown in in Figure 2, you need to protect a backend. For the purposes of this post, you create a mock backend by creating a simple Lambda function with a default response.

To create the Lambda function

  1. In the Lambda console, choose Create function.

    Note: Make sure to select your desired AWS Region.

  2. Choose Author from scratch as the option to create the function.
  3. In the Basic information section as shown in , enter or select the following values:
  4. Choose Create function.
    Figure 3: Configuring the getProtectedResource Lambda function

    Figure 3: Configuring the getProtectedResource Lambda function

The default Lambda function code returns a simple Hello from Lambda message, which is sufficient to demonstrate the concept of this solution.

1.2. Create an HTTP API in Amazon API Gateway

Next, you create an HTTP API by using API Gateway. Either an HTTP API or a REST API will work. In this example, choose HTTP API because it’s offered at a lower price point (for this tutorial you will stay within the free tier).

To create the API Gateway API

  1. In the API Gateway console, under HTTP API, choose Build.
  2. On the Create and configure integrations page, as shown in Figure 4, choose Add integration, then enter or select the following values:
    • Select Lambda.
    • For Lambda function, select the getProtectedResource Lambda function that you created in the previous section.
    • For API name, enter a name. In this example, I used MyApp.
    • Choose Next.
    Figure 4: Configuring API Gateway integrations and API name

    Figure 4: Configuring API Gateway integrations and API name

  3. On the Configure routes page, as shown in Figure 5, enter or select the following values:
    • For Method, select GET.
    • For Resource path, enter / (a single forward slash).
    • For Integration target, select the getProtectedResource Lambda function.
    • Choose Next.
    Figure 5: Configuring API Gateway routes

    Figure 5: Configuring API Gateway routes

  4. On the Configure stages page, keep all the default options, and choose Next.
  5. On the Review and create page, choose Create.
  6. Note down the value of Invoke URL, as shown in Figure 6.
    Figure 6: Note down the invoke URL

    Figure 6: Note down the invoke URL

Now it’s time to test your API Gateway API. Paste the value of Invoke URL into your browser. You’ll see the following message from your Lambda function: Hello from Lambda.

1.3. Use Amazon Cognito

You’ll use Amazon Cognito user pools to create and maintain a user directory, and add sign-up and sign-in to your web application.

To create an Amazon Cognito user pool

  1. In the Amazon Cognito console, choose Create user pool.
  2. On the Authentication providers page, as shown in Figure 7, for Cognito user pool sign-in options, select Email, then choose Next.
    Figure 7: Configuring authentication providers

    Figure 7: Configuring authentication providers

  3. In the Multi-factor authentication pane of the Configure Security requirements page, as shown in Figure 8, choose your MFA enforcement. For this example, choose No MFA to make it simpler for you to test your solution. However, in production for data sensitive workloads you should choose Require MFA – Recommended. Choose Next.
    Figure 8: Configuring MFA

    Figure 8: Configuring MFA

  4. On the Configure sign-up experience page, keep all the default options and choose Next.
  5. On the Configure message delivery page, as shown in Figure 9, choose your email provider. For this example, choose Send email with Cognito to make it simple to test your solution. In production workloads, you should choose Send email with Amazon SES – Recommended. Choose Next.
    Figure 9: Configuring email

    Figure 9: Configuring email

  6. In the User pool name section of the Integrate your app page, as shown in Figure 10, enter or select the following values:
    1. For User pool name, enter a name. In this example, I used MyUserPool.
      Figure 10: Configuring user pool name

      Figure 10: Configuring user pool name

    2. In the Hosted authentication pages section, as shown in Figure 11, select Use the Cognito Hosted UI.
      Figure 11: Configuring hosted authentication pages

      Figure 11: Configuring hosted authentication pages

    3. In the Domain section, as shown in Figure 12, for Domain type, choose Use a Cognito domain. For Cognito domain, enter a domain name. Note that domains in Cognito must be unique. Make sure to enter a unique name, for example by appending random numbers at the end of your domain name. For this example, I used https://http-only-cookie-secured-app.
      Figure 12: Configuring an Amazon Cognito domain

      Figure 12: Configuring an Amazon Cognito domain

    4. In the Initial app client section, as shown in Figure 13, enter or select the following values:
      • For App type, keep the default setting Public client.
      • For App client name, enter a friendly name. In this example, I used MyAppClient.
      • For Client secret, keep the default setting Don’t generate a client secret.
      • For Allowed callback URLs, enter <API_GW_INVOKE_URL>/oauth2/callback, replacing <API_GW_INVOKE_URL> with the invoke URL you noted down from API Gateway in the previous section.
        Figure 13: Configuring the initial app client

        Figure 13: Configuring the initial app client

    5. Choose Next.
  7. Choose Create user pool.

Next, you need to retrieve some Amazon Cognito information for later use.

To note down Amazon Cognito information

  1. In the Amazon Cognito console, choose the user pool you created in the previous steps.
  2. Under User pool overview, make note of the User pool ID value.
  3. On the App integration tab, under Cognito Domain, make note of the Domain value.
  4. Under App client list, make note of the Client ID value.
  5. Under App client list, choose the app client name you created in the previous steps.
  6. Under Hosted UI, make note of the Allowed callback URLs value.

Next, create the user that you will use in a later section of this post to run your test.

To create a user

  1. In the Amazon Cognito console, choose the user pool you created in the previous steps.
  2. Under Users, choose Create user.
  3. For Email address, enter [email protected]. For this tutorial, you don’t need to send out actual emails, so the email address does not need to actually exist.
  4. Choose Mark email address as verified.
  5. For password, enter a password you can remember (or even better: use a password generator).
  6. Remember the email and password for later use.
  7. Choose Create user.

1.4. Create the Lambda function oAuth2Callback

Next, you create the Lambda function oAuth2Callback, which is responsible for issuing and persisting the OAuth2 access tokens.

To create the Lambda function oAuth2Callback

  1. In the Lambda console, choose Create function.

    Note: Make sure to select your desired Region.

  2. For Function name, enter oAuth2Callback.
  3. For Runtime, select Node.js 16.x.
  4. For Architecture, select arm64.
  5. Choose Create function.

After you create the Lambda function, you need to add the code. Create a new folder on your local machine and open it with your preferred integrated development environment (IDE). Add the package.json and index.js files, as shown in the following examples.

package.json

{
  "name": "oAuth2Callback",
  "version": "0.0.1",
  "dependencies": {
    "axios": "^0.27.2",
    "qs": "^6.11.0"
  }
}

In a terminal at the root of your created folder, run the following command.

$ npm install

In the index.js example code that follows, be sure to replace the placeholders with your values.

index.js

const qs = require("qs");
const axios = require("axios").default;
exports.handler = async function (event) {
  const code = event.queryStringParameters?.code;
  if (code == null) {
    return {
      statusCode: 400,
      body: "code query param required",
    };
  }
  const data = {
    grant_type: "authorization_code",
    client_id: "<your client ID from Cognito>",
    // The redirect has already happened, but you still need to pass the URI for validation, so a valid oAuth2 access token can be generated
    redirect_uri: encodeURI("<your callback URL from Cognito>"),
    code: code,
  };
  // Every Cognito instance has its own token endpoints. For more information check the documentation: https://docs.aws.amazon.com/cognito/latest/developerguide/token-endpoint.html
  const res = await axios.post(
    "<your App Client Cognito domain>/oauth2/token",
    qs.stringify(data),
    {
      headers: {
        "Content-Type": "application/x-www-form-urlencoded",
      },
    }
  );
  return {
    statusCode: 302,
    // These headers are returned as part of the response to the browser.
    headers: {
      // The Location header tells the browser it should redirect to the root of the URL
      Location: "/",
      // The Set-Cookie header tells the browser to persist the access token in the cookie store
      "Set-Cookie": `accessToken=${res.data.access_token}; Secure; HttpOnly; SameSite=Lax; Path=/`,
    },
  };
};

Along with the HttpOnly attribute, you pass along two additional cookie attributes:

  • Secure – Indicates that cookies are only sent by the browser to the server when a request is made with the https: scheme.
  • SameSite – Controls whether or not a cookie is sent with cross-site requests, providing protection against cross-site request forgery attacks. You set the value to Lax because you want the cookie to be set when the user is forwarded from Amazon Cognito to your web application (which runs under a different URL).

For more information, see Using HTTP cookies on the MDN Web Docs site.

Afterwards, upload the code to the oAuth2Callback Lambda function as described in Upload a Lambda Function in the AWS Toolkit for VS Code User Guide.

1.5. Configure an OAuth2 callback route in API Gateway

Now, you configure API Gateway to use your new Lambda function through a Lambda proxy integration.

To configure API Gateway to use your Lambda function

  1. In the API Gateway console, under APIs, choose your API name. For me, the name is MyApp.
  2. Under Develop, choose Routes.
  3. Choose Create.
  4. Enter or select the following values:
    • For method, select GET.
    • For path, enter /oauth2/callback.
  5. Choose Create.
  6. Choose GET under /oauth2/callback, and then choose Attach integration.
  7. Choose Create and attach an integration.
    • For Integration type, choose Lambda function.
    • For Lambda function, choose oAuth2Callback from the last step.
  8. Choose Create.

Your route configuration in API Gateway should now look like Figure 14.

Figure 14: Routes for API Gateway

Figure 14: Routes for API Gateway

2. Testing the OAuth2 flow

Now that you have the components in place, you can test your OAuth2 flow. You test the OAuth2 flow by invoking the login on your browser.

To test the OAuth2 flow

  1. In the Amazon Cognito console, choose your user pool name. For me, the name is MyUserPool.
  2. Under the navigation tabs, choose App integration.
  3. Under App client list, choose your app client name. For me, the name is MyAppClient.
  4. Choose View Hosted UI.
  5. In the newly opened browser tab, open your developer tools, so you can inspect the network requests.
  6. Log in with the email address and password you set in the previous section. Change your password, if you’re asked to do so. You can also choose the same password as you set in the previous section.
  7. You should see your Hello from Lambda message.

To test that the cookie was accurately set

  1. Check your browser network tab in the browser developer settings. You’ll see the /oauth2/callback request, as shown in Figure 15.
    Figure 15: Callback network request

    Figure 15: Callback network request

    The response headers should include a set-cookie header, as you specified in your Lambda function. With the set-cookie header, your OAuth2 access token is set as an HttpOnly cookie in the browser, and access is prohibited from any client-side code.

  2. Alternatively, you can inspect the cookie in the browser cookie storage, as shown in Figure 16.

  3. If you want to retry the authentication, navigate in your browser to your Amazon Cognito domain that you chose in the previous section and clear all site data in the browser developer tools. Do the same with your API Gateway invoke URL. Now you can restart the test with a clean state.

3. Deploying the authentication check

In this section, you’ll deploy the second part of your application: the authentication check. The authentication check makes it so that only authenticated users can access your protected backend. The authentication check works with the HttpOnly cookie, which is stored in the user’s cookie store.

3.1. Create the Lambda function oAuth2Authorizer

This Lambda function checks that requests are authenticated.

To create the Lambda function

  1. In the Lambda console, choose Create function.

    Note: Make sure to select your desired Region.

  2. For Function name, enter oAuth2Authorizer.
  3. For Runtime, select Node.js 16.x.
  4. For Architecture, select arm64.
  5. Choose Create function.

After you create the Lambda function, you need to add the code. Create a new folder on your local machine and open it with your preferred IDE. Add the package.json and index.js files as shown in the following examples.

package.json

{
  "name": "oAuth2Authorizer",
  "version": "0.0.1",
  "dependencies": {
    "aws-jwt-verify": "^3.1.0"
  }
}

In a terminal at the root of your created folder, run the following command.

$ npm install

In the index.js example code, be sure to replace the placeholders with your values.

index.js

const { CognitoJwtVerifier } = require("aws-jwt-verify");
function getAccessTokenFromCookies(cookiesArray) {
  // cookieStr contains the full cookie definition string: "accessToken=abc"
  for (const cookieStr of cookiesArray) {
    const cookieArr = cookieStr.split("accessToken=");
    // After splitting you should get an array with 2 entries: ["", "abc"] - Or only 1 entry in case it was a different cookie string: ["test=test"]
    if (cookieArr[1] != null) {
      return cookieArr[1]; // Returning only the value of the access token without cookie name
    }
  }
  return null;
}
// Create the verifier outside the Lambda handler (= during cold start),
// so the cache can be reused for subsequent invocations. Then, only during the
// first invocation, will the verifier actually need to fetch the JWKS.
const verifier = CognitoJwtVerifier.create({
  userPoolId: "<your user pool ID from Cognito>",
  tokenUse: "access",
  clientId: "<your client ID from Cognito>",
});
exports.handler = async (event) => {
  if (event.cookies == null) {
    console.log("No cookies found");
    return {
      isAuthorized: false,
    };
  }
  // Cookies array looks something like this: ["accessToken=abc", "otherCookie=Random Value"]
  const accessToken = getAccessTokenFromCookies(event.cookies);
  if (accessToken == null) {
    console.log("Access token not found in cookies");
    return {
      isAuthorized: false,
    };
  }
  try {
    await verifier.verify(accessToken);
    return {
      isAuthorized: true,
    };
  } catch (e) {
    console.error(e);
    return {
      isAuthorized: false,
    };
  }
};

After you add the package.json and index.js files, upload the code to the oAuth2Authorizer Lambda function as described in Upload a Lambda Function in the AWS Toolkit for VS Code User Guide.

3.2. Configure the Lambda authorizer in API Gateway

Next, you configure your authorizer Lambda function to protect your backend. This way you control access to your HTTP API.

To configure the authorizer Lambda function

  1. In the API Gateway console, under APIs, choose your API name. For me, the name is MyApp.
  2. Under Develop, choose Routes.
  3. Under / (a single forward slash) GET, choose Attach authorization.
  4. Choose Create and attach an authorizer.
  5. Choose Lambda.
  6. Enter or select the following values:
    • For Name, enter oAuth2Authorizer.
    • For Lambda function, choose oAuth2Authorizer.
    • Clear Authorizer caching. For this tutorial, you disable authorizer caching to make testing simpler. See the section Bonus: Enabling authorizer caching for more information about enabling caching to increase performance.
    • Under Identity sources, choose Remove.

      Note: Identity sources are ignored for your Lambda authorizer. These are only used for caching.

    • Choose Create and attach.
  7. Under Develop, choose Routes to inspect all routes.

Now your API Gateway route /oauth2/callback should be configured as shown in Figure 17.

Figure 17: API Gateway route configuration

Figure 17: API Gateway route configuration

4. Testing the OAuth2 authorizer

You did it! From your last test, you should still be authenticated. So, if you open the API Gateway Invoke URL in your browser, you’ll be greeted from your protected backend.

In case you are not authenticated anymore, you’ll have to follow the steps again from the section Testing the OAuth2 flow to authenticate.

When you inspect the HTTP request that your browser makes in the developer tools as shown in Figure 18, you can see that authentication works because the HttpOnly cookie is automatically attached to every request.

Figure 18: Browser requests include HttpOnly cookies

Figure 18: Browser requests include HttpOnly cookies

To verify that your authorizer Lambda function works correctly, paste the same Invoke URL you noted previously in an incognito window. Incognito windows do not share the cookie store with your browser session, so you see a {"message":"Forbidden"} error message with HTTP response code 403 – Forbidden.

Cleanup

Delete all unwanted resources to avoid incurring costs.

To delete the Amazon Cognito domain and user pool

  1. In the Amazon Cognito console, choose your user pool name. For me, the name is MyUserPool.
  2. Under the navigation tabs, choose App integration.
  3. Under Domain, choose Actions, then choose Delete Cognito domain.
  4. Confirm by entering your custom Amazon Cognito domain, and choose Delete.
  5. Choose Delete user pool.
  6. Confirm by entering your user pool name (in my case, MyUserPool), and then choose Delete.

To delete your API Gateway resource

  1. In the API Gateway console, select your API name. For me, the name is MyApp.
  2. Under Actions, choose Delete and confirm your deletion.

To delete the AWS Lambda functions

  1. In the Lambda console, select all three of the Lambda functions you created.
  2. Under Actions, choose Delete and confirm your deletion.

Bonus: Enabling authorizer caching

As mentioned earlier, you can enable authorizer caching to help improve your performance. When caching is enabled for an authorizer, API Gateway uses the authorizer’s identity sources as the cache key. If a client specifies the same parameters in identity sources within the configured Time to Live (TTL), then API Gateway uses the cached authorizer result, rather than invoking your Lambda function.

To enable caching, your authorizer must have at least one identity source. To cache by the cookie request header, you specify $request.header.cookie as the identity source. Be aware that caching will be affected if you pass along additional HttpOnly cookies apart from the access token.

For more information, see Working with AWS Lambda authorizers for HTTP APIs in the Amazon API Gateway Developer Guide.

Conclusion

In this blog post, you learned how to implement authentication by using HttpOnly cookies. You used Amazon API Gateway and AWS Lambda to persist and validate the HttpOnly cookies, and you used Amazon Cognito to issue OAuth2 access tokens. If you want to try an automated deployment of this solution with a script, see the api-gw-http-only-cookie-auth GitHub repository.

The application of this solution to protect your secrets from potential cross-site scripting (XSS) attacks is not limited to OAuth2. You can protect other kinds of tokens, sessions, or tracking IDs with HttpOnly cookies.

In this solution, you used NodeJS for your Lambda functions to implement authentication. But HttpOnly cookies are widely supported by many programing frameworks. You can find more implementation options on the OWASP Secure Cookie Attribute page.

Although this blog post gives you a tutorial on how to implement HttpOnly cookie authentication in API Gateway, it may not meet all your security and functional requirements. Make sure to check your business requirements and talk to your stakeholders before you adopt techniques from this blog post.

Furthermore, it’s a good idea to continuously test your web application, so that cookies are only set with your approved security attributes. For more information, see the OWASP Testing for Cookies Attributes page.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon API Gateway re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Marc Borntraeger

Marc Borntraeger

Marc is a Solutions Architect in healthcare, based in Zurich, Switzerland. He helps security-sensitive customers such as hospitals to re-innovate themselves with AWS.

Visualize AWS WAF logs with an Amazon CloudWatch dashboard

Post Syndicated from Diana Alvarado original https://aws.amazon.com/blogs/security/visualize-aws-waf-logs-with-an-amazon-cloudwatch-dashboard/

AWS WAF is a web application firewall service that helps you protect your applications from common exploits that could affect your application’s availability and your security posture. One of the most useful ways to detect and respond to malicious web activity is to collect and analyze AWS WAF logs. You can perform this task conveniently by sending your AWS WAF logs to Amazon CloudWatch Logs and visualizing them through an Amazon CloudWatch dashboard.

In this blog post, I’ll show you how to use Amazon CloudWatch to monitor and analyze AWS WAF activity using the options in CloudWatch metrics, Contributor Insights, and Logs Insights. I’ll also walk you through how to deploy this solution in your own AWS account by using an AWS CloudFormation template.

Prerequisites

This blog post builds on the concepts introduced in the blog post Analyzing AWS WAF Logs in Amazon CloudWatch Logs. There we introduced how to natively set up AWS WAF logging to Amazon CloudWatch logs, and discussed the basic options that are available for visualizing and analyzing the data provided in the logs.

The only AWS services that you need to turn on for this solution are Amazon CloudWatch and AWS WAF. The solution assumes that you’ve previously set up AWS WAF log delivery to Amazon CloudWatch Logs. If you have not done so, follow the instructions for AWS WAF logging destinations – CloudWatch Logs.

You will need to provide the following parameters for the CloudFormation template:

  • CloudWatch log group name for the AWS WAF logs
  • The AWS Region for the logs
  • The name of the AWS WAF web access control list (web ACL)

Solution overview

The architecture of the solution is outlined in Figure 1. The solution takes advantage of the native integration available between AWS WAF and CloudWatch, which simplifies the setup and management of this solution.

Figure 1: Solution architecture

Figure 1: Solution architecture

In the solution, the logs are sent to CloudWatch (when you enable log delivery). From there, they’re ready to be consumed by all the different service options that CloudWatch offers, including the ones that we’ll use in this solution: CloudWatch Logs Insights and Contributor Insights.

Deploy the solution

Choose the following Launch stack button to launch the CloudFormation stack in your account.

Launch Stack

You’ll be redirected to the CloudFormation service in the AWS US East (N. Virginia) Region, which is the default Region to deploy this solution, although this can vary depending on where your web ACL is located. You can change the Region as preferred. The template will spin up multiple cloud resources, such as the following:

  • CloudWatch Logs Insights queries
  • CloudWatch Contributor Insights visuals
  • CloudWatch dashboard

The solution is quickly deployed to your account and is ready to use in less than 30 minutes. You can use the solution when the status of the stack changes to CREATE_COMPLETE.

As a measure to control costs, you can also choose whether to create the Contributor Insights rules and enable them by default. For more information on costs, see the Cost considerations section later in this post.

Explore and validate the dashboard

When the CloudFormation stack is complete, you can choose the Output tab in the CloudFormation console and then choose the dashboard link. This will take you to the CloudWatch service in the AWS Management Console. The dashboard time range presents information for the last hour of activity by default, and can go up to one week, but keep in mind that Contributor Insights has a maximum time range of 24 hours. You can also select a different dashboard refresh interval from 10 seconds up to 15 minutes.

The dashboard provides the following information from CloudWatch.

Rule name Description
WAF_top_terminating_rules This rule shows the top rules where the requests are being terminated by AWS WAF. This can help you understand the main cause of blocked requests.
WAF_top_ips This rule shows the top source IPs for requests. This can help you understand if the traffic and activity that you see is spread across many IPs or concentrated in a small group of IPs.
WAF_top_countries This rule shows the main source countries for the IPs in the requests. This can help you visualize where the traffic is originating.
WAF_top_user_agents This rule shows the main user agents that are being used to generate the requests. This will help you isolate problematic devices or identify potential false positives.
WAF_top_uri This rule shows the main URIs in the requests that are being evaluated. This can help you identify if one specific path is the target of activity.
WAF_top_http This rule shows the HTTP methods used for the requests examined by AWS WAF. This can help you understand the pattern of behavior of the traffic.
WAF_top_referrer_hosts This rule shows the main referrer from which requests are being sent. This can help you identify incorrect or suspicious origins of requests based on the known application flow.
WAF_top_rate_rules This rule shows the main rate rules being applied to traffic. It helps understand volumetric activity identified by AWS WAF.
WAF_top_labels This rule shows the top labels found in logs. This can help you visualize the main rules that are matching on the requests evaluated by AWS WAF.

The dashboard also provides the following information from the default CloudWatch metrics sent by AWS WAF.

Rule name Description
AllowedvsBlockedRequests This metric shows the number of all blocked and allowed requests. This can help you understand the number of requests that AWS WAF is actively blocking.
Bot Requests vs non-Bot requests This visual shows the number of requests identified as bots versus non-bots (if you’re using AWS WAF Bot Control).
All Requests This metric shows the number of all requests, separated by bot and non-bot origin. This can help you understand all requests that AWS WAF is evaluating.
CountedRequests This metric shows the number of all counted requests. This can help you understand the requests that are matching a rule but not being blocked, and aid the decision of a configuration change during the testing phase.
CaptchaRequests This metric shows requests that go through the CAPTCHA rule.

Figure 2 shows an example of how the CloudWatch dashboard displays the data within this solution. You can rearrange and customize the elements within the dashboard as needed.

You can review each of the queries and rules deployed with this solution. You can also customize these baseline queries and rules to provide more detailed information or to add custom queries and rules to the solution code. For more information on how to build queries and use CloudWatch Logs and Contributor Insights, see the CloudWatch documentation.

Use the dashboard for monitoring

After you’ve set up the dashboard, you can monitor the activity of the sites that are protected by AWS WAF. If suspicious activity is reported, you can use the visuals to understand the traffic in more detail, and drive incident response actions as needed.

Let’s consider an example of how to use your new dashboard and its data to drive security operations decisions. Suppose that you have a website that sells custom clothing at a bargain price. It has a sign-up link to receive offers, and you’re getting reports of unusual activity by the application team. By looking at the metrics for the web ACL that protects the site, you can see the main country for source traffic and the contributing URIs, as shown in Figure 3. You can also see that most of the activity is being detected by rules that you have in place, so you can set the rules to block traffic, or if they are already blocking, you can just monitor the activity.

You can use the same visuals to decide whether an AWS WAF rule with high activity can be changed to autoblock suspicious web traffic without affecting valid customer traffic. By looking at the top terminating rules and cross-referencing information, such as source IPs, user agents, top URIs, and other request identifiers, you can understand the traffic pattern and activity of different applications and endpoints. From here, you can investigate further by using specific queries with CloudWatch Logs Insights.

Operational and security management with CloudWatch Logs Insights

You can use CloudWatch Logs Insights to interactively search and analyze log data in Amazon CloudWatch Logs using advanced queries to effectively investigate operational issues and security incidents.

Examine a bot reported as a false positive

You can use CloudWatch Logs Insights to identify requests that have specific labels to understand where the traffic is originating from based on source IP address and other essential event details. A simple example is investigating requests flagged as potential false positives.

Imagine that you have a reported false positive request that was flagged as a non-browser by AWS WAF Bot Control. You can run the non-browser user agent query that was created by the provided template on CloudWatch Logs Insights, as shown in the following example, and then verify the source IPs for the top hits for this rule group. Or you can look for a specific request that has been flagged as a false positive, in order to review the details and make adjustments as needed.

fields @timestamp, httpRequest.clientIp 
| filter @message like "awswaf:managed:aws:botcontrol:signal:non_browser_user_agent" 
| parse @message ""labels":[*]"as Labels 
| stats count(*) as requestCount by httpRequest.clientIP 
| display @timestamp,httpRequest.clientIp, httpRequest.uri,Labels 
| sort requestCount desc 
| limit 10

The non-browser user agent query also allows you confirm whether this request has other rule hits that were in count mode and were non-terminating; you can do this by examining the labels. If there are multiple rules matching the requests, that can be an indicator of suspicious activity.

If you have a CAPTCHA challenge configured on the endpoint, you can also look at CAPTCHA responses. The CaptchaTokenqueryDefinition query provided in this solution uses a variation of the preceding format, and can display the main IPs from which bad tokens are being sent. An example query is shown following, along with the query results in Figure 4. If you have signals from non-browser user agents and CAPTCHA tokens missing, then that is a strong indicator of suspicious activity.

fields @timestamp, httpRequest.clientIp 
| filter captchaResponse.failureReason = "TOKEN_MISSING" 
| stats count(*) as requestCount by httpRequest.clientIp, httpRequest.country 
| sort requestCount desc 
| limit 10
Figure 4: Main IP addresses and number of counts for CAPTCHA responses

Figure 4: Main IP addresses and number of counts for CAPTCHA responses

This information can provide an indication of the main source of activity. You can then use other visuals, like top user agents or top referrers, to provide more context to the information and inform further actions, such as adding new rules to the AWS WAF configuration.

You can adapt the queries provided in the sample solution to other use cases by using the fields provided in the left-hand pane of CloudWatch Logs Insights.

Cost considerations

Configuring AWS WAF to send logs to Amazon CloudWatch logs doesn’t have an additional cost. The cost incurred is for the use of the CloudWatch features and services, such as log storage and retention, Contributor Insights rules enabled, Logs Insights queries run, matched log events, and CloudWatch dashboards. For detailed information on the pricing of these features, see the CloudWatch Logs pricing information. You can also get an estimate of potential costs by using the AWS pricing calculator for CloudWatch.

One way to help offset the cost of CloudWatch features and services is to restrict the use of the dashboard and enforce a log retention policy for AWS WAF that makes it cost effective. If you use the queries and monitoring only as-needed, this can also help reduce costs. By limiting the running of queries and the matched log events for the Contributor Insights rules, you can enable the rules only when you need them. AWS WAF also provides the option to filter the logs that are sent when logging is enabled. For more information, see AWS WAF log filtering.

Conclusion

In this post, you learned how to use a pre-built CloudWatch dashboard to monitor AWS WAF activity by using metrics and Contributor Insights rules. The dashboard can help you identify traffic patterns and activity, and you can use the sample Logs Insights queries to explore the log information in more detail and examine false positives and suspicious activity, for rule tuning.

For more information on AWS WAF and the features mentioned in this post, see the AWS WAF documentation.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS WAF re:Post.

Want more AWS Security news? Follow us on Twitter.

Diana Alvarado

Diana Alvarado

Diana is Sr security solutions architect at AWS. She is passionate about helping customers solve difficult cloud challenges, she has a soft spot for all things logs.

United Arab Emirates IAR compliance assessment report is now available with 58 services in scope

Post Syndicated from Ioana Mecu original https://aws.amazon.com/blogs/security/united-arab-emirates-iar-compliance-assessment-report-is-now-available-with-58-services-in-scope/

Amazon Web Services (AWS) is pleased to announce the publication of our compliance assessment report on the Information Assurance Regulation (IAR) established by the Telecommunications and Digital Government Regulatory Authority (TDRA) of the United Arab Emirates. The report covers the AWS Middle East (UAE) Region, with 58 services in scope of the assessment.

The IAR provides management and technical information security controls to establish, implement, maintain, and continuously improve information assurance. AWS alignment with IAR requirements demonstrates our ongoing commitment to adhere to the heightened expectations for cloud service providers. As such, IAR-regulated customers can use AWS services with confidence.

Independent third-party auditors from BDO evaluated AWS for the period of November 1, 2021, to October 31, 2022. The assessment report illustrating the status of AWS compliance is available through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

For up-to-date information, including when additional services are added, see AWS Services in Scope by Compliance Program and choose IAR.

AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about IAR compliance, reach out to your AWS account team.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

 
If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Ioana Mecu

Ioana Mecu

Ioana is a Security Audit Program Manager at AWS based in Madrid, Spain. She leads security audits, attestations, and certification programs across Europe and the Middle East. Ioana has previously worked in risk management, security assurance, and technology audits in the financial sector industry for the past 15 years.

Gokhan Akyuz

Gokhan Akyuz

Gokhan is a Security Audit Program Manager at AWS based in Amsterdam, Netherlands. He leads security audits, attestations, and certification programs across Europe and the Middle East. Gokhan has more than 15 years of experience in IT and cybersecurity audits and controls implementation in a wide range of industries.

Investing in security to protect data privacy

Post Syndicated from Emily Hancock original https://blog.cloudflare.com/investing-in-security-to-protect-data-privacy/

Investing in security to protect data privacy

Investing in security to protect data privacy

If you’ve made it to 2023 without ever receiving a notice that your personal information was compromised in a security breach, consider yourself lucky. In a best case scenario, bad actors only got your email address and name – information that won’t cause you a huge amount of harm. Or in a worst-case scenario, maybe your profile on a dating app was breached and intimate details of your personal life were exposed publicly, with life-changing impacts. But there are also more hidden, insidious ways that your personal data can be exploited. For example, most of us use an Internet Service Provider (ISP) to connect to the Internet. Some of those ISPs are collecting information about your Internet viewing habits, your search histories, your location, etc. – all of which can impact the privacy of your personal information as you are targeted with ads based on your online habits.

You also probably haven’t made it to 2023 without hearing at least something about Internet privacy laws around the globe. In some jurisdictions, lawmakers are driven by a recognition that the right to privacy is a fundamental human right. In other locations, lawmakers are passing laws to address the harms their citizens are concerned about – data breaches and mining of data about private details of people’s lives  to sell targeted advertising. At the core of most of this legislation is an effort to give users more control over their personal data. And many of these regulations require data controllers to ensure adequate protections are in place for cross-border data transfers. In recent years, we’ve seen an increasing number of regulators interpreting these regulations in a way that would leave no room for cross-border data transfers, however. These interpretations are problematic – not only are they harmful to global commerce, but they also disregard the idea that data might be more secure if cross-border data transfers are allowed. Some regulators instead assert that personal data will be safer if it stays within their borders because their law protects privacy better than that of another jurisdiction.

So with Data Privacy Day 2023 just a few days away on January 28, we think it’s important to focus on all the ways security measures and privacy-enhancing technologies help keep personal data private and why security measures are so much more critical to protecting privacy than merely implementing the requirements of data protection laws or keeping data in a jurisdiction because regulators think that jurisdiction has stronger laws than another.

The role of data security in protecting personal information

Most data protection regulations recognize the role security plays in protecting the privacy of personal information. That’s not surprising. An entity’s efforts to follow a data protection law’s requirements for how personal data should be collected and used won’t mean much if a third party can access the data for their own malicious purposes.

The laws themselves provide few specifics about what security is required. For example, the EU General Data Protection Regulation (“GDPR”) and similar comprehensive privacy laws in other jurisdictions require data controllers (the entities that collect your data) to implement “reasonable and appropriate” security measures. But it’s almost impossible for regulators to require specific security measures because the security landscape changes so quickly. In the United States, state security breach laws don’t require notification if the data obtained is encrypted, suggesting that encryption is at least one way regulators think data should be protected.

Enforcement actions brought by regulators against companies that have experienced data breaches provide other clues for what regulators think are “best practices” for ensuring data protection. For example, on January 10 of this year, the U.S. Federal Trade Commission entered into a consent order with Drizly, an online alcohol sales and delivery platform, outlining a number of security failures that led to a data breach that exposed the personal information of about 2.5 million Drizly users and requiring Drizly to implement a comprehensive security program that includes a long list of intrusion detection and logging procedures. In particular, the FTC specifically requires Drizly to implement “…(c) data loss prevention tools; [and] (d) properly configured firewalls” among other measures.

What many regulatory post-breach enforcement actions have in common is the requirement of a comprehensive security program that includes a number of technical measures to protect data from third parties who might seek access to it. The enforcement actions tend to be data location-agnostic, however. It’s not important where the data might be stored – what is important is the right security measures are in place. We couldn’t agree more wholeheartedly.

Cloudflare’s portfolio of products and services helps our customers put protections in place to thwart would-be attackers from accessing their websites or corporate networks. By making it less likely that users’ data will be accessed by malicious actors, Cloudflare’s services can help organizations save millions of dollars, protect their brand reputations, and build trust with their users. We also spend a great deal of time working to develop privacy-enhancing technologies that directly support the ability of individual users to have a more privacy-preserving experience on the Internet.

Cloudflare is most well-known for its application layer security services – Web Application Firewall (WAF), bot management, DDoS protection, SSL/TLS, Page Shield, and more. As the FTC noted in its Drizly consent order, firewalls can be a critical line of defense for any online application. Think about what happens when you go through security at an airport – your body and your bags are scanned for something bad that might be there (e.g. weapons or explosives), but the airport security personnel are not inventorying or recording the contents of your bags. They’re simply looking for dangerous content to make sure it doesn’t make its way onto an airplane. In the same way, the WAF looks at packets as they are being routed through Cloudflare’s network to make sure the Internet equivalent of weapons and explosives are not delivered to a web application. Governments around the globe have agreed that these quick security scans at the airport are necessary to protect us all from bad actors. Internet traffic is the same.

We embrace the critical importance of encryption in transit. In fact, we see encryption as so important that in 2014, Cloudflare introduced Universal SSL to support SSL (and now TLS) connections to every Cloudflare customer. And at the same time, we recognize that blindly passing along encrypted packets would undercut some of the very security that we’re trying to provide. Data privacy and security are a balance. If we let encrypted malicious code get to an end destination, then the malicious code may be used to access information that should otherwise have been protected. If data isn’t encrypted in transit, it’s at risk for interception. But by supporting encryption in transit and ensuring malicious code doesn’t get to its intended destination, we can protect private personal information even more effectively.

Let’s take another example – In June 2022, Atlassian released a Security Advisory relating to a remote code execution (RCE) vulnerability affecting Confluence Server and Confluence Data Center products. Cloudflare responded immediately to roll out a new WAF rule for all of our customers. For customers without this WAF protection, all the trade secret and personal information on their instances of Confluence were potentially vulnerable to data breach. These types of security measures are critical to protecting personal data. And it wouldn’t have mattered if the personal data were stored on a server in Australia, Germany, the U.S., or India – the RCE vulnerability would have exposed data wherever it was stored. Instead, the data was protected because a global network was able to roll out a WAF rule immediately to protect all of its customers globally.

Global network to thwart global attacks

The power of a large, global network is often overlooked when we think about using security measures to protect the privacy of personal data. Regulators who would seek to wall off their countries from the rest of the world as a method of protecting data privacy often miss how such a move can impact the security measures that are even more critical to keeping private data protected from bad actors.

Global knowledge is necessary to stop attacks that could come from anywhere in the world. Just as an international network of counterterrorism units helps to prevent physical threats, the same approach is needed to prevent cyberthreats. The most powerful security tools are built upon identified patterns of anomalous traffic, coming from all over the world. Cloudflare’s global network puts us in a unique position to understand the evolution of global threats and anomalous behaviors. To empower our customers with preventative and responsive cybersecurity, we transform global learnings into protections, while still maintaining the privacy of good-faith Internet users.

For example, Cloudflare’s tools to block threats at the DNS or HTTP level, including DDoS protection for websites and Gateway for enterprises, allow users to further secure their entities beyond customized traffic rules by screening for patterns of traffic known to contain phishing or malware content. We use our global network to improve our identification of vulnerabilities and malicious content and to roll out rules in real time that protect everyone. This ability to identify and instantly protect our customers from security vulnerabilities that they may not have yet had time to address reduces the possibility that their data will be compromised or that they will otherwise be subjected to nefarious activity.

Similarly, Cloudflare’s Bot Management product only increases in accuracy with continued use on the global network: it detects and blocks traffic coming from likely bots before feeding back learnings to the models backing the product. And most importantly, we minimize the amount of information used to detect these threats by fingerprinting traffic patterns and forgoing reliance on PII. Our Bot Management products are successful because of the sheer number of customers and amount of traffic on our network. With approximately 20 percent of all websites protected by Cloudflare, we are uniquely positioned to gather the signals that traffic is from a bad bot and interpret them into actionable intelligence. This diversity of signal and scale of data on a global platform is critical to help us continue to evolve our bot detection tools. If the Internet were fragmented – preventing data from one jurisdiction being used in another – more and more signals would be missed. We wouldn’t be able to apply learnings from bot trends in Asia to bot mitigation efforts in Europe, for example.

A global network is equally important for resilience and effective security protection, a reality that the war in Ukraine has brought into sharp relief. In order to keep their data safe, the Ukrainian government was required to change their laws to remove data localization requirements. As Ukraine’s infrastructure came under attack during Russia’s invasion, the Ukrainian government migrated their data to the cloud, allowing it to be preserved and easily moved to safety in other parts of Europe. Likewise, Cloudflare’s global network played an important role in helping maintain Internet access inside Ukraine. Sites in Ukraine at times came under heavy DDoS attack, even as infrastructure was being destroyed by physical attacks. With bandwidth limited, it was important that the traffic that was getting through inside Ukraine was useful traffic, not attack traffic. Instead of allowing attack traffic inside Ukraine, Cloudflare’s global network identified it and rejected it in the countries where the attacks originated. Without the ability to inspect and reject traffic outside of Ukraine, the attack traffic would have further congested networks inside Ukraine, limiting network capacity for critical wartime communications.

Although the situation in Ukraine reflects the country’s wartime posture, Cloudflare’s global network provides the same security benefits for all of our customers. We use our entire network to deliver DDoS mitigation, with a  network capacity of over 172 Tbps, making it possible for our customers to stay online even in the face of the largest attacks. That enormous capacity to protect customers from attack is the result of the global nature of Cloudflare’s network, aided by the ability to restrict attack traffic to the countries where it originated. And a network that stays online is less likely to have to address the network intrusions and data loss that are frequently connected to successful DDoS attacks.

Zero Trust security for corporate networks

Some of the biggest data breaches in recent years have happened as a result of something pretty simple – an attacker uses a phishing email or social engineering to get an employee of a company to visit a site that infects the employee’s computer with malware or enter their credentials on a fake site that lets the bad actor capture the credentials and then use those to impersonate the employee and log into a company’s systems. Depending on the type of information compromised, these kinds of data breaches can have a huge impact on individuals’ privacy. For this reason, Cloudflare has invested in a number of technologies designed to protect corporate networks, and the personal data on those networks.

As we noted during our recent CIO week, the FBI’s latest Internet Crime Report shows that business email compromise and email account compromise, a subset of malicious phishing campaigns, are the most costly – with U.S. businesses losing nearly $2.4 billion. Cloudflare has invested in a number of Zero Trust solutions to help fight this very problem:

  • Link Isolation means that when an employee clicks a link in an email, it will automatically be opened using Cloudflare’s Remote Browser Isolation technology that isolates potentially risky links, downloads, or other zero-day attacks from impacting that user’s computer and the wider corporate network.
  • With our Data Loss Prevention tools, businesses can identify and stop exfiltration of data.
  • Our Area 1 solution identifies phishing attempts, emails containing malicious code, and emails containing ransomware payloads and prevents them from landing in the inbox of unsuspecting employees.

These Zero Trust tools, combined with the use of hardware keys for multi-factor authentication, were key in Cloudflare’s ability to prevent a breach by an SMS phishing attack that targeted more than 130 companies in July and August 2022. Many of these companies reported the disclosure of customer personal information as a result of employees falling victim to this SMS phishing effort.

And remember the Atlassian Confluence RCE vulnerability we mentioned earlier? Cloudflare remained protected not only due to our rapid update of our WAF rules, but also because we use our own Cloudflare Access solution (part of our Zero Trust suite) to ensure that only individuals with Cloudflare credentials are able to access our internal systems. Cloudflare Access verified every request made to a Confluence application to ensure it was coming from an authenticated user.

All of these Zero Trust solutions require sophisticated machine learning to detect patterns of malicious activity, and none of them require data to be stored in a specific location to keep the data safe. Thwarting these kinds of security threats aren’t only important for protecting organizations’ internal networks from intrusion – they are critical for keeping large scale data sets private for the benefit of millions of individuals.

Cutting-edge technologies

Cloudflare’s security services enable our customers to screen for cybersecurity risks on Cloudflare’s network before those risks can reach the customer’s internal network. This helps protect our customers and our customers’ data from a range of cyber threats. By doing so, Cloudflare’s services are essentially fulfilling a privacy-enhancing function in themselves. From the beginning, we have built our systems to ensure that data is kept private, even from us, and we have made public policy and contractual commitments about keeping that data private and secure. But beyond securing our network for the benefit of our customers, we’ve invested heavily in new technologies that aim to secure communications from bad actors; the prying eyes of ISPs or other man-in-the-middle machines that might find your Internet communications of interest for advertising purpose; or government entities that might want to crack down on individuals exercising their freedom of speech.

For example, Cloudflare operates part of Apple’s iCloud Private Relay system, which ensures that no single party handling user data has complete information on both who the user is and what they are trying to access. Instead, a user’s original IP address is visible to the access network (e.g. the coffee shop you’re sitting in, or your home ISP) and the first relay (operated by Apple), but the server or website name is encrypted and not visible to either. The first relay hands encrypted data to a second relay (e.g. Cloudflare), but is unable to see “inside” the traffic to Cloudflare. And the Cloudflare-operated relays know only that it is receiving traffic from a Private Relay user, but not specifically who or their client IP address. Cloudflare relays then forward traffic on to the destination server.

And of course any post on how security measures enable greater data privacy would be remiss if it failed to mention Cloudflare’s privacy-first 1.1.1.1 public resolver. By using 1.1.1.1, individuals can search the Internet without their ISPs seeing where they are going. Unlike most DNS resolvers, 1.1.1.1 does not sell user data to advertisers.

Together, these many technologies and security measures ensure the privacy of personal data from many types of threats to privacy – behavioral advertising, man-in-the-middle attacks, malicious code, and more. On this data privacy day 2023, we urge regulators to recognize that the emphasis currently being placed on data localization has perhaps gone too far – and has foreclosed the many benefits cross-border data transfers can have for data security and, therefore, data privacy.

Armed to Boot: an enhancement to Arm’s Secure Boot chain

Post Syndicated from Derek Chamorro original https://blog.cloudflare.com/armed-to-boot/

Armed to Boot: an enhancement to Arm's Secure Boot chain

Armed to Boot: an enhancement to Arm's Secure Boot chain

Over the last few years, there has been a rise in the number of attacks that affect how a computer boots. Most modern computers use a specification called Unified Extensible Firmware Interface (UEFI) that defines a software interface between an operating system (e.g. Windows) and platform firmware (e.g. disk drives, video cards). There are security mechanisms built into UEFI that ensure that platform firmware can be cryptographically validated and boot securely through an application called a bootloader. This firmware is stored in non-volatile SPI flash memory on the motherboard, so it persists on the system even if the operating system is reinstalled and drives are replaced.

This creates a ‘trust anchor’ used to validate each stage of the boot process, but, unfortunately, this trust anchor is also a target for attack. In these UEFI attacks, malicious actions are loaded onto a compromised device early in the boot process. This means that malware can change configuration data, establish persistence by ‘implanting’ itself, and can bypass security measures that are only loaded at the operating system stage. So, while UEFI-anchored secure boot protects the bootloader from bootloader attacks, it does not protect the UEFI firmware itself.

Because of this growing trend of attacks, we began the process of cryptographically signing our UEFI firmware as a mitigation step. While our existing solution is platform specific to our x86 AMD server fleet, we did not have a similar solution to UEFI firmware signing for Arm. To determine what was missing, we had to take a deep dive into the Arm secure boot process.

Read on to learn about the world of Arm Trusted Firmware Secure Boot.

Arm Trusted Firmware Secure Boot

Arm defines a trusted boot process through an architecture called Trusted Board Boot Requirements (TBBR), or Arm Trusted Firmware (ATF) Secure Boot. TBBR works by authenticating a series of cryptographically signed binary images each containing a different stage or element in the system boot process to be loaded and executed. Every bootloader (BL) stage accomplishes a different stage in the initialization process:

BL1

BL1 defines the boot path (is this a cold boot or warm boot), initializes the architectures (exception vectors, CPU initialization, and control register setup), and initializes the platform (enables watchdog processes, MMU, and DDR initialization).

BL2

BL2 prepares initialization of the Arm Trusted Firmware (ATF), the stack responsible for setting up the secure boot process. After ATF setup, the console is initialized, memory is mapped for the MMU, and message buffers are set for the next bootloader.

BL3

The BL3 stage has multiple parts, the first being initialization of runtime services that are used in detecting system topology. After initialization, there is a handoff between the ATF ‘secure world’ boot stage to the ‘normal world’ boot stage that includes setup of UEFI firmware. Context is set up to ensure that no secure state information finds its way into the normal world execution state.

Each image is authenticated by a public key, which is stored in a signed certificate and can be traced back to a root key stored on the SoC in one time programmable (OTP) memory or ROM.

Armed to Boot: an enhancement to Arm's Secure Boot chain

TBBR was originally designed for cell phones. This established a reference architecture on how to build a “Chain of Trust” from the first ROM executed (BL1) to the handoff to “normal world” firmware (BL3). While this creates a validated firmware signing chain, it has caveats:

  1. SoC manufacturers are heavily involved in the secure boot chain, while the customer has little involvement.
  2. A unique SoC SKU is required per customer. With one customer this could be easy, but most manufacturers have thousands of SKUs
  3. The SoC manufacturer is primarily responsible for end-to-end signing and maintenance of the PKI chain. This adds complexity to the process  requiring USB key fobs for signing.
  4. Doesn’t scale outside the manufacturer.

What this tells us is what was built for cell phones doesn’t scale for servers.

Armed to Boot: an enhancement to Arm's Secure Boot chain

If we were involved 100% in the manufacturing process, then this wouldn’t be as much of an issue, but we are a customer and consumer. As a customer, we have a lot of control of our server and block design, so we looked at design partners that would take some of the concepts we were able to implement with AMD Platform Secure Boot and refine them to fit Arm CPUs.

Amping it up

We partnered with Ampere and tested their Altra Max single socket rack server CPU (code named Mystique) that provides high performance with incredible power efficiency per core, much of what we were looking for in reducing power consumption. These are only a small subset of specs, but Ampere backported various features into the Altra Max notably, speculative attack mitigations that include Meltdown and Spectre (variants 1 and 2) from the Armv8.5 instruction set architecture, giving Altra the “+” designation in their ISA.

Ampere does implement a signed boot process similar to the ATF signing process mentioned above, but with some slight variations. We’ll explain it a bit to help set context for the modifications that we made.

Ampere Secure Boot

Armed to Boot: an enhancement to Arm's Secure Boot chain

The diagram above shows the Arm processor boot sequence as implemented by Ampere. System Control Processors (SCP) are comprised of the System Management Processor (SMpro) and the Power Management Processor (PMpro). The SMpro is responsible for features such as secure boot and bmc communication while the PMpro is responsible for power features such as Dynamic Frequency Scaling and on-die thermal monitoring.

At power-on-reset, the SCP runs the system management bootloader from ROM and loads the SMpro firmware. After initialization, the SMpro spawns the power management stack on the PMpro and ATF threads. The ATF BL2 and BL31 bring up processor resources such as DRAM, and PCIe. After this, control is passed to BL33 BIOS.

Authentication flow

Armed to Boot: an enhancement to Arm's Secure Boot chain

At power on, the SMpro firmware reads Ampere’s public key (ROTPK) from the SMpro key certificate in SCP EEPROM, computes a hash and compares this to Ampere’s public key hash stored in eFuse. Once authenticated, Ampere’s public key is used to decrypt key and content certificates for SMpro, PMpro, and ATF firmware, which are launched in the order described above.

The SMpro public key will be used to authenticate the SMpro and PMpro images and ATF keys which in turn will authenticate ATF images. This cascading set of authentication that originates with the Ampere root key and stored in chip called an electronic fuse, or eFuse.  An eFuse can be programmed only once, setting the content to be read-only and can not be tampered with nor modified.

This is the original hardware root of trust used for signing system, secure world firmware. When we looked at this, after referencing the signing process we had with AMD PSB and knowing there was a large enough one-time-programmable (OTP) region within the SoC, we thought: why can’t we insert our key hash in here?

Single Domain Secure Boot

Armed to Boot: an enhancement to Arm's Secure Boot chain

Single Domain Secure Boot takes the same authentication flow and adds a hash of the customer public key (Cloudflare firmware signing key in this case) to the eFuse domain. This enables the verification of UEFI firmware by a hardware root of trust. This process is performed in the already validated ATF firmware by BL2. Our public key (dbb) is read from UEFI secure variable storage, a hash is computed and compared to the public key hash stored in eFuse. If they match, the validated public key is used to decrypt the BL33 content certificate, validating and launching the BIOS, and remaining boot items. This is the key feature added by SDSB. It validates the entire software boot chain with a single eFuse root of trust on the processor.

Building blocks

With a basic understanding of how Single Domain Secure Boot works, the next logical question is “How does it get implemented?”. We ensure that all UEFI firmware is signed at build time, but this process can be better understood if broken down into steps.

Armed to Boot: an enhancement to Arm's Secure Boot chain

Ampere, our original device manufacturer (ODM), and we play a role in execution of SDSB. First, we generate certificates for a public-private key pair using our internal, secure PKI. The public key side is provided to the ODM as dbb.auth and dbu.auth in UEFI secure variable format. Ampere provides a reference Software Release Package (SRP) including the baseboard management controller, system control processor, UEFI, and complex programmable logic device (CPLD) firmware to the ODM, who customizes it for their platform. The ODM generates a board file describing the hardware configuration, and also customizes the UEFI to enroll dbb and dbu to secure variable storage on first boot.

Armed to Boot: an enhancement to Arm's Secure Boot chain

Once this is done, we generate a UEFI.slim file using the ODM’s UEFI ROM image, Arm Trusted Firmware (ATF) and Board File. (Note: This differs from AMD PSB insofar as the entire image and ATF files are signed; with AMD PSB, only the first block of boot code is signed.) The entire .SLIM file is signed with our private key, producing a signature hash in the file. This can only be authenticated by the correct public key. Finally, the ODM packages the UEFI into .HPM format compatible with their platform BMC.

Armed to Boot: an enhancement to Arm's Secure Boot chain

In parallel, we provide the debug fuse selection and hash of our DER-formatted public key. Ampere uses this information to create a special version of the SCP firmware known as Security Provisioning (SECPROV) .slim format. This firmware is run one time only, to program the debug fuse settings and public key hash into the SoC eFuses. Ampere delivers the SECPROV .slim file to the ODM, who packages it into a .hpm file compatible with the BMC firmware update tooling.

Fusing the keys

Armed to Boot: an enhancement to Arm's Secure Boot chain

During system manufacturing, firmware is pre-programmed into storage ICs before placement on the motherboard. Note that the SCP EEPROM contains the SECPROV image, not standard SCP firmware. After a system is first powered on, an IPMI command is sent to the BMC which releases the Ampere processor from reset. This allows SECPROV firmware to run, burning the SoC eFuse with our public key hash and debug fuse settings.

Final manufacturing flow

Armed to Boot: an enhancement to Arm's Secure Boot chain

Once our public key has been provisioned, manufacturing proceeds by re-programming the SCP EEPROM with its regular firmware. Once the system powers on, ATF detects there are no keys present in secure variable storage and allows UEFI firmware to boot, regardless of signature. Since this is the first UEFI boot, it programs our public key into secure variable storage and reboots. ATF is validated by Ampere’s public key hash as usual. Since our public key is present in dbb, it is validated against our public key hash in eFuse and allows UEFI to boot.

Validation

Armed to Boot: an enhancement to Arm's Secure Boot chain

The first part of validation requires observing successful destruction of the eFuses. This imprints our public key hash into a dedicated, immutable memory region, not allowing the hash to be overwritten. Upon automatic or manual issue of an IPMI OEM command to the BMC, the BMC observes a signal from the SECPROV firmware, denoting eFuse programming completion. This can be probed with BMC commands.

When the eFuses have been blown, validation continues by observing the boot chain of the other firmware. Corruption of the SCP, ATF, or UEFI firmware obstructs boot flow and boot authentication and will cause the machine to fail booting to the OS. Once firmware is in place, happy path validation begins with booting the machine.

Upon first boot, firmware boots in the following order: BMC, SCP, ATF, and UEFI. The BMC, SCP, and ATF firmware can be observed via their respective serial consoles. The UEFI will automatically enroll the dbb and dbu files to the secure variable storage and trigger a reset of the system.

After observing the reset, the machine should successfully boot to the OS if the feature is executed correctly. For further validation, we can use the UEFI shell environment to extract the dbb file and compare the hash against the hash submitted to Ampere. After successfully validating the keys, we flash an unsigned UEFI image. An unsigned UEFI image causes authentication failure at bootloader stage BL3-2. The ATF firmware undergoes a boot loop as a result. Similar results will occur for a UEFI image signed with incorrect keys.

Updated authentication flow

Armed to Boot: an enhancement to Arm's Secure Boot chain

On all subsequent boot cycles, the ATF will read secure variable dbb (our public key), compute a hash of the key, and compare it to the read-only Cloudflare public key hash in eFuse. If the computed and eFuse hashes match, our public key variable can be trusted and is used to authenticate the signed UEFI. After this, the system boots to the OS.

Let’s boot!

We were unable to get a machine without the feature enabled to demonstrate the set-up of the feature since we have the eFuse set at build time, but we can demonstrate what it looks like to go between an unsigned BIOS and a signed BIOS. What we would have observed with the set-up of the feature is a custom BMC command to instruct the SCP to burn the ROTPK into the SOC’s OTP fuses. From there, we would observe feedback to the BMC detailing whether burning the fuses was successful. Upon booting the UEFI image for the first time, the UEFI will write the dbb and dbu into secure storage.

As you can see, after flashing the unsigned BIOS, the machine fails to boot.

Despite the lack of visibility in failure to boot, there are a few things going on underneath the hood. The SCP (System Control Processor) still boots.

  1. The SCP image holds a key certificate with Ampere’s generated ROTPK and the SCP key hash. SCP will calculate the ROTPK hash and compare it against the burned OTP fuses. In the failure case, where the hash does not match, you will observe a failure as you saw earlier. If successful, the SCP firmware will proceed to boot the PMpro and SMpro. Both the PMpro and SMpro firmware will be verified and proceed with the ATF authentication flow.
  2. The conclusion of the SCP authentication is the passing of the BL1 key to the first stage bootloader via the SCP HOB(hand-off-block) to proceed with the standard three stage bootloader ATF authentication mentioned previously.
  3. At BL2, the dbb is read out of the secure variable storage and used to authenticate the BL33 certificate and complete the boot process by booting the BL33 UEFI image.

Still more to do

In recent years, management interfaces on servers, like the BMC, have been the target of cyber attacks including ransomware, implants, and disruptive operations. Access to the BMC can be local or remote. With remote vectors open, there is potential for malware to be installed on the BMC via network interfaces. With compromised software on the BMC, malware or spyware could maintain persistence on the server. An attacker might be able to update the BMC directly using flashing tools such as flashrom or socflash without the same level of firmware resilience established at the UEFI level.

The future state involves using host CPU-agnostic infrastructure to enable a cryptographically secure host prior to boot time. We will look to incorporate a modular approach that has been proposed by the Open Compute Project’s Data Center Secure Control

Module Specification (DC-SCM) 2.0 specification. This will allow us to standardize our Root of Trust, sign our BMC, and assign physically unclonable function (PUF) based identity keys to components and peripherals to limit the use of OTP fusing. OTP fusing creates a problem with trying to “e-cycle” or reuse machines as you cannot truly remove a machine identity.

Unlocking security updates for transitive dependencies with npm

Post Syndicated from Bryan Dragon original https://github.blog/2023-01-19-unlocking-security-updates-for-transitive-dependencies-with-npm/

Dependabot helps developers secure their software with automated security updates: when a security advisory is published that affects a project dependency, Dependabot will try to submit a pull request that updates the vulnerable dependency to a safe version if one is available. Of course, there’s no rule that says a security vulnerability will only affect direct dependencies—dependencies at any level of a project’s dependency graph could become vulnerable.

Until recently, Dependabot did not address vulnerabilities on transitive dependencies, that is, on the dependencies sitting one or more levels below a project’s direct dependencies. Developers would encounter an error message in the GitHub UI and they would have to manually update the chain of ancestor dependencies leading to the vulnerable dependency to bring it to a safe version.

Screenshot of the warning a user sees when they try to update the version of a project when its dependencies sitting one or more levels below a project’s direct dependencies were out of date. The message reads, "Dependabot cannot update minimist to a non-vulnerable version."

Internally, this would show up as a failed background job due to an update-not-possible error—and we would see a lot of these errors.

Understanding the challenge

Dependabot offers two strategies for updating dependencies: scheduled version updates and security updates. With version updates, the explicit goal is to keep project dependencies updated to the latest available version, and Dependabot can be configured to widen or increase a version requirement so that it accommodates the latest version. With security updates, Dependabot tries to make the most conservative update that removes the vulnerability while respecting version requirements. In this post we’ll be looking at security updates.

As an example, let’s say we have a repository with security updates enabled that contains an npm project with a single dependency on react-scrip[email protected]^4.0.3.

Not all package managers handle version requirements in the same way, so let’s quickly refresh. A version requirement like ^4.0.3 (a “caret range”) in npm permits updates to versions that don’t change the leftmost nonzero element in the MAJOR.MINOR.PATCH semver version number. The version requirement ^4.0.3, then, can be understood as allowing versions greater than or equal to 4.0.3 and less than 5.0.0.

On March 18, 2022, a high-severity security advisory was published for node-forge, a popular npm package that provides tools for writing cryptographic and network-heavy applications. The advisory impacts versions earlier than 1.3.0, the patched version released the day before the advisory was published.

While we don’t have a direct dependency on node-forge, if we zoom in on our project’s dependency tree we can see that we do indirectly depend on a vulnerable version:

[email protected]^4.0.3             4.0.3
  - [email protected]    3.11.1
    - [email protected]^1.10.7         1.10.14
      - [email protected]^0.10.0       0.10.0

In order to resolve the vulnerability, we need to bring node-forge from 0.10.0 to 1.3.0, but a sequence of conflicting ancestor dependencies prevents us from doing so:

  • 4.0.3 is the latest version of react-scripts permitted by our project
  • 3.11.1 is the only version of webpack-dev-server permitted by [email protected]
  • 1.10.14 is the latest version of selfsigned permitted by [email protected]
  • 0.10.0 is the latest version of node-forge permitted by[email protected]

This is the point at which the security update would fail with an update-not-possible error. The challenge is in finding the version of selfsigned that permits [email protected], the version of webpack-dev-server that permits that version of selfsigned, and so on up the chain of ancestor dependencies until we reach react-scripts.

How we chose npm

When we set out to reduce the rate of update-not-possible errors, the first thing we did was pull data from our data warehouse in order to identify the greatest opportunities for impact.

JavaScript is the most popular ecosystem that Dependabot supports, both by Dependabot enablement and by update volume. In fact, more than 80% of the security updates that Dependabot performs are for npm and Yarn projects. Given their popularity, improving security update outcomes for JavaScript projects promised the greatest potential for impact, so we focused our investigation there.

npm and Yarn both include an operation that audits a project’s dependencies for known security vulnerabilities, but currently only npm natively has the ability to additionally make the updates needed to resolve the vulnerabilities that it finds.

After a successful engineering spike to assess the feasibility of integrating with npm’s audit functionality, we set about productionizing the approach.

Tapping into npm audit

When you run the npm audit command, npm collects your project’s dependencies, makes a bulk request to the configured npm registry for all security advisories affecting them, and then prepares an audit report. The report lists each vulnerable dependency, the dependency that requires it, the advisories affecting it, and whether a fix is possible—in other words, almost everything Dependabot should need to resolve a vulnerable transitive dependency.

node-forge  <=1.2.1
Severity: high
Open Redirect in node-forge - https://github.com/advisories/GHSA-8fr3-hfg3-gpgp
Prototype Pollution in node-forge debug API. - https://github.com/advisories/GHSA-5rrq-pxf6-6jx5
Improper Verification of Cryptographic Signature in node-forge - https://github.com/advisories/GHSA-cfm4-qjh2-4765
URL parsing in node-forge could lead to undesired behavior. - https://github.com/advisories/GHSA-gf8q-jrpm-jvxq
fix available via `npm audit fix --force`
Will install [email protected], which is a breaking change
node_modules/node-forge
  selfsigned  1.1.1 - 1.10.14
  Depends on vulnerable versions of node-forge
  node_modules/selfsigned

There were two ways in which we had to supplement npm audit to meet our requirements:

  1. The audit report doesn’t include the chain of dependencies linking a vulnerable transitive dependency, which a developer may not recognize, to a direct dependency, which a developer should recognize. The last step in a security update job is creating a pull request that removes the vulnerability and we wanted to include some context that lets developers know how changes relate to their project’s direct dependencies.
  2. Dependabot performs security updates for one vulnerable dependency at a time. (Updating one dependency at a time keeps diffs to a minimum and reduces the likelihood of introducing breaking changes.) npm audit and npm audit fix, however, operate on all project dependencies, which means Dependabot wouldn’t be able to tell which of the resulting updates were necessary for the dependency it’s concerned with.

Fortunately, there’s a JavaScript API for accessing the audit functionality underlying the npm audit and npm audit fix commands via Arborist, the component npm uses to manage dependency trees. Since Dependabot is a Ruby application, we wrote a helper script that uses the Arborist.audit() API and can be invoked in a subprocess from Ruby. The script takes as input a vulnerable dependency and a list of security advisories affecting it and returns as output the updates necessary to remove the vulnerabilities as reported by npm.

To meet our first requirement, the script uses the audit results from Arborist.audit() to perform a depth-first traversal of the project’s dependency tree, starting with direct dependencies. This top-down, recursive approach allows us to maintain the chain of dependencies linking the vulnerable dependency to its top-level ancestor(s) (which we’ll want to mention later when creating a pull request), and its worst-case time complexity is linear in the total number of dependencies.

function buildDependencyChains(auditReport, name) {
  const helper = (node, chain, visited) => {
    if (!node) {
      return []
    }
    if (visited.has(node.name)) {
      // We've already seen this node; end path.
      return []
    }
    if (auditReport.has(node.name)) {
      const vuln = auditReport.get(node.name)
      if (vuln.isVulnerable(node)) {
        return [{ fixAvailable: vuln.fixAvailable, nodes: [node, ...chain.nodes] }]
      } else if (node.name == name) {
        // This is a non-vulnerable version of the advisory dependency; end path.
        return []
      }
    }
    if (!node.edgesOut.size) {
      // This is a leaf node that is unaffected by the vuln; end path.
      return []
    }
    return [...node.edgesOut.values()].reduce((chains, { to }) => {
      // Only prepend current node to chain/visited if it's not the project root.
      const newChain = node.isProjectRoot ? chain : { nodes: [node, ...chain.nodes] }
      const newVisited = node.isProjectRoot ? visited : new Set([node.name, ...visited])
      return chains.concat(helper(to, newChain, newVisited))
    }, [])
  }
  return helper(auditReport.tree, { nodes: [] }, new Set())
}

To meet our second requirement of operating on one vulnerable dependency at a time, the script takes advantage of the fact that the Arborist constructor accepts a custom audit registry URL to be used when requesting bulk advisory data. We initialize a mock audit registry server using nock that returns only the list of advisories (in the expected format) for the dependency that was passed into the script and we tell the Arborist instance to use it.

const arb = new Arborist({
  auditRegistry: 'http://localhost:9999',
  // ...
})

const scope = nock('http://localhost:9999')
  .persist()
  .post('/-/npm/v1/security/advisories/bulk')
  .reply(200, convertAdvisoriesToRegistryBulkFormat(advisories))

We see both of these use cases—linking a vulnerable dependency to its top-level ancestor and conducting an audit for a single package or a particular set of vulnerabilities—as opportunities to extend Arborist and we’re working on integrating them upstream.

Back in the Ruby code, we parse and verify the audit results emitted by the helper script, accounting for scenarios such as a dependency being downgraded or removed in order to fix a vulnerability, and we incorporate the updates recommended by npm into the remainder of the security update job.

With a viable update path in hand, Dependabot is able to make the necessary updates to remove the vulnerability and submit a pull request that tells the developer about the transitive dependency and its top-level ancestor.

Screenshot of an open pull request that tells the developer about the transitive dependency and its top-level ancestor. The pull request is titled "Bump node-forge and react-scripts" and has a message from Dependabot that reads, "Merging this pull request will resolve 6 Dependabot alerts on node-forge including a high severity alert."

Caveats

When npm audit decides that a vulnerability can only be fixed by changing major versions, it requires use of the force option with npm audit fix. When the force option is used, npm will update to the latest version of a package, even if it means jumping several major versions. This breaks with Dependabot’s previous security update behavior. It also achieves our goal: to unlock conflicting dependencies in order to bring the vulnerable dependency to an unaffected version. Of course, you should still always review the changelog for breaking changes when jumping minor or major versions of a package.

Impact

We rolled out support for transitive security updates with npm in September 2022. Now, having a full quarter of data with the changes in place, we’re able to measure the impact: between Q1Y22 and Q4Y22 we saw a 42% reduction in update-not-possible errors for security updates on JavaScript projects. 🎉

If you have Dependabot security updates enabled on your npm projects, there’s nothing extra for you to do—you’re already benefiting from this improvement.

Looking ahead

I hope this post illustrates some of the considerations and trade-offs that are necessary when making improvements to an established system like Dependabot. We prefer to leverage the native functionality provided by package managers whenever possible, but as package managers come in all shapes and sizes, the approach may vary substantially from one ecosystem to the next.

We hope other package managers will introduce functionality similar to npm audit and npm audit fix that Dependabot can integrate with and we look forward to extending support for transitive security updates to those ecosystems as they do.

C5 Type 2 attestation report now available with 156 services in scope

Post Syndicated from Julian Herlinghaus original https://aws.amazon.com/blogs/security/c5-type-2-attestation-report-now-available-with-156-services-in-scope/

We continue to expand the scope of our assurance programs at Amazon Web Services (AWS), and we are pleased to announce that AWS has successfully completed the 2022 Cloud Computing Compliance Controls Catalogue (C5) attestation cycle with 156 services in scope. This alignment with C5 requirements demonstrates our ongoing commitment to adhere to the heightened expectations for cloud service providers. AWS customers in Germany and across Europe can run their applications on AWS Regions in scope of the C5 report with the assurance that AWS aligns with C5 requirements.

The C5 attestation scheme is backed by the German government and was introduced by the Federal Office for Information Security (BSI) in 2016. AWS has adhered to the C5 requirements since their inception. C5 helps organizations demonstrate operational security against common cyberattacks when using cloud services within the context of the German Government’s Security Recommendations for Cloud Computing Providers.

Independent third-party auditors evaluated AWS for the period October 1, 2021, through September 30, 2022. The C5 report illustrates AWS’ compliance status for both the basic and additional criteria of C5. Customers can download the C5 report through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

AWS has added the following 16 services to the current C5 scope:

At present, the services offered in the Frankfurt, Dublin, London, Paris, Milan, Stockholm and Singapore Regions are in scope of this certification. For up-to-date information, see the AWS Services in Scope by Compliance Program page and choose C5.

AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about C5 compliance, reach out to your AWS account team.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Julian Herlinghaus

Julian Herlinghaus

Julian is a Manager in AWS Security Assurance based in Berlin, Germany. He leads third-party and customer security audits across Europe and specifically the DACH region. He has previously worked as Information Security department lead of an accredited certification body and has multiple years of experience in information security and security assurance & compliance.

Andreas Terwellen

Andreas Terwellen

Andreas is a senior manager in security audit assurance at AWS, based in Frankfurt, Germany. His team is responsible for third-party and customer audits, attestations, certifications, and assessments across Europe. Previously, he was a CISO in a DAX-listed telecommunications company in Germany. He also worked for different consulting companies managing large teams and programs across multiple industries and sectors.

AWS achieves HDS certification in two additional Regions

Post Syndicated from Janice Leung original https://aws.amazon.com/blogs/security/aws-achieves-hds-certification-in-two-additional-regions/

We’re excited to announce that two additional AWS Regions—Asia Pacific (Jakarta) and Europe (Milan)—have been granted the Health Data Hosting (Hébergeur de Données de Santé, HDS) certification. This alignment with HDS requirements demonstrates our continued commitment to adhere to the heightened expectations for cloud service providers. AWS customers who handle personal health data can use HDS-certified Regions with confidence to manage their workloads.

The following 18 Regions are in scope for this certification:

  • US East (Ohio)
  • US East (Northern Virginia)
  • US West (Northern California)
  • US West (Oregon)
  • Asia Pacific (Jakarta)
  • Asia Pacific (Seoul)
  • Asia Pacific (Mumbai)
  • Asia Pacific (Singapore)
  • Asia Pacific (Sydney)
  • Asia Pacific (Tokyo)
  • Canada (Central)
  • Europe (Frankfurt)
  • Europe (Ireland)
  • Europe (London)
  • Europe (Milan)
  • Europe (Paris)
  • Europe (Stockholm)
  • South America (São Paulo)

Introduced by the French governmental agency for health, Agence Française de la Santé Numérique (ASIP Santé), the HDS certification aims to strengthen the security and protection of personal health data. Achieving this certification demonstrates that AWS provides a framework for technical and governance measures to secure and protect personal health data, governed by French law.

Independent third-party auditors evaluated and certified AWS on January 13, 2023. The Certificate of Compliance that demonstrates AWS compliance status is available on the Agence du Numérique en Santé (ANS) website and AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

For up-to-date information, including when additional Regions are added, see the AWS Compliance Programs page, and choose HDS.

AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about HDS compliance, reach out to your AWS account team.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Author

Janice Leung

Janice is a security audit program manager at AWS, based in New York. She leads security audits across Europe and previously worked in security assurance and technology risk management in the financial industry for 11 years.

How to revoke federated users’ active AWS sessions

Post Syndicated from Matt Howard original https://aws.amazon.com/blogs/security/how-to-revoke-federated-users-active-aws-sessions/

When you use a centralized identity provider (IdP) for human user access, changes that an identity administrator makes to a user within the IdP won’t invalidate the user’s existing active Amazon Web Services (AWS) sessions. This is due to the nature of session durations that are configured on assumed roles. This situation presents a challenge for identity administrators.

In this post, you’ll learn how to revoke access to specific users’ sessions on AWS assumed roles through the use of AWS Identity and Access Management (IAM) policies and service control policies (SCPs) via AWS Organizations.

Session duration overview

When you configure IAM roles, you have the option of configuring a maximum session duration that specifies how long a session is valid. By default, the temporary credentials provided to the user will last for one hour, but you can change this to a value of up to 12 hours.

When a user assumes a role in AWS by using their IdP credentials, that role’s credentials will remain valid for the length of their session duration. It’s convenient for end users to have a maximum session duration set to 12 hours, because this prevents their sessions from frequently timing out and then requiring re-login. However, a longer session duration also poses a challenge if you, as an identity administrator, attempt to revoke or modify a user’s access to AWS from your IdP.

For example, user John Doe is leaving the company and you want to verify that John has his privileges within AWS revoked. If John has access to IAM roles with long-session durations, then he might have residual access to AWS despite having his session revoked or his user identity deleted within the IdP. Perhaps John assumed a role for his daily work at 8 AM and then you revoked his credentials within the IdP at 9 AM. Because John had already assumed an AWS role, he would still have access to AWS through that role for the duration of the configured session, 8 PM if the session was configured for 12 hours. Therefore, as a security best practice, AWS recommends that you do not set the session duration length longer than is needed. This example is displayed in Figure 1.

Figure 1: Session duration overview

Figure 1: Session duration overview

In order to restrict access despite the session duration being active, you could update the roles that are assumable from an IdP with a deny-all policy or delete the role entirely. However, this is a disruptive action for the users that have access to this role. If the role was deleted or the policy was updated to deny all, then users would no longer be able to assume the role or access their AWS environment. Instead, the recommended approach is to revoke access based on the specific user’s principalId or sourceIdentity values.

The principalId is the unique identifier for the entity that made the API call. When requests are made with temporary credentials, such as assumed roles through IdPs, this value also includes the session name, such as [email protected]. The sourceIdentity identifies the original user identity that is making the request, such as a user who is authenticated through SAML federation from an IdP. As a best practice, AWS recommends that you configure this value within the IdP, because this improves traceability for user sessions within AWS. You can find more information on this functionality in the blog post, How to integrate AWS STS SourceIdentity with your identity provider.

Identify the principalId and sourceIdentity by using CloudTrail

You can use AWS CloudTrail to review the actions taken by a user, role, or AWS service that are recorded as events. In the following procedure, you will use CloudTrail to identify the principalId and sourceIdentity contained in the CloudTrail record contents for your IdP assumed role.

To identify the principalId and sourceIdentity by using CloudTrail

  1. Assume a role in AWS by signing in through your IdP.
  2. Perform an action such as a creating an S3 bucket.
  3. Navigate to the CloudTrail service.
  4. In the navigation pane, choose Event History.
  5. For Lookup attributes, choose Event name. For Event name, enter CreateBucket.
  6. Figure 2: Looking up the CreateBucket event in the CloudTrail event history

    Figure 2: Looking up the CreateBucket event in the CloudTrail event history

  7. Select the corresponding event record and review the event details. An example showing the userIdentity element is as follows.

"userIdentity": {
	"type": "AssumedRole",
	"principalId": 
"AROATVGBKRLCHXEXAMPLE:[email protected]",
	"arn": "arn:aws:sts::111122223333:assumed-
role/roleexample/[email protected]",
	"accountId": "111122223333",
	"accessKeyId": "ASIATVGBKRLCJEXAMPLE",
	"sessionContext": {
		"sessionIssuer": {
			"type": "Role",
			"principalId": "AROATVGBKRLCHXEXAMPLE",
			"arn": 
"arn:aws:iam::111122223333:role/roleexample",
			"accountId": "111122223333",
			"userName": "roleexample"
		},
		"webIdFederationData": {},
		"attributes": {
			"creationDate": "2022-07-05T15:48:28Z",
			"mfaAuthenticated": "false"
		},
		"sourceIdentity": "[email protected]"
	}
}

In this event record, you can see that principalId is “AROATVGBKRLCHXEXAMPLE:[email protected] and sourceIdentity was specified as [email protected]. Now that you have these values, let’s explore how you can revoke access by using SCP and IAM policies.

Use an SCP to deny users based on IdP user name or revoke session token

First, you will create an SCP, a policy that can be applied to an organization to offer central control of the maximum available permissions across the accounts in the organization. More information on SCPs, including steps to create and apply them, can be found in the AWS Organizations User Guide.

The SCP will have a deny-all statement with a condition for aws:userid, which will evaluate the principalId field; and a condition for aws:SourceIdentity, which will evaluate the sourceIdentity field. In the following example SCP, the users John Doe and Mary Major are prevented from accessing AWS, in member accounts, regardless of their session duration, because each action will check against their aws:userid and aws:SourceIdentity values and be denied accordingly.

SCP to deny access based on IdP user name


{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Deny",
			"Action": "*",
			"Resource": "*",
			"Condition": {
				"StringLike": {
					"aws:userid": [
						"*:[email protected]",
						"*:[email protected]"
				]
			}
		}
	},
	{
			"Effect": "Deny",
			"Action": "*",
			"Resource": "*",
			"Condition": {
				"StringEquals": {
					"aws:SourceIdentity": [
						"[email protected]",
						"[email protected]"
					]
				}
			}
		}
	]
}

Use an IAM policy to revoke access in the AWS Organizations management account

SCPs do not affect users or roles in the AWS Organizations management account and instead only affect the member accounts in the organization. Therefore, using an SCP alone to deny access may not be sufficient. However, identity administrators can revoke access in a similar way within their management account by using the following procedure.

To create an IAM policy in the management account

  1. Sign in to the AWS Management Console by using your AWS Organizations management account credentials.
  2. Follow these steps to use the JSON policy editor to create an IAM policy. Use the JSON of the SCP shown in the preceding section, SCP to deny access based on IdP user name, in the IAM JSON editor.
  3. Follow these steps to add the IAM policy to roles that IdP users may assume within the account.

Revoke active sessions when role chaining

At this point, the user actions on the IdP assumable roles within the AWS organization have been blocked. However, there is still an edge case if the target users use role chaining (use an IdP assumedRole credential to assume a second role) that uses a different RoleSessionName than the one assigned by the IdP. In a role chaining situation, the users will still have access by using the cached credentials for the second role.

This is where the sourceIdentity field is valuable. After a source identity is set, it is present in requests for AWS actions that are taken during the role session. The value that is set persists when a role is used to assume another role (role chaining). The value that is set cannot be changed during the role session. Therefore, it’s recommended that you configure the sourceIdentity field within the IdP as explained previously. This concept is shown in Figure 3.

Figure 3: Role chaining with sourceIdentity configured

Figure 3: Role chaining with sourceIdentity configured

A user assumes an IAM role via their IdP (#1), and the CloudTrail record displays sourceIdentity: [email protected] (#2). When the user assumes a new role within AWS (#3), that CloudTrail record continues to display sourceIdentity: [email protected] despite the principalId changing (#4).

However, if a second role is assumed in the account through role chaining and the sourceIdentity is not set, then it’s recommended that you revoke the issued session tokens for the second role. In order to do this, you can use the SCP policy at the end of this section, SCP to revoke active sessions for assumed roles. When you use this policy, the issued credentials related to the roles specified will be revoked for the users currently using them, and only users who were not denied through the previous SCP or IAM policies restricting their aws:userid will be able to reassume the target roles to obtain a new temporary credential.

If you take this approach, you will need to use an SCP to apply across the organization’s member accounts. The SCP must have the human-assumable roles for role chaining listed and a token issue time set to a specific time when you want users’ access revoked. (Normally, this time window would be set to the present time to immediately revoke access, but there might be circumstances in which you wish to revoke the access at a future date, such as when a user moves to a new project or team and therefore requires different access levels.) In addition, you will need to follow the same procedures in your management account by creating a customer-managed policy by using the same JSON with the condition statement for aws:PrincipalArn removed. Then attach the customer managed policy to the individual roles that are human-assumable through role chaining.

SCP to revoke active sessions for assumed roles


{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "RevokeActiveSessions",
			"Effect": "Deny",
			"Action": [
				"*"
			],
			"Resource": [
				"*"
			],
			"Condition": {
				"StringEquals": {
					"aws:PrincipalArn": [
						"arn:aws:iam::<account-id>:role/<role-name>",
						"arn:aws:iam::<account-id>:role/<role-name>"
					]
				},
				"DateLessThan": {
					"aws:TokenIssueTime": "2022-06-01T00:00:00Z"
				}
			}
		}
	]
}

Conclusion and final recommendations

In this blog post, I demonstrated how you can revoke a federated user’s active AWS sessions by using SCPs and IAM policies that restrict the use of the aws:userid and aws:SourceIdentity condition keys. I also shared how you can handle a role chaining situation with the aws:TokenIssueTime condition key.

This exercise demonstrates the importance of configuring the session duration parameter on IdP assumed roles. As a security best practice, you should set the session duration to no longer than what is needed to perform the role. In some situations, that could mean an hour or less in a production environment and a longer session in a development environment. Regardless, it’s important to understand the impact of configuring the maximum session duration in the user’s environment and also to have proper procedures in place for revoking a federated user’s access.

This post also covered the recommendation to set the sourceIdentity for assumed roles through the IdP. This value cannot be changed during role sessions and therefore persists when a user conducts role chaining. Following this recommendation minimizes the risk that a user might have assumed another role with a different session name than the one assigned by the IdP and helps prevent the edge case scenario of revoking active sessions based on TokenIssueTime.

You should also consider other security best practices, described in the Security Pillar of the AWS Well-Architected Framework, when you revoke users’ AWS access. For example, rotating credentials such as IAM access keys in situations in which IAM access keys are regularly used and shared among users. The example solutions in this post would not have prevented a user from performing AWS actions if that user had IAM access keys configured for a separate IAM user in the environment. Organizations should limit long-lived security credentials such as IAM keys and instead rotate them regularly or avoid their use altogether. Also, the concept of least privilege is highly important to limit the access that users have and scope it solely to the requirements that are needed to perform their job functions. Lastly, you should adopt a centralized identity provider coupled with the AWS IAM Identity Center (successor to AWS Single Sign-On) service in order to centralize identity management and avoid the need for multiple credentials for users.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Identity and Access Management re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Matt Howard

Matt is a Principal Technical Account Manager (TAM) for AWS Enterprise Support. As a TAM, Matt provides advocacy and technical guidance to help customers plan and build solutions using AWS best practices. Outside of AWS, Matt enjoys spending time with family, sports, and video games.

Three key security themes from AWS re:Invent 2022

Post Syndicated from Anne Grahn original https://aws.amazon.com/blogs/security/three-key-security-themes-from-aws-reinvent-2022/

AWS re:Invent returned to Las Vegas, Nevada, November 28 to December 2, 2022. After a virtual event in 2020 and a hybrid 2021 edition, spirits were high as over 51,000 in-person attendees returned to network and learn about the latest AWS innovations.

Now in its 11th year, the conference featured 5 keynotes, 22 leadership sessions, and more than 2,200 breakout sessions and hands-on labs at 6 venues over 5 days.

With well over 100 service and feature announcements—and innumerable best practices shared by AWS executives, customers, and partners—distilling highlights is a challenge. From a security perspective, three key themes emerged.

Turn data into actionable insights

Security teams are always looking for ways to increase visibility into their security posture and uncover patterns to make more informed decisions. However, as AWS Vice President of Data and Machine Learning, Swami Sivasubramanian, pointed out during his keynote, data often exists in silos; it isn’t always easy to analyze or visualize, which can make it hard to identify correlations that spark new ideas.

“Data is the genesis for modern invention.” – Swami Sivasubramanian, AWS VP of Data and Machine Learning

At AWS re:Invent, we launched new features and services that make it simpler for security teams to store and act on data. One such service is Amazon Security Lake, which brings together security data from cloud, on-premises, and custom sources in a purpose-built data lake stored in your account. The service, which is now in preview, automates the sourcing, aggregation, normalization, enrichment, and management of security-related data across an entire organization for more efficient storage and query performance. It empowers you to use the security analytics solutions of your choice, while retaining control and ownership of your security data.

Amazon Security Lake has adopted the Open Cybersecurity Schema Framework (OCSF), which AWS cofounded with a number of organizations in the cybersecurity industry. The OCSF helps standardize and combine security data from a wide range of security products and services, so that it can be shared and ingested by analytics tools. More than 37 AWS security partners have announced integrations with Amazon Security Lake, enhancing its ability to transform security data into a powerful engine that helps drive business decisions and reduce risk. With Amazon Security Lake, analysts and engineers can gain actionable insights from a broad range of security data and improve threat detection, investigation, and incident response processes.

Strengthen security programs

According to Gartner, by 2026, at least 50% of C-Level executives will have performance requirements related to cybersecurity risk built into their employment contracts. Security is top of mind for organizations across the globe, and as AWS CISO CJ Moses emphasized during his leadership session, we are continuously building new capabilities to help our customers meet security, risk, and compliance goals.

In addition to Amazon Security Lake, several new AWS services announced during the conference are designed to make it simpler for builders and security teams to improve their security posture in multiple areas.

Identity and networking

Authorization is a key component of applications. Amazon Verified Permissions is a scalable, fine-grained permissions management and authorization service for custom applications that simplifies policy-based access for developers and centralizes access governance. The new service gives developers a simple-to-use policy and schema management system to define and manage authorization models. The policy-based authorization system that Amazon Verified Permissions offers can shorten development cycles by months, provide a consistent user experience across applications, and facilitate integrated auditing to support stringent compliance and regulatory requirements.

Additional services that make it simpler to define authorization and service communication include Amazon VPC Lattice, an application-layer service that consistently connects, monitors, and secures communications between your services, and AWS Verified Access, which provides secure access to corporate applications without a virtual private network (VPN).

Threat detection and monitoring

Monitoring for malicious activity and anomalous behavior just got simpler. Amazon GuardDuty RDS Protection expands the threat detection capabilities of GuardDuty by using tailored machine learning (ML) models to detect suspicious logins to Amazon Aurora databases. You can enable the feature with a single click in the GuardDuty console, with no agents to manually deploy, no data sources to enable, and no permissions to configure. When RDS Protection detects a potentially suspicious or anomalous login attempt that indicates a threat to your database instance, GuardDuty generates a new finding with details about the potentially compromised database instance. You can view GuardDuty findings in AWS Security Hub, Amazon Detective (if enabled), and Amazon EventBridge, allowing for integration with existing security event management or workflow systems.

To bolster vulnerability management processes, Amazon Inspector now supports AWS Lambda functions, adding automated vulnerability assessments for serverless compute workloads. With this expanded capability, Amazon Inspector automatically discovers eligible Lambda functions and identifies software vulnerabilities in application package dependencies used in the Lambda function code. Actionable security findings are aggregated in the Amazon Inspector console, and pushed to Security Hub and EventBridge to automate workflows.

Data protection and privacy

The first step to protecting data is to find it. Amazon Macie now automatically discovers sensitive data, providing continual, cost-effective, organization-wide visibility into where sensitive data resides across your Amazon Simple Storage Service (Amazon S3) estate. With this new capability, Macie automatically and intelligently samples and analyzes objects across your S3 buckets, inspecting them for sensitive data such as personally identifiable information (PII), financial data, and AWS credentials. Macie then builds and maintains an interactive data map of your sensitive data in S3 across your accounts and Regions, and provides a sensitivity score for each bucket. This helps you identify and remediate data security risks without manual configuration and reduce monitoring and remediation costs.

Encryption is a critical tool for protecting data and building customer trust. The launch of the end-to-end encrypted enterprise communication service AWS Wickr offers advanced security and administrative controls that can help you protect sensitive messages and files from unauthorized access, while working to meet data retention requirements.

Management and governance

Maintaining compliance with regulatory, security, and operational best practices as you provision cloud resources is key. AWS Config rules, which evaluate the configuration of your resources, have now been extended to support proactive mode, so that they can be incorporated into infrastructure-as-code continuous integration and continuous delivery (CI/CD) pipelines to help identify noncompliant resources prior to provisioning. This can significantly reduce time spent on remediation.

Managing the controls needed to meet your security objectives and comply with frameworks and standards can be challenging. To make it simpler, we launched comprehensive controls management with AWS Control Tower. You can use it to apply managed preventative, detective, and proactive controls to accounts and organizational units (OUs) by service, control objective, or compliance framework. You can also use AWS Control Tower to turn on Security Hub detective controls across accounts in an OU. This new set of features reduces the time that it takes to define and manage the controls required to meet specific objectives, such as supporting the principle of least privilege, restricting network access, and enforcing data encryption.

Do more with less

As we work through macroeconomic conditions, security leaders are facing increased budgetary pressures. In his opening keynote, AWS CEO Adam Selipsky emphasized the effects of the pandemic, inflation, supply chain disruption, energy prices, and geopolitical events that continue to impact organizations.

Now more than ever, it is important to maintain your security posture despite resource constraints. Citing specific customer examples, Selipsky underscored how the AWS Cloud can help organizations move faster and more securely. By moving to the cloud, agricultural machinery manufacturer Agco reduced costs by 78% while increasing data retrieval speed, and multinational HVAC provider Carrier Global experienced a 40% reduction in the cost of running mission-critical ERP systems.

“If you’re looking to tighten your belt, the cloud is the place to do it.” – Adam Selipsky, AWS CEO

Security teams can do more with less by maximizing the value of existing controls, and bolstering security monitoring and analytics capabilities. Services and features announced during AWS re:Invent—including Amazon Security Lake, sensitive data discovery with Amazon Macie, support for Lambda functions in Amazon Inspector, Amazon GuardDuty RDS Protection, and more—can help you get more out of the cloud and address evolving challenges, no matter the economic climate.

Security is our top priority

AWS re:Invent featured many more highlights on a variety of topics, such as Amazon EventBridge Pipes and the pre-announcement of GuardDuty EKS Runtime protection, as well as Amazon CTO Dr. Werner Vogels’ keynote, and the security partnerships showcased on the Expo floor. It was a whirlwind week, but one thing is clear: AWS is working harder than ever to make our services better and to collaborate on solutions that ease the path to proactive security, so that you can focus on what matters most—your business.

For more security-related announcements and on-demand sessions, see A recap for security, identity, and compliance sessions at AWS re:Invent 2022 and the AWS re:Invent Security, Identity, and Compliance playlist on YouTube.

If you have feedback about this post, submit comments in the Comments section below.

Anne Grahn

Anne Grahn

Anne is a Senior Worldwide Security GTM Specialist at AWS based in Chicago. She has more than a decade of experience in the security industry, and has a strong focus on privacy risk management. She maintains a Certified Information Systems Security Professional (CISSP) certification.

Author

Paul Hawkins

Paul helps customers of all sizes understand how to think about cloud security so they can build the technology and culture where security is a business enabler. He takes an optimistic approach to security and believes that getting the foundations right is the key to improving your security posture.

CIO Week 2023 recap

Post Syndicated from James Chang original https://blog.cloudflare.com/cio-week-2023-recap/

CIO Week 2023 recap

CIO Week 2023 recap

In our Welcome to CIO Week 2023 post, we talked about wanting to start the year by celebrating the work Chief Information Officers do to keep their organizations safe and productive.

Over the past week, you learned about announcements addressing all facets of your technology stack – including new services, betas, strategic partnerships, third party integrations, and more. This recap blog summarizes each announcement and labels what capability is generally available (GA), in beta, or on our roadmap.

We delivered on critical capabilities requested by our customers – such as even more comprehensive phishing protection and deeper integrations with the Microsoft ecosystem. Looking ahead, we also described our roadmap for emerging technology categories like Digital Experience Monitoring and our vision to make it exceedingly simple to route traffic from any source to any destination through Cloudflare’s network.

Everything we launched is designed to help CIOs accelerate their pursuit of digital transformation. In this blog, we organized our announcement summaries based on the three feelings we want CIOs to have when they consider partnering with Cloudflare:

  1. CIOs now have a simpler roadmap to Zero Trust and SASE: We announced new capabilities and tighter integrations that make it easier for organizations to adopt Zero Trust security best practices and move towards aspirational architectures like Secure Access Service Edge (SASE).
  2. CIOs have access to the right technology and channel partners: We announced integrations and programming to help organizations access the right expertise to modernize IT and security at their own pace with the technologies they already use.
  3. CIOs can streamline a multi-cloud strategy with ease: We announced new ways to connect, secure, and accelerate traffic across diverse cloud environments.

Thank you for following CIO Week, Cloudflare’s first of many Innovation Weeks in 2023. It can be hard to keep up with our pace of innovation sometimes, but we hope that reading this blog and registering for our recap webinar will help!

If you want to speak with us about how to modernize your IT and security and make life easier for your organization’s CIO, fill out the form here.

Simplifying your journey to Zero Trust and SASE

Securing access
These blog posts are focused on making it faster, easier, and safer to connect any user to any application with the granular controls and comprehensive visibility needed to achieve Zero Trust.

Blog Summary
Beta: Introducing Digital Experience Monitoring Cloudflare Digital Experience Monitoring will be an all-in-one dashboard that helps CIOs understand how critical applications and Internet services are performing across their entire corporate network. Sign up for beta access.
Beta: Weave your own global, private, virtual Zero Trust network on Cloudflare with WARP-to-WARP With a single click, any device running Cloudflare’s device client, WARP, in your organization can reach any other device running WARP over a private network. Sign up for beta access.
GA: New ways to troubleshoot Cloudflare Access ‘blocked’ messages Investigate ‘allow’ or ‘block’ decisions based on how a connection was made with the same level of ease that you can troubleshoot user identity within Cloudflare’s Zero Trust platform.
Beta: One-click data security for your internal and SaaS applications Secure sensitive data by running application sessions in an isolated browser and control how users interact with sensitive data – now with just one click. Sign up for beta access.
GA: Announcing SCIM support for Cloudflare Access & Gateway Cloudflare’s ZTNA (Access) and SWG (Gateway) services now support the System for Cross-domain Identity Management (SCIM) protocol, making it easier for administrators to manage identity records across systems.
GA: Cloudflare Zero Trust: The Most Exciting Ping Release Since 1983 Cloudflare Zero Trust administrators can use familiar debugging tools that use the ICMP protocol (like Ping, Traceroute, and MTR) to test connectivity to private network destinations.

Threat defense
These blog posts are focused on helping organizations filter, inspect, and isolate traffic to protect users from phishing, ransomware, and other Internet threats.

Blog Summary
GA: Email Link Isolation: your safety net for the latest phishing attacks Email Link Isolation is your safety net for the suspicious links that end up in inboxes and that users may click. This added protection turns Cloudflare Area 1 into the most comprehensive email security solution when it comes to protecting against phishing attacks.
GA: Bring your own certificates to Cloudflare Gateway Administrators can use their own custom certificates to apply HTTP, DNS, CASB, DLP, RBI and other filtering policies.
GA: Announcing Custom DLP profiles Cloudflare’s Data Loss Prevention (DLP) service now offers the ability to create custom detections, so that organizations can inspect traffic for their most sensitive data.
GA: Cloudflare Zero Trust for Managed Service Providers Learn how the U.S. Federal Government and other large Managed Service Providers (MSPs) are using Cloudflare’s Tenant API to apply security policies like DNS filtering across the organizations they manage.

Secure SaaS environments
These blog posts are focused on maintaining consistent security and visibility across SaaS application environments, in particular to protect leaks of sensitive data.

Blog Summary
Roadmap: How Cloudflare CASB and DLP work together to protect your data Cloudflare Zero Trust will introduce capabilities between our CASB and DLP services that will enable administrators to peer into the files stored in their SaaS applications and identify sensitive data inside them.
Roadmap: How Cloudflare Area 1 and DLP work together to protect data in email Cloudflare is combining capabilities from Area 1 Email Security and Data Loss Prevention (DLP) to provide complete data protection for corporate email.
GA: Cloudflare CASB: Scan Salesforce and Box for security issues Cloudflare CASB now integrates with Salesforce and Box, enabling IT and security teams to scan these SaaS environments for security risks.

Accelerating and securing connectivity
In addition to product capabilities, blog posts in this section highlight speed and other strategic benefits that organizations realize with Cloudflare.

Blog Summary
Why do CIOs choose Cloudflare One? As part of CIO Week, we spoke with the leaders of some of our largest customers to better understand why they selected Cloudflare One. Learn six thematic reasons why.
Cloudflare is faster than Zscaler Cloudflare is 38-55% faster at delivering Zero Trust experiences than Zscaler, as validated by third party testing.
GA: Network detection and settings profiles for the Cloudflare One agent Cloudflare’s device client (WARP) can now securely detect pre-configured locations and route traffic based on the needs of the organization for that location.

Making Cloudflare easier to use
These blog posts highlight innovations across the Cloudflare portfolio, and outside the Zero Trust and SASE categories, to help organizations secure and accelerate traffic with ease.

Blog Summary
Preview any Cloudflare product today Enterprise customers can now start previewing non-contracted services with a single click in the dashboard.
GA: Improved access controls: API access can now be selectively disabled Cloudflare is making it easier for account owners to view and manage the access their users have on an account by allowing them to restrict API access to the account.
GA: Zone Versioning is now generally available Zone Versioning allows customers to safely manage zone configuration by versioning changes and choosing how and when to deploy those changes to defined environments of traffic.
Roadmap: Cloudflare Application Services for private networks: do more with the tools you already love Cloudflare is unlocking operational efficiencies by working on integrations between our Application Services to protect Internet-facing websites and our Cloudflare One platform to protect corporate networks.

Collaborating with the right partners

In addition to new programming for our channel partners, these blog posts describe deeper technical integrations that help organizations work more efficiently with the IT and security tools they already use.

Blog Summary
GA: Expanding our Microsoft collaboration: Proactive and automated Zero Trust security for customers Cloudflare announced four new integrations between Microsoft Azure Active Directory (Azure AD) and Cloudflare Zero Trust that reduce risk proactively. These integrated offerings increase automation, allowing security teams to focus on threats versus implementation and maintenance.
Beta: API-based email scanning Now, Microsoft Office 365 customers can deploy Area 1 cloud email security via Microsoft Graph API. This feature enables O365 customers to quickly deploy the Area 1 product via API, with onboarding through the Microsoft Marketplace coming in the near future.
GA: China Express: Cloudflare partners to boost performance in China for corporate networks China Express is a suite of offerings designed to simplify connectivity and improve performance for users in China and developed in partnership with China Mobile International and China Broadband Communications.
Beta: Announcing the Authorized Partner Service Delivery Track for Cloudflare One Cloudflare announced the limited availability of a new specialization track for our channel and implementation partners, designed to help develop their expertise in delivering Cloudflare One services.

Streamlining your multi-cloud strategy

These blog posts highlight innovations that make it easier for organizations to simply ‘plug into’ Cloudflare’s network and send traffic from any source to any destination.

Blog Summary
Beta: Announcing the Magic WAN Connector: the easiest on-ramp to your next generation network Cloudflare is making it even easier to get connected with the Magic WAN Connector: a lightweight software package you can install in any physical or cloud network to automatically connect, steer, and shape any IP traffic. Sign up for early access.
GA: Cloud CNI privately connects your clouds to Cloudflare Customers using Google Cloud Platform, Azure, Oracle Cloud, IBM Cloud, and Amazon Web Services can now open direct connections from their private cloud instances into Cloudflare.
Cloudflare protection for all your cardinal directions This blog post recaps how definitions of corporate network traffic have shifted and how Cloudflare One provides protection for all traffic flows, regardless of source or destination.

Expanding our Microsoft collaboration: proactive and automated Zero Trust security for customers

Post Syndicated from Abhi Das original https://blog.cloudflare.com/expanding-our-collaboration-with-microsoft-proactive-and-automated-zero-trust-security/

Expanding our Microsoft collaboration: proactive and automated Zero Trust security for customers

Expanding our Microsoft collaboration: proactive and automated Zero Trust security for customers

As CIOs navigate the complexities of stitching together multiple solutions, we are extending our partnership with Microsoft to create one of the best Zero Trust solutions available. Today, we are announcing four new integrations between Azure AD and Cloudflare Zero Trust that reduce risk proactively. These integrated offerings increase automation allowing security teams to focus on threats versus implementation and maintenance.

What is Zero Trust and why is it important?

Zero Trust is an overused term in the industry and creates a lot of confusion. So, let’s break it down. Zero Trust architecture emphasizes the “never trust, always verify” approach. One way to think about it is that in the traditional security perimeter or “castle and moat” model, you have access to all the rooms inside the building (e.g., apps) simply by having access to the main door (e.g., typically a VPN).  In the Zero Trust model you would need to obtain access to each locked room (or app) individually rather than only relying on access through the main door. Some key components of the Zero Trust model are identity e.g., Azure AD (who), apps e.g., a SAP instance or a custom app on Azure (applications), policies e.g. Cloudflare Access rules (who can access what application), devices e.g. a laptop managed by Microsoft Intune (the security of the endpoint requesting the access) and other contextual signals.

Zero Trust is even more important today since companies of all sizes are faced with an accelerating digital transformation and an increasingly distributed workforce. Moving away from the castle and moat model, to the Internet becoming your corporate network, requires security checks for every user accessing every resource. As a result, all companies, especially those whose use of Microsoft’s broad cloud portfolio is increasing, are adopting a Zero Trust architecture as an essential part of their cloud journey.

Cloudflare’s Zero Trust platform provides a modern approach to authentication for internal and SaaS applications. Most companies likely have a mix of corporate applications – some that are SaaS and some that are hosted on-premise or on Azure. Cloudflare’s Zero Trust Network Access (ZTNA) product as part of our Zero Trust platform makes these applications feel like SaaS applications, allowing employees to access them with a simple and consistent flow. Cloudflare Access acts as a unified reverse proxy to enforce access control by making sure every request is authenticated, authorized, and encrypted.

Cloudflare Zero Trust and Microsoft Azure Active Directory

We have thousands of customers using Azure AD and Cloudflare Access as part of their Zero Trust architecture. Our partnership with Microsoft  announced last year strengthened security without compromising performance for our joint customers. Cloudflare’s Zero Trust platform integrates with Azure AD, providing a seamless application access experience for your organization’s hybrid workforce.

Expanding our Microsoft collaboration: proactive and automated Zero Trust security for customers

As a recap, the integrations we launched solved two key problems:

  1. For on-premise legacy applications, Cloudflare’s participation as Azure AD secure hybrid access partner enabled customers to centrally manage access to their legacy on-premise applications using SSO authentication without incremental development. Joint customers now easily use Cloudflare Access as an additional layer of security with built-in performance in front of their legacy applications.
  2. For apps that run on Microsoft Azure, joint customers can integrate Azure AD with Cloudflare Zero Trust and build rules based on user identity, group membership and Azure AD Conditional Access policies. Users will authenticate with their Azure AD credentials and connect to Cloudflare Access with just a few simple steps using Cloudflare’s app connector, Cloudflare Tunnel, that can expose applications running on Azure. See guide to install and configure Cloudflare Tunnel.

Recognizing Cloudflare’s innovative approach to Zero Trust and Security solutions, Microsoft awarded us the Security Software Innovator award at the 2022 Microsoft Security Excellence Awards, a prestigious classification in the Microsoft partner community.

But we aren’t done innovating. We listened to our customers’ feedback and to address their pain points are announcing several new integrations.

Microsoft integrations we are announcing today

The four new integrations we are announcing today are:

1. Per-application conditional access: Azure AD customers can use their existing Conditional Access policies in Cloudflare Zero Trust.

Expanding our Microsoft collaboration: proactive and automated Zero Trust security for customers

Azure AD allows administrators to create and enforce policies on both applications and users using Conditional Access. It provides a wide range of parameters that can be used to control user access to applications (e.g. user risk level, sign-in risk level, device platform, location, client apps, etc.). Cloudflare Access now supports Azure AD Conditional Access policies per application. This allows security teams to define their security conditions in Azure AD and enforce them in Cloudflare Access.

For example, customers might have tighter levels of control for an internal payroll application and hence will have specific conditional access policies on Azure AD. However, for a general info type application such as an internal wiki, customers might enforce not as stringent rules on Azure AD conditional access policies. In this case both app groups and relevant Azure AD conditional access policies can be directly plugged into Cloudflare Zero Trust seamlessly without any code changes.

2. SCIM: Autonomously synchronize Azure AD groups between Cloudflare Zero Trust and Azure AD, saving hundreds of hours in the CIO org.

Expanding our Microsoft collaboration: proactive and automated Zero Trust security for customers

Cloudflare Access policies can use Azure AD to verify a user’s identity and provide information about that user (e.g., first/last name, email, group membership, etc.). These user attributes are not always constant, and can change over time. When a user still retains access to certain sensitive resources when they shouldn’t, it can have serious consequences.

Often when user attributes change, an administrator needs to review and update all access policies that may include the user in question. This makes for a tedious process and an error-prone outcome.

The SCIM (System for Cross-domain Identity Management) specification ensures that user identities across entities using it are always up-to-date. We are excited to announce that joint customers of Azure AD and Cloudflare Access can now enable SCIM user and group provisioning and deprovisioning. It will accomplish the following:

  • The IdP policy group selectors are now pre-populated with Azure AD groups and will remain in sync. Any changes made to the policy group will instantly reflect in Access without any overhead for administrators.

  • When a user is deprovisioned on Azure AD, all the user’s access is revoked across Cloudflare Access and Gateway. This ensures that change is made in near real time thereby reducing security risks.

3. Risky user isolation: Helps joint customers add an extra layer of security by isolating high risk users (based on AD signals) such as contractors to browser isolated sessions via Cloudflare’s RBI product.

Expanding our Microsoft collaboration: proactive and automated Zero Trust security for customers

Azure AD classifies users into low, medium and high risk users based on many data points it analyzes. Users may move from one risk group to another based on their activities. Users can be deemed risky based on many factors such as the nature of their employment i.e. contractors, risky sign-in behavior, credential leaks, etc. While these users are high-risk, there is a low-risk way to provide access to resources/apps while the user is assessed further.

We now support integrating Azure AD groups with Cloudflare Browser Isolation. When a user is classified as high-risk on Azure AD, we use this signal to automatically isolate their traffic with our Azure AD integration. This means a high-risk user can access resources through a secure and isolated browser. If the user were to move from high-risk to low-risk, the user would no longer be subjected to the isolation policy applied to high-risk users.

4. Secure joint Government Cloud customers: Helps Government Cloud customers achieve better security with centralized identity & access management via Azure AD, and an additional layer of security by connecting them to the Cloudflare global network, not having to open them up to the whole Internet.

Via Secure Hybrid Access (SHA) program, Government Cloud (‘GCC’) customers will soon be able to integrate Azure AD with Cloudflare Zero Trust and build rules based on user identity, group membership and Azure AD conditional access policies. Users will authenticate with their Azure AD credentials and connect to Cloudflare Access with just a few simple steps using Cloudflare Tunnel that can expose applications running on Microsoft Azure.

“Digital transformation has created a new security paradigm resulting in organizations accelerating their adoption of Zero Trust. The Cloudflare Zero Trust and Azure Active Directory joint solution has been a growth enabler for Swiss Re by easing Zero Trust deployments across our workforce allowing us to focus on our core business. Together, the joint solution enables us to go beyond SSO to empower our adaptive workforce with frictionless, secure access to applications from anywhere. The joint solution also delivers us a holistic Zero Trust solution that encompasses people, devices, and networks.”
– Botond Szakács, Director, Swiss Re

A cloud-native Zero Trust security model has become an absolute necessity as enterprises continue to adopt a cloud-first strategy. Cloudflare has and Microsoft have jointly developed robust product integrations with Microsoft to help security and IT leaders CIO teams prevent attacks proactively, dynamically control policy and risk, and increase automation in alignment with Zero Trust best practices.
– Joy Chik, President, Identity & Network Access, Microsoft

Try it now

Interested in learning more about how our Zero Trust products integrate with Azure Active Directory? Take a look at this extensive reference architecture that can help you get started on your Zero Trust journey and then add the specific use cases above as required. Also, check out this joint webinar with Microsoft that highlights our joint Zero Trust solution and how you can get started.

What next

We are just getting started. We want to continue innovating and make the Cloudflare Zero Trust and Microsoft Security joint solution to solve your problems. Please give us feedback on what else you would like us to build as you continue using this joint solution.

API-based email scanning

Post Syndicated from Ayush Kumar original https://blog.cloudflare.com/api-based-email-scanning/

API-based email scanning

API-based email scanning

The landscape of email security is constantly changing. One aspect that remains consistent is the reliance of email as the beginning for the majority of threat campaigns. Attackers often start with a phishing campaign to gather employee credentials which, if successful, are used to exfiltrate data, siphon money, or perform other malicious activities. This threat remains ever present even as companies transition to moving their email to the cloud using providers like Microsoft 365 or Google Workspace.

In our pursuit to help build a better Internet and tackle online threats, Cloudflare offers email security via our Area 1 product to protect all types of email inboxes – from cloud to on premise. The Area 1 product analyzes every email an organization receives and uses our threat models to assess if the message poses risk to the customer. For messages that are deemed malicious, the Area 1 platform will even prevent the email from landing in the recipient’s inbox, ensuring that there is no chance for the attempted attack to be successful.

We try to provide customers with the flexibility to deploy our solution in whatever way they find easiest. Continuing in this pursuit to make our solution as turnkey as possible, we are excited to announce our open beta for Microsoft 365 domain onboarding via the Microsoft Graph API. We know that domains onboarded via API offer quicker deployment times and more flexibility. This onboarding method is one of many, so customers can now deploy domains how they see fit without losing Area 1 protection.

Onboarding Microsoft 365 Domains via API

Cloudflare Area 1 provides customers with many deployment options. Whether it is Journaling + BCC (where customers send a copy of each email to Area 1), Inline/MX records (where another hop is added via MX records), or Secure Email Gateway Connectors (where Area 1 directly interacts with a SEG), Area 1 provides customers with flexibility with how they want to deploy our solution. However, we have always recommended customers to deploy using MX records.

API-based email scanning

Adding this extra hop and having domains be pointed to Area 1 allows the service to provide protection with sinkholing, making sure that malicious emails don’t reach the destination email inbox. However, we recognized that configuring Area 1 as the first hop (i.e. changing the MX records) may require sign-offs from other teams inside organizations and can lead to additional cycles. Organizations are also caught in waiting for this inline change to reflect in DNS (known as DNS propagation time). We know our customers want to be protected ASAP while they make these necessary adjustments.

With Microsoft 365 onboarding, the process of adding protection requires less configuration steps and waiting time. We now use the Microsoft Graph API to evaluate all messages associated with a domain. This allows for greater flexibility for operation teams to deploy Area 1.

For example, a customer of Area 1 who is heavily involved in M&A transactions due to the nature of their industry benefit from being able to deploy Area 1 quickly using the Microsoft API. Before API onboarding, IT teams spent time juggling the handover of various acquisition assets. Assigning new access rights, handing over ownership, and other tasks took time to execute leaving mailboxes unsecured. However, now when the customer acquires a new entity, they can use the API onboarding to quickly add protection for the domains they just acquired. This allows them to have protection on the email addresses associated with the new domain while they work on completing the other tasks on hand. How our API onboarding process works can be seen below.

API-based email scanning

Once we are authorized to read incoming messages from Microsoft 365, we will start processing emails and firing detections on suspected emails. This new onboarding process is significantly faster and only requires a few clicks to get started.

To start the process, choose which domain you would like to onboard via API. Then within the UI, you can navigate to “Domains & Routing” within the settings. After adding a new domain and choosing API scan, you can follow our setup wizard to authorize Area 1 to start reading messages.

API-based email scanning
API scan

Within a few minutes of authorization, your organization will now be protected by Area 1.

API-based email scanning
Ready to scan ‌‌

Looking Ahead

This onboarding process is part of our continual efforts to provide customers with best of class email protection. With our API onboarding we provide customers with increased flexibility to deploy our solution. As we look forward, our Microsoft 365 API onboarding opens the door for other capabilities.

Our team is now looking to add the ability to retroactively scan emails that were sent before Area 1 was installed. This provides the opportunity for new customers to clean up any old emails that could still pose a risk for the organization. We are also looking to provide more levers for organizations who want to have more control on which mailboxes are scanned with Area 1. Soon customers will be able to designate within the UI which mailboxes will have their incoming email scanned by Area 1.

We also currently limit the deployment type of each domain to one type (i.e. a domain can either be onboarded using MX records or API). However, we are now looking at providing customers with the ability to do hybrid deployments, using both API + MX records. This combinatorial approach not only provides the greatest flexibility but also provides the maximum coverage.

There are many things in the pipeline that the Area 1 team is looking to bring to customers in 2023 and this open beta lets us build these new capabilities.

All customers can join the open beta so if you are interested in onboarding a new domain using this method, follow the steps above and get Area 1 protection on your Microsoft 365 Domains.

Email Link Isolation: your safety net for the latest phishing attacks

Post Syndicated from Joao Sousa Botto original https://blog.cloudflare.com/area1-eli-ga/

Email Link Isolation: your safety net for the latest phishing attacks

Email Link Isolation: your safety net for the latest phishing attacks

Email is one of the most ubiquitous and also most exploited tools that businesses use every single day. Baiting users into clicking malicious links within an email has been a particularly long-standing tactic for the vast majority of bad actors, from the most sophisticated criminal organizations to the least experienced attackers.

Even though this is a commonly known approach to gain account access or commit fraud, users are still being tricked into clicking malicious links that, in many cases, lead to exploitation. The reason is simple: even the best trained users (and security solutions) cannot always distinguish a good link from a bad link.

On top of that, securing employees’ mailboxes often results in multiple vendors, complex deployments, and a huge drain of resources.

Email Link Isolation turns Cloudflare Area 1 into the most comprehensive email security solution when it comes to protecting against phishing attacks. It rewrites links that could be exploited, keeps users vigilant by alerting them of the uncertainty around the website they’re about to visit, and protects against malware and vulnerabilities through the user-friendly Cloudflare Browser Isolation service. Also, in true Cloudflare fashion,  it’s a one-click deployment.

With more than a couple dozen customers in beta and over one million links protected (so far), we can now clearly see the significant value and potential that this solution can deliver. To extend these benefits to more customers and continue to expand on the multitude of ways we can apply this technology, we’re making Email Link Isolation generally available (GA) starting today.

Email Link Isolation is included with Cloudflare Area 1 enterprise plan at no extra cost, and can be enabled with three clicks:

1. Log in to the Area 1 portal.

2. Go to Settings (the gear icon).

3. On Email Configuration, go to Email Policies > Link Actions.

4. Scroll to Email Link Isolation and enable it.

Email Link Isolation: your safety net for the latest phishing attacks

Defense in layers

Applying multiple layers of defense becomes ever more critical as threat actors continuously look for ways to navigate around each security measure and develop more complex attacks. One of the best examples that demonstrates these evolving techniques is a deferred phishing attack, where an embedded URL is benign when the email reaches your email security stack and eventually your users’ inbox, but is later weaponized post-delivery.

Email Link Isolation: your safety net for the latest phishing attacks

To combat evolving email-borne threats, such as malicious links, Area 1 continually updates its machine learning (ML) models to account for all potential attack vectors, and leverages post-delivery scans and retractions as additional layers of defense. And now, customers on the Enterprise plan also have access to Email Link Isolation as one last defense – a safety net.

The key to successfully adding layers of security is to use a strong Zero Trust suite, not a disjointed set of products from multiple vendors. Users need to be kept safe without disrupting their productivity – otherwise they’ll start seeing important emails being quarantined or run into a poor experience when accessing websites, and soon enough they’ll be the ones looking for ways around the company’s security measures.

Built to avoid productivity impacts

Email Link Isolation provides an additional layer of security with virtually no disruption to the user experience. It’s smart enough to decide which links are safe, which are malicious, and which are still dubious. Those dubious links are then changed (rewritten to be precise) and Email Link Isolation keeps evaluating them until it reaches a verdict with a high degree of confidence. When a user clicks on one of those rewritten links, Email Link Isolation checks for a verdict (benign or malign) and takes the corresponding action – benign links open in the local browser as if they hadn’t been changed, while malign links are prevented from opening altogether.

Most importantly, when Email Link Isolation is unable to confidently determine a verdict based on all available intelligence, an interstitial page is presented to ask the user to be extra vigilant. The interstitial page calls out that the website is suspicious, and that the user should refrain from entering any personal information and passwords unless they know and fully trust the website. Over the last few months of beta, we’ve seen that over two thirds of users don’t proceed to the website after seeing this interstitial – that’s a good thing!

For the users that still want to navigate to the website after seeing the interstitial page, Email Link Isolation uses Cloudflare Browser Isolation to automatically open the link in an isolated browser running in Cloudflare’s closest data center to the user. This delivers an experience virtually indistinguishable from using the local browser, thanks to our Network Vector Rendering (NVR) technology and Cloudflare’s expansive, low-latency network. By opening the suspicious link in an isolated browser, the user is protected against potential browser attacks (including malware, zero days, and other types of malicious code execution).

In a nutshell, the interstitial page is displayed when Email Link Isolation is uncertain about the website, and provides another layer of awareness and protection against phishing attacks. Then, Cloudflare Browser Isolation is used to protect against malicious code execution when a user decides to still proceed to such a website.

What we’ve seen in the beta

As expected, the percentage of rewritten links that users actually click is quite small (single digit percentage). That’s because the majority of such links are not delivered in messages the users are expecting, and aren’t coming from trusted colleagues or partners of theirs. So, even when a user clicks on such a link, they will often see the interstitial page and decide not to proceed any further. We see that less than half of all clicks lead to the user actually visiting the website (in Browser Isolation, to protect against malicious code that could otherwise be executing behind the scenes).

Email Link Isolation: your safety net for the latest phishing attacks
Email Link Isolation: your safety net for the latest phishing attacks

You may be wondering why we’re not seeing a larger amount of clicks on these rewritten links. The answer is quite simply that link Email Link Isolation is indeed that last layer of protection against attack vectors that may have evaded other lines of defense. Virtually all the well crafted phishing attacks that try and trick users into clicking malicious links are already being stopped by the Area 1 email security, and such messages don’t reach users’ inboxes.

The balance is very positive. From all the customers using Email Link Isolation beta in production, some Fortune 500, we received no negative feedback on the user experience. That means that we’re meeting one of the most challenging goals – to provide additional security without negatively affecting users and without adding the burden of tuning/administration to the SOC and IT teams.

One interesting thing we uncover is how valuable our customers are finding our click-time inspection of link shorteners. The fact that a shortened URL (e.g. bit.ly) can be modified at any time to point to a different website has been making some of our customers anxious. Email Link Isolation inspects the link at time-of-click, evaluates the actual website that it’s going to open, and proceeds to open locally, block or present the interstitial page as adequate. We’re now working on full link shortener coverage through Email Link Isolation.

All built on Cloudflare

Cloudflare’s intelligence is driving the decisions of what gets rewritten. We have earlier signals than others.

Email Link Isolation has been built on Cloudflare’s unique capabilities in many areas.

First, because Cloudflare sees enough Internet traffic for us to confidently identify new/low confidence and potentially dangerous domains earlier than anyone else – leveraging the Cloudflare intelligence for this early signal is key to the user experience, to not add speed bumps to legitimate websites that are part of our users’ daily routines. Next, we’re using Cloudflare Workers to process this data and serve the interstitial without introducing frustrating delays to the user. And finally, only Cloudflare Browser Isolation can protect against malicious code with a low-latency experience that is invisible to end users and feels like a local browser.

If you’re not yet a Cloudflare Area 1 customer, start your free trial and phishing risk assessment here.

How Cloudflare Area 1 and DLP work together to protect data in email

Post Syndicated from Ayush Kumar original https://blog.cloudflare.com/dlp-area1-to-protect-data-in-email/

How Cloudflare Area 1 and DLP work together to protect data in email

How Cloudflare Area 1 and DLP work together to protect data in email

Threat prevention is not limited to keeping external actors out, but also keeping sensitive data in. Most organizations do not realize how much confidential information resides within their email inboxes. Employees handle vast amounts of sensitive data on a daily basis, such as intellectual property, internal documentation, PII, or payment information and often share this information internally via email making email one of the largest locations confidential information is stored within a company. It comes as no shock that organizations worry about protecting the accidental or malicious egress of sensitive data and often address these concerns by instituting strong Data Loss Prevention policies. Cloudflare makes it easy for customers to manage the data in their email inboxes with Area 1 Email Security and Cloudflare One.

Cloudflare One, our SASE platform that delivers network-as-a-service (NaaS) with Zero Trust security natively built-in, connects users to enterprise resources, and offers a wide variety of opportunities to secure corporate traffic, including the inspection of data transferred to your corporate email. Area 1 email security, as part of our composable Cloudflare One platform, delivers the most complete data protection for your inbox and offers a cohesive solution when including additional services, such as Data Loss Prevention (DLP). With the ability to easily adopt and implement Zero Trust services as needed, customers have the flexibility to layer on defenses based on their most critical use cases. In the case of Area 1 + DLP, the combination can collectively and preemptively address the most pressing use cases that represent high-risk areas of exposure for organizations. Combining these products provides the in-depth defense of your corporate data.

Preventing egress of cloud email data via HTTPs

Email provides a readily available outlet for corporate data, so why let sensitive data reach email in the first place? An employee can accidentally attach an internal file rather than a public white paper in a customer email, or worse, attach a document with the wrong customers’ information to an email.

With Cloudflare Data Loss Prevention (DLP) you can prevent the upload of sensitive information, such as PII or intellectual property, to your corporate email. DLP is offered as part of Cloudflare One, which runs traffic from data centers, offices, and remote users through the Cloudflare network.  As traffic traverses Cloudflare, we offer protections including validating identity and device posture and filtering corporate traffic.

How Cloudflare Area 1 and DLP work together to protect data in email

Cloudflare One offers HTTP(s) filtering, enabling you to inspect and route the traffic to your corporate applications. Cloudflare Data Loss Prevention (DLP) leverages the HTTP filtering abilities of Cloudflare One. You can apply rules to your corporate traffic and route traffic based on information in an HTTP request. There are a wide variety of options for filtering, such as domain, URL, application, HTTP method, and many more. You can use these options to segment the traffic you wish to DLP scan. All of this is done with the performance of our global network and managed with one control plane.

You can apply DLP policies to corporate email applications, such as Google Suite or O365.  As an employee attempts to upload an attachment to an email, the upload is inspected for sensitive data, and then allowed or blocked according to your policy.

How Cloudflare Area 1 and DLP work together to protect data in email

Inside your corporate email extend more core data protection principles with Area 1 in the following ways:

Enforcing data security between partners

With Cloudflare’s Area 1, you can also enforce strong TLS standards. Having TLS configured adds an extra layer of security as it ensures that emails are encrypted, preventing any attackers from reading sensitive information and changing the message if they intercept the email in transit (on-path-attack). This is especially useful for G Suite customers whose internal emails still go out to the whole Internet in front of prying eyes or for customers who have contractual obligations to communicate with partners with SSL/TLS.

Area 1 makes it easy to enforce SSL/TLS inspections. From the Area 1 portal, you can configure Partner Domain(s) TLS by navigating “Partner Domains TLS” within “Domains & Routing” and adding a partner domain with which you want to enforce TLS. If TLS is required then all emails from that domain with no TLS will be automatically dropped. Our TLS ensures strong TLS rather than the best effort in order to make sure that all traffic is encrypted with strong ciphers preventing a malicious attacker from being able to decrypt any intercepted emails.

How Cloudflare Area 1 and DLP work together to protect data in email

Stopping passive email data loss

Organizations often forget that exfiltration also can be done without ever sending any email. Attackers who are able to compromise a company account are able to passively sit by, monitoring all communications and picking out information manually.

Once an attacker has reached this stage, it is incredibly difficult to know an account is compromised and what information is being tracked. Indicators like email volume, IP address changes, and others do not work since the attacker is not taking any actions that would cause suspicion. At Cloudflare, we have a strong thesis on preventing these account takeovers before they take place, so no attacker is able to fly under the radar.

In order to stop account takeovers before they happen, we place great emphasis on filtering emails that pose a risk for stealing employee credentials. The most common attack vector used by malicious actors are phishing emails. Given its ability to have a high impact in accessing confidential data when successful, it’s no shock that this is the go-to tool in the attackers tool kit. Phishing emails pose little threat to an email inbox protected by Cloudflare’s Area 1 product. Area 1’s models are able to assess if a message is a suspected phishing email by analyzing different metadata. Anomalies detected by the models like domain proximity (how close a domain is to the legitimate one), sentiment of email, or others can quickly determine if an email is legitimate or not. If Area 1 determines an email to be a phishing attempt, we automatically retract the email and prevent the recipient from receiving the email in their inbox ensuring that the employee’s account remains uncompromised and unable to be used to exfiltrate data.

Attackers who are looking to exfiltrate data from an organization also often rely on employees clicking on links sent to them via email. These links can point to online forms which on the surface look innocuous but serve to gather sensitive information. Attackers can use these websites to initiate scripts which gather information about the visitor without any interaction from the employee. This presents a strong concern since an errant click by an employee can lead to the exfiltration of sensitive information. Other malicious links can contain exact copies of websites which the user is accustomed to accessing. However, these links are a form of phishing where the credentials entered by the employee are sent to the attacker rather than logging them into the website.

Area 1 covers this risk by providing Email Link Isolation as part of our email security offering. With email link isolation, Area 1 looks at every link sent and accesses its domain authority. For anything that’s on the margin (a link we cannot confidently say is safe), Area 1 will launch a headless Chromium browser and open the link there with no interruption. This way, any malicious scripts that execute will run on an isolated instance far from the company’s infrastructure, stopping the attacker from getting company information. This is all accomplished instantaneously and reliably.

Stopping Ransomware

Attackers have many tools in their arsenal to try to compromise employee accounts. As we mentioned above, phishing is a common threat vector, but it’s far from the only one. At Area 1, we are also vigilant in preventing the propagation of ransomware.

A common mechanism that attackers use to disseminate ransomware is to disguise attachments by renaming them. A ransomware payload could be renamed from petya.7z to Invoice.pdf in order to try to trick an employee into downloading the file. Depending on how urgent the email made this invoice seem, the employee could blindly try to open the attachment on their computer causing the organization to suffer a ransomware attack. Area 1’s models detect these mismatches and stop malicious ones from arriving into their target’s email inbox.

How Cloudflare Area 1 and DLP work together to protect data in email

A successful ransomware campaign can not only stunt the daily operations of any company, but can also lead to the loss of local data if the encryption is unable to be reversed. Cloudflare’s Area 1 product has dedicated payload models which analyze not only the attachment extensions but also the hashed value of the attachment to compare it to known ransomware campaigns. Once Area 1 finds an attachment deemed to be ransomware, we prohibit the email from going any further.

How Cloudflare Area 1 and DLP work together to protect data in email

Cloudflare’s DLP vision

We aim for Cloudflare products to give you the layered security you need to protect your organization, whether its malicious attempts to get in or sensitive data getting out. As email continues to be the largest surface of corporate data, it is crucial for companies to have strong DLP policies in place to prevent the loss of data. With Area 1 and Cloudflare One working together, we at Cloudflare are able to provide organizations with more confidence about their DLP policies.

If you are interested in these email security or DLP services, contact us for a conversation about your security and data protection needs.

Or if you currently subscribe to Cloudflare services, consider reaching out to your Cloudflare customer success manager to discuss adding additional email security or DLP protection.