Tag Archives: cognito

Protect your Amazon Cognito user pool with AWS WAF

Post Syndicated from Maitreya Ranganath original https://aws.amazon.com/blogs/security/protect-your-amazon-cognito-user-pool-with-aws-waf/

Many of our customers use Amazon Cognito user pools to add authentication, authorization, and user management capabilities to their web and mobile applications. You can enable the built-in advanced security in Amazon Cognito to detect and block the use of credentials that have been compromised elsewhere, and to detect unusual sign-in activity and then prompt users for additional verification or block sign-ins. Additionally, you can associate an AWS WAF web access control list (web ACL) with your user pool to allow or block requests to Amazon Cognito user pools, based on security rules.

In this post, we’ll show how you can use AWS WAF with Amazon Cognito user pools and provide a sample set of rate-based rules and advanced AWS WAF rule groups. We’ll also show you how to test and tune the rules to help protect your user pools from common threats.

Rate-based rules for Amazon Cognito user pool endpoints

The following are endpoints exposed publicly by an Amazon Cognito user pool that you can protect with AWS WAF:

  • Hosted UI — These endpoints are listed in the OIDC and hosted UI API reference. Cognito creates these endpoints when you assign a domain to your user pool. Your users will interact with these endpoints when they use the Hosted UI web interface directly, or when your application calls Cognito OAuth endpoints such as Authorize or Token.
  • Public API operations — These generate a request to Cognito API actions that are either unauthenticated or authenticated with a session string or access token, but not with AWS credentials.

A good way to protect these endpoints is to deploy rate-based AWS WAF rules. These rules will detect and block requests with high rates that could indicate an attempt to exceed your Amazon Cognito API request rate quotas and that could subsequently impact requests from legitimate users.

When you apply rate limits, it helps to group Amazon Cognito API actions into four action categories. You can set specific rate limits per action category giving you traffic visibility for each category.

  • User Creation — This category includes operations that create new users in Cognito. Setting a rate limit for this category provides visibility for traffic of these operations and threats such as fake users being created in Cognito, which drives up your Monthly Active User (MAU) costs for Cognito.
  • Sign-in — This category includes operations to initiate a sign-in operation. Setting a rate limit for this category can provide visibility into the abuse of these operations. This could indicate high frequency, automated attempts to guess user credentials, sometimes referred to as credential stuffing.
  • Account Recovery — This category includes operations to recover accounts, including “forgot password” flows. Setting a rate limit for this category can provide visibility into the abuse of these operations, malicious activity can include: sending fake reset attempts, which might result in emails and SMS messages being sent to users.
  • Default — This is a catch-all rate limit that applies to an operation that is not in one of the prior categories. Setting a default rate limit can provide visibility and mitigation from request flooding attacks.

Table 1 below shows selected Hosted UI endpoint paths (the equivalent of individual API actions) and the recommended rate-based rule limit category for each.

Table 1: Amazon Cognito Hosted UI URL paths mapped to action categories

Hosted UI URL path Authentication method Action category
/signup Unauthenticated User Creation
/confirmUser Confirmation code User Creation
/resendcode Unauthenticated User Creation
/login Unauthenticated Sign-in
/oauth2/authorize Unauthenticated Sign-in
/forgotPassword Unauthenticated Account Recovery
/confirmForgotPassword Confirmation code Account Recovery
/logout Unauthenticated Default
/oauth2/revoke Refresh token Default
/oauth2/token Auth code, or refresh token, or client credentials Default
/oauth2/userInfo Access token Default
/oauth2/idpresponse Authorization code Default
/saml2/idpresponse SAML assertion Default

Table 2 below shows selected Cognito API actions and the recommended rate-based rule category for each.

Table 2: Selected Cognito API actions mapped to action categories

API action name Authentication method Action category
SignUp Unauthenticated User Creation
ConfirmSignUp Confirmation code User Creation
ResendConfirmationCode Unauthenticated User Creation
InitiateAuth Unauthenticated Sign-in
RespondToAuthChallenge Unauthenticated Sign-in
ForgotPassword Unauthenticated Account Recovery
ConfirmForgotPassword Confirmation code Account Recovery
AssociateSoftwareToken Access token or session Default
VerifySoftwareToken Access token or session Default

Additionally, the rate-based rules we provide in this post include the following:

  • Two IP sets that represent allow lists for IPv4 and IPv6. You can add IPs that represent your trusted source IP addresses to these IP sets so that other AWS WAF rules don’t apply to requests that originate from these IP addresses.
  • Two IP sets that represent deny lists for IPv4 and IPv6. Add IPs to these IP sets that you want to block in all cases, regardless of the result of other rules.
  • An AWS managed IP reputation rule group: The AWS managed IP reputation list rule group contains rules that are based on Amazon internal threat intelligence, to identify IP addresses typically associated with bots or other threats. You can limit requests that match rules in this rule group to a specific rate limit.

Deploy rate-based rules

You can deploy the rate-based rules described in the previous section by using the AWS CloudFormation template that we provide here.

To deploy rate-based rules using the template

  1. (Optional but recommended) If you want to enable AWS WAF logging and resources to analyze request rates, create an Amazon Simple Storage Service (Amazon S3) bucket in the same AWS Region as your Amazon Cognito user pool, with a bucket name starting with the prefix aws-waf-logs-. If you previously created an S3 bucket for AWS WAF logs, you can choose to reuse it, or you can create a new bucket to store AWS WAF logs for Amazon Cognito.
  2. Choose the following Launch Stack button to launch a CloudFormation stack in your account.

    Launch Stack

    Note: The stack will launch in the N. Virginia (us-east-1) Region. To deploy this solution into other AWS Regions, download the solution’s CloudFormation template and deploy it to the selected Region.

    This template creates the following resources in your AWS account:

    • A rule group for the rate-based rules, according to the limits shown in Tables 1 and 2.
    • Four IP sets for an allow list and deny list for IPv4 and IPv6 addresses.
    • A web ACL that includes the rule group that is created, IP set based rules, and the AWS managed IP reputation rule group.
    • (Optional) The template enables AWS WAF logging for the web ACL to an S3 bucket that you specify.
    • (Optional) The template creates resources to help you analyze AWS WAF logs in S3 to calculate peak request rates that you can use to set rate limits for the rate-based rules.
  3. Set the template parameters as needed. The following table shows the default values for the parameters. We recommend that you deploy the template with the default values and with TestMode set to Yes so that all rules are set to Count. This allows all requests but emits Amazon CloudWatch metrics and AWS WAF log events for each rule that matches. You can then follow the guidance in the next section to analyze the logs and tune the rate limits to match the traffic patterns to your user pool. When you are satisfied with the unique rate limits for each parameter, you can update the stack and set TestMode to No to start blocking requests that exceed the rate limits.

    The rate limits for AWS WAF rate-based rules are configured as the number of requests per 5-minute period per unique source IP. The value of the rate limit can be between 100 and 2,000,000,000 (2 billion).

    Table 3: Default values for template parameters

    Parameter name Description Default value Allowed values
    Request rate limits by action category
    UserCreationRateLimit Rate limit applied to User Creation actions 2000 100–2,000,000,000
    SignInRateLimit Rate limit applied to Sign-in actions 4000 100–2,000,000,000
    AccountRecoveryRateLimit Rate limit applied to Account Recovery actions 1000 100–2,000,000,000
    IPReputationRateLimit Rate limit applied to requests that match the AWS Managed IP reputation list 1000 100–2,000,000,000
    DefaultRateLimit Default rate limit applied to actions that are not in any of the prior categories 6000 100–2,000,000,000
    Test mode
    TestMode Set to Yes to test rules by overriding rule actions to Count. Set to No to apply the default actions for rules after you’ve tested the impact of these rules. Yes Yes or No
    AWS WAF logging and rate analysis
    EnableWAFLogsAndRateAnalysis Set to Yes to enable logging for the AWS WAF web ACL to an S3 bucket and create resources for request rate analysis. Set to No to disable AWS WAF logging and skip creating resources for rate analysis. If No, the rest of the parameter values in this section are ignored. If Yes, choose values for the rest of the parameters in this section. Yes Yes or No
    WAFLogsS3Bucket The name of an existing S3 bucket where AWS WAF logs are delivered. The bucket name must start with aws-waf-logs- and can end with any suffix.
    Only used if the parameter EnableWAFLogsAndRateAnalysis is set to Yes.
    None Name of an existing S3 bucket that starts with the prefix aws-waf-logs-
    DatabaseName The name of the AWS Glue database to create, which will contain the request rate analysis tables created by this template. (Important: The name cannot contain hyphens.)
    Only used if the parameter EnableWAFLogsAndRateAnalysis is set to Yes.
    rate_analysis
    WorkgroupName The name of the Amazon Athena workgroup to create for rate analysis.
    Only used if the parameter EnableWAFLogsAndRateAnalysis is set to Yes.
    rate_analysis
    WAFLogsTableName The name of the AWS Glue table for AWS WAF logs.
    Only used if the parameter EnableWAFLogsAndRateAnalysis is set to Yes.
    waf_logs
    WAFLogsProjectionStartDate The earliest date to analyze AWS WAF logs, in the format YYYY/MM/DD (example: 2023/02/28).
    Only used if the parameter EnableWAFLogsAndRateAnalysis is set to Yes.
    None Set this to the current date, in the format YYYY/MM/DD
  4. Wait for the CloudFormation template to be created successfully.
  5. Go to the AWS WAF console and choose the web ACL created by the template. It will have a name ending with CognitoWebACL.
  6. Choose the Associated AWS resources tab, and then choose Add AWS resource.
  7. For Resource type, choose Amazon Cognito user pool, and then select the Amazon Cognito user pools that you want to protect with this web ACL.
  8. Choose Add.

Now that your user pool is being protected by the rate-based rules in the web ACL you created, you can proceed to tune the rate-based rule limits by analyzing AWS WAF logs.

Tune AWS WAF rate-based rule limits

As described in the previous section, the rate-based rules give you the ability to set separate rate limit values for each category of Amazon Cognito API actions.

Although the CloudFormation template has default starting values for these rate limits, it is important that you tune these values to match the traffic patterns for your user pool. To begin the tuning process, deploy the template with default values for all parameters, including Yes for TestMode. This overrides all rule actions to Count, allowing all requests but emitting CloudWatch metrics and AWS WAF log events for each rule that matches.

After you collect AWS WAF logs for a period of time (this period can vary depending on your traffic, from a couple of hours to a couple of days), you can analyze them, as shown in the next section, to get peak request rates to tune the rate limits to match observed traffic patterns for your user pool.

Query AWS WAF logs to calculate peak request rates by request type

You can calculate peak request rates by analyzing information that is present in AWS WAF logs. One way to analyze these is to send AWS WAF logs to S3 and to analyze the logs by using SQL queries in Amazon Athena. If you deploy the template in this post with default values, it creates the resources you need to analyze AWS WAF logs in S3 to calculate peak requests rates by request type.

If you are instead ingesting AWS WAF logs into your security information and event management (SIEM) system or a different analytics environment, you can create equivalent queries by using the query language for your SIEM or analytics environment to get similar results.

To access and edit the queries built by the CloudFormation template for use

  1. Open the Athena console and switch to the Athena workgroup that was created by the template (the default name is rate_analysis).
  2. On the Saved queries tab, choose the query named Peak request rate per 5-minute period by source IP and request category. The following SQL query will be loaded into the edit panel.
    -- Gets the top 5 source IPs sending the most requests in a 5-minute period per request category
    ‐‐ NOTE: change the start and end timestamps to match the duration of interest
    SELECT request_category, from_unixtime(time_bin*60*5) AS date_time, client_ip, request_count FROM (
      SELECT *, row_number() OVER (PARTITION BY request_category ORDER BY request_count DESC, time_bin DESC) AS row_num FROM (
        SELECT
          CASE
            WHEN ip_reputation_labels.name IN (
              'awswaf:managed:aws:amazon-ip-list:AWSManagedIPReputationList',
              'awswaf:managed:aws:amazon-ip-list:AWSManagedReconnaissanceList',
              'awswaf:managed:aws:amazon-ip-list:AWSManagedIPDDoSList'
            ) THEN 'IPReputation'
            WHEN target.value IN (
              'AWSCognitoIdentityProviderService.InitiateAuth',
              'AWSCognitoIdentityProviderService.RespondToAuthChallenge'
            ) THEN 'SignIn'
            WHEN target.value IN (
              'AWSCognitoIdentityProviderService.ResendConfirmationCode',
              'AWSCognitoIdentityProviderService.SignUp',
              'AWSCognitoIdentityProviderService.ConfirmSignUp'
            ) THEN 'UserCreation'
            WHEN target.value IN (
              'AWSCognitoIdentityProviderService.ForgotPassword',
              'AWSCognitoIdentityProviderService.ConfirmForgotPassword'
            ) THEN 'AccountRecovery'
            WHEN httprequest.uri IN (
              '/login',
              '/oauth2/authorize'
            ) THEN 'SignIn'
            WHEN httprequest.uri IN (
              '/signup',
              '/confirmUser',
              '/resendcode'
            ) THEN 'UserCreation'
            WHEN  httprequest.uri IN (
              '/forgotPassword',
              '/confirmForgotPassword'
            ) THEN 'AccountRecovery'
            ELSE 'Default'
          END AS request_category,
          httprequest.clientip AS client_ip,
          FLOOR("timestamp"/(1000*60*5)) AS time_bin,
          COUNT(*) AS request_count
        FROM waf_logs
          LEFT OUTER JOIN UNNEST(FILTER(httprequest.headers, h -> h.name = 'x-amz-target')) AS t(target) ON TRUE
          LEFT OUTER JOIN UNNEST(FILTER(labels, l -> l.name like 'awswaf:managed:aws:amazon-ip-list:%')) AS t(ip_reputation_labels) ON TRUE
        WHERE
          from_unixtime("timestamp"/1000) BETWEEN TIMESTAMP '2022-01-01 00:00:00' AND TIMESTAMP '2023-01-01 00:00:00'
        GROUP BY 1, 2, 3
        ORDER BY 1, 4 DESC
      )
    ) WHERE row_num <= 5 ORDER BY request_category ASC, row_num ASC
  3. Scroll down to Line 48 in the Query Editor and edit the timestamps to match the start and end time of the time window of interest.
  4. Run the query to calculate the top 5 peak request rates per 5-minute period by source IP and by action category.

The results show the action category, source IP, time, and count of requests. You can use the request count to tune the rate limits for each action category.

The lowest rate limit you can set for AWS WAF rate-based rules is 100 requests per 5-minute period. If your query results show that the peak request count is less than 100, set the rate limit as 100 or higher.

After you have tuned the rate limits, you can apply the changes to your web ACL by updating the CloudFormation stack.

To update the CloudFormation stack

  1. On the CloudFormation console, choose the stack you created earlier.
  2. Choose Update. For Prepare template, choose Use current template, and then choose Next.
  3. Update the values of the parameters with rate limits to match the tuned values from your analysis.
  4. You can choose to enable blocking of requests by setting TestMode to No. This will set the action to Block for the rate-based rules in the web ACL and start blocking traffic that exceeds the rate limits you have chosen.
  5. Choose Next and then Next again to update the stack.

Now the rate-based rules are updated with your tuned limits, and requests will be blocked if you set TestMode to No.

Protect endpoints with user interaction

Now that we’ve covered the bases with rate-based rules, we’ll show you some more advanced AWS WAF rules that further help protect your user pool. We’ll explore two sample scenarios in detail, and provide AWS WAF rules for each. You can use the rules provided as a guideline to build others that can help with similar use cases.

Rules to verify human activity

The first scenario is protecting endpoints where users have interaction with the page. This will be a browser-based interaction, and a human is expected to be behind the keyboard. This scenario applies to the Hosted UI endpoints such as /login, /signup, and /forgotPassword, where a CAPTCHA can be rendered on the user’s browser for the user to solve. Let’s take the login (sign-in) endpoint as an example, and imagine you want to make sure that only actual human users are attempting to sign in and you want to block bots that might try to guess passwords.

To illustrate how to protect this endpoint with AWS WAF, we’re sharing a sample rule, shown in Figure 1. In this rule, you can take input from prior rules like the Amazon IP reputation list or the Anonymous IP list (which are configured to Count requests and add labels) and combine that with a CAPTCHA action. The logic of the rule says that if the request matches the reputation rules (and has received the corresponding labels) and is going to the /login endpoint, then the AWS WAF action should be to respond with a CAPTCHA challenge. This will present a challenge that increases the confidence that a human is performing the action, and it also adds a custom label so you can efficiently identify and have metrics on how many requests were matched by this rule. The rule is provided in the CloudFormation template and is in JSON format, because it has advanced logic that cannot be displayed by the console. Learn more about labels and CAPTCHA actions in the AWS WAF documentation.

Figure 1: Login sample rule flow

Figure 1: Login sample rule flow

Note that the rate-based rules you created in the previous section are evaluated before the advanced rules. The rate-based rules will block requests to the /login endpoint that exceed the rate limit you have configured, while this advanced rule will match requests that are below the rate limit but match the other conditions in the rule.

Rules for specific activity

The second scenario explores activity on specific application clients within the user pool. You can spot this activity by monitoring the logs provided by AWS WAF, or other traffic logs like Application Load Balancer (ALB) logs. The application client information is provided in the call to the service.

In the Amazon Cognito user pool in this scenario, we have different application clients and they’re constrained by geography. For example, for one of the application clients, requests are expected to come from the United States at or below a certain rate. We can create a rule that combines the rate and geographical criteria to block requests that don’t meet the conditions defined.

The flow of this rule is shown in Figure 2. The logic of the rule will evaluate the application client information provided in the request and the geographic information identified by the service, and apply the selected rate limit. If blocked, the rule will provide a custom response code by using HTTP code 429 Too Many Requests, which can help the sender understand the reason for the block. For requests that you make with the Amazon Cognito API, you could also customize the response body of a request that receives a Block response. Adding a custom response helps provide the sender context and adjust the rate or information that is sent.

Figure 2: AppClientId sample rule flow

Figure 2: AppClientId sample rule flow

AWS WAF can detect geo location with Region accuracy and add specific labels for the location. These can then be used in other rule evaluations. This rule is also provided as a sample in the CloudFormation template.

Advanced protections

To build on the rules we’ve shared so far, you can consider using some of the other intelligent threat mitigation rules that are available as managed rules—namely, bot control for common or targeted bots. These rules offer advanced capabilities to detect bots in sensitive endpoints where automation or non-browser user agents are not expected or allowed. If you receive machine traffic to the endpoint, these rules will result in false positives that would need to be tuned. For more information, see Options for intelligent threat mitigation.

The sample rule flow in Figure 3 shows an example for our Hosted UI, which builds on the first rule we built for specific activity and adds signals coming from the Bot Control common bots managed rule, in this case the non-browser-user-agent label.

Figure 3: Login sample rule with advanced protections

Figure 3: Login sample rule with advanced protections

Adding the bot detection label will also add accuracy to the evaluation, because AWS WAF will consider multiple different sources of information when analyzing the request. This can also block attacks that come from a small set of IPs or easily recognizable bots.

We’ve shared this rule in the CloudFormation template sample. The rule requires you to add AWS WAF Bot Control (ABC) before the custom rule evaluation. ABC has additional costs associated with it and should only be used for specific use cases. For more information on ABC and how to enable it, see this blog post.

After adding these protections, we have a complete set of rules for our Hosted UI–specific needs; consider that your traffic and needs might be different. Figure 4 shows you what the rule priority looks like. All rules except the last are included in the provided CloudFormation template. Managed rule evaluations need to have higher priority and be in Count mode; this way, a matching request can get labels that can be evaluated further down the priority list by using the custom rules that were created. For more information, see How labeling works.

Figure 4: Summary of the rules discussed in this post

Figure 4: Summary of the rules discussed in this post

Conclusion

In this post, we examined the different protections provided by the integration between AWS WAF and Amazon Cognito. This integration makes it simpler for you to view and monitor the activity in the different Amazon Cognito endpoints and APIs, while also adding rate-based rules and IP reputation evaluations. For more specific use cases and advanced protections, we provided sample custom rules that use labels, as well as an advanced rule that uses bot control for common bots. You can use these advanced rules as examples to create similar rules that apply to your use cases.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the re:Post with tag AWS WAF or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Maitreya Ranganath

Maitreya is an AWS Security Solutions Architect. He enjoys helping customers solve security and compliance challenges and architect scalable and cost-effective solutions on AWS.

Diana Alvarado

Diana Alvarado

Diana is Sr security solutions architect at AWS. She is passionate about helping customers solve difficult cloud challenges, she has a soft spot for all things logs.

Analyze Amazon Cognito advanced security intelligence to improve visibility and protection

Post Syndicated from Diana Alvarado original https://aws.amazon.com/blogs/security/analyze-amazon-cognito-advanced-security-intelligence-to-improve-visibility-and-protection/

As your organization looks to improve your security posture and practices, early detection and prevention of unauthorized activity quickly becomes one of your main priorities. The behaviors associated with unauthorized activity commonly follow patterns that you can analyze in order to create specific mitigations or feed data into your security monitoring systems.

This post shows you how you can analyze security intelligence from Amazon Cognito advanced security features logs by using AWS native services. You can use the intelligence data provided by the logs to increase your visibility into sign-in and sign-up activities from users, this can help you with monitoring, decision making, and to feed other security services in your organization, such as a web application firewall or security information and event management (SIEM) tool. The data can also enrich available security feeds like fraud detection systems, increasing protection for the workloads that you run on AWS.

Amazon Cognito advanced security features overview

Amazon Cognito provides authentication, authorization, and user management for your web and mobile apps. Your users can sign in to apps directly with a user name and password, or through a third party such as social providers or standard enterprise providers through SAML 2.0/OpenID Connect (OIDC). Amazon Cognito includes additional protections for users that you manage in Amazon Cognito user pools. In particular, Amazon Cognito can add risk-based adaptive authentication and also flag the use of compromised credentials. For more information, see Checking for compromised credentials in the Amazon Cognito Developer Guide.

With adaptive authentication, Amazon Cognito examines each user pool sign-in attempt and generates a risk score for how likely the sign-in request is from an unauthorized user. Amazon Cognito examines a number of factors, including whether the user has used the same device before or has signed in from the same location or IP address. A detected risk is rated as low, medium, or high, and you can determine what actions should be taken at each risk level. You can choose to allow or block the request, require a second authentication factor, or notify the user of the risk by email. Security teams and administrators can also submit feedback on the risk through the API, and users can submit feedback by using a link that is sent to the user’s email. This feedback can improve the risk calculation for future attempts.

To add advanced security features to your existing Amazon Cognito configuration, you can get started by using the steps for Adding advanced security to a user pool in the Amazon Cognito Developer Guide. Note that there is an additional charge for advanced security features, as described on our pricing page. These features are applicable only to native Amazon Cognito users; they aren’t applicable to federated users who sign in with an external provider.

Solution architecture

Figure 1: Solution architecture

Figure 1: Solution architecture

Figure 1 shows the high-level architecture for the advanced security solution. When an Amazon Cognito sign-in event is recorded by AWS CloudTrail, the solution uses an Amazon EventBridge rule to send the event to an Amazon Simple Queue Service (Amazon SQS) queue and batch it, to then be processed by an AWS Lambda function. The Lambda function uses the event information to pull the sign-in security information and send it as logs to an Amazon Simple Storage Service (Amazon S3) bucket and Amazon CloudWatch Logs.

Prerequisites and considerations for this solution

This solution assumes that you are using Amazon Cognito with advanced security features already enabled, the solution does not create a user pool and does not activate the advanced security features on an existing one.

The following list describes some limitations that you should be aware of for this solution:

  1. This solution does not apply to events in the hosted UI, but the same architecture can be adapted for that environment, with some changes to the events processor.
  2. The Amazon Cognito advanced security features support only native users. This solution is not applicable to federated users.
  3. The admin API used in this solution has a default rate limit of 30 requests per second (RPS). If you have a higher rate of authentication attempts, this API call might be throttled and you will need to implement a re-try pattern to confirm that your requests are processed.

Implement the solution

You can deploy the solution automatically by using the following AWS CloudFormation template.

Choose the following Launch Stack button to launch a CloudFormation stack in your account and deploy the solution.

Select this image to open a link that starts building the CloudFormation stack

You’ll be redirected to the CloudFormation service in the US East (N. Virginia) Region, which is the default AWS Region, to deploy this solution. You can change the Region to align it to where your Cognito User Pool is running.

This template will create multiple cloud resources including, but not limited to, the following:

  • An EventBridge rule for sending the Amazon Cognito events
  • An Amazon SQS queue for sending the events to Lambda
  • A Lambda function for getting the advanced security information based on the authentication events from CloudTrail
  • An S3 bucket to store the logs

In the wizard, you’ll be asked to modify or provide one parameter, the existing Cognito user pool ID. You can get this value from the Amazon Cognito console or the Cognito API.

Now, let’s break down each component of the solution in detail.

Sending the authentication events from CloudTrail to Lambda

Cognito advanced security features supports the CloudTrail events: SignUp, ConfirmSignUp, ForgotPassword, ResendConfirmationCode, InitiateAuth and RespondToAuthChallenge. This solution will focus on the sign-in event InitiateAuth as an example.

The solution creates an EventBridge rule that will run when an event is identified in CloudTrail and send the event to an SQS queue. This is useful so that events can be batched up and decoupled for Lambda to process.

The EventBridge rule uses Amazon SQS as a target. The queue is created by the solution and uses the default settings, with the exception that Receive message wait time is set to 20 seconds for long polling. For more information about long polling and how to manually set up an SQS queue, see Consuming messages using long polling in the Amazon SQS Developer Guide.

When the SQS queue receives the messages from EventBridge, these are sent to Lambda for processing. Let’s now focus on understanding how this information is processed by the Lambda function.

Using Lambda to process Amazon Cognito advanced security features information

In order to get the advanced security features evaluation information, you need authentication details that can only be obtained by using the Amazon Cognito identity provider (IdP) API call admin_list_user_auth_events. This API call requires a username to fetch all the authentication event details for a specific user. For security reasons, the username is not logged in CloudTrail and must be obtained by using other event information.

You can use the Lambda function in the sample solution to get this information. It’s composed of three main sequential actions:

  1. The Lambda function gets the sub identifiers from the authentication events recorded by CloudTrail.
  2. Each sub identifier is used to get the user name through an API call to list_users.
  3. 3. The sample function retrieves the last five authentication event details from advanced security features for each of these users by using the admin_list_user_auth_events API call. You can modify the function to retrieve a different number of events, or use other criteria such as a timestamp or a specific time period.

Getting the user name information from a CloudTrail event

The following sample authentication event shows a sub identifier in the CloudTrail event information, shown as sub under additionalEventData. With this sub identifier, you can use the ListUsers API call from the Cognito IdP SDK to get the user name details.

{
"eventVersion": "1.XX",
"userIdentity": {
"type": "Unknown",
"principalId": "Anonymous"
},
"eventTime": "2022-01-01T11:11:11Z",
"eventSource": "cognito-idp.amazonaws.com",
"eventName": "InitiateAuth",
"awsRegion": "us-east-1",
"sourceIPAddress": "xx.xx.xx.xx",
"userAgent": "Mozilla/5.0 (xxxx)",
"requestParameters": {
"authFlow": "USER_SRP_AUTH",
"authParameters": "HIDDEN_DUE_TO_SECURITY_REASONS",
"clientMetadata": {},
"clientId": "iiiiiiiii"
},
"responseElements": {
"challengeName": "PASSWORD_VERIFIER",
"challengeParameters": {
"SALT": "HIDDEN_DUE_TO_SECURITY_REASONS",
"SECRET_BLOCK": "HIDDEN_DUE_TO_SECURITY_REASONS",
"USER_ID_FOR_SRP": "HIDDEN_DUE_TO_SECURITY_REASONS",
"USERNAME": "HIDDEN_DUE_TO_SECURITY_REASONS",
"SRP_B": "HIDDEN_DUE_TO_SECURITY_REASONS"
}
},
"additionalEventData": {
"sub": "11110b4c-1f4264cd111"
},
"requestID": "xxxxxxxx",
"eventID": "xxxxxxxxxx",
"readOnly": false,
"eventType": "AwsApiCall",
"managementEvent": true,
"recipientAccountId": "xxxxxxxxxxxxx",
"eventCategory": "Management"
}

Listing authentication events information

After the Lambda function obtains the username, it can then use the Cognito IdP API call admin_list_user_auth_events to get the advanced security feature risk evaluation information for each of the authentication events for that user. Let’s look into the details of that evaluation.

The authentication event information from Amazon Cognito advanced security provides information for each of the categories evaluated and logs the results. Those results can then be used to decide whether the authentication attempt information is useful for the security team to be notified or take action. It’s recommended that you limit the number of events returned, in order to keep performance optimized.

The following sample event shows some of the risk information provided by advanced security features; the options for the response syntax can be found in the CognitoIdentityProvider API documentation.

}
]
at the bottom, so
"AuthEvents": [
{
"EventId": "1111111”,
"EventType": "SignIn",
"CreationDate": 111111.111,
"EventResponse": "Pass",
"EventRisk": {
"RiskDecision": "NoRisk",
"CompromisedCredentialsDetected": false
},
"ChallengeResponses": [
{
"ChallengeName": "Password",
"ChallengeResponse": "Success"
}
],
"EventContextData": {
"IpAddress": "72.xx.xx.xx",
"DeviceName": "Firefox xx
"City": "Axxx",
"Country": "United States"
}
}
]

The event information that is returned includes the details that are highlighted in this sample event, such as CompromisedCredentialsDetected, RiskDecision, and RiskLevel, which you can evaluate to decide whether the information can be used to enrich other security monitoring services.

Logging the authentication events information

You can use a Lambda extensions layer to send logs to an S3 bucket. Lambda still sends logs to Amazon CloudWatch Logs, but you can disable this activity by removing the required permissions to CloudWatch on the Lambda execution role. For more details on how to set this up, see Using AWS Lambda extensions to send logs to custom destinations.

Figure 2 shows an example of a log sent by Lambda. It includes execution information that is logged by the extension, as well as the information returned from the authentication evaluation by advanced security features.

Figure 2: Sample log information sent to S3

Figure 2: Sample log information sent to S3

Note that the detailed authentication information in the Lambda execution log is the same as the preceding sample event. You can further enhance the information provided by the Lambda function by modifying the function code and logging more information during the execution, or by filtering the logs and focusing only on high-risk or compromised login attempts.

After the logs are in the S3 bucket, different applications and tools can use this information to perform automated security actions and configuration updates or provide further visibility. You can query the data from Amazon S3 by using Amazon Athena, feed the data to other services such as Amazon Fraud Detector as described in this post, mine the data by using artificial intelligence/machine learning (AI/ML) managed tools like AWS Lookout for Metrics, or enhance visibility with AWS WAF.

Sample scenarios

You can start to gain insights into the security information provided by this solution in an existing environment by querying and visualizing the log data directly by using CloudWatch Logs Insights. For detailed information about how you can use CloudWatch Logs Insights with Lambda logs, see the blog post Operating Lambda: Using CloudWatch Logs Insights.

The CloudFormation template deploys the CloudWatch Logs Insights queries. You can view the queries for the sample solution in the Amazon CloudWatch console, under Queries.

To access the queries in the CloudWatch console

  1. In the CloudWatch console, under Logs, choose Insights.
  2. Choose Select log group(s). In the drop-drown list, select the Lambda log group.
  3. The query box should show the pre-created query. Choose Run query. You should then see the query results in the bottom-right panel.
  4. (Optional) Choose Add to dashboard to add the widget to a dashboard.

CloudWatch Logs Insights discovers the fields in the auth event log automatically. As shown in Figure 3, you can see the available fields in the right-hand side Discovered fields pane, which includes the Amazon Cognito information in the event.

Figure 3: The fields available in CloudWatch Logs Insights

Figure 3: The fields available in CloudWatch Logs Insights

The first query, shown in the following code snippet, will help you get a view of the number of requests per IP, where the advanced security features have determined the risk decision as Account Takeover and the CompromisedCredentialsDetected as true.

fields @message
| filter @message like /INFO/
| filter AuthEvents.0.EventType like 'SignIn'
| filter AuthEvents.0.EventRisk.RiskDecision like "AccountTakeover" and 
AuthEvents.0.EventRisk.CompromisedCredentialsDetected =! "false"
| stats count(*) as RequestsperIP by AuthEvents.2.EventContextData.IpAddress as IP
| sort desc

You can view the results of the query as a table or graph, as shown in Figure 4.

Figure 4: Sample query results for CompromisedCredentialsDetected

Figure 4: Sample query results for CompromisedCredentialsDetected

Using the same approach and the convenient access to the fields for query, you can explore another use case, using the following query, to view the number of requests per IP for each type of event (SignIn, SignUp, and forgot password) where the risk level was high.

fields @message
| filter @message like /INFO/
| filter AuthEvents.0.EventRisk.RiskLevel like "High"
| stats count(*) as RequestsperIP by AuthEvents.0.EventContextData.IpAddress as IP, 
AuthEvents.0.EventType as EventType
| sort desc

Figure 5 shows the results for this EventType query.

Figure 5: The sample results for the EventType query

Figure 5: The sample results for the EventType query

In the final sample scenario, you can look at event context data and query for the source of the events for which the risk level was high.

fields @message
| filter @message like /INFO/
| filter AuthEvents.0.EventRisk.RiskLevel like 'High'
| stats count(*) as RequestsperCountry by AuthEvents.0.EventContextData.Country as Country
| sort desc

Figure 6 shows the results for this RiskLevel query.

Figure 6: Sample results for the RiskLevel query

Figure 6: Sample results for the RiskLevel query

As you can see, there are many ways to mix and match the filters to extract deep insights, depending on your specific needs. You can use these examples as a base to build your own queries.

Conclusion

In this post, you learned how to use security intelligence information provided by Amazon Cognito through its advanced security features to improve your security posture and practices. You used an advanced security solution to retrieve valuable authentication information using CloudTrail logs as a source and a Lambda function to process the events, send this evaluation information in the form of a log to CloudWatch Logs and S3 for use as an additional security feed for wider organizational monitoring and visibility. In a set of sample use cases, you explored how to use CloudWatch Logs Insights to quickly and conveniently access this information, aggregate it, gain deep insights and use it to take action.

To learn more, see the blog post How to Use New Advanced Security Features for Amazon Cognito User Pools.

 
If you have feedback about this post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Diana Alvarado

Diana Alvarado

Diana is Sr security solutions architect at AWS. She is passionate about helping customers solve difficult cloud challenges, she has a soft spot for all things logs.

How to automatically build forensic kernel modules for Amazon Linux EC2 instances

Post Syndicated from Jonathan Nguyen original https://aws.amazon.com/blogs/security/how-to-automatically-build-forensic-kernel-modules-for-amazon-linux-ec2-instances/

In this blog post, we will walk you through the EC2 forensic module factory solution to deploy automation to build forensic kernel modules that are required for Amazon Elastic Compute Cloud (Amazon EC2) incident response automation.

When an EC2 instance is suspected to have been compromised, it’s strongly recommended to investigate what happened to the instance. You should look for activities such as:

  • Open network connections
  • List of running processes
  • Processes that contain injected code
  • Memory-resident infections
  • Other forensic artifacts

When an EC2 instance is compromised, it’s important to take action as quickly as possible. Before you shut down the EC2 instance, you first need to capture the contents of its volatile memory (RAM) in a memory dump because it contains the instance’s in-progress operations. This is key in determining the root cause of compromise.

In order to capture volatile memory in Linux, you can use a tool like Linux Memory Extractor (LiME). This requires you to have the kernel modules that are specific to the kernel version of the instance for which you want to capture volatile memory. We also recommend that you limit the actions you take on the instance where you are trying to capture the volatile memory in order to minimize the set of artifacts created as part of the capture process, so you need a method to build the tools for capturing volatile memory outside the instance under investigation. After you capture the volatile memory, you can use a tool like Volatility2 to analyze it in a dedicated forensics environment. You can use tools like LiME and Volatility2 on EC2 instances that use x86, x64, and Graviton instance types.

Prerequisites

This solution has the following prerequisites:

Solution overview

The EC2 forensic module factory solution consists of the following resources:

Figure 1 shows an overview of the EC2 forensic module factory solution workflow.

Figure 1: Automation to build forensic kernel modules for an Amazon Linux EC2 instance

Figure 1: Automation to build forensic kernel modules for an Amazon Linux EC2 instance

The EC2 forensic module factory solution workflow in Figure 1 includes the following numbered steps:

  1. A Step Functions workflow is started, which creates a Step Functions task token and invokes the first Lambda function, createEC2module, to create EC2 forensic modules.
    1. A Step Functions task token is used to allow long-running processes to complete and to avoid a Lambda timeout error. The createEC2module function runs for approximately 9 minutes. The run time for the function can vary depending on any customizations to the createEC2module function or the SSM document.
  2. The createEC2module function launches an EC2 instance based on the Amazon Machine Image (AMI) provided.
  3. Once the EC2 instance is running, an SSM document is run, which includes the following steps:
    1. If a specific kernel version is provided in step 1, this kernel version will be installed on the EC2 instance. If no kernel version is provided, the default kernel version on the EC2 instance will be used to create the modules.
    2. If a specific kernel version was selected and installed, the system is rebooted to use this kernel version.
    3. The prerequisite build tools are installed, as well as the LiME and Volatility2 packages.
    4. The LiME kernel module and the Volatility2 profile are built.
  4. The kernel modules for LiME and Volatility2 are put into the S3 bucket.
  5. Upon completion, the Step Functions task token is sent to the Step Functions workflow to invoke the second cleanupEC2module Lambda function to terminate the EC2 instance that was launched in step 2.

Solution deployment

You can deploy the EC2 forensic module factory solution by using either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).

Option 1: Deploy the solution with AWS CloudFormation (console)

Sign in to your preferred security tooling account in the AWS Management Console, and choose the following Launch Stack button to open the AWS CloudFormation console pre-loaded with the template for this solution. It will take approximately 10 minutes for the CloudFormation stack to complete.

Select this image to open a link that starts building the CloudFormation stack

Option 2: Deploy the solution by using the AWS CDK

You can find the latest code for the EC2 forensic module factory solution in the ec2-forensic-module-factory GitHub repository, where you can also contribute to the sample code. For instructions and more information on using the AWS CDK, see Get Started with AWS CDK.

To deploy the solution by using the AWS CDK

  1. To build the app when navigating to the project’s root folder, use the following commands.
    npm install -g aws-cdk
    npm install
  2. Run the following commands in your terminal while authenticated in your preferred security tooling AWS account. Be sure to replace <INSERT_AWS_ACCOUNT> with your account number, and replace <INSERT_REGION> with the AWS Region that you want the solution deployed to.
    cdk bootstrap aws://<INSERT_AWS_ACCOUNT>/<INSERT_REGION>
    cdk deploy

Run the solution to build forensic kernel objects

Now that you’ve deployed the EC2 forensic module factory solution, you need to invoke the Step Functions workflow in order to create the forensic kernel objects. The following is an example of manually invoking the workflow, to help you understand what actions are being performed. These actions can also be integrated and automated with an EC2 incident response solution.

To manually invoke the workflow to create the forensic kernel objects (console)

  1. In the AWS Management Console, sign in to the account where the solution was deployed.
  2. In the AWS Step Functions console, select the state machine named create_ec2_volatile_memory_modules.
  3. Choose Start execution.
  4. At the input prompt, enter the following JSON values.
    {
    "AMI_ID": "ami-0022f774911c1d690",
    "kernelversion":"kernel-4.14.104-95.84.amzn2.x86_64"
    }
  5. Choose Start execution to start the workflow, as shown in Figure 2.
    Figure 2: Step Functions step input example to build custom kernel version using Amazon Linux 2 AMI ID

    Figure 2: Step Functions step input example to build custom kernel version using Amazon Linux 2 AMI ID

Workflow progress

You can use the AWS Management Console to follow the progress of the Step Functions workflow. If the workflow is successful, you should see the image when you view the status of the Step Functions workflow, as shown in Figure 3.

Figure 3: Step Functions workflow success example

Figure 3: Step Functions workflow success example

Note: The Step Functions workflow run time depends on the commands that are being run in the SSM document. The example SSM document included in this post runs for approximately 9 minutes. For information about possible Step Functions errors, see Error handling in Step Functions.

To verify that the artifacts are built

  1. After the Step Functions workflow has successfully completed, go to the S3 bucket that was provisioned in the EC2 forensic module factory solution.
  2. Look for two prefixes in the bucket for LiME and Volatility2, as shown in Figure 4.
    Figure 4: S3 bucket prefix for forensic kernel modules

    Figure 4: S3 bucket prefix for forensic kernel modules

  3. Open each tool name prefix in S3 to find the actual module, such as in the following examples:
    • LiME example: lime-4.14.104-95.84.amzn2.x86_64.ko
    • Volatility2 example: 4.14.104-95.84.amzn2.x86_64.zip

Now that the objects have been created, the solution has successfully completed.

Incorporate forensic module builds into an EC2 AMI pipeline

Each organization has specific requirements for allowing application teams to use various EC2 AMIs, and organizations commonly implement an EC2 image pipeline using tools like EC2 Image Builder. EC2 Image Builder uses recipes to install and configure required components in the AMI before application teams can launch EC2 instances in their environment.

The EC2 forensic module factory solution we implemented here makes use of an existing EC2 instance AMI. As mentioned, the solution uses an SSM document to create forensic modules. The logic in the SSM document could be incorporated into your EC2 image pipeline to create the forensic modules and store them in an S3 bucket. S3 also allows additional layers of protection such as enforcing default bucket encryption with an AWS Key Management Service Customer Managed Key (CMK), verifying S3 object integrity with checksum, S3 Object Lock, and restrictive S3 bucket policies. These protections can help you to ensure that your forensic modules have not been modified and are only accessible by authorized entities.

It is important to note that incorporating forensic module creation into an EC2 AMI pipeline will build forensic modules for the specific kernel version used in that AMI. You would still need to employ this EC2 forensic module solution to build a specific forensic module version if it is missing from the S3 bucket where you are creating and storing these forensic modules. The need to do this can arise if the EC2 instance is updated after the initial creation of the AMI.

Incorporate the solution into existing EC2 incident response automation

There are many existing solutions to automate incident response workflow for quarantining and capturing forensic evidence for EC2 instances, but the majority of EC2 incident response automation solutions have a single dependency in common, which is the use of specific forensic modules for the target EC2 instance kernel version. The EC2 forensic module factory solution in this post enables you to be both proactive and reactive when building forensic kernel modules for your EC2 instances.

You can use the EC2 forensic module factory solution in two different ways:

  1. Ad-hoc – In this post, you walked through the solution by running the Step Functions workflow with specific parameters. You can do this to build a repository of kernel modules.
  2. Automated – Alternatively, you can incorporate this solution into existing automation by invoking the Step Functions workflow and passing the AMI ID and kernel version. An example could be the following:
    1. An existing EC2 incident response solution attempts to get the forensic modules to capture the volatile memory from an S3 bucket.
    2. If the specific kernel version is missing in the S3 bucket, the solution updates the automation to StartExecution on the create_ec2_volatile_memory_modules state machine.
    3. The Step Functions workflow builds the specific forensic modules.
    4. After the Step Functions workflow is complete, the EC2 incident response solution restarts its workflow to get the forensic modules to capture the volatile memory on the EC2 instance.

Now that you have the kernel modules, you can both capture the volatile memory by using LiME, and then conduct analysis on the memory dump by using a Volatility2 profile.

To capture and analyze volatile memory on the target EC2 instance (high-level steps)

  1. Copy the LiME module from the S3 bucket holding the module repository to the target EC2 instance.
  2. Capture the volatile memory by using the LiME module.
  3. Stream the volatile memory dump to a S3 bucket.
  4. Launch an EC2 forensic workstation instance, with Volatility2 installed.
  5. Copy the Volatility2 profile from the S3 bucket to the appropriate location.
  6. Copy the volatile memory dump to the EC2 forensic workstation.
  7. Run analysis on the volatile memory with Volatility2 by using the specific Volatility2 profile created for the target EC2 instance.

Automated self-service AWS solution

AWS has also released the Automated Forensics Orchestrator for Amazon EC2 solution that you can use to quickly set up and configure a dedicated forensics orchestration automation solution for your security teams. The Automated Forensics Orchestrator for Amazon EC2 allows you to capture and examine the data from EC2 instances and attached Amazon Elastic Block Store (Amazon EBS) volumes in your AWS environment. This data is collected as forensic evidence for analysis by the security team.

The Automated Forensics Orchestrator for Amazon EC2 creates the foundational components to enable the EC2 forensic module factory solution’s memory forensic acquisition workflow and forensic investigation and reporting service. Both the Automated Forensics Orchestrator for Amazon EC2, and the EC2 forensic module factory, are hosted in different GitHub projects. And you will need to reconcile the expected S3 bucket locations for the associated modules:

Customize the EC2 forensic module factory solution

The SSM document pulls open-source packages to build tools for the specific Linux kernel version. You can update the SSM document to your specific requirements for forensic analysis, including expanding support for other operating systems, versions, and tools.

You can also update the S3 object naming convention and object tagging, to allow external solutions to reference and copy the appropriate kernel module versions to enable the forensic workflow.

Clean up

If you deployed the EC2 forensic module factory solution by using the Launch Stack button in the AWS Management Console or the CloudFormation template ec2_module_factory_cfn, do the following to clean up:

  1. In the AWS CloudFormation console for the account and Region where you deployed the solution, choose the Ec2VolModules stack.
  2. Choose the option to Delete the stack.

If you deployed the solution by using the AWS CDK, run the following command.

cdk destroy

Conclusion

In this blog post, we walked you through the deployment and use of the EC2 forensic module factory solution to use AWS Step Functions, AWS Lambda, AWS Systems Manager, and Amazon EC2 to create specific versions of forensic kernel modules for Amazon Linux EC2 instances.

The solution provides a framework to create the foundational components required in an EC2 incident response automation solution. You can customize the solution to your needs to fit into an existing EC2 automation, or you can deploy this solution in tandem with the Automated Forensics Orchestrator for Amazon EC2.

If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on re:Post.

Want more AWS Security news? Follow us on Twitter.

Jonathan Nguyen

Jonathan Nguyen

Jonathan is a Shared Delivery Team Senior Security Consultant at AWS. His background is in AWS Security with a focus on threat detection and incident response. Today, he helps enterprise customers develop a comprehensive security strategy and deploy security solutions at scale, and he trains customers on AWS Security best practices.

David Hoang

David Hoang

David is a Shared Delivery Team Security Consultant at AWS. His background is in AWS security, with a focus on automation. David designs, builds, and implements scalable enterprise solutions with security guardrails that use AWS services.

Implement step-up authentication with Amazon Cognito, Part 2: Deploy and test the solution

Post Syndicated from Salman Moghal original https://aws.amazon.com/blogs/security/implement-step-up-authentication-with-amazon-cognito-part-2-deploy-and-test-the-solution/

This solution consists of two parts. In the previous blog post Implement step-up authentication with Amazon Cognito, Part 1: Solution overview, you learned about the architecture and design of a step-up authentication solution that uses AWS services such as Amazon API Gateway, Amazon Cognito, Amazon DynamoDB, and AWS Lambda to protect privileged API operations. In this post, you will use a reference implementation to deploy and test the step-up authentication solution in your AWS account.

Solution deployment

The step-up authentication solution discussed in Part 1 uses a reference implementation that you can use for demonstration and learning purposes. You can also review the implementation code in the step-up-auth GitHub repository. The reference implementation includes a web application that you can use in the following sections to test the step-up implementation. Additionally, the implementation contains a sample privileged API action /transfer and a non-privileged API action /info, and two step-up authentication solution API operations /initiate-auth, and /respond-to-challenge. The web application invokes these API operations to demonstrate how to perform step-up authentication.

Deployment prerequisites

The following are prerequisites for deployment:

  1. The Node.js runtime and the node package manager (npm) are installed on your machine. You can use a package manager for your platform to install these. Note that the reference implementation code was tested using Node.js v16 LTS.
  2. The AWS Cloud Development Kit (AWS CDK) is installed in your environment.
  3. The AWS Command Line Interface (AWS CLI) is installed in your environment.
  4. You must have AWS credentials files that contain a profile with your account secret key and access key to perform the deployment. Make sure that your account has enough privileges to create, update, or delete the following resources:
  5. A two-factor authentication (2FA) mobile application, such as Google Authenticator, is installed on your mobile device.

Deploy the step-up solution

You can deploy the solution by using the AWS CDK, which will create a working reference implementation of the step-up authentication solution.

To deploy the solution

  1. Build the necessary resources by using the build.sh script in the deployment folder. Run the build script from a terminal window, using the following command:
    cd deployment && ./build.sh
  2. Deploy the solution by using the deploy.sh script that is present in the deployment folder, using the following command. Be sure to replace the required environment variables with your own values.
    export AWS_REGION=<your AWS Region of choice, for example us-east-2>
    export AWS_ACCOUNT=<your account number>
    export AWS_PROFILE=<a valid profile in .aws/credentials that contains the secret/access key to your account>
    export NODE_ENV=development
    export ENV_PREFIX=dev

    The account you specify in the AWS_ACCOUNT environment variable is used to bootstrap the AWS CDK deployment. Set AWS_PROFILE to point to your profile. Make sure that your account has sufficient privileges, as described in the prerequisites.

    The NODE_ENV environment variable can be set to development or production. This variable controls the log output that the Lambda functions generate. The ENV_PREFIX environment variable allows you to prefix all resources with a tag, which enables a multi-tenant deployment of this solution.

  3. Still in the deployment folder, deploy the stack by using the following command:
    ./deploy.sh
  4. Make note of the CloudFront distribution URL that follows Sample Web App URL, as shown in Figure 1. In the next section, you will use this CloudFront distribution URL to load the sample web app in a web browser and test the step-up solution
    Figure 1: The output of the deployment process

    Figure 1: The output of the deployment process

After the deployment script deploy.sh completes successfully, the AWS CDK creates the following resources in your account:

  • An Amazon Cognito user pool that is used as a user registry.
  • An Amazon API Gateway API that contains three resources:
    • A protected resource that requires step-up authentication.
    • An initiate-auth resource to start the step-up challenge response.
    • A respond-to-challenge resource to complete the step-up challenge.
  • An API Gateway Lambda authorizer that is used to protect API actions.
  • The following Amazon DynamoDB tables:
    • A setting table that holds the configuration mapping of the API operations that require elevated privileges.
    • A session table that holds temporary, user-initiated step-up sessions and their current status.
  • A React web UI that demonstrates how to invoke a privileged API action and go through step-up authentication.

Test the step-up solution

In order to test the step-up solution, you’ll use the sample web application that you deployed in the previous section. Here’s an overview of the actions you’ll perform to test the flow:

  1. In the AWS Management Console, create items in the setting DynamoDB table that point to privileged API actions. After the solution deployment, the setting DynamoDB table is called step-up-auth-setting-<ENV_PREFIX>. For more information about ENV_PREFIX variable usage in a multi-tenant environment, see Deploy the step-up solution earlier in this post.

    As discussed, in the Data design section in Part 1 of this series, the Lambda authorizer treats all API invocations as non-privileged (that is, they don’t require step-up authentication) unless there is a matching entry for the API action in the setting table. Additionally, you can switch a privileged API action to a non-privileged API action by simply changing the stepUpState attribute in the setting table. Create an item in the DynamoDB table for the sample /transfer API action and for the sample /info API action. The /transfer API action will require step-up authentication, whereas the /info API action will be a non-privileged invocation that does not require step-up authentication. Note that there is no need to define a non-privileged API action in the table; it is there for illustration purposes only.

  2. If you haven’t already, install Google Authenticator or a similar two-factor authentication (2FA) application on your mobile device.
  3. Using the sample web application, register a new user in Amazon Cognito.
  4. Log in to the sample web application by using the registered new user.
  5. Configure the preferred multi-factor authentication (MFA) settings for the logged in user in the application. This step is necessary so that Amazon Cognito can challenge the user with a one-time password (OTP).
  6. Using the sample web application, invoke the sample /transfer privileged API action that requires step-up authentication.
  7. The Lambda authorizer will intercept the API request and return a 401 Unauthorized response status code that the sample web application will handle. The application will perform step-up authentication by prompting you to provide additional security credentials, specifically the OTP. To complete the step-up authentication, enter the OTP, which is sent through short service message (SMS) or by using an authenticator mobile app.
  8. Invoke the sample /transfer privileged API action again in the sample web application, and verify that the API invocation is successful.

The following instructions assume that you’ve installed a 2FA mobile application, such as Google Authenticator, on your mobile device. You will configure the 2FA application in the following steps and use the OTP from this mobile application when prompted to enter the step-up challenge. You can configure Amazon Cognito to send you an SMS with the OTP. However, you must be aware of the Amazon Cognito throttling limits. See the Additional considerations section in Part 1 of this series. Read these limits carefully, especially if you set the user’s preferred MFA setting to SMS.

To test the step-up authentication solution

  1. Open the Amazon DynamoDB console and log in to your AWS account.
  2. On the left nav pane, under Tables, choose Explore items. In the right pane, choose the table named step-up-auth-setting* and choose Create item, as shown in Figure 2.
    Figure 2: Choose the step-up-auth-setting* table and choose Create item button

    Figure 2: Choose the step-up-auth-setting* table and choose Create item button

  3. In the Edit item screen as shown in Figure 3, ensure that JSON is selected, and the Attributes button for View DynamoDB JSON is off.
    Figure 3: Edit an item in the table - select JSON and turn off View DynamoDB JSON button

    Figure 3: Edit an item in the table – select JSON and turn off View DynamoDB JSON button

  4. To create an entry for the /info API action, copy the following JSON text:
    {
       "id": "/info",
       "lastUpdateTimestamp": "2021-08-23T08:25:29.023Z",
       "stepUpState": "STEP_UP_NOT_REQUIRED",
       "createTimestamp": "2021-08-23T08:25:29.023Z"
    }
  5. Paste the copied JSON text for the /info API action in the Attributes text area, as shown in Figure 4, and choose Create item.
    Figure 4: Create an entry for the /info API action

    Figure 4: Create an entry for the /info API action

  6. To create an entry for the /transfer API action, copy the following JSON text:
    {
       "id": "/transfer",
       "lastUpdateTimestamp": "2021-08-23T08:22:12.436Z",
       "stepUpState": "STEP_UP_REQUIRED",
       "createTimestamp": "2021-08-23T08:22:12.436Z"
    }
  7. Paste the copied JSON text for the /transfer API action in the Attributes text area, as shown in Figure 4, and choose Create item.
    Figure 5: Create an entry for the /transfer API action

    Figure 5: Create an entry for the /transfer API action

  8. Open your web browser and load the CloudFront URL that you made note of in step 4 of the Deploy the step-up solution procedure.
  9. On the login screen of the sample web application, enter the information for a new user. Make sure that the email address and phone numbers are valid. Choose Register. You will be prompted to enter a verification code. Check your email for the verification code, and enter it at the sample web application prompt.
  10. You will be sent back to the login screen. Log in as the user that you just registered. You will see the welcome screen, as shown in Figure 6.
    Figure 6: Welcome screen of the sample web application

    Figure 6: Welcome screen of the sample web application

  11. In the left nav pane choose Setting, choose the Configure button to the right of Software Token, as shown in Figure 7. Use your mobile device camera to capture the QR code on the screen in your 2FA application, for example Google Authenticator.
    Figure 7: Configure Software Token screen with QR code

    Figure 7: Configure Software Token screen with QR code

  12. Enter the temporary code from the 2FA application into the web application and choose Submit. You will see the message Software Token successfully configured!
  13. Still in the Setting menu, next to Select Preferred MFA, choose Software Token. You will see the message User preferred MFA set to Software Token, as shown in Figure 8.
    Figure 8: Completed Software Token setup

    Figure 8: Completed Software Token setup

  14. In the left nav pane choose StepUp Auth. In the right pane, choose Invoke Transfer API. You should see Response: 401 authorization challenge, as shown in Figure 9.
    Figure 9: The step-up API invocation returns an authorization challenge

    Figure 9: The step-up API invocation returns an authorization challenge

  15. On your mobile device, open the 2FA application, copy the OTP code from the 2FA application, and enter the code into the Enter OTP field, as shown in Figure 9. Choose Submit.
  16. This sends the OTP to the respond-to-challenge endpoint. After the OTP is verified, the endpoint will return a success or failure message. Figure 10 shows a successful OTP verification. You are prompted to invoke the /transfer privileged API action again.
    Figure 10: The OTP prompt during step-up API invocation

    Figure 10: The OTP prompt during step-up API invocation

  17. Invoke the transfer API action again by choosing Invoke Transfer API. You should see a success message as shown in Figure 11.
    Figure 11: A successful step-up API invocation

    Figure 11: A successful step-up API invocation

    Congratulations! You’ve successfully performed step-up authentication.

Conclusion

In the previous post in this series, Implement step-up authentication with Amazon Cognito, Part 1: Solution overview, you learned about the architecture and implementation details for the step-up authentication solution. In this blog post, you learned how to deploy and test the step-up authentication solution in your AWS account. You deployed the solution by using scripts from the step-up-auth GitHub repository that use the AWS CDK to create resources in your account for Amazon Cognito, Amazon API Gateway, a Lambda authorizer, and Amazon DynamoDB. Finally, you tested the end-to-end solution on a sample web application by invoking a privileged API action that required step-up authentication. Using the 2FA application, you were able to pass in an OTP to complete the step-up authentication and subsequently successfully invoke the privileged API action.

For more information about AWS Cognito user pools and the new console experience, watch the video Amazon Cognito User Pools New Console Walkthrough on the AWS channel on YouTube. And for more information about how to protect your API actions with fine-grained access controls, see the blog post Building fine-grained authorization using Amazon Cognito, API Gateway, and IAM.

If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the Amazon Cognito forum.

Want more AWS Security news? Follow us on Twitter.

Salman Moghal

Salman Moghal

Salman is a Principal Consultant in AWS Professional Services, based in Toronto, Canada. He helps customers in architecting, developing, and reengineering data-driven applications at scale, with a sharp focus on security.

Thomas Ross

Thomas Ross

Thomas is a Software Engineering student at Carleton University. He worked at AWS as a Professional Services Intern and a Software Development Engineer Intern in Amazon Aurora. He has an interest in almost anything related to technology, especially systems at high scale, security, distributed systems, and databases.

Ozair Sheikh

Ozair Sheikh

Ozair is a senior product leader for Sponsored Display in Amazon ads, based in Toronto, Canada. He helps advertisers and Ad Tech API Partners build campaign management solutions to reach customers across the purchase journey. He has over 10 years of experience in API management and security, with an obsession for delivering highly secure API products.

Mahmoud Matouk

Mahmoud Matouk

Mahmoud is a Principal Solutions Architect with the Amazon Cognito team. He helps AWS customers build secure and innovative solutions for various identity and access management scenarios.

Implement step-up authentication with Amazon Cognito, Part 1: Solution overview

Post Syndicated from Salman Moghal original https://aws.amazon.com/blogs/security/implement-step-up-authentication-with-amazon-cognito-part-1-solution-overview/

In this blog post, you’ll learn how to protect privileged business transactions that are exposed as APIs by using multi-factor authentication (MFA) or security challenges. These challenges have two components: what you know (such as passwords), and what you have (such as a one-time password token). By using these multi-factor security controls, you can implement step-up authentication to obtain a higher level of security when you perform critical transactions. In this post, we show you how you can use AWS services such as Amazon API Gateway, Amazon Cognito, Amazon DynamoDB, and AWS Lambda functions to implement step-up authentication by using a simple rule-based security model for your API resources.

Previously, identity and access management solutions have attempted to deliver step-up authentication by retrofitting their runtimes with stateful server-side management, which doesn’t scale in the modern-day stateless cloud-centered application architecture. We’ll show you how to use a pluggable, stateless authentication implementation that integrates into your existing infrastructure without compromising your security or performance. The Amazon API Gateway Lambda authorizer is a pluggable serverless function that acts as an intermediary step before an API action is invoked. This Lambda authorizer, coupled with a small SDK library that runs in the authorizer, will provide step-up authentication.

This solution consists of two blog posts. This is Part 1, where you’ll learn about the step-up authentication solution architecture and design. In the next post, Implement step-up authentication with Amazon Cognito, Part 2: Deploy and test the solution, you’ll learn how to use a reference implementation to test the step-up authentication solution.

Prerequisites

The reference architecture in this post uses a purpose-built step-up authorization workflow engine, which uses a custom SDK. The custom SDK uses the DynamoDB service as a persistent layer. This workflow engine is generic and can be used across any API serving layers, such as API Gateway or Elastic Load Balancing (ELB) Application Load Balancer, as long as the API serving layers can intercept API requests to perform additional actions. The step-up workflow engine also relies on an identity provider that is capable of issuing an OAuth 2.0 access token.

There are three parts to the step-up authentication solution:

  1. An API serving layer with the capability to apply custom logic before applying business logic.
  2. An OAuth 2.0–capable identity provider system.
  3. A purpose-built step-up workflow engine.

The solution in this post uses Amazon Cognito as the identity provider, with an API Gateway Lambda authorizer to invoke the step-up workflow engine, and DynamoDB as a persistent layer used by the step-up workflow engine. You can see a reference implementation of the API Gateway Lambda authorizer in the step-up-auth GitHub repository. Additionally, the purpose-built step-up workflow engine provides two API endpoints (or API actions), /initiate-auth and /respond-to-challenge, which are realized using the API Gateway Lambda authorizer, to drive the API invocation step-up state.

Note: If you decide to use an API serving layer other than API Gateway, or use an OAuth 2.0 identity provider besides Amazon Cognito, you will have to make changes to the accompanying sample code in the step-up-auth GitHub repository.

Solution architecture

Figure 1 shows the high-level reference architecture.

Figure 1: Step-up authentication high-level reference architecture

Figure 1: Step-up authentication high-level reference architecture

First, let’s talk about the core components in the step-up authentication reference architecture in Figure 1.

Identity provider

In order for a client application or user to invoke a protected backend API action, they must first obtain a valid OAuth token or JSON web token (JWT) from an identity provider. The step-up authentication solution uses Amazon Cognito as the identity provider. The step-up authentication solution and the accompanying step-up API operations use the access token to make the step-up authorization decision.

Protected backend

The step-up authentication solution uses API Gateway to protect backend resources. API Gateway supports several different API integration types, and you can use any one of the supported API Gateway integration types. For this solution, the accompanying sample code in the step-up-auth GitHub repository uses Lambda proxy integration to simulate a protected backend resource.

Data design

The step-up authentication solution relies on two DynamoDB tables, a session table and a setting table. The session table contains the user’s step-up session information, and the setting table contains an API step-up configuration. The API Gateway Lambda authorizer (described in the next section) checks the setting table to determine whether the API request requires a step-up session. For more information about table structure and sample values, see the Step-up authentication data design section in the accompanying GitHub repository.

The session table has the DynamoDB Time to Live (TTL) feature enabled. An item stays in the session table until the TTL time expires, when DynamoDB automatically deletes the item. The TTL value can be controlled by using the environment variable SESSION_TABLE_ITEM_TTL. Later in this post, we’ll cover where to define this environment variable in the Step-up solution design details section; and we’ll cover how to set the optimal value for this environment variable in the Additional considerations section.

Authorizer

The step-up authentication solution uses a purpose-built request parameter-based Lambda authorizer (also called a REQUEST authorizer). This REQUEST authorizer helps protect privileged API operations that require a step-up session.

The authorizer verifies that the API request contains a valid access token in the HTTP Authorization header. Using the access token’s JSON web token ID (JTI) claim as a key, the authorizer then attempts to retrieve a step-up session from the session table. If a session exists and its state is set to either STEP_UP_COMPLETED or STEP_UP_NOT_REQUIRED, then the authorizer lets the API call through by generating an allow API Gateway Lambda authorizer policy. If the set-up state is set to STEP_UP_REQUIRED, then the authorizer returns a 401 Unauthorized response status code to the caller.

If a step-up session does not exist in the session table for the incoming API request, then the authorizer attempts to create a session. It first looks up the setting table for the API configuration. If an API configuration is found and the configuration status is set to STEP_UP_REQUIRED, it indicates that the user must provide additional authentication in order to call this API action. In this case, the authorizer will create a new session in the session table by using the access token’s JTI claim as a session key, and it will return a 401 Unauthorized response status code to the caller. If the API configuration in the setting table is set to STEP_UP_DENY, then the authorizer will return a deny API Gateway Lambda authorizer policy, therefore blocking the API invocation. The caller will receive a 403 Forbidden response status code.

The authorizer uses the purpose-built auth-sdk library to interface with both the session and setting DynamoDB tables. The auth-sdk library provides convenient methods to create, update, or delete items in tables. Internally, auth-sdk uses the DynamoDB v3 Client SDK.

Initiate auth endpoint

When you deploy the step-up authentication solution, you will get the following two API endpoints:

  1. The initiate step-up authentication endpoint (described in this section).
  2. The respond to step-up authentication challenge endpoint (described in the next section).

When a client receives a 401 Unauthorized response status code from API Gateway after invoking a privileged API operation, the client can start the step-up authentication flow by invoking the initiate step-up authentication endpoint (/initiate-auth).

The /initiate-auth endpoint does not require any extra parameters, it only requires the Amazon Cognito access_token to be passed in the Authorization header of the request. The /initiate-auth endpoint uses the access token to call the Amazon Cognito API actions GetUser and GetUserAttributeVerificationCode on behalf of the user.

After the /initiate-auth endpoint has determined the proper multi-factor authentication (MFA) method to use, it returns the MFA method to the client. There are three possible values for the MFA methods:

  • MAYBE_SOFTWARE_TOKEN_STEP_UP, which is used when the MFA method cannot be determined.
  • SOFTWARE_TOKEN_STEP_UP, which is used when the user prefers software token MFA.
  • SMS_STEP_UP, which is used when the user prefers short message service (SMS) MFA.

Let’s take a closer look at how /initiate-auth endpoint determines the type of MFA methods to return to the client. The endpoint calls Amazon Cognito GetUser API action to check for user preferences, and it takes the following actions:

  1. Determines what method of MFA the user prefers, either software token or SMS.
  2. If the user’s preferred method is set to software token, the endpoint returns SOFTWARE_TOKEN_STEP_UP code to the client.
  3. If the user’s preferred method is set to SMS, the endpoint sends an SMS message with a code to the user’s mobile device. It uses the Amazon Cognito GetUserAttributeVerificationCode API action to send the SMS message. After the Amazon Cognito API action returns success, the endpoint returns SMS_STEP_UP code to the client.
  4. When the user preferences don’t include either a software token or SMS, the endpoint checks if the response from Amazon Cognito GetUser API action contains UserMFASetting response attribute list with either SOFTWARE_TOKEN_MFA or SMS_MFA keywords. If the UserMFASetting response attribute list contains SOFTWARE_TOKEN_MFA, then the endpoint returns SOFTWARE_TOKEN_STEP_UP code to the client. If it contains SMS_MFA keyword, then the endpoint invokes the Amazon Cognito GetUserAttributeVerificationCode API action to send the SMS message (as in step 3). Upon successful response from the Amazon Cognito API action, the endpoint returns SMS_STEP_UP code to the client.
  5. If the UserMFASetting response attribute list from Amazon Cognito GetUser API action does not contain SOFTWARE_TOKEN_MFA or SMS_MFA keywords, then the endpoint looks for phone_number_verified attribute. If found, then the endpoint sends an SMS message with a code to the user’s mobile device with verified phone number. The endpoint uses the Amazon Cognito GetUserAttributeVerificationCode API action to send the SMS message (as in step 3). Otherwise, when no verified phone is found, the endpoint returns MAYBE_SOFTWARE_TOKEN_STEP_UP code to the client.

The flowchart shown in Figure 2 illustrates the full decision logic.

Figure 2: MFA decision flow chart

Figure 2: MFA decision flow chart

Respond to challenge endpoint

The respond to challenge endpoint (/respond-to-challenge) is called by the client after it receives an appropriate MFA method from the /initiate-auth endpoint. The user must respond to the challenge appropriately by invoking /respond-to-challenge with a code and an MFA method.

The /respond-to-challenge endpoint receives two parameters in the POST body, one indicating the MFA method and the other containing the challenge response. Additionally, this endpoint requires the Amazon Cognito access token to be passed in the Authorization header of the request.

If the MFA method is SMS_STEP_UP, the /respond-to-challenge endpoint invokes the Amazon Cognito API action VerifyUserAttribute to verify the user-provided challenge response, which is the code that was sent by using SMS.

If the MFA method is SOFTWARE_TOKEN_STEP_UP or MAYBE_SOFTWARE_TOKEN_STEP_UP, the /respond-to-challenge endpoint invokes the Amazon Cognito API action VerifySoftwareToken to verify the challenge response that was sent in the endpoint payload.

After the user-provided challenge response is verified, the /respond-to-challenge endpoint updates the session table with the step-up session state STEP_UP_COMPLETED by using the access_token JTI. If the challenge response verification step fails, no changes are made to the session table. As explained earlier in the Data design section, the step-up session stays in the session table until the TTL time expires, when DynamoDB will automatically delete the item.

Deploy and test the step-up authentication solution

If you want to test the step-up authentication solution at this point, go to the second part of this blog, Implement step-up authentication with Amazon Cognito, Part 2: Deploy and test the solution. That post provides instructions you can use to deploy the solution by using the AWS Cloud Development Kit (AWS CDK) in your AWS account, and test it by using a sample web application.

Otherwise, you can continue reading the rest of this post to review the details and code behind the step-up authentication solution.

Step-up solution design details

Now let’s dig deeper into the step-up authentication solution. Figure 3 expands on the high-level solution design in the previous section and highlights the sequence of events that must take place to perform step-up authentication. In this section, we’ll break down these sequences into smaller parts and discuss each by going over a detailed sequence diagram.

Figure 3: Step-up authentication detailed reference architecture

Figure 3: Step-up authentication detailed reference architecture

Let’s group the step-up authentication flow in Figure 3 into three parts:

  1. Create a step-up session (steps 1-6 in Figure 3)
  2. Initiate step-up authentication (steps 7-8 in Figure 3)
  3. Respond to the step-up challenge (steps 9-12 in Figure 3)

In the next sections, you’ll learn how the user’s API requests are handled by the step-up authentication solution, and how the user state is elevated by going through an additional challenge.

Create a step-up session

After the user successfully logs in, they create a step-up session when invoking a privileged API action that is protected with the step-up Lambda authorizer. This authorizer determines whether to start a step-up challenge based on the configuration within the DynamoDB setting table, which might create a step-up session in the DynamoDB session table. Let’s go over steps 1–6, shown in the architecture diagram in Figure 3, in more detail:

  • Step 1 – It’s important to note that the user must authenticate with Amazon Cognito initially. As a result, they must have a valid access token generated by the Amazon Cognito user pool.
  • Step 2 – The user then invokes a privileged API action and passes the access token in the Authorization header.
  • Step 3 – The API action is protected by using a Lambda authorizer. The authorizer first validates the token by invoking the Amazon Cognito user pool public key. If the token is invalid, a 401 Unauthorized response status code can be sent immediately, prompting the client to present a valid token.
  • Step 4 – The authorizer performs a lookup in the DynamoDB setting table to check whether the current request needs elevated privilege (also known as step-up privilege). In the setting table, you can define which API actions require elevated privilege. You can additionally bundle API operations into a group by defining the group attribute. This allows you to further isolate privileged API operations, especially in a large-scale deployment.
  • Step 5 – If an API action requires elevated privilege, the authorizer will check for an existing step-up session for this specific user in the session table. If a step-up session does not exist, the authorizer will create a new entry in the session table. The key for this table will be the JTI claim of the access_token (which can be obtained after token verification).
  • Step 6 – If a valid session exists, then authorization will be given. Otherwise an unauthorized access response (401 HTTP code) will be sent back from the Lambda authorizer, indicating that the user requires elevated privilege.

Figure 4 highlights these steps in a sequence diagram.

Figure 4: Sequence diagram for creating a step-up session

Figure 4: Sequence diagram for creating a step-up session

Initiate step-up authentication

After the user receives a 401 Unauthorized response status code from invoking the privileged API action in the previous step, the user must call the /initiate-auth endpoint to start step-up authentication. The endpoint will return the response to the user or the client application to supply the temporary code. Let’s go over steps 7 and 8, shown in the architecture diagram in Figure 3, in more detail:

  • Step 7 – The client application initiates a step-up action by calling the /initiate-auth endpoint. This action is protected by the API Gateway built-in Amazon Cognito authorizer, and the client needs to pass a valid access_token in the Authorization header.
  • Step 8 – The call is forwarded to a Lambda function that will initiate the step-up action with the end user. The function first calls the Amazon Cognito API action GetUser to find out the user’s MFA settings. Depending on which MFA type is enabled for the user, the function uses different Amazon Cognito API operations to start the MFA challenge. For more details, see the Initiate auth endpoint section earlier in this post.

Figure 5 shows these steps in a sequence diagram.

Figure 5: Sequence diagram for invoking /initiate-auth to start step-up authentication

Figure 5: Sequence diagram for invoking /initiate-auth to start step-up authentication

Respond to the step-up challenge

In the previous step, the user receives a challenge code from the /initiate-auth endpoint. Depending on the type of challenge code, user must respond by sending a one-time password (OTP) to the /respond-to-challenge endpoint. The /respond-to-challenge endpoint invokes an Amazon Cognito API action to verify the OTP. Upon successful verification, the /respond-to-challenge endpoint marks the step-up session in the session table to STEP_UP_COMPLETED, indicating that the user now has elevated privilege. At this point, the user can invoke the privileged API action again to perform the elevated business operation. Let’s go over steps 9–12, shown in the architecture diagram in Figure 3, in more detail:

  • Step 9 – The client application presents an appropriate screen to the user to collect a response to the step-up challenge. The client application calls the /respond-to-challenge endpoint that contains the following:
    1. An access_token in the Authorization header.
    2. A step-up challenge type.
    3. A response provided by the user to the step-up challenge.

    This endpoint is protected by the API Gateway built-in Amazon Cognito authorizer.

  • Step 10 – The call is forwarded to the Lambda function, which verifies the response by calling the Amazon Cognito API action VerifyUserAttribute (in the case of SMS_STEP_UP) or VerifySoftwareToken (in the case of SOFTWARE_TOKEN_STEP_UP), depending on the type of step-up action that was returned from the /initiate-auth API action. The Amazon Cognito response will indicate whether verification was successful.
  • Step 11 – If the Amazon Cognito response in the previous step was successful, the Lambda function associated with the /respond-to-challenge endpoint inserts a record in the session table by using the access_token JTI as key. This record indicates that the user has completed step-up authentication. The record is inserted with a time to live (TTL) equal to the lesser of these values: the remaining period in the access_token timeout, or the default TTL value that is set in the Lambda function as a configurable environment variable, SESSION_TABLE_ITEM_TTL. The /respond-to-challenge endpoint returns a 200 status code after successfully updating the session table. It returns a 401 Unauthorized response status code if the operation failed or if the Amazon Cognito API calls in the previous step failed. For more information about the optimal value for the SESSION_TABLE_ITEM_TTL variable, see the Additional considerations section later in this post.
  • Step 12 – The client application can re-try the original call (using the same access token) to the privileged API operations, and this call should now succeed because an active step-up session exists for the user. Calls to other privileged API operations that require step-up should also succeed, as long as the step-up session hasn’t expired.

Figure 6 shows these steps in a sequence diagram.

Figure 6: Invoke the /respond-to-challenge endpoint to complete step-up authentication

Figure 6: Invoke the /respond-to-challenge endpoint to complete step-up authentication

Additional considerations

This solution uses several Amazon Cognito API operations to provide step-up authentication functionality. Amazon Cognito applies rate limiting on all API operations categories, and rapid calls that exceed the assigned quota will be throttled.

The step-up flow for a single user can include multiple Amazon Cognito API operations such as GetUser, GetUserAttributeVerificationCode, VerifyUserAttribute, and VerifySoftwareToken. These Amazon Cognito API operations have different rate limits. The effective rate, in requests per second (RPS), that your privileged and protected API action can achieve will be equivalent to the lowest category rate limit among these API operations. When you use the default quota, your application can achieve 25 SMS_STEP_UP RPS or up to 50 SOFTWARE_TOKEN_STEP_UP RPS.

Certain Amazon Cognito API operations have additional security rate limits per user per hour. For example, the GetUserAttributeVerificationCode API action has a limit of five calls per user per hour. For that reason, we recommend 15 minutes as the minimum value for SESSION_TABLE_ITEM_TTL, as this will allow a single user to have up to four step-up sessions per hour if needed.

Conclusion

In this blog post, you learned about the architecture of our step-up authentication solution and how to implement this architecture to protect privileged API operations by using AWS services. You learned how to use Amazon Cognito as the identity provider to authenticate users with multi-factor security and API Gateway with an authorizer Lambda function to enforce access to API actions by using a step-up authentication workflow engine. This solution uses DynamoDB as a persistent layer to manage the security rules for the step-up authentication workflow engine, which helps you to efficiently manage your rules.

In the next part of this post, Implement step-up authentication with Amazon Cognito, Part 2: Deploy and test the solution, you’ll deploy a reference implementation of the step-up authentication solution in your AWS account. You’ll use a sample web application to test the step-up authentication solution you learned about in this post.

 
If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the Amazon Cognito forum.

Want more AWS Security news? Follow us on Twitter.

Salman Moghal

Salman Moghal

Salman is a Principal Consultant in AWS Professional Services, based in Toronto, Canada. He helps customers in architecting, developing, and reengineering data-driven applications at scale, with a sharp focus on security.

Thomas Ross

Thomas Ross

Thomas is a Software Engineering student at Carleton University. He worked at AWS as a Professional Services Intern and a Software Development Engineer Intern in Amazon Aurora. He has an interest in almost anything related to technology, especially systems at high scale, security, distributed systems, and databases.

Ozair Sheikh

Ozair Sheikh

Ozair is a senior product leader for Sponsored Display in Amazon ads, based in Toronto, Canada. He helps advertisers and Ad Tech API Partners build campaign management solutions to reach customers across the purchase journey. He has over 10 years of experience in API management and security, with an obsession for delivering highly secure API products.

Mahmoud Matouk

Mahmoud Matouk

Mahmoud is a Principal Solutions Architect with the Amazon Cognito team. He helps AWS customers build secure and innovative solutions for various identity and access management scenarios.

Amazon Cognito launches support for in-Region integration with Amazon SES and Amazon SNS

Post Syndicated from Amit Jha original https://aws.amazon.com/blogs/security/amazon-cognito-launches-support-for-in-region-integration-with-amazon-ses-and-amazon-sns/

We are pleased to announce that in all AWS Regions that support Amazon Cognito, you can now integrate Amazon Cognito with Amazon Simple Email Service (Amazon SES) and Amazon Simple Notification Service (Amazon SNS) in the same Region. By integrating these services in the same Region, you can more easily achieve lower latency, and remove cross-Region dependencies in your architecture. Amazon Cognito lets you add authentication, authorization, and user management to your web and mobile apps. Amazon Cognito scales to millions of users and supports sign-in with social identity providers such as Apple, Facebook, Google, and Amazon, and enterprise identity providers that support SAML 2.0 and OpenID Connect (OIDC).

Amazon Cognito launched new console experience in 2021 that makes it even easier for you to manage Amazon Cognito user pools and add sign-in and sign-up functionality to your applications. The new console has now been further enhanced to configure the in-Region Amazon SES options as shown in Figure 1, and Amazon SNS options as shown in Figure 2. Also you can configure the same via Amazon Cognito APIs. Thus you can update your in-Region Amazon SES, Amazon SNS configuration options through the console, API, or CLI. You can use Amazon Cognito in a Region that suits your business requirements and sustainability goals, and extend your Amazon Cognito architecture to additional Regions.

Figure 1: Amazon SES Region drop-down selection with new options

Figure 1: Amazon SES Region drop-down selection with new options

Figure 2: Amazon SNS Region selection drop-down selection with new options

Figure 2: Amazon SNS Region selection drop-down selection with new options

In-Region integration with Amazon SES and Amazon SNS is currently available in all Regions where Amazon SES, Amazon SNS and Amazon Cognito are available. For up to date information, see the AWS Regional Services List. To learn more, see What is Amazon Cognito?.

Frequently asked questions (FAQs)

What Region will Amazon Cognito console default to when I configure Amazon SES and Amazon SNS Regions?

When creating new user pools, the Amazon Cognito console auto-populates the Region to in-Region, but you still have to select the identity. Existing user pools with cross-Region Amazon SES or Amazon SNS integration will not be affected.

Can I update an existing user pool to integrate with Amazon SES or Amazon SNS in the same Region?

Yes, you can change your configuration so that Amazon Cognito integrates with either Amazon SES or Amazon SNS, or both, in the same Region.

What Regions can I use with Amazon Cognito for Amazon SNS and Amazon SES?

For most up-to date mapping of Regions to use, see the table in SMS message settings for Amazon Cognito user pools.

Why should I change from cross-Region to same-Region Amazon SES or Amazon SNS?

Amazon Cognito is designed to scale to millions of users. Your users expect prompt delivery of their messages for multi-factor authentication and account setup. Using Amazon SES and Amazon SNS in the same Region as your user pool improves performance by reducing the round-trip time of the call that Amazon Cognito makes to Amazon SES or Amazon SNS.

What are the key benefits of using in-Region integration?

Availability: Availability is improved as you no longer will have cross-Region dependency for Amazon SES or Amazon SNS.

Latency: Transit time for API requests is most efficient within a single AWS Region.

Usability: Billing, logging, and setup are more transparent when you consolidate resources in the same Region.

Which version of Amazon Cognito user pools console does this change apply to?

This change applies to current version of the new Amazon Cognito user pool console experience. Also this change applies to current version of Amazon Cognito APIs.

Will my current cross-Region integration change?

No. Your AWS resources are your own and will not be changed. If you want to make use of the new in-Region integration, you must update your user pool configuration to integrate with Amazon SES or Amazon SNS in the same AWS Region.

Will I be placed in the SMS sandbox if I change my Amazon SNS Region?

The SMS sandbox status is Region dependent, so whether or not your user pool is in the SMS sandbox depends on the SNS Region you configure in your user pool. When your account is in the SMS sandbox, Amazon Cognito can send SMS text messages only to verified phone numbers and not to all of your users. When you move to a new Region, verified phone numbers will also need to be re-verified. For more information, see SMS message settings for Amazon Cognito user pools.

To find info about whether your user pool is configured in an SNS Region that is in the SMS sandbox, you can view the SmsConfigurationFailure field in DescribeUserPool API.

Which API parameters can developers use to make the in-Region changes?

Amazon SES: verified Amazon SES identities from the new Regions will be allowed through SourceArn parameters in the AWS::Cognito::UserPool EmailConfiguration type, and in the AWS::Cognito:: RiskConfiguration NotifyConfiguration type.

Amazon SNS: There is now a new parameter called SnsRegionM in the SmsConfiguration type in the following APIs:

Will my automation scripts break due to this change?

This change to support in-Region integration will not break your automation scripts. If future updates include changing the default Region value to in-Region, we plan to inform all Amazon Cognito customers about this change with sufficient time to transition to the new default Region value.

Can I revert to my original Region integration if I run into an issue?

Yes, the ability to use Amazon SES or Amazon SNS resources in a different AWS Region is still supported.

Next steps

If your Amazon Cognito user pool is currently configured to make cross-Region calls to Amazon SES or Amazon SNS, you can update your configuration through the console, API, or CLI.

If you have any questions or issues, you can start a new thread on AWS re:Post, contact AWS Support, or your technical account manager (TAM).

Want more AWS Security news? Follow us on Twitter.

Amit Jha

Amit Jha

Amit is a Developer Advocate with focus on Security/Identity. Amit has 18+ years of industry experience as a software developer & Architect. Prior to his current role, he served multiple roles at Microsoft for 11+ years helping large enterprises with Software architecture and custom development consulting.

Security practices in AWS multi-tenant SaaS environments

Post Syndicated from Keith P original https://aws.amazon.com/blogs/security/security-practices-in-aws-multi-tenant-saas-environments/

Securing software-as-a-service (SaaS) applications is a top priority for all application architects and developers. Doing so in an environment shared by multiple tenants can be even more challenging. Identity frameworks and concepts can take time to understand, and forming tenant isolation in these environments requires deep understanding of different tools and services.

While security is a foundational element of any software application, specific considerations apply to SaaS applications. This post dives into the challenges, opportunities and best practices for securing multi-tenant SaaS environments on Amazon Web Services (AWS).

SaaS application security considerations

Single tenant applications are often deployed for a specific customer, and typically only deal with this single entity. While security is important in these environments, the threat profile does not include potential access by other customers. Multi-tenant SaaS applications have unique security considerations when compared to single tenant applications.

In particular, multi-tenant SaaS applications must pay special attention to identity and tenant isolation. These considerations are in addition to the security measures all applications must take. This blog post reviews concepts related to identity and tenant isolation, and how AWS can help SaaS providers build secure applications.

Identity

SaaS applications are accessed by individual principals (often referred to as users). These principals may be interactive (for example, through a web application) or machine-based (for example, through an API). Each principal is uniquely identified, and is usually associated with information about the principal, including email address, name, role and other metadata.

In addition to the unique identification of each individual principal, a SaaS application has another construct: a tenant. A paper on multi-tenancy defines a tenant as a group of one or more users sharing the same view on an application they use. This view may differ for different tenants. Each individual principal is associated with a tenant, even if it is only a 1:1 mapping. A tenant is uniquely identified, and contains information about the tenant administrator, billing information and other metadata.

When a principal makes a request to a SaaS application, the principal provides their tenant and user identifier along with the request. The SaaS application validates this information and makes an authorization decision. In well-designed SaaS applications, this authorization step should not rely on a centralized authorization service. A centralized authorization service is a single point of failure in an application. If it fails, or is overwhelmed with requests, the application will no longer be able to process requests.

There are two key techniques to providing this type of experience in a SaaS application: using an identity provider (IdP) and representing identity or authorization in a token.

Using an Identity Provider (IdP)

In the past, some web applications often stored user information in a relational database table. When a principal authenticated successfully, the application issued a session ID. For subsequent requests, the principal passed the session ID to the application. The application made authorization decisions based on this session ID. Figure 1 provides an example of how this setup worked.

Figure 1 - An example of legacy application authentication.

Figure 1 – An example of legacy application authentication.

In applications larger than a simple web application, this pattern is suboptimal. Each request usually results in at least one database query or cache look up, creating a bottleneck on the data store holding the user or session information. Further, because of the tight coupling between the application and its user management, federation with external identity providers becomes difficult.

When designing your SaaS application, you should consider the use of an identity provider like Amazon Cognito, Auth0, or Okta. Using an identity provider offloads the heavy lifting required for managing identity by having user authentication, including federation, handled by external identity providers. Figure 2 provides an example of how a SaaS provider can use an identity provider in place of the self-managed solution shown in Figure 1.

Figure 2 – An example of an authentication flow that involves an identity provider.

Figure 2 – An example of an authentication flow that involves an identity provider.

Once a user authenticates with an identity provider, the identity provider issues a standardized token. This token is the same regardless of how a user authenticates, which means your application does not need to build in support for multiple different authentication methods tenants might use.

Identity providers also commonly support federated access. Federated access means that a third party maintains the identities, but the identity provider has a trust relationship with this third party. When a customer tries to log in with an identity managed by the third party, the SaaS application’s identity provider handles the authentication transaction with the third-party identity provider.

This authentication transaction commonly uses a protocol like Security Assertion Markup Language (SAML) 2.0. The SaaS application’s identity provider manages the interaction with the tenant’s identity provider. The SaaS application’s identity provider issues a token in a format understood by the SaaS application. Figure 3 provides an example of how a SaaS application can provide support for federation using an identity provider.

Figure 3 - An example of authentication that involves a tenant-provided identity provider

Figure 3 – An example of authentication that involves a tenant-provided identity provider

For an example, see How to set up Amazon Cognito for federated authentication using Azure AD.

Representing identity with tokens

Identity is usually represented by signed tokens. JSON Web Signatures (JWS), often referred to as JSON Web Tokens (JWT), are signed JSON objects used in web applications to demonstrate that the bearer is authorized to access a particular resource. These JSON objects are signed by the identity provider, and can be validated without querying a centralized database or service.

The token contains several key-value pairs, called claims, which are issued by the identity provider. Besides several claims relating to the issuance and expiration of the token, the token can also contain information about the individual principal and tenant.

Sample access token claims

The example below shows the claims section of a typical access token issued by Amazon Cognito in JWT format.

{
  "sub": "aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee",
  "cognito:groups": [
"TENANT-1"
  ],
  "token_use": "access",
  "auth_time": 1562190524,
  "iss": "https://cognito-idp.us-west-2.amazonaws.com/us-west-2_example",
  "exp": 1562194124,
  "iat": 1562190524,
  "origin_jti": "bbbbbbbbb-cccc-dddd-eeee-aaaaaaaaaaaa",
  "jti": "cccccccc-dddd-eeee-aaaa-bbbbbbbbbbbb",
  "client_id": "12345abcde",

The principal, and the tenant the principal is associated with, are represented in this token by the combination of the user identifier (the sub claim) and the tenant ID in the cognito:groups claim. In this example, the SaaS application represents a tenant by creating a Cognito group per tenant. Other identity providers may allow you to add a custom attribute to a user that is reflected in the access token.

When a SaaS application receives a JWT as part of a request, the application validates the token and unpacks its contents to make authorization decisions. The claims within the token set what is known as the tenant context. Much like the way environment variables can influence a command line application, the tenant context influences how the SaaS application processes the request.

By using a JWT, the SaaS application can process a request without frequent reference to an external identity provider or other centralized service.

Tenant isolation

Tenant isolation is foundational to every SaaS application. Each SaaS application must ensure that one tenant cannot access another tenant’s resources. The SaaS application must create boundaries that adequately isolate one tenant from another.

Determining what constitutes sufficient isolation depends on your domain, deployment model and any applicable compliance frameworks. The techniques for isolating tenants from each other depend on the isolation model and the applications you use. This section provides an overview of tenant isolation strategies.

Your deployment model influences isolation

How an application is deployed influences how tenants are isolated. SaaS applications can use three types of isolation: silo, pool, and bridge.

Silo deployment model

The silo deployment model involves customers deploying one set of infrastructure per tenant. Depending on the application, this may mean a VPC-per-tenant, a set of containers per tenant, or some other resource that is deployed for each tenant. In this model, there is one deployment per tenant, though there may be some shared infrastructure for cross-tenant administration. Figure 4 shows an example of a siloed deployment that uses a VPC-per-tenant model.

Figure 4 - An example of a siloed deployment that provisions a VPC-per-tenant

Figure 4 – An example of a siloed deployment that provisions a VPC-per-tenant

Pool deployment model

The pool deployment model involves a shared set of infrastructure for all tenants. Tenant isolation is implemented logically in the application through application-level constructs. Rather than having separate resources per tenant, isolation enforcement occurs within the application. Figure 5 shows an example of a pooled deployment model that uses serverless technologies.

Figure 5 - An example of a pooled deployment model using serverless technologies

Figure 5 – An example of a pooled deployment model using serverless technologies

In Figure 5, an AWS Lambda function that retrieves an item from an Amazon DynamoDB table shared by all tenants needs temporary credentials issued by the AWS Security Token Service. These credentials only allow the requester to access items in the table that belong to the tenant making the request. A requester gets these credentials by assuming an AWS Identity and Access Management (IAM) role. This allows a SaaS application to share the underlying infrastructure, while still isolating tenants from one another. See Isolation enforcement depends on service below for more details on this pattern.

Bridge deployment model

The bridge model combines elements of both the silo and pool models. Some resources may be separate, others may be shared. For example, suppose your application has a shared application layer and an Amazon Relational Database Service (RDS) instance per tenant. The application layer evaluates each request and connects to the database for the tenant that made the request.

This model is useful in a situation where each tenant may require a certain response time and one set of resources acts as a bottleneck. In the RDS example, the application layer could handle the requests imposed by the tenants, but a single RDS instance could not.

The decision on which isolation model to implement depends on your customer’s requirements, compliance needs or industry needs. You may find that some customers can be deployed onto a pool model, while larger customers may require their own silo deployment.

Your tiering strategy may also influence the type of isolation model you use. For example, a basic tier customer might be deployed onto pooled infrastructure, while an enterprise tier customer is deployed onto siloed infrastructure.

For more information about different tenant isolation models, read the tenant isolation strategies whitepaper.

Isolation enforcement depends on service

Most SaaS applications will need somewhere to store state information. This could be a relational database, a NoSQL database, or some other storage medium which persists state. SaaS applications built on AWS use various mechanisms to enforce tenant isolation when accessing a persistent storage medium.

IAM provides fine grain access controls access for the AWS API. Some services, like Amazon Simple Storage Service (Amazon S3) and DynamoDB, provide the ability to control access to individual objects or items with IAM policies. When possible, your application should use IAM’s built-in functionality to limit access to tenant resources. See Isolating SaaS Tenants with Dynamically Generated IAM Policies for more information about using IAM to implement tenant isolation.

AWS IAM also offers the ability to restrict access to resources based on tags. This is known as attribute-based access control (ABAC). This technique allows you to apply tags to supported resources, and make access control decisions based on which tags are applied. This is a more scalable access control mechanism than role-based access control (RBAC), because you do not need to modify an IAM policy each time a resource is added or removed. See How to implement SaaS tenant isolation with ABAC and AWS IAM for more information about how this can be applied to a SaaS application.

Some relational databases offer features that can enforce tenant isolation. For example, PostgreSQL offers a feature called row level security (RLS). Depending on the context in which the query is sent to the database, only tenant-specific items are returned in the results. See Multi-tenant data isolation with PostgreSQL Row Level Security for more information about row level security in PostgreSQL.

Other persistent storage mediums do not have fine grain permission models. They may, however, offer some kind of state container per tenant. For example, when using MongoDB, each tenant is assigned a MongoDB user and a MongoDB database. The secret associated with the user can be stored in AWS Secrets Manager. When retrieving a tenant’s data, the SaaS application first retrieves the secret, then authenticates with MongoDB. This creates tenant isolation because the associated credentials only have permission to access collections in a tenant-specific database.

Generally, if the persistent storage medium you’re using offers its own permission model that can enforce tenant isolation, you should use it, since this keeps you from having to implement isolation in your application. However, there may be cases where your data store does not offer this level of isolation. In this situation, you would need to write application-level tenant isolation enforcement. Application-level tenant isolation means that the SaaS application, rather than the persistent storage medium, makes sure that one tenant cannot access another tenant’s data.

Conclusion

This post reviews the challenges, opportunities and best practices for the unique security considerations associated with a multi-tenant SaaS application, and describes specific identity considerations, as well as tenant isolation methods.

If you’d like to know more about the topics above, the AWS Well-Architected SaaS Lens Security pillar dives deep on performance management in SaaS environments. It also provides best practices and resources to help you design and improve performance efficiency in your SaaS application.

Get Started with the AWS Well-Architected SaaS Lens

The AWS Well-Architected SaaS Lens focuses on SaaS workloads, and is intended to drive critical thinking for developing and operating SaaS workloads. Each question in the lens has a list of best practices, and each best practice has a list of improvement plans to help guide you in implementing them.

The lens can be applied to existing workloads, or used for new workloads you define in the tool. You can use it to improve the application you’re working on, or to get visibility into multiple workloads used by the department or area you’re working with.

The SaaS Lens is available in all Regions where the AWS Well-Architected Tool is offered, as described in the AWS Regional Services List. There are no costs for using the AWS Well-Architected Tool.

If you’re an AWS customer, find current AWS Partners that can conduct a review by learning about AWS Well-Architected Partners and AWS SaaS Competency Partners.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security Hub forum. To start your 30-day free trial of Security Hub, visit AWS Security Hub.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Keith P

Keith is a senior partner solutions architect on the SaaS Factory team.

Andy Powell

Andy is the global lead partner for solutions architecture on the SaaS Factory team.