Tag Archives: AWS CloudTrail

Creating a Multi-Region Application with AWS Services – Part 1, Compute and Security

Post Syndicated from Joe Chapman original https://aws.amazon.com/blogs/architecture/creating-a-multi-region-application-with-aws-services-part-1-compute-and-security/

Building a multi-Region application requires lots of preparation and work. Many AWS services have features to help you build and manage a multi-Region architecture, but identifying those capabilities across 200+ services can be overwhelming.

In this 3-part blog series, we’ll explore AWS services with features to assist you in building multi-Region applications. In Part 1, we’ll build a foundation with AWS security, networking, and compute services. In Part 2, we’ll add in data and replication strategies. Finally, in Part 3, we’ll look at the application and management layers.

Considerations before getting started

AWS Regions are built with multiple isolated and physically separate Availability Zones (AZs). This approach allows you to create highly available Well-Architected workloads that span AZs to achieve greater fault tolerance. There are three general reasons that you may need to expand beyond a single Region:

  • Expansion to a global audience as an application grows and its user base becomes more geographically dispersed, there can be a need to reduce latencies for different parts of the world.
  • Reducing Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) as part of disaster recovery (DR) plan.
  • Local laws and regulations may have strict data residency and privacy requirements that must be followed.

Ensuring security, identity, and compliance

Creating a security foundation starts with proper authentication, authorization, and accounting to implement the principle of least privilege. AWS Identity and Access Management (IAM) operates in a global context by default. With IAM, you specify who can access which AWS resources and under what conditions. For workloads that use directory services, the AWS Directory Service for Microsoft Active Directory Enterprise Edition can be set up to automatically replicate directory data across Regions. This allows applications to reduce lookup latencies by using the closest directory and creates durability by spanning multiple Regions.

Applications that need to securely store, rotate, and audit secrets, such as database passwords, should use AWS Secrets Manager. It encrypts secrets with AWS Key Management Service (AWS KMS) keys and can replicate secrets to secondary Regions to ensure applications are able to obtain a secret in the closest Region.

Encrypt everything all the time

AWS KMS can be used to encrypt data at rest, and is used extensively for encryption across AWS services. By default, keys are confined to a single Region. AWS KMS multi-Region keys can be created to replicate keys to a second Region, which eliminates the need to decrypt and re-encrypt data with a different key in each Region.

AWS CloudTrail logs user activity and API usage. Logs are created in each Region, but they can be centralized from multiple Regions and multiple accounts into a single Amazon Simple Storage Service (Amazon S3) bucket. As a best practice, these logs should be aggregated to an account that is only accessible to required security personnel to prevent misuse.

As your application expands to new Regions, AWS Security Hub can aggregate and link findings to a single Region to create a centralized view across accounts and Regions. These findings are continuously synced between Regions to keep you updated on global findings.

We put these features together in Figure 1.

Multi-Region security, identity, and compliance services

Figure 1. Multi-Region security, identity, and compliance services

Building a global network

For resources launched into virtual networks in different Regions, Amazon Virtual Private Cloud (Amazon VPC) allows private routing between Regions and accounts with VPC peering. These resources can communicate using private IP addresses and do not require an internet gateway, VPN, or separate network appliances. This works well for smaller networks that only require a few peering connections. However, as the number of peered connections increases, the mesh of peered connections can become difficult to manage and troubleshoot.

AWS Transit Gateway can help reduce these difficulties by creating a central transitive hub to act as a cloud router. A Transit Gateway’s routing capabilities can expand to additional Regions with Transit Gateway inter-Region peering to create a globally distributed private network.

Building a reliable, cost-effective way to route users to distributed Internet applications requires highly available and scalable Domain Name System (DNS) records. Amazon Route 53 does exactly that.

Route 53 routing policies can route traffic to a record with the lowest latency, or automatically fail over a record. If a larger failure occurs, the Route 53 Application Recovery Controller can simplify the monitoring and failover process for application failures across Regions, AZs, and on-premises.

Amazon CloudFront’s content delivery network is truly global, built across 300+ points of presence (PoP) spread throughout the world. Applications that have multiple possible origins, such as across Regions, can use CloudFront origin failover to automatically fail over the origin. CloudFront’s capabilities expand beyond serving content, with the ability to run compute at the edge. CloudFront functions make it easy to run lightweight JavaScript functions, and AWS Lambda@Edge makes it easy to run Node.js and Python functions across these 300+ PoPs.

AWS Global Accelerator uses the AWS global network infrastructure to provide two static anycast IPs for your application. It automatically routes traffic to the closest Region deployment, and if a failure is detected it will automatically redirect traffic to a healthy endpoint within seconds.

Figure 2 brings these features together to create a global network across two Regions.

AWS VPC connectivity and content delivery

Figure 2. AWS VPC connectivity and content delivery

Building the compute layer

An Amazon Elastic Compute Cloud (Amazon EC2) instance is based on an Amazon Machine Image (AMI). An AMI specifies instance configurations such as the instance’s storage, launch permissions, and device mappings. When a new standard image needs to be created, EC2 Image Builder can be used to streamline copying AMIs to selected Regions.

Although EC2 instances and their associated Amazon Elastic Block Store (Amazon EBS) volumes live in a single AZ, Amazon Data Lifecycle Manager can automate the process of taking and copying EBS snapshots across Regions. This can enhance DR strategies by providing a relatively easy cold backup-and-restore option for EBS volumes.

As an architecture expands into multiple Regions, it can become difficult to track where instances are provisioned. Amazon EC2 Global View helps solve this by providing a centralized dashboard to see Amazon EC2 resources such as instances, VPCs, subnets, security groups, and volumes in all active Regions.

Microservice-based applications that use containers benefit from quicker start-up times. Amazon Elastic Container Registry (Amazon ECR) can help ensure this happens consistently across Regions with private image replication at the registry level. An ECR private registry can be configured for either cross-Region or cross-account replication to ensure your images are ready in secondary Regions when needed.

We bring these compute layer features together in Figure 3.

AMI and EBS snapshot copy across Regions

Figure 3. AMI and EBS snapshot copy across Regions

Summary

It’s important to create a solid foundation when architecting a multi-Region application. These foundations pave the way for you to move fast in a secure, reliable, and elastic way as you build out your application. In this post, we covered options across AWS security, networking, and compute services that have built-in functionality to take away some of the undifferentiated heavy lifting. We’ll cover data, application, and management services in future posts.

Ready to get started? We’ve chosen some AWS Solutions and AWS Blogs to help you!

Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

Field Notes: Analyze Cross-Account AWS KMS Call Usage with AWS CloudTrail and Amazon Athena

Post Syndicated from Abhijit Rajeshirke original https://aws.amazon.com/blogs/architecture/field-notes-analyze-cross-account-aws-kms-call-usage-with-aws-cloudtrail-and-amazon-athena/

Businesses are expanding their footprint on Amazon Web Services (AWS) and are adopting a multi-account strategy to help isolate and manage business applications and data. In the multi-account strategy, it is common to have business applications deployed in one account accessing an Amazon Simple Storage Service (Amazon S3) encrypted bucket from another AWS account.

When an application in an AWS account uses a AWS Key Management Service (AWS KMS) key owned by a different account, it’s known as a cross-account call. For cross-account requests, AWS KMS throttles the account that makes the requests, not the account that owns the AWS KMS key. These requests count toward the request quota of the caller account. Sometimes it’s essential to identify or track cross-account AWS KMS API usage. In this blog, you will learn about use cases to track these requests and steps to identify cross-account AWS KMS calls.

To understand the problem better, consider a scenario where you have multiple AWS accounts set up in a hub and spoke configuration as shown in the following diagram.  Each account is administered by a different administrator. Amazon S3 data lake is located in the centralized hub account. The data lake bucket is encrypted using server-side encryption with AWS KMS (SSE-KMS) with customer-managed keys. Multiple spoke accounts access datasets from this data lake bucket. When a spoke account uploads or downloads objects from the data lake, Amazon S3 makes a GenerateDataKey (for uploads) or Decrypt (for downloads) API request to AWS KMS on behalf of the spoke account. These API requests get applied toward AWS KMS quota of the spoke account.

In the following diagram (figure 1), spoke accounts B, C, and D are uploading/downloading files from the encrypted data lake located in hub account A. Related AWS KMS API quotas will get applied to spoke accounts even though encryption/decryption is happening at the data lake S3 bucket. For example, the centralized Amazon S3 data lake is located in hub account A with an account ID 111111111111. Amazon S3 data lake bucket is encrypted using AWS KMS key ARN ending in 3aa3c82a2174.

Spoke account B with account ID 222222222222 is downloading 1,811 files and uploading 749 files from the centralized data lake. A total of 2,560 AWS KMS API calls will be counted against the request quota for account B.

Spoke account C with account ID 33333333333 is downloading 997 files and uploading 271 files from centralized data lake. A total of 1,268 AWS KMS API calls will be counted against the request quota for account C.

Spoke account D with account ID 444444444444 is downloading 638 files and uploading 306 files from centralized data lake from centralized data lake. The total 944 AWS KMS API quotas will get applied to account D.

Spoke and hub accounts are owned by separate business units and owned by different account administrators.

Note: when you configure your bucket to use an S3 Bucket Key for SSE-KMS, you may not see separate Decrypt or GenerateDataKey for each file upload or download.

Figure 1: Architecture outlining the hub and spoke accounts

Figure 1: Architecture outlining the hub and spoke accounts

This architecture design works for the following three use cases.

Use case #1:

A spoke account administrator wants to track the individual AWS KMS key-wise encryption/decryption costs using AWS Cost Explorer and cost allocation tags. Tracking costs this way works well for the AWS KMS API calls made within the same spoke account and related costs will be displayed under appropriate cost allocation tags. However, for the cross account AWS KMS API calls, cost allocation tags will not be visible outside of the hub account and will be displayed under cost allocation tag “None.” Analyzing cross-account AWS KMS API calls will help administrator determine approximate percentage usage by each cross account KMS key.

Use case # 2:

The spoke account has multiple applications, and each application has a unique AWS Identity and Access Management (IAM) principal. The spoke account administrator would like to track encryption/decryption usage. Identifying IAM principal-wise cross account calls will help the administrator determine approximate percent usage by each IAM principal /each application.

Use case # 3:

The spoke account administrator wants to understand how much AWS KMS quota is used for the cross-account specific KMS keys.

Solution Overview

Let’s discuss how we can track cross-account AWS KMS calls using AWS CloudTrail and Amazon Athena. For this solution, we will reuse your existing CloudTrail or create new CloudTrail in a Region where the hub account Amazon S3 data lake is located. As shown in the following diagram, we will use Athena to query the CloudTrail data to identify cross account AWS KMS calls used for S3 encryption/decryption.

Figure 2- Spoke and hub architecture

Figure 2: Architecture outlining the CloudTrail and Athena Solution

Prerequisites

For this walkthrough, you should have the following prerequisites:

  • AWS accounts (one for hub and at least one for spoke)
  • AWS KMS (SSE-KMS) with customer managed keys encrypted S3 bucket

Walkthrough

Step 1: Activate AWS CloudTrail for the hub account

CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. With CloudTrail, you can log, continuously monitor, and retain account activity across your AWS accounts. If you have already activated CloudTrail, you can reuse the same. If you haven’t, you can activate it using the steps in this tutorial. For the proposed solution, you must enable CloudTrail for management events only. You don’t require CloudTrail for data events or insight events. Also, be aware that you need only single CloudTrail and creating duplicate cloud trails can increase the service cost.

Note:  you can analyze the data in Athena only when the CloudTrail data is available. Any access requests made prior to enabling CloudTrail cannot be analyzed. It takes up to 15 minutes for events to get to CloudTrail, and up to 5 minutes for CloudTrail to write to S3.

Step 2: Create Amazon Athena table to query the CloudTrail data

Amazon Athena is an interactive query service that analyzes data in Amazon S3 using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for the queries that you run. Create an Athena table in any database or default database in a Region where your hub account S3 data lake bucket resides.

If you are using Athena for the first time, follow these steps to create a database. Once the database is created you need to create Athena table. Follow these steps to create a table:

  1. Open the Athena built-in query editor,
  2. copy the following query,
  3. modify as suggested,
  4. run the query.

In the LOCATION and storage.location.template clauses, replace the bucket with CloudTrail bucket. Replace accountId with hub account’s ID and replace awsRegion with region where data lake S3 bucket is located. For projection.timestamp.range, replace 2020/01/01 with the start date you want to use.

After successful initiation of the query, you will see the CloudTrail_logs table created in Athena.

CREATE EXTERNAL TABLE cloudtrail_logs_region(

    eventVersion STRING,

    userIdentity STRUCT<

        type: STRING,

        principalId: STRING,

        arn: STRING,

        accountId: STRING,

        invokedBy: STRING,

        accessKeyId: STRING,

        userName: STRING,

        sessionContext: STRUCT<

            attributes: STRUCT<

                mfaAuthenticated: STRING,

                creationDate: STRING>,

            sessionIssuer: STRUCT<

                type: STRING,

                principalId: STRING,

                arn: STRING,

                accountId: STRING,

                userName: STRING>>>,

    eventTime STRING,

    eventSource STRING,

    eventName STRING,

    awsRegion STRING,

    sourceIpAddress STRING,

    userAgent STRING,

    errorCode STRING,

    errorMessage STRING,

    requestParameters STRING,

    responseElements STRING,

    additionalEventData STRING,

    requestId STRING,

    eventId STRING,

    readOnly STRING,

    resources ARRAY<STRUCT<

        arn: STRING,

        accountId: STRING,

        type: STRING>>,

    eventType STRING,

    apiVersion STRING,

    recipientAccountId STRING,

    serviceEventDetails STRING,

    sharedEventID STRING,

    vpcEndpointId STRING

  )

PARTITIONED BY (

   `timestamp` string)

ROW FORMAT SERDE 'com.amazon.emr.hive.serde.CloudTrailSerde'

STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'

OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'

LOCATION

  's3://bucket/AWSLogs/account-id/CloudTrail/aws-region'

TBLPROPERTIES (

  'projection.enabled'='true',

  'projection.timestamp.format'='yyyy/MM/dd',

  'projection.timestamp.interval'='1',

  'projection.timestamp.interval.unit'='DAYS',

  'projection.timestamp.range'='2020/01/01,NOW',

  'projection.timestamp.type'='date',

  'storage.location.template'='s3://bucket/AWSLogs/account-id/CloudTrail/aws-region/${timestamp}')

Athena screenshot

Step 3: Identify cross account Amazon S3 encryption/decryption calls

Once the Athena table is created, you can run the following query to find out cross-account AWS KMS calls made for S3 encryption /decryption.

Query:

SELECT useridentity.accountid as requestor_account_id,

              resources[1].accountid as owner_account_id,

              resources[1].arn as key_arn,       

          count(resources) as count

  FROM CloudTrail_logs_us_east_2

  WHERE eventsource='kms.amazonaws.com'

          AND timestamp between '2021/04/01' and '2021/08/30'

          AND eventname in ('Decrypt','Encrypt','GenerateDataKey')

          AND useridentity.accountid!= resources[1].accountid

    AND json_extract(json_extract(requestparameters , '$.encryptionContext'),'$.aws:s3:arn') is not null

  GROUP BY  useridentity.accountid,resources[1].accountid,resources[1].arn

  ORDER BY  key_arn,count desc

Result:

Result displays the cross account AWS KMS calls made for S3 encryption /decryption i.e. where caller account is not the key owner account for time period between April 1, 2021 and August 30, 2021.

Athena screenshot 2

The preceding example shows cross-account AWS KMS API calls generated by downloading /uploading files from centralized Amazon S3 data lake located in account A (111111111111) from spoke accounts B (222222222222), C (333333333333), and D (444444444444).

These AWS KMS quotas will get applied to caller (spoke) accounts even though key owner is hub account.

For example:

  • 2,560 AWS KMS API call quotas will be applied to account B.
  • 1,644 AWS KMS API call quotas will be applied to account C.
  • 944 AWS KMS API call quotas will be applied to account D.

Step 4: Identify IAM principal-wise cross account Amazon S3 encryption/decryption calls.

To identify IAM principal-wise cross account Amazon S3 encryption /decryption calls, you can run following query.

Query:

SELECT useridentity.accountid as requestor_account_id,
useridentity.principalid as requestor_principal,
resources[1].accountid as owner_account_id,
resources[1].arn as key_arn,
count(resources) as count
FROM CloudTrail_logs_us_east_2
WHERE eventsource='kms.amazonaws.com'
AND timestamp between '2021/04/01' and '2021/08/30'
AND eventname in ('Decrypt','Encrypt','GenerateDataKey')
AND useridentity.accountid!= resources[1].accountid
AND json_extract(json_extract(requestparameters , '$.encryptionContext'),'$.aws:s3:arn') is not null
GROUP BY useridentity.accountid,useridentity.principalid,resources[1].accountid,resources[1].arn
ORDER BY requestor_account_id,count desc

Result:

Athena screenshot 3

The preceding result shows  AWS Identity and Access Management (IAM) principal-wise cross-account AWS KMS API calls made between hub and spoke accounts. For example, Account B (22222222222) has two applications configured with IAM principals ids ending with 4C5VIMGI2, 4YFPRTQMP are accessing the centralized S3 bucket located in hub account A (111111111111).

For the time period between ‘2021/04/01’ and ‘2021/08/30’, the application configured with IAM principal ending in 4C5VIMGI2 made 1622 cross-account AWS KMS API calls. During this same time period, the application configured with IAM principal 4YFPRTQMP made 936 cross-account AWS KMS API calls.

We can further filter the results to see only KMS key ARN ending with 3aa3c82a2174 to get application- wise % of AWS KMS API calls made to the Amazon S3 centralized data lake from all the spoke accounts.

account ID table

Note:  we assume that each application is configured with a unique IAM principal.

Step 5: Identify cross account Amazon S3 encryption/decryption calls by events.

Query:

SELECT useridentity.accountid as requestor_account_id,

              resources[1].accountid as owner_account_id,

              resources[1].arn as key_arn,

              eventname as eventname,

          count(resources) as count

  FROM CloudTrail_logs_us_east_2

  WHERE eventsource='kms.amazonaws.com'

          AND timestamp between '2021/04/01' and '2021/08/30'

          AND useridentity.accountid!= resources[1].accountid

        AND json_extract(json_extract(requestparameters , '$.encryptionContext'),'$.aws:s3:arn') is not null

  GROUP BY useridentity.accountid, resources [1].accountid,resources[1].arn,eventname

  ORDER BY requestor_account_id,count desc

Result:

Athena screenshot 4

Amazon S3 makes decrypt API requests when you download the files and GenerateDataKey API request when you upload the file to encrypted S3 bucket. The result shows that:

  • Spoke account B (22222222222) made 1811 decrypt API requests to download 1811 files and 749 GenerateDataKey API requests to uploaded 749 files.
  • Spoke account C (33333333333) made 1373 decrypt API requests to download 1373 files and 271 GenerateDataKey API requests to uploaded 271 files.
  • Spoke account D (444444444444) made 638 decrypt API requests to download 638 files and 306 GenerateDataKey API requests to uploaded 306 files.

Note: When you configure your bucket to use an S3BucketKey for SSE-KMS, you may not have a separate Decrypt or GenerateDataKey for each file upload or download.

Step 6: Identify all the AWS KMS Calls.

To analyze the hub account for all the AWS KMS API calls made, run following query.

Query:

SELECT useridentity.accountid as requestor_account_id,

              resources[1].accountid as owner_account_id,

              resources[1].arn as key_arn,

          count(resources) as count

  FROM CloudTrail_logs_us_east_2

  WHERE eventsource='kms.amazonaws.com'

          AND timestamp between '2021/04/01' and '2021/08/30'

  GROUP BY useridentity.accountid, resources [1].accountid,resources[1].arn

  ORDER BY requestor_account_id,count desc

Result:

Athena screenshot 5

Results show all the AWS KMS API calls made in the hub account both within the account and across accounts. From this result, we can analyze that for centralized S3 data lake (KMS key ARN ending with 3aa3c82a2174), the majority of the calls are cross account AWS KMS API call and only 303 calls are made within account. You can do further analysis by refining the Amazon Athena queries based on your needs.

Cleaning up

To avoid incurring future charges, delete the resources that are no longer required.

Step 1: Delete the CloudTrail created in hub account

If you have created CloudTrail specifically for this solution, you can delete the CloudTrail by following the instructions in this user guide.

Step 2: Drop the Amazon Athena table

Log in to the Amazon Athena console and run the following drop table query:

Drop table < CloudTrail_logs_aws_region_1>

Conclusion

Tracking use of the cross-account AWS KMS APIs can be challenging in a multi-account scenario. In this blog, we learned how to use AWS CloudTrail and Amazon Athena to analyze AWS KMS API usage. In a hub and spoke account model, cross-account AWS KMS API quotas are applied to the spoke account when the spoke account accesses SSE-KMS encrypted S3 bucket in the hub account. You learned to analyze cross-account AWS KMS API quotas using AWS CloudTrail and Amazon Athena.  Finally, we learned how we can identify all the AWS KMS API call within account for period of time and analyze AWS KMS API traffic within account and across account. You can repeat the process and aggregate the data across Regions.

Additional Reading:

Manage your AWS KMS API request rates using Service Quotas and Amazon CloudWatch

Why did my CloudTrail cost and usage increase unexpectedly?

User Guide: Managing CloudTrail Costs

Detect Adversary Behavior in Milliseconds with CrowdStrike and Amazon EventBridge

Post Syndicated from Joby Bett original https://aws.amazon.com/blogs/architecture/detect-adversary-behavior-in-seconds-with-crowdstrike-and-amazon-eventbridge/

By integrating Amazon EventBridge with Falcon Horizon, CrowdStrike has developed a real-time, cloud-based solution that allows you to detect threats in less than a second. This solution uses AWS CloudTrail and EventBridge. CloudTrail allows governance, compliance, operational auditing, and risk auditing of your AWS account. EventBridge is a serverless event bus that makes it easier to build event-driven applications at scale.

In this blog post, we’ll cover the challenges presented by using traditional log file-based security monitoring. We will also discuss how CrowdStrike used EventBridge to create an innovative, real-time cloud security solution that enables high-speed, event-driven alerts that detect malicious actors in milliseconds.

Challenges of log file-based security monitoring

Being able to detect malicious actors in your environment is necessary to stay secure in the cloud. With the growing volume, velocity, and variety of cloud logs, log file-based monitoring makes it difficult to reveal adverse behaviors in time to stop breaches.

When an attack is in progress, a security operations center (SOC) analyst has an average of one minute to detect the threat, ten minutes to understand it, and one hour to contain it. If you cannot meet this 1/10/60 minute rule, you may have a costly breach that may move laterally and explode exponentially across the cloud estate.

Let’s look at a real-life scenario. When a malicious actor attempts a ransom attack that targets high-value data in an Amazon Simple Storage Service (Amazon S3) bucket, it can involve activities in various parts of the cloud services in a brief time window.

These example activities can involve:

  • AWS Identity and Access Management (IAM): account enumeration, disabling multi-factor authentication (MFA), account hijacking, privilege escalation, etc.
  • Amazon Elastic Compute Cloud (Amazon EC2): instance profile privilege escalation, file exchange tool installs, etc.
  • Amazon S3: bucket and object enumeration; impair bucket encryption and versioning; bucket policy manipulation; getObject, putObject, and deleteObject APIs, etc.

With siloed log file-based monitoring, detecting, understanding, and containing a ransom attack while still meeting the 1/10/60 rule is difficult. This is because log files are written in batches, and files are typically only created every 5 minutes. Once the log file is written, it still needs to be fetched and processed. This means that you lose the ability to dynamically correlate disparate activities.

To summarize, top-level challenges of log file-based monitoring are:

  1. Lag time between the breach and the detection
  2. Inability to correlate disparate activities to reveal sophisticated attack patterns
  3. Frequent false positive alarms that obscure true positives
  4. High operational cost of log file synchronizations and reprocessing
  5. Log analysis tool maintenance for fast growing log volume

Security and compliance is a shared responsibility between AWS and the customer. We protect the infrastructure that runs all of the services offered in the AWS Cloud. For abstracted services, such as  Amazon S3, we operate the infrastructure layer, the operating system, and platforms, and customers access the endpoints to store and retrieve data.

You are responsible for managing your data (including encryption options), classifying assets, and using IAM tools to apply the appropriate permissions.

Indicators of attack by CrowdStrike with Amazon EventBridge

In real-world cloud breach scenarios, timeliness of observation, detection, and remediation is critical. CrowdStrike Falcon Horizon IOA is built on an event-driven architecture based on EventBridge and operates at a velocity that can outpace attackers.

CrowdStrike Falcon Horizon IOA performs the following core actions:

  • Observe: EventBridge streams CloudTrail log files across accounts to the CrowdStrike platform as activity occurs. Parallelism is enabled via event bus rules, which enables CrowdStrike to avoid the five-minute lag in fetching the log files and dynamically correlate disparate activities. The CrowdStrike platform observes end-to-end activities from AWS services and infrastructure hosted in the accounts protected by CrowdStrike.
  • Detect: Falcon Horizon invokes indicators of attack (IOA) detection algorithms that reveal adversarial or anomalous activities from the log file streams. It correlates new and historical events in real time while enriching the events with CrowdStrike threat intelligence data. Each IOA is prioritized with the likelihood of activity being malicious via scoring and mapped to the MITRE ATT&CK framework.
  • Remediate: The detected IOA is presented with remediation steps. Depending on the score, applying the remediations quickly can be critical before the attack spreads.
  • Prevent: Unremediated insecure configurations are revealed via indicators of misconfiguration (IOM) in Falcon Horizon. Applying the remediation steps from IOM can prevent future breaches.

Key differentiators of IOA from Falcon Horizon are:

  • Observability of wider attack surfaces with heterogeneous event sources
  • Detection of sophisticated tactics, techniques, and procedures (TTPs) with dynamic event correlation
  • Event enrichment with threat intelligence that aids prioritization and reduces alert fatigue
  • Low latency between malicious activity occurrence and corresponding detection
  • Insight into attacks for each adversarial event from MITRE ATT&CK framework

High-level architecture

Event-driven architectures provide advantages for integrating varied systems over legacy log file-based approaches. For securing cloud attack surfaces against the ever-evolving TTPs, a robust event-driven architecture at scale is a key differentiator.

CrowdStrike maximizes the advantages of event-driven architecture by integrating with EventBridge, as shown in Figure 1. EventBridge allows observing CloudTrail logs in event streams. It also simplifies log centralization from a number of accounts with its direct source-to-target integration across accounts, as follows:

  1. CrowdStrike hosts an EventBridge with central event buses that consume the stream of CloudTrail log events from a multitude of customer AWS accounts.
  2. Within customer accounts, EventBridge rules listen to the local CloudTrail and stream each activity as an event to the centralized EventBridge hosted by CrowdStrike.
  3. CrowdStrike’s event-driven platform detects adversarial behaviors from the event streams in real time. The detection is performed against incoming events in conjunction with historical events. The context that comes from connecting new and historical events minimizes false positives and improves alert efficacy.
  4. Events are enriched with CrowdStrike threat intelligence data that provides additional insight of the attack to SOC analysts and incident responders.
CrowdStrike Falcon Horizon IOA architecture

Figure 1. CrowdStrike Falcon Horizon IOA architecture

As data is received by the centralized EventBridge, CrowdStrike relies on unique customer ID and AWS Region in each event to provide integrity and isolation.

EventBridge allows relatively hassle-free customer onboarding by using cross account rules to transfer customer CloudTrail data into one common event bus that can then be used to filter and selectively forward the data into the Falcon Horizon platform for analysis and mitigation.

Conclusion

As your organization’s cloud footprint grows, visibility into end-to-end activities in a timely manner is critical for maintaining a safe environment for your business to operate. EventBridge allows event-driven monitoring of CloudTrail logs at scale.

CrowdStrike Falcon Horizon IOA, powered by EventBridge, observes end-to-end cloud activities at high speeds at scale. Paired with targeted detection algorithms from in-house threat detection experts and threat intelligence data, Falcon Horizon IOA combats emerging threats against the cloud control plane with its cutting-edge event-driven architecture.

Related information

How to automate incident response to security events with AWS Systems Manager Incident Manager

Post Syndicated from Sumit Patel original https://aws.amazon.com/blogs/security/how-to-automate-incident-response-to-security-events-with-aws-systems-manager-incident-manager/

Incident response is a core security capability for organizations to develop, and a core element in the AWS Cloud Adoption Framework (AWS CAF). Responding to security incidents quickly is important to minimize their impacts. Automating incident response helps you scale your capabilities, rapidly reduce the scope of compromised resources, and reduce repetitive work by your security team.

In this post, I show you how to use Incident Manager, a capability of AWS Systems Manager, to build an effective automated incident management and response solution to security events.

You’ll walk through three common security-related events and how you can use Incident Manager to automate your response.

  • AWS account root user activity: An Amazon Web Services (AWS) account root user has full access to all your resources for all AWS services, including billing information. It’s therefore elemental to adhere to the best practice of using the root user only to create your first IAM user and securely lock away the root user credentials and use them to perform only a few account and service management tasks. And it is critical to be aware when root user activity occurs in your AWS account.
  • Amazon GuardDuty high severity findings: Amazon GuardDuty is a threat detection service that continuously monitors for malicious or unauthorized behavior to help protect your AWS accounts and workloads. In this blog post, you’ll learn how to initiate an incident response plan whenever a high severity finding is discovered.
  • AWS Config rule change and S3 bucket allowing public access: AWS Config enables continuous monitoring of your AWS resources, making it simple to assess, audit, and record resource configurations and changes. You will use AWS Config to monitor your Amazon Simple Storage Service (S3) bucket ACLs and policies for settings that allow public read or public write access.

Prerequisites

If this is your first time using Incident Manager, follow the initial onboarding steps in Getting prepared with Incident Manager.

Incident Manager can start managing incidents automatically using Amazon CloudWatch or Amazon EventBridge. For the solution in this blog post, you will use EventBridge to capture events and start an incident.

To complete the steps in this walkthrough, you need the following:

Create an Incident Manager response plan

A response plan ties together the contacts, escalation plan, and runbook. When an incident occurs, a response plan defines who to engage, how to engage, which runbook to initiate, and which metrics to monitor. By creating a well-defined response plan, you can save your security team time down the road.

Add contacts

Your contacts should include everyone who might be involved in the incident. Follow these steps to add a contact.

To add contacts

  1. Open the AWS Management Console, and then go to Systems Manager within the console, expand Operations Management, and then expand Incident Manager.
  2. Choose Contacts, and then choose Create contact.

    Figure 1: Adding contact details

    Figure 1: Adding contact details

  3. On Contact information, enter names and define contact channels for your contacts.
  4. Under Contact channel, you can select Email, SMS, or Voice. You can also add multiple contact channels.
  5. In Engagement plan, specify how fast to engage your responders. In the example illustrated below, the incident responder will be engaged through email immediately (0 minutes) when an incident is detected and then through SMS 10 minutes into an incident. Complete the fields and then choose Create.

    Figure 2: Engagement plan

    Figure 2: Engagement plan

Create a response plan

Once you’ve created your contacts, you can create a response plan to define how to respond to incidents. Refer to the Best Practices for Response Plans.

Note: (Optional) You can also create an escalation plan that lets you further define the escalation path for your contacts. You can learn more in Create an escalation plan.

To create a response plan

  1. Open the Incident Manager console, and choose Response plans in the left navigation pane.
  2. Choose Create response plan.
  3. Enter a unique and identifiable name for your response plan.
  4. Enter an incident title. The incident title helps to identify an incident on the incidents home page.
  5. Select an appropriate Impact based on the potential scope of the incident.

    Figure 3: Selecting your impact level

    Figure 3: Selecting your impact level

  6. (Optional) Choose a chat channel for the incident responders to interact in during an incident. For more information about chat channels, see Chat channels.
  7. (Optional) For Engagement, you can choose any number of contacts and escalation plans. For this solution, select the security team responder that you created earlier as one of your contacts.

    Figure 4: Adding engagements

    Figure 4: Adding engagements

  8. (Optional) You can also create a runbook that can drive the incident mitigation and response. For further information, refer to Runbooks and automation.
  9. Under Execution permissions, choose Create an IAM role using a template. Under Role name, select the IAM role you created in the prerequisites that allows Incident Manager to run SSM automation documents, and then choose Create response plan.

Monitor AWS account root activity

When you first create an AWS account, you begin with a single sign-in identity that has complete access to all AWS services and resources in the account. This identity is called the root user and is accessed by signing in with the email address and password that you used to create the account.

An AWS account root user has full access to all your resources for all AWS services, including billing information. It is critical to prevent root user access from unauthorized use and to be aware whenever root user activity occurs in your AWS account. For more information about AWS recommendations, see Security best practices in IAM.

To be certain that all root user activity is authorized and expected, it’s important to monitor root API calls to a given AWS account and to be notified when root user activity is detected.

Create an EventBridge rule

Create and validate an EventBridge rule to capture AWS account root activity.

To create an EventBridge rule

  1. Open the EventBridge console.
  2. In the navigation pane, choose Rules, and then choose Create rule.
  3. Enter a name and description for the rule.
  4. For Define pattern, choose Event pattern.
  5. Choose Custom pattern.
  6. Enter the following event pattern:
    {
      "detail-type": [
        "AWS API Call via CloudTrail",
        "AWS Console Sign In via CloudTrail"
      ],
      "detail": {
        "userIdentity": {
          "type": [
            "Root"
          ]
        }
      }
    }
    

  7. For Select targets, choose Incident Manager response plan.
  8. For Response plan, choose SecurityEventResponsePlan, which you created when you set up Incident Manager.
  9. To create an IAM role automatically, choose Create a new role for this specific resource. To use an existing IAM role, choose Use existing role.
  10. (Optional) Enter one or more tags for the rule.
  11. Choose Create.

To validate the rule

  1. Sign in using root credentials.
  2. This console login activity by a root user should invoke the Incident Manager response plan and show an open incident as illustrated below. The respective contact channels that you defined earlier in your Engagement Plan, will be engaged.
Figure 5: Incident Manager open incidents

Figure 5: Incident Manager open incidents

Watch for GuardDuty high severity findings

GuardDuty is a monitoring service that analyzes AWS CloudTrail management and Amazon S3 data events, Amazon Virtual Private Cloud (Amazon VPC) flow logs, and Amazon Route 53 DNS logs to generate security findings for your account. Once GuardDuty is enabled, it immediately starts monitoring your environment.

GuardDuty integrates with EventBridge, which can be used to send findings data to other applications and services for processing. With EventBridge, you can use GuardDuty findings to invoke automatic responses to your findings by connecting finding events to targets such as Incident Manager response plan.

Create an EventBridge rule

You’ll use an EventBridge rule to capture GuardDuty high severity findings.

To create an EventBridge rule

  1. Open the EventBridge console.
  2. In the navigation pane, select Rules, and then choose Create rule.
  3. Enter a name and description for the rule.
  4. For Define pattern, choose Event pattern.
  5. Choose Custom pattern
  6. Enter the following event pattern which will filter on GuardDuty high severity findings
    {
      "source": ["aws.guardduty"],
      "detail-type": ["GuardDuty Finding"],
      "detail": {
        "severity": [
          7.0,
          7.1,
          7.2,
          7.3,
          7.4,
          7.5,
          7.6,
          7.7,
          7.8,
          7.9,
          8,
          8.0,
          8.1,
          8.2,
          8.3,
          8.4,
          8.5,
          8.6,
          8.7,
          8.8,
          8.9
        ]
      }
    } 
    

  7. For Select targets, choose Incident Manager response plan.
  8. For Response plan, select SecurityEventResponsePlan, which you created when you set up Incident Manager.
  9. To create an IAM role automatically, choose Create a new role for this specific resource. To use an IAM role that you created before, choose Use existing role.
  10. (Optional) Enter one or more tags for the rule.
  11. Choose Create.

To validate the rule

To test and validate whether the above rule is now functional, you can generate sample findings within the GuardDuty console.

  1. Open the GuardDuty console.
  2. In the navigation pane, choose Settings.
  3. On the Settings page, under Sample findings, choose Generate sample findings.
  4. In the navigation pane, choose Findings. The sample findings are displayed on the Current findings page with the prefix [SAMPLE].

Once you have generated sample findings, your Incident Manager response plan will be invoked almost immediately and the engagement plan with your contacts will begin.

You can select an open incident in the Incident Manager console to see additional details from the GuardDuty finding. Figure 6 shows a high severity finding.

Figure 6: Incident Manager open incident for GuardDuty high severity finding

Figure 6: Incident Manager open incident for GuardDuty high severity finding

Monitor S3 bucket settings for public access

AWS Config enables continuous monitoring of your AWS resources, making it easier to assess, audit, and record resource configurations and changes. AWS Config does this through rules that define the desired configuration state of your AWS resources. AWS Config provides a number of AWS managed rules that address a wide range of security concerns such as checking that your Amazon Elastic Block Store (Amazon EBS) volumes are encrypted, your resources are tagged appropriately, and multi-factor authentication (MFA) is enabled for root accounts.

Set up AWS Config and EventBridge

You will use AWS Config to monitor your S3 bucket ACLs and policies for violations which could allow public read or public write access. If AWS Config finds a policy violation, it will initiate an AWS EventBridge rule to invoke your Incident Manager response plan.

To create the AWS Config rule to capture S3 bucket public access

  1. Sign in to the AWS Config console.
  2. If this is your first time in the AWS Config console, refer to the Getting Started guide for more information.
  3. Select Rules from the menu and choose Add Rule.
  4. On the AWS Config rules page, enter S3 in the search box and select the s3-bucket-public-read-prohibited and s3-bucket-public-write-prohibited rules, and then choose Next.

    Figure 7: AWS Config rules

    Figure 7: AWS Config rules

  5. Leave the Configure rules page as default and select Next.
  6. On the Review page, select Add Rule. AWS Config is now analyzing your S3 buckets, capturing their current configurations, and evaluating the configurations against the rules you selected.

To create the EventBridge rule

  1. Open the Amazon EventBridge console
  2. In the navigation pane, choose Rules, and then choose Create rule.
  3. Enter a name and description for the rule.
  4. For Define pattern, choose Event pattern.
  5. Choose Custom pattern
  6. Enter the following event pattern, which will filter on AWS Config rule s3-bucket-public-write-prohibited being non-compliant.
    {
      "source": ["aws.config"],
      "detail-type": ["Config Rules Compliance Change"],
      "detail": {
        "messageType": ["ComplianceChangeNotification"],
        "configRuleName": ["s3-bucket-public-write-prohibited", ""],
        "newEvaluationResult": {
          "complianceType": [
            "NON_COMPLIANT"
          ]
        }
      }
    }
    

  7. For Select targets, choose Incident Manager response plan.
  8. For Response plan, choose SecurityEventResponsePlan, which you created earlier when setting up Incident Manager.
  9. To create an IAM role automatically, choose Create a new role for this specific resource. To use an existing IAM role, choose Use existing role.
  10. (Optional) Enter one or more tags for the rule.
  11. Choose Create.

To validate the rule

  1. Create a compliant test S3 bucket with no public read or write access through either an ACL or a policy.
  2. Change the ACL of the bucket to allow public listing of objects so that the bucket is non-compliant.

    Figure 8: Amazon S3 console

    Figure 8: Amazon S3 console

  3. After a few minutes, you should see the AWS Config rule initiated which invokes the EventBridge rule and therefore your Incident Manager response plan.

Summary

In this post, I showed you how to use Incident Manager to monitor for security events and invoke a response plan via Amazon CloudWatch or Amazon EventBridge. AWS CloudTrail API activity (for a root account login), Amazon GuardDuty (for high severity findings), and AWS Config (to enforce policies like preventing public write access to an S3 bucket). I demonstrated how you can create an incident management and response plan to ensure you have used the power of cloud to create automations that respond to and mitigate security incidents in a timely manner. To learn more about Incident Manager, see What Is AWS Systems Manager Incident Manager in the AWS documentation.

If you have feedback about this post, submit comments in the comments section below. If you have questions about this post, start a new thread on the Systems Manager forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Sumit Patel

As a Senior Solutions Architect at AWS, Sumit works with large enterprise customers helping them create innovative solutions to address their cloud challenges. Sumit uses his more than 15 years of enterprise experience to help customers navigate their cloud transformation journey and shape the right dynamics between technology and business.

Using Cloud Fitness Functions to Drive Evolutionary Architecture

Post Syndicated from Hauke Juhls original https://aws.amazon.com/blogs/architecture/using-cloud-fitness-functions-to-drive-evolutionary-architecture/

“It is not the strongest of the species that survives, nor the most intelligent. It is the one that is most adaptable to change.” – often attributed to Charles Darwin

One common strategy for businesses that operate in dynamic market conditions (and thus need to continuously correct their course) is to aim for smaller, independent development teams. Microservices and two-pizza teams at Amazon are prominent examples of this strategy. But having smaller units is not the only success factor: to reduce organizational bottlenecks and make high-quality decisions quickly, these two-pizza teams need to be autonomous in most of their decision making.

Architects can no longer rely on static upfront design to meet the change rate required to be successful in such an environment.

This blog shows enterprise architects a mechanism to align decentralized architectural decision making with overall architecture goals.

Gathering data from your fitness functions

“Evolutionary architecture” was coined by Neal Ford and his colleagues from AWS Partner ThoughtWorks in their work on Building Evolutionary Architectures. It is defined as “supporting guided, incremental change as a first principle across multiple dimensions.”

Fitness functions help you obtain the necessary data to allow for the planned evolution of your architecture. They set measurable values to assess how close your solution is to achieving your set goals.

Fitness functions can and should be adapted as the architecture evolves to guide a desired change process. This provides architects with a tool to guide their teams while maintaining team autonomy.

Example of a regression fitness function in action

You’ve identified shorter time-to-market as a key non-functional requirement. You want to lower the risk of regressions and rollbacks after deployments. So, you and your team write automated test cases. To ensure that they have a good set of test cases in place, they measure test coverage. This test coverage measures the percentage of code that is tested automatically. This steers the team toward writing tests to mitigate the risk of regressions so they have fewer rollbacks and shorter time to market.

Fitness functions like this work best when they’re as automated as possible. But how do you acquire the necessary data points to use this mechanism outside of software architecture? We’ll show you how in the following sections.

AWS Cloud services with built-in fitness functions

AWS Cloud services are highly standardized, fully automated via API operations, and are built with observability in mind. This allows you to generate measurements for fitness functions automatically for areas such as availability, responsiveness, and security.

To start building your evolutionary architecture with fitness functions, use something that can be easily measured. AWS has services that can be used as inputs to fitness functions, including:

  • Amazon CloudWatch aggregates logs and metrics to check for availability, responsiveness, and reliability fitness functions.
  • AWS Security Hub provides a comprehensive view of your security alerts and security posture across your AWS accounts. Security Architects could, for example, define the fitness function of critical and high findings to be zero. Teams then would be guided into reducing the number of these findings, resulting in better security.
  • AWS Cost Explorer ensures your costs stay in line with value generated.
  • AWS Well-Architected Tool evaluates teams’ architectures in a consistent and repeatable way. The number of items acts as your fitness function, which can be queried using the API. To improve your architecture based on the results, review the Establishing Feedback Loops Based on the AWS Well-Architected Framework Review blog post.
  • Amazon SageMaker Model Monitor continuously monitors the quality of SageMaker machine learning models in production. Detecting deviations early allows you to take corrective actions like retraining models, auditing upstream systems, or fixing quality issues.

Using the observability that the cloud provides

Fitness functions can be derived by evaluating the AWS account activity such as configuration changes. AWS CloudTrail is useful for this. It records account activity and service events from most AWS services, which can then be analyzed with Amazon Athena.

Fitness functions provide feedback to engineers via metrics

Figure 1. Fitness functions provide feedback to engineers via metrics

Example of a cloud fitness function in action

In this example, we implement a fitness function that monitors the operability of your system.

You have had certain outages due to manual tasks in operations, and you have anecdotal evidence that engineers are spending time on manual work during application rollouts. To improve operations, you want to reduce manual interactions via the shell in favor of automation. First, you prevent direct secure shell (SSH) access by blocking SSH traffic via the managed AWS Config rule restricted-ssh. Second, you make use of AWS Systems Manager Session Manager, which provides a secure and auditable way to access Amazon Elastic Compute Cloud (Amazon EC2) instances.

By counting the logged API events in CloudTrail you can measure the number of shell sessions. This is shown in this sample Athena query to count the number of shell sessions:

SELECT count(*),
       DATE(from_iso8601_timestamp(eventTime)),
       userIdentity.type,
       eventSource,
       eventName
FROM "cloudtrail_logs_partition_projection"
WHERE readonly = 'false'
  AND eventsource = 'ssm.amazonaws.com'
  AND eventname in ('StartSession',
                    'ResumeSession',
                    'TerminateSession')
GROUP BY DATE(from_iso8601_timestamp(eventTime)),
         userIdentity.type,
         eventSource,
         eventName
ORDER BY DATE(from_iso8601_timestamp(eventTime)) DESC

The number of shell sessions now act as fitness function to improve operational excellence through operations as code. Coincidently, the fitness function you defined also rewards teams moving to serverless compute services such as AWS Fargate or AWS Lambda.

Fitness through exercising

Similar to people, your architecture’s fitness can be improved by exercising. It does not take much equipment, but you need to take the first step. To get started, we encourage you to think of the desired outcomes for your architecture that you can measure (and thus guide) through fitness functions. The following lessons learned will help you focus your goals:

  • Requirements and business goals may differ per domain. Thus, your fitness functions might differ. Work closely with your teams when defining fitness functions.
  • Start by taking something that can be easily measured and communicated as a goal.
  • Focus on a positive trendline rather than absolute values.
  • Make sure you and your teams are using the same metrics and the same way to measure them. We have seen examples where central governance departments had access to data the individual teams did not, leading to frustration on all sides.
  • Ensure that your architecture goals fit well into the current context and time horizon.
  • Continuously re-visit the fitness functions to ensure that they evolve with the changing business goals.

Conclusion

Fitness functions help architects focus on building. Once established, teams can use the data points from fitness functions to make decisions and work towards a common and measurable goal. The architects in turn can use the data points they get from fitness functions to confirm their hypothesis of the current state of the architecture. Get started building your fitness functions today by:

  • Gathering the most important system quality attributes.
  • Beginning with approximately three meaningful fitness functions relying on the API operations available.
  • Building a dashboard that shows progress over time, share it with your teams, and rely on this data in your daily work.

Building well-architected serverless applications: Implementing application workload security – part 2

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-implementing-application-workload-security-part-2/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

Security question SEC3: How do you implement application security in your workload?

This post continues part 1 of this security question. Previously, I cover reviewing security awareness documentation such as the Common Vulnerabilities and Exposures (CVE) database. I show how to use GitHub security features to inspect and manage code dependencies. I then show how to validate inbound events using Amazon API Gateway request validation.

Required practice: Store secrets that are used in your code securely

Store secrets such as database passwords or API keys in a secrets manager. Using a secrets manager allows for auditing access, easier rotation, and prevents exposing secrets in application source code. There are a number of AWS and third-party solutions to store and manage secrets.

AWS Partner Network (APN) member Hashicorp provides Vault to keep secrets and application data secure. Vault has a centralized workflow for tightly controlling access to secrets across applications, systems, and infrastructure. You can store secrets in Vault and access them from an AWS Lambda function to, for example, access a database. You can use the Vault Agent for AWS to authenticate with Vault, receive the database credentials, and then perform the necessary queries. You can also use the Vault AWS Lambda extension to manage the connectivity to Vault.

AWS Systems Manager Parameter Store allows you to store configuration data securely, including secrets, as parameter values.

AWS Secrets Manager enables you to replace hardcoded credentials in your code with an API call to Secrets Manager to retrieve the secret programmatically. You can protect, rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. You can also generate secure secrets. By default, Secrets Manager does not write or cache the secret to persistent storage.

Parameter Store integrates with Secrets Manager. For more information, see “Referencing AWS Secrets Manager secrets from Parameter Store parameters.”

To show how Secrets Manager works, deploy the solution detailed in “How to securely provide database credentials to Lambda functions by using AWS Secrets Manager”.

The AWS Cloud​Formation stack deploys an Amazon RDS MySQL database with a randomly generated password. This is stored in Secrets Manager using a secret resource. A Lambda function behind an API Gateway endpoint returns the record count in a table from the database, using the required credentials. Lambda function environment variables store the database connection details and which secret to return for the database password. The password is not stored as an environment variable, nor in the Lambda function application code.

Lambda environment variables for Secrets Manager

Lambda environment variables for Secrets Manager

The application flow is as follows:

  1. Clients call the API Gateway endpoint
  2. API Gateway invokes the Lambda function
  3. The Lambda function retrieves the database secrets using the Secrets Manager API
  4. The Lambda function connects to the RDS database using the credentials from Secrets Manager and returns the query results

View the password secret value in the Secrets Manager console, which is randomly generated as part of the stack deployment.

Example password stored in Secrets Manager

Example password stored in Secrets Manager

The Lambda function includes the following code to retrieve the secret from Secrets Manager. The function then uses it to connect to the database securely.

secret_name = os.environ['SECRET_NAME']
rds_host = os.environ['RDS_HOST']
name = os.environ['RDS_USERNAME']
db_name = os.environ['RDS_DB_NAME']

session = boto3.session.Session()
client = session.client(
	service_name='secretsmanager',
	region_name=region_name
)
get_secret_value_response = client.get_secret_value(
	SecretId=secret_name
)
...
secret = get_secret_value_response['SecretString']
j = json.loads(secret)
password = j['password']
...
conn = pymysql.connect(
	rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)

Browsing to the endpoint URL specified in the Cloud​Formation output displays the number of records. This confirms that the Lambda function has successfully retrieved the secure database credentials and queried the table for the record count.

Lambda function retrieving database credentials

Lambda function retrieving database credentials

Audit secrets access through a secrets manager

Monitor how your secrets are used to confirm that the usage is expected, and log any changes to them. This helps to ensure that any unexpected usage or change can be investigated, and unwanted changes can be rolled back.

Hashicorp Vault uses Audit devices that keep a detailed log of all requests and responses to Vault. Audit devices can append logs to a file, write to syslog, or write to a socket.

Secrets Manager supports logging API calls with AWS CloudTrail. CloudTrail captures all API calls for Secrets Manager as events. This includes calls from the Secrets Manager console and from code calling the Secrets Manager APIs.

Viewing the CloudTrail event history shows the requests to secretsmanager.amazonaws.com. This shows the requests from the console in addition to the Lambda function.

CloudTrail showing access to Secrets Manager

CloudTrail showing access to Secrets Manager

Secrets Manager also works with Amazon EventBridge so you can trigger alerts when administrator-specified operations occur. You can configure EventBridge rules to alert on deleted secrets or secret rotation. You can also create an alert if anyone tries to use a secret version while it is pending deletion. This can identify and alert when there is an attempt to use an out-of-date secret.

Enforce least privilege access to secrets

Access to secrets must be tightly controlled because the secrets contain sensitive information. Create AWS Identity and Access Management (IAM) policies that enable minimal access to secrets to prevent credentials being accidentally used or compromised. Secrets that have policies that are too permissive could be misused by other environments or developers. This can lead to accidental data loss or compromised systems. For more information, see “Authentication and access control for AWS Secrets Manager”.

Rotate secrets frequently.

Rotating your workload secrets is important. This prevents misuse of your secrets since they become invalid within a configured time period.

Secrets Manager allows you to rotate secrets on a schedule or on demand. This enables you to replace long-term secrets with short-term ones, significantly reducing the risk of compromise. Secrets Manager creates a CloudFormation stack with a Lambda function to manage the rotation process for you. Secrets Manager has native integrations with Amazon RDS, Amazon Redshift, and Amazon DocumentDB. It populates the function with the Amazon Resource Name (ARN) of the secret. You specify the permissions to rotate the credentials, and how often you want to rotate the secret.

The CloudFormation stack creates a MySecretRotationSchedule resource with a MyRotationLambda function to rotate the secret every 30 days.

MySecretRotationSchedule:
    Type: AWS::SecretsManager::RotationSchedule
    DependsOn: SecretRDSInstanceAttachment
    Properties:
    SecretId: !Ref MyRDSInstanceRotationSecret
    RotationLambdaARN: !GetAtt MyRotationLambda.Arn
    RotationRules:
        AutomaticallyAfterDays: 30
MyRotationLambda:
    Type: AWS::Serverless::Function
    Properties:
    Runtime: python3.7
    Role: !GetAtt MyLambdaExecutionRole.Arn
    Handler: mysql_secret_rotation.lambda_handler
    Description: 'This is a lambda to rotate MySql user passwd'
    FunctionName: 'cfn-rotation-lambda'
    CodeUri: 's3://devsecopsblog/code.zip'      
    Environment:
        Variables:
        SECRETS_MANAGER_ENDPOINT: !Sub 'https://secretsmanager.${AWS::Region}.amazonaws.com'

View and edit the rotation settings in the Secrets Manager console.

Secrets Manager rotation settings

Secrets Manager rotation settings

Manually rotate the secret by selecting Rotate secret immediately. This invokes the Lambda function, which updates the database password and updates the secret in Secrets Manager.

View the updated secret in Secrets Manager, where the password has changed.

Secrets Manager password change

Secrets Manager password change

Browse to the endpoint URL to confirm you can still access the database with the updated credentials.

Access endpoint with updated Secret Manager password

Access endpoint with updated Secret Manager password

You can provide your own code to customize a Lambda rotation function for other databases or services. The code includes the commands required to interact with your secured service to update or add credentials.

Conclusion

Implementing application security in your workload involves reviewing and automating security practices at the application code level. By implementing code security, you can protect against emerging security threats. You can improve the security posture by checking for malicious code, including third-party dependencies.

In this post, I continue from part 1, looking at securely storing, auditing, and rotating secrets that are used in your application code.

In the next post in the series, I start to cover the reliability pillar from the Well-Architected Serverless Lens with regulating inbound request rates.

For more serverless learning resources, visit Serverless Land.

Building well-architected serverless applications: Managing application security boundaries – part 2

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-managing-application-security-boundaries-part-2/

This series uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the introduction post for a table of contents and explanation of the example application.

Security question SEC2: How do you manage your serverless application’s security boundaries?

This post continues part 1 of this security question. Previously, I cover how to evaluate and define resource policies, showing what policies are available for various serverless services. I show some of the features of AWS Web Application Firewall (AWS WAF) to protect APIs. Then then go through how to control network traffic at all layers. I explain how AWS Lambda functions connect to VPCs, and how to use private APIs and VPC endpoints. I walk through how to audit your traffic.

Required practice: Use temporary credentials between resources and components

Do not share credentials and permissions policies between resources to maintain a granular segregation of permissions and improve the security posture. Use temporary credentials that are frequently rotated and that have policies tailored to the access the resource needs.

Use dynamic authentication when accessing components and managed services

AWS Identity and Access Management (IAM) roles allows your applications to access AWS services securely without requiring you to manage or hardcode the security credentials. When you use a role, you don’t have to distribute long-term credentials such as a user name and password, or access keys. Instead, the role supplies temporary permissions that applications can use when they make calls to other AWS resources. When you create a Lambda function, for example, you specify an IAM role to associate with the function. The function can then use the role-supplied temporary credentials to sign API requests.

Use IAM for authorizing access to AWS managed services such as Lambda or Amazon S3. Lambda also assumes IAM roles, exposing and rotating temporary credentials to your functions. This enables your application code to access AWS services.

Use IAM to authorize access to internal or private Amazon API Gateway API consumers. See this list of AWS services that work with IAM.

Within the serverless airline example used in this series, the loyalty service uses a Lambda function to fetch loyalty points and next tier progress. AWS AppSync acts as the client using an HTTP resolver, via an API Gateway REST API /loyalty/{customerId}/get resource, to invoke the function.

To ensure only AWS AppSync is authorized to invoke the API, IAM authorization is set within the API Gateway method request.

Viewing API Gateway IAM authorization

Viewing API Gateway IAM authorization

The IAM role specifies that appsync.amazonaws.com can perform an execute-api:Invoke on the specific API Gateway resource arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${LoyaltyApi}/*/*/*

For more information, see “Using an IAM role to grant permissions to applications”.

Use a framework such as the AWS Serverless Application Model (AWS SAM) to deploy your applications. This ensures that AWS resources are provisioned with unique per resource IAM roles. For example, AWS SAM automatically creates unique IAM roles for every Lambda function you create.

Best practice: Design smaller, single purpose functions

Creating smaller, single purpose functions enables you to keep your permissions aligned to least privileged access. This reduces the risk of compromise since the function does not require access to more than it needs.

Create single purpose functions with their own IAM role

Single purpose Lambda functions allow you to create IAM roles that are specific to your access requirements. For example, a large multipurpose function might need access to multiple AWS resources such as Amazon DynamoDB, Amazon S3, and Amazon Simple Queue Service (SQS). Single purpose functions would not need access to all of them at the same time.

With smaller, single purpose functions, it’s often easier to identify the specific resources and access requirements, and grant only those permissions. Additionally, new features are usually implemented by new functions in this architectural design. You can specifically grant permissions in new IAM roles for these functions.

Avoid sharing IAM roles with multiple cloud resources. As permissions are added to the role, these are shared across all resources using this role. For example, use one dedicated IAM role per Lambda function. This allows you to control permissions more intentionally. Even if some functions have the same policy initially, always separate the IAM roles to ensure least privilege policies.

Use least privilege access policies with your users and roles

When you create IAM policies, follow the standard security advice of granting least privilege, or granting only the permissions required to perform a task. Determine what users (and roles) must do and then craft policies that allow them to perform only those tasks.

Start with a minimum set of permissions and grant additional permissions as necessary. Doing so is more secure than starting with permissions that are too lenient and then trying to tighten them later. In the unlikely event of misused credentials, credentials will only be able to perform limited interactions.

To control access to AWS resources, AWS SAM uses the same mechanisms as AWS CloudFormation. For more information, see “Controlling access with AWS Identity and Access Management” in the AWS CloudFormation User Guide.

For a Lambda function, AWS SAM scopes the permissions of your Lambda functions to the resources that are used by your application. You add IAM policies as part of the AWS SAM template. The policies property can be the name of AWS managed policies, inline IAM policy documents, or AWS SAM policy templates.

For example, the serverless airline has a ConfirmBooking Lambda function that has UpdateItem permissions to the specific DynamoDB BookingTable resource.

Parameters:
    BookingTable:
        Type: AWS::SSM::Parameter::Value<String>
        Description: Parameter Name for Booking Table
Resources:
    ConfirmBooking:
        Type: AWS::Serverless::Function
        Properties:
            FunctionName: !Sub ServerlessAirline-ConfirmBooking-${Stage}
            Policies:
                - Version: "2012-10-17"
                  Statement:
                      Action: dynamodb:UpdateItem
                      Effect: Allow
                      Resource: !Sub "arn:${AWS::Partition}:dynamodb:${AWS::Region}:${AWS::AccountId}:table/${BookingTable}"

One of the fastest ways to scope permissions appropriately is to use AWS SAM policy templates. You can reference these templates directly in the AWS SAM template for your application, providing custom parameters as required.

The serverless patterns collection allows you to build integrations quickly using AWS SAM and AWS Cloud Development Kit (AWS CDK) templates.

The booking service uses the SNSPublishMessagePolicy. This policy gives permission to the NotifyBooking Lambda function to publish a message to an Amazon Simple Notification Service (Amazon SNS) topic.

    BookingTopic:
        Type: AWS::SNS::Topic

    NotifyBooking:
        Type: AWS::Serverless::Function
        Properties:
            Policies:
                - SNSPublishMessagePolicy:
                      TopicName: !Sub ${BookingTopic.TopicName}
        …

Auditing permissions and removing unnecessary permissions

Audit permissions regularly to help you identify unused permissions so that you can remove them. You can use last accessed information to refine your policies and allow access to only the services and actions that your entities use. Use the IAM console to view when last an IAM role was used.

IAM last used

IAM last used

Use IAM access advisor to review when was the last time an AWS service was used from a specific IAM user or role. You can view last accessed information for IAM on the Access Advisor tab in the IAM console. Using this information, you can remove IAM policies and access from your IAM roles.

IAM access advisor

IAM access advisor

When creating and editing policies, you can validate them using IAM Access Analyzer, which provides over 100 policy checks. It generates security warnings when a statement in your policy allows access AWS considers overly permissive. Use the security warning’s actionable recommendations to help grant least privilege. To learn more about policy checks provided by IAM Access Analyzer, see “IAM Access Analyzer policy validation”.

With AWS CloudTrail, you can use CloudTrail event history to review individual actions your IAM role has performed in the past. Using this information, you can detect which permissions were actively used, and decide to remove permissions.

AWS CloudTrail

AWS CloudTrail

To work out which permissions you may need, you can generate IAM policies based on access activity. You configure an IAM role with broad permissions while the application is in development. Access Analyzer reviews your CloudTrail logs. It generates a policy template that contains the permissions that the role used in your specified date range. Use the template to create a policy that grants only the permissions needed to support your specific use case. For more information, see “Generate policies based on access activity”.

IAM Access Analyzer

IAM Access Analyzer

Conclusion

Managing your serverless application’s security boundaries ensures isolation for, within, and between components. In this post, I continue from part 1, looking at using temporary credentials between resources and components. I cover why smaller, single purpose functions are better from a security perspective, and how to audit permissions. I show how to use AWS SAM to create per-function IAM roles.

For more serverless learning resources, visit https://serverlessland.com.

Using AWS Systems Manager in Hybrid Cloud Environments

Post Syndicated from Shivam Patel original https://aws.amazon.com/blogs/architecture/using-aws-systems-manager-in-hybrid-cloud-environments/

Customers operating in hybrid environments today face tremendous challenges with regard to operational management, security/compliance, and monitoring. Systems administrators have to connect, monitor, patch, and automate across multiple Operating Systems (OS), applications, cloud, and on-premises infrastructure. Each of these scenarios has its own unique vendor and console purpose-built for a specific use case.

Using Hybrid Activations, a capability within AWS Systems Manager, you can manage resources irrespective of where they are hosted. You can securely initiate remote shell connections, automate patch management, and monitor critical metrics. You’re able to gain visibility into networking information and application installations via a single console.

In this post, we’ll discuss how the Session Manager and Patch Manager capabilities of Systems Manager allow you to securely connect to instances and virtual machines (VMs). You can centrally log session activity for later auditing and automate patch management, across both cloud and on-premises environments, within a single interface.

Session Manager

Session Manager is a fully managed feature of AWS Systems Manager. Session Manager provides secure and auditable instance management without the need to open inbound ports, maintain bastion hosts, or manage SSH keys. The centralized session management capability of Session Manager provides administrators the ability to centrally manage access to all compute instances. Irrespective of where your VM is hosted, the Session Manager session can be initiated from the AWS Management Console or from the Command-line interface (CLI). When using the CLI, the Session Manager plugin must be installed. The screenshot following shows an example of this.

Figure 1. Initiating instance management via Session Manager

Figure 1. Initiating instance management via Session Manager

The session is launched using the default system generated ssm-user account. With this account, the system does not prompt for a password when initiating root level commands. To improve security, OS accounts can be used to launch sessions using the Run As feature of Session Manager.

A session initiated via Session Manager is secure. The data exchange between the client and a managed instance takes place over a secure channel using TLS 1.2. To further improve your security posture, AWS Key Management Service (KMS) encryption can be used to encrypt the session traffic between a client and a managed instance. Encrypting session data with a customer managed key enables sessions to handle confidential data interactions. For using KMS encryption, both the user who starts sessions and the managed instance that they connect to, must have permission to use the key. Step-by-step instructions on how to set this up can be found in the Session Manager documentation.

Session Manager integrates with AWS CloudTrail, and this enables security teams to track when a user starts and shuts down sessions. Session Manager can also centrally log all session activity in Amazon CloudWatch or Amazon Simple Storage Service (S3). This gives system administrators the ability to manage details, such as when the session started, what commands were typed during the session, and when it ended. To configure session manager to send logs to CloudWatch and Amazon S3, the instance profile attached to the instance must have permissions to write to CloudWatch and S3. For the Amazon EC2 instance, this will be the IAM role attached to the instance. For VMs running on VMware Cloud on AWS, or on-premises, this is the IAM role from the “Hybrid Activations” page.

Following, we show an example of a session run on an on-premises instance via Session Manager and the corresponding logs in CloudWatch. The logs are continuously streamed into CloudWatch.

Figure 2. CloudWatch log events for session activity via Session Manager

Figure 2. CloudWatch log events for session activity via Session Manager

The following screenshot displays the ipconfig /all command being run remotely within PowerShell of an instance running within VMware Cloud on AWS via Session Manager:

Figure 3. Remote PowerShell session for on-premises VM via Session Manager

Figure 3. Remote PowerShell session for on-premises VM via Session Manager

Patch Manager

Patch management is vital in maintaining a secure and compliant environment. Patch Manager, a capability of AWS Systems Manager, helps you monitor, select, and deploy operating system and software patches automatically. This can happen across compute running on Amazon EC2, VMware on-premises, or VMware Cloud on AWS instances.

The Patch Manager dashboard shows details such as number of instances, high-level patch compliance summaries, compliance reporting age, and common causes of noncompliance. As Patch Manager performs patching operations, it updates the dashboard with a summary of recent patching operations and a list of recurring patching tasks. This provides the operations team a single unified view into environments and simplifies their monitoring efforts.

Figure 4. Patch Manager dashboard

Figure 4. Patch Manager dashboard

Figure 5. List of all recurring patching tasks

Figure 5. List of all recurring patching tasks

A patch baseline in Patch Manager defines which patches are approved for installation on your instances. Patch Manager provides predefined patch baselines for each supported operating system and also lets you create your own custom patch baselines. These patch baselines let you maintain patch consistency across your deployments on Amazon EC2, VMware on-premises, and VMware Cloud on AWS.

Custom patch baselines give you greater control over which patches are approved and when they are automatically applied. By using multiple patch baselines with different auto-approval delays or cutoffs, you can test patches in your development environment. Custom patch baselines also let you assign compliance levels to indicate the severity of the compliance violation when a patch is reported as missing.

Figure 6. List of Patch baselines

Figure 6. List of Patch baselines

You can use a patch group to associate a group of instances with a specific patch baseline in Patch Manager. This ensures that you are deploying the appropriate patches with associated patch baseline rules, to the correct set of instances. These instances can be EC2, VMware on-premises, or VMware Cloud on AWS. You can also use patch groups to schedule patching during a specific maintenance window.

Patch Manager also provides the ability to scan your instances and VMs running within VMware on-premises and/or VMware Cloud on AWS. It can report compliance adherence based on pre-defined schedules. Patch compliance reports can also be saved to an Amazon S3 bucket of your choice and generated as needed. For reports on a single instance/VM, detailed patch data will be included. For reports run on all instances, a summary of missing patch data will be provided.

The Patch Manager feature of AWS Systems Manager also integrates with AWS Security Hub, a service providing a comprehensive view of your security alerts. It additionally offers security check automation capabilities. In the following image, we show non-compliant instances and servers being reported within AWS Security Hub across EC2, VMware on-premises, and VMware Cloud on AWS:

Figure 7. Non-compliant instances and VMs being reported via AWS Security Hub

Figure 7. Non-compliant instances and VMs being reported via AWS Security Hub

Installation and deployment

To ease installation and deployment efforts, the SSM agent is pre-installed on instances created from the following Amazon Machine Images (AMIs):

  • Amazon Linux
  • Amazon Linux 2
  • Amazon Linux 2 ECS-Optimized Base AMIs
  • macOS 10.14.x (Mojave) and 10.15.x (Catalina)
  • Ubuntu Server 16.04, 18.04, and 20.04
  • Windows Server 2008-2012 R2 AMIs published in November 2016 or later
  • Windows Server 2016 and 2019

For other AMI’s and VMs within VMware on-premises and/or VMware Cloud on AWS, manual agent installation must be performed.

Below is an architecture diagram of our solution described in this post:

Figure 8. General example of Systems Manager process flow

Figure 8. General example of Systems Manager process flow

  1. Configure Systems Manager: Use the Systems Manager console, SDK, AWS Command Line Interface (AWS CLI), or AWS Tools for Windows PowerShell to configure, schedule, automate, and run actions that you want to perform on your AWS resources.
  2. Verification and processing: Systems Manager verifies the configurations, including permissions, and sends requests to the AWS Systems Manager SSM Agent running on your instances or servers in your hybrid environment. SSM Agent performs the specified configuration changes.
  3. Reporting: SSM Agent reports the status of the configuration changes and actions to Systems Manager in the AWS Cloud. If configured, Systems Manager then sends the status to the user and various AWS services.

Conclusion

In this post, we showcase how AWS Systems Manager can yield a unified view within your hybrid environments. It spans native AWS, VMware on-premises, and VMware Cloud on AWS. The Session Manager and Patch Manager features simplify instance connectivity and patch management. Other native capabilities of AWS Systems Manager allow application and change management, software inventory, remote initiation, and monitoring. We encourage you to use the features discussed in this post to maintain your servers across your hybrid environment.

Additional links for consideration:

Operating Lambda: Building a solid security foundation – Part 2

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/operating-lambda-building-a-solid-security-foundation-part-2/

In the Operating Lambda series, I cover important topics for developers, architects, and systems administrators who are managing AWS Lambda-based applications. This two-part series discusses core security concepts for Lambda-based applications.

Part 1 explains the Lambda execution environment and how to apply the principles of least privilege to your workload. This post covers securing workloads with public endpoints, encrypting data, and using AWS CloudTrail for governance, compliance, and operational auditing.

Securing workloads with public endpoints

For workloads that are accessible publicly, AWS provides a number of features and services that can help mitigate certain risks. This section covers authentication and authorization of application users and protecting API endpoints.

Authentication and authorization

Authentication relates to identity and authorization refers to actions. Use authentication to control who can invoke a Lambda function, and then use authorization to control what they can do. For many applications, AWS Identity & Access Management (IAM) is sufficient for managing both control mechanisms.

For applications with external users, such as web or mobile applications, it is common to use JSON Web Tokens (JWTs) to manage authentication and authorization. Unlike traditional, server-based password management, JWTs are passed from the client on every request. They are a cryptographically secure way to verify identity and claims using data passed from the client. For Lambda-based applications, this allows you to secure APIs for each microservice independently, without relying on a central server for authentication.

You can implement JWTs with Amazon Cognito, which is a user directory service that can handle registration, authentication, account recovery, and other common account management operations. For frontend development, Amplify Framework provides libraries to simplify integrating Cognito into your frontend application. You can also use third-party partner services like Auth0.

Given the critical security role of an identity provider service, it’s important to use professional tooling to safeguard your application. It’s not recommended that you write your own services to handle authentication or authorization. Any vulnerabilities in custom libraries may have significant implications for the security of your workload and its data.

Protecting API endpoints

For serverless applications, the preferred way to serve a backend application publicly is to use Amazon API Gateway. This can help you protect an API from malicious users or spikes in traffic.

For authenticated API routes, API Gateway offers both REST APIs and HTTP APIs for serverless developers. Both types support authorization using AWS Lambda, IAM or Amazon Cognito. When using IAM or Amazon Cognito, incoming requests are evaluated and if they are missing a required token or contain invalid authentication, the request is rejected. You are not charged for these requests and they do not count towards any throttling quotas.

Unauthenticated API routes may be accessed by anyone on the public internet so it’s recommended that you limit their use. If you must use unauthenticated APIs, it’s important to protect these against common risks, such as denial-of-service (DoS) attacks. Applying AWS WAF to these APIs can help protect your application from SQL injection and cross-site scripting (XSS) attacks. API Gateway also implements throttling at the AWS account-level and per-client level when API keys are used.

In some cases, the functionality provided by an unauthenticated API can be achieved with an alternative approach. For example, a web application may provide a list of customer retail stores from an Amazon DynamoDB table to users who are not logged in. This request may originate from a frontend web application or from any other source that calls the URL endpoint. This diagram compares three solutions:

Solutions for an unauthenticated API

  1. The unauthenticated API can be called by anyone on the internet. In a denial of service attack, it’s possible to exhaust API throttling limits, Lambda concurrency, or DynamoDB provisioned read capacity on an underlying table.
  2. An Amazon CloudFront distribution in front of the API endpoint with an appropriate time-to-live (TTL) configuration may help absorb traffic in a DoS attack, without changing the underlying solution for fetching the data.
  3. Alternatively, for static data that rarely changes, the CloudFront distribution could serve the data from an S3 bucket.

The AWS Well-Architected Tool provides a Serverless Lens that analyzes the security posture of serverless workloads.

Encrypting data in Lambda-based applications

Managing secrets

For applications handling sensitive data, AWS services provide a range of encryption options for data in transit and at rest. It’s important to identity and classify sensitive data in your workload, and minimize the storage of sensitive data to only what is necessary.

When protecting data at rest, use AWS services for key management and encryption of stored data, secrets and environment variables. Both the AWS Key Management Service and AWS Secrets Manager provide a robust approach to storing and managing secrets used in Lambda functions.

Do not store plaintext secrets or API keys in Lambda environment variables. Instead, use KMS to encrypt environment variables. Also ensure you do not embed secrets directly in function code, or commit these secrets to code repositories.

Using HTTPS securely

HTTPS is encrypted HTTP, using TLS (SSL) to encrypt the request and response, including headers and query parameters. While query parameters are encrypted, URLs may be logged by different services in plaintext, so you should not use these to store sensitive data such as credit card numbers.

AWS services make it easier to use HTTPS throughout your application and it is provided by default in services like API Gateway. Where you need an SSL/TLS certificate in your application, to support features like custom domain names, it’s recommended that you use AWS Certificate Manager (ACM). This provides free public certificates for ACM-integrated services and managed certificate renewal.

Governance controls with AWS CloudTrail

For compliance and operational auditing of application usage, AWS CloudTrail logs activity related to your AWS account usage. It tracks resource changes and usage, and provides analysis and troubleshooting tools. Enabling CloudTrail does not have any negative performance implications for your Lambda-based application, since the logging occurs asynchronously.

Separate from application logging (see chapter 4), CloudTrail captures two types of events:

  • Control plane: These events apply to management operations performed on any AWS resources. Individual trails can be configured to capture read or write events, or both.
  • Data plane: Events performed on the resources, such as when a Lambda function is invoked or an S3 object is downloaded.

For Lambda, you can log who creates and invokes functions, together with any changes to IAM roles. You can configure CloudTrail to log every single activity by user, role, service, and API within an AWS account. The service is critical for understanding the history of changes made to your account and also detecting any unintended changes or suspicious activity.

To research which AWS user interacted with a Lambda function, CloudTrail provides an audit log to find this information. For example, when a new permission is added to a Lambda function, it creates an AddPermission record. You can interpret the meaning of individual attributes in the JSON message by referring to the CloudTrail Record Contents documentation.

CloudTrail Record Contents documentation

CloudTrail data is considered sensitive so it’s recommended that you protect it with KMS encryption. For any service processing encrypted CloudTrail data, it must use an IAM policy with kms:Decrypt permission.

By integrating CloudTrail with Amazon EventBridge, you can create alerts in response to certain activities and respond accordingly. With these two services, you can quickly implement an automated detection and response pattern, enabling you to develop mechanisms to mitigate security risks. With EventBridge, you can analyze data in real-time, using event rules to filter events and forward to targets like Lambda functions or Amazon Kinesis streams.

CloudTrail can deliver data to Amazon CloudWatch Logs, which allows you to process multi-Region data in real time from one location. You can also deliver CloudTrail to Amazon S3 buckets, where you can create event source mappings to start data processing pipelines, run queries with Amazon Athena, or analyze activity with Amazon Macie.

If you use multiple AWS accounts, you can use AWS Organizations to manage and govern individual member accounts centrally. You can set an existing trail as an organization-level trail in a primary account that can collect events from all other member accounts. This can simplify applying consistent auditing rules across a large set of existing accounts, or automatically apply rules to new accounts. To learn more about this feature, see Creating a Trail for an Organization.

Conclusion

In this blog post, I explain how to secure workloads with public endpoints and the different authentication and authorization options available. I also show different approaches to exposing APIs publicly.

CloudTrail can provide compliance and operational auditing for Lambda usage. It provides logs for both the control plane and data plane. You can integrate CloudTrail with EventBridge to create alerts in response to certain activities. Customers with multiple AWS accounts can use AWS Organizations to manage trails centrally.

For more serverless learning resources, visit Serverless Land.

Run usage analytics on Amazon QuickSight using AWS CloudTrail

Post Syndicated from Sunil Salunkhe original https://aws.amazon.com/blogs/big-data/run-usage-analytics-on-amazon-quicksight-using-aws-cloudtrail/

Amazon QuickSight is a cloud-native BI service that allows end users to create and publish dashboards in minutes, without provisioning any servers or requiring complex licensing. You can view these dashboards on the QuickSight product console or embed them into applications and websites. After you deploy a dashboard, it’s important to assess how they and other assets are being adopted, accessed, and used across various departments or customers.

In this post, we use a QuickSight dashboard to present the following insights:

  • Most viewed and accessed dashboards
  • Most updated dashboards and analyses
  • Most popular datasets
  • Active users vs. idle users
  • Idle authors
  • Unused datasets (wasted SPICE capacity)

You can use these insights to reduce costs and create operational efficiencies in a deployment. The following diagram illustrates this architecture.

The following diagram illustrates this architecture.

Solution components

The following table summarizes the AWS services and resources that this solution uses.

Resource Type Name Purpose
AWS CloudTrail logs CloudTrailMultiAccount Capture all API calls for all AWS services across all AWS Regions for this account. You can use AWS Organizations to consolidate trails across multiple AWS accounts.
AWS Glue crawler

QSCloudTrailLogsCrawler

QSProcessedDataCrawler

Ensures that all CloudTrail data is crawled periodically and that partitions are updated in the AWS Glue Data Catalog.
AWS Glue ETL job QuickSightCloudTrailProcessing Reads catalogued data from the crawler, processes, transforms, and stores it in an S3 output bucket.
AWS Lambda function ExtractQSMetadata_func Extracts event data using the AWS SDK for Python, Boto3. The event data is enriched with QuickSight metadata objects like user, analysis, datasets, and dashboards.
Amazon Simple Storage Service (s3)

CloudTrailLogsBucket

QuickSight-BIonBI-processed

One bucket stores CloudTrail data. The other stores processed data.
Amazon QuickSight Quicksight_BI_On_BO_Analysis Visualizes the processed data.

 Solution walkthrough

AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. You can use CloudTrail to log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. You can define a trail to collect API actions across all AWS Regions. Although we have enabled a trail for all Regions in our solution, the dashboard shows the data for single Region only.

After you enable CloudTrail, it starts capturing all API actions and then, at 15-minute intervals, delivers logs in JSON format to a configured Amazon Simple Storage Service (Amazon S3) bucket. Before the logs are made available to our ad hoc query engine, Amazon Athena, they must be parsed, transformed, and processed by the AWS Glue crawler and ETL job.

Before the logs are made available to our ad hoc query engine

This will be handled by AWS Glue Crawler & AWS Glue ETL Job. The AWS Glue crawler crawls through the data every day and populates new partitions in the Data Catalog. The data is later made available as a table on the Athena console for processing by the AWS Glue ETL job. Glue ETL Job QuickSightCloudtrail_GlueJob.txt filters logs and processes only those events where the event source is QuickSight. (for example, eventSource = quicksight.amazonaws.com’).

  This will be handled by AWS Glue Crawler & AWS Glue ETL Job.

The following screenshot shows the sample JSON for the QuickSight API calls.

The following screenshot shows the sample JSON for the QuickSight API calls.

The job processes those events and creates a Parquet file. The following table summarizes the file’s data points.

Quicksightlogs
Field Name Data Type
eventtime Datetime
eventname String
awsregion String
accountid String
username String
analysisname String
Date Date

The processed data is stored in an S3 folder at s3://<BucketName>/processedlogs/. For performance optimization during querying and connecting this data to QuickSight for visualization, these logs are partitioned by date field. For this reason, we recommend that you configure the AWS Glue crawler to detect the new data and partitions and update the Data Catalog for subsequent analysis. We have configured the crawler to run one time a day.

We need to enrich this log data with metadata from QuickSight, such as a list of analyses, users, and datasets. This metadata can be extracted using descibe_analysis, describe_user, describe_data_set in the AWS SDK for Python.

We provide an AWS Lambda function that is ideal for this extraction. We configured it to be triggered once a day through Amazon EventBridge. The extracted metadata is stored in the S3 folder at s3://<BucketName>/metadata/.

Now that we have processed logs and metadata for enrichment, we need to prepare the data visualization in QuickSight. Athena allows us to build views that can be imported into QuickSight as datasets.

We build the following views based on the tables populated by the Lambda function and the ETL job:

CREATE VIEW vw_quicksight_bionbi 
AS 
  SELECT Date_parse(eventtime, '%Y-%m-%dT%H:%i:%SZ') AS "Event Time", 
         eventname  AS "Event Name", 
         awsregion  AS "AWS Region", 
         accountid  AS "Account ID", 
         username   AS "User Name", 
         analysisname AS "Analysis Name", 
         dashboardname AS "Dashboard Name", 
         Date_parse(date, '%Y%m%d') AS "Event Date" 
  FROM   "quicksightbionbi"."quicksightoutput_aggregatedoutput" 

CREATE VIEW vw_users 
AS 
  SELECT usr.username "User Name", 
         usr.role     AS "Role", 
         usr.active   AS "Active" 
  FROM   (quicksightbionbi.users 
          CROSS JOIN Unnest("users") t (usr)) 

CREATE VIEW vw_analysis 
AS 
  SELECT aly.analysisname "Analysis Name", 
         aly.analysisid   AS "Analysis ID" 
  FROM   (quicksightbionbi.analysis 
          CROSS JOIN Unnest("analysis") t (aly)) 

CREATE VIEW vw_analysisdatasets 
AS 
  SELECT alyds.analysesname "Analysis Name", 
         alyds.analysisid   AS "Analysis ID", 
         alyds.datasetid    AS "Dataset ID", 
         alyds.datasetname  AS "Dataset Name" 
  FROM   (quicksightbionbi.analysisdatasets 
          CROSS JOIN Unnest("analysisdatasets") t (alyds)) 

CREATE VIEW vw_datasets 
AS 
  SELECT ds.datasetname AS "Dataset Name", 
         ds.importmode  AS "Import Mode" 
  FROM   (quicksightbionbi.datasets 
          CROSS JOIN Unnest("datasets") t (ds))

QuickSight visualization

Follow these steps to connect the prepared data with QuickSight and start building the BI visualization.

  1. Sign in to the AWS Management Console and open the QuickSight console.

You can set up QuickSight access for end users through SSO providers such as AWS Single Sign-On (AWS SSO), Okta, Ping, and Azure AD so they don’t need to open the console.

You can set up QuickSight access for end users through SSO providers

  1. On the QuickSight console, choose Datasets.
  2. Choose New dataset to create a dataset for our analysis.

Choose New dataset to create a dataset for our analysis.

  1. For Create a Data Set, choose Athena.

In the previous steps, we prepared all our data in the form of Athena views.

  1. Configure permission for QuickSight to access AWS services, including Athena and its S3 buckets. For information, see Accessing Data Sources.

Configure permission for QuickSight to access AWS services,

  1. For Data source name, enter QuickSightBIbBI.
  2. Choose Create data source.

Choose Create data source.

  1. On Choose your table, for Database, choose quicksightbionbi.
  2. For Tables, select vw_quicksight_bionbi.
  3. Choose Select.

Choose Select.

  1. For Finish data set creation, there are two options to choose from:
    1. Import to SPICE for quicker analytics – Built from the ground up for the cloud, SPICE uses a combination of columnar storage, in-memory technologies enabled through the latest hardware innovations, and machine code generation to run interactive queries on large datasets and get rapid responses. We use this option for this post.
    2. Directly query your data – You can connect to the data source in real time, but if the data query is expected to bring bulky results, this option might slow down the dashboard refresh.
  2. Choose Visualize to complete the data source creation process.

Choose Visualize to complete the data source creation process.

Now you can build your visualizations sheets. QuickSight refreshes the data source first. You can also schedule a periodic refresh of your data source.

Now you can build your visualizations sheets.

The following screenshot shows some examples of visualizations we built from the data source.

The following screenshot shows some examples of visualizations we built from the data source.

 

This dashboard presents us with two main areas for cost optimization:

  • Usage analysis – We can see how analyses and dashboards are being consumed by users. This area highlights the opportunity for cost saving by looking at datasets that have not been used for the last 90 days in any of the analysis but are still holding a major chunk of SPICE capacity.
  • Account governance – Because author subscriptions are charged on a fixed fee basis, it’s important to monitor if they are actively used. The dashboard helps us identify idle authors for the last 60 days.

Based on the information in the dashboard, we could do the following to save costs:

Conclusion

In this post, we showed how you can use CloudTrail logs to review the use of QuickSight objects, including analysis, dashboards, datasets, and users. You can use the information available in dashboards to save money on storage, subscriptions, understand maturity of QuickSight Tool adoption and more.


About the Author

Sunil SalunkheSunil Salunkhe is a Senior Solution Architect working with Strategic Accounts on their vision to leverage the cloud to drive aggressive growth strategies. He practices customer obsession by solving their complex challenges in all the aspects of the cloud journey including scale, security and reliability. While not working, he enjoys playing cricket and go cycling with his wife and a son.