Tag Archives: Intermediate (200)

How to use multiple instances of AWS IAM Identity Center

2023-11-20 Laura Reith

Post Syndicated from Laura Reith original https://aws.amazon.com/blogs/security/how-to-use-multiple-instances-of-aws-iam-identity-center/

Recently, AWS launched a new feature that allows deployment of account instances of AWS IAM Identity Center . With this launch, you can now have two types of IAM Identity Center instances: organization instances and account instances. An organization instance is the IAM Identity Center instance that’s enabled in the management account of your organization created with AWS Organizations. This instance is used to manage access to AWS accounts and applications across your entire organization. Organization instances are the best practice when deploying IAM Identity Center. Many customers have requested a way to enable AWS applications using test or sandbox identities. The new account instances are intended to support sand-boxed deployments of AWS managed applications such as Amazon CodeCatalyst and are only usable from within the account and AWS Region in which they were created. They can exist in a standalone account or in a member account within AWS Organizations.

In this blog post, we show you when to use each instance type, how to control the deployment of account instances, and how you can monitor, manage, and audit these instances at scale using the enhanced IAM Identity Center APIs.

IAM Identity Center instance types

IAM Identity Center now offers two deployment types, the traditional organization instance and an account instance, shown in Figure 1. In this section, we show you the differences between the two.

Figure 1: IAM Identity Center instance types

Organization instance of IAM Identity Center

An organization instance of IAM Identity Center is the fully featured version that’s available with AWS Organizations. This type of instance helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications in your organization. The recommended use of an organization instance of Identity Center is for workforce authentication and authorization on AWS for organizations of any size and type.

Using the organization instance of IAM Identity Center, your identity center administrator can create and manage user identities in the Identity Center directory, or connect your existing identity source, including Microsoft Active Directory, Okta, Ping Identity, JumpCloud, Google Workspace, and Azure Active Directory (Entra ID). There is only one organization instance of IAM Identity Center at the organization level. If you have enabled IAM Identity Center before November 15, 2023, you have an organization instance.

Account instances of IAM Identity Center

Account instances of IAM Identity Center provide a subset of the features of the organization instance. Specifically, account instances support user and group assignments initially only to Amazon CodeCatalyst. They are bound to a single AWS account, and you can deploy them in either member accounts of an organization or in standalone AWS accounts. You can only deploy one account instance per AWS account regardless of Region.

You can use account instances of IAM Identity Center to provide access to supported Identity Center enabled application if the application is in the same account and Region.

Account instances of Identity Center don’t support permission sets or assignments to customer managed applications. If you’ve enabled Identity Center before November 15, 2023 then you must enable account instance creation from your management account. To learn more see Enable account instances in the AWS Management Console documentation. If you haven’t yet enabled Identity Center, then account instances are now available to you.

When should I use account instances of IAM Identity Center?

Account instances are intended for use in specific situations where organization instances are unavailable or impractical, including:

You want to run a temporary trial of a supported AWS managed application to determine if it suits your business needs. See Additional Considerations.
You are unable to deploy IAM Identity Center across your organization, but still want to experiment with one or more AWS managed applications. See Additional Considerations.
You have an organization instance of IAM Identity Center, but you want to deploy a supported AWS managed application to an isolated set of users that are distinct from those in your organization instance.

Additional considerations

When working with multiple instances of IAM Identity Center, you want to keep a number of things in mind:

Each instance of IAM Identity Center is separate and distinct from other Identity Center instances. That is, users and assignments are managed separately in each instance without a means to keep them in sync.
Migration between instances isn’t possible. This means that migrating an application between instances requires setting up that application from scratch in the new instance.
Account instances have the same considerations when changing your identity source as an organization instance. In general, you want to set up with the right identity source before adding assignments.
Automating assigning users to applications through the IAM Identity Center public APIs also requires using the applications APIs to ensure that those users and groups have the right permissions within the application. For example, if you assign groups to CodeCatalyst using Identity Center, you still have to assign the groups to the CodeCatalyst space from the Amazon CodeCatalyst page in the AWS Management Console. See the Setting up a space that supports identity federation documentation.
By default, account instances require newly added users to register a multi-factor authentication (MFA) device when they first sign in. This can be altered in the AWS Management Console for Identity Center for a specific instance.

Controlling IAM Identity Center instance deployments

If you’ve enabled IAM Identity Center prior to November 15, 2023 then account instance creation is off by default. If you want to allow account instance creation, you must enable this feature from the Identity Center console in your organization’s management account. This includes scenarios where you’re using IAM Identity Center centrally and want to allow deployment and management of account instances. See Enable account instances in the AWS Management Console documentation.

If you enable IAM Identity Center after November 15, 2023 or if you haven’t enabled Identity Center at all, you can control the creation of account instances of Identity Center through a service control policy (SCP). We recommend applying the following sample policy to restrict the use of account instances to all but a select set of AWS accounts. The sample SCP that follows will help you deny creation of account instances of Identity Center to accounts in the organization unless the account ID matches the one you specified in the policy. Replace <ALLOWED-ACCOUNT_ID> with the ID of the account that is allowed to create account instances of Identity Center:

{
    "Version": "2012-10-17",
    "Statement" : [
        {
            "Sid": "DenyCreateAccountInstances",
            "Effect": "Deny",
            "Action": [
                "sso:CreateInstance"
            ],
            "Resource": "*",
            "Condition": {
                "StringNotEquals": [
                    "aws:PrincipalAccount": ["<ALLOWED-ACCOUNT-ID>"]
                ]
            }
        }
    ]
}

To learn more about SCPs, see the AWS Organizations User Guide on service control policies.

Monitoring instance activity with AWS CloudTrail

If your organization has an existing log ingestion pipeline solution to collect logs and generate reports through AWS CloudTrail, then IAM Identity Center supported CloudTrail operations will automatically be present in your pipeline, including additional account instances of IAM Identity Center actions such as sso:CreateInstance.

To create a monitoring solution for IAM Identity Center events in your organization, you should set up monitoring through AWS CloudTrail. CloudTrail is a service that records events from AWS services to facilitate monitoring activity from those services in your accounts. You can create a CloudTrail trail that captures events across all accounts and all Regions in your organization and persists them to Amazon Simple Storage Service (Amazon S3).

After creating a trail for your organization, you can use it in several ways. You can send events to Amazon CloudWatch Logs and set up monitoring and alarms for Identity Center events, which enables immediate notification of supported IAM Identity Center CloudTrail operations. With multiple instances of Identity Center deployed within your organization, you can also enable notification of instance activity, including new instance creation, deletion, application registration, user authentication, or other supported actions.

If you want to take action on IAM Identity Center events, you can create a solution to process events using additional service such as Amazon Simple Notification Service, Amazon Simple Queue Service, and the CloudTrail Processing Library. With this solution, you can set your own business logic and rules as appropriate.

Additionally, you might want to consider AWS CloudTrail Lake, which provides a powerful data store that allows you to query CloudTrail events without needing to manage a complex data loading pipeline. You can quickly create a data store for new events, which will immediately start gathering data that can be queried within minutes. To analyze historical data, you can copy your organization trail to CloudTrail Lake.

The following is an example of a simple query that shows you a list of the Identity Center instances created and deleted, the account where they were created, and the user that created them. Replace <Event_data_store_ID> with your store ID.

SELECT 
    userIdentity.arn AS userARN, eventName, userIdentity.accountId 
FROM 
    <Event_data_store_ID> 
WHERE 
    userIdentity.arn IS NOT NULL 
    AND eventName = 'DeleteInstance' 
    OR eventName = 'CreateInstance'

You can save your query result to an S3 bucket and download a copy of the results in CSV format. To learn more, follow the steps in Download your CloudTrail Lake saved query results. Figure 2 shows the CloudTrail Lake query results.

Figure 2: AWS CloudTrail Lake query results

If you want to automate the sourcing, aggregation, normalization, and data management of security data across your organization using the Open Cyber Security Framework (OCSF) standard, you will benefit from using Amazon Security Lake. This service helps make your organization’s security data broadly accessible to your preferred security analytics solutions to power use cases such like threat detection, investigation, and incident response. Learn more in What is Amazon Security Lake?

Instance management and discovery within an organization

You can create account instances of IAM Identity Center in a standalone account or in an account that belongs to your organization. Creation can happen from an API call (CreateInstance) from the Identity Center console in a member account or from the setup experience of a supported AWS managed application. Learn more about Supported AWS managed applications.

If you decide to apply the DenyCreateAccountInstances SCP shown earlier to accounts in your organization, you will no longer be able to create account instances of IAM Identity Center in those accounts. However, you should also consider that when you invite a standalone AWS account to join your organization, the account might have an existing account instance of Identity Center.

To identify existing instances, who’s using them, and what they’re using them for, you can audit your organization to search for new instances. The following script shows how to discover all IAM Identity Center instances in your organization and export a .csv summary to an S3 bucket. This script is designed to run on the account where Identity Center was enabled. Click here to see instructions on how to use this script.

. . .
. . .
accounts_and_instances_dict={}
duplicated_users ={}

main_session = boto3.session.Session()
sso_admin_client = main_session.client('sso-admin')
identity_store_client = main_session.client('identitystore')
organizations_client = main_session.client('organizations')
s3_client = boto3.client('s3')
logger = logging.getLogger()
logger.setLevel(logging.INFO)

#create function to list all Identity Center instances in your organization
def lambda_handler(event, context):
    application_assignment = []
    user_dict={}
    
    current_account = os.environ['CurrentAccountId']
 
    logger.info("Current account %s", current_account)
    
    paginator = organizations_client.get_paginator('list_accounts')
    page_iterator = paginator.paginate()
    for page in page_iterator:
        for account in page['Accounts']:
            get_credentials(account['Id'],current_account)
            #get all instances per account - returns dictionary of instance id and instances ARN per account
            accounts_and_instances_dict = get_accounts_and_instances(account['Id'], current_account)
                    
def get_accounts_and_instances(account_id, current_account):
    global accounts_and_instances_dict
    
    instance_paginator = sso_admin_client.get_paginator('list_instances')
    instance_page_iterator = instance_paginator.paginate()
    for page in instance_page_iterator:
        for instance in page['Instances']:
            #send back all instances and identity centers
            if account_id == current_account:
                accounts_and_instances_dict = {current_account:[instance['IdentityStoreId'],instance['InstanceArn']]}
            elif instance['OwnerAccountId'] != current_account: 
                accounts_and_instances_dict[account_id]= ([instance['IdentityStoreId'],instance['InstanceArn']])
    return accounts_and_instances_dict
  . . .  
  . . .
  . . .

The following table shows the resulting IAM Identity Center instance summary report with all of the accounts in your organization and their corresponding Identity Center instances.

AccountId	IdentityCenterInstance
111122223333	d-111122223333
111122224444	d-111122223333
111122221111	d-111111111111

Duplicate user detection across multiple instances

A consideration of having multiple IAM Identity Center instances is the possibility of having the same person existing in two or more instances. In this situation, each instance creates a unique identifier for the same person and the identifier associates application-related data to the user. Create a user management process for incoming and outgoing users that is similar to the process you use at the organization level. For example, if a user leaves your organization, you need to revoke access in all Identity Center instances where that user exists.

The code that follows can be added to the previous script to help detect where duplicates might exist so you can take appropriate action. If you find a lot of duplication across account instances, you should consider adopting an organization instance to reduce your management overhead.

...
#determine if the member in IdentityStore have duplicate
def get_users(identityStoreId, user_dict): 
    global duplicated_users
    paginator = identity_store_client.get_paginator('list_users')
    page_iterator = paginator.paginate(IdentityStoreId=identityStoreId)
    for page in page_iterator:
        for user in page['Users']:
            if ( 'Emails' not in user ):
                print("user has no email")
            else:
                for email in user['Emails']:
                    if email['Value'] not in user_dict:
                        user_dict[email['Value']] = identityStoreId
                    else:
                        print("Duplicate user found " + user['UserName'])
                        user_dict[email['Value']] = user_dict[email['Value']] + "," + identityStoreId
                        duplicated_users[email['Value']] = user_dict[email['Value']]
    return user_dict 
...

The following table shows the resulting report with duplicated users in your organization and their corresponding IAM identity Center instances.

User_email	IdentityStoreId
[email protected]	d-111122223333, d-111111111111
[email protected]	d-111122223333, d-111111111111, d-222222222222
[email protected]	d-111111111111, d-222222222222

The full script for all of the above use cases is available in the multiple-instance-management-iam-identity-center GitHub repository. The repository includes instructions to deploy the script using AWS Lambda within the management account. After deployment, you can invoke the Lambda function to get .csv files of every IAM Identity center instance in your organization, the applications assigned to each instance, and the users that have access to those applications. With this function, you also get a report of users that exist in more than one local instance.

Conclusion

In this post, you learned the differences between an IAM Identity Center organization instance and an account instance, considerations for when to use an account instance, and how to use Identity Center APIs to automate discovery of Identity Center account instances in your organization.

To learn more about IAM Identity Center, see the AWS IAM Identity Center user guide.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS IAM Identity Center re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Download AWS Security Hub CSV report

2023-11-17 Pablo Pagani

Post Syndicated from Pablo Pagani original https://aws.amazon.com/blogs/security/download-aws-security-hub-csv-report/

AWS Security Hub provides a comprehensive view of your security posture in Amazon Web Services (AWS) and helps you check your environment against security standards and best practices. In this post, I show you a solution to export Security Hub findings to a .csv file weekly and send an email notification to download the file from Amazon Simple Storage Service (Amazon S3). By using this solution, you can share the report with others without providing access to your AWS account. You can also use it to generate assessment reports and prioritize and build a remediation roadmap.

When you enable Security Hub, it collects and consolidates findings from AWS security services that you’re using, such as threat detection findings from Amazon GuardDuty, vulnerability scans from Amazon Inspector, S3 bucket policy findings from Amazon Macie, publicly accessible and cross-account resources from AWS Identity and Access Management Access Analyzer, and resources missing AWS WAF coverage from AWS Firewall Manager. Security Hub also consolidates findings from integrated AWS Partner Network (APN) security solutions.

Cloud security processes can differ from traditional on-premises security in that security is often decentralized in the cloud. With traditional on-premises security operations, security alerts are typically routed to centralized security teams operating out of security operations centers (SOCs). With cloud security operations, it’s often the application builders or DevOps engineers who are best situated to triage, investigate, and remediate security alerts.

This solution uses the Security Hub API, AWS Lambda, Amazon S3, and Amazon Simple Notification Service (Amazon SNS). Findings are aggregated into a .csv file to help identify common security issues that might require remediation action.

Solution overview

This solution assumes that Security Hub is enabled in your AWS account. If it isn’t enabled, set up the service so that you can start seeing a comprehensive view of security findings across your AWS accounts.

How the solution works

An Amazon EventBridge time-based event invokes a Lambda function for processing.
The Lambda function gets finding results from the Security Hub API and writes them into a .csv file.
The API uploads the file into Amazon S3 and generates a presigned URL with a 24-hour duration, or the duration of the temporary credential used in Lambda, whichever ends first.
Amazon SNS sends an email notification to the address provided during deployment. This email address can be updated afterwards through the Amazon SNS console.
The email includes a link to download the file.

Figure 1: Solution overview, deployed through AWS CloudFormation

Fields included in the report:

Note: You can extend the report by modifying the Lambda function to add fields as needed.

Solution resources

The solution provided with this blog post consists of an AWS CloudFormation template named security-hub-full-report-email.json that deploys the following resources:

An Amazon SNS topic named SecurityHubRecurringFullReport and an email subscription to the topic.

Figure 2: SNS topic created by the solution
The email address that subscribes to the topic is captured through a CloudFormation template input parameter. The subscriber is notified by email to confirm the subscription. After confirmation, the subscription to the SNS topic is created. Additional subscriptions can be added as needed to include additional emails or distribution lists.

Figure 3: SNS email subscription
The SendSecurityHubFullReportEmail Lambda function queries the Security Hub API to get findings into a .csv file that’s written to Amazon S3. A pre-authenticated link to the file is generated and sends the email message to the SNS topic described above.

Figure 4: Lambda function created by the solution
An IAM role for the Lambda function to be able to create logs in CloudWatch, get findings from Security Hub, publish messages to SNS, and put objects into an S3 bucket.

Figure 5: Permissions policy for the Lambda function
An EventBridge rule that runs on a schedule named SecurityHubFullReportEmailSchedule used to invoke the Lambda function that generates the findings report. The default schedule is every Monday at 8:00 AM UTC. This schedule can be overwritten by using a CloudFormation input parameter. Learn more about creating cron expressions.

Figure 6: Example of the EventBridge schedule created by the solution

Deploy the solution

Use the following steps to deploy this solution in a single AWS account. If you have a Security Hub administrator account or are using Security Hub cross-Region aggregation, the report will get the findings from the linked AWS accounts and Regions.

To deploy the solution

Download the CloudFormation template security-hub-full-report-email.json from our GitHub repository.
Copy the template to an S3 bucket within your target AWS account and Region. Copy the object URL for the CloudFormation template .json file.
On the AWS Management Console, go to the CloudFormation console. Choose Create Stack and select With new resources.

Figure 7: Create stack with new resources
Under Specify template, in the Amazon S3 URL textbox, enter the S3 object URL for the .json file that you uploaded in step 1.

Figure 8: Specify S3 URL for CloudFormation template
Choose Next. On the next page, do the following:
1. Stack name: Enter a name for the stack.
2. Email address: Enter the email address of the subscriber to the Security Hub findings email.
3. RecurringScheduleCron: Enter the cron expression for scheduling the Security Hub findings email. The default is every Monday at 8:00 AM UTC. Learn more about creating cron expressions.
4. SecurityHubRegion: Enter the Region where Security Hub is aggregating the findings.
Figure 9: Enter stack name and parameters
Choose Next.
Keep all defaults in the screens that follow and choose Next.
Check the box I acknowledge that AWS CloudFormation might create IAM resources, and then choose Create stack.

Test the solution

You can send a test email after the deployment is complete. To do this, open the Lambda console and locate the SendSecurityHubFullReportEmail Lambda function. Perform a manual invocation with an event payload to receive an email within a few minutes. You can repeat this procedure as many times as you want.

Conclusion

In this post I’ve shown you an approach for rapidly building a solution for sending weekly findings report of the security posture of your AWS account as evaluated by Security Hub. This solution helps you to be diligent in reviewing outstanding findings and to remediate findings in a timely way based on their severity. You can extend the solution in many ways, including:

Send a file to an email-enabled ticketing service, such as ServiceNow or another security information and event management (SIEM) that you use.
Add links to internal wikis for workflows such as organizational exceptions to vulnerabilities or other internal processes.
Extend the solution by modifying the filters, email content, and delivery frequency.

To learn more about how to set up and customize Security Hub, see these additional blog posts.

If you have feedback about this post, submit comments in the Comments section below. If you have any questions about this post, start a thread on the AWS Security Hub re:Post forum.

Want more AWS Security news? Follow us on Twitter.

Speed up queries with the cost-based optimizer in Amazon Athena

2023-11-17 Darshit Thakkar

Post Syndicated from Darshit Thakkar original https://aws.amazon.com/blogs/big-data/speed-up-queries-with-cost-based-optimizer-in-amazon-athena/

Amazon Athena is a serverless, interactive analytics service built on open source frameworks, supporting open table file formats. Athena provides a simplified, flexible way to analyze petabytes of data where it lives. You can analyze data or build applications from an Amazon Simple Storage Service (Amazon S3) data lake and 30 data sources, including on-premises data sources or other cloud systems using SQL or Python. Athena is built on open source Trino and Presto engines and Apache Spark frameworks, with no provisioning or configuration effort required.

Starting today, the Athena SQL engine uses a cost-based optimizer (CBO), a new feature that uses table and column statistics stored in the AWS Glue Data Catalog as part of the table’s metadata. By using these statistics, CBO improves query run plans and boosts the performance of queries run in Athena. Some of the specific optimizations CBO can employ include join reordering and pushing aggregations down based on the statistics available for each table and column.

TPC-DS benchmarks These benchmarks demonstrate the power of the cost-based optimizer—queries run up to 2x times faster with CBO enabled compared to running the same TPC-DS queries without CBO.

Performance and cost comparison on TPC-DS benchmarks

We used the industry-standard TPC-DS 3 TB to represent different customer use cases. These are representative of workloads with 10 times the stated benchmark size. This means a 3 TB benchmark dataset accurately represents customer workloads on 30–50 TB datasets.

In our testing, the dataset was stored in Amazon S3 in non-compressed Parquet format and the AWS Glue Data Catalog was used to store metadata for databases and tables. Fact tables were partitioned on the date column used for join operations, and each fact table consisted of 2,000 partitions. To help illustrate the performance of CBO, we compare the behavior of various queries and highlight the performance differences between running with CBO enabled vs. disabled.

The following graph illustrates the runtime of queries on the engine with and without CBO.

The following graph presents the top 10 queries from the TPC-DS benchmark with the greatest performance improvement.

Let’s discuss some of the cost-based optimization techniques that contributed to improved query performance.

Cost-based join reordering

Join reordering, an optimization technique used by cost-based SQL optimizers, analyzes different join sequences to select the order that minimizes query runtime by reducing intermediate data processed at each step, lowering memory and CPU requirements.

Let’s talk about query 82 of the TPC-DS 3TB dataset. The query performs inner joins on four tables: item, inventory, date_dim, and store_sales. The store_sales table has 8.6 billion rows and is partitioned by date. The inventory table has 1 billion rows and is also partitioned by date. The item table contains 360,000 rows, and the date_dim table holds 73,000 rows.

Query 82

select  i_item_id ,i_item_desc ,i_current_price
from item, inventory, date_dim, store_sales
where i_current_price between 30 and 30+30
and inv_item_sk = i_item_sk
and d_date_sk=inv_date_sk
and cast(d_date as date) between cast('2002-05-30' as date) and (cast('2002-05-30' as date) +  interval '60' day)
and i_manufact_id in (437,129,727,663)
and inv_quantity_on_hand between 100 and 500
and ss_item_sk = i_item_sk
group by i_item_id,i_item_desc,i_current_price
order by i_item_id
limit 100

Without CBO

Without using CBO, the engine will determine the join order based on the sequence of tables defined in the input query with internal heuristics. The FROM clause of the input query is "from item, inventory, date_dim, store_sales" (all inner joins). After passing through internal heuristics, Athena chose the join order as ((item ⋈ (inventory ⋈ date_dim)) ⋈ store_sales). Despite store_sales being the largest fact table, it’s defined last in the FROM clause and therefore gets joined last. This plan fails to reduce the intermediate join sizes as early as possible, resulting in an increased query runtime. The following diagram shows the join order without CBO and the number of rows flowing through different stages.

With CBO

When using CBO, the optimizer determines the best join order using a variety of data, including statistics as well as join size estimation, join build side, and join type. In this instance, Athena’s selected join order is ((store_sales ⋈ item) ⋈ (inventory ⋈ date_dim)). The largest fact table, store_sales, without being shuffled, is first joined with the item dimension table. The other partitioned table, inventory, is also first joined in-place with the date_dim dimension table. The join with the dimension table acts as a filter on the fact table, which dramatically reduces the input data size of the join that follows. Note that which side a table resides for a join is significant in Athena, because it’s the table on the right that will be built into memory for the join operation. Therefore, we always want to keep the larger table on the left and the smaller table on the right. CBO chose a plan that the left side was 8.6 billion before, and now it’s 13.6 million.

With CBO, the query runtime improved by 25% (from 15 seconds down to 11 seconds) by choosing the optimal join order.

Next, let’s discuss another CBO technique.

Cost-based aggregation pushdown

Aggregation pushdown is an optimization technique used by query optimizers to improve performance. It involves pushing aggregation operations like SUM, COUNT, and AVG into an earlier stage in the query plan, while maintaining the same query semantics. This reduces the amount of data transferred between the stages. By minimizing data processing, aggregation pushdown decreases memory usage, I/O costs, and network traffic.

However, pushing down aggregation is not always beneficial. It depends on the data distribution. For example, grouping on a column with many rows but few distinct values (like gender) before joins works better. Grouping first means aggregating a large number of records into fewer records (just male, female, for example). Grouping after joining means a large number of records have to participate the join before being aggregated. On the other hand, grouping on a high cardinality column is better done after joins. Doing it before risks unnecessary aggregation overhead because each value is likely unique anyway and that step will not result in an earlier reduction in the amount of data transferred between intermediate stages.

Therefore, whether to push down aggregation should be a cost-based decision. Let’s take example of the query 2 run on a 3TB TPC-DS dataset, showing how the aggregation pushdown’s value depends on data distribution. The web_sales table has 2.1 billion rows and the catalog_sales table has 4.23 billion rows. Both tables are partitioned on the date column.

Query 2

with wscs as
 (select sold_date_sk
        ,sales_price
  from (select ws_sold_date_sk sold_date_sk
              ,ws_ext_sales_price sales_price
        from web_sales 
        union all
        select cs_sold_date_sk sold_date_sk
              ,cs_ext_sales_price sales_price
        from catalog_sales)),
 wswscs as 
 (select d_week_seq,
        sum(case when (d_day_name='Sunday') then sales_price else null end) sun_sales,
        sum(case when (d_day_name='Monday') then sales_price else null end) mon_sales,
        sum(case when (d_day_name='Tuesday') then sales_price else  null end) tue_sales,
        sum(case when (d_day_name='Wednesday') then sales_price else null end) wed_sales,
        sum(case when (d_day_name='Thursday') then sales_price else null end) thu_sales,
        sum(case when (d_day_name='Friday') then sales_price else null end) fri_sales,
        sum(case when (d_day_name='Saturday') then sales_price else null end) sat_sales
 from wscs
     ,date_dim
 where d_date_sk = sold_date_sk
 group by d_week_seq)
 select d_week_seq1
       ,round(sun_sales1/sun_sales2,2)
       ,round(mon_sales1/mon_sales2,2)
       ,round(tue_sales1/tue_sales2,2)
       ,round(wed_sales1/wed_sales2,2)
       ,round(thu_sales1/thu_sales2,2)
       ,round(fri_sales1/fri_sales2,2)
       ,round(sat_sales1/sat_sales2,2)
 from
 (select wswscs.d_week_seq d_week_seq1
        ,sun_sales sun_sales1
        ,mon_sales mon_sales1
        ,tue_sales tue_sales1
        ,wed_sales wed_sales1
        ,thu_sales thu_sales1
        ,fri_sales fri_sales1
        ,sat_sales sat_sales1
  from wswscs,date_dim 
  where date_dim.d_week_seq = wswscs.d_week_seq and
        d_year = 2001) y,
 (select wswscs.d_week_seq d_week_seq2
        ,sun_sales sun_sales2
        ,mon_sales mon_sales2
        ,tue_sales tue_sales2
        ,wed_sales wed_sales2
        ,thu_sales thu_sales2
        ,fri_sales fri_sales2
        ,sat_sales sat_sales2
  from wswscs
      ,date_dim 
  where date_dim.d_week_seq = wswscs.d_week_seq and
        d_year = 2001+1) z
 where d_week_seq1=d_week_seq2-53
 order by d_week_seq1

Without CBO

Athena first joins the result of the union all operation on the web_sales table and the catalog_sales table with another table. Only then does it perform aggregation on the joined results. In this example, the amount of data that needed to be joined was huge, resulting in a longer query runtime.

With CBO

Athena utilizes one of the statistics values, the distinct value count, to evaluate the cost implications of pushing down the aggregation vs. not doing so. When a column has many rows but few distinct values, CBO is more likely to push aggregation down. This shrank the qualified rows from web_sales and catalog_sales tables to 2,590 and 3,590 rows, respectively. These aggregated records were then unioned and used to join with the tables. Comparing to the plan without CBO, the records participating in the join from the two large tables dropped from 6.33 billion rows (2.1 billion + 4.23 billion) to just 6,180 rows (2,590 + 3,590). This significantly decreased query runtime.

With CBO, the query runtime improved by 50% (from 37 seconds down to 18 seconds). In summary, CBO helped Athena choose an optimal aggregation pushdown plan, cutting the query time in half compared to not using cost-based optimization.

Conclusion

In this post, we discussed how Athena uses a cost-based optimizer (CBO) in its engine v3 to use table statistics for generating more efficient query run plans. Testing on the TPC-DS benchmark showed an 11% improvement in overall query performance when using CBO compared to without it.

Two key optimization employed by CBO are join reordering and aggregate pushdown. Join reordering reduces intermediate data by intelligently picking the order to join tables based on statistics. Aggregate pushdown decreases intermediate data by pushing aggregations earlier in the plan when beneficial.

In summary, Athena’s new cost-based optimizer significantly speeds up queries by choosing superior run plans. CBO optimizes based on table statistics stored in the AWS Glue Data Catalog. This automatic optimization improves productivity for Athena users through more responsive query performance. To take advantage of optimization techniques of CBO, refer to working with column statistics to generate statistics on the tables and columns in the AWS Glue Data Catalog.

About the Authors

Darshit Thakkar is a Technical Product Manager with AWS and works with the Amazon Athena team based out of Boston, Massachusetts.

Wei Zheng is a Sr. Software Development Engineer with Amazon Athena. He joined AWS in 2021 and has been working on multiple performance improvements on Athena.

Chuho Chang is a Software Development Engineer with Amazon Athena. He has been working on query optimizers for over a decade.

Pathik Shah is a Sr. Analytics Architect on Amazon Athena. He joined AWS in 2015 and has been focusing in the big data analytics space since then, helping customers build scalable and robust solutions using AWS analytics services.

Implement an early feedback loop with AWS developer tools to shift security left

2023-11-17 Barry Conway

Post Syndicated from Barry Conway original https://aws.amazon.com/blogs/security/implement-an-early-feedback-loop-with-aws-developer-tools-to-shift-security-left/

Early-feedback loops exist to provide developers with ongoing feedback through automated checks. This enables developers to take early remedial action while increasing the efficiency of the code review process and, in turn, their productivity.

Early-feedback loops help provide confidence to reviewers that fundamental security and compliance requirements were validated before review. As part of this process, common expectations of code standards and quality can be established, while shifting governance mechanisms to the left.

In this post, we will show you how to use AWS developer tools to implement a shift-left approach to security that empowers your developers with early feedback loops within their development practices. You will use AWS CodeCommit to securely host Git repositories, AWS CodePipeline to automate continuous delivery pipelines, AWS CodeBuild to build and test code, and Amazon CodeGuru Reviewer to detect potential code defects.

Why the shift-left approach is important

Developers today are an integral part of organizations, building and maintaining the most critical customer-facing applications. Developers must have the knowledge, tools, and processes in place to help them identify potential security issues before they release a product to production.

This is why the shift-left approach is important. Shift left is the process of checking for vulnerabilities and issues in the earlier stages of software development. By following the shift-left process (which should be part of a wider application security review and threat modelling process), software teams can help prevent undetected security issues when they build an application. The modern DevSecOps workflow continues to shift left towards the developer and their practices with the aim to achieve the following:

Drive accountability among developers for the security of their code
Empower development teams to remediate issues up front and at their own pace
Improve risk management by enabling early visibility of potential security issues through early feedback loops

You can use AWS developer tools to help provide this continual early feedback for developers upon each commit of code.

Solution prerequisites

To follow along with this solution, make sure that you have the following prerequisites in place:

An AWS account
Access to these AWS services:

Make sure that you have a general working knowledge of the listed services and DevOps practices.

Solution overview

The following diagram illustrates the architecture of the solution.

Figure 1: Solution overview

We will show you how to set up a continuous integration and continuous delivery (CI/CD) pipeline by using AWS developer tools—CodeCommit, CodePipeline, CodeBuild, and CodeGuru—that you will integrate with the code repository to detect code security vulnerabilities. As shown in Figure 1, the solution has the following steps:

The developer commits the new branch into the code repository.
The developer creates a pull request to the main branch.
Pull requests initiate two jobs: an Amazon CodeGuru Reviewer code scan and a CodeBuild job.
1. CodeGuru Reviewer uses program analysis and machine learning to help detect potential defects in your Java and Python code, and provides recommendations to improve the code. CodeGuru Reviewer helps detect security vulnerabilities, secrets, resource leaks, concurrency issues, incorrect input validation, and deviation from best practices for using AWS APIs and SDKs.
2. You can configure the CodeBuild deployment with third-party tools, such as Bandit for Python to help detect security issues in your Python code.
CodeGuru Reviewer or CodeBuild writes back the findings of the code scans to the pull request to provide a single common place for developers to review the findings that are relevant to their specific code updates.

The following table presents some other tools that you can integrate into the early-feedback toolchain, depending on the type of code or artefacts that you are evaluating:

Early feedback – security tools	Usage	License
cfn-guard , cfn-nag , cfn-lint	Infrastructure linting and validation	cfn-guard license, cfn-nag license, cfn-lint license
CodeGuru, Bandit	Python	Bandit license
CodeGuru	Java
npm-audit, Dependabot	npm libraries	Dependabot license

When you deploy the solution in your AWS account, you can review how Bandit for Python has been built into the deployment pipeline by using AWS CodeBuild with a configured buildspec file, as shown in Figure 2. You can implement the other tools in the table by using a similar approach.

Figure 2: Bandit configured in CodeBuild

Walkthrough

To deploy the solution, you will complete the following steps:

Deploy the solution

The first step is to deploy the required resources into your AWS environment by using CloudFormation.

To deploy the solution

Choose the following Launch Stack button to deploy the solution’s CloudFormation template:

The solution deploys in the AWS US East (N. Virginia) Region (us-east-1) by default because each service listed in the Prerequisites section is available in this Region. To deploy the solution in a different Region, use the Region selector in the console navigation bar and make sure that the services required for this walkthrough are supported in your newly selected Region. For service availability by Region, see AWS Services by Region.
On the Quick Create Stack screen, do the following:
1. Leave the provided parameter defaults in place.
2. Scroll to the bottom, and in the Capabilities section, select I acknowledge that AWS CloudFormation might create IAM resources with custom names.
3. Choose Create Stack.
When the CloudFormation template has completed, open the AWS Cloud9 console.
In the Environments table, for the provisioned shift-left-blog-cloud9-ide environment, choose Open, as shown in Figure 3.

Figure 3: Cloud9 environments
The provisioned Cloud9 environment opens in a new tab. Wait for Cloud9 to initialize the two sample code repositories: shift-left-sample-app-java and shift-left-sample-app-python, as shown in Figure 4. For this post, you will work only with the Python sample repository shift-left-sample-app-python, but the procedures we outline will also work for the Java repository.

Figure 4: Cloud9 IDE

Associate CodeGuru Reviewer with a code repository

The next step is to associate the Python code repository with CodeGuru Reviewer. After you associate the repository, CodeGuru Reviewer analyzes and comments on issues that it finds when you create a pull request.

To associate CodeGuru Reviewer with a repository

Open the CodeGuru console, and in the left navigation pane, under Reviewer, choose Repositories.
In the Repositories section, choose Associate repository and run analysis.
In the Associate repository section, do the following:
1. For Select source provider, select AWS CodeCommit.
2. For Repository location,select shift-left-sample-app-python.
In the Run a repository analysis section, do the following, as shown in Figure 5:
1. For Source branch, select main.
2. For Code review name – optional, enter a name.
3. For Tags – optional, leave the default settings.
4. Choose Associate repository and run analysis.
  
  Figure 5: CodeGuru repository configuration
CodeGuru initiates the Full repository analysis and the status is Pending, as shown in Figure 6. The full analysis takes about 5 minutes to complete. Wait for the status to change from Pending to Completed.

Figure 6: CodeGuru full analysis pending

Create a pull request

The next step is to create a new branch and to push sample code to the repository by creating a pull request so that the code scan can be initiated by CodeGuru Reviewer and the CodeBuild job.

To create a new branch

In the Cloud9 IDE, locate the terminal and create a new branch by running the following commands.
```
cd ~/environment/shift-left-sample-app-python
git checkout -b python-test
```
Confirm that you are working from the new branch, which will be highlighted in the Cloud9 IDE terminal, as shown in Figure 7.
```
git branch -v
```
Figure 7: Cloud9 IDE terminal

To create a new file and push it to the code repository

Create a new file called sample.py.
```
touch sample.py
```
Copy the following sample code, paste it into the sample.py file, and save the changes, as shown in Figure 8.
```
import requests

data = requests.get("https://www.example.org/", verify = False)
print(data.status_code)
```
Figure 8: Cloud9 IDE noncompliant code
Commit the changes to the repository.
```
git status
git add -A
git commit -m "shift left blog python sample app update"
```
Note: if you receive a message to set your name and email address, you can ignore it because Git will automatically set these for you, and the Git commit will complete successfully.
Push the changes to the code repository, as shown in Figure 9.
```
git push origin python-test
```
Figure 9: Git push

To create a new pull request

Open the CodeCommit console and select the code repository called shift-left-sample-app-python.
From the Branches dropdown, select the new branch that you created and pushed, as shown in Figure 10.

Figure 10: CodeCommit branch selection
In your new branch, select the file sample.py, confirm that the file has the changes that you made, and then choose Create pull request, as shown in Figure 11.

Figure 11: CodeCommit pull request

A notification appears stating that the new code updates can be merged.
In the Source dropdown, choose the new branch python-test. In the Destination dropdown, choose the main branch where you intend to merge your code changes when the pull request is closed.
To have CodeCommit run a comparison between the main branch and your new branch python-test, choose Compare. To see the differences between the two branches, choose the Changes tab at the bottom of the page. CodeCommit also assesses whether the two branches can be merged automatically when the pull request is closed.
When you’re satisfied with the comparison results for the pull request, enter a Title and an optional Description, and then choose Create pull request. Your pull request appears in the list of pull requests for the CodeCommit repository, as shown in Figure 12.

Figure 12: Pull request

The creation of this pull request has automatically started two separate code scans. The first is a CodeGuru incremental code review and the second uses CodeBuild, which utilizes Bandit to perform a security code scan of the Python code.

Review code scan results and resolve detected security vulnerabilities

The next step is to review the code scan results to identify security vulnerabilities and the recommendations on how to fix them.

To review the code scan results

Open the CodeGuru console, and in the left navigation pane, under Reviewer, select Code reviews.
On the Incremental code reviews tab, make sure that you see a new code review item created for the preceding pull request.

Figure 13: CodeGuru Code review
After a few minutes, when CodeGuru completes the incremental analysis, choose the code review to review the CodeGuru recommendations on the pull request. Figure 14 shows the CodeGuru recommendations for our example.

Figure 14: CodeGuru recommendations
Open the CodeBuild console and select the CodeBuild job called shift-left-blog-pr-Python. In our example, this job should be in a Failed state.

Open the CodeBuild run, and under the Build history tab, select the CodeBuild job, which is in Failed state. Under the Build Logs tab, scroll down until you see the following errors in the logs. Note that the severity of the finding is High, which is why the CodeBuild job failed. You can review the Bandit scanning options in the Bandit documentation.

Test results:
>> Issue: [B501:request_with_no_cert_validation] Call to requests with verify=False disabling SSL certificate checks, security issue.
   Severity: High   Confidence: High
   CWE: CWE-295 (https://cwe.mitre.org/data/definitions/295.html)
   More Info: https://bandit.readthedocs.io/en/1.7.5/plugins/b501_request_with_no_cert_validation.html
   Location: sample.py:3:7

2   
3   data = requests.get("https://www.example.org/", verify = False)
4   print(data.status_code)

Navigate to the CodeCommit console, and on the Activity tab of the pull request, review the CodeGuru recommendations. You can also review the results of the CodeBuild jobs that Bandit performed, as shown in Figure 15.

Figure 15: CodeGuru recommendations and CodeBuild logs

This demonstrates how developers can directly link the relevant information relating to security code scans with their code development and associated pull requests, hence shifting to the left the required security awareness for developers.

To resolve the detected security vulnerabilities

In the Cloud9 IDE, navigate to the file sample.py in the Python sample repository, as shown in Figure 16.

Figure 16: Cloud9 IDE sample.py
Copy the following code and paste it in the sample.py file, overwriting the existing code. Save the update.
```
import requests

data = requests.get("https://www.example.org", timeout=5)
print(data.status_code)
```

Commit the changes by running the following commands.

git status
git add -A
git commit -m "shift left python sample.py resolve security errors"
git push origin python-test

Open the CodeCommit console and choose the Activity tab on the pull request that you created earlier. You will see a banner indicating that the pull request was updated. You will also see new comments indicating that new code scans using CodeGuru and CodeBuild were initiated for the new pull request update.
In the CodeGuru console, on the Incremental code reviews page, check that a new code scan has begun. When the scans are finished, review the results in the CodeGuru console and the CodeBuild build logs, as described previously. The previously detected security vulnerability should now be resolved.
In the CodeCommit console, on the Activity tab, under Activity history, review the comments to verify that each of the code scans has a status of Passing, as shown in Figure 17.

Figure 17: CodeCommit activity history
Now that the security issue has been resolved, merge the pull request into the main branch of the code repository. Choose Merge, and under Merge strategy, select Fast Forward merge.

AWS account clean-up

Clean up the resources created by this solution to avoid incurring future charges.

To clean up your account

Start by deleting the CloudFormation stacks for the Java and Python sample applications that you deployed. In the CloudFormation console, in the Stacks section, select one of these stacks and choose Delete; then select the other stack and choose Delete.

Figure 18: Delete repository stack
To initiate deletion of the Cloud9 CloudFormation stack, select it and choose Delete.
Open the Amazon S3 console, and in the search box, enter shift-left to search for the S3 bucket that CodePipeline used.

Figure 19: Select CodePipeline S3 bucket
Select the S3 bucket, select all of the object folders in the bucket, and choose Delete

Figure 20: Select CodePipeline S3 objects
To confirm deletion of the objects, in the section Permanently delete objects?, enter permanently delete, and then choose Delete objects. A banner message that states Successfully deleted objects appears at the top confirming the object deletion.
Navigate back to the CloudFormation console, select the stack named shift-left-blog, and choose Delete.

Conclusion

In this blog post, we showed you how to implement a solution that enables early feedback on code development through status comments in the CodeCommit pull request activity tab by using Amazon CodeGuru Reviewer and CodeBuild to perform automated code security scans on the creation of a code repository pull request.

We configured CodeBuild with Bandit for Python to demonstrate how you can integrate third-party or open-source tools into the development cycle. You can use this approach to integrate other tools into the workflow.

Shifting security left early in the development cycle can help you identify potential security issues earlier and empower teams to remediate issues earlier, helping to prevent the need to refactor code towards the end of a build.

This solution provides a simple method that you can use to view and understand potential security issues with your newly developed code and thus enhances your awareness of the security requirements within your organization.

It’s simple to get started. Sign up for an AWS account, deploy the provided CloudFormation template through the Launch Stack button, commit your code, and start scanning for vulnerabilities.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Use scalable controls for AWS services accessing your resources

2023-11-16 James Greenwood

Post Syndicated from James Greenwood original https://aws.amazon.com/blogs/security/use-scalable-controls-for-aws-services-accessing-your-resources/

Sometimes you want to configure an AWS service to access your resource in another service. For example, you can configure AWS CloudTrail, a service that monitors account activity across your AWS infrastructure, to write log data to your bucket in Amazon Simple Storage Service (Amazon S3). When you do this, you want assurance that the service will only access your resource on your behalf—you don’t want an untrusted entity to be able to use the service to access your resource. Before today, you could achieve this by using the two AWS Identity and Access Management (IAM) condition keys, aws:SourceAccount and aws:SourceArn. You can use these condition keys to help make sure that a service accesses your resource only on behalf of specific accounts or resources that you trust. However, because these condition keys require you to specify individual accounts and resources, they can be difficult to manage at scale, especially in larger organizations.

Recently, IAM launched two new condition keys that can help you achieve this in a more scalable way that is simpler to manage within your organization:

aws:SourceOrgID — use this condition key to make sure that an AWS service can access your resources only when the request originates from a particular organization ID in AWS Organizations.
aws:SourceOrgPaths — use this condition key to make sure that an AWS service can access your resources only when the request originates from one or more organizational units (OUs) in your organization.

In this blog post, we describe how you can use the four available condition keys, including the two new ones, to help you control how AWS services access your resources.

Background

Imagine a scenario where you configure an AWS service to access your resource in another service. Let’s say you’re using Amazon CloudWatch to observe resources in your AWS environment, and you create an alarm that activates when certain conditions occur. When the alarm activates, you want it to publish messages to a topic that you create in Amazon Simple Notification Service (Amazon SNS) to generate notifications.

Figure 1 depicts this process.

Figure 1: Amazon CloudWatch publishing messages to an SNS topic

In this scenario, there’s a resource-based policy controlling access to your SNS topic. For CloudWatch to publish messages to it, you must configure the policy to allow access by CloudWatch. When you do this, you identify CloudWatch using an AWS service principal, in this case cloudwatch.amazonaws.com.

Cross-service access

This is an example of a common pattern known as cross-service access. With cross-service access, a calling service accesses your resource in a called service, and a resource-based policy attached to your resource grants access to the calling service. The calling service is identified using an AWS service principal in the form <SERVICE-NAME>.amazonaws.com, and it accesses your resource on behalf of an originating resource, such as a CloudWatch alarm.

Figure 2 shows cross-service access.

Figure 2: Cross-service access

When you configure cross-service access, you want to make sure that the calling service will access your resource only on your behalf. That means you want the originating resource to be controlled by someone whom you trust. If an untrusted entity creates their own CloudWatch alarm in their AWS environment, for example, then their alarm should not be able to publish messages to your SNS topic.

If an untrusted entity could use a calling service to access your resource on their behalf, it would be an example of what’s known as the confused deputy problem. The confused deputy problem is a security issue in which an entity that doesn’t have permission to perform an action coerces a more privileged entity (in this case, a calling service) to perform the action instead.

Use condition keys to help prevent cross-service confused deputy issues

AWS provides global condition keys to help you prevent cross-service confused deputy issues. You can use these condition keys to control how AWS services access your resources.

Before today, you could use the aws:SourceAccount or aws:SourceArn condition keys to make sure that a calling service accesses your resource only when the request originates from a specific account (with aws:SourceAccount) or a specific originating resource (with aws:SourceArn). However, there are situations where you might want to allow multiple resources or accounts to use a calling service to access your resource. For example, you might want to create many VPC flow logs in an organization that publish to a central S3 bucket. To achieve this using the aws:SourceAccount or aws:SourceArn condition keys, you must enumerate all the originating accounts or resources individually in your resource-based policies. This can be difficult to manage, especially in large organizations, and can potentially cause your resource-based policy documents to reach size limits.

Now, you can use the new aws:SourceOrgID or aws:SourceOrgPaths condition keys to make sure that a calling service accesses your resource only when the request originates from a specific organization (with aws:SourceOrgID) or a specific organizational unit (with aws:SourceOrgPaths). This helps avoid the need to update policies when accounts are added or removed, reduces the size of policy documents, and makes it simpler to create and review policy statements.

The following table summarizes the four condition keys that you can use to help prevent cross-service confused deputy issues. These keys work in a similar way, but with different levels of granularity.

Use case	Condition key	Value	Allowed operators	Single/multi valued	Example value
Allow a calling service to access your resource only on behalf of an organization that you trust.	`aws:SourceOrgID`	AWS organization ID of the resource making a cross-service access request	String operators	Single-valued key	o-a1b2c3d4e5
Allow a calling service to access your resource only on behalf of an organizational unit (OU) that you trust.	`aws:SourceOrgPaths`	Organization entity paths of the resource making a cross-service access request	Set operators and string operators	Multivalued key	o-a1b2c3d4e5/r-ab12/ou-ab12-11111111/ou-ab12-22222222/
Allow a calling service to access your resource only on behalf of an account that you trust.	`aws:SourceAccount`	AWS account ID of the resource making a cross-service access request	String operators	Single-valued key	111122223333
Allow a calling service to access your resource only on behalf of a resource that you trust.	`aws:SourceArn`	Amazon Resource Name (ARN) of the resource making a cross- service access request	ARN operators (recommended) or string operators	Single-valued key	arn:aws:cloudwatch:eu-west-1:111122223333:alarm:myalarm

When to use the condition keys

AWS recommends that you use these condition keys in any resource-based policy statements that allow access by an AWS service, except where the relevant condition key is not yet supported by the service. To find out whether a condition key is supported by a particular service, see AWS global condition context keys in the AWS Identity and Access Management User Guide.

Note: Only use these condition keys in resource-based policies that allow access by an AWS service. Don’t use them in other use cases, including identity-based policies and service control policies (SCPs), where these condition keys won’t be populated.

Use condition keys for defense in depth

AWS services use a variety of mechanisms to help prevent cross-service confused deputy issues, and the details vary by service. For example, where a calling service accesses an S3 bucket, some services use S3 prefixes to help prevent confused deputy issues. For more information, see the relevant service documentation.

Where supported by the service, AWS recommends that you use the condition keys we describe in this post regardless of whether the service has another mechanism in place to help prevent cross-service confused deputy issues. This helps to make your intentions explicit, provide defense in depth, and guard against misconfigurations.

Example use cases

Let’s walk through some example use cases to learn how to use these condition keys in practice.

First, imagine you’re using Amazon Virtual Private Cloud (Amazon VPC) to manage logically isolated virtual networks. In Amazon VPC, you can configure flow logs, which capture information about your network traffic. Let’s say you want a flow log to write data into an S3 bucket for later analysis. This process is depicted in Figure 3.

Figure 3: Amazon VPC writing flow logs to an S3 bucket

This constitutes another cross-service access scenario. In this case, Amazon VPC is the calling service, Amazon S3 is the called service, the VPC flow log is the originating resource, and the S3 bucket is your resource in the called service.

To allow access, the resource-based policy for your S3 bucket (known as a bucket policy) must allow Amazon VPC to put objects there. The Principal element in this policy specifies the AWS service principal of the service that will access the resource, which for VPC flow logs is delivery.logs.amazonaws.com.

Initial policy without confused deputy prevention

The following is an initial version of the bucket policy that allows Amazon VPC to put objects in the bucket but doesn’t yet provide confused deputy prevention. We’re showing this policy for illustration purposes; don’t use it in its current form.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "PARTIAL-EXAMPLE-DO-NOT-USE",
            "Effect": "Allow",
            "Principal": {
                "Service": "delivery.logs.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<DOC-EXAMPLE-BUCKET>/*"
        }
    ]
}

Note: For simplicity, we only show one of the policy statements that you need to allow VPC flow logs to write to a bucket. In a real-life bucket policy for flow logs, you need two policy statements: one allowing actions on the bucket, and one allowing actions on the bucket contents. These are described in Publish flow logs to Amazon S3. Both policy statements work in the same way with respect to confused deputy prevention.

This policy statement allows Amazon VPC to put objects in the bucket. However, it allows Amazon VPC to do that on behalf of any flow log in any account. There’s nothing in the policy to tell Amazon VPC that it should access this bucket only if the flow log belongs to a specific organization, OU, account, or resource that you trust.

Let’s now update the policy to help prevent cross-service confused deputy issues. For the rest of this post, the remaining policy samples provide confused deputy protection, but at different levels of granularity.

Specify a trusted organization

Continuing with the previous example, imagine that you now have an organization in AWS Organizations, and you want to create VPC flow logs in various accounts within your organization that publish to a central S3 bucket. You want Amazon VPC to put objects in the bucket only if the request originates from a flow log that resides in your organization.

You can achieve this by using the new aws:SourceOrgID condition key. In a cross-service access scenario, this condition key evaluates to the ID of the organization that the request came from. You can use this condition key in the Condition element of a resource-based policy to allow actions only if aws:SourceOrgID matches the ID of a specific organization, as shown in the following example. In your own policy, make sure to replace <DOC-EXAMPLE-BUCKET> and <MY-ORGANIZATION-ID> with your own information.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VPCLogsDeliveryWrite",
            "Effect": "Allow",
            "Principal": {
                "Service": "delivery.logs.amazonaws.com"
            },
            "Action": "s3:PutObject",
            "Resource": "arn:aws:s3:::<DOC-EXAMPLE-BUCKET>/*",
            "Condition": {
                "StringEquals": {
                    "aws:SourceOrgID": "<MY-ORGANIZATION-ID>"
                }
            }
        }
    ]
}

The revised policy states that Amazon VPC can put objects in the bucket only if the request originates from a flow log in your organization. Now, if someone creates a flow log outside your organization and configures it to access your bucket, they will get an access denied error.

You can use aws:SourceOrgID in this way to allow a calling service to access your resource only if the request originates from a specific organization, as shown in Figure 4.

Figure 4: Specify a trusted organization using aws:SourceOrgID

Specify a trusted OU

What if you don’t want to trust your entire organization, but only part of it? Let’s consider a different scenario. Imagine that you want to send messages from Amazon SNS into a queue in Amazon Simple Queue Service (Amazon SQS) so they can be processed by consumers. This is depicted in Figure 5.

Figure 5: Amazon SNS sending messages to an SQS queue

Now imagine that you want your SQS queue to receive messages only if they originate from an SNS topic that resides in a specific organizational unit (OU) in your organization. For example, you might want to allow messages only if they originate from a production OU that is subject to change control.

You can achieve this by using the new aws:SourceOrgPaths condition key. As before, you use this condition key in a resource-based policy attached to your resource. In a cross-service access scenario, this condition key evaluates to the AWS Organizations entity path that the request came from. An entity path is a text representation of an entity within an organization.

You build an entity path for an OU by using the IDs of the organization, root, and all OUs in the path down to and including the OU. For example, consider the organizational structure shown in Figure 6.

Figure 6: Example organization structure

In this example, you can specify the Prod OU by using the following entity path:

o-a1b2c3d4e5/r-ab12/ou-ab12-11111111/ou-ab12-22222222/

For more information about how to construct an entity path, see Understand the AWS Organizations entity path.

Let’s now match the aws:SourceOrgPaths condition key against a specific entity path in the Condition element of a resource-based policy for an SQS queue. In your own policy, make sure to replace <MY-QUEUE-ARN> and <MY-ENTITY-PATH> with your own information.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Allow-SNS-SendMessage",
            "Effect": "Allow",
            "Principal": {
                "Service": "sns.amazonaws.com"
            },
            "Action": "sqs:SendMessage",
            "Resource": "<MY-QUEUE-ARN>",
            "Condition": {
                "Null": {
                    "aws:SourceOrgPaths": "false"
                },
                "ForAllValues:StringEquals": {
                    "aws:SourceOrgPaths": "<MY-ENTITY-PATH>"
                }
            }
        }
    ]
}

Note: aws:SourceOrgPaths is a multivalued condition key, which means it’s capable of having multiple values in the request context. At the time of writing, it contains a single entity path if the request originates from an account in an organization, and a null value if the request originates from an account that’s not in an organization. Because this key is multivalued, you need to use both a set operator and a string operator to compare values.

In this policy, there are two conditions in the Condition block. The first uses the Null condition operator and compares with a false value to confirm that the condition key’s value is not null. The second uses set operator ForAllValues, which returns true if every condition key value in the request matches at least one value in your policy condition, and string operator StringEquals, which requires an exact match with a value specified in your policy condition.

Note: The reason for the null check is that set operator ForAllValues returns true when a condition key resolves to null. With an Allow effect and the null check in place, access is denied if the request originates from an account that’s not in an organization.

With this policy applied to your SQS queue, Amazon SNS can send messages to your queue only if the message came from an SNS topic in a specific OU.

You can use aws:SourceOrgPaths in this way to allow a calling service to access your resource only if the request originates from a specific organizational unit, as shown in Figure 7.

Figure 7: Specify a trusted OU using aws:SourceOrgPaths

Specify a trusted OU and its children

In the previous example, we specified a trusted OU, but that didn’t include its child OUs. What if you want to include its children as well?

You can achieve this by replacing the string operator StringEquals with StringLike. This allows you to use wildcards in the entity path. Using the organization structure from the previous example, the following Condition evaluates to true only if the condition key value is not null and the request originates from the Prod OU or any of its child OUs.

"Condition": {
    "Null": {
        "aws:SourceOrgPaths": "false"
    },
    "ForAllValues:StringLike": {
        "aws:SourceOrgPaths": "o-a1b2c3d4e5/r-ab12/ou-ab12-11111111/ou-ab12-22222222/*"
    }
}

Specify a trusted account

If you want to be more granular, you can allow a service to access your resource only if the request originates from a specific account. You can achieve this by using the aws:SourceAccount condition key. In a cross-service access scenario, this condition key evaluates to the ID of the account that the request came from.

The following Condition evaluates to true only if the request originates from the account that you specify in the policy. In your own policy, make sure to replace <MY-ACCOUNT-ID> with your own information.

"Condition": {
    "StringEquals": {
        "aws:SourceAccount": "<MY-ACCOUNT-ID>"
    }
}

You can use this condition element within a resource-based policy to allow a calling service to access your resource only if the request originates from a specific account, as shown in Figure 8.

Figure 8: Specify a trusted account using aws:SourceAccount

Specify a trusted resource

If you want to be even more granular, you can allow a service to access your resource only if the request originates from a specific resource. For example, you can allow Amazon SNS to send messages to your SQS queue only if the request originates from a specific topic within Amazon SNS.

You can achieve this by using the aws:SourceArn condition key. In a cross-service access scenario, this condition key evaluates to the Amazon Resource Name (ARN) of the originating resource. This provides the most granular form of cross-service confused deputy prevention.

The following Condition evaluates to true only if the request originates from the resource that you specify in the policy. In your own policy, make sure to replace <MY-RESOURCE-ARN> with your own information.

"Condition": {
    "ArnEquals": {
        "aws:SourceArn": "<MY-RESOURCE-ARN>"
    }
}

Note: AWS recommends that you use an ARN operator rather than a string operator when comparing ARNs. This example uses ArnEquals to match the condition key value against the ARN specified in the policy.

You can use this condition element within a resource-based policy to allow a calling service to access your resource only if the request comes from a specific originating resource, as shown in Figure 9.

Figure 9: Specify a trusted resource using aws:SourceArn

Specify multiple trusted resources, accounts, OUs, or organizations

The four condition keys allow you to specify multiple trusted entities by matching against an array of values. This allows you to specify multiple trusted resources, accounts, OUs, or organizations in your policies.

Conclusion

In this post, you learned about cross-service access, in which an AWS service communicates with another AWS service to access your resource. You saw that it’s important to make sure that such services access your resources only on your behalf in order to help avoid cross-service confused deputy issues.

We showed you how to help prevent cross-service confused deputy issues by using two new condition keys aws:SourceOrgID and aws:SourceOrgPaths, as well as the other available condition keys aws:SourceAccount and aws:SourceArn. You learned that you should use these condition keys in any resource-based policy statements that allow access by an AWS service, if the condition key is supported by the service. This helps make sure that a calling service can access your resource only when the request originates from a specific organization, OU, account, or resource that you trust.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS IAM re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Decentralize LF-tag management with AWS Lake Formation

2023-11-16 Ramkumar Nottath

Post Syndicated from Ramkumar Nottath original https://aws.amazon.com/blogs/big-data/decentralize-lf-tag-management-with-aws-lake-formation/

In today’s data-driven world, organizations face unprecedented challenges in managing and extracting valuable insights from their ever-expanding data ecosystems. As the number of data assets and users grow, the traditional approaches to data management and governance are no longer sufficient. Customers are now building more advanced architectures to decentralize permissions management to allow for individual groups of users to build and manage their own data products, without being slowed down by a central governance team. One of the core features of AWS Lake Formation is the delegation of permissions on a subset of resources such as databases, tables, and columns in AWS Glue Data Catalog to data stewards, empowering them make decisions regarding who should get access to their resources and helping you decentralize the permissions management of your data lakes. Lake Formation has added a new capability that further allows data stewards to create and manage their own Lake Formation tags (LF-tags). Lake Formation tag-based access control (LF-TBAC) is an authorization strategy that defines permissions based on attributes. In Lake Formation, these attributes are called LF-Tags. LF-TBAC is the recommended method to use to grant Lake Formation permissions when there is a large number of Data Catalog resources. LF-TBAC is more scalable than the named resource method and requires less permission management overhead.

In this post, we go through the process of delegating the LF-tag creation, management, and granting of permissions to a data steward.

Lake Formation serves as the foundation for these advanced architectures by simplifying security management and governance for users at scale across AWS analytics. Lake Formation is designed to address these challenges by providing secure sharing between AWS accounts and tag-based access control to be able scale permissions. By assigning tags to data assets based on their characteristics and properties, organizations can implement access control policies tailored to specific data attributes. This ensures that only authorized individuals or teams can access and work with the data relevant to their domain. For example, it allows customers to tag data assets as “Confidential” and grant access to that LF-Tag to only those users who should have access to confidential data. Tag-based access control not only enhances data security and privacy, but also promotes efficient collaboration and knowledge sharing.

The need for producer autonomy and decentralized tag creation and delegation in data governance is paramount, regardless of the architecture chosen, whether it be a single account, hub and spoke, or data mesh with central governance. Relying solely on centralized tag creation and governance can create bottlenecks, hinder agility, and stifle innovation. By granting producers and data stewards the autonomy to create and manage tags relevant to their specific domains, organizations can foster a sense of ownership and accountability among producer teams. This decentralized approach allows you to adapt and respond quickly to changing requirements. This methodology helps organizations strike a balance between central governance and producer ownership, leading to improved governance, enhanced data quality, and data democratization.

Lake Formation announced the tag delegation feature to address this. With this feature, a Lake Formation admin can now provide permission to AWS Identity and Access Management (IAM) users and roles to create tags, associate them, and manage the tag expressions.

Solution overview

In this post, we examine an example organization that has a central data lake that is being used by multiple groups. We have two personas: the Lake Formation administrator LFAdmin, who manages the data lake and onboards different groups, and the data steward LFDataSteward-Sales, who owns and manages resources for the Sales group within the organization. The goal is to grant permission to the data steward to be able to use LF-Tags to perform permission grants for the resources that they own. In addition, the organization has a set of common LF-Tags called Confidentiality and Department, which the data steward will be able to use.

The following diagram illustrates the workflow to implement the solution.

The following are the high-level steps:

Grant permissions to create LF-Tags to a user who is not a Lake Formation administrator (the LFDataSteward-Sales IAM role).
Grant permissions to associate an organization’s common LF-Tags to the LFDataSteward-Sales role.
Create new LF-Tags using the LFDataSteward-Sales role.
Associate the new and common LF-Tags to resources using the LFDataSteward-Sales role.
Grant permissions to other users using the LFDataSteward-Sales role.

Prerequisites

For this walkthrough, you should have the following:

An AWS account.
Knowledge of using Lake Formation and enabling Lake Formation to manage permissions to a set of tables.
An IAM role that is a Lake Formation administrator. For this post, we name ours LFAdmin.
Two LF-Tags created by the LFAdmin:
- Key Confidentiality with values PII and Public.
- Key Department with values Sales and Marketing.
An IAM role that is a data steward within an organization. For this post, we name ours LFDataSteward-Sales.
The data steward should have ‘Super’ access to at least one database. In this post, the data steward has access to three databases: sales-ml-data, sales-processed-data, and sales-raw-data.
An IAM role to serve as a user that the data steward will grant permissions to using LF-Tags. For this post, we name ours LFAnalysts-MLScientist.

Grant permission to the data steward to be able to create LF-Tags

Complete the following steps to grant LFDataSteward-Sales the ability to create LF-Tags:

As the LFAdmin role, open the Lake Formation console.
In the navigation pane, choose LF-Tags and permissions under Permissions.

Under LF-Tags, because you are logged in as LFAdmin, you can see all the tags that have been created within the account. You can see the Confidentiality LF-Tag as well as the Department LF-Tag and the possible values for each tag.

On the LF-Tag creators tab, choose Add LF-Tag creators.

For IAM users and roles, enter the LFDataSteward-Sales IAM role.
For Permission, select Create LF-Tag.
If you want this data steward to be able to grant Create LF-Tag permissions to other users, select Create LF-Tag under Grantable permission.
Choose Add.

The LFDataSteward-Sales IAM role now has permissions to create their own LF-Tags.

Grant permission to the data steward to use common LF-Tags

We now want to give permission to the data steward to tag using the Confidentiality and Department tags. Complete the following steps:

As the LFAdmin role, open the Lake Formation console.
In the navigation pane, choose LF-Tags and permissions under Permissions.
On the LF-Tag permissions tab, choose Grant permissions.

Select LF-Tag key-value permission for Permission type.

The LF-Tag permission option grants the ability to modify or drop an LF-Tag, which doesn’t apply in this use case.

Select IAM users and roles and enter the LFDataSteward-Sales IAM role.

Provide the Confidentiality LF-Tag and all its values, and the Department LF-Tag with only the Sales value.
Select Describe, Associate, and Grant with LF-Tag expression under Permissions.
Choose Grant permissions.

This gave the LFDataSteward-Sales role the ability to tag resources using the Confidentiality tag and all its values as well as the Department tag with only the Sales value.

Create new LF-Tags using the data steward role

This step demonstrates how the LFDataSteward-Sales role can now create their own LF-Tags.

As the LFDataSteward-Sales role, open the Lake Formation console.
In the navigation pane, choose LF-Tags and permissions under Permissions.

The LF-Tags section only shows the Confidentiality tag and Department tag with only the Sales value. As the data steward, we want to create our own LF-Tags to make permissioning easier.

Choose Add LF-Tag.

For Key, enter Sales-Subgroups.
For Values¸ enter DataScientists, DataEngineers, and MachineLearningEngineers.
Choose Add LF-Tag.

As the LF-Tag creator, the data steward has full permissions on the tags that they created. You will be able to see all the tags that the data steward has access to.

Associate LF-Tags to resources as the data steward

We now associate resources to the LF-Tags that we just created so that Machine Learning Engineers can have access to the sales-ml-data resource.

As the LFDataSteward-Sales role, open the Lake Formation console.
In the navigation pane, choose Databases.
Select sales-ml-data and on the Actions menu, choose Edit LF-Tags.

Add the following LF-Tags and values:
1. Key Sales-Subgroups with value MachineLearningEngineers.
2. Key Department with value analytics.
3. Key Confidentiality with value Public.
Choose Save.

Grant permissions using LF-Tags as the data steward

To grant permissions using LF-Tags, complete the following steps:

As the LFDataSteward-Sales role, open the Lake Formation console.
In the navigation pane, choose Data lake permissions under Permissions.
Choose Grant.
Select IAM users and roles and enter the IAM principal to grant permission to (for this example, the Sales-MLScientist role).

In the LF-Tags or catalog resources section, select Resources matched by LF-Tags.
Enter the following tag expressions:
1. For the Department LF-Tag, set the Sales value.
2. For the Sales-Subgroups LF-Tag, set the MachineLearningEngineers value.
3. For the Confidentiality LF-Tag, set the Public value.

Because this is a machine learning (ML) and data science user, we want to give full permissions so that they can manage databases and create tables.

For Database permissions, select Super, and for Table permissions, select Super.

Choose Grant.

We now see the permissions granted to the LF-Tag expression.

Verify permissions granted to the user

To verify permissions using Amazon Athena, navigate to the Athena console as the Sales-MLScientist role. We can observe that the Sales-MLScientist role now has access to the sales-ml-data database and all the tables. In this case, there is only one table, sales-report.

Clean up

To clean up your resources, delete the following:

IAM roles that you may have created for the purposes of this post
Any LF-Tags that you created

Conclusion

In this post, we discussed the benefits of decentralized tag management and how the new Lake Formation feature helps implement this. By granting permission to producer teams’ data stewards to manage tags, organizations empower them to use their domain knowledge and capture the nuances of their data effectively. Furthermore, granting permission to data stewards enables them to take ownership of the tagging process, ensuring accuracy and relevance.

The post illustrated the various steps involved in decentralized Lake Formation tag management, such as granting permission to data stewards to create LF-Tags and use common LF-Tags. We also demonstrated how the data steward can create their own LF-Tags, associate the tags to resources, and grant permissions using tags.

We encourage you to explore the new decentralized Lake Formation tag management feature. For more details, see Lake Formation tag-based access control.

About the Authors

Ramkumar Nottath is a Principal Solutions Architect at AWS focusing on Analytics services. He enjoys working with various customers to help them build scalable, reliable big data and analytics solutions. His interests extend to various technologies such as analytics, data warehousing, streaming, data governance, and machine learning. He loves spending time with his family and friends.

Mert Hocanin is a Principal Big Data Architect at AWS within the AWS Lake Formation Product team. He has been with Amazon for over 10 years, and enjoys helping customers build their data lakes with a focus on governance on a wide variety of services. When he isn’t helping customers build data lakes, he spends his time with his family and traveling.

Use generative AI with Amazon EMR, Amazon Bedrock, and English SDK for Apache Spark to unlock insights

2023-11-16 Saurabh Bhutyani

Post Syndicated from Saurabh Bhutyani original https://aws.amazon.com/blogs/big-data/use-generative-ai-with-amazon-emr-amazon-bedrock-and-english-sdk-for-apache-spark-to-unlock-insights/

In this era of big data, organizations worldwide are constantly searching for innovative ways to extract value and insights from their vast datasets. Apache Spark offers the scalability and speed needed to process large amounts of data efficiently.

Amazon EMR is the industry-leading cloud big data solution for petabyte-scale data processing, interactive analytics, and machine learning (ML) using open source frameworks such as Apache Spark, Apache Hive, and Presto. Amazon EMR is the best place to run Apache Spark. You can quickly and effortlessly create managed Spark clusters from the AWS Management Console, AWS Command Line Interface (AWS CLI), or Amazon EMR API. You can also use additional Amazon EMR features, including fast Amazon Simple Storage Service (Amazon S3) connectivity using the Amazon EMR File System (EMRFS), integration with the Amazon EC2 Spot market and the AWS Glue Data Catalog, and EMR Managed Scaling to add or remove instances from your cluster. Amazon EMR Studio is an integrated development environment (IDE) that makes it straightforward for data scientists and data engineers to develop, visualize, and debug data engineering and data science applications written in R, Python, Scala, and PySpark. EMR Studio provides fully managed Jupyter notebooks, and tools like Spark UI and YARN Timeline Service to simplify debugging.

To unlock the potential hidden within the data troves, it’s essential to go beyond traditional analytics. Enter generative AI, a cutting-edge technology that combines ML with creativity to generate human-like text, art, and even code. Amazon Bedrock is the most straightforward way to build and scale generative AI applications with foundation models (FMs). Amazon Bedrock is a fully managed service that makes FMs from Amazon and leading AI companies available through an API, so you can quickly experiment with a variety of FMs in the playground, and use a single API for inference regardless of the models you choose, giving you the flexibility to use FMs from different providers and keep up to date with the latest model versions with minimal code changes.

In this post, we explore how you can supercharge your data analytics with generative AI using Amazon EMR, Amazon Bedrock, and the pyspark-ai library. The pyspark-ai library is an English SDK for Apache Spark. It takes instructions in English language and compiles them into PySpark objects like DataFrames. This makes it straightforward to work with Spark, allowing you to focus on extracting value from your data.

Solution overview

The following diagram illustrates the architecture for using generative AI with Amazon EMR and Amazon Bedrock.

Solution Overview

EMR Studio is a web-based IDE for fully managed Jupyter notebooks that run on EMR clusters. We interact with EMR Studio Workspaces connected to a running EMR cluster and run the notebook provided as part of this post. We use the New York City Taxi data to garner insights into various taxi rides taken by users. We ask the questions in natural language on top of the data loaded in Spark DataFrame. The pyspark-ai library then uses the Amazon Titan Text FM from Amazon Bedrock to create a SQL query based on the natural language question. The pyspark-ai library takes the SQL query, runs it using Spark SQL, and provides results back to the user.

In this solution, you can create and configure the required resources in your AWS account with an AWS CloudFormation template. The template creates the AWS Glue database and tables, S3 bucket, VPC, and other AWS Identity and Access Management (IAM) resources that are used in the solution.

The template is designed to demonstrate how to use EMR Studio with the pyspark-ai package and Amazon Bedrock, and is not intended for production use without modification. Additionally, the template uses the us-east-1 Region and may not work in other Regions without modification. The template creates resources that incur costs while they are in use. Follow the cleanup steps at the end of this post to delete the resources and avoid unnecessary charges.

Prerequisites

Before you launch the CloudFormation stack, ensure you have the following:

An AWS account that provides access to AWS services
An IAM user with an access key and secret key to configure the AWS CLI, and permissions to create an IAM role, IAM policies, and stacks in AWS CloudFormation
The Titan Text G1 – Express model is currently in preview, so you need to have preview access to use it as part of this post

Create resources with AWS CloudFormation

The CloudFormation creates the following AWS resources:

A VPC stack with private and public subnets to use with EMR Studio, route tables, and NAT gateway.
An EMR cluster with Python 3.9 installed. We are using a bootstrap action to install Python 3.9 and other relevant packages like pyspark-ai and Amazon Bedrock dependencies. (For more information, refer to the bootstrap script.)
An S3 bucket for the EMR Studio Workspace and notebook storage.
IAM roles and policies for EMR Studio setup, Amazon Bedrock access, and running notebooks

To get started, complete the following steps:

Choose Launch Stack:
Select I acknowledge that this template may create IAM resources.

The CloudFormation stack takes approximately 20–30 minutes to complete. You can monitor its progress on the AWS CloudFormation console. When its status reads CREATE_COMPLETE, your AWS account will have the resources necessary to implement this solution.

Create EMR Studio

Now you can create an EMR Studio and Workspace to work with the notebook code. Complete the following steps:

On the EMR Studio console, choose Create Studio.
Enter the Studio Name as GenAI-EMR-Studio and provide a description.
In the Networking and security section, specify the following:
- For VPC, choose the VPC you created as part of the CloudFormation stack that you deployed. Get the VPC ID using the CloudFormation outputs for the VPCID key.
- For Subnets, choose all four subnets.
- For Security and access, select Custom security group.
- For Cluster/endpoint security group, choose EMRSparkAI-Cluster-Endpoint-SG.
- For Workspace security group, choose EMRSparkAI-Workspace-SG.
In the Studio service role section, specify the following:
- For Authentication, select AWS Identity and Access Management (IAM).
- For AWS IAM service role, choose EMRSparkAI-StudioServiceRole.
In the Workspace storage section, browse and choose the S3 bucket for storage starting with emr-sparkai-<account-id>.
Choose Create Studio.
When the EMR Studio is created, choose the link under Studio Access URL to access the Studio.
When you’re in the Studio, choose Create workspace.
Add emr-genai as the name for the Workspace and choose Create workspace.
When the Workspace is created, choose its name to launch the Workspace (make sure you’ve disabled any pop-up blockers).

Big data analytics using Apache Spark with Amazon EMR and generative AI

Now that we have completed the required setup, we can start performing big data analytics using Apache Spark with Amazon EMR and generative AI.

As a first step, we load a notebook that has the required code and examples to work with the use case. We use NY Taxi dataset, which contains details about taxi rides.

Download the notebook file NYTaxi.ipynb and upload it to your Workspace by choosing the upload icon.
After the notebook is imported, open the notebook and choose PySpark as the kernel.

PySpark AI by default uses OpenAI’s ChatGPT4.0 as the LLM model, but you can also plug in models from Amazon Bedrock, Amazon SageMaker JumpStart, and other third-party models. For this post, we show how to integrate the Amazon Bedrock Titan model for SQL query generation and run it with Apache Spark in Amazon EMR.

To get started with the notebook, you need to associate the Workspace to a compute layer. To do so, choose the Compute icon in the navigation pane and choose the EMR cluster created by the CloudFormation stack.

Configure the Python parameters to use the updated Python 3.9 package with Amazon EMR:

%%configure -f
{
"conf": {
"spark.executorEnv.PYSPARK_PYTHON": "/usr/local/python3.9.18/bin/python3.9",
"spark.yarn.appMasterEnv.PYSPARK_PYTHON": "/usr/local/python3.9.18/bin/python3.9"
}
}

Import the necessary libraries:

from pyspark_ai import SparkAI
from pyspark.sql import SparkSession
from langchain.chat_models import ChatOpenAI
from langchain.llms.bedrock import Bedrock
import boto3
import os

After the libraries are imported, you can define the LLM model from Amazon Bedrock. In this case, we use amazon.titan-text-express-v1. You need to enter the Region and Amazon Bedrock endpoint URL based on your preview access for the Titan Text G1 – Express model.
```
boto3_bedrock = boto3.client('bedrock-runtime', '<region>', endpoint_url='<bedrock endpoint url>')
llm = Bedrock(
model_id="amazon.titan-text-express-v1",
client=boto3_bedrock)
```
Connect Spark AI to the Amazon Bedrock LLM model for SQL query generation based on questions in natural language:
```
#Connecting Spark AI to the Bedrock Titan LLM
spark_ai = SparkAI(llm = llm, verbose=False)
spark_ai.activate()
```

Here, we have initialized Spark AI with verbose=False; you can also set verbose=True to see more details.

Now you can read the NYC Taxi data in a Spark DataFrame and use the power of generative AI in Spark.

For example, you can ask the count of the number of records in the dataset:

taxi_records.ai.transform("count the number of records in this dataset").show()

We get the following response:

> Entering new AgentExecutor chain...
Thought: I need to count the number of records in the table.
Action: query_validation
Action Input: SELECT count(*) FROM spark_ai_temp_view_ee3325
Observation: OK
Thought: I now know the final answer.
Final Answer: SELECT count(*) FROM spark_ai_temp_view_ee3325
> Finished chain.
+----------+
| count(1)|
+----------+
|2870781820|
+----------+

Spark AI internally uses LangChain and SQL chain, which hide the complexity from end-users working with queries in Spark.

The notebook has a few more example scenarios to explore the power of generative AI with Apache Spark and Amazon EMR.

Clean up

Empty the contents of the S3 bucket emr-sparkai-<account-id>, delete the EMR Studio Workspace created as part of this post, and then delete the CloudFormation stack that you deployed.

Conclusion

This post showed how you can supercharge your big data analytics with the help of Apache Spark with Amazon EMR and Amazon Bedrock. The PySpark AI package allows you to derive meaningful insights from your data. It helps reduce development and analysis time, reducing time to write manual queries and allowing you to focus on your business use case.

About the Authors

Saurabh Bhutyani is a Principal Analytics Specialist Solutions Architect at AWS. He is passionate about new technologies. He joined AWS in 2019 and works with customers to provide architectural guidance for running generative AI use cases, scalable analytics solutions and data mesh architectures using AWS services like Amazon Bedrock, Amazon SageMaker, Amazon EMR, Amazon Athena, AWS Glue, AWS Lake Formation, and Amazon DataZone.

Harsh Vardhan is an AWS Senior Solutions Architect, specializing in analytics. He has over 8 years of experience working in the field of big data and data science. He is passionate about helping customers adopt best practices and discover insights from their data.

Automate and enhance your code security with AI-powered services

2023-11-16 Dylan Souvage

Post Syndicated from Dylan Souvage original https://aws.amazon.com/blogs/security/automate-and-enhance-your-code-security-with-ai-powered-services/

Organizations are increasingly embracing a shift-left approach when it comes to security, actively integrating security considerations into their software development lifecycle (SDLC). This shift aligns seamlessly with modern software development practices such as DevSecOps and continuous integration and continuous deployment (CI/CD), making it a vital strategy in today’s rapidly evolving software development landscape. At its core, shift left promotes a security-as-code culture, where security becomes an integral part of the entire application lifecycle, starting from the initial design phase and extending all the way through to deployment. This proactive approach to security involves seamlessly integrating security measures into the CI/CD pipeline, enabling automated security testing and checks at every stage of development. Consequently, it accelerates the process of identifying and remediating security issues.

By identifying security vulnerabilities early in the development process, you can promptly address them, leading to significant reductions in the time and effort required for mitigation. Amazon Web Services (AWS) encourages this shift-left mindset, providing services that enable a seamless integration of security into your DevOps processes, fostering a more robust, secure, and efficient system. In this blog post we share how you can use Amazon CodeWhisperer, Amazon CodeGuru, and Amazon Inspector to automate and enhance code security.

CodeWhisperer is a versatile, artificial intelligence (AI)-powered code generation service that delivers real-time code recommendations. This innovative service plays a pivotal role in the shift-left strategy by automating the integration of crucial security best practices during the early stages of code development. CodeWhisperer is equipped to generate code in Python, Java, and JavaScript, effectively mitigating vulnerabilities outlined in the OWASP (Open Web Application Security Project) Top 10. It uses cryptographic libraries aligned with industry best practices, promoting robust security measures. Additionally, as you develop your code, CodeWhisperer scans for potential security vulnerabilities, offering actionable suggestions for remediation. This is achieved through generative AI, which creates code alternatives to replace identified vulnerable sections, enhancing the overall security posture of your applications.

Next, you can perform further vulnerability scanning of code repositories and supported integrated development environments (IDEs) with Amazon CodeGuru Security. CodeGuru Security is a static application security tool that uses machine learning to detect security policy violations and vulnerabilities. It provides recommendations for addressing security risks and generates metrics so you can track the security health of your applications. Examples of security vulnerabilities it can detect include resource leaks, hardcoded credentials, and cross-site scripting.

Finally, you can use Amazon Inspector to address vulnerabilities in workloads that are deployed. Amazon Inspector is a vulnerability management service that continually scans AWS workloads for software vulnerabilities and unintended network exposure. Amazon Inspector calculates a highly contextualized risk score for each finding by correlating common vulnerabilities and exposures (CVE) information with factors such as network access and exploitability. This score is used to prioritize the most critical vulnerabilities to improve remediation response efficiency. When started, it automatically discovers Amazon Elastic Compute Cloud (Amazon EC2) instances, container images residing in Amazon Elastic Container Registry (Amazon ECR), and AWS Lambda functions, at scale, and immediately starts assessing them for known vulnerabilities.

Figure 1: An architecture workflow of a developer’s code workflow

Amazon CodeWhisperer

CodeWhisperer is powered by a large language model (LLM) trained on billions of lines of code, including code owned by Amazon and open-source code. This makes it a highly effective AI coding companion that can generate real-time code suggestions in your IDE to help you quickly build secure software with prompts in natural language. CodeWhisperer can be used with four IDEs including AWS Toolkit for JetBrains, AWS Toolkit for Visual Studio Code, AWS Lambda, and AWS Cloud9.

After you’ve installed the AWS Toolkit, there are two ways to authenticate to CodeWhisperer. The first is authenticating to CodeWhisperer as an individual developer using AWS Builder ID, and the second way is authenticating to CodeWhisperer Professional using the IAM Identity Center. Authenticating through AWS IAM Identity Center means your AWS administrator has set up CodeWhisperer Professional for your organization to use and provided you with a start URL. AWS administrators must have configured AWS IAM Identity Center and delegated users to access CodeWhisperer.

As you use CodeWhisperer it filters out code suggestions that include toxic phrases (profanity, hate speech, and so on) and suggestions that contain commonly known code structures that indicate bias. These filters help CodeWhisperer generate more inclusive and ethical code suggestions by proactively avoiding known problematic content. The goal is to make AI assistance more beneficial and safer for all developers.

CodeWhisperer can also scan your code to highlight and define security issues in real time. For example, using Python and JetBrains, if you write code that would write unencrypted AWS credentials to a log — a bad security practice — CodeWhisperer will raise an alert. Security scans operate at the project level, analyzing files within a user’s local project or workspace and then truncating them to create a payload for transmission to the server side.

For an example of CodeGuru in action, see Security Scans. Figure 2 is a screenshot of a CodeGuru scan.

Figure 2: CodeWhisperer performing a security scan in Visual Studio Code

Furthermore, the CodeWhisperer reference tracker detects whether a code suggestion might be similar to particular CodeWhisperer open source training data. The reference tracker can flag such suggestions with a repository URL and project license information or optionally filter them out. Using CodeWhisperer, you improve productivity while embracing the shift-left approach by implementing automated security best practices at one of the principal layers—code development.

CodeGuru Security

Amazon CodeGuru Security significantly bolsters code security by harnessing the power of machine learning to proactively pinpoint security policy violations and vulnerabilities. This intelligent tool conducts a thorough scan of your codebase and offers actionable recommendations to address identified issues. This approach verifies that potential security concerns are corrected early in the development lifecycle, contributing to an overall more robust application security posture.

CodeGuru Security relies on a set of security and code quality detectors crafted to identify security risks and policy violations. These detectors empower developers to spot and resolve potential issues efficiently.

CodeGuru Security allows manual scanning of existing code and automating integration with popular code repositories like GitHub and GitLab. It establishes an automated security check pipeline through either AWS CodePipeline or Bitbucket Pipeline. Moreover, CodeGuru Security integrates with Amazon Inspector Lambda code scanning, enabling automated code scans for your Lambda functions.

Notably, CodeGuru Security doesn’t just uncover security vulnerabilities; it also offers insights to optimize code efficiency. It identifies areas where code improvements can be made, enhancing both security and performance aspects within your applications.

Initiating CodeGuru Security is a straightforward process, accessible through the AWS Management Console, AWS Command Line Interface (AWS CLI), AWS SDKs, and multiple integrations. This allows you to run code scans, review recommendations, and implement necessary updates, fostering a continuous improvement cycle that bolsters the security stance of your applications.

Use Amazon CodeGuru to scan code directly and in a pipeline

Use the following steps to create a scan in CodeGuru to scan code directly and to integrate CodeGuru with AWS CodePipeline.

Note: You must provide sample code to scan.

Scan code directly

Open the AWS Management Console using your organization management account and go to Amazon CodeGuru.
In the navigation pane, select Security and then select Scans.
Choose Create new scan to start your manual code scan.

Figure 3: Scans overview
On the Create Scan page:
1. Choose Choose file to upload your code.
  
  Note: The file must be in .zip format and cannot exceed 5 GB.
2. Enter a unique name to identify your scan.
3. Choose Create scan.
  
  Figure 4: Create scan
After you create the scan, the configured scan will automatically appear in the Scans table, where you see the Scan name, Status, Open findings, Date of last scan, and Revision number (you review these findings later in the Findings section of this post).

Figure 5: Scan update

Automated scan using AWS CodePipeline integration

Still in the CodeGuru console, in the navigation pane under Security, select Integrations. On the Integrations page, select Integration with AWS CodePipeline. This will allow you to have an automated security scan inside your CI/CD pipeline.

Figure 6: CodeGuru integrations
Next, choose Open template in CloudFormation to create a CodeBuild project to allow discovery of your repositories and run security scans.

Figure 7: CodeGuru and CodePipeline integration
The CloudFormation template is already entered. Select the acknowledge box, and then choose Create stack.

Figure 8: CloudFormation quick create stack
If you already have a pipeline integration, go to Step 2 and select CodePipeline console. If this is your first time using CodePipeline, this blog post explains how to integrate it with AWS CI/CD services.

Figure 9: Integrate with AWS CodePipeline
Choose Edit.

Figure 10: CodePipeline with CodeGuru integration
Choose Add stage.

Figure 11: Add Stage in CodePipeline
On the Edit action page:
1. Enter a stage name.
2. For the stage you just created, choose Add action group.
3. For Action provider, select CodeBuild.
4. For Input artifacts, select SourceArtifact.
5. For Project name, select CodeGuruSecurity.
6. Choose Done, and then choose Save.
Figure 12: Add action group

Test CodeGuru Security

You have now created a security check stage for your CI/CD pipeline. To test the pipeline, choose Release change.

Figure 13: CodePipeline with successful security scan

If your code was successfully scanned, you will see Succeeded in the Most recent execution column for your pipeline.

Figure 14: CodePipeline dashboard with successful security scan

Findings

To analyze the findings of your scan, select Findings under Security, and you will see the findings for the scans whether manually done or through integrations. Each finding will show the vulnerability, the scan it belongs to, the severity level, the status of an open case or closed case, the age, and the time of detection.

Figure 15: Findings inside CodeGuru security

Dashboard

To view a summary of the insights and findings from your scan, select Dashboard, under Security, and you will see high level summary of your findings overview and a vulnerability fix overview.

Figure 16:Findings inside CodeGuru dashboard

Amazon Inspector

Your journey with the shift-left model extends beyond code deployment. After scanning your code repositories and using tools like CodeWhisperer and CodeGuru Security to proactively reduce security risks before code commits to a repository, your code might still encounter potential vulnerabilities after being deployed to production. For instance, faulty software updates can introduce risks to your application. Continuous vigilance and monitoring after deployment are crucial.

This is where Amazon Inspector offers ongoing assessment throughout your resource lifecycle, automatically rescanning resources in response to changes. Amazon Inspector seamlessly complements the shift-left model by identifying vulnerabilities as your workload operates in a production environment.

Amazon Inspector continuously scans various components, including Amazon EC2, Lambda functions, and container workloads, seeking out software vulnerabilities and inadvertent network exposure. Its user-friendly features include enablement in a few clicks, continuous and automated scanning, and robust support for multi-account environments through AWS Organizations. After activation, it autonomously identifies workloads and presents real-time coverage details, consolidating findings across accounts and resources.

Distinguishing itself from traditional security scanning software, Amazon Inspector has minimal impact on your fleet’s performance. When vulnerabilities or open network paths are uncovered, it generates detailed findings, including comprehensive information about the vulnerability, the affected resource, and recommended remediation. When you address a finding appropriately, Amazon Inspector autonomously detects the remediation and closes the finding.

The findings you receive are prioritized according to a contextualized Inspector risk score, facilitating prompt analysis and allowing for automated remediation.

Additionally, Amazon Inspector provides robust management APIs for comprehensive programmatic access to the Amazon Inspector service and resources. You can also access detailed findings through Amazon EventBridge and seamlessly integrate them into AWS Security Hub for a comprehensive security overview.

Scan workloads with Amazon Inspector

Use the following examples to learn how to use Amazon Inspector to scan AWS workloads.

Open the Amazon Inspector console in your AWS Organizations management account. In the navigation pane, select Activate Inspector.
Under Delegated administrator, enter the account number for your desired account to grant it all the permissions required to manage Amazon Inspector for your organization. Consider using your Security Tooling account as delegated administrator for Amazon Inspector. Choose Delegate. Then, in the confirmation window, choose Delegate again. When you select a delegated administrator, Amazon Inspector is activated for that account. Now, choose Activate Inspector to activate the service in your management account.

Figure 17: Set the delegated administrator account ID for Amazon Inspector
You will see a green success message near the top of your browser window and the Amazon Inspector dashboard, showing a summary of data from the accounts.

Figure 18: Amazon Inspector dashboard after activation

Explore Amazon Inspector

From the Amazon Inspector console in your delegated administrator account, in the navigation pane, select Account management. Because you’re signed in as the delegated administrator, you can enable and disable Amazon Inspector in the other accounts that are part of your organization. You can also automatically enable Amazon Inspector for new member accounts.

Figure 19: Amazon Inspector account management dashboard
In the navigation pane, select Findings. Using the contextualized Amazon Inspector risk score, these findings are sorted into several severity ratings.
1. The contextualized Amazon Inspector risk score is calculated by correlating CVE information with findings such as network access and exploitability.
2. This score is used to derive severity of a finding and prioritize the most critical findings to improve remediation response efficiency.
Figure 20: Findings in Amazon Inspector sorted by severity (default)

When you enable Amazon Inspector, it automatically discovers all of your Amazon EC2 and Amazon ECR resources. It scans these workloads to detect vulnerabilities that pose risks to the security of your compute workloads. After the initial scan, Amazon Inspector continues to monitor your environment. It automatically scans new resources and re-scans existing resources when changes are detected. As vulnerabilities are remediated or resources are removed from service, Amazon Inspector automatically updates the associated security findings.

In order to successfully scan EC2 instances, Amazon Inspector requires inventory collected by AWS Systems Manager and the Systems Manager agent. This is installed by default on many EC2 instances. If you find some instances aren’t being scanned by Amazon Inspector, this might be because they aren’t being managed by Systems Manager.
Select a findings title to see the associated report.
1. Each finding provides a description, severity rating, information about the affected resource, and additional details such as resource tags and how to remediate the reported vulnerability.
2. Amazon Inspector stores active findings until they are closed by remediation. Findings that are closed are displayed for 30 days.
Figure 21: Amazon Inspector findings report details

Integrate CodeGuru Security with Amazon Inspector to scan Lambda functions

Amazon Inspector and CodeGuru Security work harmoniously together. CodeGuru Security is available through Amazon Inspector Lambda code scanning. After activating Lambda code scanning, you can configure automated code scans to be performed on your Lambda functions.

Use the following steps to configure Amazon CodeGuru Security with Amazon Inspector Lambda code scanning to evaluate Lambda functions.

Open the Amazon Inspector console and select Account management from the navigation pane.
Select the AWS account you want to activate Lambda code scanning in.

Figure 22: Activating AWS Lambda code scanning from the Amazon Inspector Account management console
Choose Activate and select AWS Lambda code scanning.

With Lambda code scanning activated, security findings for your Lambda function code will appear in the All findings section of Amazon Inspector.

Amazon Inspector plays a crucial role in maintaining the highest security standards for your resources. Whether you’re installing a new package on an EC2 instance, applying a software patch, or when a new CVE affecting a specific resource is disclosed, Amazon Inspector can assist with quick identification and remediation.

Conclusion

Incorporating security at every stage of the software development lifecycle is paramount and requires that security be a consideration from the outset. Shifting left enables security teams to reduce overall application security risks.

Using these AWS services — Amazon CodeWhisperer, Amazon CodeGuru and Amazon Inspector — not only aids in early risk identification and mitigation, it empowers your development and security teams, leading to more efficient and secure business outcomes.

For further reading, check out the AWS Well Architected Security Pillar, the Generative AI on AWS page, and more blogs like this on the AWS Security Blog page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon CodeWhisperer re:Post forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Introducing shared VPC support on Amazon MWAA

2023-11-15 John Jackson

Post Syndicated from John Jackson original https://aws.amazon.com/blogs/big-data/introducing-shared-vpc-support-on-amazon-mwaa/

In this post, we demonstrate automating deployment of Amazon Managed Workflows for Apache Airflow (Amazon MWAA) using customer-managed endpoints in a VPC, providing compatibility with shared, or otherwise restricted, VPCs.

Data scientists and engineers have made Apache Airflow a leading open source tool to create data pipelines due to its active open source community, familiar Python development as Directed Acyclic Graph (DAG) workflows, and extensive library of pre-built integrations. Amazon MWAA is a managed service for Airflow that makes it easy to run Airflow on AWS without the operational burden of having to manage the underlying infrastructure. For each Airflow environment, Amazon MWAA creates a single-tenant service VPC, which hosts the metadatabase that stores states and the web server that provides the user interface. Amazon MWAA further manages Airflow scheduler and worker instances in a customer-owned and managed VPC, in order to schedule and run tasks that interact with customer resources. Those Airflow containers in the customer VPC access resources in the service VPC via a VPC endpoint.

Many organizations choose to centrally manage their VPC using AWS Organizations, allowing a VPC in an owner account to be shared with resources in a different participant account. However, because creating a new route outside of a VPC is considered a privileged operation, participant accounts can’t create endpoints in owner VPCs. Furthermore, many customers don’t want to extend the security privileges required to create VPC endpoints to all users provisioning Amazon MWAA environments. In addition to VPC endpoints, customers also wish to restrict data egress via Amazon Simple Queue Service (Amazon SQS) queues, and Amazon SQS access is a requirement in the Amazon MWAA architecture.

Shared VPC support for Amazon MWAA adds the ability for you to manage your own endpoints within your VPCs, adding compatibility to shared and otherwise restricted VPCs. Specifying customer-managed endpoints also provides the ability to meet strict security policies by explicitly restricting VPC resource access to just those needed by your Amazon MWAA environments. This post demonstrates how customer-managed endpoints work with Amazon MWAA and provides examples of how to automate the provisioning of those endpoints.

Solution overview

Shared VPC support for Amazon MWAA allows multiple AWS accounts to create their Airflow environments into shared, centrally managed VPCs. The account that owns the VPC (owner) shares the two private subnets required by Amazon MWAA with other accounts (participants) that belong to the same organization from AWS Organizations. After the subnets are shared, the participants can view, create, modify, and delete Amazon MWAA environments in the subnets shared with them.

When users specify the need for a shared, or otherwise policy-restricted, VPC during environment creation, Amazon MWAA will first create the service VPC resources, then enter a pending state for up to 72 hours, with an Amazon EventBridge notification of the change in state. This allows owners to create the required endpoints on behalf of participants based on endpoint service information from the Amazon MWAA console or API, or programmatically via an AWS Lambda function and EventBridge rule, as in the example in this post.

After those endpoints are created on the owner account, the endpoint service in the single-tenant Amazon MWAA VPC will detect the endpoint connection event and resume environment creation. Should there be an issue, you can cancel environment creation by deleting the environment during this pending state.

This feature also allows you to remove the create, modify, and delete VPCE privileges from the AWS Identity and Access Management (IAM) principal creating Amazon MWAA environments, even when not using a shared VPC, because that permission will instead be imposed on the IAM principal creating the endpoint (the Lambda function in our example). Furthermore, the Amazon MWAA environment will provide the SQS queue Amazon Resource Name (ARN) used by the Airflow Celery Executor to queue tasks (the Celery Executor Queue), allowing you to explicitly enter those resources into your network policy rather than having to provide a more open and generalized permission.

In this example, we create the VPC and Amazon MWAA environment in the same account. For shared VPCs across accounts, the EventBridge rule and Lambda function would exist in the owner account, and the Amazon MWAA environment would be created in the participant account. See Sending and receiving Amazon EventBridge events between AWS accounts for more information.

Prerequisites

You should have the following prerequisites:

An AWS account
An AWS user in that account, with permissions to create VPCs, VPC endpoints, and Amazon MWAA environments
An Amazon Simple Storage Service (Amazon S3) bucket in that account, with a folder called dags

Create the VPC

We begin by creating a restrictive VPC using an AWS CloudFormation template, in order to simulate creating the necessary VPC endpoint and modifying the SQS endpoint policy. If you want to use an existing VPC, you can proceed to the next section.

On the AWS CloudFormation console, choose Create stack and choose With new resources (standard).
Under Specify template, choose Upload a template file.
Now we edit our CloudFormation template to restrict access to Amazon SQS. In cfn-vpc-private-bjs.yml, edit the SqsVpcEndoint section to appear as follows:

   SqsVpcEndoint:
     Type: AWS::EC2::VPCEndpoint
     Properties:
       ServiceName: !Sub "com.amazonaws.${AWS::Region}.sqs"
       VpcEndpointType: Interface
       VpcId: !Ref VPC
       PrivateDnsEnabled: true
       SubnetIds:
        - !Ref PrivateSubnet1
        - !Ref PrivateSubnet2
       SecurityGroupIds:
        - !Ref SecurityGroup
       PolicyDocument:
        Statement:
         - Effect: Allow
           Principal: '*'
           Action: '*'
           Resource: []

This additional policy document entry prevents Amazon SQS egress to any resource not explicitly listed.

Now we can create our CloudFormation stack.

On the AWS CloudFormation console, choose Create stack.
Select Upload a template file.
Choose Choose file.
Browse to the file you modified.
Choose Next.
For Stack name, enter MWAA-Environment-VPC.
Choose Next until you reach the review page.
Choose Submit.

Create the Lambda function

We have two options for self-managing our endpoints: manual and automated. In this example, we create a Lambda function that responds to the Amazon MWAA EventBridge notification. You could also use the EventBridge notification to send an Amazon Simple Notification Service (Amazon SNS) message, such as an email, to someone with permission to create the VPC endpoint manually.

First, we create a Lambda function to respond to the EventBridge event that Amazon MWAA will emit.

On the Lambda console, choose Create function.
For Name, enter mwaa-create-lambda.
For Runtime, choose Python 3.11.
Choose Create function.

For Code, in the Code source section, for lambda_function, enter the following code:

import boto3
import json
import logging

logger = logging.getLogger()
logger.setLevel(logging.INFO)

def lambda_handler(event, context):
    if event['detail']['status']=="PENDING":
        detail=event['detail']
        name=detail['name']
        celeryExecutorQueue=detail['celeryExecutorQueue']
        subnetIds=detail['networkConfiguration']['subnetIds']
        securityGroupIds=detail['networkConfiguration']['securityGroupIds']
        databaseVpcEndpointService=detail['databaseVpcEndpointService']

        # MWAA does not need to store the VPC ID, but we can get it from the subnets
        client = boto3.client('ec2')
        response = client.describe_subnets(SubnetIds=subnetIds)
        logger.info(response['Subnets'][0]['VpcId'])  
        vpcId=response['Subnets'][0]['VpcId']
        logger.info("vpcId: " + vpcId)       
        
        webserverVpcEndpointService=None
        if detail['webserverAccessMode']=="PRIVATE_ONLY":
            webserverVpcEndpointService=event['detail']['webserverVpcEndpointService']
        
        response = client.describe_vpc_endpoints(
            VpcEndpointIds=[],
            Filters=[
                {"Name": "vpc-id", "Values": [vpcId]},
                {"Name": "service-name", "Values": ["*.sqs"]},
                ],
            MaxResults=1000
        )
        sqsVpcEndpoint=None
        for r in response['VpcEndpoints']:
            if subnetIds[0] in r['SubnetIds'] or subnetIds[0] in r['SubnetIds']:
                # We are filtering describe by service name, so this must be SQS
                sqsVpcEndpoint=r
                break
        
        if sqsVpcEndpoint:
            logger.info("Found SQS endpoint: " + sqsVpcEndpoint['VpcEndpointId'])

            logger.info(sqsVpcEndpoint)
            pd = json.loads(sqsVpcEndpoint['PolicyDocument'])
            for s in pd['Statement']:
                if s['Effect']=='Allow':
                    resource = s['Resource']
                    logger.info(resource)
                    if '*' in resource:
                        logger.info("'*' already allowed")
                    elif celeryExecutorQueue in resource: 
                        logger.info("'"+celeryExecutorQueue+"' already allowed")                
                    else:
                        s['Resource'].append(celeryExecutorQueue)
                        logger.info("Updating SQS policy to " + str(pd))
        
                        client.modify_vpc_endpoint(
                            VpcEndpointId=sqsVpcEndpoint['VpcEndpointId'],
                            PolicyDocument=json.dumps(pd)
                            )
                    break
        
        # create MWAA database endpoint
        logger.info("creating endpoint to " + databaseVpcEndpointService)
        endpointName=name+"-database"
        response = client.create_vpc_endpoint(
            VpcEndpointType='Interface',
            VpcId=vpcId,
            ServiceName=databaseVpcEndpointService,
            SubnetIds=subnetIds,
            SecurityGroupIds=securityGroupIds,
            TagSpecifications=[
                {
                    "ResourceType": "vpc-endpoint",
                    "Tags": [
                        {
                            "Key": "Name",
                            "Value": endpointName
                        },
                    ]
                },
            ],           
        )
        logger.info("created VPCE: " + response['VpcEndpoint']['VpcEndpointId'])
            
        # create MWAA web server endpoint (if private)
        if webserverVpcEndpointService:
            endpointName=name+"-webserver"
            logger.info("creating endpoint to " + webserverVpcEndpointService)
            response = client.create_vpc_endpoint(
                VpcEndpointType='Interface',
                VpcId=vpcId,
                ServiceName=webserverVpcEndpointService,
                SubnetIds=subnetIds,
                SecurityGroupIds=securityGroupIds,
                TagSpecifications=[
                    {
                        "ResourceType": "vpc-endpoint",
                        "Tags": [
                            {
                                "Key": "Name",
                                "Value": endpointName
                            },
                        ]
                    },
                ],                  
            )
            logger.info("created VPCE: " + response['VpcEndpoint']['VpcEndpointId'])

    return {
        'statusCode': 200,
        'body': json.dumps(event['detail']['status'])
    }

Choose Deploy.
On the Configuration tab of the Lambda function, in the General configuration section, choose Edit.
For Timeout, increate to 5 minutes, 0 seconds.
Choose Save.
In the Permissions section, under Execution role, choose the role name to edit the permissions of this function.
For Permission policies, choose the link under Policy name.

Choose Edit and add a comma and the following statement:

{
		"Sid": "Statement1",
		"Effect": "Allow",
		"Action": 
		[
			"ec2:DescribeVpcEndpoints",
			"ec2:CreateVpcEndpoint",
			"ec2:ModifyVpcEndpoint",
            "ec2:DescribeSubnets",
			"ec2:CreateTags"
		],
		"Resource": 
		[
			"*"
		]
}

The complete policy should look similar to the following:

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": "logs:CreateLogGroup",
			"Resource": "arn:aws:logs:us-east-1:112233445566:*"
		},
		{
			"Effect": "Allow",
			"Action": [
				"logs:CreateLogStream",
				"logs:PutLogEvents"
			],
			"Resource": [
				"arn:aws:logs:us-east-1:112233445566:log-group:/aws/lambda/mwaa-create-lambda:*"
			]
		},
		{
			"Sid": "Statement1",
			"Effect": "Allow",
			"Action": [
				"ec2:DescribeVpcEndpoints",
				"ec2:CreateVpcEndpoint",
				"ec2:ModifyVpcEndpoint",
               	"ec2:DescribeSubnets",
				"ec2:CreateTags"
			],
			"Resource": [
				"*"
			]
		}
	]
}

Choose Next until you reach the review page.
Choose Save changes.

Create an EventBridge rule

Next, we configure EventBridge to send the Amazon MWAA notifications to our Lambda function.

On the EventBridge console, choose Create rule.
For Name, enter mwaa-create.
Select Rule with an event pattern.
Choose Next.
For Creation method, choose User pattern form.
Choose Edit pattern.

For Event pattern, enter the following:

{
  "source": ["aws.airflow"],
  "detail-type": ["MWAA Environment Status Change"]
}

Choose Next.
For Select a target, choose Lambda function.

You may also specify an SNS notification in order to receive a message when the environment state changes.

For Function, choose mwaa-create-lambda.
Choose Next until you reach the final section, then choose Create rule.

Create an Amazon MWAA environment

Finally, we create an Amazon MWAA environment with customer-managed endpoints.

On the Amazon MWAA console, choose Create environment.
For Name, enter a unique name for your environment.
For Airflow version, choose the latest Airflow version.
For S3 bucket, choose Browse S3 and choose your S3 bucket, or enter the Amazon S3 URI.
For DAGs folder, choose Browse S3 and choose the dags/ folder in your S3 bucket, or enter the Amazon S3 URI.
Choose Next.
For Virtual Private Cloud, choose the VPC you created earlier.
For Web server access, choose Public network (Internet accessible).
For Security groups, deselect Create new security group.
Choose the shared VPC security group created by the CloudFormation template.

Because the security groups of the AWS PrivateLink endpoints from the earlier step are self-referencing, you must choose the same security group for your Amazon MWAA environment.

For Endpoint management, choose Customer managed endpoints.
Keep the remaining settings as default and choose Next.
Choose Create environment.

When your environment is available, you can access it via the Open Airflow UI link on the Amazon MWAA console.

Clean up

Cleaning up resources that are not actively being used reduces costs and is a best practice. If you don’t delete your resources, you can incur additional charges. To clean up your resources, complete the following steps:

Delete your Amazon MWAA environment, EventBridge rule, and Lambda function.
Delete the VPC endpoints created by the Lambda function.
Delete any security groups created, if applicable.
After the above resources have completed deletion, delete the CloudFormation stack to ensure that you have removed all of the remaining resources.

Summary

This post described how to automate environment creation with shared VPC support in Amazon MWAA. This gives you the ability to manage your own endpoints within your VPC, adding compatibility to shared, or otherwise restricted, VPCs. Specifying customer-managed endpoints also provides the ability to meet strict security policies by explicitly restricting VPC resource access to just those needed by their Amazon MWAA environments. To learn more about Amazon MWAA, refer to the Amazon MWAA User Guide. For more posts about Amazon MWAA, visit the Amazon MWAA resources page.

About the author

John Jackson has over 25 years of software experience as a developer, systems architect, and product manager in both startups and large corporations and is the AWS Principal Product Manager responsible for Amazon MWAA.

Writing IAM Policies: Grant Access to User-Specific Folders in an Amazon S3 Bucket

2023-11-14 Dylan Souvage

Post Syndicated from Dylan Souvage original https://aws.amazon.com/blogs/security/writing-iam-policies-grant-access-to-user-specific-folders-in-an-amazon-s3-bucket/

November 14, 2023: We’ve updated this post to use IAM Identity Center and follow updated IAM best practices.

In this post, we discuss the concept of folders in Amazon Simple Storage Service (Amazon S3) and how to use policies to restrict access to these folders. The idea is that by properly managing permissions, you can allow federated users to have full access to their respective folders and no access to the rest of the folders.

Overview

Imagine you have a team of developers named Adele, Bob, and David. Each of them has a dedicated folder in a shared S3 bucket, and they should only have access to their respective folders. These users are authenticated through AWS IAM Identity Center (successor to AWS Single Sign-On).

In this post, you’ll focus on David. You’ll walk through the process of setting up these permissions for David using IAM Identity Center and Amazon S3. Before you get started, let’s first discuss what is meant by folders in Amazon S3, because it’s not as straightforward as it might seem. To learn how to create a policy with folder-level permissions, you’ll walk through a scenario similar to what many people have done on existing files shares, where every IAM Identity Center user has access to only their own home folder. With folder-level permissions, you can granularly control who has access to which objects in a specific bucket.

You’ll be shown a policy that grants IAM Identity Center users access to the same Amazon S3 bucket so that they can use the AWS Management Console to store their information. The policy allows users in the company to upload or download files from their department’s folder, but not to access any other department’s folder in the bucket.

After the policy is explained, you’ll see how to create an individual policy for each IAM Identity Center user.

Throughout the rest of this post, you will use a policy, which will be associated with an IAM Identity Center user named David. Also, you must have already created an S3 bucket.

Note: S3 buckets have a global namespace and you must change the bucket name to a unique name.

For this blog post, you will need an S3 bucket with the following structure (the example bucket name for the rest of the blog is “my-new-company-123456789”):

/home/Adele/ /home/Bob/ /home/David/ /confidential/ /root-file.txt

Figure 1: Screenshot of the root of the my-new-company-123456789 bucket

Your S3 bucket structure should have two folders, home and confidential, with a file root-file.txt in the main bucket directory. Inside confidential you will have no items or folders. Inside home there should be three sub-folders: Adele, Bob, and David.

Figure 2: Screenshot of the home/ directory of the my-new-company-123456789 bucket

A brief lesson about Amazon S3 objects

Before explaining the policy, it’s important to review how Amazon S3 objects are named. This brief description isn’t comprehensive, but will help you understand how the policy works. If you already know about Amazon S3 objects and prefixes, skip ahead to Creating David in Identity Center.

Amazon S3 stores data in a flat structure; you create a bucket, and the bucket stores objects. S3 doesn’t have a hierarchy of sub-buckets or folders; however, tools like the console can emulate a folder hierarchy to present folders in a bucket by using the names of objects (also known as keys). When you create a folder in S3, S3 creates a 0-byte object with a key that references the folder name that you provided. For example, if you create a folder named photos in your bucket, the S3 console creates a 0-byte object with the key photos/. The console creates this object to support the idea of folders. The S3 console treats all objects that have a forward slash (/) character as the last (trailing) character in the key name as a folder (for example, examplekeyname/)

To give you an example, for an object that’s named home/common/shared.txt, the console will show the shared.txt file in the common folder in the home folder. The names of these folders (such as home/ or home/common/) are called prefixes, and prefixes like these are what you use to specify David’s department folder in his policy. By the way, the slash (/) in a prefix like home/ isn’t a reserved character — you could name an object (using the Amazon S3 API) with prefixes such as home:common:shared.txt or home-common-shared.txt. However, the convention is to use a slash as the delimiter, and the Amazon S3 console (but not S3 itself) treats the slash as a special character for showing objects in folders. For more information on organizing objects in the S3 console using folders, see Organizing objects in the Amazon S3 console by using folders.

Creating David in Identity Center

IAM Identity Center helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. Identity Center is the recommended approach for workforce authentication and authorization on AWS for organizations of any size and type. Using Identity Center, you can create and manage user identities in AWS, or connect your existing identity source, including Microsoft Active Directory, Okta, Ping Identity, JumpCloud, Google Workspace, and Azure Active Directory (Azure AD). For further reading on IAM Identity Center, see the Identity Center getting started page.

Begin by setting up David as an IAM Identity Center user. To start, open the AWS Management Console and go to IAM Identity Center and create a user.

Note: The following steps are for Identity Center without System for Cross-domain Identity Management (SCIM) turned on, the add user option won’t be available if SCIM is turned on.

From the left pane of the Identity Center console, select Users, and then choose Add user.

Figure 3: Screenshot of IAM Identity Center Users page.
Enter David as the Username, enter an email address that you have access to as you will need this later to confirm your user, and then enter a First name, Last name, and Display name.
Leave the rest as default and choose Add user.
Select Users from the left navigation pane and verify you’ve created the user David.

Figure 4: Screenshot of adding users to group in Identity Center.
Now that you’re verified the user David has been created, use the left pane to navigate to Permission sets, then choose Create permission set.

Figure 5: Screenshot of permission sets in Identity Center.
Select Custom permission set as your Permission set type, then choose Next.

Figure 6: Screenshot of permission set types in Identity Center.

David’s policy

This is David’s complete policy, which will be associated with an IAM Identity Center federated user named David by using the console. This policy grants David full console access to only his folder (/home/David) and no one else’s. While you could grant each user access to their own bucket, keep in mind that an AWS account can have up to 100 buckets by default. By creating home folders and granting the appropriate permissions, you can instead allow thousands of users to share a single bucket.

{
 “Version”:”2012-10-17”,
 “Statement”: [
   {
     “Sid”: “AllowUserToSeeBucketListInTheConsole”,
     “Action”: [“s3:ListAllMyBuckets”, “s3:GetBucketLocation”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3:::*”]
   },
  {
     “Sid”: “AllowRootAndHomeListingOfCompanyBucket”,
     “Action”: [“s3:ListBucket”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3::: my-new-company-123456789”],
     “Condition”:{“StringEquals”:{“s3:prefix”:[“”,”home/”, “home/David”],”s3:delimiter”:[“/”]}}
    },
   {
     “Sid”: “AllowListingOfUserFolder”,
     “Action”: [“s3:ListBucket”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3:::my-new-company-123456789”],
     “Condition”:{“StringLike”:{“s3:prefix”:[“home/David/*”]}}
   },
   {
     “Sid”: “AllowAllS3ActionsInUserFolder”,
     “Effect”: “Allow”,
     “Action”: [“s3:*”],
     “Resource”: [“arn:aws:s3:::my-new-company-123456789/home/David/*”]
   }
 ]
}

Now, copy and paste the preceding IAM Policy into the inline policy editor. In this case, you use the JSON editor. For information on creating policies, see Creating IAM policies.

Figure 7: Screenshot of the inline policy inside the permissions set in Identity Center.
Give your permission set a name and a description, then leave the rest at the default settings and choose Next.
Verify that you modify the policies to have the bucket name you created earlier.
After your permission set has been created, navigate to AWS accounts on the left navigation pane, then select Assign users or groups.

Figure 8: Screenshot of the AWS accounts in Identity Center.
Select the user David and choose Next.

Figure 9: Screenshot of the AWS accounts in Identity Center.
Select the permission set you created earlier, choose Next, leave the rest at the default settings and choose Submit.

Figure 10: Screenshot of the permission sets in Identity Center.

You’ve now created and attached the permissions required for David to view his S3 bucket folder, but not to view the objects in other users’ folders. You can verify this by signing in as David through the AWS access portal.

Figure 11: Screenshot of the settings summary in Identity Center.
Navigate to the dashboard in IAM Identity Center and go to the Settings summary, then choose the AWS access portal URL.

Figure 12: Screenshot of David signing into the console via the Identity Center dashboard URL.
Sign in as the user David with the one-time password you received earlier when creating David.

Figure 13: Second screenshot of David signing into the console through the Identity Center dashboard URL.
Open the Amazon S3 console.
Search for the bucket you created earlier.

Figure 14: Screenshot of my-new-company-123456789 bucket in the AWS console.
Navigate to David’s folder and verify that you have read and write access to the folder. If you navigate to other users’ folders, you’ll find that you don’t have access to the objects inside their folders.

David’s policy consists of four blocks; let’s look at each individually.

Block 1: Allow required Amazon S3 console permissions

Before you begin identifying the specific folders David can have access to, you must give him two permissions that are required for Amazon S3 console access: ListAllMyBuckets and GetBucketLocation.

   {
      "Sid": "AllowUserToSeeBucketListInTheConsole",
      "Action": ["s3:GetBucketLocation", "s3:ListAllMyBuckets"],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::*"]
   }

The ListAllMyBuckets action grants David permission to list all the buckets in the AWS account, which is required for navigating to buckets in the Amazon S3 console (and as an aside, you currently can’t selectively filter out certain buckets, so users must have permission to list all buckets for console access). The console also does a GetBucketLocation call when users initially navigate to the Amazon S3 console, which is why David also requires permission for that action. Without these two actions, David will get an access denied error in the console.

Block 2: Allow listing objects in root and home folders

Although David should have access to only his home folder, he requires additional permissions so that he can navigate to his folder in the Amazon S3 console. David needs permission to list objects at the root level of the my-new-company-123456789 bucket and to the home/ folder. The following policy grants these permissions to David:

   {
      "Sid": "AllowRootAndHomeListingOfCompanyBucket",
      "Action": ["s3:ListBucket"],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::my-new-company-123456789"],
      "Condition":{"StringEquals":{"s3:prefix":["","home/", "home/David"],"s3:delimiter":["/"]}}
   }

Without the ListBucket permission, David can’t navigate to his folder because he won’t have permissions to view the contents of the root and home folders. When David tries to use the console to view the contents of the my-new-company-123456789 bucket, the console will return an access denied error. Although this policy grants David permission to list all objects in the root and home folders, he won’t be able to view the contents of any files or folders except his own (you specify these permissions in the next block).

This block includes conditions, which let you limit under what conditions a request to AWS is valid. In this case, David can list objects in the my-new-company-123456789 bucket only when he requests objects without a prefix (objects at the root level) and objects with the home/ prefix (objects in the home folder). If David tries to navigate to other folders, such as confidential/, David is denied access. Additionally, David needs permissions to list prefix home/David to be able to use the search functionality of the console instead of scrolling down the list of users’ folders.

To set these root and home folder permissions, I used two conditions: s3:prefix and s3:delimiter. The s3:prefix condition specifies the folders that David has ListBucket permissions for. For example, David can list the following files and folders in the my-new-company-123456789 bucket:

/root-file.txt /confidential/ /home/Adele/ /home/Bob/ /home/David/

But David cannot list files or subfolders in the confidential/, home/Adele, or home/Bob folders.

Although the s3:delimiter condition isn’t required for console access, it’s still a good practice to include it in case David makes requests by using the API. As previously noted, the delimiter is a character—such as a slash (/)—that identifies the folder that an object is in. The delimiter is useful when you want to list objects as if they were in a file system. For example, let’s assume the my-new-company-123456789 bucket stored thousands of objects. If David includes the delimiter in his requests, he can limit the number of returned objects to just the names of files and subfolders in the folder he specified. Without the delimiter, in addition to every file in the folder he specified, David would get a list of all files in any subfolders.

Block 3: Allow listing objects in David’s folder

In addition to the root and home folders, David requires access to all objects in the home/David/ folder and any subfolders that he might create. Here’s a policy that allows this:

{
      “Sid”: “AllowListingOfUserFolder”,
      “Action”: [“s3:ListBucket”],
      “Effect”: “Allow”,
      “Resource”: [“arn:aws:s3:::my-new-company-123456789”],
      "Condition":{"StringLike":{"s3:prefix":["home/David/*"]}}
    }

In the condition above, you use a StringLike expression in combination with the asterisk (*) to represent an object in David’s folder, where the asterisk acts as a wildcard. That way, David can list files and folders in his folder (home/David/). You couldn’t include this condition in the previous block (AllowRootAndHomeListingOfCompanyBucket) because it used the StringEquals expression, which would interpret the asterisk (*) as an asterisk, not as a wildcard.

In the next section, the AllowAllS3ActionsInUserFolder block, you’ll see that the Resource element specifies my-new-company/home/David/*, which looks like the condition that I specified in this section. You might think that you can similarly use the Resource element to specify David’s folder in this block. However, the ListBucket action is a bucket-level operation, meaning the Resource element for the ListBucket action applies only to bucket names and doesn’t take folder names into account. So, to limit actions at the object level (files and folders), you must use conditions.

Block 4: Allow all Amazon S3 actions in David’s folder

Finally, you specify David’s actions (such as read, write, and delete permissions) and limit them to just his home folder, as shown in the following policy:

    {
      "Sid": "AllowAllS3ActionsInUserFolder",
      "Effect": "Allow",
      "Action": ["s3:*"],
      "Resource": ["arn:aws:s3:::my-new-company-123456789/home/David/*"]
    }

For the Action element, you specified s3:*, which means David has permission to do all Amazon S3 actions. In the Resource element, you specified David’s folder with an asterisk (*) (a wildcard) so that David can perform actions on the folder and inside the folder. For example, David has permission to change his folder’s storage class. David also has permission to upload files, delete files, and create subfolders in his folder (perform actions in the folder).

An easier way to manage policies with policy variables

In David’s folder-level policy you specified David’s home folder. If you wanted a similar policy for users like Bob and Adele, you’d have to create separate policies that specify their home folders. Instead of creating individual policies for each IAM Identity Center user, you can use policy variables and create a single policy that applies to multiple users (a group policy). Policy variables act as placeholders. When you make a request to a service in AWS, the placeholder is replaced by a value from the request when the policy is evaluated.

For example, you can use the previous policy and replace David’s user name with a variable that uses the requester’s user name through attributes and PrincipalTag as shown in the following policy (copy this policy to use in the procedure that follows):

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "AllowUserToSeeBucketListInTheConsole",
			"Action": [
				"s3:ListAllMyBuckets",
				"s3:GetBucketLocation"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::*"
			]
		},
		{
			"Sid": "AllowRootAndHomeListingOfCompanyBucket",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789"
			],
			"Condition": {
				"StringEquals": {
					"s3:prefix": [
						"",
						"home/",
						"home/${aws:PrincipalTag/userName}"
					],
					"s3:delimiter": [
						"/"
					]
				}
			}
		},
		{
			"Sid": "AllowListingOfUserFolder",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789"
			],
			"Condition": {
				"StringLike": {
					"s3:prefix": [
						"home/${aws:PrincipalTag/userName}/*"
					]
				}
			}
		},
		{
			"Sid": "AllowAllS3ActionsInUserFolder",
			"Effect": "Allow",
			"Action": [
				"s3:*"
			],
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789/home/${aws:PrincipalTag/userName}/*"
			]
		}
	]
}

To implement this policy with variables, begin by opening the IAM Identity Center console using the main AWS admin account (ensuring you’re not signed in as David).
Select Settings on the left-hand side, then select the Attributes for access control tab.

Figure 15: Screenshot of Settings inside Identity Center.
Create a new attribute for access control, entering userName as the Key and ${path:userName} as the Value, then choose Save changes. This will add a session tag to your Identity Center user and allow you to use that tag in an IAM policy.

Figure 16: Screenshot of managing attributes inside Identity Center settings.
To edit David’s permissions, go back to the IAM Identity Center console and select Permission sets.

Figure 17: Screenshot of permission sets inside Identity Center with Davids-Permissions selected.
Select David’s permission set that you created previously.
Select Inline policy and then choose Edit to update David’s policy by replacing it with the modified policy that you copied at the beginning of this section, which will resolve to David’s username.

Figure 18: Screenshot of David’s policy inside his permission set inside Identity Center.

You can validate that this is set up correctly by signing in to David’s user through the Identity Center dashboard as you did before and verifying you have access to the David folder and not the Bob or Adele folder.

Figure 19: Screenshot of David’s S3 folder with access to a .jpg file inside.

Whenever a user makes a request to AWS, the variable is replaced by the user name of whoever made the request. For example, when David makes a request, ${aws:PrincipalTag/userName} resolves to David; when Adele makes the request, ${aws:PrincipalTag/userName} resolves to Adele.

It’s important to note that, if this is the route you use to grant access, you must control and limit who can set this username tag on an IAM principal. Anyone who can set this tag can effectively read/write to any of these bucket prefixes. It’s important that you limit access and protect the bucket prefixes and who can set the tags. For more information, see What is ABAC for AWS, and the Attribute-based access control User Guide.

Conclusion

By using Amazon S3 folders, you can follow the principle of least privilege and verify that the right users have access to what they need, and only to what they need.

See the following example policy that only allows API access to the buckets, and only allows for adding, deleting, restoring, and listing objects inside the folders:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowAllS3ActionsInUserFolder",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:DeleteObjectTagging",
                "s3:DeleteObjectVersion",
                "s3:DeleteObjectVersionTagging",
                "s3:GetObject",
                "s3:GetObjectTagging",
                "s3:GetObjectVersion",
                "s3:GetObjectVersionTagging",
                "s3:ListBucket",
                "s3:PutObject",
                "s3:PutObjectTagging",
                "s3:PutObjectVersionTagging",
                "s3:RestoreObject"
            ],
            "Resource": [
		   "arn:aws:s3:::my-new-company-123456789",
                "arn:aws:s3:::my-new-company-123456789/home/${aws:PrincipalTag/userName}/*"
            ],
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "home/${aws:PrincipalTag/userName}/*"
                    ]
                }
            }
        }
    ]
}

We encourage you to think about what policies your users might need and restrict the access by only explicitly allowing what is needed.

Here are some additional resources for learning about Amazon S3 folders and about IAM policies, and be sure to get involved at the community forums:

For a detailed walkthrough of Amazon S3 policies, see Controlling access to a bucket with user policies.
See Listing objects using prefixes and delimiters in Organizing objects using prefixes.
See a sample Amazon S3 API request in the Amazon S3 API Reference.
Blocking public access to your Amazon S3 storage.
Bucket policies and examples.
Our previous blog post about how to grant access to an Amazon S3 bucket.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

AWS CodeBuild adds support for AWS Lambda compute mode

2023-11-09 Ryan Bachman

Post Syndicated from Ryan Bachman original https://aws.amazon.com/blogs/devops/aws-codebuild-adds-support-for-aws-lambda-compute-mode/

AWS CodeBuild recently announced that it supports running projects on AWS Lambda. AWS CodeBuild is a fully managed continuous integration (CI) service that allows you to build and test your code without having to manage build servers. This new compute mode enables you to execute your CI process on the same AWS Lambda base images supported by the AWS Lambda service, providing consistency, performance, and cost benefits. If you are already building and deploying microservices or building code packages, its likely these light-weight and smaller code bases will benefit from the efficiencies gained by using Lambda compute for CI. In this post, I will explain the benefits of this new compute mode as well as provide an example CI workflow to demonstrate these benefits.

Consistency Benefits

One of the key benefits of running AWS CodeBuild projects using the AWS Lambda compute mode is consistency. By building and testing your serverless applications on the same base images AWS Lambda provides, you can ensure that your application works as intended before moving to production. This eliminates the possibility of compatibility issues that may arise when deploying to a different environment. Moreover, because the AWS Lambda runtime provides a standardized environment across all regions, you can build your serverless applications in Lambda on CodeBuild and have the confidence to deploy to any region where your customers are located.

Performance and Cost Benefits

Another significant advantage of running AWS CodeBuild projects on the AWS Lambda compute mode is performance. When you run your project with EC2 compute mode, there may be increased start-up latency that impacts the total CI process. However, because AWS Lambda is designed to process events in near real-time, executing jobs on the AWS Lambda compute mode can result in significant time savings. As a result, by switching to this new compute mode, you save time within your CI process and increase your frequency of integration.

Related to performance, our customers are looking for opportunities to optimize cost and deliver business value at the lowest price point. Customers can see meaningful cost optimization when using Lambda. Lambda-based compute offers better price/performance than CodeBuild’s current Amazon Elastic Compute Cloud (Amazon EC2) based compute — with per-second billing, 6,000 seconds of free tier, and 10GB of ephemeral storage. CodeBuild supports both x86 and Arm using AWS Graviton Processors for the new compute mode. By using the Graviton option, customers can further achieve lower costs with better performance.

Considerations

There are a few things to consider before moving your projects to AWS Lambda compute mode. Because this mode utilizes the same AWS Lambda service used by customers, there is a maximum project duration of 15 minutes and custom build timeouts are not supported. Additionally, local caching, batch builds, and docker image builds are not supported with Lambda compute mode. For a full list of limitations please refer to the AWS CodeBuild User Guide.

Walk-Through

In this walk-through I used a simple React app to showcase the performance benefits of building using CodeBuild’s Lambda compute option. To get started, I created an AWS CodeCommit repository and used npm to create a sample application that I then pushed to my CodeCommit repo.

Figure 1: CodeCommit Repository

Once my sample application is stored in my code repository, I then navigated to the AWS CodeBuild console. Using the console, I created two different CodeBuild projects. I selected my CodeCommit repository for the projects’ source and selected EC2 compute for one and Lambda for the other.

Figure 2: CodeBuild compute mode options

The compute mode is independent from instructions included in the project’s buildspec.yaml, so in order to test and compare the CI job on the two different compute modes, I created a buildspec and pushed that to the CodeCommit repository where I stored my simple React app in the previous step.

buildspec.yaml

version: 0.2
phases:
  build:
    commands:
      - npm install -g yarn
      - yarn
      - npm run build
      - npm run test -- --coverage --watchAll=false --testResultsProcessor="jest-junit"
artifacts:
  name: "build-output"
  base-directory: 'react-app'
  files:
    - "**/*"
reports:
  test-report:
    base-directory: 'react-app'
    files:
      - 'junit.xml'
    file-format: 'JUNITXML'
  coverage-report:
    base-directory: 'react-app'
    files:
      - 'coverage/clover.xml'
    file-format: 'CLOVERXML'

Once my projects were ready, I manually started each project from the console. Below are screenshots of the details of the execution times for each.

Figure 3: EC2 compute mode details

Figure 4: Lambda compute mode details

If you compare the provisioning and build times for this example you will see that the Lambda CodeBuild project executed significantly faster. In this example, the Lambda option was over 50% faster than the same build executed on EC2. If you are frequently building similar microservices that can take advantage of these efficiency gains, it is likely that CodeBuild on Lambda can save you both time and cost in your CI/CD pipelines.

Cleanup

If you followed along the walkthrough, you will want to remove the resources you created. First delete the two projects created in CodeBuild. This can be done by navigating to CodeBuild in the console, selecting each project and then hitting the “Delete build project” button. Next navigate to CodeCommit and delete the repository where you stored the code.

Conclusion

In conclusion, AWS CodeBuild using Lambda compute-mode provides significant benefits in terms of consistency, performance and cost. By building and testing your applications on the same runtime images executed by the AWS Lambda service, you can take advantage of benefits provided by the AWS Lambda service while reducing CI time and costs. We are excited to see how this new feature will help you build and deploy your applications faster and more efficiently. Get started today and take advantage of the benefits of running AWS CodeBuild projects on the AWS Lambda compute mode.

About the Author:

Configure dynamic tenancy for Amazon OpenSearch Dashboards

2023-11-07 Abhi Kalra

Post Syndicated from Abhi Kalra original https://aws.amazon.com/blogs/big-data/configure-dynamic-tenancy-for-amazon-opensearch-dashboards/

Amazon OpenSearch Service securely unlocks real-time search, monitoring, and analysis of business and operational data for use cases like application monitoring, log analytics, observability, and website search. In this post, we talk about new configurable dashboards tenant properties.

OpenSearch Dashboards tenants in Amazon OpenSearch Service are spaces for saving index patterns, visualizations, dashboards, and other Dashboards objects. Users can switch between multiple tenants to access and share index patterns and visualizations.

When users use Dashboards, they select their Dashboards tenant view. There are three types of tenant:

Global tenant – This tenant is shared among all the OpenSearch Dashboard users if they have access to it. This tenant is created by default for all domains.
Private tenant – This tenant is exclusive to each user and can’t be shared. It does not allow you to access routes or index patterns created by the global tenant. Private tenants are usually used for exploratory work.
Custom tenants – Administrators can create custom tenants and assign them to specific roles. Once created, these tenants can then provide spaces for specific groups of users.

One user can have access to multiple tenants, and this property is called multi-tenancy. With the OpenSearch 2.7 launch, administrators can dynamically configure the following tenancy properties:

Enable or disable multi-tenancy.
Enable or disable private tenant.
Change the default tenant.

Why do you need these properties to be dynamic?

Before OpenSearch 2.7, users of open-source OpenSearch, with security permissions, could enable and disable multi-tenancy and private tenant by changing the YAML configuration file and restarting their Dashboards environment. This had some drawbacks:

Users needed to do a Dashboards environment restart, which takes time.
Changing the configuration on large clusters (more than 100 data nodes) was difficult to automate and error-prone.
When configuration changes did not include all nodes due to configuration update failures or a failure to apply changes, the user experience would differ based on which node the request hits.

With OpenSearch 2.7 in Amazon OpenSearch Service, users can change tenancy configurations dynamically from both the REST API and from the Dashboards UI. This provides a faster and more reliable way to manage your Dashboards tenancy.

Introducing a new property: default tenant

Before OpenSearch 2.7, by default, all new users would sign in to their private tenant when accessing OpenSearch Dashboards. With 2.7, we have added a new property, default tenant. Now administrators can set a default tenant for when users sign in to OpenSearch Dashboards, whether it’s their own, private tenant, the global tenant, or a custom tenant.

This feature will serve two basic functions:

Remove confusion among new users who don’t have much experience with OpenSearch Dashboards and tenancy. If their usage of Dashboards is limited to visualizations and small modifications of already existing data in a particular tenant, they don’t have to worry about switching tenants and can access the tenant with required data by default.
Give more control to administrators. Administrators can decide which tenant should be default for all visualization purposes.

Users will sign in to the default tenant only when they are signing in for the first time or from a new browser. For subsequent sign-ins, the user will sign in to the tenant they previously signed in to, which comes from browser storage.

The user sign-in flow is as follows:

Since even a small change in these configurations can impact all the users accessing Dashboards, take care when configuring and changing these features to ensure smooth use of Dashboards.

Default tenancy configurations

The following shows the default tenancy configuration on domain creation.

“multitenancy_enabled” : true
“private_tenant_eabled”: true
“default_tenant”: “”

This means that by default for each new domain, multi-tenancy and private tenant will be enabled and the default tenant will be the global tenant. You can change this configuration after domain creation with admins or with users with access to the right FGAC or IAM roles.

Changing tenancy configurations using APIs

You can use the following API call in OpenSearch 2.7+ to configure tenancy properties. All three tenancy properties are optional:

PUT _plugins/_security/api/tenancy/config 
{
    "multitenancy_enabled":true,
    "private_tenant_enabled":false,
    "default_tenant":"mary_brown"
}

You can use the following API to retrieve the current tenancy configuration:

GET _plugins/_security/api/tenancy/config

Changing tenancy configuration from OpenSearch Dashboards

You can also configure tenancy properties from OpenSearch Dashboards. Amazon OpenSearch Service has introduced the option to configure and manage tenancy from the Getting started tab of the Security page. From the Manage tab of the Multi-tenancy page, admins can choose a tenant to be the default tenant and see tenancy status, which will tell whether a tenant is enabled or disabled. Admins can enable and disable multi-tenancy, private tenant, and choose the default tenant from the configure tab.

Summary

Since the release of OpenSearch 2.7, you can set your tenancy configuration dynamically, using both REST APIs and OpenSearch Dashboards. Dynamic, API-driven tenancy configuration will make use of tenancy features and Dashboards simpler and more efficient for both users and administrators. Administrators will have more control over which tenants are accessible to which users.

We would love to hear from you, especially about how this feature has helped your organization simplify your Dashboards usage. If you have other questions, please leave a comment.

To learn more, please visit the Amazon OpenSearch Service page.

About the authors

Abhi Kalra

Prabhat Chaturvedi

AWS KMS is now FIPS 140-2 Security Level 3. What does this mean for you?

2023-11-07 Rushir Patel

Post Syndicated from Rushir Patel original https://aws.amazon.com/blogs/security/aws-kms-now-fips-140-2-level-3-what-does-this-mean-for-you/

AWS Key Management Service (AWS KMS) recently announced that its hardware security modules (HSMs) were given Federal Information Processing Standards (FIPS) 140-2 Security Level 3 certification from the U.S. National Institute of Standards and Technology (NIST). For organizations that rely on AWS cryptographic services, this higher security level validation has several benefits, including simpler set up and operation. In this post, we will share more details about the recent change in FIPS validation status for AWS KMS and explain the benefits to customers using AWS cryptographic services as a result of this change.

Background on NIST FIPS 140

The FIPS 140 framework provides guidelines and requirements for cryptographic modules that protect sensitive information. FIPS 140 is the industry standard in the US and Canada and is recognized around the world as providing authoritative certification and validation for the way that cryptographic modules are designed, implemented, and tested against NIST cryptographic security guidelines.

Organizations follow FIPS 140 to help ensure that their cryptographic security is aligned with government standards. FIPS 140 validation is also required in certain fields such as manufacturing, healthcare, and finance and is included in several industry and regulatory compliance frameworks, such as the Payment Card Industry Data Security Standard (PCI DSS), the Federal Risk and Authorization Management Program (FedRAMP), and the Health Information Trust Alliance (HITRUST) framework. FIPS 140 validation is recognized in many jurisdictions around the world, so organizations that operate globally can use FIPS 140 certification internationally.

For more information on FIPS Security Levels and requirements, see FIPS Pub 140-2: Security Requirements for Cryptographic Modules.

What FIPS 140-2 Security Level 3 means for AWS KMS and you

Until recently, AWS KMS had been validated at Security Level 2 overall and at Security Level 3 in the following four sub-categories:

Cryptographic module specification
Roles, services, and authentication
Physical security
Design assurance

The latest certification from NIST means that AWS KMS is now validated at Security Level 3 overall in each sub-category. As a result, AWS assumes more of the shared responsibility model, which will benefit customers for certain use cases. Security Level 3 certification can assist organizations seeking compliance with several industry and regulatory standards. Even though FIPS 140 validation is not expressly required in a number of regulatory regimes, maintaining stronger, easier-to-use encryption can be a powerful tool for complying with FedRAMP, U.S. Department of Defense (DOD) Approved Product List (APL), HIPAA, PCI, the European Union’s General Data Protection Regulation (GDPR), and the ISO 27001 standard for security management best practices and comprehensive security controls.

Customers who previously needed to meet compliance requirements for FIPS 140-2 Level 3 on AWS were required to use AWS CloudHSM, a single-tenant HSM solution that provides dedicated HSMs instead of managed service HSMs. Now, customers who were using CloudHSM to help meet their compliance obligations for Level 3 validation can use AWS KMS by itself for key generation and usage. Compared to CloudHSM, AWS KMS is typically lower cost and easier to set up and operate as a managed service, and using AWS KMS shifts the responsibility for creating and controlling encryption keys and operating HSMs from the customer to AWS. This allows you to focus resources on your core business instead of on undifferentiated HSM infrastructure management tasks.

AWS KMS uses FIPS 140-2 Level 3 validated HSMs to help protect your keys when you request the service to create keys on your behalf or when you import them. The HSMs in AWS KMS are designed so that no one, not even AWS employees, can retrieve your plaintext keys. Your plaintext keys are never written to disk and are only used in volatile memory of the HSMs while performing your requested cryptographic operation.

The FIPS 140-2 Level 3 certified HSMs in AWS KMS are deployed in all AWS Regions, including the AWS GovCloud (US) Regions. The China (Beijing) and China (Ningxia) Regions do not support the FIPS 140-2 Cryptographic Module Validation Program. AWS KMS uses Office of the State Commercial Cryptography Administration (OSCCA) certified HSMs to protect KMS keys in China Regions. The certificate for the AWS KMS FIPS 140-2 Security Level 3 validation is available on the NIST Cryptographic Module Validation Program website.

As with many industry and regulatory frameworks, FIPS 140 is evolving. NIST approved and published a new updated version of the 140 standard, FIPS 140-3, which supersedes FIPS 140-2. The U.S. government has begun transitioning to the FIPS 140-3 cryptography standard, with NIST announcing that they will retire all FIPS 140-2 certificates on September 22, 2026. NIST recently validated AWS-LC under FIPS 140-3 and is currently in the process of evaluating AWS KMS and certain instance types of AWS CloudHSM under the FIPS 140-3 standard. To check the status of these evaluations, see the NIST Modules In Process List.

For more information on FIPS 140-3, see FIPS Pub 140-3: Security Requirements for Cryptographic Modules.

Legal Disclaimer

This document is provided for the purposes of information only; it is not legal advice, and should not be relied on as legal advice. Customers are responsible for making their own independent assessment of the information in this document. This document: (a) is for informational purposes only, (b) represents current AWS product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. The responsibilities and liabilities of AWS to its customers are controlled by AWS agreements, and this document is not part of, nor does it modify, any agreement between AWS and its customers.

AWS encourages its customers to obtain appropriate advice on their implementation of privacy and data protection environments, and more generally, applicable laws and other obligations relevant to their business.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Introducing Amazon MWAA support for Apache Airflow version 2.7.2 and deferrable operators

2023-11-07 Manasi Bhutada

Post Syndicated from Manasi Bhutada original https://aws.amazon.com/blogs/big-data/introducing-amazon-mwaa-support-for-apache-airflow-version-2-7-2-and-deferrable-operators/

Amazon Managed Workflow for Apache Airflow (Amazon MWAA) is a managed service that allows you to use a familiar Apache Airflow environment with improved scalability, availability, and security to enhance and scale your business workflows without the operational burden of managing the underlying infrastructure.

Today, we are announcing the availability of Apache Airflow version 2.7.2 environments and support for deferrable operators on Amazon MWAA. In this post, we provide an overview of deferrable operators and triggers, including a walkthrough of an example showcasing how to use them. We also delve into some of the new features and capabilities of Apache Airflow, and how you can set up or upgrade your Amazon MWAA environment to version 2.7.2.

Deferrable operators and triggers

Standard operators and sensors continuously occupy an Airflow worker slot, regardless of whether they are active or idle. For example, even while waiting for an external system to complete a job, a worker slot is consumed. The Gantt chart below, representing a Directed Acyclic Graph (DAG), showcases this scenario through multiple Amazon Redshift operations.

Gantt chart representing DAG idle time

You can see the time each task spends idling while waiting for the Redshift cluster to be created, snapshotted, and paused. With the introduction of deferrable operators in Apache Airflow 2.2, the polling process can be offloaded to ensure efficient utilization of the worker slot. A deferrable operator can suspend itself and resume once the external job is complete, instead of continuously occupying a worker slot. This minimizes queued tasks and leads to a more efficient utilization of resources within your Amazon MWAA environment. The following figure shows a simplified diagram describing the process flow.

After a task has deferred its run, it frees up the worker slot and assigns the check of completion to a small piece of asynchronous code called a trigger. The trigger runs in a parent process called a triggerer, a service that runs an asyncio event loop. The triggerer has the capability to run triggers in parallel at scale, and to signal tasks to resume when a condition is met.

The Amazon provider package for Apache Airflow has added triggers for popular AWS services like AWS Glue and Amazon EMR. In Amazon MWAA environments running Apache Airflow v2.7.2, the management and operation of the triggerer service is taken care of for you. If you prefer not to use the triggerer service, you can change the configuration mwaa.triggerer_enabled. Additionally, you can define how many triggers each triggerer can run in parallel using the configuration parameter triggerer.default_capacity. This parameter defaults to values based on your Amazon MWAA environment class. Refer to the Configuration reference in the User Guide for detailed configuration values.

When to use deferrable operators

Deferrable operators are particularly useful for tasks that submit jobs to systems external to an Amazon MWAA environment, such as Amazon EMR, AWS Glue, and Amazon SageMaker, or other sensors waiting for a specific event to occur. These tasks can take minutes to hours to complete and are primarily idle operators, making them good candidates to be replaced by their deferrable versions. Some additional use cases include:

File system-based operations.
Database operations with long running queries.

Using deferrable operators in Amazon MWAA

To use deferrable operators in Amazon MWAA, ensure you’re running Apache Airflow version 2.7 or greater in your Amazon MWAA environment, and the operators or sensors in your DAGs support deferring. Operators in the Amazon provider package expose a deferrable parameter which you can set to True to run the operator in asynchronous mode. For example, you can use S3KeySensor in asynchronous mode as follows:

wait_for_source_data = S3KeySensor (
task_id="WaitForSourceData",
bucket_name="source_bucket_name",
bucket_key = "object_key",
aws_conn_id="aws_default",
deferrable=True
)

You can also utilize various pre-built deferrable operators available in other provider packages, such as Snowflake and Databricks.

Follow the complete sample code in the GitHub repository to understand how deferrable operators work together. You will be building and orchestrating the data pipeline illustrated in the following figure.

The pipeline consists of three stages:

A S3KeySensor that waits for a dataset to be uploaded in Amazon Simple Storage Service (Amazon S3)
An AWS Glue crawler to classify objects in the dataset and save schemas into the AWS Glue Data Catalog
An AWS Glue job that uses the metadata in the Data Catalog to denormalize the source dataset, create Data Catalog tables based on filtered data, and write the resulting data back to Amazon S3 in separate Apache Parquet files.

Setup and Teardown tasks

It’s common to build workflows that require ephemeral resources, for example an S3 bucket to temporarily store data, databases and corresponding datasets to run quality checks, or a compute cluster to train a model in a machine learning (ML) orchestration pipeline. You need to have these resources properly configured before running work tasks, and after their run, ensure they are torn down. Doing this manually is complex. It may lead to poor readability and maintainability of your DAGs, and leave resources running constantly, thereby increasing costs. With Amazon MWAA support for Apache Airflow version 2.7.2, you can use two new types of tasks to support this scenario: setup and teardown tasks.

Setup and teardown tasks ensure that the resources needed for a work task are set up before the task starts its run and then are taken down after it has finished, even if the work task fails. Any task can be configured as a setup or teardown task. Once configured, they have special visibility in the Airflow UI and also special behavior. The following graph describes a simple data quality check pipeline using setup and teardown tasks.

One option to mark setup_db_instance and teardown_db_instance as setup and teardown tasks is to use the as_teardown() method in the teardown task in the dependencies chain declaration. Note that the method receives the setup task as a parameter:

setup_db_instance >> column_quality_check >> row_count_quality_check >> teardown_db_instance.as_teardown(setups=setup_db_instance)

Another option is to use @setup and @teardown decorators:

from airflow.decorators import setup

@setup
def setup_db_instance():
...
return "Resources fully setup"

setup_db_instance()

After you configure the tasks, the graph view shows your setup tasks with an upward arrow and your teardown tasks with a downward arrow. They’re connected by a dotted line depicting the setup/teardown workflow. Any task between the setup and teardown tasks (such as column_quality_check and row_count_quality_check) are in the scope of the workflow. This arrangement involves the following behavior:

If you clear column_quality_check or row_count_quality_check, both setup_db_instance and teardown_db_instance will be cleared
If setup_db_instance runs successfully, and column_quality_check and row_count_quality_check have completed, regardless of whether they were successful or not, teardown_db_instance will run
If setup_db_instance fails or is skipped, then teardown_db_instance will fail or skip
If teardown_db_instance fails, by default Airflow ignores its status to evaluate whether the pipeline run was successful

Note that when creating setup and teardown workflows, there can be more than one set of setup and teardown tasks, and they can be parallel and nested. Neither setup nor teardown tasks are limited in number, nor are the worker tasks you can include in the scope of the workflow.

Follow the complete sample code in the GitHub repository to understand how setup and teardown tasks work.

When to use setup and teardown tasks

Setup and teardown tasks are useful to improve the reliability and cost-effectiveness of DAGs, ensuring that required resources are created and deleted in the right time. They can also help simplify complex DAGs by breaking them down into smaller, more manageable tasks, improving maintainability. Some use cases include:

Data processing based on ephemeral compute, like Amazon Elastic Compute Cloud (Amazon EC2) instances fleets or EMR clusters
ML model training or tuning pipelines
Extract, transform, and load (ETL) jobs using external ephemeral data stores to share data among Airflow tasks

With Amazon MWAA support for Apache Airflow version 2.7.2, you can start using setup and teardown tasks to improve your pipelines as of today. To learn more about Setup and Teardown tasks, refer to the Apache Airflow documentation.

Secrets cache

To reflect changes to your DAGs and tasks, the Apache Airflow scheduler parses your DAG files continuously, every 30 seconds by default. If you have variables or connections as top-level code (code outside the operator’s execute methods), a request is generated every time the DAG file is parsed, impacting parsing speed and leading to sub-optimal performance in the DAG file processing. If you are running at scale, it has the potential to affect Airflow performance and scalability as the amount of network communication and load on the metastore database increase. If you’re using an alternative secrets backend, such as AWS Secrets Manager, every DAG parse is a new request to that service, increasing costs.

With Amazon MWAA support for Apache Airflow version 2.7.2, you can use secrets cache for variables and connections. Airflow will cache variables and connections locally so that they can be accessed faster during DAG parsing, without having to fetch them from the secrets backend, environments variables, or metadata database. The following diagram describes the process.

Enabling caching will help lower the DAG parsing time, especially if variables and connections are used in top-level code (which is not a best practice). With the introduction of a secrets cache, the frequency of API calls to the backend is reduced, which in turn lowers the overall cost associated with backend access. However, similar to other caching implementations, a secrets cache may serve outdated values until the time to live (TTL) expires.

When to use the secrets cache feature

You should consider using the secrets cache feature to improve performance and reliability, and to reduce the operating costs of your Airflow tasks. This is particularly useful if your DAG frequently retrieves variables or connections in the top-level Python code.

How to use the secrets cache feature on Amazon MWAA

To enable the secrets cache, you can set the secrets.use_cache environment configuration parameter to True. Once enabled, Airflow will automatically cache secrets when they are accessed. The cache will only be used during DAG files parsing, and not during DAG runtime.

You can also control the TTL of stored values for which the cache is considered valid using the environment configuration parameter secrets.cache_ttl_seconds, which is defaulted to 15 minutes.

Running or failed filters and Cluster Activity page

Identifying DAGs in failed state can be challenging for large Airflow instances. You typically find yourself scrolling through pages searching for failures to address. With Apache Airflow version 2.7.2 environments in Amazon MWAA, you can now filter DAGs currently running and DAGs with failed DAG runs. As you can see in the following screenshot, two status tabs, Running and Failed, were added to the UI.

Another advantage of Amazon MWAA environments using Apache Airflow version 2.7.2 is the new Cluster Activity page for environment-level monitoring.

The Cluster Activity page gathers useful data to monitor your cluster’s live and historical metrics. In the top section of the page, you get live metrics on the number of DAGs ready to be scheduled, the top 5 longest running DAGs, slots used in different pools, and components health (meta database, scheduler, and triggerer). The following screenshot shows an example of this page.

The bottom section of the Cluster Activity page includes historical metrics of DAG runs and task instances states.

Set up a new Apache Airflow v2.7.2 environment in Amazon MWAA

Setting up a new Apache Airflow version 2.7.2 environment in Amazon MWAA not only provides new features, but also leverages Python 3.11 and the Amazon Linux 2023 (AL2023) base image, offering enhanced security, modern tooling, and support for the latest Python libraries and features. You can initiate the set up in your account and preferred Region using the AWS Management Console, API, or AWS Command Line Interface (AWS CLI). If you’re adopting infrastructure as code (IaC), you can automate the setup using AWS CloudFormation, the AWS Cloud Development Kit (AWS CDK), or Terraform scripts.

Upon successful creation of an Apache Airflow version 2.7.2 environment in Amazon MWAA, certain packages are automatically installed on the scheduler and worker nodes. For a complete list of installed packages and their versions, refer to this MWAA documentation. You can install additional packages using a requirements file. Beginning with Apache Airflow version 2.7.2, your requirements file must include a --constraints statement. If you do not provide a constraint, Amazon MWAA will specify one for you to ensure the packages listed in your requirements are compatible with the version of Apache Airflow you are using.

Upgrade from older versions of Apache Airflow to Apache Airflow v2.7.2

Take advantage of these latest capabilities by upgrading your older Apache Airflow v2.x-based environments to version 2.7.2 using in-place version upgrades. To learn more about in-place version upgrades, refer to Upgrading the Apache Airflow version or Introducing in-place version upgrades with Amazon MWAA.

Conclusion

In this post, we discussed deferrable operators along with some significant changes introduced in Apache Airflow version 2.7.2, such as the Cluster Activity page in the UI, the cache for variables and connections, and how you can get started using them in Amazon MWAA.

For additional details and code examples on Amazon MWAA, visit the Amazon MWAA User Guide and the Amazon MWAA examples GitHub repo.

Apache, Apache Airflow, and Airflow are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries.

About the Authors

Manasi Bhutada is an ISV Solutions Architect based in the Netherlands. She helps customers design and implement well architected solutions in AWS that address their business problems. She is passionate about data analytics and networking. Beyond work she enjoys experimenting with food, playing pickleball, and diving into fun board games.

Hernan Garcia is a Senior Solutions Architect at AWS based in the Netherlands. He works in the Financial Services Industry supporting enterprises in their cloud adoption. He is passionate about serverless technologies, security, and compliance. He enjoys spending time with family and friends, and trying out new dishes from different cuisines.

Deploy Amazon QuickSight dashboards to monitor AWS Glue ETL job metrics and set alarms

2023-11-03 Michael Hamilton

Post Syndicated from Michael Hamilton original https://aws.amazon.com/blogs/big-data/deploy-amazon-quicksight-dashboards-to-monitor-aws-glue-etl-job-metrics-and-set-alarms/

No matter the industry or level of maturity within AWS, our customers require better visibility into their AWS Glue usage. Better visibility can lend itself to gains in operational efficiency, informed business decisions, and further transparency into your return on investment (ROI) when using the various features available through AWS Glue.

As your company grows, you should be able to answer simple questions about your AWS Glue usage, such as the following:

Where am I spending the most with AWS Glue?
Where can I save the most by taking advantage of new AWS Glue features?
What does my overall usage look like using AWS Glue?

AWS offers services such as Amazon QuickSight, a serverless business intelligence (BI) service that lets you centralize this view and even ask natural language questions of your data, using Amazon QuickSight Q. QuickSight can give business leaders and their technology counterparts a common landscape for reporting important details of their usage, providing automated narratives to bridge communication gaps.

In this post, we explore how to combine AWS Glue usage information and metrics with centralized reporting and visualization using QuickSight. This can provide you with a more comprehensive view of your usage and tools to help you dive deep into your AWS Glue job run environment. You have metrics available per job run within the AWS Glue console, but they don’t cover all available AWS Glue job metrics, and the visuals aren’t as interactive compared to the QuickSight dashboard.

Although we don’t cover optimizing your jobs for costs in this post, you can refer to Monitor and optimize cost on AWS Glue for Apache Spark to learn how to fine-tune your AWS Glue jobs for performance, efficiency ,and cost-optimization.

Let’s dive in!

Solution overview

The following diagram illustrates the architecture for the given solution. At a high level, a scheduled event triggers an orchestration flow consisting of multiple data, compute, and analytics resources—the output of which culminates as a set of visuals in a BI dashboard.

solution architecture

Now let’s dig into the technical details involved in this solution.

An AWS Step Functions workflow is scheduled to run once per hour through Amazon EventBridge, which triggers an AWS Lambda function that calls the AWS Glue GetJob and GetJobRun APIs. We parse this data to check for jobs that have succeeded, stopped, or failed in the past hour, as well as any streaming jobs. The metadata is extracted from each job run, including information like runtime, start time, end time, auto scaling, number of workers, and worker type, and is written to an Amazon DynamoDB table with TTL (time to live) enabled to ensure the table doesn’t grow too large.

We move into a parallel state to check two tables that Amazon Athena writes the output of the federated queries to. Athena first checks to make sure the tables exist in Amazon Simple Storage Service (Amazon S3), where the data will be stored. If the tables don’t exist, Athena creates them. One federated query gathers AWS Glue metric data from Amazon CloudWatch metrics; the other gathers data from the DynamoDB table where Lambda writes the AWS Glue job metadata it’s collecting. Both federated queries utilize appropriate filtering in order to only scan the necessary data from each source.

There is a choice state for each branch. If there is no new data to be added to a table in Amazon S3, the state ends and waits for the other to complete. For example, there could be an AWS Glue job that is running while the step is evaluating. In this case, the metrics for the job would be inserted in the table on Amazon S3, but the metadata from DynamoDB wouldn’t arrive until the following hour after the job has succeeded, stopped, or failed.

When new metrics or metadata are found, Athena inserts this data to the metrics or metadata tables in Amazon S3, which are both partitioned by the hour. After the data is inserted, the final steps call the QuickSight CreateIngestion API, which triggers data ingestion into QuickSight SPICE to power interactive analysis. At this point, the workflow has finished running and will run again the following hour.

In the following sections, we show you how to set up the solution, explore the dashboards, and configure alarms.

The code for this solution can be found at the AWS samples GitHub repository.

Prerequisites

You should have the following prerequisites:

An AWS account with AWS Identity and Access Management (IAM) privileges sufficient to create the solution resources
QuickSight Standard or Enterprise Edition with a QuickSight user created
An AWS Cloud9 integrated development environment (IDE) or your local machine using your preferred IDE with the following packages installed:
- Node Package Manager (npm)
- Git CLI
- AWS CDK v2
- AWS CDK Library
- TypeScript
The AWS Cloud Development Kit (AWS CDK) bootstrapped in your target AWS account and Region

Deploy solution resources with the AWS CDK

To provision the resources that build the dashboard and keep it up to date, we provide steps to download and deploy the solution via the AWS CDK. The solution was developed with cost-optimization as a priority, but some resources in the stack will incur costs once deployed.

This solution generates the following resources:

IAM role
EventBridge rule
Step Functions state machine
Lambda function
S3 bucket
Two AWS Glue tables and one AWS Glue database
DynamoDB table
Athena queries invoked by Step Functions
QuickSight data source, dataset, analysis, and dashboard

To deploy the solution, complete the following steps:

Clone the source code from AWS samples GitHub repository to the client:
```
git clone https://github.com/aws-samples/glue-metrics-in-quicksight
```

Bootstrap your AWS CDK app:

cd glue-metrics-in-quicksight
npm i aws-cdk-lib
cdk bootstrap

Deploy the solution with the required parameters:
1. The first parameter is for a new S3 bucket to be created, which holds the AWS Glue metrics and metadata.
2. The second parameter is required in order for QuickSight to assign permissions to the user who will manage the assets. Refer to Managing user access inside Amazon QuickSight to find your existing QuickSight users.
```
cdk deploy --parameters BucketName=New-Unique-Bucket-Name --parameters QuicksightUsername=QuickSight-Existing-User
```

If your deployment fails, make sure you installed the AWS CDK library and rerun cdk deploy after installing:

npm i aws-cdk-lib

The deployment may take up to 10 minutes.

After the solution is deployed, the Step Functions state machine will evaluate once per hour if it should ingest data into QuickSight. You can run some AWS Glue jobs after the stack is deployed and check the QuickSight dashboard in the next hour or two, where the job metadata and metrics will be populated for your analysis.

Explore the dashboard

The dashboard contains two sheets: Glue Jobs and Glue Metrics.

The Glue Jobs sheet includes all of the metadata about your AWS Glue job runs, including AWS Glue for Apache Spark, AWS Glue for Ray, and AWS Glue streaming ETL. Most of the visuals also have a hierarchy that you can drill down into with QuickSight, going as low as each specific job run ID. You can use controls to filter by date, job name, and job run ID.

In the following demonstration, you will see the pivot table, which is a simple view of all our job metadata, including estimated cost per job and job run. We open up a job name and see the different job runs. There is one individual job run that we would like to inspect the metrics on, so we choose the job name and choose View metrics for job run id: <my job run id>. This will take us to the Glue Metrics sheet and automatically filter for the job run ID we want to view.

glue information sheet

The Glue Metrics sheet is built to reflect the documentation we provide in AWS Glue resource monitoring. This documentation helps explain each visual in the dashboard. You can use the Glue Metrics sheet to view aggregated metrics across all jobs, a single job, or down to the job run ID.

To populate the Glue Metrics sheet, your AWS Glue jobs must be enabled to capture metrics in CloudWatch.

glue metrics sheet

Set up alerts

Setting up alerts on measures is also straightforward to do in QuickSight. To do so, choose (right-click) one of the tracked measures on either worksheet and choose Create Alarm. This will bring you to the configuration page to set up the metric you’d like to be alerted on.

quicksight alarm

The dashboard is designed to give you the freedom to alter it and make your own visualizations with the metadata and metrics that are provided to you. If you want even more insight into cost, consider deploying the CUDOS dashboard as well!

Clean up

If you no longer need the dashboard, delete the CDK app:

cdk destroy

Conclusion

In this post, we talked about the importance of having observability of your AWS Glue jobs and provided an AWS CDK app that deploys a QuickSight dashboard for you. We hope this helps you optimize your AWS Glue environment using the insights the dashboard provides. To learn about event-based alerting for your AWS Glue for Apache Spark and Ray jobs, refer to Automate alerting and reporting for AWS Glue job resource usage.

About the authors

Michael Hamilton is a Sr Analytics Solutions Architect focusing on helping enterprise customers in the south east modernize and simplify their analytics workloads on AWS. He enjoys mountain biking and spending time with his wife and three children when not working.

Cody Penta is a Solutions Architect at Amazon Web Services and is based out of Charlotte, NC. He has a focus in security and CDK, and enjoys solving the really difficult problems in the technology world. Off the clock, he loves relaxing in the mountains, coding personal projects, and gaming.

Angus Ferguson is a Solutions Architect at AWS who is passionate about meeting customers across the world, helping them solve their technical challenges. Angus specializes in Data & Analytics with a focus on customers in the financial services industry.

Aggregating, searching, and visualizing log data from distributed sources with Amazon Athena and Amazon QuickSight

2023-11-02 Pratima Singh

Post Syndicated from Pratima Singh original https://aws.amazon.com/blogs/security/aggregating-searching-and-visualizing-log-data-from-distributed-sources-with-amazon-athena-and-amazon-quicksight/

Customers using Amazon Web Services (AWS) can use a range of native and third-party tools to build workloads based on their specific use cases. Logs and metrics are foundational components in building effective insights into the health of your IT environment. In a distributed and agile AWS environment, customers need a centralized and holistic solution to visualize the health and security posture of their infrastructure.

You can effectively categorize the members of the teams involved using the following roles:

Executive stakeholder: Owns and operates with their support staff and has total financial and risk accountability.
Data custodian: Aggregates related data sources while managing cost, access, and compliance.
Operator or analyst: Uses security tooling to monitor, assess, and respond to related events such as service disruptions.

In this blog post, we focus on the data custodian role. We show you how you can visualize metrics and logs centrally with Amazon QuickSight irrespective of the service or tool generating them. We use Amazon Simple Storage Service (Amazon S3) for storage, AWS Glue for cataloguing, and Amazon Athena for querying the data and creating structured query language (SQL) views for QuickSight to consume.

Target architecture

This post guides you towards building a target architecture in line with the AWS Well-Architected Framework. The tiered and multi-account target architecture, shown in Figure 1, uses account-level isolation to separate responsibilities across the various roles identified above and makes access management more defined and specific to those roles. The workload accounts generate the telemetry around the applications and infrastructure. The data custodian account is where the data lake is deployed and collects the telemetry. The operator account is where the queries and visualizations are created.

Throughout the post, I mention AWS services that reduce the operational overhead in one or more stages of the architecture.

Figure 1: Data visualization architecture

Ingestion

Irrespective of the technology choices, applications and infrastructure configurations should generate metrics and logs that report on resource health and security. The format of the logs depends on which tool and which part of the stack is generating the logs. For example, the format of log data generated by application code can capture bespoke and additional metadata deemed useful from a workload perspective as compared to access logs generated by proxies or load balancers. For more information on types of logs and effective logging strategies, see Logging strategies for security incident response.

Amazon S3 is a scalable, highly available, durable, and secure object storage that you will use as the storage layer. To build a solution that captures events agnostic of the source, you must forward data as a stream to the S3 bucket. Based on the architecture, there are multiple tools you can use to capture and stream data into S3 buckets. Some tools support integration with S3 and directly stream data to S3. Resources like servers and virtual machines need forwarding agents such as Amazon Kinesis Agent, Amazon CloudWatch agent, or Fluent Bit.

Amazon Kinesis Data Streams provides a scalable data streaming environment. Using on-demand capacity mode eliminates the need for capacity provisioning and capacity management for streaming workloads. For log data and metric collection, you should use on-demand capacity mode, because log data generation can be unpredictable depending on the requests that are being handled by the environment. Amazon Kinesis Data Firehose can convert the format of your input data from JSON to Apache Parquet before storing the data in Amazon S3. Parquet is naturally compressed, and using Parquet native partitioning and compression allows for faster queries compared to JSON formatted objects.

Scalable data lake

Use AWS Lake Formation to build, secure, and manage the data lake to store log and metric data in S3 buckets. We recommend using tag-based access control and named resources to share the data in your data store to share data across accounts to build visualizations. Data custodians should configure access for relevant datasets to the operators who can use Athena to perform complex queries and build compelling data visualizations with QuickSight, as shown in Figure 2. For cross-account permissions, see Use Amazon Athena and Amazon QuickSight in a cross-account environment. You can also use Amazon DataZone to build additional governance and share data at scale within your organization. Note that the data lake is different to and separate from the Log Archive bucket and account described in Organizing Your AWS Environment Using Multiple Accounts.

Figure 2: Account structure

Amazon Security Lake

Amazon Security Lake is a fully managed security data lake service. You can use Security Lake to automatically centralize security data from AWS environments, SaaS providers, on-premises, and third-party sources into a purpose-built data lake that’s stored in your AWS account. Using Security Lake reduces the operational effort involved in building a scalable data lake, as the service automates the configuration and orchestration for the data lake with Lake Formation. Security Lake automatically transforms logs into a standard schema—the Open Cybersecurity Schema Framework (OCSF) — and parses them into a standard directory structure, which allows for faster queries. For more information, see How to visualize Amazon Security Lake findings with Amazon QuickSight.

Querying and visualization

Figure 3: Data sharing overview

After you’ve configured cross-account permissions, you can use Athena as the data source to create a dataset in QuickSight, as shown in Figure 3. You start by signing up for a QuickSight subscription. There are multiple ways to sign in to QuickSight; this post uses AWS Identity and Access Management (IAM) for access. To use QuickSight with Athena and Lake Formation, you first must authorize connections through Lake Formation. After permissions are in place, you can add datasets. You should verify that you’re using QuickSight in the same AWS Region as the Region where Lake Formation is sharing the data. You can do this by checking the Region in the QuickSight URL.

You can start with basic queries and visualizations as described in Query logs in S3 with Athena and Create a QuickSight visualization. Depending on the nature and origin of the logs and metrics that you want to query, you can use the examples published in Running SQL queries using Amazon Athena. To build custom analytics, you can create views with Athena. Views in Athena are logical tables that you can use to query a subset of data. Views help you to hide complexity and minimize maintenance when querying large tables. Use views as a source for new datasets to build specific health analytics and dashboards.

You can also use Amazon QuickSight Q to get started on your analytics journey. Powered by machine learning, Q uses natural language processing to provide insights into the datasets. After the dataset is configured, you can use Q to give you suggestions for questions to ask about the data. Q understands business language and generates results based on relevant phrases detected in the questions. For more information, see Working with Amazon QuickSight Q topics.

Conclusion

Logs and metrics offer insights into the health of your applications and infrastructure. It’s essential to build visibility into the health of your IT environment so that you can understand what good health looks like and identify outliers in your data. These outliers can be used to identify thresholds and feed into your incident response workflow to help identify security issues. This post helps you build out a scalable centralized visualization environment irrespective of the source of log and metric data.

This post is part 1 of a series that helps you dive deeper into the security analytics use case. In part 2, How to visualize Amazon Security Lake findings with Amazon QuickSight, you will learn how you can use Security Lake to reduce the operational overhead involved in building a scalable data lake and centralizing log data from SaaS providers, on-premises, AWS, and third-party sources into a purpose-built data lake. You will also learn how you can integrate Athena with Security Lake and create visualizations with QuickSight of the data and events captured by Security Lake.

Part 3, How to share security telemetry per Organizational Unit using Amazon Security Lake and AWS Lake Formation, dives deeper into how you can query security posture using AWS Security Hub findings integrated with Security Lake. You will also use the capabilities of Athena and QuickSight to visualize security posture in a distributed environment.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Refine permissions for externally accessible roles using IAM Access Analyzer and IAM action last accessed

2023-11-01 Nini Ren

Post Syndicated from Nini Ren original https://aws.amazon.com/blogs/security/refine-permissions-for-externally-accessible-roles-using-iam-access-analyzer-and-iam-action-last-accessed/

When you build on Amazon Web Services (AWS) across accounts, you might use an AWS Identity and Access Management (IAM) role to allow an authenticated identity from outside your account—such as an IAM entity or a user from an external identity provider—to access the resources in your account. IAM roles have two types of policies attached to them: a trust policy that allows access to an external entity, and a permissions policy that defines what actions the role can take. This blog post focuses on how to use AWS Identity and Access Management Access Analyzer cross-account access findings and IAM action last accessed information to refine the permissions policies of your IAM roles that have a trust policy.

IAM Access Analyzer helps you set, verify, and refine permissions. To learn more about how IAM Access Analyzer guides you toward least-privilege permissions, visit Using AWS IAM Access Analyzer. Action last accessed information helps you identify unused permissions and refine the access of your IAM roles to only the actions they use. IAM now provides action last accessed information for more than 140 services such as Amazon Kinesis Data Streams and Data Firehose, Amazon DynamoDB, and Amazon Simple Queue Service (Amazon SQS).

This blog post walks you through how to use IAM Access Analyzer and action last accessed to refine the required permissions for your IAM roles that have a trust policy, which allows entities outside of your account to assume a role and access your resources.

Use IAM roles to grant access to an external entity

You can create an IAM role that grants permissions for an entity outside your account to access the resources in your account. For example, if you’re an application developer, you might grant cross-account access to your AWS resources by using a role and attaching a trust policy to the role.

To allow an external entity access to your resources by using a role, you first create a role with a role trust policy to grant access to entities outside your account, and then grant permissions that specify which actions the role can take. The external entities can then assume the role in your account and access your resources based on the permissions you granted to the role. See Cross-account access using roles for more information.

You should restrict the access of roles that grant access outside of your account to just the permissions required to perform a specific task.

Use IAM Access Analyzer cross-account access findings to identify roles that grant access to external entities

When you use role trust policies to grant account access to entities outside your account, those entities can access and take the allowed actions on your resources. IAM Access Analyzer continuously monitors your account to identify the resources in your account that can be accessed from outside your account and helps you verify whether the access permissions meet your intent. For the example in this post, if you were to add a new trust policy to your
ApplicationRole to grant permissions to an external account to access an application in your account, IAM Access Analyzer would let you know that ApplicationRole is accessible by entities from outside your account.

Use IAM action last accessed information to identify and remove unused permissions

After you’ve identified the IAM roles that grant access to entities outside your account, review what those roles can do and remove unused permissions. You can use action last accessed to show you the latest timestamp when your IAM role used an action, analyze its access permissions, and remove unused permissions.

Refine permissions for externally accessible roles by using IAM Access Analyzer cross-account access findings and action last accessed information

This example demonstrates how you can combine the information from IAM Access Analyzer cross-account access findings and IAM action last accessed information to identify roles that can be assumed from outside your account, review unused and unnecessary actions, and reduce the permissions available to external roles.

To view action last accessed information in the IAM console

Open the AWS Management Console and go to the IAM console, and then select Access analyzer in the navigation pane.
If you’ve already created an analyzer, go to Step 3. Otherwise, follow Identify Unintended Resource Access with IAM Access Analyzer to create an analyzer.
Review your findings on the IAM Access Analyzer tab.
Under Active findings, for Filter active findings, enter AWS::IAM::Role. The list of Active findings shows you the roles that can be accessed by entities outside your account.

Figure 1: Findings filtered by resource types

Under the Finding ID column, select a finding for a role (for example, ApplicationRole) that you want to review.
A new page for the Finding ID will appear. Choose the resource ARN link in the Resource field under the Details section.

Figure 2: Findings page

A new page for the role will appear. Select the Access Advisor tab to review the last accessed information of your services for this role. This tab displays the AWS services to which the role has permissions. Action last accessed reports the actions listed in the IAM action last accessed information services and actions. The tracking period for services is the last 400 days—fewer if your AWS Region began tracking within the last 400 days. Learn more about Where AWS tracks last accessed information.

Figure 3: Last accessed information of allowed services

In this exercise, we will use DynamoDB as an example. Under Allowed services, for Search, enter Amazon DynamoDB and under the Service column, choose Amazon DynamoDB. This will take you to a new section titled Allowed management actions for Amazon DynamoDB, which displays the action last accessed information of your role for DynamoDB. The Action column displays the action, the Last Accessed column displays the timestamp of when access was last attempted, and the Region accessed column displays in which region access was last attempted.
The Action column on the resulting Allowed management actions for Amazon DynamoDB section includes the actions to which the role has permissions, when the role last accessed each action, and the Region accessed. You can sort the actions by choosing the arrow next to Last accessed.

Figure 4: Action last accessed information for Amazon DynamoDB

Because you want to remove unused permissions, filter for all unused actions for the role by selecting Services not accessed from the Last accessed dropdown list. This will show you the actions that haven’t been accessed during the tracking period.

Figure 5: Action last accessed information ordered by not accessed

To return to the service view, choose Back to Allowed services and then select the Permissions tab. Select the plus sign to the left of DynamoDBAccess to see the JSON of the customer managed policy.

Figure 6: The JSON code of the customer managed policy

Choose Edit and remove dynamodb:* and replace it with just the actions that have been used recently such as: DescribeTable and DescribeKinesisStreamingDestination. Not all actions are reported by action last accessed. Review the list of actions that action last accessed information reports and when action last accessed started tracking the action for the service in an AWS Region.
Choose Next and then Save changes. Return to the Access Advisor tab to confirm that all the retained permissions have been used recently.

Conclusion

In this post, you learned how to use IAM Access Analyzer and action last accessed information to identify and refine permissions for externally accessible roles in your journey toward least privilege. You first used IAM Access Analyzer cross-account access findings to identify IAM roles that can be accessed from outside your account. You then used IAM action last accessed information to review the permissions those roles are using and to remove unused permissions.

For more information about IAM Access Analyzer cross-account findings, see Findings for public and cross-account access. For more information about action last accessed information, see Things to know about last accessed information and the IAM action last accessed information services and actions.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS re:Post or contact AWS Support.

Security considerations for running containers on Amazon ECS

2023-11-01 Mutaz Hajeer

Post Syndicated from Mutaz Hajeer original https://aws.amazon.com/blogs/security/security-considerations-for-running-containers-on-amazon-ecs/

If you’re looking to enhance the security of your containers on Amazon Elastic Container Service (Amazon ECS), you can begin with the six tips that we’ll cover in this blog post. These curated best practices are recommended by Amazon Web Services (AWS) container and security subject matter experts in order to help raise your container security posture.

Before we jump into best practices, let’s look at the how the shared responsibility model works for Amazon ECS hosted on Amazon Elastic Compute Cloud (Amazon EC2) infrastructure compared to AWS Fargate. The security and compliance of a managed service like Amazon ECS is a shared responsibility between you and AWS. Generally speaking, AWS is responsible for security of the cloud whereas you, the customer, are responsible for security in the cloud. AWS is responsible for the management of the Amazon ECS control plane, including the infrastructure that’s needed to deliver a secure and reliable service. In this post, we’re going to focus on the areas of ECS security that you will be responsible for and provide guidance on what you need to do to adhere to these ECS security best practices.

Figure 1 shows the shared responsibility model for Amazon ECS hosted on an EC2 instance, in which the customer has more security responsibility to cover than when using ECS on Fargate. For example, the ECS agent and the worker node configuration is the customer’s responsibility to govern, because the customer is managing the EC2 instance. Therefore, the customer will have to manage the ECS agent and worker node as part of their configuration and management operations.

Figure 1: Responsibility model for Amazon ECS hosted on an Amazon EC2 instance

AWS assumes greater responsibility for infrastructure security for Fargate, as shown in Figure 2.

Figure 2: Responsibility model for Amazon ECS hosted on Fargate

In Fargate, each task runs in its own virtual machine (VM). No two tasks share the operating system or kernel resources. With Fargate, AWS manages the security of the underlying instance in the cloud and the runtime that’s used to run your tasks. It also automatically scales your infrastructure on your behalf, which is something you should take into consideration if you’re starting your container journey and deciding on your infrastructure options.

With that, let’s go through these six Amazon ECS security best practices.

1 – Manage ECS access with IAM policies and roles

AWS Identity and Access Management (IAM) policies can help you control access to Amazon ECS. For this we recommend that you do the following:

Enforce least privilege when setting up policies for Amazon ECS resources – Use resource-level permissions to specify upon which resources you want to allow particular actions. For example, only allow a specific IAM user to stop a task that uses a specific task definition family on a specific ECS cluster.
Specify your task’s role – Make sure to define the right task role in your ECS task definition. The task role is used by your application in the task to make API calls to AWS services like Amazon Simple Storage Service (Amazon S3). This allows you to run your tasks by using an IAM role that has only the necessary permissions, without complete access to all services and resources within your account.
Create automated pipelines – Use Amazon CodePipeline or one of your other preferred continuous integration and continuous delivery (CI/CD) solutions to create pipelines that package and deploy your applications into ECS clusters. This way, you limit the users’ actions and delegate them to the automated pipeline. For an example of how to create pipelines, see Automatically build CI/CD pipelines and Amazon ECS clusters for microservices using AWS CDK.
Audit Amazon ECS API access – Track and monitor your AWS CloudTrail logs to identify who has access to your Amazon ECS APIs and whether that access is still warranted. You can then delete the IAM users, roles, and groups that aren’t in use and review the policies that are in place. For more information, see the AWS security audit guidelines.

2 – Secure your ECS network

Network security is an important item to work on as part of applying best practices to secure your Amazon ECS environment. This area includes several sub-areas such as firewalling, traffic routing, and network observability. Here’s what we recommend:

Network segmentation and isolation – Amazon ECS tasks are configured to operate in different network modes. AWS recommends the use of awsvpc as the preferred network mode. This is because it’s the only mode that you can use to assign security groups to tasks. After you configure your task to use this mode, the ECS agent automatically provisions and attaches an elastic network interface (ENI) to the task. When the ENI is provisioned, the task is enrolled in an AWS security group. The security group acts as a virtual firewall that you can use to control inbound and outbound traffic. It’s also the only mode that’s available for Fargate tasks on ECS if you choose to go that route.
Use network encryption where applicable – Encrypting network traffic helps prevent unauthorized users from intercepting and reading data when that data is transmitted across a network. With Amazon ECS, you can implement network encryption in different ways, such as with a service mesh (TLS), using AWS Nitro system instances, using server name indication (SNI) with an application load balancer, or end-to-end encryption with TLS certificates. If your service is fronted by a public-facing load balancer, use TLS/SSL to encrypt the traffic from the client’s browser to the load balancer and re-encrypt traffic to the backend if warranted. For more information, see Amazon ECS encryption in transit.
Create clusters in separate VPCs when network traffic needs to be strictly isolated – You should create clusters in separate virtual private clouds (VPCs) when network traffic needs to be strictly isolated. Avoid running workloads that have strict security requirements on clusters with workloads that don’t have to adhere to those requirements. When strict network isolation is mandatory, create clusters in separate VPCs and selectively expose services to other VPCs by using VPC endpoints. For more information, see VPC endpoints.
Configure AWS PrivateLink endpoints when possible – AWS PrivateLink is a networking technology that allows you to create private endpoints for different AWS services, including Amazon ECS. You should configure AWS PrivateLink endpoints when possible. If your security policy prevents you from attaching an internet gateway to your Amazon VPCs, then configure PrivateLink endpoints for ECS and other services such as Amazon Elastic Container Registry (Amazon ECR), AWS Secrets Manager, and Amazon CloudWatch. For more details, see the Amazon ECS Best Practices Guide.

3 – ECS secrets management

Secrets, such as API keys and database credentials, are frequently used by applications to gain access to other systems. They often consist of a username and password, a certificate, or an API key. Access to these secrets should be restricted to specific IAM principals that are using IAM and injected into containers at runtime. Here’s what we recommend:

Use Secrets Manager or Amazon EC2 Systems Manager Parameter Store for storing secret materials – Securely storing API keys, database credentials, and other secret materials is crucial to help prevent accidental exposure and unauthorized access. AWS recommends that you store these secrets in Secrets Manager or as an encrypted parameter in Amazon EC2 Systems Manager Parameter Store. These services are similar because they’re both managed key-value stores that use AWS Key Management Service (AWS KMS) to encrypt sensitive data. Secrets Manager, however, also includes the ability to automatically rotate secrets, generate random secrets, and share secrets across AWS accounts. Additionally, Amazon ECS does not support versioned parameters in Parameter Store. If you need to implement any of these features, use Secrets Manager; otherwise, use encrypted parameters. Also, you can use tools like Chamber to manage secrets. For more information, see this Knowledge Center article.
Mount the secret to a volume by using a sidecar container – Considering the elevated risk of data leakage with environmental variables, it’s recommended that you run a sidecar container that reads your secrets from Secrets Manager and writes them to a shared volume. This container can run and exit before the application container by using Amazon ECS container ordering. When you do this, the application container subsequently mounts the volume where the secret was written. This will help isolate secret management concerns and facilitates dynamic secret handling. For example, your application should be written to read the secret from the shared volume. Then, because the volume is scoped to the task, the volume is automatically deleted after the task stops. For more details about sidecar containers, see the aws-secret-sidecar-injector project in GitHub.

4 – Secure the ECS task and runtime

You should consider the container image as your first line of defense. An insecure, poorly constructed image can allow users to escape the bounds of the container and gain access to the host. You should do the following to mitigate the chances of this happening:

Secure your container’s images – Escape to host is a well-known container threat technique where bad actors use unsecured container images to escape the bounds of a container and gain access to the underlying host. We recommend that you scan your container’s images before deployment. For images stored on Amazon ECR, you can use Amazon Inspector to scan your images, along with Amazon EventBridge to be notified to take actions to either delete or rebuild insecure images. This process is shown in the architecture in Figure 3. You can find more details on how to create custom responses to Amazon Inspector findings with Amazon EventBridge in the Amazon Inspector User Guide.

Figure 3: Sample architecture showing how to get notified of Amazon Inspector findings on a container’s image
Enable the ECR tag immutability feature – Threat actors could also try to push a compromised version of a container image into your Amazon ECR repository with an identical tag. A solution for this is to force a new tag for each new version of your image. You can do this by enabling the image tag mutability feature for your ECR repositories. You can find the Tag immutability setting on the Create repository page in the Amazon ECR console, under General settings, as shown in Figure 4.

Figure 4: Enabling the tag immutability feature for your Amazon ECR repository
Secure your containers and tasks
- Define the USER parameter to use inside your container – Containers run by default as the root user, which doesn’t adhere to the principle of least privilege and can be misused. One recommendation is to make sure to run your containers as a non-root user by specifying the USER directive in your Dockerfile. You can enforce this when using a CI/CD pipeline by configuring the pipeline to fail the build if the USER directive is missing.
- Don’t run your containers in privileged mode – Make sure to not run your containers in privileged mode, which can be a potential gap that allows unauthorized users to run commands within a container. You can use AWS Security Hub to detect containers that are running in privileged mode. Alternatively, you can use AWS Lambda to scan your task definitions for the use of the privileged parameter. Security Hub has a built-in control (ECS.4) to check whether the privileged parameter in the container definition of Amazon ECS task definitions is set to true.
Disable ECS Exec – Customers should disable the ECS Execute condition key for production environments. Disabling the key provides access control that can help prevent SSH access into running containers. You can do this by disabling the ECS:Enable-Execute-Command condition key.
Secure runtime – For Linux containers, make sure to add or drop Linux kernel capabilities in the task definition. You can do this either by using linuxParameters and applying SELinux labels or by using the AppArmor profile, which is a Linux security module that restricts a container’s capabilities, such as accessing parts of the file system. When you’re using the Fargate launch type, each Fargate task has its own isolation boundary and does not share the underlying kernel, CPU resources, memory resources, or elastic network interface with another task.

5 – ECS logging and monitoring

Logging and monitoring your container’s activity can help you quickly identify and investigate security incidents in your AWS environments. For example, threat actors might have escalated permissions and have access to your root user. Here’s what we recommend:

Monitor your root-user activities – Configure an Amazon EventBridge rule that detects root-user activities based on Amazon CloudTrail logs. For more details, see this blog post.
Monitor changes to your tasks and containers – Put appropriate events rules in place in Amazon EventBridge for the creation of and changes to your tasks and containers.
Monitor Amazon ECS scheduled tasks – If threat actors have enough privileges, they can abuse the ECS task scheduling feature to deploy containers that would run malicious code. Monitor this type of activity by using Amazon CloudTrail logs and get notifications. For more information about scheduling ECS tasks, see the Amazon ECS Developer Guide.
Monitor your container’s activity metrics – Another recommendation is to enable logging for your container and use Amazon CloudWatch to track activity metrics on your containers, such as CPU and memory utilization. This can help you detect if your resources are accessed and being used for malicious activities, such as launching DoS attacks. See Amazon ECS CloudWatch Container Insights for more information.
Use Amazon VPC Flow Logs to analyze the traffic to and from long-running tasks – You should use Amazon VPC Flow Logs to analyze the traffic to and from long-running tasks. Tasks that use awsvpc network mode get their own ENI. By setting tasks to use this mode, you can use VPC flow logs to monitor traffic that goes to and from individual tasks. A recent update to Amazon VPC Flow Logs (v3) enriches the logs with traffic metadata, including the VPC ID, subnet ID, and the instance ID. You can use this metadata to help narrow an investigation. For more information, see Amazon VPC Flow Logs. AWS cloud-native tools like Amazon GuardDuty inspect VPC flow logs and generate alerts and findings if unusual activity is detected.

6 – ECS security compliance

When using Amazon ECS, your compliance responsibility is determined by the sensitivity of your data, the compliance objectives of your company, and applicable laws and regulations. For example, with regards to the Payment Card Industry Data Security Standard (PCI-DSS), it’s important that you understand the complete flow of cardholder data (CHD) within your environment.

The temporary nature of containerized applications provides additional complexities when auditing configurations. As a result, customers need to maintain an awareness of all container configuration parameters, to make sure that compliance requirements are addressed throughout the phases of a container lifecycle. For additional information on adhering to PCI DSS compliance on Amazon ECS, see the Architecting on Amazon ECS for PCI DSS Compliance whitepaper.

One service that can help with monitoring Amazon ECS compliance is AWS Security Hub. You can use this service to monitor your usage of ECS as it relates to security best practices. Security Hub uses controls to evaluate resource configurations and security standards to help you comply with various compliance frameworks. For more information about using Security Hub to evaluate ECS resources, see Amazon ECS controls in the AWS Security Hub User Guide.

Conclusion

In this blog post, we presented a curated list of best practices for securing your Amazon ECS implementation. You can use these best practices as a starting point to increase the security posture of your ECS environment. You can always add, remove, or prioritize the best practice items based on your business needs and requirements. If you’re looking for more detailed guidance on securing ECS in your environment, we suggest that you take a look at Amazon ECS Security Best Practices.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS re:Post or contact AWS Support.

Transforming transactions: Streamlining PCI compliance using AWS serverless architecture

2023-10-31 Abdul Javid

Post Syndicated from Abdul Javid original https://aws.amazon.com/blogs/security/transforming-transactions-streamlining-pci-compliance-using-aws-serverless-architecture/

Compliance with the Payment Card Industry Data Security Standard (PCI DSS) is critical for organizations that handle cardholder data. Achieving and maintaining PCI DSS compliance can be a complex and challenging endeavor. Serverless technology has transformed application development, offering agility, performance, cost, and security.

In this blog post, we examine the benefits of using AWS serverless services and highlight how you can use them to help align with your PCI DSS compliance responsibilities. You can remove additional undifferentiated compliance heavy lifting by building modern applications with abstracted AWS services. We review an example payment application and workflow that uses AWS serverless services and showcases the potential reduction in effort and responsibility that a serverless architecture could provide to help align with your compliance requirements. We present the review through the lens of a merchant that has an ecommerce website and include key topics such as access control, data encryption, monitoring, and auditing—all within the context of the example payment application. We don’t discuss additional service provider requirements from the PCI DSS in this post.

This example will help you navigate the intricate landscape of PCI DSS compliance. This can help you focus on building robust and secure payment solutions without getting lost in the complexities of compliance. This can also help reduce your compliance burden and empower you to develop your own secure, scalable applications. Join us in this journey as we explore how AWS serverless services can help you meet your PCI DSS compliance objectives.

Disclaimer

PCI DSS v4.0 and serverless

In April 2022, the Payment Card Industry Security Standards Council (PCI SSC) updated the security payment standard to “address emerging threats and technologies and enable innovative methods to combat new threats.” Two of the high-level goals of these updates are enhancing validation methods and procedures and promoting security as a continuous process. Adopting serverless architectures can help meet some of the new and updated requirements in version 4.0, such as enhanced software and encryption inventories. If a customer has access to change a configuration, it’s the customer’s responsibility to verify that the configuration meets PCI DSS requirements. There are more than 20 PCI DSS requirements applicable to Amazon Elastic Compute Cloud (Amazon EC2). To fulfill these requirements, customer organizations must implement controls such as file integrity monitoring, operating system level access management, system logging, and asset inventories. Using AWS abstracted services in this scenario can remove undifferentiated heavy lifting from your environment. With abstracted AWS services, because there is no operating system to manage, AWS becomes responsible for maintaining consistent time settings for an abstracted service to meet Requirement 10.6. This will also shift your compliance focus more towards your application code and data.

This makes more of your PCI DSS responsibility addressable through the AWS PCI DSS Attestation of Compliance (AOC) and Responsibility Summary. This attestation package is available to AWS customers through AWS Artifact.

Reduction in compliance burden

You can use three common architectural patterns within AWS to design payment applications and meet PCI DSS requirements: infrastructure, containerized, and abstracted. We look into EC2 instance-based architecture (infrastructure or containerized patterns) and modernized architectures using serverless services (abstracted patterns). While both approaches can help align with PCI DSS requirements, there are notable differences in how they handle certain elements. EC2 instances provide more control and flexibility over the underlying infrastructure and operating system, assisting you in customizing security measures based on your organization’s operational and security requirements. However, this also means that you bear more responsibility for configuring and maintaining security controls applicable to the operating systems, such as network security controls, patching, file integrity monitoring, and vulnerability scanning.

On the other hand, serverless architectures similar to the preceding example can reduce much of the infrastructure management requirements. This can relieve you, the application owner or cloud service consumer, of the burden of configuring and securing those underlying virtual servers. This can streamline meeting certain PCI requirements, such as file integrity monitoring, patch management, and vulnerability management, because AWS handles these responsibilities.

Using serverless architecture on AWS can significantly reduce the PCI compliance burden. Approximately 43 percent of the overall PCI compliance requirements, encompassing both technical and non-technical tests, are addressed by the AWS PCI DSS Attestation of Compliance.

Customer responsible
52%

AWS responsible
43%

N/A
5%

The following table provides an analysis of each PCI DSS requirement against the serverless architecture in Figure 1, which shows a sample payment application workflow. You must evaluate your own use and secure configuration of AWS workload and architectures for a successful audit.

PCI DSS 4.0 requirements	Test cases	Customer responsible	AWS responsible	N/A
Requirement 1: Install and maintain network security controls	35	13	22	0
Requirement 2: Apply secure configurations to all system components	27	16	11	0
Requirement 3: Protect stored account data	55	24	29	2
Requirement 4: Protect cardholder data with strong cryptography during transmission over open, public networks	12	7	5	0
Requirement 5: Protect all systems and networks from malicious software	25	4	21	0
Requirement 6: Develop and maintain secure systems and software	35	31	4	0
Requirement 7: Restrict access to system components and cardholder data by business need-to-know	22	19	3	0
Requirement 8: Identify users and authenticate access to system components	52	43	6	3
Requirement 9: Restrict physical access to cardholder data	56	3	53	0
Requirement 10: Log and monitor all access to system components and cardholder data	38	17	19	2
Requirement 11: Test security of systems and networks regularly	51	22	23	6
Requirement 12: Support information security with organizational policies	56	44	2	10
Total	464	243	198	23
Percentage		52%	43%	5%

Note: The preceding table is based on the example reference architecture that follows. The actual extent of PCI DSS requirements reduction can vary significantly depending on your cardholder data environment (CDE) scope, implementation, and configurations.

Sample payment application and workflow

This example serverless payment application and workflow in Figure 1 consists of several interconnected steps, each using different AWS services. The steps are listed in the following text and include brief descriptions. They cover two use cases within this example application — consumers making a payment and a business analyst generating a report.

The example outlines a basic serverless payment application workflow using AWS serverless services. However, it’s important to note that the actual implementation and behavior of the workflow may vary based on specific configurations, dependencies, and external factors. The example serves as a general guide and may require adjustments to suit the unique requirements of your application or infrastructure.

Several factors, including but not limited to, AWS service configurations, network settings, security policies, and third-party integrations, can influence the behavior of the system. Before deploying a similar solution in a production environment, we recommend thoroughly reviewing and adapting the example to align with your specific use case and requirements.

Keep in mind that AWS services and features may evolve over time, and new updates or changes may impact the behavior of the components described in this example. Regularly consult the AWS documentation and ensure that your configurations adhere to best practices and compliance standards.

This example is intended to provide a starting point and should be considered as a reference rather than an exhaustive solution. Always conduct thorough testing and validation in your specific environment to ensure the desired functionality and security.

Figure 1: Serverless payment architecture and workflow

Use case 1: Consumers make a payment
1. Consumers visit the e-commerce payment page to make a payment.
2. The request is routed to the payment application’s domain using Amazon Route 53, which acts as a DNS service.
3. The payment page is protected by AWS WAF to inspect the initial incoming request for any malicious patterns, web-based attacks (such as cross-site scripting (XSS) attacks), and unwanted bots.
4. An HTTPS GET request (over TLS) is sent to the public target IP. Amazon CloudFront, a content delivery network (CDN), acts as a front-end proxy and caches and fetches static content from an Amazon Simple Storage Service (Amazon S3) bucket.
5. AWS WAF inspects the incoming request for any malicious patterns, if the request is blocked, the request doesn’t return static content from the S3 bucket.
6. User authentication and authorization are handled by Amazon Cognito, providing a secure login and scalable customer identity and access management system (CIAM)
7. AWS WAF processes the request to protect against web exploits, then Amazon API Gateway forwards it to the payment application API endpoint.
8. API Gateway launches AWS Lambda functions to handle payment requests. AWS Step Functions state machine oversees the entire process, directing the running of multiple Lambda functions to communicate with the payment processor, initiate the payment transaction, and process the response.
9. The cardholder data (CHD) is temporarily cached in Amazon DynamoDB for troubleshooting and retry attempts in the event of transaction failures.
10. A Lambda function validates the transaction details and performs necessary checks against the data stored in DynamoDB. A web notification is sent to the consumer for any invalid data.
11. A Lambda function calculates the transaction fees.
12. A Lambda function authenticates the transaction and initiates the payment transaction with the third-party payment provider.
13. A Lambda function is initiated when a payment transaction with the third-party payment provider is completed. It receives the transaction status from the provider and performs multiple actions.
14. Consumers receive real-time notifications through a web browser and email. The notifications are initiated by a step function, such as order confirmations or payment receipts, and can be integrated with external payment processors through an Amazon Simple Notification Service (Amazon SNS) Amazon Simple Email Service (Amazon SES) web hook.
15. A separate Lambda function clears the DynamoDB cache.
16. The Lambda function makes entries into the Amazon Simple Queue Service (Amazon SQS) dead-letter queue for failed transactions to retry at a later time.
Use case 2: An admin or analyst generates the report for non-PCI data
1. An admin accesses the web-based reporting dashboard using their browser to generate a report.
2. The request is routed to AWS WAF to verify the source that initiated the request.
3. An HTTPS GET request (over TLS) is sent to the public target IP. CloudFront fetches static content from an S3 bucket.
4. AWS WAF inspects incoming requests for any malicious patterns, if the request is blocked, the request doesn’t return static content from the S3 bucket. The validated traffic is sent to Amazon S3 to retrieve the reporting page.
5. The backend requests of the reporting page pass through AWS WAF again to provide protection against common web exploits before being forwarded to the reporting API endpoint through API Gateway.
6. API Gateway launches a Lambda function for report generation. The Lambda function retrieves data from DynamoDB storage for the reporting mechanism.
7. The AWS Security Token Service (AWS STS) issues temporary credentials to the Lambda service in the non-PCI serverless account, allowing it to launch the Lambda function in the PCI serverless account. The Lambda function retrieves non-PCI data and writes it into DynamoDB.
8. The Lambda function fetches the non-PCI data based on the report criteria from the DynamoDB table from the same account.

Additional AWS security and governance services that would be implemented throughout the architecture are shown in Figure 1, Label-25. For example, Amazon CloudWatch monitors and alerts on all the Lambda functions within the environment.

Label-26 demonstrates frameworks that can be used to build the serverless applications.

Scoping and requirements

Now that we’ve established the reference architecture and workflow, lets delve into how it aligns with PCI DSS scope and requirements.

PCI scoping

Serverless services are inherently segmented by AWS, but they can be used within the context of an AWS account hierarchy to provide various levels of isolation as described in the reference architecture example.

Segregating PCI data and non-PCI data into separate AWS accounts can help in de-scoping non-PCI environments and reducing the complexity and audit requirements for components that don’t handle cardholder data.

PCI serverless production account

This AWS account is dedicated to handling PCI data and applications that directly process, transmit, or store cardholder data.
Services such as Amazon Cognito, DynamoDB, API Gateway, CloudFront, Amazon SNS, Amazon SES, Amazon SQS, and Step Functions are provisioned in this account to support the PCI data workflow.
Security controls, logging, monitoring, and access controls in this account are specifically designed to meet PCI DSS requirements.

Non-PCI serverless production account

This separate AWS account is used to host applications that don’t handle PCI data.
Since this account doesn’t handle cardholder data, the scope of PCI DSS compliance is reduced, simplifying the compliance process.

Note: You can use AWS Organizations to centrally manage multiple AWS accounts.

AWS IAM Identity Center (successor to AWS Single Sign-On) is used to manage user access to each account and is integrated with your existing identify provider. This helps to ensure you’re meeting PCI requirements on identity, access control of card holder data, and environment.

Now, let’s look at the PCI DSS requirements that this architectural pattern can help address.

Requirement 1: Install and maintain network security controls

Network security controls are limited to AWS Identity and Access Management (IAM) and application permissions because there is no customer controlled or defined network. VPC-centric requirements aren’t applicable because there is no VPC. The configuration settings for serverless services can be covered under Requirement 6 to for secure configuration standards. This supports compliance with Requirements 1.2 and 1.3.

Requirement 2: Apply secure configurations to all system components

AWS services are single function by default and exist with only the necessary functionality enabled for the functioning of that service. This supports compliance with much of Requirement 2.2.
Access to AWS services is considered non-console and only accessible through HTTPS through the service API. This supports compliance with Requirement 2.2.7.
The wireless requirements under Requirement 2.3 are not applicable, because wireless environments don’t exist in AWS environments.

Requirement 3: Protect stored account data

AWS is responsible for destruction of account data configured for deletion based on DynamoDB Time to Live (TTL) values. This supports compliance with Requirement 3.2.
DynamoDB and Amazon S3 offer secure storage of account data, encryption by default in transit and at rest, and integration with AWS Key Management Service (AWS KMS). This supports compliance with Requirements 3.5 and 4.2.
AWS is responsible for the generation, distribution, storage, rotation, destruction, and overall protection of encryption keys within AWS KMS. This supports compliance with Requirements 3.6 and 3.7.
Manual cleartext cryptographic keys aren’t available in this solution, Requirement 3.7.6 is not applicable.

Requirement 4: Protect cardholder data with strong cryptography during transmission over open, public networks

AWS Certificate Manager (ACM) integrates with API Gateway and enables the use of trusted certificates and HTTPS (TLS) for secure communication between clients and the API. This supports compliance with Requirement 4.2.
Requirement 4.2.1.2 is not applicable because there are no wireless technologies in use in this solution. Customers are responsible for ensuring strong cryptography exists for authentication and transmission over other wireless networks they manage outside of AWS.
Requirement 4.2.2 is not applicable because no end-user technologies exist in this solution. Customers are responsible for ensuring the use of strong cryptography if primary account numbers (PAN) are sent through end-user messaging technologies in other environments.

Requirement 5: Protect a ll systems and networks from malicious software

There are no customer-managed compute resources in this example payment environment, Requirements 5.2 and 5.3 are the responsibility of AWS.

Requirement 6: Develop and maintain secure systems and software

Amazon Inspector now supports Lambda functions, adding continual, automated vulnerability assessments for serverless compute. This supports compliance with Requirement 6.2.
Amazon Inspector helps identify vulnerabilities and security weaknesses in the payment application’s code, dependencies, and configuration. This supports compliance with Requirement 6.3.
AWS WAF is designed to protect applications from common attacks, such as SQL injections, cross-site scripting, and other web exploits. AWS WAF can filter and block malicious traffic before it reaches the application. This supports compliance with Requirement 6.4.2.

Requirement 7: Restrict access to system components and cardholder data by business need to know

IAM and Amazon Cognito allow for fine-grained role- and job-based permissions and access control. Customers can use these capabilities to configure access following the principles of least privilege and need-to-know. IAM and Cognito support the use of strong identification, authentication, authorization, and multi-factor authentication (MFA). This supports compliance with much of Requirement 7.

Requirement 8: Identify users and authenticate access to system components

IAM and Amazon Cognito also support compliance with much of Requirement 8.
Some of the controls in this requirement are usually met by the identity provider for internal access to the cardholder data environment (CDE).

Requirement 9: Restrict physical access to cardholder data

AWS is responsible for the destruction of data in DynamoDB based on the customer configuration of content TTL values for Requirement 9.4.7. Customers are responsible for ensuring their database instance is configured for appropriate removal of data by enabling TTL on DDB attributes.
Requirement 9 is otherwise not applicable for this serverless example environment because there are no physical media, electronic media not already addressed under Requirement 3.2, or hard-copy materials with cardholder data. AWS is responsible for the physical infrastructure under the Shared Responsibility Model.

Requirement 10: Log and monitor all access to system components and cardholder data

AWS CloudTrail provides detailed logs of API activity for auditing and monitoring purposes. This supports compliance with Requirement 10.2 and contains all of the events and data elements listed.
CloudWatch can be used for monitoring and alerting on system events and performance metrics. This supports compliance with Requirement 10.4.
AWS Security Hub provides a comprehensive view of security alerts and compliance status, consolidating findings from various security services, which helps in ongoing security monitoring and testing. Customers must enable PCI DSS security standard, which supports compliance with Requirement 10.4.2.
AWS is responsible for maintaining accurate system time for AWS services. In this example, there are no compute resources for which customers can configure time. Requirement 10.6 is addressable through the AWS Attestation of Compliance and Responsibility Summary available in AWS Artifact.

Requirement 11: Regularly test security systems and processes

Testing for rogue wireless activity within the AWS-based CDE is the responsibility of AWS. AWS is responsible for the management of the physical infrastructure under Requirement 11.2. Customers are still responsible for wireless testing for their environments outside of AWS, such as where administrative workstations exist.
AWS is responsible for internal vulnerability testing of AWS services, and supports compliance with Requirement 11.3.1.
Amazon GuardDuty, a threat detection service that continuously monitors for malicious activity and unauthorized access, providing continuous security monitoring. This supports the IDS requirements under Requirement 11.5.1, and covers the entire AWS-based CDE.
AWS Config allows customers to catalog, monitor and manage configuration changes for their AWS resources. This supports compliance with Requirement 11.5.2.
Customers can use AWS Config to monitor the configuration of the S3 bucket hosting the static website. This supports compliance with Requirement 11.6.1.

Requirement 12: Support information security with organizational policies and programs

Customers can download the AWS AOC and Responsibility Summary package from Artifact to support Requirement 12.8.5 and the identification of which PCI DSS requirements are managed by the third-party service provider (TSPS) and which by the customer.

Conclusion

Using AWS serverless services when developing your payment application can significantly help reduce the number of PCI DSS requirements you need to meet by yourself. By offloading infrastructure management to AWS and using serverless services such as Lambda, API Gateway, DynamoDB, Amazon S3, and others, you can benefit from built-in security features and help align with your PCI DSS compliance requirements.

Contact us to help design an architecture that works for your organization. AWS Security Assurance Services is a Payment Card Industry-Qualified Security Assessor company (PCI-QSAC) and HITRUST External Assessor firm. We are a team of industry-certified assessors who help you to achieve, maintain, and automate compliance in the cloud by tying together applicable audit standards to AWS service-specific features and functionality. We help you build on frameworks such as PCI DSS, HITRUST CSF, NIST, SOC 2, HIPAA, ISO 27001, GDPR, and CCPA.

More information on how to build applications using AWS serverless technologies can be found at Serverless on AWS.

Want more AWS Security news? Follow us on Twitter.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Serverless re:Post, Security, Identity, & Compliance re:Post or contact AWS Support.

Prepare your AWS workloads for the “Operational risks and resilience – banks” FINMA Circular

2023-10-31 Margo Cronin

Post Syndicated from Margo Cronin original https://aws.amazon.com/blogs/security/prepare-your-aws-workloads-for-the-operational-risks-and-resilience-banks-finma-circular/

In December 2022, FINMA, the Swiss Financial Market Supervisory Authority, announced a fully revised circular called Operational risks and resilience – banks that will take effect on January 1, 2024. The circular will replace the Swiss Bankers Association’s Recommendations for Business Continuity Management (BCM), which is currently recognized as a minimum standard. The new circular also adopts the revised principles for managing operational risks, and the new principles on operational resilience, that the Basel Committee on Banking Supervision published in March 2021.

In this blog post, we share key considerations for AWS customers and regulated financial institutions to help them prepare for, and align to, the new circular.

AWS previously announced the publication of the AWS User Guide to Financial Services Regulations and Guidelines in Switzerland. The guide refers to certain rules applicable to financial institutions in Switzerland, including banks, insurance companies, stock exchanges, securities dealers, portfolio managers, trustees, and other financial entities that FINMA oversees (directly or indirectly).

FINMA has previously issued the following circulars to help regulated financial institutions understand approaches to due diligence, third party management, and key technical and organizational controls to be implemented in cloud outsourcing arrangements, particularly for material workloads:

2018/03 FINMA Circular Outsourcing – banks and insurers (31.10.2019)
2008/21 FINMA Circular Operational Risks – Banks (31.10.2019) – Principal 4 Technology Infrastructure
2008/21 FINMA Circular Operational Risks – Banks (31.10.2019) – Appendix 3 Handling of electronic Client Identifying Data
2013/03 Auditing (04.11.2020) – Information Technology (21.04.2020)
BCM minimum standards proposed by the Swiss Insurance Association (01.06.2015) and Swiss Bankers Association (29.08.2013)

Operational risk management: Critical data

The circular defines critical data as follows:

“Critical data are data that, in view of the institution’s size, complexity, structure, risk profile and business model, are of such crucial significance that they require increased security measures. These are data that are crucial for the successful and sustainable provision of the institution’s services or for regulatory purposes. When assessing and determining the criticality of data, the confidentiality as well as the integrity and availability must be taken into account. Each of these three aspects can determine whether data is classified as critical.”

This definition is consistent with the AWS approach to privacy and security. We believe that for AWS to realize its full potential, customers must have control over their data. This includes the following commitments:

Control over the location of your data
Verifiable control over data access
Ability to encrypt everything everywhere
Resilience of AWS

These commitments further demonstrate our dedication to securing your data: it’s our highest priority. We implement rigorous contractual, technical, and organizational measures to help protect the confidentiality, integrity, and availability of your content regardless of which AWS Region you select. You have complete control over your content through powerful AWS services and tools that you can use to determine where to store your data, how to secure it, and who can access it.

You also have control over the location of your content on AWS. For example, in Europe, at the time of publication of this blog post, customers can deploy their data into any of eight Regions (for an up-to-date list of Regions, see AWS Global Infrastructure). One of these Regions is the Europe (Zurich) Region, also known by its API name ‘eu-central-2’, which customers can use to store data in Switzerland. Additionally, Swiss customers can rely on the terms of the AWS Swiss Addendum to the AWS Data Processing Addendum (DPA), which applies automatically when Swiss customers use AWS services to process personal data under the new Federal Act on Data Protection (nFADP).

AWS continually monitors the evolving privacy, regulatory, and legislative landscape to help identify changes and determine what tools our customers might need to meet their compliance requirements. Maintaining customer trust is an ongoing commitment. We strive to inform you of the privacy and security policies, practices, and technologies that we’ve put in place. Our commitments, as described in the Data Privacy FAQ, include the following:

Access – As a customer, you maintain full control of your content that you upload to the AWS services under your AWS account, and responsibility for configuring access to AWS services and resources. We provide an advanced set of access, encryption, and logging features to help you do this effectively (for example, AWS Identity and Access Management, AWS Organizations, and AWS CloudTrail). We provide APIs that you can use to configure access control permissions for the services that you develop or deploy in an AWS environment. We never use your content or derive information from it for marketing or advertising purposes.
Storage – You choose the AWS Regions in which your content is stored. You can replicate and back up your content in more than one Region. We will not move or replicate your content outside of your chosen AWS Regions except as agreed with you.
Security – You choose how your content is secured. We offer you industry-leading encryption features to protect your content in transit and at rest, and we provide you with the option to manage your own encryption keys. These data protection features include:
- Data encryption capabilities available in over 100 AWS services.
- Flexible key management options using AWS Key Management Service (AWS KMS), so you can choose whether to have AWS manage your encryption keys or to keep complete control over your keys.
Disclosure of customer content – We will not disclose customer content unless we’re required to do so to comply with the law or a binding order of a government body. If a governmental body sends AWS a demand for your customer content, we will attempt to redirect the governmental body to request that data directly from you. If compelled to disclose your customer content to a governmental body, we will give you reasonable notice of the demand to allow the customer to seek a protective order or other appropriate remedy, unless AWS is legally prohibited from doing so.
Security assurance – We have developed a security assurance program that uses current recommendations for global privacy and data protection to help you operate securely on AWS, and to make the best use of our security control environment. These security protections and control processes are independently validated by multiple third-party independent assessments, including the FINMA International Standard on Assurance Engagements (ISAE) 3000 Type II attestation report.

Additionally, FINMA guidelines lay out requirements for the written agreement between a Swiss financial institution and its service provider, including access and audit rights. For Swiss financial institutions that run regulated workloads on AWS, we offer the Swiss Financial Services Addendum to address the contractual and audit requirements of the FINMA guidelines. We also provide these institutions the ability to comply with the audit requirements in the FINMA guidelines through the AWS Security & Audit Series, including participation in an Audit Symposium, to facilitate customer audits. To help align with regulatory requirements and expectations, our FINMA addendum and audit program incorporate feedback that we’ve received from a variety of financial supervisory authorities across EU member states. To learn more about the Swiss Financial Services addendum or about the audit engagements offered by AWS, reach out to your AWS account team.

Resilience

Customers need control over their workloads and high availability to help prepare for events such as supply chain disruptions, network interruptions, and natural disasters. Each AWS Region is composed of multiple Availability Zones (AZs). An Availability Zone is one or more discrete data centers with redundant power, networking, and connectivity in an AWS Region. To better isolate issues and achieve high availability, you can partition applications across multiple AZs in the same Region. If you are running workloads on premises or in intermittently connected or remote use cases, you can use our services that provide specific capabilities for offline data and remote compute and storage. We will continue to enhance our range of sovereign and resilient options, to help you sustain operations through disruption or disconnection.

FINMA incorporates the principles of operational resilience in the newest circular 2023/01. In line with the efforts of the European Commission’s proposal for the Digital Operational Resilience Act (DORA), FINMA outlines requirements for regulated institutions to identify critical functions and their tolerance for disruption. Continuity of service, especially for critical economic functions, is a key prerequisite for financial stability. AWS recognizes that financial institutions need to comply with sector-specific regulatory obligations and requirements regarding operational resilience. AWS has published the whitepaper Amazon Web Services’ Approach to Operational Resilience in the Financial Sector and Beyond, in which we discuss how AWS and customers build for resiliency on the AWS Cloud. AWS provides resilient infrastructure and services, which financial institution customers can rely on as they design their applications to align with FINMA regulatory and compliance obligations.

AWS previously announced the third issuance of the FINMA ISAE 3000 Type II attestation report. Customers can access the entire report in AWS Artifact. To learn more about the list of certified services and Regions, see the FINMA ISAE 3000 Type 2 Report and AWS Services in Scope for FINMA.

AWS is committed to adding new services into our future FINMA program scope based on your architectural and regulatory needs. If you have questions about the FINMA report, or how your workloads on AWS align to the FINMA obligations, contact your AWS account team. We will also help support customers as they look for new ways to experiment, remain competitive, meet consumer expectations, and develop new products and services on AWS that align with the new regulatory framework.

To learn more about our compliance, security programs and common privacy and data protection considerations, see AWS Compliance Programs and the dedicated AWS Compliance Center for Switzerland. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Security, Identity, & Compliance re:Post or contact AWS Support.