Tag Archives: Advanced (300)

How to issue use-case bound certificates with AWS Private CA

Post Syndicated from Chris Morris original https://aws.amazon.com/blogs/security/how-to-issue-use-case-bound-certificates-with-aws-private-ca/

In this post, we’ll show how you can use AWS Private Certificate Authority (AWS Private CA) to issue a wide range of X.509 certificates that are tailored for specific use cases. These use-case bound certificates have their intended purpose defined within the certificate components, such as the Key Usage and Extended Key usage extensions. We will guide you on how you can define your usage by applying your required Key Usage and Extended Key usage values with the IssueCertificate API operation.

Background

With the AWS Private CA service, you can build your own public key infrastructure (PKI) in the AWS Cloud and issue certificates to use within your organization. Certificates issued by AWS Private CA support both the Key Usage and Extended Key Usage extensions. By using these extensions with specific values, you can bind the usage of a given certificate to a particular use case during creation. Binding certificates to their intended use case, such as SSL/TLS server authentication or code signing, provides distinct security benefits such as accountability and least privilege.

When you define certificate usage with specific Key Usage and Extended Key Usage values, this helps your organization understand what purpose a given certificate serves and the use case for which it is bound. During audits, organizations can inspect their certificate’s Key Usage and Extended Key Usage values to determine the certificate’s purpose and scope. This not only provides accountability regarding a certificate’s usage, but also a level of transparency for auditors and stakeholders. Furthermore, by using these extensions with specific values, you will follow the principle of least privilege. You can grant least privilege by defining only the required Key Usage and Extended Key Usage values for your use case. For example, if a given certificate is going to be used only for email protection (S/MIME), you can assign only that extended key usage value to the certificate.

Certificate templates and use cases

In AWS Private CA, the Key Usage and Extended Key Usage extensions and values are specified by using a configuration template, which is passed with the IssueCertificate API operation. The base template provided by AWS handles the most common certificate use cases, such as SSL/TLS server authentication or code signing. However, there are additional use cases for certificates that are not defined in base templates. To issue certificates for these use cases, you can pass blank certificate templates in your IssueCertificate requests, along with your required Key Usage and Extended Key usage values.

Such use cases include, but are not limited to the following:

  • Certificates for SSL/TLS
    • Issue certificates with an Extended Key Usage value of Server Authentication, Client Authentication, or both.
  • Certificates for email protection (S/MIME)
    • Issue certificates with an Extended Key Usage value of E-mail Protection
  • Certificates for smart card authentication (Microsoft Smart Card Login)
    • Issue certificates with an Extended Key Usage value of Smart Card Logon
  • Certificates for document signing
    • Issue certificates with an Extended Key Usage value of Document Signing
  • Certificates for code signing
    • Issue certificates with an Extended Key Usage value of Code Signing
  • Certificates that conform to the Matter connectivity standard

If your certificates require less-common extended key usage values not defined in the AWS documentation, you can also pass object identifiers (OIDs) to define values in Extended Key Usage. OIDs are dotted-string identifiers that are mapped to objects and attributes. OIDs can be defined and passed with custom extensions using API passthrough. You can also define OIDs in a CSR (certificate signing request) with a CSR passthrough template. Such uses include:

  • Certificates that require IPSec or virtual private network (VPN) related extensions
    • Issue certificates with Extended Key Usage values:
      • OID: 1.3.6.1.5.5.7.3.5 (IPSEC_END_SYSTEM)
      • OID: 1.3.6.1.5.5.7.3.6 (IPSEC_TUNNEL)
      • OID: 1.3.6.1.5.5.7.3.7 (IPSEC_USER)
  • Certificates that conform to the ISO/IEC standard for mobile driving license (mDL)
    • Pass the ISO/IEC 18013-5 OID reserved for mDL DS: 1.0.18013.5.1.2 by using custom extensions.

It’s important to note that blank certificate templates aren’t limited to just end-entity certificates. For example, the BlankSubordinateCACertificate_PathLen0_APICSRPassthrough template sets the Basic constraints parameter to CA:TRUE, allowing you to issue a subordinate CA certificate with your own Key Usage and Extended Key Usage values.

Using blank certificate templates

When you browse through the AWS Private CA certificate templates, you may see that base templates don’t allow you to define your own Key Usage or Extended Key Usage extensions and values. They are preset to the extensions and values used for the most common certificate types in order to simplify issuing those types of certificates. For example, when using EndEntityCertificate/V1, you will always get a Key Usage value of Critical, digital signature, key encipherment and an Extended Key Usage value of TLS web server authentication, TLS web client authentication. The following table shows all of the values for this base template.

EndEntityCertificate/V1
X509v3 parameter Value
Subject alternative name [Passthrough from certificate signing request (CSR)]
Subject [Passthrough from CSR]
Basic constraints CA:FALSE
Authority key identifier [Subject key identifier from CA certificate]
Subject key identifier [Derived from CSR]
Key usage Critical, digital signature, key encipherment
Extended key usage TLS web server authentication, TLS web client authentication
CRL distribution points [Passthrough from CA configuration]

When you look at blank certificate templates, you will see that there is more flexibility. For one example of a blank certificate template, BlankEndEntityCertificate_APICSRPassthrough/V1, you can see that there are fewer predefined values compared to EndEntityCertificate/V1. You can pass your own values for Extended Key Usage and Key Usage.

BlankEndEntityCertificate_APICSRPassthrough/V1
X509v3 parameter Value
Subject alternative name [Passthrough from API or CSR]
Subject [Passthrough from API or CSR]
Basic constraints CA:FALSE
Authority key identifier [Subject key identifier from CA certificate]
Subject key identifier [Derived from CSR]
CRL distribution points

Note: CRL distribution points are included in the template only if the CA is configured with CRL generation enabled.

[Passthrough from CA configuration or CSR]

To specify your desired extension and value, you must pass them in the IssueCertificate API call. There are two ways of doing so: the API Passthrough and CSR Passthrough templates.

  • API Passthrough – Extensions and their values defined in the IssueCertificate parameter APIPassthrough are copied over to the issued certificate.
  • CSR Passthrough – Extensions and their values defined in the CSR are copied over to the issued certificate.

To accommodate the different ways of passing these values, there are three varieties of blank certificate templates. If you would like to pass extensions defined only in your CSR file to the issued certificate, you can use the BlankEndEntityCertificate_CSRPassthrough/V1 template. Similarly, if you would like to pass extensions defined only in the APIPassthrough parameter, you can use the BlankEndEntityCertificate_APIPassthrough/V1 template. Finally, if you would like to use a combination of extensions defined in both the CSR and APIPassthrough, you can use the BlankEndEntityCertificate_APICSRPassthrough/V1 template. It’s important to remember these points when choosing your template:

  • The template definition will always have the higher priority over the values specified in the CSR, regardless of what template variety you use. For example, if the template contains a Key Usage value of digital signature and your CSR file contains key encipherment, the certificate will choose the template definition digital signature.
  • API passthrough values are only respected when you use an API passthrough or APICSR passthrough template. CSR passthrough is only respected when you use a CSR passthrough or APICSR passthrough template. When these sources of information are in conflict (the CSR contains the same extension or value as what’s passed in API passthrough), a general rule usually applies: For each extension value, the template definition has highest priority, followed by API passthrough values, followed by CSR passthrough extensions. Read more about the template order of operations in the AWS documentation.

How to issue use-case bound certificates in the AWS CLI

To get started issuing certificates, you must have appropriate AWS Identity and Access Management (IAM) permissions as well as an AWS Private CA in an “Active” status. You can verify if your private CA is active by running the aws acm-pca list-certificate-authorities command from the AWS Command Line Interface (CLI). You should see the following:

"Status": "ACTIVE"

After verifying the status, make note of your private CA Amazon Resource Name (ARN).

To issue use-case bound certificates, you must use the Private CA API operation IssueCertificate.

In the AWS CLI, you can call this API by using the command issue-certificate. There are several parameters you must pass with this command:

  • (--certificate-authority-arn) – The ARN of your private CA.
  • (--csr) – The CSR in PEM format. It must be passed as a blob , like fileb://.
  • (--validity) – Sets the “Not After” date (expiration date) for the certificate.
  • (--signing-algorithm) – The signing algorithm to be used to sign the certificate. The value you choose must match the algorithm family of the private CA’s algorithm (RSA or ECDSA). For example, if the private CA uses RSA_2048, the signing algorithm must be an RSA variant, like SHA256WITHRSA.

    You can check your private CA’s algorithm family by referring to its key algorithm. The command aws acm-pca describe-certificate-authority will show the corresponding KeyAlgorithm value.

  • (--template-arn) – This is where the blank certificate template is defined. The template should be an AWS Private CA template ARN. The full list of AWS Private CA template ARNs are shown in the AWS documentation.

We’ll now demonstrate how to issue use-case bound end-entity certificates by using blank end-entity certificate templates. We will issue two different certificates. One will be bound for email protection, and one will be bound for smart card authentication. Email protection and smart card authentication certificates have specific Extended Key Usage values which are not defined by any base template. We’ll use CSR passthrough to issue the smart card authentication certificate and use API passthrough to issue the email protection certificate.

The certificate templates that we will use are:

  • For CSR passthrough: BlankEndEntityCertificate_CSRPassthrough/V1
  • For API Passthrough: BlankEndEntityCertificate_APIPassthrough/V1

Important notes about this demo:

  • These commands are for demo purposes only. Depending on your specific use case, email protection certificates and smart card authentication certificates may require different extensions than what’s shown in this demo.
  • You will be generating RSA 2048 private keys. Private keys need to be protected and stored properly and securely. For example, encrypting private keys or storing private keys in a hardware security module (HSM) are some methods of protection that you can use.
  • We will be using the OpenSSL command line tool, which is installed by default on many operating systems such as Amazon Linux 2023. If you don’t have this tool installed, you can obtain it by using the software distribution facilities of your organization or your operating system, as appropriate.

Use API passthrough

We will now demonstrate how to issue a certificate that is bound for email protection. We’ll specify Key Usage and Extended Key Usage values, and also a subject alternative name through API passthrough. The goal is to have these extensions and values in the email protection certificate.

Extensions:

	X509v3 Key Usage: critical
	Digital Signature, Key Encipherment
	X509v3 Extended Key Usage:
	E-mail Protection
	X509v3 Subject Alternative Name:
	email:[email protected]

To issue a certificate bound for email protection

  1. Use the following command to create your keypair and CSR with OpenSSL. Define your distinguished name in the OpenSSL prompt.
    openssl req -out csr-demo-1.csr -new -newkey rsa:2048 -nodes -keyout private-key-demo-1.pem

  2. Use the following command to issue an end-entity certificate specifying the EMAIL_PROTECTION extended key usage value, the Digital Signature and Key Encipherment Key Usage values, and the subject alternative name [email protected]. We will use the Rfc822Name subject alternative name type, because the value is an email address.

    Make sure to replace the data in arn:aws:acm-pca:<region>:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111 with your private CA ARN, and adjust the signing algorithm according to your private CA’s algorithm. Assuming my PCA is type RSA, I am using SHA256WITHRSA.

    aws acm-pca issue-certificate --certificate-authority-arn arn:aws:acm-pca:<region>:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111 --csr fileb://csr-demo-1.csr --template-arn arn:aws:acm-pca:::template/BlankEndEntityCertificate_APIPassthrough/V1 --signing-algorithm "SHA256WITHRSA" --validity Value=365,Type="DAYS" --api-passthrough "Extensions={ExtendedKeyUsage=[{ExtendedKeyUsageType="EMAIL_PROTECTION"}],KeyUsage={"DigitalSignature"=true,"KeyEncipherment"=true},SubjectAlternativeNames=[{Rfc822Name="[email protected]"}]}"

     If the command is successful, then the ARN of the issued certificate is shown as the result:

    {
        "CertificateArn": "arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111/certificate/123465789123456789"
    }

  3. Proceed to the Retrieve the Certificate section of this post to retrieve the certificate and certificate chain PEM from the CertificateArn.

Use CSR passthrough

We’ll now demonstrate how to issue a certificate that is bound for smart card authentication. We will specify Key Usage, Extended Key Usage, and subject alternative name extensions and values through CSR passthrough. The goal is to have these values in the smart card authentication certificate.

Extensions:

	X509v3 Key Usage: critical
	Digital Signature
	X509v3 Extended Key Usage:
	TLS Web Client Authentication, Microsoft Smartcard Login
	X509v3 Subject Alternative Name:
	othername: UPN::[email protected]

We’ll generate our CSR by requesting these specific extensions and values with OpenSSL. When we call IssueCertificate, the CSR passthrough template will acknowledge the requested extensions and copy them over to the issued certificate.

To issue a certificate bound for smart card authentication

  1. Use the following command to create the private key.
    openssl genpkey -algorithm RSA -pkeyopt rsa_keygen_bits:2048 -out private-key-demo-2.pem

  2. Create a file called openssl_csr.conf to define the distinguished name and the requested CSR extensions.

    Following is an example of OpenSSL configuration file content. You can copy this configuration to the openssl_csr.conf file and adjust the values to your requirements. You can find further reference on the configuration in the OpenSSL documentation.

    [ req ]
    default_bits = 2048
    prompt = no
    default_md = sha256
    req_extensions = my_req_ext
    distinguished_name = dn
    
    #Specify the Distinguished Name
    [ dn ]
    countryName                     = US
    stateOrProvinceName             = VA 
    localityName                    = Test City
    organizationName                = Test Organization Inc
    organizationalUnitName          = Test Organization Unit
    commonName                      = john_doe
    
    
    #Specify the Extensions
    [ my_req_ext ]
    keyUsage = critical, digitalSignature
    extendedKeyUsage = clientAuth, msSmartcardLogin 
    
    #UPN OtherName OID: "1.3.6.1.4.1.311.20.2.3". Value is ASN1-encoded UTF8 string
    subjectAltName = otherName:msUPN;UTF8:[email protected] 

    In this example, you can specify your Key Usage and Extended Key Usage values in the [ my_req_ext ] section of the configuration. In the extendedKeyUsage line, you may also define extended key usage OIDs, like 1.3.6.1.4.1.311.20.2.2. Possible values are defined in the OpenSSL documentation.

  3. Create the CSR, defining the configuration file.
    openssl req -new -key private-key-demo-2.pem -out csr-demo-2.csr -config openssl_csr.conf

  4. (Optional) You can use the following command to decode the CSR to make sure it contains the information you require.
    openssl req -in csr-demo-2.csr -noout  -text

    The output should show the requested extensions and their values, as follows.

    	X509v3 Key Usage: critical
    	Digital Signature
    	X509v3 Extended Key Usage:
    	TLS Web Client Authentication, Microsoft Smartcard Login
    	X509v3 Subject Alternative Name:
    	othername: UPN:: <your_user_here>

  5. Issue the certificate by using the issue-certificate command. We will use a CSR passthrough template so that the requested extensions and values in the CSR file are copied over to the issued certificate.

    Make sure to replace the data in arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111 with your private CA ARN and adjust the signing algorithm and validity to for your use case. Assuming my PCA is type RSA, I am using SHA256WITHRSA.

    aws acm-pca issue-certificate --certificate-authority-arn arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111 --csr fileb://csr-demo-2.csr --template-arn arn:aws:acm-pca:::template/BlankEndEntityCertificate_CSRPassthrough/V1 --signing-algorithm "SHA256WITHRSA" --validity Value=365,Type="DAYS"

    If the command is successful, then the ARN of the issued certificate is shown as the result:

    {
        "CertificateArn": "arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111/certificate/123465789123456789"
    }

Retrieve the certificate

After using issue-certificate with API passthrough or CSR passthrough, you can retrieve the certificate material in PEM format. Use the get-certificate command and specify the ARN of the private CA that issued the certificate, as well as the ARN of the certificate that was issued:

aws acm-pca get-certificate --certificate-arn arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111/certificate/123465789123456789 --certificate-authority-arn arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111 --output text

You can use the --query command with the AWS CLI to get the certificate and certificate chain in separate files.

Certificate

aws acm-pca get-certificate --certificate-authority-arn  arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111 --certificate-arn arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111/certificate/123465789123456789 --output text --query Certificate > certfile.pem

Certificate chain

aws acm-pca get-certificate --certificate-authority-arn  arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111 --certificate-arn arn:aws:acm-pca:us-east-1:<accountID>:certificate-authority/11111111-1111-1111-1111-111111111111/certificate/123465789123456789 --output text --query CertificateChain > certchain.pem

After you retrieve the certificate, you can decode it with the openssl x509 command. This will allow you to view the details of the certificate, including the extensions and values that you defined.

openssl x509 -in certfile.pem -noout -text

Conclusion

In AWS Private CA, you can implement the security benefits of accountability and least privilege by defining the usage of your certificates. The Key Usage and Extended Key Usage values define the usage of your certificates. Many certificate use cases require a combination of Key Usage and Extended Key Usage values, which cannot be defined with base certificate templates. Some examples include document signing, smart card authentication, and mobile driving license (mDL) certificates. To issue certificates for these specific use cases, you can use blank certificate templates with the IssueCertificate API call. In addition to the blank certificate template, you must also define the specific combination of Key Usage and Extended Key Usage values through CSR passthrough, API passthrough, or both.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Chris Morris

Chris Morris

Chris is a Cloud Support Engineer at AWS. He specializes in a variety of security topics, including cryptography and data protection. He focuses on helping AWS customers effectively use AWS security services to strengthen their security posture in the cloud. Public key infrastructure and key management are some of his favorite security topics.

Vishal Jakharia

Vishal Jakharia

Vishal is a Cloud Support Engineer based in New Jersey, USA. Having expertise in security services and he loves to work with customer to troubleshoot the complex issues. He helps customers migrate and build secure scalable architecture on the AWS Cloud.

Establishing a data perimeter on AWS: Analyze your account activity to evaluate impact and refine controls

Post Syndicated from Achraf Moussadek-Kabdani original https://aws.amazon.com/blogs/security/establishing-a-data-perimeter-on-aws-analyze-your-account-activity-to-evaluate-impact-and-refine-controls/

A data perimeter on Amazon Web Services (AWS) is a set of preventive controls you can use to help establish a boundary around your data in AWS Organizations. This boundary helps ensure that your data can be accessed only by trusted identities from within networks you expect and that the data cannot be transferred outside of your organization to untrusted resources. Review the previous posts in the Establishing a data perimeter on AWS series for information about the security objectives and foundational elements needed to define and enforce each perimeter type.

In this blog post, we discuss how to use AWS logging and analytics capabilities to help accelerate the implementation and effectively operate data perimeter controls at scale.

You start your data perimeter journey by identifying access patterns that you want to prevent and defining what trusted identities, trusted resources, and expected networks mean to your organization. After you define your trust boundaries based on your business and security requirements, you can use policy examples provided in the data perimeter GitHub repository to design the authorization controls for enforcing these boundaries. Before you enforce the controls in your organization, we recommend that you assess the potential impacts on your existing workloads. Performing the assessment helps you to identify unknown data access patterns missed during the initial design phase, investigate, and refine your controls to help ensure business continuity as you deploy them.

Finally, you should continuously monitor your controls to verify that they operate as expected and consistently align with your requirements as your business grows and relationships with your trusted partners change.

Figure 1 illustrates common phases of a data perimeter journey.

Figure 1: Data perimeter implementation journey

Figure 1: Data perimeter implementation journey

The usual phases of the data perimeter journey are:

  1. Examine your security objectives
  2. Set your boundaries
  3. Design data perimeter controls
  4. Anticipate potential impacts
  5. Implement data perimeter controls
  6. Monitor data perimeter controls
  7. Continuous improvement

In this post, we focus on phase 4: Anticipate potential impacts. We demonstrate how to analyze activity observed in your AWS environment to evaluate impact and refine your data perimeter controls. We also demonstrate how you can automate the analysis by using the data perimeter helper open source tool.

You can use the same techniques to support phase 6: Monitor data perimeter controls, where you will continuously monitor data access patterns in your organization and potentially troubleshoot permissions issues caused by overly restrictive or overly permissive policies as new data access paths are introduced.

Setting prerequisites for impact analysis

In this section, we describe AWS logging and analytics capabilities that you can use to analyze impact of data perimeter controls on your environment. We also provide instructions for configuring them.

While you might have some capabilities covered by other AWS tools (for example, AWS Security Lake) or external tools, the proposed approach remains applicable. For instance, if your logs are stored in an external security data lake or your configuration state recording is performed by an external cloud security posture management (CSPM) tool, you can extract and adapt the logic from this post to suit your context. The flexibility of this approach allows you to use the existing tools and processes in your environment while benefiting from the insights provided.

Pricing

Some of the required capabilities can generate additional costs in your environment.

AWS CloudTrail charges based on the number of events delivered to Amazon Simple Storage Service (Amazon S3). Note that the first copy of management events is free. To help control costs, you can use advanced event selectors to select only events that matter to your context. For more details, see CloudTrail pricing.

AWS Config charges based on the number of configuration items delivered, the AWS Config aggregator and advanced queries are included in AWS Config pricing. To help control costs, you can select which resource types AWS Config records or change the recording frequency. For more details, see AWS Config pricing.

Amazon Athena charges based on the number of terabytes of data scanned in Amazon S3. To help control costs, use the proposed optimized tables with partitioning and reduce the time frame of your queries. For more details, see Athena pricing.

AWS Identity and Access Management Access Analyzer doesn’t charge additional costs for external access findings. For more details, see IAM Access Analyzer pricing.

Create a CloudTrail trail to record access activity

The primary capability that you will use is a CloudTrail trail. CloudTrail records AWS API calls and events from your AWS accounts that contain the following information relevant to data perimeter objectives:

  • API calls performed by your identities or on your resources (record fields: eventSource, eventName)
  • Identity that performed API calls (record field: userIdentity)
  • Network origin of API calls (record fields: sourceIPAddress, vpcEndpointId)
  • Resources API calls are performed on (record fields: resources, requestParameters)

See the CloudTrail record contents page for the description of all available fields.

Data perimeter controls are meant to be applied across a broad set of accounts and resources, therefore, we recommend using a CloudTrail organization trail that collects logs across the AWS accounts in your organization. If you don’t have an organization trail configured, follow these steps or use one of the data perimeter helper templates for deploying prerequisites. If you use AWS services that support CloudTrail data events and want to analyze the associated API calls, enable the relevant data events.

Though CloudTrail provides you information about parameters of an API request, it doesn’t reflect values of AWS Identity and Access Management (IAM) condition keys present in the request. Thus, you still need to analyze the logs to help refine your data perimeter controls.

Create an Athena table to analyze CloudTrail logs

To ease and accelerate logs analysis, use Athena to query and extract relevant data from the log files stored by CloudTrail in an S3 bucket.

To create an Athena table

  1. Open the Athena console. If this is your first time visiting the Athena console in your current AWS Region, choose Edit settings to set up a query result location in Amazon S3.
  2. Next, navigate to the Query editor and create a SQL table by entering the following DDL statement into the Athena console query editor. Make sure to replace s3://<BUCKET_NAME_WITH_PREFIX>/AWSLogs/<ORGANIZATION_ID>/ to point to the S3 bucket location that contains your CloudTrail log data and <REGIONS> with the list of AWS regions where you want to analyze API calls. For example, to analyze API calls made in the AWS Regions Paris (eu-west-3) and North Virginia (us-east-1), use eu-west-3,us-east-1. We recommend that you include us-east-1 to retrieve API calls performed on global resources such as IAM roles.
    CREATE EXTERNAL TABLE IF NOT EXISTS cloudtrail_logs (
        eventVersion STRING,
        userIdentity STRUCT<
            type: STRING,
            principalId: STRING,
            arn: STRING,
            accountId: STRING,
            invokedBy: STRING,
            accessKeyId: STRING,
            userName: STRING,
            sessionContext: STRUCT<
                attributes: STRUCT<
                    mfaAuthenticated: STRING,
                    creationDate: STRING>,
                sessionIssuer: STRUCT<
                    type: STRING,
                    principalId: STRING,
                    arn: STRING,
                    accountId: STRING,
                    userName: STRING>,
                ec2RoleDelivery: STRING,
                webIdFederationData: MAP<STRING,STRING>>>,
        eventTime STRING,
        eventSource STRING,
        eventName STRING,
        awsRegion STRING,
        sourceIpAddress STRING,
        userAgent STRING,
        errorCode STRING,
        errorMessage STRING,
        requestParameters STRING,
        responseElements STRING,
        additionalEventData STRING,
        requestId STRING,
        eventId STRING,
        readOnly STRING,
        resources ARRAY<STRUCT<
            arn: STRING,
            accountId: STRING,
            type: STRING>>,
        eventType STRING,
        apiVersion STRING,
        recipientAccountId STRING,
        serviceEventDetails STRING,
        sharedEventID STRING,
        vpcEndpointId STRING,
        tlsDetails STRUCT<
            tlsVersion:string,
            cipherSuite:string,
            clientProvidedHostHeader:string
        >
    )
    PARTITIONED BY (
    `p_account` string,
    `p_region` string,
    `p_date` string
    )
    ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
    STORED AS INPUTFORMAT 'com.amazon.emr.cloudtrail.CloudTrailInputFormat'
    OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
    LOCATION 's3://<BUCKET_NAME_WITH_PREFIX>/AWSLogs/<ORGANIZATION_ID>/'
    TBLPROPERTIES (
        'projection.enabled'='true',
        'projection.p_date.type'='date',
        'projection.p_date.format'='yyyy/MM/dd', 
        'projection.p_date.interval'='1', 
        'projection.p_date.interval.unit'='DAYS', 
        'projection.p_date.range'='2022/01/01,NOW', 
        'projection.p_region.type'='enum',
        'projection.p_region.values'='<REGIONS>',
        'projection.p_account.type'='injected',
    'storage.location.template'='s3://<BUCKET_NAME_WITH_PREFIX>/AWSLogs/<ORGANIZATION_ID>/${p_account}/CloudTrail/${p_region}/${p_date}'
    )

  3. Finally, run the Athena query and confirm that the cloudtrail_logs table is created and appears under the list of Tables.

Create an AWS Config aggregator to enrich query results

To further reduce manual steps for retrieval of relevant data about your environment, use the AWS Config aggregator and advanced queries to enrich CloudTrail logs with the configuration state of your resources.

To have a view into the resource configurations across the accounts in your organization, we recommend using the AWS Config organization aggregator. You can use an existing aggregator or create a new one. You can also use one of the data perimeter helper templates for deploying prerequisites.

Create an IAM Access Analyzer external access analyzer

To identify resources in your organization that are shared with an external entity, use the IAM Access Analyzer external access analyzer with your organization as the zone of trust.

You can use an existing external access analyzer or create a new one.

Install the data perimeter helper tool

Finally, you will use the data perimeter helper, an open-source tool with a set of purpose-built data perimeter queries, to automate the logs analysis process.

Clone the data perimeter helper repository and follow instructions in the Getting Started section.

Analyze account activity and refine your data perimeter controls

In this section, we provide step-by-step instructions for using the AWS services and tools you configured to effectively implement common data perimeter controls. We first demonstrate how to use the configured CloudTrail trail, Athena table, and AWS Config aggregator directly. We then show you how to accelerate the analysis with the data perimeter helper.

Example 1: Review API calls to untrusted S3 buckets and refine your resource perimeter policy

One of the security objectives targeted by companies is ensuring that their identities can only put or get data to and from S3 buckets belonging to their organization to manage the risk of unintended data disclosure or access to unapproved data. You can help achieve this security objective by implementing a resource perimeter on your identities using a service control policy (SCP). Start crafting your policy by referring to the resource_perimeter_policy template provided in the data perimeter policy examples repository:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceResourcePerimeterAWSResourcesS3",
      "Effect": "Deny",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:PutObjectAcl"
      ],
      "Resource": "*",
      "Condition": {
        "StringNotEqualsIfExists": {
          "aws:ResourceOrgID": "<my-org-id>",
          "aws:PrincipalTag/resource-perimeter-exception": "true"
        },
        "ForAllValues:StringNotEquals": {
          "aws:CalledVia": [
            "dataexchange.amazonaws.com",
            "servicecatalog.amazonaws.com"
          ]
        }
      }
    }
  ]
}

Replace the value of the aws:ResourceOrgID condition key with your organization identifier. See the GitHub repository README file for information on other elements of the policy.

As a security engineer, you can anticipate potential impacts by reviewing account activity and CloudTrail logs. You can perform this analysis manually or use the data perimeter helper tool to streamline the process.

First, let’s explore the manual approach to understand each step in detail.

Perform impact analysis without tooling

To assess the effects of the preceding policy before deployment, review your CloudTrail logs to understand on which S3 buckets API calls are performed. The targeted Amazon S3 API calls are recorded as CloudTrail data events, so make sure you enable the S3 data event for this example. CloudTrail logs provide request parameters from which you can extract the bucket names.

Below is an example Athena query to list the targeted S3 API calls made by principals in the selected account within the last 7 days. The 7-day timeframe is set to verify that the query runs quickly, but you can adjust the timeframe later to suit your specific requirements and obtain more realistic results. Replace <ACCOUNT_ID> with the AWS account ID you want to analyze the activity of.

SELECT
  useridentity.sessioncontext.sessionissuer.arn AS principal_arn,
  useridentity.type AS principal_type,
  eventname,
  JSON_EXTRACT_SCALAR(requestparameters, '$.bucketName') AS bucketname,
  resources,
  COUNT(*) AS nb_reqs
FROM "cloudtrail_logs"
WHERE
  p_account = '<ACCOUNT_ID>'
  AND p_date BETWEEN DATE_FORMAT(CURRENT_DATE - INTERVAL '7' day, '%Y/%m/%d') AND DATE_FORMAT(CURRENT_DATE, '%Y/%m/%d')
  AND eventsource = 's3.amazonaws.com'
  -- Get only requests performed by principals in the selected account
  AND useridentity.accountid = '<ACCOUNT_ID>'
  -- Keep only the targeted API calls
  AND eventname IN ('GetObject', 'PutObject', 'PutObjectAcl')
  -- Remove API calls made by AWS service principals - `useridentity.principalid` field in CloudTrail log equals `AWSService`.
  AND useridentity.principalid != 'AWSService'
  -- Remove API calls made by service-linked roles in the selected account
    AND COALESCE(NOT regexp_like(useridentity.sessioncontext.sessionissuer.arn, '(:role/aws-service-role/)'), True)
  -- Remove calls with errors
  AND errorcode IS NULL
GROUP BY
  useridentity.sessioncontext.sessionissuer.arn,
  useridentity.type,
  eventname,
  JSON_EXTRACT_SCALAR(requestparameters, '$.bucketName'),
  resources

As shown in Figure 2, this query provides you with a list of the S3 bucket names that are being accessed by principals in the selected account, while removing calls made by service-linked roles (SLRs) because they aren’t governed by SCPs. In this example, the IAM roles AppMigrator and PartnerSync performed API calls on S3 buckets named app-logs-111111111111, raw-data-111111111111, expected-partner-999999999999, and app-migration-888888888888.

Figure 2: Sample of the Athena query results

Figure 2: Sample of the Athena query results

The CloudTrail record field resources provides information on the list of resources accessed in an event. The field is optional and can notably contain the resource Amazon Resource Names (ARNs) and the account ID of the resource owner. You can use this record field to detect resources owned by accounts not in your organization. However, because this record field is optional, to scale your approach you can also use the AWS Config aggregator data to list resources currently deployed in your organization.

To know if the S3 buckets belong to your organization or not, you can run the following AWS Config advanced query. This query lists the S3 buckets inventoried in your AWS Config organization aggregator.

SELECT
  accountId,
  awsRegion,
  resourceId
WHERE
  resourceType = 'AWS::S3::Bucket'

As shown in Figure 3, buckets expected-partner-999999999999 and app-migration-888888888888 aren’t inventoried and therefore don’t belong to this organization.

Figure 3: Sample of the AWS Config advanced query results

Figure 3: Sample of the AWS Config advanced query results

By combining the results of the Athena query and the AWS Config advanced query, you can now pinpoint S3 API calls made by principals in the selected account on S3 buckets that are not part of your AWS organization.

If you do nothing, your starting resource perimeter policy would block access to these buckets. Therefore, you should investigate with your development teams why your principals performed those API calls and refine your policy if there is a legitimate reason, such as integration with a trusted third party. If you determine, for example, that your principals have a legitimate reason to access the bucket expected-partner-999999999999, you can discover the account ID (<third-party-account-a>) that owns the bucket by reviewing the record field resources in your CloudTrail logs or investigating with your developers and edit the policy as follows:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceResourcePerimeterAWSResourcesS3",
      "Effect": "Deny",
      "Action": [
        "s3:GetObject",
        "s3:PutObject",
        "s3:PutObjectAcl"
      ],
      "Resource": "*",
      "Condition": {
        "StringNotEqualsIfExists": {
          "aws:ResourceOrgID": "<my-org-id>",
          "aws:ResourceAccount": "<third-party-account-a>",
          "aws:PrincipalTag/resource-perimeter-exception": "true"
        },
        "ForAllValues:StringNotEquals": {
          "aws:CalledVia": [
            "dataexchange.amazonaws.com",
            "servicecatalog.amazonaws.com"
          ]
        }
      }
    }
  ]
}

Now your resource perimeter policy helps ensure that access to resources that belong to your trusted third-party partner aren’t blocked by default.

Automate impact analysis with the data perimeter helper

Data perimeter helper provides queries that perform and combine the results of Athena and AWS Config aggregator queries on your behalf to accelerate policy impact analysis.

For this example, we use the s3_scp_resource_perimeter query to analyze S3 API calls made by principals in a selected account on S3 buckets not owned by your organization or inventoried in your AWS Config aggregator.

You can first add the bucket names of your trusted third-party partners that are already known in the data perimeter helper configuration file (resource_perimeter_trusted_bucket parameter). You then run the data perimeter helper query using the following command. Replace <ACCOUNT_ID> with the AWS account ID you want to analyze the activity of.

data_perimeter_helper --list-account <ACCOUNT_ID> --list-query s3_scp_resource_perimeter

Data perimeter helper performs these actions:

  • Runs an Athena query to list S3 API calls made by principals in the selected account, filtering out:
    • S3 API calls made at the account level (for example, s3:ListAllMyBuckets)
    • S3 API calls made on buckets that your organization owns
    • S3 API calls made on buckets listed as trusted in the data perimeter helper configuration file (resource_perimeter_trusted_bucket parameter)
    • API calls made by service principals and SLRs because SCPs don’t apply to them
    • API calls with errors
  • Gets the list of S3 buckets in your organization using an AWS Config advanced query.
  • Removes from the Athena query’s results API calls performed on S3 buckets inventoried in your AWS Config aggregator. This is done as a second clean-up layer in case the optional CloudTrail record field resources isn’t populated.

Data perimeter helper exports the results in the selected format (HTML, Excel, or JSON) so that you can investigate API calls that don’t align with your initial resource perimeter policy. Figure 4 shows a sample of results in HTML:

Figure 4: Sample of the s3_scp_resource_perimeter query results

Figure 4: Sample of the s3_scp_resource_perimeter query results

The preceding data perimeter helper query results indicate that the IAM role PartnerSync performed API calls on S3 buckets that aren’t part of the organization, giving you a head start in your investigation efforts. Following the investigation, you can document a trusted partner bucket in the data perimeter helper configuration file to filter out the associated API calls from subsequent queries:

111111111111:
  network_perimeter_expected_public_cidr: [
  ]
  network_perimeter_trusted_principal: [
  ]
  network_perimeter_expected_vpc: [
  ]
  network_perimeter_expected_vpc_endpoint: [
  ]
  identity_perimeter_trusted_account: [
  ]
  identity_perimeter_trusted_principal: [
  ]
  resource_perimeter_trusted_bucket: [
    expected-partner-999999999999
  ]

With a single command line, you have identified for your selected account the S3 API calls crossing your resource perimeter boundaries. You can now refine and implement your controls while lowering the risk of potential impacts. If you want to scale your approach to other accounts, you just need to run the same query against them.

Example 2: Review granted access and API calls by untrusted identities on your S3 buckets and refine your identity perimeter policy

Another security objective pursued by companies is ensuring that their S3 buckets can be accessed only by principals belonging to their organization to manage the risk of unintended access to company data. You can help achieve this security objective by implementing an identity perimeter on your buckets. You can start by crafting your identity perimeter policy using policy samples provided in the data perimeter policy examples repository.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceIdentityPerimeter",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": [
        "arn:aws:s3:::<my-data-bucket>",
        "arn:aws:s3:::<my-data-bucket>/*"
      ],
      "Condition": {
        "StringNotEqualsIfExists": {
          "aws:PrincipalOrgID": "<my-org-id>",
          "aws:PrincipalAccount": [
            "<load-balancing-account-id>",
            "<third-party-account-a>",
            "<third-party-account-b>"
          ]
        },
        "BoolIfExists": {
          "aws:PrincipalIsAWSService": "false"
        }
      }
    }
  ]
}

Replace values of the aws:PrincipalOrgID and aws:PrincipalAccount condition keys based on what trusted identities mean for your organization and on your knowledge of the intended access patterns you need to support. See the GitHub repository README file for information on elements of the policy.

To assess the effects of the preceding policy before deployment, review your IAM Access Analyzer external access findings to discover the external entities that are allowed in your S3 bucket policies. Then to accelerate your analysis, review your CloudTrail logs to learn who is performing API calls on your S3 buckets. This can help you accelerate the removal of granted but unused external accesses.

Data perimeter helper provides queries that streamline these processes for you:

Run these queries by using the following command, replacing <ACCOUNT_ID> with the AWS account ID of the buckets you want to analyze the access activity of:

data_perimeter_helper --list-account <ACCOUNT_ID> --list-query s3_external_access_org_boundary s3_bucket_policy_identity_perimeter_org_boundary

The query s3_external_access_org_boundary performs this action:

  • Extracts IAM Access Analyzer external access findings from either:
    • IAM Access Analyzer if the variable external_access_findings in the data perimeter variable file is set to IAM_ACCESS_ANALYZER
    • AWS Security Hub if the same variable is set to SECURITY_HUB. Security Hub provides cross-Region aggregation, enabling you to retrieve external access findings across your organization

The query s3_external_access_org_boundary performs this action:

  • Runs an Athena query to list S3 API calls made on S3 buckets in the selected account, filtering out:
    • API calls made by principals in the same organization
    • API calls made by principals belonging to trusted accounts listed in the data perimeter configuration file (identity_perimeter_trusted_account parameter)
    • API calls made by trusted identities listed in the data perimeter configuration file (identity_perimeter_trusted_principal parameter)

Figure 5 shows a sample of results for this query in HTML:

Figure 5: Sample of the s3_bucket_policy_identity_perimeter_org_boundary and s3_external_access_org_boundary queries results

Figure 5: Sample of the s3_bucket_policy_identity_perimeter_org_boundary and s3_external_access_org_boundary queries results

The result shows that only the bucket my-bucket-open-to-partner grants access (PutObject) to principals not in your organization. Plus, in the configured time frame, your CloudTrail trail hasn’t recorded S3 API calls made by principals not in your organization on buckets that the account 111111111111 owns.

This means that your proposed identity perimeter policy accounts for the access patterns observed in your environment. After reviewing with your developers, if the granted action on the bucket my-bucket-open-to-partner is not needed, you could deploy it on the analyzed account with a reduced risk of impacting business applications.

Example 3: Investigate resource configurations to support network perimeter controls implementation

The blog post Require services to be created only within expected networks provides an example of an SCP you can use to make sure that AWS Lambda functions can only be created or updated if associated with an Amazon Virtual Private Cloud (Amazon VPC).

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "EnforceVPCFunction",
      "Action": [
          "lambda:CreateFunction",
          "lambda:UpdateFunctionConfiguration"
       ],
      "Effect": "Deny",
      "Resource": "*",
      "Condition": {
        "Null": {
           "lambda:VpcIds": "true"
        }
      }
    }
  ]
}

Before implementing the preceding policy or to continuously review the configuration of your Lambda functions, you can use your AWS Config aggregator to understand whether there are functions in your environment that aren’t attached to a VPC.

Data perimeter helper provides the referential_lambda_function query that helps you automate the analysis. Run the query by using the following command:

data_perimeter_helper --list-query referential_lambda_function

Figure 6 shows a sample of results for this query in HTML:

Figure 6: Sample of the referential_lambda_function query results

Figure 6: Sample of the referential_lambda_function query results

By reviewing the inVpc column, you can quickly identify functions that aren’t currently associated with a VPC and investigate with your development teams before enforcing your network perimeter controls.

Example 4: Investigate access denied errors to help troubleshoot your controls

While you refine your data perimeter controls or after deploying them, you might encounter API calls that fail with an access denied error message. You can use CloudTrail logs to review those API calls and use the record to investigate the root-cause.

Data perimeter helper provides the common_only_denied query, which lists the API calls with access denied errors in the configured time frame. Run the query by using the following command, replacing <ACCOUNT_ID> with your account ID:

data_perimeter_helper --list-account <ACCOUNT_ID> --list-query common_only_denied

Figure 7 shows a sample of results for this query in HTML:

Figure 7: Sample of the common_only_denied query results

Figure 7: Sample of the common_only_denied query results

Let’s say you want to review S3 API calls with access denied error messages for one of your developers who uses a role called DevOps. You can update, in the HTML report, the input fields below the principal_arn and eventsource columns to match your lookup.

Then by reviewing the columns principal_arn, eventname, isAssumableBy, and nb_reqs, you learn that the role DevOps is assumable through a SAML provider and performed two GetObject API calls that failed with an access denied error message. By reviewing the sourceipaddress field you discover that the request has been performed from an IP address outside of your network perimeter boundary, you can then advise your developer to perform the API calls again from an expected network.

Data perimeter helper provides several ready-to-use queries and a framework to add new queries based on your data perimeter objectives and needs. See Guidelines to build a new query for detailed instructions.

Clean up

If you followed the configuration steps in this blog only to test the solution, you can clean up your account to avoid recurring charges.

If you used the data perimeter helper deployment templates, use the respective infrastructure as code commands to delete the provisioned resources (for example, for Terraform, terraform destroy).

To delete configured resources manually, follow these steps:

  • If you created a CloudTrail organization trail:
    • Navigate to the CloudTrail console, select the trail your created, and choose Delete.
    • Navigate to the Amazon S3 console and delete the S3 bucket you created to store CloudTrail logs from all accounts.
  • If you created an Athena table:
    • Navigate to the Athena console and select Query editor in the left navigation panel.
    • Run the following SQL query by replacing <TABLE_NAME> with the name of the created table:
      DROP TABLE <TABLE_NAME>

  • If you created an AWS Config aggregator:
    • Navigate to the AWS Config console and select Aggregators in the left navigation panel.
    • Select the created aggregator and select Delete from the Actions drop-down list.
  • If you installed data perimeter helper:
    • Follow the uninstallation steps in the data perimeter helper README file.

Conclusion

In this blog post, we reviewed how you can analyze access activity in your organization by using the CloudTrail logs to evaluate impact of your data perimeter controls and perform troubleshooting. We discussed how the log events data can be enriched using resource configuration information from AWS Config to streamline your analysis. Finally, we introduced the open source tool, data perimeter helper, that provides a set of data perimeter tailored queries to speed up your review process and a framework to create new queries.

For additional learning opportunities, see the Data perimeters on AWS page, which provides additional material such as a data perimeter workshop, blog posts, whitepapers, and webinar sessions.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Security, Identity, and Compliance re:Post or contact AWS Support.

Want more AWS Security news? Follow us on X.

Achraf Moussadek-Kabdani

Achraf Moussadek-Kabdani

Achraf is a Senior Security Specialist at AWS. He works with global financial services customers to assess and improve their security posture. He is both a builder and advisor, supporting his customers to meet their security objectives while making security a business enabler.

Tatyana Yatskevich

Tatyana Yatskevich

Tatyana is a Principal Solutions Architect in AWS Identity. She works with customers to help them build and operate in AWS in the most secure and efficient manner.

Simplify data lake access control for your enterprise users with trusted identity propagation in AWS IAM Identity Center, AWS Lake Formation, and Amazon S3 Access Grants

Post Syndicated from Shoukat Ghouse original https://aws.amazon.com/blogs/big-data/simplify-data-lake-access-control-for-your-enterprise-users-with-trusted-identity-propagation-in-aws-iam-identity-center-aws-lake-formation-and-amazon-s3-access-grants/

Many organizations use external identity providers (IdPs) such as Okta or Microsoft Azure Active Directory to manage their enterprise user identities. These users interact with and run analytical queries across AWS analytics services. To enable them to use the AWS services, their identities from the external IdP are mapped to AWS Identity and Access Management (IAM) roles within AWS, and access policies are applied to these IAM roles by data administrators.

Given the diverse range of services involved, different IAM roles may be required for accessing the data. Consequently, administrators need to manage permissions across multiple roles, a task that can become cumbersome at scale.

To address this challenge, you need a unified solution to simplify data access management using your corporate user identities instead of relying solely on IAM roles. AWS IAM Identity Center offers a solution through its trusted identity propagation feature, which is built upon the OAuth 2.0 authorization framework.

With trusted identity propagation, data access management is anchored to a user’s identity, which can be synchronized to IAM Identity Center from external IdPs using the System for Cross-domain Identity Management (SCIM) protocol. Integrated applications exchange OAuth tokens, and these tokens are propagated across services. This approach empowers administrators to grant access directly based on existing user and group memberships federated from external IdPs, rather than relying on IAM users or roles.

In this post, we showcase the seamless integration of AWS analytics services with trusted identity propagation by presenting an end-to-end architecture for data access flows.

Solution overview

Let’s consider a fictional company, OkTank. OkTank has multiple user personas that use a variety of AWS Analytics services. The user identities are managed externally in an external IdP: Okta. User1 is a Data Analyst and uses the Amazon Athena query editor to query AWS Glue Data Catalog tables with data stored in Amazon Simple Storage Service (Amazon S3). User2 is a Data Engineer and uses Amazon EMR Studio notebooks to query Data Catalog tables and also query raw data stored in Amazon S3 that is not yet cataloged to the Data Catalog. User3 is a Business Analyst who needs to query data stored in Amazon Redshift tables using the Amazon Redshift Query Editor v2. Additionally, this user builds Amazon QuickSight visualizations for the data in Redshift tables.

OkTank wants to simplify governance by centralizing data access control for their variety of data sources, user identities, and tools. They also want to define permissions directly on their corporate user or group identities from Okta instead of creating IAM roles for each user and group and managing access on the IAM role. In addition, for their audit requirements, they need the capability to map data access to the corporate identity of users within Okta for enhanced tracking and accountability.

To achieve these goals, we use trusted identity propagation with the aforementioned services and use AWS Lake Formation and Amazon S3 Access Grants for access controls. We use Lake Formation to centrally manage permissions to the Data Catalog tables and Redshift tables shared with Redshift datashares. In our scenario, we use S3 Access Grants for granting permission for the Athena query result location. Additionally, we show how to access a raw data bucket governed by S3 Access Grants with an EMR notebook.

Data access is audited with AWS CloudTrail and can be queried with AWS CloudTrail Lake. This architecture showcases the versatility and effectiveness of AWS analytics services in enabling efficient and secure data analysis workflows across different use cases and user personas.

We use Okta as the external IdP, but you can also use other IdPs like Microsoft Azure Active Directory. Users and groups from Okta are synced to IAM Identity Center. In this post, we have three groups, as shown in the following diagram.

User1 needs to query a Data Catalog table with data stored in Amazon S3. The S3 location is secured and managed by Lake Formation. The user connects to an IAM Identity Center enabled Athena workgroup using the Athena query editor with EMR Studio. The IAM Identity Center enabled Athena workgroups need to be secured with S3 Access Grants permissions for the Athena query results location. With this feature, you can also enable the creation of identity-based query result locations that are governed by S3 Access Grants. These user identity-based S3 prefixes let users in an Athena workgroup keep their query results isolated from other users in the same workgroup. The following diagram illustrates this architecture.

User2 needs to query the same Data Catalog table as User1. This table is governed using Lake Formation permissions. Additionally, the user needs to access raw data in another S3 bucket that isn’t cataloged to the Data Catalog and is controlled using S3 Access Grants; in the following diagram, this is shown as S3 Data Location-2.

The user uses an EMR Studio notebook to run Spark queries on an EMR cluster. The EMR cluster uses a security configuration that integrates with IAM Identity Center for authentication and uses Lake Formation for authorization. The EMR cluster is also enabled for S3 Access Grants. With this kind of hybrid access management, you can use Lake Formation to centrally manage permissions for your datasets cataloged to the Data Catalog and use S3 Access Grants to centrally manage access to your raw data that is not yet cataloged to the Data Catalog. This gives you flexibility to access data managed by either of the access control mechanisms from the same notebook.

User3 uses the Redshift Query Editor V2 to query a Redshift table. The user also accesses the same table with QuickSight. For our demo, we use a single user persona for simplicity, but in reality, these could be completely different user personas. To enable access control with Lake Formation for Redshift tables, we use data sharing in Lake Formation.

Data access requests by the specific users are logged to CloudTrail. Later in this post, we also briefly touch upon using CloudTrail Lake to query the data access events.

In the following sections, we demonstrate how to build this architecture. We use AWS CloudFormation to provision the resources. AWS CloudFormation lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code. We also use the AWS Command Line Interface (AWS CLI) and AWS Management Console to complete some steps.

The following diagram shows the end-to-end architecture.

Prerequisites

Complete the following prerequisite steps:

  1. Have an AWS account. If you don’t have an account, you can create one.
  2. Have IAM Identity Center set up in a specific AWS Region.
  3. Make sure you use the same Region where you have IAM Identity Center set up throughout the setup and verification steps. In this post, we use the us-east-1 Region.
  4. Have Okta set up with three different groups and users, and enable sync to IAM Identity Center. Refer to Configure SAML and SCIM with Okta and IAM Identity Center for instructions.

After the Okta groups are pushed to IAM Identity Center, you can see the users and groups on the IAM Identity Center console, as shown in the following screenshot. You need the group IDs of the three groups to be passed in the CloudFormation template.

  1. For enabling User2 access using the EMR cluster, you need have an SSL certificate .zip file available in your S3 bucket. You can download the following sample certificate to use in this post. In production use cases, you should create and use your own certificates. You need to reference the bucket name and the certificate bundle .zip file in AWS CloudFormation. The CloudFormation template lets you choose the components you want to provision. If you do not intend to deploy the EMR cluster, you can ignore this step.
  2. Have an administrator user or role to run the CloudFormation stack. The user or role should also be a Lake Formation administrator to grant permissions.

Deploy the CloudFormation stack

The CloudFormation template provided in the post lets you choose the components you want to provision from the solution architecture. In this post, we enable all components, as shown in the following screenshot.

Run the provided CloudFormation stack to create the solution resources. Refer to the following table for a list of important parameters.

Parameter Group Description Parameter Name Expected Value
Choose components to provision. Choose the components you want to be provisioned. DeployAthenaFlow Yes/No. If you choose No, you can ignore the parameters in the “Athena Configuration” group.
DeployEMRFlow Yes/No. If you choose No, you can ignore the parameters in the “EMR Configuration” group.
DeployRedshiftQEV2Flow Yes/No. If you choose No, you can ignore the parameters in the “Redshift Configuration” group.
CreateS3AGInstance Yes/No. If you already have an S3 Access Grants instance, choose No. Otherwise, choose Yes to allow the stack create a new S3 Access Grants instance. The S3 Access Grants instance is needed for User1 and User2.
Identity Center Configuration IAM Identity Center parameters. IDCGroup1Id Group ID corresponding to Group1 from IAM Identity Center.
IDCGroup2Id Group ID corresponding to Group2 from IAM Identity Center.
IDCGroup3Id Group ID corresponding to Group3 from IAM Identity Center.
IAMIDCInstanceArn IAM Identity Center instance ARN. You can get this from the Settings section of IAM Identity Center.
Redshift Configuration

Redshift parameters.

Ignore if you chose DeployRedshiftQEV2Flow as No.

RedshiftServerlessAdminUserName Redshift admin user name.
RedshiftServerlessAdminPassword Redshift admin password.
RedshiftServerlessDatabase Redshift database to create the tables.
EMR Configuration

EMR parameters.

Ignore if you chose parameter DeployEMRFlow as No.

SSlCertsS3BucketName Bucket name where you copied the SSL certificates.
SSlCertsZip Name of SSL certificates file (my-certs.zip) to use the sample certificate provided in the post.
Athena Configuration

Athena parameters.

Ignore if you chose parameter DeployAthenaFlow as No.

IDCUser1Id User ID corresponding to User1 from IAM Identity Center.

The CloudFormation stack provisions the following resources:

  • A VPC with a public and private subnet.
  • If you chose the Redshift components, it also creates three additional subnets.
  • S3 buckets for data and Athena query results location storage. It also copies some sample data to the buckets.
  • EMR Studio with IAM Identity Center integration.
  • Amazon EMR security configuration with IAM Identity Center integration.
  • An EMR cluster that uses the EMR security group.
  • Registers the source S3 bucket with Lake Formation.
  • An AWS Glue database named oktank_tipblog_temp and a table named customer under the database. The table points to the Amazon S3 location governed by Lake Formation.
  • Allows external engines to access data in Amazon S3 locations with full table access. This is required for Amazon EMR integration with Lake Formation for trusted identity propagation. As of this writing, Amazon EMR supports table-level access with IAM Identity Center enabled clusters.
  • An S3 Access Grants instance.
  • S3 Access Grants for Group1 to the User1 prefix under the Athena query results location bucket.
  • S3 Access Grants for Group2 to the S3 bucket input and output prefixes. The user has read access to the input prefix and write access to the output prefix under the bucket.
  • An Amazon Redshift Serverless namespace and workgroup. This workgroup is not integrated with IAM Identity Center; we complete subsequent steps to enable IAM Identity Center for the workgroup.
  • An AWS Cloud9 integrated development environment (IDE), which we use to run AWS CLI commands during the setup.

Note the stack outputs on the AWS CloudFormation console. You use these values in later steps.

Choose the link for Cloud9URL in the stack output to open the AWS Cloud9 IDE. In AWS Cloud9, go to the Window tab and choose New Terminal to start a new bash terminal.

Set up Lake Formation

You need to enable Lake Formation with IAM Identity Center and enable an EMR application with Lake Formation integration. Complete the following steps:

  1. In the AWS Cloud9 bash terminal, enter the following command to get the Amazon EMR security configuration created by the stack:
aws emr describe-security-configuration --name TIP-EMRSecurityConfig | jq -r '.SecurityConfiguration | fromjson | .AuthenticationConfiguration.IdentityCenterConfiguration.IdCApplicationARN'
  1. Note the value for IdcApplicationARN from the output.
  2. Enter the following command in AWS Cloud9 to enable the Lake Formation integration with IAM Identity Center and add the Amazon EMR security configuration application as a trusted application in Lake Formation. If you already have the IAM Identity Center integration with Lake Formation, sign in to Lake Formation and add the preceding value to the list of applications instead of running the following command and proceed to next step.
aws lakeformation create-lake-formation-identity-center-configuration --catalog-id <Replace with CatalogId value from Cloudformation output> --instance-arn <Replace with IDCInstanceARN value from CloudFormation stack output> --external-filtering Status=ENABLED,AuthorizedTargets=<Replace with IdcApplicationARN value copied in previous step>

After this step, you should see the application on the Lake Formation console.

This completes the initial setup. In subsequent steps, we apply some additional configurations for specific user personas.

Validate user personas

To review the S3 Access Grants created by AWS CloudFormation, open the Amazon S3 console and Access Grants in the navigation pane. Choose the access grant you created to view its details.

The CloudFormation stack created the S3 Access Grants for Group1 for the User1 prefix under the Athena query results location bucket. This allows User1 to access the prefix under in the query results bucket. The stack also created the grants for Group2 for User2 to access the raw data bucket input and output prefixes.

Set up User1 access

Complete the steps in this section to set up User1 access.

Create an IAM Identity Center enabled Athena workgroup

Let’s create the Athena workgroup that will be used by User1.

Enter the following command in the AWS Cloud9 terminal. The command creates an IAM Identity Center integrated Athena workgroup and enables S3 Access Grants for the user-level prefix. These user identity-based S3 prefixes let users in an Athena workgroup keep their query results isolated from other users in the same workgroup. The prefix is automatically created by Athena when the CreateUserLevelPrefix option is enabled. Access to the prefix was granted by the CloudFormation stack.

aws athena create-work-group --cli-input-json '{
"Name": "AthenaIDCWG",
"Configuration": {
"ResultConfiguration": {
"OutputLocation": "<Replace with AthenaResultLocation from CloudFormation stack>"
},
"ExecutionRole": "<Replace with TIPStudioRoleArn from CloudFormation stack>",
"IdentityCenterConfiguration": {
"EnableIdentityCenter": true,
"IdentityCenterInstanceArn": "<Replace with IDCInstanceARN from CloudFormation stack>"
},
"QueryResultsS3AccessGrantsConfiguration": {
"EnableS3AccessGrants": true,
"CreateUserLevelPrefix": true,
"AuthenticationType": "DIRECTORY_IDENTITY"
},
"EnforceWorkGroupConfiguration":true
},
"Description": "Athena Workgroup with IDC integration"
}'

Grant access to User1 on the Athena workgroup

Sign in to the Athena console and grant access to Group1 to the workgroup as shown in the following screenshot. You can grant access to the user (User1) or to the group (Group1). In this post, we grant access to Group1.

Grant access to User1 in Lake Formation

Sign in to the Lake Formation console, choose Data lake permissions in the navigation pane, and grant access to the user group on the database oktank_tipblog_temp and table customer.

With Athena, you can grant access to specific columns and for specific rows with row-level filtering. For this post, we grant column-level access and restrict access to only selected columns for the table.

This completes the access permission setup for User1.

Verify access

Let’s see how User1 uses Athena to analyze the data.

  1. Copy the URL for EMRStudioURL from the CloudFormation stack output.
  2. Open a new browser window and connect to the URL.

You will be redirected to the Okta login page.

  1. Log in with User1.
  2. In the EMR Studio query editor, change the workgroup to AthenaIDCWG and choose Acknowledge.
  3. Run the following query in the query editor:
SELECT * FROM "oktank_tipblog_temp"."customer" limit 10;


You can see that the user is only able to access the columns for which permissions were previously granted in Lake Formation. This completes the access flow verification for User1.

Set up User2 access

User2 accesses the table using an EMR Studio notebook. Note the current considerations for EMR with IAM Identity Center integrations.

Complete the steps in this section to set up User2 access.

Grant Lake Formation permissions to User2

Sign in to the Lake Formation console and grant access to Group2 on the table, similar to the steps you followed earlier for User1. Also grant Describe permission on the default database to Group2, as shown in the following screenshot.

Create an EMR Studio Workspace

Next, User2 creates an EMR Studio Workspace.

  1. Copy the URL for EMR Studio from the EMRStudioURL value from the CloudFormation stack output.
  2. Log in to EMR Studio as User2 on the Okta login page.
  3. Create a Workspace, giving it a name and leaving all other options as default.

This will open a JupyterLab notebook in a new window.

Connect to the EMR Studio notebook

In the Compute pane of the notebook, select the EMR cluster (named EMRWithTIP) created by the CloudFormation stack to attach to it. After the notebook is attached to the cluster, choose the PySpark kernel to run Spark queries.

Verify access

Enter the following query in the notebook to read from the customer table:

spark.sql("select * from oktank_tipblog_temp.customer").show()


The user access works as expected based on the Lake Formation grants you provided earlier.

Run the following Spark query in the notebook to read data from the raw bucket. Access to this bucket is controlled by S3 Access Grants.

spark.read.option("header",True).csv("s3://tip-blog-s3-s3ag/input/*").show()

Let’s write this data to the same bucket and input prefix. This should fail because you only granted read access to the input prefix with S3 Access Grants.

spark.read.option("header",True).csv("s3://tip-blog-s3-s3ag/input/*").write.mode("overwrite").parquet("s3://tip-blog-s3-s3ag/input/")

The user has access to the output prefix under the bucket. Change the query to write to the output prefix:

spark.read.option("header",True).csv("s3://tip-blog-s3-s3ag/input/*").write.mode("overwrite").parquet("s3://tip-blog-s3-s3ag/output/test.part")

The write should now be successful.

We have now seen the data access controls and access flows for User1 and User2.

Set up User3 access

Following the target architecture in our post, Group3 users use the Redshift Query Editor v2 to query the Redshift tables.

Complete the steps in this section to set up access for User3.

Enable Redshift Query Editor v2 console access for User3

Complete the following steps:

  1. On the IAM Identity Center console, create a custom permission set and attach the following policies:
    1. AWS managed policy AmazonRedshiftQueryEditorV2ReadSharing.
    2. Customer managed policy redshift-idc-policy-tip. This policy is already created by the CloudFormation stack, so you don’t have to create it.
  2. Provide a name (tip-blog-qe-v2-permission-set) to the permission set.
  3. Set the relay state as https://<region-id>.console.aws.amazon.com/sqlworkbench/home (for example, https://us-east-1.console.aws.amazon.com/sqlworkbench/home).
  4. Choose Create.
  5. Assign Group3 to the account in IAM Identity Center, select the permission set you created, and choose Submit.

Create the Redshift IAM Identity Center application

Enter the following in the AWS Cloud9 terminal:

aws redshift create-redshift-idc-application \
--idc-instance-arn '<Replace with IDCInstanceARN value from CloudFormation Output>' \
--redshift-idc-application-name 'redshift-iad-<Replace with CatalogId value from CloudFormation output>-tip-blog-1' \
--identity-namespace 'tipblogawsidc' \
--idc-display-name 'TIPBlog_AWSIDC' \
--iam-role-arn '<Replace with TIPRedshiftRoleArn value from CloudFormation output>' \
--service-integrations '[
  {
    "LakeFormation": [
    {
     "LakeFormationQuery": {
     "Authorization": "Enabled"
    }
   }
  ]
 }
]'

Enter the following command to get the application details:

aws redshift describe-redshift-idc-applications --output json

Keep a note of the IdcManagedApplicationArn, IdcDisplayName, and IdentityNamespace values in the output for the application with IdcDisplayName TIPBlog_AWSIDC. You need these values in the next step.

Enable the Redshift Query Editor v2 for the Redshift IAM Identity Center application

Complete the following steps:

  1. On the Amazon Redshift console, choose IAM Identity Center connections in the navigation pane.
  2. Choose the application you created.
  3. Choose Edit.
  4. Select Enable Query Editor v2 application and choose Save changes.
  5. On the Groups tab, choose Add or assign groups.
  6. Assign Group3 to the application.

The Redshift IAM Identity Center connection is now set up.

Enable the Redshift Serverless namespace and workgroup with IAM Identity Center

The CloudFormation stack you deployed created a serverless namespace and workgroup. However, they’re not enabled with IAM Identity Center. To enable with IAM Identity Center, complete the following steps. You can get the namespace name from the RedshiftNamespace value of the CloudFormation stack output.

  1. On the Amazon Redshift Serverless dashboard console, navigate to the namespace you created.
  2. Choose Query Data to open Query Editor v2.
  3. Choose the options menu (three dots) and choose Create connections for the workgroup redshift-idc-wg-tipblog.
  4. Choose Other ways to connect and then Database user name and password.
  5. Use the credentials you provided for the Redshift admin user name and password parameters when deploying the CloudFormation stack and create the connection.

Create resources using the Redshift Query Editor v2

You now enter a series of commands in the query editor with the database admin user.

  1. Create an IdP for the Redshift IAM Identity Center application:
CREATE IDENTITY PROVIDER "TIPBlog_AWSIDC" TYPE AWSIDC
NAMESPACE 'tipblogawsidc'
APPLICATION_ARN '<Replace with IdcManagedApplicationArn value you copied earlier in Cloud9>'
IAM_ROLE '<Replace with TIPRedshiftRoleArn value from CloudFormation output>';
  1. Enter the following command to check the IdP you added previously:
SELECT * FROM svv_identity_providers;

Next, you grant permissions to the IAM Identity Center user.

  1. Create a role in Redshift. This role should correspond to the group in IAM Identity Center to which you intend to provide the permissions (Group3 in this post). The role should follow the format <namespace>:<GroupNameinIDC>.
Create role "tipblogawsidc:Group3";
  1. Run the following command to see role you created. The external_id corresponds to the group ID value for Group3 in IAM Identity Center.
Select * from svv_roles where role_name = 'tipblogawsidc:Group3';

  1. Create a sample table to use to verify access for the Group3 user:
CREATE TABLE IF NOT EXISTS revenue
(
account INTEGER ENCODE az64
,customer VARCHAR(20) ENCODE lzo
,salesamt NUMERIC(18,0) ENCODE az64
)
DISTSTYLE AUTO
;

insert into revenue values (10001, 'ABC Company', 12000);
insert into revenue values (10002, 'Tech Logistics', 175400);
  1. Grant access to the user on the schema:
-- Grant usage on schema
grant usage on schema public to role "tipblogawsidc:Group3";
  1. To create a datashare and add the preceding table to the datashare, enter the following statements:
CREATE DATASHARE demo_datashare;
ALTER DATASHARE demo_datashare ADD SCHEMA public;
ALTER DATASHARE demo_datashare ADD TABLE revenue;
  1. Grant usage on the datashare to the account using the Data Catalog:
GRANT USAGE ON DATASHARE demo_datashare TO ACCOUNT '<Replace with CatalogId from Cloud Formation Output>' via DATA CATALOG;

Authorize the datashare

For this post, we use the AWS CLI to authorize the datashare. You can also do it from the Amazon Redshift console.

Enter the following command in the AWS Cloud9 IDE to describe the datashare you created and note the value of DataShareArn and ConsumerIdentifier to use in subsequent steps:

aws redshift describe-data-shares

Enter the following command in the AWS Cloud9 IDE to the authorize the datashare:

aws redshift authorize-data-share --data-share-arn <Replace with DataShareArn value copied from earlier command’s output> --consumer-identifier <Replace with ConsumerIdentifier value copied from earlier command’s output >

Accept the datashare in Lake Formation

Next, accept the datashare in Lake Formation.

  1. On the Lake Formation console, choose Data sharing in the navigation pane.
  2. In the Invitations section, select the datashare invitation that is pending acceptance.
  3. Choose Review invitation and accept the datashare.
  4. Provide a database name (tip-blog-redshift-ds-db), which will be created in the Data Catalog by Lake Formation.
  5. Choose Skip to Review and Create and create the database.

Grant permissions in Lake Formation

Complete the following steps:

  1. On the Lake Formation console, choose Data lake permissions in the navigation pane.
  2. Choose Grant and in the Principals section, choose User3 to grant permissions with the IAM Identity Center-new option. Refer to the Lake Formation access grants steps performed for User1 and User2 if needed.
  3. Choose the database (tip-blog-redshift-ds-db) you created earlier and the table public.revenue, which you created in the Redshift Query Editor v2.
  4. For Table permissions¸ select Select.
  5. For Data permissions¸ select Column-based access and select the account and salesamt columns.
  6. Choose Grant.

Mount the AWS Glue database to Amazon Redshift

As the last step in the setup, mount the AWS Glue database to Amazon Redshift. In the Query Editor v2, enter the following statements:

create external schema if not exists tipblog_datashare_idc_schema from DATA CATALOG DATABASE 'tip-blog-redshift-ds-db' catalog_id '<Replace with CatalogId from CloudFormation output>';

grant usage on schema tipblog_datashare_idc_schema to role "tipblogawsidc:Group3";

grant select on all tables in schema tipblog_datashare_idc_schema to role "tipblogawsidc:Group3";

You are now done with the required setup and permissions for User3 on the Redshift table.

Verify access

To verify access, complete the following steps:

  1. Get the AWS access portal URL from the IAM Identity Center Settings section.
  2. Open a different browser and enter the access portal URL.

This will redirect you to your Okta login page.

  1. Sign in, select the account, and choose the tip-blog-qe-v2-permission-set link to open the Query Editor v2.

If you’re using private or incognito mode for testing this, you may need to enable third-party cookies.

  1. Choose the options menu (three dots) and choose Edit connection for the redshift-idc-wg-tipblog workgroup.
  2. Use IAM Identity Center in the pop-up window and choose Continue.

If you get an error with the message “Redshift serverless cluster is auto paused,” switch to the other browser with admin credentials and run any sample queries to un-pause the cluster. Then switch back to this browser and continue the next steps.

  1. Run the following query to access the table:
SELECT * FROM "dev"."tipblog_datashare_idc_schema"."public.revenue";

You can only see the two columns due to the access grants you provided in Lake Formation earlier.

This completes configuring User3 access to the Redshift table.

Set up QuickSight for User3

Let’s now set up QuickSight and verify access for User3. We already granted access to User3 to the Redshift table in earlier steps.

  1. Create a new IAM Identity Center enabled QuickSight account. Refer to Simplify business intelligence identity management with Amazon QuickSight and AWS IAM Identity Center for guidance.
  2. Choose Group3 for the author and reader for this post.
  3. For IAM Role, choose the IAM role matching the RoleQuickSight value from the CloudFormation stack output.

Next, you add a VPC connection to QuickSight to access the Redshift Serverless namespace you created earlier.

  1. On the QuickSight console, manage your VPC connections.
  2. Choose Add VPC connection.
  3. For VPC connection name, enter a name.
  4. For VPC ID, enter the value for VPCId from the CloudFormation stack output.
  5. For Execution role, choose the value for RoleQuickSight from the CloudFormation stack output.
  6. For Security Group IDs, choose the security group for QSSecurityGroup from the CloudFormation stack output.

  1. Wait for the VPC connection to be AVAILABLE.
  2. Enter the following command in AWS Cloud9 to enable QuickSight with Amazon Redshift for trusted identity propagation:
aws quicksight update-identity-propagation-config --aws-account-id "<Replace with CatalogId from CloudFormation output>" --service "REDSHIFT" --authorized-targets "< Replace with IdcManagedApplicationArn value from output of aws redshift describe-redshift-idc-applications --output json which you copied earlier>"

Verify User3 access with QuickSight

Complete the following steps:

  1. Sign in to the QuickSight console as User3 in a different browser.
  2. On the Okta sign-in page, sign in as User 3.
  3. Create a new dataset with Amazon Redshift as the data source.
  4. Choose the VPC connection you created above for Connection Type.
  5. Provide the Redshift server (the RedshiftSrverlessWorkgroup value from the CloudFormation stack output), port (5439 in this post), and database name (dev in this post).
  6. Under Authentication method, select Single sign-on.
  7. Choose Validate, then choose Create data source.

If you encounter an issue with validating using single sign-on, switch to Database username and password for Authentication method, validate with any dummy user and password, and then switch back to validate using single sign-on and proceed to the next step. Also check that the Redshift serverless cluster is not auto-paused as mentioned earlier in Redshift access verification.

  1. Choose the schema you created earlier (tipblog_datashare_idc_schema) and the table public.revenue
  2. Choose Select to create your dataset.

You should now be able to visualize the data in QuickSight. You are only able to only see the account and salesamt columns from the table because of the access permissions you granted earlier with Lake Formation.

This finishes all the steps for setting up trusted identity propagation.

Audit data access

Let’s see how we can audit the data access with the different users.

Access requests are logged to CloudTrail. The IAM Identity Center user ID is logged under the onBehalfOf tag in the CloudTrail event. The following screenshot shows the GetDataAccess event generated by Lake Formation. You can view the CloudTrail event history and filter by event name GetDataAccess to view similar events in your account.

You can see the userId corresponds to User2.

You can run the following commands in AWS Cloud9 to confirm this.

Get the identity store ID:

aws sso-admin describe-instance --instance-arn <Replace with your instance arn value> | jq -r '.IdentityStoreId'

Describe the user in the identity store:

aws identitystore describe-user --identity-store-id <Replace with output of above command> --user-id <User Id from above screenshot>

One way to query the CloudTrail log events is by using CloudTrail Lake. Set up the event data store (refer to the following instructions) and rerun the queries for User1, User2, and User3. You can query the access events using CloudTrail Lake with the following sample query:

SELECT eventTime,userIdentity.onBehalfOf.userid AS idcUserId,requestParameters as accessInfo, serviceEventDetails
FROM 04d81d04-753f-42e0-a31f-2810659d9c27
WHERE userIdentity.arn IS NOT NULL AND eventName='BatchGetTable' or eventName='GetDataAccess' or eventName='CreateDataSet'
order by eventTime DESC

The following screenshot shows an example of the detailed results with audit explanations.

Clean up

To avoid incurring further charges, delete the CloudFormation stack. Before you delete the CloudFormation stack, delete all the resources you created using the console or AWS CLI:

  1. Manually delete any EMR Studio Workspaces you created with User2.
  2. Delete the Athena workgroup created as part of the User1 setup.
  3. Delete the QuickSight VPC connection you created.
  4. Delete the Redshift IAM Identity Center connection.
  5. Deregister IAM Identity Center from S3 Access Grants.
  6. Delete the CloudFormation stack.
  7. Manually delete the VPC created by AWS CloudFormation.

Conclusion

In this post, we delved into the trusted identity propagation feature of AWS Identity Center alongside various AWS Analytics services, demonstrating its utility in managing permissions using corporate user or group identities rather than IAM roles. We examined diverse user personas utilizing interactive tools like Athena, EMR Studio notebooks, Redshift Query Editor V2, and QuickSight, all centralized under Lake Formation for streamlined permission management. Additionally, we explored S3 Access Grants for S3 bucket access management, and concluded with insights into auditing through CloudTrail events and CloudTrail Lake for a comprehensive overview of user data access.

For further reading, refer to the following resources:


About the Author

Shoukat Ghouse is a Senior Big Data Specialist Solutions Architect at AWS. He helps customers around the world build robust, efficient and scalable data platforms on AWS leveraging AWS analytics services like AWS Glue, AWS Lake Formation, Amazon Athena and Amazon EMR.

Build a decentralized semantic search engine on heterogeneous data stores using autonomous agents

Post Syndicated from Dhaval Shah original https://aws.amazon.com/blogs/big-data/build-a-decentralized-semantic-search-engine-on-heterogeneous-data-stores-using-autonomous-agents/

Large language models (LLMs) such as Anthropic Claude and Amazon Titan have the potential to drive automation across various business processes by processing both structured and unstructured data. For example, financial analysts currently have to manually read and summarize lengthy regulatory filings and earnings transcripts in order to respond to Q&A on investment strategies. LLMs could automate the extraction and summarization of key information from these documents, enabling analysts to query the LLM and receive reliable summaries. This would allow analysts to process the documents to develop investment recommendations faster and more efficiently. Anthropic Claude and other LLMs on Amazon Bedrock can bring new levels of automation and insight across many business functions that involve both human expertise and access to knowledge spread across an organization’s databases and content repositories.

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon via a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI.

In this post, we show how to build a Q&A bot with RAG (Retrieval Augmented Generation). RAG uses data sources like Amazon Redshift and Amazon OpenSearch Service to retrieve documents that augment the LLM prompt. For getting data from Amazon Redshift, we use the Anthropic Claude 2.0 on Amazon Bedrock, summarizing the final response based on pre-defined prompt template libraries from LangChain. To get data from Amazon OpenSearch Service, we chunk, and convert the source data chunks to vectors using Amazon Titan Text Embeddings model.

For client interaction we use Agent Tools based on ReAct. A ReAct prompt consists of few-shot task-solving trajectories, with human-written text reasoning traces and actions, as well as environment observations in response to actions. In this example, we use ReAct for zero-shot training to generate responses to fit in a pre-defined template. The additional information is concatenated as context with the original input prompt and fed to the text generator which produces the final output. This makes RAG adaptive for situations where facts could evolve over time.

Solution overview

Our solution demonstrates how financial analysts can use generative artificial intelligence (AI) to adapt their investment recommendations based on financial reports and earnings transcripts with RAG to use LLMs to generate factual content.

The hybrid architecture uses multiple databases and LLMs, with foundation models from Amazon Bedrock for data source identification, SQL generation, and text generation with results. In the following architecture, Steps 1 and 2 represent data ingestion to be done by data engineering in batch mode. Steps 3, 4, and 5 are the queries and response formation.

The following diagram shows a more detailed view of the Q&A processing chain. The user asks a question, and LangChain queries the Redshift and OpenSearch Service data stores for relevant information to build the prompt. It sends the prompt to the Anthropic Claude on Amazon Bedrock model, and returns the response.

The details of each step are as follows:

  1. Populate the Amazon Redshift Serverless data warehouse with company stock information stored in Amazon Simple Storage Service (Amazon S3). Redshift Serverless is a fully functional data warehouse holding data tables maintained in real time.
  2. Load the unstructured data from your S3 data lake to OpenSearch Service to create an index to store and perform semantic search. The LangChain library loads knowledge base documents, splits the documents into smaller chunks, and uses Amazon Titan to generate embeddings for chunks.
  3. The client submits a question via an interface like a chatbot or website.
  4. You will create multiple steps to transform a user query passed from Amazon SageMaker Notebook to execute API calls to LLMs from Amazon Bedrock. Use LLM-based Agents to generate SQL from Text and then validate if query is relevant to data warehouse tables. If yes, run query to extract information. The LangChain library calls Amazon Titan embeddings to generate a vector for the user’s question. It calls OpenSearch vector search to get similar documents.
  5. LangChain calls Anthropic Claude on Amazon Bedrock model with the additional, retrieved knowledge as context, to generate an answer for the question. It returns generated content to client

In this deployment, you will choose Amazon Redshift Serverless, use Anthropic Claude 2.0  model on Amazon Bedrock and Amazon Titan Text Embeddings model. Overall spend for the deployment will be directly proportional to number of input/output tokens for Amazon Bedrock models, Knowledge base volume, usage hours and so on.

To deploy the solution, you need two datasets: SEC Edgar Annual Financial Filings and Stock pricing data. To join these datasets for analysis, you need to choose Stock Symbol as the join key. The provided AWS CloudFormation template deploys the datasets required for this post, along with the SageMaker notebook.

Prerequisites

To follow along with this post, you should have an AWS account with AWS Identity and Access Management (IAM) user credentials to deploy AWS services.

Deploy the chat application using AWS CloudFormation

To deploy the resources, complete the following steps:

  1. Deploy the following CloudFormation template to create your stack in the us-east-1 AWS Region.The stack will deploy an OpenSearch Service domain, Redshift Serverless endpoint, SageMaker notebook, and other services like VPC and IAM roles that you will use in this post. The template sets a default user name password for the OpenSearch Service domain, and sets up a Redshift Serverless admin. You can choose to modify them or use the default values.
  2. On the AWS CloudFormation console, navigate to the stack you created.
  3. On the Outputs tab, choose the URL for SageMakerNotebookURL to open the notebook.
  4. In Jupyter, choose semantic-search-with-amazon-opensearch, thenblog, then the LLM-Based-Agentfolder.
  5. Open the notebook Generative AI with LLM based autonomous agents augmented with structured and unstructured data.ipynb.
  6. Follow the instructions in the notebook and run the code sequentially.

Run the notebook

There are six major sections in the notebook:

  • Prepare the unstructured data in OpenSearch Service – Download the SEC Edgar Annual Financial Filings dataset and convert the company financial filing document into vectors with Amazon Titan Text Embeddings model and store the vector in an Amazon OpenSearch Service vector database.
  • Prepare the structured data in a Redshift database – Ingest the structured data into your Amazon Redshift Serverless table.
  • Query the unstructured data in OpenSearch Service with a vector search – Create a function to implement semantic search with OpenSearch Service. In OpenSearch Service, match the relevant company financial information to be used as context information to LLM. This is unstructured data augmentation to the LLM.
  • Query the structured data in Amazon Redshift with SQLDatabaseChain – Use the LangChain library LLM text to SQL to query company stock information stored in Amazon Redshift. The search result will be used as context information to the LLM.
  • Create an LLM-based ReAct agent augmented with data in OpenSearch Service and Amazon Redshift – Use the LangChain library to define a ReAct agent to judge whether the user query is stock- or investment-related. If the query is stock related, the agent will query the structured data in Amazon Redshift to get the stock symbol and stock price to augment context to the LLM. The agent also uses semantic search to retrieve relevant financial information from OpenSearch Service to augment context to the LLM.
  • Use the LLM-based agent to generate a final response based on the template used for zero-shot training – The following is a sample user flow for a stock price recommendation for the query, “Is ABC a good investment choice right now.”

Example questions and responses

In this section, we show three example questions and responses to test our chatbot.

Example 1: Historical data is available

In our first test, we explore how the bot responds to a question when historical data is available. We use the question, “Is [Company Name] a good investment choice right now?” Replace [Company Name] with a company you want to query.

This is a stock-related question. The company stock information is in Amazon Redshift and the financial statement information is in OpenSearch Service. The agent will run the following process:

  1. Determine if this is a stock-related question.
  2. Get the company name.
  3. Get the stock symbol from Amazon Redshift.
  4. Get the stock price from Amazon Redshift.
  5. Use semantic search to get related information from 10k financial filing data from OpenSearch Service.
response = zero_shot_agent("\n\nHuman: Is {company name} a good investment choice right now? \n\nAssistant:")

The output may look like the following:

Final Answer: Yes, {company name} appears to be a good investment choice right now based on the stable stock price, continued revenue and earnings growth, and dividend payments. I would recommend investing in {company name} stock at current levels.

You can view the final response from the complete chain in your notebook.

Example 2: Historical data is not available

In this next test, we see how the bot responds to a question when historical data is not available. We ask the question, “Is Amazon a good investment choice right now?”

This is a stock-related question. However, there is no Amazon stock price information in the Redshift table. Therefore, the bot will answer “I cannot provide stock analysis without stock price information.” The agent will run the following process:

  1. Determine if this is a stock-related question.
  2. Get the company name.
  3. Get the stock symbol from Amazon Redshift.
  4. Get the stock price from Amazon Redshift.
response = zero_shot_agent("\n\nHuman: Is Amazon a good investment choice right now? \n\nAssistant:")

The output looks like the following:

Final Answer: I cannot provide stock analysis without stock price information.

Example 3: Unrelated question and historical data is not available

For our third test, we see how the bot responds to an irrelevant question when historical data is not available. This is testing for hallucination. We use the question, “What is SageMaker?”

This is not a stock-related query. The agent will run the following process:

  1. Determine if this is a stock-related question.
response = zero_shot_agent("\n\nHuman: What is SageMaker? \n\nAssistant:")

The output looks like the following:

Final Answer: What is SageMaker? is not a stock related query.

This was a simple RAG-based ReAct chat agent analyzing the corpus from different data stores. In a realistic scenario, you might choose to further enhance the response with restrictions or guardrails for input and output like filtering harsh words for robust input sanitization, output filtering, conversational flow control, and more. You may also want to explore the programmable guardrails to LLM-based conversational systems.

Clean up

To clean up your resources, delete the CloudFormation stack llm-based-agent.

Conclusion

In this post, you explored how LLMs play a part in answering user questions. You looked at a scenario for helping financial analysts. You could employ this methodology for other Q&A scenarios, like supporting insurance use cases, by quickly contextualizing claims data or customer interactions. You used a knowledge base of structured and unstructured data in a RAG approach, merging the data to create intelligent chatbots. You also learned how to use autonomous agents to help provide responses that are contextual and relevant to the customer data and limit irrelevant and inaccurate responses.

Leave your feedback and questions in the comments section.

References


About the Authors

Dhaval Shah is a Principal Solutions Architect with Amazon Web Services based out of New York, where he guides global financial services customers to build highly secure, scalable, reliable, and cost-efficient applications on the cloud. He brings over 20 years of technology experience on Software Development and Architecture, Data Engineering, and IT Management.

Soujanya Konka is a Senior Solutions Architect and Analytics specialist at AWS, focused on helping customers build their ideas on cloud. Expertise in design and implementation of Data platforms. Before joining AWS, Soujanya has had stints with companies such as HSBC & Cognizant

Jon Handler is a Senior Principal Solutions Architect at Amazon Web Services based in Palo Alto, CA. Jon works closely with OpenSearch and Amazon OpenSearch Service, providing help and guidance to a broad range of customers who have search and log analytics workloads that they want to move to the AWS Cloud. Prior to joining AWS, Jon’s career as a software developer included 4 years of coding a large-scale, ecommerce search engine. Jon holds a Bachelor of the Arts from the University of Pennsylvania, and a Master of Science and a PhD in Computer Science and Artificial Intelligence from Northwestern University.

Jianwei Li is a Principal Analytics Specialist TAM at Amazon Web Services. Jianwei provides consultant service for customers to help customer design and build modern data platform. Jianwei has been working in big data domain as software developer, consultant and tech leader.

Hrishikesh Karambelkar is a Principal Architect for Data and AIML with AWS Professional Services for Asia Pacific and Japan. He is proactively engaged with customers in APJ region to enable enterprises in their Digital Transformation journey on AWS Cloud in the areas of Generative AI, machine learning and Data, Analytics, Previously, Hrishikesh has authored books on enterprise search, biig data and co-authored research publications in the areas of Enterprise Search and AI-ML.

Accelerate incident response with Amazon Security Lake

Post Syndicated from Jerry Chen original https://aws.amazon.com/blogs/security/accelerate-incident-response-with-amazon-security-lake/

This blog post is the first of a two-part series that will demonstrate the value of Amazon Security Lake and how you can use it and other resources to accelerate your incident response (IR) capabilities. Security Lake is a purpose-built data lake that centrally stores your security logs in a common, industry-standard format. In part one, we will first demonstrate the value Security Lake can bring at each stage of the National Institute of Standards and Technology (NIST) SP 800-61 Computer Security Incident Handling Guide. We will then demonstrate how you can configure Security Lake in a multi-account deployment by using the AWS Security Reference Architecture (AWS SRA).

In part two of this series, we’ll walk through an example to show you how to use Security Lake and other AWS services and tools to drive an incident to resolution.

At Amazon Web Services (AWS), security is our top priority. When security incidents occur, customers need the right capabilities to quickly investigate and resolve them. Security Lake enhances your capabilities, especially during the detection and analysis stages, which can reduce time to resolution and business impact. We also cover incident response specifically in the security pillar of the AWS Well-Architected Framework, provide prescriptive guidance on preparing for and handling incidents, and publish incident response playbooks.

Incident response life cycle

NIST SP 800-61 describes a set of steps you use to resolve an incident. These include preparation (Stage 1), detection and analysis (Stage 2), containment, eradication and recovery (Stage 3), and finally post-incident activities (Stage 4).

Figure 1 shows the workflow of incident response defined by NIST SP 800-61. The response flows from Stage 1 through Stage 4, with Stages 2 and 3 often being an iterative process. We will discuss the value of Security Lake at each stage of the NIST incident response handling process, with a focus on preparation, detection, and analysis.

Figure 1: NIST 800-61 incident response life cycle. Source: NIST 800-61

Figure 1: NIST 800-61 incident response life cycle. Source: NIST 800-61

Stage 1: Preparation

Preparation helps you ensure that tools, processes, and people are prepared for incident response. In some cases, preparation can also help you identify systems, networks, and applications that might not be sufficiently secure. For example, you might determine you need certain system logs for incident response, but discover during preparation that those logs are not enabled.

Figure 2 shows how Security Lake can accelerate the preparation stage during the incident response process. Through native integration with various security data sources from both AWS services and third-party tools, Security Lake simplifies the integration and concentration of security data, which also facilitates training and rehearsal for incident response.

Figure 2: Amazon Security Lake data consolidation for IR preparation

Figure 2: Amazon Security Lake data consolidation for IR preparation

Some challenges in the preparation stage include the following:

  • Insufficient incident response planning, training, and rehearsal – Time constraints or insufficient resources can slow down preparation.
  • Complexity of system integration and data sources – An increasing number of security data sources and integration points require additional integration effort, or increase risk that some log sources are not integrated.
  • Centralized log repository for mixed environments – Customers with both on-premises and cloud infrastructure told us that consolidating logs for those mixed environments was a challenge.

Security Lake can help you deal with these challenges in the following ways:

  • Simplify system integration with security data normalization
  • Streamline data consolidation across mixed environments
    • Security Lake supports multiple log sources, including AWS native services and custom sources, which include third-party partner solutions, other cloud platforms and your on-premises log sources. For example, see this blog post to learn how to ingest Microsoft Azure activity logs into Security Lake.
  • Facilitate IR planning and testing
    • Security Lake reduces the undifferentiated heavy lifting needed to get security data into tooling so teams spend less time on configuration and data extract, transform, and load (ETL) work and more time on preparedness.
    • With a purpose-built security data lake and data retention policies that you define, security teams can integrate data-driven decision making into their planning and testing, answering questions such as “which incident handling capabilities do we prioritize?” and running Well-Architected game days.

Stages 2 and 3: Detection and Analysis, Containment, Eradication and Recovery

The Detection and Analysis stage (Stage 2) should lead you to understand the immediate cause of the incident and what steps need to be taken to contain it. Once contained, it’s critical to fully eradicate the issue. These steps form Stage 3 of the incident response cycle. You want to ensure that those malicious artifacts or exploits are removed from systems and verify that the impacted service has recovered from the incident.

Figure 3 shows how Security Lake can enable effective detection and analysis. Doing so enables teams to quickly contain, eradicate, and recover from the incident. Security Lake natively integrates with other AWS analytics services, such as Amazon Athena, Amazon QuickSight, and Amazon OpenSearch Service, which makes it easier for your security team to generate insights on the nature of the incident and to take relevant remediation steps.

Figure 3: Amazon Security Lake accelerates IR Detection and Analysis, Containment, Eradication, and Recovery

Figure 3: Amazon Security Lake accelerates IR Detection and Analysis, Containment, Eradication, and Recovery

Common challenges present in stages 2 and 3 include the following:

  • Challenges generating insights from disparate data sources
    • Inability to generate insights from security data means teams are less likely to discover an incident, as opposed to having the breach revealed to them by a third party (such as a threat actor).
    • Breaches disclosed by a threat actor might involve higher costs than incidents discovered by the impacted organizations themselves, because typically the unintended access has progressed for longer and impacted more resources and data than if the impacted organization discovered it sooner.
  • Inconsistency of data visibility and data siloing
    • Security log data silos may slow IR data analysis because it’s challenging to gather and correlate the necessary information to understand the full scope and impact of an incident. This can lead to delays in identifying the root cause, assessing the damage, and taking remediation steps.
    • Data silos might also mean additional permissions management overhead for administrators.
  • Disparate data sources add barriers to adopting new technology, such as AI-driven security analytics tools
    • AI-driven security analysis requires a large amount of security data from various data sources, which might be in disparate formats. Without a centralized security data repository, you might need to make additional effort to ingest and normalize data for model training.

Security Lake offers native support for log ingestion for a range of AWS security services, including AWS CloudTrail, AWS Security Hub, and VPC Flow Logs. Additionally, you can configure Security Lake to ingest external sources. This helps enrich findings and alerts.

Security Lake addresses the preceding challenges as follows:

  • Unleash security detection capability by centralizing detection data
    • With a purpose-built security data lake with a standard object schema, organizations can centrally access their security data—AWS and third-party—using the same set of IR tools. This can help you investigate incidents that involve multiple resources and complex timelines, which could require access logs, network logs, and other security findings. For example, use Amazon Athena to query all your security data. You can also build a centralized security finding dashboard with Amazon QuickSight.
  • Reduce management burden
    • With Security Lake, permissions complexity is reduced. You use the same access controls in AWS Identity and Access Management (IAM) to make sure that only the right people and systems have access to sensitive security data.

See this blog post for more details on generating machine learning insights for Security Lake data by using Amazon SageMaker.

Stage 4: Post-Incident Activity

Continuous improvement helps customers to further develop their IR capabilities. Teams should integrate lessons learned into their tools, policies, and processes. You decide on lifecycle policies for your security data. You can then retroactively review event data for insight and to support lessons learned. You can also share security telemetry at levels of granularity you define. Your organization can then establish distributed data views for forensic purposes and other purposes, while enforcing least privilege for data governance.

Figure 4 shows how Security Lake can accelerate the post-incident activity stage during the incident response process. Security Lake natively integrates with AWS Organizations to enable data sharing across various OUs within the organization, which further unleashes the power of machine learning to automatically create insights for incident response.

Figure 4: Security Lake accelerates post-incident activity

Figure 4: Security Lake accelerates post-incident activity

Having covered some advantages of working with your data in Security Lake, we will now demonstrate best practices for getting Security Lake set up.

Setting up for success with Security Lake

Most of the customers we work with run multiple AWS accounts, usually with AWS Organizations. With that in mind, we’re going to show you how to set up Security Lake and related tooling in line with guidance in the AWS Security Reference Architecture (AWS SRA). The AWS SRA provides guidance on how to deploy AWS security services in a multi-account environment. You will have one AWS account for security tooling and a different account to centralize log storage. You’ll run Security Lake in this log storage account.

If you just want to use Security Lake in a standalone account, follow these instructions.

Set up Security Lake in your logging account

Most of the instructions we link to in this section describe the process using either the console or AWS CLI tools. Where necessary, we’ve described the console experience for illustrative purposes.

The AmazonSecurityLakeAdministrator AWS managed IAM policy grants the permissions needed to set up Security Lake and related services. Note that you may want to further refine permissions, or remove that managed policy after Security Lake and the related services are set up and running.

To set up Security Lake in your logging account

  1. Note down the AWS account number that will be your delegated administrator account. This will be your centralized archive logs account. In the AWS Management Console, sign in to your Organizations management account and set up delegated administration for Security Lake.
  2. Sign in to the delegated administrator account, go to the Security Lake console, and choose Get started. Then follow these instructions from the Security Lake User Guide. While you’re setting this up, note the following specific guidance (this will make it easier to follow the second blog post in this series):

    Define source objective: For Sources to ingest, we recommend that you select Ingest the AWS default sources. However, if you want to include S3 data events, you’ll need to select Ingest specific AWS sources and then select CloudTrail – S3 data events. Note that we use these events for responding to the incident in blog post part 2, when we really drill down into user activity.

    Figure 5 shows the configuration of sources to ingest in Security Lake.

    Figure 5: Sources to ingest in Security Lake

    Figure 5: Sources to ingest in Security Lake

    We recommend leaving the other settings on this page as they are.

    Define target objective: We recommend that you choose Add rollup Region and add multiple AWS Regions to a designated rollup Region. The rollup Region is the one to which you will consolidate logs. The contributing Region is the one that will contribute logs to the rollup Region.

    Figure 6 shows how to select the rollup regions.

    Figure 6: Select rollup Regions

    Figure 6: Select rollup Regions

You now have Security Lake enabled, and in the background, additional services such as AWS Lake Formation and AWS Glue have been configured to organize your Security Lake data.

Now you need to configure a subscriber with query access so that you can query your Security Lake data. Here are a few recommendations:

  1. Subscribers are specific to a Region, so you want to make sure that you set up your subscriber in the same Region as your rollup Region.
  2. You will also set up an External ID. This is a value you define, and it’s used by the IAM role to prevent the confused deputy problem. Note that the subscriber will be your security tooling account.
  3. You will select Lake Formation for Data access, which will create shares in AWS Resource Access Manager (AWS RAM) that will be shared with the account that you specified in Subscriber credentials.
  4. If you’ve already set up Security Lake at some time in the past, you should select Specific log and event sources and confirm the source and version you want the subscriber to access. If it’s a new implementation, we recommend using version 2.0 or greater.
  5. There’s a note in the console that says the subscribing account will need to accept the RAM resource shares. However, if you’re using AWS Organizations, you don’t need to do that; the resource share will already list a status of Active when you select the Shared with me >> Resource shares in the subscriber (security tooling) account RAM console.

Note: If you prefer a visual guide, you can refer to this video to set up Security Lake in AWS Organizations.

Set up Amazon Athena and AWS Lake Formation in the security tooling account

If you go to Athena in your security tooling account, you won’t see your Security Lake tables yet because the tables are shared from the Security Lake account. Although services such as Amazon Athena can’t directly access databases or tables across accounts, the use of resource links overcomes this challenge.

To set up Athena and Lake Formation

  1. Go to the Lake Formation console in the security tooling account and follow the instructions to create resource links for the shared Security Lake tables. You’ll most likely use the Default database and will see your tables there. The table names in that database start with amazon_security_lake_table. You should expect to see about eight tables there.

    Figure 7 shows the shared tables in the Lake Formation service console.

    Figure 7: Shared tables in Lake Formation

    Figure 7: Shared tables in Lake Formation

    You will need to create resource links for each table, as described in the instructions from the Lake Formation Developer Guide.

    Figure 8 shows the resource link creation process.

    Figure 8: Creating resource links

    Figure 8: Creating resource links

  2. Next, go to Amazon Athena in the same Region. If Athena is not set up, follow the instructions to get it set up for SQL queries. Note that you won’t need to create a database—you’re going to use the “default” database that already exists. Select it from the Database drop-down menu in the Query editor view.
  3. In the Tables section, you should see all your Security Lake tables (represented by whatever names you gave them when you created the resource links in step 1, earlier).

Get your incident response playbooks ready

Incident response playbooks are an important tool that enable responders to work more effectively and consistently, and enable the organization to get incidents resolved more quickly. We’ve created some ready-to-go templates to get you started. You can further customize these templates to meet your needs. In part two of this post, you’ll be using the Unintended Data Access to an Amazon Simple Storage Service (Amazon S3) bucket playbook to resolve an incident. You can download that playbook so that you’re ready to follow it to get that incident resolved.

Conclusion

This is the first post in a two-part series about accelerating security incident response with Security Lake. We highlighted common challenges that decelerate customers’ incident responses across the stages outlined by NIST SP 800-61 and how Security Lake can help you address those challenges. We also showed you how to set up Security Lake and related services for incident response.

In the second part of this series, we’ll walk through a specific security incident—unintended data access—and share prescriptive guidance on using Security Lake to accelerate your incident response process.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Jerry Chen

Jerry Chen

Jerry is currently a Senior Cloud Optimization Success Solutions Architect at AWS. He focuses on cloud security and operational architecture design for AWS customers and partners. You can follow Jerry on LinkedIn.

Frank Phillis

Frank Phillis

Frank is a Senior Solutions Architect (Security) at AWS. He enables customers to get their security architecture right. Frank specializes in cryptography, identity, and incident response. He’s the creator of the popular AWS Incident Response playbooks and regularly speaks at security events. When not thinking about tech, Frank can be found with his family, riding bikes, or making music.

Build Spark Structured Streaming applications with the open source connector for Amazon Kinesis Data Streams

Post Syndicated from Idan Maizlits original https://aws.amazon.com/blogs/big-data/build-spark-structured-streaming-applications-with-the-open-source-connector-for-amazon-kinesis-data-streams/

Apache Spark is a powerful big data engine used for large-scale data analytics. Its in-memory computing makes it great for iterative algorithms and interactive queries. You can use Apache Spark to process streaming data from a variety of streaming sources, including Amazon Kinesis Data Streams for use cases like clickstream analysis, fraud detection, and more. Kinesis Data Streams is a serverless streaming data service that makes it straightforward to capture, process, and store data streams at any scale.

With the new open source Amazon Kinesis Data Streams Connector for Spark Structured Streaming, you can use the newer Spark Data Sources API. It also supports enhanced fan-out for dedicated read throughput and faster stream processing. In this post, we deep dive into the internal details of the connector and show you how to use it to consume and produce records from and to Kinesis Data Streams using Amazon EMR.

Introducing the Kinesis Data Streams connector for Spark Structured Streaming

The Kinesis Data Streams connector for Spark Structured Streaming is an open source connector that supports both provisioned and On-Demand capacity modes offered by Kinesis Data Streams. The connector is built using the latest Spark Data Sources API V2, which uses Spark optimizations. Starting with Amazon EMR 7.1, the connector comes pre-packaged on Amazon EMR on Amazon EKS, Amazon EMR on Amazon EC2, and Amazon EMR Serverless, so you don’t need to build or download any packages. For using it with other Apache Spark platforms, the connector is available as a public JAR file that can be directly referred to while submitting a Spark Structured Streaming job. Additionally, you can download and build the connector from the GitHub repo.

Kinesis Data Streams supports two types of consumers: shared throughput and dedicated throughput. With shared throughput, 2 Mbps of read throughput per shard is shared across consumers. With dedicated throughput, also known as enhanced fan-out, 2 Mbps of read throughput per shard is dedicated to each consumer. This new connector supports both consumer types out of the box without any additional coding, providing you the flexibility to consume records from your streams based on your requirements. By default, this connector uses a shared throughput consumer, but you can configure it to use enhanced fan-out in the configuration properties.

You can also use the connector as a sink connector to produce records to a Kinesis data stream. The configuration parameters for using the connector as a source and sink differ—for more information, see Kinesis Source Configuration. The connector also supports multiple storage options, including Amazon DynamoDB, Amazon Simple Service for Storage (Amazon S3), and HDFS, to store checkpoints and provide continuity.

For scenarios where a Kinesis data stream is deployed in an AWS producer account and the Spark Structured Streaming application is in a different AWS consumer account, you can use the connector to do cross-account processing. This requires additional Identity and Access Management (IAM) trust policies to allow the Spark Structured Streaming application in the consumer account to assume the role in the producer account.

You should also consider reviewing the security configuration with your security teams based on your data security requirements.

How the connector works

Consuming records from Kinesis Data Streams using the connector involves multiple steps. The following architecture diagram shows the internal details of how the connector works. A Spark Structured Streaming application consumes records from a Kinesis data stream source and produces records to another Kinesis data stream.

A Kinesis data stream is composed of set of shards. A shard is a uniquely identified sequence of data records in a stream and provides a fixed unit of capacity. The total capacity of the stream is the sum of the capacity of all of its shards.

A Spark application consists of a driver and a set of executor processes. The Spark driver acts as a coordinator, and the tasks running in executors are responsible for producing and consuming records to and from shards.

The solution workflow includes the following steps:

  1. Internally, by default, Structured Streaming queries are processed using a micro-batch processing engine, which processes data streams as a series of small batch jobs. At the beginning of a micro-batch run, the driver uses the Kinesis Data Streams ListShard API to determine the latest description of all available shards. The connector exposes a parameter (kinesis.describeShardInterval) to configure the interval between two successive ListShard API calls.
  2. The driver then determines the starting position in each shard. If the application is a new job, the starting position of each shard is determined by kinesis.startingPosition. If it’s a restart of an existing job, it’s read from last record metadata checkpoint from storage (for this post, DynamoDB) and ignores kinesis.startingPosition.
  3. Each shard is mapped to one task in an executor, which is responsible for reading data. The Spark application automatically creates an equal number of tasks based on the number of shards and distributes it across the executors.
  4. The tasks in an executor use either polling mode (shared) or push mode (enhanced fan-out) to get data records from the starting position for a shard.
  5. Spark tasks running in the executors write the processed data to the data sink. In this architecture, we use the Kinesis Data Streams sink to illustrate how the connector writes back to the stream. Executors can write to more than one Kinesis Data Streams output shard.
  6. At the end of each task, the corresponding executor process saves the metadata (checkpoint) about the last record read for each shard in the offset storage (for this post, DynamoDB). This information is used by the driver in the construction of the next micro-batch.

Solution overview

The following diagram shows an example architecture of how to use the connector to read from one Kinesis data stream and write to another.

In this architecture, we use the Amazon Kinesis Data Generator (KDG) to generate sample streaming data (random events per country) to a Kinesis Data Streams source. We start an interactive Spark Structured Streaming session and consume data from the Kinesis data stream, and then write to another Kinesis data stream.

We use Spark Structured Streaming to count events per micro-batch window. These events for each country are being consumed from Kinesis Data Streams. After the count, we can see the results.

Prerequisites

To get started, follow the instructions in the GitHub repo. You need the following prerequisites:

After you deploy the solution using the AWS CDK, you will have the following resources:

  • An EMR cluster with the Kinesis Spark connector installed
  • A Kinesis Data Streams source
  • A Kinesis Data Streams sink

Create your Spark Structured Streaming application

After the deployment is complete, you can access the EMR primary node to start a Spark application and write your Spark Structured Streaming logic.

As we mentioned earlier, you use the new open source Kinesis Spark connector to consume data from Amazon EMR. You can find the connector code on the GitHub repo along with examples on how to build and set up the connector in Spark.

In this post, we use Amazon EMR 7.1, where the connector is natively available. If you’re not using Amazon EMR 7.1 and above, you can use the connector by running the following code:

cd /usr/lib/spark/jars 
sudo wget https://awslabs-code-us-east-1.s3.amazonaws.com/spark-sql-kinesis-connector/spark-streaming-sql-kinesis-connector_2.12-1.2.1.jar
sudo chmod 755 spark-streaming-sql-kinesis-connector_2.12-1.2.1.jar

Complete the following steps:

  1. On the Amazon EMR console, navigate to the emr-spark-kinesis cluster.
  2. On the Instances tab, select the primary instance and choose the Amazon Elastic Compute Cloud (Amazon EC2) instance ID.

You’re redirected to the Amazon EC2 console.

  1. On the Amazon EC2 console, select the primary instance and choose Connect.
  2. Use Session Manager, a capability of AWS Systems Manager, to connect to the instance.
  3. Because the user that is used to connect is the ssm-user, we need to switch to the Hadoop user:
    sudo su hadoop

  4. Start a Spark shell either using Scala or Python to interactively build a Spark Structured Streaming application to consume data from a Kinesis data stream.

For this post, we use Python for writing to a stream using a PySpark shell in Amazon EMR.

  1. Start the PySpark shell by entering the command pyspark.

Because you already have the connector installed in the EMR cluster, you can now create the Kinesis source.

  1. Create the Kinesis source with the following code:
    kinesis = spark.readStream.format("aws-kinesis") \
        .option("kinesis.region", "<aws-region>") \
        .option("kinesis.streamName", "kinesis-source") \
        .option("kinesis.consumerType", "GetRecords") \
        .option("kinesis.endpointUrl", "https://kinesis.<aws-region>.amazonaws.com") \
        .option("kinesis.startingposition", "LATEST") \
        .load()

For creating the Kinesis source, the following parameters are required:

  • Name of the connector – We use the connector name aws-kinesis
  • kinesis.region – The AWS Region of the Kinesis data stream you are consuming
  • kinesis.consumerType – Use GetRecords (standard consumer) or SubscribeToShard (enhanced fan-out consumer)
  • kinesis.endpointURL – The Regional Kinesis endpoint (for more details, see Service endpoints)
  • kinesis.startingposition – Choose LATEST, TRIM_HORIZON, or AT_TIMESTAMP (refer to ShardIteratorType)

For using an enhanced fan-out consumer, additional parameters are needed, such as the consumer name. The additional configuration can be found in the connector’s GitHub repo.

kinesis_efo = spark \
.readStream \
.format("aws-kinesis") \
.option("kinesis.region", "<aws-region>") \
.option("kinesis.streamName", "kinesis-source") \
.option("kinesis.consumerType", "SubscribeToShard") \
.option("kinesis.consumerName", "efo-consumer") \
.option("kinesis.endpointUrl", "https://kinesis.<aws-region>.amazonaws.com") \
.option("kinesis.startingposition", "LATEST") \
.load()

Deploy the Kinesis Data Generator

Complete the following steps to deploy the KDG and start generating data:

  1. Choose Launch Stack:
    launch stack 1

You might need to change your Region when deploying. Make sure that the KDG is launched in the same Region as where you deployed the solution.

  1. For the parameters Username and Password, enter the values of your choice. Note these values to use later when you log in to the KDG.
  2. When the template has finished deploying, go to the Outputs tab of the stack and locate the KDG URL.
  3. Log in to the KDG, using the credentials you set when launching the CloudFormation template.
  4. Specify your Region and data stream name, and use the following template to generate test data:
    {
        "id": {{random.number(100)}},
        "data": "{{random.arrayElement(
            ["Spain","Portugal","Finland","France"]
        )}}",
        "date": "{{date.now("YYYY-MM-DD hh:mm:ss")}}"
    }

  5. Return to Systems Manager to continue working with the Spark application.
  6. To be able to apply transformations based on the fields of the events, you first need to define the schema for the events:
    from pyspark.sql.types import *
    
    pythonSchema = StructType() \
     .add("id", LongType()) \
     .add("data", StringType()) \
     .add("date", TimestampType())

  7. Run the following the command to consume data from Kinesis Data Streams:
    from pyspark.sql.functions import *
    
    events= kinesis \
      .selectExpr("cast (data as STRING) jsonData") \
      .select(from_json("jsonData", pythonSchema).alias("events")) \
      .select("events.*")

  8. Use the following code for the Kinesis Spark connector sink:
    events \
        .selectExpr("CAST(id AS STRING) as partitionKey","data","date") \
        .writeStream \
        .format("aws-kinesis") \
        .option("kinesis.region", "<aws-region>") \
        .outputMode("append") \
        .option("kinesis.streamName", "kinesis-sink") \
        .option("kinesis.endpointUrl", "https://kinesis.<aws-region>.amazonaws.com") \
        .option("checkpointLocation", "/kinesisCheckpoint") \
        .start() \
        .awaitTermination()

You can view the data in the Kinesis Data Streams console.

  1. On the Kinesis Data Streams console, navigate to kinesis-sink.
  2. On the Data viewer tab, choose a shard and a starting position (for this post, we use Latest) and choose Get records.

You can see the data sent, as shown in the following screenshot. Kinesis Data Streams uses base64 encoding by default, so you might see text with unreadable characters.

Clean up

Delete the following CloudFormation stacks created during this deployment to delete all the provisioned resources:

  • EmrSparkKinesisStack
  • Kinesis-Data-Generator-Cognito-User-SparkEFO-Blog

If you created any additional resources during this deployment, delete them manually.

Conclusion

In this post, we discussed the open source Kinesis Data Streams connector for Spark Structured Streaming. It supports the newer Data Sources API V2 and Spark Structured Streaming for building streaming applications. The connector also enables high-throughput consumption from Kinesis Data Streams with enhanced fan-out by providing dedicated throughput up to 2 Mbps per shard per consumer. With this connector, you can now effortlessly build high-throughput streaming applications with Spark Structured Streaming.

The Kinesis Spark connector is open source under the Apache 2.0 license on GitHub. To get started, visit the GitHub repo.


About the Authors


Idan Maizlits is a Senior Product Manager on the Amazon Kinesis Data Streams team at Amazon Web Services. Idan loves engaging with customers to learn about their challenges with real-time data and to help them achieve their business goals. Outside of work, he enjoys spending time with his family exploring the outdoors and cooking.


Subham Rakshit is a Streaming Specialist Solutions Architect for Analytics at AWS based in the UK. He works with customers to design and build search and streaming data platforms that help them achieve their business objective. Outside of work, he enjoys spending time solving jigsaw puzzles with his daughter.

Francisco Morillo is a Streaming Solutions Architect at AWS. Francisco works with AWS customers helping them design real-time analytics architectures using AWS services, supporting Amazon MSK and AWS’s managed offering for Apache Flink.

Umesh Chaudhari is a Streaming Solutions Architect at AWS. He works with customers to design and build real-time data processing systems. He has extensive working experience in software engineering, including architecting, designing, and developing data analytics systems. Outside of work, he enjoys traveling, reading, and watching movies.

How to set up SAML federation in Amazon Cognito using IdP-initiated single sign-on, request signing, and encrypted assertions

Post Syndicated from Vishal Jakharia original https://aws.amazon.com/blogs/security/how-to-set-up-saml-federation-in-amazon-cognito-using-idp-initiated-single-sign-on-request-signing-and-encrypted-assertions/

When an identity provider (IdP) serves multiple service providers (SPs), IdP-initiated single sign-on provides a consistent sign-in experience that allows users to start the authentication process from one centralized portal or dashboard. It helps administrators have more control over the authentication process and simplifies the management.

However, when you support IdP-initiated authentication, the SP (Amazon Cognito in this case) can’t verify that it has solicited the SAML response that it receives from IdP because there is no SAML request initiated from the SP. To accept unsolicited SAML assertions in your user pool, you must consider its effect on your app security. Although your user pool can’t verify an IdP-initiated sign-in session, Amazon Cognito validates your request parameters and SAML assertions.

Amazon Cognito has recently enhanced support for the SAML 2.0 protocol by adding support to IdP-initiated single sign-on (SSO), SAML request signing and accepting encrypted SAML responses.

Amazon Cognito acts as the SP representing your application and generates a token after federation that can be used by the application to access protected backends. The SAML provider acts as an IdP, where the user identities and credentials are stored, and is responsible for authenticating the user.

This post describes the steps to integrate a SAML IdP, Microsoft Entra ID, with an Amazon Cognito user pool and use SAML IdP-initiated SSO flow. It also describes steps to enable signing authentication requests and accepting encrypted SAML responses.

IdP-initiated authentication flow using SAML federation

Figure 1: High-level diagram for SAML IdP-initiated authentication flow in a web or mobile app

Figure 1: High-level diagram for SAML IdP-initiated authentication flow in a web or mobile app

As shown in Figure 1, the high-level flow diagram of an application with federated authentication typically involves the following steps:

  1. An enterprise user opens their SSO portal and signs in. This usually opens a portal with several applications that the user has access to. When the user selects an Amazon Cognito protected application from their SSO portal, an IdP-initiated SSO flow is initiated.
  2. When the user launches an application from the SSO portal, Entra ID sends a SAML assertion to the Cognito endpoint to federate the user.
  3. Amazon Cognito validates the SAML assertion and creates the user in Cognito if this is first-time federation for the user or updates the user’s record if user has signed in before from this IdP. Cognito then generates an authorization code and redirects the user to the application URL with this authorization code. The application exchanges the authorization code for tokens from the Cognito token endpoint.
  4. After the application has tokens, it uses them to authorize access within the application stack as needed.

The SAML response contains claims or assertions that contain user-specific data. The SAML response is transferred over HTTPS to protect confidentiality of the data, but you can also enable encryption to further protect the confidentiality of transferred user information. This enables trusted parties who have the decryption key to decrypt the data. It protects the confidentiality of the data after it’s received by the SP.

Setting up SAML federation between Amazon Cognito and Entra ID

To set up SAML federation and use IdP-initiated SSO, you will complete the following steps:

  1. Create an Amazon Cognito user pool.
  2. Create an app client in the Cognito user pool.
  3. Add Cognito as an enterprise application in Entra ID.
  4. Add Entra ID as the SAML IdP and enable IdP-initiated SSO in Cognito.
  5. Add the newly created SAML IdP to your user pool app client.
  6. Enable encrypting the SAML response.
  7. Add RelayState in Entra ID SAML SSO.

Prerequisites

To implement the solution, you must have the necessary permissions to perform these tasks in Azure portal and in your AWS account.

Step 1: Create an Amazon Cognito user pool

Create a new user pool in Amazon Cognito with the default settings. Make a note of the user pool ID, for example, us-east-1_abcd1234. You will need this value for the next steps.

Add a domain name to user pool

The Cognito user pool’s hosted UI can be used as the OAuth 2.0 authorization server with a customizable web interface for sign-up and sign-in. Cognito OAuth 2.0 endpoints are accessible from a domain name that must be added to the user pool. There are two options for adding a domain name to a user pool. You can either use a Cognito domain or a domain name that you own. This solution uses a Cognito domain, which will look like the following:

https://<yourDomainPrefix>.auth.<aws-region>.amazoncognito.com

To add a domain name to a user pool:

  1. In the AWS Management Console for Amazon Cognito, navigate to the App integration tab for your user pool.
  2. On the right side of the pane, choose Actions and select Create Cognito domain.

    Figure 2: Create a Cognito domain

    Figure 2: Create a Cognito domain

  3. Enter an available domain prefix (for example example-corp-prd) to use with the Cognito domain.

    Figure 3: Add a domain prefix

    Figure 3: Add a domain prefix

  4. Choose Create Cognito domain.

Step 2: Create an app client in the Cognito user pool

Before you can use Amazon Cognito in your web application, you must register your app with Amazon Cognito as an app client. The IdP-initiated SAML flow can’t be enabled on one app client with the other SP-initiated authentication SAML IdPs or social IdPs. IdP-initiated SAML introduces additional risks that other SSO providers aren’t subject to. For example, it’s not possible to add a state parameter, which is usually used for cross-site request forgery (CSRF) mitigation. Because of this, you can’t add IdPs that aren’t SAML, including the user pool itself, to an app client that uses a SAML provider with IdP-initiated SSO.

To create an app client:

  1. In the Amazon Cognito console, navigate to the App integration tab for the same user pool and locate App clients. Choose Create an app client.
  2. Select an Application type. For this example, create a public client.
  3. Enter an App client name.
  4. Choose Don’t generate client secret.
  5. Keep the rest of the settings as default.
  6. Under Hosted UI settings, add Allowed callback URLs for your app client. This is where you will be directed after authentication.
  7. Choose Authorization code grant for OAuth 2.0 grant types.
  8. You can keep the remaining configuration as default and choose Create app client.

After the app client is successfully created, capture the app client ID from the App integration tab of the user pool.

Prepare information for the Entra ID setup

Prepare the Identifier (Entity ID) and Reply URL, which are required to add Amazon Cognito as an enterprise application in Entra ID (Step 3).

Create values for Identifier (Entity ID) and Reply URL according to the following formats:

For Identifier (Entity ID), the format is:
urn:amazon:cognito:sp:<yourUserPoolID>

For example: urn:amazon:cognito:sp:us-east-1_abcd1234

For Reply URL, the format is:
https://<yourDomainPrefix>.auth.<aws-region>.amazoncognito.com/saml2/idpresponse

For example: https://example-corp-prd.auth.us-east-1.amazoncognito.com/saml2/idpresponse

The reply URL is the endpoint where Entra ID will send the SAML assertion to Amazon Cognito during user authentication.

For more information, see Adding SAML identity providers to a user pool.

Step 3: Add Amazon Cognito as an enterprise application in Entra ID

With the user pool and app client created and the information for Entra ID prepared, you can add Amazon Cognito as an application in Entra ID. To complete this step, you will add Cognito as an enterprise application and set up SSO.

To add Cognito as an enterprise application

  1. Sign in to the Azure portal.
  2. In the search box, search for the service Microsoft Entra ID.
  3. In the left sidebar, select Enterprise applications.
  4. Choose New application.
  5. On the Browse Microsoft Entra Gallery page, choose Create your own application.

    Figure 4: Create an application in Entra ID

    Figure 4: Create an application in Entra ID

  6. Under What’s the name of your app?, enter a name for your application and select Integrate any other application you don’t find in the gallery (Non-gallery), as shown in Figure 4. Choose Create.
  7. It will take few seconds for the application to be created in Entra ID, and then you should be redirected to the Overview page for the newly added application.

To set up SSO using SAML:

  1. On the Getting started page, in the Set up single sign on tile, choose Get started, as shown in Figure 5.

    Figure 5: Choose Set up single sign-on in Getting Started

    Figure 5: Choose Set up single sign-on in Getting Started

  2. On the next screen, select SAML.
  3. In the middle pane under Set up Single Sign-On with SAML, in the Basic SAML Configuration section, choose the edit icon.
  4. In the right pane under Basic SAML Configuration, replace the default Identifier ID (Entity ID) with the identifier (entity ID) you created in Step 2. Replace Reply URL (Assertion Consumer Service URL) with the reply URL you created in Step 2.

    Figure 6: Add the identifier (entity ID) and reply URL

    Figure 6: Add the identifier (entity ID) and reply URL

  5. Now go to Attributes & Claims and note the claims, as shown in Figure 7. You’ll need these when creating attribute mapping in Amazon Cognito.

    Figure 7: Entra ID Attributes & Claims

    Figure 7: Entra ID Attributes & Claims

  6. Scroll down to the SAML Certificates section and copy the App Federation Metadata Url by choosing the copy into clipboard icon. Make a note of this URL to use in the next step.

    Figure 8: Copy SAML metadata URL from Entra ID

    Figure 8: Copy SAML metadata URL from Entra ID

Step 4: Add Entra ID as SAML IdP in Amazon Cognito

In this step, you’ll add Entra ID as a SAML IdP to your user pool and download the signing and encryption certificates.

To add the SAML IdP:

  1. In the Amazon Cognito console, navigate to the Sign-in experience tab of the same user pool. Locate Federated identity provider sign-in and choose Add an Identity provider.
  2. Choose a SAML IdP.
  3. Enter a Provider name, for example, EntraID.
  4. Under IdP-initiated SAML sign-in, choose Accept SP-initiated and IdP-initiated SAML assertions.
  5. Under Metadata document source, enter the metadata document endpoint URL you captured in Step 3.
  6. (Optional) Under SAML signing and encryption, select Require encrypted SAML assertion from this provider.

    Enable Required encrypted SAML assertion from this provider only if you can turn on token encryption in the Entra ID application. See Step 6.

  7. Under Map attributes between your SAML provider and your user pool to map SAML provider attributes to the user profile in your user pool. Include your user pool required attributes in your attribute map.

    For example, when you choose User pool attribute email, enter the SAML attribute name as it appears in the SAML assertion from your IdP. In our case it will be http://schemas.xmlsoap.org/ws/2005/05/identity/claims/emailaddress.

    Figure 9: Enter the SAML attribute name

    Figure 9: Enter the SAML attribute name

  8. Choose Add identity provider.

After the IdP has been created, you can navigate to the recently added EntraID IdP in the user pool for downloading the SAML signing and encryption certificate. These certificates must be imported into the Entra ID enterprise application.

To download the certificates

  1. To download the SAML signing certificate, Choose View signing certificate and Download as .crt
  2. To download the SAML encryption certificate, Choose View encryption certificate and Download as .crt.

Step 5: Add the newly created SAML IdP to your user pool app client

Before you can use Amazon Cognito in your web application, you must add the SAML IdP created in Step 4 to your app client.

To add the SAML IdP:

  1. In the Amazon Cognito console, navigate to the App integration tab for the same user pool and locate App clients.
  2. Choose the app client you created in Step 2.
  3. Locate the Hosted UI section and choose Edit.
  4. Under Identity providers, select the identity provider you created in Step 4 and choose Save changes.

    Figure 10: Enabling the Entra ID SAML identity provider in the Cognito app client

    Figure 10: Enabling the Entra ID SAML identity provider in the Cognito app client

At this stage, the Amazon Cognito OAuth 2.0 server is up and running and the web interface is accessible and ready to use. You can access the Cognito hosted UI from your app client using the Cognito console to test it further.

Step 6: Enable encrypting the SAML response in EntraID

For additional security and privacy of user data, enable encrypting the SAML response. Amazon Cognito and your IdP can establish confidentiality in SAML responses when users sign in and sign out. Cognito assigns a public-private RSA key pair and a certificate to each external SAML provider that you configure in your user pool. You will use the SAML encryption certificate downloaded in step 4.

To enable encrypting the SAML response:

  1. Navigate to your Enterprise application in Entra ID and in the left menu, under Security, select Token encryption.
  2. Import the SAML encryption certificate you have already downloaded in step 4.

    Figure 11: Import the Cognito encryption certificate to Entra ID

    Figure 11: Import the Cognito encryption certificate to Entra ID

  3. After the certificate is imported, it’s inactive by default. To activate it, right-click on the certificate and select Activate token encryption certificate. This enables the encrypted SAML response.

    Figure 12: Activate the token encryption certificate in Entra ID

    Figure 12: Activate the token encryption certificate in Entra ID

Step 7: Add RelayState in Entra ID SAML SSO

A RelayState parameter is required when using SAML IdP-initiated authentication flow. Set this up in Entra ID for the Amazon Cognito user pool and the enabled app client ID.

To add RelayState in Entra ID SAML SSO:

  1. Sign in to the Azure portal and open the enterprise application created in Step 3.
  2. In the left sidebar, choose Single sign-on.
  3. In the middle pane under Set up Single Sign-On with SAML, in the Basic SAML Configuration section, choose the edit icon.
  4. In the right pane under Basic SAML Configuration, apply the value as the format below to the Relay State (Optional) field.
    identity_provider=<IDProviderName>&client_id=<ClientId>&redirect_uri=<callbackURL>&response_type=code&scope=openid+email+phone

    1. Replace <IDProviderName> with the name you previously used for ID provider.
    2. Replace <ClientId> with the app client’s ClientID created in Step 2.
    3. Replace <ecallbackURL> with the URL of your web application that will receive the authorization code. It must be an HTTPS endpoint, except for in a local development environment where you can use http://localhost:PORT_NUMBER.

    For example:

    identity_provider=EntraID&client_id=abcd1234567&redirect_uri=https://example.com&response_type=code&scope=openid+email+phone

    Figure 13: Set RelayState in Entra ID single sign-on

    Figure 13: Set RelayState in Entra ID single sign-on

Test the IdP-initiated flow

Next, do a quick test to check if everything is configured properly.

  1. Sign in to the Azure portal and open the Enterprise application created in Step 3.
  2. In the left sidebar, choose Users and groups.
  3. On the right side, choose Add user/group. This will show the Add Assignment page.
  4. From the left side of the page, choose None Selected .
  5. Select a user from the right of the screen and follow the prompt to assign the user for this application.
  6. Once the user is assigned successfully, open https://www.microsoft365.com/apps and sign in as the assigned user.
  7. After you are signed in, choose the application icon registered as the IdP-initiated SSO.

    Figure 14: Testing IdP-initiated SSO from an Office 365 application

    Figure 14: Testing IdP-initiated SSO from an Office 365 application

  8. The application will start the IdP-initiated authentication flow and the user will be redirected to the application as a signed-in user.

Signing an authentication request in case of SP-initiated flow

The preceding authentication flow that you tested uses IdP-initiated SSO. If you’re using an SP-initiated flow, you can enable signing of the SAML request that is sent from the SP (Amazon Cognito) to the IdP (Entra ID) for additional security and integrity of communication between them.

You can enable the authentication request signing in Cognito while creating the IdP or by updating your existing IdP.

To enable signing of the SAML request:

  1. In the Amazon Cognito console, when you create or edit your SAML identity provider, under SAML signing and encryption, select the box Sign SAML requests to this provider and choose Save changes.

    Figure 15: Enabling signing SAML request

    Figure 15: Enabling signing SAML request

  2. Sign in to the Azure portal and access your Entra ID enterprise application. Go to Set up single sign on and edit Verification certificates (optional).
  3. Select the checkbox Require verification certificates and upload the Cognito user pool SAML signing certificate already downloaded in Step 4 with a .cer file extension. You must convert the .crt file to a .cer file because Entra ID requires a verification certificate in a .cer extension.

To convert the .crt certificate extension to .cer:

  1. Right-click the .crt file and choose Open.
  2. Navigate to the Details tab.
  3. Select Copy to File… and choose Next.
  4. Select Base-64 encoded X.509 (.CER) and choose Next.
  5. Give your export file a name (for example, Entra ID.cer) and choose Save.
  6. Choose Next.
  7. Confirm the details and choose Finish.

Test the SP-initiated flow

Next, do a quick test to check if everything is configured properly.

  1. In the Amazon Cognito console, navigate to the App integration tab for the same user pool and locate App clients.
  2. Choose the app client you created in Step 2.
  3. Locate the Hosted UI section and choose View Hosted UI.
  4. From the hosted UI, authenticate yourself using Entra ID as the identity provider.
  5. After authentication is completed successfully, you will be redirected to the callback URL you configured in your app client with the authorization code.

If you capture the SAML request, you will see that Amazon Cognito is sending a cryptographic signature with the signing certificate in the SAML request to the IdP, and the IdP will match the cryptographic signature with the uploaded certificate to ensure the integrity of the request.

Conclusion

In this post, you learned the benefits of using IdP-initiated single sign-on. It helps centralize administration and lowers dependency on service provider applications. Also, you learned how to integrate an Amazon Cognito user pool with Microsoft Entra ID as an external SAML IdP using IdP-initiated SSO so your users can use their corporate ID to sign in to web or mobile applications. Also, you learned about how to enable signed authentication requests when using an SP-initiated flow and encrypting SAML responses for additional security between Cognito and the SAML IdP.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Vishal Jakharia

Vishal Jakharia

Vishal is a cloud support engineer based in New Jersey, USA. He is an Amazon Cognito subject matter expert who loves to work with customers and provide them solutions for implementing authentication and authorization. He helps customers migrate and build secure scalable architecture on the AWS Cloud.

Yungang Wu

Yungang Wu

Yungang is a senior cloud support engineer who specializes in the Amazon Cognito service. He helps AWS customers troubleshoot issues and suggests well-designed application authentication and authorization implementations.

Governing and securing AWS PrivateLink service access at scale in multi-account environments

Post Syndicated from Anandprasanna Gaitonde original https://aws.amazon.com/blogs/security/governing-and-securing-aws-privatelink-service-access-at-scale-in-multi-account-environments/

Amazon Web Services (AWS) customers have been adopting the approach of using AWS PrivateLink to have secure communication to AWS services, their own internal services, and third-party services in the AWS Cloud. As these environments scale, the number of PrivateLink connections outbound to external services and inbound to internal services increase and are spread out across multiple accounts in virtual private clouds (VPCs). While AWS Identity and Access Management (IAM) policies allow you to control access to individual PrivateLink services, customers want centralized governance for the use of PrivateLink in adherence with organizational standards and security needs.

This post provides an approach for centralized governance for PrivateLink based services across your multi-account environment. It provides a way to create preventative controls through the use of service control policies (SCPs) and detective controls through event-driven automation. This allows your application teams to consume internal and external services while adhering to organization policies and provides a mechanism for centralized control as your AWS environment grows.

Scenarios faced by customers

Figure 1 shows an example customer environment comprising a multi-account structure created through AWS Organizations or using AWS Control Tower. There are separate organizational units (OUs) pertaining to different business units (BUs) with respective accounts. The business services’ account hosts several backend services that are utilized by consuming applications for their functionality. Since these services provide functionality to more than one internal application and will require access across VPC and account boundaries, these are exposed through AWS PrivateLink. One such service is shown in the business services account.

The customer has partners that provide services for integration with the customer’s application stack. The approved partner account provides a service that is approved for use by the cloud administration team. The NotApproved partner account provides services that are not approved within the customer’s organization. The customer has another OU dedicated to application teams. The application 1 account has an application that consumes the business service of the approved partner account. It is also planning to use the service from the NotApproved partner, which should be blocked. The application in the application 2 account is planning on using AWS services through interface endpoints as well as the approved partner account through PrivateLink integration.

Note: Throughout this post, “organization” is used to refer to an organization that you create and manage through AWS Organizations.

Figure 1: A multi-account customer environment

Figure 1: A multi-account customer environment

Current challenges

Access to individual PrivateLink connections can be controlled through IAM policies. At scale, however, different teams use and adopt PrivateLink for incoming and outgoing connections, and the number of VPC endpoint policies to create and manage increases. As mentioned in the problem statement presented in the introduction, as the customer environment scales and the number of PrivateLink connections increases, customers want centralized guardrails to manage PrivateLink resources at scale. For our example, the customer would like to put the following controls in place:

Preventative controls:

Use case 1:

  • Allow creation of VPC endpoints and allow access only to PrivateLink enabled AWS services.
  • Allow creation of VPC endpoints and initiating connection only to approved PrivateLink enabled third-party services.
  • Allow creation of VPC endpoints and initiating connection only to internal business services owned by accounts in the same organization.

Use case 2:

  • Allow only a cloud admin role to add permissions to connect to an endpoint service to prevent connections from external clients to internal VPC endpoint services.

Detective controls:

Use case 3:

  • Detect if connections are made to PrivateLink services exposed by AWS accounts not belonging to the customer’s organization.

Use case 4:

  • Detect if connections are made by external AWS accounts (not belonging to the customer’s organization) to PrivateLink services exposed for internal use by the customer’s AWS accounts.

This post presents a solution that uses SCPs, AWS CloudTrail, and AWS Config to achieve governance. When the solution is deployed in your account, the following components are created as part of the architecture, as shown in Figure 2.

Figure 2: Resources deployed in the customer environment by the solution

Figure 2: Resources deployed in the customer environment by the solution

The following architecture is now in place:

  • SCPs to provide preventative controls for the PrivateLink connections.
  • Amazon EventBridge rules that are configured to trigger based on events from API calls captured by CloudTrail in specified accounts within specified OUs.
  • EventBridge rules in member accounts to send events to the event bus in the Audit account, and a central EventBridge rule in that account to trigger an AWS Lambda function based on PrivateLink related API calls.
  • A Lambda function that receives the events and validates if the VPC endpoint API call is allowed for the PrivateLink service and notifies a cloud administrator if a policy is violated.
  • An AWS Config rule that checks if PrivateLink enabled VPC endpoint services created within your AWS accounts have enabled auto accept of client connections and disabled notifications.

Use cases and solution approach

This section walks through each use case and how the solution components are used to address each use case.

Preventative control

Use case 1: Allowing the creation of a VPC endpoint connection to only AWS services and approved internal and third-party PrivateLink services

This solution allows creating a VPC endpoint for only approved partner PrivateLink services, PrivateLink services internal to the organization, and AWS services. This is implemented using an SCP and can be enforced at the individual account or OU. The approved partner services as well as the internal accounts that can host allowed PrivateLink services can be specified during the solution deployment. Application teams operating in AWS accounts within the customer environment can then create VPC endpoints to PrivateLink services of approved partners or AWS services. However, they will not be able to create a VPC endpoint to an unapproved PrivateLink service, for example. This is shown in Figure 3.

Figure 3: Allowed and disallowed paths in PrivateLink connections by SCP

Figure 3: Allowed and disallowed paths in PrivateLink connections by SCP

The SCP that allows you to do this preventative control is shown in the following code snippet. In this example SCP policy, AllowedPrivateLinkPartnerService-ServiceName refers to the service name of the allowed partner PrivateLink. Also, the SCP allows the creation of VPC endpoints to internal PrivateLink services that are hosted in AllowedPrivateLinkAccount. Make sure that this SCP does not interfere with the other policies you created within your organization. The solution currently uses ec2:VpceServiceName and ec2:VpceServiceOwner conditions to identify the PrivateLink service of AWS services or a third-party partner. These conditions can be used in an SCP to control the creation of VPC endpoints:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Condition": {
        "StringNotEquals": {
          "ec2:VpceServiceName": [
            "AllowedPrivateLinkPartnerService-ServiceName",
          ],
          "ec2:VpceServiceOwner": [
            "AllowedPrivateLinkAccount",
            "amazon"
          ]
        }
      },
      "Action": [
        "ec2:CreateVpcEndpoint"
      ],
      "Resource": "arn:aws:ec2:*:*:vpc-endpoint/*",
      "Effect": "Deny",
      "Sid": "SCPDenyPrivateLink"
    }
  ]
}

Use case 2: Allow only a cloud admin role to add permissions to connect to an endpoint service

This solution makes sure that PrivateLink services that are owned and created in AWS accounts of the customer cannot be connected to consumers unless it is allowed by the cloud administrator role. The cloud administrator can then make sure that only legitimate internal AWS accounts are allowed access to that service and restrict access from other accounts outside of the customer’s organization. This is achieved through the use of a service control policy that will restrict modifications of permissions of the PrivateLink endpoint service. This makes sure that individual teams are not able to use the Allow principals configuration to open access to other entities directly, and only a cloud administrator role with the right permissions can make that change.

{
  "Version": "2012-10-17",
  "Statement": [
  
      "Sid": "Statement1",
      "Effect": "Deny",
      "Action": [
        "ec2:ModifyVpcEndpointServicePermissions"
      ],
      "Resource": [
        "*"
      ],
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalArn": [
            "arn:aws:iam::*:role/CloudNetworkAdmin"
          ]
        }
      }
    }
  ]
}

This policy can help in achieving the access control, as shown in Figure 4. The cloud administrator uses the Allow principals configuration of the business services PrivateLink service to provide access only to the application 1 account. The SCP allows only the cloud administrator to make the modification and does not allow another member of the team from bypassing that process and adding a nonapproved client application account to access the internal PrivateLink service.

Figure 4: Centralized control on access to the internal PrivateLink service to the customer’s own accounts

Figure 4: Centralized control on access to the internal PrivateLink service to the customer’s own accounts

Detective controls

For detective controls, we discuss two use cases that are deployed as part of the solution and can be enabled and disabled based on the test that you want to perform.

Use case 3: Detecting if connections are made by external AWS accounts (not belonging to the customer’s organization) to PrivateLink services exposed by the customer’s AWS accounts

In this use case, the customer would like to detect if connections are made to their business services from accounts outside of its organization. The solution uses individual member account trails for capturing API calls across the multi-account structure and cross-account EventBridge integration. CloudTrail events from member accounts capture events when a PrivateLink service connection is accepted through the API call event AcceptVPCConnectionEndpoint and sent to the event bus in the audit account. This triggers a Lambda function that then captures the information of the entity requesting the connection and details of the PrivateLink service and sends a notification to the cloud administrator. This is shown in Figure 5.

Figure 5: Detecting the creation of a VPC endpoint or accepting a PrivateLink service connection using CloudTrail events in EventBridge

Figure 5: Detecting the creation of a VPC endpoint or accepting a PrivateLink service connection using CloudTrail events in EventBridge

Custom AWS Config rule for detective control

This detective control mechanism works in cases where PrivateLink services are configured to manually accept client connections. If the endpoint is configured to automatically accept connections, CloudTrail will not generate an event when a connection is accepted. AWS PrivateLink allows customers to configure connection notifications to send connection notification events to an Amazon Simple Notification Service (Amazon SNS) topic. Cloud administrators can get the notifications if they are subscribed to the SNS topic. However, if the notification configuration is removed by the member account, there is no way for the cloud administrator to have visibility for new connections and effectively apply governance requirements.

This solution employs an AWS Config rule to detect if PrivateLink services are created with the Auto Accept Connections setting enabled or without a connection notification configuration and flag it as noncompliant.

This is depicted in Figure 6.

Figure 6: Custom AWS Config rule and SNS notification deployed as part of the solution

Figure 6: Custom AWS Config rule and SNS notification deployed as part of the solution

When a PrivateLink service is created by one of the business services teams, an AWS Config organization rule in the audit account will detect the event, and the custom Lambda function will check if the connection notification configuration is present. If not, then the AWS Config rule will flag the resource as noncompliant. Cloud administrators can view these in the AWS Config dashboard or receive notifications configured through AWS Config.

Use case 4: Detecting if connections are made to PrivateLink services exposed by AWS accounts not belonging to the customer’s organization.

Using the same approach as presented in use case 3, connections made to PrivateLink services exposed by AWS accounts outside of the customer’s organization can be detected through the API call event from CloudTrail CreateVPCEndpoint. This event is sent to the centralized event bus and the Lambda function to check against the criteria and provide notifications to the cloud administrator.

Deploy and test the solution

This section walks through how to deploy and test our recommended solution.

Prerequisites

To deploy the solution, first follow these steps.

  1. In your AWS Organizations multi-account environment, go to the management account and enable trusted access for AWS CloudFormation, enable trusted access for AWS Config, and enable trusted access for CloudTrail.
  2. Identify an account in your organization to serve as the audit account and set it up as a delegated administrator for CloudFormation, AWS Config, and CloudTrail. Follow these steps to perform this step:
    1. Register a delegated administrator for CloudFormation.
    2. Perform the steps mentioned in step 1 of this post to register a delegated administrator for AWS Config.
    3. Register a delegated admin for CloudTrail.
  3. The solution uses the deployment of CloudFormation StackSets with self-managed permissions to set up the resources in the audit account. In order to enable this, create AWSCloudFormationStackSetAdministrationRole in the management account and AWSCloudFormationStackSetExecutionRole in the audit account by using the steps in the topic Grant self-managed permissions.
  4. In a separate AWS account that is different than your multi-account environment, create two PrivateLink VPC endpoint services as explained in the documentation. You can use this template to create a test PrivateLink VPC endpoint service. These will serve as two partner services, one of which is allowed, and another is untrusted and not allowed. Make note of their service names.

Figure 7: Simulated partner services (approved and not approved) in a separate test account

Figure 7: Simulated partner services (approved and not approved) in a separate test account

Deploying the solution

  1. Go to the management account of your AWS Organizations multi-account environment and use this CloudFormation template to deploy the solution, or choose the following Launch Stack button:

    Launch stack

    CloudFormation stacks can be deployed using the AWS CloudFormation console or using the AWS CLI.

  2. This initially displays the Create stack page. Leave the details entered by default, and then choose Next.
  3. On the Specify stack details page, enter the details for the input parameters for this solution. The following table shows the details that you will provide when setting up the CloudFormation template on the Specify stack details page on the CloudFormation console.

    AWSOrganizationsId Identifier for your organization. This can be obtained from your management account as described in the AWS Organizations User Guide.
    AdminRoleArn Role of the persona who is allowed to modify PrivateLink endpoint permissions.
    AllowedPrivateLinkAccounts AWS account IDs of accounts in your OU that host PrivateLink services.
    AllowedPrivateLinkPartnerServices Specify the service name of the approved PrivateLink services from partners. If you want to test with a simulated partner PrivateLink, take the service name of PrivateLink services created in Step 4 of the prerequisites as the partner services to which connections should be allowed. The unique service name of the partner’s PrivateLink service is provided by the partner to the customer so that they can connect to it.
    AuditAccountId AWS account ID of the audit account in your multi-account environment.
    PLOrganizationUnit OU identifier for the organizational unit where the solution will perform preventative and detective control.
    Figure 8: CloudFormation template input parameters for the solution as it appears on the console

    Figure 8: CloudFormation template input parameters for the solution as it appears on the console

  4. Choose Next and keep the defaults for the rest of the fields. Then, on the Review and create page, choose Submit to finish deploying the solution.

Testing the solution

Once the solution is deployed successfully, follow these steps to test the solution:

  1. For an account specified in the AllowedPrivateLinkAccounts parameter, create a VPC endpoint service as explained in the topic Create a service powered by AWS PrivateLink. Instead of creating this manually, use this CloudFormation template to create a test VPC endpoint service.
  2. Sign in to a member account within the OU that you specified in the CloudFormation template.
  3. From the member account, create a VPC endpoint connection to the internal PrivateLink service created in the account from Step 1. This connection will set up successfully because it is internal to the organization and therefore allowed by the SCP policy, and is not flagged to the cloud administrator as violating organization policy.
  4. From the member account, create a VPC endpoint connection to the AWS service that is supporting PrivateLink, such as AWS Key Management Service (AWS KMS). This connection will set up successfully because it is internal to the organization and therefore allowed by the SCP policy, and is not flagged to the cloud administrator as violating organization policy.
  5. From the member account, create a VPC endpoint connection to the PrivateLink service created in Step 4 of the prerequisites. This connection will set up successfully because it is internal to the organization and therefore allowed by the SCP policy, and is not flagged to the cloud administrator as violating organization policy.
  6. From the member account, create a VPC endpoint connection to the PrivateLink service created in Step 4 of the prerequisites and that is not an allowed partner service. This connection will fail because it is not allowed by the SCP policy.
  7. From an account outside of your organization, create a VPC endpoint connection to the internal PrivateLink service created in Step 1. The connection setup is successful, but the cloud administrator will see the internal PrivateLink service as NOT COMPLIANT because the connection from external clients is considered to be not compliant with organization requirements in this solution. This information allows the cloud admin to quickly find the noncompliant resource and work with the PrivateLink service owner team to remediate the issue.
  8. From the member account, create another VPC endpoint service without configuring the notification configuration, and leave the Acceptance required field unchecked. Navigate to the AWS Config console in the audit account and go to Aggregator->Rules. Check the evaluation of the rule starting with “OrgConfigRule-pl-governance-rule….” Once the evaluation is complete, it will indicate that this VPC endpoint service is NOT COMPLIANT, whereas the service created in Step 1 will show as COMPLIANT.

Considerations

  • The solution described here takes the approach of allowing all VPC endpoint connections from within a customer’s organization to the PrivateLink services in specified accounts and detecting and notifying all external ones. This can be modified based on your specific use cases and requirements.
  • The solution uses AWS Config rules that are applied to specific accounts of your organization, even though the solution is applied at an OU level. The AWS Config rules created in this solution are scoped to evaluate VPC endpoint services and should incur charges accordingly. Refer to the AWS Config pricing page to understand usage-based pricing for the service.
  • Other services, such AWS Lambda and Amazon EventBridge, also incur usage-based charges. Please verify that these are deleted to prevent incurring unnecessary charges.
  • SCP policies only affect member accounts. They do not apply to the management account, so actions denied through an SCP policy multi-account will still be allowed in the management account.

Cleanup

You can delete the solution by following these steps to avoid unnecessary charges:

  • Delete the CloudFormation stack created as part of Step 4 of the prerequisites.
  • Delete the CloudFormation stack of the main solution deployed in the management account as part of the Deploying the solution section.
  • Delete the CloudFormation stack created as part of Step 1 of Testing the solution.

Summary

As customers adopt AWS PrivateLink throughout their environment, the mechanisms discussed in this post provide a way for administrators to govern and secure their PrivateLink services at scale. This approach can help you create a scalable solution where interconnections are aligned to the organization’s guidelines and security requirements. While this solution presents an approach to governance, customers can tailor this solution to their unique organizational requirements.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Anandprasanna Gaitonde

Anand is a Principal Solutions Architect at AWS, responsible for helping customers design and operate Well-Architected solutions to help them adopt the AWS Cloud successfully. He focuses on AWS networking and serverless technologies to design and develop solutions in the cloud across industry verticals. He holds a master of engineering in computer science and a postgraduate degree in software enterprise management.

Siva Devabakthini

Siva Devabakthini

Siva is a Senior Solutions Architect at AWS who covers hyperscale customers in the AWS Digital Native Business segment. He focuses on AWS security, data analytics, and artificial intelligence and machine learning (AI/ML) technologies to design and develop solutions in the cloud. Outside of work, Siva loves traveling, trying different cuisines, and being outdoors with his family.

Emmanuel Isimah

Emmanuel Isimah

Emmanuel is a Senior Solutions Architect at AWS who covers hyperscale customers in the enterprise retail space. He has a background in networking, security, and containers. Emmanuel helps customers build and secure innovative cloud solutions, solving their business problems by using data-driven approaches. Emmanuel’s areas of depth include security and compliance, containers, and networking.

How to use WhatsApp to send Amazon Cognito notification messages

Post Syndicated from Nideesh K T original https://aws.amazon.com/blogs/security/how-to-use-whatsapp-to-send-amazon-cognito-notification-messages/

While traditional channels like email and SMS remain important, businesses are increasingly exploring alternative messaging services to reach their customers more effectively. In recent years, WhatsApp has emerged as a simple and effective way to engage with users. According to statista, as of 2024, WhatsApp is the most popular mobile messenger app worldwide and has reached over two billion monthly active users in January 2024.

Amazon Cognito lets you add user sign-up and authentication to your mobile and web applications. Among many other features, Cognito provides a custom SMS sender AWS Lambda trigger for using third-party providers to send notifications. In this post, we’ll be using WhatsApp as the third-party provider to send verification codes or multi-factor authentication (MFA) codes instead of SMS during Cognito user pool sign up.

Note: WhatsApp is a third-party service subject to additional terms and charges. Amazon Web Services (AWS) isn’t responsible for third-party services that you use to send messages with a custom SMS sender in Amazon Cognito.

Overview

By default, Amazon Cognito uses Amazon Simple Notification Service (Amazon SNS) for delivery of SMS text messages. Cognito also supports custom triggers that will allow you to invoke an AWS Lambda function to support additional providers such as WhatsApp.

The architecture shown in Figure 1 depicts how to use a custom SMS sender trigger and WhatsApp to send notifications. The steps are as follows:

  1. A user signs up to an Amazon Cognito user pool.
  2. Cognito invokes the custom SMS sender Lambda function and sends the user’s attributes, including the phone number and a one-time code to the Lambda function. This one-time code is encrypted using a custom symmetric encryption AWS Key Management Service (AWS KMS) key that you create.
  3. The Lambda function decrypts the one-time code using a Decrypt API call to your AWS KMS key.
  4. The Lambda function then obtains the WhatsApp access token from AWS Secrets Manager. The WhatsApp access token needs to be generated through Meta Business Settings (which are covered in the next section) and added to Secrets Manager. Lambda also parses the phone number, user attributes, and encrypted secrets.
  5. Lambda sends a POST API call to the WhatsApp API and WhatsApp delivers the verification code to the user as a message. The user can then use the verification code to verify their contact information and confirm the sign-up.

Figure 1: Custom SMS sender trigger flow

Figure 1: Custom SMS sender trigger flow

Prerequisites

Implementation

In the next steps, we look at how to create a Meta app, create a new system user, get the WhatsApp access token and create the template to send the WhatsApp token.

Create and configure an app for WhatsApp communication

To get started, create a Meta app with WhatsApp added to it, along with the customer phone number that will be used to test.

To create and configure an app

  1. Open the Meta for Developers console, choose My Apps and then choose Create App (or choose an existing Business type app and skip to step 4).
  2. Select Other choose Next and then select Business as the app type and choose Next.
  3. Enter an App name, App contact email, choose whether or not to attach a Business portfolio and choose Create app.
  4. Open the app Dashboard and in the Add product to your app section, under WhatsApp, choose Set up.
  5. Create or select an existing Meta business portfolio and choose Continue.
  6. In the left navigation pane, under WhatsApp, choose API Setup.
  7. Under Send and receive messages, take a note of the Phone number ID, which will be needed in the AWS CDK template later.
  8. Under To, add the customer phone number you want to use for testing. Follow the instructions to add and verify the phone number.

Note: You must have WhatsApp registered with the number and the WhatsApp client installed on your mobile device.

Create a user for accessing WhatsApp

Create a system user in Meta’s Business Manager and assign it to the app created in the previous step. The access tokens generated for this user will be used to make the WhatsApp API calls.

To create a user

  1. Open Meta’s Business Manager and select the business you created or associated your application with earlier from the dropdown menu under Business settings.
  2. Under Users, select System users and then choose Add to create a new system user.
  3. Enter a name for the System Username and set their role as Admin and choose Create system user.
  4. Choose Assign assets.
  5. From the Select asset type list, select Apps. Under Select assets, select your WhatsApp application’s name. Under Partial access, turn on the Test app option for the user. Choose Save Changes and then choose Done.
  6. Choose Generate New Token, select the WhatsApp application created earlier, and leave the default 60 days as the token expiration. Under Permissions select WhatsApp_business_messaging and WhatsApp_business_management and choose Generate Token at the bottom.
  7. Copy and save your access token. You will need this for the AWS CDK template later. Choose OK. For more details on creating the access token, see WhatsApp’s Business Management API Get Started guide.

Create a template in WhatsApp

Create a template for the verification messages that will be sent by WhatsApp.

To create a template

  1. Open Meta’s WhatsApp Manager.
  2. On the left icon pane, under Account tools, choose Message template and then choose Create Template.
  3. Select Authentication as the category.
  4. For the Name, enter otp_message.
  5. For Languages, enter English.
  6. Choose Continue.
  7. In the next screen, select Copy code and choose Submit.

Note: It’s possible that Meta might change the process or the UI. See the Meta documentation for specific details.

For more information on WhatsApp templates, see Create and Manage Templates.

Create a Secrets Manager secret

Use the Secrets Manager console to create a Secrets Manager secret and set the secret to the WhatsApp access token.

To create a secret

  1. Open the AWS Management Console and go to Secrets Manager.

    Figure 2: Open the Secrets Manager console

    Figure 2: Open the Secrets Manager console

  2. Choose Store a new secret.

    Figure 3: Store a new secret

    Figure 3: Store a new secret

  3. Under Choose a secret type, choose Other type of secret and under Key/value pairs, select the Plaintext tab and enter Bearer followed by the WhatsApp access token (Bearer <WhatsApp access token>).

    Figure 4: Add the secret

    Figure 4: Add the secret

  4. For the encryption key, you can use either the AWS KMS key that Secrets Manager creates or a customer managed AWS KMS key that you create and then choose Next.
  5. Provide the secret name as the WhatsAppAccessToken, choose Next, and then choose Store to create the secret.
  6. Note the secret Amazon Resource Name (ARN) to use in later steps.

Deploy the solution

In this section, you clone the GitHub repository and deploy the stack to create the resources in your account.

To clone the repository

  1. Create a new directory, navigate to that directory in a terminal and use the following command to clone the GitHub repository that has the Lambda and AWS CDK code:
  2. Change directory to the pattern directory:
    cd amazon-cognito-whatsapp-otp

To deploy the stack

  1. Configure the phone number ID obtained from WhatsApp, the secret name, secret ARN, and the Amazon Cognito user pool self-service sign-up option in the constants.ts file.

    Open the lib/constants.ts file and edit the fields. The SELF_SIGNUP value must be set to true for the purpose of this proof of concept. The SELF_SIGNUP value represents the Boolean value for the Amazon Cognito user pool sign-up option, which when set to true allows public users to sign up.

    export const PHONE_NUMBER_ID = '<phone number ID>'; 
    export const SECRET_NAME = '<WhatsAppAccessToken>'; 
    export const SECRET_ARN = 'arn:aws:secretsmanager:<AWSRegion>:<phone number ID>:secret:<WhatsAppAccessToken>'; 
    export const SELF_SIGNUP = <true>;

    Warning: If you activate user sign-up (enable self-registration) in your user pool, anyone on the internet can sign up for an account and sign in to your applications.

  2. Install the AWS CDK required dependencies by running the following command:
    npm install

  3. This project uses typescript as the client language for AWS CDK. Run the following command to compile typescript to JavaScript:
    npm run build

  4. From the command line, configure AWS CDK (if you have not already done so):
    cdk bootstrap <account number>/<AWS Region>

  5. Install and run Docker. We’re using the aws-lambda-python-alpha package in the AWS CDK code to build the Lambda deployment package. The deployment package installs the required modules in a Lambda compatible Docker container.
  6. Deploy the stack:
    cdk synth
    cdk deploy --all

Test the solution

Now that you’ve completed implementation, it’s time to test the solution by signing up a user on Amazon Cognito and confirming that the Lambda function is invoked and sends the verification code.

To test the solution

  1. Open AWS CloudFormation console.
  2. Select the WhatsappOtpStack that was deployed through AWS CDK.
  3. On the Outputs tab, copy the value of cognitocustomotpsenderclientappid.
  4. Run the following AWS Command Line Interface (AWS CLI) command, replacing the client ID with the output of cognitocustomotpsenderclientappid, username, password, email address, name, phone number, and AWS Region to sign up a new Amazon Cognito user.
    aws cognito-idp sign-up --client-id <cognitocustomsmssenderclientappid> --username <TestUserPhoneNumber> --password <Password> --user-attributes Name="email",Value="<TestUserEmail>" Name="name",Value="<TestUserName>" Name="phone_number",Value="<TestPhoneNumber>" --region <AWS Region>

    Example:

    aws cognito-idp sign-up --client-id xxxxxxxxxxxxxx --username +12065550100  --password Test@654321 --user-attributes Name="email",Value="[email protected]" Name="name",Value="Jane" Name="phone_number",Value=”+12065550100" --region us-east-1

    Note: Password requirements are a minimum length of eight characters with at least one number, one lowercase letter, and one special character.

The new user should receive a message on WhatsApp with a verification code that they can use to complete their sign-up.

Cleanup

  1. Run the following command to delete the resources that were created. It might take a few minutes for the CloudFormation stack to be deleted.
    cdk destroy --all

  2. Delete the secret WhatsAppAccessToken that was created from the Secrets Manager console.

Conclusion

In this post, we showed you how to use an alternative messaging platform such as WhatsApp to send notification messages from Amazon Cognito. This functionality is enabled through the Amazon Cognito custom SMS sender trigger, which invokes a Lambda function that has the custom code to send messages through the WhatsApp API. You can use the same method to use other third-party providers to send messages.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Amazon Cognito re:Post or contact AWS Support.

Want more AWS Security news? Follow us on X.

Nideesh K T

Nideesh K T

Nideesh is an experienced IT professional with expertise in cloud computing and technical support. Nideesh has been working in the technology industry for 8 years. In his current role as a Sr. Cloud Support Engineer, Nideesh provides technical assistance and troubleshooting for cloud infrastructure issues. Outside of work, Nideesh enjoys staying active by going to the gym, playing sports, and spending time outdoors.

Reethi Joseph

Reethi Joseph

Reethi is a Sr. Cloud Support Engineer at AWS with 7 years of experience specializing in serverless technologies. In her role, she helps customers architect and build solutions using AWS services. When not delving into the world of servers and generative AI, she spends her time trying to perfect her swimming strokes, traveling, trying new baking recipes, gardening, and watching movies.

Achieve peak performance and boost scalability using multiple Amazon Redshift serverless workgroups and Network Load Balancer

Post Syndicated from Ricardo Serafim original https://aws.amazon.com/blogs/big-data/achieve-peak-performance-and-boost-scalability-using-multiple-amazon-redshift-serverless-workgroups-and-network-load-balancer/

As data analytics use cases grow, factors of scalability and concurrency become crucial for businesses. Your analytic solution architecture should be able to handle large data volumes at high concurrency and without compromising speed, thereby delivering a scalable high-performance analytics environment.

Amazon Redshift Serverless provides a fully managed, petabyte-scale, auto scaling cloud data warehouse to support high-concurrency analytics. It offers data analysts, developers, and scientists a fast, flexible analytic environment to gain insights from their data with optimal price-performance. Redshift Serverless auto scales during usage spikes, enabling enterprises to cost-effectively help meet changing business demands. You can benefit from this simplicity without changing your existing analytics and business intelligence (BI) applications.

To help meet demanding performance needs like high concurrency, usage spikes, and fast query response times while optimizing costs, this post proposes using Redshift Serverless. The proposed solution aims to address three key performance requirements:

  • Support thousands of concurrent connections with high availability by using multiple Redshift Serverless endpoints behind a Network Load Balancer
  • Accommodate hundreds of concurrent queries with low-latency service level agreements through scalable and distributed workgroups
  • Enable subsecond response times for short queries against large datasets using the fast query processing of Amazon Redshift

The suggested architecture uses multiple Redshift Serverless endpoints accessed through a single Network Load Balancer client endpoint. The Network Load Balancer evenly distributes incoming requests across workgroups. This improves performance and reduces latency by scaling out resources to meet high throughput and low latency demands.

Solution overview

The following diagram outlines a Redshift Serverless architecture with multiple Amazon Redshift managed VPC endpoints behind a Network Load Balancer.

The following are the main components of this architecture:

  • Amazon Redshift data sharing – This allows you to securely share live data across Redshift clusters, workgroups, AWS accounts, and AWS Regions without manually moving or copying the data. Users can see up-to-date and consistent information in Amazon Redshift as soon as it’s updated. With Amazon Redshift data sharing, the ingestion can be done at the producer or consumer endpoint, allowing the other consumer endpoints to read and write the same data and thereby enabling horizontal scaling.
  • Network Load Balancer – This serves as the single point of contact for clients. The load balancer distributes incoming traffic across multiple targets, such as Redshift Serverless managed VPC endpoints. This increases the availability, scalability, and performance of your application. You can add one or more listeners to your load balancer. A listener checks for connection requests from clients, using the protocol and port that you configure, and forwards requests to a target group. A target group routes requests to one or more registered targets, such as Redshift Serverless managed VPC endpoints, using the protocol and the port number that you specify.
  • VPC – Redshift Serverless is provisioned in a VPC. By creating a Redshift managed VPC endpoint, you enable private access to Redshift Serverless from applications in another VPC. This design allows you to scale by having multiple VPCs as needed. The VPC endpoint provides a dedicate private IP for each Redshift Serverless workgroup to be used as the target groups on the Network Load Balancer.

Create an Amazon Redshift managed VPC endpoint

Complete the following steps to create the Amazon Redshift managed VPC endpoint:

  1. On the Redshift Serverless console, choose Workgroup configuration in the navigation pane.
  2. Choose a workgroup from the list.
  3. On the Data access tab, in the Redshift managed VPC endpoints section, choose Create endpoint.
  4. Enter the endpoint name. Create a name that is meaningful for your organization.
  5. The AWS account ID will be populated. This is your 12-digit account ID.
  6. Choose a VPC where the endpoint will be created.
  7. Choose a subnet ID. In the most common use case, this is a subnet where you have a client that you want to connect to your Redshift Serverless instance.
  8. Choose which VPC security groups to add. Each security group acts as a virtual firewall to control inbound and outbound traffic to resources protected by the security group, such as specific virtual desktop instances.

The following screenshot shows an example of this workgroup. Note down the IP address to use during the creation of the target group.

Repeat these steps to create all your Redshift Serverless workgroups.

Add VPC endpoints for the target group for the Network Load Balancer

To add these VPC endpoints to the target group for the Network Load Balancer using Amazon Elastic Compute Cloud (Amazon EC2), complete the following steps:

  1. On the Amazon EC2 console, choose Target groups under Load Balancing in the navigation pane.
  2. Choose Create target group.
  3. For Choose a target type, select Instances to register targets by instance ID, or select IP addresses to register targets by IP address.
  4. For Target group name, enter a name for the target group.
  5. For Protocol, choose TCP or TCP_UDP.
  6. For Port, use 5439 (Amazon Redshift port).
  7. For IP address type, choose IPv4 or IPv6. This option is available only if the target type is Instances or IP addresses and the protocol is TCP or TLS.
  8. You must associate an IPv6 target group with a dual-stack load balancer. All targets in the target group must have the same IP address type. You can’t change the IP address type of a target group after you create it.
  9. For VPC, choose the VPC with the targets to register.
  10. Leave the default selections for the Health checks section, Attributes section, and Tags section.

Create a load balancer

After you create the target group, you can create your load balancer. We recommend using port 5439 (Amazon Redshift default port) for it.

The Network Load Balancer serves as a single-access endpoint and will be used on connections to reach Amazon Redshift. This allows you to add more Redshift Serverless workgroups and increase the concurrency transparently.

Testing the solution

We tested this architecture to run three BI reports with the TPC-DS dataset (cloud benchmark dataset) as our data. Amazon Redshift includes this dataset for free when you choose to load sample data (sample_data_dev database). The installation also provides the queries to test the setup.

Among all the queries from TPC-DS benchmark, we chose the following three to use as our report queries. We changed the first two report queries to use a CREATE TABLE AS SELECT (CTAS) query on temporary tables instead of the WITH clause to emulate options you can see on a typical BI tool. For our testing, we also disabled the result cache to make sure that Amazon Redshift would run the queries every time.

The set of queries contains the creation of temporary tables, a join between those tables, and the cleanup. The cleanup step drops tables. This isn’t needed because they’re deleted at the end of the session, but this aims to simulate all that the BI tool does.

We used Apache JMETER to simulate clients invoking the requests. To learn more about how to use and configure Apache JMETER with Amazon Redshift, refer to Building high-quality benchmark tests for Amazon Redshift using Apache JMeter.

For the tests, we used the following configurations:

  • Test 1 – A single 96 RPU Redshift Serverless vs. three workgroups at 32 RPU each
  • Test 2 – A single 48 RPU Redshift Serverless vs. three workgroups at 16 RPU each

We tested three reports by spawning 100 sessions per report (300 total). There were 14 statements across the three reports (4,200 total). All sessions were triggered simultaneously.

The following table summarizes the tables used in the test.

Table Name Row Count
Catalog_page 93,744
Catalog_sales 23,064,768
Customer_address 50,000
Customer 100,000
Date_dim 73,049
Item 144,000
Promotion 2,400
Store_returns 4,600,224
Store_sales 46,086,464
Store 96
Web_returns 1,148,208
Web_sales 11,510,144
Web_site 240

Some tables were modified by ingesting more data than what the TPC-DS schema offers on Amazon Redshift. Data was reinserted on the table to increase the size.

Test results

The following table summarizes our test results.

TEST 1 . Time Consumed Number of Queries Cost Max Scaled RPU Performance
Single: 96 RPUs 0:02:06 2,100 $6 279 Base
Parallel: 3x 32 RPUs 0:01:06 2,100 $1.20 96 48.03%
Parallel 1 (32 RPU) 0:01:03 688 $0.40 32 50.10%
Parallel 2 (32 RPU) 0:01:03 703 $0.40 32 50.13%
Parallel 3 (32 RPU) 0:01:06 709 $0.40 32 48.03%
TEST 2 . Time Consumed Number of Queries Cost Max Scaled RPU Performance
Single: 48 RPUs 0:01:55 2,100 $3.30 168 Base
Parallel: 3x 16 RPUs 0:01:47 2,100 $1.90 96 6.77%
Parallel 1 (16 RPU) 0:01:47 712 $0.70 36 6.77%
Parallel 2 (16 RPU) 0:01:44 696 $0.50 25 9.13%
Parallel 3 (16 RPU) 0:01:46 692 $0.70 35 7.79%

The preceding table shows that the parallel setup was faster than the single at a lower cost. Also, in our tests, even though Test 1 had double the capacity of Test 2 for the parallel setup, the cost was still 36% lower and the speed was 39% faster. Based on these results, we can conclude that for workloads that have high throughput (I/O), low latency, and high concurrency requirements, this architecture is cost-efficient and performant. Refer to the AWS Pricing Cost Calculator for Network Load Balancer and VPC endpoints pricing.

Redshift Serverless automatically scales the capacity to deliver optimal performance during periods of peak workloads including spikes in concurrency of the workload. This is evident from the maximum scaled RPU results in the preceding table.

Recently released features of Redshift Serverless such as MaxRPU and AI-driven scaling were not used for this test. These new features can increase the price-performance of the workload even further.

We recommend enabling cross-zone load balancing on the Network Load Balancer because it distributes requests from clients to registered targets. Enabling cross-zone load balancing will help balance the requests among the Redshift Serverless managed VPC endpoints irrespective of the Availability Zone they are configured in. Also, if the Network Load Balancer receives traffic from only one server (same IP), you should always use an odd number of Redshift Serverless managed VPC endpoints behind the Network Load Balancer.

Conclusion

In this post, we discussed a scalable architecture that increases the throughput of Redshift Serverless in low latency, high concurrency scenarios. Having multiple Redshift Serverless workgroups behind a Network Load Balancer can deliver a horizontally scalable solution at the best price-performance.

Additionally, Redshift Serverless uses AI techniques (currently in preview) to scale automatically with workload changes across all key dimensions—such as data volume changes, concurrent users, and query complexity—to meet and maintain your price-performance targets.

We hope this post provides you with valuable guidance. We welcome any thoughts or questions in the comments section.


About the Authors

Ricardo Serafim is a Senior Analytics Specialist Solutions Architect at AWS.

Harshida Patel is a Analytics Specialist Principal Solutions Architect, with AWS.

Urvish Shah is a Senior Database Engineer at Amazon Redshift. He has more than a decade of experience working on databases, data warehousing and in analytics space. Outside of work, he enjoys cooking, travelling and spending time with his daughter.

Amol Gaikaiwari is a Sr. Redshift Specialist focused on helping customers realize their business outcomes with optimal Redshift price-performance. He loves to simplify data pipelines and enhance capabilities through adoption of latest Redshift features.

How to enforce a security baseline for an AWS WAF ACL across your organization using AWS Firewall Manager

Post Syndicated from Omner Barajas original https://aws.amazon.com/blogs/security/how-to-enforce-a-security-baseline-for-an-aws-waf-acl-across-your-organization-using-aws-firewall-manager/

Most organizations prioritize protecting their web applications that are exposed to the internet. Using the AWS WAF service, you can create rules to control bot traffic, help prevent account takeover fraud, and block common threat patterns such as SQL injection or cross-site scripting (XSS). Further, for those customers managing multi-account environments, it is possible to enforce security baselines for AWS WAF access control lists (ACLs) across the whole organization by using AWS Firewall Manager.

In a previous AWS Security Blog post, there is a good explanation about how to create Firewall Manager policies to deploy AWS WAF ACLs across multiple accounts. In addition, this AWS Architecture Blog post goes deeper, describing operating models for web applications security governance in Amazon Web Services (AWS). This post will show, in a central or hybrid operating model, how to create a policy to enforce a security baseline in your AWS WAF ACLs while still allowing application administrators or developers to apply specific ACL rules for their particular use case.

Centrally manage firewall policies

It’s a common scenario that a security team in an organization wants to implement a security baseline, consisting of a set of rules, across multiple applications that are distributed in multiple accounts. Those rules are not always applicable for all workloads because different applications might have different needs for protection or exposure to the public. Furthermore, sometimes local teams responsible for managing applications have permissions to create their own rules and decide not to follow policies mandated by the organization.

AWS Firewall Manager solves this problem by allowing you to centrally configure and manage firewall policies, deploy preconfigured AWS WAF rules across your organization, and automatically enforce them in existing and newly created resources.

The following architecture diagram describes how you can design a Firewall Manager policy from a central security account, establishing a security baseline that will be enforced within other member accounts in your organization. To do so, you create a managed AWS WAF ACL with the first and last group rules not editable, but allowing a custom rule group to be modified by administrators of member accounts.

Figure 1: AWS Firewall Manager enforcing security baseline for AWS WAF

Figure 1: AWS Firewall Manager enforcing security baseline for AWS WAF

Firewall Manager delegated administrators

At the time of writing this post, Firewall Manager supports up to 10 administrators who can manage firewall resources in your organization by applying scope conditions. For example, you can define an administrator for specific accounts or even a complete organization unit (OU), AWS Region, or policy type. Using this feature, you can enforce the principle of least privilege access, in addition to assigning administrators to enforce security baselines for your AWS ACL rules across your organization in a more granular way. This delegation needs to be completed from the AWS Organizations management account, as shown in Figure 2.

Figure 2: AWS Firewall Manager administrator account delegation

Figure 2: AWS Firewall Manager administrator account delegation

Firewall Manager policies

A Firewall Manager policy contains the rule groups that will be applied to your protected resources. The service creates a web ACL in each account where the policy is enforced. Account administrators can add rules or rule groups to the resulting web ACL in addition to the rules groups defined by the Firewall Manager policy.

Rules groups

AWS WAF ACLs that are managed by Firewall Manager policies contain three sets of rules that provide a higher level of prioritization in the ACL. AWS WAF evaluates rule groups in the following order:

  1. Rule groups that are defined in the Firewall Manager policy with the highest priority
  2. Rules that are defined by the account administrator in the web ACL after the first rule group
  3. Rule groups that are defined in Firewall Manager to be evaluated at the end

Within each rule set, AWS WAF evaluates rules according to their priority settings, evaluating the rules from the lowest number up until either finds a match that terminates the evaluation or exhausts all of the rules.

Security baseline policy

Figure 3 shows an example of a Firewall Manager policy that will serve as the security web ACL baseline across your organization. This policy should be created in a delegated administrator acco­­unt and enforced across all or specific accounts in your organization where the administrator has permissions. Refer to the service documentation for additional guidance on setting up this type of policy.

Figure 3: AWS Firewall Manager policy rules acting as the security baseline

Figure 3: AWS Firewall Manager policy rules acting as the security baseline

First rule group

The first rule group in the policy will contain the following:

  • Organization-level blocked list – Known bad IP addresses by organization.
  • AWS IP reputation list – Recommended AWS managed rules for IP addresses with a bad reputation.
  • AWS Anonymous IP list – Recommended AWS managed rules for anonymous IP addresses.
  • Organization-level rate limit – A high-level rate limit defined by the organization.

Last rule group

The last rule group in the policy will contain the following:

  • Organization-level allowed list – Even if these are well-known IP addresses, they still need to be evaluated against the set of rules enforced by the organization and specific rules per application. If a “good” IP address is supplanted, it might hide the real source identity, bypassing AWS WAF rules.
  • AWS bot control – Recommended if you want to enforce bot control across your organization or a set of accounts managed by an administrator.

This configuration will allow individual account administrators to define and include their own rules to protect applications based on specific use cases and the expected number of requests.

When designing your own security baselines, take into consideration that some managed rules, such as bot control, might have additional cost, and enforcing them across your organization would increase the overall cost of the service.

Policy scope

The policy scope for your security baseline defines where the policy applies. It can apply to all accounts and resources in your organization or just a subset of accounts and resources. Based on the settings selected, Firewall Manager will apply policy for accounts in scope by using the following options:

  1. All accounts in your organization
  2. Only a specific list of accounts and organization units
  3. All accounts and OUs except a specific list of those to exclude

On the other hand, when selecting the scope for resources, you can use the following options:

  1. All resources
  2. Resources that have all of the specified tags
  3. All resources except those that have all the specified tags

For delegated administrators, scope definition will apply only for accounts, Regions, or OUs defined during the delegation process. Figure 4 shows an example of the scope definition for a policy.

Figure 4: Firewall Manager scope definition

Figure 4: Firewall Manager scope definition

Use case–specific rule groups

Figure 5 is an example of a specific use case, where AWS WAF administrators in a member account within the Firewall Manager policy scope want to protect their web application by using the following rules.

Figure 5: Web ACL managed by Firewall Manager containing rules in a member account

Figure 5: Web ACL managed by Firewall Manager containing rules in a member account

Middle rule group

The middle rule group is configured in each account within the ACL deployed by Firewall Manager. The examples from Figure 5 are rules oriented to apply protection that is specific for the application where the ACL is assigned:

  • App-level blocked list – Known IP addresses blocked by the administrator.
  • App-level rate limit – The rate limit supported by the application.
  • Core rule set – The recommended rule set, focused on OWASP Top Ten vulnerabilities.
  • Technology-specific protection – An example for PHP applications.
  • App-level allowed list – Well-known IP addresses that still need to be evaluated against some rules but bypass others, such as fraud prevention.
  • Account takeover prevention – This managed rule needs specific configuration per application to work as expected. However, it is recommended that you use it after the bot control managed rule to optimize cost. Take that into consideration when building your own security baseline.

This rule group will be second priority between the first and the last rule groups coming from the Firewall Manager policy. This configuration provides account administrators the ability to design their set of rules to cover the specific use case for their application and also the possibility to override rules evaluated in a lower priority (last rule group). For example, having a higher rate limit in the app-level rule than the org-level rule would have no impact on the traffic being filtered, since the org-level rule in the first group of the policy will have priority. However, having more granular bot control rules at the app-level will supersede the org-level rules contained in the last group of the policy. Take that logic into consideration when you decide which rules need to be in the first and last groups of your Firewall Manager policies.

Recommended approach for testing

Before you deploy your web ACL implementation for production, test and tune it in a staging or testing environment until you are comfortable with the potential impact on your traffic. Then, test and tune the rules in count mode with your production traffic before enabling them.

  1. Prepare the environment for testing:
    1. Enable logging and web request sampling for your ACL.
    2. Set the protection to count mode.
    3. Associate the ACL with a resource.
  2. Monitor and tune in the test environment:
    1. Monitor traffic and rules matching by using logs, metrics, the dashboard, or sampled requests.
    2. Configure mitigation rules such as false positive, matching, scope-down, and label match.
  3. Enable protection in production:
    1. Remove any additional rules that are no longer needed.
    2. Enable rules in production accounts.
    3. Closely monitor your application behavior to be sure requests are being handled as expected.

Cleanup

To avoid unexpected charges in your accounts, delete any unnecessary policies and resources. You can do that from the console by following these steps.

  1. On the Firewall Manager policies page, choose the radio button next to the policy name, and then choose Delete.
  2. In the Delete confirmation box, select Delete all policy resources, and then choose Delete again.

AWS WAF removes the policy and any associated resources, like web ACLs, that it created in your account. The changes might take a few minutes to propagate to all accounts.

Conclusion

By using Firewall Manager, you can take advantage of native cloud features to enforce security baseline configurations for your AWS WAF rules in a multi-account environment across your organization. It is possible to centrally design policies with broad rule groups to protect workloads from a high-level perspective while allowing application administrators to design custom rules to protect, for instance, web applications from specific use cases such as OWASP Top Ten or technology-related vulnerabilities.

The examples provided in this post can be further customized and adapted to align with your organization’s needs. Design policies to comply with security requirements and specific use cases to protect your workloads.

If you want to learn more, visit the Automations for AWS Firewall Manager webpage, which provides a solution with preset rules to create a quick security baseline to protect against distributed denial of service (DDoS).

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on X.

Omner Barajas

Omner Barajas

Omner is a senior security specialist solutions architect based in Mexico, supporting customers in LATAM. He usually collaborates with account teams to help clients accelerate cloud adoption and improve security posture for their workloads, resolving complex technical challenges related to cybersecurity and compliance with international standards and regulations.

Governing data in relational databases using Amazon DataZone

Post Syndicated from Jose Romero original https://aws.amazon.com/blogs/big-data/governing-data-in-relational-databases-using-amazon-datazone/

Data governance is a key enabler for teams adopting a data-driven culture and operational model to drive innovation with data. Amazon DataZone is a fully managed data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across Amazon Web Services (AWS), on premises, and on third-party sources. It also makes it easier for engineers, data scientists, product managers, analysts, and business users to access data throughout an organization to discover, use, and collaborate to derive data-driven insights.

Amazon DataZone allows you to simply and securely govern end-to-end data assets stored in your Amazon Redshift data warehouses or data lakes cataloged with the AWS Glue data catalog. As you experience the benefits of consolidating your data governance strategy on top of Amazon DataZone, you may want to extend its coverage to new, diverse data repositories (either self-managed or as managed services) including relational databases, third-party data warehouses, analytic platforms and more.

This post explains how you can extend the governance capabilities of Amazon DataZone to data assets hosted in relational databases based on MySQL, PostgreSQL, Oracle or SQL Server engines. What’s covered in this post is already implemented and available in the Guidance for Connecting Data Products with Amazon DataZone solution, published in the AWS Solutions Library. This solution was built using the AWS Cloud Development Kit (AWS CDK) and was designed to be easy to set up in any AWS environment. It is based on a serverless stack for cost-effectiveness and simplicity and follows the best practices in the AWS Well-Architected-Framework.

Self-service analytics experience in Amazon DataZone

In Amazon DataZone, data producers populate the business data catalog with data assets from data sources such as the AWS Glue data catalog and Amazon Redshift. They also enrich their assets with business context to make them accessible to the consumers.

After the data asset is available in the Amazon DataZone business catalog, data consumers such as analysts and data scientists can search and access this data by requesting subscriptions. When the request is approved, Amazon DataZone can automatically provision access to the managed data asset by managing permissions in AWS Lake Formation or Amazon Redshift so that the data consumer can start querying the data using tools such as Amazon Athena or Amazon Redshift. Note that a managed data asset is an asset for which Amazon DataZone can manage permissions. It includes those stored in Amazon Simple Storage Service (Amazon S3) data lakes (and cataloged in the AWS Glue data catalog) or Amazon Redshift.

As you’ll see next, when working with relational databases, most of the experience described above will remain the same because Amazon DataZone provides a set features and integrations that data producers and consumers can use with a consistent experience, even when working with additional data sources. However, there are some additional tasks that need to be accounted for to achieve a frictionless experience, which will be addressed later in this post.

The following diagram illustrates a high-level overview of the flow of actions when a data producer and consumer collaborate around a data asset stored in a relational database using Amazon DataZone.

Flow of actions for self-service analytics around data assets stored in relational databases

Figure 1: Flow of actions for self-service analytics around data assets stored in relational databases

First, the data producer needs to capture and catalog the technical metadata of the data asset.

The AWS Glue data catalog can be used to store metadata from a variety of data assets, like those stored in relational databases, including their schema, connection details, and more. It offers AWS Glue connections and AWS Glue crawlers as a means to capture the data asset’s metadata easily from their source database and keep it up to date. Later in this post, we’ll introduce how the “Guidance for Connecting Data Products with Amazon DataZone” solution can help data producers easily deploy and run AWS Glue connections and crawlers to capture technical metadata.

Second, the data producer needs to consolidate the data asset’s metadata in the business catalog and enrich it with business metadata. The producer also needs to manage and publish the data asset so it’s discoverable throughout the organization.

Amazon DataZone provides built-in data sources that allow you to easily fetch metadata (such as table name, column name, or data types) of assets in the AWS Glue data catalog into Amazon DataZone’s business catalog. You can also include data quality details thanks to the integration with AWS Glue Data Quality or external data quality solutions. Amazon DataZone also provides metadata forms and generative artificial intelligence (generative AI) driven suggestions to simplify the enrichment of data assets’ metadata with business context. Finally, the Amazon DataZone data portal helps you manage and publish your data assets.

Third, a data consumer needs to subscribe to the data asset published by the producer. To do so, the data consumer will submit a subscription request that, once approved by the producer, triggers a mechanism that automatically provisions read access to the consumer without moving or duplicating data.

In Amazon DataZone, data assets stored in relational databases are considered unmanaged data assets, which means that Amazon DataZone will not be able to manage permissions to them on the customer’s behalf. This is where the “Guidance for Connecting Data Products with Amazon DataZone” solution also comes in handy because it deploys the required mechanism to provision access automatically when subscriptions are approved. You’ll learn how the solution does this later in this post.

Finally, the data consumer needs to access the subscribed data once access has been provisioned. Depending on the use case, consumers would like to use SQL-based engines to run exploratory analysis, business intelligence (BI) tools to build dashboards for decision-making, or data science tools for machine learning (ML) development.

Amazon DataZone provides blueprints to give options for consuming data and provides default ones for Amazon Athena and Amazon Redshift, with more to come soon. Amazon Athena connectors is a good way to run one-time queries on top of relational databases. Later in this post we’ll introduce how the “Guidance for Connecting Data Products with Amazon DataZone” solution can help data consumers deploy Amazon Athena connectors and become a platform to deploy custom tools for data consumers.

Solution’s core components

Now that we have covered what the self-service analytics experience looks like when working with data assets stored in relational databases, let’s review at a high level the core components of the “Guidance for Connecting Data Products with Amazon DataZone” solution.

You’ll be able to identify where some of the core components fit in the flow of actions described in the last section because they were developed to bring simplicity and automation for a frictionless experience. Other components, even though they are not directly tied to the experience, are as relevant since they take care of the prerequisites for the solution to work properly.

Solution’s core components

Figure 2: Solution’s core components

  1. The toolkit component is a set of tools (in AWS Service Catalog) that producer and consumer teams can easily deploy and use, in a self-service fashion, to support some of the tasks described in the experience, such as the following.
    1. As a data producer, capture metadata from data assets stored in relational databases into the AWS Glue data catalog by leveraging AWS Glue connectors and crawlers.
    2. As a data consumer, query a subscribed data asset directly from its source database with Amazon Athena by deploying and using an Amazon Athena connector.
  2. The workflows component is a set of automated workflows (orchestrated through AWS Step Functions) that will trigger automatically on certain Amazon DataZone events such as:
    1. When a new Amazon DataZone data lake environment is successfully deployed so that its default capabilities are extended to support this solution’s toolkit.
    2. When a subscription request is accepted by a data producer so that access is provisioned automatically for data assets stored in relational databases. This workflow is the mechanism that was referred to in the experience of the last section as the means to provision access to unmanaged data assets governed by Amazon DataZone.
    3. When a subscription is revoked or canceled so that access is revoked automatically for data assets in relational databases.
    4. When an existing Amazon DataZone environment deletion starts so that non default Amazon DataZone capabilities are removed.

The following table lists the multiple AWS services that the solution uses to provide an add-on for Amazon DataZone with the purpose of providing the core components described in this section.

AWS Service Description
Amazon DataZone Data governance service whose capabilities are extended when deploying this add-on solution.
Amazon EventBridge Used as a mechanism to capture Amazon DataZone events and trigger solution’s corresponding workflow.
Amazon Step Functions Used as orchestration engine to execute solution workflows.
AWS Lambda Provides logic for the workflow tasks, such as extending environment’s capabilities or sharing secrets with environment credentials.
AWS Secrets Manager Used to store database credentials as secrets. Each consumer environment with granted subscription to one or many data assets in the same relational database will have its own individual credentials (secret).
Amazon DynamoDB Used to store workflows’ output metadata. Governance teams can track subscription details for data assets stored in relational databases.
Amazon Service Catalog Used to provide a complementary toolkit for users (producers and consumers), so that they can provision products to execute tasks specific to their roles in a self-service manner.
AWS Glue Multiple components are used, such as the AWS Glue data catalog as the direct publishing source for Amazon DataZone business catalog and connectors and crawlers to connect on infer schemas from data assets stored in relational databases.
Amazon Athena Used as one of the consumption mechanisms that allow users and teams to query data assets that they are subscribed to, either on top of Amazon S3 backed data lakes and relational databases.

Solution overview

Now let’s dive into the workflow that automatically provisions access to an approved subscription request (2b in the last section). Figure 3 outlines the AWS services involved in its execution. It also illustrates when the solution’s toolkit is used to simplify some of the tasks that producers and consumers need to perform before and after a subscription is requested and granted. If you’d like to learn more about other workflows in this solution, please refer to the implementation guide.

The architecture illustrates how the solution works in a multi-account environment, which is a common scenario. In a multi-account environment, the governance account will host the Amazon DataZone domain and the remaining accounts will be associated to it. The producer account hosts the subscription’s data asset and the consumer account hosts the environment subscribing to the data asset.

Architecture for subscription grant workflow

Figure 3 – Architecture for subscription grant workflow

Solution walkthrough

1. Capture data asset’s metadata

A data producer captures metadata of a data asset to be published from its data source into the AWS Glue catalog. This can be done by using AWS Glue connections and crawlers. To speed up the process, the solution includes a Producer Toolkit using the AWS Service Catalog to simplify the deployment of such resources by just filling out a form.

Once the data asset’s technical metadata is captured, the data producer will run a data source job in Amazon DataZone to publish it into the business catalog. In the Amazon DataZone portal, a consumer will discover the data asset and subsequently, subscribe to it when needed. Any subscription action will create a subscription request in Amazon DataZone.

2. Approve a subscription request

The data producer approves the incoming subscription request. An event is sent to Amazon EventBridge, where a rule deployed by the solution captures it and triggers an instance of the AWS Step Functions primary state machine in the governance account for each environment of the subscribing project.

3. Fulfill read-access in the relational database (producer account)

The primary state machine in the governance account triggers an instance of the AWS Step Functions secondary state machine in the producer account, which will run a set of AWS Lambda functions to:

  1. Retrieve the subscription data asset’s metadata from the AWS Glue catalog, including the details required for connecting to the data source hosting the subscription’s data asset.
  2. Connect to the data source hosting the subscription’s data asset, create credentials for the subscription’s target environment (if nonexistent) and grant read access to the subscription’s data asset.
  3. Store the new data source credentials in an AWS Secrets Manager producer secret (if nonexistent) with a resource policy allowing read cross-account access to the environment’s associated consumer account.
  4. Update tracking records in Amazon DynamoDB in the governance account.

4. Share access credentials to the subscribing environment (consumer account)

The primary state machine in the governance account triggers an instance of the AWS Step Functions secondary state machine in the consumer account, which will run a set of AWS Lambda functions to:

  1. Retrieve connection credentials from the producer secret in the producer account through cross-account access, then copy the credentials into a new consumer secret (if nonexistent) in AWS Secrets Manager local to the consumer account.
  2. Update tracking records in Amazon DynamoDB in the governance account.

5. Access the subscribed data

The data consumer uses the consumer secret to connect to that data source and query the subscribed data asset using any preferred means.

To speed up the process, the solution includes a consumer toolkit using the AWS Service Catalog to simplify the deployment of such resources by just filling out a form. Current scope for this toolkit includes a tool that deploys an Amazon Athena connector for a corresponding MySQL, PostgreSQL, Oracle, or SQL Server data source. However, it could be extended to support other tools on top of AWS Glue, Amazon EMR, Amazon SageMaker, Amazon Quicksight, or other AWS services, and keep the same simple-to-deploy experience.

Conclusion

In this post we went through how teams can extend the governance of Amazon DataZone to cover relational databases, including those with MySQL, Postgres, Oracle, and SQL Server engines. Now, teams are one step further in unifying their data governance strategy in Amazon DataZone to deliver self-service analytics across their organizations for all of their data.

As a final thought, the solution explained in this post introduces a replicable pattern that can be extended to other relational databases. The pattern is based on access grants through environment-specific credentials that are shared as secrets in AWS Secrets Manager. For data sources with different authentication and authorization methods, the solution can be extended to provide the required means to grant access to them (such as through AWS Identity and Access Management (IAM) roles and policies). We encourage teams to experiment with this approach as well.

How to get started

With the “Guidance for Connecting Data Products with Amazon DataZone” solution, you have multiple resources to learn more, test it, and make it your own.

You can learn more on the AWS Solutions Library solutions page. You can download the source code from GitHub and follow the README file to learn more of its underlying components and how to set it up and deploy it in a single or multi-account environment. You can also use it to learn how to think of costs when using the solution. Finally, it explains how best practices from the AWS Well-Architected Framework were included in the solution.

You can follow the solution’s hands-on lab either with the help of the AWS Solutions Architect team or on your own. The lab will take you through the entire workflow described in this post for each of the supported database engines (MySQL, PostgreSQL, Oracle, and SQL Server). We encourage you to start here before trying the solution in your own testing environments and your own sample datasets. Once you have full clarity on how to set up and use the solution, you can test it with your workloads and even customize it to make it your own.

The implementation guide is an asset for customers eager to customize or extend the solution to their specific challenges and needs. It provides an in-depth description of the code repository structure and the solution’s underlying components, as well as all the details to understand the mechanisms used to track all subscriptions handled by the solution.


About the authors

Jose Romero is a Senior Solutions Architect for Startups at AWS, based in Austin, TX, US. He is passionate about helping customers architect modern platforms at scale for data, AI, and ML. As a former senior architect with AWS Professional Services, he enjoys building and sharing solutions for common complex problems so that customers can accelerate their cloud journey and adopt best practices. Connect with him on LinkedIn..

Leonardo Gómez is a Principal Big Data / ETL Solutions Architect at AWS, based in Florida, US. He has over a decade of experience in data management, helping customers around the globe address their business and technical needs. Connect with him on LinkedIn.

Simplify Amazon EKS Deployments with GitHub Actions and AWS CodeBuild

Post Syndicated from Deepak Kovvuri original https://aws.amazon.com/blogs/devops/simplify-amazon-eks-deployments-with-github-actions-and-aws-codebuild/

In this blog post, we will explore how to simplify Amazon EKS deployments with GitHub Actions and AWS CodeBuild. In today’s fast-paced digital landscape, organizations are turning to DevOps practices to drive innovation and streamline their software development and infrastructure management processes. One key practice within DevOps is Continuous Integration and Continuous Delivery (CI/CD), which automates deployment activities to reduce the time it takes to release new software updates. AWS offers a suite of native tools to support CI/CD, but also allows for flexibility and customization through integration with third-party tools.

Throughout this post, you will learn how to use GitHub Actions to create a CI/CD workflow with AWS CodeBuild and AWS CodePipeline. You’ll leverage the capabilities of GitHub Actions from a vast selection of pre-written actions in the GitHub Marketplace to build and deploy a Python application to an Amazon Elastic Kubernetes Service (EKS) cluster.

GitHub Actions is a powerful feature on GitHub’s development platform that enables you to automate your software development workflows directly within your repository. With Actions, you can write individual tasks to build, test, package, release, or deploy your code, and then combine them into custom workflows to streamline your development process.

Solution Overview

This solution being proposed in this post uses several AWS developer tools to establish a CI/CD pipeline while ensuring a streamlined path from development to deployment:

  • AWS CodeBuild: A fully managed build service that compiles source code, runs tests, and produces software packages that are ready to deploy.
  • AWS CodePipeline: A continuous delivery service that orchestrates the build, test, and deploy phases of your release process.
  • Amazon Elastic Kubernetes Service (EKS): A managed service that makes it easy to run Kubernetes on AWS without needing to install and operate your own Kubernetes control plane.
  • AWS CloudFormation: AWS CloudFormation lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code. You’ll use AWS CloudFormation to deploy certain baseline resources required to follow along.
  • Amazon Elastic Container Registry (ECR): A fully managed container registry that makes it easy for developers to store, manage, and deploy Docker container images.
Figure 1 Workflow architecture showing source, build, test, approval and deployment stages

Figure 1 Workflow architecture showing source, build, test, approval and deployment stages

The code’s journey from the developer’s workstation to the final user-facing application is a seamless relay across various AWS services with key build an deploy operations performed via GitHub Actions:

  1. The developer commits the application’s code to the Source Code Repository. In this post we will leverage a repository created in AWS CodeCommit.
  2. The commit to the Source Control Management (SCM) system triggers the AWS CodePipeline, which is the orchestration service that manages the CI/CD pipeline.
  3. AWS CodePipeline proceeds to the Build stage, where AWS CodeBuild, integrated with GitHub Actions, builds the container image from the committed code.
  4. Once the container image is successfully built, AWS CodeBuild, with GitHub Actions, pushes the image to Amazon Elastic Container Registry (ECR) for storage and versioning.
  5. An Approval Stage is included in the pipeline, which allows the developer to manually review and approve the build artifacts before they are deployed.
  6. After receiving approval, AWS CodePipeline advances to the Deploy Stage, where GitHub Actions are used to run helm deployment commands.
  7. Within this Deploy Stage, AWS CodeBuild uses GitHub Actions to install the Helm application on Amazon Elastic Kubernetes Service (EKS), leveraging Helm charts for deployment.
  8. The deployed application is now running on Amazon EKS and is accessible via the automatically provisioned Application Load Balancer.

Pre-requisites

If you choose to replicate the steps in this post, you will need the following items:

Utilities like awscli and eksctl require access to your AWS account. Please make sure you have the AWS CLI configured with credentials. For instructions on setting up the AWS CLI, refer to this documentation.

Walkthrough

Deploy Baseline Resources

To get started you will first deploy an AWS CloudFormation stack that pre-creates some foundational developer resources such as a CodeCommit repository, CodeBuild projects, a CodePipeline pipeline that orchestrates the release of the application across multiple stages. If you’re interested to learn more about the resources being deployed, you can download the template and review its contents.

Additionally, to make use of GitHub Actions in AWS CodeBuild, it is required to authenticate your AWS CodeBuild project with GitHub using an access token – authentication with GitHub is required to ensure consistent access and avoid being rate-limited by GitHub.

  1. First, let’s set up the environment variables required to configure the infrastructure:
    export CLUSTER_NAME=<cluster-name>
    export AWS_REGION=<cluster-region>
    export AWS_ACCOUNT_ID=<cluster-account>
    export GITHUB_TOKEN=<github-pat>

    In the commands above, replace cluster-name with your EKS cluster name, cluster-region with the AWS region of your EKS cluster, cluster-account with your AWS account ID (12-digit number), and github-pat with your GitHub Personal Access Token (PAT).

  2. Using the AWS CloudFormation template located here, deploy the stack using the AWS CLI:
    aws cloudformation create-stack \
      --stack-name github-actions-demo-base \
      --region $AWS_REGION \
      --template-body file://gha.yaml \
      --parameters ParameterKey=ClusterName,ParameterValue=$CLUSTER_NAME \
                   ParameterKey=RepositoryToken,ParameterValue=$GITHUB_TOKEN \
      --capabilities CAPABILITY_IAM && \
    echo "Waiting for stack to be created..." && \
    aws cloudformation wait stack-create-complete \
      --stack-name github-actions-demo-base \
      --region $AWS_REGION
  3. When you use AWS CodeBuild / GitHub Actions to deploy your application onto Amazon EKS, you’ll need to allow-list the service role associated with the build project(s) by adding the IAM principal to access your Cluster’s aws-auth config-map or using EKS Access Entries (recommended). The CodeBuild service role has been pre-created in the previous step and the role ARN can be retrieved using the command below:
    aws cloudformation describe-stacks --stack-name github-actions-demo-base \
    --query "Stacks[0].Outputs[?OutputKey=='CodeBuildServiceRole'].OutputValue" \
    --region $AWS_REGION --output text

Clone the CodeCommit Repository

Next, you will create a simple python flask application and the associated helm charts required to deploy the application and commit them to source control repository in AWS CodeCommit. Begin by cloning the CodeCommit repository by following the steps below:

  1. Configure your git client to use the AWS CLI CodeCommit credential helper. For UNIX based systems follow instructions here, and for Windows based systems follow instructions here.
  2. Retrieve the repository HTTPS clone URL using the command below:
    export CODECOMMIT_CLONE_URL=$(aws cloudformation describe-stacks \
    --stack-name github-actions-demo-base \
    --query "Stacks[0].Outputs[?OutputKey=='CodeCommitCloneUrl'].OutputValue" \
    --region $AWS_REGION \
    --output text)
  3. Clone and navigate to your repository:
    git clone $CODECOMMIT_CLONE_URL github-actions-demo && cd github-actions-demo

Create the Application

Now that you’ve set up all the required resources, you can begin building your application and its necessary deployment manifests.

  1. Create the app.py file, which serves as the hello world application using the command below:
    cat << EOF >app.py
    from flask import Flask
    app = Flask(__name__)
    
    @app.route('/')
    def demoapp():
      return 'Hello from EKS! This application is built using Github Actions on AWS CodeBuild'
    
    if __name__ == '__main__':
      app.run(port=8080,host='0.0.0.0')
    EOF
  2. Create a Dockerfile in the same directory as the application using the command below:
    cat << EOF > Dockerfile
    FROM public.ecr.aws/docker/library/python:alpine3.18 
    WORKDIR /app 
    RUN pip install Flask 
    RUN apk update && apk upgrade --no-cache 
    COPY app.py . 
    CMD [ "python3", "app.py" ]
    EOF
  3. Initialize the HELM application
    helm create demo-app
    rm -rf demo-app/templates/*
  4. Create the manifest files required for the deployment accordingly:
    • deployment.yaml – Contains the blueprint for deploying instances of the application. It includes the desired state and pod template which has the pod specifications like the container image to be used, ports etc.
      cat <<EOF > demo-app/templates/deployment.yaml
      ---
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        namespace: {{ default "default" .Values.namespace }}
        name: {{ .Release.Name }}-deployment
      spec:
        selector:
          matchLabels:
            app.kubernetes.io/name: {{ .Release.Name }}
        replicas: 2
        template:
          metadata:
            labels:
              app.kubernetes.io/name: {{ .Release.Name }}
          spec:
            containers:
            - image: {{ .Values.image.repository }}:{{ default "latest" .Values.image.tag }}
              imagePullPolicy: {{ .Values.image.pullPolicy}}
              name: demoapp
              ports:
              - containerPort: 8080
      EOF
    • service.yaml – Describes the service object in Kubernetes and specifies how to access the set of pods running the application. It acts as an internal load balancer to route traffic to pods based on the defined service type (like ClusterIP, NodePort, or LoadBalancer).
      cat <<EOF > demo-app/templates/service.yaml
      ---
      apiVersion: v1
      kind: Service
      metadata:
        namespace: {{ default "default" .Values.namespace }}
        name: {{ .Release.Name }}-service
      spec:
        ports:
          - port: {{ .Values.service.port }}
            targetPort: 8080
            protocol: TCP
        type: {{ .Values.service.type }}
        selector:
          app.kubernetes.io/name: {{ .Release.Name }}
      EOF
    • ingress.yaml – Defines the ingress rules for accessing the application from outside the Kubernetes cluster. This file maps HTTP and HTTPS routes to services within the cluster, allowing external traffic to reach the correct services.
      cat <<EOF > demo-app/templates/ingress.yaml
      ---
      apiVersion: networking.k8s.io/v1
      kind: Ingress
      metadata:
        namespace: {{ default "default" .Values.namespace }}
        name: {{ .Release.Name }}-ingress
        annotations:
          alb.ingress.kubernetes.io/scheme: internet-facing
          alb.ingress.kubernetes.io/target-type: ip
      spec:
        ingressClassName: alb
        rules:
          - http:
              paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: {{ .Release.Name }}-service
                    port:
                      number: 8080
      EOF
    • values.yaml – This file provides the default configuration values for the Helm chart. This file is crucial for customizing the chart to fit different environments or deployment scenarios. The manifest below assumes that the default namespace is configured as the namespace selector for your Fargate profile.
      cat <<EOF > demo-app/values.yaml
      ---
      namespace: default
      replicaCount: 1
      image:
        pullPolicy: IfNotPresent
      service:
        type: NodePort
        port: 8080
      EOF

Overview of the CI/CD Pipeline

  • A typical CI/CD pipeline consists of source, build, test, approval, and deploy stages.
  • In this post, AWS CodeBuild is used in the build and deploy states. AWS CodeBuild utilizes specification files called buildspec.
  • A buildspec is a collection of build phases and relevant settings in YAML format that CodeBuild uses to execute a build.

Below you’ll learn how to define your buildspec(s) to build and deploy your application onto Amazon EKS by leveraging the AWS managed GitHub action runner on AWS CodeBuild.

Defining GitHub Actions in AWS CodeBuild

Each phase in a buildspec can contain multiple steps and each step can run commands or run a GitHub Action. Each step runs in its own process and has access to the build filesystem. A step references a GitHub action by specifying the uses directive and optionally the with directive is used to pass arguments required by the action. Alternatively, a step can specify a series of commands using the run directive. It’s worth noting that, because steps run in their own process, changes to environment variables are not preserved between steps.

To pass environment variables between different steps of a build phase, you will need to assign the value to an existing or new environment variable and then writing this to the GITHUB_ENV environment file. Additionally, these environment variables can also be passed across multiple stage in CodePipeline by leveraging the exported variables directive.

Build Specification (Build Stage)

Here, you will create a file called buildspec-build.yml at the root of the repository – In the following buildspec, we leverage GitHub actions in AWS CodeBuild to build the container image and push the image to ECR. The actions used in this buildspec are:

  • aws-actions/configure-aws-credentials: Accessing AWS APIs requires the action to be authenticated using AWS credentials. By default, the permissions granted to the CodeBuild service role can be used to sign API actions executed during a build. However, when using a GitHub action in CodeBuild, the credentials from the CodeBuild service role need to be made available to subsequent actions (e.g., to log in to ECR, push the image). This action allows leveraging the CodeBuild service role credentials for subsequent actions.
  • aws-actions/amazon-ecr-login: Logs into the ECR registry using the credentials from the previous step.
version: 0.2
env:
  exported-variables:
    - IMAGE_REPO
    - IMAGE_TAG
phases:
  build:
    steps:
      - name: Get CodeBuild Region
        run: |
          echo "AWS_REGION=$AWS_REGION" >> $GITHUB_ENV
      - name: "Configure AWS credentials"
        id: creds
        uses: aws-actions/configure-aws-credentials@v3
        with:
          aws-region: ${{ env.AWS_REGION }}
          output-credentials: true
      - name: "Login to Amazon ECR"
        id: login-ecr
        uses: aws-actions/amazon-ecr-login@v1
      - name: "Build, tag, and push the image to Amazon ECR"
        run: |
          IMAGE_TAG=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
          docker build -t $IMAGE_REPO:latest .
          docker tag $IMAGE_REPO:latest $IMAGE_REPO:$IMAGE_TAG
          echo "$IMAGE_REPO:$IMAGE_TAG"
          echo "IMAGE_REPO=$IMAGE_REPO" >> $GITHUB_ENV
          echo "IMAGE_TAG=$IMAGE_TAG" >> $GITHUB_ENV
          echo "Pushing image to $REPOSITORY_URI"
          docker push $IMAGE_REPO:latest
          docker push $IMAGE_REPO:$IMAGE_TAG

In the buildspec above the variables IMAGE_REPO and IMAGE_TAG are set as exported-variables that will be used in the subsequent deploy stage.

Build Specification (Deploy Stage)

During the deploy stage, you will utilize AWS CodeBuild to deploy the helm manifests to EKS by leveraging the community provided bitovi/deploy-eks-helm action. Furthermore, the alexellis/arkade-get action is employed to install kubectl, which will be used later to describe the ingress controller and retrieve the application URL.

Create a file called buildspec-deploy.yml at the root of the repository as such:

version: 0.2
env:
  exported-variables:
   - APP_URL
phases:
  build:
    steps:
      - name: "Get Build Region"
        run: |
          echo "AWS_REGION=$AWS_REGION" >> $GITHUB_ENV        
      - name: "Configure AWS credentials"
        uses: aws-actions/configure-aws-credentials@v3
        with:
          aws-region: ${{ env.AWS_REGION }}
      - name: "Install Kubectl"
        uses: alexellis/arkade-get@23907b6f8cec5667c9a4ef724adea073d677e221
        with:
          kubectl: latest
      - name: "Configure Kubectl"
        run: aws eks update-kubeconfig --name $CLUSTER_NAME
      - name: Deploy Helm
        uses: bitovi/[email protected]
        with:
          aws-region: ${{ env.AWS_REGION }}
          cluster-name: ${{ env.CLUSTER_NAME }}
          config-files: demo-app/values.yaml
          chart-path: demo-app/
          values: image.repository=${{ env.IMAGE_REPO }},image.tag=${{ env.IMAGE_TAG }}
          namespace: default
          name: demo-app
      - name: "Fetch Application URL"
        run: |
          while :;do url=$(kubectl get ingress/demo-app-ingress -o jsonpath='{.status.loadBalancer.ingress[0].hostname}' -n default);[ -z "$url" ]&&{ echo "URL is empty, retrying in 5 seconds...";sleep 5;}||{ export APP_URL="$url";echo "APP_URL set to: $APP_URL";break;};done;echo "APP_URL=$APP_URL">>$GITHUB_ENV 

At this point your application structure should have the following structure:

├── Dockerfile
├── app.py
├── buildspec-build.yml
├── buildspec-deploy.yml
└── demo-app
├── Chart.yaml
├── charts
├── templates
│ ├── deployment.yaml
│ ├── ingress.yaml
│ └── service.yaml
└── values.yaml

Now check these files in to the remote repository by running the below commands

git add -A && git commit -m "Initial Commit"
git push --set-upstream origin main

Now, let’s verify the deployment of our application using the load balancer URL. Navigate to the CodePipeline console. The pipeline incorporates a manual approval stage and requires a pipeline operator to review and approve the release to deploy the application. Following this, the URL for the deployed application can be conveniently retrieved from the outputs of the pipeline execution.

Viewing the application

    1. Click the execution ID. This should take you to a detailed overview of the most recent execution.

      Figure 2 - CodePipeline Console showing the pipeline (release) execution ID.

      Figure 2 CodePipeline Console showing the pipeline (release) execution ID

    2. Under the Timeline tab, select the ‘Build’ action for the ‘Deploy’ stage.

      Figure 3 - Navigating to the timeline view and reviewing the details for the deploy stage

      Figure 3 Navigating to the timeline view and reviewing the details for the deploy stage

    3. Copy the application load balancer URL from the output variables.

      Figure 4 Copy the APP_URL from the Output Variables for the Deploy action

      Figure 4 Copy the APP_URL from the Output Variables for the Deploy action

    4. Paste the URL into a browser of your choice and you should see the message below.

      Figure 5 Preview of the application deployed on Amazon EKS

      Figure 5 Preview of the application deployed on Amazon EKS

You can also review the logs for your build and see the GitHub action at work from the AWS CodeBuild console.

Clean up

To avoid incurring future charges, you should clean up the resources that you created:

      • Delete the application by executing helm, this will remove the ALB that was provisioned
        helm uninstall demo-app
      • Delete the CloudFormation stack (github-actions-demo-base) by executing the below command
        aws cloudformation delete-stack \
                --stack-name github-actions-demo-base \
                -–region $AWS_REGION

Conclusion

In this walkthrough, you have learned how to leverage the powerful combination of GitHub Actions and AWS CodeBuild to simplify and automate the deployment of a Python application on Amazon EKS. This approach not only streamlines your deployment process but also ensures that your application is built and deployed securely. You can extend this pipeline by incorporating additional stages such as testing and security scanning, depending on your project’s needs. Additionally, this solution can be used for other programming languages.

Authors

Deepak Kovvuri

Deepak Kovvuri

Deepak Kovvuri is a Senior Solutions Architect at AWS supporting Enterprise Customers in the US East area. He has over 6 years of experience in helping customers architecting a DevOps strategy for their cloud workloads. Some of the areas Deepak focuses on are CI/CD, Systems Administration, Infrastructure as Code, Container Services. He holds a Masters in Computer Engineer from University of Illinois at Chicago.

Bharath Gajendran

Bharath Gajendran

Bharath Gajendran is a Technical Account Manager at AWS. He works with customers to build highly scalable, cost effective and fault tolerant workloads utilizing AWS. He is passionate about Containers, DevOps, Automation and open-source technologies.

Pawan Shrivastava

Pawan Shrivastava is a Partner Solution Architect at AWS in the WWPS team. He focusses on working with partners to provide technical guidance on AWS, collaborate with them to understand their technical requirements, and designing solutions to meet their specific needs. Pawan is passionate about DevOps, automation and CI CD pipelines. He enjoys watching MMA, playing cricket and working out in the gym.

Dive deep into security management: The Data on EKS Platform

Post Syndicated from Yuzhu Xiao original https://aws.amazon.com/blogs/big-data/dive-deep-into-security-management-the-data-on-eks-platform/

The construction of big data applications based on open source software has become increasingly uncomplicated since the advent of projects like Data on EKS, an open source project from AWS to provide blueprints for building data and machine learning (ML) applications on Amazon Elastic Kubernetes Service (Amazon EKS). In the realm of big data, securing data on cloud applications is crucial. This post explores the deployment of Apache Ranger for permission management within the Hadoop ecosystem on Amazon EKS. We show how Ranger integrates with Hadoop components like Apache Hive, Spark, Trino, Yarn, and HDFS, providing secure and efficient data management in a cloud environment. Join us as we navigate these advanced security strategies in the context of Kubernetes and cloud computing.

Overview of solution

The Amber Group’s Data on EKS Platform (DEP) is a Kubernetes-based, cloud-centered big data platform that revolutionizes the way we handle data in EKS environments. Developed by Amber Group’s Data Team, DEP integrates with familiar components like Apache Hive, Spark, Flink, Trino, HDFS, and more, making it a versatile and comprehensive solution for data management and BI platforms.

The following diagram illustrates the solution architecture.

Effective permission management is crucial for several key reasons:

  • Enhanced security – With proper permission management, sensitive data is only accessible to authorized individuals, thereby safeguarding against unauthorized access and potential security breaches. This is especially important in industries handling large volumes of sensitive or personal data.
  • Operational efficiency – By defining clear user roles and permissions, organizations can streamline workflows and reduce administrative overhead. This system simplifies managing user access, saves time for data security administrators, and minimizes the risk of configuration errors.
  • Scalability and compliance – As businesses grow and evolve, a scalable permission management system helps with smoothly adjusting user roles and access rights. This adaptability is essential for maintaining compliance with various data privacy regulations like GDPR and HIPAA, making sure that the organization’s data practices are legally sound and up to date.
  • Addressing big data challenges – Big data comes with unique challenges, like managing large volumes of rapidly evolving data across multiple platforms. Effective permission management helps tackle these challenges by controlling how data is accessed and used, providing data integrity and minimizing the risk of data breaches.

Apache Ranger is a comprehensive framework designed for data governance and security in Hadoop ecosystems. It provides a centralized framework to define, administer, and manage security policies consistently across various Hadoop components. Ranger specializes in fine-grained access control, offering detailed management of user permissions and auditing capabilities.

Ranger’s architecture is designed to integrate smoothly with various big data tools such as Hadoop, Hive, HBase, and Spark. The key components of Ranger include:

  • Ranger Admin – This is the central component where all security policies are created and managed. It provides a web-based user interface for policy management and an API for programmatic configuration.
  • Ranger UserSync – This service is responsible for syncing user and group information from a directory service like LDAP or AD into Ranger.
  • Ranger plugins – These are installed on each component of the Hadoop ecosystem (like Hive and HBase). Plugins pull policies from the Ranger Admin service and enforce them locally.
  • Ranger Auditing – Ranger captures access audit logs and stores them for compliance and monitoring purposes. It can integrate with external tools for advanced analytics on these audit logs.
  • Ranger Key Management Store (KMS) – Ranger KMS provides encryption and key management, extending Hadoop’s HDFS Transparent Data Encryption (TDE).

The following flowchart illustrates the priority levels for matching policies.

chartflow

The priority levels are as follows:

  • Deny list takes precedence over allow list
  • Deny list exclude has a higher priority than deny list
  • Allow list exclude has a higher priority than allow list

Our Amazon EKS-based deployment includes the following components:

  • S3 buckets – We use Amazon Simple Storage Service (Amazon S3) for scalable and durable Hive data storage
  • MySQL database – The database stores Hive metadata, facilitating efficient metadata retrieval and management
  • EKS cluster – The cluster is comprised of three distinct node groups: platform, Hadoop, and Trino, each tailored for specific operational needs
  • Hadoop cluster applications – These applications include HDFS for distributed storage and YARN for managing cluster resources
  • Trino cluster application – This application enables us to run distributed SQL queries for analytics
  • Apache Ranger – Ranger serves as the central security management tool for access policy across the big data components
  • OpenLDAP – This is integrated as the LDAP service to provide a centralized user information repository, essential for user authentication and authorization
  • Other cloud services resources – Other resources include a dedicated VPC for network security and isolation

By the end of this deployment process, we will have realized the following benefits:

  • A high-performing, scalable big data platform that can handle complex data workflows with ease
  • Enhanced security through centralized management of authentication and authorization, provided by the integration of OpenLDAP and Apache Ranger
  • Cost-effective infrastructure management and operation, thanks to the containerized nature of services on Amazon EKS
  • Compliance with stringent data security and privacy regulations, due to Apache Ranger’s policy enforcement capabilities

Deploy a big data cluster on Amazon EKS and configure Ranger for access control

In this section, we outline the process of deploying a big data cluster on AWS EKS and configuring Ranger for access control. We use AWS CloudFormation templates for quick deployment of a big data environment on Amazon EKS with Apache Ranger.

Complete the following steps:

  1. Upload the provided template to AWS CloudFormation, configure the stack options, and launch the stack to automate the deployment of the entire infrastructure, including the EKS cluster and Apache Ranger integration.

    cloudformation

    After a few minutes, you’ll have a fully functional big data environment with robust security management ready for your analytical workloads, as shown in the following screenshot.

  2. On the AWS web console, find the name of your EKS cluster. In this case, it’s dep-demo-eks-cluster-ap-northeast-1. For example:
    aws eks update-kubeconfig --name dep-eks-cluster-ap-northeast-1 --region ap-northeast-1
    
    ## Check pod status.
    
    kubectl get pods --namespace hadoop
    
    kubectl get pods --namespace platform
    
    kubectl get pods --namespace trino

  3. After Ranger Admin is successfully forwarded to port 6080 of localhost, go to localhost:6080 in your browser.
  4. Log in with user name admin and the password you entered earlier.

By default, you have already created two policies: Hive and Trino, and granted all access to the LDAP user you created (depadmin in this case).

Also, the LDAP user sync service is set up and will automatically sync all users from the LDAP service created in this template.

Example permission configuration

In a practical application within a company, permissions for tables and fields in the data warehouse are divided based on business departments, isolating sensitive data for different business units. This provides data security and orderly conduct of daily business operations. The following screenshots show an example business configuration.

The following is an example of an Apache Ranger permission configuration.

The following screenshots show users associated with roles.

When performing data queries, using Hive and Spark as examples, we can demonstrate the comparison before and after permission configuration.

The following screenshot shows an example of Hive SQL (running on superset) with privileges denied.

The following screenshot shows an example of Spark SQL (running on IDE) with privileges denied.

The following screenshot shows an example of Spark SQL (running on IDE) with permissions permitting.

Based on this example and considering your enterprise requirements, it becomes feasible and flexible to manage permissions in the data warehouse effectively.

Conclusion

This post provided a comprehensive guide on permission management in big data, particularly within the Amazon EKS platform using Apache Ranger, that equips you with the essential knowledge and tools for robust data security and management. By implementing the strategies and understanding the components detailed in this post, you can effectively manage permissions, implementing data security and compliance in your big data environments.


About the Authors


Yuzhu Xiao is a Senior Data Development Engineer at Amber Group with extensive experience in cloud data platform architecture. He has many years of experience in AWS Cloud platform data architecture and development, primarily focusing on efficiency optimization and cost control of enterprise cloud architectures.


Xin Zhang is an AWS Solutions Architect, responsible for solution consulting and design based on the AWS Cloud platform. He has a rich experience in R&D and architecture practice in the fields of system architecture, data warehousing, and real-time computing.

Use your corporate identities for analytics with Amazon EMR and AWS IAM Identity Center

Post Syndicated from Pradeep Misra original https://aws.amazon.com/blogs/big-data/use-your-corporate-identities-for-analytics-with-amazon-emr-and-aws-iam-identity-center/

To enable your workforce users for analytics with fine-grained data access controls and audit data access, you might have to create multiple AWS Identity and Access Management (IAM) roles with different data permissions and map the workforce users to one of those roles. Multiple users are often mapped to the same role where they need similar privileges to enable data access controls at the corporate user or group level and audit data access.

AWS IAM Identity Center enables centralized management of workforce user access to AWS accounts and applications using a local identity store or by connecting corporate directories via identity providers (IdPs). IAM Identity Center now supports trusted identity propagation, a streamlined experience for users who require access to data with AWS analytics services.

Amazon EMR Studio is an integrated development environment (IDE) that makes it straightforward for data scientists and data engineers to build data engineering and data science applications. With trusted identity propagation, data access management can be based on a user’s corporate identity and can be propagated seamlessly as they access data with single sign-on to build analytics applications with Amazon EMR (EMR Studio and Amazon EMR on EC2).

AWS Lake Formation allows data administrators to centrally govern, secure, and share data for analytics and machine learning (ML). With trusted identity propagation, data administrators can directly provide granular access to corporate users using their identity attributes and simplify the traceability of end-to-end data access across AWS services. Because access is managed based on a user’s corporate identity, they don’t need to use database local user credentials or assume an IAM role to access data.

In this post, we show how to bring your workforce identity to EMR Studio for analytics use cases, directly manage fine-grained permissions for the corporate users and groups using Lake Formation, and audit their data access.

Solution overview

For our use case, we want to enable a data analyst user named analyst1 to use their own enterprise credentials to query data they have been granted permissions to and audit their data access. We use Okta as the IdP for this demonstration. The following diagram illustrates the solution architecture.

This architecture is based on the following components:

  • Okta is responsible for maintaining the corporate user identities, related groups, and user authentication.
  • IAM Identity Center connects Okta users and centrally manages their access across AWS accounts and applications.
  • Lake Formation provides fine-grained access controls on data directly to corporate users using trusted identity propagation.
  • EMR Studio is an IDE for users to build and run applications. It allows users to log in directly with their corporate credentials without signing in to the AWS Management Console.
  • AWS Service Catalog provides a product template to create EMR clusters.
  • EMR cluster is integrated with IAM Identity Center using a security configuration.
  • AWS CloudTrail captures user data access activities.

The following are the high-level steps to implement the solution:

  1. Integrate Okta with IAM Identity Center.
  2. Set up Amazon EMR Studio.
  3. Create an IAM Identity Center enabled security configuration for EMR clusters.
  4. Create a Service Catalog product template to create the EMR clusters.
  5. Use Lake Formation to grant permissions to users to access data.
  6. Test the solution by accessing data with a corporate identity.
  7. Audit user data access.

Prerequisites

You should have the following prerequisites:

Integrate Okta with IAM Identity Center

For more information about configuring Okta with IAM Identity Center, refer to Configure SAML and SCIM with Okta and IAM Identity Center.

For this setup, we have created two users, analyst1 and engineer1, and assigned them to the corresponding Okta application. You can validate the integration is working by navigating to the Users page on the IAM Identity Center console, as shown in the following screenshot. Both enterprise users from Okta are provisioned in IAM Identity Center.

The following exact users will not be listed in your account. You can either create similar users or use an existing user.

Each provisioned user in IAM Identity Center has a unique user ID. This ID does not originate from Okta; it’s created in IAM Identity Center to uniquely identify this user. With trusted identity propagation, this user ID will be propagated across services and also used for traceability purposes in CloudTrail. The following screenshot shows the IAM Identity Center user matching the provisioned Okta user analyst1.

Choose the link under AWS access portal URL and log in with the analyst1 Okta user credentials that are already assigned to this application.

If you are able to log in and see the landing page, then all your configurations up to this step are set correctly. You will not see any applications on this page yet.

Set up EMR Studio

In this step, we demonstrate the actions needed from the data lake administrator to set up EMR Studio enabled for trusted identity propagation and with IAM Identity Center integration. This allows users to directly access EMR Studio with their enterprise credentials.

Note: All Amazon S3 buckets (created after January 5, 2023) have encryption configured by default (Amazon S3 managed keys (SSE-S3)), and all new objects that are uploaded to an S3 bucket are automatically encrypted at rest. To use a different type of encryption, to meet your security needs, please update the default encryption configuration for the bucket. See Protecting data for server-side encryption for further details.

  • On the Amazon EMR console, choose Studios in the navigation pane under EMR Studio.
  • Choose Create Studio.

  • For Setup options¸ select Custom.
  • For Studio name, enter a name (for this post, emr-studio-with-tip).
  • For S3 location for Workspace storage, select Select existing location and enter an existing S3 bucket (if you have one). Otherwise, select Create new bucket.

  • For Service role to let Studio access your AWS resources, choose View permissions details to get the trust and IAM policy information that is needed and create a role with those specific policies in IAM. In this case, we create a new role called emr_tip_role.

  • For Service role to let Studio access your AWS resources, choose the IAM role you created.
  • For Workspace name, enter a name (for this post, studio-workspace-with-tip).

  • For Authentication, select IAM Identity Center.
  • For User role¸ you can create a new role or choose an existing role. For this post, we choose the role we created (emr_tip_role).
  • To use the same role, add the following statement to the trust policy of the service role:
{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "elasticmapreduce.amazonaws.com",
 "AWS": "arn:aws:iam::xxxxxx:role/emr_tip_role"
      },
      "Action": [
              "sts:AssumeRole",
              "sts:SetContext"
              ]
    }
  ]
}
  • Select Enable trusted identity propagation to allow you to control and log user access across connected applications.

  • For Choose who can access your application, select All users and groups.

Later, we restrict access to resources using Lake Formation. However, there is an option here to restrict access to only assigned users and groups.

  • In the Networking and security section, you can provide optional details for your VPC, subnets, and security group settings.
  • Choose Create Studio.

  • On the Studios page of the Amazon EMR console, locate your Studio enabled with IAM Identity Center.
  • Copy the link for Studio Access URL.

  • Enter the URL into a web browser and log in using Okta credentials.

You should be able to successfully sign in to the EMR Studio console.

Create an AWS Identity Center enabled security configuration for EMR clusters

EMR security configurations allow you to configure data encryption, Kerberos authentication, and Amazon S3 authorization for the EMR File System (EMRFS) on the clusters. The security configuration is available to use and reuse when you create clusters.

To integrate Amazon EMR with IAM Identity Center, you need to first create an IAM role that authenticates with IAM Identity Center from the EMR cluster. Amazon EMR uses IAM credentials to relay the IAM Identity Center identity to downstream services such as Lake Formation. The IAM role should also have the respective permissions to invoke the downstream services.

  1. Create a role (for this post, called emr-idc-application) with the following trust and permission policy. The role referenced in the trust policy is the InstanceProfile role for EMR clusters. This allows the EC2 instance profile to assume this role and act as an identity broker on behalf of the federated users.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AssumeRole",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::xxxxxxxxxxn:role/service-role/AmazonEMR-InstanceProfile-20240127T102444"
            },
            "Action": [
                "sts:AssumeRole",
                "sts:SetContext"
            ]
        }
    ]
}
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "IdCPermissions",
            "Effect": "Allow",
            "Action": [
                "sso-oauth:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "GlueandLakePermissions",
            "Effect": "Allow",
            "Action": [
                "glue:*",
                "lakeformation:GetDataAccess"
            ],
            "Resource": "*"
        },
        {
            "Sid": "S3Permissions",
            "Effect": "Allow",
            "Action": [
                "s3:GetDataAccess",
                "s3:GetAccessGrantsInstanceForPrefix"
            ],
            "Resource": "*"
        }
    ]
}

Next, you create certificates for encrypting data in transit with Amazon EMR.

  • For this post, we use OpenSSL to generate a self-signed X.509 certificate with a 2048-bit RSA private key.

The key allows access to the issuer’s EMR cluster instances in the AWS Region being used. For a complete guide on creating and providing a certificate, refer to Providing certificates for encrypting data in transit with Amazon EMR encryption.

  • Upload my-certs.zip to an S3 location that will be used to create the security configuration.

The EMR service role should have access to the S3 location. The key allows access to the issuer’s EMR cluster instances in the us-west-2 Region as specified by the *.us-west-2.compute.internal domain name as the common name. You can change this to the Region your cluster is in.

$ openssl req -x509 -newkey rsa:2048 -keyout privateKey.pem -out certificateChain.pem -days 365 -nodes -subj '/CN=*.us-west-2.compute.internal'
$ cp certificateChain.pem trustedCertificates.pem
$ zip -r -X my-certs.zip certificateChain.pem privateKey.pem trustedCertificates.pem
  • Create an EMR security configuration with IAM Identity Center enabled from the AWS Command Line Interface (AWS CLI) with the following code:
aws emr create-security-configuration --name "IdentityCenterConfiguration-with-lf-tip" --region "us-west-2" --endpoint-url https://elasticmapreduce.us-west-2.amazonaws.com --security-configuration '{
    "AuthenticationConfiguration":{
        "IdentityCenterConfiguration":{
            "EnableIdentityCenter":true,
            "IdentityCenterApplicationAssigmentRequired":false,
            "IdentityCenterInstanceARN": "arn:aws:sso:::instance/ssoins-7907b0d7d77e3e0d",
            "IAMRoleForEMRIdentityCenterApplicationARN": "arn:aws:iam::1xxxxxxxxx0:role/emr-idc-application"
        }
    },
    "AuthorizationConfiguration": {
        "LakeFormationConfiguration": {
            "EnableLakeFormation": true
        }
    },
    "EncryptionConfiguration": {
        "EnableInTransitEncryption": true,
        "EnableAtRestEncryption": false,
        "InTransitEncryptionConfiguration": {
            "TLSCertificateConfiguration": {
                "CertificateProviderType": "PEM",
                "S3Object": "s3://<<Bucket Name>>/emr-transit-encry-certs/my-certs.zip"
            }
        }
    }
}' 

You can view the security configuration on the Amazon EMR console.

Create a Service Catalog product template to create EMR clusters

EMR Studio with trusted identity propagation enabled can only work with clusters created from a template. Complete the following steps to create a product template in Service Catalog:

  • On the Service Catalog console, choose Portfolios under Administration in the navigation pane.
  • Choose Create portfolio.

  • Enter a name for your portfolio (for this post, EMR Clusters Template) and an optional description.
  • Choose Create.

  • On the Portfolios page, choose the portfolio you just created to view its details.

  • On the Products tab, choose Create product.

  • For Product type, select CloudFormation.
  • For Product name, enter a name (for this post, EMR-7.0.0).
  • Use the security configuration IdentityCenterConfiguration-with-lf-tip you created in previous steps with the appropriate Amazon EMR service roles.
  • Choose Create product.

The following is an example CloudFormation template. Update the account-specific values for SecurityConfiguration, JobFlowRole, ServiceRole, LogUri, Ec2KeyName, and Ec2SubnetId. We provide a sample Amazon EMR service role and trust policy in Appendix A at the end of this post.

'Parameters':
  'ClusterName':
    'Type': 'String'
    'Default': 'EMR_TIP_Cluster'
  'EmrRelease':
    'Type': 'String'
    'Default': 'emr-7.0.0'
    'AllowedValues':
    - 'emr-7.0.0'
  'ClusterInstanceType':
    'Type': 'String'
    'Default': 'm5.xlarge'
    'AllowedValues':
    - 'm5.xlarge'
    - 'm5.2xlarge'
'Resources':
  'EmrCluster':
    'Type': 'AWS::EMR::Cluster'
    'Properties':
      'Applications':
      - 'Name': 'Spark'
      - 'Name': 'Livy'
      - 'Name': 'Hadoop'
      - 'Name': 'JupyterEnterpriseGateway'       
      'SecurityConfiguration': 'IdentityCenterConfiguration-with-lf-tip'
      'EbsRootVolumeSize': '20'
      'Name':
        'Ref': 'ClusterName'
      'JobFlowRole': <Instance Profile Role>
      'ServiceRole': <EMR Service Role>
      'ReleaseLabel':
        'Ref': 'EmrRelease'
      'VisibleToAllUsers': !!bool 'true'
      'LogUri':
        'Fn::Sub': <S3 LOG Path>
      'Instances':
        "Ec2KeyName" : <Key Pair Name>
        'TerminationProtected': !!bool 'false'
        'Ec2SubnetId': <subnet-id>
        'MasterInstanceGroup':
          'InstanceCount': !!int '1'
          'InstanceType':
            'Ref': 'ClusterInstanceType'
        'CoreInstanceGroup':
          'InstanceCount': !!int '2'
          'InstanceType':
            'Ref': 'ClusterInstanceType'
          'Market': 'ON_DEMAND'
          'Name': 'Core'
'Outputs':
  'ClusterId':
    'Value':
      'Ref': 'EmrCluster'
    'Description': 'The ID of the  EMR cluster'
'Metadata':
  'AWS::CloudFormation::Designer': {}
'Rules': {}

Trusted identity propagation is supported from Amazon EMR 6.15 onwards. For Amazon EMR 6.15, add the following bootstrap action to the CloudFormation script:

'BootstrapActions':
- 'Name': 'spark-config'
'ScriptBootstrapAction':
'Path': 's3://emr-data-access-control-<aws-region>/customer-bootstrap-actions/idc-fix/replace-puppet.sh'

The portfolio now should have the EMR cluster creation product added.

  • Grant the EMR Studio role emr_tip_role access to the portfolio.

Grant Lake Formation permissions to users to access data

In this step, we enable Lake Formation integration with IAM Identity Center and grant permissions to the Identity Center user analyst1. If Lake Formation is not already enabled, refer to Getting started with Lake Formation.

To use Lake Formation with Amazon EMR, create a custom role to register S3 locations. You need to create a new custom role with Amazon S3 access and not use the default role AWSServiceRoleForLakeFormationDataAccess. Additionally, enable external data filtering in Lake Formation. For more details, refer to Enable Lake Formation with Amazon EMR.

Complete the following steps to manage access permissions in Lake Formation:

  • On the Lake Formation console, choose IAM Identity Center integration under Administration in the navigation pane.

Lake Formation will automatically specify the correct IAM Identity Center instance.

  • Choose Create.

You can now view the IAM Identity Center integration details.

For this post, we have a Marketing database and a customer table on which we grant access to our enterprise user analyst1. You can use an existing database and table in your account or create a new one. For more examples, refer to Tutorials.

The following screenshot shows the details of our customer table.

Complete the following steps to grant analyst1 permissions. For more information, refer to Granting table permissions using the named resource method.

  • On the Lake Formation console, choose Data lake permissions under Permissions in the navigation pane.
  • Choose Grant.

  • Select Named Data Catalog resources.
  • For Databases, choose your database (marketing).
  • For Tables, choose your table (customer).

  • For Table permissions, select Select and Describe.
  • For Data permissions, select All data access.
  • Choose Grant.

The following screenshot shows a summary of permissions that user analyst1 has. They have Select access on the table and Describe permissions on the databases.

Test the solution

To test the solution, we log in to EMR Studio as enterprise user analyst1, create a new Workspace, create an EMR cluster using a template, and use that cluster to perform an analysis. You could also use the Workspace that was created during the Studio setup. In this demonstration, we create a new Workspace.

You need additional permissions in the EMR Studio role to create and list Workspaces, use a template, and create EMR clusters. For more details, refer to Configure EMR Studio user permissions for Amazon EC2 or Amazon EKS. Appendix B at the end of this post contains a sample policy.

When the cluster is available, we attach the cluster to the Workspace and run queries on the customer table, which the user has access to.

User analyst1 is now able to run queries for business use cases using their corporate identity. To open a PySpark notebook, we choose PySpark under Notebook.

When the notebook is open, we run a Spark SQL query to list the databases:

%%sql 
show databases

In this case, we query the customer table in the marketing database. We should be able to access the data.

%%sql
select * from marketing.customer

Audit data access

Lake Formation API actions are logged by CloudTrail. The GetDataAccess action is logged whenever a principal or integrated AWS service requests temporary credentials to access data in a data lake location that is registered with Lake Formation. With trusted identity propagation, CloudTrail also logs the IAM Identity Center user ID of the corporate identity who requested access to the data.

The following screenshot shows the details for the analyst1 user.

Choose View event to view the event logs.

The following is an example of the GetDataAccess event log. We can trace that user analyst1, Identity Center user ID c8c11390-00a1-706e-0c7a-bbcc5a1c9a7f, has accessed the customer table.

{
    "eventVersion": "1.09",
    
….
        "onBehalfOf": {
            "userId": "c8c11390-00a1-706e-0c7a-bbcc5a1c9a7f",
            "identityStoreArn": "arn:aws:identitystore::xxxxxxxxx:identitystore/d-XXXXXXXX"
        }
    },
    "eventTime": "2024-01-28T17:56:25Z",
    "eventSource": "lakeformation.amazonaws.com",
    "eventName": "GetDataAccess",
    "awsRegion": "us-west-2",
….
        "requestParameters": {
        "tableArn": "arn:aws:glue:us-west-2:xxxxxxxxxx:table/marketing/customer",
        "supportedPermissionTypes": [
            "TABLE_PERMISSION"
        ]
    },
    …..
    }
}

Here is an end to end demonstration video of steps to follow for enabling trusted identity propagation to your analytics flow in Amazon EMR

Clean up

Clean up the following resources when you’re done using this solution:

Conclusion

In this post, we demonstrated how to set up and use trusted identity propagation using IAM Identity Center, EMR Studio, and Lake Formation for analytics. With trusted identity propagation, a user’s corporate identity is seamlessly propagated as they access data using single sign-on across AWS analytics services to build analytics applications. Data administrators can provide fine-grained data access directly to corporate users and groups and audit usage. To learn more, see Integrate Amazon EMR with AWS IAM Identity Center.


About the Authors

Pradeep Misra is a Principal Analytics Solutions Architect at AWS. He works across Amazon to architect and design modern distributed analytics and AI/ML platform solutions. He is passionate about solving customer challenges using data, analytics, and AI/ML. Outside of work, Pradeep likes exploring new places, trying new cuisines, and playing board games with his family. He also likes doing science experiments with his daughters.

Deepmala Agarwal works as an AWS Data Specialist Solutions Architect. She is passionate about helping customers build out scalable, distributed, and data-driven solutions on AWS. When not at work, Deepmala likes spending time with family, walking, listening to music, watching movies, and cooking!

Abhilash Nagilla is a Senior Specialist Solutions Architect at Amazon Web Services (AWS), helping public sector customers on their cloud journey with a focus on AWS analytics services. Outside of work, Abhilash enjoys learning new technologies, watching movies, and visiting new places.


Appendix A

Sample Amazon EMR service role and trust policy:

Note: This is a sample service role. Fine grained access control is done using Lake Formation. Modify the permissions as per your enterprise guidance and to comply with your security team.

Trust policy:

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Service": "elasticmapreduce.amazonaws.com",
   "AWS": "arn:aws:iam::xxxxxx:role/emr_tip_role"

            },
            "Action": [
                "sts:AssumeRole",
                "sts:SetContext"
            ]
        }
    ]
}

Permission Policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ResourcesToLaunchEC2",
            "Effect": "Allow",
            "Action": [
                "ec2:RunInstances",
                "ec2:CreateFleet",
                "ec2:CreateLaunchTemplate",
                "ec2:CreateLaunchTemplateVersion"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*::image/ami-*",
                "arn:aws:ec2:*:*:key-pair/*",
                "arn:aws:ec2:*:*:capacity-reservation/*",
                "arn:aws:ec2:*:*:placement-group/pg-*",
                "arn:aws:ec2:*:*:fleet/*",
                "arn:aws:ec2:*:*:dedicated-host/*",
                "arn:aws:resource-groups:*:*:group/*"
            ]
        },
        {
            "Sid": "TagOnCreateTaggedEMRResources",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*:*:instance/*",
                "arn:aws:ec2:*:*:volume/*",
                "arn:aws:ec2:*:*:launch-template/*"
            ],
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": [
                        "RunInstances",
                        "CreateFleet",
                        "CreateLaunchTemplate",
                        "CreateNetworkInterface"
                    ]
                }
            }
        },
        {
            "Sid": "ListActionsForEC2Resources",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeAccountAttributes",
                "ec2:DescribeCapacityReservations",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeImages",
                "ec2:DescribeInstances",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeNetworkAcls",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribePlacementGroups",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVolumes",
                "ec2:DescribeVolumeStatus",
                "ec2:DescribeVpcAttribute",
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeVpcs"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AutoScaling",
            "Effect": "Allow",
            "Action": [
                "application-autoscaling:DeleteScalingPolicy",
                "application-autoscaling:DeregisterScalableTarget",
                "application-autoscaling:DescribeScalableTargets",
                "application-autoscaling:DescribeScalingPolicies",
                "application-autoscaling:PutScalingPolicy",
                "application-autoscaling:RegisterScalableTarget"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AutoScalingCloudWatch",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricAlarm",
                "cloudwatch:DeleteAlarms",
                "cloudwatch:DescribeAlarms"
            ],
            "Resource": "arn:aws:cloudwatch:*:*:alarm:*_EMR_Auto_Scaling"
        },
        {
            "Sid": "PassRoleForAutoScaling",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::*:role/EMR_AutoScaling_DefaultRole",
            "Condition": {
                "StringLike": {
                    "iam:PassedToService": "application-autoscaling.amazonaws.com*"
                }
            }
        },
        {
            "Sid": "PassRoleForEC2",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::xxxxxxxxxxx:role/service-role/<Instance-Profile-Role>",
            "Condition": {
                "StringLike": {
                    "iam:PassedToService": "ec2.amazonaws.com*"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:*",
                "s3-object-lambda:*"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket>/*",
                "arn:aws:s3:::*logs*/*"
            ]
        },
        {
            "Effect": "Allow",
            "Resource": "*",
            "Action": [
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CancelSpotInstanceRequests",
                "ec2:CreateFleet",
                "ec2:CreateLaunchTemplate",
                "ec2:CreateNetworkInterface",
                "ec2:CreateSecurityGroup",
                "ec2:CreateTags",
                "ec2:DeleteLaunchTemplate",
                "ec2:DeleteNetworkInterface",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteTags",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeAccountAttributes",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeImages",
                "ec2:DescribeInstanceStatus",
                "ec2:DescribeInstances",
                "ec2:DescribeKeyPairs",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeNetworkAcls",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribePrefixLists",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSpotInstanceRequests",
                "ec2:DescribeSpotPriceHistory",
                "ec2:DescribeSubnets",
                "ec2:DescribeTags",
                "ec2:DescribeVpcAttribute",
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeVpcEndpointServices",
                "ec2:DescribeVpcs",
                "ec2:DetachNetworkInterface",
                "ec2:ModifyImageAttribute",
                "ec2:ModifyInstanceAttribute",
                "ec2:RequestSpotInstances",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RunInstances",
                "ec2:TerminateInstances",
                "ec2:DeleteVolume",
                "ec2:DescribeVolumeStatus",
                "ec2:DescribeVolumes",
                "ec2:DetachVolume",
                "iam:GetRole",
                "iam:GetRolePolicy",
                "iam:ListInstanceProfiles",
                "iam:ListRolePolicies",
                "cloudwatch:PutMetricAlarm",
                "cloudwatch:DescribeAlarms",
                "cloudwatch:DeleteAlarms",
                "application-autoscaling:RegisterScalableTarget",
                "application-autoscaling:DeregisterScalableTarget",
                "application-autoscaling:PutScalingPolicy",
                "application-autoscaling:DeleteScalingPolicy",
                "application-autoscaling:Describe*"
            ]
        }
    ]
}

Appendix B

Sample EMR Studio role policy:

Note: This is a sample service role. Fine grained access control is done using Lake Formation. Modify the permissions as per your enterprise guidance and to comply with your security team.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowEMRReadOnlyActions",
            "Effect": "Allow",
            "Action": [
                "elasticmapreduce:ListInstances",
                "elasticmapreduce:DescribeCluster",
                "elasticmapreduce:ListSteps"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowEC2ENIActionsWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowEC2ENIAttributeAction",
            "Effect": "Allow",
            "Action": [
                "ec2:ModifyNetworkInterfaceAttribute"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:instance/*",
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*:*:security-group/*"
            ]
        },
        {
            "Sid": "AllowEC2SecurityGroupActionsWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:DeleteNetworkInterfacePermission"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowDefaultEC2SecurityGroupsCreationWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateSecurityGroup"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:security-group/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowDefaultEC2SecurityGroupsCreationInVPCWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateSecurityGroup"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:vpc/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowAddingEMRTagsDuringDefaultSecurityGroupCreation",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:*:*:security-group/*",
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/for-use-with-amazon-emr-managed-policies": "true",
                    "ec2:CreateAction": "CreateSecurityGroup"
                }
            }
        },
        {
            "Sid": "AllowEC2ENICreationWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowEC2ENICreationInSubnetAndSecurityGroupWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:subnet/*",
                "arn:aws:ec2:*:*:security-group/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowAddingTagsDuringEC2ENICreation",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:*:*:network-interface/*",
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": "CreateNetworkInterface"
                }
            }
        },
        {
            "Sid": "AllowEC2ReadOnlyActions",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeTags",
                "ec2:DescribeInstances",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcs"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowSecretsManagerReadOnlyActionsWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": "arn:aws:secretsmanager:*:*:secret:*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowWorkspaceCollaboration",
            "Effect": "Allow",
            "Action": [
                "iam:GetUser",
                "iam:GetRole",
                "iam:ListUsers",
                "iam:ListRoles",
                "sso:GetManagedApplicationInstance",
                "sso-directory:SearchUsers"
            ],
            "Resource": "*"
        },
        {
            "Sid": "S3Access",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:GetEncryptionConfiguration",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket>",
                "arn:aws:s3:::<bucket>/*"
            ]
        },
        {
            "Sid": "EMRStudioWorkspaceAccess",
            "Effect": "Allow",
            "Action": [
                "elasticmapreduce:CreateEditor",
                "elasticmapreduce:DescribeEditor",
                "elasticmapreduce:ListEditors",
                "elasticmapreduce:DeleteEditor",
                "elasticmapreduce:UpdateEditor",
                "elasticmapreduce:PutWorkspaceAccess",
                "elasticmapreduce:DeleteWorkspaceAccess",
                "elasticmapreduce:ListWorkspaceAccessIdentities",
                "elasticmapreduce:StartEditor",
                "elasticmapreduce:StopEditor",
                "elasticmapreduce:OpenEditorInConsole",
                "elasticmapreduce:AttachEditor",
                "elasticmapreduce:DetachEditor",
                "elasticmapreduce:ListInstanceGroups",
                "elasticmapreduce:ListBootstrapActions",
                "servicecatalog:SearchProducts",
                "servicecatalog:DescribeProduct",
                "servicecatalog:DescribeProductView",
                "servicecatalog:DescribeProvisioningParameters",
                "servicecatalog:ProvisionProduct",
                "servicecatalog:UpdateProvisionedProduct",
                "servicecatalog:ListProvisioningArtifacts",
                "servicecatalog:DescribeRecord",
                "servicecatalog:ListLaunchPaths",
                "elasticmapreduce:RunJobFlow",      
                "elasticmapreduce:ListClusters",
                "elasticmapreduce:DescribeCluster",
                "codewhisperer:GenerateRecommendations",
                "athena:StartQueryExecution",
                "athena:StopQueryExecution",
                "athena:GetQueryExecution",
                "athena:GetQueryRuntimeStatistics",
                "athena:GetQueryResults",
                "athena:ListQueryExecutions",
                "athena:BatchGetQueryExecution",
                "athena:GetNamedQuery",
                "athena:ListNamedQueries",
                "athena:BatchGetNamedQuery",
                "athena:UpdateNamedQuery",
                "athena:DeleteNamedQuery",
                "athena:ListDataCatalogs",
                "athena:GetDataCatalog",
                "athena:ListDatabases",
                "athena:GetDatabase",
                "athena:ListTableMetadata",
                "athena:GetTableMetadata",
                "athena:ListWorkGroups",
                "athena:GetWorkGroup",
                "athena:CreateNamedQuery",
                "athena:GetPreparedStatement",
                "glue:CreateDatabase",
                "glue:DeleteDatabase",
                "glue:GetDatabase",
                "glue:GetDatabases",
                "glue:UpdateDatabase",
                "glue:CreateTable",
                "glue:DeleteTable",
                "glue:BatchDeleteTable",
                "glue:UpdateTable",
                "glue:GetTable",
                "glue:GetTables",
                "glue:BatchCreatePartition",
                "glue:CreatePartition",
                "glue:DeletePartition",
                "glue:BatchDeletePartition",
                "glue:UpdatePartition",
                "glue:GetPartition",
                "glue:GetPartitions",
                "glue:BatchGetPartition",
                "kms:ListAliases",
                "kms:ListKeys",
                "kms:DescribeKey",
                "lakeformation:GetDataAccess",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:ListBucketMultipartUploads",
                "s3:ListMultipartUploadParts",
                "s3:AbortMultipartUpload",
                "s3:PutObject",
                "s3:PutBucketPublicAccessBlock",
                "s3:ListAllMyBuckets",
                "elasticmapreduce:ListStudios",
                "elasticmapreduce:DescribeStudio",
                "cloudformation:GetTemplate",
                "cloudformation:CreateStack",
                "cloudformation:CreateStackSet",
                "cloudformation:DeleteStack",
                "cloudformation:GetTemplateSummary",
                "cloudformation:ValidateTemplate",
                "cloudformation:ListStacks",
                "cloudformation:ListStackSets",
                "elasticmapreduce:AddTags",
                "ec2:CreateNetworkInterface",
                "elasticmapreduce:GetClusterSessionCredentials",
                "elasticmapreduce:GetOnClusterAppUIPresignedURL",
                "cloudformation:DescribeStackResources"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "AllowPassingServiceRoleForWorkspaceCreation",
            "Action": "iam:PassRole",
            "Resource": [
                "arn:aws:iam::*:role/<Studio Role>",
                "arn:aws:iam::*:role/<EMR Service Role>",
                "arn:aws:iam::*:role/<EMR Instance Profile Role>"
            ],
            "Effect": "Allow"
        },
{
			"Sid": "Statement1",
			"Effect": "Allow",
			"Action": [
				"iam:PassRole"
			],
			"Resource": [
				"arn:aws:iam::*:role/<EMR Instance Profile Role>"
			]
		}
    ]
}

Using Amazon Verified Permissions to manage authorization for AWS IoT smart home applications

Post Syndicated from Rajat Mathur original https://aws.amazon.com/blogs/security/using-amazon-verified-permissions-to-manage-authorization-for-aws-iot-smart-thermostat-applications/

This blog post introduces how manufacturers and smart appliance consumers can use Amazon Verified Permissions to centrally manage permissions and fine-grained authorizations. Developers can offer more intuitive, user-friendly experiences by designing interfaces that align with user personas and multi-tenancy authorization strategies, which can lead to higher user satisfaction and adoption. Traditionally, implementing authorization logic using role based access control (RBAC) or attribute based access control (ABAC) within IoT applications can become complex as the number of connected devices and associated user roles grows. This often leads to an unmanageable increase in access rules that must be hard-coded into each application, requiring excessive compute power for evaluation. By using Verified Permissions, you can externalize the authorization logic using Cedar policy language, enabling you to define fine-grained permissions that combine RBAC and ABAC models. This decouples permissions from your application’s business logic, providing a centralized and scalable way to manage authorization while reducing development effort.

In this post, we walk you through a reference architecture that outlines an end-to-end smart thermostat application solution using AWS IoT Core, Verified Permissions, and other AWS services. We show you how to use Verified Permissions to build an authorization solution using Cedar policy language to define dynamic policy-based access controls for different user personas. The post includes a link to a GitHub repository that houses the code for the web dashboard and the Verified Permissions logic to control access to the solution APIs.

Solution overview

This solution consists of a smart thermostat IoT device and an AWS hosted web application using Verified Permissions for fine-grained access to various application APIs. For this use case, the AWS IoT Core device is being simulated by an AWS Cloud9 environment and communicates with the IoT service using AWS IoT Device SDK for Python. After being configured, the device connects to AWS IoT Core to receive commands and send messages to various MQTT topics.

As a general practice, when a user-facing IoT solution is implemented, the manufacturer performs administrative tasks such as:

  1. Embedding AWS Private Certificate Authority certificates into each IoT device (in this case a smart thermostat). Usually this is done on the assembly line and the certificates used to verify the IoT endpoints are burned into device memory along with the firmware.
  2. Creating an Amazon Cognito user pool that provides sign-up and sign-in options for web and mobile application users and hosts the authentication process.
  3. Creating policy stores and policy templates in Verified Permissions. Based on who signs up, the manufacturer creates policies with Verified Permissions to link each signed-up user to certain allowed resources or IoT devices.
  4. The mapping of user to device is stored in a datastore. For this solution, you’ll use an Amazon DynamoDB table to record the relationship.

The user who purchases the device (the primary device owner) performs the following tasks:

  1. Signs up on the manufacturer’s web application or mobile app and registers the IoT device by entering a unique serial number. The mapping between user details and the device serial number is stored in the datastore through an automated process that is initiated after sign-up and device claim.
  2. Connects the new device to an existing wireless network, which initiates a registration process to securely connect to AWS IoT Core services within the manufacturer’s account.
  3. Invites other users (such as guests, family members, or the power company) through a referral, invitation link, or a designated OAuth process.
  4. Assign roles to the other users and therefore permissions.
     
Figure 1: Sample smart home application architecture built using AWS services

Figure 1: Sample smart home application architecture built using AWS services

Figure 1 depicts the solution as three logical components:

  1. The first component depicts device operations through AWS IoT Core. The smart thermostat is on site and it communicates with AWS IoT Core and its state is managed through the AWS IoT Device Shadow Service.
  2. The second component depicts the web application, which is the application interface that customers use. It’s a ReactJS-backed single page application deployed using AWS Amplify.
  3. The third component shows the backend application, which is built using Amazon API Gateway, AWS Lambda, and DynamoDB. A Cognito user pool is used to manage application users and their authentication. Authorization is handled by Verified Permissions where you create and manage policies that are evaluated when the web application calls backend APIs. These policies are evaluated against each authorization policy to provide an access decision to deny or allow an action.

The solution flow itself can be broken down into three steps after the device is onboarded and users have signed up:

  1. The smart thermostat device connects and communicates with AWS IoT Core using the MQTT protocol. A classic Device Shadow is created for the AWS IoT thing Thermostat1 when the UpdateThingShadow call is made the first time through the AWS SDK for a new device. AWS IoT Device Shadow service lets the web application query and update the device’s state in case of connectivity issues.
  2. Users sign up or sign in to the Amplify hosted smart home application and authenticate themselves against a Cognito user pool. They’re mapped to a device, which is stored in a DynamoDB table.
  3. After the users sign in, they’re allowed to perform certain tasks and view certain sections of the dashboard based on the different roles and policies managed by Verified Permissions. The underlying Lambda function that’s responsible for handling the API calls queries the DynamoDB table to provide user context to Verified Permissions.

Prerequisites

  1. To deploy this solution, you need access to the AWS Management Console and AWS Command Line Interface (AWS CLI) on your local machine with sufficient permissions to access required services, including Amplify, Verified Permissions, and AWS IoT Core. For this solution, you’ll give the services full access to interact with different underlying services. But in production, we recommend following security best practices with AWS Identity and Access Management (IAM), which involves scoping down policies.
  2. Set up Amplify CLI by following these instructions. We recommend the latest NodeJS stable long-term support (LTS) version. At the time of publishing this post, the LTS version was v20.11.1. Users can manage multiple NodeJS versions on their machines by using a tool such as Node Version Manager (nvm).

Walkthrough

The following table describes the actions, resources, and authorization decisions that will be enforced through Verified Permissions policies to achieve fine-grained access control. In this example, John is the primary device owner and has purchased and provisioned a new smart thermostat device called Thermostat1. He has invited Jane to access his device and has given her restricted permissions. John has full control over the device whereas Jane is only allowed to read the temperature and set the temperature between 72°F and 78°F.

John has also decided to give his local energy provider (Power Company) access to the device so that they can set the optimum temperature during the day to manage grid load and offer him maximum savings on his energy bill. However, they can only do so between 2:00 PM and 5:00 PM.

For security purposes the verified permissions default decision is DENY for unauthorized principals.

Name Principal Action Resource Authorization decision
Any Default Default Default Deny
John john_doe Any Thermostat1 Allow
Jane jane_doe GetTemperature Thermostat1 Allow
Jane jane_doe SetTemperature Thermostat1 Allow only if desired temperature is between 72°F and 78°F.
Power Company powercompany GetTemperature Thermostat1 Allow only if accessed between the hours of 2:00 PM and 5:00 PM
Power Company powercompany SetTemperature Thermostat1 Allow only if the temperature is set between the hours of 2:00 PM and 5:00 PM

Create a Verified Permissions policy store

Verified Permissions is a scalable permissions management and fine-grained authorization service for the applications that you build. The policies are created using Cedar, a dedicated language for defining access permissions in applications. Cedar seamlessly integrates with popular authorization models such as RBAC and ABAC.

A policy is a statement that either permits or forbids a principal to take one or more actions on a resource. A policy store is a logical container that stores your Cedar policies, schema, and principal sources. A schema helps you to validate your policy and identify errors based on the definitions you specify. See Cedar schema to learn about the structure and formal grammar of a Cedar schema.

To create the policy store

  1. Sign in to the Amazon Verified Permissions console and choose Create policy store.
  2. In the Configuration Method section, select Empty Policy Store and choose Create policy store.
     
Figure 2: Create an empty policy store

Figure 2: Create an empty policy store

Note: Make a note of the policy store ID to use when you deploy the solution.

To create a schema for the application

  1. On the Verified Permissions page, select Schema.
  2. In the Schema section, choose Create schema.
     
    Figure 3: Create a schema

    Figure 3: Create a schema

  3. In the Edit schema section, choose JSON mode, paste the following sample schema for your application, and choose Save changes.
    {
        "AwsIotAvpWebApp": {
            "entityTypes": {
                "Device": {
                    "shape": {
                        "attributes": {
                            "primaryOwner": {
                                "name": "User",
                                "required": true,
                                "type": "Entity"
                            }
                        },
                        "type": "Record"
                    },
                    "memberOfTypes": []
                },
                "User": {}
            },
            "actions": {
                "GetTemperature": {
                    "appliesTo": {
                        "context": {
                            "attributes": {
                                "desiredTemperature": {
                                    "type": "Long"
                                },
                                "time": {
                                    "type": "Long"
                                }
                            },
                            "type": "Record"
                        },
                        "resourceTypes": [
                            "Device"
                        ],
                        "principalTypes": [
                            "User"
                        ]
                    }
                },
                "SetTemperature": {
                    "appliesTo": {
                        "resourceTypes": [
                            "Device"
                        ],
                        "principalTypes": [
                            "User"
                        ],
                        "context": {
                            "attributes": {
                                "desiredTemperature": {
                                    "type": "Long"
                                },
                                "time": {
                                    "type": "Long"
                                }
                            },
                            "type": "Record"
                        }
                    }
                }
            }
        }
    }

When creating policies in Cedar, you can define authorization rules using a static policy or a template-linked policy.

Static policies

In scenarios where a policy explicitly defines both the principal and the resource, the policy is categorized as a static policy. These policies are immediately applicable for authorization decisions, as they are fully defined and ready for implementation.

Template-linked policies

On the other hand, there are situations where a single set of authorization rules needs to be applied across a variety of principals and resources. Consider an IoT application where actions such as SetTemperature and GetTemperature must be permitted for specific devices. Using static policies for each unique combination of principal and resource can lead to an excessive number of almost identical policies, differing only in their principal and resource components. This redundancy can be efficiently addressed with policy templates. Policy templates allow for the creation of policies using placeholders for the principal, the resource, or both. After a policy template is established, individual policies can be generated by referencing this template and specifying the desired principal and resource. These template-linked policies function the same as static policies, offering a streamlined and scalable solution for policy management.

To create a policy that allows access to the primary owner of the device using a static policy

  1. In the Verified Permissions console, on the left pane, select Policies, then choose Create policy and select Create static policy from the drop-down menu.
     
    Figure 4: Create static policy

    Figure 4: Create static policy

  2. Define the policy scope:
    1. Select Permit for the Policy effect.
       
      Figure 5: Define policy effect

      Figure 5: Define policy effect

    2. Select All Principals for Principals scope.
    3. Select All Resources for Resource scope.
    4. Select All Actions for Actions scope and choose Next.
       
      Figure 6: Define policy scope

      Figure 6: Define policy scope

  3. On the Details page, under Policy, paste the following full-access policy, which grants the primary owner permission to perform both SetTemperature and GetTemperature actions on the smart thermostat unconditionally. Choose Create policy.
    	permit (principal, action, resource)
    	when { resource.primaryOwner == principal };
    Figure 7: Write and review policy statement

    Figure 7: Write and review policy statement

To create a static policy to allow a guest user to read the temperature

In this example, the guest user is Jane (username: jane_doe).

  1. Create another static policy and specify the policy scope.
    1. Select Permit for the Policy effect.
       
      Figure 8: Define the policy effect

      Figure 8: Define the policy effect

    2. Select Specific principal for the Principals scope.
    3. Select AwsIotAvpWebApp::User and enter jane_doe.
       
      Figure 9: Define the policy scope

      Figure 9: Define the policy scope

    4. Select Specific resource for the Resources scope.
    5. Select AwsIotAvpWebApp::Device and enter Thermostat1.
    6. Select Specific set of actions for the Actions scope.
    7. Select GetTemperature and choose Next.
       
      Figure 10: Define resource and action scopes

      Figure 10: Define resource and action scopes

    8. Enter the Policy description: Allow jane_doe to read thermostat1.
    9. Choose Create policy.

Next, you will create reusable policy templates to manage policies efficiently. To create a policy template for a guest user with restricted temperature settings that limit the temperature range they can set to between 72°F and 78°F. In this case, the guest user is going to be Jane (username: jane_doe)

To create a reusable policy template

  1. Select Policy template and enter Guest user template as the description.
  2. Paste the following sample policy in the Policy body and choose Create policy template.
    permit (
        principal == ?principal,
        action in [AwsIotAvpWebApp::Action::"SetTemperature"],
        resource == ?resource
    )
    when { context.desiredTemperature >= 72 && context.desiredTemperature <= 78 };
Figure 11: Create guest user policy template

Figure 11: Create guest user policy template

As you can see, you don’t specify the principal and resource yet. You enter those when you create an actual policy from the policy template. The context object will be populated with the desiredTemperature property in the application and used to evaluate the decision.

You also need to create a policy template for the Power Company user with restricted time settings. Cedar policies don’t support date/time format, so you must represent 2:00 PM and 5:00 PM as elapsed minutes from midnight.

To create a policy template for the power company

  1. Select Policy template and enter Power company user template as the description.
  2. Paste the following sample policy in the Policy body and choose Create policy template.
    permit (
        principal == ?principal,
        action in [AwsIotAvpWebApp::Action::"SetTemperature", AwsIotAvpWebApp::Action::"GetTemperature"],
        resource == ?resource
    )
    when { context.time >= 840 && context.time < 1020 };

The policy templates accept the user and resource. The next step is to create a template-linked policy for Jane to set and get thermostat readings based on the Guest user template that you created earlier. For simplicity, you will manually create this policy using the Verified Permissions console. In production, application policies can be dynamically created using the Verified Permissions API.

To create a template-linked policy for a guest user

  1. In the Verified Permissions console, on the left pane, select Policies, then choose Create policy and select Create template-linked policy from the drop-down menu.
     
    Figure 12: Create new template-linked policy

    Figure 12: Create new template-linked policy

  2. Select the Guest user template and choose next.
     
    Figure 13: Select Guest user template

    Figure 13: Select Guest user template

  3. Under parameter selection:
    1. For Principal enter AwsIotAvpWebApp::User::”jane_doe”.
    2. For Resource enter AwsIotAvpWebApp::Device::”Thermostat1″.
    3. Choose Create template-linked policy.
       
      Figure 14: Create guest user template-linked policy

      Figure 14: Create guest user template-linked policy

Note that with this policy in place, jane_doe can only set the temperature of the device Thermostat1 to between 72°F and 78°F.

To create a template-linked policy for the power company user

Based on the template that was set up for power company, you now need an actual policy for it.

  1. In the Verified Permissions console, go to the left pane and select Policies, then choose Create policy and select Create template-linked policy from the drop-down menu.
  2. Select the Power company user template and choose next.
  3. Under Parameter selection, for Principal enter AwsIotAvpWebApp::User::”powercompany”, and for Resource enter AwsIotAvpWebApp::Device::”Thermostat1″, and choose Create template-linked policy.

Now that you have a set of policies in a policy store, you need to update the backend codebase to include this information and then deploy the web application using Amplify.

The policy statements in this post intentionally use human-readable values such as jane_doe and powercompany for the principal entity. This is useful when discussing general concepts but in production systems, customers should use unique and immutable values for entities. See Get the best out of Amazon Verified Permissions by using fine-grained authorization methods for more information.

Deploy the solution code from GitHub

Go to the GitHub repository to set up the Amplify web application. The repository Readme file provides detailed instructions on how to set up the web application. You will need your Verified Permissions policy store ID to deploy the application. For convenience, we’ve provided an onboarding script—deploy.sh—which you can use to deploy the application.

To deploy the application

  1. Close the repository.
    git clone https://github.com/aws-samples/amazon-verified-permissions-iot-
    amplify-smart-home-application.git

  2. Deploy the application.
    ./deploy.sh <region> <Verified Permissions Policy Store ID>

After the web dashboard has been deployed, you’ll create an IoT device using AWS IoT Core.

Create an IoT device and connect it to AWS IoT Core

With the users, policies, and templates, and the Amplify smart home application in place, you can now create a device and connect it to AWS IoT Core to complete the solution.

To create Thermostat1” device and connect it to AWS IoT Core

  1. From the left pane in the AWS IoT console, select Connect one device.
     
    Figure 15: Connect device using AWS IoT console

    Figure 15: Connect device using AWS IoT console

  2. Review how IoT Thing works and then choose Next.
     
    Figure 16: Review how IoT Thing works before proceeding

    Figure 16: Review how IoT Thing works before proceeding

  3. Choose Create a new thing and enter Thermostat1 as the Thing name and choose next.
    &bsp;
    Figure 17: Create the new IoT thing

    Figure 17: Create the new IoT thing

  4. Select Linux/macOS as the Device platform operating system and Python as the AWS IoT Core Device SDK and choose next.
     
    Figure 18: Choose the platform and SDK for the device

    Figure 18: Choose the platform and SDK for the device

  5. Choose Download connection kit and choose next.
     
    Figure 19: Download the connection kit to use for creating the Thermostat1 device

    Figure 19: Download the connection kit to use for creating the Thermostat1 device

  6. Review the three steps to display messages from your IoT device. You will use them to verify the thermostat1 IoT device connectivity to the AWS IoT Core platform. They are:
    1. Step 1: Add execution permissions
    2. Step 2: Run the start script
    3. Step 3: Return to the AWS IoT Console to view the device’s message
       
      Figure 20: How to display messages from an IoT device

      Figure 20: How to display messages from an IoT device

Solution validation

With all of the pieces in place, you can now test the solution.

Primary owner signs in to the web application to set Thermostat1 temperature to 82°F

Figure 21: Thermostat1 temperature update by John

Figure 21: Thermostat1 temperature update by John

  1. Sign in to the Amplify web application as John. You should be able to view the Thermostat1 controller on the dashboard.
  2. Set the temperature to 82°F.
  3. The Lambda function processes the request and performs an API call to Verified Permissions to determine whether to ALLOW or DENY the action based on the policies. Verified Permissions sends back an ALLOW, as the policy that was previously set up allows unrestricted access for primary owners.
  4. Upon receiving the response from Verified Permissions, the Lambda function sends ALLOW permission back to the web application and an API call to the AWS IoT Device Shadow service to update the device (Thermostat1) temperature to 82°F.
     
Figure 22: Policy evaluation decision is ALLOW when a primary owner calls SetTemperature

Figure 22: Policy evaluation decision is ALLOW when a primary owner calls SetTemperature

Guest user signs in to the web application to set Thermostat1 temperature to 80°F

Figure 23: Thermostat1 temperature update by Jane

Figure 23: Thermostat1 temperature update by Jane

  1. If you sign in as Jane to the Amplify web application, you can view the Thermostat1 controller on the dashboard.
  2. Set the temperature to 80°F.
  3. The Lambda function validates the actions by sending an API call to Verified Permissions to determine whether to ALLOW or DENY the action based on the established policies. Verified Permissions sends back a DENY, as the policy only permits temperature adjustments between 72°F and 78°F.
  4. Upon receiving the response from Verified Permissions, the Lambda function sends DENY permissions back to the web application and an unauthorized response is returned.
     
    Figure 24: Guest user jane_doe receives a DENY when calling SetTemperature for a desired temperature of 80°F

    Figure 24: Guest user jane_doe receives a DENY when calling SetTemperature for a desired temperature of 80°F

  5. If you repeat the process (still as Jane) but set Thermostat1 to 75°F, the policy will cause the request to be allowed.
     
    Figure 25: Guest user jane_doe receives an ALLOW when calling SetTemperature for a desired temperature of 75°F

    Figure 25: Guest user jane_doe receives an ALLOW when calling SetTemperature for a desired temperature of 75°F

  6. Similarly, jane_doe is allowed run GetTemperature on the device Thermostat1. When the temperature is set to 74°F, the device shadow is updated. The IoT device being simulated by your AWS Cloud9 instance reads desired the temperature field and sets the reported value to 74.
  7. Now, when jane_doe runs GetTemperature, the value of the device is reported as 74 as shown in Figure 26. We encourage you to try different restrictions in the World Settings (outside temperature and time) by adding restrictions to the static policy that allows GetTemperature for guest user.
     
    Figure 26: Guest user jane_doe receives an ALLOW when calling GetTemperature for the reported temperature

    Figure 26: Guest user jane_doe receives an ALLOW when calling GetTemperature for the reported temperature

Power company signs in to the web application to set Thermostat1 to 78°F at 3.30 PM

Figure 27: Thermostat1 temperature set to 78°F by powercompany user at a specified time

Figure 27: Thermostat1 temperature set to 78°F by powercompany user at a specified time

  1. Sign in as the powercompany user to the Amplify web application using an API. You can view the Thermostat1 controller on the dashboard.
  2. To test this scenario, set the current time to 3:30 PM, and try to set the temperature to 78°F.
  3. The Lambda function validates the actions by sending an API call to Verified Permissions to determine whether to ALLOW or DENY the action based on pre-established policies. Verified Permissions returns ALLOW permission, because the policy for powercompany permits device temperature changes between 2:00 PM and 5:00 PM.
  4. Upon receiving the response from Verified Permissions, the Lambda function sends ALLOW permission back to the web application and an API call to the AWS IoT Device Shadow service to update the Thermostat1 temperature to 78°F.
     
    Figure 28: powercompany receives an ALLOW when SetTemperature is called with the desired temperature of 78°F

    Figure 28: powercompany receives an ALLOW when SetTemperature is called with the desired temperature of 78°F

Note: As an optional exercise, we also made jane_doe a device owner for device Thermostat2. This can be observed in the users.json file in the Github repository. We encourage you to create your own policies and restrict functions for Thermostat2 after going through this post. You will need to create separate Verified Permissions policies and update the Lambda functions to interact with these policies.

We encourage you to create policies for guests and the power company and restrict permissions based on the following criteria:

  1. Verify Jane Doe can perform GetTemperature and SetTemperature actions on Thermostat2.
  2. John Doe should not be able to set the temperature on device Thermostat2 outside of the time range of 4:00 PM and 6:00 PM and outside of the temperature range of 68°F and 72°F.
  3. Power Company can only perform the GetTemperature operation, but there are no restrictions on time and outside temperature.

To help you verify the solution, we’ve provided the correct policies under the challenge directory in the GitHub repository.

Clean up

Deploying the Thermostat application in your AWS account will incur costs. To avoid ongoing charges, when you’re done examining the solution, delete the resources that were created. This includes the Amplify hosted web application, API Gateway resource, AWS Cloud 9 environment, the Lambda function, DynamoDB table, Cognito user pool, AWS IoT Core resources, and Verified Permissions policy store.

Amplify resources can be deleted by going to the AWS CloudFormation console and deleting the stacks that were used to provision various services.

Conclusion

In this post, you learned about creating and managing fine-grained permissions using Verified Permissions for different user personas for your smart thermostat IoT device. With Verified Permissions, you can strengthen your security posture and build smart applications aligned with Zero Trust principles for real-time authorization decisions. To learn more, we recommend:

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Author

Rajat Mathur

Rajat is a Principal Solutions Architect at Amazon Web Services. Rajat is a passionate technologist who enjoys building innovative solutions for AWS customers. His core areas of focus are IoT, Networking, and Serverless computing. In his spare time, Rajat enjoys long drives, traveling, and spending time with family.

Pronoy Chopra

Pronoy Chopra

Pronoy is a Senior Solutions Architect with the Startups Generative AI team at AWS. He specializes in architecting and developing IoT and Machine Learning solutions. He has co-founded two startups and enjoys being hands-on with projects in the IoT, AI/ML and Serverless domain. His work in Magnetoencephalography has been cited many times in the effort to build better brain-compute interfaces.

Syed Sanoor

Syed Sanoor

Syed serves as a Solutions Architect, assisting customers in the enterprise sector. With a foundation in software engineering, he takes pleasure in crafting solutions tailored to client needs. His expertise predominantly lies in C# and IoT. During his leisure time, Syed enjoys piloting drones and playing cricket.

Dynamic DAG generation with YAML and DAG Factory in Amazon MWAA

Post Syndicated from Jayesh Shinde original https://aws.amazon.com/blogs/big-data/dynamic-dag-generation-with-yaml-and-dag-factory-in-amazon-mwaa/

Amazon Managed Workflow for Apache Airflow (Amazon MWAA) is a managed service that allows you to use a familiar Apache Airflow environment with improved scalability, availability, and security to enhance and scale your business workflows without the operational burden of managing the underlying infrastructure. In Airflow, Directed Acyclic Graphs (DAGs) are defined as Python code. Dynamic DAGs refer to the ability to generate DAGs on the fly during runtime, typically based on some external conditions, configurations, or parameters. Dynamic DAGs helps you to create, schedule, and run tasks within a DAG based on data and configurations that may change over time.

There are various ways to introduce dynamism in Airflow DAGs (dynamic DAG generation) using environment variables and external files. One of the approaches is to use the DAG Factory YAML based configuration file method. This library aims to facilitate the creation and configuration of new DAGs by using declarative parameters in YAML. It allows default customizations and is open-source, making it simple to create and customize new functionalities.

In this post, we explore the process of creating Dynamic DAGs with YAML files, using the DAG Factory library. Dynamic DAGs offer several benefits:

  1. Enhanced code reusability – By structuring DAGs through YAML files, we promote reusable components, reducing redundancy in your workflow definitions.
  2. Streamlined maintenance – YAML-based DAG generation simplifies the process of modifying and updating workflows, ensuring smoother maintenance procedures.
  3. Flexible parameterization – With YAML, you can parameterize DAG configurations, facilitating dynamic adjustments to workflows based on varying requirements.
  4. Improved scheduler efficiency – Dynamic DAGs enable more efficient scheduling, optimizing resource allocation and enhancing overall workflow runs
  5. Enhanced scalability – YAML-driven DAGs allow for parallel runs, enabling scalable workflows capable of handling increased workloads efficiently.

By harnessing the power of YAML files and the DAG Factory library, we unleash a versatile approach to building and managing DAGs, empowering you to create robust, scalable, and maintainable data pipelines.

Overview of solution

In this post, we will use an example DAG file that is designed to process a COVID-19 data set. The workflow process involves processing an open source data set offered by WHO-COVID-19-Global. After we install the DAG-Factory Python package, we create a YAML file that has definitions of various tasks. We process the country-specific death count by passing Country as a variable, which creates individual country-based DAGs.

The following diagram illustrates the overall solution along with data flows within logical blocks.

Overview of the Solution

Prerequisites

For this walkthrough, you should have the following prerequisites:

Additionally, complete the following steps (run the setup in an AWS Region where Amazon MWAA is available):

  1. Create an Amazon MWAA environment (if you don’t have one already). If this is your first time using Amazon MWAA, refer to Introducing Amazon Managed Workflows for Apache Airflow (MWAA).

Make sure the AWS Identity and Access Management (IAM) user or role used for setting up the environment has IAM policies attached for the following permissions:

The access policies mentioned here are just for the example in this post. In a production environment, provide only the needed granular permissions by exercising least privilege principles.

  1. Create an unique (within an account) Amazon S3 bucket name while creating your Amazon MWAA environment, and create folders called dags and requirements.
    Amazon S3 Bucket
  2. Create and upload a requirements.txt file with the following content to the requirements folder. Replace {environment-version} with your environment’s version number, and {Python-version} with the version of Python that’s compatible with your environment:
    --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-{Airflow-version}/constraints-{Python-version}.txt"
    dag-factory==0.19.0
    pandas==2.1.4

Pandas is needed just for the example use case described in this post, and dag-factory is the only required plug-in. It is recommended to check the compatibility of the latest version of dag-factory with Amazon MWAA. The boto and psycopg2-binary libraries are included with the Apache Airflow v2 base install and don’t need to be specified in your requirements.txt file.

  1. Download the WHO-COVID-19-global data file to your local machine and upload it under the dags prefix of your S3 bucket.

Make sure that you are pointing to the latest AWS S3 bucket version of your requirements.txt file for the additional package installation to happen. This should typically take between 15 – 20 minutes depending on your environment configuration.

Validate the DAGs

When your Amazon MWAA environment shows as Available on the Amazon MWAA console, navigate to the Airflow UI by choosing Open Airflow UI next to your environment.

Validate the DAG

Verify the existing DAGs by navigating to the DAGs tab.

Verify the DAG

Configure your DAGs

Complete the following steps:

  1. Create empty files named dynamic_dags.yml, example_dag_factory.py and process_s3_data.py on your local machine.
  2. Edit the process_s3_data.py file and save it with following code content, then upload the file back to the Amazon S3 bucket dags folder. We are doing some basic data processing in the code:
    1. Read the file from an Amazon S3 location
    2. Rename the Country_code column as appropriate to the country.
    3. Filter data by the given country.
    4. Write the processed final data into CSV format and upload back to S3 prefix.
import boto3
import pandas as pd
import io
   
def process_s3_data(COUNTRY):
### Top level Variables replace S3_BUCKET with your bucket name ###
    s3 = boto3.client('s3')
    S3_BUCKET = "my-mwaa-assets-bucket-sfj33ddkm"
    INPUT_KEY = "dags/WHO-COVID-19-global-data.csv"
    OUTPUT_KEY = "dags/count_death"
### get csv file ###
   response = s3.get_object(Bucket=S3_BUCKET, Key=INPUT_KEY)
   status = response['ResponseMetadata']['HTTPStatusCode']
   if status == 200:
### read csv file and filter based on the country to write back ###
       df = pd.read_csv(response.get("Body"))
       df.rename(columns={"Country_code": "country"}, inplace=True)
       filtered_df = df[df['country'] == COUNTRY]
       with io.StringIO() as csv_buffer:
                   filtered_df.to_csv(csv_buffer, index=False)
                   response = s3.put_object(
                       Bucket=S3_BUCKET, Key=OUTPUT_KEY + '_' + COUNTRY + '.csv', Body=csv_buffer.getvalue()
                   )
       status = response['ResponseMetadata']['HTTPStatusCode']
       if status == 200:
           print(f"Successful S3 put_object response. Status - {status}")
       else:
           print(f"Unsuccessful S3 put_object response. Status - {status}")
   else:
       print(f"Unsuccessful S3 get_object response. Status - {status}")
  1. Edit the dynamic_dags.yml and save it with the following code content, then upload the file back to the dags folder. We are stitching various DAGs based on the country as follows:
    1. Define the default arguments that are passed to all DAGs.
    2. Create a DAG definition for individual countries by passing op_args
    3. Map the process_s3_data function with python_callable_name.
    4. Use Python Operator to process csv file data stored in Amazon S3 bucket.
    5. We have set schedule_interval as 10 minutes, but feel free to adjust this value as needed.
default:
  default_args:
    owner: "airflow"
    start_date: "2024-03-01"
    retries: 1
    retry_delay_sec: 300
  concurrency: 1
  max_active_runs: 1
  dagrun_timeout_sec: 600
  default_view: "tree"
  orientation: "LR"
  schedule_interval: "*/10 * * * *"
 
module3_dynamic_dag_Australia:
  tasks:
    task_process_s3_data:
      task_id: process_s3_data
      operator: airflow.operators.python.PythonOperator
      python_callable_name: process_s3_data
      python_callable_file: /usr/local/airflow/dags/process_s3_data.py
      op_args:
        - "Australia"
 
module3_dynamic_dag_Brazil:
  tasks:
    task_process_s3_data:
      task_id: process_s3_data
      operator: airflow.operators.python.PythonOperator
      python_callable_name: process_s3_data
      python_callable_file: /usr/local/airflow/dags/process_s3_data.py
      op_args:
        - "Brazil"
 
module3_dynamic_dag_India:
  tasks:
    task_process_s3_data:
      task_id: process_s3_data
      operator: airflow.operators.python.PythonOperator
      python_callable_name: process_s3_data
      python_callable_file: /usr/local/airflow/dags/process_s3_data.py
      op_args:
        - "India"
 
module3_dynamic_dag_Japan:
  tasks:
    task_process_s3_data:
      task_id: process_s3_data
      operator: airflow.operators.python.PythonOperator
      python_callable_name: process_s3_data
      python_callable_file: /usr/local/airflow/dags/process_s3_data.py
      op_args:
        - "Japan"
 
module3_dynamic_dag_Mexico:
  tasks:
    task_process_s3_data:
      task_id: process_s3_data
      operator: airflow.operators.python.PythonOperator
      python_callable_name: process_s3_data
      python_callable_file: /usr/local/airflow/dags/process_s3_data.py
      op_args:
        - "Mexico"
 
module3_dynamic_dag_Russia:
  tasks:
    task_process_s3_data:
      task_id: process_s3_data
      operator: airflow.operators.python.PythonOperator
      python_callable_name: process_s3_data
      python_callable_file: /usr/local/airflow/dags/process_s3_data.py
      op_args:
        - "Russia"
 
module3_dynamic_dag_Spain:
  tasks:
    task_process_s3_data:
      task_id: process_s3_data
      operator: airflow.operators.python.PythonOperator
      python_callable_name: process_s3_data
      python_callable_file: /usr/local/airflow/dags/process_s3_data.py
      op_args:
        - "Spain"
  1. Edit the file example_dag_factory.py and save it with the following code content, then upload the file back to dags folder. The code cleans the existing the DAGs and generates clean_dags() method and the creating new DAGs using the generate_dags() method from the DagFactory instance.
from airflow import DAG
import dagfactory
  
config_file = "/usr/local/airflow/dags/dynamic_dags.yml"
example_dag_factory = dagfactory.DagFactory(config_file)
  
## to clean up or delete any existing DAGs ##
example_dag_factory.clean_dags(globals())
## generate and create new DAGs ##
example_dag_factory.generate_dags(globals())
  1. After you upload the files, go back to the Airflow UI console and navigate to the DAGs tab, where you will find new DAGs.
    List the new DAGs
  2. Once you upload the files, go back to the Airflow UI console and under the DAGs tab you will find new DAGs are appearing as shown below:DAGs

You can enable DAGs by making them active and testing them individually. Upon activation, an additional CSV file named count_death_{COUNTRY_CODE}.csv is generated in the dags folder.

Cleaning up

There may be costs associated with using the various AWS services discussed in this post. To prevent incurring future charges, delete the Amazon MWAA environment after you have completed the tasks outlined in this post, and empty and delete the S3 bucket.

Conclusion

In this blog post we demonstrated how to use the dag-factory library to create dynamic DAGs. Dynamic DAGs are characterized by their ability to generate results with each parsing of the DAG file based on configurations. Consider using dynamic DAGs in the following scenarios:

  • Automating migration from a legacy system to Airflow, where flexibility in DAG generation is crucial
  • Situations where only a parameter changes between different DAGs, streamlining the workflow management process
  • Managing DAGs that are reliant on the evolving structure of a source system, providing adaptability to changes
  • Establishing standardized practices for DAGs across your team or organization by creating these blueprints, promoting consistency and efficiency
  • Embracing YAML-based declarations over complex Python coding, simplifying DAG configuration and maintenance processes
  • Creating data driven workflows that adapt and evolve based on the data inputs, enabling efficient automation

By incorporating dynamic DAGs into your workflow, you can enhance automation, adaptability, and standardization, ultimately improving the efficiency and effectiveness of your data pipeline management.

To learn more about Amazon MWAA DAG Factory, visit Amazon MWAA for Analytics Workshop: DAG Factory. For additional details and code examples on Amazon MWAA, visit the Amazon MWAA User Guide and the Amazon MWAA examples GitHub repository.


About the Authors

 Jayesh Shinde is Sr. Application Architect with AWS ProServe India. He specializes in creating various solutions that are cloud centered using modern software development practices like serverless, DevOps, and analytics.

Harshd Yeola is Sr. Cloud Architect with AWS ProServe India helping customers to migrate and modernize their infrastructure into AWS. He specializes in building DevSecOps and scalable infrastructure using containers, AIOPs, and AWS Developer Tools and services.

Quickly go from Idea to PR with CodeCatalyst using Amazon Q

Post Syndicated from Brendan Jenkins original https://aws.amazon.com/blogs/devops/quickly-go-from-idea-to-pr-with-codecatalyst-using-amazon-q/

Amazon Q feature development enables teams using Amazon CodeCatalyst to scale with AI to assist developers in completing everyday software development tasks. Developers can now go from an idea in an issue to a fully tested, merge-ready, running application code in a Pull Request (PR) with natural language inputs in a few clicks. Developers can also provide feedback to Amazon Q directly on the published pull request and ask it to generate a new revision. If the code change falls short of expectations, a new development environment can be created directly from the pull request, necessary adjustments can be made manually, a new revision published, and proceed with the merge upon approval.

In this blog, we will walk through a use case leveraging the Modern three-tier web application blueprint, and adding a feature to the web application. We’ll leverage Amazon Q feature development to quickly go from Idea to PR. We also suggest following the steps outlined below in this blog in your own application so you can gain a better understanding of how you can use this feature in your daily work.

Solution Overview

Amazon Q feature development is integrated into CodeCatalyst. Figure 1 details how users can assign Amazon Q an issue. When assigning the issue, users answer a few preliminary questions and Amazon Q outputs the proposed approach, where users can either approve or provide additional feedback to Amazon Q. Once approved, Amazon Q will generate a PR where users can review, revise, and merge the PR into the repository.

Figure 1: Amazon Q feature development workflow

Figure 1: Amazon Q feature development workflow

Prerequisites

Although we will walk through a sample use case in this blog using a Blueprint from CodeCatalyst, after, we encourage you to try this with your own application so you can gain hands-on experience with utilizing this feature. If you are using CodeCatalyst for the first time, you’ll need:

Walkthrough

Step 1: Creating the blueprint

In this blog, we’ll leverage the Modern three-tier web application blueprint to walk through a sample use case. This blueprint creates a Mythical Mysfits three-tier web application with modular presentation, application, and data layers.

Figure 2: Creating a new Modern three-tier application blueprint

Figure 2: Creating a new Modern three-tier application blueprint

First, within your space click “Create Project” and select the Modern three-tier web application CodeCatalyst Blueprint as shown above in Figure 2.

Enter a Project name and select: Lambda for the Compute Platform and Amplify Hosting for Frontend Hosting Options. Additionally, ensure your AWS account is selected along with creating a new IAM Role.

Once the project is finished creating, the application will deploy via a CodeCatalyst workflow, assuming the AWS account and IAM role were setup correctly. The deployed application will be similar to the Mythical Mysfits website.

Step 2: Create a new issue

The Product Manager (PM) has asked us to add a feature to the newly created application, which entails creating the ability to add new mythical creatures. The PM has provided a detailed description to get started.

In the Issues section of our new project, click Create Issue

For the Issue title, enter “Ability to add a new mythical creature” and for the Description enter “Users should be able to add a new mythical creature to the website. There should be a new Add button on the UI, when prompted should allow the user to fill in Name, Age, Description, Good/Evil, Lawful/Chaotic, Species, Profile Image URI and thumbnail Image URI for the new creature. When the user clicks save, the application should leverage the existing API in app.py to save the new creature to the DynamoDB table.”

Furthermore, click Assign to Amazon Q as shown below in Figure 3.

Figure 3: Assigning a new issue to Amazon Q

Figure 3: Assigning a new issue to Amazon Q

Lastly, enable the Require Amazon Q to stop after each step and await review of its work. In this use case, we do not anticipate having any changes to our workflow files to support this new feature so we will leave the Allow Amazon Q to modify workflow files disabled as shown below in Figure 4. Click Create Issue and Amazon Q will get started.

Figure 4: Configurations for assigning Amazon Q

Figure 4: Configurations for assigning Amazon Q

Step 3: Review Amazon Qs Approach

After a few minutes, Amazon Q will generate its understanding of the project in the Background section as well as an Approach to make the changes for the issue you created as show in Figure 5 below

(**Note: The Background and Approach generated for you may be different than what is shown in Figure 5 below).

We have the option to proceed as is or can reply to the Approach via a Comment to provide feedback so Amazon Q can refine it to align better with the use case.

Figure 5: Reviewing Amazon Qs Background and Approach

Figure 5: Reviewing Amazon Qs Background and Approach

In the approach, we notice Amazon Q is suggesting it will create a new method to create and save the new item to the table, but we already have an existing method. We decide to leave feedback as show in Figure 6 letting Amazon Q know the existing method should be leveraged.

Figure 6: Provide feedback to Approach

Figure 6: Provide feedback to Approach

Amazon Q will now refine the approach based on the feedback provided. The refined approach generated by Amazon Q meets our requirements, including unit tests, so we decide to click Proceed as shown in Figure 7 below.

Figure 7: Confirm approach and click Proceed

Figure 7: Confirm approach and click Proceed

Now, Amazon Q will generate the code for implementation & create a PR with code changes that can be reviewed.

Step 4: Review the PR

Within our project, under Code on the left panel click on Pull requests. You should see the new PR created by Amazon Q.

The PR description contains the approach that Amazon Q took to generate the code. This is helpful to reviewers who want to gain a high-level understanding of the changes included in the PR before diving into the details. You will also be able to review all changes made to the code as shown below in Figure 8.

Figure 8: Changes within PR

Figure 8: Changes within PR

Step 5 (Optional): Provide feedback on PR

After reviewing the changes in the PR, I leave comments on a few items that can be improved. Notably, all fields on the new input form for creating a new creature should be required. After I complete leaving comments, I hit the Create Revision button. Amazon Q will take my comments, update the code accordingly and create a new revision of the PR as shown in Figure 9 below.

Figure 9: PR Revision created

Figure 9: PR Revision created.

After reviewing the latest revision created by Amazon Q, I am happy with the changes and proceed with testing the changes directly from CodeCatalyst by utilizing Dev Environments. Once I have completed testing of the new feature and everything works as expected, we will let our peers review the PR to provide feedback and approve the pull request.

As part of following the steps in this blog post, if you upgraded your Space to Standard or Enterprise tier, please ensure you downgrade to the Free tier to avoid any unwanted additional charges. Additionally, delete the project and any associated resources deployed in the walkthrough.

Unassign Amazon Q from any issues no longer being worked on. If Amazon Q has finished its work on an issue or could not find a solution, make sure to unassign Amazon Q to avoid reaching the maximum quota for generative AI features. For more information, see Managing generative AI features and Pricing.

Best Practices for using Amazon Q Feature Development

You can follow a few best practices to ensure you experience the best results when using Amazon Q feature development:

  1. When describing your feature or issue, provide as much context as possible to get the best result from Amazon Q. Being too vague or unclear may not produce ideal results for your use case.
  2. Changes and new features should be as focused as possible. You will likely not experience the best results when making large and complex changes in a single issue. Instead, break the changes or feature up into smaller, more manageable issues where you will see better results.
  3. Leverage the feedback feature to practice giving input on approaches Amazon Q takes to ensure it gets to a similar outcome as highlighted in the blog.

Conclusion

In this post, you’ve seen how you can quickly go from Idea to PR using the Amazon Q Feature development capability in CodeCatalyst. You can leverage this new feature to start building new features in your applications. Check out Amazon CodeCatalyst feature development today.

About the authors

Brent Everman

Brent is a Senior Technical Account Manager with AWS, based out of Pittsburgh. He has over 17 years of experience working with enterprise and startup customers. He is passionate about improving the software development experience and specializes in AWS’ Next Generation Developer Experience services.

Brendan Jenkins

Brendan Jenkins is a Solutions Architect at Amazon Web Services (AWS) working with Enterprise AWS customers providing them with technical guidance and helping achieve their business goals. He has an area of specialization in DevOps and Machine Learning technology.

Fahim Sajjad

Fahim is a Solutions Architect at Amazon Web Services. He helps customers transform their business by helping in designing their cloud solutions and offering technical guidance. Fahim graduated from the University of Maryland, College Park with a degree in Computer Science. He has deep interested in AI and Machine learning. Fahim enjoys reading about new advancements in technology and hiking.

Abdullah Khan

Abdullah is a Solutions Architect at AWS. He attended the University of Maryland, Baltimore County where he earned a degree in Information Systems. Abdullah currently helps customers design and implement solutions on the AWS Cloud. He has a strong interest in artificial intelligence and machine learning. In his spare time, Abdullah enjoys hiking and listening to podcasts.

Amazon OpenSearch Service Under the Hood : OpenSearch Optimized Instances(OR1)

Post Syndicated from Bukhtawar Khan original https://aws.amazon.com/blogs/big-data/amazon-opensearch-service-under-the-hood-opensearch-optimized-instancesor1/

Amazon OpenSearch Service recently introduced the OpenSearch Optimized Instance family (OR1), which delivers up to 30% price-performance improvement over existing memory optimized instances in internal benchmarks, and uses Amazon Simple Storage Service (Amazon S3) to provide 11 9s of durability. With this new instance family, OpenSearch Service uses OpenSearch innovation and AWS technologies to reimagine how data is indexed and stored in the cloud.

Today, customers widely use OpenSearch Service for operational analytics because of its ability to ingest high volumes of data while also providing rich and interactive analytics. In order to provide these benefits, OpenSearch is designed as a high-scale distributed system with multiple independent instances indexing data and processing requests. As your operational analytics data velocity and volume of data grows, bottlenecks may emerge. To sustainably support high indexing volume and provide durability, we built the OR1 instance family.

In this post, we discuss how the reimagined data flow works with OR1 instances and how it can provide high indexing throughput and durability using a new physical replication protocol. We also dive deep into some of the challenges we solved to maintain correctness and data integrity.

Designing for high throughput with 11 9s of durability

OpenSearch Service manages tens of thousands of OpenSearch clusters. We’ve gained insights into typical cluster configurations that customers use to meet high throughput and durability goals. To achieve higher throughput, customers often choose to drop replica copies to save on the replication latency; however, this configuration results in sacrificing availability and durability. Other customers require high durability and as a result need to maintain multiple replica copies, resulting in higher operating costs for them.

The OpenSearch Optimized Instance family provides additional durability while also keeping costs lower by storing a copy of the data on Amazon S3. With OR1 instances, you can configure multiple replica copies for high read availability while maintaining indexing throughput.
The following diagram illustrates an indexing flow involving a metadata update in OR1

Indexing Request Flow in OR1

During indexing operations, individual documents are indexed into Lucene and also appended to a write-ahead log also known as a translog. Before sending back an acknowledgement to the client, all translog operations are persisted to the remote data store backed by Amazon S3. If any replica copies are configured, the primary copy performs checks to detect the possibility of multiple writers (control flow) on all replica copies for correctness reasons.
The following diagram illustrates the segment generation and replication flow in OR1 instances

Replication Flow in OR1

Periodically, as new segment files are created, the OR1 copy those segments to Amazon S3. When the transfer is complete, the primary publishes new checkpoints to all replica copies, notifying them of a new segment being available for download. The replica copies subsequently download newer segments and make them searchable. This model decouples the data flow that happens using Amazon S3 and the control flow (checkpoint publication and term validation) that happens over inter-node transport communication.

The following diagram illustrates the recovery flow in OR1 instances

Recovery Flow in OR1

OR1 instances persist not only the data, but the cluster metadata like index mappings, templates, and settings in Amazon S3. This makes sure that in the event of a cluster-manager quorum loss, which is a common failure mode in non-dedicated cluster-manager setups, OpenSearch can reliably recover the last acknowledged metadata.

In the event of an infrastructure failure, an OpenSearch domain can end up losing one or more nodes. In such an event, the new instance family guarantees recovery of both the cluster metadata and the index data up to the latest acknowledged operation. As new replacement nodes join the cluster, the internal cluster recovery mechanism bootstraps the new set of nodes and then recovers the latest cluster metadata from the remote cluster metadata store. After the cluster metadata is recovered, the recovery mechanism starts to hydrate the missing segment data and translog from Amazon S3. Then all uncommitted translog operations, up to the last acknowledged operation, are replayed to reinstate the lost copy.

The new design doesn’t modify the way searches work. Queries are processed normally by either the primary or replica shard for each shard in the index. You may see longer delays (in the 10-second range) before all copies are consistent to a particular point in time because the data replication is using Amazon S3.

A key advantage of this architecture is that it serves as a foundational building block for future innovations, like separation of readers and writers, and helps segregate compute and storage layers.

How redefining the replication strategy boosts the indexing throughput

OpenSearch supports two replication strategies: logical (document) and physical (segment) replication. In the case of logical replication, the data is indexed on all the copies independently, leading to redundant computation on the cluster. The OR1 instances use the new physical replication model, where data is indexed only on the primary copy and additional copies are created by copying data from the primary. With a high number of replica copies, the node hosting the primary copy requires significant network bandwidth, replicating the segment to all the copies. The new OR1 instances solve this problem by durably persisting the segment to Amazon S3, which is configured as a remote storage option. They also help with scaling replicas without bottlenecking on primary.

After the segments are uploaded to Amazon S3, the primary sends out a checkpoint request, notifying all replicas to download the new segments. The replica copies then need to download the incremental segments. Because this process frees up compute resources on replicas, which is otherwise required to redundantly index data and network overhead incurred on primaries to replicate data, the cluster is able to churn more throughput. In the event the replicas aren’t able to process the newly created segments, due to overload or slow network paths, the replicas beyond a point are marked as failed to prevent them from returning stale results.

Why high durability is a good idea, but hard to do well

Although all committed segments are durably persisted to Amazon S3 whenever they get created, one of key challenges in achieving high durability is synchronously writing all uncommitted operations to a write-ahead log on Amazon S3, before acknowledging back the request to the client, without sacrificing throughput. The new semantics introduce additional network latency for individual requests, but the way we’ve made sure there is no impact to throughput is by batching and draining requests on a single thread for up to a specified interval, while making sure other threads continue to index requests. As a result, you can drive higher throughput with more concurrent client connections by optimally batching your bulk payloads.

Other challenges in designing a highly durable system include enforcing data integrity and correctness at all times. Although some events like network partitions are rare, they can break the correctness of the system and therefore the system needs to be prepared to deal with these failure modes. Therefore, while switching to the new segment replication protocol, we also introduced a few other protocol changes, like detecting multiple writers on each replica. The protocol makes sure that an isolated writer can’t acknowledge a write request, while another newly promoted primary, based on the cluster-manager quorum, is concurrently accepting newer writes.

The new instance family automatically detects the loss of a primary shard while recovering data, and performs extensive checks on network reachability before the data can be re-hydrated from Amazon S3 and the cluster is brought back to a healthy state.

For data integrity, all files are extensively checksummed to make sure we are able to detect and prevent network or file system corruption that may result in data being unreadable. Furthermore, all files including metadata are designed to be immutable, providing additional safety against corruptions and versioned to prevent accidental mutating changes.

Reimagining how data flows

The OR1 instances hydrate copies directly from Amazon S3 in order to perform recovery of lost shards during an infrastructure failure. By using Amazon S3, we are able to free up the primary node’s network bandwidth, disk throughput, and compute, and therefore provide a more seamless in-place scaling and blue/green deployment experience by orchestrating the entire process with minimal primary node coordination.

OpenSearch Service provides automatic data backups called snapshots at hourly intervals, which means in case of accidental modifications to data, you have the option to go back to a previous point in time state. However, with the new OpenSearch instance family, we’ve discussed that the data is already durably persisted on Amazon S3. So how do snapshots work when we already have the data present on Amazon S3?

With the new instance family, snapshots serve as checkpoints, referencing the already present segment data as it exists at a point in time. This makes snapshots more lightweight and faster because they don’t need to re-upload any additional data. Instead, they upload metadata files that capture the view of the segments at that point in time, which we call shallow snapshots. The benefit of shallow snapshots extends to all operations, namely creation, deletion, and cloning of snapshots. You still have the option to snapshot an independent copy with manual snapshots for other administrative operations.

Summary

OpenSearch is an open source, community-driven software. Most of the foundational changes including the replication model, remote-backed storage, and remote cluster metadata have been contributed to open source; in fact, we follow an open source first development model.

Efforts to improve throughput and reliability is a never-ending cycle as we continue to learn and improve. The new OpenSearch optimized instances serve as a foundational building block, paving the way for future innovations. We are excited to continue our efforts in improving reliability and performance and to see what new and existing solutions builders can create using OpenSearch Service. We hope this leads to a deeper understanding of the new OpenSearch instance family, how this offering achieves high durability and better throughput, and how it can help you configure clusters based on the needs of your business.

If you’re excited to contribute to OpenSearch, open up a GitHub issue and let us know your thoughts. We would also love to hear about your success stories achieving high throughput and durability on OpenSearch Service. If you have other questions, please leave a comment.


About the Authors

Bukhtawar Khan is a Principal Engineer working on Amazon OpenSearch Service. He is interested in building distributed and autonomous systems. He is a maintainer and an active contributor to OpenSearch.

Gaurav Bafna is a Senior Software Engineer working on OpenSearch at Amazon Web Services. He is fascinated about solving problems in distributed systems. He is a maintainer and an active contributor to OpenSearch.

Sachin Kale is a senior software development engineer at AWS working on OpenSearch.

Rohin Bhargava is a Sr. Product Manager with the Amazon OpenSearch Service team. His passion at AWS is to help customers find the correct mix of AWS services to achieve success for their business goals.

Ranjith Ramachandra is a Senior Engineering Manager working on Amazon OpenSearch Service. He is passionate about highly scalable distributed systems, high performance and resilient systems.

Accelerate security automation using Amazon CodeWhisperer

Post Syndicated from Brendan Jenkins original https://aws.amazon.com/blogs/security/accelerate-security-automation-using-amazon-codewhisperer/

In an ever-changing security landscape, teams must be able to quickly remediate security risks. Many organizations look for ways to automate the remediation of security findings that are currently handled manually. Amazon CodeWhisperer is an artificial intelligence (AI) coding companion that generates real-time, single-line or full-function code suggestions in your integrated development environment (IDE) to help you quickly build software. By using CodeWhisperer, security teams can expedite the process of writing security automation scripts for various types of findings that are aggregated in AWS Security Hub, a cloud security posture management (CSPM) service.

In this post, we present some of the current challenges with security automation and walk you through how to use CodeWhisperer, together with Amazon EventBridge and AWS Lambda, to automate the remediation of Security Hub findings. Before reading further, please read the AWS Responsible AI Policy.

Current challenges with security automation

Many approaches to security automation, including Lambda and AWS Systems Manager Automation, require software development skills. Furthermore, the process of manually writing code for remediation can be a time-consuming process for security professionals. To help overcome these challenges, CodeWhisperer serves as a force multiplier for qualified security professionals with development experience to quickly and effectively generate code to help remediate security findings.

Security professionals should still cultivate software development skills to implement robust solutions. Engineers should thoroughly review and validate any generated code, as manual oversight remains critical for security.

Solution overview

Figure 1 shows how the findings that Security Hub produces are ingested by EventBridge, which then invokes Lambda functions for processing. The Lambda code is generated with the help of CodeWhisperer.

Figure 1: Diagram of the solution

Security Hub integrates with EventBridge so you can automatically process findings with other services such as Lambda. To begin remediating the findings automatically, you can configure rules to determine where to send findings. This solution will do the following:

  1. Ingest an Amazon Security Hub finding into EventBridge.
  2. Use an EventBridge rule to invoke a Lambda function for processing.
  3. Use CodeWhisperer to generate the Lambda function code.

It is important to note that there are two types of automation for Security Hub finding remediation:

  • Partial automation, which is initiated when a human worker selects the Security Hub findings manually and applies the automated remediation workflow to the selected findings.
  • End-to-end automation, which means that when a finding is generated within Security Hub, this initiates an automated workflow to immediately remediate without human intervention.

Important: When you use end-to-end automation, we highly recommend that you thoroughly test the efficiency and impact of the workflow in a non-production environment first before moving forward with implementation in a production environment.

Prerequisites

To follow along with this walkthrough, make sure that you have the following prerequisites in place:

Implement security automation

In this scenario, you have been tasked with making sure that versioning is enabled across all Amazon Simple Storage Service (Amazon S3) buckets in your AWS account. Additionally, you want to do this in a way that is programmatic and automated so that it can be reused in different AWS accounts in the future.

To do this, you will perform the following steps:

  1. Generate the remediation script with CodeWhisperer
  2. Create the Lambda function
  3. Integrate the Lambda function with Security Hub by using EventBridge
  4. Create a custom action in Security Hub
  5. Create an EventBridge rule to target the Lambda function
  6. Run the remediation

Generate a remediation script with CodeWhisperer

The first step is to use VS Code to create a script so that CodeWhisperer generates the code for your Lambda function in Python. You will use this Lambda function to remediate the Security Hub findings generated by the [S3.14] S3 buckets should use versioning control.

Note: The underlying model of CodeWhisperer is powered by generative AI, and the output of CodeWhisperer is nondeterministic. As such, the code recommended by the service can vary by user. By modifying the initial code comment to prompt CodeWhisperer for a response, customers can change the corresponding output to help meet their needs. Customers should subject all code generated by CodeWhisperer to typical testing and review protocols to verify that it is free of errors and is in line with applicable organizational security policies. To learn about best practices on prompt engineering with CodeWhisperer, see this AWS blog post.

To generate the remediation script

  1. Open a new VS Code window, and then open or create a new folder for your file to reside in.
  2. Create a Python file called cw-blog-remediation.py as shown in Figure 2.
     
    Figure 2: New VS Code file created called cw-blog-remediation.py

    Figure 2: New VS Code file created called cw-blog-remediation.py

  3. Add the following imports to the Python file.
    import json
    import boto3

  4. Because you have the context added to your file, you can now prompt CodeWhisperer by using a natural language comment. In your file, below the import statements, enter the following comment and then press Enter.
    # Create lambda function that turns on versioning for an S3 bucket after the function is triggered from Amazon EventBridge

  5. Accept the first recommendation that CodeWhisperer provides by pressing Tab to use the Lambda function handler, as shown in Figure 3.
    &ngsp;
    Figure 3: Generation of Lambda handler

    Figure 3: Generation of Lambda handler

  6. To get the recommendation for the function from CodeWhisperer, press Enter. Make sure that the recommendation you receive looks similar to the following. CodeWhisperer is nondeterministic, so its recommendations can vary.
    import json
    import boto3
    
    # Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
    def lambda_handler(event, context):
        s3 = boto3.client('s3')
        bucket = event['detail']['requestParameters']['bucketName']
        response = s3.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
                'Status': 'Enabled'
            }
        )
        print(response)
        return {
            'statusCode': 200,
            'body': json.dumps('Versioning enabled for bucket ' + bucket)
        }
    

  7. Take a moment to review the user actions and keyboard shortcut keys. Press Tab to accept the recommendation.
  8. You can change the function body to fit your use case. To get the Amazon Resource Name (ARN) of the S3 bucket from the EventBridge event, replace the bucket variable with the following line:
    bucket = event['detail']['findings'][0]['Resources'][0]['Id']

  9. To prompt CodeWhisperer to extract the bucket name from the bucket ARN, use the following comment:
    # Take the S3 bucket name from the ARN of the S3 bucket

    Your function code should look similar to the following:

    import json
    import boto3
    
    # Create lambda function that turns on versioning for an S3 bucket after function is triggered from Amazon EventBridge
    def lambda_handler(event, context):
        s3 = boto3.client('s3')
       bucket = event['detail']['findings'][0]['Resources'][0]['Id']
             # Take the S3 bucket name from the ARN of the S3 bucket
       bucket = bucket.split(':')[5]
    
        response = s3.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
                'Status': 'Enabled'
            }
        )
        print(response)
        return {
            'statusCode': 200,
            'body': json.dumps('Versioning enabled for bucket ' + bucket)
        }
    

  10. Create a .zip file for cw-blog-remediation.py. Find the file in your local file manager, right-click the file, and select compress/zip. You will use this .zip file in the next section of the post.

Create the Lambda function

The next step is to use the automation script that you generated to create the Lambda function that will enable versioning on applicable S3 buckets.

To create the Lambda function

  1. Open the AWS Lambda console.
  2. In the left navigation pane, choose Functions, and then choose Create function.
  3. Select Author from Scratch and provide the following configurations for the function:
    1. For Function name, select sec_remediation_function.
    2. For Runtime, select Python 3.12.
    3. For Architecture, select x86_64.
    4. For Permissions, select Create a new role with basic Lambda permissions.
  4. Choose Create function.
  5. To upload your local code to Lambda, select Upload from and then .zip file, and then upload the file that you zipped.
  6. Verify that you created the Lambda function successfully. In the Code source section of Lambda, you should see the code from the automation script displayed in a new tab, as shown in Figure 4.
     
    Figure 4: Source code that was successfully uploaded

    Figure 4: Source code that was successfully uploaded

  7. Choose the Code tab.
  8. Scroll down to the Runtime settings pane and choose Edit.
  9. For Handler, enter cw-blog-remediation.lambda_handler for your function handler, and then choose Save, as shown in Figure 5.
     
    Figure 5: Updated Lambda handler

    Figure 5: Updated Lambda handler

  10. For security purposes, and to follow the principle of least privilege, you should also add an inline policy to the Lambda function’s role to perform the tasks necessary to enable versioning on S3 buckets.
    1. In the Lambda console, navigate to the Configuration tab and then, in the left navigation pane, choose Permissions. Choose the Role name, as shown in Figure 6.
       
      Figure 6: Lambda role in the AWS console

      Figure 6: Lambda role in the AWS console

    2. In the Add permissions dropdown, select Create inline policy.
       
      Figure 7: Create inline policy

      Figure 7: Create inline policy

    3. Choose JSON, add the following policy to the policy editor, and then choose Next.
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "VisualEditor0",
                  "Effect": "Allow",
                  "Action": "s3:PutBucketVersioning",
                  "Resource": "*"
              }
          ]
      }

    4. Name the policy PutBucketVersioning and choose Create policy.

Create a custom action in Security Hub

In this step, you will create a custom action in Security Hub.

To create the custom action

  1. Open the Security Hub console.
  2. In the left navigation pane, choose Settings, and then choose Custom actions.
  3. Choose Create custom action.
  4. Provide the following information, as shown in Figure 8:
    • For Name, enter TurnOnS3Versioning.
    • For Description, enter Action that will turn on versioning for a specific S3 bucket.
    • For Custom action ID, enter TurnOnS3Versioning.
       
      Figure 8: Create a custom action in Security Hub

      Figure 8: Create a custom action in Security Hub

  5. Choose Create custom action.
  6. Make a note of the Custom action ARN. You will need this ARN when you create a rule to associate with the custom action in EventBridge.

Create an EventBridge rule to target the Lambda function

The next step is to create an EventBridge rule to capture the custom action. You will define an EventBridge rule that matches events (in this case, findings) from Security Hub that were forwarded by the custom action that you defined previously.

To create the EventBridge rule

  1. Navigate to the EventBridge console.
  2. On the right side, choose Create rule.
  3. On the Define rule detail page, give your rule a name and description that represents the rule’s purpose—for example, you could use the same name and description that you used for the custom action. Then choose Next.
  4. Scroll down to Event pattern, and then do the following:
    1. For Event source, make sure that AWS services is selected.
    2. For AWS service, select Security Hub.
    3. For Event type, select Security Hub Findings – Custom Action.
    4. Select Specific custom action ARN(s) and enter the ARN for the custom action that you created earlier.
       
    Figure 9: Specify the EventBridge event pattern for the Security Hub custom action workflow

    Figure 9: Specify the EventBridge event pattern for the Security Hub custom action workflow

    As you provide this information, the Event pattern updates.

  5. Choose Next.
  6. On the Select target(s) step, in the Select a target dropdown, select Lambda function. Then from the Function dropdown, select sec_remediation_function.
  7. Choose Next.
  8. On the Configure tags step, choose Next.
  9. On the Review and create step, choose Create rule.

Run the automation

Your automation is set up and you can now test the automation. This test covers a partial automation workflow, since you will manually select the finding and apply the remediation workflow to one or more selected findings.

Important: As we mentioned earlier, if you decide to make the automation end-to-end, you should assess the impact of the workflow in a non-production environment. Additionally, you may want to consider creating preventative controls if you want to minimize the risk of event occurrence across an entire environment.

To run the automation

  1. In the Security Hub console, on the Findings tab, add a filter by entering Title in the search box and selecting that filter. Select IS and enter S3 general purpose buckets should have versioning enabled (case sensitive). Choose Apply.
  2. In the filtered list, choose the Title of an active finding.
  3. Before you start the automation, check the current configuration of the S3 bucket to confirm that your automation works. Expand the Resources section of the finding.
  4. Under Resource ID, choose the link for the S3 bucket. This opens a new tab on the S3 console that shows only this S3 bucket.
  5. In your browser, go back to the Security Hub tab (don’t close the S3 tab—you will need to return to it), and on the left side, select this same finding, as shown in Figure 10.
     
    Figure 10: Filter out Security Hub findings to list only S3 bucket-related findings

    Figure 10: Filter out Security Hub findings to list only S3 bucket-related findings

  6. In the Actions dropdown list, choose the name of your custom action.
     
    Figure 11: Choose the custom action that you created to start the remediation workflow

    Figure 11: Choose the custom action that you created to start the remediation workflow

  7. When you see a banner that displays Successfully started action…, go back to the S3 browser tab and refresh it. Verify that the S3 versioning configuration on the bucket has been enabled as shown in figure 12.
     
    Figure 12: Versioning successfully enabled

    Figure 12: Versioning successfully enabled

Conclusion

In this post, you learned how to use CodeWhisperer to produce AI-generated code for custom remediations for a security use case. We encourage you to experiment with CodeWhisperer to create Lambda functions that remediate other Security Hub findings that might exist in your account, such as the enforcement of lifecycle policies on S3 buckets with versioning enabled, or using automation to remove multiple unused Amazon EC2 elastic IP addresses. The ability to automatically set public S3 buckets to private is just one of many use cases where CodeWhisperer can generate code to help you remediate Security Hub findings.

To sum up, CodeWhisperer acts as a tool that can help boost the productivity of security experts who have coding abilities, assisting them to swiftly write code to address security issues. However, security specialists should continue building their software development capabilities to implement robust solutions. Engineers should carefully review and test any generated code, since human oversight is still vital for security.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Brendan Jenkins

Brendan Jenkins

Brendan is a Solutions Architect at AWS who works with enterprise customers, providing them with technical guidance and helping them achieve their business goals. He specializes in DevOps and machine learning (ML) technology.

Chris Shea

Chris Shea

Chris is an AWS Solutions Architect serving enterprise customers in the PropTech and AdTech industry verticals, providing guidance and the tools that customers need for success. His areas of interest include AI for DevOps and AI/ML technology.

Tim Manik

Tim Manik

Tim is a Solutions Architect at AWS working with enterprise customers on migrations and modernizations. He specializes in cybersecurity and AI/ML and is passionate about bridging the gap between the two fields.

Angel Tolson

Angel Tolson

Angel is a Solutions Architect at AWS working with small to medium size businesses, providing them with technical guidance and helping them achieve their business goals. She is particularly interested in cloud operations and networking.