Tag Archives: IdP

Build an end-to-end attribute-based access control strategy with AWS SSO and Okta

Post Syndicated from Louay Shaat original https://aws.amazon.com/blogs/security/build-an-end-to-end-attribute-based-access-control-strategy-with-aws-sso-and-okta/

This blog post discusses the benefits of using an attribute-based access control (ABAC) strategy and also describes how to use ABAC with AWS Single Sign-On (AWS SSO) when you’re using Okta as an identity provider (IdP).

Over the past two years, Amazon Web Services (AWS) has invested heavily in making ABAC available across the majority of our services. With ABAC, you can simplify your access control strategy by granting access to groups of resources, which are specified by tags, instead of managing long lists of individual resources. Each tag is a label that consists of a user-defined key and value, and you can use these to assign metadata to your AWS resources. Tags can help you manage, identify, organize, search for, and filter resources. You can create tags to categorize resources by purpose, owner, environment, or other criteria. To learn more about tags and AWS best practices for tagging, see Tagging AWS resources.

The ability to include tags in sessions—combined with the ability to tag AWS Identity and Access Management (IAM) users and roles—means that you can now incorporate user attributes from your identity provider as part of your tagging and authorization strategy. Additionally, user attributes help organizations to make permissions more intuitive, because the attributes are easier to relate to teams and functions. A tag that represents a team or a job function is easier to audit and understand.

For more information on ABAC in AWS, see our ABAC documentation.

Why use ABAC?

ABAC is a strategy that that can help organizations to innovate faster. Implementing a purely role-based access control (RBAC) strategy requires identity and security teams to define a large number of RBAC policies, which can lead to complexity and time delays. With ABAC, you can make use of attributes to build more dynamic policies that provide access based on matching the attribute conditions. AWS supports both RBAC and ABAC as co-existing strategies, so you can use ABAC alongside your existing RBAC strategy.

A good example that uses ABAC is the scenario where you have two teams that require similar access to their secrets in AWS Secrets Manager. By using ABAC, you can build a single role or policy with a condition based on the Department attribute from your IdP. When the user is authenticated, you can pass the Department attribute value and use a condition to provide access to resources that have the identical tag, as shown in the following code snippet. In this post, I show how to use ABAC for this example scenario.

"Condition": {
                "StringEquals": {
                    "secretsmanager:ResourceTag/Department": "${aws:PrincipalTag/Department}"

ABAC provides organizations with a more dynamic way of working with permissions. There are four main benefits for organizations that use ABAC:

  • Scale your permissions as you innovate: As developers create new project resources, administrators can require specific attributes to be applied when resources are created. This can include applying tags with attributes that give developers immediate access to the new resources they create, without requiring an update to their own permissions.
  • Help your teams to change and grow quickly: Because permissions are based on user attributes from a corporate identity source such as an IdP, changing user attributes in the IdP that you use for access control in AWS automatically updates your permissions in AWS.
  • Create fewer AWS SSO permission sets and IAM roles: With ABAC, multiple users who are using the same AWS SSO permission set and IAM role can still get unique permissions, because permissions are now based on user attributes. Administrators can author IAM policies that grant users access only to AWS resources that have matching attributes. This helps to reduce the number of IAM roles you need to create for various use cases in a single AWS account.
  • Efficiently audit who performed an action: By using attributes that are logged in AWS CloudTrail next to every action that is performed in AWS by using an IAM role, you can make it easier for security administrators to determine the identity that takes actions in a role session.

Prerequisites

In this section, I describe some higher-level prerequisites for using ABAC effectively. ABAC in AWS relies on the use of tags for access-control decisions, so it’s important to have in place a tagging strategy for your resources. To help you develop an effective strategy, see the AWS Tagging Strategies whitepaper.

Organizations that implement ABAC can enhance the use of tags across their resources for the purpose of identity access. Making sure that tagging is enforced and secure is essential to an enterprise-wide strategy. For more information about enforcing a tagging policy, see the blog post Enforce Centralized Tag Compliance Using AWS Service Catalog, DynamoDB, Lambda, and CloudWatch Events.

You can use the service AWS Resource Groups to identify untagged resources and to find resources to tag. You can also use Resource Groups to remediate untagged resources.

Use AWS SSO with Okta as an IdP

AWS SSO gives you an efficient way to centrally manage access to multiple AWS accounts and business applications, and to provide users with single sign-on access to all their assigned accounts and applications from one place. With AWS SSO, you can manage access and user permissions to all of your accounts in AWS Organizations centrally. AWS SSO configures and maintains all the necessary permissions for your accounts automatically, without requiring any additional setup in the individual accounts.

AWS SSO supports access control attributes from any IdP. This blog post focuses on how you can use ABAC attributes with AWS SSO when you’re using Okta as an external IdP.

Use other single sign-on services with ABAC

This post describes how to turn on ABAC in AWS SSO. To turn on ABAC with other federation services, see these links:

Implement the solution

Follow these steps to set up Okta as an IdP in AWS SSO and turn on ABAC.

To set up Okta and turn on ABAC

  1. Set up Okta as an IdP for AWS SSO. To do so, follow the instructions in the blog post Single Sign-On Between Okta Universal Directory and AWS. For more information on the supported actions in AWS SSO with Okta, see our documentation.
  2. Enable attributes for access control (in other words, turn on ABAC) in AWS SSO by using these steps:
    1. In the AWS Management Console, navigate to AWS SSO in the AWS Region you selected for your implementation.
    2. On the Dashboard tab, select Choose your identity source.
    3. Next to Attributes for access control, choose Enable.

      Figure 1: Turn on ABAC in AWS SSO

      Figure 1: Turn on ABAC in AWS SSO

    You should see the message “Attributes for access control has been successfully enabled.”

  3. Enable updates for user attributes in Okta provisioning. Now that you’ve turned on ABAC in AWS SSO, you need to verify that automatic provisioning for Okta has attribute updates enabled.Log in to Okta as an administrator and locate the application you created for AWS SSO. Navigate to the Provisioning tab, choose Edit, and verify that Update User Attributes is enabled.

    Figure 2: Enable automatic provisioning for ABAC updates

    Figure 2: Enable automatic provisioning for ABAC updates

  4. Configure user attributes in Okta for use in AWS SSO by following these steps:
    1. From the same application that you created earlier, navigate to the Sign On tab.
    2. Choose Edit, and then expand the Attributes (optional) section.
    3. In the Attribute Statements (optional) section, for each attribute that you will use for access control in AWS SSO, do the following:
      1. For Name, enter https://aws.amazon.com/SAML/Attributes/AccessControl:<AttributeName>. Replace <AttributeName> with the name of the attribute you’re expecting in AWS SSO, for example https://aws.amazon.com/SAML/Attributes/AccessControl:Department.
      2. For Name Format, choose URI reference.
      3. For Value, enter user.<AttributeName>. Replace <AttributeName> with the Okta default user profile variable name, for example user.department. To view the Okta default user profile, see these instructions.

     

    Figure 3: Configure two attributes for users in Okta

    Figure 3: Configure two attributes for users in Okta

    In the example shown here, I added two attributes, Department and Division. The result should be similar to the configuration shown in Figure 3.

  5. Add attributes to your users by using these steps:
    1. In your Okta portal, log in as administrator. Navigate to Directory, and then choose People.
    2. Locate a user, navigate to the Profile tab, and then choose Edit.
    3. Add values to the attributes you selected.
    Figure 4: Addition of user attributes in Okta

    Figure 4: Addition of user attributes in Okta

  6. Confirm that attributes are mapped. Because you’ve enabled automatic provisioning updates from Okta, you should be able to see the attributes for your user immediately in AWS SSO. To confirm this:
    1. In the console, navigate to AWS SSO in the Region you selected for your implementation.
    2. On the Users tab, select a user that has attributes from Okta, and select the user. You should be able to see the attributes that you mapped from Okta.
    Figure 5: User attributes in Okta

    Figure 5: User attributes in Okta

Now that you have ABAC attributes for your users in AWS SSO, you can now create permission sets based on those attributes.

Note: Step 4 ensures that users will not be successfully authenticated unless the attributes configured are present. If you don’t want this enforcement, do not perform step 4.

Build an ABAC permission set in AWS SSO

For demonstration purposes, I’ll show how you can build a permission set that is based on ABAC attributes for AWS Secrets Manager. The permission set will match resource tags to user tags, in order to control which resources can be managed by Secrets Manager administrators. You can apply this single permission set to multiple teams.

To build the ABAC permission set

  1. In the console, navigate to AWS SSO, and choose AWS Accounts.
  2. Choose the Permission sets tab.
  3. Choose Create permission set, and then choose Create a custom permission set.
  4. Fill in the fields as follows.
    1. For Name, enter a name for your permission set that will be visible to your users, for example, SecretsManager-Profile.
    2. For Description, enter ABAC SecretsManager Profile.
    3. Select the appropriate session duration.
    4. For Relay State, for my example I will enter the URL for Secrets Manager: https://console.aws.amazon.com/secretsmanager/home. This will give a better user experience when the user signs in to AWS SSO, with an automatic redirect to the Secrets Manager console.
    5. For the field What policies do you want to include in your permission set?, choose Create a custom permissions policy.
    6. Under Create a custom permissions policy, paste the following policy.
      {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Sid": "SecretsManagerABAC",
                  "Effect": "Allow",
                  "Action": [
                      "secretsmanager:DescribeSecret",
                      "secretsmanager:PutSecretValue",
                      "secretsmanager:CreateSecret",
                      "secretsmanager:ListSecretVersionIds",
                      "secretsmanager:UpdateSecret",
                      "secretsmanager:GetResourcePolicy",
                      "secretsmanager:GetSecretValue",
                      "secretsmanager:ListSecrets",
                      "secretsmanager:TagResource"
                  ],
                  "Resource": "*",
                  "Condition": {
                      "StringEquals": {
                          "secretsmanager:ResourceTag/Department": "${aws:PrincipalTag/Department}"
                      }
                  }
              },
              {
                  "Sid": "NeededPermissions",
                  "Effect": "Allow",
                  "Action": [
             "kms:ListKeys",
             "kms:ListAliases",
                      "rds:DescribeDBInstances",
                      "redshift:DescribeClusters",
                      "rds:DescribeDBClusters",
                      "secretsmanager:ListSecrets",
                      "tag:GetResources",
                      "lambda:ListFunctions"
                  ],
                  "Resource": "*"
              }
          ]
      }
      

    This policy grants users the ability to create and list secrets that belong to their department. The policy is configured to allow Secrets Manager users to manage only the resources that belong to their department. You can modify this policy to perform matching on more attributes, in order to have more granular permissions.

    Note: The RDS permissions in the policy enable users to select an RDS instance for the secret and the Lambda Permissions are to enable custom key rotation.

    If you look closely at the condition

    “secretsmanager:ResourceTag/Department”: “${aws:PrincipalTag/Department}”

    …the condition states that the user can only access Secrets Manager resources that have a Department tag, where the value of that tag is identical to the value of the Department tag from the user.

  5. Choose Next: Tags.
  6. Tag your permission set. For my example, I’ll add Key: Service and Value: SecretsManager.
  7. Choose Next: Review and create.
  8. Assign the permission set to a user or group and to the appropriate accounts that you have in AWS Organizations.

Test an ABAC permission set

Now you can test the ABAC permission set that you just created for Secrets Manager.

To test the ABAC permission set

  1. In the AWS SSO console, on the Dashboard page, navigate to the User Portal URL.
  2. Sign in as a user who has the attributes that you configured earlier in AWS SSO. You will assume the permission set that you just created.
  3. Choose Management console. This will take you to the console that you specified in the Relay State setting for the permission set, which in my example is the Secrets Manager console.

    Figure 6: AWS SSO ABAC profile access

    Figure 6: AWS SSO ABAC profile access

  4. Try to create a secret with no tags:
    1. Choose Store a new secret.
    2. Choose Other type of secrets.
    3. You can add any values you like for the other options, and then choose Next.
    4. Give your secret a name, but don’t add any tags. Choose Next.
    5. On the Configure automatic rotation page, choose Next, and then choose Store.

    You should receive an error stating that the user failed to create the secret, because the user is not authorized to perform the secretsmanager:CreateSecret action.

    Figure 7: Failure to create a secret (no attributes)

    Figure 7: Failure to create a secret (no attributes)

  5. Choose Previous twice, and then add the appropriate tag. For my example, I’ll add a tag with the key Department and the value Serverless.

    Figure 8: Adding tags for a secret

    Figure 8: Adding tags for a secret

  6. Choose Next twice, and then choose Store. You should see a message that your secret creation was successful.

    Figure 9: Successful secret creation

    Figure 9: Successful secret creation

Now administrators who assume this permission set can view, create, and manage only the secrets that belong to their team or department, based on the tags that you defined. You can reuse this permission set across a large number of teams, which can reduce the number of permission sets you need to create and manage.

Summary

In this post, I’ve talked about the benefits organizations can gain from embracing an ABAC strategy, and walked through how to turn on ABAC attributes in Okta and AWS SSO. I’ve also shown how you can create ABAC-driven permission sets to simplify your permission set management. For more information on AWS services that support ABAC—in other words, authorization based on tags—see our updated AWS services that work with IAM page.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Single Sign-On forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Louay Shaat

Louay is a Security Solutions Architect with AWS. He spends his days working with customers, from startups to the largest of enterprises helping them build cool new capabilities and accelerating their cloud journey. He has a strong focus on security and automation helping customers improve their security, risk, and compliance in the cloud.

How to delegate management of identity in AWS Single Sign-On

Post Syndicated from Louay Shaat original https://aws.amazon.com/blogs/security/how-to-delegate-management-of-identity-in-aws-single-sign-on/

In this blog post, I show how you can use AWS Single Sign-On (AWS SSO) to delegate administration of user identities. Delegation is the process of providing your teams permissions to manage accounts and identities associated with their teams. You can achieve this by using the existing integration that AWS SSO has with AWS Organizations, and by using tags and conditions in AWS Identity and Access Management (IAM).

AWS SSO makes it easy to centrally manage access to multiple Amazon Web Services (AWS) accounts and business applications, and to provide users with single sign-on access to all their assigned accounts and applications from one place.

AWS SSO uses permission sets—a collection of administrator-defined policies—to determine a user’s effective permissions to access a given AWS account. Permission sets can contain either AWS managed policies or custom policies that are stored in AWS SSO. Policies are documents that act as containers for one or more permission statements. These statements represent individual access controls (allow or deny) for various tasks, which determine what tasks users can or cannot perform within the AWS account. Permission sets are provisioned as IAM roles in your organizational accounts, and are managed centrally using AWS SSO.

AWS SSO is tightly integrated with AWS Organizations, and runs in your AWS Organizations management account. This integration enables AWS SSO to retrieve and manage permission sets across your AWS Organizations configuration.

As you continue to build more of your workloads on AWS, managing access to AWS accounts and services becomes more time consuming for team members that manage identities. With a centralized identity approach that uses AWS SSO, there’s an increased need to delegate control of permission sets and accounts to domain and application owners. Although this is a valid use case, access to the management account in Organizations should be tightly guarded as a security best practice. As an administrator in the management account of an organization, you can control how teams and users access your AWS accounts and applications.

This post shows how you can build comprehensive delegation models in AWS SSO to securely and effectively delegate control of identities to various teams.

Solution overview

Suppose you’ve implemented AWS SSO in Organizations to manage identity across your entire AWS environment. Your organization is growing and the number of accounts and teams that need access to your AWS environment is also growing. You have a small Identity team that is constantly adding, updating, or deleting users or groups and permission sets to enable your teams to gain access to their required services and accounts.

Note: You can learn how to enable AWS SSO from the Introducing AWS Single Sign-On blog post.

As the number of teams grows, you want to start using a delegation model to enable account and application owners to manage access to their resources, in order to reduce the heavy lifting that is done by teams that manage identities.

Figure 1 shows a simple organizational structure that your organization implemented.
 

Figure 1: AWS SSO with AWS Organizations

Figure 1: AWS SSO with AWS Organizations

In this scenario, you’ve already built a collection of organizational-approved permission sets that are used across your organization. You have a tagging strategy for permission sets, and you’ve implemented two tags across all your permission sets:

  • Environment: The values for this tag are Production or Development. You only apply Production permission sets to Production accounts.
  • OU: This tag identifies the organizational unit (OU) that the permission set belongs to.

A value of All can be assigned to either tag to identify organization-wide use of the permission set.

You identified three models of delegation that you want to enable based on the setup just described, and your Identity team has identified three use cases that they want to implement:

  • A simple delegation model for a team to manage all permission sets for a set of accounts.
  • A delegation model for support teams to apply read-only permission sets to all accounts.
  • A delegation model based on AWS Organizations, where a team can manage only the permission sets intended for a specific OU.

The AWS SSO delegation model enables three key conditions for restricting user access:

  • Permission sets.
  • Accounts
  • Tags that use the condition aws:ResourceTag, to ensure that tags are present on your permission sets as part of your delegation model.

In the rest of this blog post, I show you how AWS SSO administrators can use these conditions to implement the use cases highlighted here to build a delegation model.

See Delegating permission set administration and Actions, resources, and condition keys for AWS SSO for more information.

Important: The use cases that follow are examples that can be adopted by your organization. The permission sets in these use cases show only what is needed to delegate the components discussed. You need to add additional policies to give users and groups access to AWS SSO.

Some examples:

Identify your permission set and AWS SSO instance IDs

You can use either the AWS Command Line Interface (AWS CLI) v2 or the AWS Management Console to identify your permission set and AWS SSO instance IDs.

Use the AWS CLI

To use the AWS CLI to identify the Amazon resource names (ARNs) of the AWS SSO instance and permission set, make sure you have AWS CLI v2 installed.

To list the AWS SSO instance ID ARN

Run the following command:

aws sso-admin list-instances

To list the permission set ARN

Run the following command:

aws sso-admin list-permission-sets --instance-arn <instance arn from above>

Use the console

You can also use the console to identify your permission sets and AWS SSO instance IDs.

To list the AWS SSO Instance ID ARN

  1. Navigate to the AWS SSO in your Region. Choose the Dashboard and then choose Choose your identity source.
  2. Copy the AWS SSO ARN ID.
Figure 2: AWS SSO ID ARN

Figure 2: AWS SSO ID ARN

To list the permission set ARN

  1. Navigate to the AWS SSO Service in your Region. Choose AWS Accounts and then Permission Sets.
  2. Select the permission set you want to use.
  3. Copy the ARN of the permission set.
Figure 3: Permission set ARN

Figure 3: Permission set ARN

Use case 1: Accounts-based delegation model

In this use case, you create a single policy to allow administrators to assign any permission set to a specific set of accounts.

First, you need to create a custom permission set to use with the following example policy.

The example policy is as follows.

            "Sid": "DelegatedAdminsAccounts",
            "Effect": "Allow",
            "Action": [
                "sso:ProvisionPermissionSet",
                "sso:CreateAccountAssignment",
                "sso:DeleteInlinePolicyFromPermissionSet",
                "sso:UpdateInstanceAccessControlAttributeConfiguration",
                "sso:PutInlinePolicyToPermissionSet",
                "sso:DeleteAccountAssignment",
                "sso:DetachManagedPolicyFromPermissionSet",
                "sso:DeletePermissionSet",
                "sso:AttachManagedPolicyToPermissionSet",
                "sso:CreatePermissionSet",
                "sso:UpdatePermissionSet",
                "sso:CreateInstanceAccessControlAttributeConfiguration",
                "sso:DeleteInstanceAccessControlAttributeConfiguration"
            ],
            "Resource": [
                "arn:aws:sso:::account/112233445566",
                "arn:aws:sso:::account/223344556677",
                "arn:aws:sso:::account/334455667788"
            ]
        }

This policy specifies that delegated admins are allowed to provision any permission set to the three accounts listed in the policy.

Note: To apply this permission set to your environment, replace the account numbers following Resource with your account numbers.

Use case 2: Permission-based delegation model

In this use case, you create a single policy to allow administrators to assign a specific permission set to any account. The policy is as follows.

{
                    "Sid": "DelegatedPermissionsAdmin",
                    "Effect": "Allow",
                    "Action": [
                        "sso:ProvisionPermissionSet",
                        "sso:CreateAccountAssignment",
                        "sso:DeleteInlinePolicyFromPermissionSet",
                        "sso:UpdateInstanceAccessControlAttributeConfiguration",
                        "sso:PutInlinePolicyToPermissionSet",
                        "sso:DeleteAccountAssignment",
                        "sso:DetachManagedPolicyFromPermissionSet",
                        "sso:DeletePermissionSet",
                        "sso:AttachManagedPolicyToPermissionSet",
                        "sso:CreatePermissionSet",
                        "sso:UpdatePermissionSet",
                        "sso:CreateInstanceAccessControlAttributeConfiguration",
                        "sso:DeleteInstanceAccessControlAttributeConfiguration",
                        "sso:ProvisionApplicationInstanceForAWSAccount"
                    ],
                    "Resource": [
                        "arn:aws:sso:::instance/ssoins-1111111111",
                        "arn:aws:sso:::account/*",
                        "arn:aws:sso:::permissionSet/ssoins-1111111111/ps-112233abcdef123"

            ]


        },          

This policy specifies that delegated admins are allowed to provision only the specific permission set listed in the policy to any account.

Note:

Use case 3: OU-based delegation model

In this use case, the Identity team wants to delegate the management of the Development permission sets (identified by the tag key Environment) to the Test OU (identified by the tag key OU). You use the Environment and OU tags on permission sets to restrict access to only the permission sets that contain both tags.

To build this permission set for delegation, you need to create two policies in the same permission set:

  • A policy that filters the permission sets based on both tags—Environment and OU.
  • A policy that filters the accounts belonging to the Development OU.

The policies are as follows.

{
                    "Sid": "DelegatedOUAdmin",
                    "Effect": "Allow",
                    "Action": [
                        "sso:ProvisionPermissionSet",
                        "sso:CreateAccountAssignment",
                        "sso:DeleteInlinePolicyFromPermissionSet",
                        "sso:UpdateInstanceAccessControlAttributeConfiguration",
                        "sso:PutInlinePolicyToPermissionSet",
                        "sso:DeleteAccountAssignment",
                        "sso:DetachManagedPolicyFromPermissionSet",
                        "sso:DeletePermissionSet",
                        "sso:AttachManagedPolicyToPermissionSet",
                        "sso:CreatePermissionSet",
                        "sso:UpdatePermissionSet",
                        "sso:CreateInstanceAccessControlAttributeConfiguration",
                        "sso:DeleteInstanceAccessControlAttributeConfiguration",
                        "sso:ProvisionApplicationInstanceForAWSAccount"
                    ],
                    "Resource": "arn:aws:sso:::permissionSet/*/*",
                    "Condition": {
                        "StringEquals": {
                            "aws:ResourceTag/Environment": "Development",
                            "aws:ResourceTag/OU": "Test"
                        }
                    }
        },
        {
            "Sid": "Instance",
            "Effect": "Allow",
            "Action": [
                "sso:ProvisionPermissionSet",
                "sso:CreateAccountAssignment",
                "sso:DeleteInlinePolicyFromPermissionSet",
                "sso:UpdateInstanceAccessControlAttributeConfiguration",
                "sso:PutInlinePolicyToPermissionSet",
                "sso:DeleteAccountAssignment",
                "sso:DetachManagedPolicyFromPermissionSet",
                "sso:DeletePermissionSet",
                "sso:AttachManagedPolicyToPermissionSet",
                "sso:CreatePermissionSet",
                "sso:UpdatePermissionSet",
                "sso:CreateInstanceAccessControlAttributeConfiguration",
                "sso:DeleteInstanceAccessControlAttributeConfiguration",
                "sso:ProvisionApplicationInstanceForAWSAccount"
            ],
            "Resource": [
                "arn:aws:sso:::instance/ssoins-82593a6ed92c8920",
                "arn:aws:sso:::account/112233445566",
                "arn:aws:sso:::account/223344556677",
                "arn:aws:sso:::account/334455667788"

            ]
        }

In the delegated policy, the user or group is only allowed to provision permission sets that have both tags, OU and Environment, set to “Development” and only to accounts in the Development OU.

Note: In the example above arn:aws:sso:::instance/ssoins-11112222233333 is the ARN for the AWS SSO Instance ID. To get your AWS SSO Instance ID, refer to Identify your permission set and AWS SSO Instance IDs.

Create a delegated admin profile in AWS SSO

Now that you know what’s required to delegate permissions, you can create a delegated profile and deploy that to your users and groups.

To create a delegated AWS SSO profile

  1. In the AWS SSO console, sign in to your management account and browse to the Region where AWS SSO is provisioned.
  2. Navigate to AWS Accounts and choose Permission sets, and then choose Create permission set.
     
    Figure 4: AWS SSO permission sets menu

    Figure 4: AWS SSO permission sets menu

  3. Choose Create a custom permission set.
     
    Figure 5: Create a new permission set

    Figure 5: Create a new permission set

  4. Give a name to your permission set based on your naming standards and select a session duration from your organizational policies.
  5. For Relay state, enter the following URL:
    https://<region>.console.aws.amazon.com/singlesignon/home?region=<region>#/accounts/organization 
    

    where <region> is the AWS Region in which you deployed AWS SSO.

    The relay state will automatically redirect the user to the Accounts section in the AWS SSO console, for simplicity.
     

    Figure 6: Custom permission set

    Figure 6: Custom permission set

  6. Choose Create new permission set. Here is where you can decide the level of delegation required for your application or domain administrators.
     
    Figure 7: Assign users

    Figure 7: Assign users

    See some of the examples in the earlier sections of this post for the permission set.

  7. If you’re using AWS SSO with AWS Directory Service for Microsoft Active Directory, you’ll need to provide access to AWS Directory Service in order for your administrator to assign permission sets to users and groups.

    To provide this access, navigate to the AWS Accounts screen in the AWS SSO console, and select your management account. Assign the required users or groups, and select the permission set that you created earlier. Then choose Finish.

  8. To test this delegation, sign in to AWS SSO. You’ll see the newly created permission set.
     
    Figure 8: AWS SSO sign-on page

    Figure 8: AWS SSO sign-on page

  9. Next to developer-delegated-admin, choose Management console. This should automatically redirect you to AWS SSO in the AWS Accounts submenu.

If you try to provision access by assigning or creating new permission sets to accounts or permission sets you are not explicitly allowed, according to the policies you specified earlier, you will receive the following error.
 

Figure 9: Error based on lack of permissions

Figure 9: Error based on lack of permissions

Otherwise, the provisioning will be successful.

Summary

You’ve seen that by using conditions and tags on permission sets, application and account owners can use delegation models to manage the deployment of permission sets across the accounts they manage, providing them and their teams with secure access to AWS accounts and services.

Additionally, because AWS SSO supports attribute-based access control (ABAC), you can create a more dynamic delegation model based on attributes from your identity provider, to match the tags on the permission set.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Single Sign-On forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Louay Shaat

Louay is a Security Solutions Architect with AWS. He spends his days working with customers, from startups to the largest of enterprises, helping them build cool new capabilities and accelerating their cloud journey. He has a strong focus on security and automation to help customers improve their security, risk, and compliance in the cloud.

Rely on employee attributes from your corporate directory to create fine-grained permissions in AWS

Post Syndicated from Sulay Shah original https://aws.amazon.com/blogs/security/rely-employee-attributes-from-corporate-directory-create-fine-grained-permissions-aws/

In my earlier post Simplify granting access to your AWS resources by using tags on AWS IAM users and roles, I explained how to implement attribute-based access control (ABAC) in AWS to simplify permissions management at scale. In that scenario, I talked about relying on attributes on your IAM users and roles for access control in AWS. But more often, customers manage workforce user identities with an identity provider (IdP) and want to use identity attributes from their IdP for fine-grained permissions in AWS. In this post I introduce a new capability that enables you to do just that.

In AWS, you can configure your IdP to allow workforce users federated access to AWS resources using credentials from your corporate directory. Along with user credentials, your directory also stores user attributes such as cost center, department, and email address. Now you can configure your IdP to pass in user attributes as tags in federated AWS sessions. These are called session tags. You can then control access to AWS resources based on these session tags. Moreover, when user attributes change or new users are added to your directory, permissions automatically apply based on these attributes. For example, developers can federate into AWS using an IAM role, but can only access resources specific to their project. This is because you define permissions that require the project attribute from their IdP to match the project tag on AWS resources. Additionally, AWS logs these attributes in AWS CloudTrail and enable security administrators to track the user identity for a given role session.

In this post, I introduce session tags and walk you through an example of how to use session tags for ABAC and tracking user activity.

What are session tags?

Session tags are attributes passed in the AWS session. You can use session tags for access control in IAM policies and for monitoring. These tags are not stored in AWS and are valid only for the duration of the session. You define session tags just like tags in AWS—consisting of a customer-defined key and an optional value.

How to pass session tags in the AWS session?

One of the most widely used mechanisms for requesting a session in AWS is by assuming an IAM role. For user identities stored in an external directory, you can configure your SAML IdP in IAM to allow your users federated access to AWS using IAM roles. To understand how to set up SAML federation using an IdP, read AWS Federated Authentication with Active Directory Federation Services (ADFS). If you’re using IAM users, you can also request a session in AWS using AssumeRole and GetFederationToken APIs or using AssumeRoleWithWebIdentity API for applications that require access to AWS resources.

For session tags, you can use all of the above-mentioned APIs to pass tags into your AWS session based on your use case. For details on how to use these APIs to pass session tags, please visit Tags in AWS Sessions.

What permissions do I need to use session tags?

To perform any action in AWS, developers need permissions. For example, to assume a role, your developers need sts:AssumeRole permission. Similarly with session tags, we’re introducing a new action, sts:TagSession, that is required to pass session tags in the session. Additionally, you can require and control session tags using existing AWS conditions:

Action Use Case Where to add
sts:TagSession Required to pass attributes as session tags when using AssumeRole, AssumRoleWithSAML, AssumeRoleWithWebIdentity, or GetFederatioToken API Role’s trust policy or IAM user’s permissions policy based on the API you are using to pass session tags.
Condition Key Use Case Actions that supports the condition key
aws:RequestTag Use this condition to require specific tags in the session. sts:TagSession
aws:TagKeys Use this condition key to control the tag keys that are allowed in the session. sts:TagSession
aws:PrincipalTag* Use this condition in IAM policies to compare tags on AWS resources. AWS Global Condition Keys (all actions across all services support this condition key)

Note: The table above explains only the additional use cases that the keys now support. Support for existing use cases, such as IAM users and roles remains unchanged. For details please visit AWS Global Condition Keys.

Now, I’ll show you how to create fine-grained permissions based on user attributes from your directory and how permissions automatically apply based on attributes when employees switch projects within your organization.

Example: Grant employees access to their project resources in AWS based on their job function

Consider a scenario where your organization deployed AWS EC2 and RDS instances in your AWS account for your company’s production web applications. Your systems engineers manage the EC2 instances and database engineers manage the RDS instances. They both access AWS by federating into your AWS account from a SAML IdP. Your organization’s security policy requires employees to have access to manage only the resources related to their job function and project they work on.

To meet these requirements, your cloud administrator, Michelle, implements attribute-based access control (ABAC) using the jobfunction and project attributes as session tags by following three steps:

  1. Michelle tags all existing EC2 and RDS instances with the corresponding project attribute.
  2. She creates a MyProjectResources IAM role and an IAM permission policy for this role such that employees can access resources with their jobfunction and project tags.
  3. She then configures your SAML IdP to pass the jobfunction and project attributes in the federated session when employees federate into AWS using the MyProjectResources role.

Let’s have a look at these steps in detail.

Step 1: Tag all the project resources

Michelle tags all the project resources with the appropriate project tag. This is important since she wants to create permission rules based on this tag to implement ABAC. To learn how to tag resources in EC2 and RDS, read tagging your Amazon EC2 resources and tagging Amazon RDS resources.

Step 2: Create an IAM role with permissions based on attributes

Next, Michelle creates an IAM role called MyProjectResources using the AWS Management Console or CLI. This is the role that your systems engineers and database engineers will assume when they federate into AWS to access and manage the EC2 and RDS instances respectively. To grant this role permissions, Michelle creates the following IAM policy and attaches it to the MyProjectResources role.

IAM Permissions Policy


{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": "rds:DescribeDBInstances",
			"Resource": "*"
		},
		{
			"Effect": "Allow",
			"Action": [
				"rds:RebootDBInstance",
				"rds:StartDBInstance",
				"rds:StopDBInstance"
			],
			"Resource": "*",
			"Condition": {
				"StringEquals": {
					"aws:PrincipalTag/jobfunction": "DatabaseEngineer",
					"rds:db-tag/project": "${aws:PrincipalTag/project}"
				}
			}
		},
		{
			"Effect": "Allow",
			"Action": "ec2:DescribeInstances",
			"Resource": "*"
		},
		{
			"Effect": "Allow",
			"Action": [
				"ec2:StartInstances",
				"ec2:StopInstances",
				"ec2:RebootInstances",
				"ec2:TerminateInstances"
			],
			"Resource": "*",
			"Condition": {
				"StringEquals": {
					"aws:PrincipalTag/jobfunction": "SystemsEngineer",
					"ec2:ResourceTag/project": "${aws:PrincipalTag/project}"
				}
			}
		}
	]
}

In the policy above, Michelle allows specific actions related to EC2 and RDS that the systems engineers and database engineers need to manage their project instances. In the condition element of the policy statements, Michelle adds a condition based on the jobfunction and project attributes to ensure engineers can access only the instances which belong to their jobfunction and have a matching project tag.

To ensure your systems engineers and database engineers can assume this role when they federate into AWS from your IdP, Michelle modifies the role’s trust policy to trust your SAML IdP as shown in the policy statement below. Since we also want to include session tags when engineers federate in, Michelle adds the new action sts:TagSession in the policy statement as shown below. She also adds a condition that requires the jobfunction and project attributes to be included as session tags when engineers assume this role.

Role Trust Policy


{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Principal": {
				"Federated": "arn:aws:iam::999999999999:saml-provider/ExampleCorpProvider"
			},
			"Action": [
				"sts:AssumeRoleWithSAML",
				"sts:TagSession"
			],
			"Condition": {
				"StringEquals": {
					"SAML:aud": "https://signin.aws.amazon.com/saml"
				},
				"StringLike": {
					"aws:RequestTag/project": "*",
					"aws:RequestTag/jobfunction": [
						"SystemsEngineer",
						"DatabaseEngineer"
					]
				}
			}
		}
	]
}

Step 3: Configuring your SAML IdP to pass the jobfunction and project attributes as session tags

Once Michelle creates the role and permissions policy in AWS, she configures her SAML IdP to include the jobfunction and project attributes as session tags in the SAML assertion when engineers federate into AWS using this role.

To pass attributes as session tags in the federated session, the SAML assertion must contain the attributes with the following prefix:

https://aws.amazon.com/SAML/Attributes/PrincipalTag

The example given below shows a part of the SAML assertion generated from my IdP with two attributes (project:Automation and jobfunction:SystemsEngineer) that we want to pass as session tags.


<Attribute Name="https://aws.amazon.com/SAML/Attributes/PrincipalTag:project">
		< AttributeValue >Automation<AttributeValue>
</ Attribute>
<Attribute Name="https://aws.amazon.com/SAML/Attributes/PrincipalTag:jobfunction">
		< AttributeValue >SystemsEngineer<AttributeValue>
</ Attribute>

Note: This sample only contains the new properties in the SAML assertion. There are additional required fields in the SAML assertion that must be present to successfully federate into AWS. To learn more about creating SAML assertions with session tags, visit configuring SAML assertions for the authentication response.

AWS identity partners such as Ping Identity, OneLogin, Auth0, ForgeRock, IBM, Okta, and RSA have validated the end-to-end experience for this new capability with their identity solutions, and we look forward to additional partners validating this capability. To learn more about how to use these identity providers for configuring session tags, please visit integrating third-party SAML solution providers with AWS. If you are using Active Directory Federation Services (ADFS) for SAML federation with AWS, then please visit Configuring ADFS to start using session tags for attribute-based access control.

Now, when your systems engineers and database engineers federate into AWS using the MyProjectResources role, they only get access to their project resources based on the project and jobfunction attributes passed in their federated session. Session tags enabled Michelle to define unique permissions based on user attributes without having to create and manage multiple roles and policies. This helps simplify permissions management in her company.

Permissions automatically apply when employees change projects

Consider the same example with a scenario where your systems engineer, Bob, switches from the automation project to the integration project. Due to this switch, Michelle sets Bob’s project attribute in the IdP to integration. Now, the next time Bob federates into AWS he automatically has access to resources in integration project. Using session tags, permissions automatically apply when you update attributes or create new AWS resources with appropriate attributes without requiring any permissions updates in AWS.

Track user identity using session tags

When developers federate into AWS with session tags, AWS CloudTrail logs these tags to make it easier for security administrators to track the user identity of the session. To view session tags in CloudTrail, your administrator Michelle looks for the AssumeRoleWithSAML event in the eventName filter of CloudTrail. In the example below, Michelle has configured the SAML IdP to pass three session tags: project, jobfunction, and userID. When developers federate into your account, Michelle views the AssumeRoleWithSAML event in CloudTrail to track the user identity of the session using the session tags project, jobfunction, and userID as shown below:
 

Figure 1: Search for the logged events

Figure 1: Search for the logged events

Note: You can use session tags in conjunction with the instructions to track account activity to its origin using AWS CloudTrail to trace the identity of the session.

Summary

You can use session tags to rely on your employee attributes from your corporate directory to create fine-grained permissions at scale in AWS to simplify your permissions management workflows. To learn more about session tags, please visit tags in AWS session.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Amazon IAM forum.

Want more AWS Security news? Follow us on Twitter.

Sulay Shah

Sulay is a Senior Product Manager for Identity and Access Management service at AWS. He strongly believes in the customer first approach and is always looking for new opportunities to assist customers. Outside of work, Sulay enjoys playing soccer and watching movies. Sulay holds a master’s degree in computer science from the North Carolina State University.

Rotate Amazon RDS database credentials automatically with AWS Secrets Manager

Post Syndicated from Apurv Awasthi original https://aws.amazon.com/blogs/security/rotate-amazon-rds-database-credentials-automatically-with-aws-secrets-manager/

Recently, we launched AWS Secrets Manager, a service that makes it easier to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. You can configure Secrets Manager to rotate secrets automatically, which can help you meet your security and compliance needs. Secrets Manager offers built-in integrations for MySQL, PostgreSQL, and Amazon Aurora on Amazon RDS, and can rotate credentials for these databases natively. You can control access to your secrets by using fine-grained AWS Identity and Access Management (IAM) policies. To retrieve secrets, employees replace plaintext secrets with a call to Secrets Manager APIs, eliminating the need to hard-code secrets in source code or update configuration files and redeploy code when secrets are rotated.

In this post, I introduce the key features of Secrets Manager. I then show you how to store a database credential for a MySQL database hosted on Amazon RDS and how your applications can access this secret. Finally, I show you how to configure Secrets Manager to rotate this secret automatically.

Key features of Secrets Manager

These features include the ability to:

  • Rotate secrets safely. You can configure Secrets Manager to rotate secrets automatically without disrupting your applications. Secrets Manager offers built-in integrations for rotating credentials for Amazon RDS databases for MySQL, PostgreSQL, and Amazon Aurora. You can extend Secrets Manager to meet your custom rotation requirements by creating an AWS Lambda function to rotate other types of secrets. For example, you can create an AWS Lambda function to rotate OAuth tokens used in a mobile application. Users and applications retrieve the secret from Secrets Manager, eliminating the need to email secrets to developers or update and redeploy applications after AWS Secrets Manager rotates a secret.
  • Secure and manage secrets centrally. You can store, view, and manage all your secrets. By default, Secrets Manager encrypts these secrets with encryption keys that you own and control. Using fine-grained IAM policies, you can control access to secrets. For example, you can require developers to provide a second factor of authentication when they attempt to retrieve a production database credential. You can also tag secrets to help you discover, organize, and control access to secrets used throughout your organization.
  • Monitor and audit easily. Secrets Manager integrates with AWS logging and monitoring services to enable you to meet your security and compliance requirements. For example, you can audit AWS CloudTrail logs to see when Secrets Manager rotated a secret or configure AWS CloudWatch Events to alert you when an administrator deletes a secret.
  • Pay as you go. Pay for the secrets you store in Secrets Manager and for the use of these secrets; there are no long-term contracts or licensing fees.

Get started with Secrets Manager

Now that you’re familiar with the key features, I’ll show you how to store the credential for a MySQL database hosted on Amazon RDS. To demonstrate how to retrieve and use the secret, I use a python application running on Amazon EC2 that requires this database credential to access the MySQL instance. Finally, I show how to configure Secrets Manager to rotate this database credential automatically. Let’s get started.

Phase 1: Store a secret in Secrets Manager

  1. Open the Secrets Manager console and select Store a new secret.
     
    Secrets Manager console interface
     
  2. I select Credentials for RDS database because I’m storing credentials for a MySQL database hosted on Amazon RDS. For this example, I store the credentials for the database superuser. I start by securing the superuser because it’s the most powerful database credential and has full access over the database.
     
    Store a new secret interface with Credentials for RDS database selected
     

    Note: For this example, you need permissions to store secrets in Secrets Manager. To grant these permissions, you can use the AWSSecretsManagerReadWriteAccess managed policy. Read the AWS Secrets Manager Documentation for more information about the minimum IAM permissions required to store a secret.

  3. Next, I review the encryption setting and choose to use the default encryption settings. Secrets Manager will encrypt this secret using the Secrets Manager DefaultEncryptionKeyDefaultEncryptionKey in this account. Alternatively, I can choose to encrypt using a customer master key (CMK) that I have stored in AWS KMS.
     
    Select the encryption key interface
     
  4. Next, I view the list of Amazon RDS instances in my account and select the database this credential accesses. For this example, I select the DB instance mysql-rds-database, and then I select Next.
     
    Select the RDS database interface
     
  5. In this step, I specify values for Secret Name and Description. For this example, I use Applications/MyApp/MySQL-RDS-Database as the name and enter a description of this secret, and then select Next.
     
    Secret Name and description interface
     
  6. For the next step, I keep the default setting Disable automatic rotation because my secret is used by my application running on Amazon EC2. I’ll enable rotation after I’ve updated my application (see Phase 2 below) to use Secrets Manager APIs to retrieve secrets. I then select Next.

    Note: If you’re storing a secret that you’re not using in your application, select Enable automatic rotation. See our AWS Secrets Manager getting started guide on rotation for details.

     
    Configure automatic rotation interface
     

  7. Review the information on the next screen and, if everything looks correct, select Store. We’ve now successfully stored a secret in Secrets Manager.
  8. Next, I select See sample code.
     
    The See sample code button
     
  9. Take note of the code samples provided. I will use this code to update my application to retrieve the secret using Secrets Manager APIs.
     
    Python sample code
     

Phase 2: Update an application to retrieve secret from Secrets Manager

Now that I have stored the secret in Secrets Manager, I update my application to retrieve the database credential from Secrets Manager instead of hard coding this information in a configuration file or source code. For this example, I show how to configure a python application to retrieve this secret from Secrets Manager.

  1. I connect to my Amazon EC2 instance via Secure Shell (SSH).
  2. Previously, I configured my application to retrieve the database user name and password from the configuration file. Below is the source code for my application.
    import MySQLdb
    import config

    def no_secrets_manager_sample()

    # Get the user name, password, and database connection information from a config file.
    database = config.database
    user_name = config.user_name
    password = config.password

    # Use the user name, password, and database connection information to connect to the database
    db = MySQLdb.connect(database.endpoint, user_name, password, database.db_name, database.port)

  3. I use the sample code from Phase 1 above and update my application to retrieve the user name and password from Secrets Manager. This code sets up the client and retrieves and decrypts the secret Applications/MyApp/MySQL-RDS-Database. I’ve added comments to the code to make the code easier to understand.
    # Use the code snippet provided by Secrets Manager.
    import boto3
    from botocore.exceptions import ClientError

    def get_secret():
    #Define the secret you want to retrieve
    secret_name = "Applications/MyApp/MySQL-RDS-Database"
    #Define the Secrets mManager end-point your code should use.
    endpoint_url = "https://secretsmanager.us-east-1.amazonaws.com"
    region_name = "us-east-1"

    #Setup the client
    session = boto3.session.Session()
    client = session.client(
    service_name='secretsmanager',
    region_name=region_name,
    endpoint_url=endpoint_url
    )

    #Use the client to retrieve the secret
    try:
    get_secret_value_response = client.get_secret_value(
    SecretId=secret_name
    )
    #Error handling to make it easier for your code to tolerate faults
    except ClientError as e:
    if e.response['Error']['Code'] == 'ResourceNotFoundException':
    print("The requested secret " + secret_name + " was not found")
    elif e.response['Error']['Code'] == 'InvalidRequestException':
    print("The request was invalid due to:", e)
    elif e.response['Error']['Code'] == 'InvalidParameterException':
    print("The request had invalid params:", e)
    else:
    # Decrypted secret using the associated KMS CMK
    # Depending on whether the secret was a string or binary, one of these fields will be populated
    if 'SecretString' in get_secret_value_response:
    secret = get_secret_value_response['SecretString']
    else:
    binary_secret_data = get_secret_value_response['SecretBinary']

    # Your code goes here.

  4. Applications require permissions to access Secrets Manager. My application runs on Amazon EC2 and uses an IAM role to obtain access to AWS services. I will attach the following policy to my IAM role. This policy uses the GetSecretValue action to grant my application permissions to read secret from Secrets Manager. This policy also uses the resource element to limit my application to read only the Applications/MyApp/MySQL-RDS-Database secret from Secrets Manager. You can visit the AWS Secrets Manager Documentation to understand the minimum IAM permissions required to retrieve a secret.
    {
    "Version": "2012-10-17",
    "Statement": {
    "Sid": "RetrieveDbCredentialFromSecretsManager",
    "Effect": "Allow",
    "Action": "secretsmanager:GetSecretValue",
    "Resource": "arn:aws:secretsmanager:::secret:Applications/MyApp/MySQL-RDS-Database"
    }
    }

Phase 3: Enable Rotation for Your Secret

Rotating secrets periodically is a security best practice because it reduces the risk of misuse of secrets. Secrets Manager makes it easy to follow this security best practice and offers built-in integrations for rotating credentials for MySQL, PostgreSQL, and Amazon Aurora databases hosted on Amazon RDS. When you enable rotation, Secrets Manager creates a Lambda function and attaches an IAM role to this function to execute rotations on a schedule you define.

Note: Configuring rotation is a privileged action that requires several IAM permissions and you should only grant this access to trusted individuals. To grant these permissions, you can use the AWS IAMFullAccess managed policy.

Next, I show you how to configure Secrets Manager to rotate the secret Applications/MyApp/MySQL-RDS-Database automatically.

  1. From the Secrets Manager console, I go to the list of secrets and choose the secret I created in the first step Applications/MyApp/MySQL-RDS-Database.
     
    List of secrets in the Secrets Manager console
     
  2. I scroll to Rotation configuration, and then select Edit rotation.
     
    Rotation configuration interface
     
  3. To enable rotation, I select Enable automatic rotation. I then choose how frequently I want Secrets Manager to rotate this secret. For this example, I set the rotation interval to 60 days.
     
    Edit rotation configuration interface
     
  4. Next, Secrets Manager requires permissions to rotate this secret on your behalf. Because I’m storing the superuser database credential, Secrets Manager can use this credential to perform rotations. Therefore, I select Use the secret that I provided in step 1, and then select Next.
     
    Select which secret to use in the Edit rotation configuration interface
     
  5. The banner on the next screen confirms that I have successfully configured rotation and the first rotation is in progress, which enables you to verify that rotation is functioning as expected. Secrets Manager will rotate this credential automatically every 60 days.
     
    Confirmation banner message
     

Summary

I introduced AWS Secrets Manager, explained the key benefits, and showed you how to help meet your compliance requirements by configuring AWS Secrets Manager to rotate database credentials automatically on your behalf. Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and on-going maintenance costs of operating your own secrets management infrastructure. To get started, visit the Secrets Manager console. To learn more, visit Secrets Manager documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.

Federate Database User Authentication Easily with IAM and Amazon Redshift

Post Syndicated from Thiyagarajan Arumugam original https://aws.amazon.com/blogs/big-data/federate-database-user-authentication-easily-with-iam-and-amazon-redshift/

Managing database users though federation allows you to manage authentication and authorization procedures centrally. Amazon Redshift now supports database authentication with IAM, enabling user authentication though enterprise federation. No need to manage separate database users and passwords to further ease the database administration. You can now manage users outside of AWS and authenticate them for access to an Amazon Redshift data warehouse. Do this by integrating IAM authentication and a third-party SAML-2.0 identity provider (IdP), such as AD FS, PingFederate, or Okta. In addition, database users can also be automatically created at their first login based on corporate permissions.

In this post, I demonstrate how you can extend the federation to enable single sign-on (SSO) to the Amazon Redshift data warehouse.

SAML and Amazon Redshift

AWS supports Security Assertion Markup Language (SAML) 2.0, which is an open standard for identity federation used by many IdPs. SAML enables federated SSO, which enables your users to sign in to the AWS Management Console. Users can also make programmatic calls to AWS API actions by using assertions from a SAML-compliant IdP. For example, if you use Microsoft Active Directory for corporate directories, you may be familiar with how Active Directory and AD FS work together to enable federation. For more information, see the Enabling Federation to AWS Using Windows Active Directory, AD FS, and SAML 2.0 AWS Security Blog post.

Amazon Redshift now provides the GetClusterCredentials API operation that allows you to generate temporary database user credentials for authentication. You can set up an IAM permissions policy that generates these credentials for connecting to Amazon Redshift. Extending the IAM authentication, you can configure the federation of AWS access though a SAML 2.0–compliant IdP. An IAM role can be configured to permit the federated users call the GetClusterCredentials action and generate temporary credentials to log in to Amazon Redshift databases. You can also set up policies to restrict access to Amazon Redshift clusters, databases, database user names, and user group.

Amazon Redshift federation workflow

In this post, I demonstrate how you can use a JDBC– or ODBC-based SQL client to log in to the Amazon Redshift cluster using this feature. The SQL clients used with Amazon Redshift JDBC or ODBC drivers automatically manage the process of calling the GetClusterCredentials action, retrieving the database user credentials, and establishing a connection to your Amazon Redshift database. You can also use your database application to programmatically call the GetClusterCredentials action, retrieve database user credentials, and connect to the database. I demonstrate these features using an example company to show how different database users accounts can be managed easily using federation.

The following diagram shows how the SSO process works:

  1. JDBC/ODBC
  2. Authenticate using Corp Username/Password
  3. IdP sends SAML assertion
  4. Call STS to assume role with SAML
  5. STS Returns Temp Credentials
  6. Use Temp Credentials to get Temp cluster credentials
  7. Connect to Amazon Redshift using temp credentials

Walkthrough

Example Corp. is using Active Directory (idp host:demo.examplecorp.com) to manage federated access for users in its organization. It has an AWS account: 123456789012 and currently manages an Amazon Redshift cluster with the cluster ID “examplecorp-dw”, database “analytics” in us-west-2 region for its Sales and Data Science teams. It wants the following access:

  • Sales users can access the examplecorp-dw cluster using the sales_grp database group
  • Sales users access examplecorp-dw through a JDBC-based SQL client
  • Sales users access examplecorp-dw through an ODBC connection, for their reporting tools
  • Data Science users access the examplecorp-dw cluster using the data_science_grp database group.
  • Partners access the examplecorp-dw cluster and query using the partner_grp database group.
  • Partners are not federated through Active Directory and are provided with separate IAM user credentials (with IAM user name examplecorpsalespartner).
  • Partners can connect to the examplecorp-dw cluster programmatically, using language such as Python.
  • All users are automatically created in Amazon Redshift when they log in for the first time.
  • (Optional) Internal users do not specify database user or group information in their connection string. It is automatically assigned.
  • Data warehouse users can use SSO for the Amazon Redshift data warehouse using the preceding permissions.

Step 1:  Set up IdPs and federation

The Enabling Federation to AWS Using Windows Active Directory post demonstrated how to prepare Active Directory and enable federation to AWS. Using those instructions, you can establish trust between your AWS account and the IdP and enable user access to AWS using SSO.  For more information, see Identity Providers and Federation.

For this walkthrough, assume that this company has already configured SSO to their AWS account: 123456789012 for their Active Directory domain demo.examplecorp.com. The Sales and Data Science teams are not required to specify database user and group information in the connection string. The connection string can be configured by adding SAML Attribute elements to your IdP. Configuring these optional attributes enables internal users to conveniently avoid providing the DbUser and DbGroup parameters when they log in to Amazon Redshift.

The user-name attribute can be set up as follows, with a user ID (for example, nancy) or an email address (for example. [email protected]):

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/DbUser">  
  <AttributeValue>user-name</AttributeValue>
</Attribute>

The AutoCreate attribute can be defined as follows:

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/AutoCreate">
    <AttributeValue>true</AttributeValue>
</Attribute>

The sales_grp database group can be included as follows:

<Attribute Name="https://redshift.amazon.com/SAML/Attributes/DbGroups">
    <AttributeValue>sales_grp</AttributeValue>
</Attribute>

For more information about attribute element configuration, see Configure SAML Assertions for Your IdP.

Step 2: Create IAM roles for access to the Amazon Redshift cluster

The next step is to create IAM policies with permissions to call GetClusterCredentials and provide authorization for Amazon Redshift resources. To grant a SQL client the ability to retrieve the cluster endpoint, region, and port automatically, include the redshift:DescribeClusters action with the Amazon Redshift cluster resource in the IAM role.  For example, users can connect to the Amazon Redshift cluster using a JDBC URL without the need to hardcode the Amazon Redshift endpoint:

Previous:  jdbc:redshift://endpoint:port/database

Current:  jdbc:redshift:iam://clustername:region/dbname

Use IAM to create the following policies. You can also use an existing user or role and assign these policies. For example, if you already created an IAM role for IdP access, you can attach the necessary policies to that role. Here is the policy created for sales users for this example:

Sales_DW_IAM_Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift:DescribeClusters"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:userid": "AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:CreateClusterUser"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:JoinGroup"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp"
            ]
        }
    ]
}

The policy uses the following parameter values:

  • Region: us-west-2
  • AWS Account: 123456789012
  • Cluster name: examplecorp-dw
  • Database group: sales_grp
  • IAM role: AIDIODR4TAW7CSEXAMPLE
Policy Statement Description
{
"Effect":"Allow",
"Action":[
"redshift:DescribeClusters"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
]
}

Allow users to retrieve the cluster endpoint, region, and port automatically for the Amazon Redshift cluster examplecorp-dw. This specification uses the resource format arn:aws:redshift:region:account-id:cluster:clustername. For example, the SQL client JDBC can be specified in the format jdbc:redshift:iam://clustername:region/dbname.

For more information, see Amazon Resource Names.

{
"Effect":"Allow",
"Action":[
"redshift:GetClusterCredentials"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
"arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
],
"Condition":{
"StringEquals":{
"aws:userid":"AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com"
}
}
}

Generates a temporary token to authenticate into the examplecorp-dw cluster. “arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}” restricts the corporate user name to the database user name for that user. This resource is specified using the format: arn:aws:redshift:region:account-id:dbuser:clustername/dbusername.

The Condition block enforces that the AWS user ID should match “AIDIODR4TAW7CSEXAMPLE:${redshift:DbUser}@examplecorp.com”, so that individual users can authenticate only as themselves. The AIDIODR4TAW7CSEXAMPLE role has the Sales_DW_IAM_Policy policy attached.

{
"Effect":"Allow",
"Action":[
"redshift:CreateClusterUser"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
]
}
Automatically creates database users in examplecorp-dw, when they log in for the first time. Subsequent logins reuse the existing database user.
{
"Effect":"Allow",
"Action":[
"redshift:JoinGroup"
],
"Resource":[
"arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp"
]
}
Allows sales users to join the sales_grp database group through the resource “arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/sales_grp” that is specified in the format arn:aws:redshift:region:account-id:dbgroup:clustername/dbgroupname.

Similar policies can be created for Data Science users with access to join the data_science_grp group in examplecorp-dw. You can now attach the Sales_DW_IAM_Policy policy to the role that is mapped to IdP application for SSO.
 For more information about how to define the claim rules, see Configuring SAML Assertions for the Authentication Response.

Because partners are not authorized using Active Directory, they are provided with IAM credentials and added to the partner_grp database group. The Partner_DW_IAM_Policy is attached to the IAM users for partners. The following policy allows partners to log in using the IAM user name as the database user name.

Partner_DW_IAM_Policy

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "redshift:DescribeClusters"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:GetClusterCredentials"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:cluster:examplecorp-dw",
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ],
            "Condition": {
                "StringEquals": {
                    "redshift:DbUser": "${aws:username}"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:CreateClusterUser"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbuser:examplecorp-dw/${redshift:DbUser}"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "redshift:JoinGroup"
            ],
            "Resource": [
                "arn:aws:redshift:us-west-2:123456789012:dbgroup:examplecorp-dw/partner_grp"
            ]
        }
    ]
}

redshift:DbUser“: “${aws:username}” forces an IAM user to use the IAM user name as the database user name.

With the previous steps configured, you can now establish the connection to Amazon Redshift through JDBC– or ODBC-supported clients.

Step 3: Set up database user access

Before you start connecting to Amazon Redshift using the SQL client, set up the database groups for appropriate data access. Log in to your Amazon Redshift database as superuser to create a database group, using CREATE GROUP.

Log in to examplecorp-dw/analytics as superuser and create the following groups and users:

CREATE GROUP sales_grp;
CREATE GROUP datascience_grp;
CREATE GROUP partner_grp;

Use the GRANT command to define access permissions to database objects (tables/views) for the preceding groups.

Step 4: Connect to Amazon Redshift using the JDBC SQL client

Assume that sales user “nancy” is using the SQL Workbench client and JDBC driver to log in to the Amazon Redshift data warehouse. The following steps help set up the client and establish the connection:

  1. Download the latest Amazon Redshift JDBC driver from the Configure a JDBC Connection page
  2. Build the JDBC URL with the IAM option in the following format:
    jdbc:redshift:iam://examplecorp-dw:us-west-2/sales_db

Because the redshift:DescribeClusters action is assigned to the preceding IAM roles, it automatically resolves the cluster endpoints and the port. Otherwise, you can specify the endpoint and port information in the JDBC URL, as described in Configure a JDBC Connection.

Identify the following JDBC options for providing the IAM credentials (see the “Prepare your environment” section) and configure in the SQL Workbench Connection Profile:

plugin_name=com.amazon.redshift.plugin.AdfsCredentialsProvider 
idp_host=demo.examplecorp.com (The name of the corporate identity provider host)
idp_port=443  (The port of the corporate identity provider host)
user=examplecorp\nancy(corporate user name)
password=***(corporate user password)

The SQL workbench configuration looks similar to the following screenshot:

Now, “nancy” can connect to examplecorp-dw by authenticating using the corporate Active Directory. Because the SAML attributes elements are already configured for nancy, she logs in as database user nancy and is assigned the sales_grp. Similarly, other Sales and Data Science users can connect to the examplecorp-dw cluster. A custom Amazon Redshift ODBC driver can also be used to connect using a SQL client. For more information, see Configure an ODBC Connection.

Step 5: Connecting to Amazon Redshift using JDBC SQL Client and IAM Credentials

This optional step is necessary only when you want to enable users that are not authenticated with Active Directory. Partners are provided with IAM credentials that they can use to connect to the examplecorp-dw Amazon Redshift clusters. These IAM users are attached to Partner_DW_IAM_Policy that assigns them to be assigned to the public database group in Amazon Redshift. The following JDBC URLs enable them to connect to the Amazon Redshift cluster:

jdbc:redshift:iam//examplecorp-dw/analytics?AccessKeyID=XXX&SecretAccessKey=YYY&DbUser=examplecorpsalespartner&DbGroup= partner_grp&AutoCreate=true

The AutoCreate option automatically creates a new database user the first time the partner logs in. There are several other options available to conveniently specify the IAM user credentials. For more information, see Options for providing IAM credentials.

Step 6: Connecting to Amazon Redshift using an ODBC client for Microsoft Windows

Assume that another sales user “uma” is using an ODBC-based client to log in to the Amazon Redshift data warehouse using Example Corp Active Directory. The following steps help set up the ODBC client and establish the Amazon Redshift connection in a Microsoft Windows operating system connected to your corporate network:

  1. Download and install the latest Amazon Redshift ODBC driver.
  2. Create a system DSN entry.
    1. In the Start menu, locate the driver folder or folders:
      • Amazon Redshift ODBC Driver (32-bit)
      • Amazon Redshift ODBC Driver (64-bit)
      • If you installed both drivers, you have a folder for each driver.
    2. Choose ODBC Administrator, and then type your administrator credentials.
    3. To configure the driver for all users on the computer, choose System DSN. To configure the driver for your user account only, choose User DSN.
    4. Choose Add.
  3. Select the Amazon Redshift ODBC driver, and choose Finish. Configure the following attributes:
    Data Source Name =any friendly name to identify the ODBC connection 
    Database=analytics
    user=uma(corporate user name)
    Auth Type-Identity Provider: AD FS
    password=leave blank (Windows automatically authenticates)
    Cluster ID: examplecorp-dw
    idp_host=demo.examplecorp.com (The name of the corporate IdP host)

This configuration looks like the following:

  1. Choose OK to save the ODBC connection.
  2. Verify that uma is set up with the SAML attributes, as described in the “Set up IdPs and federation” section.

The user uma can now use this ODBC connection to establish the connection to the Amazon Redshift cluster using any ODBC-based tools or reporting tools such as Tableau. Internally, uma authenticates using the Sales_DW_IAM_Policy  IAM role and is assigned the sales_grp database group.

Step 7: Connecting to Amazon Redshift using Python and IAM credentials

To enable partners, connect to the examplecorp-dw cluster programmatically, using Python on a computer such as Amazon EC2 instance. Reuse the IAM users that are attached to the Partner_DW_IAM_Policy policy defined in Step 2.

The following steps show this set up on an EC2 instance:

  1. Launch a new EC2 instance with the Partner_DW_IAM_Policy role, as described in Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances. Alternatively, you can attach an existing IAM role to an EC2 instance.
  2. This example uses Python PostgreSQL Driver (PyGreSQL) to connect to your Amazon Redshift clusters. To install PyGreSQL on Amazon Linux, use the following command as the ec2-user:
    sudo easy_install pip
    sudo yum install postgresql postgresql-devel gcc python-devel
    sudo pip install PyGreSQL

  1. The following code snippet demonstrates programmatic access to Amazon Redshift for partner users:
    #!/usr/bin/env python
    """
    Usage:
    python redshift-unload-copy.py <config file> <region>
    
    * Copyright 2014, Amazon.com, Inc. or its affiliates. All Rights Reserved.
    *
    * Licensed under the Amazon Software License (the "License").
    * You may not use this file except in compliance with the License.
    * A copy of the License is located at
    *
    * http://aws.amazon.com/asl/
    *
    * or in the "license" file accompanying this file. This file is distributed
    * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either
    * express or implied. See the License for the specific language governing
    * permissions and limitations under the License.
    """
    
    import sys
    import pg
    import boto3
    
    REGION = 'us-west-2'
    CLUSTER_IDENTIFIER = 'examplecorp-dw'
    DB_NAME = 'sales_db'
    DB_USER = 'examplecorpsalespartner'
    
    options = """keepalives=1 keepalives_idle=200 keepalives_interval=200
                 keepalives_count=6"""
    
    set_timeout_stmt = "set statement_timeout = 1200000"
    
    def conn_to_rs(host, port, db, usr, pwd, opt=options, timeout=set_timeout_stmt):
        rs_conn_string = """host=%s port=%s dbname=%s user=%s password=%s
                             %s""" % (host, port, db, usr, pwd, opt)
        print "Connecting to %s:%s:%s as %s" % (host, port, db, usr)
        rs_conn = pg.connect(dbname=rs_conn_string)
        rs_conn.query(timeout)
        return rs_conn
    
    def main():
        # describe the cluster and fetch the IAM temporary credentials
        global redshift_client
        redshift_client = boto3.client('redshift', region_name=REGION)
        response_cluster_details = redshift_client.describe_clusters(ClusterIdentifier=CLUSTER_IDENTIFIER)
        response_credentials = redshift_client.get_cluster_credentials(DbUser=DB_USER,DbName=DB_NAME,ClusterIdentifier=CLUSTER_IDENTIFIER,DurationSeconds=3600)
        rs_host = response_cluster_details['Clusters'][0]['Endpoint']['Address']
        rs_port = response_cluster_details['Clusters'][0]['Endpoint']['Port']
        rs_db = DB_NAME
        rs_iam_user = response_credentials['DbUser']
        rs_iam_pwd = response_credentials['DbPassword']
        # connect to the Amazon Redshift cluster
        conn = conn_to_rs(rs_host, rs_port, rs_db, rs_iam_user,rs_iam_pwd)
        # execute a query
        result = conn.query("SELECT sysdate as dt")
        # fetch results from the query
        for dt_val in result.getresult() :
            print dt_val
        # close the Amazon Redshift connection
        conn.close()
    
    if __name__ == "__main__":
        main()

You can save this Python program in a file (redshiftscript.py) and execute it at the command line as ec2-user:

python redshiftscript.py

Now partners can connect to the Amazon Redshift cluster using the Python script, and authentication is federated through the IAM user.

Summary

In this post, I demonstrated how to use federated access using Active Directory and IAM roles to enable single sign-on to an Amazon Redshift cluster. I also showed how partners outside an organization can be managed easily using IAM credentials.  Using the GetClusterCredentials API action, now supported by Amazon Redshift, lets you manage a large number of database users and have them use corporate credentials to log in. You don’t have to maintain separate database user accounts.

Although this post demonstrated the integration of IAM with AD FS and Active Directory, you can replicate this solution across with your choice of SAML 2.0 third-party identity providers (IdP), such as PingFederate or Okta. For the different supported federation options, see Configure SAML Assertions for Your IdP.

If you have questions or suggestions, please comment below.


Additional Reading

Learn how to establish federated access to your AWS resources by using Active Directory user attributes.


About the Author

Thiyagarajan Arumugam is a Big Data Solutions Architect at Amazon Web Services and designs customer architectures to process data at scale. Prior to AWS, he built data warehouse solutions at Amazon.com. In his free time, he enjoys all outdoor sports and practices the Indian classical drum mridangam.

 

Secure API Access with Amazon Cognito Federated Identities, Amazon Cognito User Pools, and Amazon API Gateway

Post Syndicated from Ed Lima original https://aws.amazon.com/blogs/compute/secure-api-access-with-amazon-cognito-federated-identities-amazon-cognito-user-pools-and-amazon-api-gateway/

Ed Lima, Solutions Architect

 

Our identities are what define us as human beings. Philosophical discussions aside, it also applies to our day-to-day lives. For instance, I need my work badge to get access to my office building or my passport to travel overseas. My identity in this case is attached to my work badge or passport. As part of the system that checks my access, these documents or objects help define whether I have access to get into the office building or travel internationally.

This exact same concept can also be applied to cloud applications and APIs. To provide secure access to your application users, you define who can access the application resources and what kind of access can be granted. Access is based on identity controls that can confirm authentication (AuthN) and authorization (AuthZ), which are different concepts. According to Wikipedia:

 

The process of authorization is distinct from that of authentication. Whereas authentication is the process of verifying that “you are who you say you are,” authorization is the process of verifying that “you are permitted to do what you are trying to do.” This does not mean authorization presupposes authentication; an anonymous agent could be authorized to a limited action set.

Amazon Cognito allows building, securing, and scaling a solution to handle user management and authentication, and to sync across platforms and devices. In this post, I discuss the different ways that you can use Amazon Cognito to authenticate API calls to Amazon API Gateway and secure access to your own API resources.

 

Amazon Cognito Concepts

 

It’s important to understand that Amazon Cognito provides three different services:

Today, I discuss the use of the first two. One service doesn’t need the other to work; however, they can be configured to work together.
 

Amazon Cognito Federated Identities

 
To use Amazon Cognito Federated Identities in your application, create an identity pool. An identity pool is a store of user data specific to your account. It can be configured to require an identity provider (IdP) for user authentication, after you enter details such as app IDs or keys related to that specific provider.

After the user is validated, the provider sends an identity token to Amazon Cognito Federated Identities. In turn, Amazon Cognito Federated Identities contacts the AWS Security Token Service (AWS STS) to retrieve temporary AWS credentials based on a configured, authenticated IAM role linked to the identity pool. The role has appropriate IAM policies attached to it and uses these policies to provide access to other AWS services.

Amazon Cognito Federated Identities currently supports the IdPs listed in the following graphic.

 



Continue reading Secure API Access with Amazon Cognito Federated Identities, Amazon Cognito User Pools, and Amazon API Gateway

Building High-Throughput Genomic Batch Workflows on AWS: Batch Layer (Part 3 of 4)

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/building-high-throughput-genomic-batch-workflows-on-aws-batch-layer-part-3-of-4/

Aaron Friedman is a Healthcare and Life Sciences Partner Solutions Architect at AWS

Angel Pizarro is a Scientific Computing Technical Business Development Manager at AWS

This post is the third in a series on how to build a genomics workflow on AWS. In Part 1, we introduced a general architecture, shown below, and highlighted the three common layers in a batch workflow:

  • Job
  • Batch
  • Workflow

In Part 2, you built a Docker container for each job that needed to run as part of your workflow, and stored them in Amazon ECR.

In Part 3, you tackle the batch layer and build a scalable, elastic, and easily maintainable batch engine using AWS Batch.

AWS Batch enables developers, scientists, and engineers to easily and efficiently run hundreds of thousands of batch computing jobs on AWS. It dynamically provisions the optimal quantity and type of compute resources (for example, CPU or memory optimized instances) based on the volume and specific resource requirements of the batch jobs that you submit. With AWS Batch, you do not need to install and manage your own batch computing software or server clusters, which allows you to focus on analyzing results, such as those of your genomic analysis.

Integrating applications into AWS Batch

If you are new to AWS Batch, we recommend reading Setting Up AWS Batch to ensure that you have the proper permissions and AWS environment.

After you have a working environment, you define several types of resources:

  • IAM roles that provide service permissions
  • A compute environment that launches and terminates compute resources for jobs
  • A custom Amazon Machine Image (AMI)
  • A job queue to submit the units of work and to schedule the appropriate resources within the compute environment to execute those jobs
  • Job definitions that define how to execute an application

After the resources are created, you’ll test the environment and create an AWS Lambda function to send generic jobs to the queue.

This genomics workflow covers the basic steps. For more information, see Getting Started with AWS Batch.

Creating the necessary IAM roles

AWS Batch simplifies batch processing by managing a number of underlying AWS services so that you can focus on your applications. As a result, you create IAM roles that give the service permissions to act on your behalf. In this section, deploy the AWS CloudFormation template included in the GitHub repository and extract the ARNs for later use.

To deploy the stack, go to the top level in the repo with the following command:

aws cloudformation create-stack --template-body file://batch/setup/iam.template.yaml --stack-name iam --capabilities CAPABILITY_NAMED_IAM

You can capture the output from this stack in the Outputs tab in the CloudFormation console:

Creating the compute environment

In AWS Batch, you will set up a managed compute environments. Managed compute environments automatically launch and terminate compute resources on your behalf based on the aggregate resources needed by your jobs, such as vCPU and memory, and simple boundaries that you define.

When defining your compute environment, specify the following:

  • Desired instance types in your environment
  • Min and max vCPUs in the environment
  • The Amazon Machine Image (AMI) to use
  • Percentage value for bids on the Spot Market and VPC subnets that can be used.

AWS Batch then provisions an elastic and heterogeneous pool of Amazon EC2 instances based on the aggregate resource requirements of jobs sitting in the RUNNABLE state. If a mix of CPU and memory-intensive jobs are ready to run, AWS Batch provisions the appropriate ratio and size of CPU and memory-optimized instances within your environment. For this post, you will use the simplest configuration, in which instance types are set to "optimal" allowing AWS Batch to choose from the latest C, M, and R EC2 instance families.

While you could create this compute environment in the console, we provide the following CLI commands. Replace the subnet IDs and key name with your own private subnets and key, and the image-id with the image you will build in the next section.

ACCOUNTID=<your account id>
SERVICEROLE=<from output in CloudFormation template>
IAMFLEETROLE=<from output in CloudFormation template>
JOBROLEARN=<from output in CloudFormation template>
SUBNETS=<comma delimited list of subnets>
SECGROUPS=<your security groups>
SPOTPER=50 # percentage of on demand
IMAGEID=<ami-id corresponding to the one you created>
INSTANCEROLE=<from output in CloudFormation template>
REGISTRY=${ACCOUNTID}.dkr.ecr.us-east-1.amazonaws.com
KEYNAME=<your key name>
MAXCPU=1024 # max vCPUs in compute environment
ENV=myenv

# Creates the compute environment
aws batch create-compute-environment --compute-environment-name genomicsEnv-$ENV --type MANAGED --state ENABLED --service-role ${SERVICEROLE} --compute-resources type=SPOT,minvCpus=0,maxvCpus=$MAXCPU,desiredvCpus=0,instanceTypes=optimal,imageId=$IMAGEID,subnets=$SUBNETS,securityGroupIds=$SECGROUPS,ec2KeyPair=$KEYNAME,instanceRole=$INSTANCEROLE,bidPercentage=$SPOTPER,spotIamFleetRole=$IAMFLEETROLE

Creating the custom AMI for AWS Batch

While you can use default Amazon ECS-optimized AMIs with AWS Batch, you can also provide your own image in managed compute environments. We will use this feature to provision additional scratch EBS storage on each of the instances that AWS Batch launches and also to encrypt both the Docker and scratch EBS volumes.

AWS Batch has the same requirements for your AMI as Amazon ECS. To build the custom image, modify the default Amazon ECS-Optimized Amazon Linux AMI in the following ways:

  • Attach a 1 TB scratch volume to /dev/sdb
  • Encrypt the Docker and new scratch volumes
  • Mount the scratch volume to /docker_scratch by modifying /etcfstab

The first two tasks can be addressed when you create the custom AMI in the console. Spin up a small t2.micro instance, and proceed through the standard EC2 instance launch.

After your instance has launched, record the IP address and then SSH into the instance. Copy and paste the following code:

sudo yum -y update
sudo parted /dev/xvdb mklabel gpt
sudo parted /dev/xvdb mkpart primary 0% 100%
sudo mkfs -t ext4 /dev/xvdb1
sudo mkdir /docker_scratch
sudo echo -e '/dev/xvdb1\t/docker_scratch\text4\tdefaults\t0\t0' | sudo tee -a /etc/fstab
sudo mount -a

This auto-mounts your scratch volume to /docker_scratch, which is your scratch directory for batch processing. Next, create your new AMI and record the image ID.

Creating the job queues

AWS Batch job queues are used to coordinate the submission of batch jobs. Your jobs are submitted to job queues, which can be mapped to one or more compute environments. Job queues have priority relative to each other. You can also specify the order in which they consume resources from your compute environments.

In this solution, use two job queues. The first is for high priority jobs, such as alignment or variant calling. Set this with a high priority (1000) and map back to the previously created compute environment. Next, set a second job queue for low priority jobs, such as quality statistics generation. To create these compute environments, enter the following CLI commands:

aws batch create-job-queue --job-queue-name highPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1000 --state ENABLED
aws batch create-job-queue --job-queue-name lowPriority-${ENV} --compute-environment-order order=0,computeEnvironment=genomicsEnv-${ENV}  --priority 1 --state ENABLED

Creating the job definitions

To run the Isaac aligner container image locally, supply the Amazon S3 locations for the FASTQ input sequences, the reference genome to align to, and the output BAM file. For more information, see tools/isaac/README.md.

The Docker container itself also requires some information on a suitable mountable volume so that it can read and write files temporary files without running out of space.

Note: In the following example, the FASTQ files as well as the reference files to run are in a publicly available bucket.

FASTQ1=s3://aws-batch-genomics-resources/fastq/SRR1919605_1.fastq.gz
FASTQ2=s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz
REF=s3://aws-batch-genomics-resources/reference/isaac/
BAM=s3://mybucket/genomic-workflow/test_results/bam/

mkdir ~/scratch

docker run --rm -ti -v $(HOME)/scratch:/scratch $REPO_URI --bam_s3_folder_path $BAM \
--fastq1_s3_path $FASTQ1 \
--fastq2_s3_path $FASTQ2 \
--reference_s3_path $REF \
--working_dir /scratch 

Locally running containers can typically expand their CPU and memory resource headroom. In AWS Batch, the CPU and memory requirements are hard limits and are allocated to the container image at runtime.

Isaac is a fairly resource-intensive algorithm, as it creates an uncompressed index of the reference genome in memory to match the query DNA sequences. The large memory space is shared across multiple CPU threads, and Isaac can scale almost linearly with the number of CPU threads given to it as a parameter.

To fit these characteristics, choose an optimal instance size to maximize the number of CPU threads based on a given large memory footprint, and deploy a Docker container that uses all of the instance resources. In this case, we chose a host instance with 80+ GB of memory and 32+ vCPUs. The following code is example JSON that you can pass to the AWS CLI to create a job definition for Isaac.

aws batch register-job-definition --job-definition-name isaac-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/isaac",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":80000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

You can copy and paste the following code for the other three job definitions:

aws batch register-job-definition --job-definition-name strelka-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/strelka",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":32000,
"vcpus":32,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name snpeff-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/snpeff",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

aws batch register-job-definition --job-definition-name samtoolsStats-${ENV} --type container --retry-strategy attempts=3 --container-properties '
{"image": "'${REGISTRY}'/samtools_stats",
"jobRoleArn":"'${JOBROLEARN}'",
"memory":10000,
"vcpus":4,
"mountPoints": [{"containerPath": "/scratch", "readOnly": false, "sourceVolume": "docker_scratch"}],
"volumes": [{"name": "docker_scratch", "host": {"sourcePath": "/docker_scratch"}}]
}'

The value for "image" comes from the previous post on creating a Docker image and publishing to ECR. The value for jobRoleArn you can find from the output of the CloudFormation template that you deployed earlier. In addition to providing the number of CPU cores and memory required by Isaac, you also give it a storage volume for scratch and staging. The volume comes from the previously defined custom AMI.

Testing the environment

After you have created the Isaac job definition, you can submit the job using the AWS Batch submitJob API action. While the base mappings for Docker run are taken care of in the job definition that you just built, the specific job parameters should be specified in the container overrides section of the API call. Here’s what this would look like in the CLI, using the same parameters as in the bash commands shown earlier:

aws batch submit-job --job-name testisaac --job-queue highPriority-${ENV} --job-definition isaac-${ENV}:1 --container-overrides '{
"command": [
			"--bam_s3_folder_path", "s3://mybucket/genomic-workflow/test_batch/bam/",
            "--fastq1_s3_path", "s3://aws-batch-genomics-resources/fastq/ SRR1919605_1.fastq.gz",
            "--fastq2_s3_path", "s3://aws-batch-genomics-resources/fastq/SRR1919605_2.fastq.gz",
            "--reference_s3_path", "s3://aws-batch-genomics-resources/reference/isaac/",
            "--working_dir", "/scratch",
			"—cmd_args", " --exome ",]
}'

When you execute a submitJob call, jobId is returned. You can then track the progress of your job using the describeJobs API action:

aws batch describe-jobs –jobs <jobId returned from submitJob>

You can also track the progress of all of your jobs in the AWS Batch console dashboard.

To see exactly where a RUNNING job is at, use the link in the AWS Batch console to direct you to the appropriate location in CloudWatch logs.

Completing the batch environment setup

To finish, create a Lambda function to submit a generic AWS Batch job.

In the Lambda console, create a Python 2.7 Lambda function named batchSubmitJob. Copy and paste the following code. This is similar to the batch-submit-job-python27 Lambda blueprint. Use the LambdaBatchExecutionRole that you created earlier. For more information about creating functions, see Step 2.1: Create a Hello World Lambda Function.

from __future__ import print_function

import json
import boto3

batch_client = boto3.client('batch')

def lambda_handler(event, context):
    # Log the received event
    print("Received event: " + json.dumps(event, indent=2))
    # Get parameters for the SubmitJob call
    # http://docs.aws.amazon.com/batch/latest/APIReference/API_SubmitJob.html
    job_name = event['jobName']
    job_queue = event['jobQueue']
    job_definition = event['jobDefinition']
    
    # containerOverrides, dependsOn, and parameters are optional
    container_overrides = event['containerOverrides'] if event.get('containerOverrides') else {}
    parameters = event['parameters'] if event.get('parameters') else {}
    depends_on = event['dependsOn'] if event.get('dependsOn') else []
    
    try:
        response = batch_client.submit_job(
            dependsOn=depends_on,
            containerOverrides=container_overrides,
            jobDefinition=job_definition,
            jobName=job_name,
            jobQueue=job_queue,
            parameters=parameters
        )
        
        # Log response from AWS Batch
        print("Response: " + json.dumps(response, indent=2))
        
        # Return the jobId
        event['jobId'] = response['jobId']
        return event
    
    except Exception as e:
        print(e)
        message = 'Error getting Batch Job status'
        print(message)
        raise Exception(message)

Conclusion

In part 3 of this series, you successfully set up your data processing, or batch, environment in AWS Batch. We also provided a Python script in the corresponding GitHub repo that takes care of all of the above CLI arguments for you, as well as building out the job definitions for all of the jobs in the workflow: Isaac, Strelka, SAMtools, and snpEff. You can check the script’s README for additional documentation.

In Part 4, you’ll cover the workflow layer using AWS Step Functions and AWS Lambda.

Please leave any questions and comments below.

New – Amazon EMR Instance Fleets

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/new-amazon-emr-instance-fleets/

Today we’re excited to introduce a new feature for Amazon EMR clusters called instance fleets. Instance fleets gives you a wider variety of options and intelligence around instance provisioning. You can now provide a list of up to 5 instance types with corresponding weighted capacities and spot bid prices (including spot blocks)! EMR will automatically provision On-Demand and Spot capacity across these instance types when creating your cluster. This can make it easier and more cost effective to quickly obtain and maintain your desired capacity for your clusters.

You can also specify a list of Availability Zones and EMR will optimally optimally launch your cluster in one of the AZs. EMR will also continue to rebalance your cluster in the case of Spot instance interruptions by replacing instances with any of the available types in your fleet. This will make it easier to maintain your cluster’s overall capacity. Instance fleets can be used instead of instance groups. Just like groups your cluster will have master, core, and task fleets.

Let’s take a look at the console updates to get an idea of how these fleets work.

We’ll start by navigating to the EMR console and clicking the Create Cluster button. That should bring us to our familiar EMR provisioning console where we can navigate to the advanced options near the top left.

We’ll select the latest EMR version (instance fleets are available for EMR versions 4.8.0 and greater, with the exception of 5.0.x) and click next.

Now we get to the good stuff! We’ll select the new instance fleet option in the hardware options.
Screenshot 2017-03-09 00.30.51

Now what I want to do is modify our core group to have a couple of instance types that will satisfy the needs of my cluster.

CoreFleetScreenshot

EMR will provision capacity in each instance fleet and availability zone to meet my requirements in the most cost effective way possible. The EMR console provides an easy mapping of vCPU to weighted capacity for each instance type, making it easy to use vCPU as the capacity unit (I want 16 total vCPUs in my core fleet). If the vCPU units don’t match my criteria for weighting instance types I can change the “Target capacity” selector to include arbitrary units and define my own weights (this is how the API/CLI consume capacity units as well).

When the cluster is being provisioned if it’s unable to obtain the desired spot capacity within a user defined timeout you can have it terminate or fall back onto On-Demand instances to provision the rest of the capacity.

All this functionality for instance fleets is also available from the AWS SDKs and the CLI. Let’s take a look at how we would provision our own instance fleet.

First we’ll create our configuration json in my-fleet-config.json:

[
  {
    "Name": "MasterFleet",
    "InstanceFleetType": "MASTER",
    "TargetOnDemandCapacity": 1,
    "InstanceTypeConfigs": [{"InstanceType": "m3.xlarge"}]
  },
  {
    "Name": "CoreFleet",
    "InstanceFleetType": "CORE",
    "TargetSpotCapacity": 11,
    "TargetOnDemandCapacity": 11,
    "LaunchSpecifications": {
      "SpotSpecification": {
        "TimeoutDurationMinutes": 20,
        "TimeoutAction": "SWITCH_TO_ON_DEMAND"
      }
    },
    "InstanceTypeConfigs": [
      {
        "InstanceType": "r4.xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 1
      },
      {
        "InstanceType": "r4.2xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 2
      },
      {
        "InstanceType": "r4.4xlarge",
        "BidPriceAsPercentageOfOnDemandPrice": 50,
        "WeightedCapacity": 4
      }
    ]
  }
]

Now that we have our configuration we can use the AWS CLI’s ’emr’ subcommand to create a new cluster with that configuration:

aws emr create-cluster --release-label emr-5.4.0 \
--applications Name=Spark,Name=Hive,Name=Zeppelin \
--service-role EMR_DefaultRole \
--ec2-attributes InstanceProfile="EMR_EC2_DefaultRole,SubnetIds=[subnet-1143da3c,subnet-2e27c012]" \
--instance-fleets file://my-fleet-config.json

If you’re eager to get started the feature is available now at no additional cost and you can find detailed documentation to help you get started here.

Thanks to the EMR service team for their help writing this post!