Tag Archives: AWS IAM Identity Center

Set up cross-account AWS Glue Data Catalog access using AWS Lake Formation and AWS IAM Identity Center with Amazon Redshift and Amazon QuickSight

2024-08-05 Poulomi Dasgupta

Post Syndicated from Poulomi Dasgupta original https://aws.amazon.com/blogs/big-data/set-up-cross-account-aws-glue-data-catalog-access-using-aws-lake-formation-and-aws-iam-identity-center-with-amazon-redshift-and-amazon-quicksight/

Most organizations manage their workforce identity centrally in external identity providers (IdPs) and are comprised of multiple business units that produce their own datasets and manage the lifecycle spread across multiple AWS accounts. These business units have varying landscapes, where a data lake is managed by Amazon Simple Storage Service (Amazon S3) and analytics workloads are run on Amazon Redshift, a fast, scalable, and fully managed cloud data warehouse that allows you to process and run your complex SQL analytics workloads on structured and semi-structured data.

Business units that create data products like to share them with others, without copying the data, to promote analysis to derive insights. Also, they want tighter control on user access and the ability to audit access to their data products. To address this, enterprises usually catalog the datasets in the AWS Glue Data Catalog for data discovery and use AWS Lake Formation for fine-grained access control to adhere to the compliance and operating security model for their business units. Given the diverse range of services, fine-grained data sharing, and personas involved, these enterprises often want a streamlined experience for enterprise user identities when accessing their data using AWS Analytics services.

AWS IAM Identity Center enables centralized management of workforce user access to AWS accounts and applications using a local identity store or by connecting corporate directories using IdPs. Amazon Redshift and AWS Lake Formation are integrated with the new trusted identity propagation capability in IAM Identity Center, allowing you to use third-party IdPs such as Microsoft Entra ID (Azure AD), Okta, Ping, and OneLogin.

With trusted identity propagation, Lake Formation enables data administrators to directly provide fine-grained access to corporate users and groups, and simplifies the traceability of end-to-end data access across supported AWS services. Because access is managed based on a user’s corporate identity, end-users don’t need to use database local user credentials or assume an AWS Identity and Access Management (IAM) role to access data. Furthermore, this enables effective user permissions based on collective group membership and supports group hierarchy.

In this post, we cover how to enable trusted identity propagation with AWS IAM Identity Center, Amazon Redshift, and AWS Lake Formation residing on separate AWS accounts and set up cross-account sharing of an S3 data lake for enterprise identities using AWS Lake Formation to enable analytics using Amazon Redshift. Then we use Amazon QuickSight to build insights using Redshift tables as our data source.

Solution overview

This post covers a use case where an organization centrally manages corporate users within their IdP and where the users belong to multiple business units. Their goal is to enable centralized user authentication through IAM Identity Center in the management account, while keeping the business unit that analyzes data using a Redshift cluster and the business unit that produces data cataloged using the Data Catalog in separate member accounts. This allows them to maintain a single authentication mechanism through IAM Identity Center within an organization while retaining access control, resource, and cost separation through the use of separate AWS accounts per business units and enabling cross-account data sharing using Lake Formation.

For this solution, AWS Organizations is enabled in the central management account and IAM Identity Center is configured for managing workforce identities. The organization has two member accounts: one account that manages the S3 data lake using the Data Catalog, and another account that runs analytical workloads on Amazon Redshift and QuickSight, with all the services enabled with trusted identity propagation. Amazon Redshift will access cross-account AWS Glue resources using IAM Identity Center users and groups set up in the central management account using QuickSight in member account 1. In member account 2, permissions on the AWS Glue resources are managed using Lake Formation and are shared with member account 1 using Lake Formation data sharing.

The following diagram illustrates the solution architecture.

The solution consists of the following:

In the centralized management account, we create a permission set and create account assignments for Redshift_Member_Account. We integrate users and groups from the IdP with IAM Identity Center.
Member account 1 (Redshift_Member_Account) is where the Redshift cluster and application exist.
Member account 2 (Glue_Member_Account) is where metadata is cataloged in the Data Catalog and Lake Formation is enabled with IAM Identity Center integration.
We assign permissions to two IAM Identity Center groups to access the Data Catalog resources:
- awssso-sales – We apply column-level filtering for this group so that users belonging to this group will be able to select two columns and read all rows.
- awssso-finance – We apply row-level filtering using data filters for this group so that users belonging to this group will be able to select all columns and see rows after row-level filtering is applied.
We apply different permissions for three IAM Identity Center users:
- User Ethan, part of awssso-sales – Ethan will be able to select two columns and read all rows.
- User Frank, part of awssso-finance – Frank will be able to select all columns and see rows after row-level filtering is applied.
- User Brian, part of awssso-sales and awssso-finance – Brian inherits permissions defined for both groups.
We set up QuickSight in the same account where Amazon Redshift exists, enabling authentication using IAM Identity Center.

Prerequisites

You should have the following prerequisites alreday set up:

A centralized management account where IAM Identity Center is enabled with member accounts added. For more information, see Enabling AWS IAM Identity Center.
Optionally, connect IAM Identity Center with your preferred IdP and sync users and groups. For instructions, refer to Getting started tutorials.
Member account 1 (Redshift_Member_Account) where the Redshift cluster and application exist. To set up a Redshift cluster with IAM Identity Center integration enabled, refer to Integrate Identity Provider (IdP) with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On.
Member account 2 (Glue_Member_Account) where metadata is cataloged in the Data Catalog.

Member account 2 configuration

Sign in to the Lake Formation console as the data lake administrator. To learn more about setting up permissions for a data lake administrator, see Create a data lake administrator.

In this section, we walk through the steps to set up Lake Formation, enable Lake Formation permissions, and grant database and table permissions to IAM Identity Center groups.

Set up Lake Formation

Complete the steps in this section to set up Lake Formation.

Create AWS Glue resources

You can use an existing AWS Glue database that has a few tables. For this post, we use a database called customerdb and a table called reviews whose data is stored in the S3 bucket lf-datalake-<account-id>-<region>.

Register the S3 bucket location

Complete the following steps to register the S3 bucket location:

On the Lake Formation console, in the navigation pane, under Administration, choose Data lake locations.
Choose Register location.
For Amazon S3 location, enter the S3 bucket location that contains table data.
For IAM role, provide a user-defined IAM role. For instructions to create a user-defined IAM role, refer to Requirements for roles used to register locations.
For Permission mode, select Lake Formation.
Choose Register location.

Set cross-account version

Complete the following steps to set your cross-account version:

Sign in to the Lake Formation console as the data lake admin.
In the navigation pane, under Administration, choose Data Catalog settings.
Under Cross-account version settings, keep the latest version (Version 4) as the current cross-account version.
Choose Save.

Add permissions required for cross-account access

If the AWS Glue Data Catalog resource policy is already enabled in the account, then you can either remove the policy or add the following permissions to the policy that are required for cross-account grants. The provided policy enables AWS Resource Access Manager (AWS RAM) to share a resource policy while cross-account grants are made using Lake Formation. For more information, refer to Prerequisites. Please skip to the following step if your policy is blank under Catalog Settings.

Sign in to the AWS Glue console as an IAM admin.
In the navigation pane, under Data Catalog, choose Catalog settings.
Under Permissions, add the following policy, and provide the account ID where your AWS Glue resources exist:

{ "Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ram.amazonaws.com"
},
"Action": "glue:ShareResource",
"Resource": [
"arn:aws:glue:us-east-1:<account-id>:table/*/*",
"arn:aws:glue:us-east-1:<account-id>:database/*",
"arn:aws:glue:us-east-1:<account-id>:catalog"
]
}
]
}

Choose Save.

For more information, see Granting cross-account access.

Enable IAM Identity Center integration for Lake Formation

To integrate IAM Identity Center with your Lake Formation organization instance of IAM Identity Center, refer to Connecting Lake Formation with IAM Identity Center.

To enable cross-account sharing for IAM Identity Center users and groups, add the target recipient accounts to your Lake Formation IAM Identity Center integration under the AWS account and organization IDs.

Sign in to the Lake Formation console as a data lake admin.
In the navigation pane, under Administration, choose IAM Identity Center integration.
Under AWS account and organization IDs, choose Add.
Enter your target accounts.
Choose Add.

Enable Lake Formation permissions for databases

For Data Catalog databases that contain tables that you might share, you can stop new tables from having the default grant of Super to IAMAllowedPrincipals. Complete the following steps:

Sign in to the Lake Formation console as a data lake admin.
In the navigation pane, under Data Catalog, choose Databases.
Select the database customerdb.
Choose Actions, then choose Edit.
Under Default permissions for newly created tables, deselect Use only IAM access control for new tables in this database.
Choose Save.

For Data Catalog databases, remove IAMAllowedPrincipals.

Under Data Catalog in the navigation pane, choose Databases.
Select the database customerdb.
Choose Actions, then choose View.
Select IAMAllowedPrincipals and choose Revoke.

Repeat the same steps for tables under the customerdb database.

Grant database permissions to IAM Identity Center groups

Complete the following steps to grant database permissions to your IAM Identity Center groups:

On the Lake Formation console, under Data Catalog, choose Databases.
Select the database customerdb.
Choose Actions, then choose Grant.
Select IAM Identity Center.
Choose Add and select Get Started.
Search for and select your IAM Identity Center group names and choose Assign.

Select Named Data Catalog resources.
Under Databases, choose customerdb.
Under Database permissions, select Describe for Database permissions.
Choose Grant.

Grant table permissions to IAM Identity Center groups

In the following section, we will grant different permissions to our two IAM Identity Center groups.

Column filter

We first add permissions to the group awssso-sales. This group will have access to the customerdb database and be able to select only two columns and read all rows.

On the Lake Formation console, under Data Catalog in the navigation pane, choose Databases.
Select the database customerdb.
Choose Actions, then choose Grant.
Select IAM Identity Center.
Choose Add and select Get Started.
Search for and select awssso-sales and choose Assign.

Select Named Data Catalog resources.
Under Databases, choose customerdb.
Under Tables, choose reviews.
Under Table permissions, select Select for Table permissions.
Select Column-based access.
Select Include columns and choose product_title and star_rating.
Choose Grant.

Row filter

Next, we grant permissions to awssso-finance. This group will have access to customerdb and be able to select all columns and apply filters on rows.

We need to first create a data filter by performing the following steps:

On the Lake Formation console, choose Data filters under Data Catalog.
Choose Create data filter.
For Data filter name, provide a name.
For Target database, choose customerdb.
For Target table, choose reviews.
For Column-level access, select Access to all columns.
For Row-level access, choose Filter rows and apply your filter. In this example, we are filtering reviews with star_rating as 5.
Choose Create data filter.

Under Data Catalog in the navigation pane, choose Databases.
Select the database customerdb.
Choose Actions, then choose Grant.
Select IAM Identity Center.
Choose Add and select Get Started.
Search for and select awssso-finance and choose Assign.
Select Named Data Catalog resources.
Under Databases, choose customerdb.
Under Tables, choose reviews.
Under Data Filters, choose the High_Rating
Under Data Filter permissions, select Select.
Choose Grant.

Member account 1 configuration

In this section, we walk through the steps to add Amazon Redshift Spectrum table access in member account 1, where the Redshift cluster and application exist.

Accept Invite from RAM

You should have received a Resource Access Manager (RAM) invite from member account 2 when you added member account 1 under IAM Identity Center integration in Lake Formation at the member account 1.

Navigate to Resource Access Manager(RAM) from admin console.
Under Shared with me, click on resource shares.
Select the resource name and click on Accept resource share.

Please make sure that you have followed this entire blog to establish the Redshift Integration with IAM Identity Center before following the next steps.

Set up Redshift Spectrum table access for the IAM Identity Center group

Complete the following steps to set up Redshift Spectrum table access:

Sign in to the Amazon Redshift console using the admin role.
Navigate to Query Editor v2.
Choose the options menu (three dots) next to the cluster and choose Create connection.
Connect as the admin user and run the following commands to make the shared resource link data in the S3 data lake available to the sales group (use the account ID where the Data Catalog exists):

create external schema if not exists <schema_name> from DATA CATALOG database '<glue_catalog_name>' catalog_id '<accountid>';
grant usage on schema <schema_name> to role "<role_name>";

For example:

create external schema if not exists cross_account_glue_schema from DATA CATALOG database 'customerdb' catalog_id '932880906720';
grant usage on schema cross_account_glue_schema to role "awsidc:awssso-sales";
grant usage on schema cross_account_glue_schema to role "awsidc:awssso-finance";

Validate Redshift Spectrum access as an IAM Identity Center user

Complete the following steps to validate access:

On the Amazon Redshift console, navigate to Query Editor v2.
Choose the options menu (three dots) next to the cluster and choose Create connection.
Select IAM Identity Center.
Enter your Okta user name and password in the browser pop-up.

When you’re connected as a federated user, run the following SQL commands to query the cross_account_glue_schema data lake table.

select * from "dev"."cross_account_glue_schema"."reviews";

The following screenshot shows that user Ethan, who is part of the awssso-sales group, has access to two columns and all rows from the Data Catalog.

The following screenshot shows that user Frank, who is part of the awssso-finance group, has access to all columns for records that have star_rating as 5.

The following screenshot shows that user Brian, who is part of awssso-sales and awssso-finance, has access to all columns for records that have star_rating as 5 and access to only two columns (other columns are returned NULL) for records with star_rating other than 5.

Subscribe to QuickSight with IAM Identity Center

In this post, we set up QuickSight in the same account where the Redshift cluster exists. You can use the same or a different member account for QuickSight setup. To subscribe to QuickSight, complete the following steps:

Sign in to your AWS account and open the QuickSight console.
Choose Sign up for QuickSight.

Enter a notification email address for the QuickSight account owner or group. This email address will receive service and usage notifications.
Select the identity option that you want to subscribe with. For this post, we select Use AWS IAM Identity Center.
Enter a QuickSight account name.
Choose Configure.

Next, assign groups in IAM Identity Center to roles in QuickSight (admin, author, and reader.) This step enables your users to access the QuickSight application. In this post, we choose awssso-sales and awssso-finance for Admin group.
Specify an IAM role to control QuickSight access to your AWS resources. In this post, we select Use QuickSight managed role (default).
For this post, we deselect Add Paginated Reports.
Review the choices that you made, then choose Finish.

Enable trusted identity propagation in QuickSight

Trusted identity propagation authenticates the end-user in Amazon Redshift when they access QuickSight assets that use a trusted identity propagation enabled data source. When an author creates a data source with trusted identity propagation, the identity of the data source consumers in QuickSight is propagated and logged in AWS CloudTrail. This allows database administrators to centrally manage data security in Amazon Redshift and automatically apply data security rules to data consumers in QuickSight.

To configure QuickSight to connect to Amazon Redshift data sources with trusted identity propagation, configure Amazon Redshift OAuth scopes to your QuickSight account:

aws quicksight update-identity-propagation-config --aws-account-id "AWSACCOUNTID" --service "REDSHIFT" --authorized-targets "IAM Identity Center managed application ARN"

For example:

aws quicksight update-identity-propagation-config --aws-account-id "1234123123" --service "REDSHIFT" --authorized-targets "arn:aws:sso::XXXXXXXXXXXX:application/ssoins-XXXXXXXXXXXX/apl-XXXXXXXXXXXX"

After you have added the scope, the following command lists all OAuth scopes that are currently on a QuickSight account:

aws quicksight list-identity-propagation-configs --aws-account-id "AWSACCOUNTID"

The following code is the example with output:

aws quicksight list-identity-propagation-configs --aws-account-id "1234123123"
{
"Status": 200,
"Services": [
{
"Service": "REDSHIFT",
"AuthorizedTargets": [
"arn:aws:sso::1004123000:application/ssoins-1234f1234bb1f123/apl-12a1234e2e391234"
]
}
],
"RequestId": "116ec1b0-1533-4ed2-b5a6-d7577e073b35"
}

For more information, refer to Authorizing connections from Amazon QuickSight to Amazon Redshift clusters.

For QuickSight to connect to a Redshift instance, you must add an appropriate IP address range in the Redshift security group for the specific AWS Region. For more information, see AWS Regions, websites, IP address ranges, and endpoints.

Test your IAM Identity Center and Amazon Redshift integration with QuickSight

Now you’re ready to connect to Amazon Redshift using QuickSight.

In the management account, open the IAM Identity Center console and copy the AWS access portal URL from the dashboard.
Sign out from the management account and enter the AWS access portal URL in a new browser window.
In the pop-up window, enter your IdP credentials.

After successful authentication, you’ll be logged in to the AWS Management Console as a federated user.

On the Applications tab, select the QuickSight app.
After you federate to QuickSight, choose Datasets.
Select New Dataset and then choose Redshift (Auto Discovered).
Enter your data source details. Make sure to select Single sign-on for Authentication method.
Choose Create data source.

Congratulations! You’re signed in using IAM Identity Center integration with Amazon Redshift and are ready to explore and analyze your data using QuickSight.

The following screenshot from QuickSight shows that user Ethan, who is part of the awssso-sales group, has access to two columns and all rows from the Data Catalog.

The following screenshot from QuickSight shows that user Frank, who is part of the awssso-finance group, has access to all columns for records that have star_rating as 5.

The following screenshot from QuickSight shows that user Brian, who is part of awssso-sales and awssso-finance, has access to all columns for records that have star_rating as 5 and access to only two columns (other columns are returned NULL) for records with star_rating other than 5.

Clean up

Complete the following steps to clean up your resources:

Delete the data from the S3 bucket.
Delete the Data Catalog objects that you created as part of this post.
Delete the Lake Formation resources and QuickSight account.
If you created new Redshift cluster for testing this solution, delete the cluster.

Conclusion

In this post, we established cross-account access to enable centralized user authentication through IAM Identity Center in the management account, while keeping the Amazon Redshift and AWS Glue resources isolated by business unit in separate member accounts. We used Query Editor V2 for querying the data from Amazon Redshift. Then we showed how to build user-facing dashboards by integrating with QuickSight. Refer to Integrate Tableau and Okta with Amazon Redshift using AWS IAM Identity Center to learn about integrating Tableau and Okta with Amazon Redshift using IAM Identity Center.

Learn more about IAM Identity Center with Amazon Redshift, QuickSight, and Lake Formation. Leave your questions and feedback in the comments section.

About the Authors

Srividya Parthasarathy is a Senior Big Data Architect on the AWS Lake Formation team. She enjoys building data mesh solutions and sharing them with the community.

Maneesh Sharma is a Senior Database Engineer at AWS with more than a decade of experience designing and implementing large-scale data warehouse and analytics solutions. He collaborates with various Amazon Redshift Partners and customers to drive better integration.

Poulomi Dasgupta is a Senior Analytics Solutions Architect with AWS. She is passionate about helping customers build cloud-based analytics solutions to solve their business problems. Outside of work, she likes travelling and spending time with her family.

Federated access to Amazon Athena using AWS IAM Identity Center

2024-08-01 Ajay Rawat

Post Syndicated from Ajay Rawat original https://aws.amazon.com/blogs/security/federated-access-to-amazon-athena-using-aws-iam-identity-center/

Managing Amazon Athena through identity federation allows you to manage authentication and authorization procedures centrally. Athena is a serverless, interactive analytics service that provides a simplified and flexible way to analyze petabytes of data.

In this blog post, we show you how you can use the Athena JDBC driver (which includes a browser Security Assertion Markup Language (SAML) plugin) to connect to Athena from third-party SQL client tools, which helps you quickly implement identity federation capabilities and multi-factor authentication (MFA). This enables automation and enforcement of data access policies across your organization.

You can use AWS IAM Identity Center to federate access to users to AWS accounts. IAM Identity Center integrates with AWS Organizations to manage access to the AWS accounts under your organization. In this post, you will learn how to configure the Athena driver to use the AWS configuration profile credentials. This will allow you to resolve credentials from IAM Identity Center and use the MFA capability of your federation identity provider (IdP).In this post, you will learn how you can integrate the Athena browser-based SAML plugin to add single sign-on (SSO) and MFA capability with your federation identity provider (IdP).

Prerequisites

To implement this solution, you must have the follow prerequisites:

An AWS account.
Install or update to the latest version of the AWS Command Line Interface (AWS CLI).
Configure IAM Identity Center authentication using the AWS CLI.
IAM Identity Center enabled. See Enabling AWS IAM Identity Center.
Access to SQL client tools (such as SQL Workbench/J, Pycharm, and so on) that support JDBC connections.
An Amazon Simple Storage Service (Amazon S3) bucket to store Athena query results.
Knowledge of using AWS Lake Formation and enabling Lake Formation to manage permissions to a set of tables.
A Lake Formation administrator role. See Lake Formation personas and IAM permissions reference for information on creating a data lake administrator.
Tables and databases are populated in your AWS Glue Data Catalog.
Create two Athena workgroups (for example, sensitive and non-sensitive).

Note: Lake Formation only supports a single role in the SAML assertion. Multiple roles cannot be used.

Solution overview

Figure 1: Solution architecture

To implement the solution, complete the steps below as shown in Figure 1:

An IAM Identity Center delegated administrator creates two custom permission sets within Identity Center.
An IAM Identity Center delegated administrator assign permission sets to AWS accounts and users and groups. The user has permissions to single sign-on roles that are provisioned in the data lake account. The role created by Identity Center has a name that begins with AWSReservedSSO.
A Lake Formation administrator grants single sign-on roles permissions to the corresponding database and tables.

The solution workflow consists of the following high-level steps as shown in Figure 1:

The user configures IAM Identity Center authentication using the AWS CLI.
The AWS CLI redirects the user to the AWS access portal URL. The user enters workforce identity credentials (username and password). Then chooses Sign in.
The AWS access portal verifies the user’s identity. IAM Identity Center redirects the request to the Identity Center authentication service to validate the user’s credentials.
If MFA is enabled for the user, then they are prompted to authenticate their MFA device.
The user enters or approves the MFA details. The user’s MFA is successfully completed.
The user selects the AWS account to use from the displayed list. Then select the IAM single sign-on role to use from the displayed list.
The user tests the SQL client connection and then uses the client to run a SQL query.
The client makes a call to Athena to retrieve the table and associated metadata from the Data Catalog.
Athena requests access to the data from Lake Formation. Lake Formation invokes the AWS Security Token Service (AWS STS).
Lake Formation invokes AWS STS.
1. Lake Formation obtains temporary AWS credentials with the permissions of the defined IAM role (sensitive or non-sensitive) associated with the data lake location.
2. Lake Formation returns temporary credentials to Athena.
Athena uses the temporary credentials to retrieve data objects from Amazon S3.
The Athena engine successfully runs the query and returns the results to the client.

Solution walkthrough

The walkthrough includes five sections that will guide you through the process of creating permission sets, assigning permission sets to AWS Accounts, managing permission sets access using Lake Formation, and setting up third-party SQL clients such as SQL Workbench to connect to your data store and query your data through Athena.

Step 1: Federate onboarding

Federating onboarding is done within the IAM Identity Center account. As part of federated onboarding, you need to create IAM Identity Center users and groups. Groups are a collection of people who have the same security rights and permissions. You can create groups and add users to the groups. Create one IAM Identity Center group for sensitive data and another for non-sensitive data to provide distinct access to different classes of data sets. You can assign access to IAM Identity Center permission sets to a user or group.

To federate onboarding:

Open the AWS Management Console using the IAM Identity Center account and go to IAM Identity Center.
Choose Groups.
Choose Create group.
Enter a Group name and Description .
Choose Create group.

To add a user as a member of a group:

Open the IAM Identity Center console.
Choose Groups.
Select the group name that you want to update.
On the group details page, under Users in this group, choose Add users to group.
On the Add users to group page, under Other users, locate the users you want to add as members and select the check box next to each of them.
Choose Add users to group.

Figure 2: Assigning users to a group

Step 2: Create permission sets

For this step, create two permission sets (sensitive-iam-role and non-sensitive-iam-role). These permission sets can be assigned to users or groups in IAM Identity Center, granting them specific access to AWS account resources.

To create custom permission sets:

In the IAM Identity Center administrator account, under Multi-Account permissions, choose Permission sets.
Choose Create permission set.
On the Select permission set type page, under Permission set type, choose Custom permission set.

Figure 3: Selecting a permission set
Choose Next.
On the Specify policies and permission boundary page, expand Inline policy to add custom JSON-formatted policy text.

Insert the following policy and update the S3 bucket name (<s3-bucket-name>), AWS Region (<region>) account ID (<account-id>), CloudWatch alarm name (<AlarmName>), Athena workgroup name (sensitive or non-sensitive) (<WorkGroupName>), KMS key alias name (<KMS-key-alias-name>), and organization ID (<aws-PrincipalOrgID>).

{
  "Statement": [
    {
      "Action": [
        "lakeformation:SearchTablesByLFTags",
        "lakeformation:SearchDatabasesByLFTags",
        "lakeformation:ListLFTags",
        "lakeformation:GetResourceLFTags",
        "lakeformation:GetLFTag",
        "lakeformation:GetDataAccess",
        "glue:SearchTables",
        "glue:GetTables",
        "glue:GetTable",
        "glue:GetPartitions",
        "glue:GetDatabases",
        "glue:GetDatabase"
      ],
      "Effect": "Allow",
      "Resource": "*",
      "Sid": "LakeformationAccess"
    },
    {
      "Action": [
        "s3:PutObject",
        "s3:ListMultipartUploadParts",
        "s3:ListBucketMultipartUploads",
        "s3:ListBucket",
        "s3:GetObject",
        "s3:GetBucketLocation",
        "s3:CreateBucket",
        "s3:AbortMultipartUpload"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:s3:::<s3-bucket-name>/*",
        "arn:aws:s3:::<s3-bucket-name>"
      ],
      "Sid": "S3Access"
    },
    {
      "Action": "s3:ListAllMyBuckets",
      "Effect": "Allow",
      "Resource": "*",
      "Sid": "AthenaS3ListAllBucket"
    },
    {
      "Action": [
        "cloudwatch:PutMetricAlarm",
        "cloudwatch:DescribeAlarms"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:cloudwatch:<region>:<account-id>:alarm:<AlarmName>"
      ],
      "Sid": "CloudWatchLogs"
    },
    {
      "Action": [
        "athena:UpdatePreparedStatement",
        "athena:StopQueryExecution",
        "athena:StartQueryExecution",
        "athena:ListWorkGroups",
        "athena:ListTableMetadata",
        "athena:ListQueryExecutions",
        "athena:ListPreparedStatements",
        "athena:ListNamedQueries",
        "athena:ListEngineVersions",
        "athena:ListDatabases",
        "athena:ListDataCatalogs",
        "athena:GetWorkGroup",
        "athena:GetTableMetadata",
        "athena:GetQueryResultsStream",
        "athena:GetQueryResults",
        "athena:GetQueryExecution",
        "athena:GetPreparedStatement",
        "athena:GetNamedQuery",
        "athena:GetDatabase",
        "athena:GetDataCatalog",
        "athena:DeletePreparedStatement",
        "athena:DeleteNamedQuery",
        "athena:CreatePreparedStatement",
        "athena:CreateNamedQuery",
        "athena:BatchGetQueryExecution",
        "athena:BatchGetNamedQuery"
      ],
      "Effect": "Allow",
      "Resource": [
        "arn:aws:athena:<region>:<account-id>:workgroup/<WorkGroupName>",
        "arn:aws:athena:{Region}:{Account}:datacatalog/{DataCatalogName}"
      ],
      "Sid": "AthenaAllow"
    },
    {
      "Action": [
        "kms:GenerateDataKey",
        "kms:DescribeKey",
        "kms:Decrypt"
      ],
      "Condition": {
        "ForAnyValue:StringLike": {
          "kms:ResourceAliases": "<KMS-key-alias-name>"
        }
      },
      "Effect": "Allow",
      "Resource": "*",
      "Sid": "kms"
    },
    {
      "Action": "*",
      "Condition": {
        "StringNotEquals": {
          "aws:PrincipalOrgID": "<aws-PrincipalOrgID>"
        }
      },
      "Effect": "Deny",
      "Resource": "*",
      "Sid": "denyRule"
    }
  ],
  "Version": "2012-10-17"
}

Update the custom policy to add the corresponding Athena workgroup ARN for the sensitive and non-sensitive IAM roles.

Note: See the documentation for information about AWS global condition context keys.
Choose Next.
On the Specify permission set details page, enter a name to identify this permission set in IAM Identity Center. The name that you specify for this permission set appears in the AWS access portal as an available role. Users sign in to the AWS access portal, choose an AWS account, and then choose the role.
Choose Next.
On the Review and create page, review the selections that you made, and then choose Create.

Step 3: Assign permission sets to AWS accounts

You can add and remove permissions sets for an IAM user or group by attaching and detaching permission sets. Permission sets define what actions an identity can perform on which AWS resources.

To assign permission sets to AWS accounts:

In the IAM Identity Center administrator account, under Multi-account permissions, choose AWS accounts.
On the AWS accounts page, select one or more AWS accounts that you want to assign single sign-on access to.
Choose Assign users or groups.

Figure 4: Selecting users and groups
On the Assign users and groups to “<AWS account name>”, for Selected users and groups, choose the users that you want to create the permission set for. Choose Next.
Select permission sets: On the Assign permission sets to “AWS-account-name” page, select one or more permission sets.
On the Review and submit assignments to AWS-account-name page, for Review and submit, choose Submit.

Step 4. Grant permissions to IAM (single sign-on) roles

A data lake administrator has the broad ability to grant a principal (including themselves) permissions on Data Catalog resources. This includes the ability to manage access controls and permissions for the data lake. When you grant Lake Formation permissions on a specific Data Catalog table, you can also include data filtering specifications. This allows you to further restrict access to certain data within the table, limiting what users can see in their query results based on those filtering rules.

To grant permissions to IAM roles:

In the Lake Formation console, under Permissions in the navigation pane, select Data Lake permissions, and then choose Grant.

To grant Database permissions to IAM roles:

Under Principals, select the IAM role name (for example, Sensitive-IAM-Role).
Under Named Data Catalog resources, go to Databases and select a database (for example, demo).

Figure 5: Select an IAM role and database
Under Database permissions, select Describe and then choose Grant.

Figure 6: Grant database permissions to an IAM role

To grant tables permissions to IAM roles:

Repeat steps 1 and 2 of the preceding procedure.
Under Tables – optional, select a table name (for example, demo2).

Figure 7: Select tables within a database to grant access
Select the desired Table Permissions (for example, select and describe), and then choose Grant.

Figure 8: Grant access to tables within the database
Repeat steps 1 through 4 to grant access for the respective database and tables for the non-sensitive IAM role.

Step 5: Client-side setup using JDBC

You can use a JDBC connection to connect Athena and SQL client applications (for example, PyCharm or SQL Workbench) to enable analytics and reporting on the data that Athena returns from Amazon S3 databases. To use the Athena JDBC driver, you must specify the driver class from the JAR file. Additionally, you must pass in some parameters to change the authentication mechanism so the athena-sts-auth libraries are used:

S3 output location – Where in S3 the Athena service can write its output. For example, s3://path/to/query/bucket/.
The IAM Identity Center administrator can configure the session duration for the AWS access portal. The session duration can be set from a minimum of 15 minutes to a maximum of 90 days.

To set up PyCharm

Install Athena JDBC 3.x driver from Athena JDBC 3.x driver.
1. In the left navigation pane, select JDBC 3.x and then Getting started. Select Uber jar to download a .jar file, which contains the driver and its dependencies.
  
  Figure 9: Download Athena JDBC jar
Open PyCharm and create a new project.
1. Enter a Name for your project
2. Select the desired project Location
3. Choose Create
Figure 10: Create a new project in PyCharm
Configure Data Source and drivers. Select Data Source, and then choose the plus sign or New to configure new data sources and drivers.

Figure 11: Add database source properties
Configure the Athena driver by selecting the Drivers tab, and then choose the plus sign to add a new driver.

Figure 12: Add database drivers
Under Driver Files, upload the custom JAR file that you downloaded in the Step 1. Select the Athena class dropdown. Enter the driver’s name (for example Athena JDBC Driver). Then choose Apply.

Figure 13: Add database driver files
Configure a new data source. Choose the plus sign and select your driver’s name from the driver dropdown.
Enter the data source name (for example, Athena Demo). For the authentication method, select User & Password. Then choose Apply.

Figure 14: Create a project data source profile
Select the SSH/SSL tab and select Use SSL. Verify that the Use truststore options for IDE, JAVA, and system are all selected. Then choose Apply.

Figure 15: Enable data source profile SSL
Select the Options tab and then select Single Session Mode. Then choose Apply.

Figure 16: Configure single session mode in PyCharm
Select the General tab and enter the JDBC and single sign-on URL. The following is a sample JDBC URL based on the SAML application:
```
jdbc:athena://;CredentialsProvider= ProfileCredentials; ProfileName=<name-of-the-profile>;WorkGroup=<name-of-the-WorkGroup>; 
```
1. Choose Apply.
2. Choose Test Connection. If the profile has expired, refresh the single sign-on session by running aws sso login --profile <profile-name> with the corresponding profile.
Figure 17: Test the data source connection
After the connection is successful, select the Schemas tab and select All databases and All schemas.

Figure 18: Select data source databases and schemas
Run a sample test query: SELECT <table-names> FROM <database-name> limit 10;
Verify that the credentials and permissions are working as expected.

To set up SQL Workbench

Open SQL Workbench.
Configure an Athena driver by selecting File and then Manage Drivers.
Enter the Athena JDBC Driver as the name and set the library to browse the path for the location where you downloaded the driver. Enter amazonaws.athena.jdbc.AthenaDriver as the Classname.

Enter the following URL, replacing <name-of-the-WorkGroup> with your workgroup name.

jdbc:athena://;CredentialsProvider=ProfileCredentials;ProfileName=<name-of-the-profile>;WorkGroup=<name-of-the-WorkGroup>;

Choose OK.
Run a test query, replacing <table-names> and <database-name> with your table and database names:
```
SELECT <table-names> FROM <database-name> limit 10;
```
Verify that the credentials and permissions are working as expected.

Conclusion

In this post, we covered how to use JDBC drivers to connect to Athena from third-party SQL client tools. You were able to set this up without creating IAM users or any type of long-lived credentials that would need to be stored on your developers’ workstations. You learned how to configure IAM Identity Center users and groups, create permission sets, and assign permission sets to AWS Accounts. You also learned how to grant permissions to single sign-on roles using Lake Formation to create distinct access to different classes of data sets and connect to Athena through an SQL client tool (such as PyCharm). This setup can also work with other supported identity sources such as IAM Identity Center, self-managed or on-premises Active Directory, or an external IdP.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Federating access to Amazon DataZone with AWS IAM Identity Center and Okta

2024-07-30 Carlos Gallegos

Post Syndicated from Carlos Gallegos original https://aws.amazon.com/blogs/big-data/federating-access-to-amazon-datazone-with-aws-iam-identity-center-and-okta/

Many customers rely today on Okta or other identity providers (IdPs) to federate access to their technology stack and tools. With federation, security teams can centralize user management in a single place, which helps simplify and brings agility to their day-to-day operations while keeping highest security standards.

To help develop a data-driven culture, everyone inside an organization can use Amazon DataZone. To realize the benefits of using Amazon DataZone for governing data and making it discoverable and available across different teams for collaboration, customers integrate it with their current technology stack. Handling access through their identity provider and preserving a familiar single sign-on (SSO) experience enables customers to extend the use of Amazon DataZone to users across teams in the organization without any friction while keeping centralized control.

Amazon DataZone is a fully managed data management service that makes it faster and simpler for customers to catalog, discover, share, and govern data stored across Amazon Web Services (AWS), on premises, and third-party sources. It also makes it simpler for data producers, analysts, and business users to access data throughout an organization so that they can discover, use, and collaborate to derive data-driven insights.

You can use AWS IAM Identity Center to securely create and manage identities for your organization’s workforce, or sync and use identities that are already set up and available in Okta or other identity provider, to keep centralized control of them. With IAM Identity Center you can also manage the SSO experience of your organization centrally, across your AWS accounts and applications.

This post guides you through the process of setting up Okta as an identity provider for signing in users to Amazon DataZone. The process uses IAM Identity Center and its native integration with Amazon DataZone to integrate with external identity providers. Note that, even though this post focuses on Okta, the presented pattern relies on the SAML 2.0 standard and so can be replicated with other identity providers.

Prerequisites

To build the solution presented in this post, you must have:

A developer or licensed Okta account along with administrative access to manage users and permissions.
IAM Identity Center administrator access for your single or multi-account environment. For more information, see Enabling AWS IAM Identity Center.
A provisioned Amazon DataZone domain. For more information, see Create domains.
Have IAM Identity Center for Amazon DataZone enabled. For more information, see Setting up AWS IAM Identity Center for Amazon DataZone.

Process overview

Throughout this post you’ll follow these high-level steps:

Establish a SAML connection between Okta and IAM Identity Center
Set up automatic provisioning of users and groups in IAM Identity Center so that users and groups in the Okta domain are created in Identity Center.
Assign users and groups to your AWS accounts in IAM Identity Center by assuming an AWS Identity and Access Management (IAM) role.
Access the AWS Management Console and Amazon DataZone portal through Okta SSO.
Manage Amazon DataZone specific permissions in the Amazon DataZone portal.

Setting up user federation with Okta and IAM Identity Center

This guide follows the steps in Configure SAML and SCIM with Okta and IAM Identity Center.

Before you get started, review the following items in your Okta setup:

Every Okta user must have a First name, Last name, Username and Display name value specified.
Each Okta user has only a single value per data attribute, such as email address or phone number. Users that have multiple values will fail to synchronize. If there are users that have multiple values in their attributes, remove the duplicate attributes before attempting to provision the user in IAM Identity Center. For example, only one phone number attribute can be synchronized. Because the default phone number attribute is work phone, use the work phone attribute to store the user’s phone number, even if the phone number for the user is a home phone or a mobile phone.
If you update a user’s address you must have streetAddress, city, state, zipCode and the countryCode value specified. If any of these values aren’t specified for the Okta user at the time of synchronization, the user (or changes to the user) won’t be provisioned.

Okta account

1) Establish a SAML connection between Okta and AWS IAM Identity Center

Now, let’s establish a SAML connection between Okta and AWS IAM Identity Center. First, you’ll create an application in Okta to establish the connection:

Sign in to the Okta admin dashboard, expand Applications, then select Applications.
On the Applications page, choose Browse App Catalog.
In the search box, enter AWS IAM Identity Center, then select the app to add the IAM Identity Center app.

IAM identity center app in Okta

Choose the Sign On tab.

IAM identity center app in Okta - sign on

Under SAML Signing Certificates, select Actions, and then select View IdP Metadata. A new browser tab opens showing the document tree of an XML file. Select all of the XML from <md:EntityDescriptor> to </md:EntityDescriptor> and copy it to a text file.
Save the text file as metadata.xml.

Identity provider metadata in Okta

Leave the Okta admin dashboard open, you will continue using it in the later steps.

Second, you’re going to set up Okta as an external identity provider in IAM Identity Center:

Open the IAM Identity Center console as a user with administrative privileges.
Choose Settings in the navigation pane.
On the Settings page, choose Actions, and then select Change identity source.

Identity provider source in IAM identity center

Under Choose identity source, select External identity provider, and then choose Next.

Identity provider source in IAM identity center

Under Configure external identity provider, do the following:
1. Under Service provider metadata, choose Download metadata file to download the IAM Identity Center metadata file and save it on your system. You will provide the Identity Center SAML metadata file to Okta later in this tutorial.
  1. Copy the following items to a text file for easy access (you’ll need these values later):
    - IAM Identity Center Assertion Consumer Service (ACS) URL
    - IAM Identity Center issuer URL
2. Under Identity provider metadata, under IdP SAML metadata, choose Choose file and then select the metadata.xml file you created in the previous step.
3. Choose Next.
After you read the disclaimer and are ready to proceed, enter accept.
Choose Change identity source.

Identity provider source in IAM identity center

Leave the AWS console open, because you will use it in the next procedure.

Return to the Okta admin dashboard and choose the Sign On tab of the IAM Identity Center app, then choose Edit.
Under Advanced Sign-on Settings enter the following:
1. For ACS URL, enter the value you copied for IAM Identity Center Assertion Consumer Service (ACS) URL.
2. For Issuer URL, enter the value you copied for IAM Identity Center issuer URL.
3. For Application username format, select one of the options from the drop-down menu.
  Make sure the value you select is unique for each user. For this tutorial, select Okta username.
Choose Save.

IAM identity center app in Okta - sign on

2) Set up automatic provisioning of users and groups in AWS IAM Identity Center

You are now able to set up automatic provisioning of users from Okta into IAM Identity Center. Leave the Okta admin dashboard open and return to the IAM Identity Center console for the next step.

In the IAM Identity Center console, on the Settings page, locate the Automatic provisioning information box, and then choose Enable. This enables automatic provisioning in IAM Identity Center and displays the necessary System for Cross-domain Identity Management (SCIM) endpoint and access token information.

Automatic provisioning in IAM identity center

In the Inbound automatic provisioning dialog box, copy each of the values for the following options:
- SCIM endpoint
- Access token

You will use these values to configure provisioning in Okta later.

Choose Close.

Automatic provisioning in IAM identity center

Return to the Okta admin dashboard and navigate to the IAM Identity Center app.
On the AWS IAM Identity Center app page, choose the Provisioning tab, and then in the navigation pane, under Settings, choose Integration.
Choose Edit, and then select the check box next to Enable API integration to enable provisioning.
Configure Okta with the SCIM provisioning values from IAM Identity Center that you copied earlier:
1. In the Base URL field, enter the SCIM endpoint Make sure that you remove the trailing forward slash at the end of the URL.
2. In the API Token field, enter the Access token value.
Choose Test API Credentials to verify the credentials entered are valid. The message AWS IAM Identity Center was verified successfully! displays.
Choose Save. You are taken to the Settings area, with Integration selected.

API Integration in Okta

Review the following setup before moving forward. In the Provisioning tab, in the navigation pane under Settings, choose To App. Check that all options are enabled. They should be enabled by default, but if not, enable them.

Application provision in Okta

3) Assign users and groups to your AWS accounts in AWS IAM Identity Center by assuming an AWS IAM role

By default, no groups nor users are assigned to your Okta IAM Identity Center app. Complete the following steps to synchronize users with IAM Identity Center.

In the Okta IAM Identity Center app page, choose the Assignments tab. You can assign both people and groups to the IAM Identity Center app.
1. To assign people:
  1. In the Assignments page, choose Assign, and then choose Assign to people.
  2. Select the Okta users that you want to have access to the IAM Identity Center app. Choose Assign, choose Save and Go Back, and then choose Done.
    This starts the process of provisioning the individual users into IAM Identity Center.
1. To assign groups:
  1. Choose the Push Groups tab. You can create rules to automatically provision Okta groups into IAM Identity Center.
  1. Choose the Push Groups drop-down list and select Find groups by rule.
  2. In the By rule section, set a rule name and a condition. For this post we’re using AWS SSO Rule as rule name and starts with awssso as a group name condition. This condition can be different depending on the name of the group you want to sync.
  3. Choose Create Rule
  1. (Optional) To create a new group choose Directory in the navigation pane, and then choose Groups.
  1. Choose Add group and enter a name, and then choose Save.
  1. After you have created the group, you can assign people to it. Select the group name to manage the group’s users.
  1. Choose Assign people and select the users that you want to assign to the group.
  1. You will see the users that are assigned to the group.
  1. Going back to Applications in the navigation pane, select the AWS IAM Identity Center app and choose the Push Groups tab. You should have the groups that match the rule synchronized between Okta and IAM Identity Center. The group status should be set to Active after the group and its members are updated in Identity Center.

Return to the IAM Identity Center console. In the navigation pane, choose Users. You should see the user list that was updated by Okta.

Active users in IAM identity center

In the left navigation, select Groups, you should see the group list that was updated by Okta.

Active groups in IAM identity center

Congratulations! You have successfully set up a SAML connection between Okta and AWS and have verified that automatic provisioning is working.

OPTIONAL: If you need to provide Amazon DataZone console access to the Okta users and groups, you can manage these permissions through the IAM Identity Center console.

In the IAM Identity Center navigation pane, under Multi-account permissions, choose AWS accounts.
On the AWS accounts page, the Organizational structure displays your organizational root with your accounts underneath it in the hierarchy. Select the checkbox for your management account, then choose Assign users or groups.

IAM Roles in IAM identity center

The Assign users and groups workflow displays. It consists of three steps:
1. For Step 1: Select users and groups choose the user that will be performing the administrator job function. Then choose Next.
2. For Step 2: Select permission sets choose Create permission set to open a new tab that steps you through the three sub-steps involved in creating a permission set.
  1. For Step 1: Select permission set type complete the following:
    - In Permission set type, choose Predefined permission set.
    - In Policy for predefined permission set, choose AdministratorAccess.
  2. Choose Next.
  3. For Step 2: Specify permission set details, keep the default settings, and choose Next.
    The default settings create a permission set named AdministratorAccess with session duration set to one hour. You can also specify reduced permissions with a custom policy just to allow Amazon DataZone console access.
  4. For Step 3: Review and create, verify that the Permission set type uses the AWS managed policy AdministratorAccess or your custom policy. Choose Create. On the Permission sets page, a notification appears informing you that the permission set was created. You can close this tab in your web browser now.
On the Assign users and groups browser tab, you are still on Step 2: Select permission sets from which you started the create permission set workflow.
In the Permissions sets area, Refresh. The AdministratorAccess permission or your custom policy set you created appears in the list. Select the checkbox for that permission set, and then choose Next.

IAM Roles in IAM identity center

1. For Step 3: Review and submit review the selected user and permission set, then choose Submit.
  The page updates with a message that your AWS account is being configured. Wait until the process completes.
2. You are returned to the AWS accounts page. A notification message informs you that your AWS account has been re-provisioned, and the updated permission set is applied. When a user signs in, they will have the option of choosing the AdministratorAccess role or a custom policy role.

4) Access the AWS console and Amazon DataZone portal through Okta SSO

Now, you can test your user access into the console and Amazon DataZone portal using the Okta external identity application.

Sign in to the Okta dashboard using a test user account.
Under My Apps, select the AWS IAM Identity Center icon.

IAM identity center access in Okta

Complete the authentication process using your Okta credentials.

IAM identity center access in Okta

4.1) For administrative users

You’re signed in to the portal and can see the AWS account icon. Expand that icon to see the list of AWS accounts that the user can access. In this tutorial, you worked with a single account, so expanding the icon only shows one account.
Select the account to display the permission sets available to the user. In this tutorial you created the AdministratorAccess permission set.
Next to the permission set are links for the type of access available for that permission set. When you created the permission set, you specified both management console and programmatic access be enabled, so those two options are present. Select Management console to open the console.

AWS Management console

The user is signed in to the console. Using the search bar, look for Amazon DataZone service and open it.
Open the Amazon DataZone console and make sure you have enabled SSO users through IAM Identity Center. In case you haven’t, you can follow the steps in Enable IAM Identity Center for Amazon DataZone.

Note: In this post, we followed the default IAM Identity Center for Amazon DataZone configuration, which has implicit user assignment mode enabled. With this option, any user added to your Identity Center directory can access your Amazon DataZone domain automatically. If you opt for using explicit user assignment instead, remember that you need to manually add users to your Amazon DataZone domain in the Amazon DataZone console for them to have access.
To learn more about how to manage user access to an Amazon DataZone domain, see Manage users in the Amazon DataZone console.

Choose the Open data portal to access the Amazon DataZone Portal.

DataZone console

4.2) For all other users

Choose the Applications tab in the AWS access portal window and choose the Amazon DataZone data portal application link.

DataZone application

In the Amazon DataZone data portal, choose SIGN IN WITH SSO to continue

DataZone portal

Congratulations! Now you’re signed in to the Amazon DataZone data portal using your user that’s managed by Okta.

DataZone portal

5) Manage Amazon DataZone specific permissions in the Amazon DataZone portal

After you have access to the Amazon DataZone portal, you can work with projects, the data assets within, environments, and other constructs that are specific to Amazon DataZone. A project is the overarching construct that brings together people, data, and analytics tools. A project has two roles: owner and contributor. Next, you’ll learn how a user can be made an owner or contributor of existing projects.

These steps must be completed by the existing project owner in the Amazon DataZone portal:

Open the Amazon DataZone portal, select the project in the drop-down list on the left top of the portal and choose the project you own

DataZone project

In the project window, choose the Members tab to see the current users in the project and add a new one.

DataZone project members

Choose Add Members to add a new user. Make sure the User type is SSO User to add an Okta user. Look for the Okta user in the name drop-down list, select it, and select a project role for it. Finally, choose Add Members to add the user.

DataZone project members

The Okta user has been granted the selected project role and can interact with the project, assets, and tools.

DataZone project members

You can also grant permissions to SSO Groups. Choose Add members, then select SSO group in the drop-down list, next select the Group name, set the assigned project role, and choose Add Members.

DataZone project members

The Okta group has been granted the project role and can interact with the project, assets, and tools.

DataZone project members

You can also manage SSO user and group access to the Amazon DataZone data portal from the console. See Manage users in the Amazon DataZone console for additional details.

Clean up

To ensure a seamless experience and avoid any future charges, we kindly request that you follow these steps:

Delete the Amazon DataZone project if you created it as part of this blog post.
Delete the Amazon DataZone domain if you created it as part of this blog post.
If you changed your identity source from IAM Identity Center to an external identity provider (IdP), please revert it back to IAM Identity Center.
If you added an AWS IAM Identity Center app integration to your Okta account as part of this post, please delete the app integration.
Delete the permission set that you created specifically for this blog post.

By following these steps, you can effectively clean up the resources utilized in this blog post and prevent any unnecessary charges from accruing.

Summary

In this post, you followed a step-by-step guide to set up and use Okta to federate access to Amazon DataZone with AWS IAM Identity Center. You also learned how to group users and manage their permission in Amazon DataZone. As a final thought, now that you’re familiar with the elements involved in the integration of an external identity provider such as Okta to federate access to Amazon DataZone, you’re ready to try it with other identity providers.

To learn more about, see Managing Amazon DataZone domains and user access.

About the Authors

Carlos Gallegos is a Senior Analytics Specialist Solutions Architect at AWS. Based in Austin, TX, US. He’s an experienced and motivated professional with a proven track record of delivering results worldwide. He specializes in architecture, design, migrations, and modernization strategies for complex data and analytics solutions, both on-premises and on the AWS Cloud. Carlos helps customers accelerate their data journey by providing expertise in these areas. Connect with him on LinkedIn.

Jose Romero is a Senior Solutions Architect for Startups at AWS. Based in Austin, TX, US. He’s passionate about helping customers architect modern platforms at scale for data, AI, and ML. As a former senior architect in AWS Professional Services, he enjoys building and sharing solutions for common complex problems so that customers can accelerate their cloud journey and adopt best practices. Connect with him on LinkedIn.

Arun Pradeep Selvaraj is a Senior Solutions Architect at AWS. Arun is passionate about working with his customers and stakeholders on digital transformations and innovation in the cloud while continuing to learn, build, and reinvent. He is creative, fast-paced, deeply customer-obsessed and uses the working backwards process to build modern architectures to help customers solve their unique challenges. Connect with him on LinkedIn.

Access AWS services programmatically using trusted identity propagation

2024-06-27 Roberto Migli

Post Syndicated from Roberto Migli original https://aws.amazon.com/blogs/security/access-aws-services-programmatically-using-trusted-identity-propagation/

With the introduction of trusted identity propagation, applications can now propagate a user’s workforce identity from their identity provider (IdP) to applications running in Amazon Web Services (AWS) and to storage services backing those applications, such as Amazon Simple Storage Service (Amazon S3) or AWS Glue. Since access to applications and data can now be granted directly to a workforce identity, a seamless single sign-on experience can be achieved, eliminating the need for users to be aware of different AWS Identity and Access Management (IAM) roles to assume to access data or use of local database credentials.

While AWS managed applications such as Amazon QuickSight, AWS Lake Formation, or Amazon EMR Studio offer a native setup experience with trusted identity propagation, there are use cases where custom integrations must be built. You might want to integrate your workforce identities into custom applications storing data in Amazon S3 or build an interface on top of existing applications using Java Database Connectivity (JDBC) drivers, allowing these applications to propagate those identities into AWS to access resources on behalf of their users. AWS resource owners can manage authorization directly in their AWS applications such as Lake Formation or Amazon S3 Access Grants.

This blog post introduces a sample command-line interface (CLI) application that enables users to access AWS services using their workforce identity from IdPs such as Okta or Microsoft Entra ID.

The solution relies on users authenticating with their chosen IdP using standard OAuth 2.0 authentication flows to receive an identity token. This token can then be exchanged against AWS Security Token Service (AWS STS) and AWS IAM Identity Center to access data on behalf of the workforce identity that was used to sign in to the IdP.

Finally, an integration with the AWS Command Line Interface (AWS CLI) is provided, enabling native access to AWS services on behalf of the signed-in identity.

In this post, you will learn how to build and use the CLI application to access data on S3 Access Grants, query Amazon Athena tables, and programmatically interact with other AWS services that support trusted identity propagation.

To set up the solution in this blog post, you should already be familiar with trusted identity propagation and S3 Access Grants concepts and features. If you are not, see the two-part blog post How to develop a user-facing data application with IAM Identity Center and S3 Access Grants (Part 1) and Part 2. In this post we guide you through a more hands-on setup of your own sample CLI application.

Architecture and token exchange flow

Before moving into the actual deployment, let’s talk about the architecture of the CLI and how it facilitates token exchange between different parties.

Users, such as developers and data scientists, run the CLI on their machine without AWS security credentials pre-configured. To perform a token exchange of the OAuth 2.0 credentials vended by a source IdP towards IAM Identity Center by using the CreateTokenWithIAM API, AWS security credentials are required. To fulfill this requirement, the solution uses IAM OpenID Connect (OIDC) federation to first create an IAM role session using the AssumeRoleWithWebIdentity API with the initial IdP token—because this API doesn’t require credentials—and then uses the resulting IAM role to request the necessary single sign-on OIDC token.

The process flow is shown in Figure 1:

Figure 1: Architecture diagram of the application

At a high level, the interaction flow shown in Figure 1 is the following:

The user interacts with the AWS CLI, which uses the CLI as a source credential provider.
The user is asked to sign in through their browser with the source IdP they configured in the CLI, for example Okta.
If the authorization is successful, the CLI receives a JSON Web Token (JWT) and uses it to assume an IAM role through OIDC federation using AssumeRoleWithWebIdentity.
With this temporary IAM role session, the CLI exchanges the IdP token token with the IAM Identity Center customer managed application on behalf of the user for another token using the CreateTokenWithIAM API.
If successful, the CLI uses the token returned from Identity Center to create an identity-enhanced IAM role session and returns the corresponding IAM credentials to the AWS CLI.
The AWS CLI uses the credentials to call AWS services that support the trusted identity propagation that has been configured in the Identity Center customer managed application. For example, to query Athena. See Specify trusted applications in the Identity Center documentation to learn more.
If you need access to S3 Access Grants, the CLI also provides automatic credentials requests to S3 GetDataAccess with the identity-enhanced IAM role session previously created.

Because both the Okta OAuth token and the identity-enhanced IAM role session credentials are short lived, the CLI provides functionality to refresh authentication automatically.

Figure 2 is a swimlane diagram of the requests described in the preceding flow and depicted in Figure 1.

Figure 2: Swimlane diagram of the application token exchange. Dashed lines show SigV4 signed requests

Account prerequisites

In the following procedure, you will use two AWS accounts: one is the application account, where required IAM roles and OIDC federation will be deployed and where users will be granted access to Amazon S3 objects. This account already has S3 Access Grants and an Athena workgroup configured with IAM Identity Center. If you don’t have these already configured, see Getting started with S3 Access Grants and Using IAM Identity Center enabled Athena workgroups.

The other account is the Identity Center admin account which has IAM Identity Center set up with users and groups synchronized through SCIM with your IdP of choice, in this case Okta directory. The application also supports Entra ID and Amazon Cognito.

You don’t need to have permission sets configured with IAM Identity Center, because you will grant access through a customer managed application Identity Center. Users authenticate with the source IdP and don’t interact with an AWS account or IAM policy directly.

To configure this application, you’ll complete the following steps:

Create an OIDC application in Okta
Create a customer managed application in IAM Identity Center
Install and configure the application with the AWS CLI

Create an OIDC application in Okta

Start by creating a custom application in Okta, which will act as the source IdP, and configuring it as a trusted token issuer in IAM Identity Center.

To create an OIDC application

Sign in to your Okta administrator panel, go to the Applications section in the navigation pane, and choose Create App Integration. Because you’re using a CLI, select OIDC as the sign-in method and Native Application as the application type. Choose Next.

Figure 3: Okta screen to create a new application
In the Grant Type section, the authorization code is selected by default. Make sure to also select Refresh Token. The CLI application uses the refresh tokens if available to generate new access tokens without requiring the user to re-authenticate. Otherwise, the users will have to authenticate again when the token expires.
Change the sign-in redirect URIs to http://localhost:8090/callback. The application uses OAuth 2.0 Authorization Code with PKCE Flow and waits for confirmation of authentication by listening locally on port 8090. The IdP will redirect your browser to this URL to send authentication information after successful sign-in.
Select the directory groups that you want to have access to this application. If you don’t want to restrict access to the application, choose Allow Everyone. Leave the remaining settings as-is.

Note: You must also assign and authorize users in AWS to allow access to the downstream AWS services.

Figure 4: Configure the general settings of the Okta application
When done, choose Save and your Okta custom application will be created and you will see a detail page of your application. Note the Okta Client ID value to use in later steps.

Create a customer managed application in IAM Identity Center

After setting everything up in Okta, move on to creating the custom OAuth 2.0 application in IAM Identity Center. The CLI will use this application to exchange the tokens issued by Okta.

To create a customer managed application

Sign in to the AWS Management Console of the account where you already have configured IAM Identity Center and go to the Identity Center console.
In the navigation pane, select Applications under Application assignments.
Choose Add application in the upper
Select I have an application I want to set up and select the OAuth 2.0 type.
Enter a display name and description.
For User and group assignment method, select Do not require assignments (because you already configured your user and group assignments to this application in Okta). You can leave the Application URL field empty and select Not Visible under Application visibility in AWS access portal, because you won’t access the application from the Identity Center AWS access portal URL.
On the next screen, select the Trusted Token Issuer you set up in the prerequisites, and enter your Okta client ID from the Okta application you created in the previous section as the Aud claim. This will make sure your Identity Center custom application only accepts tokens issued by Okta for your Okta custom application.

Note: Automatically refresh user authentication for active application sessions isn’t needed for this application because the CLI application already refreshes tokens with Okta.

Figure 5: Trusted token issuer configuration of the custom application
Select Edit the application policy and edit the policy to specify the application account AWS Account ID as the allowed principal to perform the token exchange. You will update the application policy with the Amazon Resource Name (ARN) of the correct role after deploying the rest of the solution.
Continue to the next page to review the application configuration and finalize the creation process.

Figure 6: Configuration of the application policy
Grant the application permission to call other AWS services on the users’ behalf. This step is necessary because the application will use a custom application to exchange tokens that will then be used with specific AWS services. Select the Customer managed tab, browse to the application you just created, and choose Specify trusted applications at the bottom of the page.
For this example, select All applications for service with same access, and then select Athena, S3 Access Grants, and Lake Formation and AWS Glue data catalog as trusted services because they’re supported by the sample application. If you don’t plan to use your CLI with S3 or Athena and Lake Formation, you can skip this step.

Figure 7: Configure application user propagation

You have completed the setup within IAM Identity Center. Switch to your application account to complete the next steps, which will deploy the backend application.

Application installation and configuration with the AWS CLI

The sample code for the application can be found on GitHub. Follow the instructions in the README file to install the command-line application. You can use the CLI to generate an AWS CloudFormation template to create the required IAM roles for the token exchange process.

To install and configure the application

You will need your Okta OAuth issuer URI and your Okta application client ID. The issuer URI can be found by going to your Okta Admin page, and on the left navigation pane, go to the API page under the Security section. Here you will see your Authorization servers and their Issuer URI. The application client ID is the same as you used earlier for the Aud claim in IAM Identity Center.

In your CLI, use the following commands to generate and deploy the CloudFormation template:

tip-cli configure generate-cfn-template https://dev-12345678.okta.com/oauth2/default aBc12345dBe6789FgH0 > ~/tip-blog-iam-roles-cfn.yaml
aws cloudformation deploy --template-file ~/tip-blog-iam-roles-cfn.yaml --stack-name tip-blog-iam-stack --capabilities CAPABILITY_IAM

After the template is created, update the customer managed application policy you configured in step 2 to allow only the IAM role that CloudFormation just created to use the application. You can find this in your CloudFormation console, or by using aws cloudformation describe-stacks --stack-name tip-blog-iam-stack in your terminal.
Configure the CLI by running tip-cli configure idp and entering the IAM roles’ ARN and Okta information.
Test your configuration by running the tip-cli auth command to start the authorization with the configured IdP by opening your default browser and requesting to sign in. In the background, the application will wait for Okta to callback the browser on port 8090 to retrieve the authentication information, and if successful request an Okta OAuth 2.0 token.
You can then run tip-cli auth get-iam-credentials to exchange the token through your trusted IAM role and your Identity Center application for a set of identity-enhanced IAM role session credentials. These expire in 1 hour by default, but you can configure the expiration period by configuring the IAM role used to create the identity-enhanced IAM session accordingly. After you test the setup, you won’t need the CLI anymore because authentication, token refresh, and credentials refresh will be managed automatically through the AWS CLI.

Note: Follow the README instructions on configuring your AWS CLI to use the tip-cli application to source credentials. The advantage of the solution is that the AWS CLI will automatically handle the refresh of tokens and credentials using this application.

Now that the AWS CLI is configured, you can use it to call AWS services representing your Okta identity. In the following examples, we configured the AWS CLI to use the application to source credentials for the AWS CLI profile my-tti-profile.

For example, you can query Athena tables through workgroups configured with trusted identity propagation and authorized through Lake Formation:

QUERY_EXECUTION_ID=$(aws athena start-query-execution \
    --query-string "SELECT * FROM db_tip.table_example LIMIT 3" \
    --work-group "tip-workgroup" \
    --profile my-tti-profile \
    --output text \
    --query 'QueryExecutionId')

aws athena get-query-results \
    --query-execution-id $QUERY_EXECUTION_ID \
    --profile my-tti-profile \
    --output text

The IAM role used by the backend application to create the identity-enhanced IAM session is allowed by default to use only Athena and Amazon S3. You can extend it to allow access to other services, such as Amazon Q Business. After you create an Amazon Q Business application and assign users to it, you can update the custom application you created earlier in IAM Identity Center and enable trusted identity propagation to your Amazon Q Business application.

You can then, for example, retrieve conversations of the user for an Amazon Q business application with ID a1b2c3d4-5678-90ab-cdef-EXAMPLE11111 as follows:

aws qbusiness list-conversations --application-id a1b2c3d4-5678-90ab-cdef-EXAMPLE11111 --profile my-tti-profile

The application also provides simplified access to S3 Access Grants. After you have configured the AWS CLI as documented in the README file, you can access S3 objects as follows:

aws s3 ls s3://<my-test-bucket-with-path> --profile my-tti-profile-s3ag

In this example, the AWS CLI uses the custom credential source to request data access to a specific S3 URI to the account instance of S3 Access Grants you want to target. If there’s a matching grant for the requested URI and your user identity or group, S3 Access Grants will return a new set of short-lived IAM credentials. The tip-cli application will cache these credentials for you and prompt you to refresh them whenever needed. You can review the list of credentials generated by the application with the command tip-cli s3ag list-credentials, and clear them with the command tip-cli s3ag clear-credentials.

Conclusion

In this post we showed you how a sample CLI application using trusted identity propagation works, enabling business users to bring their workforce identities for delegated access into AWS without needing IAM credentials or cloud backends.

You set up custom applications within Okta (as the directory service) and IAM Identity Center and deployed the required application IAM roles and federation into your AWS account. You learned the architecture of this solution and how to use it in an application.

Through this setup, you can use the AWS CLI to interact with AWS services such as S3 Access Grants or Athena on behalf of the directory user without the having to configure an IAM user or credentials for them.

The code used in this post is also published as an AWS Sample, which can be found on GitHub and is meant to serve as inspiration for integrating custom or third-party applications with trusted identity propagation.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Integrate Tableau and Okta with Amazon Redshift using AWS IAM Identity Center

2024-06-03 Debu Panda

Post Syndicated from Debu Panda original https://aws.amazon.com/blogs/big-data/integrate-tableau-and-okta-with-amazon-redshift-using-aws-iam-identity-center/

This blog post is co-written with Sid Wray and Jake Koskela from Salesforce, and Adiascar Cisneros from Tableau.

Amazon Redshift is a fast, scalable cloud data warehouse built to serve workloads at any scale. With Amazon Redshift as your data warehouse, you can run complex queries using sophisticated query optimization to quickly deliver results to Tableau, which offers a comprehensive set of capabilities and connectivity options for analysts to efficiently prepare, discover, and share insights across the enterprise. For customers who want to integrate Amazon Redshift with Tableau using single sign-on capabilities, we introduced AWS IAM Identity Center integration to seamlessly implement authentication and authorization.

IAM Identity Center provides capabilities to manage single sign-on access to AWS accounts and applications from a single location. Redshift now integrates with IAM Identity Center, and supports trusted identity propagation, making it possible to integrate with third-party identity providers (IdP) such as Microsoft Entra ID (Azure AD), Okta, Ping, and OneLogin. This integration positions Amazon Redshift as an IAM Identity Center-managed application, enabling you to use database role-based access control on your data warehouse for enhanced security. Role-based access control allows you to apply fine grained access control using row level, column level, and dynamic data masking in your data warehouse.

AWS and Tableau have collaborated to enable single sign-on support for accessing Amazon Redshift from Tableau. Tableau now supports single sign-on capabilities with Amazon Redshift connector to simplify the authentication and authorization. The Tableau Desktop 2024.1 and Tableau Server 2023.3.4 releases support trusted identity propagation with IAM Identity Center. This allows users to seamlessly access Amazon Redshift data within Tableau using their external IdP credentials without needing to specify AWS Identity and Access Management (IAM) roles in Tableau. This single sign-on integration is available for Tableau Desktop, Tableau Server, and Tableau Prep.

In this post, we outline a comprehensive guide for setting up single sign-on to Amazon Redshift using integration with IAM Identity Center and Okta as the IdP. By following this guide, you’ll learn how to enable seamless single sign-on authentication to Amazon Redshift data sources directly from within Tableau Desktop, streamlining your analytics workflows and enhancing security.

Solution overview

The following diagram illustrates the architecture of the Tableau SSO integration with Amazon RedShift, IAM Identity Center, and Okta.

Figure 1: Solution overview for Tableau integration with Amazon Redshift using IAM Identity Center and Okta

The solution depicted in Figure 1 includes the following steps:

The user configures Tableau to access Redshift using IAM Identity Center authentication
On a user sign-in attempt, Tableau initiates a browser-based OAuth flow and redirects the user to the Okta login page to enter the login credentials.
On successful authentication, Okta issues an authentication token (id and access token) to Tableau
Redshift driver then makes a call to Redshift-enabled IAM Identity Center application and forwards the access token.
Redshift passes the token to Identity Center and requests an access token.
Identity Center verifies/validates the token using the OIDC discovery connection to the trusted token issuer and returns an Identity Center generated access token for the same user. In Figure 1, Trusted Token Issuer (TTI) is the Okta server that Identity Center trusts to provide tokens that third-party applications like Tableau uses to call AWS services.
Redshift then uses the token to obtain the user and group membership information from IAM Identity Center.
Tableau user will be able to connect with Amazon Redshift and access data based on the user and group membership returned from IAM Identity Center.

Prerequisites

Before you begin implementing the solution, make sure that you have the following in place:

Setup IAM Identity Center and Amazon Redshift integration by following the steps in Integrate Identity Provider (IdP) with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On
Download and install the latest ODBC 2.X Driver.
Have installed Tableau Desktop 2024.1 or later.
Tableau Server 2023.3.4 and above version. For Tableau Server installation, please refer to Install and Configure Tableau Server.
An Okta account that has an active subscription. You need an admin role to set up the application on Okta. If you’re new to Okta, you can sign up for a free trial or for a developer account.

Walkthrough

In this walkthrough, you build the solution with following steps:

Set up the Okta OIDC application
Set up the Okta authorization server
Set up the Okta claims
Setup the Okta access policies and rules
Setup trusted token issuer in AWS IAM Identity Center
Setup client connections and trusted token issuers
Setup the Tableau OAuth config files for Okta
Install the Tableau OAuth config file for Tableau Desktop
Setup the Tableau OAuth config file for Tableau Server or Tableau Cloud
Federate to Amazon Redshift from Tableau Desktop
Federate to Amazon Redshift from Tableau Server

Set up the Okta OIDC application

To create an OIDC web app in Okta, you can follow the instructions in this video, or use the following steps to create the wep app in Okta admin console:

Note: The Tableau Desktop redirect URLs should always use localhost. The examples below also use localhost for the Tableau Server hostname for ease of testing in a test environment. For this setup, you should also access the server at localhost in the browser. If you decide to use localhost for early testing, you will also need to configure the gateway to accept localhost using this tsm command:

 tsm configuration set -k gateway.public.host -v localhost

In a production environment, or Tableau Cloud, you should use the full hostname that your users will access Tableau on the web, along with https. If you already have an environment with https configured, you may skip the localhost configuration and use the full hostname from the start.

Sign in to your Okta organization as a user with administrative privileges.
On the admin console, under Applications in the navigation pane, choose Applications.
Choose Create App Integration.
Select OIDC – OpenID Connect as the Sign-in method and Web Application as the Application type.
Choose Next.
In General Settings:
1. App integration name: Enter a name for your app integration. For example, Tableau_Redshift_App.
2. Grant type: Select Authorization Code and Refresh Token.
3. Sign-in redirect URIs: The sign-in redirect URI is where Okta sends the authentication response and ID token for the sign-in request. The URIs must be absolute URIs. Choose Add URl and along with the default URl, add the following URIs.
  - http://localhost:55556/Callback
  - http://localhost:55557/Callback
  - http://localhost:55558/Callback
  - http://localhost/auth/add_oauth_token
4. Sign-out redirect URIs: keep the default value as http://localhost:8080.
5. Skip the Trusted Origins section and for Assignments, select Skip group assignment for now.
6. Choose Save.

Figure 2: OIDC application

In the General Settings section, choose Edit and select Require PKCE as additional verification under Proof Key for Code Exchange (PKCE). This option indicates if a PKCE code challenge is required to verify client requests.
Choose Save.

Figure 3: OIDC App Overview

Select the Assignments tab and then choose Assign to Groups. In this example, we’re assigning awssso-finance and awssso-sales.
Choose Done.

Figure 4: OIDC application group assignments

For more information on creating an OIDC app, see Create OIDC app integrations.

Set up the Okta authorization server

Okta allows you to create multiple custom authorization servers that you can use to protect your own resource servers. Within each authorization server you can define your own OAuth 2.0 scopes, claims, and access policies. If you have an Okta Developer Edition account, you already have a custom authorization server created for you called default.

For this blog post, we use the default custom authorization server. If your application has requirements such as requiring more scopes, customizing rules for when to grant scopes, or you need more authorization servers with different scopes and claims, then you can follow this guide.

Figure 5: Authorization server

Set up the Okta claims

Tokens contain claims that are statements about the subject (for example: name, role, or email address). For this example, we use the default custom claim sub. Follow this guide to create claims.

Figure 6: Create claims

Setup the Okta access policies and rules

Access policies are containers for rules. Each access policy applies to a particular OpenID Connect application. The rules that the policy contains define different access and refresh token lifetimes depending on the nature of the token request. In this example, you create a simple policy for all clients as shown in Figure 7 that follows. Follow this guide to create access policies and rules.

Figure 7: Create access policies

Rules for access policies define token lifetimes for a given combination of grant type, user, and scope. They’re evaluated in priority order and after a matching rule is found, no other rules are evaluated. If no matching rule is found, then the authorization request fails. This example uses the role depicted in Figure 8 that follows. Follow this guide to create rules for your use case.

Figure 8: Access policy rules

Setup trusted token issuer in AWS IAM Identity Center

At this point, you switch to setting up the AWS configuration, starting by adding a trusted token issuer (TTI), which makes it possible to exchange tokens. This involves connecting IAM Identity Center to the Open ID Connect (OIDC) discovery URL of the external OAuth authorization server and defining an attribute-based mapping between the user from the external OAuth authorization server and a corresponding user in Identity Center. In this step, you create a TTI in the centralized management account. To create a TTI:

Open the AWS Management Console and navigate to IAM Identity Center, and then to the Settings page.
Select the Authentication tab and under Trusted token issuers, choose Create trusted token issuer.
On the Set up an external IdP to issue trusted tokens page, under Trusted token issuer details, do the following:
- For Issuer URL, enter the OIDC discovery URL of the external IdP that will issue tokens for trusted identity propagation. The administrator of the external IdP can provide this URL (for example, https://prod-1234567.okta.com/oauth2/default).

To get the issuer URL from Okta, sign in as an admin to Okta and navigate to Security and then to API and choose default under the Authorization Servers tab and copy the Issuer URL

Figure 9: Authorization server issuer

For Trusted token issuer name, enter a name to identify this trusted token issuer in IAM Identity Center and in the application console.
Under Map attributes, do the following:
- For Identity provider attribute, select an attribute from the list to map to an attribute in the IAM Identity Center identity store.
- For IAM Identity Center attribute, select the corresponding attribute for the attribute mapping.
Under Tags (optional), choose Add new tag, enter a value for Key and optionally for Value. Choose Create trusted token issuer. For information about tags, see Tagging AWS IAM Identity Center resources.

This example uses Subject (sub) as the Identity provider attribute to map with Email from the IAM identity Center attribute. Figure 10 that follows shows the set up for TTI.

Figure 10: Create Trusted Token Issuer

Setup client connections and trusted token issuers

In this step, the Amazon Redshift applications that exchange externally generated tokens must be configured to use the TTI you created in the previous step. Also, the audience claim (or aud claim) from Okta must be specified. In this example, you are configuring the Amazon Redshift application in the member account where the Amazon Redshift cluster or serverless instance exists.

Select IAM Identity Center connection from Amazon Redshift console menu.

Figure 11: Amazon Redshift IAM Identity Center connection

Select the Amazon Redshift application that you created as part of the prerequisites.
Select the Client connections tab and choose Edit.
Choose Yes under Configure client connections that use third-party IdPs.
Select the checkbox for Trusted token issuer which you have created in the previous section.
Enter the aud claim value under section Configure selected trusted token issuers. For example, okta_tableau_audience.

To get the audience value from Okta, sign in as an admin to Okta and navigate to Security and then to API and choose default under the Authorization Servers tab and copy the Audience value.

Figure 12: Authorization server audience

Note: The audience claim value must exactly match with IdP audience value otherwise your OIDC connection with third part application like Tableau will fail.

Choose Save.

Figure 13: Adding Audience Claim for Trusted Token Issuer

Setup the Tableau OAuth config files for Okta

At this point, your IAM Identity Center, Amazon Redshift, and Okta configuration are complete. Next, you need to configure Tableau.

To integrate Tableau with Amazon Redshift using IAM Identity Center, you need to use a custom XML. In this step, you use the following XML and replace the values starting with the $ sign and highlighted in bold. The rest of the values can be kept as they are, or you can modify them based on your use case. For detailed information on each of the elements in the XML file, see the Tableau documentation on GitHub.

Note: The XML file will be used for all the Tableau products including Tableau Desktop, Server, and Cloud.

<?xml version="1.0" encoding="utf-8"?>
<pluginOAuthConfig>
<dbclass>redshift</dbclass>
<oauthConfigId>custom_redshift_okta</oauthConfigId>
<clientIdDesktop>$copy_client_id_from_okta_oidc_app</clientIdDesktop>
<clientSecretDesktop>$copy_client_secret_from_okta_oidc_app</clientSecretDesktop>
<redirectUrisDesktop>http://localhost:55556/Callback</redirectUrisDesktop>
<redirectUrisDesktop>http://localhost:55557/Callback</redirectUrisDesktop>
<redirectUrisDesktop>http://localhost:55558/Callback</redirectUrisDesktop>
<authUri>https://$copy_okta_host_value.okta.com/oauth2/default/v1/authorize</authUri>
<tokenUri>https://$copy_okta_host_value.okta.com/oauth2/default/v1/token</tokenUri>
<scopes>openid</scopes>
<scopes>email</scopes>
<scopes>profile</scopes>
<scopes>offline_access</scopes>
<capabilities>
<entry>
<key>OAUTH_CAP_FIXED_PORT_IN_CALLBACK_URL</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_PKCE_REQUIRES_CODE_CHALLENGE_METHOD</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_REQUIRE_PKCE</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_SUPPORTS_STATE</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_CLIENT_SECRET_IN_URL_QUERY_PARAM</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_SUPPORTS_GET_USERINFO_FROM_ID_TOKEN</key>
<value>true</value>
</entry>
</capabilities>
<accessTokenResponseMaps>
<entry>
<key>ACCESSTOKEN</key>
<value>access_token</value>
</entry>
<entry>
<key>REFRESHTOKEN</key>
<value>refresh_token</value>
</entry>
<entry>
<key>id-token</key>
<value>id_token</value>
</entry>
<entry>
<key>access-token-issue-time</key>
<value>issued_at</value>
</entry>
<entry>
<key>access-token-expires-in</key>
<value>expires_in</value>
</entry>
<entry>
<key>username</key>
<value>preferred_username</value>
</entry>
</accessTokenResponseMaps>
</pluginOAuthConfig>

The following is an example XML file:

<?xml version="1.0" encoding="utf-8"?>
<pluginOAuthConfig>
<dbclass>redshift</dbclass>
<oauthConfigId>custom_redshift_okta</oauthConfigId>
<clientIdDesktop>ab12345z-a5nvb-123b-123b-1c434ghi1234</clientIdDesktop>
<clientSecretDesktop>3243jkbkjb~~ewf.112121.3432423432.asd834k</clientSecretDesktop>
<redirectUrisDesktop>http://localhost:55556/Callback</redirectUrisDesktop>
<redirectUrisDesktop>http://localhost:55557/Callback</redirectUrisDesktop>
<redirectUrisDesktop>http://localhost:55558/Callback</redirectUrisDesktop>
<authUri>https://prod-1234567.okta.com/oauth2/default/v1/authorize</authUri>
<tokenUri>https://prod-1234567.okta.com/oauth2/default/v1/token</tokenUri>
<scopes>openid</scopes>
<scopes>email</scopes>
<scopes>profile</scopes>
<scopes>offline_access</scopes>
<capabilities>
<entry>
<key>OAUTH_CAP_FIXED_PORT_IN_CALLBACK_URL</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_PKCE_REQUIRES_CODE_CHALLENGE_METHOD</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_REQUIRE_PKCE</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_SUPPORTS_STATE</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_CLIENT_SECRET_IN_URL_QUERY_PARAM</key>
<value>true</value>
</entry>
<entry>
<key>OAUTH_CAP_SUPPORTS_GET_USERINFO_FROM_ID_TOKEN</key>
<value>true</value>
</entry>
</capabilities>
<accessTokenResponseMaps>
<entry>
<key>ACCESSTOKEN</key>
<value>access_token</value>
</entry>
<entry>
<key>REFRESHTOKEN</key>
<value>refresh_token</value>
</entry>
<entry>
<key>id-token</key>
<value>id_token</value>
</entry>
<entry>
<key>access-token-issue-time</key>
<value>issued_at</value>
</entry>
<entry>
<key>access-token-expires-in</key>
<value>expires_in</value>
</entry>
<entry>
<key>username</key>
<value>preferred_username</value>
</entry>
</accessTokenResponseMaps>
</pluginOAuthConfig>

Install the Tableau OAuth config file for Tableau Desktop

After the configuration XML file is created, it must be copied to a location to be used by Amazon Redshift Connector from Tableau Desktop. Save the file from the previous step as .xml and save it under Documents\My Tableau Repository\OAuthConfigs.

Note: Currently this integration isn’t supported in macOS because the Redshift ODBC 2.X driver isn’t supported yet for MAC. It will be supported soon.

Setup the Tableau OAuth config file for Tableau Server or Tableau Cloud

To integrate with Amazon Redshift using IAM Identity Center authentication, you must install the Tableau OAuth config file in Tableau Server or Tableau Cloud

Sign in to the Tableau Server or Tableau Cloud using admin credentials.
Navigate to Settings.
Go to OAuth Clients Registry and select Add OAuth Client
Choose following settings:
- Connection Type: Amazon Redshift
- OAuth Provider: Custom_IdP
- Client ID: Enter your IdP client ID value
- Client Secret: Enter your client secret value
- Redirect URL: Enter http://localhost/auth/add_oauth_token. This example uses localhost for testing in a local environment. You should use the full hostname with https.
- Choose OAuth Config File. Select the XML file that you configured in the previous section.
- Select Add OAuth Client and choose Save.

Figure 14: Create an OAuth connection in Tableau Server or Tableau Cloud

Federate to Amazon Redshift from Tableau Desktop

Now you’re ready to connect to Amazon Redshift from Tableau through federated sign-in using IAM Identity Center authentication. In this step, you create a Tableau Desktop report and publish it to Tableau Server.

Open Tableau Desktop.
Select Amazon Redshift Connector and enter the following values:
1. Server: Enter the name of the server that hosts the database and the name of the database you want to connect to.
2. Port: Enter 5439.
3. Database: Enter your database name. This example uses dev.
4. Authentication: Select OAuth.
5. Federation Type: Select Identity Center.
6. Identity Center Namespace: You can leave this value blank.
7. OAuth Provider: This value should automatically be pulled from your configured XML. It will be the value from the element oauthConfigId.
8. Select Require SSL.
9. Choose Sign in.

Figure 15: Tableau Desktop OAuth connection

Enter your IdP credentials in the browser pop-up window.

Figure 16: Okta Login Page

When authentication is successful, you will see the message shown in Figure 17 that follows.

Figure 17: Successful authentication using Tableau

Congratulations! You’re signed in using IAM Identity Center integration with Amazon Redshift and are ready to explore and analyze your data using Tableau Desktop.

Figure 18: Successfully connected using Tableau Desktop

Figure 19 is a screenshot from the Amazon Redshift system table (sys_query_history) showing that user Ethan from Okta is accessing the sales report.

Figure 19: User audit in sys_query_history

After signing in, you can create your own Tableau Report on the desktop version and publish it to your Tableau Server. For this example, we created and published a report named SalesReport.

Federate to Amazon Redshift from Tableau Server

After you have published the report from Tableau Desktop to Tableau Server, sign in as a non-admin user and view the published report (SalesReport in this example) using IAM Identity Center authentication.

Sign in to the Tableau Server site as a non-admin user.
Navigate to Explore and go to the folder where your published report is stored.
Select the report and choose Sign In.

Figure 20: Tableau Server Sign In

To authenticate, enter your non-admin Okta credentials in the browser pop-up.

Figure 21: Okta Login Page

After your authentication is successful, you can access the report.

Figure 22: Tableau report

Clean up

Complete the following steps to clean up your resources:

Delete the IdP applications that you have created to integrate with IAM Identity Center.
Delete the IAM Identity Center configuration.
Delete the Amazon Redshift application and the Amazon Redshift provisioned cluster or serverless instance that you created for testing.
Delete the IAM role and IAM policy that you created for IAM Identity Center and Amazon Redshift integration.
Delete the permission set from IAM Identity Center that you created for Amazon Redshift Query Editor V2 in the management account.

Conclusion

This post covered streamlining access management for data analytics by using Tableau’s capability to support single sign-on based on the OAuth 2.0 OpenID Connect (OIDC) protocol. The solution enables federated user authentication, where user identities from an external IdP are trusted and propagated to Amazon Redshift. You walked through the steps to configure Tableau Desktop and Tableau Server to integrate seamlessly with Amazon Redshift using IAM Identity Center for single sign-on. By harnessing this integration of a third party IdP with IAM Identity Center, users can securely access Amazon Redshift data sources within Tableau without managing separate database credentials.

Listed below are key resources to learn more about Amazon Redshift integration with IAM Identity Center

About the Authors

Debu Panda is a Senior Manager, Product Management at AWS. He is an industry leader in analytics, application platform, and database technologies, and has more than 25 years of experience in the IT world.

Sid Wray is a Senior Product Manager at Salesforce based in the Pacific Northwest with nearly 20 years of experience in Digital Advertising, Data Analytics, Connectivity Integration and Identity and Access Management. He currently focuses on supporting ISV partners for Salesforce Data Cloud.

Adiascar Cisneros is a Tableau Senior Product Manager based in Atlanta, GA. He focuses on the integration of the Tableau Platform with AWS services to amplify the value users get from our products and accelerate their journey to valuable, actionable insights. His background includes analytics, infrastructure, network security, and migrations.

Jade Koskela is a Principal Software Engineer at Salesforce. He has over a decade of experience building Tableau with a focus on areas including data connectivity, authentication, and identity federation.

Harshida Patel is a Principal Solutions Architect, Analytics with AWS.

Ravi Bhattiprolu is a Senior Partner Solutions Architect at Amazon Web Services (AWS). He collaborates with strategic independent software vendor (ISV) partners like Salesforce and Tableau to design and deliver innovative, well-architected cloud products, integrations, and solutions to help joint AWS customers achieve their business goals.

AWS analytics services streamline user access to data, permissions setting, and auditing

2024-05-29 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-analytics-services-streamline-user-access-to-data-permissions-setting-and-auditing/

I am pleased to announce a new use case based on trusted identity propagation, a recently introduced capability of AWS IAM Identity Center.

Tableau, a commonly used business intelligence (BI) application, can now propagate end-user identity down to Amazon Redshift. This has a triple benefit. It simplifies the sign-in experience for end users. It allows data owners to define access based on real end-user identity. It allows auditors to verify data access by users.

Trusted identity propagation allows applications that consume data (such as Tableau, Amazon QuickSight, Amazon Redshift Query Editor, Amazon EMR Studio, and others) to propagate the user’s identity and group memberships to the services that store and manage access to the data, such as Amazon Redshift, Amazon Athena, Amazon Simple Storage Service (Amazon S3), Amazon EMR, and others. Trusted identity propagation is a capability of IAM Identity Center that improves the sign-in experience across multiple analytics applications, simplifies data access management, and simplifies audit. End users benefit from single sign-on and do not have to specify the IAM roles they want to assume to connect to the system.

Before diving into more details, let’s agree on terminology.

I use the term “identity providers” to refer to the systems that hold user identities and group memberships. These are the systems that prompt the user for credentials and perform the authentication. For example, Azure Directory, Okta, Ping Identity, and more. Check the full list of identity providers we support.

I use the term “user-facing applications” to designate the applications that consume data, such as Tableau, Microsoft PowerBI, QuickSight, Amazon Redshift Query Editor, and others.

And finally, when I write “downstream services”, I refer to the analytics engines and storage services that process, store, or manage access to your data: Amazon Redshift, Athena, S3, EMR, and others.

To understand the benefit of trusted identity propagation, let’s briefly talk about how data access was granted until today. When a user-facing application accesses data from a downstream service, either the upstream service uses generic credentials (such as “tableau_user“) or assumes an IAM role to authenticate against the downstream service. This is the source of two challenges.

First, it makes it difficult for the downstream service administrator to define access policies that are fine-tuned for the actual user making the request. As seen from the downstream service, all requests originate from that common user or IAM role. If Jeff and Jane are both mapped to the BusinessAnalytics IAM role, then it is not possible to give them different levels of access, for example, readonly and read-write. Furthermore, if Jeff is also in the Finance group, he needs to choose a role in which to operate; he cannot access data from both groups in the same session.

Secondly, the task of associating a data-access event to an end user involves some undifferentiated heavy lifting. If the request originates from an IAM role called BusinessAnalytics, then additional work is required to figure out which user was behind that action.

Well, this particular example might look very simple, but in real life, organizations have hundreds of users and thousands of groups to match to hundreds of datasets. There was an opportunity for us to Invent and Simplify.

Once configured, the new trusted identity propagation provides a technical mechanism for user-facing applications to access data on behalf of the actual user behind the keyboard. Knowing the actual user identity offers three main advantages.

First, it allows downstream service administrators to create and manage access policies based on actual user identities, the groups they belong to, or a combination of the two. Downstream service administrators can now assign access in terms of users, groups, and datasets. This is the way most of our customers naturally think about access to data—intermediate mappings to IAM roles are no longer necessary to achieve these patterns.

Second, auditors now have access to the original user identity in system logs and can verify that policies are implemented correctly and follow all requirements of the company or industry-level policies.

Third, users of BI applications can benefit from single sign-on between applications. Your end-users no longer need to understand your company’s AWS accounts and IAM roles. Instead, they can sign in to EMR Studio (for example) using their corporate single sign-on that they’re used to for so many other things they do at work.

How does trusted identity propagation work?
Trusted identity propagation relies on standard mechanisms from our industry: OAuth2 and JWT. OAuth2 is an open standard for access delegation that allows users to grant third-party user-facing applications access to data on other services (downstream services) without exposing their credentials. JWT (JSON Web Token) is a compact, URL-safe means of representing identities and claims to be transferred between two parties. JWTs are signed, which means their integrity and authenticity can be verified.

How to configure trusted identity propagation
Configuring trusted identity propagation requires setup in IAM Identity Center, at the user-facing application, and at the downstream service because each of these needs to be told to work with end-user identities. Although the particulars will be different for each application, they will all follow this pattern:

Configure an identity source in AWS IAM Identity Center. AWS recommends enabling automated provisioning if your identity provider supports it, as most do. Automated provisioning works through the SCIM synchronization standard to synchronize your directory users and groups into IAM Identity Center. You probably have configured this already if you currently use IAM Identity Center to federate your workforce into the AWS Management Console. This is a one-time configuration, and you don’t have to repeat this step for each user-facing application.
Configure your user-facing application to authenticate its users with your identity provider. For example, configure Tableau to use Okta.
Configure the connection between the user-facing application and the downstream service. For example, configure Tableau to access Amazon Redshift. In some cases, it requires using the ODBC or JDBC driver for Redshift.

Then comes the configuration specific to trusted identity propagation. For example, imagine your organization has developed a user-facing web application that authenticates the users with your identity provider, and that you want to access data in AWS on behalf of the current authenticated user. For this use case, you would create a trusted token issuer in IAM Identity Center. This powerful new construct gives you a way to map your application’s authenticated users to the users in your IAM Identity Center directory so that it can make use of trusted identity propagation. My colleague Becky wrote a blog post to show you how to develop such an application. This additional configuration is required only when using third-party applications, such as Tableau, or a customer-developed application, that authenticate outside of AWS. When using user-facing applications managed by AWS, such as Amazon QuickSight, no further setup is required.

Finally, downstream service administrators must configure the access policies based on the user identity and group memberships. The exact configuration varies from one downstream service to the other. If the application reads or writes data in Amazon S3, the data owner may use S3 Access Grants in the Amazon S3 console to grant access for users and groups to prefixes in Amazon S3. If the application makes queries to an Amazon Redshift data warehouse, the data owner must configure IAM Identity Center trusted connection in the Amazon Redshift console and match the audience claim (aud) from the identity provider.

Now that you have a high-level overview of the configuration, let’s dive into the most important part: the user experience.

The end-user experience
Although the precise experience of the end user will obviously be different for different applications, in all cases, it will be simpler and more familiar to workforce users than before. The user interaction will begin with a redirect-based authentication single sign-on flow that takes the user to their identity provider, where they can sign in with credentials, multi-factor authentication, and so on.

Let’s look at the details of how an end user might interact with Okta and Tableau when trusted identity propagation has been configured.

Here is an illustration of the flow and the main interactions between systems and services.

Here’s how it goes.

1. As a user, I attempt to sign in to Tableau.

2. Tableau initiates a browser-based flow and redirects to the Okta sign-in page where I can enter my sign-in credentials. On successful authentication, Okta issues an authentication token (ID and access token) to Tableau.

3. Tableau initiates a JDBC connection with Amazon Redshift and includes the access token in the connection request. The Amazon Redshift JDBC driver makes a call to Amazon Redshift. Because your Amazon Redshift administrator enabled IAM Identity Center, Amazon Redshift forwards the access token to IAM Identity Center.

4. IAM Identity Center verifies and validates the access token and exchange the access token for an Identity Center issued token.

5. Amazon Redshift will resolve the Identity Center token to determine the corresponding Identity Center user and authorize access to the resource. Upon successful authorization, I can connect from Tableau to Amazon Redshift.

Once authenticated, I can start to use Tableau as usual.

And when I connect to Amazon Redshift Query Editor, I can observe the sys_query_history table to check who was the user who made the query. It correctly reports awsidc:<email address>, the Okta email address I used when I connected from Tableau.

You can read Tableau’s documentation for more details about this configuration.

Pricing and availability
Trusted identity propagation is provided at no additional cost in the 26 AWS Regions where AWS IAM Identity Center is available today.

Here are more details about trusted identity propagation and downstream service configurations.

Happy reading!

With trusted identity propagation, you can now configure analytics systems to propagate the actual user identity, group membership, and attributes to AWS services such as Amazon Redshift, Amazon Athena, or Amazon S3. It simplifies the management of access policies on these services. It also allows auditors to verify your organization’s compliance posture to know the real identity of users accessing data.

Get started now and configure your Tableau integration with Amazon Redshift.

— seb

PS: Writing a blog post at AWS is always a team effort, even when you see only one name under the post title. In this case, I want to thank Eva Mineva, Laura Reith, and Roberto Migli for their much-appreciated help in understanding the many subtleties and technical details of trusted identity propagation.

Simplify data lake access control for your enterprise users with trusted identity propagation in AWS IAM Identity Center, AWS Lake Formation, and Amazon S3 Access Grants

2024-05-29 Shoukat Ghouse

Post Syndicated from Shoukat Ghouse original https://aws.amazon.com/blogs/big-data/simplify-data-lake-access-control-for-your-enterprise-users-with-trusted-identity-propagation-in-aws-iam-identity-center-aws-lake-formation-and-amazon-s3-access-grants/

Many organizations use external identity providers (IdPs) such as Okta or Microsoft Azure Active Directory to manage their enterprise user identities. These users interact with and run analytical queries across AWS analytics services. To enable them to use the AWS services, their identities from the external IdP are mapped to AWS Identity and Access Management (IAM) roles within AWS, and access policies are applied to these IAM roles by data administrators.

Given the diverse range of services involved, different IAM roles may be required for accessing the data. Consequently, administrators need to manage permissions across multiple roles, a task that can become cumbersome at scale.

To address this challenge, you need a unified solution to simplify data access management using your corporate user identities instead of relying solely on IAM roles. AWS IAM Identity Center offers a solution through its trusted identity propagation feature, which is built upon the OAuth 2.0 authorization framework.

With trusted identity propagation, data access management is anchored to a user’s identity, which can be synchronized to IAM Identity Center from external IdPs using the System for Cross-domain Identity Management (SCIM) protocol. Integrated applications exchange OAuth tokens, and these tokens are propagated across services. This approach empowers administrators to grant access directly based on existing user and group memberships federated from external IdPs, rather than relying on IAM users or roles.

In this post, we showcase the seamless integration of AWS analytics services with trusted identity propagation by presenting an end-to-end architecture for data access flows.

Solution overview

Let’s consider a fictional company, OkTank. OkTank has multiple user personas that use a variety of AWS Analytics services. The user identities are managed externally in an external IdP: Okta. User1 is a Data Analyst and uses the Amazon Athena query editor to query AWS Glue Data Catalog tables with data stored in Amazon Simple Storage Service (Amazon S3). User2 is a Data Engineer and uses Amazon EMR Studio notebooks to query Data Catalog tables and also query raw data stored in Amazon S3 that is not yet cataloged to the Data Catalog. User3 is a Business Analyst who needs to query data stored in Amazon Redshift tables using the Amazon Redshift Query Editor v2. Additionally, this user builds Amazon QuickSight visualizations for the data in Redshift tables.

OkTank wants to simplify governance by centralizing data access control for their variety of data sources, user identities, and tools. They also want to define permissions directly on their corporate user or group identities from Okta instead of creating IAM roles for each user and group and managing access on the IAM role. In addition, for their audit requirements, they need the capability to map data access to the corporate identity of users within Okta for enhanced tracking and accountability.

To achieve these goals, we use trusted identity propagation with the aforementioned services and use AWS Lake Formation and Amazon S3 Access Grants for access controls. We use Lake Formation to centrally manage permissions to the Data Catalog tables and Redshift tables shared with Redshift datashares. In our scenario, we use S3 Access Grants for granting permission for the Athena query result location. Additionally, we show how to access a raw data bucket governed by S3 Access Grants with an EMR notebook.

Data access is audited with AWS CloudTrail and can be queried with AWS CloudTrail Lake. This architecture showcases the versatility and effectiveness of AWS analytics services in enabling efficient and secure data analysis workflows across different use cases and user personas.

We use Okta as the external IdP, but you can also use other IdPs like Microsoft Azure Active Directory. Users and groups from Okta are synced to IAM Identity Center. In this post, we have three groups, as shown in the following diagram.

User1 needs to query a Data Catalog table with data stored in Amazon S3. The S3 location is secured and managed by Lake Formation. The user connects to an IAM Identity Center enabled Athena workgroup using the Athena query editor with EMR Studio. The IAM Identity Center enabled Athena workgroups need to be secured with S3 Access Grants permissions for the Athena query results location. With this feature, you can also enable the creation of identity-based query result locations that are governed by S3 Access Grants. These user identity-based S3 prefixes let users in an Athena workgroup keep their query results isolated from other users in the same workgroup. The following diagram illustrates this architecture.

User2 needs to query the same Data Catalog table as User1. This table is governed using Lake Formation permissions. Additionally, the user needs to access raw data in another S3 bucket that isn’t cataloged to the Data Catalog and is controlled using S3 Access Grants; in the following diagram, this is shown as S3 Data Location-2.

The user uses an EMR Studio notebook to run Spark queries on an EMR cluster. The EMR cluster uses a security configuration that integrates with IAM Identity Center for authentication and uses Lake Formation for authorization. The EMR cluster is also enabled for S3 Access Grants. With this kind of hybrid access management, you can use Lake Formation to centrally manage permissions for your datasets cataloged to the Data Catalog and use S3 Access Grants to centrally manage access to your raw data that is not yet cataloged to the Data Catalog. This gives you flexibility to access data managed by either of the access control mechanisms from the same notebook.

User3 uses the Redshift Query Editor V2 to query a Redshift table. The user also accesses the same table with QuickSight. For our demo, we use a single user persona for simplicity, but in reality, these could be completely different user personas. To enable access control with Lake Formation for Redshift tables, we use data sharing in Lake Formation.

Data access requests by the specific users are logged to CloudTrail. Later in this post, we also briefly touch upon using CloudTrail Lake to query the data access events.

In the following sections, we demonstrate how to build this architecture. We use AWS CloudFormation to provision the resources. AWS CloudFormation lets you model, provision, and manage AWS and third-party resources by treating infrastructure as code. We also use the AWS Command Line Interface (AWS CLI) and AWS Management Console to complete some steps.

The following diagram shows the end-to-end architecture.

Prerequisites

Complete the following prerequisite steps:

Have an AWS account. If you don’t have an account, you can create one.
Have IAM Identity Center set up in a specific AWS Region.
Make sure you use the same Region where you have IAM Identity Center set up throughout the setup and verification steps. In this post, we use the us-east-1 Region.
Have Okta set up with three different groups and users, and enable sync to IAM Identity Center. Refer to Configure SAML and SCIM with Okta and IAM Identity Center for instructions.

After the Okta groups are pushed to IAM Identity Center, you can see the users and groups on the IAM Identity Center console, as shown in the following screenshot. You need the group IDs of the three groups to be passed in the CloudFormation template.

For enabling User2 access using the EMR cluster, you need have an SSL certificate .zip file available in your S3 bucket. You can download the following sample certificate to use in this post. In production use cases, you should create and use your own certificates. You need to reference the bucket name and the certificate bundle .zip file in AWS CloudFormation. The CloudFormation template lets you choose the components you want to provision. If you do not intend to deploy the EMR cluster, you can ignore this step.
Have an administrator user or role to run the CloudFormation stack. The user or role should also be a Lake Formation administrator to grant permissions.

Deploy the CloudFormation stack

The CloudFormation template provided in the post lets you choose the components you want to provision from the solution architecture. In this post, we enable all components, as shown in the following screenshot.

Run the provided CloudFormation stack to create the solution resources. Refer to the following table for a list of important parameters.

Parameter Group	Description	Parameter Name	Expected Value
Choose components to provision.	Choose the components you want to be provisioned.	`DeployAthenaFlow`	Yes/No. If you choose No, you can ignore the parameters in the “Athena Configuration” group.
		`DeployEMRFlow`	Yes/No. If you choose No, you can ignore the parameters in the “EMR Configuration” group.
		`DeployRedshiftQEV2Flow`	Yes/No. If you choose No, you can ignore the parameters in the “Redshift Configuration” group.
		`CreateS3AGInstance`	Yes/No. If you already have an S3 Access Grants instance, choose No. Otherwise, choose Yes to allow the stack create a new S3 Access Grants instance. The S3 Access Grants instance is needed for User1 and User2.
Identity Center Configuration	IAM Identity Center parameters.	`IDCGroup1Id`	Group ID corresponding to Group1 from IAM Identity Center.
		`IDCGroup2Id`	Group ID corresponding to Group2 from IAM Identity Center.
		`IDCGroup3Id`	Group ID corresponding to Group3 from IAM Identity Center.
		`IAMIDCInstanceArn`	IAM Identity Center instance ARN. You can get this from the Settings section of IAM Identity Center.
Redshift Configuration	Redshift parameters. Ignore if you chose `DeployRedshiftQEV2Flow` as No.	`RedshiftServerlessAdminUserName`	Redshift admin user name.
		`RedshiftServerlessAdminPassword`	Redshift admin password.
		`RedshiftServerlessDatabase`	Redshift database to create the tables.
EMR Configuration	EMR parameters. Ignore if you chose parameter `DeployEMRFlow` as No.	`SSlCertsS3BucketName`	Bucket name where you copied the SSL certificates.
EMR Configuration		`SSlCertsZip`	Name of SSL certificates file (my-certs.zip) to use the sample certificate provided in the post.
Athena Configuration	Athena parameters. Ignore if you chose parameter `DeployAthenaFlow` as No.	`IDCUser1Id`	User ID corresponding to User1 from IAM Identity Center.

The CloudFormation stack provisions the following resources:

A VPC with a public and private subnet.
If you chose the Redshift components, it also creates three additional subnets.
S3 buckets for data and Athena query results location storage. It also copies some sample data to the buckets.
EMR Studio with IAM Identity Center integration.
Amazon EMR security configuration with IAM Identity Center integration.
An EMR cluster that uses the EMR security group.
Registers the source S3 bucket with Lake Formation.
An AWS Glue database named oktank_tipblog_temp and a table named customer under the database. The table points to the Amazon S3 location governed by Lake Formation.
Allows external engines to access data in Amazon S3 locations with full table access. This is required for Amazon EMR integration with Lake Formation for trusted identity propagation. As of this writing, Amazon EMR supports table-level access with IAM Identity Center enabled clusters.
An S3 Access Grants instance.
S3 Access Grants for Group1 to the User1 prefix under the Athena query results location bucket.
S3 Access Grants for Group2 to the S3 bucket input and output prefixes. The user has read access to the input prefix and write access to the output prefix under the bucket.
An Amazon Redshift Serverless namespace and workgroup. This workgroup is not integrated with IAM Identity Center; we complete subsequent steps to enable IAM Identity Center for the workgroup.
An AWS Cloud9 integrated development environment (IDE), which we use to run AWS CLI commands during the setup.

Note the stack outputs on the AWS CloudFormation console. You use these values in later steps.

Choose the link for Cloud9URL in the stack output to open the AWS Cloud9 IDE. In AWS Cloud9, go to the Window tab and choose New Terminal to start a new bash terminal.

Set up Lake Formation

You need to enable Lake Formation with IAM Identity Center and enable an EMR application with Lake Formation integration. Complete the following steps:

In the AWS Cloud9 bash terminal, enter the following command to get the Amazon EMR security configuration created by the stack:

aws emr describe-security-configuration --name TIP-EMRSecurityConfig | jq -r '.SecurityConfiguration | fromjson | .AuthenticationConfiguration.IdentityCenterConfiguration.IdCApplicationARN'

Note the value for IdcApplicationARN from the output.
Enter the following command in AWS Cloud9 to enable the Lake Formation integration with IAM Identity Center and add the Amazon EMR security configuration application as a trusted application in Lake Formation. If you already have the IAM Identity Center integration with Lake Formation, sign in to Lake Formation and add the preceding value to the list of applications instead of running the following command and proceed to next step.

aws lakeformation create-lake-formation-identity-center-configuration --catalog-id <Replace with CatalogId value from Cloudformation output> --instance-arn <Replace with IDCInstanceARN value from CloudFormation stack output> --external-filtering Status=ENABLED,AuthorizedTargets=<Replace with IdcApplicationARN value copied in previous step>

After this step, you should see the application on the Lake Formation console.

This completes the initial setup. In subsequent steps, we apply some additional configurations for specific user personas.

Validate user personas

To review the S3 Access Grants created by AWS CloudFormation, open the Amazon S3 console and Access Grants in the navigation pane. Choose the access grant you created to view its details.

The CloudFormation stack created the S3 Access Grants for Group1 for the User1 prefix under the Athena query results location bucket. This allows User1 to access the prefix under in the query results bucket. The stack also created the grants for Group2 for User2 to access the raw data bucket input and output prefixes.

Set up User1 access

Complete the steps in this section to set up User1 access.

Create an IAM Identity Center enabled Athena workgroup

Let’s create the Athena workgroup that will be used by User1.

Enter the following command in the AWS Cloud9 terminal. The command creates an IAM Identity Center integrated Athena workgroup and enables S3 Access Grants for the user-level prefix. These user identity-based S3 prefixes let users in an Athena workgroup keep their query results isolated from other users in the same workgroup. The prefix is automatically created by Athena when the CreateUserLevelPrefix option is enabled. Access to the prefix was granted by the CloudFormation stack.

aws athena create-work-group --cli-input-json '{
"Name": "AthenaIDCWG",
"Configuration": {
"ResultConfiguration": {
"OutputLocation": "<Replace with AthenaResultLocation from CloudFormation stack>"
},
"ExecutionRole": "<Replace with TIPStudioRoleArn from CloudFormation stack>",
"IdentityCenterConfiguration": {
"EnableIdentityCenter": true,
"IdentityCenterInstanceArn": "<Replace with IDCInstanceARN from CloudFormation stack>"
},
"QueryResultsS3AccessGrantsConfiguration": {
"EnableS3AccessGrants": true,
"CreateUserLevelPrefix": true,
"AuthenticationType": "DIRECTORY_IDENTITY"
},
"EnforceWorkGroupConfiguration":true
},
"Description": "Athena Workgroup with IDC integration"
}'

Grant access to User1 on the Athena workgroup

Sign in to the Athena console and grant access to Group1 to the workgroup as shown in the following screenshot. You can grant access to the user (User1) or to the group (Group1). In this post, we grant access to Group1.

Grant access to User1 in Lake Formation

Sign in to the Lake Formation console, choose Data lake permissions in the navigation pane, and grant access to the user group on the database oktank_tipblog_temp and table customer.

With Athena, you can grant access to specific columns and for specific rows with row-level filtering. For this post, we grant column-level access and restrict access to only selected columns for the table.

This completes the access permission setup for User1.

Verify access

Let’s see how User1 uses Athena to analyze the data.

Copy the URL for EMRStudioURL from the CloudFormation stack output.
Open a new browser window and connect to the URL.

You will be redirected to the Okta login page.

Log in with User1.
In the EMR Studio query editor, change the workgroup to AthenaIDCWG and choose Acknowledge.
Run the following query in the query editor:

SELECT * FROM "oktank_tipblog_temp"."customer" limit 10;

You can see that the user is only able to access the columns for which permissions were previously granted in Lake Formation. This completes the access flow verification for User1.

Set up User2 access

User2 accesses the table using an EMR Studio notebook. Note the current considerations for EMR with IAM Identity Center integrations.

Complete the steps in this section to set up User2 access.

Grant Lake Formation permissions to User2

Sign in to the Lake Formation console and grant access to Group2 on the table, similar to the steps you followed earlier for User1. Also grant Describe permission on the default database to Group2, as shown in the following screenshot.

Create an EMR Studio Workspace

Next, User2 creates an EMR Studio Workspace.

Copy the URL for EMR Studio from the EMRStudioURL value from the CloudFormation stack output.
Log in to EMR Studio as User2 on the Okta login page.
Create a Workspace, giving it a name and leaving all other options as default.

This will open a JupyterLab notebook in a new window.

Connect to the EMR Studio notebook

In the Compute pane of the notebook, select the EMR cluster (named EMRWithTIP) created by the CloudFormation stack to attach to it. After the notebook is attached to the cluster, choose the PySpark kernel to run Spark queries.

Verify access

Enter the following query in the notebook to read from the customer table:

spark.sql("select * from oktank_tipblog_temp.customer").show()

The user access works as expected based on the Lake Formation grants you provided earlier.

Run the following Spark query in the notebook to read data from the raw bucket. Access to this bucket is controlled by S3 Access Grants.

spark.read.option("header",True).csv("s3://tip-blog-s3-s3ag/input/*").show()

Let’s write this data to the same bucket and input prefix. This should fail because you only granted read access to the input prefix with S3 Access Grants.

spark.read.option("header",True).csv("s3://tip-blog-s3-s3ag/input/*").write.mode("overwrite").parquet("s3://tip-blog-s3-s3ag/input/")

The user has access to the output prefix under the bucket. Change the query to write to the output prefix:

spark.read.option("header",True).csv("s3://tip-blog-s3-s3ag/input/*").write.mode("overwrite").parquet("s3://tip-blog-s3-s3ag/output/test.part")

The write should now be successful.

We have now seen the data access controls and access flows for User1 and User2.

Set up User3 access

Following the target architecture in our post, Group3 users use the Redshift Query Editor v2 to query the Redshift tables.

Complete the steps in this section to set up access for User3.

Enable Redshift Query Editor v2 console access for User3

Complete the following steps:

On the IAM Identity Center console, create a custom permission set and attach the following policies:
1. AWS managed policy AmazonRedshiftQueryEditorV2ReadSharing.
2. Customer managed policy redshift-idc-policy-tip. This policy is already created by the CloudFormation stack, so you don’t have to create it.
Provide a name (tip-blog-qe-v2-permission-set) to the permission set.
Set the relay state as https://<region-id>.console.aws.amazon.com/sqlworkbench/home (for example, https://us-east-1.console.aws.amazon.com/sqlworkbench/home).
Choose Create.
Assign Group3 to the account in IAM Identity Center, select the permission set you created, and choose Submit.

Create the Redshift IAM Identity Center application

Enter the following in the AWS Cloud9 terminal:

aws redshift create-redshift-idc-application \
--idc-instance-arn '<Replace with IDCInstanceARN value from CloudFormation Output>' \
--redshift-idc-application-name 'redshift-iad-<Replace with CatalogId value from CloudFormation output>-tip-blog-1' \
--identity-namespace 'tipblogawsidc' \
--idc-display-name 'TIPBlog_AWSIDC' \
--iam-role-arn '<Replace with TIPRedshiftRoleArn value from CloudFormation output>' \
--service-integrations '[
  {
    "LakeFormation": [
    {
     "LakeFormationQuery": {
     "Authorization": "Enabled"
    }
   }
  ]
 }
]'

Enter the following command to get the application details:

aws redshift describe-redshift-idc-applications --output json

Keep a note of the IdcManagedApplicationArn, IdcDisplayName, and IdentityNamespace values in the output for the application with IdcDisplayName TIPBlog_AWSIDC. You need these values in the next step.

Enable the Redshift Query Editor v2 for the Redshift IAM Identity Center application

Complete the following steps:

On the Amazon Redshift console, choose IAM Identity Center connections in the navigation pane.
Choose the application you created.
Choose Edit.
Select Enable Query Editor v2 application and choose Save changes.
On the Groups tab, choose Add or assign groups.
Assign Group3 to the application.

The Redshift IAM Identity Center connection is now set up.

Enable the Redshift Serverless namespace and workgroup with IAM Identity Center

The CloudFormation stack you deployed created a serverless namespace and workgroup. However, they’re not enabled with IAM Identity Center. To enable with IAM Identity Center, complete the following steps. You can get the namespace name from the RedshiftNamespace value of the CloudFormation stack output.

On the Amazon Redshift Serverless dashboard console, navigate to the namespace you created.
Choose Query Data to open Query Editor v2.
Choose the options menu (three dots) and choose Create connections for the workgroup redshift-idc-wg-tipblog.
Choose Other ways to connect and then Database user name and password.
Use the credentials you provided for the Redshift admin user name and password parameters when deploying the CloudFormation stack and create the connection.

Create resources using the Redshift Query Editor v2

You now enter a series of commands in the query editor with the database admin user.

Create an IdP for the Redshift IAM Identity Center application:

CREATE IDENTITY PROVIDER "TIPBlog_AWSIDC" TYPE AWSIDC
NAMESPACE 'tipblogawsidc'
APPLICATION_ARN '<Replace with IdcManagedApplicationArn value you copied earlier in Cloud9>'
IAM_ROLE '<Replace with TIPRedshiftRoleArn value from CloudFormation output>';

Enter the following command to check the IdP you added previously:

SELECT * FROM svv_identity_providers;

Next, you grant permissions to the IAM Identity Center user.

Create a role in Redshift. This role should correspond to the group in IAM Identity Center to which you intend to provide the permissions (Group3 in this post). The role should follow the format <namespace>:<GroupNameinIDC>.

Create role "tipblogawsidc:Group3";

Run the following command to see role you created. The external_id corresponds to the group ID value for Group3 in IAM Identity Center.

Select * from svv_roles where role_name = 'tipblogawsidc:Group3';

Create a sample table to use to verify access for the Group3 user:

CREATE TABLE IF NOT EXISTS revenue
(
account INTEGER ENCODE az64
,customer VARCHAR(20) ENCODE lzo
,salesamt NUMERIC(18,0) ENCODE az64
)
DISTSTYLE AUTO
;

insert into revenue values (10001, 'ABC Company', 12000);
insert into revenue values (10002, 'Tech Logistics', 175400);

Grant access to the user on the schema:

-- Grant usage on schema
grant usage on schema public to role "tipblogawsidc:Group3";

To create a datashare and add the preceding table to the datashare, enter the following statements:

CREATE DATASHARE demo_datashare;
ALTER DATASHARE demo_datashare ADD SCHEMA public;
ALTER DATASHARE demo_datashare ADD TABLE revenue;

Grant usage on the datashare to the account using the Data Catalog:

GRANT USAGE ON DATASHARE demo_datashare TO ACCOUNT '<Replace with CatalogId from Cloud Formation Output>' via DATA CATALOG;

Authorize the datashare

For this post, we use the AWS CLI to authorize the datashare. You can also do it from the Amazon Redshift console.

Enter the following command in the AWS Cloud9 IDE to describe the datashare you created and note the value of DataShareArn and ConsumerIdentifier to use in subsequent steps:

aws redshift describe-data-shares

Enter the following command in the AWS Cloud9 IDE to the authorize the datashare:

aws redshift authorize-data-share --data-share-arn <Replace with DataShareArn value copied from earlier command’s output> --consumer-identifier <Replace with ConsumerIdentifier value copied from earlier command’s output >

Accept the datashare in Lake Formation

Next, accept the datashare in Lake Formation.

On the Lake Formation console, choose Data sharing in the navigation pane.
In the Invitations section, select the datashare invitation that is pending acceptance.
Choose Review invitation and accept the datashare.
Provide a database name (tip-blog-redshift-ds-db), which will be created in the Data Catalog by Lake Formation.
Choose Skip to Review and Create and create the database.

Grant permissions in Lake Formation

Complete the following steps:

On the Lake Formation console, choose Data lake permissions in the navigation pane.
Choose Grant and in the Principals section, choose User3 to grant permissions with the IAM Identity Center-new option. Refer to the Lake Formation access grants steps performed for User1 and User2 if needed.
Choose the database (tip-blog-redshift-ds-db) you created earlier and the table public.revenue, which you created in the Redshift Query Editor v2.
For Table permissions¸ select Select.
For Data permissions¸ select Column-based access and select the account and salesamt columns.
Choose Grant.

Mount the AWS Glue database to Amazon Redshift

As the last step in the setup, mount the AWS Glue database to Amazon Redshift. In the Query Editor v2, enter the following statements:

create external schema if not exists tipblog_datashare_idc_schema from DATA CATALOG DATABASE 'tip-blog-redshift-ds-db' catalog_id '<Replace with CatalogId from CloudFormation output>';

grant usage on schema tipblog_datashare_idc_schema to role "tipblogawsidc:Group3";

grant select on all tables in schema tipblog_datashare_idc_schema to role "tipblogawsidc:Group3";

You are now done with the required setup and permissions for User3 on the Redshift table.

Verify access

To verify access, complete the following steps:

Get the AWS access portal URL from the IAM Identity Center Settings section.
Open a different browser and enter the access portal URL.

This will redirect you to your Okta login page.

If you’re using private or incognito mode for testing this, you may need to enable third-party cookies.

Choose the options menu (three dots) and choose Edit connection for the redshift-idc-wg-tipblog workgroup.
Use IAM Identity Center in the pop-up window and choose Continue.

If you get an error with the message “Redshift serverless cluster is auto paused,” switch to the other browser with admin credentials and run any sample queries to un-pause the cluster. Then switch back to this browser and continue the next steps.

Run the following query to access the table:

SELECT * FROM "dev"."tipblog_datashare_idc_schema"."public.revenue";

You can only see the two columns due to the access grants you provided in Lake Formation earlier.

This completes configuring User3 access to the Redshift table.

Set up QuickSight for User3

Let’s now set up QuickSight and verify access for User3. We already granted access to User3 to the Redshift table in earlier steps.

Create a new IAM Identity Center enabled QuickSight account. Refer to Simplify business intelligence identity management with Amazon QuickSight and AWS IAM Identity Center for guidance.
Choose Group3 for the author and reader for this post.
For IAM Role, choose the IAM role matching the RoleQuickSight value from the CloudFormation stack output.

Next, you add a VPC connection to QuickSight to access the Redshift Serverless namespace you created earlier.

On the QuickSight console, manage your VPC connections.
Choose Add VPC connection.
For VPC connection name, enter a name.
For VPC ID, enter the value for VPCId from the CloudFormation stack output.
For Execution role, choose the value for RoleQuickSight from the CloudFormation stack output.
For Security Group IDs, choose the security group for QSSecurityGroup from the CloudFormation stack output.

Wait for the VPC connection to be AVAILABLE.
Enter the following command in AWS Cloud9 to enable QuickSight with Amazon Redshift for trusted identity propagation:

aws quicksight update-identity-propagation-config --aws-account-id "<Replace with CatalogId from CloudFormation output>" --service "REDSHIFT" --authorized-targets "< Replace with IdcManagedApplicationArn value from output of aws redshift describe-redshift-idc-applications --output json which you copied earlier>"

Verify User3 access with QuickSight

Complete the following steps:

Sign in to the QuickSight console as User3 in a different browser.
On the Okta sign-in page, sign in as User 3.
Create a new dataset with Amazon Redshift as the data source.
Choose the VPC connection you created above for Connection Type.
Provide the Redshift server (the RedshiftSrverlessWorkgroup value from the CloudFormation stack output), port (5439 in this post), and database name (dev in this post).
Under Authentication method, select Single sign-on.
Choose Validate, then choose Create data source.

If you encounter an issue with validating using single sign-on, switch to Database username and password for Authentication method, validate with any dummy user and password, and then switch back to validate using single sign-on and proceed to the next step. Also check that the Redshift serverless cluster is not auto-paused as mentioned earlier in Redshift access verification.

Choose the schema you created earlier (tipblog_datashare_idc_schema) and the table public.revenue
Choose Select to create your dataset.

You should now be able to visualize the data in QuickSight. You are only able to only see the account and salesamt columns from the table because of the access permissions you granted earlier with Lake Formation.

This finishes all the steps for setting up trusted identity propagation.

Audit data access

Let’s see how we can audit the data access with the different users.

Access requests are logged to CloudTrail. The IAM Identity Center user ID is logged under the onBehalfOf tag in the CloudTrail event. The following screenshot shows the GetDataAccess event generated by Lake Formation. You can view the CloudTrail event history and filter by event name GetDataAccess to view similar events in your account.

You can see the userId corresponds to User2.

You can run the following commands in AWS Cloud9 to confirm this.

Get the identity store ID:

aws sso-admin describe-instance --instance-arn <Replace with your instance arn value> | jq -r '.IdentityStoreId'

Describe the user in the identity store:

aws identitystore describe-user --identity-store-id <Replace with output of above command> --user-id <User Id from above screenshot>

One way to query the CloudTrail log events is by using CloudTrail Lake. Set up the event data store (refer to the following instructions) and rerun the queries for User1, User2, and User3. You can query the access events using CloudTrail Lake with the following sample query:

SELECT eventTime,userIdentity.onBehalfOf.userid AS idcUserId,requestParameters as accessInfo, serviceEventDetails
FROM 04d81d04-753f-42e0-a31f-2810659d9c27
WHERE userIdentity.arn IS NOT NULL AND eventName='BatchGetTable' or eventName='GetDataAccess' or eventName='CreateDataSet'
order by eventTime DESC

The following screenshot shows an example of the detailed results with audit explanations.

Clean up

To avoid incurring further charges, delete the CloudFormation stack. Before you delete the CloudFormation stack, delete all the resources you created using the console or AWS CLI:

Manually delete any EMR Studio Workspaces you created with User2.
Delete the Athena workgroup created as part of the User1 setup.
Delete the QuickSight VPC connection you created.
Delete the Redshift IAM Identity Center connection.
Deregister IAM Identity Center from S3 Access Grants.
Delete the CloudFormation stack.
Manually delete the VPC created by AWS CloudFormation.

Conclusion

In this post, we delved into the trusted identity propagation feature of AWS Identity Center alongside various AWS Analytics services, demonstrating its utility in managing permissions using corporate user or group identities rather than IAM roles. We examined diverse user personas utilizing interactive tools like Athena, EMR Studio notebooks, Redshift Query Editor V2, and QuickSight, all centralized under Lake Formation for streamlined permission management. Additionally, we explored S3 Access Grants for S3 bucket access management, and concluded with insights into auditing through CloudTrail events and CloudTrail Lake for a comprehensive overview of user data access.

For further reading, refer to the following resources:

About the Author

Shoukat Ghouse is a Senior Big Data Specialist Solutions Architect at AWS. He helps customers around the world build robust, efficient and scalable data platforms on AWS leveraging AWS analytics services like AWS Glue, AWS Lake Formation, Amazon Athena and Amazon EMR.

How to use AWS managed applications with IAM Identity Center

2024-05-13 Liam Wadman

Post Syndicated from Liam Wadman original https://aws.amazon.com/blogs/security/how-to-use-aws-managed-applications-with-iam-identity-center/

AWS IAM Identity Center is the preferred way to provide workforce access to Amazon Web Services (AWS) accounts, and enables you to provide workforce access to many AWS managed applications, such as Amazon Q Developer (Formerly known as Code Whisperer).

As we continue to release more AWS managed applications, customers have told us they want to onboard to IAM Identity Center to use AWS managed applications, but some aren’t ready to migrate their existing IAM federation for AWS account management to Identity Center.

In this blog post, I’ll show you how you can enable Identity Center and use AWS managed applications—such as Amazon Q Developer—without migrating existing IAM federation flows to Identity Center.

A recap on AWS managed applications and trusted identity propagation

Just before re:Invent 2023, AWS launched trusted identity propagation, a technology that allows you to use a user’s identity and groups when accessing AWS services. This allows you to assign permissions directly to users or groups, rather than model entitlements in AWS Identity and Access Management (IAM). This makes permissions management simpler for users. For example, with trusted identity propagation, you can grant users and groups access to specific Amazon Redshift clusters without modeling all possible unique combinations of permissions in IAM. Trusted identity propagation is available today for Redshift and Amazon Simple Storage Service (Amazon S3), with more services and features coming over time.

In 2023, we released Amazon Q Developer, which is integrated with IAM Identity Center, generally available as an AWS managed application. When you’re using Amazon Q Developer outside of AWS in integrated development environments (IDEs) such as Microsoft Visual Studio Code, Identity Center is used to sign in to Amazon Q Developer.

Amazon Q Developer is one of many AWS managed applications that are integrated with the OAuth 2.0 functionality of IAM Identity Center, and it doesn’t use IAM credentials to access the Q Developer service from within your IDEs. AWS managed applications and trusted identity propagation don’t require you to use the permission sets feature of Identity Center and instead use OpenID Connect to grant your workforce access to AWS applications and features.

IAM Identity Center for AWS application access only

In the following section, we use IAM Identity Center to sign in to Amazon Q Developer as an example of an AWS managed application.

Prerequisites

The steps in this post require that you have administrative level access to an organization in AWS Organizations.
Specific prerequisites and considerations for deploying IAM Identity Center can be found in the documentation.

Step 1: Enable an organization instance of IAM Identity Center

To begin, you must enable an organization instance of IAM Identity Center. While it’s possible to use IAM Identity Center without an AWS Organizations organization, we generally recommend that customers operate with such an organization.

The IAM Identity Center documentation provides the steps to enable an organizational instance of IAM Identity Center, as well as prerequisites and considerations. One consideration I would emphasize here is the identity source. We recommend, wherever possible, that you integrate with an external identity provider (IdP), because this provides the most flexibility and allows you to take advantage of the advanced security features of modern identity platforms.

IAM Identity Center is available at no additional cost.

Note: In late 2023, AWS launched account instances for IAM Identity Center. Account instances allow you to create additional Identity Center instances within member accounts of your organization. Wherever possible, we recommend that customers use an organization instance of IAM Identity Center to give them a centralized place to manage their identities and permissions. AWS recommends account instances when you want to perform a proof of concept using Identity Center, when there isn’t a central IdP or directory that contains all the identities you want to use on AWS and you want to use AWS managed applications with distinct directories, or when your AWS account is a member of an organization in AWS Organizations that is managed by another party and you don’t have access to set up an organization instance.

Step 2: Set up your IdP and synchronize identities and groups

After you’ve enabled your IAM Identity Center instance, you need to set up your instance to work with your chosen IdP and synchronize your identities and groups. The IAM Identity Center documentation includes examples of how to do this with many popular IdPs.

After your identity source is connected, IAM Identity Center can act as the single source of identity and authentication for AWS managed applications, bridging your external identity source and AWS managed applications. You don’t have to create a bespoke relationship between each AWS application and your IdP, and you have a single place to manage user permissions.

Step 3: Set up delegated administration for IAM Identity Center

As a best practice, we recommend that you only access the management account of your AWS Organizations organization when absolutely necessary. IAM Identity Center supports delegated administration, which allows you to manage Identity Center from a member account of your organization.

To set up delegated administration

Go to the AWS Management Console and navigate to IAM Identity Center.
In the left navigation pane, select Settings. Then select the Management tab and choose Register account.
From the menu that follows, select the AWS account that will be used for delegated administration for IAM Identity Center. Ideally, this member account is dedicated solely to the purpose of administrating IAM Identity Center and is only accessible to users who are responsible for maintaining IAM Identity Center.

Figure 1: Set up delegated administration

Step 4: Configure Amazon Q Developer

You now have IAM Identity Center set up with the users and groups from your directory, and you’re ready to configure AWS managed applications with IAM Identity Center. From a member account within your organization, you can now enable Amazon Q Developer. This can be any member account in your organization and should not be the one where you set up delegated administration of IAM Identity Center, or the management account.

Note: If you’re doing this step immediately after configuring IAM Identity Center with an external IdP with SCIM synchronization, be aware that the users and groups from your external IdP might not yet be synchronized to Identity Center by your external IdP. Identity Center updates user information and group membership as soon as the data is received from your external IdP. How long it takes to finish synchronizing after the data is received depends on the number of users and groups being synchronized to Identity Center.

To enable Amazon Q Developer

Open the Amazon Q Developer console. This will take you to the setup for Amazon Q Developer.

Figure 2: Open the Amazon Q Developer console
Choose Subscribe to Amazon Q.

Figure 3: The Amazon Q developer console
You’ll be taken to the Amazon Q console. Choose Subscribe to subscribe to Amazon Q Developer Pro.

Figure 4: Subscribe to Amazon Q Developer Pro
After choosing Subscribe, you will be prompted to select users and groups you want to enroll for Amazon Q Developer. Select the users and groups you want and then choose Assign.

Figure 5: Assign user and group access to Amazon Q Developer

After you perform these steps, the setup of Amazon Q Developer as an AWS managed application is complete, and you can now use Amazon Q Developer. No additional configuration is required within your external IdP or on-premises Microsoft Active Directory, and no additional user profiles have to be created or synchronized to Amazon Q Developer.

Note: There are charges associated with using the Amazon Q Developer service.

Step 5: Set up Amazon Q Developer in the IDE

Now that Amazon Q Developer is configured, users and groups that you have granted access to can use Amazon Q Developer from their supported IDE.

In their IDE, a user can sign in to Amazon Q Developer by entering the start URL and AWS Region and choosing Sign in. Figure 6 shows what this looks like in Visual Studio Code. The Amazon Q extension for Visual Studio Code is available to download within Visual Studio Code.

Figure 6: Signing in to the Amazon Q Developer extension in Visual Studio Code

After choosing Use with Pro license, and entering their Identity Center’s start URL and Region, the user will be directed to authenticate with IAM Identity Center and grant the Amazon Q Developer application access to use the Amazon Q Developer service.

When this is successful, the user will have the Amazon Q Developer functionality available in their IDE. This was achieved without migrating existing federation or AWS account access patterns to IAM Identity Center.

Clean up

If you don’t wish to continue using IAM Identity Center or Amazon Q Developer, you can delete the Amazon Q Developer Profile and Identity Center instance within their respective consoles, within the AWS account they are deployed into. Deleting your Identity Center instance won’t make changes to existing federation or AWS account access that is not done through IAM Identity Center.

Conclusion

In this post, we talked about some recent significant launches of AWS managed applications and features that integrate with IAM Identity Center and discussed how you can use these features without migrating your AWS account management to permission sets. We also showed how you can set up Amazon Q Developer with IAM Identity Center. While the example in this post uses Amazon Q Developer, the same approach and guidance applies to Amazon Q Business and other AWS managed applications integrated with Identity Center.

To learn more about the benefits and use cases of IAM Identity Center, visit the product page, and to learn more about Amazon Q Developer, visit the Amazon Q Developer product page.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on X.

Use your corporate identities for analytics with Amazon EMR and AWS IAM Identity Center

2024-04-26 Pradeep Misra

Post Syndicated from Pradeep Misra original https://aws.amazon.com/blogs/big-data/use-your-corporate-identities-for-analytics-with-amazon-emr-and-aws-iam-identity-center/

To enable your workforce users for analytics with fine-grained data access controls and audit data access, you might have to create multiple AWS Identity and Access Management (IAM) roles with different data permissions and map the workforce users to one of those roles. Multiple users are often mapped to the same role where they need similar privileges to enable data access controls at the corporate user or group level and audit data access.

AWS IAM Identity Center enables centralized management of workforce user access to AWS accounts and applications using a local identity store or by connecting corporate directories via identity providers (IdPs). IAM Identity Center now supports trusted identity propagation, a streamlined experience for users who require access to data with AWS analytics services.

Amazon EMR Studio is an integrated development environment (IDE) that makes it straightforward for data scientists and data engineers to build data engineering and data science applications. With trusted identity propagation, data access management can be based on a user’s corporate identity and can be propagated seamlessly as they access data with single sign-on to build analytics applications with Amazon EMR (EMR Studio and Amazon EMR on EC2).

AWS Lake Formation allows data administrators to centrally govern, secure, and share data for analytics and machine learning (ML). With trusted identity propagation, data administrators can directly provide granular access to corporate users using their identity attributes and simplify the traceability of end-to-end data access across AWS services. Because access is managed based on a user’s corporate identity, they don’t need to use database local user credentials or assume an IAM role to access data.

In this post, we show how to bring your workforce identity to EMR Studio for analytics use cases, directly manage fine-grained permissions for the corporate users and groups using Lake Formation, and audit their data access.

Solution overview

For our use case, we want to enable a data analyst user named analyst1 to use their own enterprise credentials to query data they have been granted permissions to and audit their data access. We use Okta as the IdP for this demonstration. The following diagram illustrates the solution architecture.

This architecture is based on the following components:

Okta is responsible for maintaining the corporate user identities, related groups, and user authentication.
IAM Identity Center connects Okta users and centrally manages their access across AWS accounts and applications.
Lake Formation provides fine-grained access controls on data directly to corporate users using trusted identity propagation.
EMR Studio is an IDE for users to build and run applications. It allows users to log in directly with their corporate credentials without signing in to the AWS Management Console.
AWS Service Catalog provides a product template to create EMR clusters.
EMR cluster is integrated with IAM Identity Center using a security configuration.
AWS CloudTrail captures user data access activities.

The following are the high-level steps to implement the solution:

Integrate Okta with IAM Identity Center.
Set up Amazon EMR Studio.
Create an IAM Identity Center enabled security configuration for EMR clusters.
Create a Service Catalog product template to create the EMR clusters.
Use Lake Formation to grant permissions to users to access data.
Test the solution by accessing data with a corporate identity.
Audit user data access.

Prerequisites

You should have the following prerequisites:

An AWS account with access to the following AWS services:
- AWS CloudFormation
- CloudTrail
- Amazon Elastic Compute Cloud (Amazon EC2)
- Amazon Simple Storage Service (Amazon S3)
- EMR Studio
- IAM
- IAM Identity Center
- Lake Formation
- Service Catalog
An Okta account (you can create a free developer account)

Integrate Okta with IAM Identity Center

For more information about configuring Okta with IAM Identity Center, refer to Configure SAML and SCIM with Okta and IAM Identity Center.

For this setup, we have created two users, analyst1 and engineer1, and assigned them to the corresponding Okta application. You can validate the integration is working by navigating to the Users page on the IAM Identity Center console, as shown in the following screenshot. Both enterprise users from Okta are provisioned in IAM Identity Center.

The following exact users will not be listed in your account. You can either create similar users or use an existing user.

Each provisioned user in IAM Identity Center has a unique user ID. This ID does not originate from Okta; it’s created in IAM Identity Center to uniquely identify this user. With trusted identity propagation, this user ID will be propagated across services and also used for traceability purposes in CloudTrail. The following screenshot shows the IAM Identity Center user matching the provisioned Okta user analyst1.

Choose the link under AWS access portal URL and log in with the analyst1 Okta user credentials that are already assigned to this application.

If you are able to log in and see the landing page, then all your configurations up to this step are set correctly. You will not see any applications on this page yet.

Set up EMR Studio

In this step, we demonstrate the actions needed from the data lake administrator to set up EMR Studio enabled for trusted identity propagation and with IAM Identity Center integration. This allows users to directly access EMR Studio with their enterprise credentials.

Note: All Amazon S3 buckets (created after January 5, 2023) have encryption configured by default (Amazon S3 managed keys (SSE-S3)), and all new objects that are uploaded to an S3 bucket are automatically encrypted at rest. To use a different type of encryption, to meet your security needs, please update the default encryption configuration for the bucket. See Protecting data for server-side encryption for further details.

On the Amazon EMR console, choose Studios in the navigation pane under EMR Studio.
Choose Create Studio.

For Setup options¸ select Custom.
For Studio name, enter a name (for this post, emr-studio-with-tip).
For S3 location for Workspace storage, select Select existing location and enter an existing S3 bucket (if you have one). Otherwise, select Create new bucket.

For Service role to let Studio access your AWS resources, choose View permissions details to get the trust and IAM policy information that is needed and create a role with those specific policies in IAM. In this case, we create a new role called emr_tip_role.

For Service role to let Studio access your AWS resources, choose the IAM role you created.
For Workspace name, enter a name (for this post, studio-workspace-with-tip).

For Authentication, select IAM Identity Center.
For User role¸ you can create a new role or choose an existing role. For this post, we choose the role we created (emr_tip_role).
To use the same role, add the following statement to the trust policy of the service role:

{
  "Version": "2008-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "elasticmapreduce.amazonaws.com",
 "AWS": "arn:aws:iam::xxxxxx:role/emr_tip_role"
      },
      "Action": [
              "sts:AssumeRole",
              "sts:SetContext"
              ]
    }
  ]
}

Select Enable trusted identity propagation to allow you to control and log user access across connected applications.

For Choose who can access your application, select All users and groups.

Later, we restrict access to resources using Lake Formation. However, there is an option here to restrict access to only assigned users and groups.

In the Networking and security section, you can provide optional details for your VPC, subnets, and security group settings.
Choose Create Studio.

On the Studios page of the Amazon EMR console, locate your Studio enabled with IAM Identity Center.
Copy the link for Studio Access URL.

Enter the URL into a web browser and log in using Okta credentials.

You should be able to successfully sign in to the EMR Studio console.

Create an AWS Identity Center enabled security configuration for EMR clusters

EMR security configurations allow you to configure data encryption, Kerberos authentication, and Amazon S3 authorization for the EMR File System (EMRFS) on the clusters. The security configuration is available to use and reuse when you create clusters.

To integrate Amazon EMR with IAM Identity Center, you need to first create an IAM role that authenticates with IAM Identity Center from the EMR cluster. Amazon EMR uses IAM credentials to relay the IAM Identity Center identity to downstream services such as Lake Formation. The IAM role should also have the respective permissions to invoke the downstream services.

Create a role (for this post, called emr-idc-application) with the following trust and permission policy. The role referenced in the trust policy is the InstanceProfile role for EMR clusters. This allows the EC2 instance profile to assume this role and act as an identity broker on behalf of the federated users.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AssumeRole",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::xxxxxxxxxxn:role/service-role/AmazonEMR-InstanceProfile-20240127T102444"
            },
            "Action": [
                "sts:AssumeRole",
                "sts:SetContext"
            ]
        }
    ]
}

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "IdCPermissions",
            "Effect": "Allow",
            "Action": [
                "sso-oauth:*"
            ],
            "Resource": "*"
        },
        {
            "Sid": "GlueandLakePermissions",
            "Effect": "Allow",
            "Action": [
                "glue:*",
                "lakeformation:GetDataAccess"
            ],
            "Resource": "*"
        },
        {
            "Sid": "S3Permissions",
            "Effect": "Allow",
            "Action": [
                "s3:GetDataAccess",
                "s3:GetAccessGrantsInstanceForPrefix"
            ],
            "Resource": "*"
        }
    ]
}

Next, you create certificates for encrypting data in transit with Amazon EMR.

For this post, we use OpenSSL to generate a self-signed X.509 certificate with a 2048-bit RSA private key.

The key allows access to the issuer’s EMR cluster instances in the AWS Region being used. For a complete guide on creating and providing a certificate, refer to Providing certificates for encrypting data in transit with Amazon EMR encryption.

Upload my-certs.zip to an S3 location that will be used to create the security configuration.

The EMR service role should have access to the S3 location. The key allows access to the issuer’s EMR cluster instances in the us-west-2 Region as specified by the *.us-west-2.compute.internal domain name as the common name. You can change this to the Region your cluster is in.

$ openssl req -x509 -newkey rsa:2048 -keyout privateKey.pem -out certificateChain.pem -days 365 -nodes -subj '/CN=*.us-west-2.compute.internal'
$ cp certificateChain.pem trustedCertificates.pem
$ zip -r -X my-certs.zip certificateChain.pem privateKey.pem trustedCertificates.pem

Create an EMR security configuration with IAM Identity Center enabled from the AWS Command Line Interface (AWS CLI) with the following code:

aws emr create-security-configuration --name "IdentityCenterConfiguration-with-lf-tip" --region "us-west-2" --endpoint-url https://elasticmapreduce.us-west-2.amazonaws.com --security-configuration '{
    "AuthenticationConfiguration":{
        "IdentityCenterConfiguration":{
            "EnableIdentityCenter":true,
            "IdentityCenterApplicationAssigmentRequired":false,
            "IdentityCenterInstanceARN": "arn:aws:sso:::instance/ssoins-7907b0d7d77e3e0d",
            "IAMRoleForEMRIdentityCenterApplicationARN": "arn:aws:iam::1xxxxxxxxx0:role/emr-idc-application"
        }
    },
    "AuthorizationConfiguration": {
        "LakeFormationConfiguration": {
            "EnableLakeFormation": true
        }
    },
    "EncryptionConfiguration": {
        "EnableInTransitEncryption": true,
        "EnableAtRestEncryption": false,
        "InTransitEncryptionConfiguration": {
            "TLSCertificateConfiguration": {
                "CertificateProviderType": "PEM",
                "S3Object": "s3://<<Bucket Name>>/emr-transit-encry-certs/my-certs.zip"
            }
        }
    }
}'

You can view the security configuration on the Amazon EMR console.

Create a Service Catalog product template to create EMR clusters

EMR Studio with trusted identity propagation enabled can only work with clusters created from a template. Complete the following steps to create a product template in Service Catalog:

On the Service Catalog console, choose Portfolios under Administration in the navigation pane.
Choose Create portfolio.

Enter a name for your portfolio (for this post, EMR Clusters Template) and an optional description.
Choose Create.

On the Portfolios page, choose the portfolio you just created to view its details.

On the Products tab, choose Create product.

For Product type, select CloudFormation.
For Product name, enter a name (for this post, EMR-7.0.0).
Use the security configuration IdentityCenterConfiguration-with-lf-tip you created in previous steps with the appropriate Amazon EMR service roles.
Choose Create product.

The following is an example CloudFormation template. Update the account-specific values for SecurityConfiguration, JobFlowRole, ServiceRole, LogUri, Ec2KeyName, and Ec2SubnetId. We provide a sample Amazon EMR service role and trust policy in Appendix A at the end of this post.

'Parameters':
  'ClusterName':
    'Type': 'String'
    'Default': 'EMR_TIP_Cluster'
  'EmrRelease':
    'Type': 'String'
    'Default': 'emr-7.0.0'
    'AllowedValues':
    - 'emr-7.0.0'
  'ClusterInstanceType':
    'Type': 'String'
    'Default': 'm5.xlarge'
    'AllowedValues':
    - 'm5.xlarge'
    - 'm5.2xlarge'
'Resources':
  'EmrCluster':
    'Type': 'AWS::EMR::Cluster'
    'Properties':
      'Applications':
      - 'Name': 'Spark'
      - 'Name': 'Livy'
      - 'Name': 'Hadoop'
      - 'Name': 'JupyterEnterpriseGateway'       
      'SecurityConfiguration': 'IdentityCenterConfiguration-with-lf-tip'
      'EbsRootVolumeSize': '20'
      'Name':
        'Ref': 'ClusterName'
      'JobFlowRole': <Instance Profile Role>
      'ServiceRole': <EMR Service Role>
      'ReleaseLabel':
        'Ref': 'EmrRelease'
      'VisibleToAllUsers': !!bool 'true'
      'LogUri':
        'Fn::Sub': <S3 LOG Path>
      'Instances':
        "Ec2KeyName" : <Key Pair Name>
        'TerminationProtected': !!bool 'false'
        'Ec2SubnetId': <subnet-id>
        'MasterInstanceGroup':
          'InstanceCount': !!int '1'
          'InstanceType':
            'Ref': 'ClusterInstanceType'
        'CoreInstanceGroup':
          'InstanceCount': !!int '2'
          'InstanceType':
            'Ref': 'ClusterInstanceType'
          'Market': 'ON_DEMAND'
          'Name': 'Core'
'Outputs':
  'ClusterId':
    'Value':
      'Ref': 'EmrCluster'
    'Description': 'The ID of the  EMR cluster'
'Metadata':
  'AWS::CloudFormation::Designer': {}
'Rules': {}

Trusted identity propagation is supported from Amazon EMR 6.15 onwards. For Amazon EMR 6.15, add the following bootstrap action to the CloudFormation script:

'BootstrapActions':
- 'Name': 'spark-config'
'ScriptBootstrapAction':
'Path': 's3://emr-data-access-control-<aws-region>/customer-bootstrap-actions/idc-fix/replace-puppet.sh'

The portfolio now should have the EMR cluster creation product added.

Grant the EMR Studio role emr_tip_role access to the portfolio.

Grant Lake Formation permissions to users to access data

In this step, we enable Lake Formation integration with IAM Identity Center and grant permissions to the Identity Center user analyst1. If Lake Formation is not already enabled, refer to Getting started with Lake Formation.

To use Lake Formation with Amazon EMR, create a custom role to register S3 locations. You need to create a new custom role with Amazon S3 access and not use the default role AWSServiceRoleForLakeFormationDataAccess. Additionally, enable external data filtering in Lake Formation. For more details, refer to Enable Lake Formation with Amazon EMR.

Complete the following steps to manage access permissions in Lake Formation:

On the Lake Formation console, choose IAM Identity Center integration under Administration in the navigation pane.

Lake Formation will automatically specify the correct IAM Identity Center instance.

Choose Create.

You can now view the IAM Identity Center integration details.

For this post, we have a Marketing database and a customer table on which we grant access to our enterprise user analyst1. You can use an existing database and table in your account or create a new one. For more examples, refer to Tutorials.

The following screenshot shows the details of our customer table.

Complete the following steps to grant analyst1 permissions. For more information, refer to Granting table permissions using the named resource method.

On the Lake Formation console, choose Data lake permissions under Permissions in the navigation pane.
Choose Grant.

Select Named Data Catalog resources.
For Databases, choose your database (marketing).
For Tables, choose your table (customer).

For Table permissions, select Select and Describe.
For Data permissions, select All data access.
Choose Grant.

The following screenshot shows a summary of permissions that user analyst1 has. They have Select access on the table and Describe permissions on the databases.

Test the solution

To test the solution, we log in to EMR Studio as enterprise user analyst1, create a new Workspace, create an EMR cluster using a template, and use that cluster to perform an analysis. You could also use the Workspace that was created during the Studio setup. In this demonstration, we create a new Workspace.

You need additional permissions in the EMR Studio role to create and list Workspaces, use a template, and create EMR clusters. For more details, refer to Configure EMR Studio user permissions for Amazon EC2 or Amazon EKS. Appendix B at the end of this post contains a sample policy.

When the cluster is available, we attach the cluster to the Workspace and run queries on the customer table, which the user has access to.

User analyst1 is now able to run queries for business use cases using their corporate identity. To open a PySpark notebook, we choose PySpark under Notebook.

When the notebook is open, we run a Spark SQL query to list the databases:

%%sql 
show databases

In this case, we query the customer table in the marketing database. We should be able to access the data.

%%sql
select * from marketing.customer

Audit data access

Lake Formation API actions are logged by CloudTrail. The GetDataAccess action is logged whenever a principal or integrated AWS service requests temporary credentials to access data in a data lake location that is registered with Lake Formation. With trusted identity propagation, CloudTrail also logs the IAM Identity Center user ID of the corporate identity who requested access to the data.

The following screenshot shows the details for the analyst1 user.

Choose View event to view the event logs.

The following is an example of the GetDataAccess event log. We can trace that user analyst1, Identity Center user ID c8c11390-00a1-706e-0c7a-bbcc5a1c9a7f, has accessed the customer table.

{
    "eventVersion": "1.09",
    
….
        "onBehalfOf": {
            "userId": "c8c11390-00a1-706e-0c7a-bbcc5a1c9a7f",
            "identityStoreArn": "arn:aws:identitystore::xxxxxxxxx:identitystore/d-XXXXXXXX"
        }
    },
    "eventTime": "2024-01-28T17:56:25Z",
    "eventSource": "lakeformation.amazonaws.com",
    "eventName": "GetDataAccess",
    "awsRegion": "us-west-2",
….
        "requestParameters": {
        "tableArn": "arn:aws:glue:us-west-2:xxxxxxxxxx:table/marketing/customer",
        "supportedPermissionTypes": [
            "TABLE_PERMISSION"
        ]
    },
    …..
    }
}

Here is an end to end demonstration video of steps to follow for enabling trusted identity propagation to your analytics flow in Amazon EMR

Clean up

Clean up the following resources when you’re done using this solution:

Delete the CloudFormation stacks created in each account to delete the EMR cluster.
Delete the EMR Studio Workspaces and environment.
Delete the Service Catalog product and portfolio.
Delete Okta users
Revoke Lake Formation access to the users.

Conclusion

In this post, we demonstrated how to set up and use trusted identity propagation using IAM Identity Center, EMR Studio, and Lake Formation for analytics. With trusted identity propagation, a user’s corporate identity is seamlessly propagated as they access data using single sign-on across AWS analytics services to build analytics applications. Data administrators can provide fine-grained data access directly to corporate users and groups and audit usage. To learn more, see Integrate Amazon EMR with AWS IAM Identity Center.

About the Authors

Pradeep Misra is a Principal Analytics Solutions Architect at AWS. He works across Amazon to architect and design modern distributed analytics and AI/ML platform solutions. He is passionate about solving customer challenges using data, analytics, and AI/ML. Outside of work, Pradeep likes exploring new places, trying new cuisines, and playing board games with his family. He also likes doing science experiments with his daughters.

Deepmala Agarwal works as an AWS Data Specialist Solutions Architect. She is passionate about helping customers build out scalable, distributed, and data-driven solutions on AWS. When not at work, Deepmala likes spending time with family, walking, listening to music, watching movies, and cooking!

Abhilash Nagilla is a Senior Specialist Solutions Architect at Amazon Web Services (AWS), helping public sector customers on their cloud journey with a focus on AWS analytics services. Outside of work, Abhilash enjoys learning new technologies, watching movies, and visiting new places.

Appendix A

Sample Amazon EMR service role and trust policy:

Note: This is a sample service role. Fine grained access control is done using Lake Formation. Modify the permissions as per your enterprise guidance and to comply with your security team.

Trust policy:

{
    "Version": "2008-10-17",
    "Statement": [
        {
            "Sid": "",
            "Effect": "Allow",
            "Principal": {
                "Service": "elasticmapreduce.amazonaws.com",
   "AWS": "arn:aws:iam::xxxxxx:role/emr_tip_role"

            },
            "Action": [
                "sts:AssumeRole",
                "sts:SetContext"
            ]
        }
    ]
}

Permission Policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ResourcesToLaunchEC2",
            "Effect": "Allow",
            "Action": [
                "ec2:RunInstances",
                "ec2:CreateFleet",
                "ec2:CreateLaunchTemplate",
                "ec2:CreateLaunchTemplateVersion"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*::image/ami-*",
                "arn:aws:ec2:*:*:key-pair/*",
                "arn:aws:ec2:*:*:capacity-reservation/*",
                "arn:aws:ec2:*:*:placement-group/pg-*",
                "arn:aws:ec2:*:*:fleet/*",
                "arn:aws:ec2:*:*:dedicated-host/*",
                "arn:aws:resource-groups:*:*:group/*"
            ]
        },
        {
            "Sid": "TagOnCreateTaggedEMRResources",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*:*:instance/*",
                "arn:aws:ec2:*:*:volume/*",
                "arn:aws:ec2:*:*:launch-template/*"
            ],
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": [
                        "RunInstances",
                        "CreateFleet",
                        "CreateLaunchTemplate",
                        "CreateNetworkInterface"
                    ]
                }
            }
        },
        {
            "Sid": "ListActionsForEC2Resources",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeAccountAttributes",
                "ec2:DescribeCapacityReservations",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeImages",
                "ec2:DescribeInstances",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeNetworkAcls",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribePlacementGroups",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSubnets",
                "ec2:DescribeVolumes",
                "ec2:DescribeVolumeStatus",
                "ec2:DescribeVpcAttribute",
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeVpcs"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AutoScaling",
            "Effect": "Allow",
            "Action": [
                "application-autoscaling:DeleteScalingPolicy",
                "application-autoscaling:DeregisterScalableTarget",
                "application-autoscaling:DescribeScalableTargets",
                "application-autoscaling:DescribeScalingPolicies",
                "application-autoscaling:PutScalingPolicy",
                "application-autoscaling:RegisterScalableTarget"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AutoScalingCloudWatch",
            "Effect": "Allow",
            "Action": [
                "cloudwatch:PutMetricAlarm",
                "cloudwatch:DeleteAlarms",
                "cloudwatch:DescribeAlarms"
            ],
            "Resource": "arn:aws:cloudwatch:*:*:alarm:*_EMR_Auto_Scaling"
        },
        {
            "Sid": "PassRoleForAutoScaling",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::*:role/EMR_AutoScaling_DefaultRole",
            "Condition": {
                "StringLike": {
                    "iam:PassedToService": "application-autoscaling.amazonaws.com*"
                }
            }
        },
        {
            "Sid": "PassRoleForEC2",
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:aws:iam::xxxxxxxxxxx:role/service-role/<Instance-Profile-Role>",
            "Condition": {
                "StringLike": {
                    "iam:PassedToService": "ec2.amazonaws.com*"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:*",
                "s3-object-lambda:*"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket>/*",
                "arn:aws:s3:::*logs*/*"
            ]
        },
        {
            "Effect": "Allow",
            "Resource": "*",
            "Action": [
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:CancelSpotInstanceRequests",
                "ec2:CreateFleet",
                "ec2:CreateLaunchTemplate",
                "ec2:CreateNetworkInterface",
                "ec2:CreateSecurityGroup",
                "ec2:CreateTags",
                "ec2:DeleteLaunchTemplate",
                "ec2:DeleteNetworkInterface",
                "ec2:DeleteSecurityGroup",
                "ec2:DeleteTags",
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeAccountAttributes",
                "ec2:DescribeDhcpOptions",
                "ec2:DescribeImages",
                "ec2:DescribeInstanceStatus",
                "ec2:DescribeInstances",
                "ec2:DescribeKeyPairs",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeNetworkAcls",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribePrefixLists",
                "ec2:DescribeRouteTables",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeSpotInstanceRequests",
                "ec2:DescribeSpotPriceHistory",
                "ec2:DescribeSubnets",
                "ec2:DescribeTags",
                "ec2:DescribeVpcAttribute",
                "ec2:DescribeVpcEndpoints",
                "ec2:DescribeVpcEndpointServices",
                "ec2:DescribeVpcs",
                "ec2:DetachNetworkInterface",
                "ec2:ModifyImageAttribute",
                "ec2:ModifyInstanceAttribute",
                "ec2:RequestSpotInstances",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RunInstances",
                "ec2:TerminateInstances",
                "ec2:DeleteVolume",
                "ec2:DescribeVolumeStatus",
                "ec2:DescribeVolumes",
                "ec2:DetachVolume",
                "iam:GetRole",
                "iam:GetRolePolicy",
                "iam:ListInstanceProfiles",
                "iam:ListRolePolicies",
                "cloudwatch:PutMetricAlarm",
                "cloudwatch:DescribeAlarms",
                "cloudwatch:DeleteAlarms",
                "application-autoscaling:RegisterScalableTarget",
                "application-autoscaling:DeregisterScalableTarget",
                "application-autoscaling:PutScalingPolicy",
                "application-autoscaling:DeleteScalingPolicy",
                "application-autoscaling:Describe*"
            ]
        }
    ]
}

Appendix B

Sample EMR Studio role policy:

Note: This is a sample service role. Fine grained access control is done using Lake Formation. Modify the permissions as per your enterprise guidance and to comply with your security team.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowEMRReadOnlyActions",
            "Effect": "Allow",
            "Action": [
                "elasticmapreduce:ListInstances",
                "elasticmapreduce:DescribeCluster",
                "elasticmapreduce:ListSteps"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowEC2ENIActionsWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterfacePermission",
                "ec2:DeleteNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowEC2ENIAttributeAction",
            "Effect": "Allow",
            "Action": [
                "ec2:ModifyNetworkInterfaceAttribute"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:instance/*",
                "arn:aws:ec2:*:*:network-interface/*",
                "arn:aws:ec2:*:*:security-group/*"
            ]
        },
        {
            "Sid": "AllowEC2SecurityGroupActionsWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:AuthorizeSecurityGroupEgress",
                "ec2:AuthorizeSecurityGroupIngress",
                "ec2:RevokeSecurityGroupEgress",
                "ec2:RevokeSecurityGroupIngress",
                "ec2:DeleteNetworkInterfacePermission"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowDefaultEC2SecurityGroupsCreationWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateSecurityGroup"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:security-group/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowDefaultEC2SecurityGroupsCreationInVPCWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateSecurityGroup"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:vpc/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowAddingEMRTagsDuringDefaultSecurityGroupCreation",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:*:*:security-group/*",
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/for-use-with-amazon-emr-managed-policies": "true",
                    "ec2:CreateAction": "CreateSecurityGroup"
                }
            }
        },
        {
            "Sid": "AllowEC2ENICreationWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:network-interface/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowEC2ENICreationInSubnetAndSecurityGroupWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface"
            ],
            "Resource": [
                "arn:aws:ec2:*:*:subnet/*",
                "arn:aws:ec2:*:*:security-group/*"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowAddingTagsDuringEC2ENICreation",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags"
            ],
            "Resource": "arn:aws:ec2:*:*:network-interface/*",
            "Condition": {
                "StringEquals": {
                    "ec2:CreateAction": "CreateNetworkInterface"
                }
            }
        },
        {
            "Sid": "AllowEC2ReadOnlyActions",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeNetworkInterfaces",
                "ec2:DescribeTags",
                "ec2:DescribeInstances",
                "ec2:DescribeSubnets",
                "ec2:DescribeVpcs"
            ],
            "Resource": "*"
        },
        {
            "Sid": "AllowSecretsManagerReadOnlyActionsWithEMRTags",
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue"
            ],
            "Resource": "arn:aws:secretsmanager:*:*:secret:*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/for-use-with-amazon-emr-managed-policies": "true"
                }
            }
        },
        {
            "Sid": "AllowWorkspaceCollaboration",
            "Effect": "Allow",
            "Action": [
                "iam:GetUser",
                "iam:GetRole",
                "iam:ListUsers",
                "iam:ListRoles",
                "sso:GetManagedApplicationInstance",
                "sso-directory:SearchUsers"
            ],
            "Resource": "*"
        },
        {
            "Sid": "S3Access",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:GetEncryptionConfiguration",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket>",
                "arn:aws:s3:::<bucket>/*"
            ]
        },
        {
            "Sid": "EMRStudioWorkspaceAccess",
            "Effect": "Allow",
            "Action": [
                "elasticmapreduce:CreateEditor",
                "elasticmapreduce:DescribeEditor",
                "elasticmapreduce:ListEditors",
                "elasticmapreduce:DeleteEditor",
                "elasticmapreduce:UpdateEditor",
                "elasticmapreduce:PutWorkspaceAccess",
                "elasticmapreduce:DeleteWorkspaceAccess",
                "elasticmapreduce:ListWorkspaceAccessIdentities",
                "elasticmapreduce:StartEditor",
                "elasticmapreduce:StopEditor",
                "elasticmapreduce:OpenEditorInConsole",
                "elasticmapreduce:AttachEditor",
                "elasticmapreduce:DetachEditor",
                "elasticmapreduce:ListInstanceGroups",
                "elasticmapreduce:ListBootstrapActions",
                "servicecatalog:SearchProducts",
                "servicecatalog:DescribeProduct",
                "servicecatalog:DescribeProductView",
                "servicecatalog:DescribeProvisioningParameters",
                "servicecatalog:ProvisionProduct",
                "servicecatalog:UpdateProvisionedProduct",
                "servicecatalog:ListProvisioningArtifacts",
                "servicecatalog:DescribeRecord",
                "servicecatalog:ListLaunchPaths",
                "elasticmapreduce:RunJobFlow",      
                "elasticmapreduce:ListClusters",
                "elasticmapreduce:DescribeCluster",
                "codewhisperer:GenerateRecommendations",
                "athena:StartQueryExecution",
                "athena:StopQueryExecution",
                "athena:GetQueryExecution",
                "athena:GetQueryRuntimeStatistics",
                "athena:GetQueryResults",
                "athena:ListQueryExecutions",
                "athena:BatchGetQueryExecution",
                "athena:GetNamedQuery",
                "athena:ListNamedQueries",
                "athena:BatchGetNamedQuery",
                "athena:UpdateNamedQuery",
                "athena:DeleteNamedQuery",
                "athena:ListDataCatalogs",
                "athena:GetDataCatalog",
                "athena:ListDatabases",
                "athena:GetDatabase",
                "athena:ListTableMetadata",
                "athena:GetTableMetadata",
                "athena:ListWorkGroups",
                "athena:GetWorkGroup",
                "athena:CreateNamedQuery",
                "athena:GetPreparedStatement",
                "glue:CreateDatabase",
                "glue:DeleteDatabase",
                "glue:GetDatabase",
                "glue:GetDatabases",
                "glue:UpdateDatabase",
                "glue:CreateTable",
                "glue:DeleteTable",
                "glue:BatchDeleteTable",
                "glue:UpdateTable",
                "glue:GetTable",
                "glue:GetTables",
                "glue:BatchCreatePartition",
                "glue:CreatePartition",
                "glue:DeletePartition",
                "glue:BatchDeletePartition",
                "glue:UpdatePartition",
                "glue:GetPartition",
                "glue:GetPartitions",
                "glue:BatchGetPartition",
                "kms:ListAliases",
                "kms:ListKeys",
                "kms:DescribeKey",
                "lakeformation:GetDataAccess",
                "s3:GetBucketLocation",
                "s3:GetObject",
                "s3:ListBucket",
                "s3:ListBucketMultipartUploads",
                "s3:ListMultipartUploadParts",
                "s3:AbortMultipartUpload",
                "s3:PutObject",
                "s3:PutBucketPublicAccessBlock",
                "s3:ListAllMyBuckets",
                "elasticmapreduce:ListStudios",
                "elasticmapreduce:DescribeStudio",
                "cloudformation:GetTemplate",
                "cloudformation:CreateStack",
                "cloudformation:CreateStackSet",
                "cloudformation:DeleteStack",
                "cloudformation:GetTemplateSummary",
                "cloudformation:ValidateTemplate",
                "cloudformation:ListStacks",
                "cloudformation:ListStackSets",
                "elasticmapreduce:AddTags",
                "ec2:CreateNetworkInterface",
                "elasticmapreduce:GetClusterSessionCredentials",
                "elasticmapreduce:GetOnClusterAppUIPresignedURL",
                "cloudformation:DescribeStackResources"
            ],
            "Resource": [
                "*"
            ]
        },
        {
            "Sid": "AllowPassingServiceRoleForWorkspaceCreation",
            "Action": "iam:PassRole",
            "Resource": [
                "arn:aws:iam::*:role/<Studio Role>",
                "arn:aws:iam::*:role/<EMR Service Role>",
                "arn:aws:iam::*:role/<EMR Instance Profile Role>"
            ],
            "Effect": "Allow"
        },
{
			"Sid": "Statement1",
			"Effect": "Allow",
			"Action": [
				"iam:PassRole"
			],
			"Resource": [
				"arn:aws:iam::*:role/<EMR Instance Profile Role>"
			]
		}
    ]
}

Simplify access management with Amazon Redshift and AWS Lake Formation for users in an External Identity Provider

2024-02-15 Harshida Patel

Post Syndicated from Harshida Patel original https://aws.amazon.com/blogs/big-data/simplify-access-management-with-amazon-redshift-and-aws-lake-formation-for-users-in-an-external-identity-provider/

Many organizations use identity providers (IdPs) to authenticate users, manage their attributes, and group memberships for secure, efficient, and centralized identity management. You might be modernizing your data architecture using Amazon Redshift to enable access to your data lake and data in your data warehouse, and are looking for a centralized and scalable way to define and manage the data access based on IdP identities. AWS Lake Formation makes it straightforward to centrally govern, secure, and globally share data for analytics and machine learning (ML). Currently, you may have to map user identities and groups to AWS Identity and Access Management (IAM) roles, and data access permissions are defined at the IAM role level within Lake Formation. This setup is not efficient because setting up and maintaining IdP groups with IAM role mapping as new groups are created is time consuming and it makes it difficult to derive what data was accessed from which service at that time.

Amazon Redshift, Amazon QuickSight, and Lake Formation now integrate with the new trusted identity propagation capability in AWS IAM Identity Center to authenticate users seamlessly across services. In this post, we discuss two use cases to configure trusted identity propagation with Amazon Redshift and Lake Formation.

Solution overview

Trusted identity propagation provides a new authentication option for organizations that want to centralize data permissions management and authorize requests based on their IdP identity across service boundaries. With IAM Identity Center, you can configure an existing IdP to manage users and groups and use Lake Formation to define fine-grained access control permissions on catalog resources for these IdP identities. Amazon Redshift supports identity propagation when querying data with Amazon Redshift Spectrum and with Amazon Redshift Data Sharing, and you can use AWS CloudTrail to audit data access by IdP identities to help your organization meet their regulatory and compliance requirements.

With this new capability, users can connect to Amazon Redshift from QuickSight with a single sign-on experience and create direct query datasets. This is enabled by using IAM Identity Center as a shared identity source. With trusted identity propagation, when QuickSight assets like dashboards are shared with other users, the database permissions of each QuickSight user are applied by propagating their end-user identity from QuickSight to Amazon Redshift and enforcing their individual data permissions. Depending on the use case, the author can apply additional row-level and column-level security in QuickSight.

The following diagram illustrates an example of the solution architecture.

In this post, we walk through how to configure trusted identity propagation with Amazon Redshift and Lake Formation. We cover the following use cases:

Redshift Spectrum with Lake Formation
Redshift data sharing with Lake Formation

Prerequisites

This walkthrough assumes you have set up a Lake Formation administrator role or a similar role to follow along with the instructions in this post. To learn more about setting up permissions for a data lake administrator, see Create a data lake administrator.

Additionally, you must create the following resources as detailed in Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On:

An Okta account integrated with IAM Identity Center to sync users and groups
A Redshift managed application with IAM Identity Center
A Redshift source cluster with IAM Identity Center integration enabled
A Redshift target cluster with IAM Identity Center integration enabled (you can skip the section to set up Amazon Redshift role-based access)
Users and groups from IAM Identity Center assigned to the Redshift application
A permission set assigned to AWS accounts to enable Redshift Query Editor v2 access

Add the below permission to the IAM role used in Redshift managed application for integration with IAM Identity Center.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "lakeformation:GetDataAccess",
                "glue:GetTable",
                "glue:GetTables",
                "glue:SearchTables",
                "glue:GetDatabase",
                "glue:GetDatabases",
                "glue:GetPartitions",
                "lakeformation:GetResourceLFTags",
                "lakeformation:ListLFTags",
                "lakeformation:GetLFTag",
                "lakeformation:SearchTablesByLFTags",
                "lakeformation:SearchDatabasesByLFTags"
           ],
            "Resource": "*"
        }
    ]
}

Use case 1: Redshift Spectrum with Lake Formation

This use case assumes you have the following prerequisites:

- To store the data, you need an Amazon Simple Storage Service (Amazon S3) bucket.
- To run AWS Command Line Interface (AWS CLI) commands, you need to set up AWS CloudShell in your account or the AWS CLI on your workstation. For instructions, refer to Getting started with AWS CloudShell or Set up the AWS CLI, respectively.
- You can use an existing AWS Glue database and table in your account or complete the following steps logged in as an admin to set up these resources.

Log in to the AWS Management Console as an IAM administrator.
Go to CloudShell or your AWS CLI and run the following AWS CLI command, providing your bucket name to copy the data:

aws s3 sync s3://redshift-demos/data/NY-Pub/ s3://<bucketname>/data/NY-Pub/

In this post, we use an AWS Glue crawler to create the external table ny_pub stored in Apache Parquet format in the Amazon S3 location s3://<bucketname>/data/NY-Pub/. In the next step, we create the solution resources using AWS CloudFormation to create a stack named CrawlS3Source-NYTaxiData in us-east-1.

Download the .yml file or launch the CloudFormation stack.

The stack creates the following resources:

The crawler NYTaxiCrawler along with the new IAM role AWSGlueServiceRole-RedshiftAutoMount
The AWS Glue database automountdb

When the stack is complete, continue with the following steps to finish setting up your resources:

On the AWS Glue console, under Data Catalog in the navigation pane, choose Crawlers.
Open NYTaxiCrawler and choose Edit.

Under Choose data sources and classifiers, choose Edit.

For Data source, choose S3.
For S3 path, enter s3://<bucketname>/data/NY-Pub/.
Choose Update S3 data source.

Choose Next and choose Update.
Choose Run crawler.

After the crawler is complete, you can see a new table called ny_pub in the Data Catalog under the automountdb database.

After you create the resources, complete the steps in the next sections to set up Lake Formation permissions on the AWS Glue table ny_pub for the sales IdP group and access them via Redshift Spectrum.

Enable Lake Formation propagation for the Redshift managed application

Complete the following steps to enable Lake Formation propagation for the Redshift managed application created in Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On:

Log in to the console as admin.
On the Amazon Redshift console, choose IAM Identity Center connection in the navigation pane.
Select the managed application that starts with redshift-iad and choose Edit.

Select Enable AWS Lake Formation access grants under Trusted identity propagation and save your changes.

Set up Lake Formation as an IAM Identity Center application

Complete the following steps to set up Lake Formation as an IAM Identity Center application:

On the Lake Formation console, under Administration in the navigation pane, choose IAM Identity Center integration.

Review the options and choose Submit to enable Lake Formation integration.

The integration status will update to Success.
Alternatively, you can run the following command:

aws lakeformation create-lake-formation-identity-center-configuration 
--cli-input-json '{"CatalogId": "<catalog_id>","InstanceArn": "<identitycenter_arn>"}'

Register the data with Lake Formation

In this section, we register the data with Lake Formation. Complete the following steps:

On the Lake Formation console, under Administration in the navigation pane, choose Data lake locations.
Choose Register location.
For Amazon S3 path, enter the bucket where the table data resides (s3://<bucketname>/data/NY-Pub/).
For IAM role, choose a Lake Formation user-defined role. For more information, refer to Requirements for roles used to register locations.
For Permission mode, select Lake Formation.
Choose Register location.

Next, verify that the IAMAllowedPrincipal group doesn’t have permission on the database.

On the Lake Formation console, under Data catalog in the navigation pane, choose Databases.
Select automountdb and on the Actions menu, choose View permissions.
If IAMAllowedPrincipal is listed, select the principal and choose Revoke.
Repeat these steps to verify permissions for the table ny_pub.

Grant the IAM Identity Center group permissions on the AWS Glue database and table

Complete the following steps to grant database permissions to the IAM Identity Center group:

On the Lake Formation console, under Data catalog in the navigation pane, choose Databases.
Select the database automountdb and on the Actions menu, choose Grant.
Choose Grant database.
Under Principals, select IAM Identity Center and choose Add.
In the pop-up window, if this is the first time assigning users and groups, choose Get started.
Enter the IAM Identity Center group in the search bar and choose the group.
Choose Assign.
Under LF-Tags or catalog resources, automountdb is already selected for Databases.
Select Describe for Database permissions.
Choose Grant to apply the permissions.

Alternatively, you can run the following command:

aws lakeformation grant-permissions --cli-input-json '
{
    "Principal": {
        "DataLakePrincipalIdentifier": "arn:aws:identitystore:::group/<identitycenter_group_name>"
    },
    "Resource": {
        "Database": {
            "Name": "automountdb"
        }
    },
    "Permissions": [
        "DESCRIBE"
    ]
}'

Next, you grant table permissions to the IAM Identity Center group.

Under Data catalog in the navigation pane, choose Databases.
Select the database automountdb and on the Actions menu, choose Grant.
Under Principals, select IAM Identity Center and choose Add.
Enter the IAM Identity Center group in the search bar and choose the group.
Choose Assign.
Under LF-Tags or catalog resources, automountdb is already selected for Databases.
For Tables, choose ny_pub.
Select Describe and Select for Table permissions.
Choose Grant to apply the permissions.

Alternatively, you can run the following command:

aws lakeformation grant-permissions --cli-input-json '
{
    "Principal": {
        "DataLakePrincipalIdentifier": "arn:aws:identitystore:::group/<identitycenter_group_name>"
    },
    "Resource": {
        "Table": {
            "DatabaseName": "automountdb",
            "Name": "ny_pub "
        }
    },
    "Permissions": [
        "SELECT",
        "DESCRIBE"

    ]
}'

Set up Redshift Spectrum table access for the IAM Identity Center group

Complete the following steps to set up Redshift Spectrum table access:

Sign in to the Amazon Redshift console using the admin role.
Navigate to Query Editor v2.
Choose the options menu (three dots) next to the cluster and choose Create connection.

Connect as the admin user and run the following commands to make the ny_pub data in the S3 data lake available to the sales group:

create external schema if not exists nyc_external_schema from DATA CATALOG database 'automountdb' catalog_id '<accountid>'; 
grant usage on schema nyc_external_schema to role "awsidc:awssso-sales"; 
grant select on all tables in schema nyc_external_schema to role "awsidc:awssso- sales";

Validate Redshift Spectrum access as an IAM Identity Center user

Complete the following steps to validate access:

On the Amazon Redshift console, navigate to Query Editor v2.
Choose the options menu (three dots) next to the cluster and choose Create connection
Choose select IAM Identity Center option for Connect option. Provide Okta user name and password in the browser pop-up.
Once connected as a federated user, run the following SQL commands to query the ny_pub data lake table:

select * from nyc_external_schema.ny_pub;

Use Case 2: Redshift data sharing with Lake Formation

This use case assumes you have IAM Identity Center integration with Amazon Redshift set up, with Lake Formation propagation enabled as per the instructions provided in the previous section.

Create a data share with objects and share it with the Data Catalog

Complete the following steps to create a data share:

Sign in to the Amazon Redshift console using the admin role.
Navigate to Query Editor v2.
Choose the options menu (three dots) next to the Redshift source cluster and choose Create connection.

Connect as admin user using Temporarily credentials using a database user name option and run the following SQL commands to create a data share:

CREATE DATASHARE salesds; 
ALTER DATASHARE salesds ADD SCHEMA sales_schema; 
ALTER DATASHARE salesds ADD TABLE store_sales; 
GRANT USAGE ON DATASHARE salesds TO ACCOUNT ‘<accountid>’ via DATA CATALOG;

Authorize the data share by choosing Data shares in the navigation page and selecting the data share salesdb.
Select the data share and choose Authorize.

Now you can register the data share in Lake Formation as an AWS Glue database.

Sign in to the Lake Formation console as the data lake administrator IAM user or role.
Under Data catalog in the navigation pane, choose Data sharing and view the Redshift data share invitations on the Configuration tab.
Select the datashare salesds and choose Review Invitation.
Once you review the details choose Accept.
Provide a name for the AWS Glue database (for example, salesds) and choose Skip to Review and create.

After the AWS Glue database is created on the Redshift data share, you can view it under Shared databases.

Grant the IAM Identity Center user group permission on the AWS Glue database and table

Complete the following steps to grant database permissions to the IAM Identity Center group:

On the Lake Formation console, under Data catalog in the navigation pane, choose Databases.
Select the database salesds and on the Actions menu, choose Grant.
Choose Grant database.
Under Principals, select IAM Identity Center and choose Add.
In the pop-up window, enter the IAM Identity Center group awssso in the search bar and choose the awssso-sales group.
Choose Assign.
Under LF-Tags or catalog resources, salesds is already selected for Databases.
Select Describe for Database permissions.
Choose Grant to apply the permissions.

Next, grant table permissions to the IAM Identity Center group.

Under Data catalog in the navigation pane, choose Databases.
Select the database salesds and on the Actions menu, choose Grant.
Under Principals, select IAM Identity Center and choose Add.
In the pop-up window, enter the IAM Identity Center group awssso in the search bar and choose the awssso-sales group.
Choose Assign.
Under LF-Tags or catalog resources, salesds is already selected for Databases.
For Tables, choose sales_schema.store_sales.
Select Describe and Select for Table permissions.
Choose Grant to apply the permissions.

Mount the external schema in the target Redshift cluster and enable access for the IAM Identity Center user

Complete the following steps:

Sign in to the Amazon Redshift console using the admin role.
Navigate to Query Editor v2.
Connect as an admin user and run the following SQL commands to mount the AWS Glue database customerds as an external schema and enable access to the sales group:

create external schema if not exists sales_datashare_schema from DATA CATALOG database salesds catalog_id '<accountid>';
create role "awsidc:awssso-sales"; # If the role was not already created 
grant usage on schema sales_datashare_schema to role "awsidc:awssso-sales";
grant select on all tables in schema sales_datashare_schema to role "awsidc:awssso- sales";

Access Redshift data shares as an IAM Identity Center user

Complete the following steps to access the data shares:

On the Amazon Redshift console, navigate to Query Editor v2.
Choose the options menu (three dots) next to the cluster and choose Create connection.
Connect with IAM Identity Center and the provide IAM Identity Center user and password in the browser login.
Run the following SQL commands to query the data lake table:

SELECT * FROM "dev"."sales_datashare_schema"."sales_schema.store_sales";

With Transitive Identity Propagation we can now audit user access to dataset from Lake Formation dashboard and service used for accessing the dataset providing complete trackability. For federated user Ethan whose Identity Center User ID is ‘459e10f6-a3d0-47ae-bc8d-a66f8b054014’ you can see the below event log.

"eventSource": "lakeformation.amazonaws.com",
    "eventName": "GetDataAccess",
    "awsRegion": "us-east-1",
    "sourceIPAddress": "redshift.amazonaws.com",
    "userAgent": "redshift.amazonaws.com",
    "requestParameters": {
        "tableArn": "arn:aws:glue:us-east-1:xxxx:table/automountdb/ny_pub",
        "durationSeconds": 3600,
        "auditContext": {
            "additionalAuditContext": "{\"invokedBy\":\"arn:aws:redshift:us-east-1:xxxx:dbuser:redshift-consumer/awsidc:[email protected]\", \"transactionId\":\"961953\", \"queryId\":\"613842\", \"isConcurrencyScalingQuery\":\"false\"}"
        },
        "cellLevelSecurityEnforced": true
    },
    "responseElements": null,
    "additionalEventData": {
        "requesterService": "REDSHIFT",
        "LakeFormationTrustedCallerInvocation": "true",
        "lakeFormationPrincipal": "arn:aws:identitystore:::user/459e10f6-a3d0-47ae-bc8d-a66f8b054014",
        "lakeFormationRoleSessionName": "AWSLF-00-RE-726034267621-K7FUMxovuq"
    }

Clean up

Complete the following steps to clean up your resources:

Delete the data from the S3 bucket.
Delete the Lake Formation application and the Redshift provisioned cluster that you created for testing.
Sign in to the CloudFormation console as the IAM admin used for creating the CloudFormation stack, and delete the stack you created.

Conclusion

In this post, we covered how to simplify access management for analytics by propagating user identity across Amazon Redshift and Lake Formation using IAM Identity Center. We learned how to get started with trusted identity propagation by connecting to Amazon Redshift and Lake Formation. We also learned how to configure Redshift Spectrum and data sharing to support trusted identity propagation.

Learn more about IAM Identity Center with Amazon Redshift and AWS Lake Formation. Leave your questions and feedback in the comments section.

About the Authors

Harshida Patel is a Analytics Specialist Principal Solutions Architect, with AWS.

Srividya Parthasarathy is a Senior Big Data Architect on the AWS Lake Formation team. She enjoys building data mesh solutions and sharing them with the community.

Build SAML identity federation for Amazon OpenSearch Service domains within a VPC

2024-02-08 Mahdi Ebrahimi

Post Syndicated from Mahdi Ebrahimi original https://aws.amazon.com/blogs/big-data/build-saml-identity-federation-for-amazon-opensearch-service-domains-within-a-vpc/

Amazon OpenSearch Service is a fully managed search and analytics service powered by the Apache Lucene search library that can be operated within a virtual private cloud (VPC). A VPC is a virtual network that’s dedicated to your AWS account. It’s logically isolated from other virtual networks in the AWS Cloud. Placing an OpenSearch Service domain within a VPC enables a secure communication between OpenSearch Service and other services within the VPC without the need for an internet gateway, NAT device, or a VPN connection. All traffic remains securely within the AWS Cloud, providing a safe environment for your data. To connect to an OpenSearch Service domain running inside a private VPC, enterprise customers use one of two available options: either integrate their VPC with their enterprise network through VPN or AWS Direct Connect, or make the cluster endpoint publicly accessible through a reverse proxy. Refer to How can I access OpenSearch Dashboards from outside of a VPC using Amazon Cognito authentication for a detailed evaluation of the available options and the corresponding pros and cons.

For managing access to OpenSearch Dashboards in enterprise customers’ environments, OpenSearch Service supports Security Assertion Markup Language (SAML) integration with the customer’s existing identity providers (IdPs) to offer single sign-on (SSO). Although SAML integration for publicly accessible OpenSearch Dashboards works out of the box, enabling SAML for OpenSearch Dashboards within a VPC requires careful design with various configurations.

This post outlines an end-to-end solution for integrating SAML authentication for OpenSearch Service domains running in a VPC. It provides a step-by-step deployment guideline and is accompanied by AWS Cloud Development Kit (AWS CDK) applications, which automate all the necessary configurations.

Overview of solution

The following diagram describes the step-by-step authentication flow for accessing a private OpenSearch Service domain through SSO using SAML identity federation. The access is enabled over public internet through private NGINX reverse proxy servers running on Amazon Elastic Container Service (Amazon ECS) for high availability.

The workflow consists of the following steps:

The user navigates to the OpenSearch Dashboards URL in their browser.
The browser resolves the domain IP address and sends the request.
AWS WAF rules make sure that only allow listed IP address ranges are allowed.
Application Load Balancer forwards the request to NGINX reverse proxy.
NGINX adds the necessary headers and forwards the request to OpenSearch Dashboards.
OpenSearch Dashboards detects that the request is not authenticated. It replies with a redirect to the integrated SAML IdP for authentication.
The user is redirected to the SSO login page.
The IdP verifies the user’s identity and generates a SAML assertion token.
The user is redirected back to the OpenSearch Dashboards URL.
The request goes through the Steps 1–5 again until it reaches OpenSearch. This time, OpenSearch Dashboards detects the accompanying SAML assertion and allows the request.

In the following sections, we set up a NGINX reverse proxy in private subnets to provide access to OpenSearch Dashboards for a domain deployed inside VPC private subnets. We then enable SAML authentication for OpenSearch Dashboards using a SAML 2.0 application and use a custom domain endpoint to access OpenSearch Dashboards to see the SAML authentication in action.

Prerequisites

Before you get started, complete the prerequisite steps in this section.

Install required tools

First, install the AWS CDK. For more information, refer to the AWS CDK v2 Developer Guide.

Prepare required AWS resources

Complete the following steps to set up your AWS resources:

Create an AWS account.
Create an Amazon Route 53 public hosted zone such as mydomain.com to be used for routing internet traffic to your domain. For instructions, refer to Creating a public hosted zone.
Request an AWS Certificate Manager (ACM) public certificate for the hosted zone. For instructions, refer to Requesting a public certificate.
Create a VPC with public and private subnets.
Enable AWS IAM Identity Center. For instructions, refer to Enable IAM Identity Center.

Prepare your OpenSearch Service cluster

This post is accompanied with a standalone AWS CDK application (opensearch-domain) that deploys a sample OpenSearch Service domain in private VPC subnets. The deployed domain is for demonstration purposes only, and is optional.

If you have an existing OpenSearch Service domain in VPC that you want to use for SAML integration, apply the following configurations:

On the Cluster configuration tab, choose Edit and select Enable custom endpoint in the Custom endpoint section.
For Custom hostname, enter a fully qualified domain name (FQDN) such as opensearch.mydomain.com, which you want to use to access your cluster. Note that the domain name of the provided FQDN (for example, mydomain.com) must be the same as the public hosted zone you created earlier.
For AWS certificate, choose the SSL certificate you created earlier.
In the Summary section, optionally enable dry run analysis and select Dry run or deselect it and choose Save changes.

Otherwise, download the accompanied opensearch-domain AWS CDK application and unzip it. Then, edit the cdk.json file on the root of the unzipped folder and configure the required parameters:

vpc_cidr – The CIDR block in which to create the VPC. You may leave the default of 10.0.0.0/16.
opensearch_cluster_name – The name of the OpenSearch Service cluster. You may leave the default value of opensearch. It will also be used, together with the hosted_zone_name parameter, to build the FQDN of the custom domain URL.
hosted_zone_id – The Route 53 public hosted zone ID.
hosted_zone_name – The Route 53 public hosted zone name (for example, mydomain.com). The result FQDN with the default example values will then be opensearch.mydomain.com.

Finally, run the following commands to deploy the AWS CDK application:

cd opensearch-domain

# Create a Python environment and install the reuired dependencies
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
pip install -r requirements.txt

# Deploy the CDK application
cdk deploy

With the prerequisites in place, refer to the following sections for a step-by-step guide to deploy this solution.

Create a SAML 2.0 application

We use IAM Identity Center as the source of identity for our SAML integration. The same configuration should apply to other SAML 2.0-compliant IdPs. Consult your IdP documentation.

On the IAM Identity Center console, choose Groups in the navigation pane.
Create a new group called Opensearch Admin, and add users to it.
This will be the SAML group that receives full permissions in OpenSearch Dashboards. Take note of the group ID.
Choose Applications in the navigation pane.
Create a new custom SAML 2.0 application.
Download the IAM Identity Center SAML metadata file to use in a later step.
For Application start URL, enter [Custom Domain URL]/_dashboards/.
The custom domain URL is composed of communication protocol (https://) followed by the FQDN, which you used for your OpenSearch Service cluster in the prerequisites (for example, https://opensearch.mydomain.com). Look under your OpenSearch Service cluster configurations, if in doubt.
For Application ACS URL, enter [Custom Domain URL]/_dashboards/_opendistro/_security/saml/acs.
For Application SAML audience, enter [Custom Domain URL] (without any trailing slash).
Choose Submit.
In the Assigned users section, select Opensearch Admin and choose Assign Users.
On the Actions menu, choose Edit attribute mappings.
Define attribute mappings as shown in the following screenshot and choose Save changes.

Deploy the AWS CDK application

Complete the following steps to deploy the AWS CDK application:

Download and unzip the opensearch-domain-saml-integration AWS CDK application.

Add your private SSL key and certificate to AWS Secrets Manager and create two secrets called Key and Crt. For example, see the following code:

KEY=$(cat private.key | base64) && aws secretsmanager create-secret --name Key --secret-string $KEY
CRT=$(cat certificate.crt | base64) && aws secretsmanager create-secret --name Crt --secret-string $CRT

You can use the following command to generate a self-signed certificate. This is for testing only; do not use this for production environments.

openssl req -new -newkey rsa:4096 -days 1095 -nodes -x509 -subj '/' -keyout private.key -out certificate.crt

Edit the cdk.json file and set the required parameters inside the nested config object:

aws_region – The target AWS Region for your deployment (for example, eu-central-1).
vpc_id – The ID of the VPC into which the OpenSearch Service domain has been deployed.
opensearch_cluster_security_group_id – The ID of the security group used by the OpenSearch Service domain or any other security group that allows inbound connections to that domain on port 80 and 443. This group ID will be used by the Application Load Balancer to forward traffic to your OpenSearch Service domain.
hosted_zone_id – The Route 53 public hosted zone ID.
hosted_zone – The Route 53 public hosted zone name (for example, mydomain.com).
opensearch_custom_domain_name – An FQDN such as opensearch.mydomain.com, which you want to use to access your cluster. Note that the domain name of the provided FQDN (mydomain.com) must be the same as the hosted_zone parameter.
opensearch_custom_domain_certificate_arn – The ARN of the certificate stored in ACM.
opensearch_domain_endpoint – The OpenSearch Service VPC domain endpoint (for example, vpc-opensearch-abc123.eu-central-1.es.amazonaws.com).
vpc_dns_resolver – This must be 10.0.0. if your VPC CIDR is 10.0.0.0/16. See Amazon DNS server for further details.
alb_waf_ip_whitelist_cidrs – This is an optional list of zero or more IP CIDR ranges that will be automatically allow listed in AWS WAF to permit access to the OpenSearch Service domain. If not specified, after the deployment you will need to manually add relevant IP CIDR ranges to the AWS WAF IP set to allow access. For example, ["1.2.3.4/32", "5.6.7.0/24"].

Deploy the OpenSearch Service domain SAML integration AWS CDK application:

cd opensearch-domain-saml-integration

# Create a Python environment and install the required dependencies
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements-dev.txt
pip install -r requirements.txt

# Deploy the CDK application
cdk deploy

Enable SAML authentication for your OpenSearch Service cluster

When the application deployment is complete, enable SAML authentication for your cluster:

On the OpenSearch Service console, navigate to your domain.
On the Security configuration tab, choose Edit.
Select Enable SAML authentication.
Choose Import from XML file and import the IAM Identity Center SAML metadata file that you downloaded in an earlier step.
For SAML master backend role, use the group ID you saved earlier.
Expand the Additional settings section and for Roles, enter the SAML 2.0 attribute name you mapped earlier when you created the SAML 2.0 application in AWS Identity Center.
Configure the domain access policy for SAML integration.
Submit changes and wait for OpenSearch Service to apply the configurations before proceeding to the next section.

Test the solution

Complete the following steps to see the solution in action:

On the IAM Identity Center console, choose Dashboard in the navigation pane.
In the Settings summary section, choose the link under AWS access portal URL.
Sign in with your user name and password (register your password if this is your first login).
If your account was successfully added to the admin group, a SAML application logo is visible.
Choose Custom SAML 2.0 application to be redirected to the OpenSearch Service dashboards through SSO without any additional login attempts.
Alternatively, you could skip logging in to the access portal and directly point your browser to the OpenSearch Dashboards URL. In that case, OpenSearch Dashboards would first redirect you to the access portal to log in, which would redirect you back to the OpenSearch Dashboards UI after a successful login, resulting in the same outcome as shown in the following screenshot.

Troubleshooting

Your public-facing IP must be allow listed by the AWS WAF rule, otherwise a 403 Forbidden error will be returned. Allow list your IP CIDR range via the AWS CDK alb_waf_ip_whitelist_cidrs property as described in the installation guide and redeploy the AWS CDK application for changes to take effect.

Clean up

When you’re finished with this configuration, clean up the resources to avoid future charges.

On the OpenSearch Service console, navigate to the Security configuration tab of your OpenSearch Service domain and choose Edit.
Deselect Enable SAML authentication and choose Save changes.
After the Amazon SAML integration is disabled, delete the opensearch-domain-saml-integration stack using cdk destroy.
Optionally, if you used the provided OpenSearch Service sample AWS CDK stack (opensearch-domain), delete it using cdk destroy.

Conclusion

OpenSearch Service allows enterprise customers to use their preferred federated IdPs such as SAML using IAM Identity Center for clusters running inside private VPC subnets following AWS best practices.

In this post, we showed you how to integrate an OpenSearch Service domain within a VPC with an existing SAML IdP for SSO access to OpenSearch Dashboards using IAM Identity Center. The provided solution securely manages network access to the resources using AWS WAF to restrict access only to authorized network segments or specific IP addresses.

To get started, refer to How can I access OpenSearch Dashboards from outside of a VPC using Amazon Cognito authentication for further comparison of OpenSearch Service domain in private VPC access patterns.

About the Authors

Mahdi Ebrahimi is a Senior Cloud Infrastructure Architect with Amazon Web Services. He excels in designing distributed, highly-available software systems. Mahdi is dedicated to delivering cutting-edge solutions that empower his customers to innovate in the rapidly evolving landscape in the automotive industry.

Dmytro Protsiv is a Cloud Applications Architect for with Amazon Web Services. He is passionate about helping customers to solve their business challenges around application modernization.

Luca Menichetti is a Big Data Architect with Amazon Web Services. He helps customers develop performant and reusable solutions to process data at scale. Luca is passioned about managing organisation’s data architecture, enabling data analytics and machine learning. Having worked around the Hadoop ecosystem for a decade, he really enjoys tackling problems in NoSQL environments.

Krithivasan Balasubramaniyan is a Principal Consultant with Amazon Web Services. He enables global enterprise customers in their digital transformation journey and helps architect cloud native solutions.

Muthu Pitchaimani is a Search Specialist with Amazon OpenSearch Service. He builds large-scale search applications and solutions. Muthu is interested in the topics of networking and security, and is based out of Austin, Texas.

Simplify workforce identity management using IAM Identity Center and trusted token issuers

2023-12-07 Roberto Migli

Post Syndicated from Roberto Migli original https://aws.amazon.com/blogs/security/simplify-workforce-identity-management-using-iam-identity-center-and-trusted-token-issuers/

AWS Identity and Access Management (IAM) roles are a powerful way to manage permissions to resources in the Amazon Web Services (AWS) Cloud. IAM roles are useful when granting permissions to users whose workloads are static. However, for users whose access patterns are more dynamic, relying on roles can add complexity for administrators who are faced with provisioning roles and making sure the right people have the right access to the right roles.

The typical solution to handle dynamic workforce access is the OAuth 2.0 framework, which you can use to propagate an authenticated user’s identity to resource services. Resource services can then manage permissions based on the user—their attributes or permissions—rather than building a complex role management system. AWS IAM Identity Center recently introduced trusted identity propagation based on OAuth 2.0 to support dynamic access patterns.

With trusted identity propagation, your requesting application obtains OAuth tokens from IAM Identity Center and passes them to an AWS resource service. The AWS resource service trusts tokens that Identity Center generates and grants permissions based on the Identity Center tokens.

What happens if the application you want to deploy uses an external OAuth authorization server, such as Okta Universal Directory or Microsoft Entra ID, but the AWS service uses IAM Identity Center? How can you use the tokens from those applications with your applications that AWS hosts?

In this blog post, we show you how you can use IAM Identity Center trusted token issuers to help address these challenges. You also review the basics of Identity Center and OAuth and how Identity Center enables the use of external OAuth authorization servers.

IAM Identity Center and OAuth

IAM Identity Center acts as a central identity service for your AWS Cloud environment. You can bring your workforce users to AWS and authenticate them from an identity provider (IdP) that’s external to AWS (such as Okta or Microsoft Entra), or you can create and authenticate the users on AWS.

Trusted identity propagation in IAM Identity Center lets AWS workforce identities use OAuth 2.0, helping applications that need to share who’s using them with AWS services. In OAuth, a client application and a resource service both trust the same authorization server. The client application gets an OAuth token for the user and sends it to the resource service. Because both services trust the OAuth server, the resource service can identify the user from the token and set permissions based on their identity.

AWS supports two OAuth patterns:

AWS applications authenticate directly with IAM Identity Center: Identity Center redirects authentication to your identity source, which generates OAuth tokens that the AWS managed application uses to access AWS services. This is the default pattern because the AWS services that support trusted identity propagation use Identity Center as their OAuth authorization server.
Third-party, non-AWS applications authenticate outside of AWS (typically to your IdP) and access AWS resources: During authentication, these third-party applications obtain an OAuth token from an OAuth authorization server outside of AWS. In this pattern, the AWS services aren’t connected to the same OAuth authorization server as the client application. To enable this pattern, AWS introduced a model called the trusted token issuer.

Trusted token issuer

When AWS services use IAM Identity Center as their authentication service, directory, and OAuth authorization server, the AWS services that use OAuth tokens require that Identity Center issues the tokens. However, most third-party applications federate with an external IdP and obtain OAuth tokens from an external authorization server. Although the identities in Identity Center and the external authorization server might be for the same person, the identities exist in separate domains, one in Identity Center, the other in the external authorization server. This is required to manage authorization of workforce identities with AWS services.

The trusted token issuer (TTI) feature provides a way to securely associate one identity from the external IdP with the other identity in IAM Identity Center.

When using third-party applications to access AWS services, there’s an external OAuth authorization server for the third-party application, and IAM Identity Center is the OAuth authorization server for AWS services; each has its own domain of users. The Identity Center TTI feature connects these two systems so that tokens from the external OAuth authorization server can be exchanged for tokens from Identity Center that AWS services can use to identify the user in the AWS domain of users. A TTI is the external OAuth authorization server that Identity Center trusts to provide tokens that third-party applications use to call AWS services, as shown in Figure 1.

Figure 1: Conceptual model using a trusted token issuer and token exchange

How the trust model and token exchange work

There are two levels of trust involved with TTIs. First, the IAM Identity Center administrator must add the TTI, which makes it possible to exchange tokens. This involves connecting Identity Center to the Open ID Connect (OIDC) discovery URL of the external OAuth authorization server and defining an attribute-based mapping between the user from the external OAuth authorization server and a corresponding user in Identity Center. Second, the applications that exchange externally generated tokens must be configured to use the TTI. There are two models for how tokens are exchanged:

Managed AWS service-driven token exchange: A third-party application uses an AWS driver or API to access a managed AWS service, such as accessing Amazon Redshift by using Amazon Redshift drivers. This works only if the managed AWS service has been designed to accept and exchange tokens. The application passes the external token to the AWS service through an API call. The AWS service then makes a call to IAM Identity Center to exchange the external token for an Identity Center token. The service uses the Identity Center token to determine who the corresponding Identity Center user is and authorizes resource access based on that identity.
Third-party application-driven token exchange: A third-party application not managed by AWS exchanges the external token for an IAM Identity Center token before calling AWS services. This is different from the first model, where the application that exchanges the token is the managed AWS service. An example is a third-party application that uses Amazon Simple Storage Service (Amazon S3) Access Grants to access S3. In this model, the third-party application obtains a token from the external OAuth authorization server and then calls Identity Center to exchange the external token for an Identity Center token. The application can then use the Identity Center token to call AWS services that use Identity Center as their OAuth authorization server. In this case, the Identity Center administrator must register the third-party application and authorize it to exchange tokens from the TTI.

TTI trust details

When using a TTI, IAM Identity Center trusts that the TTI authenticated the user and authorized them to use the AWS service. This is expressed in an identity token or access token from the external OAuth authorization server (the TTI).

These are the requirements for the external OAuth authorization server (the TTI) and the token it creates:

The token must be a signed JSON Web Token (JWT). The JWT must contain a subject (sub) claim, an audience (aud) claim, an issuer (iss), a user attribute claim, and a JWT ID (JTI) claim.
- The subject in the JWT is the authenticated user and the audience is a value that represents the AWS service that the application will use.
- The audience claim value must match the value that is configured in the application that exchanges the token.
- The issuer claim value must match the value configured in the issuer URL in the TTI.
- There must be a claim in the token that specifies a user attribute that IAM Identity Center can use to find the corresponding user in the Identity Center directory.
- The JWT token must contain the JWT ID claim. This claim is used to help prevent replay attacks. If a new token exchange is attempted after the initial exchange is complete, IAM Identity Center rejects the new exchange request.
The TTI must have an OIDC discovery URL that IAM Identity Center can use to obtain keys that it can use to verify the signature on JWTs created by your TTI. Identity Center appends the suffix /.well-known/openid-configuration to the provider URL that you configure to identify where to fetch the signature keys.

Note: Typically, the IdP that you use as your identity source for IAM Identity Center is your TTI. However, your TTI doesn’t have to be the IdP that Identity Center uses as an identity source. If the users from a TTI can be mapped to users in Identity Center, the tokens can be exchanged. You can have as many as 10 TTIs configured for a single Identity Center instance.

Details for applications that exchange tokens

Your OAuth authorization server service (the TTI) provides a way to authorize a user to access an AWS service. When a user signs in to the client application, the OAuth authorization server generates an ID token or an access token that contains the subject (the user) and an audience (the AWS services the user can access). When a third-party application accesses an AWS service, the audience must include an identifier of the AWS service. The third-party client application then passes this token to an AWS driver or an AWS service.

To use IAM Identity Center and exchange an external token from the TTI for an Identity Center token, you must configure the application that will exchange the token with Identity Center to use one or more of the TTIs. Additionally, as part of the configuration process, you specify the audience values that are expected to be used with the external OAuth token.

If the applications are managed AWS services, AWS performs most of the configuration process. For example, the Amazon Redshift administrator connects Amazon Redshift to IAM Identity Center, and then connects a specific Amazon Redshift cluster to Identity Center. The Amazon Redshift cluster exchanges the token and must be configured to do so, which is done through the Amazon Redshift administrative console or APIs and doesn’t require additional configuration.
If the applications are third-party and not managed by AWS, your IAM Identity Center administrator must register the application and configure it for token exchange. For example, suppose you create an application that obtains an OAuth token from Okta Universal Directory and calls S3 Access Grants. The Identity Center administrator must add this application as a customer managed application and must grant the application permissions to exchange tokens.

How to set up TTIs

To create new TTIs, open the IAM Identity Center console, choose Settings, and then choose Create trusted token issuer, as shown in Figure 2. In this section, I show an example of how to use the console to create a new TTI to exchange tokens with my Okta IdP, where I already created my OIDC application to use with my new IAM Identity Center application.

Figure 2: Configure the TTI in the IAM Identity Center console

TTI uses the issuer URL to discover the OpenID configuration. Because I use Okta, I can verify that my IdP discovery URL is accessible at https://{my-okta-domain}.okta.com/.well-known/openid-configuration. I can also verify that the OpenID configuration URL responds with a JSON that contains the jwks_uri attribute, which contains a URL that lists the keys that are used by my IdP to sign the JWT tokens. Trusted token issuer requires that both URLs are publicly accessible.

I then configure the attributes I want to use to map the identity of the Okta user with the user in IAM Identity Center in the Map attributes section. I can get the attributes from an OIDC identity token issued by Okta:

{
    "sub": "00u22603n2TgCxTgs5d7",
    "email": "<masked>",
    "ver": 1,
    "iss": "https://<masked>.okta.com",
    "aud": "123456nqqVBTdtk7890",
    "iat": 1699550469,
    "exp": 1699554069,
    "jti": "ID.MojsBne1SlND7tCMtZPbpiei9p-goJsOmCiHkyEhUj8",
    "amr": [
        "pwd"
    ],
    "idp": "<masked>",
    "auth_time": 1699527801,
    "at_hash": "ZFteB9l4MXc9virpYaul9A"
}

I’m requesting a token with an additional email scope, because I want to use this attribute to match against the email of my IAM Identity Center users. In most cases, your Identity Center users are synchronized with your central identity provider by using automatic provisioning with the SCIM protocol. In this case, you can use the Identity Center external ID attribute to match with oid or sub attributes. The only requirement for TTI is that those attributes create a one-to-one mapping between the two IdPs.

Now that I have created my TTI, I can associate it with my IAM Identity Center applications. As explained previously, there are two use cases. For the managed AWS service-driven token exchange use case, use the service-specific interface to do so. For example, I can use my TTI with Amazon Redshift, as shown in Figure 3:

Figure 3: Configure the TTI with Amazon Redshift

I selected Okta as the TTI to use for this integration, and I now need to configure the audclaim value that the application will use to accept the token. I can find it when creating the application from the IdP side–in this example, the value is 123456nqqVBTdtk7890, and I can obtain it by using the preceding example OIDC identity token.

I can also use the AWS Command Line Interface (AWS CLI) to configure the IAM Identity Center application with the appropriate application grants:

aws sso put-application-grant \
    --application-arn "<my-application-arn>" \
    --grant-type "urn:ietf:params:oauth:grant-type:jwt-bearer" \
    --grant '
    {
        "JwtBearer": { 
            "AuthorizedTokenIssuers": [
                {
                    "TrustedTokenIssuerArn": "<my-tti-arn>", 
                    "AuthorizedAudiences": [
                        "123456nqqVBTdtk7890"
                    ]
                 }
            ]
       }
    }'

Perform a token exchange

For AWS service-driven use cases, the token exchange between your IdP and IAM Identity Center is performed automatically by the service itself. For third-party application-driven token exchange, such as when building your own Identity Center application with S3 Access Grants, your application performs the token exchange by using the Identity Center OIDC API action CreateTokenWithIAM:

aws sso-oidc create-token-with-iam \  
    --client-id "<my-application-arn>" \ 
    --grant-type "urn:ietf:params:oauth:grant-type:jwt-bearer" \
    --assertion "<jwt-from-idp>"

This action is performed by an IAM principal, which then uses the result to interact with AWS services.

If successful, the result looks like the following:

{
    "accessToken": "<idc-access-token>",
    "tokenType": "Bearer",
    "expiresIn": 3600,
    "idToken": "<jwt-idc-identity-token>",
    "issuedTokenType": "urn:ietf:params:oauth:token-type:access_token",
    "scope": [
        "sts:identity_context",
        "openid",
        "aws"
    ]
}

The value of the scope attribute varies depending on the IAM Identity Center application that you’re interacting with, because it defines the permissions associated with the application.

You can also inspect the idToken attribute because it’s JWT-encoded:

{
    "aws:identity_store_id": "d-123456789",
    "sub": "93445892-f001-7078-8c38-7f2b978f686f",
    "aws:instance_account": "12345678912",
    "iss": "https://identitycenter.amazonaws.com/ssoins-69870e74abba8440",
    "sts:audit_context": "<sts-token>",
    "aws:identity_store_arn": "arn:aws:identitystore::12345678912:identitystore/d-996701d649",
    "aud": "20bSatbAF2kiR7lxX5Vdp2V1LWNlbnRyYWwtMQ",
    "aws:instance_arn": "arn:aws:sso:::instance/ssoins-69870e74abba8440",
    "aws:credential_id": "<masked>",
    "act": {
      "sub": "arn:aws:sso::12345678912:trustedTokenIssuer/ssoins-69870e74abba8440/c38448c2-e030-7092-0f0a-b594f83fcf82"
    },
    "aws:application_arn": "arn:aws:sso::12345678912:application/ssoins-69870e74abba8440/apl-0ed2bf0be396a325",
    "auth_time": "2023-11-10T08:00:08Z",
    "exp": 1699606808,
    "iat": 1699603208
  }

The token contains:

The AWS account and the IAM Identity Center instance and application that accepted the token exchange
The unique user ID of the user that was matched in IAM Identity Center (attribute sub)

AWS services can now use the STS Audit Context token (found in the attribute sts:audit_context) to create identity-aware sessions with the STS API. You can audit the API calls performed by the identity-aware sessions in AWS CloudTrail, by inspecting the attribute onBehalfOf within the field userIdentity. In this example, you can see an API call that was performed with an identity-aware session:

"userIdentity": {
    ...
    "onBehalfOf": {
        "userId": "93445892-f001-7078-8c38-7f2b978f686f",
        "identityStoreArn": "arn:aws:identitystore::425341151473:identitystore/d-996701d649"
    }
}

You can thus quickly filter actions that an AWS principal performs on behalf of your IAM Identity Center user.

Troubleshooting TTI

You can troubleshoot token exchange errors by verifying that:

The OpenID discovery URL is publicly accessible.
The OpenID discovery URL response conforms with the OpenID standard.
The OpenID keys URL referenced in the discovery response is publicly accessible.
The issuer URL that you configure in the TTI exactly matches the value of the iss scope that your IdP returns.
The user attribute that you configure in the TTI exists in the JWT that your IdP returns.
The user attribute value that you configure in the TTI matches exactly one existing IAM Identity Center user on the target attribute.
The aud scope exists in the token returned from your IdP and exactly matches what is configured in the requested IAM Identity Center application.
The jti claim exists in the token returned from your IdP.
If you use an IAM Identity Center application that requires user or group assignments, the matched Identity Center user is already assigned to the application or belongs to a group assigned to the application.

Note: When an IAM Identity Center application doesn’t require user or group assignments, the token exchange will succeed if the preceding conditions are met. This configuration implies that the connected AWS service requires additional security assignments. For example, Amazon Redshift administrators need to configure access to the data within Amazon Redshift. The token exchange doesn’t grant implicit access to the AWS services.

Conclusion

In this blog post, we introduced the trust token issuer feature of IAM Identity Center and what it offers, how it’s configured, and how you can use it to integrate your IdP with AWS services. You learned how to use TTI with AWS-managed applications and third-party applications by configuring the appropriate parameters. You also learned how to troubleshoot token-exchange issues and audit access through CloudTrail.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS IAM Identity Center re:Post or contact AWS Support.

Integrate Okta with Amazon Redshift Query Editor V2 using AWS IAM Identity Center for seamless Single Sign-On

2023-11-30 Maneesh Sharma

Post Syndicated from Maneesh Sharma original https://aws.amazon.com/blogs/big-data/integrate-okta-with-amazon-redshift-query-editor-v2-using-aws-iam-identity-center-for-seamless-single-sign-on/

AWS IAM Identity Center (IdC) allows you to manage single sign-on (SSO) access to all your AWS accounts and applications from a single location. We are pleased to announce that Amazon Redshift now integrates with AWS IAM Identity Center, and supports trusted identity propagation, allowing you to use third-party Identity Providers (IdP) such as Microsoft Entra ID (Azure AD), Okta, Ping, and OneLogin. This integration simplifies the authentication and authorization process for Amazon Redshift users using Query Editor V2 or Amazon Quicksight, making it easier for them to securely access your data warehouse. Additionally, this integration positions Amazon Redshift as an IdC-managed application, enabling you to use database role-based access control on your data warehouse for enhanced security.

AWS IAM Identity Center offers automatic user and group provisioning from Okta to itself by utilizing the System for Cross-domain Identity Management (SCIM) 2.0 protocol. This integration allows for seamless synchronization of information between two services, ensuring accurate and up-to-date information in AWS IAM Identity Center.

In this post, we’ll outline a comprehensive guide for setting up SSO to Amazon Redshift using integration with IdC and Okta as the Identity Provider. This guide shows how you can SSO onto Amazon Redshift for Amazon Redshift Query Editor V2 (QEV2).

Solution overview

Using IAM IdC with Amazon Redshift can benefit your organization in the following ways:

Users can connect to Amazon Redshift without requiring an administrator to set up AWS IAM roles with complex permissions.
IAM IdC integration allows mapping of IdC groups with Amazon Redshift database roles. Administrators can then assign different privileges to different roles and assigning these roles to different users, giving organizations granular control for user access.
IdC provides a central location for your users in AWS. You can create users and groups directly in IdC or connect your existing users and groups that you manage in a standards-based identity provider like Okta, Ping Identity, or Microsoft Entra ID (i.e., Azure Active Directory [AD]).
IdC directs authentication to your chosen source of truth for users and groups, and it maintains a directory of users and groups for access by Amazon Redshift.
You can share one IdC instance with multiple Amazon Redshift data warehouses with a simple auto-discovery and connect capability. This makes it fast to add clusters without the extra effort of configuring the IdC connection for each, and it ensures that all clusters and workgroups have a consistent view of users, their attributes, and groups. Note: Your organization’s IdC instance must be in the same region as the Amazon Redshift data warehouse you’re connecting to.
Because user identities are known and logged along with data access, it’s easier for you to meet compliance regulations through auditing user access in AWS CloudTrail authorizes access to data.

Amazon Redshift Query Editor V2 workflow:

End user initiates the flow using AWS access portal URL (this URL would be available on IdC dashboard console). A browser pop-up triggers and takes you to the Okta Login page where you enter Okta credentials. After successful authentication, you’ll be logged into the AWS Console as a federated user. Click on your AWS Account and choose the Amazon Redshift Query Editor V2 application. Once you federate to Query Editor V2, select the IdC authentication method.
QEv2 invokes browser flow where you re-authenticate, this time with their AWS IdC credentials. Since Okta is the IdP, you enter Okta credentials, which are already cached in browser. At this step, federation flow with IdC initiates and at the end of this flow, the Session token and Access token is available to the QEv2 console in browser as cookies.
Amazon Redshift retrieves your authorization details based on session token retrieved and fetches user’s group membership.
Upon a successful authentication, you’ll be redirected back to QEV2, but logged in as an IdC authenticated user.

This solution covers following steps:

Integrate Okta with AWS IdC to sync user and groups.
Setting up IdC integration with Amazon Redshift
Assign Users or Groups from IdC to Amazon Redshift Application.
Enable IdC integration for a new Amazon Redshift provisioned or Amazon Redshift Serverless endpoint.
Associate an IdC application with an existing provisioned or serverless data warehouse.
Configure Amazon Redshift role-based access.
Create a permission set.
Assign permission set to AWS accounts.
Federate to Redshift Query Editor V2 using IdC.
Troubleshooting

Prerequisites

An AWS account. If you don’t have one, you can sign up for one.
An Amazon Redshift Serverless data warehouse. For setup instructions, see Getting started with Amazon Redshift Serverless.
The latest Redshift Serverless JDBC SDK driver-dependent libraries (download the libraries and unzip the JDBC JAR zipped folder).
An Okta account that has an active subscription. You need an administrator role to set up the application on Okta. If you’re new to Okta, then you can sign up for a free trial or sign up for a developer account.

Walkthrough

Integrate Okta with AWS IdC to sync user and groups

Enable group and user provisioning from Okta with AWS IdC by following this documentation here.

If you see issues while syncing users and groups, then refer to this section these considerations for using automatic provisioning.

Setting up IAM IdC integration with Amazon Redshift

Amazon Redshift cluster administrator or Amazon Redshift Serverless administrator must perform steps to configure Redshift as an IdC-enabled application. This enables Amazon Redshift to discover and connect to the IdC automatically to receive sign-in and user directory services.

After this, when the Amazon Redshift administrator creates a cluster or workgroup, they can enable the new data warehouse to use IdC and its identity-management capabilities. The point of enabling Amazon Redshift as an IdC-managed application is so you can control user and group permissions from within the IdC, or from a source third-party identity provider that’s integrated with it.

When your database users sign in to an Amazon Redshift database, for example an analyst or a data scientist, it checks their groups in IdC and these are mapped to roles in Amazon Redshift. In this manner, a group can map to an Amazon Redshift database role that allows read access to a set of tables.

The following steps show how to make Amazon Redshift an AWS-managed application with IdC:

Select IAM Identity Center connection from Amazon Redshift console menu.
Choose Create application
The IAM Identity Center connection opens. Choose Next.
In IAM Identity Center integration setup section, for:
1. IAM Identity Center display name – Enter a unique name for Amazon Redshift’s IdC-managed application.
2. Managed application name – You can enter the managed Amazon Redshift application name or use the assigned value as it is.
In Connection with third-party identity providers section, for:
1. Identity Provider Namespace – Specify the unique namespace for your organization. This is typically an abbreviated version of your organization’s name. It’s added as a prefix for your IdC-managed users and roles in the Amazon Redshift database.
In IAM role for IAM Identity Center access – Select an IAM role to use. You can create a new IAM role if you don’t have an existing one. The specific policy permissions required are the following:

- sso:DescribeApplication – Required to create an identity provider (IdP) entry in the catalog.
- sso:DescribeInstance – Used to manually create IdP federated roles or users.
- redshift:DescribeQev2IdcApplications – Used to detect capability for IDC authentication from Redshift Query Editor V2.

The following screenshot is from the IAM role:

We won’t enable Trusted identity propagation because we are not integrating with AWS Lake Formation in this post.
Choose Next.
In Configure client connections that use third-party IdPs section, choose Yes if you want to connect Amazon Redshift with a third-party application. Otherwise, choose No. For this post we chose No because we’ll be integrating only with Amazon Redshift Query Editor V2.
Choose Next.
In the Review and create application section, review all the details you have entered before and choose Create application.

After the Amazon Redshift administrator finishes the steps and saves the configuration, the IdC properties appear in the Redshift Console. Completing these tasks makes Redshift an IdC-enabled application.

After you select the managed application name, the properties in the console includes the integration status. It says Success when it’s completed. This status indicates if IdC configuration is completed.

Assigning users or groups from IdC to Amazon Redshift application

In this step, Users or groups synced to your IdC directory are available to assign to your application where the Amazon Redshift administrator can decide which users or groups from IDC need to be included as part of Amazon Redshift application.

For example, if you have total 20 groups from your IdC and you don’t want all the groups to include as part of Amazon Redshift application, then you have options to choose which IdC groups to include as part of Amazon Redshift-enabled IdC application. Later, you can create two Redshift database roles as part of IDC integration in Amazon Redshift.

The following steps assign groups to Amazon Redshift-enabled IdC application:

On IAM Identity Center properties in the Amazon Redshift Console, select Assign under Groups tab.
If this is the first time you’re assigning groups, then you’ll see a notification. Select Get started.
Enter which groups you want to synchronize in the application. In this example, we chose the groups wssso-sales and awssso-finance.
Choose Done.

Enabling IdC integration for a new Amazon Redshift provisioned cluster or Amazon Redshift Serverless

After completing steps under section (Setting up IAM Identity Center integration with Amazon Redshift) — Amazon Redshift database administrator needs to configure new Redshift resources to work in alignment with IdC to make sign-in and data access easier. This is performed as part of the steps to create a provisioned cluster or a Serverless workgroup. Anyone with permissions to create Amazon Redshift resources can perform these IdC integration tasks. When you create a provisioned cluster, you start by choosing Create Cluster in the Amazon Redshift Console.

Choose Enable for the cluster (recommended) in the section for IAM Identity Center connection in the create-cluster steps.
From the drop down, choose the redshift application which you created in above steps.

Note that when a new data warehouse is created, the IAM role specified for IdC integration is automatically attached to the provisioned cluster or Serverless Namespace. After you finish entering the required cluster metadata and create the resource, you can check the status for IdC integration in the properties.

Associating an IdC application with an existing provisioned cluster or Serverless endpoint

If you have an existing provisioned cluster or serverless workgroup that you would like to enable for IdC integration, then you can do that by running a SQL command. You run the following command to enable integration. It’s required that a database administrator run the query.

CREATE IDENTITY PROVIDER "<idc_application_name>" TYPE AWSIDC
NAMESPACE '<idc_namespace_name>'
APPLICATION_ARN '<idc_application_arn>'
IAM_ROLE '<IAM role for IAM Identity Center access>';

Example:

CREATE IDENTITY PROVIDER "redshift-idc-app" TYPE AWSIDC
NAMESPACE 'awsidc'
APPLICATION_ARN 'arn:aws:sso::123456789012:application/ssoins-12345f67fe123d4/apl-a0b0a12dc123b1a4'
IAM_ROLE 'arn:aws:iam::123456789012:role/EspressoRedshiftRole';

To alter the IdP, use the following command (this new set of parameter values completely replaces the current values):

ALTER IDENTITY PROVIDER "<idp_name>"
NAMESPACE '<idc_namespace_name>'|
IAM_ROLE default | 'arn:aws:iam::<AWS account-id-1>:role/<role-name>';

Few of the examples are:

ALTER IDENTITY PROVIDER "redshift-idc-app"
NAMESPACE 'awsidctest';

ALTER IDENTITY PROVIDER "redshift-idc-app"
IAM_ROLE 'arn:aws:iam::123456789012:role/administratoraccess';

Note: If you update the idc-namespace value, then all the new cluster created afterwards will be using the updated namespace.

For existing clusters or serverless workgroups, you need to update the namespace manually on each Amazon Redshift cluster using the previous command. Also, all the database roles associated with identity provider will be updated with new namespace value.

You can disable or enable the identity provider using the following command:

ALTER IDENTITY PROVIDER <provider_name> disable|enable;

Example:

ALTER IDENTITY PROVIDER <provider_name> disable;

You can drop an existing identity provider. The following example shows how CASCADE deletes users and roles attached to the identity provider.

DROP IDENTITY PROVIDER <provider_name> [ CASCADE ]

Configure Amazon Redshift role-based access

In this step, we pre-create the database roles in Amazon Redshift based on the groups that you synced in IdC. Make sure the role name matches with the IdC Group name.

Amazon Redshift roles simplify managing privileges required for your end-users. In this post, we create two database roles, sales and finance, and grant them access to query tables with sales and finance data, respectively. You can download this sample SQL Notebook and import into Redshift Query Editor v2 to run all cells in the notebook used in this example. Alternatively, you can copy and enter the SQL into your SQL client.

Below is the syntax to create role in Amazon Redshift:

create role "<IDC_Namespace>:<IdP groupname>;

For example:

create role "awsidc:awssso-sales";
create role "awsidc:awssso-finance";

Create the sales and finance database schema:

create schema sales_schema;
create schema finance_schema;

Creating the tables:

CREATE TABLE IF NOT EXISTS finance_schema.revenue
(
account INTEGER   ENCODE az64
,customer VARCHAR(20)   ENCODE lzo
,salesamt NUMERIC(18,0)   ENCODE az64
)
DISTSTYLE AUTO
;

insert into finance_schema.revenue values (10001, 'ABC Company', 12000);
insert into finance_schema.revenue values (10002, 'Tech Logistics', 175400);
insert into finance_schema.revenue values (10003, 'XYZ Industry', 24355);
insert into finance_schema.revenue values (10004, 'The tax experts', 186577);

CREATE TABLE IF NOT EXISTS sales_schema.store_sales
(
ID INTEGER   ENCODE az64,
Product varchar(20),
Sales_Amount INTEGER   ENCODE az64
)
DISTSTYLE AUTO
;

Insert into sales_schema.store_sales values (1,'product1',1000);
Insert into sales_schema.store_sales values (2,'product2',2000);
Insert into sales_schema.store_sales values (3,'product3',3000);
Insert into sales_schema.store_sales values (4,'product4',4000);

Below is the syntax to grant permission to the Amazon Redshift Serverless role:

GRANT { { SELECT | INSERT | UPDATE | DELETE | DROP | REFERENCES } [,...]| ALL [ PRIVILEGES ] } ON { [ TABLE ] table_name [, ...] | ALL TABLES IN SCHEMA schema_name [, ...] } TO role <IdP groupname>;

Grant relevant permission to the role as per your requirement. In following example, we grant full permission to role sales on sales_schema and only select permission on finance_schema to role finance.

For example:

grant usage on schema sales_schema to role "awsidc:awssso-sales";
grant select on all tables in schema sales_schema to role "awsidc:awssso-sales";

grant usage on schema finance_schema to role "awsidc:awssso-finance";
grant select on all tables in schema finance_schema to role "awsidc:awssso-finance";

Create a permission set

A permission set is a template that you create and maintain that defines a collection of one or more IAM policies. Permission sets simplify the assignment of AWS account access for users and groups in your organization. We’ll create a permission set to allow federated user to access Query Editor V

The following steps to create permission set:

Open the IAM Identity Center Console.
In the navigation pane, under Multi-Account permissions, choose Permission sets.
Choose Create permission set.
Choose Custom permission set and then choose Next.
Under AWS managed policies, choose AmazonRedshiftQueryEditorV2ReadSharing.
Under Customer managed policies, provide the policy name which you created in step 4 under section – Setting up IAM Identity Center integration with Amazon Redshift.
Choose Next.
Enter permission set name. For example, Amazon Redshift-Query-Editor-V2.
Under Relay state – optional – set default relay state to the Query Editor V2 URL, using the format : https://<region>.console.aws.amazon.com/sqlworkbench/home.
For this post, we use: https://us-east-1.console.aws.amazon.com/sqlworkbench/home.
Choose Next.
On the Review and create screen, choose Create. The console displays the following message: The permission set Redshift-Query-Editor-V2 was successfully created.

Assign permission set to AWS accounts

Open the IAM Identity Center Console.
In the navigation pane, under Multi-account permissions, choose AWS accounts.
On the AWS accounts page, select one or more AWS accounts that you want to assign single sign-on access to.
Choose Assign users or groups.
On the Assign users and groups to AWS-account-name, choose the groups that you want to create the permission set for. Then, choose Next.
On the Assign permission sets to AWS-account-name, choose the permission set you created in the section – Create a permission set. Then, choose Next.
On the Review and submit assignments to AWS-account-name page, for Review and submit, choose Submit. The console displays the following message: We reprovisioned your AWS account successfully and applied the updated permission set to the account.

Federate to Amazon Redshift using Query Editor V2 using IdC

Now you’re ready to connect to Amazon Redshift Query Editor V2 and federated login using IdC authentication:

Open the IAM Identity Center Console.
Go to dashboard and select the AWS access portal URL.
A browser pop-up triggers and takes you to the Okta Login page where you enter your Okta credentials.
After successful authentication, you’ll be logged into the AWS console as a federated user.
Select your AWS Account and choose the Amazon Redshift Query Editor V2 application.
Once you federate to Query Editor V2, choose your Redshift instance (i.e., right-click) and choose Create connection.
To authenticate using IdC, choose the authentication method IAM Identity Center.
It will show a pop-up and since your Okta credentials is already cached, it utilizes the same credentials and connects to Amazon Redshift Query Editor V2 using IdC authentication.

The following demonstration shows a federated user (Ethan) used the AWS access portal URL to access Amazon Redshift using IdC authentication. User Ethan accesses the sales_schema tables. If User Ethan tries to access the tables in finance_schema, then the user gets a permission denied error.

Troubleshooting

If you get the following error:

ERROR: registered identity provider does not exist for "<idc-namespace:redshift_database_role_name"

This means that you are trying to create a role with a wrong namespace. Please check current namespace using the command select * from identity_providers;

If you get below error:

ERROR: (arn:aws:iam::123456789012:role/<iam_role_name>) Insufficient privileges to access AWS IdC. [ErrorId: 1-1234cf25-1f23ea1234c447fb12aa123e]

This means that an IAM role doesn’t have sufficient privileges to access to the IdC. Your IAM role should contain a policy with following permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": "redshift:DescribeQev2IdcApplications",
            "Resource": "*"
        },
        {
            "Sid": "VisualEditor1",
            "Effect": "Allow",
            "Action": [
                "sso:DescribeApplication",
                "sso:DescribeInstance"
            ],
            "Resource": "arn:aws:sso::726034267621:application/ssoins-722324fe82a3f0b8/*"
        }
    ]
}

If you get below error:

FATAL: Invalid scope. User's credentials are not authorized to connect to Redshift. [SQL State=28000]

Please make sure that the user and group are added to the Amazon Redshift IdC application.

Clean up

Complete the following steps to clean up your resources:

Delete the Okta Applications which you have created to integrate with IdC.
Delete IAM Identity Center configuration.
Delete the Redshift application and the Redshift provisioned cluster which you have created for testing.
Delete the IAM role which you have created for IdC and Redshift integration.

Conclusion

In this post, we showed you a detailed walkthrough of how you can integrate Okta with the IdC and Amazon Redshift Query Editor version 2 to simplify your SSO setup. This integration allows you to use role-based access control with Amazon Redshift. We encourage you to try out this integration.

To learn more about IdC with Amazon Redshift, visit the documentation.

About the Authors

Harshida Patel is a Principal Solutions Architect, Analytics with AWS.

Praveen Kumar Ramakrishnan is a Senior Software Engineer at AWS. He has nearly 20 years of experience spanning various domains including filesystems, storage virtualization and network security. At AWS, he focuses on enhancing the Redshift data security.

Karthik Ramanathan is a Sr. Software Engineer with AWS Redshift and is based in San Francisco. He brings close to two decades of development experience across the networking, data storage and IoT verticals prior to Redshift. When not at work, he is also a writer and loves to be in the water.

Use IAM Identity Center APIs to audit and manage application assignments

2023-11-27 Laura Reith

Post Syndicated from Laura Reith original https://aws.amazon.com/blogs/security/use-iam-identity-center-apis-to-audit-and-manage-application-assignments/

You can now use AWS IAM Identity Center application assignment APIs to programmatically manage and audit user and group access to AWS managed applications. Previously, you had to use the IAM Identity Center console to manually assign users and groups to an application. Now, you can automate this task so that you scale more effectively as your organization grows.

In this post, we will show you how to use IAM Identity Center APIs to programmatically manage and audit user and group access to applications. The procedures that we share apply to both organization instances and account instances of IAM Identity Center.

Automate management of user and group assignment to applications

IAM Identity Center is where you create, or connect, your workforce users one time and centrally manage their access to multiple AWS accounts and applications. You configure AWS managed applications to work with IAM Identity Center directly from within the relevant application console, and then manage which users or groups need permissions to the application.

You can already use the account assignment APIs to automate multi-account access and audit access assigned to your users using IAM Identity Center permission sets. Today, we expanded this capability with the new application assignment APIs. You can use these new APIs to programmatically control application assignments and develop automated workflows for auditing them.

AWS managed applications access user and group information directly from IAM Identity Center. One example of an AWS managed application is Amazon Redshift. When you configure Amazon Redshift as an AWS managed application with IAM Identity Center, and a user from your organization accesses the database, their group memberships defined in IAM Identity Center can map to Amazon Redshift database roles that grant them specific permissions. This makes it simpler for you to manage users because you don’t have to set database-object permissions for each individual. For more information, see The benefits of Redshift integration with AWS IAM Identity Center.

After you configure the integration between IAM Identity Center and Amazon Redshift, you can automate the assignment or removal of users and groups by using the DeleteApplicationAssignment and CreateApplicationAssignment APIs, as shown in Figure 1.

Figure 1: Use the CreateApplicationAssignment API to assign users and groups to Amazon Redshift

In this section, you will learn how to use Identity Center APIs to assign a group to your Amazon Redshift application. You will also learn how to delete the group assignment.

Prerequisites

To follow along with this walkthrough, make sure that you’ve completed the following prerequisites:

Enable IAM Identity Center, and use the Identity Store to manage your identity data. If you use an external identity provider, then you should handle the user creation and deletion processes in those systems.
Configure Amazon Redshift to use IAM Identity Center as its identity source. When you configure Amazon Redshift to use IAM Identity Center as its identity source, the application requires explicit assignment by default. This means that you must explicitly assign users to the application in the Identity Center console or APIs.
Install and configure AWS Command Line Interface (AWS CLI) version 2. For this example, you will use AWS CLI v2 to call the IAM Identity Center application assignment APIs. For more information, see Installing the AWS CLI and Configuring the AWS CLI.

Step 1: Get your Identity Center instance information

The first step is to run the following command to get the Amazon Resource Name (ARN) and Identity Store ID for the instance that you’re working with:

aws sso-admin list-instances

The output should look similar to the following:

{
  "Instances": [
      {
          "InstanceArn": "arn:aws:sso:::instance/ssoins-****************",
          "IdentityStoreId": "d-**********",
          "OwnerAccountId": "************",
          "Name": "MyInstanceName",
          "CreatedDate": "2023-10-08T16:45:19.839000-04:00",
          "State": {
              "Name": "ACTIVE"
          },
          "Status": "ACTIVE"
      }
  ],
  "NextToken": <<TOKEN>>
}

Take note of the IdentityStoreId and the InstanceArn — you will use both in the following steps.

Step 2: Create user and group in your Identity Store

The next step is to create a user and group in your Identity Store.

Note: If you already have a group in your Identity Center instance, get its GroupId and then proceed to Step 3. To get your GroupId, run the following command:
aws identitystore get-group-id --identity-store-id “d-********” –alternate-identifier “GroupName” ,

Create a new user by using the IdentityStoreId that you noted in the previous step.

aws identitystore create-user --identity-store-id "d-**********" --user-name "MyUser" --emails Value="[email protected]",Type="Work",Primary=true —display-name "My User" —name FamilyName="User",GivenName="My"

The output should look similar to the following:

{
    "UserId": "********-****-****-****-************",
    "IdentityStoreId": "d--********** "
}

Create a group in your Identity Store:

aws identitystore create-group --identity-store-id d-********** --display-name engineering

In the output, make note of the GroupId — you will need it later when you create the application assignment in Step 4:

{
    "GroupId": "********-****-****-****-************",
    "IdentityStoreId": "d-**********"
}

Run the following command to add the user to the group:

aws identitystore create-group-membership --identity-store-id d-********** --group-id ********-****-****-****-************ --member-id UserId=********-****-****-****-************

The result will look similar to the following:

{
    "MembershipId": "********-****-****-****-************",
    "IdentityStoreId": "d-**********"
}

Step 3: Get your Amazon Redshift application ARN instance

The next step is to determine the application ARN. To get the ARN, run the following command.

aws sso-admin list-applications --instance-arn "arn:aws:sso:::instance/ssoins-****************"

If you have more than one application in your environment, use the filter flag to specify the application account or the application provider. To learn more about the filter option, see the ListApplications API documentation.

In this case, we have only one application: Amazon Redshift. The response should look similar to the following. Take note of the ApplicationArn — you will need it in the next step.

{

    "ApplicationArn": "arn:aws:sso:::instance/ssoins-****************/apl-***************",
    "ApplicationProviderArn": "arn:aws:sso::aws:applicationProvider/Redshift",
    "Name": "Amazon Redshift",
    "InstanceArn": "arn:aws:sso:::instance/ssoins-****************",
    "Status": "DISABLED",
    "PortalOptions": {
        "Visible": true,
        "Visibility": "ENABLED",
        "SignInOptions": {
            "Origin": "IDENTITY_CENTER"
        }
    },
    "AssignmentConfig": {
        "AssignmentRequired": true
    },
    "Description": "Amazon Redshift",
    "CreatedDate": "2023-10-09T10:48:44.496000-07:00"
}

Step 4: Add your group to the Amazon Redshift application

Now you can add your new group to the Amazon Redshift application managed by IAM Identity Center. The principal-id is the GroupId that you created in Step 2.

aws sso-admin create-application-assignment --application-arn "arn:aws:sso:::instance/ssoins-****************/apl-***************" --principal-id "********-****-****-****-************" --principal-type "GROUP"

The group now has access to Amazon Redshift, but with the default permissions in Amazon Redshift. To grant access to databases, you can create roles that control the permissions available on a set of tables or views.

To create these roles in Amazon Redshift, you need to connect to your cluster and run SQL commands. To connect to your cluster, use one of the following options:

Figure 2 shows a connection to Amazon Redshift through the query editor v2.

Figure 2: Query editor v2

By default, all users have CREATE and USAGE permissions on the PUBLIC schema of a database. To disallow users from creating objects in the PUBLIC schema of a database, use the REVOKE command to remove that permission. For more information, see Default database user permissions.

As the Amazon Redshift database administrator, you can create roles where the role name contains the identity provider namespace prefix and the group or user name. To do this, use the following syntax:

CREATE ROLE <identitycenternamespace:rolename>;

The rolename needs to match the group name in IAM Identity Center. Amazon Redshift automatically maps the IAM Identity Center group or user to the role created previously. To expand the permissions of a user, use the GRANT command.

The identityprovidernamespace is assigned when you create the integration between Amazon Redshift and IAM Identity Center. It represents your organization’s name and is added as a prefix to your IAM Identity Center managed users and roles in the Redshift database.

Your syntax should look like the following:

CREATE ROLE <AWSIdentityCenter:MyGroup>;

Step 5: Remove application assignment

If you decide that the new group no longer needs access to the Amazon Redshift application but should remain within the IAM Identity Center instance, run the following command:

aws sso-admin delete-application-assignment --application-arn "arn:aws:sso:::instance/ssoins-****************/apl-***************" --principal-id "********-****-****-****-************" --principal-type "GROUP"

Note: Removing an application assignment for a group doesn’t remove the group from your Identity Center instance.

When you remove or add user assignments, we recommend that you review the application’s documentation because you might need to take additional steps to completely onboard or offboard a given user or group. For example, when you remove a user or group assignment, you must also remove the corresponding roles in Amazon Redshift. You can do this by using the DROP ROLE command. For more information, see Managing database security.

Audit user and group access to applications

Let’s consider how you can use the new APIs to help you audit application assignments. In the preceding example, you used the AWS CLI to create and delete assignments to Amazon Redshift. Now, we will show you how to use the new ListApplicationAssignments API to list the groups that are currently assigned to your Amazon Redshift application.

aws sso-admin list-application-assignments --application-arn arn:aws:sso::****************:application/ssoins-****************/apl-****************

The output should look similar to the following — in this case, you have a single group assigned to the application.

{
    "ApplicationAssignments": [
        {
        "ApplicationArn": "arn:aws:sso::****************:application/ssoins-****************/apl-****************",
        "PrincipalId": "********-****-****-****-************",
        "PrincipalType": "GROUP"
        }
    ]
}

To see the group membership, use the PrincipalId information to query Identity Store and get information on the users assigned to the group with a combination of the ListGroupMemberships and DescribeGroupMembership APIs.

If you have several applications that IAM Identity Center manages, you can also create a script to automatically audit those applications. You can run this script periodically in an AWS Lambda function in your environment to maintain oversight of the members that are added to each application.

To get the script for this use case, see the multiple-instance-management-iam-identity-center GitHub repository. The repository includes instructions to deploy the script using Lambda within the AWS Organizations delegated administrator account. After deployment, you can invoke the Lambda function to get .csv files of every IAM Identity Center instance in your organization, the applications assigned to each instance, and the users that have access to those applications.

Conclusion

In this post, you learned how to use the IAM Identity Center application assignment APIs to assign users to Amazon Redshift and remove them from the application when they are no longer part of the organization. You also learned to list which applications are deployed in each account, and which users are assigned to each of those applications.

To learn more about IAM Identity Center, see the AWS IAM Identity Center user guide. To test the application assignment APIs, see the SSO-admin API reference guide.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on AWS IAM Identity Center re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

How to use multiple instances of AWS IAM Identity Center

2023-11-20 Laura Reith

Post Syndicated from Laura Reith original https://aws.amazon.com/blogs/security/how-to-use-multiple-instances-of-aws-iam-identity-center/

Recently, AWS launched a new feature that allows deployment of account instances of AWS IAM Identity Center . With this launch, you can now have two types of IAM Identity Center instances: organization instances and account instances. An organization instance is the IAM Identity Center instance that’s enabled in the management account of your organization created with AWS Organizations. This instance is used to manage access to AWS accounts and applications across your entire organization. Organization instances are the best practice when deploying IAM Identity Center. Many customers have requested a way to enable AWS applications using test or sandbox identities. The new account instances are intended to support sand-boxed deployments of AWS managed applications such as Amazon CodeCatalyst and are only usable from within the account and AWS Region in which they were created. They can exist in a standalone account or in a member account within AWS Organizations.

In this blog post, we show you when to use each instance type, how to control the deployment of account instances, and how you can monitor, manage, and audit these instances at scale using the enhanced IAM Identity Center APIs.

IAM Identity Center instance types

IAM Identity Center now offers two deployment types, the traditional organization instance and an account instance, shown in Figure 1. In this section, we show you the differences between the two.

Figure 1: IAM Identity Center instance types

Organization instance of IAM Identity Center

An organization instance of IAM Identity Center is the fully featured version that’s available with AWS Organizations. This type of instance helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications in your organization. The recommended use of an organization instance of Identity Center is for workforce authentication and authorization on AWS for organizations of any size and type.

Using the organization instance of IAM Identity Center, your identity center administrator can create and manage user identities in the Identity Center directory, or connect your existing identity source, including Microsoft Active Directory, Okta, Ping Identity, JumpCloud, Google Workspace, and Azure Active Directory (Entra ID). There is only one organization instance of IAM Identity Center at the organization level. If you have enabled IAM Identity Center before November 15, 2023, you have an organization instance.

Account instances of IAM Identity Center

Account instances of IAM Identity Center provide a subset of the features of the organization instance. Specifically, account instances support user and group assignments initially only to Amazon CodeCatalyst. They are bound to a single AWS account, and you can deploy them in either member accounts of an organization or in standalone AWS accounts. You can only deploy one account instance per AWS account regardless of Region.

You can use account instances of IAM Identity Center to provide access to supported Identity Center enabled application if the application is in the same account and Region.

Account instances of Identity Center don’t support permission sets or assignments to customer managed applications. If you’ve enabled Identity Center before November 15, 2023 then you must enable account instance creation from your management account. To learn more see Enable account instances in the AWS Management Console documentation. If you haven’t yet enabled Identity Center, then account instances are now available to you.

When should I use account instances of IAM Identity Center?

Account instances are intended for use in specific situations where organization instances are unavailable or impractical, including:

You want to run a temporary trial of a supported AWS managed application to determine if it suits your business needs. See Additional Considerations.
You are unable to deploy IAM Identity Center across your organization, but still want to experiment with one or more AWS managed applications. See Additional Considerations.
You have an organization instance of IAM Identity Center, but you want to deploy a supported AWS managed application to an isolated set of users that are distinct from those in your organization instance.

Additional considerations

When working with multiple instances of IAM Identity Center, you want to keep a number of things in mind:

Each instance of IAM Identity Center is separate and distinct from other Identity Center instances. That is, users and assignments are managed separately in each instance without a means to keep them in sync.
Migration between instances isn’t possible. This means that migrating an application between instances requires setting up that application from scratch in the new instance.
Account instances have the same considerations when changing your identity source as an organization instance. In general, you want to set up with the right identity source before adding assignments.
Automating assigning users to applications through the IAM Identity Center public APIs also requires using the applications APIs to ensure that those users and groups have the right permissions within the application. For example, if you assign groups to CodeCatalyst using Identity Center, you still have to assign the groups to the CodeCatalyst space from the Amazon CodeCatalyst page in the AWS Management Console. See the Setting up a space that supports identity federation documentation.
By default, account instances require newly added users to register a multi-factor authentication (MFA) device when they first sign in. This can be altered in the AWS Management Console for Identity Center for a specific instance.

Controlling IAM Identity Center instance deployments

If you’ve enabled IAM Identity Center prior to November 15, 2023 then account instance creation is off by default. If you want to allow account instance creation, you must enable this feature from the Identity Center console in your organization’s management account. This includes scenarios where you’re using IAM Identity Center centrally and want to allow deployment and management of account instances. See Enable account instances in the AWS Management Console documentation.

If you enable IAM Identity Center after November 15, 2023 or if you haven’t enabled Identity Center at all, you can control the creation of account instances of Identity Center through a service control policy (SCP). We recommend applying the following sample policy to restrict the use of account instances to all but a select set of AWS accounts. The sample SCP that follows will help you deny creation of account instances of Identity Center to accounts in the organization unless the account ID matches the one you specified in the policy. Replace <ALLOWED-ACCOUNT_ID> with the ID of the account that is allowed to create account instances of Identity Center:

{
    "Version": "2012-10-17",
    "Statement" : [
        {
            "Sid": "DenyCreateAccountInstances",
            "Effect": "Deny",
            "Action": [
                "sso:CreateInstance"
            ],
            "Resource": "*",
            "Condition": {
                "StringNotEquals": [
                    "aws:PrincipalAccount": ["<ALLOWED-ACCOUNT-ID>"]
                ]
            }
        }
    ]
}

To learn more about SCPs, see the AWS Organizations User Guide on service control policies.

Monitoring instance activity with AWS CloudTrail

If your organization has an existing log ingestion pipeline solution to collect logs and generate reports through AWS CloudTrail, then IAM Identity Center supported CloudTrail operations will automatically be present in your pipeline, including additional account instances of IAM Identity Center actions such as sso:CreateInstance.

To create a monitoring solution for IAM Identity Center events in your organization, you should set up monitoring through AWS CloudTrail. CloudTrail is a service that records events from AWS services to facilitate monitoring activity from those services in your accounts. You can create a CloudTrail trail that captures events across all accounts and all Regions in your organization and persists them to Amazon Simple Storage Service (Amazon S3).

After creating a trail for your organization, you can use it in several ways. You can send events to Amazon CloudWatch Logs and set up monitoring and alarms for Identity Center events, which enables immediate notification of supported IAM Identity Center CloudTrail operations. With multiple instances of Identity Center deployed within your organization, you can also enable notification of instance activity, including new instance creation, deletion, application registration, user authentication, or other supported actions.

If you want to take action on IAM Identity Center events, you can create a solution to process events using additional service such as Amazon Simple Notification Service, Amazon Simple Queue Service, and the CloudTrail Processing Library. With this solution, you can set your own business logic and rules as appropriate.

Additionally, you might want to consider AWS CloudTrail Lake, which provides a powerful data store that allows you to query CloudTrail events without needing to manage a complex data loading pipeline. You can quickly create a data store for new events, which will immediately start gathering data that can be queried within minutes. To analyze historical data, you can copy your organization trail to CloudTrail Lake.

The following is an example of a simple query that shows you a list of the Identity Center instances created and deleted, the account where they were created, and the user that created them. Replace <Event_data_store_ID> with your store ID.

SELECT 
    userIdentity.arn AS userARN, eventName, userIdentity.accountId 
FROM 
    <Event_data_store_ID> 
WHERE 
    userIdentity.arn IS NOT NULL 
    AND eventName = 'DeleteInstance' 
    OR eventName = 'CreateInstance'

You can save your query result to an S3 bucket and download a copy of the results in CSV format. To learn more, follow the steps in Download your CloudTrail Lake saved query results. Figure 2 shows the CloudTrail Lake query results.

Figure 2: AWS CloudTrail Lake query results

If you want to automate the sourcing, aggregation, normalization, and data management of security data across your organization using the Open Cyber Security Framework (OCSF) standard, you will benefit from using Amazon Security Lake. This service helps make your organization’s security data broadly accessible to your preferred security analytics solutions to power use cases such like threat detection, investigation, and incident response. Learn more in What is Amazon Security Lake?

Instance management and discovery within an organization

You can create account instances of IAM Identity Center in a standalone account or in an account that belongs to your organization. Creation can happen from an API call (CreateInstance) from the Identity Center console in a member account or from the setup experience of a supported AWS managed application. Learn more about Supported AWS managed applications.

If you decide to apply the DenyCreateAccountInstances SCP shown earlier to accounts in your organization, you will no longer be able to create account instances of IAM Identity Center in those accounts. However, you should also consider that when you invite a standalone AWS account to join your organization, the account might have an existing account instance of Identity Center.

To identify existing instances, who’s using them, and what they’re using them for, you can audit your organization to search for new instances. The following script shows how to discover all IAM Identity Center instances in your organization and export a .csv summary to an S3 bucket. This script is designed to run on the account where Identity Center was enabled. Click here to see instructions on how to use this script.

. . .
. . .
accounts_and_instances_dict={}
duplicated_users ={}

main_session = boto3.session.Session()
sso_admin_client = main_session.client('sso-admin')
identity_store_client = main_session.client('identitystore')
organizations_client = main_session.client('organizations')
s3_client = boto3.client('s3')
logger = logging.getLogger()
logger.setLevel(logging.INFO)

#create function to list all Identity Center instances in your organization
def lambda_handler(event, context):
    application_assignment = []
    user_dict={}
    
    current_account = os.environ['CurrentAccountId']
 
    logger.info("Current account %s", current_account)
    
    paginator = organizations_client.get_paginator('list_accounts')
    page_iterator = paginator.paginate()
    for page in page_iterator:
        for account in page['Accounts']:
            get_credentials(account['Id'],current_account)
            #get all instances per account - returns dictionary of instance id and instances ARN per account
            accounts_and_instances_dict = get_accounts_and_instances(account['Id'], current_account)
                    
def get_accounts_and_instances(account_id, current_account):
    global accounts_and_instances_dict
    
    instance_paginator = sso_admin_client.get_paginator('list_instances')
    instance_page_iterator = instance_paginator.paginate()
    for page in instance_page_iterator:
        for instance in page['Instances']:
            #send back all instances and identity centers
            if account_id == current_account:
                accounts_and_instances_dict = {current_account:[instance['IdentityStoreId'],instance['InstanceArn']]}
            elif instance['OwnerAccountId'] != current_account: 
                accounts_and_instances_dict[account_id]= ([instance['IdentityStoreId'],instance['InstanceArn']])
    return accounts_and_instances_dict
  . . .  
  . . .
  . . .

The following table shows the resulting IAM Identity Center instance summary report with all of the accounts in your organization and their corresponding Identity Center instances.

AccountId	IdentityCenterInstance
111122223333	d-111122223333
111122224444	d-111122223333
111122221111	d-111111111111

Duplicate user detection across multiple instances

A consideration of having multiple IAM Identity Center instances is the possibility of having the same person existing in two or more instances. In this situation, each instance creates a unique identifier for the same person and the identifier associates application-related data to the user. Create a user management process for incoming and outgoing users that is similar to the process you use at the organization level. For example, if a user leaves your organization, you need to revoke access in all Identity Center instances where that user exists.

The code that follows can be added to the previous script to help detect where duplicates might exist so you can take appropriate action. If you find a lot of duplication across account instances, you should consider adopting an organization instance to reduce your management overhead.

...
#determine if the member in IdentityStore have duplicate
def get_users(identityStoreId, user_dict): 
    global duplicated_users
    paginator = identity_store_client.get_paginator('list_users')
    page_iterator = paginator.paginate(IdentityStoreId=identityStoreId)
    for page in page_iterator:
        for user in page['Users']:
            if ( 'Emails' not in user ):
                print("user has no email")
            else:
                for email in user['Emails']:
                    if email['Value'] not in user_dict:
                        user_dict[email['Value']] = identityStoreId
                    else:
                        print("Duplicate user found " + user['UserName'])
                        user_dict[email['Value']] = user_dict[email['Value']] + "," + identityStoreId
                        duplicated_users[email['Value']] = user_dict[email['Value']]
    return user_dict 
...

The following table shows the resulting report with duplicated users in your organization and their corresponding IAM identity Center instances.

User_email	IdentityStoreId
[email protected]	d-111122223333, d-111111111111
[email protected]	d-111122223333, d-111111111111, d-222222222222
[email protected]	d-111111111111, d-222222222222

The full script for all of the above use cases is available in the multiple-instance-management-iam-identity-center GitHub repository. The repository includes instructions to deploy the script using AWS Lambda within the management account. After deployment, you can invoke the Lambda function to get .csv files of every IAM Identity center instance in your organization, the applications assigned to each instance, and the users that have access to those applications. With this function, you also get a report of users that exist in more than one local instance.

Conclusion

In this post, you learned the differences between an IAM Identity Center organization instance and an account instance, considerations for when to use an account instance, and how to use Identity Center APIs to automate discovery of Identity Center account instances in your organization.

To learn more about IAM Identity Center, see the AWS IAM Identity Center user guide.

Want more AWS Security news? Follow us on Twitter.

Writing IAM Policies: Grant Access to User-Specific Folders in an Amazon S3 Bucket

2023-11-14 Dylan Souvage

Post Syndicated from Dylan Souvage original https://aws.amazon.com/blogs/security/writing-iam-policies-grant-access-to-user-specific-folders-in-an-amazon-s3-bucket/

November 14, 2023: We’ve updated this post to use IAM Identity Center and follow updated IAM best practices.

In this post, we discuss the concept of folders in Amazon Simple Storage Service (Amazon S3) and how to use policies to restrict access to these folders. The idea is that by properly managing permissions, you can allow federated users to have full access to their respective folders and no access to the rest of the folders.

Overview

Imagine you have a team of developers named Adele, Bob, and David. Each of them has a dedicated folder in a shared S3 bucket, and they should only have access to their respective folders. These users are authenticated through AWS IAM Identity Center (successor to AWS Single Sign-On).

In this post, you’ll focus on David. You’ll walk through the process of setting up these permissions for David using IAM Identity Center and Amazon S3. Before you get started, let’s first discuss what is meant by folders in Amazon S3, because it’s not as straightforward as it might seem. To learn how to create a policy with folder-level permissions, you’ll walk through a scenario similar to what many people have done on existing files shares, where every IAM Identity Center user has access to only their own home folder. With folder-level permissions, you can granularly control who has access to which objects in a specific bucket.

You’ll be shown a policy that grants IAM Identity Center users access to the same Amazon S3 bucket so that they can use the AWS Management Console to store their information. The policy allows users in the company to upload or download files from their department’s folder, but not to access any other department’s folder in the bucket.

After the policy is explained, you’ll see how to create an individual policy for each IAM Identity Center user.

Throughout the rest of this post, you will use a policy, which will be associated with an IAM Identity Center user named David. Also, you must have already created an S3 bucket.

Note: S3 buckets have a global namespace and you must change the bucket name to a unique name.

For this blog post, you will need an S3 bucket with the following structure (the example bucket name for the rest of the blog is “my-new-company-123456789”):

/home/Adele/ /home/Bob/ /home/David/ /confidential/ /root-file.txt

Figure 1: Screenshot of the root of the my-new-company-123456789 bucket

Your S3 bucket structure should have two folders, home and confidential, with a file root-file.txt in the main bucket directory. Inside confidential you will have no items or folders. Inside home there should be three sub-folders: Adele, Bob, and David.

Figure 2: Screenshot of the home/ directory of the my-new-company-123456789 bucket

A brief lesson about Amazon S3 objects

Before explaining the policy, it’s important to review how Amazon S3 objects are named. This brief description isn’t comprehensive, but will help you understand how the policy works. If you already know about Amazon S3 objects and prefixes, skip ahead to Creating David in Identity Center.

Amazon S3 stores data in a flat structure; you create a bucket, and the bucket stores objects. S3 doesn’t have a hierarchy of sub-buckets or folders; however, tools like the console can emulate a folder hierarchy to present folders in a bucket by using the names of objects (also known as keys). When you create a folder in S3, S3 creates a 0-byte object with a key that references the folder name that you provided. For example, if you create a folder named photos in your bucket, the S3 console creates a 0-byte object with the key photos/. The console creates this object to support the idea of folders. The S3 console treats all objects that have a forward slash (/) character as the last (trailing) character in the key name as a folder (for example, examplekeyname/)

To give you an example, for an object that’s named home/common/shared.txt, the console will show the shared.txt file in the common folder in the home folder. The names of these folders (such as home/ or home/common/) are called prefixes, and prefixes like these are what you use to specify David’s department folder in his policy. By the way, the slash (/) in a prefix like home/ isn’t a reserved character — you could name an object (using the Amazon S3 API) with prefixes such as home:common:shared.txt or home-common-shared.txt. However, the convention is to use a slash as the delimiter, and the Amazon S3 console (but not S3 itself) treats the slash as a special character for showing objects in folders. For more information on organizing objects in the S3 console using folders, see Organizing objects in the Amazon S3 console by using folders.

Creating David in Identity Center

IAM Identity Center helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. Identity Center is the recommended approach for workforce authentication and authorization on AWS for organizations of any size and type. Using Identity Center, you can create and manage user identities in AWS, or connect your existing identity source, including Microsoft Active Directory, Okta, Ping Identity, JumpCloud, Google Workspace, and Azure Active Directory (Azure AD). For further reading on IAM Identity Center, see the Identity Center getting started page.

Begin by setting up David as an IAM Identity Center user. To start, open the AWS Management Console and go to IAM Identity Center and create a user.

Note: The following steps are for Identity Center without System for Cross-domain Identity Management (SCIM) turned on, the add user option won’t be available if SCIM is turned on.

From the left pane of the Identity Center console, select Users, and then choose Add user.

Figure 3: Screenshot of IAM Identity Center Users page.
Enter David as the Username, enter an email address that you have access to as you will need this later to confirm your user, and then enter a First name, Last name, and Display name.
Leave the rest as default and choose Add user.
Select Users from the left navigation pane and verify you’ve created the user David.

Figure 4: Screenshot of adding users to group in Identity Center.
Now that you’re verified the user David has been created, use the left pane to navigate to Permission sets, then choose Create permission set.

Figure 5: Screenshot of permission sets in Identity Center.
Select Custom permission set as your Permission set type, then choose Next.

Figure 6: Screenshot of permission set types in Identity Center.

David’s policy

This is David’s complete policy, which will be associated with an IAM Identity Center federated user named David by using the console. This policy grants David full console access to only his folder (/home/David) and no one else’s. While you could grant each user access to their own bucket, keep in mind that an AWS account can have up to 100 buckets by default. By creating home folders and granting the appropriate permissions, you can instead allow thousands of users to share a single bucket.

{
 “Version”:”2012-10-17”,
 “Statement”: [
   {
     “Sid”: “AllowUserToSeeBucketListInTheConsole”,
     “Action”: [“s3:ListAllMyBuckets”, “s3:GetBucketLocation”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3:::*”]
   },
  {
     “Sid”: “AllowRootAndHomeListingOfCompanyBucket”,
     “Action”: [“s3:ListBucket”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3::: my-new-company-123456789”],
     “Condition”:{“StringEquals”:{“s3:prefix”:[“”,”home/”, “home/David”],”s3:delimiter”:[“/”]}}
    },
   {
     “Sid”: “AllowListingOfUserFolder”,
     “Action”: [“s3:ListBucket”],
     “Effect”: “Allow”,
     “Resource”: [“arn:aws:s3:::my-new-company-123456789”],
     “Condition”:{“StringLike”:{“s3:prefix”:[“home/David/*”]}}
   },
   {
     “Sid”: “AllowAllS3ActionsInUserFolder”,
     “Effect”: “Allow”,
     “Action”: [“s3:*”],
     “Resource”: [“arn:aws:s3:::my-new-company-123456789/home/David/*”]
   }
 ]
}

Now, copy and paste the preceding IAM Policy into the inline policy editor. In this case, you use the JSON editor. For information on creating policies, see Creating IAM policies.

Figure 7: Screenshot of the inline policy inside the permissions set in Identity Center.
Give your permission set a name and a description, then leave the rest at the default settings and choose Next.
Verify that you modify the policies to have the bucket name you created earlier.
After your permission set has been created, navigate to AWS accounts on the left navigation pane, then select Assign users or groups.

Figure 8: Screenshot of the AWS accounts in Identity Center.
Select the user David and choose Next.

Figure 9: Screenshot of the AWS accounts in Identity Center.
Select the permission set you created earlier, choose Next, leave the rest at the default settings and choose Submit.

Figure 10: Screenshot of the permission sets in Identity Center.

You’ve now created and attached the permissions required for David to view his S3 bucket folder, but not to view the objects in other users’ folders. You can verify this by signing in as David through the AWS access portal.

Figure 11: Screenshot of the settings summary in Identity Center.
Navigate to the dashboard in IAM Identity Center and go to the Settings summary, then choose the AWS access portal URL.

Figure 12: Screenshot of David signing into the console via the Identity Center dashboard URL.
Sign in as the user David with the one-time password you received earlier when creating David.

Figure 13: Second screenshot of David signing into the console through the Identity Center dashboard URL.
Open the Amazon S3 console.
Search for the bucket you created earlier.

Figure 14: Screenshot of my-new-company-123456789 bucket in the AWS console.
Navigate to David’s folder and verify that you have read and write access to the folder. If you navigate to other users’ folders, you’ll find that you don’t have access to the objects inside their folders.

David’s policy consists of four blocks; let’s look at each individually.

Block 1: Allow required Amazon S3 console permissions

Before you begin identifying the specific folders David can have access to, you must give him two permissions that are required for Amazon S3 console access: ListAllMyBuckets and GetBucketLocation.

   {
      "Sid": "AllowUserToSeeBucketListInTheConsole",
      "Action": ["s3:GetBucketLocation", "s3:ListAllMyBuckets"],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::*"]
   }

The ListAllMyBuckets action grants David permission to list all the buckets in the AWS account, which is required for navigating to buckets in the Amazon S3 console (and as an aside, you currently can’t selectively filter out certain buckets, so users must have permission to list all buckets for console access). The console also does a GetBucketLocation call when users initially navigate to the Amazon S3 console, which is why David also requires permission for that action. Without these two actions, David will get an access denied error in the console.

Block 2: Allow listing objects in root and home folders

Although David should have access to only his home folder, he requires additional permissions so that he can navigate to his folder in the Amazon S3 console. David needs permission to list objects at the root level of the my-new-company-123456789 bucket and to the home/ folder. The following policy grants these permissions to David:

   {
      "Sid": "AllowRootAndHomeListingOfCompanyBucket",
      "Action": ["s3:ListBucket"],
      "Effect": "Allow",
      "Resource": ["arn:aws:s3:::my-new-company-123456789"],
      "Condition":{"StringEquals":{"s3:prefix":["","home/", "home/David"],"s3:delimiter":["/"]}}
   }

Without the ListBucket permission, David can’t navigate to his folder because he won’t have permissions to view the contents of the root and home folders. When David tries to use the console to view the contents of the my-new-company-123456789 bucket, the console will return an access denied error. Although this policy grants David permission to list all objects in the root and home folders, he won’t be able to view the contents of any files or folders except his own (you specify these permissions in the next block).

This block includes conditions, which let you limit under what conditions a request to AWS is valid. In this case, David can list objects in the my-new-company-123456789 bucket only when he requests objects without a prefix (objects at the root level) and objects with the home/ prefix (objects in the home folder). If David tries to navigate to other folders, such as confidential/, David is denied access. Additionally, David needs permissions to list prefix home/David to be able to use the search functionality of the console instead of scrolling down the list of users’ folders.

To set these root and home folder permissions, I used two conditions: s3:prefix and s3:delimiter. The s3:prefix condition specifies the folders that David has ListBucket permissions for. For example, David can list the following files and folders in the my-new-company-123456789 bucket:

/root-file.txt /confidential/ /home/Adele/ /home/Bob/ /home/David/

But David cannot list files or subfolders in the confidential/, home/Adele, or home/Bob folders.

Although the s3:delimiter condition isn’t required for console access, it’s still a good practice to include it in case David makes requests by using the API. As previously noted, the delimiter is a character—such as a slash (/)—that identifies the folder that an object is in. The delimiter is useful when you want to list objects as if they were in a file system. For example, let’s assume the my-new-company-123456789 bucket stored thousands of objects. If David includes the delimiter in his requests, he can limit the number of returned objects to just the names of files and subfolders in the folder he specified. Without the delimiter, in addition to every file in the folder he specified, David would get a list of all files in any subfolders.

Block 3: Allow listing objects in David’s folder

In addition to the root and home folders, David requires access to all objects in the home/David/ folder and any subfolders that he might create. Here’s a policy that allows this:

{
      “Sid”: “AllowListingOfUserFolder”,
      “Action”: [“s3:ListBucket”],
      “Effect”: “Allow”,
      “Resource”: [“arn:aws:s3:::my-new-company-123456789”],
      "Condition":{"StringLike":{"s3:prefix":["home/David/*"]}}
    }

In the condition above, you use a StringLike expression in combination with the asterisk (*) to represent an object in David’s folder, where the asterisk acts as a wildcard. That way, David can list files and folders in his folder (home/David/). You couldn’t include this condition in the previous block (AllowRootAndHomeListingOfCompanyBucket) because it used the StringEquals expression, which would interpret the asterisk (*) as an asterisk, not as a wildcard.

In the next section, the AllowAllS3ActionsInUserFolder block, you’ll see that the Resource element specifies my-new-company/home/David/*, which looks like the condition that I specified in this section. You might think that you can similarly use the Resource element to specify David’s folder in this block. However, the ListBucket action is a bucket-level operation, meaning the Resource element for the ListBucket action applies only to bucket names and doesn’t take folder names into account. So, to limit actions at the object level (files and folders), you must use conditions.

Block 4: Allow all Amazon S3 actions in David’s folder

Finally, you specify David’s actions (such as read, write, and delete permissions) and limit them to just his home folder, as shown in the following policy:

    {
      "Sid": "AllowAllS3ActionsInUserFolder",
      "Effect": "Allow",
      "Action": ["s3:*"],
      "Resource": ["arn:aws:s3:::my-new-company-123456789/home/David/*"]
    }

For the Action element, you specified s3:*, which means David has permission to do all Amazon S3 actions. In the Resource element, you specified David’s folder with an asterisk (*) (a wildcard) so that David can perform actions on the folder and inside the folder. For example, David has permission to change his folder’s storage class. David also has permission to upload files, delete files, and create subfolders in his folder (perform actions in the folder).

An easier way to manage policies with policy variables

In David’s folder-level policy you specified David’s home folder. If you wanted a similar policy for users like Bob and Adele, you’d have to create separate policies that specify their home folders. Instead of creating individual policies for each IAM Identity Center user, you can use policy variables and create a single policy that applies to multiple users (a group policy). Policy variables act as placeholders. When you make a request to a service in AWS, the placeholder is replaced by a value from the request when the policy is evaluated.

For example, you can use the previous policy and replace David’s user name with a variable that uses the requester’s user name through attributes and PrincipalTag as shown in the following policy (copy this policy to use in the procedure that follows):

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Sid": "AllowUserToSeeBucketListInTheConsole",
			"Action": [
				"s3:ListAllMyBuckets",
				"s3:GetBucketLocation"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::*"
			]
		},
		{
			"Sid": "AllowRootAndHomeListingOfCompanyBucket",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789"
			],
			"Condition": {
				"StringEquals": {
					"s3:prefix": [
						"",
						"home/",
						"home/${aws:PrincipalTag/userName}"
					],
					"s3:delimiter": [
						"/"
					]
				}
			}
		},
		{
			"Sid": "AllowListingOfUserFolder",
			"Action": [
				"s3:ListBucket"
			],
			"Effect": "Allow",
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789"
			],
			"Condition": {
				"StringLike": {
					"s3:prefix": [
						"home/${aws:PrincipalTag/userName}/*"
					]
				}
			}
		},
		{
			"Sid": "AllowAllS3ActionsInUserFolder",
			"Effect": "Allow",
			"Action": [
				"s3:*"
			],
			"Resource": [
				"arn:aws:s3:::my-new-company-123456789/home/${aws:PrincipalTag/userName}/*"
			]
		}
	]
}

To implement this policy with variables, begin by opening the IAM Identity Center console using the main AWS admin account (ensuring you’re not signed in as David).
Select Settings on the left-hand side, then select the Attributes for access control tab.

Figure 15: Screenshot of Settings inside Identity Center.
Create a new attribute for access control, entering userName as the Key and ${path:userName} as the Value, then choose Save changes. This will add a session tag to your Identity Center user and allow you to use that tag in an IAM policy.

Figure 16: Screenshot of managing attributes inside Identity Center settings.
To edit David’s permissions, go back to the IAM Identity Center console and select Permission sets.

Figure 17: Screenshot of permission sets inside Identity Center with Davids-Permissions selected.
Select David’s permission set that you created previously.
Select Inline policy and then choose Edit to update David’s policy by replacing it with the modified policy that you copied at the beginning of this section, which will resolve to David’s username.

Figure 18: Screenshot of David’s policy inside his permission set inside Identity Center.

You can validate that this is set up correctly by signing in to David’s user through the Identity Center dashboard as you did before and verifying you have access to the David folder and not the Bob or Adele folder.

Figure 19: Screenshot of David’s S3 folder with access to a .jpg file inside.

Whenever a user makes a request to AWS, the variable is replaced by the user name of whoever made the request. For example, when David makes a request, ${aws:PrincipalTag/userName} resolves to David; when Adele makes the request, ${aws:PrincipalTag/userName} resolves to Adele.

It’s important to note that, if this is the route you use to grant access, you must control and limit who can set this username tag on an IAM principal. Anyone who can set this tag can effectively read/write to any of these bucket prefixes. It’s important that you limit access and protect the bucket prefixes and who can set the tags. For more information, see What is ABAC for AWS, and the Attribute-based access control User Guide.

Conclusion

By using Amazon S3 folders, you can follow the principle of least privilege and verify that the right users have access to what they need, and only to what they need.

See the following example policy that only allows API access to the buckets, and only allows for adding, deleting, restoring, and listing objects inside the folders:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "AllowAllS3ActionsInUserFolder",
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:DeleteObjectTagging",
                "s3:DeleteObjectVersion",
                "s3:DeleteObjectVersionTagging",
                "s3:GetObject",
                "s3:GetObjectTagging",
                "s3:GetObjectVersion",
                "s3:GetObjectVersionTagging",
                "s3:ListBucket",
                "s3:PutObject",
                "s3:PutObjectTagging",
                "s3:PutObjectVersionTagging",
                "s3:RestoreObject"
            ],
            "Resource": [
		   "arn:aws:s3:::my-new-company-123456789",
                "arn:aws:s3:::my-new-company-123456789/home/${aws:PrincipalTag/userName}/*"
            ],
            "Condition": {
                "StringLike": {
                    "s3:prefix": [
                        "home/${aws:PrincipalTag/userName}/*"
                    ]
                }
            }
        }
    ]
}

We encourage you to think about what policies your users might need and restrict the access by only explicitly allowing what is needed.

Here are some additional resources for learning about Amazon S3 folders and about IAM policies, and be sure to get involved at the community forums:

For a detailed walkthrough of Amazon S3 policies, see Controlling access to a bucket with user policies.
See Listing objects using prefixes and delimiters in Organizing objects using prefixes.
See a sample Amazon S3 API request in the Amazon S3 API Reference.
Blocking public access to your Amazon S3 storage.
Bucket policies and examples.
Our previous blog post about how to grant access to an Amazon S3 bucket.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

How to use Amazon CodeWhisperer using Okta as an external IdP

2023-10-27 Chetan Makvana

Post Syndicated from Chetan Makvana original https://aws.amazon.com/blogs/devops/how-to-use-amazon-codewhisperer-using-okta-as-an-external-idp/

Customers using Amazon CodeWhisperer often want to enable their developers to sign in using existing identity providers (IdP), such as Okta. CodeWhisperer provides support for authentication either through AWS Builder Id or AWS IAM Identity Center. AWS Builder ID is a personal profile for builders. It is designed for individual developers, particularly when working on personal projects or in cases when organization does not authenticate to AWS using the IAM Identity Center. IAM Identity Center is better suited for enterprise developers who use CodeWhisperer as employees of organizations that have an AWS account. The IAM Identity Center authentication method expands the capabilities of IAM by centralizing user administration and access control. Many customers prefer using Okta as their external IdP for Single Sign-On (SSO). They aim to leverage their existing Okta credentials to seamlessly access CodeWhisperer. To achieve this, customers utilize the IAM Identity Center authentication method.

Trained on billions of lines of Amazon and open-source code, CodeWhisperer is an AI coding companion that helps developers write code by generating real-time whole-line and full-function code suggestions in their Integrated Development Environments (IDEs). CodeWhisperer comes in two tiers—the Individual Tier is free for individual use, and the Professional Tier offers administrative features, like SSO and IAM Identity Center integration, policy control for referenced code suggestions, and higher limits on security scanning. Customers enable the professional tier of CodeWhisperer for their developers for a business use. When using CodeWhisperer with the professional tier, developers should authenticate with the IAM Identity Center. We will also soon introduce the Enterprise Tier, offering the additional capability to customize CodeWhisperer to generate more relevant recommendations by making it aware of your internal libraries, APIs, best practices, and architectural patterns.

In this blog, we will show you how to set up Okta as an external IdP for IAM Identity Center and enable access to CodeWhisperer using existing Okta credentials.

How it works

The flow for accessing CodeWhisperer through the IAM Identity Center involves the authentication of Okta users using Security Assertion Markup Language (SAML) 2.0 authentication.

Figure 1: Architecture diagram for sign-in process

The sign-in process follows these steps:

IAM Identity Center synchronizes users and groups information from Okta into IAM Identity Center using the System for Cross-domain Identity Management (SCIM) v2.0 protocol.
Developer with an Okta account connects to CodeWhisperer through IAM Identity Center.
If the developer isn’t already authenticated, they will be redirected to the Okta account login. The developer will sign in using their Okta credentials.
If the sign-in is successful, Okta processes the request and sends a SAML response containing the developer’s identity and authentication status to IAM Identity Center.
If the SAML response is valid and the developer is authenticated, IAM Identity Center grants access to CodeWhisperer.
The developer can now securely access and use CodeWhisperer.

Prerequisites

For this walkthrough, you should have the following prerequisites:

AWS account
Okta account with users and group already setup
An IAM Identity Center-enabled account (free). For more information, see Enable IAM Identity Center.
Professional tier of Amazon CodeWhisperer enabled. The professional tier comes with associated costs. Please check the pricing.

Configure Okta as an external IdP with IAM Identity Center

Step 1: Enable IAM Identity Center in Okta

In the Okta Admin Console, Navigate to the Applications Tab > Applications.
Browse App Catalog for the AWS IAM Identity Center app and Select Add Integration.

In Okta’s App Integration Catalog, the option to select the AWS IAM Identity Center App

Figure 2: Okta’s App Integration Catalog

Provide a name for the App Integration, using the Application label. Example: “Demo_AWS IAM Identity Center”.
Navigate to the Sign On tab for the “Demo_AWS IAM Identity Center” app .
Under SAML Signing Certificates, select View IdP Metadata from the Actions drop-down menu and Save the contents as metadata.xml

Under SAML Signing Certificates, the option to select View IdP Metadata from the Actions drop-down menu and Save the contents as metadata.xml.

Figure 3: SAML Signing Certificates Page

Step 2: Configure Okta as an external IdP in IAM Identity Center

In IAM Identity Center Console, navigate to Dashboard.
Select Choose your identity source.

In IAM Identity Center Console, Dashboard tab displays the option to select Recommended setup steps with Choose your Identity source as Step 1

Figure 4: IAM Identity Center Console

Under Identity source, select Change identity source from the Actions drop-down menu
On the next page, select External identity provider, then click Next.
Configure the external identity provider:

- IdP SAML metadata: Click Choose file to upload Okta’s IdP SAML metadata you saved in the previous section Step 6.
- Make a copy of the AWS access portal sign-in URL, IAM Identity Center ACS URL, and IAM Identity Center issuer URL values. You’ll need these values later on.
- Click Next.

Review the list of changes. Once you are ready to proceed, type ACCEPT, then click Change identity source.

Step 3: Configure Okta with IAM Identity Center Sign On details

In Okta, select the Sign On tab IAM Identity Center SAML app, then click Edit:

- Enter AWS IAM Identity Center SSO ACS URL and AWS IAM Identity Center SSO issuer URL values that you copied in previous section step 5b into the corresponding fields.
- Application username format: Select one of the options from the drop-down menu.

Click Save.

Step 4: Enable provisioning in IAM Identity Center

In IAM Identity Center Console, choose Settings in the left navigation pane.
On the Settings page, locate the Automatic provisioning information box, and then choose Enable. This immediately enables automatic provisioning in IAM Identity Center and displays the necessary SCIM endpoint and access token information.

Under IAM Identity Center settings, the option to enable automatic provisioning.

Figure 5: IAM Identity Center Settings for Automatic provisioning

In the Inbound automatic provisioning dialog box, copy each of the values for the following options. You will need to paste these in later when you configure provisioning in Okta.
- SCIM endpoint
- Access token
Choose Close.

Step 5: Configure provisioning in Okta

In a separate browser window, log in to the Okta admin portal and navigate to the IAM Identity Center app.
On the IAM Identity Center app page, choose the Provisioning tab, and then choose Integration.
Choose Configure API Integration and then select the check box next to Enable API integration to enable provisioning.
Paste SCIM endpoint copied in previous step into the Base URL field. Make sure that you remove the trailing forward slash at the end of the URL.
Paste Access token copied in previous step into the API Token field.
Choose Test API Credentials to verify the credentials entered are valid.
Choose Save.
Under Settings, choose To App, choose Edit, and then select the Enable check box for each of the Provisioning Features you want to enable.
Choose Save.

Step 6: Assign access for groups in Okta

In the Okta admin console, navigate to the IAM Identity Center app page, choose the Assignments tab.
In the Assignments page, choose Assign, and then choose Assign to Groups.
Choose the Okta group or groups that you want to assign access to the IAM Identity Center app. Choose Assign, choose Save and Go Back, and then choose Done. This starts provisioning the users in the group into the IAM Identity Center.
Repeat step 3 for all groups that you want to provide access.
Choose the Push Groups tab. Choose the Okta group or groups that you chose in the previous step.
Then choose Save. The group status changes to Active after the group and its members have successfully been pushed to IAM Identity Center.

Figure 7: Amazon CodeWhisperer Settings Page

Step 7: Provide access to CodeWhisperer

In the CodeWhisperer Console, under Settings add the groups which require access to CodeWhisperer.

Figure 7: Amazon CodeWhisperer Settings Page

Set up AWS Toolkit with IAM Identity Center

To use CodeWhisperer, you will now set up the AWS Toolkit within integrated development environments (IDE) to establish authentication with the IAM Identity Center.

Figure 8: Set up AWS Toolkit with IAM Identity Center

In the IDE, open the AWS extension panel and click Start button under Developer Tools > CodeWhisperer.
In the resulting pane, expand the section titled Have a Professional Tier subscription? Sign in with IAM Identity Center.
Enter the IAM Identity Center URL you previously copied into the Start URL field.
Set the region to us-east-1 and click Sign in button.
Click Copy Code and Proceed to copy the code from the resulting pop-up.
When prompted by the Do you want Code to open the external website? pop-up, click Open.
Paste the code copied in Step 5 and click Next.
Enter your Okta credentials and click Sign in.
Click Allow to grant AWS Toolkit to access your data.
When the connection is complete, a notification indicates that it is safe to close your browser. Close the browser tab and return to IDE.
Depending on your preference, select Yes if you wish to continue using IAM Identity Center with CodeWhisperer while using current AWS profile, else select No.
You are now all set to use CodeWhisperer from within IDE, authenticated with your Okta credentials.

Test Configuration

If you have successfully completed the previous step, you will see the code suggested by CodeWhisperer.

In IDE, steps to test CodeWhisperer configuration

Figure 9: Test Step Configurations

Conclusion

In this post, you learned how to leverage existing Okta credential to access Amazon CodeWhisperer via the IAM Identity Center integration. The post walked through the steps detailing the process of setting up Okta as an external IdP with the IAM Identity Center. You then configured the AWS Toolkit to establish an authenticated connection to AWS using Okta credentials, enabling you to access the CodeWhisperer Professional Tier.

About the authors:

Delegating permission set management and account assignment in AWS IAM Identity Center

2023-10-09 Jake Barker

Post Syndicated from Jake Barker original https://aws.amazon.com/blogs/security/delegating-permission-set-management-and-account-assignment-in-aws-iam-identity-center/

In this blog post, we look at how you can use AWS IAM Identity Center (successor to AWS Single Sign-On) to delegate the management of permission sets and account assignments. Delegating the day-to-day administration of user identities and entitlements allows teams to move faster and reduces the burden on your central identity administrators.

IAM Identity Center helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. Identity Center requires accounts to be managed by AWS Organizations. Administration of Identity Center can be delegated to a member account (an account other than the management account). We recommend that you delegate Identity Center administration to limit who has access to the management account and use the management account only for tasks that require the management account.

Delegated administration is different from the delegation of permission sets and account assignments, which this blog covers. For more information on delegated administration, see Getting started with AWS IAM Identity Center delegated administration. The patterns in this blog post work whether Identity Center is delegated to a member account or remains in the management account.

Permission sets are used to define the level of access that users or groups have to an AWS account. Permission sets can contain AWS managed policies, customer managed policies, inline policies, and permissions boundaries.

Solution overview

As your organization grows, you might want to start delegating permissions management and account assignment to give your teams more autonomy and reduce the burden on your identity team. Alternatively, you might have different business units or customers, operating out of their own organizational units (OUs), that want more control over their own identity management.

In this scenario, an example organization has three developer teams: Red, Blue, and Yellow. Each of the teams operate out of its own OU. IAM Identity Center has been delegated from the management account to the Identity Account. Figure 1 shows the structure of the example organization.

Figure 1: The structure of the organization in the example scenario

The organization in this scenario has an existing collection of permission sets. They want to delegate the management of permission sets and account assignments away from their central identity management team.

The Red team wants to be able to assign the existing permission sets to accounts in its OU. This is an accounts-based model.
The Blue team wants to edit and use a single permission set and then assign that set to the team’s single account. This is a permission-based model.
The Yellow team wants to create, edit, and use a permission set tagged with Team: Yellow and then assign that set to all of the accounts in its OU. This is a tag-based model.

We’ll look at the permission sets needed for these three use cases.

Note: If you’re using the AWS Management Console, additional permissions are required.

Use case 1: Accounts-based model

In this use case, the Red team is given permission to assign existing permission sets to the three accounts in its OU. This will also include permissions to remove account assignments.

Using this model, an organization can create generic permission sets that can be assigned to its AWS accounts. It helps reduce complexity for the delegated administrators and verifies that they are using permission sets that follow the organization’s best practices. These permission sets restrict access based on services and features within those services, rather than specific resources.

{
  "Version": "2012-10-17",
  "Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "sso:CreateAccountAssignment",
            "sso:DeleteAccountAssignment",
            "sso:ProvisionPermissionSet"
        ],
        "Resource": [
            "arn:aws:sso:::instance/ssoins-<sso-ins-id>",
            "arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/*",
            "arn:aws:sso:::account/112233445566",
            "arn:aws:sso:::account/223344556677",
            "arn:aws:sso:::account/334455667788"

        ]
    }
  ]
}

In the preceding policy, the principal can assign existing permission sets to the three AWS accounts with the IDs 112233445566, 223344556677 and 334455667788. This includes administration permission sets, so carefully consider which accounts you allow the permission sets to be assigned to.

The arn:aws:sso:::instance/ssoins-<sso-ins-id> is the IAM Identity Center instance ID ARN. It can be found using either the AWS Command Line Interface (AWS CLI) v2 with the list-instances API or the AWS Management Console.

Use the AWS CLI

Use the AWS Command Line Interface (AWS CLI) to run the following command:

aws sso-admin list-instances

You can also use AWS CloudShell to run the command.

Use the AWS Management Console

Use the Management Console to navigate to the IAM Identity Center in your AWS Region and then select Choose your identity source on the dashboard.

Figure 2: The IAM Identity Center instance ID ARN in the console

Use case 2: Permission-based model

For this example, the Blue team is given permission to edit one or more specific permission sets and then assign those permission sets to a single account. The following permissions allow the team to use managed and inline policies.

This model allows the delegated administrator to use fine-grained permissions on a specific AWS account. It’s useful when the team wants total control over the permissions in its AWS account, including the ability to create additional roles with administrative permissions. In these cases, the permissions are often better managed by the team that operates the account because it has a better understanding of the services and workloads.

Granting complete control over permissions can lead to unintended or undesired outcomes. Permission sets are still subject to IAM evaluation and authorization, which means that service control policies (SCPs) can be used to deny specific actions.

{
  "Version": "2012-10-17",
  "Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "sso:AttachCustomerManagedPolicyReferenceToPermissionSet",
            "sso:AttachManagedPolicyToPermissionSet",
            "sso:CreateAccountAssignment",
            "sso:DeleteAccountAssignment",
            "sso:DeleteInlinePolicyFromPermissionSet",
            "sso:DetachCustomerManagedPolicyReferenceFromPermissionSet",
            "sso:DetachManagedPolicyFromPermissionSet",
            "sso:ProvisionPermissionSet",
            "sso:PutInlinePolicyToPermissionSet",
            "sso:UpdatePermissionSet"
        ],
        "Resource": [
            "arn:aws:sso:::instance/ssoins-<sso-ins-id>",
            "arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/ps-1122334455667788",
            "arn:aws:sso:::account/445566778899"
        ]
    }
  ]
}

Here, the principal can edit the permission set arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/ps-1122334455667788 and assign it to the AWS account 445566778899. The editing rights include customer managed policies, AWS managed policies, and inline policies.

If you want to use the preceding policy, replace the missing and example resource values with your own IAM Identity Center instance ID and account numbers.

In the preceding policy, the arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/ps-1122334455667788 is the permission set ARN. You can find this ARN through the console, or by using the AWS CLI command to list all of the permission sets:

aws sso-admin list-permission-sets --instance-arn <instance arn from above>

This permission set can also be applied to multiple accounts—similar to the first use case—by adding additional account IDs to the list of resources. Likewise, additional permission sets can be added so that the user can edit multiple permission sets and assign them to a set of accounts.

Use case 3: Tag-based model

For this example, the Yellow team is given permission to create, edit, and use permission sets tagged with Team: Yellow. Then they can assign those tagged permission sets to all of their accounts.

This example can be used by an organization to allow a team to freely create and edit permission sets and then assign them to the team’s accounts. It uses tagging as a mechanism to control which permission sets can be created and edited. Permission sets without the correct tag cannot be altered.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "sso:CreatePermissionSet",
                "sso:DescribePermissionSet",
                "sso:UpdatePermissionSet",
                "sso:DeletePermissionSet",
                "sso:DescribePermissionSetProvisioningStatus",
                "sso:DescribeAccountAssignmentCreationStatus",
                "sso:DescribeAccountAssignmentDeletionStatus",
                "sso:TagResource"
            ],
            "Resource": [
                "arn:aws:sso:::instance/ssoins-<sso-ins-id>"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "sso:DescribePermissionSet",
                "sso:UpdatePermissionSet",
                "sso:DeletePermissionSet"
                	
            ],
            "Resource": "arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/Team": "Yellow"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "sso:CreatePermissionSet"
            ],
            "Resource": "arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/*",
            "Condition": {
                "StringEquals": {
                    "aws:RequestTag/Team": "Yellow"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": "sso:TagResource",
            "Resource": "arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/Team": "Yellow",
                    "aws:RequestTag/Team": "Yellow"
                }
            }
        },
        {
            "Effect": "Allow",
            "Action": [
                "sso:CreateAccountAssignment",
                "sso:DeleteAccountAssignment",
                "sso:ProvisionPermissionSet"
            ],
            "Resource": [
                "arn:aws:sso:::instance/ssoins-<sso-ins-id>",
                "arn:aws:sso:::account/556677889900",
                "arn:aws:sso:::account/667788990011",
                "arn:aws:sso:::account/778899001122"
            ]
        },
        {
            "Sid": "InlinePolicy",
            "Effect": "Allow",
            "Action": [
                "sso:GetInlinePolicyForPermissionSet",
                "sso:PutInlinePolicyToPermissionSet",
                "sso:DeleteInlinePolicyFromPermissionSet"
            ],
            "Resource": [
                "arn:aws:sso:::instance/ssoins-<sso-ins-id>"
            ]
        },
        {
            "Sid": "InlinePolicyABAC",
            "Effect": "Allow",
            "Action": [
                "sso:GetInlinePolicyForPermissionSet",
                "sso:PutInlinePolicyToPermissionSet",
                "sso:DeleteInlinePolicyFromPermissionSet"
            ],
            "Resource": "arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/Team": "Yellow"
                }
            }
        }
    ]
}

In the preceding policy, the principal is allowed to create new permission sets only with the tag Team: Yellow, and assign only permission sets tagged with Team: Yellow to the AWS accounts with ID 556677889900, 667788990011, and 778899001122.

The principal can only edit the inline policies of the permission sets tagged with Team: Yellow and cannot change the tags of the permission sets that are already tagged for another team.

If you want to use this policy, replace the missing and example resource values with your own IAM Identity Center instance ID, tags, and account numbers.

Note: The policy above assumes that there are no additional statements applying to the principal. If you require additional allow statements, verify that the resulting policy doesn’t create a risk of privilege escalation. You can review Controlling access to AWS resources using tags for additional information.

This policy only allows the delegation of permission sets using inline policies. Customer managed policies are IAM policies that are deployed to and are unique to each AWS account. When you create a permission set with a customer managed policy, you must create an IAM policy with the same name and path in each AWS account where IAM Identity Center assigns the permission set. If the IAM policy doesn’t exist, Identity Center won’t make the account assignment. For more information on how to use customer managed policies with Identity Center, see How to use customer managed policies in AWS IAM Identity Center for advanced use cases.

You can extend the policy to allow the delegation of customer managed policies with these two statements:

{
    "Sid": "CustomerManagedPolicy",
    "Effect": "Allow",
    "Action": [
        "sso:ListCustomerManagedPolicyReferencesInPermissionSet",
        "sso:AttachCustomerManagedPolicyReferenceToPermissionSet",
        "sso:DetachCustomerManagedPolicyReferenceFromPermissionSet"
    ],
    "Resource": [
        "arn:aws:sso:::instance/ssoins-<sso-ins-id>"
    ]
},
{
    "Sid": "CustomerManagedABAC",
    "Effect": "Allow",
    "Action": [
        "sso:ListCustomerManagedPolicyReferencesInPermissionSet",
        "sso:AttachCustomerManagedPolicyReferenceToPermissionSet",
        "sso:DetachCustomerManagedPolicyReferenceFromPermissionSet"
    ],
    "Resource": "arn:aws:sso:::permissionSet/ssoins-<sso-ins-id>/*",
    "Condition": {
        "StringEquals": {
            "aws:ResourceTag/Team": "Yellow"
        }
    }
}

Note: Both statements are required, as only the resource type PermissionSet supports the condition key aws:ResourceTag/${TagKey}, and the actions listed require access to both the Instance and PermissionSet resource type. See Actions, resources, and condition keys for AWS IAM Identity Center for more information.

Best practices

Here are some best practices to consider when delegating management of permission sets and account assignments:

Assign permissions to edit specific permission sets. Allowing roles to edit every permission set could allow that role to edit their own permission set.
Only allow administrators to manage groups. Users with rights to edit group membership could add themselves to any group, including a group reserved for organization administrators.

If you’re using IAM Identity Center in a delegated account, you should also be aware of the best practices for delegated administration.

Summary

Organizations can empower teams by delegating the management of permission sets and account assignments in IAM Identity Center. Delegating these actions can allow teams to move faster and reduce the burden on the central identity management team.

The scenario and examples share delegation concepts that can be combined and scaled up within your organization. If you have feedback about this blog post, submit comments in the Comments section. If you have questions, start a new thread on AWS Re:Post with the Identity Center tag.

Want more AWS Security news? Follow us on Twitter.

How to automate the review and validation of permissions for users and groups in AWS IAM Identity Center

2023-08-16 Yee Fei Ooi

Post Syndicated from Yee Fei Ooi original https://aws.amazon.com/blogs/security/how-to-automate-the-review-and-validation-of-permissions-for-users-and-groups-in-aws-iam-identity-center/

AWS IAM Identity Center (successor to AWS Single Sign-On) is widely used by organizations to centrally manage federated access to their Amazon Web Services (AWS) environment. As organizations grow, it’s crucial that they maintain control of access to their environment and conduct regular reviews of existing granted permissions to maintain a good security posture. With continuous movement of users among projects and teams within an organization, there are constant updates in groups and permission sets. Given the frequency of updates, it’s important for organizations to maintain the integrity of the identity entities and promote visibility into their associated permissions within IAM Identity Center.

Performing an audit of permissions assignment through the IAM Identity Center Management Console can be an arduous and time-consuming task, especially for customers managing a significant number of AWS accounts. This blog post addresses the following concerns faced by security administrators:

How to maintain control over permissions and efficiently conduct thorough audits.
How to regularly review granted permissions to uphold the principle of least privilege.

In this blog post, we show you how to automate your IAM Identity Center users and groups permission review process with AWS SDK and AWS serverless services. The solution also includes how to schedule the review process based on preferred frequency and generating a business-specific access and permission review report.

By using AWS serverless services and AWS SDK, you can create an automated workflow to retrieve the latest permission sets of your identities in IAM Identity Center and extract them as a report. Amazon EventBridge scheduling allows you to set customized schedules to launch the automation process. AWS Lambda functions are used in data retrieval, data transformation, and report generation, and Amazon DynamoDB tables are used for storing raw unstructured data.

We show you how to build an automated solution using AWS SDK, AWS Step Functions, Lambda, DynamoDB, EventBridge, Amazon Simple Storage Service (Amazon S3), and Amazon Simple Notification Service (Amazon SNS) to review the IAM Identity Center instance that you specify. The review includes retrieving attached permission policies (inline, AWS managed, and customer managed) based on the assigned identity.

Note: This solution will incur costs based on the AWS services used.

Prerequisites

In your own AWS environment, make sure that you have the following:

An IAM Identity Center instance set up in the account
IAM Identity Center instance metadata that you want to perform the analysis on:
- The IAM Identity Center instance identityStoreId – example: d-xxxxxxxxxx
- The IAM Identity Center instance instanceArn – example: arn:aws:sso:::instance/ssoins-xxxxxxxxxx
Access and permission to deploy the related AWS services mentioned previously in AWS CloudFormation.

Note: This solution is expected to deploy in the account where your IAM Identity Center instance is being set up. If you want to deploy in other accounts, you need to establish cross-account access for the IAM roles of the relevant services mentioned previously.
AWS SAM CLI installed. You will deploy the solution using AWS Serverless Application Model (AWS SAM). To learn more about how AWS SAM works, see the AWS Serverless Application Model (AWS SAM) specification.

Solution overview

In this section, we discuss the steps to set up solution. We provide a CloudFormation template that you can use to set up the required services and Lambda functions. Figure 1 illustrates the architecture of the solution.

Figure 1: Architecture of the solution

The solution is deployed using AWS SAM, which is an open-sourced framework for building serverless applications. AWS SAM helps to organize related components and operate on a single stack. When used together with the SAM CLI, it’s a useful tool for developing, testing, and building serverless applications on AWS.

To generate the report, the solution uses the following steps:

The EventBridge Scheduler is configured to launch the Step Functions based on the frequency of the cron job stated. The user can also manually launch the review as needed.
After the Step Functions are launched, the dataExtractionFunction Lambda function retrieves data from IAM Identity Center and stores it in two separate DynamoDB tables, fullPermissionSetsWithGroupTable and userWithGroupTable.
Step Functions will then launch the dataTransformLoadFunction Lambda function, which retrieves the data from both DynamoDB tables to perform data transformation for report generation.
The permission review report is stored in an S3 bucket and notification of completion is sent to the stakeholders.

Deploy the solution

Make sure that you have AWS SAM CLI installed.
Clone the GitHub repository. Open a CLI window and run
git clone https://github.com/aws-samples/aws-iam-identity-center-permission-policies-analyzer.git
Navigate to root directory of the GitHub repository by running cd aws-iam-identity-center-permission-policies-analyzer
Run sam deploy ‐‐guided and follow the step-by-step instructions to indicate the deployment details such as the desired CloudFormation stack name, AWS Region and other details as shown in Figure 2.

Figure 2: Configure SAM deploy
As shown in Figure 2, you receive confirmation that the required resources have been created. AWS SAM creates a default S3 bucket to store the necessary resources and then proceeds to the deployment prompt. Enter y to deploy and wait for deployment to complete.
After deployment is complete, you should see the following output: Successfully created/updated stack – {StackName} in {AWSRegion}. You can review the resources and stack in your CloudFormation console as shown in Figure 3.

Figure 3: CloudFormation console view of deployed stack

The CloudFormation template specifies the cron schedule on the first day of each month at 0800 UTC +8 by default. You can update the schedule based on your preference by following steps 7 and 8.
Open the EventBridge console. In the navigation pane, under Scheduler, choose Schedules. Check the box next to {StackName}-monthlySchedule-{RandomID} and choose Edit.

Figure 4: EventBridge schedule console
At Step 1, under the Schedule pattern segment, enter your preferred scheduling. To learn about the different types of EventBridge scheduling, see Schedule types on EventBridge Scheduler. For this example, you use a recurring type of schedule using cron expression. Update to your preferred schedule and time zone and choose Next.

Figure 5: EventBridge Schedule edit console Step 1 – Specify schedule detail
Check the email address you entered during the deployment stage of this solution for an email sent by [email protected], similar to what you see in Figure 6. Follow the steps in the email to confirm the Amazon SNS topic subscription.

Figure 6: Example email from Amazon SNS for subscription confirmation

Manually launch the review

After you’ve updated the schedule, the review process runs on the specified timing and frequency. You can manually launch the review immediately after you’ve deployed the solution, or at a time outside of the schedule on an as-needed basis.

To manually launch the review, open the Step Functions console,
Select the state machine monthlyUserPermissionAssessment-{randomID} and choose Start execution.

Figure 7: Start execution for monthlyUserPermissionAssessment state machine

Enter the following event pattern and choose Start execution.

{
  "identityStoreId": "d-xxxxxxxxxx",
  "instanceArn": "arn:aws:sso:::instance/ssoins-xxxxxxxxxx",
  "ssoDeployedRegion": "YOUR_SSO_DEPLOYED_REGION" 
}

Note: The format and keyword format are important to run the Step Functions successfully.

Figure 8: Example input to start state machine execution

When the process starts, the execution page opens and you can follow the process. The flow turns green when each step has been completed successfully. You can also review Events and check the Lambda functions or logs if you need to troubleshoot or refer to the details.

Figure 9: State machine successful execution example

Notification from each successful review

After each successful execution, you should receive an email notification at the email you specified in the Amazon SNS topic. You can then retrieve the report from the S3 bucket with the bucket name {StackName}-monthlyre-{AccountID}. Your report is stored according to the object key name specified in the email. An example of the email notification is shown in Figure 10.

Figure 10: Example email notification

You can download the report in CSV format from the S3 bucket. The headers of the report are:

User:	Username
PrincipalId:	An identifier for an object in IAM Identity Center, such as a user or group
PrincipalType:	USER or GROUP
GroupName:	Group’s display name value (if PrincipalType is GROUP)
AccountIdAssignment:	Identifier of the AWS account assigned with the specified permission set
PermissionSetARN:	ARN of the permission set
PermissionSetName:	Name of the permission set
Inline Policy:	Inline policy that is attached to the permission set
Customer Managed Policy:	Specifies the names and paths of the customer managed policies that you have attached to your permission set
AWS Managed Policy:	Details of the AWS managed policy
Permission Boundary:	Permission boundary details (Customer Managed Policy Reference and/or AWS managed policy ARN)

From the report, you can determine whether a user is assigned to an account individually or as part of a group, along with the corresponding permission sets. The report also includes details on inline policy, AWS managed policy, customer managed policy, and the permission boundaries attached to the permission set. Inline policies and AWS managed policies are presented in JSON format. However, for customer managed policies and permission boundaries, to keep the solution simple, the generated report provides only basic information on the policies that you’ve attached to the permission set. You can log in to the respective accounts to view the policies in full JSON format through the AWS IAM console.

[Optional] Customize the user notification email

If you want to customize the email notification subject and content, you can do so by editing the Lambda function {StackName}-dataTransformLoadFunction-{RandomID}. Scroll down to the bottom of the source code and edit the sns_message and Subject accordingly.

Figure 11: Customizing the notification email in dataTransformLoadFunction source code

Clean up the resources

To clean up the resources that you created for this example:

Empty your S3 bucket. Open the Amazon S3 console, search for the bucket name and choose Empty. Follow the instructions on screen to empty it.
Delete the CloudFormation stack by either:
1. Using the CloudFormation console to delete the stack, or
2. Using the AWS SAM CLI to run sam delete in your terminal. Follow the instructions and enter y when prompted to delete the stack.

Conclusion

In this post, you learned how to deploy a solution that simplifies the review and analysis of IAM permissions granted to IAM Identity Center with an automated flow. You also learned about customization that you can set up to fit your team’s needs and preferences.

If you have feedback about this post, submit comments in the Comments section. If you have questions about this post, start a new thread on the AWS Security, Identity, & Compliance re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

AWS Week in Review – Amazon EC2 Instance Connect Endpoint, Detective, Amazon S3 Dual Layer Encryption, Amazon Verified Permission – June 19, 2023

2023-06-19 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-week-in-review-amazon-ec2-instance-connect-endpoint-detective-amazon-s3-dual-layer-encryption-amazon-verified-permission-june-19-2023/

This week, I’ll meet you at AWS partner’s Jamf Nation Live in Amsterdam where we’re showing how to use Amazon EC2 Mac to deploy your remote developer workstations or configure your iOS CI/CD pipelines in the cloud.

Last Week’s Launches
While I was traveling last week, I kept an eye on the AWS News. Here are some launches that got my attention.

Amazon EC2 Instance Connect Endpoint. Endpoint for EC2 Instance Connect allows you to securely access Amazon EC2 instances using their private IP addresses, making the use of bastion hosts obsolete. Endpoint for EC2 Instance Connect is by far my favorite launch from last week. With EC2 Instance Connect, you use AWS Identity and Access Management (IAM) policies and principals to control SSH access to your instances. This removes the need to share and manage SSH keys. We also updated the AWS Command Line Interface (AWS CLI) to allow you to easily connect or open a secured tunnel to an instance using only its instance ID. I read and contributed to a couple of threads on social media where you pointed out that AWS Systems Manager Session Manager already offered similar capabilities. You’re right. But the extra advantage of EC2 Instance Connect Endpoint is that it allows you to use your existing SSH-based tools and libraries, such as the scp command.

Amazon Inspector now supports code scanning of AWS Lambda functions. This expands the existing capability to scan Lambda functions and associated layers for software vulnerabilities in application package dependencies. Amazon Detective also extends finding groups to Amazon Inspector. Detective automatically collects findings from Amazon Inspector, GuardDuty, and other AWS security services, such as AWS Security Hub, to help increase situational awareness of related security events.

Amazon Verified Permissions is generally available. If you’re designing or developing business applications that need to enforce user-based permissions, you have a new option to centrally manage application permissions. Verified Permissions is a fine-grained permissions management and authorization service for your applications that can be used at any scale. Verified Permissions centralizes permissions in a policy store and helps developers use those permissions to authorize user actions within their applications. Similarly to the way an identity provider simplifies authentication, a policy store lets you manage authorization in a consistent and scalable way. Read Danilo’s post to discover the details.

Amazon S3 Dual-Layer Server-Side Encryption with keys stored in AWS Key Management Service (DSSE-KMS). Some heavily regulated industries require double encryption to store some type of data at rest. Amazon Simple Storage Service (Amazon S3) offers DSSE-KMS, a new free encryption option that provides two layers of data encryption, using different keys and different implementation of the 256-bit Advanced Encryption Standard with Galois Counter Mode (AES-GCM) algorithm. My colleague Irshad’s post has all the details.

AWS CloudTrail Lake Dashboards provide out-of-the-box visibility and top insights from your audit and security data directly within the CloudTrail Lake console. CloudTrail Lake features a number of AWS curated dashboards so you can get started right away – with no required detailed dashboard setup or SQL experience.

AWS IAM Identity Center now supports automated user provisioning from Google Workspace. You can now connect your Google Workspace to AWS IAM Identity Center (successor to AWS Single Sign-On) once and manage access to AWS accounts and applications centrally in IAM Identity Center.

AWS CloudShell is now available in 12 additional regions. AWS CloudShell is a browser-based shell that makes it easier to securely manage, explore, and interact with your AWS resources. The list of the 12 new Regions is detailed in the launch announcement.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some other updates and news that you might have missed:

AWS Extension for Stable Diffusion WebUI. WebUI is a popular open-source web interface that allows you to easily interact with Stable Diffusion generative AI. We built this extension to help you to migrate existing workloads (such as inference, train, and ckpt merge) from your local or standalone servers to the AWS Cloud.
GoDaddy developed a multi-Region, event-driven system. Their system handles 400 millions events per day. They plan to scale it to process 2 billion messages per day in a near future. My colleague Marcia explains the detail of their architecture in her post.
The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in several languages. Check out the podcasts in French, German, Italian, and Spanish.
AWS Open Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS Silicon Innovation Day (June 21) – A one-day virtual event that will allow you to better understand AWS Silicon and how you can use the Amazon EC2 chip offerings to your benefit. My colleague Irshad shared the details in this post. Register today.
AWS Global Summits – There are many AWS Summits going on right now around the world: Milano (June 22), Hong Kong (July 20), New York (July 26), Taiwan (Aug 2 & 3), and Sao Paulo (Aug 3).
AWS Community Day – Join a community-led conference run by AWS user group leaders in your region: Manila (June 29–30), Chile (July 1), and Munich (September 14).
AWS User Group Perú Conf 2023 (September 2023). Some of the AWS News blog writer team will be present: Marcia, Jeff, myself, and our colleague Startup Developer Advocate Mark. Save the date and register today.
CDK Day – CDK Day is happening again this year on September 29. The call for papers for this event is open, and this year we’re also accepting talks in Spanish. Submit your talk here.

That’s all for this week. Check back next Monday for another Week in Review!

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!
— seb