Songzhi Liu | Noise

Post Syndicated from Songzhi Liu original https://aws.amazon.com/blogs/big-data/build-a-secure-data-visualization-application-using-the-amazon-redshift-data-api-with-aws-iam-identity-center/

In today’s data-driven world, securely accessing, visualizing, and analyzing data is essential for making informed business decisions. Tens of thousands of customers use Amazon Redshift for modern data analytics at scale, delivering up to three times better price-performance and seven times better throughput than other cloud data warehouses.

The Amazon Redshift Data API simplifies access to your Amazon Redshift data warehouse by removing the need to manage database drivers, connections, network configurations, data buffering, and more.

With the newly released feature of Amazon Redshift Data API support for single sign-on and trusted identity propagation, you can build data visualization applications that integrate single sign-on (SSO) and role-based access control (RBAC), simplifying user management while enforcing appropriate access to sensitive information.

For instance, a global sports gear company selling products across multiple regions needs to visualize its sales data, which includes country-level details. To maintain the right level of access, the company wants to restrict data visibility based on the user’s role and region. Regional sales managers should only see sales data for their specific region, such as North America or Europe. Conversely, the global sales executives require full access to the entire dataset, covering all countries.

In this post, we dive into the newly released feature of Amazon Redshift Data API support for SSO, Amazon Redshift RBAC for row-level security (RLS) and column-level security (CLS), and trusted identity propagation with AWS IAM Identity Center to let corporate identities connect to AWS services securely. We demonstrate how to integrate these services to create a data visualization application using Streamlit, providing secure, role-based access that simplifies user management while making sure that your organization can make data-driven decisions with enhanced security and ease.

Solution overview

We use multiple AWS services and open source tools to build a simple data visualization application with SSO to access data in Amazon Redshift with RBAC. The key components that power the solution are as follows:

IAM Identity Center and trusted identity propagation – IAM Identity Center can simplify user management by enabling SSO across AWS services. This allows users to authenticate with their corporate credentials managed in their corporate identity provider (IdP) like Okta, providing seamless access to the application. We explore how trusted identity propagation enables managing application-level access control at scale and activity logging across AWS services, like Amazon Redshift, by propagating and maintaining identity context throughout the workflow.
External IdP – We use Okta as an external IdP to manage user authentication. Okta connects to IAM Identity Center, allowing users to authenticate from external systems while maintaining centralized identity management within AWS. This makes sure that user access and roles are consistently maintained across both AWS services and external tools.
Amazon Redshift Serverless workgroup, Amazon Redshift Data API, and Amazon Redshift RBAC – Amazon Redshift is a fully managed data warehouse service that allows for fast querying and analysis of large datasets. In this solution, we use the Redshift Data API, which offers a simple and secure HTTP-based connection to Amazon Redshift, eliminating the need for JDBC or ODBC driver-based connections. The Redshift Data API is the recommended method to connect with Amazon Redshift for web applications. We also use RBAC in Amazon Redshift to demonstrate access restrictions on sales data based on the region column, making sure that regional sales managers only see data for their assigned regions, while global sales managers have full access.
Streamlit application – Streamlit is a widely used open source tool that enables the creation of interactive data applications with minimal code. In this solution, we use Streamlit to build a user-friendly interface where sales managers can view and analyze sales data in a visual, accessible format. The application will integrate with Amazon Redshift, providing users with access to the data based on their roles and permissions.

The following diagram illustrates the solution architecture for SSO with the Redshift Data API using IAM Identity Center.

The user workflow for the data visualization application consists of the following steps:

The user (whether a regional sales manager or global sales manager) accesses the Streamlit application, which is integrated with SSO to provide a seamless authentication experience.
The application redirects the user to authenticate through Okta, the external IdP. Okta verifies the user’s credentials and returns an ID token to the application.
The application uses the token issued by Okta to assume a role and temporary AWS Identity and Access Management (IAM) session credentials to call the IAM Identity Center AssumeRoleWithWebIdentity API and IAM AssumeRole API in later steps.
The application exchanges the Okta ID token for a token issued by IAM Identity Center by calling the IAM Identity Center CreateTokenWithIAM API using the temporary IAM credentials from the previous step. This token makes sure that the user is authenticated with AWS services and is tied to the IAM Identity Center user profile.
The application requests an identity-enhanced IAM role session using the IAM Identity Center token by calling the AssumeRole
The application uses the identity-enhanced IAM role session credentials to securely query Amazon Redshift for sales data. The credentials make sure that only authorized users can interact with the Redshift data.
As the query is processed, Amazon Redshift checks the identity context provided by IAM Identity Center. It verifies the user’s role and group membership, such as being a part of the North American region or the global sales manager group.
Based on the user’s identity and group membership, and using Amazon Redshift RBAC and row-level security, Amazon Redshift makes an authorization decision. The groups for the illustration can be broadly classified into the following categories:
1. Regional sales managers will be granted access to view sales data only for the specific country or region they manage. For instance, the AMER North American Sales Manager will only see sales data related to North America. Similarly, the access control based on EMEA and APAC regions will provide row-level security for the respective regions.
2. The global sales managers will be granted full access to all regions, enabling them to view the entire global dataset.

The setup consists of two main steps:

Provision the resources for IAM Identity Center, Amazon Redshift and Okta:
1. Enable IAM Identity Center and configure Okta as the IdP to manage user authentication and group provisioning.
2. Create an Okta application to authenticate users accessing the Streamlit application.
3. Set up an Amazon Redshift IAM Identity Center connection application to enable trusted identity propagation for secure authentication.
4. Provision an Amazon Redshift Serverless
5. Create the tables and configure RBAC within the Redshift workgroup to enforce row-level security for different IAM Identity Center federated roles, mapped to IAM Identity Center groups.
Download, configure, and run the Streamlit application:
1. Create a customer managed application in IAM Identity Center for the Redshift Data API client (Streamlit application) to enable secure API-based queries and create the required IAM roles
2. Configure the Streamlit application.
3. Run the Streamlit application.

Prerequisites

You should have the following prerequisites:

An AWS account. If you don’t have one, you can sign up for one.
IAM Identity Center enabled. For more information, see Enabling AWS IAM Identity Center.
A connection to IAM Identity Center with your preferred IdP and users and groups synchronized. Refer to IAM Identity Center identity source tutorials for the IdP setup.
A Python virtual environment. We use Visual Studio Code integrated development environment (IDE) for this post.

Provision the resources for IAM Identity Center, Amazon Redshift, and Okta

In this section, we walk through the steps to provision the resources for IAM Identity Center, Amazon Redshift, and Okta.

Enable IAM Identity Center and configure Okta as the IdP

Complete the following steps to enable IAM Identity Center and configure Okta as the IdP to manage user authentication and group provisioning:

Create the following users and groups in Okta:
1. Ethan Global with email [email protected], in group exec-global
2. Frank Amer with email [email protected], in group amer-sales
3. Alex Emea with email [email protected], in group emea-sales
4. Ming Apac with email [email protected], in group apac-sales

Create an IAM Identity Center instance in the AWS Region where Amazon Redshift is going to be deployed. An organization instance type is recommended.
Configure Okta as the identity source and enable automatic user and group provisioning. The users and groups will be pushed to IAM Identity Center using SCIM protocol.

The following screenshot shows the users synced in IAM Identity Center using SCIM protocol.

Create an Okta application

Complete the following steps to create an Okta application to authenticate users accessing the Streamlit application:

Create an OIDC application in Okta.
1. Copy and save the client ID and client secret needed later for the Streamlit application and the IAM Identity Center application to connect using the Redshift Data API.
2. Generate the client secret and set sign-in redirect URL and sign-out URL to http://localhost:8501 (we will host the Streamlit application locally on port 8501).
3. Under Assignments, Controlled access, grant access to everyone.
Create an OIDC IdP on IAM the console. The following screenshot shows an IdP created on the IAM console.

Set up an Amazon Redshift IAM Identity Center connection application

Complete the following steps to create an Amazon Redshift IAM Identity Center connection application to enable trusted identity propagation for secure authentication:

On the Amazon Redshift console, choose IAM Identity Center connection in the navigation pane.
Choose Create application.
Name the application redshift-data-api-okta-app.
Note down the IdP namespace. The default value AWSIDC is used for this post.

In the IAM role for IAM Identity Center access section, you need to provide an IAM role. You can go to the IAM console and create an IAM role called RedshiftOktaRole with the following policy and trust relationship. RedshiftOktaRole is used by the Amazon Redshift IAM Identity Center connection application to manage and interact with IAM Identity Center.

The policy attached to the role needs the following permissions:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "sso:DescribeApplication",
        "sso:DescribeInstance"
      ],
      "Resource": [
        "arn:aws:sso:::instance/<IAM Identity Center Instance ID>",
        "arn:aws:sso::<AWS Account ID>:application/<IAM Identity Center Instance ID>/*"
      ]
    }
  ]
}

The role uses the following trust relationship:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "redshift.amazonaws.com",
          "redshift-serverless.amazonaws.com"
        ]
      },
      "Action": [
        "sts:AssumeRole",
        "sts:SetContext"
      ]
    }
  ]
}

Leave Trusted Identity propagation section unchanged, then choose Next. You have the option to choose AWS Lake Formation or Amazon S3 Access Grants for use cases like using Amazon Redshift Spectrum to query external tables in Lake Formation. In our use case, we only use Amazon Redshift native tables so we don’t choose either.
In the Configure client connections that use third-party IdPs section, choose No.
Review and choose Create application.
When the application is created, navigate to your IAM Identity Center connection redshift-data-api-okta-app and choose Assign to add the groups that were synced in IAM Identity Center using SCIM protocol from Okta.

We will enable trusted identity propagation and third-party IdP (Okta) on the customer managed application for the Redshift Data API in a later step instead of configuring it in the Amazon Redshift connection application.

The following screenshot shows the IAM Identity Center connection application created on the Amazon Redshift console.

The following screenshot shows groups assigned to the Amazon Redshift IAM Identity Center connection for the managed application.

Provision a Redshift Serverless workgroup

Complete the following steps to create a Redshift Serverless workgroup. For more details, refer to Creating a workgroup with a namespace.

On the Amazon Redshift console, navigate to the Redshift Serverless dashboard.
Choose Create workgroup.
Enter a name for your workgroup (for example, redshift-tip-enabled).
Change the Base capacity to 8 RPU in the Performance and cost control
You can configure network and security based on your virtual private cloud (VPC) and subnet you want to create the workgroup.
In the Namespace section, create a new namespace for your workgroup. (For example, redshift-tip-enabled-namespace).
In the Database name and password section, select Customize admin user credentials and set the admin user name and create a password. Note them down to use in a later step to configure RBAC in Amazon Redshift.
In the Identity Center connections section, choose Enable for the cluster option and select the Amazon Redshift IAM Identity Center application created in the previous step (redshift-data-api-okta-app).
Associate an IAM role with the workgroup that has the following policies attached. Make it the default role to use.
Leave other settings as default and choose Next.
Review the settings and create the workgroup.

Wait until the workgroup is available before continuing to the next steps.

Create the tables and configure RBAC within the Redshift Serverless workgroup

Next, you use the Amazon Redshift Query Editor V2 on the Amazon Redshift console to connect to the workgroup you just created. You create the tables and configure the Amazon Redshift roles corresponding to Okta groups for the groups in IAM Identity Center and use the RBAC policy to grant users privileges to view data only for their regions. Complete the following steps:

On the Amazon Redshift console, open the Query Editor V2.
Choose the options menu (three dots) next to the Redshift workgroup name and choose Edit connection.
Select Other ways to connect and use the database user name and password to connect.

In the query editor, run the following code to create the sales table and load the data from Amazon Simple Storage Service (Amazon S3):

# Create the table
CREATE TABLE IF NOT EXISTS public.sales_data (
    SKU VARCHAR(50),
    Product_Name VARCHAR(255),
    Category VARCHAR(100),
    Quantity INT,
    Sales_Price DECIMAL(10,2),
    Timestamp TIMESTAMP,
    City VARCHAR(100),
    Region_Code VARCHAR(10),
    Country VARCHAR(10),
    Latitude DECIMAL(10,6),
    Longitude DECIMAL(10,6),
    Population INT,
    Elevation INT,
    Timezone VARCHAR(50)
);

# Load data from S3 to table
COPY public.sales_data
FROM 's3://redshift-blogs/redshift-data-api-idc/sales_data.csv'
IAM_ROLE default
CSV
IGNOREHEADER 1
DELIMITER ','
TIMEFORMAT 'auto';

# Create Redshift roles for the groups in IDC, the role format is Namespace:IDCGroupName
CREATE ROLE "AWSIDC:amer-sales";
CREATE ROLE "AWSIDC:emea-sales";
CREATE ROLE "AWSIDC:apac-sales";
CREATE ROLE "AWSIDC:exec-global";

--Create RLS policy
CREATE RLS POLICY eu_region_filter
WITH (timezone VARCHAR(50))
USING (timezone LIKE 'Europe%');

CREATE RLS POLICY apac_region_filter
WITH (timezone VARCHAR(50))
USING (timezone LIKE 'Asia%');

CREATE RLS POLICY amer_region_filter
WITH (timezone VARCHAR(50))
USING (timezone LIKE 'America%');

--Attach policy
ATTACH RLS POLICY eu_region_filter ON sales_data TO ROLE "AWSIDC:emea-sales";
ATTACH RLS POLICY apac_region_filter ON sales_data TO ROLE "AWSIDC:apac-sales";
ATTACH RLS POLICY amer_region_filter ON sales_data TO ROLE "AWSIDC:amer-sales";

--Turn on RLS on table
ALTER TABLE public.sales_data ROW LEVEL SECURITY ON;
GRANT IGNORE RLS TO ROLE "AWSIDC:exec-global";

IAM Identity Center will map the groups into the Redshift roles in the format of Namespace:IDCGroupName. Therefore, create the role name as AWSIDC:emea-sales and so on to match them with Okta group names synced in IAM Identity Center. The users will be created automatically within the groups as they log in using SSO into Amazon Redshift.

Download, configure, and run the Streamlit application

In this section, we walk through the steps to download, configure, and run the Streamlit application.

Create a customer managed application in IAM Identity Center for the Redshift Data API client

In order to start a trusted identity propagation workflow and allow Amazon Redshift to make authorization decisions based on the users and groups from IAM Identity Center (provisioned from the external IdP), you need an identity-enhanced IAM role session.

This requires a couple of IAM roles and a customer managed application in IAM Identity Center to handle the trust relationship between the external IdP and IAM Identity Center and control access for the Redshift Data API client, in this case, the Streamlit application.

First, you create two IAM roles, then you create a customer managed application for the Streamlit application. Complete the following steps:

Create a temporary IAM role (we named it IDCBridgeRole) to exchange the token with IAM Identity Center (assuming you don’t have an existing IAM identity to use). This role will be assumed by the Streamlit application with AssumeRoleWithWebIdentity to get a temporary set of role credentials to call the CreateTokenWithIAM and AssumeRole APIs to get the identity-enhanced role session.

Attach the following policy the role:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sso-oauth:CreateTokenWithIAM",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "sts:SetContext",
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "sts:AssumeRole",
            "Resource": "*"
        }
    ]
}

In the trust relationship, provide your AWS account ID and IdP’s URL. The trusted principal to use is the Amazon Resource Name (ARN) of oidc-provider you created earlier.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:aws:iam::<accountid>:oidc-provider/<your-idp-domain>"
            },
            "Action": "sts:AssumeRoleWithWebIdentity"
        }
    ]
}

Create an IAM role with permissions to access the Redshift Data API (we named it RedshiftDataAPIClientRole). This role will be assumed by the Streamlit application with the enhanced identities from IAM Identity Center and then used to authenticate requests to the Redshift Data API.
1. Attach the AmazonRedshiftDataFullAccess AWS managed policy. AWS recommends using the principle of least privilege in your IAM policy.
2. Restrict the trust relationship to the IDCBridgeRole ARN created in the previous step), and provide your AWS account ID:
```
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::<accountid>:role/IDCBridgeRole"
            },
            "Action": [
                "sts:AssumeRole",
                "sts:SetContext"
            ]
        }
    ]
}
```

Now you can create the customer managed application.

On the IAM Identity Center console, choose Applications in the navigation pane.
Choose Add application.
Choose I have an application I want to setup, select the OAuth 2.0 application type, and choose Next.
Enter a name for the application, for example, RedshiftStreamlitDemo.
In User and group assignment method, choose Do not require assignment. This means all the users provisioned in IAM Identity Center from Okta can use their Okta credentials to sign in to the Streamlit application. You can alternatively select the Require assignments option and pick the users and groups you want to allow access to the application.
In the AWS access portal section, choose Not visible, then choose Next.
In the Authentication with trusted token issuer section, select Create trusted token issuer, then enter the Okta issuer URL and enter a name for the trusted token issuer.
In the map attribute, use the default email to email mapping between the external IdP attribute and IAM Identity Center attribute, then create the trusted token issuer.
Select the trusted token issuer you just created.
In the Aud claim section, use the client ID of the Okta application you noted earlier, then choose Next.

In the Specify application credentials section, choose Edit the application policy and use the following policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "redshift-data.amazonaws.com"
      },
      "Action": "sso-oauth:*",
      "Resource": "*"
    }
  ]
}

Choose Submit.

After the application is created, you can view it in on the IAM Identity Center.

Choose Applications in the navigation pane, and locate the Customer managed applications tab.

Choose the application to navigate to the application details page.
In the Trusted applications for identity propagation section, choose Specify trusted applications and select the setup type as Individual applications and specify access, then choose Next.
Choose Amazon Redshift as the service, then choose Next.
In the Application that can receive requests section, choose the Amazon Redshift IAM Identity Center application you created, then choose Next.
In the Access Scopes to apply section, check the redshift:connect
Review and then choose Trust application.

Configure and run the Streamlit application

Now that you have the roles and the customer managed application in IAM Identity Center, you can create an identity-enhanced IAM role session, which is the most critical step to enable trusted identity propagation. Following steps provide an overview of Streamlit application code to create the identity-enhanced IAM role session.

Authenticate with and retrieve the id_token from the external IdP (Okta).
Call CreateTokenWithIAM using the external IdP issued id_token to obtain an IAM Identity Center issued id_token.
Use AssumeRoleWithWebIdentity to obtain temporary IAM credentials (by assuming IDCBridgeRole, explained later).
Extract the sts:identity_context from the IAM Identity Center issued id_token.
Assume the role RedshiftDataAPIClientRole with the AssumeRole API and insert the sts:identity_context to obtain the identity-enhanced IAM role session credentials.

Now you can use these credentials to make requests to the Redshift Data API, and Amazon Redshift will be able to use the identity context for authorization decisions.

At this point, you should have all the required resources for creating the Streamlit application. Complete the following steps to test the Streamlit application:

Download the Streamlit application code and modify the configuration section of the code based on the resources provisioned earlier:

# TIP Token exchange configuration
AWS_REGION = "<YOUR AWS REGION>" # us-east-1
TOKEN_EXCHANGE_APP_ARN = "<YOUR IDC CUSTOM APP ARN>" # The ARN of the IDC customer-managed-App created earlier
TOKEN_GRANT_TYPE = "urn:ietf:params:oauth:grant-type:jwt-bearer" # fixed value, please don't change
TEMP_ROLE_ARN = "<TEMP ROLE ARN>" # The role created in this step for users to assume with AssumeRoleWithWebIdentity(IDCBridgeRole)
ENHANCED_ROLE_ARN = "<ENHANCED ROLE ARN>" # The role created in this step for users to assume for the Identity-enhanced role session with IAM Identity Center(RedshiftDataAPIClientRole)
IDENHANCED_ROLE_SESSION_NAME = "rs-idc-tip-session" # Use any name for the session 
ROLE_DURATION_SECS = 3600  # 1 hour

# Okta OAuth configuration, replace with your own Okta Domain
OKTA_DOMAIN = "<YOUR OKTA DOMAIN>"
AUTHORIZE_URL = f"https://{OKTA_DOMAIN}/oauth2/v1/authorize"
TOKEN_URL = f"https://{OKTA_DOMAIN}/oauth2/v1/token"
REFRESH_TOKEN_URL = f"https://{OKTA_DOMAIN}/oauth2/v1/token"
REVOKE_TOKEN_URL = f"https://{OKTA_DOMAIN}/oauth2/v1/revoke"
LOGOUT_URL = f"https://{OKTA_DOMAIN}/oauth2/v1/logout"
CLIENT_ID = "<OKTA CLIENT ID>" # The client id of the Okta app created for the Streamlit app in 2.
CLIENT_SECRET = "<OKTA CLIENT SECRET>" # The client id of the Okta app created for the Streamlit app in 2.
REDIRECT_URI = "http://localhost:8501" # This is for dev/test purpose only
SCOPE = "openid profile email" # Please do not change
WORKGROUP_NAME = "<your-redshift-workgroup-we-used:redshift-tip-enabled>" #The name of the created Redshift Workgroup
DATABASE = "dev" # The database set for the Workgroup

We recommend hosting this application on an Amazon Elastic Compute Cloud (Amazon EC2) instance for production use cases, and using AWS Secrets Manager for sensitive information like the CLIENT_ID and CLIENT_SECRET provided as configuration parameters in the code for simplicity.

For this example, we use the Okta organization URL (/oauth2/v1/). You can use the customer authorization servers as well, for example, the default authorization server, but make sure all URLs are using the same authorization server. Refer to Authorization servers for more information about authorization servers in Okta.

After you modify the script for the Streamlit application, you can run it using a Python virtual environment.

Create a Python virtual environment. The application has been tested successfully with versions v3.12.8 and v3.12.2.

You need to install the following packages, which are required libraries for the Streamlit application code you downloaded in your virtual environment:

streamlit
streamlit_oauth
boto3
pyjwt
pydeck
pandas

You can install these libraries directly using the following command with the requirements file:

pip install -r https://redshift-blogs.s3.us-east-1.amazonaws.com/redshift-data-api-idc/requirements.txt

Test the Streamlit application in the Python virtual environment with the following command:
```
streamlit run /path/to/st_app.py
```
Log in with the user [email protected] from the apac-sales group.

The identity-enhanced role session credentials will display on the top of the page after successful authentication with Okta.

For the APAC region manager, you should only see the data from the countries in the Asia-Pacific region based on the row-level security filter you configured earlier.

Log out and log back in with the global executive user, [email protected] from the exec-global

You should see the data in all regions.

You can try other regional users’ logins and you should see only the data in the region they belong to.

Trusted identity propagation deep dive

In this section, you walk through the Python code of the Streamlit application and explain how trusted identity propagation works. The following is an explanation of key parts of the application code.

main()

The main() function of the Streamlit application implements the preceding steps to get the identity-enhanced IAM role session using the get_id_enhanded_session() function, which wraps the login to get the identity-enhanced role session credentials:

def main():
    # Create OAuth2Component instance
    oauth2 = OAuth2Component(
        CLIENT_ID, 
        CLIENT_SECRET, 
        AUTHORIZE_URL, 
        TOKEN_URL, 
        REFRESH_TOKEN_URL, 
        REVOKE_TOKEN_URL)
    
    # Other setup code omitted
    
    # Handle OAuth authentication with Okta
    if not st.session_state.is_authenticated or is_token_expired():
        # Show the login button if not authenticated
        st.title("Login to the Demo app")
        result = oauth2.authorize_button("Login with Okta", REDIRECT_URI, SCOPE)
        if result and "token" in result:
            # Save the token in session state and mark the user as authenticated
            st.session_state.token = result.get("token")
            st.session_state.user_email = get_user_email_from_token(st.session_state.token.get("id_token"))
            st.session_state.aws_creds = get_id_enhanced_session(st.session_state.token.get("id_token"))
            st.session_state.is_authenticated = True
            st.rerun()
    else:
        
        st.json(st.session_state.aws_creds)
        st.title("Total Sales by City")
    
        if not is_token_expired():
            # Use the enhanced credentials to create the Redshift client
            redshift_client = boto3.client("redshift-data", region_name=AWS_REGION,
                                        aws_access_key_id=st.session_state.aws_creds['AccessKeyId'],
                                        aws_secret_access_key=st.session_state.aws_creds['SecretAccessKey'],
                                        aws_session_token=st.session_state.aws_creds['SessionToken'])
        else:
            st.error("Session expired. Please re-authenticate.")
            logout()
            
    # more code for query execution and data visualizetion omitted

We use the Streamlit st.session_state provided by Streamlit to store important session states, including the authentication status as well as additional information like user information and the AWS identity-enhanced role session credentials.

get_id_enhanced_session()

The get_id_enhanced_session() function code has three steps:

We use the id_token (variable name: jwt_token) from Okta in JWT format to call the AssumeRoleWithWebIdentity API to assume the role IDCBridgeRole. This is because the user doesn’t have any AWS credentials to interact with the IAM Identity Center API. If you plan to host this application in an AWS environment with an IAM role available, for example, on an EC2 instance, you can use the role associated with Amazon EC2 to make the call to the IAM Identity Center APIs without creating IDCBridgeRole, but make sure the EC2 role has the required permissions we specified for IDCBridgeRole.
After we have the credentials of the temporary role, we use them to make a call to the CreateTokenWithIAM API of IAM Identity Center. This API handles the exchange of tokens by taking in the id_token from Okta and returning an IAM Identity Center issued token, which will be used later to get the identity-enhanced role session. For more information, refer to the CreateTokenWithIAM API reference.
Lastly, we extract the sts:identity_context from the IAM Identity Center issued id_token and pass it to the AWS Security Token Service (AWS STS) AssumeRole This is done by including the sts:identity_context in the ContextAssertion parameter within ProvidedContexts, along with ProviderArn set to arn:aws:iam::aws:contextProvider/IdentityCenter.

def get_id_enhanced_session(jwt_token):
    """
    Obtains an identity-enhanced session by assuming a temporary IAM role,
    creating a token with IAM, and assuming an enhanced role session.
    
    Args:
        jwt_token (str): The JWT id token from the identity provider.
    
    Returns:
        dict or None: The enhanced session credentials if successful, otherwise None.
    """
    logging.info("Starting identity-enhanced session process.")

    # Step 1: Assume a temporary IAM role with the provided JWT token
    temp_credentials = assume_role_with_web_identity(jwt_token)
    if not temp_credentials:
        logging.error("Failed to assume role with web identity.")
        return None

    # Step 2: Use the temporary credentials to create a token with IAM
    id_token = create_token_with_iam(jwt_token, temp_credentials)
    if not id_token:
        logging.error("Failed to create ID token with IAM.")
        return None

    # Step 3: Use the ID token to assume an enhanced role session
    enhanced_creds = assume_enhanced_role_session(id_token, temp_credentials)
    if not enhanced_creds:
        logging.error("Failed to assume enhanced role session.")
        return None

    logging.info("Successfully obtained identity-enhanced session credentials.")
    return enhanced_creds

assume_role_with_web_identity()

The assume_role_with_web_identity() function code is as follows. We initialize the STS client, decode the JWT token, and then assume the role with the web identity.

def assume_role_with_web_identity(jwt_token):
    """
    Assumes an IAM role using a web identity token and returns the temporary credentials.

    Args:
        jwt_token (str): The JWT token for authentication, typically issued by an external identity provider.

    Returns:
        dict: Temporary IAM credentials (Access Key, Secret Key, Session Token) or None if an error occurs.
    """
    try:
        # Initialize the STS client
        sts_client = boto3.client('sts', region_name=AWS_REGION)
        
        # Decode the JWT token without verifying signature (for debugging purposes)
        decoded_jwt = jwt.decode(jwt_token, options={"verify_signature": False})
        logging.debug(f"Decoded JWT Token: {decoded_jwt}")

        # Prepare the request for AssumeRoleWithWebIdentity
        assume_role_request = {
            'RoleArn': TEMP_ROLE_ARN,
            'RoleSessionName': 'WebIdentitySession',
            'WebIdentityToken': jwt_token,
            'DurationSeconds': DURATION_SECS  # 1 hour
        }

        # Call the AssumeRoleWithWebIdentity API
        assume_role_response = sts_client.assume_role_with_web_identity(**assume_role_request)
        
        # Extract the temporary credentials from the response
        temp_credentials = assume_role_response['Credentials']
        logging.info("Temporary credentials successfully obtained.")
        
        # Return the temporary credentials
        return temp_credentials

    except ClientError as e:
        logging.error(f"Error calling AssumeRoleWithWebIdentity: {e}")
        return None
    except jwt.ExpiredSignatureError:
        logging.error("JWT token has expired.")
        return None
    except jwt.DecodeError:
        logging.error("Error decoding JWT token.")
        return None
    except Exception as e:
        logging.error(f"Unexpected error: {e}")
        return None

create_token_with_iam()

The create_token_with_iam() function code is called to get the id_token from IAM Identity Center. The jwt_token is the id_token in JWT format issued by Okta; the id_token is the IAM Identity Center issued id_token.

def create_token_with_iam(jwt_token, temp_credentials):
    """
    Creates an IAM token using the provided JWT token and temporary credentials.

    Args:
        jwt_token (str): The JWT token to exchange for an IAM token.
        temp_credentials (dict): Temporary AWS credentials for assuming the role.
    
    Returns:
        str or None: The IAM token if successful, otherwise None.
    """
    logging.info("Starting token creation process with IAM.")
    
    # Initialize the SSO OIDC client with temporary credentials
    try:
        sso_oidc_client = boto3.client(
            'sso-oidc', 
            region_name=AWS_REGION, 
            aws_access_key_id=temp_credentials['AccessKeyId'],
            aws_secret_access_key=temp_credentials['SecretAccessKey'],
            aws_session_token=temp_credentials['SessionToken']
        )
    except Exception as e:
        logging.error(f"Error initializing SSO OIDC client: {e}")
        return None

    # Prepare the request for CreateTokenWithIAM
    token_request = {
        'clientId': TOKEN_EXCHANGE_APP_ARN,
        'grantType': TOKEN_GRANT_TYPE,
        'assertion': jwt_token
    }

    # Call the CreateTokenWithIAM API
    try:
        token_result = sso_oidc_client.create_token_with_iam(**token_request)
        id_token = token_result['idToken']
        logging.info(f"Successfully obtained ID Token: {id_token}")
        return id_token
    except ClientError as e:
        logging.error(f"Error calling CreateTokenWithIAM API: {e}")
        return None
    except KeyError as e:
        logging.error(f"Missing expected field in response: {e}")
        return None

In the CreateTokenWithIAM call, we pass the following parameters:

clientId – The ARN of the IAM Identity Center application for the Redshift Data API client
grantType – urn:ietf:params:oauth:grant-type:jwt-bearer
assertion – The id_token (jwt_token) issued by Okta

The idToken issued by IAM Identity Center is returned.

assume_enhanced_role_session()

The assume_enhanced_role_session() function uses the ID token to assume an identity-enhanced role session:

def assume_enhanced_role_session(id_token, temp_credentials):
    """
    Assumes an identity-enhanced IAM role session using the provided ID token and temporary credentials.

    Args:
        id_token (str): The ID token containing the identity context.
        temp_credentials (dict): Temporary AWS credentials for assuming the role.

    Returns:
        dict or None: The credentials for the identity-enhanced IAM role session, or None on failure.
    """
    logging.info("Extracting identity context from ID token.")
    identity_context = extract_identity_context_from_id_token(id_token)

    if not identity_context:
        logging.error("Failed to extract identity context from ID token.")
        return None

    try:
        # Initialize STS client with temporary credentials
        sts_client = boto3.client(
            'sts',
            region_name=AWS_REGION,
            aws_access_key_id=temp_credentials['AccessKeyId'],
            aws_secret_access_key=temp_credentials['SecretAccessKey'],
            aws_session_token=temp_credentials['SessionToken']
        )

        # Prepare AssumeRole request with identity context
        assume_role_request = {
            'RoleArn': ENHANCED_ROLE_ARN,
            'RoleSessionName': IDENHANCED_ROLE_SESSION_NAME,
            'DurationSeconds': ROLE_DURATION_SECS,
            'ProvidedContexts': [{
                'ContextAssertion': identity_context,
                'ProviderArn': "arn:aws:iam::aws:contextProvider/IdentityCenter"
            }]
        }

        # Call the AssumeRole API
        logging.info("Calling STS AssumeRole for identity-enhanced session.")
        assume_role_response = sts_client.assume_role(**assume_role_request)

        enhanced_role_credentials = assume_role_response['Credentials']
        logging.info("Successfully assumed enhanced role.")
        
        return enhanced_role_credentials

    except ClientError as e:
        logging.error(f"Error calling AssumeRole: {e}")
        return None

extract_identity_context_from_id_token()

The extract_identity_context_from_id_token() function extracts the sts:identity_context:

def extract_identity_context_from_id_token(id_token):
    """
    Extracts the identity context from a decoded JWT token.

    Args:
        id_token (str): The JWT token containing identity context.

    Returns:
        dict or None: The extracted identity context if available, otherwise None.
    """
    logging.info("Decoding ID token to extract identity context.")

    try:
        # Decode the JWT token (without signature verification)
        decoded_jwt = jwt.decode(id_token, options={"verify_signature": False})

        logging.debug(f"Decoded JWT Claims: {decoded_jwt}")

        # Extract the identity context from the token
        for key in ('sts:identity_context', 'sts:audit_context'):
            if key in decoded_jwt:
                return decoded_jwt[key]

        logging.warning("No valid identity context found in the token.")
        return None

    except Exception as e:
        logging.error(f"Error decoding JWT: {e}")
        return None

Now you have the identity-enhanced role session credentials to call the Amazon Redshift Data API.

execute_statement() and fetch_results()

The execute_statement() and fetch_results() functions demonstrate how to run Redshift queries and retrieve query results with trusted identity propagation for visualization:

def execute_statement(sql, redshift_client):
    """
    Executes a SQL statement on Amazon Redshift using the provided Redshift Data API client.

    Args:
        sql (str): The SQL query to execute.
        redshift_client (boto3.client): The Redshift Data API client.

    Returns:
        str: The execution ID of the statement.

    Raises:
        ClientError: If an error occurs during execution.
    """
    try:
        response = redshift_client.execute_statement(
            WorkgroupName=WORKGROUP_NAME,
            Database=DATABASE,
            Sql=sql 
        )
        return response["Id"]
    
    except ClientError as e:
        error_code = e.response.get('Error', {}).get('Code', '')
        
        if error_code == 'ExpiredTokenException':
            logging.error("Session expired. Logging out...")
            logout()
        else:
            logging.error(f"Error executing statement: {e}")
            raise
            
def fetch_results(statement_id, redshift_client):
    """
    Fetches query results from the Redshift Data API.

    Args:
        statement_id (str): The execution ID of the statement.
        redshift_client (boto3.client): The Redshift Data API client.

    Returns:
        list: A list of records from the query result.
    """
    try:
        response = redshift_client.get_statement_result(Id=statement_id)
        return response.get("Records", [])
    
    except ClientError as e:
        logging.error(f"Error fetching query results: {e}")
        raise

Conclusion

In this post, we showed how to create a third-party application backed by analytics insights arriving from Amazon Redshift securely using OIDC. With Redshift Data API support of IAM Identity Center integration, you can connect to Amazon Redshift using SSO from the IdP of your choice. You can extend this method to authenticate other AWS services that support trusted identity propagation, such as Amazon Athena and Amazon QuickSight, enabling fine-grained access control for IAM Identity Center users and groups across your AWS ecosystem. We encourage you to set up your application using IAM Identity Center integration and unify your access control directly from your IdP across all IAM Identity Center supported AWS services.

For more information on AWS services and applications that support trusted identity propagation, refer to Trusted identity propagation overview.

About the Authors

Songzhi Liu is a Principal Big Data Architect with the AWS Identity Solutions team. In this role, he collaborates closely with AWS customers and cross-functional teams to design and implement scalable data architectures, focusing on integrating big data and machine learning solutions to enhance identity awareness within the AWS ecosystem.

Rohit Vashishtha is a Senior Analytics Specialist Solutions Architect at AWS based in Dallas, Texas. He has over 19 years of experience architecting, building, leading, and maintaining big data platforms. Rohit helps customers modernize their analytic workloads using the breadth of AWS services and ensures that customers get the best price/performance with utmost security and data governance.

Fei Peng is a Senior Software Development Engineer working in the Amazon Redshift team, where he leads the development of Redshift Data API, enabling seamless and scalable access to cloud data warehouses.

Yanzhu Ji is a Product Manager in the Amazon Redshift team. She has experience in product vision and strategy in industry-leading data products and platforms. She has outstanding skill in building substantial software products using web development, system design, database, and distributed programming techniques. In her personal life, Yanzhu likes painting, photography, and playing tennis.

Noise

All posts by Songzhi Liu

Build a secure data visualization application using the Amazon Redshift Data API with AWS IAM Identity Center