All posts by Vetri Natarajan

Automate replication of row-level security from AWS Lake Formation to Amazon QuickSight

Post Syndicated from Vetri Natarajan original https://aws.amazon.com/blogs/big-data/automate-replication-of-row-level-security-from-aws-lake-formation-to-amazon-quicksight/

Amazon QuickSight is cloud-powered, serverless, and embeddable business intelligence (BI) service that makes it straightforward to deliver insights to your organization. As a fully managed service, Amazon QuickSight lets you create and publish interactive dashboards that can then be accessed from different devices and embedded into your applications, portals, and websites.

When authors create datasets, build dashboards, and share with end-users, the users will see the same data as the author, unless row-level security (RLS) is enabled in the Amazon QuickSight dataset. Amazon QuickSight also provides options to pass a reader’s identity to a data source using trusted identity propagation and apply RLS at the source. To learn more, see Centrally manage permissions for tables and views accessed from Amazon QuickSight with trusted identity propagation and Simplify access management with Amazon Redshift and AWS Lake Formation for users in an External Identity Provider.

However, there are a few requirements when using trusted identity propagation with Amazon QuickSight:

  • The authentication method for Amazon QuickSight must be using AWS IAM Identity Center.
  • The dataset created using trusted identity propagation will be a direct query dataset in Amazon QuickSight. QuickSight SPICE can’t be used with trusted identity propagation. This is because when using SPICE, data is imported (replicated) and therefore the entitlements at the source can’t be used when readers access the dashboard.

This post outlines a solution to automatically replicate the entitlements for readers from the source (AWS Lake Formation) to Amazon QuickSight. This solution can be used even when the authentication method in Amazon QuickSight is not using IAM Identity Center and can work with both direct query and SPICE datasets in Amazon QuickSight. This lets you take advantage of auto scaling that comes with SPICE. Although we focus on using a Lake Formation table that exists in the same account, you can extend the solution for cross-account tables as well. When extracting data filter rules for the table in another account, the execution role must have necessary access to the tables in the other account.

Use case overview

For this post, let’s consider a large financial institution that has implemented Lake Formation as its central data lake and entitlement management system. The institution aims to streamline access control and maintain a single source of truth for data permissions across its entire data ecosystem. By using Lake Formation for entitlement management, the financial institution can maintain a robust, scalable, and compliant data access control system that serves as the foundation for its data-driven operations and analytics initiatives. This approach is particularly crucial for maintaining compliance with financial regulations and maintaining data security. The analytics team wants to build an Amazon QuickSight dashboard for data and business teams.

Solution overview

This solution uses APIs of AWS Lake Formation and Amazon QuickSight to extract, transform, and store AWS Lake Formation data filters in a format that can be used in QuickSight.

The solution has four key steps:

  1. Extract and transform the row-level security (data filters) and permissions to data filters for tables of interest from AWS Lake Formation.
  2. Create a rules dataset in Amazon QuickSight.

We use the following key services:

The following diagram illustrates the solution architecture.

Prerequisites

To implement this solution, you should have following services enabled in the same account

  1. AWS Lake Formation and
  2. Amazon QuickSight
  3. AWS Identity and Access Management (IAM) permissions: Make sure you have necessary IAM permissions to perform operation across all the services mentioned in the solution overview above
  4. AWS Lake Formation table with data filters with right permissions
  5. Amazon QuickSight principals (Users or Groups)

The below section shows how you can create Amazon QuickSight groups and AWS Lake formation tables and data filters

Create groups in QuickSight

Create two groups in Amazon QuickSight: QuickSight_Readers and QuickSight_Authors. For instructions, see Create a group with the QuickSight console.

You can then form the Amazon Resource Names (ARNs) of the groups as follows. These will be used when granting permission in AWS Lake Formation for data filters.

  • arn:aws:quicksight:<<identity-region>>:<<AWSAcocuntId>>:group/<<namespace>>/QuickSight_Readers
  • arn:aws:quicksight:<<identity-region>>:<<AWSAcocuntId>>:group/<<namespace>>/QuickSight_Authors

You can also get the ARN of the groups by executing the Amazon QuickSight CLI command list-groups. The following screenshot shows the output.

Create a table in AWS Lake Formation

The following section is for example purposes and not necessary for production use of this solution. Complete the following steps to create a table in AWS Lake Formation using sample data. In this post, the table is called saas_sales.

  1. Download the file Saas Sales.csv.
  2. Upload the file to an Amazon S3 location.
  3. Create a table in AWS Lake Formation.

Create row-level security (data filter) in AWS Lake Formation

In AWS Lake Formation, data filters are used to filter the data in a table for an individual or group. Complete the following steps to create a data filter:

  1. Create a data filter called QuickSightReaderFilter in the table saas_sales. For Row-level access, enter the expression segment = 'Enterprise'.
  2. Grant the Amazon QuickSight group access to this data filter. Use the reader group ARN from the first step for SAML Users and groups.
  3. Grant the QuickSight_Authors group full access to the table. Use the reader group ARN from the first step for SAML Users and groups.
  4. (Optional) You can create another table called second_table and create another data filter called SecondFilter and grant permission to the QuickSight_Readers group.

Now that you have set up the table, permissions, and data filters, you can extract the row-level access details for the QuickSight_Readers and QuickSight_Authors groups and the saas_sales table in AWS Lake Formation, and create the rules dataset in Amazon QuickSight for the saas_sales table.

Extract and transform data filters and permissions from AWS Lake Formation using a Lambda function

In AWS Lake Formation, data filters are created for each table. There can be many tables in AWS Lake Formation. However, for a team or a project, there are only a specific set of tables that the BI developer is interested in. Therefore, choose a list of tables to track and update the data filters for. In a batch process, for each table in AWS Lake Formation, extract the data filter definitions and write them into Amazon S3 using AWS Lake Formation and Amazon S3 APIs.

We use the following AWS Lake Formation APIs to extract the data filter details and permissions:

  • ListDataCellFilters – This API is used to list all the data filters in each table that is required for the project
  • ListPermissions – This API is used to retrieve the permissions for each of the data filters extracted using the ListDataCellFilters API

The Lambda function covers three parts of the solution:

  • Extract the data filters and permissions to data filters for tables of interest from AWS Lake Formation
  • Transform the data filters and permission into a format usable in Amazon QuickSight
  • Persist the transformed data

Complete the following steps to create an AWS Lambda function:

  1. On the Lambda console, create a function called Lake_Formation_QuickSight_RLS. Use Python 3.12 as the runtime and create a new role for execution.
  2. Configure Lambda function timeout to 2 minutes. This can vary depending on the number of tables to be parsed and the number of data filters to be transformed.
  3. Attach the following permissions to the Lambda execution role:
    {
    "Version": "2012-10-17",
    "Statement": [
    {
    "Sid": "VisualEditor0",
    "Effect": "Allow",
    "Action": [
    "lakeformation:ListDataCellsFilter",
    "lakeformation:ListPermissions"
    ],
    "Resource": "*"
    },
    {
    "Sid": "VisualEditor1",
    "Effect": "Allow",
    "Action": "s3:PutObject",
    "Resource": "arn:aws:s3:::<bucket_used_for_storage>/*"
    }
    ]
    }

  4. Set the following environment variables for the Lambda function:

    Name Value
    S3Bucket Value of the S3 bucket where the output files will be stored
    tablesToTrack List of tables to track as JSON converted to string
    Tmp /tmp

The Lambda function gets the list of tables and S3 bucket details from the environment variables. The list of tables is given as a JSON array converted to string. The JSON format is shown in the following code. The values for catalogId, DatabaseName, and Name can be fetched from the AWS Lake Formation console.

[
{
"CatalogId": "String",
"DatabaseName": "String",
"Name": "String"
}
]
  1. Add a folder named tmp.
  2. Download the zip file Lake_Formation_QuickSight_RLS.zip.
    Note: This is sample code for non-production usage. You should work with your security and legal teams to meet your organizational security, regulatory, and compliance requirements before deployment.
  3. For the Lambda function code, upload the downloaded .zip file to the Lambda function, on the Code tab.
  4. Provide necessary access to the execution role in AWS Lake Formation. Although the AWS Identity and Access Management (IAM) permissions are given to the Lambda execution role, explicit permission has to be given to the role in AWS Lake Formation for the Lambda function to get the details about the data filters. Therefore, you have to explicitly grant access to the execution role to limit the Lambda role to read-only admin. For more details, see Viewing data filters.

In the following sections, we explain what the Lambda function code does in more detail.

Extract data filters and permissions for data filters and tables in AWS Lake Formation

The main flow of the code takes the list of tables as input and extracts table and data filter permissions and data filter rules. The approach here is to get the permissions for the entire table and also for the data filters applied to the table. This way, both full access (table level) and partial access (data filter) can be extracted.

...
....
tablesToTrack= json.loads(os.environ["tablesToTrack"])
lf_client = boto3.client('lakeformation')
# For each table in the list get the data filter rules attached to the table.
for table in tablesToTrack:
df_response= lf_client.list_data_cells_filter(
Table= table
)
d_filters += df_response["DataCellsFilters"]

# Also, for each table in the list get the list of permissions at table level.
# This determines who has access to all rows in the table.
tresponse=lf_client.list_permissions(
Resource= {
"Table": table
}
)

d_permissions += tresponse["PrincipalResourcePermissions"]
transformDataFilterRules(d_filters)
# For each data filters fetched above, get the permissions.
# This determines the row level security for the tables.
for filter in d_filters:
p_response=lf_client.list_permissions(
Resource= {

"DataCellsFilter": {
"DatabaseName": filter ["DatabaseName"],
"Name": filter["Name"],
"TableCatalogId": filter["TableCatalogId"],
"TableName": filter["TableName"]
}

}
)
d_permissions += p_response["PrincipalResourcePermissions"]

transformFilterandTablePermissions(d_permissions)

Transform data filter definitions in to a format usable in Amazon QuickSight

The extracted permissions and filters are transformed to create a rules dataset in Amazon QuickSight. There are different ways to define data filters. The following figure illustrates some of the example transformations.

The function transformDataFilterRules in the following code can transform some of the OR and AND conditions into Amazon QuickSight acceptable format. The following are the details available in the transformed format:

  • Lake Formation catalog ID
  • Lake Formation database name
  • Lake Formation table name
  • Lake Formation data filter name
  • List of columns from all the tables provided in the input for which the data filter rules are defined

See the following code:

def transformDataFilterRules(rules):
global complete_transformed_filter_rules
transformed_filter_rules = []
filter_to_extract=[]
complete_transformed_filter_rules = []
col_headers=[]
col_headers.append("catalog")
col_headers.append("database")
col_headers.append("table")
col_headers.append("filter")

for rule in rules:
print(rule)
catalog=rule["TableCatalogId"]
database = rule["DatabaseName"]
table = rule["TableName"]
filter = rule["Name"]
row=[]
row.append(catalog)
row.append(database)
row.append(table)
row.append(filter)
logger.info(f"row==={row}")

f_conditions = re.split(' OR | or | and | AND ' , rule["RowFilter"]["FilterExpression"])

for f_condition in f_conditions:
logger.info(f"f_condition={f_condition}")
f_condition = f_condition.replace("(","")
f_condition = f_condition.replace(")","")
filter_rule_column= f_condition.split("=")
if len(filter_rule_column)>1:
filter_rule_column[0] = filter_rule_column[0].strip()
if not filter_rule_column[0].strip() in col_headers:
col_headers.append(filter_rule_column[0].strip())
i= col_headers.index(filter_rule_column[0].strip())
j= i- (len(row)-1)
if j>0:
for x in range(1, j):
row.append("")
logger.info(f"i={i} j={j} {filter_rule_column[1]}")
row.insert(i, filter_rule_column[1].replace("'",""))
print(row)
transformed_filter_rules.append(','.join(row))

row=[]
row.append(catalog)
row.append(database)
row.append(table)
row.append(filter)
max_columns = len(col_headers)
complete_transformed_filter_rules=[]
for rule in transformed_filter_rules:
r = rule.split(",")
to_fill = max_columns - len(r)
if to_fill>0:
for x in range(1, to_fill+1):
r.append("")
complete_transformed_filter_rules.append(','.join(r))

complete_transformed_filter_rules.insert(0,','.join(col_headers))

The following figure is an example of the transformed file. The file contains the columns for both tables. When creating a rules dataset for a specific table, the records are filtered for that table pulled into Amazon QuickSight.

The function transformFilterandTablePermissions in the following code snippet combines and transforms the table and data filter permissions into a flat structure that contains the following columns:

  • Amazon QuickSight group ARN
  • Lake Formation catalog ID
  • Lake Formation database name
  • Lake Formation table name
  • Lake Formation data filter name

See the following code:

def transformFilterandTablePermissions(permissions):
    global transformed_table_permissions,transformed_filter_permissions
    # Read and set table level access
    transformed_table_permissions = []
    transformed_filter_permissions = []
    transformed_filter_permissions.insert(0,"group,catalog,database,table,filter")
    transformed_table_permissions.insert(0,"group,catalog,database,table")
    
    for permission in permissions:
    group=""
    database=""
    table =""
    catalog=""
    
    p= permission["Permissions"]
    
    if "DESCRIBE" in p or "SELECT" in p:
    
    group = permission["Principal"]["DataLakePrincipalIdentifier"]
    if "Database" in permission["Resource"]:
    catalog=permission["Resource"]["Database"]["CatalogId"]
    database=permission["Resource"]["Database"]["Name"]
    table = "*"
    transformed_table_permissions.append(group + "," + catalog+ "," + database + "," + table)
    transformed_filter_permissions.append(group+"," +catalog + ","+ database + ","+ table)
    elif "TableWithColumns" in  permission["Resource"]  or "Table" in permission["Resource"]:
    if "TableWithColumns" in  permission["Resource"]:
    catalog=permission["Resource"]["TableWithColumns"]["CatalogId"]
    database = permission["Resource"]["TableWithColumns"]["DatabaseName"]
    table = permission["Resource"]["TableWithColumns"]["Name"]
    elif "Table" in  permission["Resource"]:
    catalog=permission["Resource"]["Table"]["CatalogId"]
    database = permission["Resource"]["Table"]["DatabaseName"]
    table = permission["Resource"]["Table"]["Name"]
    transformed_table_permissions.append( group + "," + catalog + "," + database + "," + table)
    transformed_filter_permissions.append(group+"," +catalog + ","+ database + ","+ table)
    elif "DataCellsFilter" in permission["Resource"]:
    catalog=permission["Resource"]["DataCellsFilter"]["TableCatalogId"]
    database = permission["Resource"]["DataCellsFilter"]["DatabaseName"]
    table = permission["Resource"]["DataCellsFilter"]["TableName"]
    filter = permission["Resource"]["DataCellsFilter"]["Name"]
    transformed_filter_permissions.append(group+"," +catalog + ","+ database + ","+ table+ ","+ filter)

The following figure is an example of the extracted data filter and table permissions. AWS Lake Formation can have data filters applied to any principal. However, we focus on the Amazon QuickSight principals:

  • The QuickSight_Authors ARN has full access to two tables. This is determined by transforming the table-level permissions in addition to the data filter permissions.
  • The QuickSight_Readers ARN has limited access based on filter conditions.

Store the transformed rules and permissions in two separate files in Amazon S3

The transformed rules and permissions are then persisted in a data store. In this solution, the transformed rules are written to an Amazon S3 location in CSV format. The name of the files created by the Lambda function are:

  • transformed_filter_permissions.csv
  • transformed_filter_rules.csv

See the following code:

with open("/tmp/transformed_table_permissions.csv", "w") as txt_file:
for line in transformed_table_permissions:
txt_file.write(line + "\n") # works with any number of elements in a line
txt_file.close()
s3 = boto3.resource('s3')
s3.meta.client.upload_file(Filename = "/tmp/transformed_table_permissions.csv", Bucket= os.environ['S3Bucket'], Key = "table-permissions/transformed_table_permissions.csv")

with open("/tmp/transformed_filter_permissions.csv", "w") as txt_file:
for line in transformed_filter_permissions:
txt_file.write(line + "\n") # works with any number of elements in a line
txt_file.close()

s3.meta.client.upload_file(Filename = "/tmp/transformed_filter_permissions.csv", Bucket= os.environ['S3Bucket'], Key = "filter-permissions/transformed_filter_permissions.csv")

with open("/tmp/transformed_filter_rules.csv", "w") as txt_file:
for line in complete_transformed_filter_rules:
txt_file.write(line + "\n") # works with any number of elements in a line
txt_file.close()

s3.meta.client.upload_file(Filename = "/tmp/transformed_filter_rules.csv", Bucket= os.environ['S3Bucket'], Key = "filter-rules/transformed_filter_rules.csv")

Create a rules dataset in Amazon QuickSight

In this section, we walk through the steps to create a rules dataset in Amazon QuickSight.

Create a table in Lake formation for the files

The first step is to create a table in AWS Lake Formation for the two files, transformed_filter_permissions.csv and transformed_filter_rules.csv.

Although you can directly use an Amazon S3 connector in Amazon QuickSight, creating a table and making the rules dataset using an Athena connector gives flexibility in writing custom SQL and using direct query. For the steps to bring an Amazon S3 location into AWS Lake Formation, see Creating tables.

For this post, the tables for the files are created in a separate database called quicksight_lf_transformation.

Grant permission for the tables to the QuickSight_Authors group

Grant permission in AWS Lake Formation for the two tables to the QuickSight_Authors group. This is essential for Amazon QuickSight authors to create a rules dataset in Amazon QuickSight. The following screenshot shows the permission details.

Create a rules dataset in Amazon QuickSight

Amazon QuickSight supports both user-level and group-level RLS. In this post, we use groups to enable RLS. To create the rules dataset, you first join the filter permissions table with the filter rules table on the columns catalog, database, table, and filter. Then you can filter the permissions to include the Amazon QuickSight principals, and include only the columns required for the dataset. The objective in this solution is to build a rules dataset for the saas_sales table.

Complete the following steps:

  1. On the Amazon QuickSight console, create a new Athena dataset.
  2. Specify the following:
    1. For Catalog, choose AWSDataCatalog.
    2. For Database, choose quicksight_lf_transformation.
    3. For Table, choose filter_permissions.
  3. Choose Edit/Preview data.
  4. Choose Add data.
  5. Choose Add source.
  6. Select Athena.
  7. Specify the following:
    1. For Catalog, choose AWSDataCatalog.
    2. For Database, choose quicksight_lf_transformation.
    3. For Table, choose filter_rules.

  8. Join the permissions table with the data filter rules table on the catalog, database, table and filter columns.
  9. Rename the column group as GroupArn. This needs to be done before filter is applied.
  10. Filter the data where column table equals saas_sales.
  11. Filter the data where column group is also filtered for values starting with arn:aws:quicksight (Amazon QuickSight principals).
  12. Exclude fields that are not part of the saas_sales table.
  13. Change Query mode to SPICE.
  14. Publish the dataset.

If your organization has a mapping of other principals to a Amazon QuickSight group or user, you can apply that mapping before joining the tables.

You can also write the following custom SQL to achieve the same result:

SELECT a."group" as GroupArn, segment FROM "QuickSight_lf_transformation"."filter_permissions" as a
left join
"QuickSight_lf_transformation"."filter_rules" as b
on
a.catalog = b.catalog and
a.database = b.database and
a."table" = b."table" and
a.filter = b.filter
where a."table" = 'saas_sales'
and a."group" like 'arn:aws:quicksight%'

  1. Name the dataset LakeFormationRLSDataSet and publish the dataset.

Test the row-level security

Now you’re ready to test the row-level security by publishing a dashboard as a user in the QuickSight_Authors group and then viewing the dashboard as a user in the QuickSight_Readers group.

Publish a dashboard as a QuickSight_Authors group user

As an author who belongs to the QuickSight_Authors group, the user will be able to see the saas_sales table in the Athena connector and all the data in the table. As shown in this section, all three segments are visible for the author when creating an analysis and viewing the published dashboard.

  1. Create a dataset by pulling data from the saas_sales table using the Athena connector.
  2. Attach LakeFormationRLSDataSet as the RLS dataset for the saas_sales dataset. For instructions, see Using row-level security with user-based rules to restrict access to a dataset.
  3. Create an analysis using the saas_sales dataset as an author who belongs to the QuickSight_Authors group.
  4. Publish the dashboard.
  5. Share the dashboard with the group QuickSight_Readers.

View the dashboard as a QuickSight_Readers group user

Complete the following steps to view the dashboard as a QuickSight_Readers group user:

  1. Log into Amazon QuickSight as a reader who belongs to the QuickSight_Readers group.

The user will be able to see only the segment Enterprise.

  1. Now, change the RLS in AWS Lake Formation, and set the segment to be SMB for the QuickSightReaderFilter.
  2. Run the Lambda function to export and transform the new data filter rules.
  3. Refresh the SPICE dataset LakeFormationRLSDataSet in Amazon QuickSight.
  4. When the refresh is complete, refresh the dashboard in the reader login.

Now the reader user will see SMB data.

Cleanup

Amazon QuickSight resources

  1. Delete the Amazon QuickSight dashboard and analysis created
  2. Delete the datasets saas_sales and LakeFormationRulesDataSet
  3. Delete the Athena data source
  4. Delete the QuickSight groups using the DeleteGroup API

AWS Lake Formation resources

  1. Delete the database quicksight_lf transformation created in AWS Lake Formation
  2. Revoke permission given to the Lambda execution role
  3. Delete the saas_sales table and data filters created
  4. If you have used Glue crawler to create the tables in AWS Lake Formation, remove the Glue crawler as well

Compute resources

  1. Delete the AWS Lambda function created
  2. Delete the AWS Lambda execution role associated with the lambda

Storage resources

  1. Empty the content of the Amazon S3 bucket created for this solution
  2. Delete the Amazon S3 bucket

Conclusion

This post explained how to replicate row-level security in AWS Lake Formation automatically in Amazon QuickSight. This makes sure that the SPICE dataset in QuickSight can use row-level access defined in Lake Formation.

This solution can also be extended for other data sources. The logic to programmatically extract the entitlements from the source and transform them into Amazon QuickSight format will vary by source. After the extract and transform are in place, it can scale to multiple teams in the organization. Although this post laid out a basic approach, the automation has to be either scheduled to run periodically or triggered based on events like data filters change or grant or revoke of AWS Lake Formation permissions to make sure that the entitlements remain in sync between AWS Lake Formation and Amazon QuickSight.

Try out this solution for your own use case, and share your feedback in the comments.


About the Authors

Vetri Natarajan is a Specialist Solutions Architect for Amazon QuickSight. Vetri has 15 years of experience implementing enterprise business intelligence (BI) solutions and greenfield data products. Vetri specializes in integration of BI solutions with business applications and enable data-driven decisions.

Ismael Murillo is a Solutions Architect for Amazon QuickSight. Before joining AWS, Ismael worked in Amazon Logistics (AMZL) with delivery station management, delivery service providers, and our customer actively in the field. Ismael focused on last mile delivery and delivery success. He designed and implemented many innovative solutions to help reduce cost, influence delivery success. He is also a United States Army Veteran, where he served for eleven years.

Automate and accelerate your Amazon QuickSight asset deployments using the new APIs

Post Syndicated from Vetri Natarajan original https://aws.amazon.com/blogs/big-data/automate-and-accelerate-your-amazon-quicksight-asset-deployments-using-the-new-apis/

Business intelligence (BI) and IT operations (BIOps) teams often need to automate and accelerate the deployment of BI assets to ensure business continuity. We heard that you wanted an automated and scalable way to deploy, back up, and replicate Amazon QuickSight assets at scale so that BIOps teams within your organization can work in an agile manner.

Today, we are releasing six new QuickSight APIs to allow programmatic access to export and import QuickSight assets—dashboards, analyses, datasets including ingestion schedules, data sources, themes, and VPC configurations—across accounts and environments. These new APIs allow you to interact with a collection of assets in a lift-and-shift manner for deployment across QuickSight accounts, enable backup and restore, and support replication so you can automate workflows and achieve your desired infrastructure setup. These new capabilities bring greater agility to your BIOps teams, allowing you to automate and seamlessly integrate QuickSight assets into existing infrastructure.

Prior to this launch, you needed to have an in-depth understanding of QuickSight asset relationships and couldn’t deploy, back up, or replicate at scale in an automated manner. In this post, we cover the capabilities of the new APIs in detail and go over common use cases.

Export APIs

You can use the following APIs to initiate, track, and describe the export jobs that produce the bundle files from the source account. A bundle file is a zip file (with the .qs extension) that contains assets specified by the caller, and optionally all dependencies of the assets. The APIs are as follows:

  • StartAssetBundleExportJob – Use this asynchronous API to export an asset bundle file.
  • DescribeAssetBundleExportJob – Use this synchronous API to get the status of your export job. When successful, this API call response will have a presigned URL to fetch the asset bundle.
  • ListAssetBundleExportJobs – Use this synchronous API to list past export jobs. The list will contain both finished and running jobs from the past 15 days.

Import APIs

These APIs initiate, track, and describe the import jobs that take the bundle file as input and create or update assets in the destination account:

Supported assets

You can export and import the following assets with these APIs:

  • Analyses
  • Dashboards
  • Data sources
  • Datasets including scheduled and incremental refreshes
  • Themes
  • VPC connections

Asset bundle output format

The output of the export job is a single zip file with the .qs extension. This zip file contains a separate folder for each asset type. Each folder contains a single JSON file for each asset with the resourceId as the file name. This folder structure makes it easy to commit the contents into a version control system like Git, so you can get the benefits of a complete version history.

The Asset-bundle API can also export QuickSight assets as AWS CloudFormation templates, one of the most popular infrastructure as code (IaC) frameworks. It makes it easy to manage your QuickSight assets at scale and automate your deployments. AWS CloudFormation also has built-in transaction and rollback capabilities, ensuring that all your environments are consistent and your assets are deployed correctly every time. Finally, you can use the CloudFormation templates to recreate your QuickSight resources in case of a disaster.

Permissions required

These APIs are available to users with AWS Identity and Access Management (IAM) permissions to run these APIs. The following IAM policy allows an IAM user to get access to these APIs:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [          
                "quicksight:StartAssetBundleImportJob",
                "quicksight:DescribeAssetBundleImportJob",
                "quicksight:ListAssetBundleImportJobs",
                "quicksight:StartAssetBundleExportJob",
                "quicksight:DescribeAssetBundleExportJob",
                "quicksight:ListAssetBundleExportJobs"
            ],
            "Resource": "*"
        }
    ]
}

Use case overview

Let’s consider a fictional company, AnyCompany, which owns healthcare facilities across the globe. They have set up a development QuickSight account for authors to create and update QuickSight assets and a separate production account. In some cases due, to data residency regulation, they have to maintain the same assets across multiple Regions. AnyCompany is scaling their business and they want to automate deployment within and across multiple QuickSight accounts and back up QuickSight assets on a schedule.

AnyCompany has the following key deployment and backup requirements:

  • Deployment – Deploy QuickSight assets across Regions and multiple accounts:
    • Deployment to the production account – AnyCompany wants to automate the deployment of QuickSight assets from their development to their production account.
    • Deployment to different Regions in the same account – AnyCompany’s central IT team needs to deploy dashboards and datasets across various Regions to meet data residency requirements.
    • Deployment to multiple accounts in different Regions – To meet their end customer requirements of separate QuickSight accounts, AnyCompany needs to deploy the dashboards and datasets across multiple accounts.
    • Deployment in the same account and same Region – AnyCompany consolidates all non-production environment into one QuickSight account. However, there has to be different dashboards and datasets for each non-production environment, such as development and testing.
  • Backup and restore – As AnyCompany rolls out critical dashboards for business, it needs to ensure high availability of the dashboards. As part of their strategy, AnyCompany wants to maintain a backup of assets to restore in case of disasters.
  • Deployment history – As part of the governed process, AnyCompany’s central IT team needs to have a history of deployments in each environment.

In the following sections, we discuss how to meet these requirements.

Deploy to a production account

The following figure shows the sequence of steps in the deployment process through the new asset deployment APIs.

For deployments, the import job API provides the capability to pass data source configurations to point to the respective test or production instances of data sources. In the preceding sequence flow, we use the AWS Command Line Interface (AWS CLI) to showcase the capability, but you can invoke the APIs through your automation pipeline using AWS SDKs.

For this use case, AnyCompany used Amazon Simple Storage Service (Amazon S3) to store their asset bundles..

On the development QuickSight account, complete the following steps:

  1. Use the StartAssetBundleExportJob API to export the dashboard and its dependencies.
  2. Place the output asset bundle in an S3 bucket in the production account.

On the production QuickSight account, complete the following steps:

  1. Use the StartAssetBundleImportJob API with the asset bundle in Amazon S3 as the source, overriding the data source details.
  2. Run the import job.

The following code shows their StartAssetBundleExportJob API call to export the dashboard and its dependencies:

aws quicksight start-asset-bundle-export-job 
--aws-account-id $AAI 
--asset-bundle-export-job-id job-1 
--resource-arns arn:aws:quicksight:$IR:$AAI:dashboard/<<dashboard-id>> 
--include-all-dependencies 
--export-format QUICKSIGHT_JSON

The following code is for the DescribeAssetBundleExportJob API:

aws quicksight describe-asset-bundle-export-job 
--aws-account-id $AAI 
--asset-bundle-export-job-id job-1

The output of the DescribeAssetBundleExportJob API call contains the presigned URL, which you use to download your respective assets and subsequently upload them to a dedicated S3 bucket in the target account.

The import job (StartAssetBundleImportJob) is initiated in the target account using S3Uri as one of the input parameters. You can also change the data source configuration while initiating the job. In the following example, the S3 manifest file location for the S3 data source is overridden:

aws quicksight start-asset-bundle-import-job
 --aws-account-id $AAI 
--asset-bundle-import-job-id job-1 
--asset-bundle-import-source "{\"S3Uri\": \"<<qsfile location\"}"
 --override-parameters '{"DataSources": 
 [{"DataSourceId": "<<DataSourceID>>", 
 "DataSourceParameters": 
 { "S3Parameters": {"ManifestFileLocation": 
 {"Bucket": "<<bucket name>>", "Key": "<<key for manifest file>>"}}}}]}' 
--region us-west-2

DescribeAssetBundleImportJob lets you to monitor the status of the import job:

aws quicksight describe-asset-bundle-import-job 
--aws-account-id $AAI 
--asset-bundle-import-job-id job-1 
--region us-west-2

The following screenshot shows the response.

Deploy to different Regions in the different accounts

To comply with data residency regulations, data can’t be moved outside a Region in certain countries. Therefore, the dashboards have to be deployed in each of these Regions. QuickSight provides the option to pass an asset bundle extracted from the source environment as a base64 encoded string for the import job (StartAssetBundleImportJob):

aws quicksight start-asset-bundle-import-job \
 --aws-account-id $AAI  \
--asset-bundle-import-job-id <<job-id>> \ 
--asset-bundle-import-source Body="$(base64 assetbundle-extract.qs)"

Deploy within a single account in the same Region

When using a centralized account approach for all the lower environments, AnyCompany wanted to have the same dashboards within a single Region to be able to connect with different data sources. To achieve this, they used the optional parameter resource-id-prefix in the import job (StartAssetBundleImportJob) to create a unique ID for each environment:

aws quicksight start-asset-bundle-import-job
--aws-account-id $AAI 
--asset-bundle-import-job-id job-1 
--asset-bundle-import-source "{\"S3Uri\": \"s3://qs file s3 location\"}"
--override-parameters '{"DataSources": 
[{"DataSourceId": "<<DataSource ID>>", 
"DataSourceParameters":{ "S3Parameters": {"ManifestFileLocation": 
{"Bucket": "ee-assets-prod-us-west-2", 
"Key": "modules/337d5d05acc64a6fa37bcba6b921071c/v1/SalesDataManifest.json"}}}}]}' 
--region us-west-2
--resource-id-prefix "test-"

Backup and version control

AnyCompany deploys business-critical dashboards, and it’s important for them to have proper backup and version control processes. They run scheduled export jobs at regular intervals along with asset deployments. Additionally, they use asset bundle APIs to meet their version control requirements.

The following screenshot shows the content of a sample bundle.

Deployment history

AnyCompany needs to track the deployment history of all the assets in all environments. They achieved this goal by using the ListAssetBundleExportJobs and ListAssetBundleImportJobs APIs to fetch the deployment history in a given account.

The following code is for ListAssetBundleExportJobs:

aws quicksight list-asset-bundle-export-jobs \
 --aws-account-id $AAI  \

The following code is for ListAssetBundleImportJobs:

aws quicksight list-asset-bundle-export-jobs \
 --aws-account-id $AAI  \

Conclusion

Asset bundle APIs provide methods for automation and acceleration in the deployment process across multiple environments. This post illustrated various use cases where you can apply these APIs for automation and scale. For more information, refer to Amazon QuickSight and What’s New in the Amazon QuickSight User Guide.

If you have any questions or feedback, please leave a comment. For additional discussions and help getting answers to your questions, check out the QuickSight Community.


About the authors

Vetri Natarajan is a Specialist Solutions Architect for Amazon QuickSight. Vetri has 15 years of experience implementing enterprise business intelligence (BI) solutions and greenfield data products. Vetri specializes in integration of BI solutions with business applications and enable data-driven decisions.

Zhao Pan is a software development manager for Amazon QuickSight. He is working to provide a delightful developer experience to our customers to automate and streamline their BI operations. He has 20 years of software development experience in various tech stacks. Prior to QuickSight he was a people and technical leader at ADP building a next-gen platform for human capital management. When he is not at his desk, he can usually be found in his garage building one contraption or another.

Mayank Agarwal is a product manager for Amazon QuickSight, AWS’ cloud-native, fully managed BI service. He focuses on embedded analytics and developer experience. He started his career as an embedded software engineer developing handheld devices. Prior to QuickSight he was leading engineering teams at Credence ID, developing custom mobile embedded device and web solutions using AWS services that make biometric enrollment and identification fast, intuitive, and cost-effective for Government sector, healthcare and transaction security applications.

Scale Amazon QuickSight embedded analytics with new API-based domain allow listing

Post Syndicated from Vetri Natarajan original https://aws.amazon.com/blogs/big-data/scale-amazon-quicksight-embedded-analytics-with-new-api-based-domain-allow-listing/

Amazon QuickSight is a fully-managed, cloud-native business intelligence (BI) service that makes it easy to connect to your data, create interactive dashboards, and share these with tens of thousands of users, either within QuickSight itself or embedded in apps and portals.

QuickSight Enterprise Edition recently introduced the ability to dynamically allow list the domains where QuickSight content can be embedded. This allows developers to quickly embed dashboards across multiple apps, portals, or websites, without needing to make this change on the QuickSight administrative console every time. Together with QuickSight’s existing dashboard theming and templating capabilities, this new feature allows developers to rapidly develop and deploy QuickSight dashboards and visualizations for a variety of use cases across various applications with ease. Let’s take a look at how this works.

Solution overview

To embed a QuickSight dashboard using APIs, you can use one of the following embedding APIs:

In these APIs, you can now pass the domain where you want to embed your dashboard using the new parameter AllowedDomains:

POST /accounts/AwsAccountId/embed-url/registered-user HTTP/1.1
Content-type: application/json
 
{
   "AllowedDomains": [ "string" ],
   "ExperienceConfiguration": { 
      "Dashboard": { 
         "InitialDashboardId": "string"
      },
      "QSearchBar": { 
         "InitialTopicId": "string"
      },
      "QuickSightConsole": { 
         "InitialPath": "string"
      }
   },
   "SessionLifetimeInMinutes": number,
   "UserArn": "string"
}

You can add up to three domains in a single API call as an array list. All the domains need to be SSL enabled (using HTTPS protocol). If you want to test out the embedded dashboard on your local machine, you can allow list http://localhost via the AllowedDomains parameter. For example, if you want to embed a dashboard in your SaaS application called https://myorders.simplelogistics.com, you set AllowedDomains to be https://myorders.simplelogistics.com in the API call. You can also enable sub domains by passing *, for example, https://*.myorders.simplelogistics.com.

AllowedDomains is an optional parameter. If you don’t specify any domains via this parameter, you can still use the domains allow listed via the QuickSight console. But if you specify domains via this parameter, then the embedding URL returned as part of the API call is only embeddable in these domains (even if you have a list of static domains entered on the QuickSight console).

Prior to this capability, the Content-Service-Policy in the request header listed all the domains allow listed in QuickSight console. Now when allow listing the domains using the API, the Content-Service-Policy only shows the domains that are allow listed in the API call.

With this new capability, ISVs that have different applications for different customers can allow list specific domains at runtime, enabling them to scale easily for different customers and to hundreds of thousands of end-users.

As an added security, the AWS Identity and Access Management (IAM) admin of your QuickSight account can restrict the domains that can be allow listed. This can be done when your IAM admin sets up permissions for your application or server. As part of this step, you can specify the list of domains that can be allow listed via the embedding APIs. For example, let’s assume you want your developers to only allow list the following domains:

You can set these domains in the quicksight:AllowedEmbeddingDomain of the permissions setup. The following code is a sample for the GenerateEmbedURLForAnonymousUser API:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
            "quicksight:GenerateEmbedUrlForAnonymousUser"
            ],
            "Resource": "arn:partition:quicksight:region:accountId:user/namespace/userName",
            "Condition": {
                "ForAllValues:StringEquals": {
                    "quicksight:AllowedEmbeddingDomains": [
                        "https://myorders.simplelogistics.com",
                        "https://cheapelectornics.simplelogistics.com",
                        "https://myreturns.simplelogistics.com"
                    ]
                }
            }
        }
    ]
}

Sample use case

In this example use case, Travel Analytics is a software as a service (SaaS) provider with travel-related solutions for various travel agencies. They have a SaaS application for these agencies to track different metrics on how their business is performing. Because Travel Analytics is scaling their business, they have different sites for different travel agencies. With the newly launched domain allow listing with APIs, they’re able to scale with ease. They allow list the specific domains, depending on the customer, via the API when generating the embedding URL.

The following code shows their sample GenerateEmbedURLForAnonymousUser API call with the domain added to the request:

The returned URL can only be embedded in the domain that was allow listed as part of the preceding request. The following is a screenshot of the embedded dashboard in this domain.

The CSP header has only the specific allow listed domain via the API when the dashboard is embedded.

Conclusion

Runtime domain allow listing using embedding APIs enables developers to scale their embedded offerings with QuickSight dashboards, visuals, QuickSight Q (natural language querying), or authoring experience across different domains for their different customers easily. All of this is done without any infrastructure setup or management, while scaling to millions of users. For more information, refer to Amazon QuickSight Embedded Analytics and What’s New in the Amazon QuickSight User Guide.


About the authors

Vetri Natarajan is a Specialist Solutions Architect for Amazon QuickSight. Vetri has 15 years of experience implementing enterprise Business Intelligence (BI) solutions and greenfield data products. Vetri specializes in integration of BI solutions with business applications and enable data-driven decisions.

Kareem Syed-Mohammed is a Product Manager at Amazon QuickSight. He focuses on embedded analytics, APIs, and developer experience. Prior to QuickSight he has been with AWS Marketplace and Amazon retail as a PM. Kareem started his career as a developer and then PM for call center technologies, Local Expert and Ads for Expedia. He worked as a consultant with McKinsey and Company for a short while.