Perform secure database write-backs with Amazon QuickSight

Post Syndicated from Srikanth Baheti original https://aws.amazon.com/blogs/big-data/perform-secure-database-write-backs-with-amazon-quicksight/

Amazon QuickSight is a scalable, serverless, machine learning (ML)-powered business intelligence (BI) solution that makes it easy to connect to your data, create interactive dashboards, get access to ML-enabled insights, and share visuals and dashboards with tens of thousands of internal and external users, either within QuickSight itself or embedded into any application.

A write-back is the ability to update a data mart, data warehouse, or any other database backend from within BI dashboards and analyze the updated data in near-real time within the dashboard itself. In this post, we show how to perform secure database write-backs with QuickSight.

Use case overview

To demonstrate how to enable a write-back capability with QuickSight, let’s consider a fictional company, AnyCompany Inc. AnyCompany is a professional services firm that specializes in providing workforce solutions to their customers. AnyCompany determined that running workloads in the cloud to support its growing global business needs is a competitive advantage and uses the cloud to host all its workloads. AnyCompany decided to enhance the way its branches provide quotes to its customers. Currently, the branches generate customer quotes manually, and as a first step in this innovation journey, AnyCompany is looking to develop an enterprise solution for customer quote generation with the capability to dynamically apply local pricing data at the time of quote generation.

AnyCompany currently uses Amazon Redshift as their enterprise data warehouse platform and QuickSight as their BI solution.

Building a new solution comes with the following challenges:

  • AnyCompany wants a solution that is easy to build and maintain, and they don’t want to invest in building a separate user interface.
  • AnyCompany wants to extend the capabilities of their existing QuickSight BI dashboard to also enable quote generation and quote acceptance. This will simplify feature rollouts because their employees already use QuickSight dashboards and enjoy the easy-to-use interface that QuickSight provides.
  • AnyCompany wants to store the quote negotiation history that includes generated, reviewed, and accepted quotes.
  • AnyCompany wants to build a new dashboard with quote history data for analysis and business insights.

This post goes through the steps to enable write-back functionality to Amazon Redshift from QuickSight. Note that the traditional BI tools are read-only with little to no options to update source data.

Solution overview

This solution uses the following AWS services:

  • Amazon API Gateway – Hosts and secures the write-back REST API that will be invoked by QuickSight
  • AWS Lambda – Runs the compute function required to generate the hash and a second function to securely perform the write-back
  • Amazon QuickSight – Offers BI dashboards and quote generation capabilities
  • Amazon Redshift – Stores quotes, prices, and other relevant datasets
  • AWS Secrets Manager – Stores and manages keys to sign hashes (message digest)

Although this solution uses Amazon Redshift as the data store, a similar approach can be implemented with any database that supports creating user-defined functions (UDFs) that can invoke Lambda.

The following figure shows the workflow to perform write-backs from QuickSight.

The first step in the solution is to generate a hash or a message digest of the set of attributes in Amazon Redshift by invoking a Lambda function. This step prevents request tampering. To generate a hash, Amazon Redshift invokes a scalar Lambda UDF. The hashing mechanism used here is the popular BLAKE2 function (available in the Python library hashlib). To further secure the hash, keyed hashing is used, which is a faster and simpler alternative to hash-based message authentication code (HMAC). This key is generated and stored by Secrets Manager and should be accessible only to allowed applications. After the secure hash is generated, it’s returned to Amazon Redshift and combined in an Amazon Redshift view.

Writing the generated quote back to Amazon Redshift is performed by the write-back Lambda function, and an API Gateway REST API endpoint is created to secure and pass requests to the write-back function. The write-back function performs the following actions:

  1. Generate the hash based on the API input parameters received from QuickSight.
  2. Sign the hash by applying the key from Secrets Manager.
  3. Compare the generated hash with the hash received from the input parameters using the compare_digest method available in the HMAC module.
  4. Upon successful validation, write the record to the quote submission table in Amazon Redshift.

The following section provide detailed steps with sample payloads and code snippets.

Generate the hash

The hash is generated using a Lambda UDF in Amazon Redshift. Additionally, a Secrets Manager key is used to sign the hash. To create the hash, complete the following steps:

  1. Create the Secrets Manager key from the AWS Command Line Interface (AWS CLI):
aws secretsmanager create-secret --name “name_of_secret” --description "Secret key to sign hash" --secret-string '{" name_of_key ":"value"}' --region us-east-1
  1. Create a Lambda UDF to generate a hash for encryption:
import boto3	
import base64
import json
from hashlib import blake2b
from botocore.exceptions import ClientError

def get_secret(): 	#This key is used by the Lambda function to further secure the hash.

    secret_name = "<name_of_secret>"
    region_name = "<aws_region_name>"

    # Create a Secrets Manager client
    session = boto3.session.Session()
    client = session.client(service_name='secretsmanager', region_name=<aws_region_name>    )

    # In this sample we only handle the specific exceptions for the 'GetSecretValue' API.
    # See https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html
    # We rethrow the exception by default.

    try:
        get_secret_value_response = client.get_secret_value(SecretId=secret_name)
    except Exception as e:
            raise e

   if "SecretString" in get_secret_value_response:
       access_token = get_secret_value_response["SecretString"]
   else:
       access_token = get_secret_value_response["SecretBinary"]

   return json.loads(access_token)[<token key name>]

SECRET_KEY = get_secret()
AUTH_SIZE = 16 

def sign(payload):
    h = blake2b(digest_size=AUTH_SIZE, key=SECRET_KEY)
    h.update(payload)
    return h.hexdigest().encode('utf-8')

def lambda_handler(event, context):
ret = dict()
 try:
  res = []
  for argument in event['arguments']:
   
   try:
     msg = json.dumps(argument)
     signed_key = sign(str.encode(msg))
     res.append(signed_key.decode('utf-8'))
     
   except:
   res.append(None)     
   ret['success'] = True
   ret['results'] = res
    
except Exception as e:
  ret['success'] = False
  ret['error_msg'] = str(e)
  
 return json.dumps(ret)
  1. Define an Amazon Redshift UDF to call the Lambda function to create a hash:
CREATE OR REPLACE EXTERNAL FUNCTION udf_get_digest (par1 varchar)
RETURNS varchar STABLE
LAMBDA 'redshift_get_digest'
IAM_ROLE 'arn:aws:iam::<AWSACCOUNTID>role/service-role/<role_name>';

The AWS Identity and Access Management (IAM) role in the preceding step should have the following policy attached to be able to invoke the Lambda function:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "lambda:InvokeFunction",
            "Resource": "arn:aws:lambda:us-east-1:<AWSACCOUNTID>1:function:redshift_get_digest"
        }
}
  1. Fetch the key from Secrets Manager.

This key is used by the Lambda function to further secure the hash. This is indicated in the get_secret function in Step 2.

Set up Amazon Redshift datasets in QuickSight

The quote generation dashboard uses the following Amazon Redshift view.

Create an Amazon Redshift view that uses all the preceding columns along with the hash column:

create view quote_gen_vw as select *, udf_get_digest 
( customername || BGCheckRequired || Skill|| Shift ||State ||Cost ) from billing_input_tbl

The records will look like the following screenshot.

The preceding view will be used as the QuickSight dataset to generate quotes. A QuickSight analysis will be created using the dataset. For near-real-time analysis, you can use QuickSight direct query mode.

Create API Gateway resources

The write-back operation is initiated by QuickSight invoking an API Gateway resource, which invokes the Lambda write-back function. As a prerequisite for creating the calculated field in QuickSight to call the write-back API, you must first create these resources.

API Gateway secures and invokes the write-back Lambda function with the parameters created as URL query string parameters with mapping templates. The mapping parameters can be avoided by using the Lambda proxy integration.

Create a REST API resource of method type GET that uses Lambda functions (created in the next step) as the integration type. For instructions, refer to Creating a REST API in Amazon API Gateway and Set up Lambda integrations in API Gateway.

The following screenshot shows the details for creating a query string parameter for each parameter passed to API Gateway.

The following screenshot shows the details for creating a mapping template parameter for each parameter passed to API Gateway.

Create the Lambda function

Create a new Lambda function for the API Gateway to invoke. The Lambda function performs the following steps:

  1. Receive parameters from QuickSight through API Gateway and hash the concatenated parameters.

The following code example retrieves parameters from the API Gateway call using the event object of the Lambda function:

   customer= event['customer’])
    bgc = event['bgc']

The function performs the hashing logic as shown in the create hash step earlier using the concatenated parameters passed by QuickSight.

  1. Compare the hashed output with the hash parameter.

If these don’t match, the write-back won’t happen.

  1. If the hashes match, perform a write-back. Check for the presence of a record in the quote generation table by generating a query from the table using the parameters passed from QuickSight:
query_str = "select * From tbquote where cust = '" + cust + "' and bgc = '" + bgc +"'" +" and skilledtrades = '" + skilledtrades + "'  and shift = '" +shift + "' and jobdutydescription ='" + jobdutydescription + "'"
  1. Complete the following action based on the results of the query:
    1. If no record exists for the preceding combination, generate and run an insert query using all parameters with the status as generated.
    2. If a record exists for the preceding combination, generate and run an insert query with the status as in review. The quote_Id for the existing combination will be reused.

Create a QuickSight visual

This step involves creating a table visual that uses a calculated field to pass parameters to API Gateway and invoke the preceding Lambda function.

  1. Add a QuickSight calculated field named Generate Quote to hold the API Gateway hosted URL that will be triggered to write back the quote history into Amazon Redshift:
concat("https://xxxxx.execute-api.us-east-1.amazonaws.com/stage_name/apiresourcename/?cust=",customername,"&bgc=",bgcheckrequired,"&billrate=",toString(billrate),"&skilledtrades=",skilledtrades,"&shift=",shift,"&jobdutydescription=",jobdutydescription,"&hash=",hashvalue)
  1. Create a QuickSight table visual.
  2. Add required fields such as Customer, Skill, and Cost.
  3. Add the Generate Quote calculated field and style this as a hyperlink.

Choosing this link will write the record into Amazon Redshift. This is incumbent on the same hash value returning when the Lambda function performs the hash on the parameters.

The following screenshot shows a sample table visual.

Write to the Amazon Redshift database

The Secrets Manager key is fetched and used by the Lambda function to generate the hash for comparison. The write-back will be performed only if the hash matches with the hash passed in the parameter.

The following Amazon Redshift table will capture the quote history as populated by the Lambda function. Records in green represent the most recent records for the quote.

Considerations and next steps

Using secure hashes prevents the tampering of payload parameters that are visible in the browser window when the write-back URL is invoked. To further secure the write-back URL, you can employ the following techniques:

  • Deploy the REST API in a private VPC that is accessible only to QuickSight users.
  • To prevent replay attacks, a timestamp can be generated alongside the hashing function and passed as an additional parameter in the write-back URL. The backend Lambda function can then be modified to only allow write-backs within a certain time-based threshold.
  • Follow the API Gateway access control and security best practices.
  • Mitigate potential Denial of Service for public-facing APIs.

You can further enhance this solution to render a web-based form when the write-back URL is opened. This could be implemented by dynamically generating an HTML form in the backend Lambda function to support the input of additional information. If your workload requires a high number of write-backs that require higher throughput or concurrency, a purpose-built data store like Amazon Aurora PostgreSQL-Compatible Edition might be a better choice. For more information, refer to Invoking an AWS Lambda function from an Aurora PostgreSQL DB cluster. These updates can then be synchronized into Amazon Redshift tables using federated queries.

Conclusion

This post showed how to use QuickSight along with Lambda, API Gateway, Secrets Manager, and Amazon Redshift to capture user input data and securely update your Amazon Redshift data warehouse without leaving your QuickSight BI environment. This solution eliminates the need to create an external application or user interface for database update or insert operations, and reduces related development and maintenance overhead. The API Gateway call can also be secured using a key or token to ensure only calls originating from QuickSight are accepted by the API Gateway. This will be covered in subsequent posts.


About the Authors

Srikanth Baheti is a Specialized World Wide Principal Solutions Architect for Amazon QuickSight. He started his career as a consultant and worked for multiple private and government organizations. Later he worked for PerkinElmer Health and Sciences & eResearch Technology Inc, where he was responsible for designing and developing high traffic web applications, highly scalable and maintainable data pipelines for reporting platforms using AWS services and Serverless computing.

Raji Sivasubramaniam is a Sr. Solutions Architect at AWS, focusing on Analytics. Raji is specialized in architecting end-to-end Enterprise Data Management, Business Intelligence and Analytics solutions for Fortune 500 and Fortune 100 companies across the globe. She has in-depth experience in integrated healthcare data and analytics with wide variety of healthcare datasets including managed market, physician targeting and patient analytics.