Tag Archives: IAM policies

Easier way to control access to AWS regions using IAM policies

Post Syndicated from Sulay Shah original https://aws.amazon.com/blogs/security/easier-way-to-control-access-to-aws-regions-using-iam-policies/

We made it easier for you to comply with regulatory standards by controlling access to AWS Regions using IAM policies. For example, if your company requires users to create resources in a specific AWS region, you can now add a new condition to the IAM policies you attach to your IAM principal (user or role) to enforce this for all AWS services. In this post, I review conditions in policies, introduce the new condition, and review a policy example to demonstrate how you can control access across multiple AWS services to a specific region.

Condition concepts

Before I introduce the new condition, let’s review the condition element of an IAM policy. A condition is an optional IAM policy element that lets you specify special circumstances under which the policy grants or denies permission. A condition includes a condition key, operator, and value for the condition. There are two types of conditions: service-specific conditions and global conditions. Service-specific conditions are specific to certain actions in an AWS service. For example, the condition key ec2:InstanceType supports specific EC2 actions. Global conditions support all actions across all AWS services.

Now that I’ve reviewed the condition element in an IAM policy, let me introduce the new condition.

AWS:RequestedRegion condition key

The new global condition key, , supports all actions across all AWS services. You can use any string operator and specify any AWS region for its value.

Condition key Description Operator(s) Value
aws:RequestedRegion Allows you to specify the region to which the IAM principal (user or role) can make API calls All string operators (for example, StringEquals Any AWS region (for example, us-east-1)

I’ll now demonstrate the use of the new global condition key.

Example: Policy with region-level control

Let’s say a group of software developers in my organization is working on a project using Amazon EC2 and Amazon RDS. The project requires a web server running on an EC2 instance using Amazon Linux and a MySQL database instance in RDS. The developers also want to test Amazon Lambda, an event-driven platform, to retrieve data from the MySQL DB instance in RDS for future use.

My organization requires all the AWS resources to remain in the Frankfurt, eu-central-1, region. To make sure this project follows these guidelines, I create a single IAM policy for all the AWS services that this group is going to use and apply the new global condition key aws:RequestedRegion for all the services. This way I can ensure that any new EC2 instances launched or any database instances created using RDS are in Frankfurt. This policy also ensures that any Lambda functions this group creates for testing are also in the Frankfurt region.

    "Version": "2012-10-17",
    "Statement": [
            "Effect": "Allow",
            "Action": [
            "Resource": "*"
            "Effect": "Allow",
            "Action": [
            "Resource": "*",
      "Condition": {"StringEquals": {"aws:RequestedRegion": "eu-central-1"}}

            "Effect": "Allow",
            "Action": [
            "Resource": "arn:aws:iam::account-id:role/*"

The first statement in the above example contains all the read-only actions that let my developers use the console for EC2, RDS, and Lambda. The permissions for IAM-related actions are required to launch EC2 instances with a role, enable enhanced monitoring in RDS, and for AWS Lambda to assume the IAM execution role to execute the Lambda function. I’ve combined all the read-only actions into a single statement for simplicity. The second statement is where I give write access to my developers for the three services and restrict the write access to the Frankfurt region using the aws:RequestedRegion condition key. You can also list multiple AWS regions with the new condition key if your developers are allowed to create resources in multiple regions. The third statement grants permissions for the IAM action iam:PassRole required by AWS Lambda. For more information on allowing users to create a Lambda function, see Using Identity-Based Policies for AWS Lambda.


You can now use the aws:RequestedRegion global condition key in your IAM policies to specify the region to which the IAM principal (user or role) can invoke an API call. This capability makes it easier for you to restrict the AWS regions your IAM principals can use to comply with regulatory standards and improve account security. For more information about this global condition key and policy examples using aws:RequestedRegion, see the IAM documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about or suggestions for this solution, start a new thread on the IAM forum.

Want more AWS Security news? Follow us on Twitter.

Rotate Amazon RDS database credentials automatically with AWS Secrets Manager

Post Syndicated from Apurv Awasthi original https://aws.amazon.com/blogs/security/rotate-amazon-rds-database-credentials-automatically-with-aws-secrets-manager/

Recently, we launched AWS Secrets Manager, a service that makes it easier to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. You can configure Secrets Manager to rotate secrets automatically, which can help you meet your security and compliance needs. Secrets Manager offers built-in integrations for MySQL, PostgreSQL, and Amazon Aurora on Amazon RDS, and can rotate credentials for these databases natively. You can control access to your secrets by using fine-grained AWS Identity and Access Management (IAM) policies. To retrieve secrets, employees replace plaintext secrets with a call to Secrets Manager APIs, eliminating the need to hard-code secrets in source code or update configuration files and redeploy code when secrets are rotated.

In this post, I introduce the key features of Secrets Manager. I then show you how to store a database credential for a MySQL database hosted on Amazon RDS and how your applications can access this secret. Finally, I show you how to configure Secrets Manager to rotate this secret automatically.

Key features of Secrets Manager

These features include the ability to:

  • Rotate secrets safely. You can configure Secrets Manager to rotate secrets automatically without disrupting your applications. Secrets Manager offers built-in integrations for rotating credentials for Amazon RDS databases for MySQL, PostgreSQL, and Amazon Aurora. You can extend Secrets Manager to meet your custom rotation requirements by creating an AWS Lambda function to rotate other types of secrets. For example, you can create an AWS Lambda function to rotate OAuth tokens used in a mobile application. Users and applications retrieve the secret from Secrets Manager, eliminating the need to email secrets to developers or update and redeploy applications after AWS Secrets Manager rotates a secret.
  • Secure and manage secrets centrally. You can store, view, and manage all your secrets. By default, Secrets Manager encrypts these secrets with encryption keys that you own and control. Using fine-grained IAM policies, you can control access to secrets. For example, you can require developers to provide a second factor of authentication when they attempt to retrieve a production database credential. You can also tag secrets to help you discover, organize, and control access to secrets used throughout your organization.
  • Monitor and audit easily. Secrets Manager integrates with AWS logging and monitoring services to enable you to meet your security and compliance requirements. For example, you can audit AWS CloudTrail logs to see when Secrets Manager rotated a secret or configure AWS CloudWatch Events to alert you when an administrator deletes a secret.
  • Pay as you go. Pay for the secrets you store in Secrets Manager and for the use of these secrets; there are no long-term contracts or licensing fees.

Get started with Secrets Manager

Now that you’re familiar with the key features, I’ll show you how to store the credential for a MySQL database hosted on Amazon RDS. To demonstrate how to retrieve and use the secret, I use a python application running on Amazon EC2 that requires this database credential to access the MySQL instance. Finally, I show how to configure Secrets Manager to rotate this database credential automatically. Let’s get started.

Phase 1: Store a secret in Secrets Manager

  1. Open the Secrets Manager console and select Store a new secret.
    Secrets Manager console interface
  2. I select Credentials for RDS database because I’m storing credentials for a MySQL database hosted on Amazon RDS. For this example, I store the credentials for the database superuser. I start by securing the superuser because it’s the most powerful database credential and has full access over the database.
    Store a new secret interface with Credentials for RDS database selected

    Note: For this example, you need permissions to store secrets in Secrets Manager. To grant these permissions, you can use the AWSSecretsManagerReadWriteAccess managed policy. Read the AWS Secrets Manager Documentation for more information about the minimum IAM permissions required to store a secret.

  3. Next, I review the encryption setting and choose to use the default encryption settings. Secrets Manager will encrypt this secret using the Secrets Manager DefaultEncryptionKeyDefaultEncryptionKey in this account. Alternatively, I can choose to encrypt using a customer master key (CMK) that I have stored in AWS KMS.
    Select the encryption key interface
  4. Next, I view the list of Amazon RDS instances in my account and select the database this credential accesses. For this example, I select the DB instance mysql-rds-database, and then I select Next.
    Select the RDS database interface
  5. In this step, I specify values for Secret Name and Description. For this example, I use Applications/MyApp/MySQL-RDS-Database as the name and enter a description of this secret, and then select Next.
    Secret Name and description interface
  6. For the next step, I keep the default setting Disable automatic rotation because my secret is used by my application running on Amazon EC2. I’ll enable rotation after I’ve updated my application (see Phase 2 below) to use Secrets Manager APIs to retrieve secrets. I then select Next.

    Note: If you’re storing a secret that you’re not using in your application, select Enable automatic rotation. See our AWS Secrets Manager getting started guide on rotation for details.

    Configure automatic rotation interface

  7. Review the information on the next screen and, if everything looks correct, select Store. We’ve now successfully stored a secret in Secrets Manager.
  8. Next, I select See sample code.
    The See sample code button
  9. Take note of the code samples provided. I will use this code to update my application to retrieve the secret using Secrets Manager APIs.
    Python sample code

Phase 2: Update an application to retrieve secret from Secrets Manager

Now that I have stored the secret in Secrets Manager, I update my application to retrieve the database credential from Secrets Manager instead of hard coding this information in a configuration file or source code. For this example, I show how to configure a python application to retrieve this secret from Secrets Manager.

  1. I connect to my Amazon EC2 instance via Secure Shell (SSH).
  2. Previously, I configured my application to retrieve the database user name and password from the configuration file. Below is the source code for my application.
    import MySQLdb
    import config

    def no_secrets_manager_sample()

    # Get the user name, password, and database connection information from a config file.
    database = config.database
    user_name = config.user_name
    password = config.password

    # Use the user name, password, and database connection information to connect to the database
    db = MySQLdb.connect(database.endpoint, user_name, password, database.db_name, database.port)

  3. I use the sample code from Phase 1 above and update my application to retrieve the user name and password from Secrets Manager. This code sets up the client and retrieves and decrypts the secret Applications/MyApp/MySQL-RDS-Database. I’ve added comments to the code to make the code easier to understand.
    # Use the code snippet provided by Secrets Manager.
    import boto3
    from botocore.exceptions import ClientError

    def get_secret():
    #Define the secret you want to retrieve
    secret_name = "Applications/MyApp/MySQL-RDS-Database"
    #Define the Secrets mManager end-point your code should use.
    endpoint_url = "https://secretsmanager.us-east-1.amazonaws.com"
    region_name = "us-east-1"

    #Setup the client
    session = boto3.session.Session()
    client = session.client(

    #Use the client to retrieve the secret
    get_secret_value_response = client.get_secret_value(
    #Error handling to make it easier for your code to tolerate faults
    except ClientError as e:
    if e.response['Error']['Code'] == 'ResourceNotFoundException':
    print("The requested secret " + secret_name + " was not found")
    elif e.response['Error']['Code'] == 'InvalidRequestException':
    print("The request was invalid due to:", e)
    elif e.response['Error']['Code'] == 'InvalidParameterException':
    print("The request had invalid params:", e)
    # Decrypted secret using the associated KMS CMK
    # Depending on whether the secret was a string or binary, one of these fields will be populated
    if 'SecretString' in get_secret_value_response:
    secret = get_secret_value_response['SecretString']
    binary_secret_data = get_secret_value_response['SecretBinary']

    # Your code goes here.

  4. Applications require permissions to access Secrets Manager. My application runs on Amazon EC2 and uses an IAM role to obtain access to AWS services. I will attach the following policy to my IAM role. This policy uses the GetSecretValue action to grant my application permissions to read secret from Secrets Manager. This policy also uses the resource element to limit my application to read only the Applications/MyApp/MySQL-RDS-Database secret from Secrets Manager. You can visit the AWS Secrets Manager Documentation to understand the minimum IAM permissions required to retrieve a secret.
    "Version": "2012-10-17",
    "Statement": {
    "Sid": "RetrieveDbCredentialFromSecretsManager",
    "Effect": "Allow",
    "Action": "secretsmanager:GetSecretValue",
    "Resource": "arn:aws:secretsmanager:::secret:Applications/MyApp/MySQL-RDS-Database"

Phase 3: Enable Rotation for Your Secret

Rotating secrets periodically is a security best practice because it reduces the risk of misuse of secrets. Secrets Manager makes it easy to follow this security best practice and offers built-in integrations for rotating credentials for MySQL, PostgreSQL, and Amazon Aurora databases hosted on Amazon RDS. When you enable rotation, Secrets Manager creates a Lambda function and attaches an IAM role to this function to execute rotations on a schedule you define.

Note: Configuring rotation is a privileged action that requires several IAM permissions and you should only grant this access to trusted individuals. To grant these permissions, you can use the AWS IAMFullAccess managed policy.

Next, I show you how to configure Secrets Manager to rotate the secret Applications/MyApp/MySQL-RDS-Database automatically.

  1. From the Secrets Manager console, I go to the list of secrets and choose the secret I created in the first step Applications/MyApp/MySQL-RDS-Database.
    List of secrets in the Secrets Manager console
  2. I scroll to Rotation configuration, and then select Edit rotation.
    Rotation configuration interface
  3. To enable rotation, I select Enable automatic rotation. I then choose how frequently I want Secrets Manager to rotate this secret. For this example, I set the rotation interval to 60 days.
    Edit rotation configuration interface
  4. Next, Secrets Manager requires permissions to rotate this secret on your behalf. Because I’m storing the superuser database credential, Secrets Manager can use this credential to perform rotations. Therefore, I select Use the secret that I provided in step 1, and then select Next.
    Select which secret to use in the Edit rotation configuration interface
  5. The banner on the next screen confirms that I have successfully configured rotation and the first rotation is in progress, which enables you to verify that rotation is functioning as expected. Secrets Manager will rotate this credential automatically every 60 days.
    Confirmation banner message


I introduced AWS Secrets Manager, explained the key benefits, and showed you how to help meet your compliance requirements by configuring AWS Secrets Manager to rotate database credentials automatically on your behalf. Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and on-going maintenance costs of operating your own secrets management infrastructure. To get started, visit the Secrets Manager console. To learn more, visit Secrets Manager documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.

AWS Certificate Manager Launches Private Certificate Authority

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-certificate-manager-launches-private-certificate-authority/

Today we’re launching a new feature for AWS Certificate Manager (ACM), Private Certificate Authority (CA). This new service allows ACM to act as a private subordinate CA. Previously, if a customer wanted to use private certificates, they needed specialized infrastructure and security expertise that could be expensive to maintain and operate. ACM Private CA builds on ACM’s existing certificate capabilities to help you easily and securely manage the lifecycle of your private certificates with pay as you go pricing. This enables developers to provision certificates in just a few simple API calls while administrators have a central CA management console and fine grained access control through granular IAM policies. ACM Private CA keys are stored securely in AWS managed hardware security modules (HSMs) that adhere to FIPS 140-2 Level 3 security standards. ACM Private CA automatically maintains certificate revocation lists (CRLs) in Amazon Simple Storage Service (S3) and lets administrators generate audit reports of certificate creation with the API or console. This service is packed full of features so let’s jump in and provision a CA.

Provisioning a Private Certificate Authority (CA)

First, I’ll navigate to the ACM console in my region and select the new Private CAs section in the sidebar. From there I’ll click Get Started to start the CA wizard. For now, I only have the option to provision a subordinate CA so we’ll select that and use my super secure desktop as the root CA and click Next. This isn’t what I would do in a production setting but it will work for testing out our private CA.

Now, I’ll configure the CA with some common details. The most important thing here is the Common Name which I’ll set as secure.internal to represent my internal domain.

Now I need to choose my key algorithm. You should choose the best algorithm for your needs but know that ACM has a limitation today that it can only manage certificates that chain up to to RSA CAs. For now, I’ll go with RSA 2048 bit and click Next.

In this next screen, I’m able to configure my certificate revocation list (CRL). CRLs are essential for notifying clients in the case that a certificate has been compromised before certificate expiration. ACM will maintain the revocation list for me and I have the option of routing my S3 bucket to a custome domain. In this case I’ll create a new S3 bucket to store my CRL in and click Next.

Finally, I’ll review all the details to make sure I didn’t make any typos and click Confirm and create.

A few seconds later and I’m greeted with a fancy screen saying I successfully provisioned a certificate authority. Hooray! I’m not done yet though. I still need to activate my CA by creating a certificate signing request (CSR) and signing that with my root CA. I’ll click Get started to begin that process.

Now I’ll copy the CSR or download it to a server or desktop that has access to my root CA (or potentially another subordinate – so long as it chains to a trusted root for my clients).

Now I can use a tool like openssl to sign my cert and generate the certificate chain.

$openssl ca -config openssl_root.cnf -extensions v3_intermediate_ca -days 3650 -notext -md sha256 -in csr/CSR.pem -out certs/subordinate_cert.pem
Using configuration from openssl_root.cnf
Enter pass phrase for /Users/randhunt/dev/amzn/ca/private/root_private_key.pem:
Check that the request matches the signature
Signature ok
The Subject's Distinguished Name is as follows
stateOrProvinceName   :ASN.1 12:'Washington'
localityName          :ASN.1 12:'Seattle'
organizationName      :ASN.1 12:'Amazon'
organizationalUnitName:ASN.1 12:'Engineering'
commonName            :ASN.1 12:'secure.internal'
Certificate is to be certified until Mar 31 06:05:30 2028 GMT (3650 days)
Sign the certificate? [y/n]:y

1 out of 1 certificate requests certified, commit? [y/n]y
Write out database with 1 new entries
Data Base Updated

After that I’ll copy my subordinate_cert.pem and certificate chain back into the console. and click Next.

Finally, I’ll review all the information and click Confirm and import. I should see a screen like the one below that shows my CA has been activated successfully.

Now that I have a private CA we can provision private certificates by hopping back to the ACM console and creating a new certificate. After clicking create a new certificate I’ll select the radio button Request a private certificate then I’ll click Request a certificate.

From there it’s just similar to provisioning a normal certificate in ACM.

Now I have a private certificate that I can bind to my ELBs, CloudFront Distributions, API Gateways, and more. I can also export the certificate for use on embedded devices or outside of ACM managed environments.

Available Now
ACM Private CA is a service in and of itself and it is packed full of features that won’t fit into a blog post. I strongly encourage the interested readers to go through the developer guide and familiarize themselves with certificate based security. ACM Private CA is available in in US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt) and EU (Ireland). Private CAs cost $400 per month (prorated) for each private CA. You are not charged for certificates created and maintained in ACM but you are charged for certificates where you have access to the private key (exported or created outside of ACM). The pricing per certificate is tiered starting at $0.75 per certificate for the first 1000 certificates and going down to $0.001 per certificate after 10,000 certificates.

I’m excited to see administrators and developers take advantage of this new service. As always please let us know what you think of this service on Twitter or in the comments below.


AWS Secrets Manager: Store, Distribute, and Rotate Credentials Securely

Post Syndicated from Randall Hunt original https://aws.amazon.com/blogs/aws/aws-secrets-manager-store-distribute-and-rotate-credentials-securely/

Today we’re launching AWS Secrets Manager which makes it easy to store and retrieve your secrets via API or the AWS Command Line Interface (CLI) and rotate your credentials with built-in or custom AWS Lambda functions. Managing application secrets like database credentials, passwords, or API Keys is easy when you’re working locally with one machine and one application. As you grow and scale to many distributed microservices, it becomes a daunting task to securely store, distribute, rotate, and consume secrets. Previously, customers needed to provision and maintain additional infrastructure solely for secrets management which could incur costs and introduce unneeded complexity into systems.

AWS Secrets Manager

Imagine that I have an application that takes incoming tweets from Twitter and stores them in an Amazon Aurora database. Previously, I would have had to request a username and password from my database administrator and embed those credentials in environment variables or, in my race to production, even in the application itself. I would also need to have our social media manager create the Twitter API credentials and figure out how to store those. This is a fairly manual process, involving multiple people, that I have to restart every time I want to rotate these credentials. With Secrets Manager my database administrator can provide the credentials in secrets manager once and subsequently rely on a Secrets Manager provided Lambda function to automatically update and rotate those credentials. My social media manager can put the Twitter API keys in Secrets Manager which I can then access with a simple API call and I can even rotate these programmatically with a custom lambda function calling out to the Twitter API. My secrets are encrypted with the KMS key of my choice, and each of these administrators can explicitly grant access to these secrets with with granular IAM policies for individual roles or users.

Let’s take a look at how I would store a secret using the AWS Secrets Manager console. First, I’ll click Store a new secret to get to the new secrets wizard. For my RDS Aurora instance it’s straightforward to simply select the instance and provide the initial username and password to connect to the database.

Next, I’ll fill in a quick description and a name to access my secret by. You can use whatever naming scheme you want here.

Next, we’ll configure rotation to use the Secrets Manager-provided Lambda function to rotate our password every 10 days.

Finally, we’ll review all the details and check out our sample code for storing and retrieving our secret!

Finally I can review the secrets in the console.

Now, if I needed to access these secrets I’d simply call the API.

import json
import boto3
secrets = boto3.client("secretsmanager")
rds = json.dumps(secrets.get_secrets_value("prod/TwitterApp/Database")['SecretString'])

Which would give me the following values:

{'engine': 'mysql',
 'host': 'twitterapp2.abcdefg.us-east-1.rds.amazonaws.com',
 'password': '-)Kw>THISISAFAKEPASSWORD:lg{&sad+Canr',
 'port': 3306,
 'username': 'ranman'}

More than passwords

AWS Secrets Manager works for more than just passwords. I can store OAuth credentials, binary data, and more. Let’s look at storing my Twitter OAuth application keys.

Now, I can define the rotation for these third-party OAuth credentials with a custom AWS Lambda function that can call out to Twitter whenever we need to rotate our credentials.

Custom Rotation

One of the niftiest features of AWS Secrets Manager is custom AWS Lambda functions for credential rotation. This allows you to define completely custom workflows for credentials. Secrets Manager will call your lambda with a payload that includes a Step which specifies which step of the rotation you’re in, a SecretId which specifies which secret the rotation is for, and importantly a ClientRequestToken which is used to ensure idempotency in any changes to the underlying secret.

When you’re rotating secrets you go through a few different steps:

  1. createSecret
  2. setSecret
  3. testSecret
  4. finishSecret

The advantage of these steps is that you can add any kind of approval steps you want for each phase of the rotation. For more details on custom rotation check out the documentation.

Available Now
AWS Secrets Manager is available today in US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (São Paulo). Secrets are priced at $0.40 per month per secret and $0.05 per 10,000 API calls. I’m looking forward to seeing more users adopt rotating credentials to secure their applications!


Tag Amazon EBS Snapshots on Creation and Implement Stronger Security Policies

Post Syndicated from Woo Kim original https://aws.amazon.com/blogs/compute/tag-amazon-ebs-snapshots-on-creation-and-implement-stronger-security-policies/

This blog was contributed by Rucha Nene, Sr. Product Manager for Amazon EBS

AWS customers use tags to track ownership of resources, implement compliance protocols, control access to resources via IAM policies, and drive their cost accounting processes. Last year, we made tagging for Amazon EC2 instances and Amazon EBS volumes easier by adding the ability to tag these resources upon creation. We are now extending this capability to EBS snapshots.

Earlier, you could tag your EBS snapshots only after the resource had been created and sometimes, ended up with EBS snapshots in an untagged state if tagging failed. You also could not control the actions that users and groups could take over specific snapshots, or enforce tighter security policies.

To address these issues, we are making tagging for EBS snapshots more flexible and giving customers more control over EBS snapshots by introducing two new capabilities:

  • Tag on creation for EBS snapshots – You can now specify tags for EBS snapshots as part of the API call that creates the resource or via the Amazon EC2 Console when creating an EBS snapshot.
  • Resource-level permission and enforced tag usage – The CreateSnapshot, DeleteSnapshot, and ModifySnapshotAttrribute API actions now support IAM resource-level permissions. You can now write IAM policies that mandate the use of specific tags when taking actions on EBS snapshots.

Tag on creation

You can now specify tags for EBS snapshots as part of the API call that creates the resources. The resource creation and the tagging are performed atomically; both must succeed in order for the operation CreateSnapshot to succeed. You no longer need to build tagging scripts that run after EBS snapshots have been created.

Here’s how you specify tags when you create an EBS snapshot, using the console:

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
  2. In the navigation pane, choose Snapshots, Create Snapshot.
  3. On the Create Snapshot page, select the volume for which to create a snapshot.
  4. (Optional) Choose Add tags to your snapshot. For each tag, provide a tag key and a tag value.
  5. Choose Create Snapshot.

Using the AWS CLI:

aws ec2 create-snapshot --volume-id vol-0c0e757e277111f3c --description 'Prod_Backup' --tag-specifications 

To learn more, see Using Tags.

Resource-level permissions and enforced tag usage

CreateSnapshot, DeleteSnapshot, and ModifySnapshotAttribute now support resource-level permissions, which allow you to exercise more control over EBS snapshots. You can write IAM policies that give you precise control over access to resources and let you specify which users are able to create snapshots for a given set of volumes. You can also enforce the use of specific tags to help track resources and achieve more accurate cost allocation reporting.

For example, here’s a statement that requires that the costcenter tag (with a value of “115”) be present on the volume from which snapshots are being created. It requires that this tag be applied to all newly created snapshots. In addition, it requires that the created snapshots are tagged with User:username for the customer.

	   "Condition": {

To implement stronger compliance and security policies, you could also restrict access to DeleteSnapshot, if the resource is not tagged with the user’s name. Here’s a statement that allows the deletion of a snapshot only if the snapshot is tagged with User:username for the customer.


To learn more and to see some sample policies, see IAM Policies for Amazon EC2 and Working with Snapshots.

Available Now

These new features are available now in all AWS Regions. You can start using it today from the Amazon EC2 Console, AWS Command Line Interface (CLI), or the AWS APIs.

Now Available – AWS Serverless Application Repository

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-aws-serverless-application-repository/

Last year I suggested that you Get Ready for the AWS Serverless Application Repository and gave you a sneak peek. The Repository is designed to make it as easy as possible for you to discover, configure, and deploy serverless applications and components on AWS. It is also an ideal venue for AWS partners, enterprise customers, and independent developers to share their serverless creations.

Now Available
After a well-received public preview, the AWS Serverless Application Repository is now generally available and you can start using it today!

As a consumer, you will be able to tap in to a thriving ecosystem of serverless applications and components that will be a perfect complement to your machine learning, image processing, IoT, and general-purpose work. You can configure and consume them as-is, or you can take them apart, add features, and submit pull requests to the author.

As a publisher, you can publish your contribution in the Serverless Application Repository with ease. You simply enter a name and a description, choose some labels to increase discoverability, select an appropriate open source license from a menu, and supply a README to help users get started. Then you enter a link to your existing source code repo, choose a SAM template, and designate a semantic version.

Let’s take a look at both operations…

Consuming a Serverless Application
The Serverless Application Repository is accessible from the Lambda Console. I can page through the existing applications or I can initiate a search:

A search for “todo” returns some interesting results:

I simply click on an application to learn more:

I can configure the application and deploy it right away if I am already familiar with the application:

I can expand each of the sections to learn more. The Permissions section tells me which IAM policies will be used:

And the Template section displays the SAM template that will be used to deploy the application:

I can inspect the template to learn more about the AWS resources that will be created when the template is deployed. I can also use the templates as a learning resource in preparation for creating and publishing my own application.

The License section displays the application’s license:

To deploy todo, I name the application and click Deploy:

Deployment starts immediately and is done within a minute (application deployment time will vary, depending on the number and type of resources to be created):

I can see all of my deployed applications in the Lambda Console:

There’s currently no way for a SAM template to indicate that an API Gateway function returns binary media types, so I set this up by hand and then re-deploy the API:

Following the directions in the Readme, I open the API Gateway Console and find the URL for the app in the API Gateway Dashboard:

I visit the URL and enter some items into my list:

Publishing a Serverless Application
Publishing applications is a breeze! I visit the Serverless App Repository page and click on Publish application to get started:

Then I assign a name to my application, enter my own name, and so forth:

I can choose from a long list of open-source friendly SPDX licenses:

I can create an initial version of my application at this point, or I can do it later. Either way, I simply provide a version number, a URL to a public repository containing my code, and a SAM template:

Available Now
The AWS Serverless Application Repository is available now and you can start using it today, paying only for the AWS resources consumed by the serverless applications that you deploy.

You can deploy applications in the US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Mumbai), Asia Pacific (Singapore), Asia Pacific (Sydney), Canada (Central), EU (Frankfurt), EU (Ireland), EU (London), and South America (São Paulo) Regions. You can publish from the US East (N. Virginia) or US East (Ohio) Regions for global availability.



Sharing Secrets with AWS Lambda Using AWS Systems Manager Parameter Store

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/sharing-secrets-with-aws-lambda-using-aws-systems-manager-parameter-store/

This post courtesy of Roberto Iturralde, Sr. Application Developer- AWS Professional Services

Application architects are faced with key decisions throughout the process of designing and implementing their systems. One decision common to nearly all solutions is how to manage the storage and access rights of application configuration. Shared configuration should be stored centrally and securely with each system component having access only to the properties that it needs for functioning.

With AWS Systems Manager Parameter Store, developers have access to central, secure, durable, and highly available storage for application configuration and secrets. Parameter Store also integrates with AWS Identity and Access Management (IAM), allowing fine-grained access control to individual parameters or branches of a hierarchical tree.

This post demonstrates how to create and access shared configurations in Parameter Store from AWS Lambda. Both encrypted and plaintext parameter values are stored with only the Lambda function having permissions to decrypt the secrets. You also use AWS X-Ray to profile the function.

Solution overview

This example is made up of the following components:

  • An AWS SAM template that defines:
    • A Lambda function and its permissions
    • An unencrypted Parameter Store parameter that the Lambda function loads
    • A KMS key that only the Lambda function can access. You use this key to create an encrypted parameter later.
  • Lambda function code in Python 3.6 that demonstrates how to load values from Parameter Store at function initialization for reuse across invocations.

Launch the AWS SAM template

To create the resources shown in this post, you can download the SAM template or choose the button to launch the stack. The template requires one parameter, an IAM user name, which is the name of the IAM user to be the admin of the KMS key that you create. In order to perform the steps listed in this post, this IAM user will need permissions to execute Lambda functions, create Parameter Store parameters, administer keys in KMS, and view the X-Ray console. If you have these privileges in your IAM user account you can use your own account to complete the walkthrough. You can not use the root user to administer the KMS keys.

SAM template resources

The following sections show the code for the resources defined in the template.
Lambda function

    Type: 'AWS::Serverless::Function'
      FunctionName: 'ParameterStoreBlogFunctionDev'
      Description: 'Integrating lambda with Parameter Store'
      Handler: 'lambda_function.lambda_handler'
      Role: !GetAtt ParameterStoreBlogFunctionRoleDev.Arn
      CodeUri: './code'
          ENV: 'dev'
          APP_CONFIG_PATH: 'parameterStoreBlog'
          AWS_XRAY_TRACING_NAME: 'ParameterStoreBlogFunctionDev'
      Runtime: 'python3.6'
      Timeout: 5
      Tracing: 'Active'

    Type: AWS::IAM::Role
        Version: '2012-10-17'
            Effect: Allow
                - 'lambda.amazonaws.com'
              - 'sts:AssumeRole'
        - 'arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole'
          PolicyName: 'ParameterStoreBlogDevParameterAccess'
            Version: '2012-10-17'
                Effect: Allow
                  - 'ssm:GetParameter*'
                Resource: !Sub 'arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:parameter/dev/parameterStoreBlog*'
          PolicyName: 'ParameterStoreBlogDevXRayAccess'
            Version: '2012-10-17'
                Effect: Allow
                  - 'xray:PutTraceSegments'
                  - 'xray:PutTelemetryRecords'
                Resource: '*'

In this YAML code, you define a Lambda function named ParameterStoreBlogFunctionDev using the SAM AWS::Serverless::Function type. The environment variables for this function include the ENV (dev) and the APP_CONFIG_PATH where you find the configuration for this app in Parameter Store. X-Ray tracing is also enabled for profiling later.

The IAM role for this function extends the AWSLambdaBasicExecutionRole by adding IAM policies that grant the function permissions to write to X-Ray and get parameters from Parameter Store, limited to paths under /dev/parameterStoreBlog*.
Parameter Store parameter

    Type: AWS::SSM::Parameter
      Name: '/dev/parameterStoreBlog/appConfig'
      Description: 'Sample dev config values for my app'
      Type: String
      Value: '{"key1": "value1","key2": "value2","key3": "value3"}'

This YAML code creates a plaintext string parameter in Parameter Store in a path that your Lambda function can access.
KMS encryption key

    Type: AWS::KMS::Alias
      AliasName: 'alias/ParameterStoreBlogKeyDev'
      TargetKeyId: !Ref ParameterStoreBlogDevEncryptionKey

    Type: AWS::KMS::Key
      Description: 'Encryption key for secret config values for the Parameter Store blog post'
      Enabled: True
      EnableKeyRotation: False
        Version: '2012-10-17'
        Id: 'key-default-1'
            Sid: 'Allow administration of the key & encryption of new values'
            Effect: Allow
                - !Sub 'arn:aws:iam::${AWS::AccountId}:user/${IAMUsername}'
              - 'kms:Create*'
              - 'kms:Encrypt'
              - 'kms:Describe*'
              - 'kms:Enable*'
              - 'kms:List*'
              - 'kms:Put*'
              - 'kms:Update*'
              - 'kms:Revoke*'
              - 'kms:Disable*'
              - 'kms:Get*'
              - 'kms:Delete*'
              - 'kms:ScheduleKeyDeletion'
              - 'kms:CancelKeyDeletion'
            Resource: '*'
            Sid: 'Allow use of the key'
            Effect: Allow
              AWS: !GetAtt ParameterStoreBlogFunctionRoleDev.Arn
              - 'kms:Encrypt'
              - 'kms:Decrypt'
              - 'kms:ReEncrypt*'
              - 'kms:GenerateDataKey*'
              - 'kms:DescribeKey'
            Resource: '*'

This YAML code creates an encryption key with a key policy with two statements.

The first statement allows a given user (${IAMUsername}) to administer the key. Importantly, this includes the ability to encrypt values using this key and disable or delete this key, but does not allow the administrator to decrypt values that were encrypted with this key.

The second statement grants your Lambda function permission to encrypt and decrypt values using this key. The alias for this key in KMS is ParameterStoreBlogKeyDev, which is how you reference it later.

Lambda function

Here I walk you through the Lambda function code.

import os, traceback, json, configparser, boto3
from aws_xray_sdk.core import patch_all

# Initialize boto3 client at global scope for connection reuse
client = boto3.client('ssm')
env = os.environ['ENV']
app_config_path = os.environ['APP_CONFIG_PATH']
full_config_path = '/' + env + '/' + app_config_path
# Initialize app at global scope for reuse across invocations
app = None

class MyApp:
    def __init__(self, config):
        Construct new MyApp with configuration
        :param config: application configuration
        self.config = config

    def get_config(self):
        return self.config

def load_config(ssm_parameter_path):
    Load configparser from config stored in SSM Parameter Store
    :param ssm_parameter_path: Path to app config in SSM Parameter Store
    :return: ConfigParser holding loaded config
    configuration = configparser.ConfigParser()
        # Get all parameters for this app
        param_details = client.get_parameters_by_path(

        # Loop through the returned parameters and populate the ConfigParser
        if 'Parameters' in param_details and len(param_details.get('Parameters')) > 0:
            for param in param_details.get('Parameters'):
                param_path_array = param.get('Name').split("/")
                section_position = len(param_path_array) - 1
                section_name = param_path_array[section_position]
                config_values = json.loads(param.get('Value'))
                config_dict = {section_name: config_values}
                print("Found configuration: " + str(config_dict))

        print("Encountered an error loading config from SSM.")
        return configuration

def lambda_handler(event, context):
    global app
    # Initialize app if it doesn't yet exist
    if app is None:
        print("Loading config and creating new MyApp...")
        config = load_config(full_config_path)
        app = MyApp(config)

    return "MyApp config is " + str(app.get_config()._sections)

Beneath the import statements, you import the patch_all function from the AWS X-Ray library, which you use to patch boto3 to create X-Ray segments for all your boto3 operations.

Next, you create a boto3 SSM client at the global scope for reuse across function invocations, following Lambda best practices. Using the function environment variables, you assemble the path where you expect to find your configuration in Parameter Store. The class MyApp is meant to serve as an example of an application that would need its configuration injected at construction. In this example, you create an instance of ConfigParser, a class in Python’s standard library for handling basic configurations, to give to MyApp.

The load_config function loads the all the parameters from Parameter Store at the level immediately beneath the path provided in the Lambda function environment variables. Each parameter found is put into a new section in ConfigParser. The name of the section is the name of the parameter, less the base path. In this example, the full parameter name is /dev/parameterStoreBlog/appConfig, which is put in a section named appConfig.

Finally, the lambda_handler function initializes an instance of MyApp if it doesn’t already exist, constructing it with the loaded configuration from Parameter Store. Then it simply returns the currently loaded configuration in MyApp. The impact of this design is that the configuration is only loaded from Parameter Store the first time that the Lambda function execution environment is initialized. Subsequent invocations reuse the existing instance of MyApp, resulting in improved performance. You see this in the X-Ray traces later in this post. For more advanced use cases where configuration changes need to be received immediately, you could implement an expiry policy for your configuration entries or push notifications to your function.

To confirm that everything was created successfully, test the function in the Lambda console.

  1. Open the Lambda console.
  2. In the navigation pane, choose Functions.
  3. In the Functions pane, filter to ParameterStoreBlogFunctionDev to find the function created by the SAM template earlier. Open the function name to view its details.
  4. On the top right of the function detail page, choose Test. You may need to create a new test event. The input JSON doesn’t matter as this function ignores the input.

After running the test, you should see output similar to the following. This demonstrates that the function successfully fetched the unencrypted configuration from Parameter Store.

Create an encrypted parameter

You currently have a simple, unencrypted parameter and a Lambda function that can access it.

Next, you create an encrypted parameter that only your Lambda function has permission to use for decryption. This limits read access for this parameter to only this Lambda function.

To follow along with this section, deploy the SAM template for this post in your account and make your IAM user name the KMS key admin mentioned earlier.

  1. In the Systems Manager console, under Shared Resources, choose Parameter Store.
  2. Choose Create Parameter.
    • For Name, enter /dev/parameterStoreBlog/appSecrets.
    • For Type, select Secure String.
    • For KMS Key ID, choose alias/ParameterStoreBlogKeyDev, which is the key that your SAM template created.
    • For Value, enter {"secretKey": "secretValue"}.
    • Choose Create Parameter.
  3. If you now try to view the value of this parameter by choosing the name of the parameter in the parameters list and then choosing Show next to the Value field, you won’t see the value appear. This is because, even though you have permission to encrypt values using this KMS key, you do not have permissions to decrypt values.
  4. In the Lambda console, run another test of your function. You now also see the secret parameter that you created and its decrypted value.

If you do not see the new parameter in the Lambda output, this may be because the Lambda execution environment is still warm from the previous test. Because the parameters are loaded at Lambda startup, you need a fresh execution environment to refresh the values.

Adjust the function timeout to a different value in the Advanced Settings at the bottom of the Lambda Configuration tab. Choose Save and test to trigger the creation of a new Lambda execution environment.

Profiling the impact of querying Parameter Store using AWS X-Ray

By using the AWS X-Ray SDK to patch boto3 in your Lambda function code, each invocation of the function creates traces in X-Ray. In this example, you can use these traces to validate the performance impact of your design decision to only load configuration from Parameter Store on the first invocation of the function in a new execution environment.

From the Lambda function details page where you tested the function earlier, under the function name, choose Monitoring. Choose View traces in X-Ray.

This opens the X-Ray console in a new window filtered to your function. Be aware of the time range field next to the search bar if you don’t see any search results.
In this screenshot, I’ve invoked the Lambda function twice, one time 10.3 minutes ago with a response time of 1.1 seconds and again 9.8 minutes ago with a response time of 8 milliseconds.

Looking at the details of the longer running trace by clicking the trace ID, you can see that the Lambda function spent the first ~350 ms of the full 1.1 sec routing the request through Lambda and creating a new execution environment for this function, as this was the first invocation with this code. This is the portion of time before the initialization subsegment.

Next, it took 725 ms to initialize the function, which includes executing the code at the global scope (including creating the boto3 client). This is also a one-time cost for a fresh execution environment.

Finally, the function executed for 65 ms, of which 63.5 ms was the GetParametersByPath call to Parameter Store.

Looking at the trace for the second, much faster function invocation, you see that the majority of the 8 ms execution time was Lambda routing the request to the function and returning the response. Only 1 ms of the overall execution time was attributed to the execution of the function, which makes sense given that after the first invocation you’re simply returning the config stored in MyApp.

While the Traces screen allows you to view the details of individual traces, the X-Ray Service Map screen allows you to view aggregate performance data for all traced services over a period of time.

In the X-Ray console navigation pane, choose Service map. Selecting a service node shows the metrics for node-specific requests. Selecting an edge between two nodes shows the metrics for requests that traveled that connection. Again, be aware of the time range field next to the search bar if you don’t see any search results.

After invoking your Lambda function several more times by testing it from the Lambda console, you can view some aggregate performance metrics. Look at the following:

  • From the client perspective, requests to the Lambda service for the function are taking an average of 50 ms to respond. The function is generating ~1 trace per minute.
  • The function itself is responding in an average of 3 ms. In the following screenshot, I’ve clicked on this node, which reveals a latency histogram of the traced requests showing that over 95% of requests return in under 5 ms.
  • Parameter Store is responding to requests in an average of 64 ms, but note the much lower trace rate in the node. This is because you only fetch data from Parameter Store on the initialization of the Lambda execution environment.


Deduplication, encryption, and restricted access to shared configuration and secrets is a key component to any mature architecture. Serverless architectures designed using event-driven, on-demand, compute services like Lambda are no different.

In this post, I walked you through a sample application accessing unencrypted and encrypted values in Parameter Store. These values were created in a hierarchy by application environment and component name, with the permissions to decrypt secret values restricted to only the function needing access. The techniques used here can become the foundation of secure, robust configuration management in your enterprise serverless applications.

Combine Transactional and Analytical Data Using Amazon Aurora and Amazon Redshift

Post Syndicated from Re Alvarez-Parmar original https://aws.amazon.com/blogs/big-data/combine-transactional-and-analytical-data-using-amazon-aurora-and-amazon-redshift/

A few months ago, we published a blog post about capturing data changes in an Amazon Aurora database and sending it to Amazon Athena and Amazon QuickSight for fast analysis and visualization. In this post, I want to demonstrate how easy it can be to take the data in Aurora and combine it with data in Amazon Redshift using Amazon Redshift Spectrum.

With Amazon Redshift, you can build petabyte-scale data warehouses that unify data from a variety of internal and external sources. Because Amazon Redshift is optimized for complex queries (often involving multiple joins) across large tables, it can handle large volumes of retail, inventory, and financial data without breaking a sweat.

In this post, we describe how to combine data in Aurora in Amazon Redshift. Here’s an overview of the solution:

  • Use AWS Lambda functions with Amazon Aurora to capture data changes in a table.
  • Save data in an Amazon S3
  • Query data using Amazon Redshift Spectrum.

We use the following services:

Serverless architecture for capturing and analyzing Aurora data changes

Consider a scenario in which an e-commerce web application uses Amazon Aurora for a transactional database layer. The company has a sales table that captures every single sale, along with a few corresponding data items. This information is stored as immutable data in a table. Business users want to monitor the sales data and then analyze and visualize it.

In this example, you take the changes in data in an Aurora database table and save it in Amazon S3. After the data is captured in Amazon S3, you combine it with data in your existing Amazon Redshift cluster for analysis.

By the end of this post, you will understand how to capture data events in an Aurora table and push them out to other AWS services using AWS Lambda.

The following diagram shows the flow of data as it occurs in this tutorial:

The starting point in this architecture is a database insert operation in Amazon Aurora. When the insert statement is executed, a custom trigger calls a Lambda function and forwards the inserted data. Lambda writes the data that it received from Amazon Aurora to a Kinesis data delivery stream. Kinesis Data Firehose writes the data to an Amazon S3 bucket. Once the data is in an Amazon S3 bucket, it is queried in place using Amazon Redshift Spectrum.

Creating an Aurora database

First, create a database by following these steps in the Amazon RDS console:

  1. Sign in to the AWS Management Console, and open the Amazon RDS console.
  2. Choose Launch a DB instance, and choose Next.
  3. For Engine, choose Amazon Aurora.
  4. Choose a DB instance class. This example uses a small, since this is not a production database.
  5. In Multi-AZ deployment, choose No.
  6. Configure DB instance identifier, Master username, and Master password.
  7. Launch the DB instance.

After you create the database, use MySQL Workbench to connect to the database using the CNAME from the console. For information about connecting to an Aurora database, see Connecting to an Amazon Aurora DB Cluster.

The following screenshot shows the MySQL Workbench configuration:

Next, create a table in the database by running the following SQL statement:

Create Table
ItemID int NOT NULL,
Category varchar(255),
Price double(10,2), 
Quantity int not NULL,
OrderDate timestamp,
DestinationState varchar(2),
ShippingType varchar(255),
Referral varchar(255),

You can now populate the table with some sample data. To generate sample data in your table, copy and run the following script. Ensure that the highlighted (bold) variables are replaced with appropriate values.

import MySQLdb
import random
import datetime

db = MySQLdb.connect(host="AURORA_CNAME",

states = ("AL","AK","AZ","AR","CA","CO","CT","DE","FL","GA","HI","ID","IL","IN",

shipping_types = ("Free", "3-Day", "2-Day")

product_categories = ("Garden", "Kitchen", "Office", "Household")
referrals = ("Other", "Friend/Colleague", "Repeat Customer", "Online Ad")

for i in range(0,10):
    item_id = random.randint(1,100)
    state = states[random.randint(0,len(states)-1)]
    shipping_type = shipping_types[random.randint(0,len(shipping_types)-1)]
    product_category = product_categories[random.randint(0,len(product_categories)-1)]
    quantity = random.randint(1,4)
    referral = referrals[random.randint(0,len(referrals)-1)]
    price = random.randint(1,100)
    order_date = datetime.date(2016,random.randint(1,12),random.randint(1,30)).isoformat()

    data_order = (item_id, product_category, price, quantity, order_date, state,
    shipping_type, referral)

    add_order = ("INSERT INTO Sales "
                   "(ItemID, Category, Price, Quantity, OrderDate, DestinationState, \
                   ShippingType, Referral) "
                   "VALUES (%s, %s, %s, %s, %s, %s, %s, %s)")

    cursor = db.cursor()
    cursor.execute(add_order, data_order)



The following screenshot shows how the table appears with the sample data:

Sending data from Amazon Aurora to Amazon S3

There are two methods available to send data from Amazon Aurora to Amazon S3:

  • Using a Lambda function

To demonstrate the ease of setting up integration between multiple AWS services, we use a Lambda function to send data to Amazon S3 using Amazon Kinesis Data Firehose.

Alternatively, you can use a SELECT INTO OUTFILE S3 statement to query data from an Amazon Aurora DB cluster and save it directly in text files that are stored in an Amazon S3 bucket. However, with this method, there is a delay between the time that the database transaction occurs and the time that the data is exported to Amazon S3 because the default file size threshold is 6 GB.

Creating a Kinesis data delivery stream

The next step is to create a Kinesis data delivery stream, since it’s a dependency of the Lambda function.

To create a delivery stream:

  1. Open the Kinesis Data Firehose console
  2. Choose Create delivery stream.
  3. For Delivery stream name, type AuroraChangesToS3.
  4. For Source, choose Direct PUT.
  5. For Record transformation, choose Disabled.
  6. For Destination, choose Amazon S3.
  7. In the S3 bucket drop-down list, choose an existing bucket, or create a new one.
  8. Enter a prefix if needed, and choose Next.
  9. For Data compression, choose GZIP.
  10. In IAM role, choose either an existing role that has access to write to Amazon S3, or choose to generate one automatically. Choose Next.
  11. Review all the details on the screen, and choose Create delivery stream when you’re finished.


Creating a Lambda function

Now you can create a Lambda function that is called every time there is a change that needs to be tracked in the database table. This Lambda function passes the data to the Kinesis data delivery stream that you created earlier.

To create the Lambda function:

  1. Open the AWS Lambda console.
  2. Ensure that you are in the AWS Region where your Amazon Aurora database is located.
  3. If you have no Lambda functions yet, choose Get started now. Otherwise, choose Create function.
  4. Choose Author from scratch.
  5. Give your function a name and select Python 3.6 for Runtime
  6. Choose and existing or create a new Role, the role would need to have access to call firehose:PutRecord
  7. Choose Next on the trigger selection screen.
  8. Paste the following code in the code window. Change the stream_name variable to the Kinesis data delivery stream that you created in the previous step.
  9. Choose File -> Save in the code editor and then choose Save.
import boto3
import json

firehose = boto3.client('firehose')
stream_name = ‘AuroraChangesToS3’

def Kinesis_publish_message(event, context):
    firehose_data = (("%s,%s,%s,%s,%s,%s,%s,%s\n") %(event['ItemID'], 
    event['Category'], event['Price'], event['Quantity'],
    event['OrderDate'], event['DestinationState'], event['ShippingType'], 
    firehose_data = {'Data': str(firehose_data)}

Note the Amazon Resource Name (ARN) of this Lambda function.

Giving Aurora permissions to invoke a Lambda function

To give Amazon Aurora permissions to invoke a Lambda function, you must attach an IAM role with appropriate permissions to the cluster. For more information, see Invoking a Lambda Function from an Amazon Aurora DB Cluster.

Once you are finished, the Amazon Aurora database has access to invoke a Lambda function.

Creating a stored procedure and a trigger in Amazon Aurora

Now, go back to MySQL Workbench, and run the following command to create a new stored procedure. When this stored procedure is called, it invokes the Lambda function you created. Change the ARN in the following code to your Lambda function’s ARN.

									IN Category varchar(255), 
									IN Price double(10,2),
                                    IN Quantity int(11),
                                    IN OrderDate timestamp,
                                    IN DestinationState varchar(2),
                                    IN ShippingType varchar(255),
                                    IN Referral  varchar(255)) LANGUAGE SQL 
  CALL mysql.lambda_async('arn:aws:lambda:us-east-1:XXXXXXXXXXXXX:function:CDCFromAuroraToKinesis', 
     CONCAT('{ "ItemID" : "', ItemID, 
            '", "Category" : "', Category,
            '", "Price" : "', Price,
            '", "Quantity" : "', Quantity, 
            '", "OrderDate" : "', OrderDate, 
            '", "DestinationState" : "', DestinationState, 
            '", "ShippingType" : "', ShippingType, 
            '", "Referral" : "', Referral, '"}')

Create a trigger TR_Sales_CDC on the Sales table. When a new record is inserted, this trigger calls the CDC_TO_FIREHOSE stored procedure.

  SELECT  NEW.ItemID , NEW.Category, New.Price, New.Quantity, New.OrderDate
  , New.DestinationState, New.ShippingType, New.Referral
  INTO @ItemID , @Category, @Price, @Quantity, @OrderDate
  , @DestinationState, @ShippingType, @Referral;
  CALL  CDC_TO_FIREHOSE(@ItemID , @Category, @Price, @Quantity, @OrderDate
  , @DestinationState, @ShippingType, @Referral);

If a new row is inserted in the Sales table, the Lambda function that is mentioned in the stored procedure is invoked.

Verify that data is being sent from the Lambda function to Kinesis Data Firehose to Amazon S3 successfully. You might have to insert a few records, depending on the size of your data, before new records appear in Amazon S3. This is due to Kinesis Data Firehose buffering. To learn more about Kinesis Data Firehose buffering, see the “Amazon S3” section in Amazon Kinesis Data Firehose Data Delivery.

Every time a new record is inserted in the sales table, a stored procedure is called, and it updates data in Amazon S3.

Querying data in Amazon Redshift

In this section, you use the data you produced from Amazon Aurora and consume it as-is in Amazon Redshift. In order to allow you to process your data as-is, where it is, while taking advantage of the power and flexibility of Amazon Redshift, you use Amazon Redshift Spectrum. You can use Redshift Spectrum to run complex queries on data stored in Amazon S3, with no need for loading or other data prep.

Just create a data source and issue your queries to your Amazon Redshift cluster as usual. Behind the scenes, Redshift Spectrum scales to thousands of instances on a per-query basis, ensuring that you get fast, consistent performance even as your dataset grows to beyond an exabyte! Being able to query data that is stored in Amazon S3 means that you can scale your compute and your storage independently. You have the full power of the Amazon Redshift query model and all the reporting and business intelligence tools at your disposal. Your queries can reference any combination of data stored in Amazon Redshift tables and in Amazon S3.

Redshift Spectrum supports open, common data types, including CSV/TSV, Apache Parquet, SequenceFile, and RCFile. Files can be compressed using gzip or Snappy, with other data types and compression methods in the works.

First, create an Amazon Redshift cluster. Follow the steps in Launch a Sample Amazon Redshift Cluster.

Next, create an IAM role that has access to Amazon S3 and Athena. By default, Amazon Redshift Spectrum uses the Amazon Athena data catalog. Your cluster needs authorization to access your external data catalog in AWS Glue or Athena and your data files in Amazon S3.

In the demo setup, I attached AmazonS3FullAccess and AmazonAthenaFullAccess. In a production environment, the IAM roles should follow the standard security of granting least privilege. For more information, see IAM Policies for Amazon Redshift Spectrum.

Attach the newly created role to the Amazon Redshift cluster. For more information, see Associate the IAM Role with Your Cluster.

Next, connect to the Amazon Redshift cluster, and create an external schema and database:

create external schema if not exists spectrum_schema
from data catalog 
database 'spectrum_db' 
region 'us-east-1'
IAM_ROLE 'arn:aws:iam::XXXXXXXXXXXX:role/RedshiftSpectrumRole'
create external database if not exists;

Don’t forget to replace the IAM role in the statement.

Then create an external table within the database:

 CREATE EXTERNAL TABLE IF NOT EXISTS spectrum_schema.ecommerce_sales(
  ItemID int,
  Category varchar,
  Quantity int,
  OrderDate TIMESTAMP,
  DestinationState varchar,
  ShippingType varchar,
  Referral varchar)

Query the table, and it should contain data. This is a fact table.

select top 10 * from spectrum_schema.ecommerce_sales


Next, create a dimension table. For this example, we create a date/time dimension table. Create the table:

CREATE TABLE date_dimension (
  d_datekey           integer       not null sortkey,
  d_dayofmonth        integer       not null,
  d_monthnum          integer       not null,
  d_dayofweek                varchar(10)   not null,
  d_prettydate        date       not null,
  d_quarter           integer       not null,
  d_half              integer       not null,
  d_year              integer       not null,
  d_season            varchar(10)   not null,
  d_fiscalyear        integer       not null)
diststyle all;

Populate the table with data:

copy date_dimension from 's3://reparmar-lab/2016dates' 
iam_role 'arn:aws:iam::XXXXXXXXXXXX:role/redshiftspectrum'
dateformat 'auto';

The date dimension table should look like the following:

Querying data in local and external tables using Amazon Redshift

Now that you have the fact and dimension table populated with data, you can combine the two and run analysis. For example, if you want to query the total sales amount by weekday, you can run the following:

select sum(quantity*price) as total_sales, date_dimension.d_season
from spectrum_schema.ecommerce_sales 
join date_dimension on spectrum_schema.ecommerce_sales.orderdate = date_dimension.d_prettydate 
group by date_dimension.d_season

You get the following results:

Similarly, you can replace d_season with d_dayofweek to get sales figures by weekday:

With Amazon Redshift Spectrum, you pay only for the queries you run against the data that you actually scan. We encourage you to use file partitioning, columnar data formats, and data compression to significantly minimize the amount of data scanned in Amazon S3. This is important for data warehousing because it dramatically improves query performance and reduces cost.

Partitioning your data in Amazon S3 by date, time, or any other custom keys enables Amazon Redshift Spectrum to dynamically prune nonrelevant partitions to minimize the amount of data processed. If you store data in a columnar format, such as Parquet, Amazon Redshift Spectrum scans only the columns needed by your query, rather than processing entire rows. Similarly, if you compress your data using one of the supported compression algorithms in Amazon Redshift Spectrum, less data is scanned.

Analyzing and visualizing Amazon Redshift data in Amazon QuickSight

Modify the Amazon Redshift security group to allow an Amazon QuickSight connection. For more information, see Authorizing Connections from Amazon QuickSight to Amazon Redshift Clusters.

After modifying the Amazon Redshift security group, go to Amazon QuickSight. Create a new analysis, and choose Amazon Redshift as the data source.

Enter the database connection details, validate the connection, and create the data source.

Choose the schema to be analyzed. In this case, choose spectrum_schema, and then choose the ecommerce_sales table.

Next, we add a custom field for Total Sales = Price*Quantity. In the drop-down list for the ecommerce_sales table, choose Edit analysis data sets.

On the next screen, choose Edit.

In the data prep screen, choose New Field. Add a new calculated field Total Sales $, which is the product of the Price*Quantity fields. Then choose Create. Save and visualize it.

Next, to visualize total sales figures by month, create a graph with Total Sales on the x-axis and Order Data formatted as month on the y-axis.

After you’ve finished, you can use Amazon QuickSight to add different columns from your Amazon Redshift tables and perform different types of visualizations. You can build operational dashboards that continuously monitor your transactional and analytical data. You can publish these dashboards and share them with others.

Final notes

Amazon QuickSight can also read data in Amazon S3 directly. However, with the method demonstrated in this post, you have the option to manipulate, filter, and combine data from multiple sources or Amazon Redshift tables before visualizing it in Amazon QuickSight.

In this example, we dealt with data being inserted, but triggers can be activated in response to an INSERT, UPDATE, or DELETE trigger.

Keep the following in mind:

  • Be careful when invoking a Lambda function from triggers on tables that experience high write traffic. This would result in a large number of calls to your Lambda function. Although calls to the lambda_async procedure are asynchronous, triggers are synchronous.
  • A statement that results in a large number of trigger activations does not wait for the call to the AWS Lambda function to complete. But it does wait for the triggers to complete before returning control to the client.
  • Similarly, you must account for Amazon Kinesis Data Firehose limits. By default, Kinesis Data Firehose is limited to a maximum of 5,000 records/second. For more information, see Monitoring Amazon Kinesis Data Firehose.

In certain cases, it may be optimal to use AWS Database Migration Service (AWS DMS) to capture data changes in Aurora and use Amazon S3 as a target. For example, AWS DMS might be a good option if you don’t need to transform data from Amazon Aurora. The method used in this post gives you the flexibility to transform data from Aurora using Lambda before sending it to Amazon S3. Additionally, the architecture has the benefits of being serverless, whereas AWS DMS requires an Amazon EC2 instance for replication.

For design considerations while using Redshift Spectrum, see Using Amazon Redshift Spectrum to Query External Data.

If you have questions or suggestions, please comment below.

Additional Reading

If you found this post useful, be sure to check out Capturing Data Changes in Amazon Aurora Using AWS Lambda and 10 Best Practices for Amazon Redshift Spectrum

About the Authors

Re Alvarez-Parmar is a solutions architect for Amazon Web Services. He helps enterprises achieve success through technical guidance and thought leadership. In his spare time, he enjoys spending time with his two kids and exploring outdoors.