Tag Archives: database

Migration Complete – Amazon’s Consumer Business Just Turned off its Final Oracle Database

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/migration-complete-amazons-consumer-business-just-turned-off-its-final-oracle-database/

Over my 17 years at Amazon, I have seen that my colleagues on the engineering team are never content to leave good-enough alone. They routinely re-evaluate every internal system to make sure that it is as scalable, efficient, performant, and secure as possible. When they find an avenue for improvement, they will use what they have learned to thoroughly modernize our architectures and implementations, often going so far as to rip apart existing systems and rebuild them from the ground up if necessary.

Today I would like to tell you about an internal database migration effort of this type that just wrapped up after several years of work. Over the years we realized that we were spending too much time managing and scaling thousands of legacy Oracle databases. Instead of focusing on high-value differentiated work, our database administrators (DBAs) spent a lot of time simply keeping the lights on while transaction rates climbed and the overall amount of stored data mounted. This included time spent dealing with complex & inefficient hardware provisioning, license management, and many other issues that are now best handled by modern, managed database services.

More than 100 teams in Amazon’s Consumer business participated in the migration effort. This includes well-known customer-facing brands and sites such as Alexa, Amazon Prime, Amazon Prime Video, Amazon Fresh, Kindle, Amazon Music, Audible, Shopbop, Twitch, and Zappos, as well as internal teams such as AdTech, Amazon Fulfillment Technology, Consumer Payments, Customer Returns, Catalog Systems, Deliver Experience, Digital Devices, External Payments, Finance, InfoSec, Marketplace, Ordering, and Retail Systems.

Migration Complete
I am happy to report that this database migration effort is now complete. Amazon’s Consumer business just turned off its final Oracle database (some third-party applications are tightly bound to Oracle and were not migrated).

We migrated 75 petabytes of internal data stored in nearly 7,500 Oracle databases to multiple AWS database services including Amazon DynamoDB, Amazon Aurora, Amazon Relational Database Service (RDS), and Amazon Redshift. The migrations were accomplished with little or no downtime, and covered 100% of our proprietary systems. This includes complex purchasing, catalog management, order fulfillment, accounting, and video streaming workloads. We kept careful track of the costs and the performance, and realized the following results:

  • Cost Reduction – We reduced our database costs by over 60% on top of the heavily discounted rate we negotiated based on our scale. Customers regularly report cost savings of 90% by switching from Oracle to AWS.
  • Performance Improvements – Latency of our consumer-facing applications was reduced by 40%.
  • Administrative Overhead – The switch to managed services reduced database admin overhead by 70%.

The migration gave each internal team the freedom to choose the purpose-built AWS database service that best fit their needs, and also gave them better control over their budget and their cost model. Low-latency services were migrated to DynamoDB and other highly scalable non-relational databases such as Amazon ElastiCache. Transactional relational workloads with high data consistency requirements were moved to Aurora and RDS; analytics workloads were migrated to Redshift, our cloud data warehouse.

We captured the shutdown of the final Oracle database, and had a quick celebration:

DBA Career Path
As I explained earlier, our DBAs once spent a lot of time managing and scaling our legacy Oracle databases. The migration freed up time that our DBAs now use to do an even better job of performance monitoring and query optimization, all with the goal of letting them deliver a better customer experience.

As part of the migration, we also worked to create a fresh career path for our Oracle DBAs, training them to become database migration specialists and advisors. This training includes education on AWS database technologies, cloud-based architecture, cloud security, OpEx-style cost management. They now work with both internal and external customers in an advisory role, where they have an opportunity to share their first-hand experience with large-scale migration of mission-critical databases.

Migration Examples
Here are examples drawn from a few of the migrations:

Advertising – After the migration, this team was able to double their database fleet size (and their throughput) in minutes to accommodate peak traffic, courtesy of RDS. This scale-up effort would have taken months.

Buyer Fraud – This team moved 40 TB of data with just one hour of downtime, and realized the same or better performance at half the cost, powered by Amazon Aurora.

Financial Ledger – This team moved 120 TB of data, reduced latency by 40%, cut costs by 70%, and cut overhead by the same 70%, all powered by DynamoDB.

Wallet – This team migrated more than 10 billion records to DynamoDB, reducing latency by 50% and operational costs by 90% in the process. To learn more about this migration, read Amazon Wallet Scales Using Amazon DynamoDB.

My recent Prime Day 2019 post contains more examples of the extreme scale and performance that are possible with AWS.

Migration Resources
If you are ready to migrate from Oracle (or another hand-managed legacy database) to one or more AWS database services, here are some resources to get you started:

AWS Migration Partners – Our slate of AWS Migration Partners have the experience, expertise, and tools to help you to understand, plan, and execute a database migration.

Migration Case Studies -Read How Amazon is Achieving Database Freedom Using AWS to learn more about this effort; read the Prime Video, Advertising, Items & Offers, Amazon Fulfillment, and Analytics case studies to learn more about the examples that I mentioned above.

AWS Professional Services – My colleagues at AWS Professional Services are ready to work alongside you to make your migration a success.

AWS Migration Tools & Services – Check out our Cloud Migration page, read more about Migration Hub, and don’t forget about the Database Migration Service.

AWS Database Freedom – The AWS Database Freedom program is designed to help qualified customers migrate from traditional databases to cloud-native AWS databases.

AWS re:Invent Sessions – We are finalizing an extensive lineup of chalk talks and breakout sessions for AWS re:Invent that will focus on this migration effort, all led by the team members that planned and executed the migrations.

Jeff;

 

 

Orchestrate Amazon Redshift-Based ETL workflows with AWS Step Functions and AWS Glue

Post Syndicated from Ben Romano original https://aws.amazon.com/blogs/big-data/orchestrate-amazon-redshift-based-etl-workflows-with-aws-step-functions-and-aws-glue/

Amazon Redshift is a fully managed, petabyte-scale data warehouse service in the cloud that offers fast query performance using the same SQL-based tools and business intelligence applications that you use today. Many customers also like to use Amazon Redshift as an extract, transform, and load (ETL) engine to use existing SQL developer skillsets, to quickly migrate pre-existing SQL-based ETL scripts, and—because Amazon Redshift is fully ACID-compliant—as an efficient mechanism to merge change data from source data systems.

In this post, I show how to use AWS Step Functions and AWS Glue Python Shell to orchestrate tasks for those Amazon Redshift-based ETL workflows in a completely serverless fashion. AWS Glue Python Shell is a Python runtime environment for running small to medium-sized ETL tasks, such as submitting SQL queries and waiting for a response. Step Functions lets you coordinate multiple AWS services into workflows so you can easily run and monitor a series of ETL tasks. Both AWS Glue Python Shell and Step Functions are serverless, allowing you to automatically run and scale them in response to events you define, rather than requiring you to provision, scale, and manage servers.

While many traditional SQL-based workflows use internal database constructs like triggers and stored procedures, separating workflow orchestration, task, and compute engine components into standalone services allows you to develop, optimize, and even reuse each component independently. So, while this post uses Amazon Redshift as an example, my aim is to more generally show you how to orchestrate any SQL-based ETL.

Prerequisites

If you want to follow along with the examples in this post using your own AWS account, you need a Virtual Private Cloud (VPC) with at least two private subnets that have routes to an S3 VPC endpoint.

If you don’t have a VPC, or are unsure if yours meets these requirements, I provide an AWS CloudFormation template stack you can launch by selecting the following button. Provide a stack name on the first page and leave the default settings for everything else. Wait for the stack to display Create Complete (this should only take a few minutes) before moving on to the other sections.

Scenario

For the examples in this post, I use the Amazon Customer Reviews Dataset to build an ETL workflow that completes the following two tasks which represent a simple ETL process.

  • Task 1: Move a copy of the dataset containing reviews from the year 2015 and later from S3 to an Amazon Redshift table.
  • Task 2: Generate a set of output files to another Amazon S3 location which identifies the “most helpful” reviews by market and product category, allowing an analytics team to glean information about high quality reviews.

This dataset is publicly available via an Amazon Simple Storage Service (Amazon S3) bucket. Complete the following tasks to get set up.

Solution overview

The following diagram highlights the solution architecture from end to end:

The steps in this process are as follows:

  1. The state machine launches a series of runs of an AWS Glue Python Shell job (more on how and why I use a single job later in this post!) with parameters for retrieving database connection information from AWS Secrets Manager and an .sql file from S3.
  2. Each run of the AWS Glue Python Shell job uses the database connection information to connect to the Amazon Redshift cluster and submit the queries contained in the .sql file.
    1. For Task 1: The cluster utilizes Amazon Redshift Spectrum to read data from S3 and load it into an Amazon Redshift table. Amazon Redshift Spectrum is commonly used as an means for loading data to Amazon Redshift. (See Step 7 of Twelve Best Practices for Amazon Redshift Spectrum for more information.)
    2. For Task 2: The cluster executes an aggregation query and exports the results to another Amazon S3 location via UNLOAD.
  3. The state machine may send a notification to an Amazon Simple Notification Service (SNS) topic in the case of pipeline failure.
  4. Users can query the data from the cluster and/or retrieve report output files directly from S3.

I include an AWS CloudFormation template to jumpstart the ETL environment so that I can focus this post on the steps dedicated to building the task and orchestration components. The template launches the following resources:

  • Amazon Redshift Cluster
  • Secrets Manager secret for storing Amazon Redshift cluster information and credentials
  • S3 Bucket preloaded with Python scripts and .sql files
  • Identity and Access Management (IAM) Role for AWS Glue Python Shell jobs

See the following resources for how to complete these steps manually:

Be sure to select at least two private subnets and the corresponding VPC, as shown in the following screenshot. If you are using the VPC template from above, the VPC appears as 10.71.0.0/16 and the subnet names are A private and B private.

The stack should take 10-15 minutes to launch. Once it displays Create Complete, you can move on to the next section. Be sure to take note of the Resources tab in the AWS CloudFormation console, shown in the following screenshot, as I refer to these resources throughout the post.

Building with AWS Glue Python Shell

Begin by navigating to AWS Glue in the AWS Management Console.

Making a connection

Amazon Redshift cluster resides in a VPC, so you first need to create a connection using AWS Glue. Connections contain properties, including VPC networking information, needed to access your data stores. You eventually attach this connection to your Glue Python Shell Job so that it can reach your Amazon Redshift cluster.

Select Connections from the menu bar, and then select Add connection. Give your connection a name like blog_rs_connection,  select Amazon Redshift as the Connection type, and then select Next, as shown in the following screenshot.

Under Cluster, enter the name of the cluster that the AWS CloudFormation template launched, i.e blogstack-redshiftcluster-####. Because the Python code I provide for this blog already handles credential retrieval, the rest of the values around database information you enter here are largely placeholders. The key information you are associating with the connection is networking-related.

Please note that you are not able to test the connection without the correct cluster information.  If you are interested in doing so, note that Database name and Username are auto-populated after selecting the correct cluster, as shown in the following screenshot. Follow the instructions here to retrieve the password information from Secrets Manager to copy into the Password field.

ETL code review

Take a look at the two main Python scripts used in this example:

Pygresql_redshift_common.py is a set of functions that can retrieve cluster connection information and credentials from Secrets Manger, make a connection to the cluster, and submit queries respectively. By retrieving cluster information at runtime via a passed parameter, these functions allow the job to connect to any cluster to which it has access. You can package these functions into a library by following the instructions to create a python .egg file (already completed as a part of the AWS CloudFormation template launch). Note that AWS Glue Python Shell supports several python libraries natively.

import pg
import boto3
import base64
from botocore.exceptions import ClientError
import json

#uses session manager name to return connection and credential information
def connection_info(db):

	session = boto3.session.Session()
	client = session.client(
		service_name='secretsmanager'
	)

	get_secret_value_response = client.get_secret_value(SecretId=db)

	if 'SecretString' in get_secret_value_response:
		secret = json.loads(get_secret_value_response['SecretString'])
	else:
		secret = json.loads(base64.b64decode(get_secret_value_response['SecretBinary']))
		
	return secret


#creates a connection to the cluster
def get_connection(db,db_creds):

	con_params = connection_info(db_creds)
	
	rs_conn_string = "host=%s port=%s dbname=%s user=%s password=%s" % (con_params['host'], con_params['port'], db, con_params['username'], con_params['password'])
	rs_conn = pg.connect(dbname=rs_conn_string)
	rs_conn.query("set statement_timeout = 1200000")
	
	return rs_conn


#submits a query to the cluster
def query(con,statement):
    res = con.query(statement)
    return res

The AWS Glue Python Shell job runs rs_query.py when called. It starts by parsing job arguments that are passed at invocation. It uses some of those arguments to retrieve a .sql file from S3, then connects and submits the statements within the file to the cluster using the functions from pygresql_redshift_common.py. So, in addition to connecting to any cluster using the Python library you just packaged, it can also retrieve and run any SQL statement. This means you can manage a single AWS Glue Python Shell job for all of your Amazon Redshift-based ETL by simply passing in parameters on where it should connect and what it should submit to complete each task in your pipeline.

from redshift_module import pygresql_redshift_common as rs_common
import sys
from awsglue.utils import getResolvedOptions
import boto3

#get job args
args = getResolvedOptions(sys.argv,['db','db_creds','bucket','file'])
db = args['db']
db_creds = args['db_creds']
bucket = args['bucket']
file = args['file']

#get sql statements
s3 = boto3.client('s3') 
sqls = s3.get_object(Bucket=bucket, Key=file)['Body'].read().decode('utf-8')
sqls = sqls.split(';')

#get database connection
print('connecting...')
con = rs_common.get_connection(db,db_creds)

#run each sql statement
print("connected...running query...")
results = []
for sql in sqls[:-1]:
    sql = sql + ';'
    result = rs_common.query(con, sql)
    print(result)
    results.append(result)

print(results)

Creating the Glue Python Shell Job

Next, put that code into action:

  1. Navigate to Jobs on the left menu of the AWS Glue console page and from there, select Add job.
  2. Give the job a name like blog_rs_query.
  3. For the IAM role, select the same GlueExecutionRole you previously noted from the Resources section of the AWS CloudFormation console.
  4. For Type, select Python shell, leave Python version as the default of Python 3, and for This job runs select An existing script that you provide.
  5. For S3 path where the script is stored, navigate to the script bucket created by the AWS CloudFormation template (look for ScriptBucket in the Resources), then select the python/py file.
  6. Expand the Security configuration, script libraries, and job parameters section to add the Python .egg file with the Amazon Redshift connection library to the Python library path. It is also located in the script bucket under python /redshift_module-0.1-py3.6.egg.

When all is said and done everything should look as it does in the following screenshot:

Choose Next. Add the connection you created by choosing Select to move it under Required connections. (Recall from the Making a connection section that this gives the job the ability to interact with your VPC.) Choose Save job and edit script to finish, as shown in the following screenshot.

Test driving the Python Shell job

After creating the job, you are taken to the AWS Glue Python Shell IDE. If everything went well, you should see the rs_query.py code. Right now, the Amazon Redshift cluster is sitting there empty, so use the Python code to run the following SQL statements and populate it with tables.

  1. Create an external database (amzreviews).
  2. Create an external table (reviews) from which Amazon Redshift Spectrum can read from the source data in S3 (the public reviews dataset). The table is partitioned by product_category because the source files are organized by category, but in general you should partition on frequently filtered columns (see #4).
  3. Add partitions to the external table.
  4. Create an internal table (reviews) local to the Amazon Redshift cluster. product_id works well as a DISTKEY because it has high cardinality, even distribution, and most likely (although not explicitly part of this blog’s scenario) a column that will be used to join with other tables. I choose review_date as a SORTKEY to efficiently filter out review data that is not part of my target query (after 2015). Learn more about how to best choose DISTKEY/SORTKEY as well as additional table design parameters for optimizing performance by reading the Designing Tables documentation.
    CREATE EXTERNAL SCHEMA amzreviews 
    from data catalog
    database 'amzreviews'
    iam_role 'rolearn'
    CREATE EXTERNAL database IF NOT EXISTS;
    
    
    
    CREATE EXTERNAL TABLE amzreviews.reviews(
      marketplace varchar(10), 
      customer_id varchar(15), 
      review_id varchar(15), 
      product_id varchar(25), 
      product_parent varchar(15), 
      product_title varchar(50), 
      star_rating int, 
      helpful_votes int, 
      total_votes int, 
      vine varchar(5), 
      verified_purchase varchar(5), 
      review_headline varchar(25), 
      review_body varchar(1024), 
      review_date date, 
      year int)
    PARTITIONED BY ( 
      product_category varchar(25))
    ROW FORMAT SERDE 
      'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 
    STORED AS INPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 
    OUTPUTFORMAT 
      'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
    LOCATION
      's3://amazon-reviews-pds/parquet/';
      
      
      
    ALTER TABLE amzreviews.reviews ADD
    partition(product_category='Apparel') 
    location 's3://amazon-reviews-pds/parquet/product_category=Apparel/'
    partition(product_category='Automotive') 
    location 's3://amazon-reviews-pds/parquet/product_category=Automotive'
    partition(product_category='Baby') 
    location 's3://amazon-reviews-pds/parquet/product_category=Baby'
    partition(product_category='Beauty') 
    location 's3://amazon-reviews-pds/parquet/product_category=Beauty'
    partition(product_category='Books') 
    location 's3://amazon-reviews-pds/parquet/product_category=Books'
    partition(product_category='Camera') 
    location 's3://amazon-reviews-pds/parquet/product_category=Camera'
    partition(product_category='Grocery') 
    location 's3://amazon-reviews-pds/parquet/product_category=Grocery'
    partition(product_category='Furniture') 
    location 's3://amazon-reviews-pds/parquet/product_category=Furniture'
    partition(product_category='Watches') 
    location 's3://amazon-reviews-pds/parquet/product_category=Watches'
    partition(product_category='Lawn_and_Garden') 
    location 's3://amazon-reviews-pds/parquet/product_category=Lawn_and_Garden';
    
    
    CREATE TABLE reviews(
      marketplace varchar(10),
      customer_id varchar(15), 
      review_id varchar(15), 
      product_id varchar(25) DISTKEY, 
      product_parent varchar(15), 
      product_title varchar(50), 
      star_rating int, 
      helpful_votes int, 
      total_votes int, 
      vine varchar(5), 
      verified_purchase varchar(5), 
      review_date date, 
      year int,
      product_category varchar(25))
      
      SORTKEY (
         review_date
        );

Do this first job run manually so you can see where all of the elements I’ve discussed come into play. Select Run Job at the top of the IDE screen. Expand the Security configuration, script libraries, and job parameters section. This is where you add in the parameters as key-value pairs, as shown in the following screenshot.

KeyValue
–dbreviews
–db_credsreviewssecret
–bucket<name of s3 script bucket>
–filesql/reviewsschema.sql

Select Run job to start it. The job should take a few seconds to complete. You can look for log outputs below the code in the IDE to watch job progress.

Once the job completes, navigate to Databases in the AWS Glue console and look for the amzreviews database and reviews table, as shown in the following screenshot. If they are there, then everything worked as planned! You can also connect to your Amazon Redshift cluster using the Redshift Query Editor or with your own SQL client tool and look for the local reviews table.

Step Functions Orchestration

Now that you’ve had a chance to run a job manually, it’s time to move onto something more programmatic that is orchestrated by Step Functions.

Launch Template

I provide a third AWS CloudFormation template for kickstarting this process as well. It creates a Step Functions state machine that calls two instances of the AWS Glue Python Shell job you just created to complete the two tasks I outlined at the beginning of this post.

For BucketName, paste the name of the script bucket created in the second AWS CloudFormation stack. For GlueJobName, type in the name of the job you just created. Leave the other information as default, as shown in the following screenshot. Launch the stack and wait for it to display Create Complete—this should take only a couple of minutes—before moving on to the next section.

Working with the Step Functions State Machine

State Machines are made up of a series of steps, allowing you to stitch together services into robust ETL workflows. You can monitor each step of execution as it happens, which means you can identify and fix problems in your ETL workflow quickly, and even automatically.

Take a look at the state machine you just launched to get a better idea. Navigate to Step Functions in the AWS Console and look for a state machine with a name like GlueJobStateMachine-######. Choose Edit to view the state machine configuration, as shown in the following screenshot.

It should look as it does in the following screenshot:

As you can see, state machines are created using JSON templates made up of task definitions and workflow logic. You can run parallel tasks, catch errors, and even pause workflows and wait for manual callback to continue. The example I provide contains two tasks for running the SQL statements that complete the goals I outlined at the beginning of the post:

  1. Load data from S3 using Redshift Spectrum
  2. Transform and writing data back to S3

Each task contains basic error handling which, if caught, routes the workflow to an error notification task. This example is a simple one to show you how to build a basic workflow, but you can refer to the Step Functions documentation for an example of more complex workflows to help build a robust ETL pipeline. Step Functions also supports reusing modular components with Nested Workflows.

SQL Review

The state machine will retrieve and run the following SQL statements:

INSERT INTO reviews
SELECT marketplace, customer_id, review_id, product_id, product_parent, product_title, star_rating, helpful_votes, total_votes, vine, verified_purchase, review_date, year, product_category
FROM amzreviews.reviews
WHERE year > 2015;

As I mentioned previously, Amazon Redshift Spectrum is a great way to run ETL using an INSERT INTO statement. This example is a simple load of the data as it is in S3, but keep in mind you can add more complex SQL statements to transform your data prior to loading.

UNLOAD ('SELECT marketplace, product_category, product_title, review_id, helpful_votes, AVG(star_rating) as average_stars FROM reviews GROUP BY marketplace, product_category, product_title, review_id, helpful_votes ORDER BY helpful_votes DESC, average_stars DESC')
TO 's3://bucket/testunload/'
iam_role 'rolearn';

This statement groups reviews by product, ordered by number of helpful votes, and writes to Amazon S3 using UNLOAD.

State Machine execution

Now that everything is in order, start an execution. From the state machine main page select Start an Execution.

Leave the defaults as they are and select Start to begin execution. Once execution begins you are taken to a visual workflow interface where you can follow the execution progress, as shown in the following screenshot.

Each of the queries takes a few minutes to run. In the meantime, you can watch the Amazon Redshift query logs to track the query progress in real time. These can be found by navigating to Amazon Redshift in the AWS Console, selecting your Amazon Redshift cluster, and then selecting the Queries tab, as shown in the following screenshot.

Once you see COMPLETED for both queries, navigate back to the state machine execution. You should see success for each of the states, as shown in the following screenshot.

Next, navigate to the data bucket in the S3 AWS Console page (refer to the DataBucket in the CloudFormation Resources tab). If all went as planned, you should see a folder named testunload in the bucket with the unloaded data, as shown in the following screenshot.

Inject Failure into Step Functions State Machine

Next, test the error handling component of the state machine by intentionally causing an error. An easy way to do this is to edit the state machine and misspell the name of the Secrets Manager secret in the ReadFilterJob task, as shown in the following screenshot.

If you want the error output sent to you, optionally subscribe to the error notification SNS Topic. Start another state machine execution as you did previously. This time the workflow should take the path toward the NotifyFailure task, as shown in the following screenshot. If you subscribed to the SNS Topic associated with it, you should receive a message shortly thereafter.

The state machine logs will show the error in more detail, as shown in the following screenshot.

Conclusion

In this post I demonstrated how you can orchestrate Amazon Redshift-based ETL using serverless AWS Step Functions and AWS Glue Python Shells jobs. As I mentioned in the introduction, the concepts can also be more generally applied to other SQL-based ETL, so use them to start building your own SQL-based ETL pipelines today!

 


About the Author

Ben Romano is a Data Lab solution architect at AWS. Ben helps our customers architect and build data and analytics prototypes in just four days in the AWS Data Lab.

 

 

 

 

Protect and Audit PII data in Amazon Redshift with DataSunrise Security

Post Syndicated from Saunak Chandra original https://aws.amazon.com/blogs/big-data/protect-and-audit-pii-data-in-amazon-redshift-with-datasunrise-security/

DataSunrise, in their own words: DataSunrise is a database security software company that offers a breadth of security solutions, including data masking (dynamic and static masking), activity monitoring, database firewalls, and sensitive data discovery for various databases. The goal is to protect databases against external and internal threats and vulnerabilities. Customers often choose DataSunrise Database Security because it gives them unified control and a single-user experience when protecting different database engines that run on AWS, including Amazon Redshift, Amazon Aurora, all Amazon RDS database engines, Amazon DynamoDB, and Amazon Athena, among others. DataSunrise Security Suite is a set of tools that can protect and audit PII data in Amazon Redshift.

DataSunrise offers passive security with data auditing in addition to active data and database security. Active security is based on predefined security policies, such as preventing unauthorized access to sensitive data, blocking suspicious SQL queries, preventing SQL-injection attacks, or dynamically masking and obfuscating data in real time. DataSunrise comes with high availability, failover, and automatic scaling.

This post focuses on active security for Amazon Redshift, in particular DataSunrise’s capabilities for masking and access control of personally identifiable information (PII), which you can back with DataSunrise’s passive security offerings such as auditing access of sensitive information. This post discusses DataSunrise security for Amazon Redshift, how it works, and how to get started.

Why you need active security for Amazon Redshift

Amazon Redshift is a massively parallel processing (MPP), fully managed petabyte-scale data warehouse (DW) solution with over 15,000 deployments worldwide. Amazon Redshift provides a database encryption mechanism to protect sensitive data, such as payment information and health insurance. For more information, see Amazon Redshift Database Encryption.

Many organizations store sensitive data, commonly classified as personally identifiable information (PII) or sensitive personal information (SPI). You may need solutions to manage access control to such sensitive information, and want to manage it efficiently and flexibly, preferably using a management tool. DataSunrise is a centralized management tool that masks that data. It resolves the PII and SPI data access control requirement by allowing you to enforce masking policies against all queries against your Amazon Redshift data warehouse.

What DataSunrise masking does

DataSunrise enables masking queries against Amazon Redshift by acting as a proxy layer between your applications and the backend stores of Amazon Redshift, enabling transparent data flow, bindings, and so on, while your end-users receive masked or obfuscated data that allows them to do their job but prevents any risk of revealing PII data unintentionally.

DataSunrise can exempt users who are authorized to access this information by composing policies and choosing from predefined policy templates that would allow those users to bypass masking constraints when needed.

How it works

DataSunrise operates as a proxy between users or applications that connect to the database and the database server. DataSunrise intercepts the traffic for in-depth analysis and filtering. It applies data masking and access control policies to enforce active security policies against your PII data. When the database firewall is enabled and a security policy violation is detected, DataSunrise can block the malicious SQL query and notify administrators via SMTP or SNMP. With real-time alerts, you can maintain continuous database security and streamline compliance.

DataSunrise operates as a proxy

Getting started with DataSunrise

You can deploy DataSunrise on a Windows or Linux instance in Amazon EC2. You can download a fully prepared DataSunrise AMI from AWS Marketplace to protect your Amazon Redshift cluster. DataSunrise Database and Data Security are available for both Windows and Linux platforms.

After deploying DataSunrise, you can configure security policies for Amazon Redshift and create data masking and access control security rules. After you configure and activate the security policies, DataSunrise enacts those policies against the user and application traffic that would connect to the database through DataSunrise’s proxy.

DataSunrise customers need to configure the Amazon Redshift cluster security group inbound rule to allow DataSunrise IP. For more information, see Amazon Redshift Cluster Security Group. Additionally, you can include the DataSunrise security group in the cluster security group when it runs on the same AWS VPC. Users can execute queries only through connecting to the DataSunrise endpoint and not to the Amazon Redshift cluster endpoint. All DB users and groups are imported from Amazon Redshift into DataSunrise for authentication and authorization to Amazon Redshift objects.

Creating a dynamic data masking rule

Masking obfuscates part or an entire column value. When a column is masked, the column values are replaced by fake values. It is effected either by replacing some original characters with fake ones or by using some masking functions. DataSunrise has many built-in masking functions for credit card numbers, e-mails, etc. Masking protects sensitive or personally identifiable data such as credit card numbers. This is not the same as encryption or hashing, which applies a sophisticated algorithm to a scalar value to convert it into another value.

You can create dynamic masking rules using object-based filters in DataSunrise’s console. DataSunrise identifies the protected objects during application calls and enforces those security rules against targeted operations, schemas, or objects in general within your Amazon Redshift cluster. Security administrators can enable those rules granularly based on the object level and caller identity. They can allow exemptions when needed by identifying authorized callers.

To perform dynamic masking in DataSunrise, you need to create data masking rules as part of defining such security policies.

Complete the following steps to create those masking policies:

  1. In the DataSunrise console, choose Masking > Dynamic Masking Rules.
  2. Choose Add Rule. Add required information.

    Create Dynamic Data Masking Rule

Create Dynamic Data Masking Rule

  1. In the Masking Settings section click Select and navigate to a table in a schema and check the columns you want to mask. See the following screenshot of the Check Columns page:

    Redshift columns to enable dynamic masking

Redshift columns to enable dynamic masking

Click Done after you decide which column to protect and choose the masking method and any relevant settings to allow business-oriented outcomes of the masked information.

In Add Rule, Filter Sessions, you can choose which users, applications, hosts, and more are affected by this rule.

Creating a static data masking rule

You can mask data permanently with static masking as opposed to dynamic data masking. It stores the objects permanently in a separate schema or database. During static masking, DataSunrise copies each selected table into a separate schema or database. So, static masking may require additional storage space. Some of these tables have columns with masked content stored on the disk. This replicated schema is a fully functional schema in which you can run user queries. The source tables remain untouched and unmasked data can be viewed. In case original data has been changed, it is required to repeat the static masking procedure again. In that case it is necessary to truncate tables with previously masked data.

  1. From the menu, choose Masking > Static Masking.
  2. In New Static Masking Task, in Source and Target Instances, choose the source database, schema, and corresponding target destination.See the following screenshot of the New Static Masking Task page:
  1. In Select Source Tables to Transfer and Columns to Mask, choose the objects to which you wish to apply masking.The following screenshot shows the list of available tables:

DataSunrise also enables you to reschedule recurring static masking jobs so you can refresh your masked records based on your source or production data.

Static data masking under DataSunrise applies to Amazon Redshift local tables. In addition to local tables, Amazon Redshift allows querying external tables in Amazon S3; however, DataSunrise does not support static masking on data stored in Amazon S3 accessed in Amazon Redshift via external tables. For more information, see Using Amazon Redshift Spectrum to Query External Data.

Creating a security/access control rule

While data masking can help in many cases to allow your Amazon Redshift users the appropriate access, you may need further enforcement of access control to filter out any operations that might violate your security strategy. DataSunrise can import database users’ and groups’ metadata from Amazon Redshift, which the DataSunrise administrator can use to configure the security profile. If you already have a set of users defined in your existing Redshift DB you don’t need to additionally recreate the users for DataSunrise. DataSunrise will use this list of users only to configure rules. DataSunrise does not modify any traffic related to the authentication process of database users by default.

  1. In the DataSunrise console, from the menu, choose Security > Rules.The following screenshot shows the Security Rules page:
  2. Choose Add Rule.The following screenshot shows the details you can enter to compose this new rule:

DataSunrise also allows you to compose your rule to restrict or allow certain users, applications, hosts, and so on from performing activities that you consider prohibited against particular objects or areas within your Amazon Redshift cluster.

The following screenshot shows the Filter Sessions page:

DataSunrise enables you to create rules for specific objects, query groups, query types, and SQL injection activities, and trigger actions when authorization errors occur.

Static masking doesn’t have a direct impact on performance, but if a customer uses the DataSunrise custom function, it could impact performance because custom functions execute on the DataSunrise server.

Using DataSunrise audit and compliance policies

From the console, Compliance > Add Compliance.

In the Compliance orchestrator page, you can initiate a scan of your Amazon Redshift cluster to identify all PII data or sensitive information in general, per your compliance standards. DataSunrise comes bundled with internal scans for HIPAA, GDPR, and other compliance standards, but you can create or amend any of those libraries to accommodate any special requirements that your security strategy mandates. The following screenshot shows the Add Compliance page:

After completing the scan, DataSunrise guides you through the process of composing rules for sensitive information within your Amazon Redshift cluster.

You can also create your audit rules manually. The following screenshot shows the New Audit Rule page:

You can set audit rules for any restriction to make sure that transactional trails are only collected when necessary. You can target objects starting from the entire database down to a single column in your Amazon Redshift cluster. See the following screenshot:

Conclusion

DataSunrise’s masking feature allows for descriptive specifications of access control to sensitive columns, in addition to built-in encryption provided by the Amazon Redshift cluster. Its proxy enables more fine-grained access control, auditing, and masking capabilities to better monitor, protect, and comply with regulatory standards that address the ever-increasing needs of securing and protecting data. DataSunrise’s integration with Amazon Redshift addresses those concerns by simplifying and automating the security rules and its applications. Keep your data safe and protected at all times!

To get started with DataSunrise with Amazon Redshift, visit DataSunrise in AWS Marketplace.

The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

 


About the Authors


Saunak Chandra is a senior partner solutions architect for Redshift at AWS.
Saunak likes to experiment with new products in the technology space, alongside his day to day work. He loves exploring the nature in the Pacific Northwest. A short hiking or biking in the trails is his favorite weekend morning routine. He also likes to do yoga when he gets time from his kid.

 

 

 

Radik Chumaren is an engineering leader at DataSunrise. Radik is specializing in heterogeneous database environments with focus on building database security software in the cloud. He enjoys reading and playing soccer.

 

 

 

 

NoSQL Workbench for Amazon DynamoDB – Available in Preview

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/nosql-workbench-for-amazon-dynamodb-available-in-preview/

I am always impressed by the flexibility of Amazon DynamoDB, providing our customers a fully-managed key-value and document database that can easily scale from a few requests per month to millions of requests per second.

The DynamoDB team released so many great features recently, from on-demand capacity, to support for native ACID transactions. Here’s a great recap of other recent DynamoDB announcements such as global tables, point-in-time recovery, and instant adaptive capacity. DynamoDB now encrypts all customer data at rest by default.

However, switching mindset from a relational database to NoSQL is not that easy. Last year we had two amazing talks at re:Invent that can help you understand how DynamoDB works, and how you can use it for your use cases:

To help you even further, we are introducing today in preview NoSQL Workbench for Amazon DynamoDB, a free, client-side application available for Windows and macOS to help you design and visualize your data model, run queries on your data, and generate the code for your application!

The three main capabilities provided by the NoSQL Workbench are:

  • Data modeler — to build new data models, adding tables and indexes, or to import, modify, and export existing data models.
  • Visualizer — to visualize data models based on their applications access patterns, with sample data that you can add manually or import via a SQL query.
  • Operation builder — to define and execute data-plane operations or generate ready-to-use sample code for them.

To see how this new tool can simplify working with DynamoDB, let’s build an application to retrieve information on customers and their orders.

Using the NoSQL Workbench
In the Data modeler, I start by creating a CustomerOrders data model, and I add a table, CustomerAndOrders, to hold my customer data and the information on their orders. You can use this tool to create a simple data model where customers and orders are in two distinct tables, each one with their own primary keys. There would be nothing wrong with that. Here I’d like to show how this tool can also help you use more advanced design patterns. By having the customer and order data in a single table, I can construct queries that return all the data I need with a single interaction with DynamoDB, speeding up the performance of my application.

As partition key, I use the customerId. This choice provides an even distribution of data across multiple partitions. The sort key in my data model will be an overloaded attribute, in the sense that it can hold different data depending on the item:

  • A fixed string, for example customer, for the items containing the customer data.
  • The order date, written using ISO 8601 strings such as 20190823, for the items containing orders.

By overloading the sort key with these two possible values, I am able to run a single query that returns the customer data and the most recent orders. For this reason, I use a generic name for the sort key. In this case, I use sk.

Apart from the partition key and the optional sort key, DynamoDB has a flexible schema, and the other attributes can be different for each item in a table. However, with this tool I have the option to describe in the data model all the possible attributes I am going to use for a table. In this way, I can check later that all the access patterns I need for my application work well with this data model.

For this table, I add the following attributes:

  • customerName and customerAddress, for the items in the table containing customer data.
  • orderId and deliveryAddress, for the items in the table containing order data.

I am not adding a orderDate attribute, because for this data model the value will be stored in the sk sort key. For a real production use case, you would probably have much more attributes to describe your customers and orders, but I am trying to keep things simple enough here to show what you can do, without getting lost in details.

Another access pattern for my application is to be able to get a specific order by ID. For that, I add a global secondary index to my table, with orderId as partition key and no sort key.

I add the table definition to the data model, and move on to the Visualizer. There, I update the table by adding some sample data. I add data manually, but I could import a few rows from a table in a MySQL database, for example to simplify a NoSQL migration from a relational database.

Now, I visualize my data model with the sample data to have a better understanding of what to expect from this table. For example, if I select a customerId, and I query for all the orders greater than a specific date, I also get the customer data at the end, because the string customer, stored in the sk sort key, is always greater that any date written in ISO 8601 syntax.

In the Visualizer, I can also see how the global secondary index on the orderId works. Interestingly, items without an orderId are not part of this index, so I get only 4 of the 6 items that are part of my sample data. This happens because DynamoDB writes a corresponding index entry only if the index sort key value is present in the item. If the sort key doesn’t appear in every table item, the index is said to be sparseSparse indexes are useful for queries over a subsection of a table.

I now commit my data model to DynamoDB. This step creates server-side resources such as tables and global secondary indexes for the selected data model, and loads the sample data. To do so, I need AWS credentials for an AWS account. I have the AWS Command Line Interface (CLI) installed and configured in the environment where I am using this tool, so I can just select one of my named profiles.

I move to the Operation builder, where I see all the tables in the selected AWS Region. I select the newly created CustomerAndOrders table to browse the data and build the code for the operations I need in my application.

In this case, I want to run a query that, for a specific customer, selects all orders more recent that a date I provide. As we saw previously, the overloaded sort key would also return the customer data as last item. The Operation builder can help you use the full syntax of DynamoDB operations, for example adding conditions and child expressions. In this case, I add the condition to only return orders where the deliveryAddress contains Seattle.

I have the option to execute the operation on the DynamoDB table, but this time I want to use the query in my application. To generate the code, I select between Python, JavaScript (Node.js), or Java.

You can use the Operation builder to generate the code for all the access patterns that you plan to use with your application, using all the advanced features that DynamoDB provides, including ACID transactions.

Available Now
You can find how to set up NoSQL Workbench for Amazon DynamoDB (Preview) for Windows and macOS here.

We welcome your suggestions in the DynamoDB discussion forum. Let us know what you build with this new tool and how we can help you more!

Learn about AWS Services & Solutions – September AWS Online Tech Talks

Post Syndicated from Jenny Hang original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-september-aws-online-tech-talks/

Learn about AWS Services & Solutions – September AWS Online Tech Talks

AWS Tech Talks

Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

 

Compute:

September 23, 2019 | 11:00 AM – 12:00 PM PTBuild Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.

September 26, 2019 | 1:00 PM – 2:00 PM PTSelf-Hosted WordPress: It’s Easier Than You Think – Learn how you can easily build a fault-tolerant WordPress site using Amazon Lightsail.

October 3, 2019 | 11:00 AM – 12:00 PM PTLower Costs by Right Sizing Your Instance with Amazon EC2 T3 General Purpose Burstable Instances – Get an overview of T3 instances, understand what workloads are ideal for them, and understand how the T3 credit system works so that you can lower your EC2 instance costs today.

 

Containers:

September 26, 2019 | 11:00 AM – 12:00 PM PTDevelop a Web App Using Amazon ECS and AWS Cloud Development Kit (CDK) – Learn how to build your first app using CDK and AWS container services.

 

Data Lakes & Analytics:

September 26, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Provisioning Amazon MSK Clusters and Using Popular Apache Kafka-Compatible Tooling – Learn best practices on running Apache Kafka production workloads at a lower cost on Amazon MSK.

 

Databases:

September 25, 2019 | 1:00 PM – 2:00 PM PTWhat’s New in Amazon DocumentDB (with MongoDB compatibility) – Learn what’s new in Amazon DocumentDB, a fully managed MongoDB compatible database service designed from the ground up to be fast, scalable, and highly available.

October 3, 2019 | 9:00 AM – 10:00 AM PTBest Practices for Enterprise-Class Security, High-Availability, and Scalability with Amazon ElastiCache – Learn about new enterprise-friendly Amazon ElastiCache enhancements like customer managed key and online scaling up or down to make your critical workloads more secure, scalable and available.

 

DevOps:

October 1, 2019 | 9:00 AM – 10:00 AM PT – CI/CD for Containers: A Way Forward for Your DevOps Pipeline – Learn how to build CI/CD pipelines using AWS services to get the most out of the agility afforded by containers.

 

Enterprise & Hybrid:

September 24, 2019 | 1:00 PM – 2:30 PM PT Virtual Workshop: How to Monitor and Manage Your AWS Costs – Learn how to visualize and manage your AWS cost and usage in this virtual hands-on workshop.

October 2, 2019 | 1:00 PM – 2:00 PM PT – Accelerate Cloud Adoption and Reduce Operational Risk with AWS Managed Services – Learn how AMS accelerates your migration to AWS, reduces your operating costs, improves security and compliance, and enables you to focus on your differentiating business priorities.

 

IoT:

September 25, 2019 | 9:00 AM – 10:00 AM PTComplex Monitoring for Industrial with AWS IoT Data Services – Learn how to solve your complex event monitoring challenges with AWS IoT Data Services.

 

Machine Learning:

September 23, 2019 | 9:00 AM – 10:00 AM PTTraining Machine Learning Models Faster – Learn how to train machine learning models quickly and with a single click using Amazon SageMaker.

September 30, 2019 | 11:00 AM – 12:00 PM PTUsing Containers for Deep Learning Workflows – Learn how containers can help address challenges in deploying deep learning environments.

October 3, 2019 | 1:00 PM – 2:30 PM PTVirtual Workshop: Getting Hands-On with Machine Learning and Ready to Race in the AWS DeepRacer League – Join DeClercq Wentzel, Senior Product Manager for AWS DeepRacer, for a presentation on the basics of machine learning and how to build a reinforcement learning model that you can use to join the AWS DeepRacer League.

 

AWS Marketplace:

September 30, 2019 | 9:00 AM – 10:00 AM PTAdvancing Software Procurement in a Containerized World – Learn how to deploy applications faster with third-party container products.

 

Migration:

September 24, 2019 | 11:00 AM – 12:00 PM PTApplication Migrations Using AWS Server Migration Service (SMS) – Learn how to use AWS Server Migration Service (SMS) for automating application migration and scheduling continuous replication, from your on-premises data centers or Microsoft Azure to AWS.

 

Networking & Content Delivery:

September 25, 2019 | 11:00 AM – 12:00 PM PTBuilding Highly Available and Performant Applications using AWS Global Accelerator – Learn how to build highly available and performant architectures for your applications with AWS Global Accelerator, now with source IP preservation.

September 30, 2019 | 1:00 PM – 2:00 PM PTAWS Office Hours: Amazon CloudFront – Just getting started with Amazon CloudFront and [email protected]? Get answers directly from our experts during AWS Office Hours.

 

Robotics:

October 1, 2019 | 11:00 AM – 12:00 PM PTRobots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.

 

Security, Identity & Compliance:

October 1, 2019 | 1:00 PM – 2:00 PM PTDeep Dive on Running Active Directory on AWS – Learn how to deploy Active Directory on AWS and start migrating your windows workloads.

 

Serverless:

October 2, 2019 | 9:00 AM – 10:00 AM PTDeep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.

 

Storage:

September 24, 2019 | 9:00 AM – 10:00 AM PTOptimize Your Amazon S3 Data Lake with S3 Storage Classes and Management Tools – Learn how to use the Amazon S3 Storage Classes and management tools to better manage your data lake at scale and to optimize storage costs and resources.

October 2, 2019 | 11:00 AM – 12:00 PM PTThe Great Migration to Cloud Storage: Choosing the Right Storage Solution for Your Workload – Learn more about AWS storage services and identify which service is the right fit for your business.

 

 

Creating custom Pinpoint dashboards using Amazon QuickSight, part 3

Post Syndicated from Brent Meyer original https://aws.amazon.com/blogs/messaging-and-targeting/creating-custom-pinpoint-dashboards-using-amazon-quicksight-part-3/

Note: This post was written by Manan Nayar and Aprajita Arora, Software Development Engineers on the AWS Digital User Engagement team.


This is the third and final post in our series about creating custom visualizations of your Amazon Pinpoint metrics using Amazon QuickSight.

In our first post, we used the Metrics APIs to retrieve specific Key Performance Indicators (KPIs), and then created visualizations using QuickSight. In the second post, we used the event stream feature in Amazon Pinpoint to enable more in-depth analyses.

The examples in the first two posts used Amazon S3 to store the metrics that we retrieved from Amazon Pinpoint. This post takes a different approach, using Amazon Redshift to store the data. By using Redshift to store this data, you gain the ability to create visualizations on large data sets. This example is useful in situations where you have a large volume of event data, and situations where you need to store your data for long periods of time.

Step 1: Provision the storage

The first step in setting up this solution is to create the destinations where you’ll store the Amazon Pinpoint event data. Since we’ll be storing the data in Amazon Redshift, we need to create a new Redshift cluster. We’ll also create an S3 bucket, which will house the original event data that’s streamed from Amazon Pinpoint.

To create the Redshift cluster and the S3 bucket

  1. Create a new Redshift cluster. To learn more, see the Amazon Redshift Getting Started Guide.
  2. Create a new table in the Redshift cluster that contains the appropriate columns. Use the following query to create the table:
    create table if not exists pinpoint_events_table(
      rowid varchar(255) not null,
      project_key varchar(100) not null,
      event_type varchar(100) not null,
      event_timestamp timestamp not null,
      campaign_id varchar(100),
      campaign_activity_id varchar(100),
      treatment_id varchar(100),
      PRIMARY KEY (rowid)
    );
  3. Create a new Amazon S3 bucket. For complete procedures, see Create a Bucket in the Amazon S3 Getting Started Guide.

Step 2: Set up the event stream

This example uses the event stream feature of Amazon Pinpoint to send event data to S3. Later, we’ll create a Lambda function that sends the event data to your Redshift cluster when new event data is added to the S3 bucket. This method lets us store the original event data in S3, and transfer a subset of that data to Redshift for analysis.

To set up the event stream

  1. Sign in to the Amazon Pinpoint console at http://console.aws.amazon.com/pinpoint. In the list of projects, choose the project that you want to enable event streaming for.
  2. Under Settings, choose Event stream.
  3. Choose Edit, and then configure the event stream to use Amazon Kinesis Data Firehose. If you don’t already have a Kinesis Data Firehose stream, follow the link to create one in the Kinesis console. Configure the stream to send data to an S3 bucket. For more information about creating streams, see Creating an Amazon Kinesis Data Firehose Delivery Stream.
  4. Under IAM role, choose Automatically create a role. Choose Save.

Step 3: Create the Lambda function

In this section, you create a Lambda function that processes the raw event stream data, and then writes it to a table in your Redshift cluster.
To create the Lambda function:

  1. Download the psycopg2 binary from https://github.com/jkehler/awslambda-psycopg2. This Python library lets you interact with PostgreSQL databases, such as Amazon Redshift. It contains certain libraries that aren’t included in Lambda.
    • Note: This Github repository is not an official AWS-managed repository.
  2. Within the awslambda-psycopg2-master folder, you’ll find a folder called psycopg2-37. Rename the folder to psycopg2 (you may need to delete the existing folder with that name), and then compress the entire folder to a .zip file.
  3. Create a new Lambda function from scratch, using the Python 3.7 runtime.
  4. Upload the psycopg2.zip file that you created in step 1 to Lambda.
  5. In Lambda, create a new function called lambda_function.py. Paste the following code into the function:
    import datetime
    import json
    import re
    import uuid
    import os
    import boto3
    import psycopg2
    from psycopg2 import Error
    
    cluster_redshift = "<clustername>"
    dbname_redshift = "<dbname>"
    user_redshift = "<username>"
    password_redshift = "<password>"
    endpoint_redshift = "<endpoint>"
    port_redshift = "5439"
    table_redshift = "pinpoint_events_table"
    
    # Get the file that contains the event data from the appropriate S3 bucket.
    def get_file_from_s3(bucket, key):
        s3 = boto3.client('s3')
        obj = s3.get_object(Bucket=bucket, Key=key)
        text = obj["Body"].read().decode()
    
        return text
    
    # If the object that we retrieve contains newline-delineated JSON, split it into
    # multiple objects.
    def clean_and_split(json_raw):
        json_delimited = re.sub('}\s{','}---X-DELIMITER---{',json_raw)
        json_clean = re.sub('\s+','',json_delimited)
        data = json_clean.split("---X-DELIMITER---")
    
        return data
    
    # Set all of the variables that we'll use to create the new row in Redshift.
    def set_variables(in_json):
    
        for line in in_json:
            content = json.loads(line)
            app_id = content['application']['app_id']
            event_type = content['event_type']
            event_timestamp = datetime.datetime.fromtimestamp(content['event_timestamp'] / 1e3).strftime('%Y-%m-%d %H:%M:%S')
    
            if (content['attributes'].get('campaign_id') is None):
                campaign_id = ""
            else:
                campaign_id = content['attributes']['campaign_id']
    
            if (content['attributes'].get('campaign_activity_id') is None):
                campaign_activity_id = ""
            else:
                campaign_activity_id = content['attributes']['campaign_activity_id']
    
            if (content['attributes'].get('treatment_id') is None):
                treatment_id = ""
            else:
                treatment_id = content['attributes']['treatment_id']
    
            write_to_redshift(app_id, event_type, event_timestamp, campaign_id, campaign_activity_id, treatment_id)
                
    # Write the event stream data to the Redshift table.
    def write_to_redshift(app_id, event_type, event_timestamp, campaign_id, campaign_activity_id, treatment_id):
        row_id = str(uuid.uuid4())
    
        query = ("INSERT INTO " + table_redshift + "(rowid, project_key, event_type, "
                + "event_timestamp, campaign_id, campaign_activity_id, treatment_id) "
                + "VALUES ('" + row_id + "', '"
                + app_id + "', '"
                + event_type + "', '"
                + event_timestamp + "', '"
                + campaign_id + "', '"
                + campaign_activity_id + "', '"
                + treatment_id + "');")
    
        try:
            conn = psycopg2.connect(user = user_redshift,
                                    password = password_redshift,
                                    host = endpoint_redshift,
                                    port = port_redshift,
                                    database = dbname_redshift)
    
            cur = conn.cursor()
            cur.execute(query)
            conn.commit()
            print("Updated table.")
    
        except (Exception, psycopg2.DatabaseError) as error :
            print("Database error: ", error)
        finally:
            if (conn):
                cur.close()
                conn.close()
                print("Connection closed.")
    
    # Handle the event notification that we receive when a new item is sent to the 
    # S3 bucket.
    def lambda_handler(event,context):
        print("Received event: \n" + str(event))
    
        bucket = event['Records'][0]['s3']['bucket']['name']
        key = event['Records'][0]['s3']['object']['key']
        data = get_file_from_s3(bucket, key)
    
        in_json = clean_and_split(data)
    
        set_variables(in_json)

    In the preceding code, make the following changes:

    • Replace <clustername> with the name of the cluster.
    • Replace <dbname> with the name of the database.
    • Replace <username> with the user name that you specified when you created the Redshift cluster.
    • Replace <password> with the password that you specified when you created the Redshift cluster.
    • Replace <endpoint> with the endpoint address of the Redshift cluster.
  6. In IAM, update the execution role that’s associated with the Lambda function to include the GetObject permission for the S3 bucket that contains the event data. For more information, see Editing IAM Policies in the AWS IAM User Guide.

Step 4: Set up notifications on the S3 bucket

Now that we’ve created the Lambda function, we’ll set up a notification on the S3 bucket. In this case, the notification will refer to the Lambda function that we created in the previous section. Every time a new file is added to the bucket, the notification will cause the Lambda function to run.

To create the event notification

  1. In S3, create a new bucket notification. The notification should be triggered when PUT events occur, and should trigger the Lambda function that you created in the previous section. For more information about creating notifications, see Configuring Amazon S3 Event Notifications in the Amazon S3 Developer Guide.
  2. Test the event notification by sending a test campaign. If you send an email campaign, your Redshift database should contain events such as _campaign.send, _email.send, _email.delivered, and others. You can check the contents of the Redshift table by running the following query in the Query Editor in the Redshift console:
    select * from pinpoint_events_table;

Step 5: Add the data set in Amazon QuickSight

If your Lambda function is sending event data to Redshift as expected, you can use your Redshift database to create a new data set in Amazon QuickSight. QuickSight includes an automatic database discovery feature that helps you add your Redshift database as a data set with only a few clicks. For more information, see Creating a Data Set from a Database in the Amazon QuickSight User Guide.

Step 6: Create your visualizations

Now that QuickSight is retrieving information from your Redshift database, you can use that data to create visualizations. To learn more about creating visualizations in QuickSight, see Creating an Analysis in the Amazon QuickSight User Guide.

This brings us to the end of our series. While these posts focused on using Amazon QuickSight to visualize your analytics data, you can also use these same techniques to create visualizations using 3rd party applications. We hope you enjoyed this series, and we can’t wait to see what you build using these examples!

Amazon Aurora PostgreSQL Serverless – Now Generally Available

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-aurora-postgresql-serverless-now-generally-available/

The database is usually the most critical part of a software architecture and managing databases, especially relational ones, has never been easy. For this reason, we created Amazon Aurora Serverless, an auto-scaling version of Amazon Aurora that automatically starts up, shuts down and scales up or down based on your application workload.

The MySQL-compatible edition of Aurora Serverless has been available for some time now. I am pleased to announce that the PostgreSQL-compatible edition of Aurora Serverless is generally available today.

Before moving on with details, I take the opportunity to congratulate the Amazon Aurora development team that has just won the 2019 Association for Computing Machinery’s (ACM) Special Interest Group on Management of Data (SIGMOD) Systems Award!

When you create a database with Aurora Serverless, you set the minimum and maximum capacity. Your client applications transparently connect to a proxy fleet that routes the workload to a pool of resources that are automatically scaled. Scaling is very fast because resources are “warm” and ready to be added to serve your requests.

 

There is no change with Aurora Serverless on how storage is managed by Aurora. The storage layer is independent from the compute resources used by the database. There is no need to provision storage in advance. The minimum storage is 10GB and, based on the database usage, the Amazon Aurora storage will automatically grow, up to 64 TB, in 10GB increments with no impact to database performance.

Creating an Aurora Serverless PostgreSQL Database
Let’s start an Aurora Serverless PostgreSQL database and see the automatic scalability at work. From the Amazon RDS console, I select to create a database using Amazon Aurora as engine. Currently, Aurora serverless is compatible with PostgreSQL version 10.5. Selecting that version, the serverless option becomes available.

I give the new DB cluster an identifier, choose my master username, and let Amazon RDS generate a password for me. I will be able to retrieve my credentials during database creation.

I can now select the minimum and maximum capacity for my database, in terms of Aurora Capacity Units (ACUs), and in the additional scaling configuration I choose to pause compute capacity after 5 minutes of inactivity. Based on my settings, Aurora Serverless automatically creates scaling rules for thresholds for CPU utilization, connections, and available memory.

Testing Some Load on the Database
To generate some load on the database I am using sysbench on an EC2 instance. There are a couple of Lua scripts bundled with sysbench that can help generate an online transaction processing (OLTP) workload:

  • The first script, parallel_prepare.lua, generates 100,000 rows per table for 24 tables.
  • The second script, oltp.lua, generates workload against those data using 64 worker threads.

By using those scripts, I start generating load on my database cluster. As you can see from this graph, taken from the RDS console monitoring tab, the serverless database capacity grows and shrinks to follow my requirements. The metric shown on this graph is the number of ACUs used by the database cluster. First it scales up to accommodate the sysbench workload. When I stop the load generator, it scales down and then pauses.

Available Now
Aurora Serverless PostgreSQL is available now in US East (N. Virginia), US East (Ohio), US West (Oregon), EU (Ireland), and Asia Pacific (Tokyo). With Aurora Serverless, you pay on a per-second basis for the database capacity you use when the database is active, plus the usual Aurora storage costs.

For more information on Amazon Aurora, I recommend this great post explaining why and how it was created:

Amazon Aurora ascendant: How we designed a cloud-native relational database

It’s never been so easy to use a relational database in production. I am so excited to see what you are going to use it for!

How to build databases using Python and text files | Hello World #9

Post Syndicated from Mac Bowley original https://www.raspberrypi.org/blog/how-to-build-databases-using-python-and-text-files-hello-world-9/

In Hello World issue 9, Raspberry Pi’s own Mac Bowley shares a lesson that introduces students to databases using Python and text files.

In this lesson, students create a library app for their books. This will store information about their book collection and allow them to display, manipulate, and search their collection. You will show students how to use text files in their programs that act as a database.

The project will give your students practical examples of database terminology and hands-on experience working with persistent data. It gives opportunities for students to define and gain concrete experience with key database concepts using a language they are familiar with. The script that accompanies this activity can be adapted to suit your students’ experience and competency.

This ready-to-go software project can be used alongside approaches such as PRIMM or pair programming, or as a worked example to engage your students in programming with persistent data.

What makes a database?

Start by asking the students why we need databases and what they are: do they ever feel unorganised? Life can get complicated, and there is so much to keep track of, the raw data required can be overwhelming. How can we use computing to solve this problem? If only there was a way of organising and accessing data that would let us get it out of our head. Databases are a way of organising the data we care about, so that we can easily access it and use it to make our lives easier.

Then explain that in this lesson the students will create a database, using Python and a text file. The example I show students is a personal library app that keeps track of which books I own and where I keep them. I have also run this lesson and allowed the students pick their own items to keep track of — it just involves a little more planning time at the end. Split the class up into pairs; have each of them discuss and select five pieces of data about a book (or their own item) they would like to track in a database. They should also consider which type of data each of them is. Give them five minutes to discuss and select some data to track.

Databases are organised collections of data, and this allows them to be displayed, maintained, and searched easily. Our database will have one table — effectively just like a spreadsheet table. The headings on each of the columns are the fields: the individual pieces of data we want to store about the books in our collection. The information about a single book are called its attributes and are stored together in one record, which would be a single row in our database table. To make it easier to search and sort our database, we should also select a primary key: one field that will be unique for each book. Sometimes one of the fields we are already storing works for this purpose; if not, then the database will create an ID number that it uses to uniquely identify each record.

Create a library application

Pull the class back together and ask a few groups about the data they selected to track. Make sure they have chosen appropriate data types. Ask some if they can find any of the fields that would be a primary key; the answer will most likely be no. The ISBN could work, but for our simple application, having to type in a 10- or 13-digit number just to use for an ID would be overkill. In our database, we are going to generate our own IDs.

The requirements for our database are that it can do the following things: save data to a file, read data from that file, create new books, display our full database, allow the user to enter a search term, and display a list of relevant results based on that term. We can decompose the problem into the following steps:

  • Set up our structures
  • Create a record
  • Save the data to the database file
  • Read from the database file
  • Display the database to the user
  • Allow the user to search the database
  • Display the results

Have the class log in and power up Python. If they are doing this locally, have them create a new folder to hold this project. We will be interacting with external files and so having them in the same folder avoids confusion with file locations and paths. They should then load up a new Python file. To start, download the starter file from the link provided. Each student should make a copy of this file. At first, I have them examine the code, and then get them to run it. Using concepts from PRIMM, I get them to print certain messages when a menu option is selected. This can be a great exemplar for making a menu in any application they are developing. This will be the skeleton of our database app: giving them a starter file can help ease some cognitive load from students.

Have them examine the variables and make guesses about what they are used for.

  • current_ID – a variable to count up as we create records, this will be our primary key
  • new_additions – a list to hold any new records we make while our code is running, before we save them to the file
  • filename – the name of the database file we will be using
  • fields – a list of our fields, so that our dictionaries can be aligned with our text file
  • data – a list that will hold all of the data from the database, so that we can search and display it without having to read the file every time

Create the first record

We are going to use dictionaries to store our records. They reference their elements using keys instead of indices, which fit our database fields nicely. We are going to generate our own IDs. Each of these must be unique, so a variable is needed that we can add to as we make our records. This is a user-focused application, so let’s make it so our user can input the data for the first book. The strings, in quotes, on the left of the colon, are the keys (the names of our fields) and the data on the right is the stored value, in our case whatever the user inputs in response to our appropriate prompts. We finish this part of by adding the record to the file, incrementing the current ID, and then displaying a useful feedback message to the user to say their record has been created successfully. Your students should now save their code and run it to make sure there aren’t any syntax errors.

You could make use of pair programming, with carefully selected pairs taking it in turns in the driver and navigator roles. You could also offer differing levels of scaffolding: providing some of the code and asking them to modify it based on given requirements.

How to use the code in your class

To complete the project, your students can add functionality to save their data to a CSV file, read from a database file, and allow users to search the database. The code for the whole project is available at helloworld.cc/database.

An example of the code

You may want to give your students the entire piece of code. They can investigate and modify it to their own purpose. You can also lead them through it, having them follow you as you demonstrate how an expert constructs a piece of software. I have done both to great effect. Let me know how your classes get on! Get in touch at [email protected]

Hello World issue 9

The brand-new issue of Hello World is out today, and available right now as a free PDF download from the Hello World website.



UK-based educators can also sign up to receive Hello World as printed magazine FOR FREE, direct to their door. And those outside the UK, educator or not, can subscribe to receive new digital issues of Hello World in their inbox on the day of release.

The post How to build databases using Python and text files | Hello World #9 appeared first on Raspberry Pi.

How to securely provide database credentials to Lambda functions by using AWS Secrets Manager

Post Syndicated from Ramesh Adabala original https://aws.amazon.com/blogs/security/how-to-securely-provide-database-credentials-to-lambda-functions-by-using-aws-secrets-manager/

As a solutions architect at AWS, I often assist customers in architecting and deploying business applications using APIs and microservices that rely on serverless services such as AWS Lambda and database services such as Amazon Relational Database Service (Amazon RDS). Customers can take advantage of these fully managed AWS services to unburden their teams from infrastructure operations and other undifferentiated heavy lifting, such as patching, software maintenance, and capacity planning.

In this blog post, I’ll show you how to use AWS Secrets Manager to secure your database credentials and send them to Lambda functions that will use them to connect and query the backend database service Amazon RDS—without hardcoding the secrets in code or passing them through environment variables. This approach will help you secure last-mile secrets and protect your backend databases. Long living credentials need to be managed and regularly rotated to keep access into critical systems secure, so it’s a security best practice to periodically reset your passwords. Manually changing the passwords would be cumbersome, but AWS Secrets Manager helps by managing and rotating the RDS database passwords.

Solution overview

This is sample code: you’ll use an AWS CloudFormation template to deploy the following components to test the API endpoint from your browser:

  • An RDS MySQL database instance on a db.t2.micro instance
  • Two Lambda functions with necessary IAM roles and IAM policies, including access to AWS Secrets Manager:
    • LambdaRDSCFNInit: This Lambda function will execute immediately after the CloudFormation stack creation. It will create an “Employees” table in the database, where it will insert three sample records.
    • LambdaRDSTest: This function will query the Employees table and return the record count in an HTML string format
  • RESTful API with “GET” method on AWS API Gateway

Here’s the high level setup of the AWS services that will be created from the CloudFormation stack deployment:
 

Figure 1: Solution architecture

Figure 1: Architecture diagram

  1. Clients call the RESTful API hosted on AWS API Gateway
  2. The API Gateway executes the Lambda function
  3. The Lambda function retrieves the database secrets using the Secrets Manager API
  4. The Lambda function connects to the RDS database using database secrets from Secrets Manager and returns the query results

You can access the source code for the sample used in this post here: https://github.com/awslabs/automating-governance-sample/tree/master/AWS-SecretsManager-Lambda-RDS-blog.

Deploying the sample solution

Set up the sample deployment by selecting the Launch Stack button below. If you haven’t logged into your AWS account, follow the prompts to log in.

By default, the stack will be deployed in the us-east-1 region. If you want to deploy this stack in any other region, download the code from the above GitHub link, place the Lambda code zip file in a region-specific S3 bucket and make the necessary changes in the CloudFormation template to point to the right S3 bucket. (Please refer to the AWS CloudFormation User Guide for additional details on how to create stacks using the AWS CloudFormation console.)
 
Select this image to open a link that starts building the CloudFormation stack

Next, follow these steps to execute the stack:

  1. Leave the default location for the template and select Next.
     
    Figure 2: Keep the default location for the template

    Figure 2: Keep the default location for the template

  2. On the Specify Details page, you’ll see the parameters pre-populated. These parameters include the name of the database and the database user name. Select Next on this screen
     
    Figure 3: Parameters on the "Specify Details" page

    Figure 3: Parameters on the “Specify Details” page

  3. On the Options screen, select the Next button.
  4. On the Review screen, select both check boxes, then select the Create Change Set button:
     
    Figure 4: Select the check boxes and "Create Change Set"

    Figure 4: Select the check boxes and “Create Change Set”

  5. After the change set creation is completed, choose the Execute button to launch the stack.
  6. Stack creation will take between 10 – 15 minutes. After the stack is created successfully, select the Outputs tab of the stack, then select the link.
     
    Figure 5:  Select the link on the "Outputs" tab

    Figure 5: Select the link on the “Outputs” tab

    This action will trigger the code in the Lambda function, which will query the “Employee” table in the MySQL database and will return the results count back to the API. You’ll see the following screen as output from the RESTful API endpoint:
     

    Figure 6:   Output from the RESTful API endpoint

    Figure 6: Output from the RESTful API endpoint

At this point, you’ve successfully deployed and tested the API endpoint with a backend Lambda function and RDS resources. The Lambda function is able to successfully query the MySQL RDS database and is able to return the results through the API endpoint.

What’s happening in the background?

The CloudFormation stack deployed a MySQL RDS database with a randomly generated password using a secret resource. Now that the secret resource with randomly generated password has been created, the CloudFormation stack will use dynamic reference to resolve the value of the password from Secrets Manager in order to create the RDS instance resource. Dynamic references provide a compact, powerful way for you to specify external values that are stored and managed in other AWS services, such as Secrets Manager. The dynamic reference guarantees that CloudFormation will not log or persist the resolved value, keeping the database password safe. The CloudFormation template also creates a Lambda function to do automatic rotation of the password for the MySQL RDS database every 30 days. Native credential rotation can improve security posture, as it eliminates the need to manually handle database passwords through the lifecycle process.

Below is the CloudFormation code that covers these details:


#This is a Secret resource with a randomly generated password in its SecretString JSON.
MyRDSInstanceRotationSecret:
    Type: AWS::SecretsManager::Secret
    Properties:
    Description: 'This is my rds instance secret'
    GenerateSecretString:
        SecretStringTemplate: !Sub '{"username": "${!Ref RDSUserName}"}'
        GenerateStringKey: 'password'
        PasswordLength: 16
        ExcludeCharacters: '"@/\'
    Tags:
    -
        Key: AppNam
        Value: MyApp

#This is a RDS instance resource. Its master username and password use dynamic references to resolve values from
#SecretsManager. The dynamic reference guarantees that CloudFormation will not log or persist the resolved value
#We use a ref to the Secret resource logical id in order to construct the dynamic reference, since the Secret name is being
#generated by CloudFormation
MyDBInstance2:
    Type: AWS::RDS::DBInstance
    Properties:
    AllocatedStorage: 20
    DBInstanceClass: db.t2.micro
    DBName: !Ref RDSDBName
    Engine: mysql
    MasterUsername: !Ref RDSUserName
    MasterUserPassword: !Join ['', ['{{resolve:secretsmanager:', !Ref MyRDSInstanceRotationSecret, ':SecretString:password}}' ]]
    MultiAZ: False
    PubliclyAccessible: False      
    StorageType: gp2
    DBSubnetGroupName: !Ref myDBSubnetGroup
    VPCSecurityGroups:
    - !Ref RDSSecurityGroup
    BackupRetentionPeriod: 0
    DBInstanceIdentifier: 'rotation-instance'

#This is a SecretTargetAttachment resource which updates the referenced Secret resource with properties about
#the referenced RDS instance
SecretRDSInstanceAttachment:
    Type: AWS::SecretsManager::SecretTargetAttachment
    Properties:
    SecretId: !Ref MyRDSInstanceRotationSecret
    TargetId: !Ref MyDBInstance2
    TargetType: AWS::RDS::DBInstance
#This is a RotationSchedule resource. It configures rotation of password for the referenced secret using a rotation lambda
#The first rotation happens at resource creation time, with subsequent rotations scheduled according to the rotation rules
#We explicitly depend on the SecretTargetAttachment resource being created to ensure that the secret contains all the
#information necessary for rotation to succeed
MySecretRotationSchedule:
    Type: AWS::SecretsManager::RotationSchedule
    DependsOn: SecretRDSInstanceAttachment
    Properties:
    SecretId: !Ref MyRDSInstanceRotationSecret
    RotationLambdaARN: !GetAtt MyRotationLambda.Arn
    RotationRules:
        AutomaticallyAfterDays: 30

#This is a lambda Function resource. We will use this lambda to rotate secrets
#For details about rotation lambdas, see https://docs.aws.amazon.com/secretsmanager/latest/userguide/rotating-secrets.html     https://docs.aws.amazon.com/secretsmanager/latest/userguide/rotating-secrets.html
#The below example assumes that the lambda code has been uploaded to a S3 bucket, and that it will rotate a mysql database password
MyRotationLambda:
    Type: AWS::Serverless::Function
    Properties:
    Runtime: python2.7
    Role: !GetAtt MyLambdaExecutionRole.Arn
    Handler: mysql_secret_rotation.lambda_handler
    Description: 'This is a lambda to rotate MySql user passwd'
    FunctionName: 'cfn-rotation-lambda'
    CodeUri: 's3://devsecopsblog/code.zip'      
    Environment:
        Variables:
        SECRETS_MANAGER_ENDPOINT: !Sub 'https://secretsmanager.${AWS::Region}.amazonaws.com' 

Verifying the solution

To be certain that everything is set up properly, you can look at the Lambda code that’s querying the database table by following the below steps:

  1. Go to the AWS Lambda service page
  2. From the list of Lambda functions, click on the function with the name scm2-LambdaRDSTest-…
  3. You can see the environment variables at the bottom of the Lambda Configuration details screen. Notice that there should be no database password supplied as part of these environment variables:
     
    Figure 7: Environment variables

    Figure 7: Environment variables

    
        import sys
        import pymysql
        import boto3
        import botocore
        import json
        import random
        import time
        import os
        from botocore.exceptions import ClientError
        
        # rds settings
        rds_host = os.environ['RDS_HOST']
        name = os.environ['RDS_USERNAME']
        db_name = os.environ['RDS_DB_NAME']
        helperFunctionARN = os.environ['HELPER_FUNCTION_ARN']
        
        secret_name = os.environ['SECRET_NAME']
        my_session = boto3.session.Session()
        region_name = my_session.region_name
        conn = None
        
        # Get the service resource.
        lambdaClient = boto3.client('lambda')
        
        
        def invokeConnCountManager(incrementCounter):
            # return True
            response = lambdaClient.invoke(
                FunctionName=helperFunctionARN,
                InvocationType='RequestResponse',
                Payload='{"incrementCounter":' + str.lower(str(incrementCounter)) + ',"RDBMSName": "Prod_MySQL"}'
            )
            retVal = response['Payload']
            retVal1 = retVal.read()
            return retVal1
        
        
        def openConnection():
            print("In Open connection")
            global conn
            password = "None"
            # Create a Secrets Manager client
            session = boto3.session.Session()
            client = session.client(
                service_name='secretsmanager',
                region_name=region_name
            )
            
            # In this sample we only handle the specific exceptions for the 'GetSecretValue' API.
            # See https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html
            # We rethrow the exception by default.
            
            try:
                get_secret_value_response = client.get_secret_value(
                    SecretId=secret_name
                )
                print(get_secret_value_response)
            except ClientError as e:
                print(e)
                if e.response['Error']['Code'] == 'DecryptionFailureException':
                    # Secrets Manager can't decrypt the protected secret text using the provided KMS key.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'InternalServiceErrorException':
                    # An error occurred on the server side.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'InvalidParameterException':
                    # You provided an invalid value for a parameter.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'InvalidRequestException':
                    # You provided a parameter value that is not valid for the current state of the resource.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'ResourceNotFoundException':
                    # We can't find the resource that you asked for.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
            else:
                # Decrypts secret using the associated KMS CMK.
                # Depending on whether the secret is a string or binary, one of these fields will be populated.
                if 'SecretString' in get_secret_value_response:
                    secret = get_secret_value_response['SecretString']
                    j = json.loads(secret)
                    password = j['password']
                else:
                    decoded_binary_secret = base64.b64decode(get_secret_value_response['SecretBinary'])
                    print("password binary:" + decoded_binary_secret)
                    password = decoded_binary_secret.password    
            
            try:
                if(conn is None):
                    conn = pymysql.connect(
                        rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)
                elif (not conn.open):
                    # print(conn.open)
                    conn = pymysql.connect(
                        rds_host, user=name, passwd=password, db=db_name, connect_timeout=5)
        
            except Exception as e:
                print (e)
                print("ERROR: Unexpected error: Could not connect to MySql instance.")
                raise e
        
        
        def lambda_handler(event, context):
            if invokeConnCountManager(True) == "false":
                print ("Not enough Connections available.")
                return False
        
            item_count = 0
            try:
                openConnection()
                # Introducing artificial random delay to mimic actual DB query time. Remove this code for actual use.
                time.sleep(random.randint(1, 3))
                with conn.cursor() as cur:
                    cur.execute("select * from Employees")
                    for row in cur:
                        item_count += 1
                        print(row)
                        # print(row)
            except Exception as e:
                # Error while opening connection or processing
                print(e)
            finally:
                print("Closing Connection")
                if(conn is not None and conn.open):
                    conn.close()
                invokeConnCountManager(False)
        
            content =  "Selected %d items from RDS MySQL table" % (item_count)
            response = {
                "statusCode": 200,
                "body": content,
                "headers": {
                    'Content-Type': 'text/html',
                }
            }
            return response        
        

In the AWS Secrets Manager console, you can also look at the new secret that was created from CloudFormation execution by following the below steps:

  1. Go to theAWS Secret Manager service page with appropriate IAM permissions
  2. From the list of secrets, click on the latest secret with the name MyRDSInstanceRotationSecret-…
  3. You will see the secret details and rotation information on the screen, as shown in the following screenshot:
     
    Figure 8: Secret details and rotation information

    Figure 8: Secret details and rotation information

Conclusion

In this post, I showed you how to manage database secrets using AWS Secrets Manager and how to leverage Secrets Manager’s API to retrieve the secrets into a Lambda execution environment to improve database security and protect sensitive data. Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and ongoing maintenance costs of operating your own secrets management infrastructure. To get started, visit the Secrets Manager console. To learn more, visit Secrets Manager documentation.

If you have feedback about this post, add it to the Comments section below. If you have questions about implementing the example used in this post, open a thread on the Secrets Manager Forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Ramesh Adabala

Ramesh is a Solution Architect on the Southeast Enterprise Solution Architecture team at AWS.

Learn about AWS Services & Solutions – April AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-april-aws-online-tech-talks/

AWS Tech Talks

Join us this April to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Blockchain

May 2, 2019 | 11:00 AM – 12:00 PM PTHow to Build an Application with Amazon Managed Blockchain – Learn how to build an application on Amazon Managed Blockchain with the help of demo applications and sample code.

Compute

April 29, 2019 | 1:00 PM – 2:00 PM PTHow to Optimize Amazon Elastic Block Store (EBS) for Higher Performance – Learn how to optimize performance and spend on your Amazon Elastic Block Store (EBS) volumes.

May 1, 2019 | 11:00 AM – 12:00 PM PTIntroducing New Amazon EC2 Instances Featuring AMD EPYC and AWS Graviton Processors – See how new Amazon EC2 instance offerings that feature AMD EPYC processors and AWS Graviton processors enable you to optimize performance and cost for your workloads.

Containers

April 23, 2019 | 11:00 AM – 12:00 PM PTDeep Dive on AWS App Mesh – Learn how AWS App Mesh makes it easy to monitor and control communications for services running on AWS.

March 22, 2019 | 9:00 AM – 10:00 AM PTDeep Dive Into Container Networking – Dive deep into microservices networking and how you can build, secure, and manage the communications into, out of, and between the various microservices that make up your application.

Databases

April 23, 2019 | 1:00 PM – 2:00 PM PTSelecting the Right Database for Your Application – Learn how to develop a purpose-built strategy for databases, where you choose the right tool for the job.

April 25, 2019 | 9:00 AM – 10:00 AM PTMastering Amazon DynamoDB ACID Transactions: When and How to Use the New Transactional APIs – Learn how the new Amazon DynamoDB’s transactional APIs simplify the developer experience of making coordinated, all-or-nothing changes to multiple items both within and across tables.

DevOps

April 24, 2019 | 9:00 AM – 10:00 AM PTRunning .NET applications with AWS Elastic Beanstalk Windows Server Platform V2 – Learn about the easiest way to get your .NET applications up and running on AWS Elastic Beanstalk.

Enterprise & Hybrid

April 30, 2019 | 11:00 AM – 12:00 PM PTBusiness Case Teardown: Identify Your Real-World On-Premises and Projected AWS Costs – Discover tools and strategies to help you as you build your value-based business case.

IoT

April 30, 2019 | 9:00 AM – 10:00 AM PTBuilding the Edge of Connected Home – Learn how AWS IoT edge services are enabling smarter products for the connected home.

Machine Learning

April 24, 2019 | 11:00 AM – 12:00 PM PTStart Your Engines and Get Ready to Race in the AWS DeepRacer League – Learn more about reinforcement learning, how to build a model, and compete in the AWS DeepRacer League.

April 30, 2019 | 1:00 PM – 2:00 PM PTDeploying Machine Learning Models in Production – Learn best practices for training and deploying machine learning models.

May 2, 2019 | 9:00 AM – 10:00 AM PTAccelerate Machine Learning Projects with Hundreds of Algorithms and Models in AWS Marketplace – Learn how to use third party algorithms and model packages to accelerate machine learning projects and solve business problems.

Networking & Content Delivery

April 23, 2019 | 9:00 AM – 10:00 AM PTSmart Tips on Application Load Balancers: Advanced Request Routing, Lambda as a Target, and User Authentication – Learn tips and tricks about important Application Load Balancers (ALBs) features that were recently launched.

Productivity & Business Solutions

April 29, 2019 | 11:00 AM – 12:00 PM PTLearn How to Set up Business Calling and Voice Connector in Minutes with Amazon Chime – Learn how Amazon Chime Business Calling and Voice Connector can help you with your business communication needs.

May 1, 2019 | 1:00 PM – 2:00 PM PTBring Voice to Your Workplace – Learn how you can bring voice to your workplace with Alexa for Business.

Serverless

April 25, 2019 | 11:00 AM – 12:00 PM PTModernizing .NET Applications Using the Latest Features on AWS Development Tools for .NET – Get a dive deep and demonstration of the latest updates to the AWS SDK and tools for .NET to make development even easier, more powerful, and more productive.

May 1, 2019 | 9:00 AM – 10:00 AM PTCustomer Showcase: Improving Data Processing Workloads with AWS Step Functions’ Service Integrations – Learn how innovative customers like SkyWatch are coordinating AWS services using AWS Step Functions to improve productivity.

Storage

April 24, 2019 | 1:00 PM – 2:00 PM PTAmazon S3 Glacier Deep Archive: The Cheapest Storage in the Cloud – See how Amazon S3 Glacier Deep Archive offers the lowest cost storage in the cloud, at prices significantly lower than storing and maintaining data in on-premises magnetic tape libraries or archiving data offsite.

How to rotate Amazon DocumentDB and Amazon Redshift credentials in AWS Secrets Manager

Post Syndicated from Apurv Awasthi original https://aws.amazon.com/blogs/security/how-to-rotate-amazon-documentdb-and-amazon-redshift-credentials-in-aws-secrets-manager/

Using temporary credentials is an AWS Identity and Access Management (IAM) best practice. Even Dilbert is learning to set up temporary credentials. Today, AWS Secrets Manager made it easier to follow this best practice by launching support for rotating credentials for Amazon DocumentDB and Amazon Redshift automatically. Now, with a few clicks, you can configure Secrets Manager to rotate these credentials automatically, turning a typical, long-term credential into a temporary credential.

In this post, I summarize the key features of AWS Secrets Manager. Then, I show you how to store a database credential for an Amazon DocumentDB cluster and how your applications can access this secret. Finally, I show you how to configure AWS Secrets Manager to rotate this secret automatically.

Key features of Secrets Manager

These features include the ability to:

  • Rotate secrets safely. You can configure Secrets Manager to rotate secrets automatically without disrupting your applications, turning long-term secrets into temporary secrets. Secrets Manager natively supports rotating secrets for all Amazon database services—Amazon RDS, Amazon DocumentDB, and Amazon Redshift—that require a user name and password. You can extend Secrets Manager to meet your custom rotation requirements by creating an AWS Lambda function to rotate other types of secrets.
  • Manage access with fine-grained policies. You can store all your secrets centrally and control access to these securely using fine-grained AWS Identity and Access Management (IAM) policies and resource-based policies. You can also tag secrets to help you discover, organize, and control access to secrets used throughout your organization.
  • Audit and monitor secrets centrally. Secrets Manager integrates with AWS logging and monitoring services to enable you to meet your security and compliance requirements. For example, you can audit AWS AWS CloudTrail logs to see when Secrets Manager rotated a secret or configure AWS CloudWatch Events to alert you when an administrator deletes a secret.
  • Pay as you go. Pay for the secrets you store in Secrets Manager and for the use of these secrets; there are no long-term contracts or licensing fees.
  • Compliance. You can use AWS Secrets Manager to manage secrets for workloads that are subject to U.S. Health Insurance Portability and Accountability Act (HIPAA), Payment Card Industry Data Security Standard (PCI-DSS), and ISO/IEC 27001, ISO/IEC 27017, ISO/IEC 27018, or ISO 9001.

Phase 1: Store a secret in Secrets Manager

Now that you’re familiar with the key features, I’ll show you how to store the credential for a DocumentDB cluster. To demonstrate how to retrieve and use the secret, I use a Python application running on Amazon EC2 that requires this database credential to access the DocumentDB cluster. Finally, I show how to configure Secrets Manager to rotate this database credential automatically.

  1. In the Secrets Manager console, select Store a new secret.
     
    Figure 1: Select "Store a new secret"

    Figure 1: Select “Store a new secret”

  2. Next, select Credentials for DocumentDB database. For this example, I store the credentials for the database masteruser. I start by securing the masteruser because it’s the most powerful database credential and has full access over the database.
     
    Figure 2: Select "Credentials for DocumentDB database"

    Figure 2: Select “Credentials for DocumentDB database”

    Note: To follow along, you need the AWSSecretsManagerReadWriteAccess managed policy because this policy grants permissions to store secrets in Secrets Manager. Read the AWS Secrets Manager Documentation for more information about the minimum IAM permissions required to store a secret.

  3. By default, Secrets Manager creates a unique encryption key for each AWS region and AWS account where you use Secrets Manager. I chose to encrypt this secret with the default encryption key.
     
    Figure 3: Select the default or your CMK

    Figure 3: Select the default or your CMK

  4. Next, view the list of DocumentDB clusters in my account and select the database this credential accesses. For this example, I select the DB instance documentdb-instance, and then select Next.
     
    Figure 4: Select the instance you created

    Figure 4: Select the instance you created

  5. In this step, specify values for Secret Name and Description. Based on where you will use this secret, give it a hierarchical name, such as Applications/MyApp/Documentdb-instancee, and then select Next.
     
    Figure 5: Provide a name and description

    Figure 5: Provide a name and description

  6. For the next step, I chose to keep the Disable automatic rotation default setting because in my example my application that uses the secret is running on Amazon EC2. I’ll enable rotation after I’ve updated my application (see Phase 2 below) to use Secrets Manager APIs to retrieve secrets. Select Next.
     
    Figure 6: Choose to either enable or disable automatic rotation

    Figure 6: Choose to either enable or disable automatic rotation

    Note:If you’re storing a secret that you’re not using in your application, select Enable automatic rotation. See AWS Secrets Manager getting started guide on rotation for details.

  7. Review the information on the next screen and, if everything looks correct, select Store. You’ve now successfully stored a secret in Secrets Manager.
  8. Next, select See sample code in Python.
     
    Figure 7: Select the "See sample code" button

    Figure 7: Select the “See sample code” button

  9. Finally, take note of the code samples provided. You will use this code to update your application to retrieve the secret using Secrets Manager APIs.
     
    Figure 8: Copy the code sample for use in your application

    Figure 8: Copy the code sample for use in your application

Phase 2: Update an application to retrieve a secret from Secrets Manager

Now that you’ve stored the secret in Secrets Manager, you can update your application to retrieve the database credential from Secrets Manager instead of hard-coding this information in a configuration file or source code. For this example, I show how to configure a Python application to retrieve this secret from Secrets Manager.

  1. I connect to my Amazon EC2 instance via Secure Shell (SSH).
    
        import DocumentDB
        import config
        
        def no_secrets_manager_sample()
        
        # Get the user name, password, and database connection information from a config file.
        database = config.database
        user_name = config.user_name
        password = config.password                
        

  2. Previously, I configured my application to retrieve the database user name and password from the configuration file. Below is the source code for my application.
    
        # Use the user name, password, and database connection information to connect to the database
        db = Database.connect(database.endpoint, user_name, password, database.db_name, database.port) 
        

  3. I use the sample code from Phase 1 above and update my application to retrieve the user name and password from Secrets Manager. This code sets up the client, then retrieves and decrypts the secret Applications/MyApp/Documentdb-instance. I’ve added comments to the code to make the code easier to understand.
    
        # Use this code snippet in your app.
        # If you need more information about configurations or implementing the sample code, visit the AWS docs:   
        # https://aws.amazon.com/developers/getting-started/python/
        
        import boto3
        import base64
        from botocore.exceptions import ClientError
        
        
        def get_secret():
        
            secret_name = "Applications/MyApp/Documentdb-instance"
            region_name = "us-west-2"
        
            # Create a Secrets Manager client
            session = boto3.session.Session()
            client = session.client(
                service_name='secretsmanager',
                region_name=region_name
            )
        
            # In this sample we only handle the specific exceptions for the 'GetSecretValue' API.
            # See https://docs.aws.amazon.com/secretsmanager/latest/apireference/API_GetSecretValue.html
            # We rethrow the exception by default.
        
            try:
                get_secret_value_response = client.get_secret_value(
                    SecretId=secret_name
                )
            except ClientError as e:
                if e.response['Error']['Code'] == 'DecryptionFailureException':
                    # Secrets Manager can't decrypt the protected secret text using the provided KMS key.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'InternalServiceErrorException':
                    # An error occurred on the server side.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'InvalidParameterException':
                    # You provided an invalid value for a parameter.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'InvalidRequestException':
                    # You provided a parameter value that is not valid for the current state of the resource.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
                elif e.response['Error']['Code'] == 'ResourceNotFoundException':
                    # We can't find the resource that you asked for.
                    # Deal with the exception here, and/or rethrow at your discretion.
                    raise e
            else:
                # Decrypts secret using the associated KMS CMK.
                # Depending on whether the secret is a string or binary, one of these fields will be populated.
                if 'SecretString' in get_secret_value_response:
                    secret = get_secret_value_response['SecretString']
                else:
                    decoded_binary_secret = base64.b64decode(get_secret_value_response['SecretBinary'])
                    
            # Your code goes here.                          
        

  4. Applications require permissions to access Secrets Manager. My application runs on Amazon EC2 and uses an IAM role to obtain access to AWS services. I will attach the following policy to my IAM role. This policy uses the GetSecretValue action to grant my application permissions to read a secret from Secrets Manager. This policy also uses the resource element to limit my application to read only the Applications/MyApp/Documentdb-instance secret from Secrets Manager. You can visit the AWS Secrets Manager documentation to understand the minimum IAM permissions required to retrieve a secret.
    
        {
        "Version": "2012-10-17",
        "Statement": {
        "Sid": "RetrieveDbCredentialFromSecretsManager",
        "Effect": "Allow",
        "Action": "secretsmanager:GetSecretValue",
        "Resource": "arn:aws:secretsmanager:::secret:Applications/MyApp/Documentdb-instance"
        }
        }                   
        

Phase 3: Enable rotation for your secret

Rotating secrets regularly is a security best practice. Secrets Manager makes it easier to follow this security best practice by offering built-in integrations and supporting extensibility with Lambda. When you enable rotation, Secrets Manager creates a Lambda function and attaches an IAM role to this function to execute rotations on a schedule you define.

Note: Configuring rotation is a privileged action that requires several IAM permissions, and you should only grant this access to trusted individuals. To grant these permissions, you can use the AWS IAMFullAccess managed policy.

Now, I show you how to configure Secrets Manager to rotate the secret
Applications/MyApp/Documentdb-instance automatically.

  1. From the Secrets Manager console, I go to the list of secrets and choose the secret I created in phase 1, Applications/MyApp/Documentdb-instance.
     
    Figure 9: Choose the secret from Phase 1

    Figure 9: Choose the secret from Phase 1

  2. Scroll to Rotation configuration, and then select Edit rotation.
     
    Figure 10: Select the Edit rotation configuration

    Figure 10: Select the Edit rotation configuration

  3. To enable rotation, select Enable automatic rotation, and then choose how frequently Secrets Manager rotates this secret. For this example, I set the rotation interval to 30 days. Then, choose create a new Lambda function to perform rotation and give the function an easy to remember name. For this example, I choose the name RotationFunctionforDocumentDB.
     
    Figure 11: Chose to enable automatic rotation, select a rotation interval, create a new Lambda function, and give it a name

    Figure 11: Chose to enable automatic rotation, select a rotation interval, create a new Lambda function, and give it a name

  4. Next, Secrets Manager requires permissions to rotate this secret on your behalf. Because I’m storing the masteruser database credential, Secrets Manager can use this credential to perform rotations. Therefore, I select Use this secret, and then select Save.
     
    Figure12: Select credentials for Secret Manager to use

    Figure12: Select credentials for Secret Manager to use

  5. The banner on the next screen confirms that I successfully configured rotation and the first rotation is in progress, which enables you to verify that rotation is functioning as expected. Secrets Manager will rotate this credential automatically every 30 days.
     
    Figure 13: The banner at the top of the screen will show the status of the rotation

    Figure 13: The banner at the top of the screen will show the status of the rotation

Summary

I explained the key benefits of AWS Secrets Manager and showed how you can use temporary credentials to access your Amazon DocumentDB clusters and Amazon Redshift instances securely. You can follow similar steps to rotate credentials for Amazon Redshift.

Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and on-going maintenance costs of operating your own secrets management infrastructure. To get started, visit the Secrets Manager console. To learn more, read the Secrets Manager documentation. If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Apurv Awasthi

Apurv is the product manager for credentials management services at AWS, including AWS Secrets Manager and IAM Roles. He enjoys the “Day 1” culture at Amazon because it aligns with his experience building startups in the sports and recruiting industries. Outside of work, Apurv enjoys hiking. He holds an MBA from UCLA and an MS in computer science from University of Kentucky.

Learn about AWS Services & Solutions – February 2019 AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-february-2019-aws-online-tech-talks/

AWS Tech Talks

Join us this February to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Application Integration

February 20, 2019 | 11:00 AM – 12:00 PM PTCustomer Showcase: Migration & Messaging for Mission Critical Apps with S&P Global Ratings – Learn how S&P Global Ratings meets the high availability and fault tolerance requirements of their mission critical applications using the Amazon MQ.

AR/VR

February 28, 2019 | 1:00 PM – 2:00 PM PTBuild AR/VR Apps with AWS: Creating a Multiplayer Game with Amazon Sumerian – Learn how to build real-world augmented reality, virtual reality and 3D applications with Amazon Sumerian.

Blockchain

February 18, 2019 | 11:00 AM – 12:00 PM PTDeep Dive on Amazon Managed Blockchain – Explore the components of blockchain technology, discuss use cases, and do a deep dive into capabilities, performance, and key innovations in Amazon Managed Blockchain.

Compute

February 25, 2019 | 9:00 AM – 10:00 AM PTWhat’s New in Amazon EC2 – Learn about the latest innovations in Amazon EC2, including new instances types, related technologies, and consumption options that help you optimize running your workloads for performance and cost.

February 27, 2019 | 1:00 PM – 2:00 PM PTDeploy and Scale Your First Cloud Application with Amazon Lightsail – Learn how to quickly deploy and scale your first multi-tier cloud application using Amazon Lightsail.

Containers

February 19, 2019 | 9:00 AM – 10:00 AM PTSecuring Container Workloads on AWS Fargate – Explore the security controls and best practices for securing containers running on AWS Fargate.

Data Lakes & Analytics

February 18, 2019 | 1:00 PM – 2:00 PM PTAmazon Redshift Tips & Tricks: Scaling Storage and Compute Resources – Learn about the tools and best practices Amazon Redshift customers can use to scale storage and compute resources on-demand and automatically to handle growing data volume and analytical demand.

Databases

February 18, 2019 | 9:00 AM – 10:00 AM PTBuilding Real-Time Applications with Redis – Learn about Amazon’s fully managed Redis service and how it makes it easier, simpler, and faster to build real-time applications.

February 21, 2019 | 1:00 PM – 2:00 PM PT – Introduction to Amazon DocumentDB (with MongoDB Compatibility) – Get an introduction to Amazon DocumentDB (with MongoDB compatibility), a fast, scalable, and highly available document database that makes it easy to run, manage & scale MongoDB-workloads.

DevOps

February 20, 2019 | 1:00 PM – 2:00 PM PTFireside Chat: DevOps at Amazon with Ken Exner, GM of AWS Developer Tools – Join our fireside chat with Ken Exner, GM of Developer Tools, to learn about Amazon’s DevOps transformation journey and latest practices and tools that support the current DevOps model.

End-User Computing

February 28, 2019 | 9:00 AM – 10:00 AM PTEnable Your Remote and Mobile Workforce with Amazon WorkLink – Learn about Amazon WorkLink, a new, fully-managed service that provides your employees secure, one-click access to internal corporate websites and web apps using their mobile phones.

Enterprise & Hybrid

February 26, 2019 | 1:00 PM – 2:00 PM PTThe Amazon S3 Storage Classes – For cloud ops professionals, by cloud ops professionals. Wallace and Orion will tackle your toughest AWS hybrid cloud operations questions in this live Office Hours tech talk.

IoT

February 26, 2019 | 9:00 AM – 10:00 AM PTBring IoT and AI Together – Learn how to bring intelligence to your devices with the intersection of IoT and AI.

Machine Learning

February 19, 2019 | 1:00 PM – 2:00 PM PTGetting Started with AWS DeepRacer – Learn about the basics of reinforcement learning, what’s under the hood and opportunities to get hands on with AWS DeepRacer and how to participate in the AWS DeepRacer League.

February 20, 2019 | 9:00 AM – 10:00 AM PTBuild and Train Reinforcement Models with Amazon SageMaker RL – Learn about Amazon SageMaker RL to use reinforcement learning and build intelligent applications for your businesses.

February 21, 2019 | 11:00 AM – 12:00 PM PTTrain ML Models Once, Run Anywhere in the Cloud & at the Edge with Amazon SageMaker Neo – Learn about Amazon SageMaker Neo where you can train ML models once and run them anywhere in the cloud and at the edge.

February 28, 2019 | 11:00 AM – 12:00 PM PTBuild your Machine Learning Datasets with Amazon SageMaker Ground Truth – Learn how customers are using Amazon SageMaker Ground Truth to build highly accurate training datasets for machine learning quickly and reduce data labeling costs by up to 70%.

Migration

February 27, 2019 | 11:00 AM – 12:00 PM PTMaximize the Benefits of Migrating to the Cloud – Learn how to group and rationalize applications and plan migration waves in order to realize the full set of benefits that cloud migration offers.

Networking

February 27, 2019 | 9:00 AM – 10:00 AM PTSimplifying DNS for Hybrid Cloud with Route 53 Resolver – Learn how to enable DNS resolution in hybrid cloud environments using Amazon Route 53 Resolver.

Productivity & Business Solutions

February 26, 2019 | 11:00 AM – 12:00 PM PTTransform the Modern Contact Center Using Machine Learning and Analytics – Learn how to integrate Amazon Connect and AWS machine learning services, such Amazon Lex, Amazon Transcribe, and Amazon Comprehend, to quickly process and analyze thousands of customer conversations and gain valuable insights.

Serverless

February 19, 2019 | 11:00 AM – 12:00 PM PTBest Practices for Serverless Queue Processing – Learn the best practices of serverless queue processing, using Amazon SQS as an event source for AWS Lambda.

Storage

February 25, 2019 | 11:00 AM – 12:00 PM PT Introducing AWS Backup: Automate and Centralize Data Protection in the AWS Cloud – Learn about this new, fully managed backup service that makes it easy to centralize and automate the backup of data across AWS services in the cloud as well as on-premises.

Learn about New AWS re:Invent Launches – December AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-new-aws-reinvent-launches-december-aws-online-tech-talks/

AWS Tech Talks

Join us in the next couple weeks to learn about some of the new service and feature launches from re:Invent 2018. Learn about features and benefits, watch live demos and ask questions! We’ll have AWS experts online to answer any questions you may have. Register today!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Compute

December 19, 2018 | 01:00 PM – 02:00 PM PTDeveloping Deep Learning Models for Computer Vision with Amazon EC2 P3 Instances – Learn about the different steps required to build, train, and deploy a machine learning model for computer vision.

Containers

December 11, 2018 | 01:00 PM – 02:00 PM PTIntroduction to AWS App Mesh – Learn about using AWS App Mesh to monitor and control microservices on AWS.

Data Lakes & Analytics

December 10, 2018 | 11:00 AM – 12:00 PM PTIntroduction to AWS Lake Formation – Build a Secure Data Lake in Days – AWS Lake Formation (coming soon) will make it easy to set up a secure data lake in days. With AWS Lake Formation, you will be able to ingest, catalog, clean, transform, and secure your data, and make it available for analysis and machine learning.

December 12, 2018 | 11:00 AM – 12:00 PM PTIntroduction to Amazon Managed Streaming for Kafka (MSK) – Learn about features and benefits, use cases and how to get started with Amazon MSK.

Databases

December 10, 2018 | 01:00 PM – 02:00 PM PTIntroduction to Amazon RDS on VMware – Learn how Amazon RDS on VMware can be used to automate on-premises database administration, enable hybrid cloud backups and read scaling for on-premises databases, and simplify database migration to AWS.

December 13, 2018 | 09:00 AM – 10:00 AM PTServerless Databases with Amazon Aurora and Amazon DynamoDB – Learn about the new serverless features and benefits in Amazon Aurora and DynamoDB, use cases and how to get started.

Enterprise & Hybrid

December 19, 2018 | 11:00 AM – 12:00 PM PTHow to Use “Minimum Viable Refactoring” to Achieve Post-Migration Operational Excellence – Learn how to improve the security and compliance of your applications in two weeks with “minimum viable refactoring”.

IoT

December 17, 2018 | 11:00 AM – 12:00 PM PTIntroduction to New AWS IoT Services – Dive deep into the AWS IoT service announcements from re:Invent 2018, including AWS IoT Things Graph, AWS IoT Events, and AWS IoT SiteWise.

Machine Learning

December 10, 2018 | 09:00 AM – 10:00 AM PTIntroducing Amazon SageMaker Ground Truth – Learn how to build highly accurate training datasets with machine learning and reduce data labeling costs by up to 70%.

December 11, 2018 | 09:00 AM – 10:00 AM PTIntroduction to AWS DeepRacer – AWS DeepRacer is the fastest way to get rolling with machine learning, literally. Get hands-on with a fully autonomous 1/18th scale race car driven by reinforcement learning, 3D racing simulator, and a global racing league.

December 12, 2018 | 01:00 PM – 02:00 PM PTIntroduction to Amazon Forecast and Amazon Personalize – Learn about Amazon Forecast and Amazon Personalize – what are the key features and benefits of these managed ML services, common use cases and how you can get started.

December 13, 2018 | 01:00 PM – 02:00 PM PTIntroduction to Amazon Textract: Now in Preview – Learn how Amazon Textract, now in preview, enables companies to easily extract text and data from virtually any document.

Networking

December 17, 2018 | 01:00 PM – 02:00 PM PTIntroduction to AWS Transit Gateway – Learn how AWS Transit Gateway significantly simplifies management and reduces operational costs with a hub and spoke architecture.

Robotics

December 18, 2018 | 11:00 AM – 12:00 PM PTIntroduction to AWS RoboMaker, a New Cloud Robotics Service – Learn about AWS RoboMaker, a service that makes it easy to develop, test, and deploy intelligent robotics applications at scale.

Security, Identity & Compliance

December 17, 2018 | 09:00 AM – 10:00 AM PTIntroduction to AWS Security Hub – Learn about AWS Security Hub, and how it gives you a comprehensive view of high-priority security alerts and your compliance status across AWS accounts.

Serverless

December 11, 2018 | 11:00 AM – 12:00 PM PTWhat’s New with Serverless at AWS – In this tech talk, we’ll catch you up on our ever-growing collection of natively supported languages, console updates, and re:Invent launches.

December 13, 2018 | 11:00 AM – 12:00 PM PTBuilding Real Time Applications using WebSocket APIs Supported by Amazon API Gateway – Learn how to build, deploy and manage APIs with API Gateway.

Storage

December 12, 2018 | 09:00 AM – 10:00 AM PTIntroduction to Amazon FSx for Windows File Server – Learn about Amazon FSx for Windows File Server, a new fully managed native Windows file system that makes it easy to move Windows-based applications that require file storage to AWS.

December 14, 2018 | 01:00 PM – 02:00 PM PTWhat’s New with AWS Storage – A Recap of re:Invent 2018 Announcements – Learn about the key AWS storage announcements that occurred prior to and at re:Invent 2018. With 15+ new service, feature, and device launches in object, file, block, and data transfer storage services, you will be able to start designing the foundation of your cloud IT environment for any application and easily migrate data to AWS.

December 18, 2018 | 09:00 AM – 10:00 AM PTIntroduction to Amazon FSx for Lustre – Learn about Amazon FSx for Lustre, a fully managed file system for compute-intensive workloads. Process files from S3 or data stores, with throughput up to hundreds of GBps and sub-millisecond latencies.

December 18, 2018 | 01:00 PM – 02:00 PM PTIntroduction to New AWS Services for Data Transfer – Learn about new AWS data transfer services, and which might best fit your requirements for data migration or ongoing hybrid workloads.

Amazon DynamoDB On-Demand – No Capacity Planning and Pay-Per-Request Pricing

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-no-capacity-planning-and-pay-per-request-pricing/

Just a few years ago, creating a database that could support your business at any scale while providing consistent low latency was a daunting task. That changed for me in 2012 while reading Werner Vogels’ blog post announcing Amazon DynamoDB (it was a few months before I joined AWS). DynamoDB was built on the principles in the original Dynamo paper that Amazon published in 2007. Over the years, lots of new features have been introduced to further simplify how AWS customers use databases. You can now create fully managed, multi-region, multi-master database tables with features such as encryption at rest, point-in-time recovery, in-memory caching, and a 99.99% uptime service level agreement (SLA).

Amazon DynamoDB on-demand

Today we are introducing Amazon DynamoDB on-demand, a flexible new billing option for DynamoDB capable of serving thousands of requests per second without capacity planning. DynamoDB on-demand offers simple pay-per-request pricing for read and write requests so that you only pay for what you use, making it easy to balance costs and performance. For tables using on-demand mode, DynamoDB instantly accommodates customers’ workloads as they ramp up or down to any previously observed traffic level. If the level of traffic hits a new peak, DynamoDB adapts rapidly to accommodate the workload.

In the DynamoDB console, you can choose the on-demand read/write capacity mode when creating a new table, or change it later in the Capacity tab.

Tables using on-demand mode support all DynamoDB features (such as encryption at rest, point-in-time recovery, global tables, and so on) with the exception of auto scaling, which is not applicable with this mode.

Indexes created on a table using on-demand mode inherit the same scalability and billing model. You don’t need to specify throughput capacity settings for indexes, and you pay by their use. If you don’t have read/write traffic to a table using on-demand mode and its indexes, you only pay for the data storage.

DynamoDB on-demand is useful if your application traffic is difficult to predict and control, your workload has large spikes of short duration, or if your average table utilization is well below the peak. For example:

  • New applications, or applications whose database workload is complex to forecast
  • Developers working on serverless stacks with pay-per-use pricing
  • SaaS provider and independent software vendors (ISVs) who want the simplicity and resource isolation of deploying a table per subscriber

You can change a table from provisioned capacity to on-demand once per day. You can go from on-demand capacity to provisioned as often as you want.

A quick performance test

Let’s test some load on a newly created DynamoDB table using on-demand mode!

I created two serverless applications:

  • The first application creates a REST API on top of a DynamoDB table using an AWS Lambda function and Amazon API Gateway. Using this API, you can read, add, update, and delete items in the table using HTTP methods such as get, post, put, delete.
  • The second application starts 1,000 Lambda functions in parallel to generate load on the API endpoint, using random HTTP methods and random data for the items.

Each load generating function runs 100 concurrent requests, and when they are all terminated starts another 100, and so on, for one minute. There is no ramp-up period. Load generation starts immediately at full speed!

As you can see in the metrics tab for this table in the DynamoDB console, I reached a peak of almost 5,000 requests per second very quickly and without any throttling.

The scaling of the serverless stack, from API Gateway to the Lambda function and the DynamoDB table, was fully managed. I didn’t have to plan for the right throughput, and I could focus on the application logic I was building.

With DynamoDB on-demand you pay only for what you use. For example, in the US East (N. Virginia) region, you are charged $1.25 per million write requests units and $0.25 per million read request units, plus the usual data storage costs.

You can use the AWS Command Line Interface (CLI), AWS SDKs, and AWS CloudFormation to create a table using on-demand mode or to change the read/write capacity mode of an existing table.

Available now

The DynamoDB on-demand is available globally in all commercial regions.

I am really excited by the new possibilities for developers, ISVs and SaaS providers, and I look forward to seeing what you build with pay-per-request billing.

New – Amazon DynamoDB Transactions

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-amazon-dynamodb-transactions/

Over the years, customers have used Amazon DynamoDB for lots of different use cases, from building microservices and mobile backends to implementing gaming and Internet of Things (IoT) solutions. For example, Capital One uses DynamoDB to reduce the latency of their mobile applications by moving their mainframe transactions to a serverless architecture. Tinder migrated user data to DynamoDB with zero downtime, to get the scalability they need to support their global user base.

Developers sometimes need to implement business logic that requires multiple, all-or-nothing operations across one or more tables. This requirement can add unnecessary complexity to their implementation. Today, we are making these use cases easier to build on DynamoDB with native support for transactions!

Introducing Amazon DynamoDB Transactions

DynamoDB transactions provide developers atomicity, consistency, isolation, and durability (ACID) across one or more tables within a single AWS account and region. You can use transactions when building applications that require coordinated inserts, deletes, or updates to multiple items as part of a single logical business operation. DynamoDB is the only non-relational database that supports transactions across multiple partitions and tables.

Transactions bring the scale, performance, and enterprise benefits of DynamoDB to a broader set of workloads. Many use cases are easier and faster to implement using transactions, for example:

  • Processing financial transactions
  • Fulfilling and managing orders
  • Building multiplayer game engines
  • Coordinating actions across distributed components and services

Two new DynamoDB operations have been introduced for handling transactions:

  • TransactWriteItems, a batch operation that contains a write set, with one or more PutItem, UpdateItem, and DeleteItem operations. TransactWriteItems can optionally check for prerequisite conditions that must be satisfied before making updates. These conditions may involve the same or different items than those in the write set. If any condition is not met, the transaction is rejected.
  • TransactGetItems, a batch operation that contains a read set, with one or more GetItem operations. If a TransactGetItems request is issued on an item that is part of an active write transaction, the read transaction is canceled. To get the previously committed value, you can use a standard read.

Each transaction can include up to 10 unique items or up to 4 MB of data, including conditions.

With this new feature, DynamoDB offers multiple read and write options to meet different application requirements, providing huge flexibility to developers implementing complex, data-driven business logic:

  • Three options for reads—eventual consistency, strong consistency, and transactional.
  • Two for writes—standard and transactional.

For example, imagine you are building a game where players can buy items with virtual coins:

  • In the players table, each player has a number of coins and an inventory of purchased items.
  • In the items table, each item has a price and is marked as available (or not) with a Boolean value.

To purchase an item, you can now implement a single atomic transaction:

  1. First, check that the item is available and the player has the necessary coins.
  2. If those conditions are satisfied, the item is marked as not available and owned by the player.
  3. The purchased item is then added to the player inventory list.

In JavaScript, using the AWS SDK for JavaScript in Node.js, you would have code similar to this:

data = await dynamoDb.transactWriteItems({
    TransactItems: [
        {
            Update: {
                TableName: 'items',
                Key: { id: { S: itemId } },
                ConditionExpression: 'available = :true',
                UpdateExpression: 'set available = :false, ' +
                    'ownedBy = :player',
                ExpressionAttributeValues: {
                    ':true': { BOOL: true },
                    ':false': { BOOL: false },
                    ':player': { S: playerId }
                }
            }
        },
        {
            Update: {
                TableName: 'players',
                Key: { id: { S: playerId } },
                ConditionExpression: 'coins >= :price',
                UpdateExpression: 'set coins = coins - :price, ' +
                    'inventory = list_append(inventory, :items)',
                ExpressionAttributeValues: {
                    ':items': { L: [{ S: itemId }] },
                    ':price': { N: itemPrice.toString() }
                }
            }
        }
    ]
}).promise();

Using Transactions

Transactions are enabled for all single-region DynamoDB tables and are disabled on global tables by default. You can choose to enable transactions on global tables by request, but replication across regions is asynchronous and eventually consistent. You may observe partially completed transactions during replication to other regions. Additionally, simultaneous writes to the same item in different regions are not guaranteed to be serially isolated.

Items are not locked during a transaction. DynamoDB transactions provide serializable isolation. If an item is modified outside of a transaction while the transaction is in progress, the transaction is canceled and an exception is thrown with details about which item or items caused the exception.

When creating an AWS Identity and Access Management (IAM) policy, there are no new permissions for TransactGetItems and TransactWriteItems. Existing DynamoDB UpdateItem, PutItem, DeleteItem, and GetItem actions authorize the use of those operations also within transactions. For example, if an IAM user has only PutItem permission, they can send a transaction with one or more put, but if they add a delete to the write set, it will get rejected because they do not have DeleteItem permission.

For any committed operation that was part of a transaction, DynamoDB Streams adds a new field, transaction-id, as a universally unique identifier (UUID) for the transaction. The in-order and exactly once semantics of DynamoDB Streams guarantee that eventually all updates of a TransactWriteItems request will be propagated through streams in an order that is consistent with the transaction serialization order.

Pricing, Monitoring, and Availability

There is no additional cost to enable transactions for DynamoDB tables. You only pay for the reads or writes that are part of your transaction. DynamoDB performs two underlying reads or writes of every item in the transaction, one to prepare the transaction and one to commit the transaction. The two underlying read/write operations are visible in your CloudWatch metrics. You should plan your costs, capacity, and performance needs assuming each transactional read performs two reads and each transactional write performs two writes.

DynamoDB transactions are available globally in all commercial regions.

I am really intrigued by these new capabilities. Please let me know what you are going to use them for!

Use AWS Secrets Manager client-side caching libraries to improve the availability and latency of using your secrets

Post Syndicated from Lanre Ogunmola original https://aws.amazon.com/blogs/security/use-aws-secrets-manager-client-side-caching-libraries-to-improve-the-availability-and-latency-of-using-your-secrets/

At AWS, we offer features that make it easier for you to follow the AWS Identity and Access Management (IAM) best practice of using short-term credentials. For example, you can use an IAM role that rotates and distributes short-term AWS credentials to your applications automatically. Similarly, you can configure AWS Secrets Manager to rotate a database credential daily, turning a typical, long-term credential in to a short-term credential that is rotated automatically. Today, AWS Secrets Manager introduced a client-side caching library for Java and a client-side caching library of Java Database Connectivity (JDBC) drivers that make it easier to distribute these credentials to your applications. Client-side caching can help you improve the availability and latency of using your secrets. It can also help you reduce the cost associated with retrieving secrets. In this post, we’ll walk you through the following topics:

  • Benefits of the Secrets Manager client-side caching libraries
  • Overview of the Secrets Manager client-side caching library for JDBC
  • Using the client-side caching library for JDBC to connect your application to a database

Benefits of the Secrets Manager client-side caching libraries

The key benefits of the client-side caching libraries are:

  • Improved availability: You can cache secrets to reduce the impact of network availability issues, such as increased response times and temporary loss of network connectivity.
  • Improved latency: Retrieving secrets from the cache is faster than retrieving secrets by sending API requests to Secrets Manager within a Virtual Private Network (VPN) or over the Internet.
  • Reduced cost: Retrieving secrets from the cache can reduce the number of API requests made to and billed by Secrets Manager.
  • Automatic distribution of secrets: The library updates the cache periodically, ensuring your applications use the most up to date secret value, which you may have configured to rotate regularly.
  • Update your applications to use client-side caching in two steps: Add the library dependency to your application and then provide the identifier of the secret that you want the library to use.

Overview of the Secrets Manager client-side caching library for JDBC

Java applications use JDBC drivers to interact with databases and connection pooling tools, such as c3p0, to manage connections to databases. The client-side caching library for JDBC operates by retrieving secrets from Secrets Manager and providing these to the JDBC driver transparently, eliminating the need to hard-code the database user name and password in the connection pooling tool. To see how the client-side caching library works, review the diagram below.
 

Figure 1: Diagram showing how the client-side caching library works

Figure 1: Diagram showing how the client-side caching library works

When an application attempts to connect to a database (step 1), the client-side caching library calls the GetSecretValue command (steps 2) to retrieve the secret (step 3) required to establish this connection. Next, the library provides the secret to the JDBC driver transparently to connect the application to the database (steps 4 and 5). The library also caches the secret. If the application attempts to connect to the database again (step 6), the library retrieves the secret from the cache and calls the JDBC driver to connect to the database (steps 7 and 8).

The library refreshes the cache every hour. The library also handles stale credentials in the cache automatically. For example, after a secret is rotated, an application’s attempt to create new connections using the cached credentials will result in authentication failure. When this happens, the library will catch these authentication failures, refresh the cache, and retry the database connection automatically.

Use the client-side caching library for JDBC to connect your application to a database

Now that you’re familiar with the benefits and functions of client-side caching, we’ll show you how to use the client-side caching library for JDBC to connect your application to a database. These instructions assume your application is built in Java 8 or higher, uses the open-source c3po JDBC connection pooling library to manage connections between the application and the database, and uses the open-source tool Maven for building and managing the application. To get started, follow these steps.

  1. Navigate to the Secrets Manager console and store the user name and password for a MySQL database user. We’ll use the placeholder, CachingLibraryDemo, to denote this secret and the placeholder ARN-CachingLibraryDemo to denote the ARN of this secret. Remember to replace these with the name and ARN of your secret. Note: For step-by-step instructions on storing a secret, read the post on How to use AWS Secrets Manager to rotate credentials for all Amazon RDS database types.
  2. Next, update your application to consume the client-side caching library jar from the Sonatype Maven repository. To make this change, add the following profile to the ~/.m2/settings.xml file.
    
    <profiles>
      <profile>
        <id>allow-snapshots</id>
        <activation><activeByDefault>true</activeByDefault></activation>
        <repositories>
          <repository>
            <id>snapshots-repo</id>
            <url>https://oss.sonatype.org/content/repositories/snapshots</url>
            <releases><enabled>false</enabled></releases>
            <snapshots><enabled>true</enabled></snapshots>
          </repository>
        </repositories>
      </profile>
    </profiles>
    
    

  3. Update your Maven build file to include the Java cache and JDBC driver dependencies. This ensures your application will include the relevant libraries at run time. To make this change, add the following dependency to the pom.xml file.
    
     <dependency>
      <groupId>com.amazonaws.secretsmanager</groupId>
      <artifactId>aws-secretsmanager-caching-java</artifactId>
      <version>1.0.0</version>
    </dependency>
    <dependency>
        <groupId>com.amazonaws.secretsmanager</groupId>
        <artifactId>aws-secretsmanager-jdbc</artifactId>
        <version>1.0.0</version>
    </dependency>
    
    

  4. For this post, we assume your application uses c3p0 to manage connections to the database. Configuring c3p0 requires providing the database user name and password as parameters. Here’s what the typical c3p0 configuration looks like:
    
    # c3p0.properties
    c3p0.user=sampleusername
    c3p0.password=samplepassword
    c3p0.driverClass=com.mysql.jdbc.Driver
    c3p0.jdbcUrl=jdbc:mysql://my-sample-mysql-instance.rds.amazonaws.com:3306
    
    

    Now, update the c3p0 configuration to retrieve this information from the client-side cache by replacing the user name with the ARN of the secret and adding the prefix jdbc-secretsmanager to the JDBC URL. You can provide the name of the secret instead of the ARN.

    
    # c3p0.properties
    c3p0.user= ARN-CachingLibraryDemo
    c3p0.driverClass=com.amazonaws.secretsmanager.sql.AWSSecretsManagerMySQLDriver
    c3p0.jdbcUrl= jdbc-secretsmanager::mysql://my-sample-mysql-instance.rds.amazonaws.com:3306
    
    

Note: In our code snippet, the JDBC URL points to our database. Update the string my-sample-mysql-instance.rds.amazonaws.com:3306 to point to your database.

You’ve successfully updated your application to use the client-side caching library for JDBC.

Summary

In this post, we’ve showed how you can improve availability, reduce latency, and reduce cost of using your secrets by using the Secrets Manager client-side caching library for JDBC. To get started managing secrets, open the Secrets Manager console. To learn more, read How to Store, Distribute, and Rotate Credentials Securely with Secret Manager or refer to the Secrets Manager documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Author

Lanre Ogunmola

Lanre is a Cloud Support Engineer at AWS. He enjoys the culture at Amazon because it aligns with his dedication to lifelong learning. Outside of work, he loves watching soccer. He holds an MS in Cyber Security from the University of Nebraska, and CISA, CISM, and AWS Security Specialist certifications.

Apurv Awasthi

Apurv is the product manager for credentials management services at AWS, including AWS Secrets Manager and IAM Roles. He enjoys the “Day 1” culture at Amazon because it aligns with his experience building startups in the sports and recruiting industries. Outside of work, Apurv enjoys hiking. He holds an MBA from UCLA and an MS in computer science from University of Kentucky.

How to create and retrieve secrets managed in AWS Secrets Manager using AWS CloudFormation template

Post Syndicated from Apurv Awasthi original https://aws.amazon.com/blogs/security/how-to-create-and-retrieve-secrets-managed-in-aws-secrets-manager-using-aws-cloudformation-template/

AWS Secrets Manager now integrates with AWS CloudFormation so you can create and retrieve secrets securely using CloudFormation. This integration makes it easier to automate provisioning your AWS infrastructure. For example, without any code changes, you can generate unique secrets for your resources with every execution of your CloudFormation template. This also improves the security of your infrastructure by storing secrets securely, encrypting automatically, and enabling rotation more easily.

Secrets Manager helps you protect the secrets needed to access your applications, services, and IT resources. In this post, I show how you can get the benefits of Secrets Manager for resources provisioned through CloudFormation. First, I describe the new Secrets Manager resource types supported in CloudFormation. Next, I show a sample CloudFormation template that launches a MySQL database on Amazon Relational Database Service (RDS). This template uses the new resource types to create, rotate, and retrieve the credentials (user name and password) of the database superuser required to launch the MySQL database.

Why use Secrets Manager with CloudFormation?

CloudFormation helps you model your AWS resources as templates and execute these templates to provision AWS resources at scale. Some AWS resources require secrets as part of the provisioning process. For example, to provision a MySQL database, you must provide the credentials for the database superuser. You can use Secrets Manager, the AWS dedicated secrets management service, to create and manage such secrets.

Secrets Manager makes it easier to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. You can now reference Secrets Manager in your CloudFormation templates to create unique secrets with every invocation of your template. By default, Secrets Manager encrypts these secrets with encryption keys that you own and control. Secrets Manager ensures the secret isn’t logged or persisted by CloudFormation by using a dynamic reference to the secret. You can configure Secrets Manager to rotate your secrets automatically without disrupting your applications. Secrets Manager offers built-in integrations for rotating credentials for all Amazon RDS databases and supports extensibility with AWS Lambda so you can meet your custom rotation requirements.

New Secrets Manager resource types supported in CloudFormation

  1. AWS::SecretsManager::Secret — Create a secret and store it in Secrets Manager.
  2. AWS::SecretsManager::ResourcePolicy — Create a resource-based policy and attach it to a secret. Resource-based policies enable you to control access to secrets.
  3. AWS::SecretsManager::SecretTargetAttachment — Configure Secrets Manager to rotate the secret automatically.
  4. AWS::SecretsManager::RotationSchedule — Define the Lambda function that will be used to rotate the secret.

How to use Secrets Manager in CloudFormation

Now that you’re familiar with the new Secrets Manager resource types supported in CloudFormation, I’ll show how you can use these in a CloudFormation template. I will use a sample template that creates a MySQL database in Amazon RDS and uses Secrets Manager to create the credentials for the superuser. The template also configures the secret to rotate every 30 days automatically.

  1. Create a stack on the AWS CloudFormation console by copying the following sample template.
    
    ---
    Description: "How to create and retrieve secrets securely using an AWS CloudFormation template"
    Resources:
    
    # Create a secret with the username admin and a randomly generated password in JSON.  
      MyRDSInstanceRotationSecret:
        Type: AWS::SecretsManager::Secret
        Properties:
          Description: 'This is the secret for my RDS instance'
          GenerateSecretString:
            SecretStringTemplate: '{"username": "admin"}'
            GenerateStringKey: 'password'
            PasswordLength: 16
            ExcludeCharacters: '"@/'
    
    
    
    # Create a MySQL database of size t2.micro.
    # The secret (username and password for the superuser) will be dynamically 
    # referenced. This ensures CloudFormation will not log or persist the resolved 
    # value. 
      MyDBInstance:
        Type: AWS::RDS::DBInstance
        Properties:
          AllocatedStorage: 20
          DBInstanceClass: db.t2.micro
          Engine: mysql
          MasterUsername: !Join ['', ['{{resolve:secretsmanager:', !Ref MyRDSInstanceRotationSecret, ':SecretString:username}}' ]]
          MasterUserPassword: !Join ['', ['{{resolve:secretsmanager:', !Ref MyRDSInstanceRotationSecret, ':SecretString:password}}' ]]
          BackupRetentionPeriod: 0
          DBInstanceIdentifier: 'rotation-instance'
    
    
    
    # Update the referenced secret with properties of the RDS database.
    # This is required to enable rotation. To learn more, visit our documentation
    # https://docs.aws.amazon.com/secretsmanager/latest/userguide/rotating-secrets.html
      SecretRDSInstanceAttachment:
        Type: AWS::SecretsManager::SecretTargetAttachment
        Properties:
          SecretId: !Ref MyRDSInstanceRotationSecret
          TargetId: !Ref MyDBInstance
          TargetType: AWS::RDS::DBInstance
    
    
    
    # Schedule rotating the secret every 30 days. 
    # Note, the first rotation is triggered immediately. 
    # This enables you to verify that rotation is configured appropriately.
    # Subsequent rotations are scheduled according to the configured rotation. 
      MySecretRotationSchedule:
        Type: AWS::SecretsManager::RotationSchedule
        DependsOn: SecretRDSInstanceAttachment
        Properties:
          SecretId: !Ref MyRDSInstanceRotationSecret
          RotationLambdaARN: <% replace-with-lambda-arn %>
          RotationRules:
            AutomaticallyAfterDays: 30
     
    

  2. Next, execute the stack.
     
    Figure 1: Execute the stack

    Figure 1: Execute the stack

  3. After you execute the stack, open the RDS console to verify the database, rotation-instance, has been successfully created.
     
    Figure 2: Verify the database has been created

    Figure 2: Verify the database has been created

  4. Open the Secrets Manager console and verify the stack successfully created the secret, MyRDSInstanceRotationSecret.
     
    Figure 3: Verify the stack successfully created the secret

    Figure 3: Verify the stack successfully created the secret

Summary

I showed you how to create and retrieve secrets in CloudFormation. This improves the security of your infrastructure and makes it easier to automate infrastructure provisioning. To get started managing secrets, open the Secrets Manager console. To learn more, read How to Store, Distribute, and Rotate Credentials Securely with Secret Manager or refer to the Secrets Manager documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Apurv Awasthi

Apurv is the product manager for credentials management services at AWS, including AWS Secrets Manager and IAM Roles. He enjoys the “Day 1” culture at Amazon because it aligns with his experience building startups in the sports and recruiting industries. Outside of work, Apurv enjoys hiking. He holds an MBA from UCLA and an MS in computer science from University of Kentucky.

Learn about AWS – November AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-november-aws-online-tech-talks/

AWS Tech Talks

AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. Join us this month to learn about AWS services and solutions. We’ll have experts online to help answer any questions you may have.

Featured this month! Check out the tech talks: Virtual Hands-On Workshop: Amazon Elasticsearch Service – Analyze Your CloudTrail Logs, AWS re:Invent: Know Before You Go and AWS Office Hours: Amazon GuardDuty Tips and Tricks.

Register today!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

AR/VR

November 13, 2018 | 11:00 AM – 12:00 PM PTHow to Create a Chatbot Using Amazon Sumerian and Sumerian Hosts – Learn how to quickly and easily create a chatbot using Amazon Sumerian & Sumerian Hosts.

Compute

November 19, 2018 | 11:00 AM – 12:00 PM PTUsing Amazon Lightsail to Create a Database – Learn how to set up a database on your Amazon Lightsail instance for your applications or stand-alone websites.

November 21, 2018 | 09:00 AM – 10:00 AM PTSave up to 90% on CI/CD Workloads with Amazon EC2 Spot Instances – Learn how to automatically scale a fleet of Spot Instances with Jenkins and EC2 Spot Plug-In.

Containers

November 13, 2018 | 09:00 AM – 10:00 AM PTCustomer Showcase: How Portal Finance Scaled Their Containerized Application Seamlessly with AWS Fargate – Learn how to scale your containerized applications without managing servers and cluster, using AWS Fargate.

November 14, 2018 | 11:00 AM – 12:00 PM PTCustomer Showcase: How 99designs Used AWS Fargate and Datadog to Manage their Containerized Application – Learn how 99designs scales their containerized applications using AWS Fargate.

November 21, 2018 | 11:00 AM – 12:00 PM PTMonitor the World: Meaningful Metrics for Containerized Apps and Clusters – Learn about metrics and tools you need to monitor your Kubernetes applications on AWS.

Data Lakes & Analytics

November 12, 2018 | 01:00 PM – 01:45 PM PTSearch Your DynamoDB Data with Amazon Elasticsearch Service – Learn the joint power of Amazon Elasticsearch Service and DynamoDB and how to set up your DynamoDB tables and streams to replicate your data to Amazon Elasticsearch Service.

November 13, 2018 | 01:00 PM – 01:45 PM PTVirtual Hands-On Workshop: Amazon Elasticsearch Service – Analyze Your CloudTrail Logs – Get hands-on experience and learn how to ingest and analyze CloudTrail logs using Amazon Elasticsearch Service.

November 14, 2018 | 01:00 PM – 01:45 PM PTBest Practices for Migrating Big Data Workloads to AWS – Learn how to migrate analytics, data processing (ETL), and data science workloads running on Apache Hadoop, Spark, and data warehouse appliances from on-premises deployments to AWS.

November 15, 2018 | 11:00 AM – 11:45 AM PTBest Practices for Scaling Amazon Redshift – Learn about the most common scalability pain points with analytics platforms and see how Amazon Redshift can quickly scale to fulfill growing analytical needs and data volume.

Databases

November 12, 2018 | 11:00 AM – 11:45 AM PTModernize your SQL Server 2008/R2 Databases with AWS Database Services – As end of extended Support for SQL Server 2008/ R2 nears, learn how AWS’s portfolio of fully managed, cost effective databases, and easy-to-use migration tools can help.

DevOps

November 16, 2018 | 09:00 AM – 09:45 AM PTBuild and Orchestrate Serverless Applications on AWS with PowerShell – Learn how to build and orchestrate serverless applications on AWS with AWS Lambda and PowerShell.

End-User Computing

November 19, 2018 | 01:00 PM – 02:00 PM PTWork Without Workstations with AppStream 2.0 – Learn how to work without workstations and accelerate your engineering workflows using AppStream 2.0.

Enterprise & Hybrid

November 19, 2018 | 09:00 AM – 10:00 AM PTEnterprise DevOps: New Patterns of Efficiency – Learn how to implement “Enterprise DevOps” in your organization through building a culture of inclusion, common sense, and continuous improvement.

November 20, 2018 | 11:00 AM – 11:45 AM PTAre Your Workloads Well-Architected? – Learn how to measure and improve your workloads with AWS Well-Architected best practices.

IoT

November 16, 2018 | 01:00 PM – 02:00 PM PTPushing Intelligence to the Edge in Industrial Applications – Learn how GE uses AWS IoT for industrial use cases, including 3D printing and aviation.

Machine Learning

November 12, 2018 | 09:00 AM – 09:45 AM PTAutomate for Efficiency with Amazon Transcribe and Amazon Translate – Learn how you can increase efficiency and reach of your operations with Amazon Translate and Amazon Transcribe.

Mobile

November 20, 2018 | 01:00 PM – 02:00 PM PTGraphQL Deep Dive – Designing Schemas and Automating Deployment – Get an overview of the basics of how GraphQL works and dive into different schema designs, best practices, and considerations for providing data to your applications in production.

re:Invent

November 9, 2018 | 08:00 AM – 08:30 AM PTEpisode 7: Getting Around the re:Invent Campus – Learn how to efficiently get around the re:Invent campus using our new mobile app technology. Make sure you arrive on time and never miss a session.

November 14, 2018 | 08:00 AM – 08:30 AM PTEpisode 8: Know Before You Go – Learn about all final details you need to know before you arrive in Las Vegas for AWS re:Invent!

Security, Identity & Compliance

November 16, 2018 | 11:00 AM – 12:00 PM PTAWS Office Hours: Amazon GuardDuty Tips and Tricks – Join us for office hours and get the latest tips and tricks for Amazon GuardDuty from AWS Security experts.

Serverless

November 14, 2018 | 09:00 AM – 10:00 AM PTServerless Workflows for the Enterprise – Learn how to seamlessly build and deploy serverless applications across multiple teams in large organizations.

Storage

November 15, 2018 | 01:00 PM – 01:45 PM PTMove From Tape Backups to AWS in 30 Minutes – Learn how to switch to cloud backups easily with AWS Storage Gateway.

November 20, 2018 | 09:00 AM – 10:00 AM PTDeep Dive on Amazon S3 Security and Management – Amazon S3 provides some of the most enhanced data security features available in the cloud today, including access controls, encryption, security monitoring, remediation, and security standards and compliance certifications.

Performance matters: Amazon Redshift is now up to 3.5x faster for real-world workloads

Post Syndicated from Ayush Jain original https://aws.amazon.com/blogs/big-data/performance-matters-amazon-redshift-is-now-up-to-3-5x-faster-for-real-world-workloads/

Since we launched Amazon Redshift, thousands of customers have trusted us to get uncompromising speed for their most complex analytical workloads. Over the course of 2017, our customers benefited from a 3x to 5x performance gain, resulting from short query acceleration, result caching, late materialization, and many other under-the-hood improvements. In this post, we highlight recent improvements to Amazon Redshift and how our continued focus on performance enhancements is benefiting customers. We also discuss performance testing derived from industry-standard benchmarks that help us measure the impact of these ongoing improvements.

Recent performance improvements

With the largest number of data warehousing deployments in the cloud, we have the ability to analyze usage patterns across a variety of analytical workloads and uncover opportunities to improve performance. We leverage these insights to deliver improvements that seamlessly benefit thousands of customers. Major improvements in performance over the past six months include the following:

  • Improved resource management for memory-intensive queries: Amazon Redshift improved how joins and aggregations consume and reserve memory. This improved cache efficiency for the majority of the hash tables and reduced spilling for memory-intensive joins and aggregations by up to 1.6x.
  • Improved performance for commits: As a central component of write transactions, commit has a direct impact on the performance of data update and data ingestion workloads, such as ETL (extract, transform, and load) jobs. Since November 2017, we’ve delivered a series of commit performance optimizations such as batching multiple commits in a single operation, improved usage of commit locks, and a locality-aware metadata defragmenter. These and other related optimizations have resulted in a 4x commit time reduction on average for HDD-based clusters. For heavy transactions (the top 5 percent of commit operations in Amazon Redshift), the delivered optimizations resulted in a 7.5x improvement.
  • Improved performance for repeated queries: With Amazon Redshift’s result caching, dashboards, visualization, and business intelligence (BI) tools that execute queries repeatedly now see a significant boost in performance. In addition, result caching frees up resources that can improve the performance of all other queries.
  • Query processing improvements: Amazon Redshift now performs 2x–6x faster for scenarios such as repeated subqueries, advanced analytics functions with predicates, and complex query plans by eliminating duplicate work and streamlining steps.
  • Faster string manipulation: Amazon Redshift yields 5x better performance for frequently used string functions because of more efficient code generation techniques.

We’ve also complemented these out-of-the-box improvements with tailored recommendations to help you get better performance at a lower cost with Amazon Redshift Advisor. Advisor has already provided close to 50,000 recommendations since it launched in July 2018.

All of these optimizations have transparently boosted customers’ ability to get faster insights from their AWS analytics platform and saved thousands of hours of execution time on a daily basis. This applies to even the largest deployments, where customers have multiple petabytes of data in Redshift clusters, and seamless access to even larger data volumes in their Amazon S3 data lakes with Amazon Redshift Spectrum. “Redshift’s query performance and scalability has been increasing, even though our data has grown.” said Minero Aoki, Senior Data Engineer, Cookpad Inc. “In the last 10 months, we have seen commit performance increase by 500% without any increase in cost.”

Using benchmarks to measure success

To measure the impact of these ongoing improvements, we measure performance on a nightly basis and run queries derived from industry-standard benchmarks such as TPC-DS. We also occasionally benchmark Amazon Redshift against other data warehouse services. We set up these measurements to reflect our customers’ real-world usage, as highlighted earlier. This enables us to accurately gauge whether Amazon Redshift is getting better with each release, which happens every two weeks.

Comparing Amazon Redshift releases over the past few months, we observed that Amazon Redshift is now 3.5x faster versus six months ago, running all 99 queries derived from the TPC-DS benchmark. This is shown in the following chart.

Note: We used a Cloud DW benchmark derived from TPC-DS for this study. As such, the Cloud DW benchmark is not comparable to published TPC-DS results. TPC Benchmark and TPC-DS are trademarks of the Transaction Processing Performance Council.

For this post, we also compared the latest Amazon Redshift release with Microsoft Azure SQL Data Warehouse using the Cloud DW benchmark derived from TPC-DS. Queries ran against a 3 TB dataset on a 4-node cluster on both services, using dc2.8xlarge for Amazon Redshift and DW2000c Gen2 for Azure SQL Data Warehouse. We could not run a larger dataset because Azure could not allocate the DW15000c cluster required for a 30 TB dataset owing to capacity constraints at the time of publishing.

We observed that Amazon Redshift is 15x faster than Azure SQL Data Warehouse running all 99 queries with one user, and 14x faster with four concurrent users. There were a couple of outlier queries that took Azure SQL Data Warehouse several hours to complete. Excluding the two long running queries, Amazon Redshift is 2x faster than Azure SQL Data Warehouse with 1 user and 1.6x faster with four concurrent users. The following charts compare the two services.

Note: We used queries derived from TPC-DS v2.9 for this study. Amazon Redshift and Azure SQL DW do not support rollup queries, so we used TPC-DS provided variants for queries 5, 14, 18, 27, 36, 67, 70, 77, 80, and 86. We used out-of-the-box Workload Management configuration for Amazon Redshift, which allows for 5 concurrent queries, and ‘largerc’ resource class for Azure SQL DW, which has a lower limit of 4 concurrent queries. Amazon Redshift took 25 minutes to run all 99 queries, whereas Azure SQL Data Warehouse took 6.4 hours. Ignoring two queries that each took Azure SQL Data Warehouse more than 1 hour to execute (Q38 and Q67), Amazon Redshift took 22 minutes, while Azure SQL Data Warehouse took 42 minutes.

 

Evaluating Amazon Redshift

Although benchmarks against other data warehouse services are interesting, they are of limited value. First, there’s no one-size-fits-all benchmark. Each service has its unique real-world usage patterns and ways to configure and tune for them. We make a best effort to configure the services based on publicly available guidance, but we can’t guarantee optimal performance for any given service. We see this commonly with third-party benchmarks, for instance, where Amazon Redshift’s powerful distribution and sort keys are not used—even though the large majority of our customers use them.

Similarly, each benchmark query can only be run once, in contrast to real-world scenarios where 99.5 percent of queries we observe have components that can be found in the compilation cache (Amazon Redshift generates and compiles code for each query execution plan. The compiled code segments are stored in a least recently used cache and shared across sessions in a cluster). In other words, they are similar to queries that were run previously. So, the query run times measured by benchmarking studies can end up over-indexing on compilation times, which might not indicate the actual performance you can expect to get.

Secondly, these studies are, by necessity, a point-in-time assessment. As cloud vendors update and evolve their service, benchmark numbers might already be obsolete by the time they’re published.

Therefore, we don’t recommend that you make product selection decisions based on these benchmarks because your data and your query workloads have their own unique characteristics. If you’re evaluating Amazon Redshift for your analytics platform, we have created a Proof of Concept guide to help. You can also request assistance from us, or work with one of our System Integration and Consulting Partners and make a data-driven decision.

Finally, we invite you to watch the recent Fireside chat webinar and join us at re:Invent 2018 in Las Vegas, where we have a ton of exciting news to share with you. Happy querying!

If you would like instruction to reproduce the benchmark, please contact us at [email protected]. If you have questions or suggestions, please comment below.


About the Authors

Ayush Jain is a Product Marketer at Amazon Web Services. He loves growing cloud services and helping customers get more value from the cloud deployments. He has several years of experience in Software Development, Product Management and Product Marketing in developer and data services.

 

 

 

Mostafa Mokhtar is an engineer working on Redshift performance. Previously, he held similar roles at Cloudera, Hortonworks and on the SQL Server team at Microsoft.

 

How to use AWS Secrets Manager to rotate credentials for all Amazon RDS database types, including Oracle

Post Syndicated from Apurv Awasthi original https://aws.amazon.com/blogs/security/how-to-use-aws-secrets-manager-rotate-credentials-amazon-rds-database-types-oracle/

You can now use AWS Secrets Manager to rotate credentials for Oracle, Microsoft SQL Server, or MariaDB databases hosted on Amazon Relational Database Service (Amazon RDS) automatically. Previously, I showed how to rotate credentials for a MySQL database hosted on Amazon RDS automatically with AWS Secrets Manager. With today’s launch, you can use Secrets Manager to automatically rotate credentials for all types of databases hosted on Amazon RDS.

In this post, I review the key features of Secrets Manager. You’ll then learn:

  1. How to store the database credential for the superuser of an Oracle database hosted on Amazon RDS
  2. How to store the Oracle database credential used by an application
  3. How to configure Secrets Manager to rotate both Oracle credentials automatically on a schedule that you define

Key features of Secrets Manager

AWS Secrets Manager makes it easier to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. The key features of this service include the ability to:

  1. Secure and manage secrets centrally. You can store, view, and manage all your secrets centrally. By default, Secrets Manager encrypts these secrets with encryption keys that you own and control. You can use fine-grained IAM policies or resource-based policies to control access to your secrets. You can also tag secrets to help you discover, organize, and control access to secrets used throughout your organization.
  2. Rotate secrets safely. You can configure Secrets Manager to rotate secrets automatically without disrupting your applications. Secrets Manager offers built-in integrations for rotating credentials for all Amazon RDS databases (MySQL, PostgreSQL, Oracle, Microsoft SQL Server, MariaDB, and Amazon Aurora.) You can also extend Secrets Manager to meet your custom rotation requirements by creating an AWS Lambda function to rotate other types of secrets.
  3. Transmit securely. Secrets are transmitted securely over Transport Layer Security (TLS) protocol 1.2. You can also use Secrets Manager with Amazon Virtual Private Cloud (Amazon VPC) endpoints powered by AWS Privatelink to keep this communication within the AWS network and help meet your compliance and regulatory requirements to limit public internet connectivity.
  4. Pay as you go. Pay for the secrets you store in Secrets Manager and for the use of these secrets; there are no long-term contracts, licensing fees, or infrastructure and personnel costs. For example, a typical production-scale web application will generate an estimated monthly bill of $6. If you follow along the instructions in this blog post, your estimated monthly bill for Secrets Manager will be $1. Note: you may incur additional charges for using Amazon RDS and Amazon Lambda, if you’ve already consumed the free tier for these services.

Now that you’re familiar with Secrets Manager features, I’ll show you how to store and automatically rotate credentials for an Oracle database hosted on Amazon RDS. I divided these instructions into three phases:

  1. Phase 1: Store and configure rotation for the superuser credential
  2. Phase 2: Store and configure rotation for the application credential
  3. Phase 3: Retrieve the credential from Secrets Manager programmatically

Prerequisites

To follow along, your AWS Identity and Access Management (IAM) principal (user or role) requires the SecretsManagerReadWrite AWS managed policy to store the secrets. Your principal also requires the IAMFullAccess AWS managed policy to create and configure permissions for the IAM role used by Lambda for executing rotations. You can use IAM permissions boundaries to grant an employee the ability to configure rotation without also granting them full administrative access to your account.

Phase 1: Store and configure rotation for the superuser credential

From the Secrets Manager console, on the right side, select Store a new secret.

Since I’m storing credentials for database hosted on Amazon RDS, I select Credentials for RDS database. Next, I input the user name and password for the superuser. I start by securing the superuser because it’s the most powerful database credential and has full access to the database.
 

Figure 1: For "Select secret type," choose "Credentials for RDS database"

Figure 1: For “Select secret type,” choose “Credentials for RDS database”

For this example, I choose to use the default encryption settings. Secrets Manager will encrypt this secret using the Secrets Manager DefaultEncryptionKey in this account. Alternatively, I can choose to encrypt using a customer master key (CMK) that I have stored in AWS Key Management Service (AWS KMS). To learn more, read the Using Your AWS KMS CMK documentation.
 

Figure 2: Choose either DefaultEncryptionKey or use a CMK

Figure 2: Choose either DefaultEncryptionKey or use a CMK

Next, I view the list of Amazon RDS instances in my account and select the database this credential accesses. For this example, I select the DB instance oracle-rds-database from the list, and then I select Next.

I then specify values for Secret name and Description. For this example, I use Database/Development/Oracle-Superuser as the name and enter a description of this secret, and then select Next.
 

Figure 3: Provide values for "Secret name" and "Description"

Figure 3: Provide values for “Secret name” and “Description”

Since this database is not yet being used, I choose to enable rotation. To do so, I select Enable automatic rotation, and then set the rotation interval to 60 days. Remember, if this database credential is currently being used, first update the application (see phase 3) to use Secrets Manager APIs to retrieve secrets before enabling rotation.
 

Figure 4: Select "Enable automatic rotation"

Figure 4: Select “Enable automatic rotation”

Next, Secrets Manager requires permissions to rotate this secret on my behalf. Because I’m storing the credentials for the superuser, Secrets Manager can use this credential to perform rotations. Therefore, on the same screen, I select Use a secret that I have previously stored in AWS Secrets Manager, and then select Next.

Finally, I review the information on the next screen. Everything looks correct, so I select Store. I have now successfully stored a secret in Secrets Manager.

Note: Secrets Manager will now create a Lambda function in the same VPC as my Oracle database and trigger this function periodically to change the password for the superuser. I can view the name of the Lambda function on the Rotation configuration section of the Secret Details page.

The banner on the next screen confirms that I’ve successfully configured rotation and the first rotation is in progress, which enables me to verify that rotation is functioning as expected. Secrets Manager will rotate this credential automatically every 60 days.
 

Figure 5: The confirmation notification

Figure 5: The confirmation notification

Phase 2: Store and configure rotation for the application credential

The superuser is a powerful credential that should be used only for administrative tasks. To enable your applications to access a database, create a unique database credential per application and grant these credentials limited permissions. You can use these database credentials to read or write to database tables required by the application. As a security best practice, deny the ability to perform management actions, such as creating new credentials.

In this phase, I will store the credential that my application will use to connect to the Oracle database. To get started, from the Secrets Manager console, on the right side, select Store a new secret.

Next, I select Credentials for RDS database, and input the user name and password for the application credential.

I continue to use the default encryption key. I select the DB instance oracle-rds-database, and then select Next.

I specify values for Secret Name and Description. For this example, I use Database/Development/Oracle-Application-User as the name and enter a description of this secret, and then select Next.

I now configure rotation. Once again, since my application is not using this database credential yet, I’ll configure rotation as part of storing this secret. I select Enable automatic rotation, and set the rotation interval to 60 days.

Next, Secrets Manager requires permissions to rotate this secret on behalf of my application. Earlier in the post, I mentioned that applications credentials have limited permissions and are unable to change their password. Therefore, I will use the superuser credential, Database/Development/Oracle-Superuser, that I stored in Phase 1 to rotate the application credential. With this configuration, Secrets Manager creates a clone application user.
 

Figure 6: Select the superuser credential

Figure 6: Select the superuser credential

Note: Creating a clone application user is the preferred mechanism of rotation because the old version of the secret continues to operate and handle service requests while the new version is prepared and tested. There’s no application downtime while changing between versions.

I review the information on the next screen. Everything looks correct, so I select Store. I have now successfully stored the application credential in Secrets Manager.

As mentioned in Phase 1, AWS Secrets Manager creates a Lambda function in the same VPC as the database and then triggers this function periodically to rotate the secret. Since I chose to use the existing superuser secret to rotate the application secret, I will grant the rotation Lambda function permissions to retrieve the superuser secret. To grant this permission, I first select role from the confirmation banner.
 

Figure 7: Select the "role" link that's in the confirmation notification

Figure 7: Select the “role” link that’s in the confirmation notification

Next, in the Permissions tab, I select SecretsManagerRDSMySQLRotationMultiUserRolePolicy0. Then I select Edit policy.
 

Figure 8: Edit the policy on the "Permissions" tab

Figure 8: Edit the policy on the “Permissions” tab

In this step, I update the policy (see below) and select Review policy. When following along, remember to replace the placeholder ARN-OF-SUPERUSER-SECRET with the ARN of the secret you stored in Phase 1.


{
  "Statement": [
    {
        "Effect": "Allow",
        "Action": [
            "ec2:CreateNetworkInterface",
			"ec2:DeleteNetworkInterface",
			"ec2:DescribeNetworkInterfaces",
			"ec2:DetachNetworkInterface"
		],
		"Resource": "*"
	},
	{
	    "Sid": "GrantPermissionToUse",
		"Effect": "Allow",
		"Action": [
            "secretsmanager:GetSecretValue"
        ],
		"Resource": "ARN-OF-SUPERUSER-SECRET"
	}
  ]
}

Here’s what it will look like:
 

Figure 9: Edit the policy

Figure 9: Edit the policy

Next, I select Save changes. I have now completed all the steps required to configure rotation for the application credential, Database/Development/Oracle-Application-User.

Phase 3: Retrieve the credential from Secrets Manager programmatically

Now that I have stored the secret in Secrets Manager, I add code to my application to retrieve the database credential from Secrets Manager. I use the sample code from Phase 2 above. This code sets up the client and retrieves and decrypts the secret Database/Development/Oracle-Application-User.

Remember, applications require permissions to retrieve the secret, Database/Development/Oracle-Application-User, from Secrets Manager. My application runs on Amazon EC2 and uses an IAM role to obtain access to AWS services. I attach the following policy to my IAM role. This policy uses the GetSecretValue action to grant my application permissions to read secret from Secrets Manager. This policy also uses the resource element to limit my application to read only the Database/Development/Oracle-Application-User secret from Secrets Manager. You can refer to the Secrets Manager Documentation to understand the minimum IAM permissions required to retrieve a secret.


{
 "Version": "2012-10-17",
 "Statement": {
    "Sid": "RetrieveDbCredentialFromSecretsManager",
    "Effect": "Allow",
    "Action": "secretsmanager:GetSecretValue",
    "Resource": "arn:aws:secretsmanager:<AWS-REGION>:<ACCOUNT-NUMBER>:secret: Database/Development/Oracle-Application-User     
 }
}

In the above policy, remember to replace the placeholder <AWS-REGION> with the AWS region that you’re using and the placeholder <ACCOUNT-NUMBER> with the number of your AWS account.

Summary

I explained the key benefits of Secrets Manager as they relate to RDS and showed you how to help meet your compliance requirements by configuring Secrets Manager to rotate database credentials automatically on your behalf. Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and on-going maintenance costs of operating your own secrets management infrastructure. To get started, visit the Secrets Manager console. To learn more, visit Secrets Manager documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.

Apurv Awasthi

Apurv is the product manager for credentials management services at AWS, including AWS Secrets Manager and IAM Roles. He enjoys the “Day 1” culture at Amazon because it aligns with his experience building startups in the sports and recruiting industries. Outside of work, Apurv enjoys hiking. He holds an MBA from UCLA and an MS in computer science from University of Kentucky.