Tag Archives: ror

Implement continuous integration and delivery of serverless AWS Glue ETL applications using AWS Developer Tools

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/implement-continuous-integration-and-delivery-of-serverless-aws-glue-etl-applications-using-aws-developer-tools/

AWS Glue is an increasingly popular way to develop serverless ETL (extract, transform, and load) applications for big data and data lake workloads. Organizations that transform their ETL applications to cloud-based, serverless ETL architectures need a seamless, end-to-end continuous integration and continuous delivery (CI/CD) pipeline: from source code, to build, to deployment, to product delivery. Having a good CI/CD pipeline can help your organization discover bugs before they reach production and deliver updates more frequently. It can also help developers write quality code and automate the ETL job release management process, mitigate risk, and more.

AWS Glue is a fully managed data catalog and ETL service. It simplifies and automates the difficult and time-consuming tasks of data discovery, conversion, and job scheduling. AWS Glue crawls your data sources and constructs a data catalog using pre-built classifiers for popular data formats and data types, including CSV, Apache Parquet, JSON, and more.

When you are developing ETL applications using AWS Glue, you might come across some of the following CI/CD challenges:

  • Iterative development with unit tests
  • Continuous integration and build
  • Pushing the ETL pipeline to a test environment
  • Pushing the ETL pipeline to a production environment
  • Testing ETL applications using real data (live test)
  • Exploring and validating data

In this post, I walk you through a solution that implements a CI/CD pipeline for serverless AWS Glue ETL applications supported by AWS Developer Tools (including AWS CodePipeline, AWS CodeCommit, and AWS CodeBuild) and AWS CloudFormation.

Solution overview

The following diagram shows the pipeline workflow:

This solution uses AWS CodePipeline, which lets you orchestrate and automate the test and deploy stages for ETL application source code. The solution consists of a pipeline that contains the following stages:

1.) Source Control: In this stage, the AWS Glue ETL job source code and the AWS CloudFormation template file for deploying the ETL jobs are both committed to version control. I chose to use AWS CodeCommit for version control.

To get the ETL job source code and AWS CloudFormation template, download the gluedemoetl.zip file. This solution is developed based on a previous post, Build a Data Lake Foundation with AWS Glue and Amazon S3.

2.) LiveTest: In this stage, all resources—including AWS Glue crawlers, jobs, S3 buckets, roles, and other resources that are required for the solution—are provisioned, deployed, live tested, and cleaned up.

The LiveTest stage includes the following actions:

  • Deploy: In this action, all the resources that are required for this solution (crawlers, jobs, buckets, roles, and so on) are provisioned and deployed using an AWS CloudFormation template.
  • AutomatedLiveTest: In this action, all the AWS Glue crawlers and jobs are executed and data exploration and validation tests are performed. These validation tests include, but are not limited to, record counts in both raw tables and transformed tables in the data lake and any other business validations. I used AWS CodeBuild for this action.
  • LiveTestApproval: This action is included for the cases in which a pipeline administrator approval is required to deploy/promote the ETL applications to the next stage. The pipeline pauses in this action until an administrator manually approves the release.
  • LiveTestCleanup: In this action, all the LiveTest stage resources, including test crawlers, jobs, roles, and so on, are deleted using the AWS CloudFormation template. This action helps minimize cost by ensuring that the test resources exist only for the duration of the AutomatedLiveTest and LiveTestApproval

3.) DeployToProduction: In this stage, all the resources are deployed using the AWS CloudFormation template to the production environment.

Try it out

This code pipeline takes approximately 20 minutes to complete the LiveTest test stage (up to the LiveTest approval stage, in which manual approval is required).

To get started with this solution, choose Launch Stack:

This creates the CI/CD pipeline with all of its stages, as described earlier. It performs an initial commit of the sample AWS Glue ETL job source code to trigger the first release change.

In the AWS CloudFormation console, choose Create. After the template finishes creating resources, you see the pipeline name on the stack Outputs tab.

After that, open the CodePipeline console and select the newly created pipeline. Initially, your pipeline’s CodeCommit stage shows that the source action failed.

Allow a few minutes for your new pipeline to detect the initial commit applied by the CloudFormation stack creation. As soon as the commit is detected, your pipeline starts. You will see the successful stage completion status as soon as the CodeCommit source stage runs.

In the CodeCommit console, choose Code in the navigation pane to view the solution files.

Next, you can watch how the pipeline goes through the LiveTest stage of the deploy and AutomatedLiveTest actions, until it finally reaches the LiveTestApproval action.

At this point, if you check the AWS CloudFormation console, you can see that a new template has been deployed as part of the LiveTest deploy action.

At this point, make sure that the AWS Glue crawlers and the AWS Glue job ran successfully. Also check whether the corresponding databases and external tables have been created in the AWS Glue Data Catalog. Then verify that the data is validated using Amazon Athena, as shown following.

Open the AWS Glue console, and choose Databases in the navigation pane. You will see the following databases in the Data Catalog:

Open the Amazon Athena console, and run the following queries. Verify that the record counts are matching.

SELECT count(*) FROM "nycitytaxi_gluedemocicdtest"."data";
SELECT count(*) FROM "nytaxiparquet_gluedemocicdtest"."datalake";

The following shows the raw data:

The following shows the transformed data:

The pipeline pauses the action until the release is approved. After validating the data, manually approve the revision on the LiveTestApproval action on the CodePipeline console.

Add comments as needed, and choose Approve.

The LiveTestApproval stage now appears as Approved on the console.

After the revision is approved, the pipeline proceeds to use the AWS CloudFormation template to destroy the resources that were deployed in the LiveTest deploy action. This helps reduce cost and ensures a clean test environment on every deployment.

Production deployment is the final stage. In this stage, all the resources—AWS Glue crawlers, AWS Glue jobs, Amazon S3 buckets, roles, and so on—are provisioned and deployed to the production environment using the AWS CloudFormation template.

After successfully running the whole pipeline, feel free to experiment with it by changing the source code stored on AWS CodeCommit. For example, if you modify the AWS Glue ETL job to generate an error, it should make the AutomatedLiveTest action fail. Or if you change the AWS CloudFormation template to make its creation fail, it should affect the LiveTest deploy action. The objective of the pipeline is to guarantee that all changes that are deployed to production are guaranteed to work as expected.

Conclusion

In this post, you learned how easy it is to implement CI/CD for serverless AWS Glue ETL solutions with AWS developer tools like AWS CodePipeline and AWS CodeBuild at scale. Implementing such solutions can help you accelerate ETL development and testing at your organization.

If you have questions or suggestions, please comment below.

 


Additional Reading

If you found this post useful, be sure to check out Implement Continuous Integration and Delivery of Apache Spark Applications using AWS and Build a Data Lake Foundation with AWS Glue and Amazon S3.

 


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 
Luis Caro is a Big Data Consultant for AWS Professional Services. He works with our customers to provide guidance and technical assistance on big data projects, helping them improving the value of their solutions when using AWS.

 

 

 

[$] Finding Spectre vulnerabilities with smatch

Post Syndicated from corbet original https://lwn.net/Articles/752408/rss

The furor over the Meltdown and Spectre vulnerabilities has calmed a bit —
for now, at least — but that does not mean that developers have stopped
worrying about them. Spectre variant 1 (the bounds-check bypass
vulnerability) has been of particular concern because, while the kernel is
thought to contain numerous vulnerable spots, nobody really knows how to
find them all. As a result, the defenses that have been developed for
variant 1 have only been deployed in a few places. Recently, though,
Dan Carpenter has enhanced the smatch tool to enable it to find possibly
vulnerable code in the kernel.

Cloudflare Kicks Out Torrent Site For Abuse Reporting Interference

Post Syndicated from Ernesto original https://torrentfreak.com/cloudflare-kicks-out-torrent-site-for-abuse-reporting-interference-180420/

As one of the leading CDN and DDoS protection services, Cloudflare is used by millions of websites across the globe.

The company’s clients include billion dollar companies and national governments, but also personal blogs, and even pirate sites.

Copyright holders are not happy with the latter category and are pressuring Cloudflare to cut their ties with sites like The Pirate Bay, both in and out of court.

Cloudflare, however, maintains that it’s a neutral service provider. They forward copyright infringement notices to their customers, for example, but deny any liability for these sites.

Generally speaking, the company only disconnects a customer in response to a court order, as it did with Sci-Hub earlier this year. That’s why it came as a surprise when the anime torrent site NYAA.si was disconnected this week.

The site, which is a replacement for the original NYAA, has millions of users and is particularly popular in Japan. Without prior warning, it became unavailable for several hours this week, after Cloudflare removed it from its services. So what happened?

TorrentFreak spoke to the operator who said that the exact reason for the termination remains a mystery to him. He reached out to Cloudflare looking for answers, but the comany simply stated that it’s about “avoiding measures taken to avoid abuse complaints,” as can be seen below.

One of Cloudflare’s messages

The operator says he hasn’t done anything out of the ordinary and showed his willingness to resolve any possible issues. However, that hasn’t changed Cloudflare’s stance.

“We asked multiple times for clarification. We also expressed that we were willing to attempt to work with them on whatever the problem actually was, if they would explain what they even mean.

“Naturally, I have been stonewalled by them at every stage. I’ve contacted numerous persons at Cloudflare and nobody will talk about this,” NYAA’s operator adds.

TorrentFreak asked Cloudflare for more details and the company confirmed that the matter was related to interference with its abuse reporting systems, without providing further detail.

“We determined that the customer had taken steps specifically intended to interfere with and thwart the operation of our abuse reporting systems,” Cloudflare’s General Counsel Doug Kramer informed us.

Cloudflare’s statement suggests that the site took active steps to interfere with the abuse process. The company added that it can’t go into detail, but says that the reason for the termination was shared with the website owner.

The website owner, on the other hand, informs us that he has no clue what the exact problem is. NYAA.si occasionally swaps IP addresses and have recently set up some mirror domains, but these were all under the same account. So, he has no idea why that would interfere with any abuse reports.

“I’m honestly unsure of what we could have done that ‘circumvents” their abuse system,” NYAA’s operator says, adding that the only abuse reports received were copyright related.

It’s unlikely, however, that copyright takedown notices alone would warrant account termination, as most of the largest torrent sites use Cloudflare.

NYAA’s operator says he can do little more than speculate at the point. Some have hinted at a secret court order while Japan’s recent crackdown on manga and anime piracy also came to mind, all without a grain of evidence of course.

Whatever the reason, NYAA.si now has to move on without Cloudflare, while the mystery remains.

“Frankly, this whole thing is a joke. I don’t understand why they would willingly host much bigger sites like ThePirateBay without any issue, or even ISIS, or the various hacking groups that have used them over time,” the operator says.

If more information about the abuse process interfere becomes available, we’ll definitely follow it up.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Implementing safe AWS Lambda deployments with AWS CodeDeploy

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/implementing-safe-aws-lambda-deployments-with-aws-codedeploy/

This post courtesy of George Mao, AWS Senior Serverless Specialist – Solutions Architect

AWS Lambda and AWS CodeDeploy recently made it possible to automatically shift incoming traffic between two function versions based on a preconfigured rollout strategy. This new feature allows you to gradually shift traffic to the new function. If there are any issues with the new code, you can quickly rollback and control the impact to your application.

Previously, you had to manually move 100% of traffic from the old version to the new version. Now, you can have CodeDeploy automatically execute pre- or post-deployment tests and automate a gradual rollout strategy. Traffic shifting is built right into the AWS Serverless Application Model (SAM), making it easy to define and deploy your traffic shifting capabilities. SAM is an extension of AWS CloudFormation that provides a simplified way of defining serverless applications.

In this post, I show you how to use SAM, CloudFormation, and CodeDeploy to accomplish an automated rollout strategy for safe Lambda deployments.

Scenario

For this walkthrough, you write a Lambda application that returns a count of the S3 buckets that you own. You deploy it and use it in production. Later on, you receive requirements that tell you that you need to change your Lambda application to count only buckets that begin with the letter “a”.

Before you make the change, you need to be sure that your new Lambda application works as expected. If it does have issues, you want to minimize the number of impacted users and roll back easily. To accomplish this, you create a deployment process that publishes the new Lambda function, but does not send any traffic to it. You use CodeDeploy to execute a PreTraffic test to ensure that your new function works as expected. After the test succeeds, CodeDeploy automatically shifts traffic gradually to the new version of the Lambda function.

Your Lambda function is exposed as a REST service via an Amazon API Gateway deployment. This makes it easy to test and integrate.

Prerequisites

To execute the SAM and CloudFormation deployment, you must have the following IAM permissions:

  • cloudformation:*
  • lambda:*
  • codedeploy:*
  • iam:create*

You may use the AWS SAM Local CLI or the AWS CLI to package and deploy your Lambda application. If you choose to use SAM Local, be sure to install it onto your system. For more information, see AWS SAM Local Installation.

All of the code used in this post can be found in this GitHub repository: https://github.com/aws-samples/aws-safe-lambda-deployments.

Walkthrough

For this post, use SAM to define your resources because it comes with built-in CodeDeploy support for safe Lambda deployments.  The deployment is handled and automated by CloudFormation.

SAM allows you to define your Serverless applications in a simple and concise fashion, because it automatically creates all necessary resources behind the scenes. For example, if you do not define an execution role for a Lambda function, SAM automatically creates one. SAM also creates the CodeDeploy application necessary to drive the traffic shifting, as well as the IAM service role that CodeDeploy uses to execute all actions.

Create a SAM template

To get started, write your SAM template and call it template.yaml.

AWSTemplateFormatVersion : '2010-09-09'
Transform: AWS::Serverless-2016-10-31
Description: An example SAM template for Lambda Safe Deployments.

Resources:

  returnS3Buckets:
    Type: AWS::Serverless::Function
    Properties:
      Handler: returnS3Buckets.handler
      Runtime: nodejs6.10
      AutoPublishAlias: live
      Policies:
        - Version: "2012-10-17"
          Statement: 
          - Effect: "Allow"
            Action: 
              - "s3:ListAllMyBuckets"
            Resource: '*'
      DeploymentPreference:
          Type: Linear10PercentEvery1Minute
          Hooks:
            PreTraffic: !Ref preTrafficHook
      Events:
        Api:
          Type: Api
          Properties:
            Path: /test
            Method: get

  preTrafficHook:
    Type: AWS::Serverless::Function
    Properties:
      Handler: preTrafficHook.handler
      Policies:
        - Version: "2012-10-17"
          Statement: 
          - Effect: "Allow"
            Action: 
              - "codedeploy:PutLifecycleEventHookExecutionStatus"
            Resource:
              !Sub 'arn:aws:codedeploy:${AWS::Region}:${AWS::AccountId}:deploymentgroup:${ServerlessDeploymentApplication}/*'
        - Version: "2012-10-17"
          Statement: 
          - Effect: "Allow"
            Action: 
              - "lambda:InvokeFunction"
            Resource: !Ref returnS3Buckets.Version
      Runtime: nodejs6.10
      FunctionName: 'CodeDeployHook_preTrafficHook'
      DeploymentPreference:
        Enabled: false
      Timeout: 5
      Environment:
        Variables:
          NewVersion: !Ref returnS3Buckets.Version

This template creates two functions:

  • returnS3Buckets
  • preTrafficHook

The returnS3Buckets function is where your application logic lives. It’s a simple piece of code that uses the AWS SDK for JavaScript in Node.JS to call the Amazon S3 listBuckets API action and return the number of buckets.

'use strict';

var AWS = require('aws-sdk');
var s3 = new AWS.S3();

exports.handler = (event, context, callback) => {
	console.log("I am here! " + context.functionName  +  ":"  +  context.functionVersion);

	s3.listBuckets(function (err, data){
		if(err){
			console.log(err, err.stack);
			callback(null, {
				statusCode: 500,
				body: "Failed!"
			});
		}
		else{
			var allBuckets = data.Buckets;

			console.log("Total buckets: " + allBuckets.length);
			callback(null, {
				statusCode: 200,
				body: allBuckets.length
			});
		}
	});	
}

Review the key parts of the SAM template that defines returnS3Buckets:

  • The AutoPublishAlias attribute instructs SAM to automatically publish a new version of the Lambda function for each new deployment and link it to the live alias.
  • The Policies attribute specifies additional policy statements that SAM adds onto the automatically generated IAM role for this function. The first statement provides the function with permission to call listBuckets.
  • The DeploymentPreference attribute configures the type of rollout pattern to use. In this case, you are shifting traffic in a linear fashion, moving 10% of traffic every minute to the new version. For more information about supported patterns, see Serverless Application Model: Traffic Shifting Configurations.
  • The Hooks attribute specifies that you want to execute the preTrafficHook Lambda function before CodeDeploy automatically begins shifting traffic. This function should perform validation testing on the newly deployed Lambda version. This function invokes the new Lambda function and checks the results. If you’re satisfied with the tests, instruct CodeDeploy to proceed with the rollout via an API call to: codedeploy.putLifecycleEventHookExecutionStatus.
  • The Events attribute defines an API-based event source that can trigger this function. It accepts requests on the /test path using an HTTP GET method.
'use strict';

const AWS = require('aws-sdk');
const codedeploy = new AWS.CodeDeploy({apiVersion: '2014-10-06'});
var lambda = new AWS.Lambda();

exports.handler = (event, context, callback) => {

	console.log("Entering PreTraffic Hook!");
	
	// Read the DeploymentId & LifecycleEventHookExecutionId from the event payload
    var deploymentId = event.DeploymentId;
	var lifecycleEventHookExecutionId = event.LifecycleEventHookExecutionId;

	var functionToTest = process.env.NewVersion;
	console.log("Testing new function version: " + functionToTest);

	// Perform validation of the newly deployed Lambda version
	var lambdaParams = {
		FunctionName: functionToTest,
		InvocationType: "RequestResponse"
	};

	var lambdaResult = "Failed";
	lambda.invoke(lambdaParams, function(err, data) {
		if (err){	// an error occurred
			console.log(err, err.stack);
			lambdaResult = "Failed";
		}
		else{	// successful response
			var result = JSON.parse(data.Payload);
			console.log("Result: " +  JSON.stringify(result));

			// Check the response for valid results
			// The response will be a JSON payload with statusCode and body properties. ie:
			// {
			//		"statusCode": 200,
			//		"body": 51
			// }
			if(result.body == 9){	
				lambdaResult = "Succeeded";
				console.log ("Validation testing succeeded!");
			}
			else{
				lambdaResult = "Failed";
				console.log ("Validation testing failed!");
			}

			// Complete the PreTraffic Hook by sending CodeDeploy the validation status
			var params = {
				deploymentId: deploymentId,
				lifecycleEventHookExecutionId: lifecycleEventHookExecutionId,
				status: lambdaResult // status can be 'Succeeded' or 'Failed'
			};
			
			// Pass AWS CodeDeploy the prepared validation test results.
			codedeploy.putLifecycleEventHookExecutionStatus(params, function(err, data) {
				if (err) {
					// Validation failed.
					console.log('CodeDeploy Status update failed');
					console.log(err, err.stack);
					callback("CodeDeploy Status update failed");
				} else {
					// Validation succeeded.
					console.log('Codedeploy status updated successfully');
					callback(null, 'Codedeploy status updated successfully');
				}
			});
		}  
	});
}

The hook is hardcoded to check that the number of S3 buckets returned is 9.

Review the key parts of the SAM template that defines preTrafficHook:

  • The Policies attribute specifies additional policy statements that SAM adds onto the automatically generated IAM role for this function. The first statement provides permissions to call the CodeDeploy PutLifecycleEventHookExecutionStatus API action. The second statement provides permissions to invoke the specific version of the returnS3Buckets function to test
  • This function has traffic shifting features disabled by setting the DeploymentPreference option to false.
  • The FunctionName attribute explicitly tells CloudFormation what to name the function. Otherwise, CloudFormation creates the function with the default naming convention: [stackName]-[FunctionName]-[uniqueID].  Name the function with the “CodeDeployHook_” prefix because the CodeDeployServiceRole role only allows InvokeFunction on functions named with that prefix.
  • Set the Timeout attribute to allow enough time to complete your validation tests.
  • Use an environment variable to inject the ARN of the newest deployed version of the returnS3Buckets function. The ARN allows the function to know the specific version to invoke and perform validation testing on.

Deploy the function

Your SAM template is all set and the code is written—you’re ready to deploy the function for the first time. Here’s how to do it via the SAM CLI. Replace “sam” with “cloudformation” to use CloudFormation instead.

First, package the function. This command returns a CloudFormation importable file, packaged.yaml.

sam package –template-file template.yaml –s3-bucket mybucket –output-template-file packaged.yaml

Now deploy everything:

sam deploy –template-file packaged.yaml –stack-name mySafeDeployStack –capabilities CAPABILITY_IAM

At this point, both Lambda functions have been deployed within the CloudFormation stack mySafeDeployStack. The returnS3Buckets has been deployed as Version 1:

SAM automatically created a few things, including the CodeDeploy application, with the deployment pattern that you specified (Linear10PercentEvery1Minute). There is currently one deployment group, with no action, because no deployments have occurred. SAM also created the IAM service role that this CodeDeploy application uses:

There is a single managed policy attached to this role, which allows CodeDeploy to invoke any Lambda function that begins with “CodeDeployHook_”.

An API has been set up called safeDeployStack. It targets your Lambda function with the /test resource using the GET method. When you test the endpoint, API Gateway executes the returnS3Buckets function and it returns the number of S3 buckets that you own. In this case, it’s 51.

Publish a new Lambda function version

Now implement the requirements change, which is to make returnS3Buckets count only buckets that begin with the letter “a”. The code now looks like the following (see returnS3BucketsNew.js in GitHub):

'use strict';

var AWS = require('aws-sdk');
var s3 = new AWS.S3();

exports.handler = (event, context, callback) => {
	console.log("I am here! " + context.functionName  +  ":"  +  context.functionVersion);

	s3.listBuckets(function (err, data){
		if(err){
			console.log(err, err.stack);
			callback(null, {
				statusCode: 500,
				body: "Failed!"
			});
		}
		else{
			var allBuckets = data.Buckets;

			console.log("Total buckets: " + allBuckets.length);
			//callback(null, allBuckets.length);

			//  New Code begins here
			var counter=0;
			for(var i  in allBuckets){
				if(allBuckets[i].Name[0] === "a")
					counter++;
			}
			console.log("Total buckets starting with a: " + counter);

			callback(null, {
				statusCode: 200,
				body: counter
			});
			
		}
	});	
}

Repackage and redeploy with the same two commands as earlier:

sam package –template-file template.yaml –s3-bucket mybucket –output-template-file packaged.yaml
	
sam deploy –template-file packaged.yaml –stack-name mySafeDeployStack –capabilities CAPABILITY_IAM

CloudFormation understands that this is a stack update instead of an entirely new stack. You can see that reflected in the CloudFormation console:

During the update, CloudFormation deploys the new Lambda function as version 2 and adds it to the “live” alias. There is no traffic routing there yet. CodeDeploy now takes over to begin the safe deployment process.

The first thing CodeDeploy does is invoke the preTrafficHook function. Verify that this happened by reviewing the Lambda logs and metrics:

The function should progress successfully, invoke Version 2 of returnS3Buckets, and finally invoke the CodeDeploy API with a success code. After this occurs, CodeDeploy begins the predefined rollout strategy. Open the CodeDeploy console to review the deployment progress (Linear10PercentEvery1Minute):

Verify the traffic shift

During the deployment, verify that the traffic shift has started to occur by running the test periodically. As the deployment shifts towards the new version, a larger percentage of the responses return 9 instead of 51. These numbers match the S3 buckets.

A minute later, you see 10% more traffic shifting to the new version. The whole process takes 10 minutes to complete. After completion, open the Lambda console and verify that the “live” alias now points to version 2:

After 10 minutes, the deployment is complete and CodeDeploy signals success to CloudFormation and completes the stack update.

Check the results

If you invoke the function alias manually, you see the results of the new implementation.

aws lambda invoke –function [lambda arn to live alias] out.txt

You can also execute the prod stage of your API and verify the results by issuing an HTTP GET to the invoke URL:

Summary

This post has shown you how you can safely automate your Lambda deployments using the Lambda traffic shifting feature. You used the Serverless Application Model (SAM) to define your Lambda functions and configured CodeDeploy to manage your deployment patterns. Finally, you used CloudFormation to automate the deployment and updates to your function and PreTraffic hook.

Now that you know all about this new feature, you’re ready to begin automating Lambda deployments with confidence that things will work as designed. I look forward to hearing about what you’ve built with the AWS Serverless Platform.

[$] PostgreSQL’s fsync() surprise

Post Syndicated from corbet original https://lwn.net/Articles/752063/rss

Developers of database management systems are, by necessity, concerned
about getting data safely to persistent storage. So when the PostgreSQL
community found out that the way the kernel handles I/O errors could result
in data being lost without any errors being reported to user space, a fair
amount of unhappiness resulted. The problem, which is exacerbated by the
way PostgreSQL performs buffered I/O, turns out not to be unique to Linux,
and will not be easy to solve even there.

Achieving Major Stability and Performance Improvements in Yahoo Mail with a Novel Redux Architecture

Post Syndicated from mikesefanov original https://yahooeng.tumblr.com/post/173062946866

yahoodevelopers:

By Mohit Goenka, Gnanavel Shanmugam, and Lance Welsh

At Yahoo Mail, we’re constantly striving to upgrade our product experience. We do this not only by adding new features based on our members’ feedback, but also by providing the best technical solutions to power the most engaging experiences. As such, we’ve recently introduced a number of novel and unique revisions to the way in which we use Redux that have resulted in significant stability and performance improvements. Developers may find our methods useful in achieving similar results in their apps.

Improvements to product metrics

Last year Yahoo Mail implemented a brand new architecture using Redux. Since then, we have transformed the overall architecture to reduce latencies in various operations, reduce JavaScript exceptions, and better synchronized states. As a result, the product is much faster and more stable.

Stability improvements:

  • when checking for new emails – 20%
  • when reading emails – 30%
  • when sending emails – 20%

Performance improvements:

  • 10% improvement in page load performance
  • 40% improvement in frame rendering time

We have also reduced API calls by approximately 20%.

How we use Redux in Yahoo Mail

Redux architecture is reliant on one large store that represents the application state. In a Redux cycle, action creators dispatch actions to change the state of the store. React Components then respond to those state changes. We’ve made some modifications on top of this architecture that are atypical in the React-Redux community.

For instance, when fetching data over the network, the traditional methodology is to use Thunk middleware. Yahoo Mail fetches data over the network from our API. Thunks would create an unnecessary and undesirable dependency between the action creators and our API. If and when the API changes, the action creators must then also change. To keep these concerns separate we dispatch the action payload from the action creator to store them in the Redux state for later processing by “action syncers”. Action syncers use the payload information from the store to make requests to the API and process responses. In other words, the action syncers form an API layer by interacting with the store. An additional benefit to keeping the concerns separate is that the API layer can change as the backend changes, thereby preventing such changes from bubbling back up into the action creators and components. This also allowed us to optimize the API calls by batching, deduping, and processing the requests only when the network is available. We applied similar strategies for handling other side effects like route handling and instrumentation. Overall, action syncers helped us to reduce our API calls by ~20% and bring down API errors by 20-30%.

Another change to the normal Redux architecture was made to avoid unnecessary props. The React-Redux community has learned to avoid passing unnecessary props from high-level components through multiple layers down to lower-level components (prop drilling) for rendering. We have introduced action enhancers middleware to avoid passing additional unnecessary props that are purely used when dispatching actions. Action enhancers add data to the action payload so that data does not have to come from the component when dispatching the action. This avoids the component from having to receive that data through props and has improved frame rendering by ~40%. The use of action enhancers also avoids writing utility functions to add commonly-used data to each action from action creators.

image

In our new architecture, the store reducers accept the dispatched action via action enhancers to update the state. The store then updates the UI, completing the action cycle. Action syncers then initiate the call to the backend APIs to synchronize local changes.

Conclusion

Our novel use of Redux in Yahoo Mail has led to significant user-facing benefits through a more performant application. It has also reduced development cycles for new features due to its simplified architecture. We’re excited to share our work with the community and would love to hear from anyone interested in learning more.

Telegram Founder Pledges Millions in Bitcoin For VPNs and “Digital Resistance”

Post Syndicated from Andy original https://torrentfreak.com/telegram-founder-pledges-millions-in-bitcoin-for-vpns-and-digital-resistance-180418/

Starting yesterday, Russia went to war with free cross-platform messaging app Telegram. Authorities including the FSB wanted access to Telegram’s encryption keys, but the service refused to hand them over.

As a result, the service – which serviced 200,000,000 people in March alone – came under massive attack. Supported by a court ruling obtained last Friday, authorities ordered ISPs to block huge numbers of IP addresses in an effort to shut Telegram down.

Amazon and Google, whose services Telegram uses, were both hit with censorship measures, with around 1.8 million IP addresses belonging to the Internet giants blocked in an initial wave of action. But the government was just getting warmed up.

In an updated posted by Pavel Durov to Twitter from Switzerland late last night, the Telegram founder confirmed that Russia had massively stepped up the fight against his encrypted messaging platform.

Of course, 15 million IP addresses is a huge volume, particularly since ‘just’ 14 million of Telegram’s users are located in Russia – that’s more than one IP address for each of them. As a result, there are reports of completed unrelated services being affected by the ban, which is to be expected given its widespread nature. But Russia doesn’t want to stop there.

According to Reuters, local telecoms watchdog Rozcomnadzor asked both Google and Apple [Update: and APKMirror] to remove Telegram from their app stores, to prevent local citizens from gaining access to the software itself. It is unclear whether either company intends to comply but as yet, neither has responded publicly nor taken any noticeable action.

An announcement from Durov last night thanked the companies for not complying with the Russian government’s demands, noting that the efforts so far had proven mostly futile.

“Despite the ban, we haven’t seen a significant drop in user engagement so far, since Russians tend to bypass the ban with VPNs and proxies. We also have been relying on third-party cloud services to remain partly available for our users there,” Durov wrote on Telegram.

“Thank you for your support and loyalty, Russian users of Telegram. Thank you, Apple, Google, Amazon, Microsoft – for not taking part in political censorship.”

Durov noted that Russia accounts for around 7% of Telegram’s userbase, a figure that could be compensated for with organic growth in just a couple of months, even if Telegram lost access to the entire market. However, the action only appears to have lit a fire under the serial entrepreneur, who now has declared a war of his own against censorship.

“To support internet freedoms in Russia and elsewhere I started giving out bitcoin grants to individuals and companies who run socks5 proxies and VPN,” Durov said.

“I am happy to donate millions of dollars this year to this cause, and hope that other people will follow. I called this Digital Resistance – a decentralized movement standing for digital freedoms and progress globally.”

As founder of not only Telegram but also vKontakte, Russia’s answer to Facebook, Durov is a force to be reckoned with. As such, his promises are unlikely to be hollow ones. While Russia has drawn a line in the sand on encryption, it appears to have energized Durov to take a stand, one that could have a positive effect on anti-censorship measures both in Russia and further afield.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Backblaze at NAB 2018 in Las Vegas

Post Syndicated from Roderick Bauer original https://www.backblaze.com/blog/backblaze-at-nab-2018-in-las-vegas/

Backblaze B2 Cloud Storage NAB Booth

Backblaze just returned from exhibiting at NAB in Las Vegas, April 9-12, where the response to our recent announcements was tremendous. In case you missed the news, Backblaze B2 Cloud Storage continues to extend its lead as the most affordable, high performance cloud on the planet.

Backblaze’s News at NAB

Backblaze at NAB 2018 in Las Vegas

The Backblaze booth just before opening

What We Were Asked at NAB

Our booth was busy from start to finish with attendees interested in learning more about Backblaze and B2 Cloud Storage. Here are the questions we were asked most often in the booth.

Q. How long has Backblaze been in business?
A. The company was founded in 2007. Today, we have over 500 petabytes of data from customers in over 150 countries.

B2 Partners at NAB 2018

Q. Where is your data stored?
A. We have data centers in California and Arizona and expect to expand to Europe by the end of the year.

Q. How can your services be so inexpensive?
A. Backblaze’s goal from the beginning was to offer cloud backup and storage that was easy to use and affordable. All the existing options were simply too expensive to be viable, so we created our own infrastructure. Our purpose-built storage system — the Backblaze’s Storage Pod — is recognized as one of the most cost efficient storage platforms available.

Q. Tell me about your hardware.
A. Backblaze’s Storage Pods hold 60 HDDs each, containing as much as 720TB data per pod, stored using Reed-Solomon error correction. Storage Pods are arranged in Tomes with twenty Storage Pods making up a Vault.

Q. Where do you fit in the data workflow?
A. People typically use B2 in for archiving completed projects. All data is readily available for download from B2, making it more convenient than off-line storage. In addition, DAM and MAM systems such as CatDV, axle ai, Cantemo, and others have integrated with B2 to store raw images behind the proxies.

Q. Who uses B2 in the M&E business?
A. KLRU-TV, the PBS station in Austin Texas, uses B2 to archive their entire 43 year catalog of Austin City Limits episodes and related materials. WunderVu, the production house for Pixvana, uses B2 to back up and archive their local storage systems on which they build virtual reality experiences for their customers.

Q. You’re the company that publishes the hard drive stats, right?
A. Yes, we are!

Backblaze Case Studies and Swag at NAB 2018 in Las Vegas

Were You at NAB?

If you were, we hope you stopped by the Backblaze booth to say hello. We’d like to hear what you saw at the show that was interesting or exciting. Please tell us in the comments.

The post Backblaze at NAB 2018 in Las Vegas appeared first on Backblaze Blog | Cloud Storage & Cloud Backup.

Russia’s Encryption War: 1.8m Google & Amazon IPs Blocked to Silence Telegram

Post Syndicated from Andy original https://torrentfreak.com/russias-encryption-war-1-8m-google-amazon-ips-blocked-to-silence-telegram-180417/

The rules in Russia are clear. Entities operating an encrypted messaging service need to register with the authorities. They also need to hand over their encryption keys so that if law enforcement sees fit, users can be spied on.

Free cross-platform messaging app Telegram isn’t playing ball. An impressive 200,000,000 people used the software in March (including a growing number for piracy purposes) and founder Pavel Durov says he will not compromise their security, despite losing a lawsuit against the Federal Security Service which compels him to do so.

“Telegram doesn’t have shareholders or advertisers to report to. We don’t do deals with marketers, data miners or government agencies. Since the day we launched in August 2013 we haven’t disclosed a single byte of our users’ private data to third parties,” Durov said.

“Above all, we at Telegram believe in people. We believe that humans are inherently intelligent and benevolent beings that deserve to be trusted; trusted with freedom to share their thoughts, freedom to communicate privately, freedom to create tools. This philosophy defines everything we do.”

But by not handing over its keys, Telegram is in trouble with Russia. The FSB says it needs access to Telegram messages to combat terrorism so, in response to its non-compliance, telecoms watchdog Rozcomnadzor filed a lawsuit to degrade Telegram via web-blocking. Last Friday, that process ended in the state’s favor.

After an 18-minute hearing, a Moscow court gave the go-ahead for Telegram to be banned in Russia. The hearing was scheduled just the day before, giving Telegram little time to prepare. In protest, its lawyers didn’t even turn up to argue the company’s position.

Instead, Durov took to his VKontakte account to announce that Telegram would take counter-measures.

“Telegram will use built-in methods to bypass blocks, which do not require actions from users, but 100% availability of the service without a VPN is not guaranteed,” Durov wrote.

Telegram can appeal the blocking decision but Russian authorities aren’t waiting around for a response. They are clearly prepared to match Durov’s efforts, no matter what the cost.

In instructions sent out yesterday nationwide, Rozomnadzor ordered ISPs to block Telegram. The response was immediate and massive. Telegram was using both Amazon and Google to provide service to its users so, within hours, huge numbers of IP addresses belonging to both companies were targeted.

Initially, 655,352 Amazon IP addresses were placed on Russia’s nationwide blacklist. It was later reported that a further 131,000 IP addresses were added to that total. But the Russians were just getting started.

Servers.ru reports that a further 1,048,574 IP addresses belonging to Google were also targeted Monday. Rozcomnadzor said the court ruling against Telegram compelled it to take whatever action is needed to take Telegram down but with at least 1,834,996 addresses now confirmed blocked, it remains unclear what effect it’s had on the service.

Friday’s court ruling states that restrictions against Telegram can be lifted provided that the service hands over its encryption keys to the FSB. However, Durov responded by insisting that “confidentiality is not for sale, and human rights should not be compromised because of fear or greed.”

But of course, money is still part of the Telegram equation. While its business model in terms of privacy stands in stark contrast to that of Facebook, Telegram is also involved in the world’s biggest initial coin offering (ICO). According to media reports, it has raised $1.7 billion in pre-sales thus far.

This week’s action against Telegram is the latest in Russia’s war on ‘unauthorized’ encryption.

At the end of March, authorities suggested that around 15 million IP addresses (13.5 million belonging to Amazon) could be blocked to target chat software Zello. While those measures were averted, a further 500 domains belonging to Google were caught in the dragnet.

Source: TF, for the latest info on copyright, file-sharing, torrent sites and more. We also have VPN reviews, discounts, offers and coupons.

Notes on setting up Raspberry Pi 3 as WiFi hotspot

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/04/notes-on-setting-up-raspberry-pi-3-as.html

I want to sniff the packets for IoT devices. There are a number of ways of doing this, but one straightforward mechanism is configuring a “Raspberry Pi 3 B” as a WiFi hotspot, then running tcpdump on it to record all the packets that pass through it. Google gives lots of results on how to do this, but they all demand that you have the precise hardware, WiFi hardware, and software that the authors do, so that’s a pain.

I got it working using the instructions here. There are a few additional notes, which is why I’m writing this blogpost, so I remember them.
https://www.raspberrypi.org/documentation/configuration/wireless/access-point.md

I’m using the RPi-3-B and not the RPi-3-B+, and the latest version of Raspbian at the time of this writing, “Raspbian Stretch Lite 2018-3-13”.

Some things didn’t work as described. The first is that it couldn’t find the package “hostapd”. That solution was to run “apt-get update” a second time.

The second problem was error message about the NAT not working when trying to set the masquerade rule. That’s because the ‘upgrade’ updates the kernel, making the running system out-of-date with the files on the disk. The solution to that is make sure you reboot after upgrading.

Thus, what you do at the start is:

apt-get update
apt-get upgrade
apt-get update
shutdown -r now

Then it’s just “apt-get install tcpdump” and start capturing on wlan0. This will get the non-monitor-mode Ethernet frames, which is what I want.

How to retain system tables’ data spanning multiple Amazon Redshift clusters and run cross-cluster diagnostic queries

Post Syndicated from Karthik Sonti original https://aws.amazon.com/blogs/big-data/how-to-retain-system-tables-data-spanning-multiple-amazon-redshift-clusters-and-run-cross-cluster-diagnostic-queries/

Amazon Redshift is a data warehouse service that logs the history of the system in STL log tables. The STL log tables manage disk space by retaining only two to five days of log history, depending on log usage and available disk space.

To retain STL tables’ data for an extended period, you usually have to create a replica table for every system table. Then, for each you load the data from the system table into the replica at regular intervals. By maintaining replica tables for STL tables, you can run diagnostic queries on historical data from the STL tables. You then can derive insights from query execution times, query plans, and disk-spill patterns, and make better cluster-sizing decisions. However, refreshing replica tables with live data from STL tables at regular intervals requires schedulers such as Cron or AWS Data Pipeline. Also, these tables are specific to one cluster and they are not accessible after the cluster is terminated. This is especially true for transient Amazon Redshift clusters that last for only a finite period of ad hoc query execution.

In this blog post, I present a solution that exports system tables from multiple Amazon Redshift clusters into an Amazon S3 bucket. This solution is serverless, and you can schedule it as frequently as every five minutes. The AWS CloudFormation deployment template that I provide automates the solution setup in your environment. The system tables’ data in the Amazon S3 bucket is partitioned by cluster name and query execution date to enable efficient joins in cross-cluster diagnostic queries.

I also provide another CloudFormation template later in this post. This second template helps to automate the creation of tables in the AWS Glue Data Catalog for the system tables’ data stored in Amazon S3. After the system tables are exported to Amazon S3, you can run cross-cluster diagnostic queries on the system tables’ data and derive insights about query executions in each Amazon Redshift cluster. You can do this using Amazon QuickSight, Amazon Athena, Amazon EMR, or Amazon Redshift Spectrum.

You can find all the code examples in this post, including the CloudFormation templates, AWS Glue extract, transform, and load (ETL) scripts, and the resolution steps for common errors you might encounter in this GitHub repository.

Solution overview

The solution in this post uses AWS Glue to export system tables’ log data from Amazon Redshift clusters into Amazon S3. The AWS Glue ETL jobs are invoked at a scheduled interval by AWS Lambda. AWS Systems Manager, which provides secure, hierarchical storage for configuration data management and secrets management, maintains the details of Amazon Redshift clusters for which the solution is enabled. The last-fetched time stamp values for the respective cluster-table combination are maintained in an Amazon DynamoDB table.

The following diagram covers the key steps involved in this solution.

The solution as illustrated in the preceding diagram flows like this:

  1. The Lambda function, invoke_rs_stl_export_etl, is triggered at regular intervals, as controlled by Amazon CloudWatch. It’s triggered to look up the AWS Systems Manager parameter store to get the details of the Amazon Redshift clusters for which the system table export is enabled.
  2. The same Lambda function, based on the Amazon Redshift cluster details obtained in step 1, invokes the AWS Glue ETL job designated for the Amazon Redshift cluster. If an ETL job for the cluster is not found, the Lambda function creates one.
  3. The ETL job invoked for the Amazon Redshift cluster gets the cluster credentials from the parameter store. It gets from the DynamoDB table the last exported time stamp of when each of the system tables was exported from the respective Amazon Redshift cluster.
  4. The ETL job unloads the system tables’ data from the Amazon Redshift cluster into an Amazon S3 bucket.
  5. The ETL job updates the DynamoDB table with the last exported time stamp value for each system table exported from the Amazon Redshift cluster.
  6. The Amazon Redshift cluster system tables’ data is available in Amazon S3 and is partitioned by cluster name and date for running cross-cluster diagnostic queries.

Understanding the configuration data

This solution uses AWS Systems Manager parameter store to store the Amazon Redshift cluster credentials securely. The parameter store also securely stores other configuration information that the AWS Glue ETL job needs for extracting and storing system tables’ data in Amazon S3. Systems Manager comes with a default AWS Key Management Service (AWS KMS) key that it uses to encrypt the password component of the Amazon Redshift cluster credentials.

The following table explains the global parameters and cluster-specific parameters required in this solution. The global parameters are defined once and applicable at the overall solution level. The cluster-specific parameters are specific to an Amazon Redshift cluster and repeat for each cluster for which you enable this post’s solution. The CloudFormation template explained later in this post creates these parameters as part of the deployment process.

Parameter name Type Description
Global parametersdefined once and applied to all jobs
redshift_query_logs.global.s3_prefix String The Amazon S3 path where the query logs are exported. Under this path, each exported table is partitioned by cluster name and date.
redshift_query_logs.global.tempdir String The Amazon S3 path that AWS Glue ETL jobs use for temporarily staging the data.
redshift_query_logs.global.role> String The name of the role that the AWS Glue ETL jobs assume. Just the role name is sufficient. The complete Amazon Resource Name (ARN) is not required.
redshift_query_logs.global.enabled_cluster_list StringList A comma-separated list of cluster names for which system tables’ data export is enabled. This gives flexibility for a user to exclude certain clusters.
Cluster-specific parametersfor each cluster specified in the enabled_cluster_list parameter
redshift_query_logs.<<cluster_name>>.connection String The name of the AWS Glue Data Catalog connection to the Amazon Redshift cluster. For example, if the cluster name is product_warehouse, the entry is redshift_query_logs.product_warehouse.connection.
redshift_query_logs.<<cluster_name>>.user String The user name that AWS Glue uses to connect to the Amazon Redshift cluster.
redshift_query_logs.<<cluster_name>>.password Secure String The password that AWS Glue uses to connect the Amazon Redshift cluster’s encrypted-by key that is managed in AWS KMS.

For example, suppose that you have two Amazon Redshift clusters, product-warehouse and category-management, for which the solution described in this post is enabled. In this case, the parameters shown in the following screenshot are created by the solution deployment CloudFormation template in the AWS Systems Manager parameter store.

Solution deployment

To make it easier for you to get started, I created a CloudFormation template that automatically configures and deploys the solution—only one step is required after deployment.

Prerequisites

To deploy the solution, you must have one or more Amazon Redshift clusters in a private subnet. This subnet must have a network address translation (NAT) gateway or a NAT instance configured, and also a security group with a self-referencing inbound rule for all TCP ports. For more information about why AWS Glue ETL needs the configuration it does, described previously, see Connecting to a JDBC Data Store in a VPC in the AWS Glue documentation.

To start the deployment, launch the CloudFormation template:

CloudFormation stack parameters

The following table lists and describes the parameters for deploying the solution to export query logs from multiple Amazon Redshift clusters.

Property Default Description
S3Bucket mybucket The bucket this solution uses to store the exported query logs, stage code artifacts, and perform unloads from Amazon Redshift. For example, the mybucket/extract_rs_logs/data bucket is used for storing all the exported query logs for each system table partitioned by the cluster. The mybucket/extract_rs_logs/temp/ bucket is used for temporarily staging the unloaded data from Amazon Redshift. The mybucket/extract_rs_logs/code bucket is used for storing all the code artifacts required for Lambda and the AWS Glue ETL jobs.
ExportEnabledRedshiftClusters Requires Input A comma-separated list of cluster names from which the system table logs need to be exported.
DataStoreSecurityGroups Requires Input A list of security groups with an inbound rule to the Amazon Redshift clusters provided in the parameter, ExportEnabledClusters. These security groups should also have a self-referencing inbound rule on all TCP ports, as explained on Connecting to a JDBC Data Store in a VPC.

After you launch the template and create the stack, you see that the following resources have been created:

  1. AWS Glue connections for each Amazon Redshift cluster you provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  2. All parameters required for this solution created in the parameter store.
  3. The Lambda function that invokes the AWS Glue ETL jobs for each configured Amazon Redshift cluster at a regular interval of five minutes.
  4. The DynamoDB table that captures the last exported time stamps for each exported cluster-table combination.
  5. The AWS Glue ETL jobs to export query logs from each Amazon Redshift cluster provided in the CloudFormation stack parameter, ExportEnabledRedshiftClusters.
  6. The IAM roles and policies required for the Lambda function and AWS Glue ETL jobs.

After the deployment

For each Amazon Redshift cluster for which you enabled the solution through the CloudFormation stack parameter, ExportEnabledRedshiftClusters, the automated deployment includes temporary credentials that you must update after the deployment:

  1. Go to the parameter store.
  2. Note the parameters <<cluster_name>>.user and redshift_query_logs.<<cluster_name>>.password that correspond to each Amazon Redshift cluster for which you enabled this solution. Edit these parameters to replace the placeholder values with the right credentials.

For example, if product-warehouse is one of the clusters for which you enabled system table export, you edit these two parameters with the right user name and password and choose Save parameter.

Querying the exported system tables

Within a few minutes after the solution deployment, you should see Amazon Redshift query logs being exported to the Amazon S3 location, <<S3Bucket_you_provided>>/extract_redshift_query_logs/data/. In that bucket, you should see the eight system tables partitioned by customer name and date: stl_alert_event_log, stl_dlltext, stl_explain, stl_query, stl_querytext, stl_scan, stl_utilitytext, and stl_wlm_query.

To run cross-cluster diagnostic queries on the exported system tables, create external tables in the AWS Glue Data Catalog. To make it easier for you to get started, I provide a CloudFormation template that creates an AWS Glue crawler, which crawls the exported system tables stored in Amazon S3 and builds the external tables in the AWS Glue Data Catalog.

Launch this CloudFormation template to create external tables that correspond to the Amazon Redshift system tables. S3Bucket is the only input parameter required for this stack deployment. Provide the same Amazon S3 bucket name where the system tables’ data is being exported. After you successfully create the stack, you can see the eight tables in the database, redshift_query_logs_db, as shown in the following screenshot.

Now, navigate to the Athena console to run cross-cluster diagnostic queries. The following screenshot shows a diagnostic query executed in Athena that retrieves query alerts logged across multiple Amazon Redshift clusters.

You can build the following example Amazon QuickSight dashboard by running cross-cluster diagnostic queries on Athena to identify the hourly query count and the key query alert events across multiple Amazon Redshift clusters.

How to extend the solution

You can extend this post’s solution in two ways:

  • Add any new Amazon Redshift clusters that you spin up after you deploy the solution.
  • Add other system tables or custom query results to the list of exports from an Amazon Redshift cluster.

Extend the solution to other Amazon Redshift clusters

To extend the solution to more Amazon Redshift clusters, add the three cluster-specific parameters in the AWS Systems Manager parameter store following the guidelines earlier in this post. Modify the redshift_query_logs.global.enabled_cluster_list parameter to append the new cluster to the comma-separated string.

Extend the solution to add other tables or custom queries to an Amazon Redshift cluster

The current solution ships with the export functionality for the following Amazon Redshift system tables:

  • stl_alert_event_log
  • stl_dlltext
  • stl_explain
  • stl_query
  • stl_querytext
  • stl_scan
  • stl_utilitytext
  • stl_wlm_query

You can easily add another system table or custom query by adding a few lines of code to the AWS Glue ETL job, <<cluster-name>_extract_rs_query_logs. For example, suppose that from the product-warehouse Amazon Redshift cluster you want to export orders greater than $2,000. To do so, add the following five lines of code to the AWS Glue ETL job product-warehouse_extract_rs_query_logs, where product-warehouse is your cluster name:

  1. Get the last-processed time-stamp value. The function creates a value if it doesn’t already exist.

salesLastProcessTSValue = functions.getLastProcessedTSValue(trackingEntry=”mydb.sales_2000",job_configs=job_configs)

  1. Run the custom query with the time stamp.

returnDF=functions.runQuery(query="select * from sales s join order o where o.order_amnt > 2000 and sale_timestamp > '{}'".format (salesLastProcessTSValue) ,tableName="mydb.sales_2000",job_configs=job_configs)

  1. Save the results to Amazon S3.

functions.saveToS3(dataframe=returnDF,s3Prefix=s3Prefix,tableName="mydb.sales_2000",partitionColumns=["sale_date"],job_configs=job_configs)

  1. Get the latest time-stamp value from the returned data frame in Step 2.

latestTimestampVal=functions.getMaxValue(returnDF,"sale_timestamp",job_configs)

  1. Update the last-processed time-stamp value in the DynamoDB table.

functions.updateLastProcessedTSValue(“mydb.sales_2000",latestTimestampVal[0],job_configs)

Conclusion

In this post, I demonstrate a serverless solution to retain the system tables’ log data across multiple Amazon Redshift clusters. By using this solution, you can incrementally export the data from system tables into Amazon S3. By performing this export, you can build cross-cluster diagnostic queries, build audit dashboards, and derive insights into capacity planning by using services such as Athena. I also demonstrate how you can extend this solution to other ad hoc query use cases or tables other than system tables by adding a few lines of code.


Additional Reading

If you found this post useful, be sure to check out Using Amazon Redshift Spectrum, Amazon Athena, and AWS Glue with Node.js in Production and Amazon Redshift – 2017 Recap.


About the Author

Karthik Sonti is a senior big data architect at Amazon Web Services. He helps AWS customers build big data and analytical solutions and provides guidance on architecture and best practices.

 

 

 

 

Build a house in Minecraft using Python

Post Syndicated from Rob Zwetsloot original https://www.raspberrypi.org/blog/build-minecraft-house-using-python/

In this tutorial from The MagPi issue 68, Steve Martin takes us through the process of house-building in Minecraft Pi. Get your copy of The MagPi in stores now, or download it as a free PDF here.

Minecraft Pi is provided for free as part of the Raspbian operating system. To start your Minecraft: Pi Edition adventures, try our free tutorial Getting started with Minecraft.

Minecraft Raspberry Pi

Writing programs that create things in Minecraft is not only a great way to learn how to code, but it also means that you have a program that you can run again and again to make as many copies of your Minecraft design as you want. You never need to worry about your creation being destroyed by your brother or sister ever again — simply rerun your program and get it back! Whilst it might take a little longer to write the program than to build one house, once it’s finished you can build as many houses as you want.

Co-ordinates in Minecraft

Let’s start with a review of the coordinate system that Minecraft uses to know where to place blocks. If you are already familiar with this, you can skip to the next section. Otherwise, read on.

Minecraft Raspberry Pi Edition

Plan view of our house design

Minecraft shows us a three-dimensional (3D) view of the world. Imagine that the room you are in is the Minecraft world and you want to describe your location within that room. You can do so with three numbers, as follows:

  • How far across the room are you? As you move from side to side, you change this number. We can consider this value to be our X coordinate.
  • How high off the ground are you? If you are upstairs, or if you jump, this value increases. We can consider this value to be our Y coordinate.
  • How far into the room are you? As you walk forwards or backwards, you change this number. We can consider this value to be our Z coordinate.

You might have done graphs in school with X going across the page and Y going up the page. Coordinates in Minecraft are very similar, except that we have an extra value, Z, for our third dimension. Don’t worry if this still seems a little confusing: once we start to build our house, you will see how these three dimensions work in Minecraft.

Designing our house

It is a good idea to start with a rough design for our house. This will help us to work out the values for the coordinates when we are adding doors and windows to our house. You don’t have to plan every detail of your house right away. It is always fun to enhance it once you have got the basic design written. The image above shows the plan view of the house design that we will be creating in this tutorial. Note that because this is a plan view, it only shows the X and Z co-ordinates; we can’t see how high anything is. Hopefully, you can imagine the house extending up from the screen.

We will build our house close to where the Minecraft player is standing. This a good idea when creating something in Minecraft with Python, as it saves us from having to walk around the Minecraft world to try to find our creation.

Starting our program

Type in the code as you work through this tutorial. You can use any editor you like; we would suggest either Python 3 (IDLE) or Thonny Python IDE, both of which you can find on the Raspberry Pi menu under Programming. Start by selecting the File menu and creating a new file. Save the file with a name of your choice; it must end with .py so that the Raspberry Pi knows that it is a Python program.

It is important to enter the code exactly as it is shown in the listing. Pay particular attention to both the spelling and capitalisation (upper- or lower-case letters) used. You may find that when you run your program the first time, it doesn’t work. This is very common and just means there’s a small error somewhere. The error message will give you a clue about where the error is.

It is good practice to start all of your Python programs with the first line shown in our listing. All other lines that start with a # are comments. These are ignored by Python, but they are a good way to remind us what the program is doing.

The two lines starting with from tell Python about the Minecraft API; this is a code library that our program will be using to talk to Minecraft. The line starting mc = creates a connection between our Python program and the game. Then we get the player’s location broken down into three variables: x, y, and z.

Building the shell of our house

To help us build our house, we define three variables that specify its width, height, and depth. Defining these variables makes it easy for us to change the size of our house later; it also makes the code easier to understand when we are setting the co-ordinates of the Minecraft bricks. For now, we suggest that you use the same values that we have; you can go back and change them once the house is complete and you want to alter its design.

It’s now time to start placing some bricks. We create the shell of our house with just two lines of code! These lines of code each use the setBlocks command to create a complete block of bricks. This function takes the following arguments:

setBlocks(x1, y1, z1, x2, y2, z2, block-id, data)

x1, y1, and z1 are the coordinates of one corner of the block of bricks that we want to create; x1, y1, and z1 are the coordinates of the other corner. The block-id is the type of block that we want to use. Some blocks require another value called data; we will see this being used later, but you can ignore it for now.

We have to work out the values that we need to use in place of x1, y1, z1, x1, y1, z1 for our walls. Note that what we want is a larger outer block made of bricks and that is filled with a slightly smaller block of air blocks. Yes, in Minecraft even air is actually just another type of block.

Once you have typed in the two lines that create the shell of your house, you almost ready to run your program. Before doing so, you must have Minecraft running and displaying the contents of your world. Do not have a world loaded with things that you have created, as they may get destroyed by the house that we are building. Go to a clear area in the Minecraft world before running the program. When you run your program, check for any errors in the ‘console’ window and fix them, repeatedly running the code again until you’ve corrected all the errors.

You should see a block of bricks now, as shown above. You may have to turn the player around in the Minecraft world before you can see your house.

Adding the floor and door

Now, let’s make our house a bit more interesting! Add the lines for the floor and door. Note that the floor extends beyond the boundary of the wall of the house; can you see how we achieve this?

Hint: look closely at how we calculate the x and z attributes as compared to when we created the house shell above. Also note that we use a value of y-1 to create the floor below our feet.

Minecraft doors are two blocks high, so we have to create them in two parts. This is where we have to use the data argument. A value of 0 is used for the lower half of the door, and a value of 8 is used for the upper half (the part with the windows in it). These values will create an open door. If we add 4 to each of these values, a closed door will be created.

Before you run your program again, move to a new location in Minecraft to build the house away from the previous one. Then run it to check that the floor and door are created; you will need to fix any errors again. Even if your program runs without errors, check that the floor and door are positioned correctly. If they aren’t, then you will need to check the arguments so setBlock and setBlocks are exactly as shown in the listing.

Adding windows

Hopefully you will agree that your house is beginning to take shape! Now let’s add some windows. Looking at the plan for our house, we can see that there is a window on each side; see if you can follow along. Add the four lines of code, one for each window.

Now you can move to yet another location and run the program again; you should have a window on each side of the house. Our house is starting to look pretty good!

Adding a roof

The final stage is to add a roof to the house. To do this we are going to use wooden stairs. We will do this inside a loop so that if you change the width of your house, more layers are added to the roof. Enter the rest of the code. Be careful with the indentation: I recommend using spaces and avoiding the use of tabs. After the if statement, you need to indent the code even further. Each indentation level needs four spaces, so below the line with if on it, you will need eight spaces.

Since some of these code lines are lengthy and indented a lot, you may well find that the text wraps around as you reach the right-hand side of your editor window — don’t worry about this. You will have to be careful to get those indents right, however.

Now move somewhere new in your world and run the complete program. Iron out any last bugs, then admire your house! Does it look how you expect? Can you make it better?

Customising your house

Now you can start to customise your house. It is a good idea to use Save As in the menu to save a new version of your program. Then you can keep different designs, or refer back to your previous program if you get to a point where you don’t understand why your new one doesn’t work.

Consider these changes:

  • Change the size of your house. Are you able also to move the door and windows so they stay in proportion?
  • Change the materials used for the house. An ice house placed in an area of snow would look really cool!
  • Add a back door to your house. Or make the front door a double-width door!

We hope that you have enjoyed writing this program to build a house. Now you can easily add a house to your Minecraft world whenever you want to by simply running this program.

Get the complete code for this project here.

Continue your Minecraft journey

Minecraft Pi’s programmable interface is an ideal platform for learning Python. If you’d like to try more of our free tutorials, check out:

You may also enjoy Martin O’Hanlon’s and David Whale’s Adventures in Minecraft, and the Hacking and Making in Minecraft MagPi Essentials guide, which you can download for free or buy in print here.

The post Build a house in Minecraft using Python appeared first on Raspberry Pi.

Cybersecurity Insurance

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/04/cybersecurity_i_1.html

Good article about how difficult it is to insure an organization against Internet attacks, and how expensive the insurance is.

Companies like retailers, banks, and healthcare providers began seeking out cyberinsurance in the early 2000s, when states first passed data breach notification laws. But even with 20 years’ worth of experience and claims data in cyberinsurance, underwriters still struggle with how to model and quantify a unique type of risk.

“Typically in insurance we use the past as prediction for the future, and in cyber that’s very difficult to do because no two incidents are alike,” said Lori Bailey, global head of cyberrisk for the Zurich Insurance Group. Twenty years ago, policies dealt primarily with data breaches and third-party liability coverage, like the costs associated with breach class-action lawsuits or settlements. But more recent policies tend to accommodate first-party liability coverage, including costs like online extortion payments, renting temporary facilities during an attack, and lost business due to systems failures, cloud or web hosting provider outages, or even IT configuration errors.

In my new book — out in September — I write:

There are challenges to creating these new insurance products. There are two basic models for insurance. There’s the fire model, where individual houses catch on fire at a fairly steady rate, and the insurance industry can calculate premiums based on that rate. And there’s the flood model, where an infrequent large-scale event affects large numbers of people — but again at a fairly steady rate. Internet+ insurance is complicated because it follows neither of those models but instead has aspects of both: individuals are hacked at a steady (albeit increasing) rate, while class breaks and massive data breaches affect lots of people at once. Also, the constantly changing technology landscape makes it difficult to gather and analyze the historical data necessary to calculate premiums.

BoingBoing article.

User Authentication Best Practices Checklist

Post Syndicated from Bozho original https://techblog.bozho.net/user-authentication-best-practices-checklist/

User authentication is the functionality that every web application shared. We should have perfected that a long time ago, having implemented it so many times. And yet there are so many mistakes made all the time.

Part of the reason for that is that the list of things that can go wrong is long. You can store passwords incorrectly, you can have a vulnerably password reset functionality, you can expose your session to a CSRF attack, your session can be hijacked, etc. So I’ll try to compile a list of best practices regarding user authentication. OWASP top 10 is always something you should read, every year. But that might not be enough.

So, let’s start. I’ll try to be concise, but I’ll include as much of the related pitfalls as I can cover – e.g. what could go wrong with the user session after they login:

  • Store passwords with bcrypt/scrypt/PBKDF2. No MD5 or SHA, as they are not good for password storing. Long salt (per user) is mandatory (the aforementioned algorithms have it built in). If you don’t and someone gets hold of your database, they’ll be able to extract the passwords of all your users. And then try these passwords on other websites.
  • Use HTTPS. Period. (Otherwise user credentials can leak through unprotected networks). Force HTTPS if user opens a plain-text version.
  • Mark cookies as secure. Makes cookie theft harder.
  • Use CSRF protection (e.g. CSRF one-time tokens that are verified with each request). Frameworks have such functionality built-in.
  • Disallow framing (X-Frame-Options: DENY). Otherwise your website may be included in another website in a hidden iframe and “abused” through javascript.
  • Have a same-origin policy
  • Logout – let your users logout by deleting all cookies and invalidating the session. This makes usage of shared computers safer (yes, users should ideally use private browsing sessions, but not all of them are that savvy)
  • Session expiry – don’t have forever-lasting sessions. If the user closes your website, their session should expire after a while. “A while” may still be a big number depending on the service provided. For ajax-heavy website you can have regular ajax-polling that keeps the session alive while the page stays open.
  • Remember me – implementing “remember me” (on this machine) functionality is actually hard due to the risks of a stolen persistent cookie. Spring-security uses this approach, which I think should be followed if you wish to implement more persistent logins.
  • Forgotten password flow – the forgotten password flow should rely on sending a one-time (or expiring) link to the user and asking for a new password when it’s opened. 0Auth explain it in this post and Postmark gives some best pracitces. How the link is formed is a separate discussion and there are several approaches. Store a password-reset token in the user profile table and then send it as parameter in the link. Or do not store anything in the database, but send a few params: userId:expiresTimestamp:hmac(userId+expiresTimestamp). That way you have expiring links (rather than one-time links). The HMAC relies on a secret key, so the links can’t be spoofed. It seems there’s no consensus, as the OWASP guide has a bit different approach
  • One-time login links – this is an option used by Slack, which sends one-time login links instead of asking users for passwords. It relies on the fact that your email is well guarded and you have access to it all the time. If your service is not accessed to often, you can have that approach instead of (rather than in addition to) passwords.
  • Limit login attempts – brute-force through a web UI should not be possible; therefore you should block login attempts if they become too many. One approach is to just block them based on IP. The other one is to block them based on account attempted. (Spring example here). Which one is better – I don’t know. Both can actually be combined. Instead of fully blocking the attempts, you may add a captcha after, say, the 5th attempt. But don’t add the captcha for the first attempt – it is bad user experience.
  • Don’t leak information through error messages – you shouldn’t allow attackers to figure out if an email is registered or not. If an email is not found, upon login report just “Incorrect credentials”. On passwords reset, it may be something like “If your email is registered, you should have received a password reset email”. This is often at odds with usability – people don’t often remember the email they used to register, and the ability to check a number of them before getting in might be important. So this rule is not absolute, though it’s desirable, especially for more critical systems.
  • Make sure you use JWT only if it’s really necessary and be careful of the pitfalls.
  • Consider using a 3rd party authentication – OpenID Connect, OAuth by Google/Facebook/Twitter (but be careful with OAuth flaws as well). There’s an associated risk with relying on a 3rd party identity provider, and you still have to manage cookies, logout, etc., but some of the authentication aspects are simplified.
  • For high-risk or sensitive applications use 2-factor authentication. There’s a caveat with Google Authenticator though – if you lose your phone, you lose your accounts (unless there’s a manual process to restore it). That’s why Authy seems like a good solution for storing 2FA keys.

I’m sure I’m missing something. And you see it’s complicated. Sadly we’re still at the point where the most common functionality – authenticating users – is so tricky and cumbersome, that you almost always get at least some of it wrong.

The post User Authentication Best Practices Checklist appeared first on Bozho's tech blog.

Rotate Amazon RDS database credentials automatically with AWS Secrets Manager

Post Syndicated from Apurv Awasthi original https://aws.amazon.com/blogs/security/rotate-amazon-rds-database-credentials-automatically-with-aws-secrets-manager/

Recently, we launched AWS Secrets Manager, a service that makes it easier to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. You can configure Secrets Manager to rotate secrets automatically, which can help you meet your security and compliance needs. Secrets Manager offers built-in integrations for MySQL, PostgreSQL, and Amazon Aurora on Amazon RDS, and can rotate credentials for these databases natively. You can control access to your secrets by using fine-grained AWS Identity and Access Management (IAM) policies. To retrieve secrets, employees replace plaintext secrets with a call to Secrets Manager APIs, eliminating the need to hard-code secrets in source code or update configuration files and redeploy code when secrets are rotated.

In this post, I introduce the key features of Secrets Manager. I then show you how to store a database credential for a MySQL database hosted on Amazon RDS and how your applications can access this secret. Finally, I show you how to configure Secrets Manager to rotate this secret automatically.

Key features of Secrets Manager

These features include the ability to:

  • Rotate secrets safely. You can configure Secrets Manager to rotate secrets automatically without disrupting your applications. Secrets Manager offers built-in integrations for rotating credentials for Amazon RDS databases for MySQL, PostgreSQL, and Amazon Aurora. You can extend Secrets Manager to meet your custom rotation requirements by creating an AWS Lambda function to rotate other types of secrets. For example, you can create an AWS Lambda function to rotate OAuth tokens used in a mobile application. Users and applications retrieve the secret from Secrets Manager, eliminating the need to email secrets to developers or update and redeploy applications after AWS Secrets Manager rotates a secret.
  • Secure and manage secrets centrally. You can store, view, and manage all your secrets. By default, Secrets Manager encrypts these secrets with encryption keys that you own and control. Using fine-grained IAM policies, you can control access to secrets. For example, you can require developers to provide a second factor of authentication when they attempt to retrieve a production database credential. You can also tag secrets to help you discover, organize, and control access to secrets used throughout your organization.
  • Monitor and audit easily. Secrets Manager integrates with AWS logging and monitoring services to enable you to meet your security and compliance requirements. For example, you can audit AWS CloudTrail logs to see when Secrets Manager rotated a secret or configure AWS CloudWatch Events to alert you when an administrator deletes a secret.
  • Pay as you go. Pay for the secrets you store in Secrets Manager and for the use of these secrets; there are no long-term contracts or licensing fees.

Get started with Secrets Manager

Now that you’re familiar with the key features, I’ll show you how to store the credential for a MySQL database hosted on Amazon RDS. To demonstrate how to retrieve and use the secret, I use a python application running on Amazon EC2 that requires this database credential to access the MySQL instance. Finally, I show how to configure Secrets Manager to rotate this database credential automatically. Let’s get started.

Phase 1: Store a secret in Secrets Manager

  1. Open the Secrets Manager console and select Store a new secret.
     
    Secrets Manager console interface
     
  2. I select Credentials for RDS database because I’m storing credentials for a MySQL database hosted on Amazon RDS. For this example, I store the credentials for the database superuser. I start by securing the superuser because it’s the most powerful database credential and has full access over the database.
     
    Store a new secret interface with Credentials for RDS database selected
     

    Note: For this example, you need permissions to store secrets in Secrets Manager. To grant these permissions, you can use the AWSSecretsManagerReadWriteAccess managed policy. Read the AWS Secrets Manager Documentation for more information about the minimum IAM permissions required to store a secret.

  3. Next, I review the encryption setting and choose to use the default encryption settings. Secrets Manager will encrypt this secret using the Secrets Manager DefaultEncryptionKeyDefaultEncryptionKey in this account. Alternatively, I can choose to encrypt using a customer master key (CMK) that I have stored in AWS KMS.
     
    Select the encryption key interface
     
  4. Next, I view the list of Amazon RDS instances in my account and select the database this credential accesses. For this example, I select the DB instance mysql-rds-database, and then I select Next.
     
    Select the RDS database interface
     
  5. In this step, I specify values for Secret Name and Description. For this example, I use Applications/MyApp/MySQL-RDS-Database as the name and enter a description of this secret, and then select Next.
     
    Secret Name and description interface
     
  6. For the next step, I keep the default setting Disable automatic rotation because my secret is used by my application running on Amazon EC2. I’ll enable rotation after I’ve updated my application (see Phase 2 below) to use Secrets Manager APIs to retrieve secrets. I then select Next.

    Note: If you’re storing a secret that you’re not using in your application, select Enable automatic rotation. See our AWS Secrets Manager getting started guide on rotation for details.

     
    Configure automatic rotation interface
     

  7. Review the information on the next screen and, if everything looks correct, select Store. We’ve now successfully stored a secret in Secrets Manager.
  8. Next, I select See sample code.
     
    The See sample code button
     
  9. Take note of the code samples provided. I will use this code to update my application to retrieve the secret using Secrets Manager APIs.
     
    Python sample code
     

Phase 2: Update an application to retrieve secret from Secrets Manager

Now that I have stored the secret in Secrets Manager, I update my application to retrieve the database credential from Secrets Manager instead of hard coding this information in a configuration file or source code. For this example, I show how to configure a python application to retrieve this secret from Secrets Manager.

  1. I connect to my Amazon EC2 instance via Secure Shell (SSH).
  2. Previously, I configured my application to retrieve the database user name and password from the configuration file. Below is the source code for my application.
    import MySQLdb
    import config

    def no_secrets_manager_sample()

    # Get the user name, password, and database connection information from a config file.
    database = config.database
    user_name = config.user_name
    password = config.password

    # Use the user name, password, and database connection information to connect to the database
    db = MySQLdb.connect(database.endpoint, user_name, password, database.db_name, database.port)

  3. I use the sample code from Phase 1 above and update my application to retrieve the user name and password from Secrets Manager. This code sets up the client and retrieves and decrypts the secret Applications/MyApp/MySQL-RDS-Database. I’ve added comments to the code to make the code easier to understand.
    # Use the code snippet provided by Secrets Manager.
    import boto3
    from botocore.exceptions import ClientError

    def get_secret():
    #Define the secret you want to retrieve
    secret_name = "Applications/MyApp/MySQL-RDS-Database"
    #Define the Secrets mManager end-point your code should use.
    endpoint_url = "https://secretsmanager.us-east-1.amazonaws.com"
    region_name = "us-east-1"

    #Setup the client
    session = boto3.session.Session()
    client = session.client(
    service_name='secretsmanager',
    region_name=region_name,
    endpoint_url=endpoint_url
    )

    #Use the client to retrieve the secret
    try:
    get_secret_value_response = client.get_secret_value(
    SecretId=secret_name
    )
    #Error handling to make it easier for your code to tolerate faults
    except ClientError as e:
    if e.response['Error']['Code'] == 'ResourceNotFoundException':
    print("The requested secret " + secret_name + " was not found")
    elif e.response['Error']['Code'] == 'InvalidRequestException':
    print("The request was invalid due to:", e)
    elif e.response['Error']['Code'] == 'InvalidParameterException':
    print("The request had invalid params:", e)
    else:
    # Decrypted secret using the associated KMS CMK
    # Depending on whether the secret was a string or binary, one of these fields will be populated
    if 'SecretString' in get_secret_value_response:
    secret = get_secret_value_response['SecretString']
    else:
    binary_secret_data = get_secret_value_response['SecretBinary']

    # Your code goes here.

  4. Applications require permissions to access Secrets Manager. My application runs on Amazon EC2 and uses an IAM role to obtain access to AWS services. I will attach the following policy to my IAM role. This policy uses the GetSecretValue action to grant my application permissions to read secret from Secrets Manager. This policy also uses the resource element to limit my application to read only the Applications/MyApp/MySQL-RDS-Database secret from Secrets Manager. You can visit the AWS Secrets Manager Documentation to understand the minimum IAM permissions required to retrieve a secret.
    {
    "Version": "2012-10-17",
    "Statement": {
    "Sid": "RetrieveDbCredentialFromSecretsManager",
    "Effect": "Allow",
    "Action": "secretsmanager:GetSecretValue",
    "Resource": "arn:aws:secretsmanager:::secret:Applications/MyApp/MySQL-RDS-Database"
    }
    }

Phase 3: Enable Rotation for Your Secret

Rotating secrets periodically is a security best practice because it reduces the risk of misuse of secrets. Secrets Manager makes it easy to follow this security best practice and offers built-in integrations for rotating credentials for MySQL, PostgreSQL, and Amazon Aurora databases hosted on Amazon RDS. When you enable rotation, Secrets Manager creates a Lambda function and attaches an IAM role to this function to execute rotations on a schedule you define.

Note: Configuring rotation is a privileged action that requires several IAM permissions and you should only grant this access to trusted individuals. To grant these permissions, you can use the AWS IAMFullAccess managed policy.

Next, I show you how to configure Secrets Manager to rotate the secret Applications/MyApp/MySQL-RDS-Database automatically.

  1. From the Secrets Manager console, I go to the list of secrets and choose the secret I created in the first step Applications/MyApp/MySQL-RDS-Database.
     
    List of secrets in the Secrets Manager console
     
  2. I scroll to Rotation configuration, and then select Edit rotation.
     
    Rotation configuration interface
     
  3. To enable rotation, I select Enable automatic rotation. I then choose how frequently I want Secrets Manager to rotate this secret. For this example, I set the rotation interval to 60 days.
     
    Edit rotation configuration interface
     
  4. Next, Secrets Manager requires permissions to rotate this secret on your behalf. Because I’m storing the superuser database credential, Secrets Manager can use this credential to perform rotations. Therefore, I select Use the secret that I provided in step 1, and then select Next.
     
    Select which secret to use in the Edit rotation configuration interface
     
  5. The banner on the next screen confirms that I have successfully configured rotation and the first rotation is in progress, which enables you to verify that rotation is functioning as expected. Secrets Manager will rotate this credential automatically every 60 days.
     
    Confirmation banner message
     

Summary

I introduced AWS Secrets Manager, explained the key benefits, and showed you how to help meet your compliance requirements by configuring AWS Secrets Manager to rotate database credentials automatically on your behalf. Secrets Manager helps you protect access to your applications, services, and IT resources without the upfront investment and on-going maintenance costs of operating your own secrets management infrastructure. To get started, visit the Secrets Manager console. To learn more, visit Secrets Manager documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.