Tag Archives: Management Tools

New for AWS Backup – Protect and Restore Your CloudFormation Stacks

2022-11-28 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-backup-protect-and-restore-your-cloudformation-stacks/

To define the data protection policy of an application, you have to look at its components and find which ones store data that needs to be protected. Those are the stateful components of your application, such as databases and file systems. Other components don’t store data but need to be restored as well in case of issues. These are stateless components, such as containers and their network configurations.

When you manage your application using infrastructure as code (IaC), you have a single repository where all these components are described. Can we use this information to help protect your applications? Yes! AWS Backup now supports attaching an AWS CloudFormation stack to your data protection policies.

When you use CloudFormation as a resource, all stateful components supported by AWS Backup are backed up around the same time. The backup also includes the stateless resources in the stack, such as AWS Identity and Access Management (IAM) roles and Amazon Virtual Private Cloud (Amazon VPC) security groups. This gives you a single recovery point that you can use to recover the application stack or the individual resources you need. In case of recovery, you don’t need to mix automated tools with custom scripts and manual activities to recover and put the whole application stack back together. As you modernize and update an application managed with CloudFormation, AWS Backup automatically keeps track of changes and updates the data protection policies for you.

CloudFormation support for AWS Backup also helps you prove compliance of your data protection policies. You can monitor your application resources in AWS Backup Audit Manager, a feature of AWS Backup that enables you to audit and report on the compliance of data protection policies. You can also use AWS Backup Vault Lock to manage the immutability of your backups as required by your compliance obligations.

Let’s see how this works in practice.

Using AWS Backup Support for CloudFormation Stacks
First, I need to turn on the CloudFormation resource type for AWS Backup. In the AWS Backup console, I choose Settings in the navigation pane and then, in the Service opt-in section, Configure resources. There, I toggle the CloudFormation resource type on and choose Confirm.

Now that CloudFormation support is enabled, I choose Dashboard in the navigation pane and then Create backup plan. I select the Start with a template option and then the Daily-35day-Retention template. As the name suggests, this template creates daily backups that are kept for 35 days before being automatically deleted. I enter a name for the backup plan and choose Create plan.

Now I can assign resources to my backup plan. I enter a resource assignment name and use the default IAM role that is automatically created with the correct permissions.

In the Resource selection, I can select Include all resource types to automatically protect all resource types that are enabled in my account. Because I’d like to show how CloudFormation support works, I select Include specific resource types and then CloudFormation in the Select resource types dropdown menu. In the Choose resources menu, I can use the All supported CloudFormation stacks option to have all my stacks protected. For simplicity, I choose to protect only one stack, the my-app stack.

I leave the other options at their default values and choose Assign resources. That’s all! Now the CloudFormation stack that I selected will be backed up daily with 35 days of retention. What does that mean? Let’s have a look at what happens when I create an on-demand backup of a CloudFormation stack.

Creating On-Demand Backups for CloudFormation Stacks
I choose Protected resources in the navigation pane and then Create on-demand backup. The next steps are similar to what I did before when assigning resources to a backup plan. I select the CloudFormation resource type and the my-app stack. I use the Create backup now option to start the backup within one hour. I choose 7 days of retention and the Default backup vault. Backup vaults are logical containers that store and organize your backups. I select the default IAM role and choose Create on-demand backup.

Within a few minutes, the backup job is running. I expand the Backup job ID in the Backup jobs list to see the resources being backed up. The stateful resources (such as Amazon DynamoDB tables and Amazon Relational Database Service (RDS) databases) are listed with the current state of the backup job. The stateless resources in my stack (such as IAM roles, AWS Lambda functions, and VPC configurations) are backed up by the job with the CloudFormation resource type.

When the backup job has completed, I go back to the Protected resources page to see the list of resources that I can now restore. In the list, I see the IDs of the stateful resources (in this case, two DynamoDB tables and an Aurora database) and of the CloudFormation stack. If I choose each of the stateful resources, I see the available recovery points corresponding to the different points in time when that resource has been backed up.

If I choose the CloudFormation stack, I get a list of composite recovery points. Each composite recovery point includes all stateless and stateful resources in the stack. More specifically, the stateless resources are included in the CloudFormation template recovery point (the last one in the following screenshot).

Restoring a CloudFormation Backup
Inside the composite recovery point, I select the recovery point of the CloudFormation stack and choose Restore. Restoring a CloudFormation stack backup creates a new stack with a change set that represents the backup. I enter the new stack and change set names and choose Restore backup. After a few minutes, the restore job is completed.

In the CloudFormation console, the new stack is under review. I need to apply the change set.

I choose the new stack and select the change set created by the restore job to apply the change set.

After some time, the resources in my original stack have been recreated in the new stack. The stateful resources have been recreated empty. To recover the stateful resources, I can go back to the list of recovery points, select the recovery point I need, and initiate a restore.

Availability and Pricing
AWS Backup support for CloudFormation stacks is available today using the console, AWS Command Line Interface (CLI), and AWS SDKs in all AWS Regions where AWS Backup is offered. There is no additional cost for the stateless resources backed up and restored by AWS Backup. You only pay for the stateful resources such as databases, storage volumes, or file systems. For more information, see AWS Backup pricing.

You now have an automated solution to create and restore your applications with a simplified experience, eliminating the need to manage custom scripts.

— Danilo

Introducing AWS Resource Explorer – Quickly Find Resources in Your AWS Account

2022-11-08 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-aws-resource-explorer-quickly-find-resources-in-your-aws-account/

Looking for a specific Amazon Elastic Compute Cloud (Amazon EC2) instance, Amazon Elastic Container Service (Amazon ECS) task, or Amazon CloudWatch log group can take some time, especially if you have many resources and use multiple AWS Regions.

Today, we’re making that easier. Using the new AWS Resource Explorer, you can search through the AWS resources in your account across Regions using metadata such as names, tags, and IDs. When you find a resource in the AWS Management Console, you can quickly go from the search results to the corresponding service console and Region to start working on that resource. In a similar way, you can use the AWS Command Line Interface (CLI) or any of the AWS SDKs to find resources in your automation tools.

Let’s see how this works in practice.

Using AWS Resource Explorer
To start using Resource Explorer, I need to turn it on so that it creates and maintains the indexes that will provide fast responses to my search queries. Usually, the administrator of the account is the one taking these steps so that authorized users in that account can start searching.

To run a query, I need a view that gives access to an index. If the view is using an aggregator index, then the query can search across all indexed Regions.

If the view is using a local index, then the query has access only to the resources in that Region.

I can control the visibility of resources in my account by creating views that define what resource information is available for search and discovery. These controls are not based only on resources but also on the information that resources bring. For example, I can give access to the Amazon Resource Names (ARNs) of all resources but not to their tags which might contain information that I want to keep confidential.

In the Resource Explorer console, I choose Enable Resource Explorer. Then, I select the Quick setup option to have visibility for all supported resources within my account. This option creates local indexes in all Regions and an aggregator index in the selected Region. A default view with a filter that includes all supported resources in the account is also created in the same Region as the aggregator index.

With the Advanced setup option, I have access to more granular controls that are useful when there are specific governance requirements. For example, I can select in which Regions to create indexes. I can choose not to replicate resource information to any other Region so that resources from each AWS Region are searchable only from within the same Region. I can also control what information is available in the default view or avoid the creation of the default view.

With the Quick setup option selected, I choose Go to Resource Explorer. A quick overview shows the progress of enabling Resource Explorer across Regions. After the indexes have been created, it can take up to 36 hours to index all supported resources, and search results might be incomplete until then. When resources are created or deleted, your indexes are automatically updated. These updates are asynchronous, so it can take some time (usually a few minutes) to see the changes.

Searching With AWS Resource Explorer
After resources have been indexed, I choose Proceed to resource search. In the Search criteria, I choose which View to use. Currently, I have the default view selected. Then, I start typing in the Query field to search through the resources in my AWS account across all Regions. For example, I have an application where I used the convention to start resource names with my-app. For the resources I created manually, I also added the Project tag with value MyApp.

To find the resource of this application, I start by searching for my-app.

The results include resources from multiple services and Regions and global resources from AWS Identity and Access Management (IAM). I have a service, tasks, and a task definition from Amazon ECS, roles and policies from AWS IAM, log groups from CloudWatch. Optionally, I can filter results by Region or resource type. If I choose any of the listed resources, the link will bring me to the corresponding service console and Region with the resource selected.

To look for something in a specific Region, such as Europe (Ireland), I can restrict the results by adding region:eu-west-1 to the query.

I can further restrict results to Amazon ECS resources by adding service:ecs to the query. Now I only see the ECS cluster, service, tasks, and task definition in Europe (Ireland). That’s the task definition I was looking for!

I can also search using tags. For example, I can see the resources where I added the MyApp tag by including tag.value:MyApp in a query. To specify the actual key-value pair of the tag, I can use tag:Project=MyApp.

Creating a Custom View
Sometimes you need to control the visibility of the resources in your account. For example, all the EC2 instances used for development in my account are in US West (Oregon). I create a view for the development team by choosing a specific Region (us-west-2) and filtering the results with service:ec2 in the query. Optionally, I could further filter results based on resource names or tags. For example, I could add tag:Environment=Dev to only see resources that have been tagged to be in a development environment.

Now I allow access to this view to users and roles used by the development team. To do so, I can attach an identity-based policy to the users and roles of the development team. In this way, they can only explore and search resources using this view.

Unified Search in the AWS Management Console
After I turn Resource Explorer on, I can also search through my AWS resources in the search bar at the top of the Management Console. We call this capability unified search as it gives results that include AWS services, features, blogs, documentation, tutorial, events, and more.

To focus my search on AWS resources, I add /Resources at the beginning of my search.

Note that unified search automatically inserts a wildcard character (*) at the end of the first keyword in the string. This means that unified search results include resources that match any string that starts with the specified keyword.

The search performed by the Query text box on the Resource search page in the Resource Explorer console does not automatically append a wildcard character but I can do it manually after any term in the search string to have similar results.

Unified search works when I have the default view in the same Region that contains the aggregator index. To check if unified search works for me, I look at the top of the Settings page.

Availability and Pricing
You can start using AWS Resource Explorer today with a global console and via the AWS Command Line Interface (CLI) and the AWS SDKs. AWS Resource Explorer is available at no additional charge. Using Resource Explorer makes it much faster to find the resources you need and use them in your automation processes and in their service console.

Discover and access your AWS resources across all the Regions you use with AWS Resource Explorer.

— Danilo

Using CloudFormation events to build custom workflows for post provisioning management

2022-09-29 Vivek Kumar

Post Syndicated from Vivek Kumar original https://aws.amazon.com/blogs/devops/using-cloudformation-events-to-build-custom-workflows-for-post-provisioning-management/

Over one million active customers manage application resources with AWS CloudFormation every week. CloudFormation is a service that helps you model, provision, and manage your cloud resources by treating Infrastructure as Code (IaC). It can simplify infrastructure management, quickly replicate your environment to multiple AWS regions with a single turn-key solution, and let you easily control and track changes in your infrastructure.

You can create various AWS resources using CloudFormation to setup an environment for your workloads. You continue to interact with and manage those resources throughout the workload lifecycle to make sure the resource configuration is aligned with business objectives such as adhering to security compliance standards, meeting required reliability targets, and aligning with budget requirements. The inability to perform a hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services poses a challenge. For example, after provisioning of resources, customers might need to perform additional tasks to manage these resources such as adding cost allocation tags, populating resource inventory database or trigger downstream processes.

While they are able to obtain the logical resource grouping that is tied to a workload or a workload component with a CloudFormation stack, that context does not extend beyond CloudFormation for the most part when they use various AWS and non-AWS services to conduct post-provisioning resource management. These AWS and non-AWS services typically offer a resource level view, or in some cases offer basic aggregated views such as supporting a tag group, or an account level abstraction to see all resources in a given account. For a CloudFormation customer, the inability to not have the context of a stack beyond resource provisioning provides a disjointed experience given there is no hand-off between resource provisioning actions in CloudFormation and resource management actions in other relevant AWS and non-AWS services. The various management actions customers take with their workload resources through out their lifecycle are

CloudFormation events provide a robust way to track the status of individual resources during the lifecycle of a stack. You can send CloudFormation events to Amazon EventBridge whenever a create, update, or drift detection action is performed on your stack. Then you can set up additional workflows based on those events from EventBridge. For example, by tagging the resources automatically, you can reference that tag group when using AWS Trusted Advisor, and continue your resource management experience post-provisioning. CloudFormation sends these events to EventBridge automatically so that you don’t need to do anything. One real-world use case is to use these events to create actionable tasks for your teams to troubleshoot issues. CloudFormation events published to EventBridge can be used to create OpsItems within AWS Systems Manager OpsCenter. OpsItems are the work items created in OpsCenter for engineers to view, investigate and remediate tasks/issues. This enables teams to respond and resolve any issues more efficiently.

Walkthrough

To set up the EventBridge rule, go to the AWS console and navigate to EventBridge. Select on Create Rule to get started. Enter Name, description and select Next:

Create Rule

On the next screen, select AWS events in the Event source section.

This sample event is for the CREATE_COMPLETE event. It contains the source, AWS account number, AWS region, event type, resources and details about the event.

On the same page in the Event pattern section:

Select Custom patterns (JSON editor) and enter the following event pattern. This will match any events when a resource fails to create, update, or delete. Learn more about EventBridge event patterns.

{
    "source": [
        "aws.cloudformation"
    ],
    "detail-type": [
        "CloudFormation Resource Status Change"
    ],
    "detail": {
        "status-details": {
            "status": [
                "CREATE_FAILED",
                "UPDATE_FAILED",
                "DELETE_FAILED"
            ]
        }
    }
}

Custom patterns - JSON editor

Select Next. On the Target screen, select AWS service, then select System Manager OpsItem as the target for this rule.

Target 1

Add a second target – an Amazon Simple Notification Service (SNS) Topic – to notify the Ops team whenever a failure occurs and an OpsItem has been created.

Target 2

Select Next and optionally add tags.

Select next to review the selections, and select Create rule.

Now your rule is created and whenever a stack failure occurs, an OpsItem gets created and a notification is sent out for the operators to troubleshoot and fix the issue. The OpsItem contains operational data, such as the resource that failed, the reason for failure, as well as the stack to which it belongs, which is useful for troubleshooting the issue. Operators can take manual actions or use runbooks codified as Systems Manager Documents to take corrective actions. From the AWS Console you can go to OpsCenter to see the events:

operational data

Once the issues have been addressed, operators can mark the OpsItem as resolved, and retry the stack operation that failed, resulting in a swift resolution of the issue, and preventing duplication of efforts.

This walkthrough is for the Console but you can use AWS Command Line Interface (AWS CLI), AWS SDK or even CloudFormation to accomplish all of this. Refer to AWS CLI documentation for more information on creating EventBridge rules through CLI. Furthermore, refer to AWS SDK documentation for creating EventBridge rules through AWS SDK. You can use following CloudFormation template to deploy the EventBridge rules example used as part of the walkthrough in this blog post:

{
	"Parameters": {
		"SNSTopicARN": {
			"Type": "String",
			"Description": "Enter the ARN of the SNS Topic where you want stack failure notifications to be sent."
		}
	},
	"Resources": {
		"CFNEventsRule": {
			"Type": "AWS::Events::Rule",
			"Properties": {
				"Description": "Event rule to capture CloudFormation failure events",
				"EventPattern": {
					"source": [
						"aws.cloudformation"
					],
					"detail-type": [
						"CloudFormation Resource Status Change"
					],
					"detail": {
						"status-details": {
							"status": [
								"CREATE_FAILED",
								"UPDATE_FAILED",
								"DELETE_FAILED"
							]
						}
					}
				},
				"Name": "cfn-stack-failure-test",
				"State": "ENABLED",
				"Targets": [
					{
						"Arn": {
							"Fn::Sub": "arn:aws:ssm:${AWS::Region}:${AWS::AccountId}:opsitem"
						},
						"Id": "opsitems",
						"RoleArn": {
							"Fn::GetAtt": [
								"TargetInvocationRole",
								"Arn"
							]
						}
					},
					{
						"Arn": {
							"Ref": "SNSTopicARN"
						},
						"Id": "sns"
					}
				]
			}
		},
		"TargetInvocationRole": {
			"Type": "AWS::IAM::Role",
			"Properties": {
				"AssumeRolePolicyDocument": {
					"Version": "2012-10-17",
					"Statement": [
						{
							"Effect": "Allow",
							"Principal": {
								"Service": [
									"events.amazonaws.com"
								]
							},
							"Action": [
								"sts:AssumeRole"
							]
						}
					]
				},
				"Path": "/",
				"Policies": [
					{
						"PolicyName": "createopsitem",
						"PolicyDocument": {
							"Version": "2012-10-17",
							"Statement": [
								{
									"Effect": "Allow",
									"Action": [
										"ssm:CreateOpsItem"
									],
									"Resource": "*"
								}
							]
						}
					}
				]
			}
		},
		"AllowSNSPublish": {
			"Type": "AWS::SNS::TopicPolicy",
			"Properties": {
				"PolicyDocument": {
					"Statement": [
						{
							"Sid": "grant-eventbridge-publish",
							"Effect": "Allow",
							"Principal": {
								"Service": "events.amazonaws.com"
							},
							"Action": [
								"sns:Publish"
							],
							"Resource": {
								"Ref": "SNSTopicARN"
							}
						}
					]
				},
				"Topics": [
					{
						"Ref": "SNSTopicARN"
					}
				]
			}
		}
	}
}

Summary

Responding to CloudFormation stack events becomes easy with the integration between CloudFormation and EventBridge. CloudFormation events can be used to perform post-provisioning actions on workload resources. With the variety of targets available to EventBridge rules, various actions such as adding tags and, troubleshooting issues can be performed. This example above uses Systems Manager and Amazon SNS but you can have numerous targets including, Amazon API gateway, AWS Lambda, Amazon Elastic Container Service (Amazon ECS) task, Amazon Kinesis services, Amazon Redshift, Amazon SageMaker pipeline, and many more. These events are available for free in EventBridge.

Learn more about Managing events with CloudFormation and EventBridge.

About the Author

Vivek is a Solutions Architect at AWS based out of New York. He works with customers providing technical assistance and architectural guidance on various AWS services. He brings more than 25 years of experience in software engineering and architecture roles for various large-scale enterprises.

Mahanth is a Solutions Architect at Amazon Web Services (AWS). As part of the AWS Well-Architected team, he works with customers and AWS Partner Network partners of all sizes to help them build secure, high-performing, resilient, and efficient infrastructure for their applications. He spends his free time playing with his pup Cosmo, learning more about astronomy, and is an avid gamer.

Sukhchander is a Solutions Architect at Amazon Web Services. He is passionate about helping startups and enterprises adopt the cloud in the most scalable, secure, and cost-effective way by providing technical guidance, best practices, and well architected solutions.

Deploy and manage OpenAPI/Swagger RESTful APIs with the AWS Cloud Development Kit

2022-08-23 Luke Popplewell

Post Syndicated from Luke Popplewell original https://aws.amazon.com/blogs/devops/deploy-and-manage-openapi-swagger-restful-apis-with-the-aws-cloud-development-kit/

This post demonstrates how AWS Cloud Development Kit (AWS CDK) Infrastructure as Code (IaC) constructs and AWS serverless technology can be used to build and deploy a RESTful Application Programming Interface (API) defined in the OpenAPI specification. This post uses an example API that describes Widget resources and demonstrates how to use an AWS CDK Pipeline to:

Deploy a RESTful API stage to Amazon API Gateway from an OpenAPI specification.
Build and deploy an AWS Lambda function that contains the API functionality.
Auto-generate API documentation and publish it to an Amazon Simple Storage Service (Amazon S3)-hosted website served by the Amazon CloudFront content delivery network (CDN) service. This provides technical and non-technical stakeholders with versioned, current, and accessible API documentation.
Auto-generate client libraries for invoking the API and deploy them to AWS CodeArtifact, which is a fully-managed artifact repository service. This allows API client development teams to integrate with different versions of the API in different environments.

The diagram shown in the following figure depicts the architecture of the AWS services and resources described in this post.

The architecture described in this post consists of an AWS CodePipeline pipeline, provisioned using the AWS CDK, that deploys the Widget API to AWS Lambda and API Gateway. The pipeline then auto-generates the API’s documentation as a website served by CloudFront and deployed to S3. Finally, the pipeline auto-generates a client library for the API and deploys this to CodeArtifact.

Figure 1 – Architecture

The code that accompanies this post, written in Java, is available here.

Background

APIs must be understood by all stakeholders and parties within an enterprise including business areas, management, enterprise architecture, and other teams wishing to consume the API. Unfortunately, API definitions are often hidden in code and lack up-to-date documentation. Therefore, they remain inaccessible for the majority of the API’s stakeholders. Furthermore, it’s often challenging to determine what version of an API is present in different environments at any one time.

This post describes some solutions to these issues by demonstrating how to continuously deliver up-to-date and accessible API documentation, API client libraries, and API deployments.

AWS CDK

The AWS CDK is a software development framework for defining cloud IaC and is available in multiple languages including TypeScript, JavaScript, Python, Java, C#/.Net, and Go. The AWS CDK Developer Guide provides best practices for using the CDK.

This post uses the CDK to define IaC in Java which is synthesized to a cloud assembly. The cloud assembly includes one to many templates and assets that are deployed via an AWS CodePipeline pipeline. A unit of deployment in the CDK is called a Stack.

OpenAPI specification (formerly Swagger specification)

OpenAPI specifications describe the capabilities of an API and are both human and machine-readable. They consist of definitions of API components which include resources, endpoints, operation parameters, authentication methods, and contact information.

Project composition

The API project that accompanies this post consists of three directories:

app
api
cdk

app directory

This directory contains the code for the Lambda function which is invoked when the Widget API is invoked via API Gateway. The code has been developed in Java as an Apache Maven project.

The Quarkus framework has been used to define a WidgetResource class (see src/main/java/aws/sample/blog/cdkopenapi/app/WidgetResources.java ) that contains the methods that align with HTTP Methods of the Widget API.
api directory

The api directory contains the OpenAPI specification file ( openapi.yaml ). This file is used as the source for:

Defining the REST API using API Gateway’s support for OpenApi.
Auto-generating the API documentation.
Auto-generating the API client artifact.

The api directory also contains the following files:

openapi-generator-config.yaml : This file contains configuration settings for the OpenAPI Generator framework, which is described in the section CI/CD Pipeline.
maven-settings.xml: This file is used support the deployment of the generated SDKs or libraries (Apache Maven artifacts) for the API and is described in the CI/CD Pipeline section of this post.

This directory contains a sub directory called docker . The docker directory contains a Dockerfile which defines the commands for building a Docker image:

FROM ruby:2.6.5-alpine
 
RUN apk update \
 && apk upgrade --no-cache \
 && apk add --no-cache --repository http://dl-cdn.alpinelinux.org/alpine/v3.14/main/ nodejs=14.20.0-r0 npm \
 && apk add git \
 && apk add --no-cache build-base
 
# Install Widdershins node packages and ruby gem bundler 
RUN npm install -g widdershins \
 && gem install bundler 
 
# working directory
WORKDIR /openapi
 
# Clone and install the Slate framework
RUN git clone https://github.com/slatedocs/slate
RUN cd slate \
 && bundle install

The Docker image incorporates two open source projects, the NodeJS Widdershins library and the Ruby Slate-framework. These are used together to auto-generate the documentation for the API from the OpenAPI specification. This Dockerfile is referenced and built by the ApiStack class, which is described in the CDK Stacks section of this post.

cdk directory

This directory contains an Apache Maven Project developed in Java for provisioning the CDK stacks for the Widget API.

Under the src/main/java folder, the package aws.sample.blog.cdkopenapi.cdk contains the files and classes that define the application’s CDK stacks and also the entry point (main method) for invoking the stacks from the CDK Toolkit CLI:

CdkApp.java: This file contains the CdkApp class which provides the main method that is invoked from the AWS CDK Toolkit to build and deploy the application stacks.
ApiStack.java: This file contains the ApiStack class which defines the OpenApiBlogAPI stack and is described in the CDK Stacks section of this post.
PipelineStack.java: This file contains the PipelineStack class which defines the OpenAPIBlogPipeline stack and is described in the CDK Stacks section of this post.
ApiStackStage.java: This file contains the ApiStackStage class which defines a CDK stage. As detailed in the CI/CD Pipeline section of this post, a DEV stage, containing the OpenApiBlogAPI stack resources for a DEV environment, is deployed from the OpenApiBlogPipeline pipeline.

CDK stacks

ApiStack

Note that the CDK bundling functionality is used at multiple points in the ApiStack class to produce CDK Assets. The post, Building, bundling, and deploying applications with the AWS CDK, provides more details regarding using CDK bundling mechanisms.

The ApiStack class defines multiple resources including:

Widget API Lambda function: This is bundled by the CDK in a Docker container using the Java 11 runtime image.
Widget REST API on API Gateway: The REST API is created from an Inline API Definition which is passed as an S3 CDK Asset. This asset includes a reference to the Widget API OpenAPI specification located under the api folder (see api/openapi.yaml ) and builds upon the SpecRestApi construct and API Gateway’s support for OpenApi.
API documentation Docker Image Asset: This is the Docker image that contains the open source frameworks (Widdershins and Slate) that are leveraged to generate the API documentation.
CDK Asset bundling functionality that leverages the API documentation Docker image to auto-generate documentation for the API.
An S3 Bucket for holding the API documentation website.
An origin access identity (OAI) which allows CloudFront to securely serve the S3 Bucket API documentation content.
A CloudFront distribution which provides CDN functionality for the S3 Bucket website.

Note that the ApiStack class features the following code which is executed on the Widget API Lambda construct:

CfnFunction apiCfnFunction = (CfnFunction)apiLambda.getNode().getDefaultChild();
apiCfnFunction.overrideLogicalId("APILambda");

The CDK, by default, auto-assigns an ID for each defined resource but in this case the generated ID is being overridden with “APILambda”. The reason for this is that inside of the Widget API OpenAPI specification (see api/openapi.yaml ), there is a reference to the Lambda function by name (“APILambda”) so that the function can be integrated as a proxy for each listed API path and method combination. The OpenAPI specification includes this name as a variable to derive the Amazon Resource Name (ARN) for the Lambda function:

uri:
	Fn::Sub: "arn:aws:apigateway:${AWS::Region}:lambda:path/2015-03-31/functions/${APILambda.Arn}/invocations"

PipelineStack

The PipelineStack class defines a CDK CodePipline construct which is a higher level construct and pattern. Therefore, the construct doesn’t just map directly to a single CloudFormation resource, but provisions multiple resources to fulfil the requirements of the pattern. The post, CDK Pipelines: Continuous delivery for AWS CDK applications, provides more detail on creating pipelines with the CDK.

final CodePipeline pipeline = CodePipeline.Builder.create(this, "OpenAPIBlogPipeline")
.pipelineName("OpenAPIBlogPipeline")
.selfMutation(true)
      .dockerEnabledForSynth(true)
      .synth(synthStep)
      .build();

CI/CD pipeline

The diagram in the following figure shows the multiple CodePipeline stages and actions created by the CDK CodePipeline construct that is defined in the PipelineStack class.

Figure 2 – CI/CD Pipeline

The stages defined include the following:

Source stage: The pipeline is passed the source code contents from this stage.
Synth stage: This stage consists of a Synth Action that synthesizes the CloudFormation templates for the application’s CDK stacks and compiles and builds the project Lambda API function.
Update Pipeline stage: This stage checks the OpenAPIBlogPipeline stack and reinitiates the pipeline when changes to its definition have been deployed.
Assets stage: The application’s CDK stacks produce multiple file assets (for example, zipped Lambda code) which are published to Amazon S3. Docker image assets are published to a managed CDK framework Amazon Elastic Container Registry (Amazon ECR) repository.
DEV stage: The API’s CDK stack ( OpenApiBlogAPI ) is deployed to a hypothetical development environment in this stage. A post stage deployment action is also defined in this stage. Through the use of a CDK ShellStep construct, a Bash script is executed that deploys a generated client Java Archive (JAR) for the Widget API to CodeArtifact. The script employs the OpenAPI Generator project for this purpose:

CodeBuildStep codeArtifactStep = CodeBuildStep.Builder.create("CodeArtifactDeploy")
    .input(pipelineSource)
    .commands(Arrays.asList(
           	"echo $REPOSITORY_DOMAIN",
           	"echo $REPOSITORY_NAME",
           	"export CODEARTIFACT_TOKEN=`aws codeartifact get-authorization-token --domain $REPOSITORY_DOMAIN --query authorizationToken --output text`",
           	"export REPOSITORY_ENDPOINT=$(aws codeartifact get-repository-endpoint --domain $REPOSITORY_DOMAIN --repository $REPOSITORY_NAME --format maven | jq .repositoryEndpoint | sed 's/\\\"//g')",
           	"echo $REPOSITORY_ENDPOINT",
           	"cd api",
           	"wget -q https://repo1.maven.org/maven2/org/openapitools/openapi-generator-cli/5.4.0/openapi-generator-cli-5.4.0.jar -O openapi-generator-cli.jar",
     	          "cp ./maven-settings.xml /root/.m2/settings.xml",
        	          "java -jar openapi-generator-cli.jar batch openapi-generator-config.yaml",
                    "cd client",
                    "mvn --no-transfer-progress deploy -DaltDeploymentRepository=openapi--prod::default::$REPOSITORY_ENDPOINT"
))
      .rolePolicyStatements(Arrays.asList(codeArtifactStatement, codeArtifactStsStatement))
.env(new HashMap<String, String>() {{
      		put("REPOSITORY_DOMAIN", codeArtifactDomainName);
            	put("REPOSITORY_NAME", codeArtifactRepositoryName);
       }})
      .build();

Running the project

To run this project, you must install the AWS CLI v2, the AWS CDK Toolkit CLI, a Java/JDK 11 runtime, Apache Maven, Docker, and a Git client. Furthermore, the AWS CLI must be configured for a user who has administrator access to an AWS Account. This is required to bootstrap the CDK in your AWS account (if not already completed) and provision the required AWS resources.

To build and run the project, perform the following steps:

Fork the OpenAPI blog project in GitHub.
Open the AWS Console and create a connection to GitHub. Note the connection’s ARN.
In the Console, navigate to AWS CodeArtifact and create a domain and repository. Note the names used.
From the command line, clone your forked project and change into the project’s directory:

git clone https://github.com/<your-repository-path>
cd <your-repository-path>

Edit the CDK JSON file at cdk/cdk.json and enter the details:

"RepositoryString": "<your-github-repository-path>",
"RepositoryBranch": "<your-github-repository-branch-name>",
"CodestarConnectionArn": "<connection-arn>",
"CodeArtifactDomain": "<code-artifact-domain-name>",
"CodeArtifactRepository": "<code-artifact-repository-name>"

Please note that for setting configuration values in CDK applications, it is recommend to use environment variables or AWS Systems Manager parameters.

Commit and push your changes back to your GitHub repository:

git push origin main

Change into the cdk directory and bootstrap the CDK in your AWS account if you haven’t already done so (enter “Y” when prompted):

cd cdk
cdk bootstrap

Deploy the CDK pipeline stack (enter “Y” when prompted):

cdk deploy OpenAPIBlogPipeline

Once the stack deployment completes successfully, the pipeline OpenAPIBlogPipeline will start running. This will build and deploy the API and its associated resources. If you open the Console and navigate to AWS CodePipeline, then you’ll see a pipeline in progress for the API.

Once the pipeline has completed executing, navigate to AWS CloudFormation to get the output values for the DEV-OpenAPIBlog stack deployment:

Select the DEV-OpenAPIBlog stack entry and then select the Outputs column. Record the REST_URL value for the key that begins with OpenAPIBlogRestAPIEndpoint .
Record the CLOUDFRONT_URL value for the key OpenAPIBlogCloudFrontURL .

The API ping method at https://<REST_URL>/ping can now be invoked using your browser or an API development tool like Postman. Other API other methods, as defined by the OpenApi specification, are also available for invocation (For example, GET https://<REST_URL>/widgets).

To view the generated API documentation, open a browser at https://< CLOUDFRONT_URL>.

The following figure shows the API documentation website that has been auto-generated from the API’s OpenAPI specification. The documentation includes code snippets for using the API from multiple programming languages.

The API’s auto-generated documentation website provides descriptions of the API’s methods and resources as well as code snippets in multiple languages including JavaScript, Python, and Java.

Figure 3 – Auto-generated API documentation

To view the generated API client code artifact, open the Console and navigate to AWS CodeArtifact. The following figure shows the generated API client artifact that has been published to CodeArtifact.

The CodeArtifact service user interface in the Console shows the different versions of the API’s auto-generated client libraries.

Figure 4 – API client artifact in CodeArtifact

Cleaning up

From the command change to the cdk directory and remove the API stack in the DEV stage (enter “Y” when prompted):

cd cdk
cdk destroy OpenAPIBlogPipeline/DEV/OpenAPIBlogAPI

Once this has completed, delete the pipeline stack:

cdk destroy OpenAPIBlogPipeline

Delete the S3 bucket created to support pipeline operations. Open the Console and navigate to Amazon S3. Delete buckets with the prefix openapiblogpipeline .

Conclusion

This post demonstrates the use of the AWS CDK to deploy a RESTful API defined by the OpenAPI/Swagger specification. Furthermore, this post describes how to use the AWS CDK to auto-generate API documentation, publish this documentation to a web site hosted on Amazon S3, auto-generate API client libraries or SDKs, and publish these artifacts to an Apache Maven repository hosted on CodeArtifact.

The solution described in this post can be improved by:

Building and pushing the API documentation Docker image to Amazon ECR, and then using this image in CodePipeline API pipelines.
Creating stages for different environments such as TEST, PREPROD, and PROD.
Adding integration testing actions to make sure that the API Deployment is working correctly.
Adding Manual approval actions for that are executed before deploying the API to PROD.
Using CodeBuild caching of artifacts including Docker images and libraries.

About the author:

Identification of replication bottlenecks when using AWS Application Migration Service

2022-06-10 Tobias Reekers

Post Syndicated from Tobias Reekers original https://aws.amazon.com/blogs/architecture/identification-of-replication-bottlenecks-when-using-aws-application-migration-service/

Enterprises frequently begin their journey by re-hosting (lift-and-shift) their on-premises workloads into AWS and running Amazon Elastic Compute Cloud (Amazon EC2) instances. A simpler way to re-host is by using AWS Application Migration Service (Application Migration Service), a cloud-native migration service.

To streamline and expedite migrations, automate reusable migration patterns that work for a wide range of applications. Application Migration Service is the recommended migration service to lift-and-shift your applications to AWS.

In this blog post, we explore key variables that contribute to server replication speed when using Application Migration Service. We will also look at tests you can run to identify these bottlenecks and, where appropriate, include remediation steps.

Overview of migration using Application Migration Service

Figure 1 depicts the end-to-end data replication flow from source servers to a target machine hosted on AWS. The diagram is designed to help visualize potential bottlenecks within the data flow, which are denoted by a black diamond.

Figure 1. Data flow when using AWS Application Migration Service (black diamonds denote potential points of contention)

Baseline testing

To determine a baseline replication speed, we recommend performing a control test between your target AWS Region and the nearest Region to your source workloads. For example, if your source workloads are in a data center in Rome and your target Region is Paris, run a test between eu-south-1 (Milan) and eu-west-3 (Paris). This will give a theoretical upper bandwidth limit, as replication will occur over the AWS backbone. If the target Region is already the closest Region to your source workloads, run the test from within the same Region.

Network connectivity

There are several ways to establish connectivity between your on-premises location and AWS Region:

Public internet
VPN
AWS Direct Connect

This section pertains to options 1 and 2. If facing replication speed issues, the first place to look is at network bandwidth. From a source machine within your internal network, run a speed test to calculate your bandwidth out to the internet; common test providers include Cloudflare, Ookla, and Google. This is your bandwidth to the internet, not to AWS.

Next, to confirm the data flow from within your data center, run a traceroute (Windows) or tracert (Linux). Identify any network hops that are unusual or potentially throttling bandwidth (due to hardware limitations or configuration).

To measure the maximum bandwidth between your data center and the AWS subnet that is being used for data replication, while accounting for Security Sockets Layer (SSL) encapsulation, use the CloudEndure SSL bandwidth tool (refer to Figure 1).

Source storage I/O

The next area to look for replication bottlenecks is source storage. The underlying storage for servers can be a point of contention. If the storage is maxing-out its read speeds, this will impact the data-replication rate. If your storage I/O is heavily utilized, it can impact block replication by Application Migration Service. In order to measure storage speeds, you can use the following tools:

Windows: WinSat (or other third-party tooling, like AS SSD Benchmark)
Linux: hdparm

We suggest reducing read/write operations on your source storage when starting your migration using Application Migration Service.

Application Migration Service EC2 replication instance size

The size of the EC2 replication server instance can also have an impact on the replication speed. Although it is recommended to keep the default instance size (t3.small), it can be increased if there are business requirements, like to speed up the initial data sync. Note: using a larger instance can lead to increased compute costs.

-508 (1)

Common replication instance changes include:

Servers with <26 disks: change the instance type to m5.large. Increase the instance type to m5.xlarge or higher, as needed.
Servers with <26 disks (or servers in AWS Regions that do not support m5 instance types): change the instance type to m4.large. Increase to m4.xlarge or higher, as needed.

Note: Changing the replication server instance type will not affect data replication. Data replication will automatically pick up where it left off, using the new instance type you selected.

Application Migration Service Elastic Block Store replication volume

You can customize the Amazon Elastic Block Store (Amazon EBS) volume type used by each disk within each source server in that source server’s settings (change staging disk type).

By default, disks <500GiB use Magnetic HDD volumes. AWS best practice suggests not change the default Amazon EBS volume type, unless there is a business need for doing so. However, as we aim to speed up the replication, we actively change the default EBS volume type.

There are two options to choose from:

The lower cost, Throughput Optimized HDD (st1) option utilizes slower, less expensive disks.

-508 (2)

- Consider this option if you(r):
  - Want to keep costs low
  - Large disks do not change frequently
  - Are not concerned with how long the initial sync process will take
The faster, General Purpose SSD (gp2) option utilizes faster, but more expensive disks.

-508 (3)

- Consider this option if you(r):
  - Source server has disks with a high write rate, or if you need faster performance in general
  - Want to speed up the initial sync process
  - Are willing to pay more for speed

Source server CPU

The Application Migration Service agent that is installed on the source machine for data replication uses a single core in most cases (agent threads can be scheduled to multiple cores). If core utilization reaches a maximum, this can be a limitation for replication speed. In order to check the core utilization:

Windows: Launch the Task Manger application within Windows, and click on the “CPU” tab. Right click on the CPU graph (this is currently showing an average of cores) > select “Change graph to” > “Logical processors”. This will show individual cores and their current utilization (Figure 2).

Figure 2. Logical processor CPU utilization

Linux: Install htop and run from the terminal. The htop command will display the Application Migration Service/CE process and indicate the CPU and memory utilization percentage (this is of the entire machine). You can check the CPU bars to determine if a CPU is being maxed-out (Figure 3).

Figure 3. AWS Application Migration Service/CE process to assess CPU utilization

Conclusion

In this post, we explored several key variables that contribute to server replication speed when using Application Migration Service. We encourage you to explore these key areas during your migration to determine if your replication speed can be optimized.

Related information

Extend your pre-commit hooks with AWS CloudFormation Guard

2022-04-26 Joaquin Manuel Rinaudo

Post Syndicated from Joaquin Manuel Rinaudo original https://aws.amazon.com/blogs/security/extend-your-pre-commit-hooks-with-aws-cloudformation-guard/

Git hooks are scripts that extend Git functionality when certain events and actions occur during code development. Developer teams often use Git hooks to perform quality checks before they commit their code changes. For example, see the blog post Use Git pre-commit hooks to avoid AWS CloudFormation errors for a description of how the AWS Integration and Automation team uses various pre-commit hooks to help reduce effort and errors when they build AWS Quick Starts.

This blog post shows you how to extend your Git hooks to validate your AWS CloudFormation templates against policy-as-code rules by using AWS CloudFormation Guard. This can help you verify that your code follows organizational best practices for security, compliance, and more by preventing you from commit changes that fail validation rules.

We will also provide patterns you can use to centrally maintain a list of rules that security teams can use to roll out new security best practices across an organization. You will learn how to configure a pre-commit framework by using an example repository while you store Guard rules in both a central Amazon Simple Storage Service (Amazon S3) bucket or in versioned code repositories (such as AWS CodeCommit, GitHub, Bitbucket, or GitLab).

Prerequisites

To complete the steps in this blog post, first perform the following installations.

Install AWS Command Line Interface (AWS CLI).
Install the Git CLI.
Install the pre-commit framework by running the following command.
pip install pre-commit
Install the Rust programming language by following these instructions.
(Windows only) Install the version of Microsoft Visual C++ Build Tools 2019 that provides just the Visual C++ build tools.

Solution walkthrough

In this section, we walk you through an exercise to extend a Java service on an Amazon EKS example repository with Git hooks by using AWS CloudFormation Guard. You can choose to upload your Guard rules in either a separate GitHub repository or your own S3 bucket.

First, download the sample repository that you will add the pre-commit framework to.

To clone the test repository

Clone the repo to a local directory by running the following command in your local terminal.
git clone https://github.com/aws-samples/amazon-eks-example-for-stateful-java-service.git

Next, create Guard rules that reflect the organization’s policy-as-code best practices and store them in an S3 bucket.

To set up an S3 bucket with your Guard rules

Create an S3 bucket by running the following command in the AWS CLI.
aws s3 mb s3://<account-id>-cfn-guard-rules --region <aws-region>

where <account-id> is the ID of the AWS account you’re using and <aws-region> is the AWS Region you want to use.
(Optional) Alternatively, you can follow the Getting started with Amazon S3 tutorial to create the bucket and upload the object (as described in step 4 that follows) by using the AWS Management Console.
When you store your Guard rules in an S3 bucket, you can make the rules accessible to other member accounts in your organization by using the aws:PrincipalOrgID condition and setting the value to your organization ID in the bucket policy.
Create a file that contains a Guard rule named rules.guard, with the following content.
```
let eks_cluster = Resources.* [ Type == 'AWS::EKS::Cluster' ]
rule eks_public_disallowed when %eks_cluster !empty {
      %eks_cluster.Properties.ResourcesVpcConfig.EndpointPublicAccess == false
}
```
This rule will verify that public endpoints are disabled by checking that resources that are created by using the AWS::EKS::Cluster resource type have the EndpointPublicAccess property set to false. For more information about authoring your own rules using Guard domain-specific language (DSL), see Introducing AWS CloudFormation Guard 2.0.
Upload the rule set to your S3 bucket by running the following command in the AWS CLI.
aws s3 cp rules.guard s3://<account-id>-cfn-guard-rules/rules/rules.guard

In the next step, you will set up the pre-commit framework in the repository to run CloudFormation Guard against code changes.

To configure your pre-commit hook to use Guard

Run the following command to create a new branch where you will test your changes.
git checkout -b feature/guard-hook
Navigate to the root directory of the project that you cloned earlier and create a .pre-commit-config.yaml file with the following configuration.
```
repos:
  - repo: local
    hooks:
      -   id: cfn-guard-rules
          name: Rules for AWS
          description: Download Organization rules
          entry: aws s3 cp --recursive s3://<account-id>-cfn-guard-rules/rules  guard-rules/org-rules/
          language: system
          pass_filenames: false
      -   id: cfn-guard
          name: AWS CloudFormation Guard
          description: Validate code against your Guard rules
          entry: bash -c 'for template in "$@"; do cfn-guard validate -r guard-rules -d "$template" || SCAN_RESULT="FAILED"; done; if [[ "$SCAN_RESULT" = "FAILED" ]]; then exit 1; fi'
          language: rust
          files: \.(json|yaml|yml|template\.json|template)$
          additional_dependencies:
            - cli:cfn-guard
```
You will need to replace the <account-id> placeholder value with the AWS account ID you entered in the To set up an S3 bucket with your Guard rules procedure.

This hook configuration uses local pre-commit hooks to download the latest version of Guard rules from the bucket you created previously. This allows you to set up a centralized set of Guard rules across your organization.

Alternatively, you can create and use a code repository such as GitHub, AWS CodeCommit, or Bitbucket to keep your rules in version control. To do so, replace the command in the Download Organization rules step of the .pre-commit-config.yaml file with:

bash -c ‘if [ -d guard-rules/org-rules ]; then cd guard-rules/org-rules && git pull; else git clone <guard-rules-repository-target> guard-rules/org-rules; fi’

Where <guard-rules-repository-target> is the HTTPS or SSH URL of your repository. This command will clone or pull the latest rules from your Git repo by using your Git credentials.

The hook will also install Guard as an additional dependency by using a Rust hook. Using Guard, it will run the code changes in the repository directory against the downloaded rule set. When misconfigurations are detected, the hook stops the commit.

You can further extend your organization rules with your own Guard rules by adding them to the cfn-guard-rules folder. You should commit these rules in your repository and add cfn-guard-rules/org-rules/* to your .gitignore file.
Run a pre-commit install command to install the hooks you just created.

Finally, test that the pre-commit’s Guard hook fails commits of code changes that do not follow organizational best practices.

To test pre-commit hooks

Add EndpointPublicAccess: true in cloudformation/eks.template.yaml, as shown following. This describes the test-only intent (meaning that you want to detect and flag errors in your rule) of adding public access to the Amazon Elastic Kubernetes Service (Amazon EKS) cluster.
```
  EKSCluster:
    Type: AWS::EKS::Cluster
    Properties:
      Name: java-app-demo-cluster
      ResourcesVpcConfig:
        EndpointPublicAccess: true
        SecurityGroupIds:
          - !Ref EKSControlPlaneSecurityGroup
```
Add your changes with the git add command.
git add .pre-commit-config.yaml

git add cloudformation/eks.template.yaml

Commit changes with the following command.

git commit -m “bad config”

You should see the following error that disallows the commit to the local repository and shows which one of your Guard rules failed.

amazon-eks-controlplane.template.yaml Status = FAIL
		
FAILED rules
		
rules.guard/eks_public_disallowed    FAIL
		
---
		
Evaluation of rules rules.guard against data amazon-eks-controlplane.template.yaml
		
---
		
Property
[/Resources/EKS/Properties/ResourcesVpcConfig/EndpointPublicAccess] in data
[eks.template.yaml] is not compliant with [rules.guard/eks_public_disallowed] 
because provided value [true] did not match expected value [false]. 
Error Message []

(Optional) You can also test hooks before committing by using the pre-commit run command to see similar output.

Cleanup

To avoid incurring ongoing charges, follow these cleanup steps to delete the resources and files you created as you followed along with this blog post.

To clean up resources and files

Remove your local repository.
rm -rf /path/to/repository
Delete the S3 bucket you created by running the following command.
aws s3 rb s3://<account-id>-cfn-guard-rules --force
(Optional) Remove the pre-commit hooks framework by running this command.
pip uninstall pre-commit

Conclusion

In this post, you learned how to use AWS CloudFormation Guard with the pre-commit framework locally to validate your infrastructure-as-code solutions before you push remote changes to your repositories.

You also learned how to extend the solution to use a centralized list of security rules that is stored in versioned code repositories (GitHub, Bitbucket, or GitLab) or an S3 bucket. And you learned how to further extend the solution with your own rules. You can find examples of rules to use in Guard’s Github repository or refer to write preventative compliance rules for AWS CloudFormation templates the cfn-guard way. You can then further configure other repositories to prevent misconfigurations by using the same Guard rules.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the KMS re:Post or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

New for AWS Control Tower – Region Deny and Guardrails to Help You Meet Data Residency Requirements

2021-11-30 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-control-tower-region-deny-and-guardrails-to-help-you-meet-data-residency-requirements/

Many customers, such as those in highly regulated industries and the public sector, want to have control over where their data is stored and processed. AWS already offers many tools and features to comply with local laws and regulations, but we want to provide a simplified way to translate data residency requirements into controls that can be applied to single- and multi-account environments.

Starting today, you can use AWS Control Tower to deploy data residency preventive and detective controls, referred to as guardrails. These guardrails will prevent provisioning resources in unwanted AWS Regions by restricting access to AWS APIs through service control policies (SCPs) built and managed by AWS Control Tower. In this way, content cannot be created or transferred outside of your selected Regions at the infrastructure level. In this context, content can be software (including machine images), data, text, audio, video, or images hosted on AWS for processing or storage. For example, AWS customers in Germany can deny access to AWS services in Regions outside of Frankfurt with the exception of global services such as AWS Identity and Access Management (IAM) and AWS Organizations.

AWS Control Tower also offers guardrails to further control data residency in underlying AWS service options, for example, blocking Amazon Simple Storage Service (Amazon S3) cross-region replication or blocking the creation of internet gateways.

The AWS account used for managing AWS Control Tower is not restricted by the new Region deny settings. That account can be used for remediation if you have data in an unwanted Region before enabling Region deny.

Detective guardrails are implemented via AWS Config rules and can further detect unexpected configuration changes that should not be allowed.

You still retain a shared responsibility model for data residency at the application level, but these controls can help you restrict what infrastructure and application teams can do on AWS.

Using Data Residency Guardrails in AWS Control Tower
To use the new data residency guardrails, you need to have created a landing zone using AWS Control Tower. See Plan your AWS Control Tower landing zone for more information.

To see all the new controls that are available, I select Guardrails on the left pane of the AWS Control Tower console and then find those in the Data Residency category. I sort results by Behavior. Guardrails that have a Prevention behavior are implemented as SCPs. Those that have a Detection behavior are implemented as AWS Config rules.

The most interesting guardrail is probably the one denying access to AWS based on the requested AWS Region. I choose it from the list and find that it is different from the other guardrails because it affects all Organizational Units (OUs) and cannot be activated here but must be activated in the landing zone settings.

Below the Overview, in the Guardrail components, there is a link to the full SCP for this guardrail, and I can see the list of the AWS APIs that, when this setting is enabled, are still going to be allowed towards non-governed Regions. Depending on your requirements, some of those services, such as Amazon CloudFront or AWS Global Accelerator, can be further limited by a custom SCP.

In the Landing zone settings, the Region deny guardrail is currently not enabled. I choose Modify settings and then enable the Region deny settings.

Below the Region deny settings, there is the list of AWS Regions governed by the landing zone. Those will be the regions allowed when I enable Region deny.

In my case, I have four governed Regions, two in the US and two in Europe:

US East (N. Virginia), which is also the home Region for the landing zone
US West (Oregon)
Europe (Ireland)
Europe (Frankfurt)

I choose Update landing zone at the bottom. The update of the landing zone takes a few minutes to complete. Now, the vast majority of the AWS APIs are blocked if they are not directed to one of those governed Regions. Let’s do a few tests.

Testing Region Deny in a Sandbox Account
Using AWS Single Sign-On, I copy the AWS credentials to use the sandbox account with AWSAdministratorAccess permissions. In a terminal, I paste the commands setting the environment variables to use those credentials.

Now, I try to start a new Amazon Elastic Compute Cloud (Amazon EC2) instance in US East (Ohio), one of the non-governed Regions. In a landing zone, the default VPC is replaced by a VPC managed by AWS Control Tower. To start the instance, I need to specify a VPC subnet. Let’s find a subnet ID that I can use.

aws ec2 describe-subnets --query 'Subnets[0].SubnetId' --region us-east-2

An error occurred (UnauthorizedOperation) when calling the DescribeSubnets operation:
You are not authorized to perform this operation.

As expected, I am not authorized to perform this operation in US East (Ohio). Let’s try to start an EC2 instance without passing the subnet ID.

aws ec2 run-instances --image-id ami-0dd0ccab7e2801812 --region us-east-2 \
    --instance-type t3.small                                     

An error occurred (UnauthorizedOperation) when calling the RunInstances operation:
You are not authorized to perform this operation.
Encoded authorization failure message: <ENCODED MESSAGE>

Again, I am not authorized. More information is included in the encoded authorization failure message that I can decode as described in this article:

aws sts decode-authorization-message --encoded-message <ENCODED MESSAGE>

The decoded message (that I have omitted for brevity) tells me that there was an explicit deny to my request and includes the full SCP that caused the deny. This information is really useful for debugging these kind of errors.

Now, let’s try in US East (N. Virginia), one of the four governed regions.

aws ec2 describe-subnets --query 'Subnets[0].SubnetId' --region us-east-1
"subnet-0f3580c0c5e56c210"

This time, the command returns the subnet ID of the first subnet returned by the request. Let’s start an instance in US East (N. Virginia) using this subnet.

aws ec2 run-instances --image-id  ami-04ad2567c9e3d7893 --region us-east-1 \
    --instance-type t3.small --subnet-id subnet-0f3580c0c5e56c210

As expected, it works, and I can see the EC2 instance running in the console.

Similarly, APIs for other AWS services are limited by the Region deny settings. For example, I can’t create an S3 bucket in a non-governed Region.

When I try to create the bucket, I get an access denied error.

As expected, the creation of an S3 bucket works in a governed Region.

Even if someone gives this account access to a bucket in a non-governed Region, I would not be able to copy any data into that bucket.

Other preventive guardrails can enforce data residency, for example:

Disallow cross-region networking for Amazon EC2, Amazon CloudFront, and AWS Global Accelerator
Disallow internet access for an Amazon VPC instance managed by a customer
Disallow Amazon Virtual Private Network (VPN) connections

Now, let’s see how detective guardrails work.

Testing Detective Guardrails in a Sandbox Account
I enable the following guardrails for all accounts in the sandbox OU:

Detect whether Amazon EBS snapshots are restorable by all AWS accounts
Detect whether public routes exist in the route table for an internet gateway

Now, I want to see what happens if I go against these guardrails. In the EC2 console, I create an EBS snapshot for the volume of the EC2 instance I started before. Then, I modify permissions to share it with all AWS accounts.

Then, in the VPC console, I create an internet gateway, attach it to the AWS Control Tower managed VPC, and update the route table of one of the private subnets to use the internet gateway.

After a few minutes, the noncompliant resources in the sandbox account are found by the detective guardrails.

I look at the information provided by the guardrails and update my configuration to fix the issues. In a multi-account setup I’d contact the account owner and ask for remediation.

Availability and Pricing
You can use data-residency guardrails to control resources in any AWS Region. To create a landing zone, you should start from one of the Regions where AWS Control Tower is offered. For more information, see the AWS Regional Services List. There is no additional cost for this feature. You pay the costs of other services used, such as AWS Config.

This feature provides you with a framework of controls and guidance for setting up a multi-account environment that addresses data residency requirements. Depending on your use case, you may use any subset of the new data residency guardrails.

Set up guardrails based on your data residency requirements with AWS Control Tower.

— Danilo

New – AWS Control Tower Account Factory for Terraform

2021-11-30 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-aws-control-tower-account-factory-for-terraform/

AWS Control Tower makes it easier to set up and manage a secure, multi-account AWS environment. AWS Control Tower uses AWS Organizations to create what is called a landing zone, bringing ongoing account management and governance based on our experience working with thousands of customers.

If you use AWS CloudFormation to manage your infrastructure as code, you can customize your AWS Control Tower landing zone using Customizations for AWS Control Tower, a solution that helps you deploy custom templates and policies to individual accounts and organizational units (OUs) within your organization.

But what if you use Terraform to manage your AWS infrastructure?

Today, I am happy to share the availability of AWS Control Tower Account Factory for Terraform (AFT), a new Terraform module maintained by the AWS Control Tower team that allows you to provision and customize AWS accounts through Terraform using a deployment pipeline. The source code for the development pipeline can be stored in AWS CodeCommit, GitHub, GitHub Enterprise, or BitBucket. With AFT, you can automate the creation of fully functional accounts that have access to all the resources they need to be productive. The module works with Terraform open source, Terraform Enterprise, and Terraform Cloud.

Let’s see how this works in practice.

Using AWS Control Tower Account Factory for Terraform
First, I create a main.tf file that uses the AWS Control Tower Account Factory for Terraform (AFT) module:

module "aft" {
  source = "[email protected]:aws-ia/terraform-aws-control_tower_account_factory.git"

  # Required Parameters
  ct_management_account_id    = "123412341234"
  log_archive_account_id      = "234523452345"
  audit_account_id            = "345634563456"
  aft_management_account_id   = "456745674567"
  ct_home_region              = "us-east-1"
  tf_backend_secondary_region = "us-west-2"

  # Optional Parameters
  terraform_distribution = "oss"
  vcs_provider           = "codecommit"

  # Optional Feature Flags
  aft_feature_delete_default_vpcs_enabled = false
  aft_feature_cloudtrail_data_events      = false
  aft_feature_enterprise_support          = false
}

The first six parameters are required. As a prerequisite, I need to pass the ID of four AWS accounts in my AWS organization:

ct_management_account_id – AWS Control Tower management account
log_archive_account_id – Log Archive account
audit_account_id – Audit account
aft_management_account_id – AFT management account

Then, I have to pass two AWS Regions:

ct_home_region – The Region from which this module will be executed. This must be the same Region where AWS Control Tower is deployed.
tf_backend_secondary_region – The backend primary Region is the same as the AFT Region. This parameter defines the secondary Region to replicate to. AFT creates a backend for state tracking for its own state. It is also used for Terraform when using the open-source version.

The other parameters are optional and are set to their default value in the previous main.tf file:

terraform_distribution – To select between Terraform open source (default), Enterprise, or Cloud
vcs_provider – To choose the version control system to use between AWS CodeCommit (default), GitHub, GitHub Enterprise, or BitBucket.

These feature flags are disabled by default and can be omitted unless you want to enable them:

aft_feature_delete_default_vpcs_enabled – To automatically delete the default VPC for new accounts.
aft_feature_cloudtrail_data_events – To enable AWS CloudTrail data events for new accounts. Be aware that this option, usually required for compliance in highly regulated environments, can have an impact on your costs.
aft_feature_enterprise_support – To automatically enroll new accounts with Enterprise Support (if you have an Enterprise Support Plan).

First, I initialize the project and download the plugins:

terraform init

Then, I use AWS Single Sign-On to log in with the AWS Control Tower management account and start the deployment:

terraform apply

I confirm with a yes and, after some time, the deployment is complete.

Now, I use AWS SSO again to log in with the AFT management account. In the AWS CodeCommit console, I find four repositories that I can use to customize the accounts created with AFT.

These repositories are used by pipelines managed by AWS CodePipeline to automate the account creation:

xaft-account-request – This is where I place requests for accounts provisioned and managed by AFT.
aft-global-customizations – I can use this repository to customize all provisioned accounts with customer-defined resources. The resources can be created through Terraform or through Python.
aft-account-customizations – Here, I can customize provisioned accounts depending on the value of the account_customizations_name parameter in the aft-account-request repository. In this way, I can create different sets of customizations depending on the role the account will be used for.
aft-account-provisioning-customizations – This repository uses AWS Step Functions to customize the provisioning process for new accounts and simplify the integration with additional environments. State machines can use AWS Lambda functions, Amazon Elastic Container Service (Amazon ECS) or AWS Fargate tasks, custom activities hosted either on AWS or on-premises, or Amazon Simple Notification Service (SNS) and Amazon Simple Queue Service (SQS) to communicate with external applications.

Currently, these four repositories are all empty. To start, I use the code in the sources/aft-customizations-repos folder in the GitHub repo of the AFT Terraform module.

Using the example in the aft-account-request repository, I prepare a template to create a couple of AWS accounts. One of the two accounts is for a software developer.

To help software developers be productive quickly, I create a specific account customization. In the template, I set the parameter account_customizations_name equal to developer-customization.

Then, in the aft-account-customizations repository, I create a developer-customization folder where I put a Terraform template to automatically create an AWS Cloud9 EC2-based development environment for new accounts of that type. Optionally, I can extend that with my Python code, for example, to invoke internal or external APIs. Using this approach, all new accounts for software developers will have their development environment ready as they go through the delivery pipeline.

I push the changes to the main branch (first for the aft-account-customizations repository, then for the aft-account-request). This triggers the execution of the pipeline. After a few minutes, the two new accounts are ready to be used.

You can customize accounts created by AFT based on your unique requirements. For example, you can provide each account with its own specific security setup (such as IAM roles or security groups) and storage (for example, pre-configured Amazon Simple Storage Service (Amazon S3) buckets).

Availability and Pricing
AWS Control Tower Account Factory for Terraform (AFT) works in any Region where AWS Control Tower is available. There are no additional costs when using AFT. You pay for the services used by the solution. For example, when you set up AWS Control Tower, you will begin to incur costs for AWS services configured to set up your landing zone and mandatory guardrails.

When building this solution, we worked together with HashiCorp. Armon Dadgar, HashiCorp Co-Founder and CTO, told us: “Managing cloud environments with hundreds or thousands of users can be a complex and time-consuming process. Using a software delivery pipeline integrating Terraform and AWS Control Tower makes it easier to achieve consistent governance and compliance requirements across all accounts.”

The pipeline provides an account creation process that monitors when account provisioning is complete and then triggers additional Terraform modules to enhance the account with further customizations. You can configure the pipeline to use your own custom Terraform modules or pick from pre-published Terraform modules for common products and configurations.

Simplify and standardize AWS account creation using AWS Control Tower Account Factory for Terraform.

— Danilo

New – Amazon CloudWatch Evidently – Experiments and Feature Management

2021-11-29 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/cloudwatch-evidently/

As a developer, I am excited to announce the availability of Amazon CloudWatch Evidently. This is a new Amazon CloudWatch capability that makes it easy for developers to introduce experiments and feature management in their application code. CloudWatch Evidently may be used for two similar but distinct use-cases: implementing dark launches, also known as feature flags, and A/B testing.

Features flags is a software development technique that lets you enable or disable features without needing to deploy your code. It decouples the feature deployment from the release. Features in your code are deployed in advance of the actual release. They stay hidden behind if-then-else statements. At runtime, your application code queries a remote service. The service decides the percentage of users who are exposed to the new feature. You can also configure the application behavior for some specific customers, your beta testers for example.

When you use feature flags you can deploy new code in advance of your launch. Then, you can progressively introduce a new feature to a fraction of your customers. During the launch, you monitor your technical and business metrics. As long as all goes well, you may increase traffic to expose the new feature to additional users. In the case that something goes wrong, you may modify the server-side routing with just one click or API call to present only the old (and working) experience to your customers. This lets you revert back user experience without requiring rollback deployments.

A/B Testing shares many similarities with feature flags while still serving a different purpose. A/B tests consist of a randomized experiment with multiple variations. A/B testing lets you compare multiple versions of a single feature, typically by testing the response of a subject to variation A against variation B, and determining which of the two is more effective. For example, let’s imagine an e-commerce website (a scenario we know quite well at Amazon). You might want to experiment with different shapes, sizes, or colors for the checkout button, and then measure which variation has the most impact on revenue.

The infrastructure required to conduct A/B testing is similar to the one required by feature flags. You deploy multiple scenarios in your app, and you control how to route part of the customer traffic to one scenario or the other. Then, you perform deep dive statistical analysis to compare the impacts of variations. CloudWatch Evidently assists in interpreting and acting on experimental results without the need for advanced statistical knowledge. You can use the insights provided by Evidently’s statistical engine, such as anytime p-value and confidence intervals for decision-making while an experiment is in progress.

At Amazon, we use feature flags extensively to control our launches, and A/B testing to experiment with new ideas. We’ve acquired years of experience to build developers’ tools and libraries and maintain and operate experimentation services at scale. Now you can benefit from our experience.

CloudWatch Evidently uses the terms “launches” for feature flags and “experiments” for A/B testing, and so do I in the rest of this article.

Let’s see how it works from an application developer point of view.

Launches in Action
For this demo, I use a simple Guestbook web application. So far, the guest book page is read-only, and comments are entered from our back-end only. I developed a new feature to let customers enter their comments on the guestbook page. I want to launch this new feature progressively over a week and keep the ability to revert the change back if it impacts important technical or business metrics (such as p95 latency, customer engagement, page views, etc.). Users are authenticated, and I will segment users based on their user ID.

Before launch:

After launch:

Create a Project
Let’s start by configuring Evidently. I open the AWS Management Console and navigate to CloudWatch Evidently. Then, I select Create a project.

I enter a Project name and Description.

Evidently lets you optionally store events to CloudWatch logs or S3, so that you can move them to systems such as Amazon Redshift to perform analytical operations. For this demo, I choose not to store events. When done, I select Create project.

Add a Feature
Next, I create a feature for this project by selecting Add feature. I enter a Feature name and Feature description. Next, I define my Feature variations. In this example, there are two variations, and I use a Boolean type. true indicates the guestbook is editable and false indicates it is read only. Variations types might be boolean, double, long, or string.

I may define overrides. Overrides let me pre-define the variation for selected users. I want the user “seb”, my beta tester, to always receive the editable variation.

The console shares the JavaScript and Java code snippets to add into my application.

Talking about code snippets, let’s look at the changes at the code level.

Instrument my Application Code
I use a simple web application for this demo. I coded this application using JavaScript. I use the AWS SDK for JavaScript and Webpack to package my code. I also use JQuery to manipulate the DOM to hide or show elements. I designed this application to use standard JavaScript and a minimum number of frameworks to make this example inclusive to all. Feel free to use higher level tools and frameworks, such as React or Angular for real-life projects.

I first initialize the Evidently client. Just like other AWS Services, I have to provide an access key and secret access key for authentication. Let’s leave the authentication part out for the moment. I added a note at the end of this article to discuss the options that you have. In this example, I use Amazon Cognito Identity Pools to receive temporary credentials.

// Initialize the Amazon CloudWatch Evidently client
const evidently = new AWS.Evidently({
    endpoint: EVIDENTLY_ENDPOINT,
    region: 'us-east-1',
    credentials: fromCognitoIdentityPool({
        client: new CognitoIdentityClient({ region: 'us-west-2' }),
        identityPoolId: IDENTITY_POOL_ID
    }),
});

Armed with this client, my code may invoke the EvaluateFeature API to make decisions about the variation to display to customers. The entityId is any string-based attribute to segment my customers. It might be a session ID, a customer ID, or even better, a hash of these. The featureName parameter contains the name of the feature to evaluate. In this example, I pass the value EditableGuestBook.

const evaluateFeature = async (entityId, featureName) => {

    // API request structure
    const evaluateFeatureRequest = {
        // entityId for calling evaluate feature API
        entityId: entityId,
        // Name of my feature
        feature: featureName,
        // Name of my project
        project: "AWSNewsBlog",
    };

    // Evaluate feature
    const response = await evidently.evaluateFeature(evaluateFeatureRequest).promise();
    console.log(response);
    return response;
}

The response contains the assignment decision from Evidently, as based on traffic rules defined on the server-side.

{
 details: {
   launch: "EditableGuestBook", group: "V2"},
   reason: "LAUNCH_RULE_MATCH", 
   value: {boolValue: false},
   variation: "readonly"
}}

The last part consists of hiding or displaying part of the user interface based on the value received above. Using basic JQuery DOM manipulation, it would be something like the following:

window.aws.evaluateFeature(entityId, 'EditableGuestbook').then((response, error) => {
    if (response.value.boolValue) {
        console.log('Feature Flag is on, showing guest book');
        $('div#guestbook-add').show();
    } else {
        console.log('Feature Flag is off, hiding guest book');
        $('div#guestbook-add').hide();
    }
});

Create a Launch
Now that the feature is defined on the server-side, and the client code is instrumented, I deploy the code and expose it to my customers. At a later stage, I may decide to launch the feature. I navigate back to the console, select my project, and select Create Launch. I choose a Launch name and a Launch description for my launch. Then, I select the feature I want to launch.

In the Launch Configuration section, I configure how much traffic is sent to each variation. I may also schedule the launch with multiple steps. This lets me plan different steps of routing based on a schedule. For example, on the first day, I may choose to send 10% of the traffic to the new feature, and on the second day 20%, etc. In this example, I decide to split the traffic 50/50.

Finally, I may define up to three metrics to measure the performance of my variations. Metrics are defined by applying rules to data events.

Again, I have to instrument my code to send these metrics with PutProjectEvents API from Evidently. Once my launch is created, the EvaluateFeature API returns different values for different values of entityId (users in this demo).

At any moment, I may change the routing configuration. Moreover, I also have access to a monitoring dashboard to observe the distribution of my variations and the metrics for each variation.

I am confident that your real-life launch graph will get more data than mine did, as I just created it to write this post.

A/B Testing
Doing an A/B test is similar. I create a feature to test, and I create an Experiment. I configure the experiment to route part of the traffic to variation 1, and then the other part to variation 2. When I am ready to launch the experiment, I explicitly select Start experiment.

In this experiment, I am interested in sending custom metrics. For example:

// pageLoadTime custom metric
const timeSpendOnHomePageData = `{
   "details": {
      "timeSpendOnHomePage": ${timeSpendOnHomePageValue}
   },
   "userDetails": { "userId": "${randomizedID}", "sessionId": "${randomizedID}" }
}`;

const putProjectEventsRequest: PutProjectEventsRequest = {
   project: 'AWSNewsBlog',
   events: [
    {
        timestamp: new Date(),
        type: 'aws.evidently.custom',
        data: JSON.parse(timeSpendOnHomePageData)
    },
   ],
};

this.evidently.putProjectEvents(putProjectEventsRequest).promise().then(res =>{})

Switching to the Results page, I see raw values and graph data for Event Count, Total Value, Average, Improvement (with 95% confidence interval), and Statistical significance. The statistical significance describes how certain we are that the variation has an effect on the metric as compared to the baseline.

These results are generated throughout the experiment and the confidence intervals and the statistical significance are guaranteed to be valid anytime you want to view them. Additionally, at the end of the experiment, Evidently also generates a Bayesian perspective of the experiment that provides information about how likely it is that a difference between the variations exists.

The following two screenshots show graphs for the average value of two metrics over time, and the improvement for a metric within a 95% confidence interval.

Additional Thoughts
Before we wrap-up, I’d like to share some additional considerations.

First, it is important to understand that I choose to demo Evidently in the context of front-end application development. However, you may use Evidently with any application type: front-end web or mobile, back-end API, or even machine learning (ML). For example, you may use Evidently to deploy two different ML models and conduct experiments just like I showed above.

Second, just like with other AWS Services, Evidently API is available in all of our AWS SDK. This lets you use EvaluateFeature and other APIs from nine programing languages: C++, Go, Java, JavaScript (and Typescript), .Net, NodeJS, PHP, Python, and Ruby. AWS SDK for Rust and Swift are in the making.

Third, for a front-end application as I demoed here, it is important to consider how to authenticate calls to Evidently API. Hard coding access keys and secret access keys is not an option. For the front-end scenario, I suggest that you use Amazon Cognito Identity Pools to exchange user identity tokens for a temporary access and secret keys. User identity tokens may be obtained from Cognito User Pools, or third-party authentications systems, such as Active Directory, Login with Amazon, Login with Facebook, Login with Google, Signin with Apple, or any system compliant with OpenID Connect or SAML. Cognito Identity Pools also allows for anonymous access. No identity token is required. Cognito Identity Pools vends temporary tokens associated with IAM roles. You must Allow calls to the evidently:EvaluateFeature API in your policies.

Finally, when using feature flags, plan for code cleanup time during your sprints. Once a feature is launched, you might consider removing calls to EvaluateFeature API and the if-then-else logic used to initially hide the feature.

Pricing and Availability
Amazon Cloudwatch Evidently is generally available in nine AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Ireland), Europe (Frankfurt), and Europe (Stockholm). As usual, we will gradually extend to other Regions in the coming months.

Pricing is pay-as-you-go with no minimum or recurring fees. CloudWatch Evidently charges your account based on Evidently events and Evidently analysis units. Evidently analysis units are generated from Evidently events, based on rules you have created in Evidently. For example, a user checkout event may produce two Evidently analysis units: checkout value and the number of items in cart. For more information about pricing, see Amazon CloudWatch Pricing.

Start experimenting with CloudWatch Evidently today!

— seb

New for AWS Compute Optimizer – Resource Efficiency Metrics to Estimate Savings Opportunities and Performance Risks

2021-11-29 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-compute-optimizer-resource-efficiency-metrics-to-estimate-savings-opportunities-and-performance-risks/

By applying the knowledge drawn from Amazon’s experience running diverse workloads in the cloud, AWS Compute Optimizer identifies workload patterns and recommends optimal AWS resources.

Today, I am happy to share that AWS Compute Optimizer now delivers resource efficiency metrics alongside its recommendations to help you assess how efficiently you are using AWS resources:

A dashboard shows you savings and performance improvement opportunities at the account level. You can dive into resource types and individual resources from the dashboard.
The Estimated monthly savings (On-Demand) and Savings opportunity (%) columns estimate the possible savings for over-provisioned resources. You can sort your recommendations using these two columns to quickly find the resources on which to focus your optimization efforts.
The Current performance risk column estimates the bottleneck risk with the current configuration for under-provisioned resources.

These efficiency metrics are available for Amazon Elastic Compute Cloud (Amazon EC2), AWS Lambda, and Amazon Elastic Block Store (EBS) at the resource and AWS account levels.

For multi-account environments, Compute Optimizer continuously calculates resource efficiency metrics at individual account level in an AWS organization to help identify teams with low cost-efficiency or possible performance risks. This lets you to create goals and track progress over time. You can quickly understand just how resource-efficient teams and applications are, easily prioritize recommendation evaluation and adoption by engineering team, and establish a mechanism that drives a cost-aware culture and accountability across engineering teams.

Using Resource Efficiency Metrics in AWS Compute Optimizer
You can opt in using the AWS Management Console or the AWS Command Line Interface (CLI) to start using Compute Optimizer. You can enroll the account that you’re currently signed in to or all of the accounts within your organization. Depending on your choice, Compute Optimizer analyzes resources that are in your individual account or for each account in your organization, and then generates optimization recommendations for those resources.

To see your savings opportunity in Compute Optimizer, you should also opt in to AWS Cost Explorer and enable the rightsizing recommendations in the AWS Cost Explorer preferences page. For more details, see Getting started with rightsizing recommendations.

I already enrolled some time ago, and in the Compute Optimizer console I see the overall savings opportunity for my account.

Below that, I have a recap of the performance improvement opportunity. This includes an overview of the under-provisioned resources, as well as the performance risks that they pose by resource type.

Let’s dive into some of those savings. In the EC2 instances section, Compute Optimizer found 37 over-provisioned instances.

I follow the 37 instances link to get recommendations for those resources, and then sort the table by Estimated monthly savings (On-Demand) descending.

On the right, in the same table, I see which is the current instance type, the recommended instance type based on Computer Optimizer estimates, the difference in pricing, and if there are platform differences between the current and recommended instance types.

I can select each instance to further drill down into the metrics collected, as well as the other possible instance types suggested by Computer Optimizer.

Back to the Compute Optimizer Dashboard, in the Lambda functions section, I see that eight functions have under-provisioned memory.

Again, I follow the 8 functions link to get recommendations for those resources, and then sort the table by Current performance risk. In my case, the risk is always low, but different values can help prioritize your activities.

Here, I see the current and recommended configured memory for those Lambda functions. I can select each function to get a view of the metrics collected. Choosing the memory allocated to Lambda functions is an optimization process that balances speed (duration) and cost. See Profiling functions with AWS Lambda Power Tuning in the documentation for more information.

Availability and Pricing
You can use resource efficiency metrics with AWS Compute Optimizer in any AWS Region where it is offered. For more information, see the AWS Regional Services List. There is no additional charge for this new capability. See the AWS Compute Optimizer pricing page for more information.

This new feature lets you implement a periodic workflow to optimize your costs:

You can start by reviewing savings opportunities for all of your accounts to identify which accounts have the highest savings opportunity.
Then, you can drill into those accounts with the highest savings opportunity. You can refer to the estimated monthly savings to see which recommendations can drive the largest absolute cost impact.
Finally, you can communicate optimization opportunities and priority order to the teams using those accounts.

Start using AWS Compute Optimizer today to find and prioritize savings opportunities in your AWS account or organization.

— Danilo

New for AWS Compute Optimizer – Enhanced Infrastructure Metrics to Extend the Look-Back Period to Three Months

2021-11-29 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-compute-optimizer-enhanced-infrastructure-metrics-to-extend-the-look-back-period-to-three-months/

By using machine learning to analyze historical utilization metrics, AWS Compute Optimizer recommends optimal AWS resources for your workloads to reduce costs and improve performance. Over-provisioning resources can lead to unnecessary infrastructure costs, and under-provisioning resources can lead to poor application performance. Compute Optimizer helps you choose optimal configurations for three types of AWS resources: Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Elastic Block Store (EBS) volumes, and AWS Lambda functions, based on your utilization data. Today, I am happy to share that AWS Compute Optimizer now supports recommendation preferences where you can opt in or out of features that enhance resource-specific recommendations.

For EC2 instances, AWS Compute Optimizer analyzes Amazon CloudWatch metrics from the past 14 days to generate recommendations. For this reason, recommendations weren’t relevant for a subset of workloads that had monthly or quarterly patterns. For those workloads, you had to look for unoptimized resources and determine the right resource configurations over a longer period of time. This can be time-consuming and requires deep cloud expertise, especially for large organizations.

With the launch of recommendation preferences, Compute Optimizer now offers enhanced infrastructure metrics, a new paid recommendation preference feature that enhances recommendation quality for EC2 instances and Auto Scaling groups. Activating it extends the metrics look-back period to three months. You can activate enhanced infrastructure metrics for individual resources or at the AWS account or AWS organization level.

Let’s see how that works in practice.

Using Enhanced Infrastructure Metrics with AWS Compute Optimizer
Here, I am using the management account of my AWS organization to see organization-level preferences. In the left pane of the Compute Optimizer console, I choose Accounts. Here, there is a new section to set up Organization level preferences for enhanced infrastructure metrics. The console warns me that this is a paid feature.

I want to activate enhanced infrastructure metrics for EC2 instances running in the US East (N. Virginia) Region for all accounts in my organization. I choose the Edit button. For Resource type, I select EC2 instances. For Region, I select US East (N. Virginia). I check that the flag is active and save.

If I select one of the AWS accounts on this page, I can choose View preferences and override the setting for that specific account. For example, I can disable accounts that I use for testing because EC2 instances there are created automatically by a CI/CD pipeline and are usually terminated within a few hours.

In the console Dashboard, I look at the overall recommendations for EC2 instances and Auto Scaling groups.

In the EC2 instances box, I choose View recommendations and then one of the instances. With the Edit button, I can activate or inactivate enhanced infrastructure metrics for this specific resource. Here, I can also see if, considering all settings at organization, account, and resource level, enhanced infrastructure metrics is actually active or not for this specific EC2 instance. I see Active (pending) here because I’ve just changed the setting and it may take a few hours for Compute Optimizer to consider my updated preferences in its recommendations.

Below, I see the recommended options for the instance. Considering the current workload, I should change instance type and size from c3.2xlarge to r5d.large and save some money.

In a few hours, Compute Optimizer updates its recommendations based on the latest three months of CloudWatch metrics. In this way, I get better suggestions for workloads that have monthly or quarterly activities.

Availability and Pricing
You can activate enhanced infrastructure metrics in the AWS Compute Optimizer account preferences page for all the accounts in your organization or for individual accounts. If you need more granular controls, you can activate (or deactivate) for an individual resource (Auto Scaling group or EC2 instance) in the resource detail page. You can also activate enhanced infrastructure metrics using the AWS Command Line Interface (CLI) or AWS SDKs.

Default preferences in Compute Optimizer (with 14-day look-back) are free. Enabling enhanced infrastructure metrics costs $0.0003360215 per resource per hour and is charged based on the number of hours per month the resource is running. For a resource running a full 31-day month, that’s $0.25. For more information, see the Compute Optimizer pricing page.

Use enhanced infrastructure metrics to generate recommendations with Compute Optimizer based on metrics from the past three months.

— Danilo

New for AWS Distro for OpenTelemetry – Tracing Support is Now Generally Available

2021-09-23 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-distro-for-opentelemetry-tracing-support-is-now-generally-available/

Last year before re:Invent, we introduced the public preview of AWS Distro for OpenTelemetry, a secure distribution of the OpenTelemetry project supported by AWS. OpenTelemetry provides tools, APIs, and SDKs to instrument, generate, collect, and export telemetry data to better understand the behavior and the performance of your applications. Yesterday, upstream OpenTelemetry announced tracing stability milestone for its components. Today, I am happy to share that support for traces is now generally available in AWS Distro for OpenTelemetry.

Using OpenTelemetry, you can instrument your applications just once and then send traces to multiple monitoring solutions.

You can use AWS Distro for OpenTelemetry to instrument your applications running on Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (EKS), and AWS Lambda, as well as on premises. Containers running on AWS Fargate and orchestrated via either ECS or EKS are also supported.

You can send tracing data collected by AWS Distro for OpenTelemetry to AWS X-Ray, as well as partner destinations such as:

AppDynamics, Dynatrace, Grafana, Honeycomb, Lightstep, NewRelic, and SumoLogic – which support OpenTelemetry Protocol (OTLP) exporters natively.
Datadog, Logz.io, Splunk – which have their own exporters.

You can use auto-instrumentation agents to collect traces without changing your code. Auto-instrumentation is available today for Java and Python applications. Auto-instrumentation support for Python currently only covers the AWS SDK. You can instrument your applications using other programming languages (such as Go, Node.js, and .NET) with the OpenTelemetry SDKs.

Let’s see how this works in practice for a Java application.

Visualizing Traces for a Java Application Using Auto-Instrumentation
I create a simple Java application that shows the list of my Amazon Simple Storage Service (Amazon S3) buckets and my Amazon DynamoDB tables:

package com.example.myapp;

import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.*;
import software.amazon.awssdk.services.dynamodb.model.DynamoDbException;
import software.amazon.awssdk.services.dynamodb.model.ListTablesResponse;
import software.amazon.awssdk.services.dynamodb.model.ListTablesRequest;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;

import java.util.List;

/**
 * Hello world!
 *
 */
public class App {

    public static void listAllTables(DynamoDbClient ddb) {

        System.out.println("DynamoDB Tables:");

        boolean moreTables = true;
        String lastName = null;

        while (moreTables) {
            try {
                ListTablesResponse response = null;
                if (lastName == null) {
                    ListTablesRequest request = ListTablesRequest.builder().build();
                    response = ddb.listTables(request);
                } else {
                    ListTablesRequest request = ListTablesRequest.builder().exclusiveStartTableName(lastName).build();
                    response = ddb.listTables(request);
                }

                List<String> tableNames = response.tableNames();

                if (tableNames.size() > 0) {
                    for (String curName : tableNames) {
                        System.out.format("* %s\n", curName);
                    }
                } else {
                    System.out.println("No tables found!");
                    System.exit(0);
                }

                lastName = response.lastEvaluatedTableName();
                if (lastName == null) {
                    moreTables = false;
                }
            } catch (DynamoDbException e) {
                System.err.println(e.getMessage());
                System.exit(1);
            }
        }

        System.out.println("Done!\n");
    }

    public static void listAllBuckets(S3Client s3) {

        System.out.println("S3 Buckets:");

        ListBucketsRequest listBucketsRequest = ListBucketsRequest.builder().build();
        ListBucketsResponse listBucketsResponse = s3.listBuckets(listBucketsRequest);
        listBucketsResponse.buckets().stream().forEach(x -> System.out.format("* %s\n", x.name()));

        System.out.println("Done!\n");
    }

    public static void listAllBucketsAndTables(S3Client s3, DynamoDbClient ddb) {
        listAllBuckets(s3);
        listAllTables(ddb);
    }

    public static void main(String[] args) {

        Region region = Region.EU_WEST_1;

        S3Client s3 = S3Client.builder().region(region).build();
        DynamoDbClient ddb = DynamoDbClient.builder().region(region).build();

        listAllBucketsAndTables(s3, ddb);

        s3.close();
        ddb.close();
    }
}

I package the application using Apache Maven. Here’s the Project Object Model (POM) file managing dependencies such as the AWS SDK for Java 2.x that I use to interact with S3 and DynamoDB:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>
  <groupId>com.example.myapp</groupId>
  <artifactId>myapp</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>myapp</name>
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>bom</artifactId>
        <version>2.17.38</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>s3</artifactId>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>dynamodb</artifactId>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.1</version>
        <configuration>
          <source>8</source>
          <target>8</target>
        </configuration>
      </plugin>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <archive>
            <manifest>
              <mainClass>com.example.myapp.App</mainClass>
            </manifest>
          </archive>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

I use Maven to create an executable Java Archive (JAR) file that includes all dependencies:

$ mvn clean compile assembly:single

To run the application and get tracing data, I need two components:

The AWS Distro for OpenTelemetry Auto-Instrumentation Agent for Java, a Java agent that can be attached to any Java 8+ application to capture telemetry from a number of popular libraries and frameworks, including the AWS SDK.
The AWS Distro for OpenTelemetry Collector, an executable that can receive, process, and export telemetry data to monitoring destinations.

In one terminal, I run the AWS Distro for OpenTelemetry Collector using Docker:

$ docker run --rm -p 4317:4317 -p 55680:55680 -p 8889:8888 \
         -e AWS_REGION=eu-west-1 \
         -e AWS_PROFILE=default \
         -v ~/.aws:/root/.aws \
         --name awscollector public.ecr.aws/aws-observability/aws-otel-collector:latest

The collector is now ready to receive traces and forward them to a monitoring platform. By default, the AWS Distro for OpenTelemetry Collector sends traces to AWS X-Ray. I can change the exporter or add more exporters by editing the collector configuration. For example, I can follow the documentation to configure OLTP exporters to send telemetry data using the OLTP protocol. In the documentation, I also find how to configure other partner destinations. [[ It would be great it we had a link for the partner section, I can find only links to a specific partner ]]

I download the latest version of the AWS Distro for OpenTelemetry Auto-Instrumentation Java Agent. Now, I run my application and use the agent to capture telemetry data without having to add any specific instrumentation the code. In the OTEL_RESOURCE_ATTRIBUTES environment variable I set a name and a namespace for the service: [[ Are service.name and service.namespace being used by X-Ray? I couldn’t find them in the service map ]]

$ OTEL_RESOURCE_ATTRIBUTES=service.name=MyApp,service.namespace=MyTeam \
  java -javaagent:otel/aws-opentelemetry-agent.jar \
       -jar myapp/target/myapp-1.0-SNAPSHOT-jar-with-dependencies.jar

As expected, I get the list of my S3 buckets globally and of the DynamoDB tables in the Region.

To generate more tracing data, I run the previous command a few times. Each time I run the application, telemetry data is collected by the agent and sent to the collector. The collector buffers the data and then sends it to the configured exporters. By default, it is sending traces to X-Ray.

Now, I look at the service map in the AWS X-Ray console to see my application’s interactions with other services:

And there they are! Without any change in the code, I see my application’s calls to the S3 and DynamoDB APIs. There were no errors, and all the circles are green. Inside the circles, I find the average latency of the invocations and the number of transactions per minute.

Adding Spans to a Java Application
The information automatically collected can be improved by providing more information with the traces. For example, I might have interactions with the same service in different parts of my application, and it would be useful to separate those interactions in the service map. In this way, if there is an error or high latency, I would know which part of my application is affected.

One way to do so is to use spans or segments. A span represents a group of logically related activities. For example, the listAllBucketsAndTables method is performing two operations, one with S3 and one with DynamoDB. I’d like to group them together in a span. The quickest way with OpenTelemetry is to add the @WithSpan annotation to the method. Because the result of a method usually depends on its arguments, I also use the @SpanAttribute annotation to describe which arguments in the method invocation should be automatically added as attributes to the span.

@WithSpan
    public static void listAllBucketsAndTables(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllBuckets(s3);
        listAllTables(ddb);
    }

To be able to use the @WithSpan and @SpanAttribute annotations, I need to import them into the code and add the necessary OpenTelemetry dependencies to the POM. All these changes are based on the OpenTelemetry specifications and don’t depend on the actual implementation that I am using, or on the tool that I will use to visualize or analyze the telemetry data. I have only to make these changes once to instrument my application. Isn’t that great?

To better see how spans work, I create another method that is running the same operations in reverse order, first listing the DynamoDB tables, then the S3 buckets:

    @WithSpan
    public static void listTablesFirstAndThenBuckets(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllTables(ddb);
        listAllBuckets(s3);
    }

The application is now running the two methods (listAllBucketsAndTables and listTablesFirstAndThenBuckets) one after the other. For simplicity, here’s the full code of the instrumented application:

package com.example.myapp;

import software.amazon.awssdk.regions.Region;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.*;
import software.amazon.awssdk.services.dynamodb.model.DynamoDbException;
import software.amazon.awssdk.services.dynamodb.model.ListTablesResponse;
import software.amazon.awssdk.services.dynamodb.model.ListTablesRequest;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;

import java.util.List;

import io.opentelemetry.extension.annotations.SpanAttribute;
import io.opentelemetry.extension.annotations.WithSpan;

/**
 * Hello world!
 *
 */
public class App {

    public static void listAllTables(DynamoDbClient ddb) {

        System.out.println("DynamoDB Tables:");

        boolean moreTables = true;
        String lastName = null;

        while (moreTables) {
            try {
                ListTablesResponse response = null;
                if (lastName == null) {
                    ListTablesRequest request = ListTablesRequest.builder().build();
                    response = ddb.listTables(request);
                } else {
                    ListTablesRequest request = ListTablesRequest.builder().exclusiveStartTableName(lastName).build();
                    response = ddb.listTables(request);
                }

                List<String> tableNames = response.tableNames();

                if (tableNames.size() > 0) {
                    for (String curName : tableNames) {
                        System.out.format("* %s\n", curName);
                    }
                } else {
                    System.out.println("No tables found!");
                    System.exit(0);
                }

                lastName = response.lastEvaluatedTableName();
                if (lastName == null) {
                    moreTables = false;
                }
            } catch (DynamoDbException e) {
                System.err.println(e.getMessage());
                System.exit(1);
            }
        }

        System.out.println("Done!\n");
    }

    public static void listAllBuckets(S3Client s3) {

        System.out.println("S3 Buckets:");

        ListBucketsRequest listBucketsRequest = ListBucketsRequest.builder().build();
        ListBucketsResponse listBucketsResponse = s3.listBuckets(listBucketsRequest);
        listBucketsResponse.buckets().stream().forEach(x -> System.out.format("* %s\n", x.name()));

        System.out.println("Done!\n");
    }

    @WithSpan
    public static void listAllBucketsAndTables(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllBuckets(s3);
        listAllTables(ddb);

    }

    @WithSpan
    public static void listTablesFirstAndThenBuckets(@SpanAttribute("title") String title, S3Client s3, DynamoDbClient ddb) {

        System.out.println(title);

        listAllTables(ddb);
        listAllBuckets(s3);

    }

    public static void main(String[] args) {

        Region region = Region.EU_WEST_1;

        S3Client s3 = S3Client.builder().region(region).build();
        DynamoDbClient ddb = DynamoDbClient.builder().region(region).build();

        listAllBucketsAndTables("My S3 buckets and DynamoDB tables", s3, ddb);
        listTablesFirstAndThenBuckets("My DynamoDB tables first and then S3 bucket", s3, ddb);

        s3.close();
        ddb.close();
    }
}

And here’s the updated POM that includes the additional OpenTelemetry dependencies:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>
  <groupId>com.example.myapp</groupId>
  <artifactId>myapp</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>myapp</name>
  <dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>bom</artifactId>
        <version>2.16.60</version>
        <type>pom</type>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>s3</artifactId>
    </dependency>
    <dependency>
      <groupId>software.amazon.awssdk</groupId>
      <artifactId>dynamodb</artifactId>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-extension-annotations</artifactId>
      <version>1.5.0</version>
    </dependency>
    <dependency>
      <groupId>io.opentelemetry</groupId>
      <artifactId>opentelemetry-api</artifactId>
      <version>1.5.0</version>
    </dependency>
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.8.1</version>
        <configuration>
          <source>8</source>
          <target>8</target>
        </configuration>
      </plugin>
      <plugin>
        <artifactId>maven-assembly-plugin</artifactId>
        <configuration>
          <archive>
            <manifest>
              <mainClass>com.example.myapp.App</mainClass>
            </manifest>
          </archive>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

I compile my application with these changes and run it again a few times:

$ mvn clean compile assembly:single

$ OTEL_RESOURCE_ATTRIBUTES=service.name=MyApp,service.namespace=MyTeam \
  java -javaagent:otel/aws-opentelemetry-agent.jar \
       -jar myapp/target/myapp-1.0-SNAPSHOT-jar-with-dependencies.jar

Now, let’s look at the X-Ray service map, computed using the additional information provided by those annotations.

Now I see the two methods and the other services they invoke. If there are errors or high latency, I can easily understand how the two methods are affected.

In the Traces section of the X-Ray console, I look at the Raw data for some of the traces. Because the title argument was annotated with @SpanAttribute, each trace has the value of that argument in the metadata section.

Collecting Traces from Lambda Functions
The previous steps work on premises, on EC2, and with applications running in containers. To collect traces and use auto-instrumentation with Lambda functions, you can use the AWS managed OpenTelemetry Lambda Layers (a few examples are included in the repository).

After you add the Lambda layer to your function, you can use the environment variable OPENTELEMETRY_COLLECTOR_CONFIG_FILE to pass your own configuration to the collector. More information on using AWS Distro for OpenTelemetry with AWS Lambda is available in the documentation.

Availability and Pricing
You can use AWS Distro for OpenTelemetry to get telemetry data from your application running on premises and on AWS. There are no additional costs for using AWS Distro for OpenTelemetry. Depending on your configuration, you might pay for the AWS services that are destinations for OpenTelemetry data, such as AWS X-Ray, Amazon CloudWatch, and Amazon Managed Service for Prometheus (AMP).

To learn more, you are invited to this webinar on Thursday, October 7 at 10:00 am PT / 1:00 pm EDT / 7:00 pm CEST.

Simplify the instrumentation of your applications and improve their observability using AWS Distro for OpenTelemetry today.

— Danilo

Amazon Managed Grafana Is Now Generally Available with Many New Features

2021-09-01 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/amazon-managed-grafana-is-now-generally-available-with-many-new-features/

In December, we introduced the preview of Amazon Managed Grafana, a fully managed service developed in collaboration with Grafana Labs that makes it easy to use the open-source and the enterprise versions of Grafana to visualize and analyze your data from multiple sources. With Amazon Managed Grafana, you can analyze your metrics, logs, and traces without having to provision servers, or configure and update software.

During the preview, Amazon Managed Grafana was updated with new capabilities. Today, I am happy to announce that Amazon Managed Grafana is now generally available with additional new features:

Grafana has been upgraded to version 8 and offers new data sources, visualizations, and features, including library panels that you can build once and re-use on multiple dashboards, a Prometheus metrics browser to quickly find and query metrics, and new state timeline and status history visualizations.
To centralize the querying of additional data sources within an Amazon Managed Grafana workspace, you can now query data using the JSON data source plugin. You can now also query Redis, SAP HANA, Salesforce, ServiceNow, Atlassian Jira, and many more data sources.
You can use Grafana API keys to publish your own dashboards or give programmatic access to your Grafana workspace. For example, this is a Terraform recipe that you can use to add data sources and dashboards.
You can enable single sign-on to your Amazon Managed Grafana workspaces using Security Assertion Markup Language 2.0 (SAML 2.0). We have worked with these identity providers (IdP) to have them integrated at launch: CyberArk, Okta, OneLogin, Ping Identity, and Azure Active Directory.
All calls from the Amazon Managed Grafana console and code calls to Amazon Managed Grafana API operations are captured by AWS CloudTrail. In this way, you can have a record of actions taken in Amazon Managed Grafana by a user, role, or AWS service. Additionally, you can now audit mutating changes that occur in your Amazon Managed Grafana workspace, such as when a dashboard is deleted or data source permissions are changed.
The service is available in ten AWS Regions (full list at the end of the post).

Let’s do a quick walkthrough to see how this works in practice.

Using Amazon Managed Grafana
In the Amazon Managed Grafana console, I choose Create workspace. A workspace is a logically isolated, highly available Grafana server. I enter a name and a description for the workspace, and then choose Next.

I can use AWS Single Sign-On (AWS SSO) or an external identity provider via SAML to authenticate the users of my workspace. For simplicity, I select AWS SSO. Later in the post, I’ll show how SAML authentication works. If this is your first time using AWS SSO, you can see the prerequisites (such as having AWS Organizations set up) in the documentation.

Then, I choose the Service managed permission type. In this way, Amazon Managed Grafana will automatically provision the necessary IAM permissions to access the AWS Services that I select in the next step.

In Service managed permission settings, I choose to monitor resources in my current AWS account. If you use AWS Organizations to centrally manage your AWS environment, you can use Grafana to monitor resources in your organizational units (OUs).

I can optionally select the AWS data sources that I am planning to use. This configuration creates an AWS Identity and Access Management (IAM) role that enables Amazon Managed Grafana to access those resources in my account. Later, in the Grafana console, I can set up the selected services as data sources. For now, I select Amazon CloudWatch so that I can quickly visualize CloudWatch metrics in my Grafana dashboards.

Here I also configure permissions to use Amazon Managed Service for Prometheus (AMP) as a data source and have a fully managed monitoring solution for my applications. For example, I can collect Prometheus metrics from Amazon Elastic Kubernetes Service (EKS) and Amazon Elastic Container Service (Amazon ECS) environments, using AWS Distro for OpenTelemetry or Prometheus servers as collection agents.

In this step I also select Amazon Simple Notification Service (SNS) as a notification channel. Similar to the data sources before, this option gives Amazon Managed Grafana access to SNS but does not set up the notification channel. I can do that later in the Grafana console. Specifically, this setting adds SNS publish permissions to topics that start with grafana to the IAM role created by the Amazon Managed Grafana console. If you prefer to have tighter control on permissions for SNS or any data source, you can edit the role in the IAM console or use customer-managed permissions for your workspace.

Finally, I review all the options and create the workspace.

After a few minutes, the workspace is ready, and I find the workspace URL that I can use to access the Grafana console.

I need to assign at least one user or group to the Grafana workspace to be able to access the workspace URL. I choose Assign new user or group and then select one of my AWS SSO users.

By default, the user is assigned a Viewer user type and has view-only access to the workspace. To give this user permissions to create and manage dashboards and alerts, I select the user and then choose Make admin.

Back to the workspace summary, I follow the workspace URL and sign in using my AWS SSO user credentials. I am now using the open-source version of Grafana. If you are a Grafana user, everything is familiar. For my first configurations, I will focus on AWS data sources so I choose the AWS logo on the left vertical bar.

Here, I choose CloudWatch. Permissions are already set because I selected CloudWatch in the service-managed permission settings earlier. I select the default AWS Region and add the data source. I choose the CloudWatch data source and on the Dashboards tab, I find a few dashboards for AWS services such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Block Store (EBS), AWS Lambda, Amazon Relational Database Service (RDS), and CloudWatch Logs.

I import the AWS Lambda dashboard. I can now use Grafana to monitor invocations, errors, and throttles for Lambda functions in my account. I’ll save you the screenshot because I don’t have any interesting data in this Region.

Using SAML Authentication
If I don’t have AWS SSO enabled, I can authenticate users to the Amazon Managed Grafana workspace using an external identity provider (IdP) by selecting the SAML authentication option when I create the workspace. For existing workspaces, I can choose Setup SAML configuration in the workspace summary.

First, I have to provide the workspace ID and URL information to my IdP in order to generate IdP metadata for configuring this workspace.

After my IdP is configured, I import the IdP metadata by specifying a URL or copying and pasting to the editor.

Finally, I can map user permissions in my IdP to Grafana user permissions, such as specifying which users will have Administrator, Editor, and Viewer permissions in my Amazon Managed Grafana workspace.

Availability and Pricing
Amazon Managed Grafana is available today in ten AWS Regions: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Europe (London), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Sydney), and Asia Pacific (Seoul). For more information, see the AWS Regional Services List.

With Amazon Managed Grafana, you pay for the active users per workspace each month. Grafana API keys used to publish dashboards are billed as an API user license per workspace each month. You can upgrade to Grafana Enterprise to have access to enterprise plugins, support, and on-demand training directly from Grafana Labs. For more information, see the Amazon Managed Grafana pricing page.

To learn more, you are invited to this webinar on Thursday, September 9 at 9:00 am PDT / 12:00 pm EDT / 6:00 pm CEST.

Start using Amazon Managed Grafana today to visualize and analyze your operational data at any scale.

— Danilo

New for AWS CloudFormation – Quickly Retry Stack Operations from the Point of Failure

2021-08-30 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-cloudformation-quickly-retry-stack-operations-from-the-point-of-failure/

One of the great advantages of cloud computing is that you have access to programmable infrastructure. This allows you to manage your infrastructure as code and apply the same practices of application code development to infrastructure provisioning.

AWS CloudFormation gives you an easy way to model a collection of related AWS and third-party resources, provision them quickly and consistently, and manage them throughout their lifecycles. A CloudFormation template describes your desired resources and their dependencies so you can launch and configure them together as a stack. You can use a template to create, update, and delete an entire stack as a single unit instead of managing resources individually.

When you create or update a stack, your action might fail for different reasons. For example, there can be errors in the template, in the parameters of the template, or issues outside the template, such as AWS Identity and Access Management (IAM) permission errors. When such an error occurs, CloudFormation rolls back the stack to the previous stable condition. For a stack creation, that means deleting all resources created up to the point of the error. For a stack update, it means restoring the previous configuration.

This rollback to the previous state is great for production environments, but doesn’t make it easy to understand the reason for the error. Depending on the complexity of your template and the number of resources involved, you might spend lots of time waiting for all the resources to roll back before you can update the template with the right configuration and retry the operation.

Today, I am happy to share that now CloudFormation allows you to disable the automatic rollback, keep the resources successfully created or updated before the error occurs, and retry stack operations from the point of failure. In this way, you can quickly iterate to fix and remediate errors and greatly reduce the time required to test a CloudFormation template in a development environment. You can apply this new capability when you create a stack, when you update a stack, and when you execute a change set. Let’s see how this works in practice.

Quickly Iterate to Fix and Remediate a CloudFormation Stack
For one of my applications, I need to set up an Amazon Simple Storage Service (Amazon S3) bucket, an Amazon Simple Queue Service (SQS) queue, and an Amazon DynamoDB table that is streaming item-level changes to an Amazon Kinesis data stream. For this setup, I write down the first version of the CloudFormation template.

AWSTemplateFormatVersion: "2010-09-09"
Description: A sample template to fix & remediate
Parameters:
  ShardCountParameter:
    Type: Number
    Description: The number of shards for the Kinesis stream
Resources:
  MyBucket:
    Type: AWS::S3::Bucket
  MyQueue:
    Type: AWS::SQS::Queue
  MyStream:
    Type: AWS::Kinesis::Stream
    Properties:
      ShardCount: !Ref ShardCountParameter
  MyTable:
    Type: AWS::DynamoDB::Table
    Properties:
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: "ArtistId"
          AttributeType: "S"
        - AttributeName: "Concert"
          AttributeType: "S"
        - AttributeName: "TicketSales"
          AttributeType: "S"
      KeySchema:
        - AttributeName: "ArtistId"
          KeyType: "HASH"
        - AttributeName: "Concert"
          KeyType: "RANGE"
      KinesisStreamSpecification:
        StreamArn: !GetAtt MyStream.Arn
Outputs:
  BucketName:
    Value: !Ref MyBucket
    Description: The name of my S3 bucket
  QueueName:
    Value: !GetAtt MyQueue.QueueName
    Description: The name of my SQS queue
  StreamName:
    Value: !Ref MyStream
    Description: The name of my Kinesis stream
  TableName:
    Value: !Ref MyTable
    Description: The name of my DynamoDB table

Now, I want to create a stack from this template. On the CloudFormation console, I choose Create stack. Then, I upload the template file and choose Next.

I enter a name for the stack. Then, I fill the stack parameters. My template file has one parameter (ShardCountParameter) used to configure the number of shards for the Kinesis data stream. I know that the number of shards should be greater or equal to one, but by mistake, I enter zero and choose Next.

To create, modify, or delete resources in the stack, I use an IAM role. In this way, I have a clear boundary for the permissions that CloudFormation can use for stack operations. Also, I can use the same role to automate the deployment of the stack later in a standardized and reproducible environment.

In Permissions, I select the IAM role to use for the stack operations.

Now it’s time to use the new feature! In the Stack failure options, I select Preserve successfully provisioned resources to keep, in case of errors, the resources that have already been created. Failed resources are always rolled back to the last known stable state.

I leave all other options at their defaults and choose Next. Then, I review my configurations and choose Create stack.

The creation of the stack is in progress for a few seconds, and then it fails because of an error. In the Events tab, I look at the timeline of the events. The start of the creation of the stack is at the bottom. The most recent event is at the top. Properties validation for the stream resource failed because the number of shards (ShardCount) is below the minimum. For this reason, the stack is now in the CREATE_FAILED status.

Because I chose to preserve the provisioned resources, all resources created before the error are still there. In the Resources tab, the S3 bucket and the SQS queue are in the CREATE_COMPLETE status, while the Kinesis data stream is in the CREATE_FAILED status. The creation of the DynamoDB table depends on the Kinesis data stream to be available because the table uses the data stream in one of its properties (KinesisStreamSpecification). As a consequence of that, the table creation has not started yet, and the table is not in the list.

The rollback is now paused, and I have a few new options:

Retry – To retry the stack operation without any change. This option is useful if a resource failed to provision due to an issue outside the template. I can fix the issue and then retry from the point of failure.

Update – To update the template or the parameters before retrying the stack creation. The stack update starts from where the last operation was interrupted by an error.

Rollback – To roll back to the last known stable state. This is similar to default CloudFormation behavior.

Fixing Issues in the Parameters
I quickly realize the mistake I made while entering the parameter for the number of shards, so I choose Update.

I don’t need to change the template to fix this error. In Parameters, I fix the previous error and enter the correct amount for the number of shards: one shard.

I leave all other options at their current values and choose Next.

In Change set preview, I see that the update will try to modify the Kinesis stream (currently in the CREATE_FAILED status) and add the DynamoDB table. I review the other configurations and choose Update stack.

Now the update is in progress. Did I solve all the issues? Not yet. After some time, the update fails.

Fixing Issues Outside the Template
The Kinesis stream has been created, but the IAM role assumed by CloudFormation doesn’t have permissions to create the DynamoDB table.

In the IAM console, I add additional permissions to the role used by the stack operations to be able to create the DynamoDB table.

Back to the CloudFormation console, I choose the Retry option. With the new permissions, the creation of the DynamoDB table starts, but after some time, there is another error.

Fixing Issues in the Template
This time there is an error in my template where I define the DynamoDB table. In the AttributeDefinitions section, there is an attribute (TicketSales) that is not used in the schema.

With DynamoDB, attributes defined in the template should be used either for the primary key or for an index. I update the template and remove the TicketSales attribute definition.

Because I am editing the template, I take the opportunity to also add MinValue and MaxValue properties to the number of shards parameter (ShardCountParameter). In this way, CloudFormation can check that the value is in the correct range before starting the deployment, and I can avoid further mistakes.

I select the Update option. I choose to update the current template, and I upload the new template file. I confirm the current values for the parameters. Then, I leave all other options to their current values and choose Update stack.

This time, the creation of the stack is successful, and the status is UPDATE_COMPLETE. I can see all resources in the Resources tab and their description (based on the Outputs section of the template) in the Outputs tab.

Here’s the final version of the template:

AWSTemplateFormatVersion: "2010-09-09"
Description: A sample template to fix & remediate
Parameters:
  ShardCountParameter:
    Type: Number
    MinValue: 1
    MaxValue: 10
    Description: The number of shards for the Kinesis stream
Resources:
  MyBucket:
    Type: AWS::S3::Bucket
  MyQueue:
    Type: AWS::SQS::Queue
  MyStream:
    Type: AWS::Kinesis::Stream
    Properties:
      ShardCount: !Ref ShardCountParameter
  MyTable:
    Type: AWS::DynamoDB::Table
    Properties:
      BillingMode: PAY_PER_REQUEST
      AttributeDefinitions:
        - AttributeName: "ArtistId"
          AttributeType: "S"
        - AttributeName: "Concert"
          AttributeType: "S"
      KeySchema:
        - AttributeName: "ArtistId"
          KeyType: "HASH"
        - AttributeName: "Concert"
          KeyType: "RANGE"
      KinesisStreamSpecification:
        StreamArn: !GetAtt MyStream.Arn
Outputs:
  BucketName:
    Value: !Ref MyBucket
    Description: The name of my S3 bucket
  QueueName:
    Value: !GetAtt MyQueue.QueueName
    Description: The name of my SQS queue
  StreamName:
    Value: !Ref MyStream
    Description: The name of my Kinesis stream
  TableName:
    Value: !Ref MyTable
    Description: The name of my DynamoDB table

This was a simple example, but the new capability to retry stack operations from the point of failure already saved me lots of time. It allowed me to fix and remediate issues quickly, reducing the feedback loop and increasing the number of iterations that I can do in the same amount of time. In addition to using this for debugging, it is also great for incremental interactive development of templates. With more sophisticated applications, the time saved will be huge!

Fix and Remediate a CloudFormation Stack Using the AWS CLI
I can preserve successfully provisioned resources with the AWS Command Line Interface (CLI) by specifying the --disable-rollback option when I create a stack, update a stack, or execute a change set. For example:

aws cloudformation create-stack --stack-name my-stack \
    --template-body file://my-template.yaml -–disable-rollback
aws cloudformation update-stack --stack-name my-stack \
    --template-body file://my-template.yaml --disable-rollback
aws cloudformation execute-change-set --stack-name my-stack --change-set-name my-change-set \
    --template-body file://my-template.yaml --disable-rollback

For an existing stack, I can see if the DisableRollback property is enabled with the describe stack command:

aws cloudformation describe-stacks --stack-name my-stack

I can now update stacks in the CREATE_FAILED or UPDATE_FAILED status. To manually roll back a stack that is in the CREATE_FAILED or UPDATE_FAILED status, I can use the new rollback stack command:

aws cloudformation rollback-stack --stack-name my-stack

Availability and Pricing
The capability for AWS CloudFormation to retry stack operations from the point of failure is available at no additional charge in the following AWS Regions: US East (N. Virginia, Ohio), US West (Oregon, N. California), AWS GovCloud (US-East, US-West), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Stockholm), Asia Pacific (Hong Kong, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo), Middle East (Bahrain), Africa (Cape Town), and South America (São Paulo).

Do you prefer to define your cloud application resources using familiar programming languages such as JavaScript, TypeScript, Python, Java, C#, and Go? Good news! The AWS Cloud Development Kit (AWS CDK) team is planning to add support for the new capabilities described in this post in the next couple of weeks.

Spend less time to fix and remediate your CloudFormation stacks with the new capability to retry stack operations from the point of failure.

— Danilo

Run usage analytics on Amazon QuickSight using AWS CloudTrail

2021-02-24 Sunil Salunkhe

Post Syndicated from Sunil Salunkhe original https://aws.amazon.com/blogs/big-data/run-usage-analytics-on-amazon-quicksight-using-aws-cloudtrail/

Amazon QuickSight is a cloud-native BI service that allows end users to create and publish dashboards in minutes, without provisioning any servers or requiring complex licensing. You can view these dashboards on the QuickSight product console or embed them into applications and websites. After you deploy a dashboard, it’s important to assess how they and other assets are being adopted, accessed, and used across various departments or customers.

In this post, we use a QuickSight dashboard to present the following insights:

Most viewed and accessed dashboards
Most updated dashboards and analyses
Most popular datasets
Active users vs. idle users
Idle authors
Unused datasets (wasted SPICE capacity)

You can use these insights to reduce costs and create operational efficiencies in a deployment. The following diagram illustrates this architecture.

The following diagram illustrates this architecture.

Solution components

The following table summarizes the AWS services and resources that this solution uses.

Resource Type	Name	Purpose
AWS CloudTrail logs	CloudTrailMultiAccount	Capture all API calls for all AWS services across all AWS Regions for this account. You can use AWS Organizations to consolidate trails across multiple AWS accounts.
AWS Glue crawler	QSCloudTrailLogsCrawler QSProcessedDataCrawler	Ensures that all CloudTrail data is crawled periodically and that partitions are updated in the AWS Glue Data Catalog.
AWS Glue ETL job	QuickSightCloudTrailProcessing	Reads catalogued data from the crawler, processes, transforms, and stores it in an S3 output bucket.
AWS Lambda function	ExtractQSMetadata_func	Extracts event data using the AWS SDK for Python, Boto3. The event data is enriched with QuickSight metadata objects like user, analysis, datasets, and dashboards.
Amazon Simple Storage Service (s3)	CloudTrailLogsBucket QuickSight-BIonBI-processed	One bucket stores CloudTrail data. The other stores processed data.
Amazon QuickSight	Quicksight_BI_On_BO_Analysis	Visualizes the processed data.

Solution walkthrough

AWS CloudTrail is a service that enables governance, compliance, operational auditing, and risk auditing of your AWS account. You can use CloudTrail to log, continuously monitor, and retain account activity related to actions across your AWS infrastructure. You can define a trail to collect API actions across all AWS Regions. Although we have enabled a trail for all Regions in our solution, the dashboard shows the data for single Region only.

After you enable CloudTrail, it starts capturing all API actions and then, at 15-minute intervals, delivers logs in JSON format to a configured Amazon Simple Storage Service (Amazon S3) bucket. Before the logs are made available to our ad hoc query engine, Amazon Athena, they must be parsed, transformed, and processed by the AWS Glue crawler and ETL job.

Before the logs are made available to our ad hoc query engine

This will be handled by AWS Glue Crawler & AWS Glue ETL Job. The AWS Glue crawler crawls through the data every day and populates new partitions in the Data Catalog. The data is later made available as a table on the Athena console for processing by the AWS Glue ETL job. Glue ETL Job QuickSightCloudtrail_GlueJob.txt filters logs and processes only those events where the event source is QuickSight. (for example, eventSource = quicksight.amazonaws.com’).

The following screenshot shows the sample JSON for the QuickSight API calls.

The job processes those events and creates a Parquet file. The following table summarizes the file’s data points.

Quicksightlogs
Field Name	Data Type
`eventtime`	Datetime
`eventname`	String
`awsregion`	String
`accountid`	String
`username`	String
`analysisname`	String
`Date`	Date

The processed data is stored in an S3 folder at s3://<BucketName>/processedlogs/. For performance optimization during querying and connecting this data to QuickSight for visualization, these logs are partitioned by date field. For this reason, we recommend that you configure the AWS Glue crawler to detect the new data and partitions and update the Data Catalog for subsequent analysis. We have configured the crawler to run one time a day.

We need to enrich this log data with metadata from QuickSight, such as a list of analyses, users, and datasets. This metadata can be extracted using descibe_analysis, describe_user, describe_data_set in the AWS SDK for Python.

We provide an AWS Lambda function that is ideal for this extraction. We configured it to be triggered once a day through Amazon EventBridge. The extracted metadata is stored in the S3 folder at s3://<BucketName>/metadata/.

Now that we have processed logs and metadata for enrichment, we need to prepare the data visualization in QuickSight. Athena allows us to build views that can be imported into QuickSight as datasets.

We build the following views based on the tables populated by the Lambda function and the ETL job:

CREATE VIEW vw_quicksight_bionbi 
AS 
  SELECT Date_parse(eventtime, '%Y-%m-%dT%H:%i:%SZ') AS "Event Time", 
         eventname  AS "Event Name", 
         awsregion  AS "AWS Region", 
         accountid  AS "Account ID", 
         username   AS "User Name", 
         analysisname AS "Analysis Name", 
         dashboardname AS "Dashboard Name", 
         Date_parse(date, '%Y%m%d') AS "Event Date" 
  FROM   "quicksightbionbi"."quicksightoutput_aggregatedoutput" 

CREATE VIEW vw_users 
AS 
  SELECT usr.username "User Name", 
         usr.role     AS "Role", 
         usr.active   AS "Active" 
  FROM   (quicksightbionbi.users 
          CROSS JOIN Unnest("users") t (usr)) 

CREATE VIEW vw_analysis 
AS 
  SELECT aly.analysisname "Analysis Name", 
         aly.analysisid   AS "Analysis ID" 
  FROM   (quicksightbionbi.analysis 
          CROSS JOIN Unnest("analysis") t (aly)) 

CREATE VIEW vw_analysisdatasets 
AS 
  SELECT alyds.analysesname "Analysis Name", 
         alyds.analysisid   AS "Analysis ID", 
         alyds.datasetid    AS "Dataset ID", 
         alyds.datasetname  AS "Dataset Name" 
  FROM   (quicksightbionbi.analysisdatasets 
          CROSS JOIN Unnest("analysisdatasets") t (alyds)) 

CREATE VIEW vw_datasets 
AS 
  SELECT ds.datasetname AS "Dataset Name", 
         ds.importmode  AS "Import Mode" 
  FROM   (quicksightbionbi.datasets 
          CROSS JOIN Unnest("datasets") t (ds))

QuickSight visualization

Follow these steps to connect the prepared data with QuickSight and start building the BI visualization.

You can set up QuickSight access for end users through SSO providers such as AWS Single Sign-On (AWS SSO), Okta, Ping, and Azure AD so they don’t need to open the console.

You can set up QuickSight access for end users through SSO providers

On the QuickSight console, choose Datasets.
Choose New dataset to create a dataset for our analysis.

Choose New dataset to create a dataset for our analysis.

For Create a Data Set, choose Athena.

In the previous steps, we prepared all our data in the form of Athena views.

Configure permission for QuickSight to access AWS services, including Athena and its S3 buckets. For information, see Accessing Data Sources.

Configure permission for QuickSight to access AWS services,

For Data source name, enter QuickSightBIbBI.
Choose Create data source.

Choose Create data source.

On Choose your table, for Database, choose quicksightbionbi.
For Tables, select vw_quicksight_bionbi.
Choose Select.

Choose Select.

For Finish data set creation, there are two options to choose from:
1. Import to SPICE for quicker analytics – Built from the ground up for the cloud, SPICE uses a combination of columnar storage, in-memory technologies enabled through the latest hardware innovations, and machine code generation to run interactive queries on large datasets and get rapid responses. We use this option for this post.
2. Directly query your data – You can connect to the data source in real time, but if the data query is expected to bring bulky results, this option might slow down the dashboard refresh.
Choose Visualize to complete the data source creation process.

Choose Visualize to complete the data source creation process.

Now you can build your visualizations sheets. QuickSight refreshes the data source first. You can also schedule a periodic refresh of your data source.

Now you can build your visualizations sheets.

The following screenshot shows some examples of visualizations we built from the data source.

This dashboard presents us with two main areas for cost optimization:

Usage analysis – We can see how analyses and dashboards are being consumed by users. This area highlights the opportunity for cost saving by looking at datasets that have not been used for the last 90 days in any of the analysis but are still holding a major chunk of SPICE capacity.
Account governance – Because author subscriptions are charged on a fixed fee basis, it’s important to monitor if they are actively used. The dashboard helps us identify idle authors for the last 60 days.

Based on the information in the dashboard, we could do the following to save costs:

Use the idle authors data to delete these user accounts. For information, see Managing User Access Inside Amazon QuickSight
Use the SPICE capacity data to manage SPICE capacity by deleting unused datasets or purchasing additional capacity

Conclusion

In this post, we showed how you can use CloudTrail logs to review the use of QuickSight objects, including analysis, dashboards, datasets, and users. You can use the information available in dashboards to save money on storage, subscriptions, understand maturity of QuickSight Tool adoption and more.

About the Author

Sunil Salunkhe is a Senior Solution Architect working with Strategic Accounts on their vision to leverage the cloud to drive aggressive growth strategies. He practices customer obsession by solving their complex challenges in all the aspects of the cloud journey including scale, security and reliability. While not working, he enjoys playing cricket and go cycling with his wife and a son.

Developing enterprise application patterns with the AWS CDK

2021-01-15 Krishnakumar Rengarajan

Post Syndicated from Krishnakumar Rengarajan original https://aws.amazon.com/blogs/devops/developing-application-patterns-cdk/

Enterprises often need to standardize their infrastructure as code (IaC) for governance, compliance, and quality control reasons. You also need to manage and centrally publish updates to your IaC libraries. In this post, we demonstrate how to use the AWS Cloud Development Kit (AWS CDK) to define patterns for IaC and publish them for consumption in controlled releases using AWS CodeArtifact.

AWS CDK is an open-source software development framework to model and provision cloud application resources in programming languages such as TypeScript, JavaScript, Python, Java, and C#/.Net. The basic building blocks of AWS CDK are called constructs, which map to one or more AWS resources, and can be composed of other constructs. Constructs allow high-level abstractions to be defined as patterns. You can synthesize constructs into AWS CloudFormation templates and deploy them into an AWS account.

AWS CodeArtifact is a fully managed service for managing the lifecycle of software artifacts. You can use CodeArtifact to securely store, publish, and share software artifacts. Software artifacts are stored in repositories, which are aggregated into a domain. A CodeArtifact domain allows organizational policies to be applied across multiple repositories. You can use CodeArtifact with common build tools and package managers such as NuGet, Maven, Gradle, npm, yarn, pip, and twine.

Solution overview

In this solution, we complete the following steps:

Create two AWS CDK pattern constructs in Typescript: one for traditional three-tier web applications and a second for serverless web applications.
Publish the pattern constructs to CodeArtifact as npm packages. npm is the package manager for Node.js.
Consume the pattern construct npm packages from CodeArtifact and use them to provision the AWS infrastructure.

We provide more information about the pattern constructs in the following sections. The source code mentioned in this blog is available in GitHub.

Note: The code provided in this blog post is for demonstration purposes only. You must ensure that it meets your security and production readiness requirements.

Traditional three-tier web application construct

The first pattern construct is for a traditional three-tier web application running on Amazon Elastic Compute Cloud (Amazon EC2), with AWS resources consisting of Application Load Balancer, an Autoscaling group and EC2 launch configuration, an Amazon Relational Database Service (Amazon RDS) or Amazon Aurora database, and AWS Secrets Manager. The following diagram illustrates this architecture.

Traditional stack architecture

Serverless web application construct

The second pattern construct is for a serverless application with AWS resources in AWS Lambda, Amazon API Gateway, and Amazon DynamoDB.

Serverless application architecture

Publishing and consuming pattern constructs

Both constructs are written in Typescript and published to CodeArtifact as npm packages. A semantic versioning scheme is used to version the construct packages. After a package gets published to CodeArtifact, teams can consume them for deploying AWS resources. The following diagram illustrates this architecture.

Pattern constructs

Prerequisites

Before getting started, complete the following steps:

Clone the code from the GitHub repository for the traditional and serverless web application constructs:

git clone https://github.com/aws-samples/aws-cdk-developing-application-patterns-blog.git
cd aws-cdk-developing-application-patterns-blog

Configure AWS Identity and Access Management (IAM) permissions by attaching IAM policies to the user, group, or role implementing this solution. The following policy files are in the iam folder in the root of the cloned repo:
- BlogPublishArtifacts.json – The IAM policy to configure CodeArtifact and publish packages to it.
- BlogConsumeTraditional.json – The IAM policy to consume the traditional three-tier web application construct from CodeArtifact and deploy it to an AWS account.
- PublishArtifacts.json – The IAM policy to consume the serverless construct from CodeArtifact and deploy it to an AWS account.

Configuring CodeArtifact

In this step, we configure CodeArtifact for publishing the pattern constructs as npm packages. The following AWS resources are created:

A CodeArtifact domain named blog-domain
Two CodeArtifact repositories:
- blog-npm-store – For configuring the upstream NPM repository.
- blog-repository – For publishing custom packages.

Deploy the CodeArtifact resources with the following code:

cd prerequisites/
rm -rf package-lock.json node_modules
npm install
cdk deploy --require-approval never
cd ..

Log in to the blog-repository. This step is needed for publishing and consuming the npm packages. See the following code:

aws codeartifact login \
     --tool npm \
     --domain blog-domain \
     --domain-owner $(aws sts get-caller-identity --output text --query 'Account') \
     --repository blog-repository

Publishing the pattern constructs

Change the directory to the serverless construct:
```
cd serverless
```

Install the required npm packages:

rm package-lock.json && rm -rf node_modules
npm install

Build the npm project:
```
npm run build
```
Publish the construct npm package to the CodeArtifact repository:
```
npm publish
```
Follow the previously mentioned steps for building and publishing a traditional (classic Load Balancer plus Amazon EC2) web app by running these commands in the traditional directory.

If the publishing is successful, you see messages like the following screenshots. The following screenshot shows the traditional infrastructure.

The following screenshot shows the message for the serverless infrastructure.

We just published version 1.0.1 of both the traditional and serverless web app constructs. To release a new version, we can simply update the version attribute in the package.json file in the traditional or serverless folder and repeat the last two steps.

The following code snippet is for the traditional construct:
```
{
    "name": "traditional-infrastructure",
    "main": "lib/index.js",
    "files": [
        "lib/*.js",
        "src"
    ],
    "types": "lib/index.d.ts",
    "version": "1.0.1",
...
}
```
The following code snippet is for the serverless construct:
```
{
    "name": "serverless-infrastructure",
    "main": "lib/index.js",
    "files": [
        "lib/*.js",
        "src"
    ],
    "types": "lib/index.d.ts",
    "version": "1.0.1",
...
}
```

Consuming the pattern constructs from CodeArtifact

In this step, we demonstrate how the pattern constructs published in the previous steps can be consumed and used to provision AWS infrastructure.

From the root of the GitHub package, change the directory to the examples directory containing code for consuming traditional or serverless constructs.To consume the traditional construct, use the following code:
```
cd examples/traditional
```
To consume the serverless construct, use the following code:
```
cd examples/serverless
```
Open the package.json file in either directory and note that the packages and versions we consume are listed in the dependencies section, along with their version.
The following code shows the traditional web app construct dependencies:
```
"dependencies": {
    "@aws-cdk/core": "1.30.0",
    "traditional-infrastructure": "1.0.1",
    "aws-cdk": "1.47.0"
}
```
The following code shows the serverless web app construct dependencies:
```
"dependencies": {
    "@aws-cdk/core": "1.30.0",
    "serverless-infrastructure": "1.0.1",
    "aws-cdk": "1.47.0"
}
```
Install the pattern artifact npm package along with the dependencies:
```
rm package-lock.json && rm -rf node_modules
npm install
```
As an optional step, if you need to override the default Lambda function code, build the npm project. The following commands build the Lambda function source code:
```
cd ../override-serverless
npm run build
cd -
```
Bootstrap the project with the following code:
```
cdk bootstrap
```
This step is applicable for serverless applications only. It creates the Amazon Simple Storage Service (Amazon S3) staging bucket where the Lambda function code and artifacts are stored.
Deploy the construct:
```
cdk deploy --require-approval never
```
If the deployment is successful, you see messages similar to the following screenshots. The following screenshot shows the traditional stack output, with the URL of the Load Balancer endpoint.

The following screenshot shows the serverless stack output, with the URL of the API Gateway endpoint.

You can test the endpoint for both constructs using a web browser or the following curl command:

curl <endpoint output>

The traditional web app endpoint returns a response similar to the following:

[{"app": "traditional", "id": 1605186496, "purpose": "blog"}]

The serverless stack returns two outputs. Use the output named ServerlessStack-v1.Api. See the following code:

[{"purpose":"blog","app":"serverless","itemId":"1605190688947"}]
Optionally, upgrade to a new version of pattern construct.
Let’s assume that a new version of the serverless construct, version 1.0.2, has been published, and we want to upgrade our AWS infrastructure to this version. To do this, edit the package.json file and change the traditional-infrastructure or serverless-infrastructure package version in the dependencies section to 1.0.2. See the following code example:
```
"dependencies": {
    "@aws-cdk/core": "1.30.0",
    "serverless-infrastructure": "1.0.2",
    "aws-cdk": "1.47.0"
}
```
To update the serverless-infrastructure package to 1.0.2, run the following command:
```
npm update
```
Then redeploy the CloudFormation stack:
```
cdk deploy --require-approval never
```

Cleaning up

To avoid incurring future charges, clean up the resources you created.

Delete all AWS resources that were created using the pattern constructs. We can use the AWS CDK toolkit to clean up all the resources:
```
cdk destroy --force
```
For more information about the AWS CDK toolkit, see Toolkit reference. Alternatively, delete the stack on the AWS CloudFormation console.
Delete the CodeArtifact resources by deleting the CloudFormation stack that was deployed via AWS CDK:
```
cd prerequisites
cdk destroy –force
```

Conclusion

In this post, we demonstrated how to publish AWS CDK pattern constructs to CodeArtifact as npm packages. We also showed how teams can consume the published pattern constructs and use them to provision their AWS infrastructure.

This mechanism allows your infrastructure for AWS services to be provisioned from the configuration that has been vetted for quality control and security and governance checks. It also provides control over when new versions of the pattern constructs are released, and when the teams consuming the constructs can upgrade to the newly released versions.

About the Authors

Usman Umar is a Sr. Applications Architect at AWS Professional Services. He is passionate about developing innovative ways to solve hard technical problems for the customers. In his free time, he likes going on biking trails, doing car modifications, and spending time with his family.

Krishnakumar Rengarajan is a DevOps Consultant with AWS Professional Services. He enjoys working with customers and focuses on building and delivering automated solutions that enables customers on their AWS cloud journeys.

Optimizing AWS Lambda cost and performance using AWS Compute Optimizer

2020-12-23 Chad Schmutzer

Post Syndicated from Chad Schmutzer original https://aws.amazon.com/blogs/compute/optimizing-aws-lambda-cost-and-performance-using-aws-compute-optimizer/

This post is authored by Brooke Chen, Senior Product Manager for AWS Compute Optimizer, Letian Feng, Principal Product Manager for AWS Compute Optimizer, and Chad Schmutzer, Principal Developer Advocate for Amazon EC2

Optimizing compute resources is a critical component of any application architecture. Over-provisioning compute can lead to unnecessary infrastructure costs, while under-provisioning compute can lead to poor application performance.

Launched in December 2019, AWS Compute Optimizer is a recommendation service for optimizing the cost and performance of AWS compute resources. It generates actionable optimization recommendations tailored to your specific workloads. Over the last year, thousands of AWS customers reduced compute costs up to 25% by using Compute Optimizer to help choose the optimal Amazon EC2 instance types for their workloads.

One of the most frequent requests from customers is for AWS Lambda recommendations in Compute Optimizer. Today, we announce that Compute Optimizer now supports memory size recommendations for Lambda functions. This allows you to reduce costs and increase performance for your Lambda-based serverless workloads. To get started, opt in for Compute Optimizer to start finding recommendations.

Overview

With Lambda, there are no servers to manage, it scales automatically, and you only pay for what you use. However, choosing the right memory size settings for a Lambda function is still an important task. Computer Optimizer uses machine-learning based memory recommendations to help with this task.

These recommendations are available through the Compute Optimizer console, AWS CLI, AWS SDK, and the Lambda console. Compute Optimizer continuously monitors Lambda functions, using historical performance metrics to improve recommendations over time. In this blog post, we walk through an example to show how to use this feature.

Using Compute Optimizer for Lambda

This tutorial uses the AWS CLI v2 and the AWS Management Console.

In this tutorial, we setup two compute jobs that run every minute in AWS Region US East (N. Virginia). One job is more CPU intensive than the other. Initial tests show that the invocation times for both jobs typically last for less than 60 seconds. The goal is to either reduce cost without much increase in duration, or reduce the duration in a cost-efficient manner.

Based on these requirements, a serverless solution can help with this task. Amazon EventBridge can schedule the Lambda functions using rules. To ensure that the functions are optimized for cost and performance, you can use the memory recommendation support in Compute Optimizer.

In your AWS account, opt in to Compute Optimizer to start analyzing AWS resources. Ensure you have the appropriate IAM permissions configured – follow these steps for guidance. If you prefer to use the console to opt in, follow these steps. To opt in, enter the following command in a terminal window:

$ aws compute-optimizer update-enrollment-status --status Active

Once you enable Compute Optimizer, it starts to scan for functions that have been invoked for at least 50 times over the trailing 14 days. The next section shows two example scheduled Lambda functions for analysis.

Example Lambda functions

The code for the non-CPU intensive job is below. A Lambda function named lambda-recommendation-test-sleep is created with memory size configured as 1024 MB. An EventBridge rule is created to trigger the function on a recurring 1-minute schedule:

import json
import time

def lambda_handler(event, context):
  time.sleep(30)
  x=[0]*100000000
  return {
    'statusCode': 200,
    'body': json.dumps('Hello World!')
  }

The code for the CPU intensive job is below. A Lambda function named lambda-recommendation-test-busy is created with memory size configured as 128 MB. An EventBridge rule is created to trigger the function on a recurring 1-minute schedule:

import json
import random

def lambda_handler(event, context):
  random.seed(1)
  x=0
  for i in range(0, 20000000):
    x+=random.random()

  return {
    'statusCode': 200,
    'body': json.dumps('Sum:' + str(x))
  }

Understanding the Compute Optimizer recommendations

Compute Optimizer needs a history of at least 50 invocations of a Lambda function over the trailing 14 days to deliver recommendations. Recommendations are created by analyzing function metadata such as memory size, timeout, and runtime, in addition to CloudWatch metrics such as number of invocations, duration, error count, and success rate.

Compute Optimizer will gather the necessary information to provide memory recommendations for Lambda functions, and make them available within 48 hours. Afterwards, these recommendations will be refreshed daily.

These are recent invocations for the non-CPU intensive function:

Function duration is approximately 31.3 seconds with a memory setting of 1024 MB, resulting in a duration cost of about $0.00052 per invocation. Here are the recommendations for this function in the Compute Optimizer console:

The function is Not optimized with a reason of Memory over-provisioned. You can also fetch the same recommendation information via the CLI:

$ aws compute-optimizer \
  get-lambda-function-recommendations \
  --function-arns arn:aws:lambda:us-east-1:123456789012:function:lambda-recommendation-test-sleep

{
    "lambdaFunctionRecommendations": [
        {
            "utilizationMetrics": [
                {
                    "name": "Duration",
                    "value": 31333.63587049883,
                    "statistic": "Average"
                },
                {
                    "name": "Duration",
                    "value": 32522.04,
                    "statistic": "Maximum"
                },
                {
                    "name": "Memory",
                    "value": 817.67049838188,
                    "statistic": "Average"
                },
                {
                    "name": "Memory",
                    "value": 819.0,
                    "statistic": "Maximum"
                }
            ],
            "currentMemorySize": 1024,
            "lastRefreshTimestamp": 1608735952.385,
            "numberOfInvocations": 3090,
            "functionArn": "arn:aws:lambda:us-east-1:123456789012:function:lambda-recommendation-test-sleep:$LATEST",
            "memorySizeRecommendationOptions": [
                {
                    "projectedUtilizationMetrics": [
                        {
                            "name": "Duration",
                            "value": 30015.113193697029,
                            "statistic": "LowerBound"
                        },
                        {
                            "name": "Duration",
                            "value": 31515.86878891883,
                            "statistic": "Expected"
                        },
                        {
                            "name": "Duration",
                            "value": 33091.662123300975,
                            "statistic": "UpperBound"
                        }
                    ],
                    "memorySize": 900,
                    "rank": 1
                }
            ],
            "functionVersion": "$LATEST",
            "finding": "NotOptimized",
            "findingReasonCodes": [
                "MemoryOverprovisioned"
            ],
            "lookbackPeriodInDays": 14.0,
            "accountId": "123456789012"
        }
    ]
}

The Compute Optimizer recommendation contains useful information about the function. Most importantly, it has determined that the function is over-provisioned for memory. The attribute findingReasonCodes shows the value MemoryOverprovisioned. In memorySizeRecommendationOptions, Compute Optimizer has found that using a memory size of 900 MB results in an expected invocation duration of approximately 31.5 seconds.

For non-CPU intensive jobs, reducing the memory setting of the function often doesn’t have a negative impact on function duration. The recommendation confirms that you can reduce the memory size from 1024 MB to 900 MB, saving cost without significantly impacting duration. The new duration cost per invocation saves approximately 12%.

The Compute Optimizer console validates these calculations:

These are recent invocations for the second function which is CPU-intensive:

The function duration is about 37.5 seconds with a memory setting of 128 MB, resulting in a duration cost of about $0.000078 per invocation. The recommendations for this function appear in the Compute Optimizer console:

The function is also Not optimized with a reason of Memory under-provisioned. The same recommendation information is available via the CLI:

$ aws compute-optimizer \
  get-lambda-function-recommendations \
  --function-arns arn:aws:lambda:us-east-1:123456789012:function:lambda-recommendation-test-busy

{
    "lambdaFunctionRecommendations": [
        {
            "utilizationMetrics": [
                {
                    "name": "Duration",
                    "value": 36006.85851551957,
                    "statistic": "Average"
                },
                {
                    "name": "Duration",
                    "value": 38540.43,
                    "statistic": "Maximum"
                },
                {
                    "name": "Memory",
                    "value": 53.75978407557355,
                    "statistic": "Average"
                },
                {
                    "name": "Memory",
                    "value": 55.0,
                    "statistic": "Maximum"
                }
            ],
            "currentMemorySize": 128,
            "lastRefreshTimestamp": 1608725151.752,
            "numberOfInvocations": 741,
            "functionArn": "arn:aws:lambda:us-east-1:123456789012:function:lambda-recommendation-test-busy:$LATEST",
            "memorySizeRecommendationOptions": [
                {
                    "projectedUtilizationMetrics": [
                        {
                            "name": "Duration",
                            "value": 27340.37604781184,
                            "statistic": "LowerBound"
                        },
                        {
                            "name": "Duration",
                            "value": 28707.394850202432,
                            "statistic": "Expected"
                        },
                        {
                            "name": "Duration",
                            "value": 30142.764592712556,
                            "statistic": "UpperBound"
                        }
                    ],
                    "memorySize": 160,
                    "rank": 1
                }
            ],
            "functionVersion": "$LATEST",
            "finding": "NotOptimized",
            "findingReasonCodes": [
                "MemoryUnderprovisioned"
            ],
            "lookbackPeriodInDays": 14.0,
            "accountId": "123456789012"
        }
    ]
}

For this function, Compute Optimizer has determined that the function’s memory is under-provisioned. The value of findingReasonCodes is MemoryUnderprovisioned. The recommendation is to increase the memory from 128 MB to 160 MB.

This recommendation may seem counter-intuitive, since the function only uses 55 MB of memory per invocation. However, Lambda allocates CPU and other resources linearly in proportion to the amount of memory configured. This means that increasing the memory allocation to 160 MB also reduces the expected duration to around 28.7 seconds. This is because a CPU-intensive task also benefits from the increased CPU performance that comes with the additional memory.

After applying this recommendation, the new expected duration cost per invocation is approximately $0.000075. This means that for almost no change in duration cost, the job latency is reduced from 37.5 seconds to 28.7 seconds.

The Compute Optimizer console validates these calculations:

Applying the Compute Optimizer recommendations

To optimize the Lambda functions using Compute Optimizer recommendations, use the following CLI command:

$ aws lambda update-function-configuration \
  --function-name lambda-recommendation-test-sleep \
  --memory-size 900

After invoking the function multiple times, we can see metrics of these invocations in the console. This shows that the function duration has not changed significantly after reducing the memory size from 1024 MB to 900 MB. The Lambda function has been successfully cost-optimized without increasing job duration:

To apply the recommendation to the CPU-intensive function, use the following CLI command:

$ aws lambda update-function-configuration \
  --function-name lambda-recommendation-test-busy \
  --memory-size 160

After invoking the function multiple times, the console shows that the invocation duration is reduced to about 28 seconds. This matches the recommendation’s expected duration. This shows that the function is now performance-optimized without a significant cost increase:

Final notes

A couple of final notes:

Not every function will receive a recommendation. Compute optimizer only delivers recommendations when it has high confidence that these recommendations may help reduce cost or reduce execution duration.
As with any changes you make to an environment, we strongly advise that you test recommended memory size configurations before applying them into production.

Conclusion

You can now use Compute Optimizer for serverless workloads using Lambda functions. This can help identify the optimal Lambda function configuration options for your workloads. Compute Optimizer supports memory size recommendations for Lambda functions in all AWS Regions where Compute Optimizer is available. These recommendations are available to you at no additional cost. You can get started with Compute Optimizer from the console.

To learn more visit Getting started with AWS Compute Optimizer.

Announcing Amazon Managed Service for Grafana (in Preview)

2020-12-15 Marcia Villalba

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/announcing-amazon-managed-grafana-service-in-preview/

Today, in partnership with Grafana Labs, we are excited to announce in preview, Amazon Managed Service for Grafana (AMG), a fully managed service that makes it easy to create on-demand, scalable, and secure Grafana workspaces to visualize and analyze your data from multiple sources.

Grafana is one of the most popular open source technologies used to create observability dashboards for your applications. It has a pluggable data source model and support for different kinds of time series databases and cloud monitoring vendors. Grafana centralizes your application data from multiple open-source, cloud, and third-party data sources.

Many of our customers love Grafana, but don’t want the burden of self-hosting and managing it. AMG manages the provisioning, setup, scaling, version upgrades and security patching of Grafana, eliminating the need for customers to do it themselves. AMG automatically scales to support thousands of users with high availability.

With AMG, you will get a fully managed and secure data visualization service where you can query, correlate, and visualize operational metrics, logs and traces across multiple data sources including cloud services such as AWS, Google, and Microsoft. AMG is integrated with AWS data sources, such as Amazon CloudWatch, Amazon Elasticsearch Service, AWS X-Ray, AWS IoT SiteWise, Amazon Timestream, and others to collect operational data in a simple way. Additionally, AMG also provides plug-ins to connect to popular third-party data sources, such as Datadog, Splunk, ServiceNow, and New Relic by upgrading to Grafana Enterprise directly from the AWS Console.

AMG integrates directly into your AWS Organizations. You can define a AMG workspace in one AWS account that allows you to discover and access datasources in all your accounts and regions across your AWS organization. Creating dashboards in Grafana is easy as all these different datasources are discoverable in one place.

Customers really like Grafana for the ease of creating dashboards, it comes with many built-in dashboards to use when you add a new data source, or you can take advantage of its broad community of pre-built dashboards. For example, you can see in the following image a really nice dashboard that AMG created for me from one of my AWS Lambda function.

One of my favorite things from AMG is the built-in security features. You can easily enable single sign-on using AWS Single Sign-On, restrict access to data sources and dashboards to the right users, and access audit logs via AWS CloudTrail for your hosted Grafana workspace. With AWS Single Sign-On you can leverage your existing corporate directories to enforce authentication and authorization permissions.

Another powerful feature that AMG has is support for Alerts. AMG integrates with Amazon Simple Notification Service (SNS) so customers can send Grafana alerts to SNS as a notification destination. It also has support for four other alert destinations including PagerDuty, Slack, VictorOps and OpsGenie.

There are no up-front investments required to use AMG, and you only pay a monthly active user license fee. This means that you can provision many users to access to your Grafana workspace, but will only be billed for active users that log in and use the workspace that month. Users granted access but that do not log in, will not be billed that month. You can also upgrade to Grafana Enterprise using AWS Marketplace, to get access to enterprise plugins, support, and training content directly from Grafana Labs.

Availability

This service is available in US East (N. Virginia) and Europe (Ireland) regions. To learn more visit the AMG service page, and be sure to join our re:Invent session tomorrow 12/16 from 8:00am – 8:30am PST for a demo!

AMG is now available in preview; to get access to this service fill out the registration form here.

— Marcia

Join the Preview – Amazon Managed Service for Prometheus (AMP)

2020-12-15 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/join-the-preview-amazon-managed-service-for-prometheus-amp/

Observability is an essential aspect of running cloud infrastructure at scale. You need to know that your resources are healthy and performing as expected, and that your system is delivering the desired level of performance to your customers.

A lot of challenges arise when monitoring container-based applications. First, because container resources are transient and there are lots of metrics to watch, the monitoring data has strikingly high cardinality. In plain language this means that there are lots of unique values, which can make it harder to define a space-efficient storage model and to create queries that return meaningful results. Second, because a well-architected container-based system is composed using a large number of moving parts, ingesting, processing, and storing the monitoring data can become an infrastructure challenge of its own.

Prometheus is a leading open-source monitoring solution with an active developer and user community. It has a multi-dimensional data model that is a great fit for time series data collected from containers.

Introducing Amazon Managed Service for Prometheus (AMP)
Today we are launching a preview of Amazon Managed Service for Prometheus (AMP). This fully-managed service is 100% compatible with Prometheus. It supports the same metrics, the same PromQL queries, and can also make use of the 150+ Prometheus exporters. AMP runs across multiple Availability Zones for high availability, and is powered by CNCF Cortex for horizontal scalability. AMP will easily scale to ingest, store, and query millions of time series metrics.

The preview includes support for Amazon Elastic Kubernetes Service (EKS) and Amazon Elastic Container Service (ECS). It can also be used to monitor your self-managed Kubernetes clusters that are running in the cloud or on-premises.

Getting Started with Amazon Managed Service for Prometheus (AMP)
After joining the preview, I open the AMP Console, enter a name for my AMP workspace, and click Create to get started (API and CLI support is also available):

My workspace is active within a minute or so. The console provides me with the endpoints that I can use to write data to my workspace, and to issue queries:

It also provides guidance on how to configure an existing Prometheus server to send metrics to the AMP workspace:

I can also use AWS Distro for OpenTelemetry to scrape Prometheus metrics and send them to my AMP workspace.

Once I have stored some metrics in my workspace, I can run PromQL queries and I can use Grafana to create dashboards and other visualizations. Here’s a sample Grafana dashboard:

Join the Preview
As noted earlier, we’re launching Amazon Managed Service for Prometheus (AMP) in preview form and you are welcome to try it out today.

We’ll have more info (and a more detailed blog post) at launch time.

— Jeff;

Rapid and flexible Infrastructure as Code using the AWS CDK with AWS Solutions Constructs

2020-10-31 Biff Gaut

Post Syndicated from Biff Gaut original https://aws.amazon.com/blogs/devops/rapid-flexible-infrastructure-with-solutions-constructs-cdk/

Introduction

As workloads move to the cloud and all infrastructure becomes virtual, infrastructure as code (IaC) becomes essential to leverage the agility of this new world. JSON and YAML are the powerful, declarative modeling languages of AWS CloudFormation, allowing you to define complex architectures using IaC. Just as higher level languages like BASIC and C abstracted away the details of assembly language and made developers more productive, the AWS Cloud Development Kit (AWS CDK) provides a programming model above the native template languages, a model that makes developers more productive when creating IaC. When you instantiate CDK objects in your Typescript (or Python, Java, etc.) application, those objects “compile” into a YAML template that the CDK deploys as an AWS CloudFormation stack.

AWS Solutions Constructs take this simplification a step further by providing a library of common service patterns built on top of the CDK. These multi-service patterns allow you to deploy multiple resources with a single object, resources that follow best practices by default – both independently and throughout their interaction.

Comparison of an Application stack with Assembly Language, 4th generation language and Object libraries such as Hibernate with an IaC stack of CloudFormation, AWS CDK and AWS Solutions Constructs

Application Development Stack vs. IaC Development Stack

Solution overview

To demonstrate how using Solutions Constructs can accelerate the development of IaC, in this post you will create an architecture that ingests and stores sensor readings using Amazon Kinesis Data Streams, AWS Lambda, and Amazon DynamoDB.

An architecture diagram showing sensor readings being sent to a Kinesis data stream. A Lambda function will receive the Kinesis records and store them in a DynamoDB table.

Prerequisite – Setting up the CDK environment

Tip – If you want to try this example but are concerned about the impact of changing the tools or versions on your workstation, try running it on AWS Cloud9. An AWS Cloud9 environment is launched with an AWS Identity and Access Management (AWS IAM) role and doesn’t require configuring with an access key. It uses the current region as the default for all CDK infrastructure.

To prepare your workstation for CDK development, confirm the following:

Node.js 10.3.0 or later is installed on your workstation (regardless of the language used to write CDK apps).
You have configured credentials for your environment. If you’re running locally you can do this by configuring the AWS Command Line Interface (AWS CLI).
TypeScript 2.7 or later is installed globally (npm -g install typescript)

Before creating your CDK project, install the CDK toolkit using the following command:

npm install -g aws-cdk

Create the CDK project

First create a project folder called stream-ingestion with these two commands:

mkdir stream-ingestion
cd stream-ingestion

Now create your CDK application using this command:

npx [email protected] init app --language=typescript

Tip – This example will be written in TypeScript – you can also specify other languages for your projects.

At this time, you must use the same version of the CDK and Solutions Constructs. We’re using version 1.68.0 of both based upon what’s available at publication time, but you can update this with a later version for your projects in the future.

Let’s explore the files in the application this command created:

bin/stream-ingestion.ts – This is the module that launches the application. The key line of code is:

new StreamIngestionStack(app, 'StreamIngestionStack');

This creates the actual stack, and it’s in StreamIngestionStack that you will write the CDK code that defines the resources in your architecture.

lib/stream-ingestion-stack.ts – This is the important class. In the constructor of StreamIngestionStack you will add the constructs that will create your architecture.

During the deployment process, the CDK uploads your Lambda function to an Amazon S3 bucket so it can be incorporated into your stack.

To create that S3 bucket and any other infrastructure the CDK requires, run this command:

cdk bootstrap

The CDK uses the same supporting infrastructure for all projects within a region, so you only need to run the bootstrap command once in any region in which you create CDK stacks.

To install the required Solutions Constructs packages for our architecture, run the these two commands from the command line:

npm install @aws-solutions-constructs/[email protected]
npm install @aws-solutions-constructs/[email protected]

Write the code

First you will write the Lambda function that processes the Kinesis data stream messages.

Create a folder named lambda under stream-ingestion
Within the lambda folder save a file called lambdaFunction.js with the following contents:

var AWS = require("aws-sdk");

// Create the DynamoDB service object
var ddb = new AWS.DynamoDB({ apiVersion: "2012-08-10" });

AWS.config.update({ region: process.env.AWS_REGION });

// We will configure our construct to 
// look for the .handler function
exports.handler = async function (event) {
  try {
    // Kinesis will deliver records 
    // in batches, so we need to iterate through
    // each record in the batch
    for (let record of event.Records) {
      const reading = parsePayload(record.kinesis.data);
      await writeRecord(record.kinesis.partitionKey, reading);
    };
  } catch (err) {
    console.log(`Write failed, err:\n${JSON.stringify(err, null, 2)}`);
    throw err;
  }
  return;
};

// Write the provided sensor reading data to the DynamoDB table
async function writeRecord(partitionKey, reading) {

  var params = {
    // Notice that Constructs automatically sets up 
    // an environment variable with the table name.
    TableName: process.env.DDB_TABLE_NAME,
    Item: {
      partitionKey: { S: partitionKey },  // sensor Id
      timestamp: { S: reading.timestamp },
      value: { N: reading.value}
    },
  };

  // Call DynamoDB to add the item to the table
  await ddb.putItem(params).promise();
}

// Decode the payload and extract the sensor data from it
function parsePayload(payload) {

  const decodedPayload = Buffer.from(payload, "base64").toString(
    "ascii"
  );

  // Our CLI command will send the records to Kinesis
  // with the values delimited by '|'
  const payloadValues = decodedPayload.split("|", 2)
  return {
    value: payloadValues[0],
    timestamp: payloadValues[1]
  }
}

We won’t spend a lot of time explaining this function – it’s pretty straightforward and heavily commented. It receives an event with one or more sensor readings, and for each reading it extracts the pertinent data and saves it to the DynamoDB table.

You will use two Solutions Constructs to create your infrastructure:

The aws-kinesisstreams-lambda construct deploys an Amazon Kinesis data stream and a Lambda function.

aws-kinesisstreams-lambda creates the Kinesis data stream and Lambda function that subscribes to that stream. To support this, it also creates other resources, such as IAM roles and encryption keys.

The aws-lambda-dynamodb construct deploys a Lambda function and a DynamoDB table.

aws-lambda-dynamodb creates an Amazon DynamoDB table and a Lambda function with permission to access the table.

To deploy the first of these two constructs, replace the code in lib/stream-ingestion-stack.ts with the following code:

import * as cdk from "@aws-cdk/core";
import * as lambda from "@aws-cdk/aws-lambda";
import { KinesisStreamsToLambda } from "@aws-solutions-constructs/aws-kinesisstreams-lambda";

import * as ddb from "@aws-cdk/aws-dynamodb";
import { LambdaToDynamoDB } from "@aws-solutions-constructs/aws-lambda-dynamodb";

export class StreamIngestionStack extends cdk.Stack {
  constructor(scope: cdk.Construct, id: string, props?: cdk.StackProps) {
    super(scope, id, props);

    const kinesisLambda = new KinesisStreamsToLambda(
      this,
      "KinesisLambdaConstruct",
      {
        lambdaFunctionProps: {
          // Where the CDK can find the lambda function code
          runtime: lambda.Runtime.NODEJS_10_X,
          handler: "lambdaFunction.handler",
          code: lambda.Code.fromAsset("lambda"),
        },
      }
    );

    // Next Solutions Construct goes here
  }
}

Let’s explore this code:

It instantiates a new KinesisStreamsToLambda object. This Solutions Construct will launch a new Kinesis data stream and a new Lambda function, setting up the Lambda function to receive all the messages in the Kinesis data stream. It will also deploy all the additional resources and policies required for the architecture to follow best practices.
The third argument to the constructor is the properties object, where you specify overrides of default values or any other information the construct needs. In this case you provide properties for the encapsulated Lambda function that informs the CDK where to find the code for the Lambda function that you stored as lambda/lambdaFunction.js earlier.

Now you’ll add the second construct that connects the Lambda function to a new DynamoDB table. In the same lib/stream-ingestion-stack.ts file, replace the line // Next Solutions Construct goes here with the following code:

    // Define the primary key for the new DynamoDB table
    const primaryKeyAttribute: ddb.Attribute = {
      name: "partitionKey",
      type: ddb.AttributeType.STRING,
    };

    // Define the sort key for the new DynamoDB table
    const sortKeyAttribute: ddb.Attribute = {
      name: "timestamp",
      type: ddb.AttributeType.STRING,
    };

    const lambdaDynamoDB = new LambdaToDynamoDB(
      this,
      "LambdaDynamodbConstruct",
      {
        // Tell construct to use the Lambda function in
        // the first construct rather than deploy a new one
        existingLambdaObj: kinesisLambda.lambdaFunction,
        tablePermissions: "Write",
        dynamoTableProps: {
          partitionKey: primaryKeyAttribute,
          sortKey: sortKeyAttribute,
          billingMode: ddb.BillingMode.PROVISIONED,
          removalPolicy: cdk.RemovalPolicy.DESTROY
        },
      }
    );

    // Add autoscaling
    const readScaling = lambdaDynamoDB.dynamoTable.autoScaleReadCapacity({
      minCapacity: 1,
      maxCapacity: 50,
    });

    readScaling.scaleOnUtilization({
      targetUtilizationPercent: 50,
    });

Let’s explore this code:

The first two const objects define the names and types for the partition key and sort key of the DynamoDB table.
The LambdaToDynamoDB construct instantiated creates a new DynamoDB table and grants access to your Lambda function. The key to this call is the properties object you pass in the third argument.
- The first property sent to LambdaToDynamoDB is existingLambdaObj – by setting this value to the Lambda function created by KinesisStreamsToLambda, you’re telling the construct to not create a new Lambda function, but to grant the Lambda function in the other Solutions Construct access to the DynamoDB table. This illustrates how you can chain many Solutions Constructs together to create complex architectures.
- The second property sent to LambdaToDynamoDB tells the construct to limit the Lambda function’s access to the table to write only.
- The third property sent to LambdaToDynamoDB is actually a full properties object defining the DynamoDB table. It provides the two attribute definitions you created earlier as well as the billing mode. It also sets the RemovalPolicy to DESTROY. This policy setting ensures that the table is deleted when you delete this stack – in most cases you should accept the default setting to protect your data.
The last two lines of code show how you can use statements to modify a construct outside the constructor. In this case we set up auto scaling on the new DynamoDB table, which we can access with the dynamoTable property on the construct we just instantiated.

That’s all it takes to create the all resources to deploy your architecture.

Save all the files, then compile the Typescript into a CDK program using this command:

npm run build

Finally, launch the stack using this command:

cdk deploy

(Enter “y” in response to Do you wish to deploy all these changes (y/n)?)

You will see some warnings where you override CDK default values. Because you are doing this intentionally you may disregard these, but it’s always a good idea to review these warnings when they occur.

Tip – Many mysterious CDK project errors stem from mismatched versions. If you get stuck on an inexplicable error, check package.json and confirm that all CDK and Solutions Constructs libraries have the same version number (with no leading caret ^). If necessary, correct the version numbers, delete the package-lock.json file and node_modules tree and run npm install. Think of this as the “turn it off and on again” first response to CDK errors.

You have now deployed the entire architecture for the demo – open the CloudFormation stack in the AWS Management Console and take a few minutes to explore all 12 resources that the program deployed (and the 380 line template generated to created them).

Feed the Stream

Now use the CLI to send some data through the stack.

Go to the Kinesis Data Streams console and copy the name of the data stream. Replace the stream name in the following command and run it from the command line.

aws kinesis put-records \
--stream-name StreamIngestionStack-KinesisLambdaConstructKinesisStreamXXXXXXXX-XXXXXXXXXXXX \
--records \
PartitionKey=1301,'Data=15.4|2020-08-22T01:16:36+00:00' \
PartitionKey=1503,'Data=39.1|2020-08-22T01:08:15+00:00'

Tip – If you are using the AWS CLI v2, the previous command will result in an “Invalid base64…” error because v2 expects the inputs to be Base64 encoded by default. Adding the argument --cli-binary-format raw-in-base64-out will fix the issue.

To confirm that the messages made it through the service, open the DynamoDB console – you should see the two records in the table.

Now that you’ve got it working, pause to think about what you just did. You deployed a system that can ingest and store sensor readings and scale to handle heavy loads. You did that by instantiating two objects – well under 60 lines of code. Experiment with changing some property values and deploying the changes by running npm run build and cdk deploy again.

Cleanup

To clean up the resources in the stack, run this command:

cdk destroy

Conclusion

Just as languages like BASIC and C allowed developers to write programs at a higher level of abstraction than assembly language, the AWS CDK and AWS Solutions Constructs allow us to create CloudFormation stacks in Typescript, Java, or Python instead JSON or YAML. Just as there will always be a place for assembly language, there will always be situations where we want to write CloudFormation templates manually – but for most situations, we can now use the AWS CDK and AWS Solutions Constructs to create complex and complete architectures in a fraction of the time with very little code.

AWS Solutions Constructs can currently be used in CDK applications written in Typescript, Javascript, Java and Python and will be available in C# applications soon.

About the Author

Biff Gaut has been shipping software since 1983, from small startups to large IT shops. Along the way he has contributed to 2 books, spoken at several conferences and written many blog posts. He is now a Principal Solutions Architect at AWS working on the AWS Solutions Constructs team, helping customers deploy better architectures more quickly.

Walkthrough

Summary

About the Author

Background

AWS CDK

OpenAPI specification (formerly Swagger specification)

Project composition

app directory

cdk directory

CDK stacks

ApiStack

PipelineStack

CI/CD pipeline

Running the project

Cleaning up

Conclusion

Overview of migration using Application Migration Service

Baseline testing

Network connectivity

Source storage I/O

Application Migration Service EC2 replication instance size

Application Migration Service Elastic Block Store replication volume

Source server CPU

Conclusion

Related information

Prerequisites

Solution walkthrough

Cleanup

Conclusion

Solution components

Solution walkthrough

QuickSight visualization

Conclusion

About the Author

Solution overview

Traditional three-tier web application construct

Serverless web application construct

Publishing and consuming pattern constructs

Prerequisites

Configuring CodeArtifact

Publishing the pattern constructs

Consuming the pattern constructs from CodeArtifact

Cleaning up

Conclusion

About the Authors

Overview

Using Compute Optimizer for Lambda

Example Lambda functions

Understanding the Compute Optimizer recommendations

Applying the Compute Optimizer recommendations

Final notes

Conclusion

Introduction

Solution overview

Prerequisite – Setting up the CDK environment

Create the CDK project

Write the code

Feed the Stream

Cleanup

Conclusion

About the Author

The collective thoughts of the interwebz