Tag Archives: alarms

New – Application Load Balancing via IP Address to AWS & On-Premises Resources

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-application-load-balancing-via-ip-address-to-aws-on-premises-resources/

I told you about the new AWS Application Load Balancer last year and showed you how to use it to do implement Layer 7 (application) routing to EC2 instances and to microservices running in containers.

Some of our customers are building hybrid applications as part of a longer-term move to AWS. These customers have told us that they would like to use a single Application Load Balancer to spread traffic across a combination of existing on-premises resources and new resources running in the AWS Cloud. Other customers would like to spread traffic to web or database servers that are scattered across two or more Virtual Private Clouds (VPCs), host multiple services on the same instance with distinct IP addresses but a common port number, and to offer support for IP-based virtual hosting for clients that do not support Server Name Indication (SNI). Another group of customers would like to host multiple instances of a service on the same instance (perhaps within containers), while using multiple interfaces and security groups to implement fine-grained access control.

These situations arise within a broad set of hybrid, migration, disaster recovery, and on-premises use cases and scenarios.

Route to IP Addresses
In order to address these use cases, Application Load Balancers can now route traffic directly to IP addresses. These addresses can be in the same VPC as the ALB, a peer VPC in the same region, on an EC2 instance connected to a VPC by way of ClassicLink, or on on-premises resources at the other end of a VPN connection or AWS Direct Connect connection.

Application Load Balancers already group targets in to target groups. As part of today’s launch, each target group now has a target type attribute:

instance – Targets are registered by way of EC2 instance IDs, as before.

ip – Targets are registered as IP addresses. You can use any IPv4 address from the load balancer’s VPC CIDR for targets within load balancer’s VPC and any IPv4 address from the RFC 1918 ranges (10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16) or the RFC 6598 range (100.64.0.0/10) for targets located outside the load balancer’s VPC (this includes Peered VPC, EC2-Classic, and on-premises targets reachable over Direct Connect or VPN).

Each target group has a load balancer and health check configuration, and publishes metrics to CloudWatch, as has always been the case.

Let’s say that you are in the transition phase of an application migration to AWS or want to use AWS to augment on-premises resources with EC2 instances and you need to distribute application traffic across both your AWS and on-premises resources. You can achieve this by registering all the resources (AWS and on-premises) to the same target group and associate the target group with a load balancer. Alternatively, you can use DNS based weighted load balancing across AWS and on-premises resources using two load balancers i.e. one load balancer for AWS and other for on-premises resources. In the scenario where application-A back-ends are in VPC and application-B back-ends are in on-premises locations then you can put back-ends for each application in different target groups and use content based routing to route traffic to each target group.

Creating a Target Group
Here’s how I create a target group that sends traffic to some IP addresses as part of the process of creating an Application Load Balancer. I enter a name (ip-target-1) and select ip as the Target type:

Then I enter IP address targets. These can be from the VPC that hosts the load balancer:

Or they can be other private IP addresses within one of the private ranges listed above, for targets outside of the VPC that hosts the load balancer:

After I review the settings and create the load balancer, traffic will be sent to the designated IP addresses as soon as they pass the health checks. Each load balancer can accommodate up to 1000 targets.

I can examine my target group and edit the set of targets at any time:

As you can see, one of my targets was not healthy when I took this screen shot (this was by design). Metrics are published to CloudWatch for each target group; I can see them in the Console and I can create CloudWatch Alarms:

Available Now
This feature is available now and you can start using it today in all AWS Regions.

Jeff;

 

Announcing the Reputation Dashboard

Post Syndicated from Brent Meyer original https://aws.amazon.com/blogs/ses/announcing-the-reputation-dashboard/

The Amazon SES team is pleased to announce the addition of a reputation dashboard to the Amazon SES console. This new feature helps you track issues that could impact the sender reputation of your Amazon SES account.

What information does the reputation dashboard provide?

Amazon SES users must maintain bounce and complaint rates below a certain threshold. We put these rules in place to protect the sender reputations of all Amazon SES users, and to prevent Amazon SES from being used to deliver spam or malicious content. Users with very high rates of bounces or complaints may be put on probation. If the bounce or complaint rates are not within acceptable limits by the end of the probation period, these accounts may be shut down completely.

Previous versions of Amazon SES provided basic sending metrics, including information about bounces and complaints. However, the bounce and complaint metrics in this dashboard only included information for the past few days of email sent from your account, as opposed to an overall rate.

The new reputation dashboard provides overall bounce and complaint rates for your entire account. This enables you to more closely monitor the health of your account and adjust your email sending practices as needed.

Can’t I just calculate these values myself?

Because each Amazon SES account sends different volumes of email at different rates, we do not calculate bounce and complaint rates based on a fixed time period. Instead, we use a representative volume of email. This representative volume is the basis for the bounce and complaint rate calculations.

Why do we use representative volume in our calculations? Let’s imagine that you sent 1,000 emails one week, and 5 of them bounced. If we only considered a week of email sending, your metrics look good. Now imagine that the next week you only sent 5 emails, and one of them bounced. Suddenly, your bounce rate jumps from half a percent to 20%, and your account is automatically placed on probation. This example may be an extreme case, but it illustrates the reason that we don’t use fixed time periods when calculating bounce and complaint rates.

When you open the new reputation dashboard, you will see bounce and complaint rates calculated using the representative volume for your account. We automatically recalculate these rates every time you send email through Amazon SES.

What else can I do with these metrics?

The Bounce and Complaint Rate metrics in the reputation dashboard are automatically sent to Amazon CloudWatch. You can use CloudWatch to create dashboards that track your bounce and complaint rates over time, and to create alarms that send you notifications when these metrics cross certain thresholds. To learn more, see Creating Reputation Monitoring Alarms Using CloudWatch in the Amazon SES Developer Guide.

How can I see the reputation dashboard?

The reputation dashboard is now available to all Amazon SES users. To view the reputation dashboard, sign in to the Amazon SES console. On the left navigation menu, choose Reputation Dashboard. For more information, see Monitoring Your Sender Reputation in the Amazon SES Developer Guide.

We hope you find the information in the reputation dashboard to be useful in managing your email sending programs and campaigns. If you have any questions or comments, please leave a comment on this post, or let us know in the Amazon SES forum.

New – High-Resolution Custom Metrics and Alarms for Amazon CloudWatch

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-high-resolution-custom-metrics-and-alarms-for-amazon-cloudwatch/

Amazon CloudWatch has been an important part of AWS since early 2009! Launched as part of a three-pack that also included Auto Scaling and Elastic Load Balancing, CloudWatch has evolved into a very powerful monitoring service for AWS resources and the applications that you run on the AWS Cloud. CloudWatch custom metrics (launched way back in 2011) allow you to store business and application metrics in CloudWatch, view them in graphs, and initiate actions based on CloudWatch Alarms. Needless to say, we have made many enhancements to CloudWatch over the years! Some of the most recent include Extended Metrics Retention (and a User Interface Update), Dashboards, API/CloudFormation Support for Dashboards, and Alarms on Dashboards.

Originally, metrics were stored at five minute intervals; this was reduced to one minute (also known as Detailed Monitoring) in response to customer requests way back in 2010. This was a welcome change, but now it is time to do better. Our customers are streaming video, running flash sales, deploying code tens or hundreds of times per day, and running applications that scale in and out very quickly as conditions change. In all of these situations, a minute is simply too coarse of an interval. Important, transient spikes can be missed; disparate (yet related) events are difficult to correlate across time, and the MTTR (mean time to repair) when something breaks is too high.

New High-Resolution Metrics
Today we are adding support for high-resolution custom metrics, with plans to add support for AWS services over time. Your applications can now publish metrics to CloudWatch with 1-second resolution. You can watch the metrics scroll across your screen seconds after they are published and you can set up high-resolution CloudWatch Alarms that evaluate as frequently as every 10 seconds.

Imagine alarming when available memory gets low. This is often a transient condition that can be hard to catch with infrequent samples. With high-resolution metrics, you can see, detect (via an alarm), and act on it within seconds:

In this case the alarm on the right would not fire, and you would not know about the issue.

Publishing High-Resolution Metrics
You can publish high-resolution metrics in two different ways:

  • API – The PutMetricData function now accepts an optional StorageResolution parameter. Set this parameter to 1 to publish high-resolution metrics; omit it (or set it to 60) to publish at standard 1-minute resolution.
  • collectd plugin – The CloudWatch plugin for collectd has been updated to support collection and publication of high-resolution metrics. You will need to set the enable_high_definition_metrics parameter in the config file for the plugin.

CloudWatch metrics are rolled up over time; resolution effectively decreases as the metrics age. Here’s the schedule:

  • 1 second metrics are available for 3 hours.
  • 60 second metrics are available for 15 days.
  • 5 minute metrics are available for 63 days.
  • 1 hour metrics are available for 455 days (15 months).

When you call GetMetricStatistics you can specify a period of 1, 5, 10, 30 or any multiple of 60 seconds for high-resolution metrics. You can specify any multiple of 60 seconds for standard metrics.

A Quick Demo
I grabbed my nearest EC2 instance, installed the latest version of collectd and the Python plugin:

$ sudo yum install collectd collectd-python

Then I downloaded the setup script for the plugin, made it executable, and ran it:

$ wget https://raw.githubusercontent.com/awslabs/collectd-cloudwatch/master/src/setup.py
$ chmod a+x setup.py
$ sudo ./setup.py

I had already created a suitable IAM Role and added it to my instance; it was automatically detected during setup. I was asked to enable the high resolution metrics:

collectd started running and publishing metrics within seconds. I opened up the CloudWatch Console to take a look:

Then I zoomed in to see the metrics in detail:

I also created an alarm that will check the memory.percent.used metric at 10 second intervals. This will make it easier for me to detect situations where a lot of memory is being used for a short period of time:

Now Available
High-resolution custom metrics and alarms are available now in all Public AWS Regions, with support for AWS GovCloud (US) coming soon.

As was already the case, you can store 10 metrics at no charge every month; see the CloudWatch Pricing page for more information. Pricing for high-resolution metrics is identical to that for standard resolution metrics, with volume tiers that allow you to realize savings (on a per-metric) basis when you use more metrics. High-resolution alarms are priced at $0.30 per alarm per month.

New – Target Tracking Policies for EC2 Auto Scaling

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-target-tracking-policies-for-ec2-auto-scaling/

I recently told you about DynamoDB Auto Scaling and showed you how it uses multiple CloudWatch Alarms to automate capacity management for DynamoDB tables. Behind the scenes, this feature makes use of a more general Application Auto Scaling model that we plan to put to use across several different AWS services over time.

The new Auto Scaling model includes an important new feature that we call target tracking. When you create an Auto Scaling policy that makes use of target tracking, you choose a target value for a particular CloudWatch metric. Auto Scaling then turns the appropriate knob (so to speak) to drive the metric toward the target, while also adjusting the relevant CloudWatch Alarms. Specifying your desired target, in whatever metrics-driven units make sense for your application, is generally easier and more direct than setting up ranges and thresholds manually using the original step scaling policy type. However, you can use target tracking in conjunction with step scaling in order to implement an advanced scaling strategy. For example, you could use target tracking for scale-out operations and step scaling for scale-in.

Now for EC2
Today we are adding target tracking support to EC2 Auto Scaling. You can now create scaling policies that are driven by application load balancer request counts, CPU load, network traffic, or a custom metric (the Request Count per Target metric is new, and is also part of today’s launch):

These metrics share an important property: adding additional EC2 instances will (with no changes in overall load) drive the metric down, and vice versa.

To create an Auto Scaling Group that makes use of target tracking, you simply enter a name for the policy, choose a metric, and set the desired target value:

You have the option to disable the scale-in side of the policy. If you do this, you can scale-in manually or use a separate policy.

You can create target tracking policies using the AWS Management Console, AWS Command Line Interface (CLI), or the AWS SDKs.

Here are a couple of things to keep in mind as you look forward to using target tracking:

  • You can track more than one target in a single Auto Scaling Group as long as each one references a distinct metric. Scaling will always choose the policy that drives the highest capacity.
  • Scaling will not take place if the metric has insufficient data.
  • Auto Scaling compensates for rapid, transient fluctuations in the metrics, and strives to minimize corresponding fluctuations in capacity.
  • You can set up target tracking for a custom metric through the Auto Scaling API or the AWS Command Line Interface (CLI).
  • In most cases you should elect to scale on metrics that are published with 1-minute frequency (also known as detailed monitoring). Using 5-minute metrics as the basis for scaling will result in a slower response time.

Now Available
This new feature is available now and you can start using it today at no extra charge. To learn more, read about Target Tracking Scaling in the Auto Scaling User Guide.

Jeff;

New – API & CloudFormation Support for Amazon CloudWatch Dashboards

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-api-cloudformation-support-for-amazon-cloudwatch-dashboards/

We launched CloudWatch Dashboards a couple of years ago. In the post that I wrote for the launch, I showed you how to interactively create a dashboard that displayed chosen CloudWatch metrics in graphical form. After the launch, we added additional features including a full screen mode, a dark theme, control over the range of the Y axis, simplified renaming, persistent storage, and new visualization options.

New API & CLI
While console support is wonderful for interactive use, many customers have asked us to support programmatic creation and manipulation of dashboards and the widgets within. They would like to dynamically build and maintain dashboards, adding and removing widgets as the corresponding AWS resources are created and destroyed. Other customers are interested in setting up and maintaining a consistent set of dashboards across two or more AWS accounts.

I am happy to announce that API, CLI, and AWS CloudFormation support for CloudWatch Dashboards is available now and that you can start using it today!

There are four new API functions (and equivalent CLI commands):

ListDashboards / aws cloudwatch list-dashboards – Fetch a list of all dashboards within an account, or a subset that share a common prefix.

GetDashboard / aws cloudwatch get-dashboard – Fetch details for a single dashboard.

PutDashboard / aws cloudwatch put-dashboard – Create a new dashboard or update an existing one.

DeleteDashboards / aws cloudwatch delete-dashboards – Delete one or more dashboards.

Dashboard Concepts
I want to show you how to use these functions and commands. Before I dive in, I should review a couple of important dashboard concepts and attributes.

Global – Dashboards are part of an AWS account, and are not associated with a specific AWS Region. Each account can have up to 500 dashboards.

Named – Each dashboard has a name that is unique within the AWS account. Names can be up to 255 characters long.

Grid Model – Each dashboard is composed of a grid of cells. The grid is 24 cells across and as tall as necessary. Each widget on the dashboard is positioned at a particular set of grid coordinates, and has a size that spans an integral number of grid cells.

Widgets (Visualizations) – Each widget can display text or a set of CloudWatch metrics. Text is specified using Markdown; metrics can be displayed as single values, line charts, or stacked area charts. Each dashboard can have up to 100 widgets. Widgets that display metrics can also be associated with a CloudWatch Alarm.

Dashboards have a JSON representation that you can now see and edit from within the console. Simply click on the Action menu and choose View/edit source:

Here’s the source for my dashboard:

You can use this JSON as a starting point for your own applications. As you can see, there’s an entry in the widgets array for each widget on the dashboard; each entry describes one widget, starting with its type, position, and size.

Creating a Dashboard Using the API
Let’s say I want to create a dashboard that has a widget for each of my EC2 instances in a particular region. I’ll use Python and the AWS SDK for Python, and start as follows (excuse the amateur nature of my code):

import boto3
import json

cw  = boto3.client("cloudwatch")
ec2 = boto3.client("ec2")

x, y          = [0, 0]
width, height = [3, 3]
max_width     = 12
widgets       = []

Then I simply iterate over the instances, creating a widget dictionary for each one, and appending it to the widgets array:

instances = ec2.describe_instances()
for r in instances['Reservations']:
    for i in r['Instances']:

        widget = {'type'      : 'metric',
                  'x'         : x,
                  'y'         : y,
                  'height'    : height,
                  'width'     : width,
                  'properties': {'view'    : 'timeSeries',
                                 'stacked' : False,
                                 'metrics' : [['AWS/EC2', 'NetworkIn', 'InstanceId', i['InstanceId']],
                                              ['.',       'NetworkOut', '.',         '.']
                                             ],
                                 'period'  : 300,
                                 'stat'    : 'Average',
                                 'region'  : 'us-east-1',
                                 'title'   : i['InstanceId']
                                }
                 }

        widgets.append(widget)

I update the position (x and y) within the loop, and form a grid (if I don’t specify positions, the widgets will be laid out left to right, top to bottom):

        x += width
        if (x + width > max_width):
            x = 0
            y += height

After I have processed all of the instances, I create a JSON version of the widget array:

body   = {'widgets' : widgets}
body_j = json.dumps(body)

And I create or update my dashboard:

cw.put_dashboard(DashboardName = "EC2_Networking",
                 DashboardBody = body_j)

I run the code, and get the following dashboard:

The CloudWatch team recommends that dashboards created programmatically include a text widget indicating that the dashboard was generated automatically, along with a link to the source code or CloudFormation template that did the work. This will discourage users from making manual, out-of-band changers to the dashboards.

As I mentioned earlier, each metric widget can also be associated with a CloudWatch Alarm. You can create the alarms programmatically or by using a CloudFormation template such as the Sample CPU Utilization Alarm. If you decide to do this, the alarm threshold will be displayed in the widget. To learn more about this, read Tara Walker’s recent post, Amazon CloudWatch Launches Alarms on Dashboards.

Going one step further, I could use CloudWatch Events and a Lamba Function to track the creation and deletion of certain resources and update a dashboard in concert with the changes. To learn how to do this, read Keeping CloudWatch Dashboards up to Date Using AWS Lambda.

Accessing a Dashboard Using the CLI
I can also access and manipulate my dashboards from the command line. For example, I can generate a simple list:

$ aws cloudwatch list-dashboards --output table
----------------------------------------------
|               ListDashboards               |
+--------------------------------------------+
||             DashboardEntries             ||
|+-----------------+----------------+-------+|
||  DashboardName  | LastModified   | Size  ||
|+-----------------+----------------+-------+|
||  Disk-Metrics   |  1496405221.0  |  316  ||
||  EC2_Networking |  1498090434.0  |  2830 ||
||  Main-Metrics   |  1498085173.0  |  234  ||
|+-----------------+----------------+-------+|

And I can get rid of the Disk-Metrics dashboard:

$ aws cloudwatch delete-dashboards --dashboard-names Disk-Metrics

I can also retrieve the JSON that defines a dashboard:

Creating a Dashboard Using CloudFormation
Dashboards can also be specified in CloudFormation templates. Here’s a simple template in YAML (the DashboardBody is still specified in JSON):

Resources:
  MyDashboard:
    Type: "AWS::CloudWatch::Dashboard"
    Properties:
      DashboardName: SampleDashboard
      DashboardBody: '{"widgets":[{"type":"text","x":0,"y":0,"width":6,"height":6,"properties":{"markdown":"Hi there from CloudFormation"}}]}'

I place the template in a file and then create a stack using the console or the CLI:

$ aws cloudformation create-stack --stack-name MyDashboard --template-body file://dash.yaml
{
    "StackId": "arn:aws:cloudformation:us-east-1:xxxxxxxxxxxx:stack/MyDashboard/a2a3fb20-5708-11e7-8ffd-500c21311262"
}

Here’s the dashboard:

Available Now
This feature is available now and you can start using it today. You can create 3 dashboards with up to 50 metrics per dashboard at no charge; additional dashboards are priced at $3 per month, as listed on the CloudWatch Pricing page. You can make up to 1 million calls to the new API functions each month at no charge; beyond that you pay $.01 for every 1,000 calls.

Jeff;

AWS Bill Simplification – Consolidated CloudWatch Charges

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-bill-simplification-consolidated-cloudwatch-charges/

The bill that you receive for your use of AWS in July will include a change in the way that Amazon CloudWatch charges are presented. The CloudWatch team made this change in order to make your bill simpler and easier to understand.

Consolidating Charges
In the past, charges for your usage of CloudWatch were split between two sections of your bill. For historical reasons, the charges for CloudWatch Alarms, CloudWatch Metrics, and calls to the CloudWatch API were reported in the Elastic Compute Cloud (EC2) detail section, while charges for CloudWatch Logs and CloudWatch Dashboards were reported in the CloudWatch detail section, like this:

We have received feedback that splitting the charges across two sections of the bill made it difficult to locate and understand the entire set of monitoring charges. In order to address this issue, we are moving the charges that were formerly listed in the Elastic Compute Cloud (EC2) detail section to the CloudWatch detail section. We are making the same change to the detailed billing report, moving the affected charges from the AmazonEC2 product code to the AmazonCloudWatch product code and changing to the AmazonCloudWatch product name. This change does not affect your overall bill; it simply consolidates all of the charges for the use of CloudWatch in one section.

Billing Metric
The CloudWatch billing metric named Estimated Charges can be viewed as a Total Estimated Charge, or broken down By Service:

The total will not change. However, as noted above, the charges that formerly had AmazonEC2 as the ServiceName dimension will now have it set to AmazonCloudWatch:

You may need to adjust thresholds on your billing alarms as a result:

Once again, your total AWS bill will not change. You will begin to see the consolidated charges for CloudWatch in your AWS bill for July 2017.

Jeff;

 

Protect Web Sites & Services Using Rate-Based Rules for AWS WAF

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/protect-web-sites-services-using-rate-based-rules-for-aws-waf/

AWS WAF (Web Application Firewall) helps to protect your application from many different types of application-layer attacks that involve requests that are malicious or malformed. As I showed you when I first wrote about this service (New – AWS WAF), you can define rules that match cross-site scripting, IP address, SQL injection, size, or content constraints:

When incoming requests match rules, actions are invoked. Actions can either allow, block, or simply count matches.

The existing rule model is powerful and gives you the ability to detect and respond to many different types of attacks. It does not, however, allow you to respond to attacks that simply consist of a large number of otherwise valid requests from a particular IP address. These requests might be a web-layer DDoS attack, a brute-force login attempt, or even a partner integration gone awry.

New Rate-Based Rules
Today we are adding Rate-based Rules to WAF, giving you control of when IP addresses are added to and removed from a blacklist, along with the flexibility to handle exceptions and special cases:

Blacklisting IP Addresses – You can blacklist IP addresses that make requests at a rate that exceeds a configured threshold rate.

IP Address Tracking– You can see which IP addresses are currently blacklisted.

IP Address Removal – IP addresses that have been blacklisted are automatically removed when they no longer make requests at a rate above the configured threshold.

IP Address Exemption – You can exempt certain IP addresses from blacklisting by using an IP address whitelist inside of the a rate-based rule. For example, you might want to allow trusted partners to access your site at a higher rate.

Monitoring & Alarming – You can watch and alarm on CloudWatch metrics that are published for each rule.

You can combine new Rate-based Rules with WAF Conditions to implement sophisticated rate-limiting strategies. For example, you could use a Rate-based Rule and a WAF Condition that matches your login pages. This would allow you to impose a modest threshold on your login pages (to avoid brute-force password attacks) and allow a more generous one on your marketing or system status pages.

Thresholds are defined in terms of the number of incoming requests from a single IP address within a 5 minute period. Once this threshold is breached, additional requests from the IP address are blocked until the request rate falls below the threshold.

Using Rate-Based Rules
Here’s how you would define a Rate-based Rule that protects the /login portion of your site. Start by defining a WAF condition that matches the desired string in the URI of the page:

Then use this condition to define a Rate-based Rule (the rate limit is expressed in terms of requests within a 5 minute interval, but the blacklisting goes in to effect as soon as the limit is breached):

With the condition and the rule in place, create a Web ACL (ProtectLoginACL) to bring it all together and to attach it to the AWS resource (a CloudFront distribution in this case):

Then attach the rule (ProtectLogin) to the Web ACL:

The resource is now protected in accord with the rule and the web ACL. You can monitor the associated CloudWatch metrics (ProtectLogin and ProtectLoginACL in this case). You could even create CloudWatch Alarms and use them to fire Lambda functions when a protection threshold is breached. The code could examine the offending IP address and make a complex, business-driven decision, perhaps adding a whitelisting rule that gives an extra-generous allowance to a trusted partner or to a user with a special payment plan.

Available Now
The new, Rate-based Rules are available now and you can start using them today! Rate-based rules are priced the same as Regular rules; see the WAF Pricing page for more info.

Jeff;

Building Loosely Coupled, Scalable, C# Applications with Amazon SQS and Amazon SNS

Post Syndicated from Tara Van Unen original https://aws.amazon.com/blogs/compute/building-loosely-coupled-scalable-c-applications-with-amazon-sqs-and-amazon-sns/

 
Stephen Liedig, Solutions Architect

 

One of the many challenges professional software architects and developers face is how to make cloud-native applications scalable, fault-tolerant, and highly available.

Fundamental to your project success is understanding the importance of making systems highly cohesive and loosely coupled. That means considering the multi-dimensional facets of system coupling to support the distributed nature of the applications that you are building for the cloud.

By that, I mean addressing not only the application-level coupling (managing incoming and outgoing dependencies), but also considering the impacts of of platform, spatial, and temporal coupling of your systems. Platform coupling relates to the interoperability, or lack thereof, of heterogeneous systems components. Spatial coupling deals with managing components at a network topology level or protocol level. Temporal, or runtime coupling, refers to the ability of a component within your system to do any kind of meaningful work while it is performing a synchronous, blocking operation.

The AWS messaging services, Amazon SQS and Amazon SNS, help you deal with these forms of coupling by providing mechanisms for:

  • Reliable, durable, and fault-tolerant delivery of messages between application components
  • Logical decomposition of systems and increased autonomy of components
  • Creating unidirectional, non-blocking operations, temporarily decoupling system components at runtime
  • Decreasing the dependencies that components have on each other through standard communication and network channels

Following on the recent topic, Building Scalable Applications and Microservices: Adding Messaging to Your Toolbox, in this post, I look at some of the ways you can introduce SQS and SNS into your architectures to decouple your components, and show how you can implement them using C#.

Walkthrough

To illustrate some of these concepts, consider a web application that processes customer orders. As good architects and developers, you have followed best practices and made your application scalable and highly available. Your solution included implementing load balancing, dynamic scaling across multiple Availability Zones, and persisting orders in a Multi-AZ Amazon RDS database instance, as in the following diagram.


In this example, the application is responsible for handling and persisting the order data, as well as dealing with increases in traffic for popular items.

One potential point of vulnerability in the order processing workflow is in saving the order in the database. The business expects that every order has been persisted into the database. However, any potential deadlock, race condition, or network issue could cause the persistence of the order to fail. Then, the order is lost with no recourse to restore the order.

With good logging capability, you may be able to identify when an error occurred and which customer’s order failed. This wouldn’t allow you to “restore” the transaction, and by that stage, your customer is no longer your customer.

As illustrated in the following diagram, introducing an SQS queue helps improve your ordering application. Using the queue isolates the processing logic into its own component and runs it in a separate process from the web application. This, in turn, allows the system to be more resilient to spikes in traffic, while allowing work to be performed only as fast as necessary in order to manage costs.


In addition, you now have a mechanism for persisting orders as messages (with the queue acting as a temporary database), and have moved the scope of your transaction with your database further down the stack. In the event of an application exception or transaction failure, this ensures that the order processing can be retired or redirected to the Amazon SQS Dead Letter Queue (DLQ), for re-processing at a later stage. (See the recent post, Using Amazon SQS Dead-Letter Queues to Control Message Failure, for more information on dead-letter queues.)

Scaling the order processing nodes

This change allows you now to scale the web application frontend independently from the processing nodes. The frontend application can continue to scale based on metrics such as CPU usage, or the number of requests hitting the load balancer. Processing nodes can scale based on the number of orders in the queue. Here is an example of scale-in and scale-out alarms that you would associate with the scaling policy.

Scale-out Alarm

aws cloudwatch put-metric-alarm --alarm-name AddCapacityToCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
--statistic Average --period 300 --threshold 3 --comparison-operator GreaterThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
--evaluation-periods 2 --alarm-actions <arn of the scale-out autoscaling policy>

Scale-in Alarm

aws cloudwatch put-metric-alarm --alarm-name RemoveCapacityFromCustomerOrderQueue --metric-name ApproximateNumberOfMessagesVisible --namespace "AWS/SQS" 
 --statistic Average --period 300 --threshold 1 --comparison-operator LessThanOrEqualToThreshold --dimensions Name=QueueName,Value=customer-orders
 --evaluation-periods 2 --alarm-actions <arn of the scale-in autoscaling policy>

In the above example, use the ApproximateNumberOfMessagesVisible metric to discover the queue length and drive the scaling policy of the Auto Scaling group. Another useful metric is ApproximateAgeOfOldestMessage, when applications have time-sensitive messages and developers need to ensure that messages are processed within a specific time period.

Scaling the order processing implementation

On top of scaling at an infrastructure level using Auto Scaling, make sure to take advantage of the processing power of your Amazon EC2 instances by using as many of the available threads as possible. There are several ways to implement this. In this post, we build a Windows service that uses the BackgroundWorker class to process the messages from the queue.

Here’s a closer look at the implementation. In the first section of the consuming application, use a loop to continually poll the queue for new messages, and construct a ReceiveMessageRequest variable.

public static void PollQueue()
{
    while (_running)
    {
        Task<ReceiveMessageResponse> receiveMessageResponse;

        // Pull messages off the queue
        using (var sqs = new AmazonSQSClient())
        {
            const int maxMessages = 10;  // 1-10

            //Receiving a message
            var receiveMessageRequest = new ReceiveMessageRequest
            {
                // Get URL from Configuration
                QueueUrl = _queueUrl, 
                // The maximum number of messages to return. 
                // Fewer messages might be returned. 
                MaxNumberOfMessages = maxMessages, 
                // A list of attributes that need to be returned with message.
                AttributeNames = new List<string> { "All" },
                // Enable long polling. 
                // Time to wait for message to arrive on queue.
                WaitTimeSeconds = 5 
            };

            receiveMessageResponse = sqs.ReceiveMessageAsync(receiveMessageRequest);
        }

The WaitTimeSeconds property of the ReceiveMessageRequest specifies the duration (in seconds) that the call waits for a message to arrive in the queue before returning a response to the calling application. There are a few benefits to using long polling:

  • It reduces the number of empty responses by allowing SQS to wait until a message is available in the queue before sending a response.
  • It eliminates false empty responses by querying all (rather than a limited number) of the servers.
  • It returns messages as soon any message becomes available.

For more information, see Amazon SQS Long Polling.

After you have returned messages from the queue, you can start to process them by looping through each message in the response and invoking a new BackgroundWorker thread.

// Process messages
if (receiveMessageResponse.Result.Messages != null)
{
    foreach (var message in receiveMessageResponse.Result.Messages)
    {
        Console.WriteLine("Received SQS message, starting worker thread");

        // Create background worker to process message
        BackgroundWorker worker = new BackgroundWorker();
        worker.DoWork += (obj, e) => ProcessMessage(message);
        worker.RunWorkerAsync();
    }
}
else
{
    Console.WriteLine("No messages on queue");
}

The event handler, ProcessMessage, is where you implement business logic for processing orders. It is important to have a good understanding of how long a typical transaction takes so you can set a message VisibilityTimeout that is long enough to complete your operation. If order processing takes longer than the specified timeout period, the message becomes visible on the queue. Other nodes may pick it and process the same order twice, leading to unintended consequences.

Handling Duplicate Messages

In order to manage duplicate messages, seek to make your processing application idempotent. In mathematics, idempotent describes a function that produces the same result if it is applied to itself:

f(x) = f(f(x))

No matter how many times you process the same message, the end result is the same (definition from Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions, Hohpe and Wolf, 2004).

There are several strategies you could apply to achieve this:

  • Create messages that have inherent idempotent characteristics. That is, they are non-transactional in nature and are unique at a specified point in time. Rather than saying “place new order for Customer A,” which adds a duplicate order to the customer, use “place order <orderid> on <timestamp> for Customer A,” which creates a single order no matter how often it is persisted.
  • Deliver your messages via an Amazon SQS FIFO queue, which provides the benefits of message sequencing, but also mechanisms for content-based deduplication. You can deduplicate using the MessageDeduplicationId property on the SendMessage request or by enabling content-based deduplication on the queue, which generates a hash for MessageDeduplicationId, based on the content of the message, not the attributes.
var sendMessageRequest = new SendMessageRequest
{
    QueueUrl = _queueUrl,
    MessageBody = JsonConvert.SerializeObject(order),
    MessageGroupId = Guid.NewGuid().ToString("N"),
    MessageDeduplicationId = Guid.NewGuid().ToString("N")
};
  • If using SQS FIFO queues is not an option, keep a message log of all messages attributes processed for a specified period of time, as an alternative to message deduplication on the receiving end. Verifying the existence of the message in the log before processing the message adds additional computational overhead to your processing. This can be minimized through low latency persistence solutions such as Amazon DynamoDB. Bear in mind that this solution is dependent on the successful, distributed transaction of the message and the message log.

Handling exceptions

Because of the distributed nature of SQS queues, it does not automatically delete the message. Therefore, you must explicitly delete the message from the queue after processing it, using the message ReceiptHandle property (see the following code example).

However, if at any stage you have an exception, avoid handling it as you normally would. The intention is to make sure that the message ends back on the queue, so that you can gracefully deal with intermittent failures. Instead, log the exception to capture diagnostic information, and swallow it.

By not explicitly deleting the message from the queue, you can take advantage of the VisibilityTimeout behavior described earlier. Gracefully handle the message processing failure and make the unprocessed message available to other nodes to process.

In the event that subsequent retries fail, SQS automatically moves the message to the configured DLQ after the configured number of receives has been reached. You can further investigate why the order process failed. Most importantly, the order has not been lost, and your customer is still your customer.

private static void ProcessMessage(Message message)
{
    using (var sqs = new AmazonSQSClient())
    {
        try
        {
            Console.WriteLine("Processing message id: {0}", message.MessageId);

            // Implement messaging processing here
            // Ensure no downstream resource contention (parallel processing)
            // <your order processing logic in here…>
            Console.WriteLine("{0} Thread {1}: {2}", DateTime.Now.ToString("s"), Thread.CurrentThread.ManagedThreadId, message.MessageId);
            
            // Delete the message off the queue. 
            // Receipt handle is the identifier you must provide 
            // when deleting the message.
            var deleteRequest = new DeleteMessageRequest(_queueName, message.ReceiptHandle);
            sqs.DeleteMessageAsync(deleteRequest);
            Console.WriteLine("Processed message id: {0}", message.MessageId);

        }
        catch (Exception ex)
        {
            // Do nothing.
            // Swallow exception, message will return to the queue when 
            // visibility timeout has been exceeded.
            Console.WriteLine("Could not process message due to error. Exception: {0}", ex.Message);
        }
    }
}

Using SQS to adapt to changing business requirements

One of the benefits of introducing a message queue is that you can accommodate new business requirements without dramatically affecting your application.

If, for example, the business decided that all orders placed over $5000 are to be handled as a priority, you could introduce a new “priority order” queue. The way the orders are processed does not change. The only significant change to the processing application is to ensure that messages from the “priority order” queue are processed before the “standard order” queue.

The following diagram shows how this logic could be isolated in an “order dispatcher,” whose only purpose is to route order messages to the appropriate queue based on whether the order exceeds $5000. Nothing on the web application or the processing nodes changes other than the target queue to which the order is sent. The rates at which orders are processed can be achieved by modifying the poll rates and scalability settings that I have already discussed.

Extending the design pattern with Amazon SNS

Amazon SNS supports reliable publish-subscribe (pub-sub) scenarios and push notifications to known endpoints across a wide variety of protocols. It eliminates the need to periodically check or poll for new information and updates. SNS supports:

  • Reliable storage of messages for immediate or delayed processing
  • Publish / subscribe – direct, broadcast, targeted “push” messaging
  • Multiple subscriber protocols
  • Amazon SQS, HTTP, HTTPS, email, SMS, mobile push, AWS Lambda

With these capabilities, you can provide parallel asynchronous processing of orders in the system and extend it to support any number of different business use cases without affecting the production environment. This is commonly referred to as a “fanout” scenario.

Rather than your web application pushing orders to a queue for processing, send a notification via SNS. The SNS messages are sent to a topic and then replicated and pushed to multiple SQS queues and Lambda functions for processing.

As the diagram above shows, you have the development team consuming “live” data as they work on the next version of the processing application, or potentially using the messages to troubleshoot issues in production.

Marketing is consuming all order information, via a Lambda function that has subscribed to the SNS topic, inserting the records into an Amazon Redshift warehouse for analysis.

All of this, of course, is happening without affecting your order processing application.

Summary

While I haven’t dived deep into the specifics of each service, I have discussed how these services can be applied at an architectural level to build loosely coupled systems that facilitate multiple business use cases. I’ve also shown you how to use infrastructure and application-level scaling techniques, so you can get the most out of your EC2 instances.

One of the many benefits of using these managed services is how quickly and easily you can implement powerful messaging capabilities in your systems, and lower the capital and operational costs of managing your own messaging middleware.

Using Amazon SQS and Amazon SNS together can provide you with a powerful mechanism for decoupling application components. This should be part of design considerations as you architect for the cloud.

For more information, see the Amazon SQS Developer Guide and Amazon SNS Developer Guide. You’ll find tutorials on all the concepts covered in this post, and more. To can get started using the AWS console or SDK of your choice visit:

Happy messaging!

New – Auto Scaling for Amazon DynamoDB

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-auto-scaling-for-amazon-dynamodb/

Amazon DynamoDB has more than one hundred thousand customers, spanning a wide range of industries and use cases. These customers depend on DynamoDB’s consistent performance at any scale and presence in 16 geographic regions around the world. A recent trend we’ve been observing is customers using DynamoDB to power their serverless applications. This is a good match: with DynamoDB, you don’t have to think about things like provisioning servers, performing OS and database software patching, or configuring replication across availability zones to ensure high availability – you can simply create tables and start adding data, and let DynamoDB handle the rest.

DynamoDB provides a provisioned capacity model that lets you set the amount of read and write capacity required by your applications. While this frees you from thinking about servers and enables you to change provisioning for your table with a simple API call or button click in the AWS Management Console, customers have asked us how we can make managing capacity for DynamoDB even easier.

Today we are introducing Auto Scaling for DynamoDB to help automate capacity management for your tables and global secondary indexes. You simply specify the desired target utilization and provide upper and lower bounds for read and write capacity. DynamoDB will then monitor throughput consumption using Amazon CloudWatch alarms and then will adjust provisioned capacity up or down as needed. Auto Scaling will be on by default for all new tables and indexes, and you can also configure it for existing ones.

Even if you’re not around, DynamoDB Auto Scaling will be monitoring your tables and indexes to automatically adjust throughput in response to changes in application traffic. This can make it easier to administer your DynamoDB data, help you maximize availability for your applications, and help you reduce your DynamoDB costs.

Let’s see how it works…

Using Auto Scaling
The DynamoDB Console now proposes a comfortable set of default parameters when you create a new table. You can accept them as-is or you can uncheck Use default settings and enter your own parameters:

Here’s how you enter your own parameters:

Target utilization is expressed in terms of the ratio of consumed capacity to provisioned capacity. The parameters above would allow for sufficient headroom to allow consumed capacity to double due to a burst in read or write requests (read Capacity Unit Calculations to learn more about the relationship between DynamoDB read and write operations and provisioned capacity). Changes in provisioned capacity take place in the background.

Auto Scaling in Action
In order to see this important new feature in action, I followed the directions in the Getting Started Guide. I launched a fresh EC2 instance, installed (sudo pip install boto3) and configured (aws configure) the AWS SDK for Python. Then I used the code in the Python and DynamoDB section to create and populate a table with some data, and manually configured the table for 5 units each of read and write capacity.

I took a quick break in order to have clean, straight lines for the CloudWatch metrics so that I could show the effect of Auto Scaling. Here’s what the metrics look like before I started to apply a load:

I modified the code in Step 3 to continually issue queries for random years in the range of 1920 to 2007, ran a single copy of the code, and checked the read metrics a minute or two later:

The consumed capacity is higher than the provisioned capacity, resulting in a large number of throttled reads. Time for Auto Scaling!

I returned to the console and clicked on the Capacity tab for my table. Then I clicked on Read capacity, accepted the default values, and clicked on Save:

DynamoDB created a new IAM role (DynamoDBAutoscaleRole) and a pair of CloudWatch alarms to manage the Auto Scaling of read capacity:

DynamoDB Auto Scaling will manage the thresholds for the alarms, moving them up and down as part of the scaling process. The first alarm was triggered and the table state changed to Updating while additional read capacity was provisioned:

The change was visible in the read metrics within minutes:

I started a couple of additional copies of my modified query script and watched as additional capacity was provisioned, as indicated by the red line:

I killed all of the scripts and turned my attention to other things while waiting for the scale-down alarm to trigger. Here’s what I saw when I came back:

The next morning I checked my Scaling activities and saw that the alarm had triggered several more times overnight:

This was also visible in the metrics:

Until now, you would prepare for this situation by setting your read capacity well about your expected usage, and pay for the excess capacity (the space between the blue line and the red line). Or, you might set it too low, forget to monitor it, and run out of capacity when traffic picked up. With Auto Scaling you can get the best of both worlds: an automatic response when an increase in demand suggests that more capacity is needed, and another automated response when the capacity is no longer needed.

Things to Know
DynamoDB Auto Scaling is designed to accommodate request rates that vary in a somewhat predictable, generally periodic fashion. If you need to accommodate unpredictable bursts of read activity, you should use Auto Scaling in combination with DAX (read Amazon DynamoDB Accelerator (DAX) – In-Memory Caching for Read-Intensive Workloads to learn more). Also, the AWS SDKs will detect throttled read and write requests and retry them after a suitable delay.

I mentioned the DynamoDBAutoscaleRole earlier. This role provides Auto Scaling with the privileges that it needs to have in order for it to be able to scale your tables and indexes up and down. To learn more about this role and the permissions that it uses, read Grant User Permissions for DynamoDB Auto Scaling.

Auto Scaling has complete CLI and API support, including the ability to enable and disable the Auto Scaling policies. If you have some predictable, time-bound spikes in traffic, you can programmatically disable an Auto Scaling policy, provision higher throughput for a set period of time, and then enable Auto Scaling again later.

As noted on the Limits in DynamoDB page, you can increase provisioned capacity as often as you would like and as high as you need (subject to per-account limits that we can increase on request). You can decrease capacity up to nine times per day for each table or global secondary index.

You pay for the capacity that you provision, at the regular DynamoDB prices. You can also purchase DynamoDB Reserved Capacity to further savings.

Available Now
This feature is available now in all regions and you can start using it today!

Jeff;

Build a Serverless Architecture to Analyze Amazon CloudFront Access Logs Using AWS Lambda, Amazon Athena, and Amazon Kinesis Analytics

Post Syndicated from Rajeev Srinivasan original https://aws.amazon.com/blogs/big-data/build-a-serverless-architecture-to-analyze-amazon-cloudfront-access-logs-using-aws-lambda-amazon-athena-and-amazon-kinesis-analytics/

Nowadays, it’s common for a web server to be fronted by a global content delivery service, like Amazon CloudFront. This type of front end accelerates delivery of websites, APIs, media content, and other web assets to provide a better experience to users across the globe.

The insights gained by analysis of Amazon CloudFront access logs helps improve website availability through bot detection and mitigation, optimizing web content based on the devices and browser used to view your webpages, reducing perceived latency by caching of popular object closer to its viewer, and so on. This results in a significant improvement in the overall perceived experience for the user.

This blog post provides a way to build a serverless architecture to generate some of these insights. To do so, we analyze Amazon CloudFront access logs both at rest and in transit through the stream. This serverless architecture uses Amazon Athena to analyze large volumes of CloudFront access logs (on the scale of terabytes per day), and Amazon Kinesis Analytics for streaming analysis.

The analytic queries in this blog post focus on three common use cases:

  1. Detection of common bots using the user agent string
  2. Calculation of current bandwidth usage per Amazon CloudFront distribution per edge location
  3. Determination of the current top 50 viewers

However, you can easily extend the architecture described to power dashboards for monitoring, reporting, and trigger alarms based on deeper insights gained by processing and analyzing the logs. Some examples are dashboards for cache performance, usage and viewer patterns, and so on.

Following we show a diagram of this architecture.

Prerequisites

Before you set up this architecture, install the AWS Command Line Interface (AWS CLI) tool on your local machine, if you don’t have it already.

Setup summary

The following steps are involved in setting up the serverless architecture on the AWS platform:

  1. Create an Amazon S3 bucket for your Amazon CloudFront access logs to be delivered to and stored in.
  2. Create a second Amazon S3 bucket to receive processed logs and store the partitioned data for interactive analysis.
  3. Create an Amazon Kinesis Firehose delivery stream to batch, compress, and deliver the preprocessed logs for analysis.
  4. Create an AWS Lambda function to preprocess the logs for analysis.
  5. Configure Amazon S3 event notification on the CloudFront access logs bucket, which contains the raw logs, to trigger the Lambda preprocessing function.
  6. Create an Amazon DynamoDB table to look up partition details, such as partition specification and partition location.
  7. Create an Amazon Athena table for interactive analysis.
  8. Create a second AWS Lambda function to add new partitions to the Athena table based on the log delivered to the processed logs bucket.
  9. Configure Amazon S3 event notification on the processed logs bucket to trigger the Lambda partitioning function.
  10. Configure Amazon Kinesis Analytics application for analysis of the logs directly from the stream.

ETL and preprocessing

In this section, we parse the CloudFront access logs as they are delivered, which occurs multiple times in an hour. We filter out commented records and use the user agent string to decipher the browser name, the name of the operating system, and whether the request has been made by a bot. For more details on how to decipher the preceding information based on the user agent string, see user-agents 1.1.0 in the Python documentation.

We use the Lambda preprocessing function to perform these tasks on individual rows of the access log. On successful completion, the rows are pushed to an Amazon Kinesis Firehose delivery stream to be persistently stored in an Amazon S3 bucket, the processed logs bucket.

To create a Firehose delivery stream with a new or existing S3 bucket as the destination, follow the steps described in Create a Firehose Delivery Stream to Amazon S3 in the S3 documentation. Keep most of the default settings, but select an AWS Identity and Access Management (IAM) role that has write access to your S3 bucket and specify GZIP compression. Name the delivery stream CloudFrontLogsToS3.

Another pre-requisite for this setup is to create an IAM role that provides the necessary permissions our AWS Lambda function to get the data from S3, process it, and deliver it to the CloudFrontLogsToS3 delivery stream.

Let’s use the AWS CLI to create the IAM role using the following the steps:

  1. Create the IAM policy (lambda-exec-policy) for the Lambda execution role to use.
  2. Create the Lambda execution role (lambda-cflogs-exec-role) and assign the service to use this role.
  3. Attach the policy created in step 1 to the Lambda execution role.

To download the policy document to your local machine, type the following command.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis/preprocessiong-lambda/lambda-exec-policy.json  <path_on_your_local_machine>

To download the assume policy document to your local machine, type the following command.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis/preprocessiong-lambda/assume-lambda-policy.json  <path_on_your_local_machine>

Following is the lambda-exec-policy.json file, which is the IAM policy used by the Lambda execution role.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "CloudWatchAccess",
            "Effect": "Allow",
            "Action": [
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "arn:aws:logs:*:*:*"
        },
        {
            "Sid": "S3Access",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::*"
            ]
        },
        {
            "Sid": "FirehoseAccess",
            "Effect": "Allow",
            "Action": [
                "firehose:ListDeliveryStreams",
                "firehose:PutRecord",
                "firehose:PutRecordBatch"
            ],
            "Resource": [
                "arn:aws:firehose:*:*:deliverystream/CloudFrontLogsToS3"
            ]
        }
    ]
}

To create the IAM policy used by Lambda execution role, type the following command.

aws iam create-policy --policy-name lambda-exec-policy --policy-document file://<path>/lambda-exec-policy.json

To create the AWS Lambda execution role and assign the service to use this role, type the following command.

aws iam create-role --role-name lambda-cflogs-exec-role --assume-role-policy-document file://<path>/assume-lambda-policy.json

Following is the assume-lambda-policy.json file, to grant Lambda permission to assume a role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

To attach the policy (lambda-exec-policy) created to the AWS Lambda execution role (lambda-cflogs-exec-role), type the following command.

aws iam attach-role-policy --role-name lambda-cflogs-exec-role --policy-arn arn:aws:iam::<your-account-id>:policy/lambda-exec-policy

Now that we have created the CloudFrontLogsToS3 Firehose delivery stream and the lambda-cflogs-exec-role IAM role for Lambda, the next step is to create a Lambda preprocessing function.

This Lambda preprocessing function parses the CloudFront access logs delivered into the S3 bucket and performs a few transformation and mapping operations on the data. The Lambda function adds descriptive information, such as the browser and the operating system that were used to make this request based on the user agent string found in the logs. The Lambda function also adds information about the web distribution to support scenarios where CloudFront access logs are delivered to a centralized S3 bucket from multiple distributions. With the solution in this blog post, you can get insights across distributions and their edge locations.

Use the Lambda Management Console to create a new Lambda function with a Python 2.7 runtime and the s3-get-object-python blueprint. Open the console, and on the Configure triggers page, choose the name of the S3 bucket where the CloudFront access logs are delivered. Choose Put for Event type. For Prefix, type the name of the prefix, if any, for the folder where CloudFront access logs are delivered, for example cloudfront-logs/. To invoke Lambda to retrieve the logs from the S3 bucket as they are delivered, select Enable trigger.

Choose Next and provide a function name to identify this Lambda preprocessing function.

For Code entry type, choose Upload a file from Amazon S3. For S3 link URL, type https.amazonaws.com//preprocessing-lambda/pre-data.zip. In the section, also create an environment variable with the key KINESIS_FIREHOSE_STREAM and a value with the name of the Firehose delivery stream as CloudFrontLogsToS3.

Choose lambda-cflogs-exec-role as the IAM role for the Lambda function, and type prep-data.lambda_handler for the value for Handler.

Choose Next, and then choose Create Lambda.

Table creation in Amazon Athena

In this step, we will build the Athena table. Use the Athena console in the same region and create the table using the query editor.

CREATE EXTERNAL TABLE IF NOT EXISTS cf_logs (
  logdate date,
  logtime string,
  location string,
  bytes bigint,
  requestip string,
  method string,
  host string,
  uri string,
  status bigint,
  referrer string,
  useragent string,
  uriquery string,
  cookie string,
  resulttype string,
  requestid string,
  header string,
  csprotocol string,
  csbytes string,
  timetaken bigint,
  forwardedfor string,
  sslprotocol string,
  sslcipher string,
  responseresulttype string,
  protocolversion string,
  browserfamily string,
  osfamily string,
  isbot string,
  filename string,
  distribution string
)
PARTITIONED BY(year string, month string, day string, hour string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION 's3://<pre-processing-log-bucket>/prefix/';

Creation of the Athena partition

A popular website with millions of requests each day routed using Amazon CloudFront can generate a large volume of logs, on the order of a few terabytes a day. We strongly recommend that you partition your data to effectively restrict the amount of data scanned by each query. Partitioning significantly improves query performance and substantially reduces cost. The Lambda partitioning function adds the partition information to the Athena table for the data delivered to the preprocessed logs bucket.

Before delivering the preprocessed Amazon CloudFront logs file into the preprocessed logs bucket, Amazon Kinesis Firehose adds a UTC time prefix in the format YYYY/MM/DD/HH. This approach supports multilevel partitioning of the data by year, month, date, and hour. You can invoke the Lambda partitioning function every time a new processed Amazon CloudFront log is delivered to the preprocessed logs bucket. To do so, configure the Lambda partitioning function to be triggered by an S3 Put event.

For a website with millions of requests, a large number of preprocessed logs can be delivered multiple times in an hour—for example, at the interval of one each second. To avoid querying the Athena table for partition information every time a preprocessed log file is delivered, you can create an Amazon DynamoDB table for fast lookup.

Based on the year, month, data and hour in the prefix of the delivered log, the Lambda partitioning function checks if the partition specification exists in the Amazon DynamoDB table. If it doesn’t, it’s added to the table using an atomic operation, and then the Athena table is updated.

Type the following command to create the Amazon DynamoDB table.

aws dynamodb create-table --table-name athenapartitiondetails \
--attribute-definitions AttributeName=PartitionSpec,AttributeType=S \
--key-schema AttributeName=PartitionSpec,KeyType=HASH \
--provisioned-throughput ReadCapacityUnits=100,WriteCapacityUnits=100

Here the following is true:

  • PartitionSpec is the hash key and is a representation of the partition signature—for example, year=”2017”; month=”05”; day=”15”; hour=”10”.
  • Depending on the rate at which the processed log files are delivered to the processed log bucket, you might have to increase the ReadCapacityUnits and WriteCapacityUnits values, if these are throttled.

The other attributes besides PartitionSpec are the following:

  • PartitionPath – The S3 path associated with the partition.
  • PartitionType – The type of partition used (Hour, Month, Date, Year, or ALL). In this case, ALL is used.

Next step is to create the IAM role to provide permissions for the Lambda partitioning function. You require permissions to do the following:

  1. Look up and write partition information to DynamoDB.
  2. Alter the Athena table with new partition information.
  3. Perform Amazon CloudWatch logs operations.
  4. Perform Amazon S3 operations.

To download the policy document to your local machine, type following command.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis/partitioning-lambda/lambda-partition-function-execution-policy.json  <path_on_your_local_machine>

To download the assume policy document to your local machine, type the following command.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis/partitioning-lambda/assume-lambda-policy.json <path_on_your_local_machine>

To create the Lambda execution role and assign the service to use this role, type the following command.

aws iam create-role --role-name lambda-cflogs-exec-role --assume-role-policy-document file://<path>/assume-lambda-policy.json

Let’s use the AWS CLI to create the IAM role using the following three steps:

  1. Create the IAM policy(lambda-partition-exec-policy) used by the Lambda execution role.
  2. Create the Lambda execution role (lambda-partition-execution-role)and assign the service to use this role.
  3. Attach the policy created in step 1 to the Lambda execution role.

To create the IAM policy used by Lambda execution role, type the following command.

aws iam create-policy --policy-name lambda-partition-exec-policy --policy-document file://<path>/lambda-partition-function-execution-policy.json

To create the Lambda execution role and assign the service to use this role, type the following command.

aws iam create-role --role-name lambda-partition-execution-role --assume-role-policy-document file://<path>/assume-lambda-policy.json

To attach the policy (lambda-partition-exec-policy) created to the AWS Lambda execution role (lambda-partition-execution-role), type the following command.

aws iam attach-role-policy --role-name lambda-partition-execution-role --policy-arn arn:aws:iam::<your-account-id>:policy/lambda-partition-exec-policy

Following is the lambda-partition-function-execution-policy.json file, which is the IAM policy used by the Lambda execution role.

{
    "Version": "2012-10-17",
    "Statement": [
      	{
            	"Sid": "DDBTableAccess",
            	"Effect": "Allow",
            	"Action": "dynamodb:PutItem"
            	"Resource": "arn:aws:dynamodb*:*:table/athenapartitiondetails"
        	},
        	{
            	"Sid": "S3Access",
            	"Effect": "Allow",
            	"Action": [
                		"s3:GetBucketLocation",
                		"s3:GetObject",
                		"s3:ListBucket",
                		"s3:ListBucketMultipartUploads",
                		"s3:ListMultipartUploadParts",
                		"s3:AbortMultipartUpload",
                		"s3:PutObject"
            	],
          		"Resource":"arn:aws:s3:::*"
		},
	              {
		      "Sid": "AthenaAccess",
      		"Effect": "Allow",
      		"Action": [ "athena:*" ],
      		"Resource": [ "*" ]
	      },
        	{
            	"Sid": "CloudWatchLogsAccess",
            	"Effect": "Allow",
            	"Action": [
                		"logs:CreateLogGroup",
                		"logs:CreateLogStream",
             	   	"logs:PutLogEvents"
            	],
            	"Resource": "arn:aws:logs:*:*:*"
        	}
    ]
}

Download the .jar file containing the Java deployment package to your local machine.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis/partitioning-lambda/aws-lambda-athena-1.0.0.jar <path_on_your_local_machine>

From the AWS Management Console, create a new Lambda function with Java8 as the runtime. Select the Blank Function blueprint.

On the Configure triggers page, choose the name of the S3 bucket where the preprocessed logs are delivered. Choose Put for the Event Type. For Prefix, type the name of the prefix folder, if any, where preprocessed logs are delivered by Firehose—for example, out/. For Suffix, type the name of the compression format that the Firehose stream (CloudFrontLogToS3) delivers the preprocessed logs —for example, gz. To invoke Lambda to retrieve the logs from the S3 bucket as they are delivered, select Enable Trigger.

Choose Next and provide a function name to identify this Lambda partitioning function.

Choose Java8 for Runtime for the AWS Lambda function. Choose Upload a .ZIP or .JAR file for the Code entry type, and choose Upload to upload the downloaded aws-lambda-athena-1.0.0.jar file.

Next, create the following environment variables for the Lambda function:

  • TABLE_NAME – The name of the Athena table (for example, cf_logs).
  • PARTITION_TYPE – The partition to be created based on the Athena table for the logs delivered to the sub folders in S3 bucket based on Year, Month, Date, Hour, or Set this to ALL to use Year, Month, Date, and Hour.
  • DDB_TABLE_NAME – The name of the DynamoDB table holding partition information (for example, athenapartitiondetails).
  • ATHENA_REGION – The current AWS Region for the Athena table to construct the JDBC connection string.
  • S3_STAGING_DIR – The Amazon S3 location where your query output is written. The JDBC driver asks Athena to read the results and provide rows of data back to the user (for example, s3://<bucketname>/<folder>/).

To configure the function handler and IAM, for Handler copy and paste the name of the handler: com.amazonaws.services.lambda.CreateAthenaPartitionsBasedOnS3EventWithDDB::handleRequest. Choose the existing IAM role, lambda-partition-execution-role.

Choose Next and then Create Lambda.

Interactive analysis using Amazon Athena

In this section, we analyze the historical data that’s been collected since we added the partitions to the Amazon Athena table for data delivered to the preprocessing logs bucket.

Scenario 1 is robot traffic by edge location.

SELECT COUNT(*) AS ct, requestip, location FROM cf_logs
WHERE isbot='True'
GROUP BY requestip, location
ORDER BY ct DESC;

Scenario 2 is total bytes transferred per distribution for each edge location for your website.

SELECT distribution, location, SUM(bytes) as totalBytes
FROM cf_logs
GROUP BY location, distribution;

Scenario 3 is the top 50 viewers of your website.

SELECT requestip, COUNT(*) AS ct  FROM cf_logs
GROUP BY requestip
ORDER BY ct DESC;

Streaming analysis using Amazon Kinesis Analytics

In this section, you deploy a stream processing application using Amazon Kinesis Analytics to analyze the preprocessed Amazon CloudFront log streams. This application analyzes directly from the Amazon Kinesis Stream as it is delivered to the preprocessing logs bucket. The stream queries in section are focused on gaining the following insights:

  • The IP address of the bot, identified by its Amazon CloudFront edge location, that is currently sending requests to your website. The query also includes the total bytes transferred as part of the response.
  • The total bytes served per distribution per population for your website.
  • The top 10 viewers of your website.

To download the firehose-access-policy.json file, type the following.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis/kinesisanalytics/firehose-access-policy.json  <path_on_your_local_machine>

To download the kinesisanalytics-policy.json file, type the following.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis/kinesisanalytics/assume-kinesisanalytics-policy.json <path_on_your_local_machine>

Before we create the Amazon Kinesis Analytics application, we need to create the IAM role to provide permission for the analytics application to access Amazon Kinesis Firehose stream.

Let’s use the AWS CLI to create the IAM role using the following three steps:

  1. Create the IAM policy(firehose-access-policy) for the Lambda execution role to use.
  2. Create the Lambda execution role (ka-execution-role) and assign the service to use this role.
  3. Attach the policy created in step 1 to the Lambda execution role.

Following is the firehose-access-policy.json file, which is the IAM policy used by Kinesis Analytics to read Firehose delivery stream.

{
    "Version": "2012-10-17",
    "Statement": [
      	{
    	"Sid": "AmazonFirehoseAccess",
    	"Effect": "Allow",
    	"Action": [
       	"firehose:DescribeDeliveryStream",
        	"firehose:Get*"
    	],
    	"Resource": [
              "arn:aws:firehose:*:*:deliverystream/CloudFrontLogsToS3”
       ]
     }
}

Following is the assume-kinesisanalytics-policy.json file, to grant Amazon Kinesis Analytics permissions to assume a role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "kinesisanalytics.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

To create the IAM policy used by Analytics access role, type the following command.

aws iam create-policy --policy-name firehose-access-policy --policy-document file://<path>/firehose-access-policy.json

To create the Analytics execution role and assign the service to use this role, type the following command.

aws iam attach-role-policy --role-name ka-execution-role --policy-arn arn:aws:iam::<your-account-id>:policy/firehose-access-policy

To attach the policy (irehose-access-policy) created to the Analytics execution role (ka-execution-role), type the following command.

aws iam attach-role-policy --role-name ka-execution-role --policy-arn arn:aws:iam::<your-account-id>:policy/firehose-access-policy

To deploy the Analytics application, first download the configuration file and then modify ResourceARN and RoleARN for the Amazon Kinesis Firehose input configuration.

"KinesisFirehoseInput": { 
    "ResourceARN": "arn:aws:firehose:<region>:<account-id>:deliverystream/CloudFrontLogsToS3", 
    "RoleARN": "arn:aws:iam:<account-id>:role/ka-execution-role"
}

To download the Analytics application configuration file, type the following command.

aws s3 cp s3://aws-bigdata-blog/artifacts/Serverless-CF-Analysis//kinesisanalytics/kinesis-analytics-app-configuration.json <path_on_your_local_machine>

To deploy the application, type the following command.

aws kinesisanalytics create-application --application-name "cf-log-analysis" --cli-input-json file://<path>/kinesis-analytics-app-configuration.json

To start the application, type the following command.

aws kinesisanalytics start-application --application-name "cf-log-analysis" --input-configuration Id="1.1",InputStartingPositionConfiguration={InputStartingPosition="NOW"}

SQL queries using Amazon Kinesis Analytics

Scenario 1 is a query for detecting bots for sending request to your website detection for your website.

-- Create output stream, which can be used to send to a destination
CREATE OR REPLACE STREAM "BOT_DETECTION" (requesttime TIME, destribution VARCHAR(16), requestip VARCHAR(64), edgelocation VARCHAR(64), totalBytes BIGINT);
-- Create pump to insert into output 
CREATE OR REPLACE PUMP "BOT_DETECTION_PUMP" AS INSERT INTO "BOT_DETECTION"
--
SELECT STREAM 
    STEP("CF_LOG_STREAM_001"."request_time" BY INTERVAL '1' SECOND) as requesttime,
    "distribution_name" as distribution,
    "request_ip" as requestip, 
    "edge_location" as edgelocation, 
    SUM("bytes") as totalBytes
FROM "CF_LOG_STREAM_001"
WHERE "is_bot" = true
GROUP BY "request_ip", "edge_location", "distribution_name",
STEP("CF_LOG_STREAM_001"."request_time" BY INTERVAL '1' SECOND),
STEP("CF_LOG_STREAM_001".ROWTIME BY INTERVAL '1' SECOND);

Scenario 2 is a query for total bytes transferred per distribution for each edge location for your website.

-- Create output stream, which can be used to send to a destination
CREATE OR REPLACE STREAM "BYTES_TRANSFFERED" (requesttime TIME, destribution VARCHAR(16), edgelocation VARCHAR(64), totalBytes BIGINT);
-- Create pump to insert into output 
CREATE OR REPLACE PUMP "BYTES_TRANSFFERED_PUMP" AS INSERT INTO "BYTES_TRANSFFERED"
-- Bytes Transffered per second per web destribution by edge location
SELECT STREAM 
    STEP("CF_LOG_STREAM_001"."request_time" BY INTERVAL '1' SECOND) as requesttime,
    "distribution_name" as distribution,
    "edge_location" as edgelocation, 
    SUM("bytes") as totalBytes
FROM "CF_LOG_STREAM_001"
GROUP BY "distribution_name", "edge_location", "request_date",
STEP("CF_LOG_STREAM_001"."request_time" BY INTERVAL '1' SECOND),
STEP("CF_LOG_STREAM_001".ROWTIME BY INTERVAL '1' SECOND);

Scenario 3 is a query for the top 50 viewers for your website.

-- Create output stream, which can be used to send to a destination
CREATE OR REPLACE STREAM "TOP_TALKERS" (requestip VARCHAR(64), requestcount DOUBLE);
-- Create pump to insert into output 
CREATE OR REPLACE PUMP "TOP_TALKERS_PUMP" AS INSERT INTO "TOP_TALKERS"
-- Top Ten Talker
SELECT STREAM ITEM as requestip, ITEM_COUNT as requestcount FROM TABLE(TOP_K_ITEMS_TUMBLING(
  CURSOR(SELECT STREAM * FROM "CF_LOG_STREAM_001"),
  'request_ip', -- name of column in single quotes
  50, -- number of top items
  60 -- tumbling window size in seconds
  )
);

Conclusion

Following the steps in this blog post, you just built an end-to-end serverless architecture to analyze Amazon CloudFront access logs. You analyzed these both in interactive and streaming mode, using Amazon Athena and Amazon Kinesis Analytics respectively.

By creating a partition in Athena for the logs delivered to a centralized bucket, this architecture is optimized for performance and cost when analyzing large volumes of logs for popular websites that receive millions of requests. Here, we have focused on just three common use cases for analysis, sharing the analytic queries as part of the post. However, you can extend this architecture to gain deeper insights and generate usage reports to reduce latency and increase availability. This way, you can provide a better experience on your websites fronted with Amazon CloudFront.

In this blog post, we focused on building serverless architecture to analyze Amazon CloudFront access logs. Our plan is to extend the solution to provide rich visualization as part of our next blog post.


About the Authors

Rajeev Srinivasan is a Senior Solution Architect for AWS. He works very close with our customers to provide big data and NoSQL solution leveraging the AWS platform and enjoys coding . In his spare time he enjoys riding his motorcycle and reading books.

 

Sai Sriparasa is a consultant with AWS Professional Services. He works with our customers to provide strategic and tactical big data solutions with an emphasis on automation, operations & security on AWS. In his spare time, he follows sports and current affairs.

 

 


Related

Analyzing VPC Flow Logs with Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight

Amazon DynamoDB Accelerator (DAX) – In-Memory Caching for Read-Intensive Workloads

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-dynamodb-accelerator-dax-in-memory-caching-for-read-intensive-workloads/

I’m fairly sure that you already know about Amazon DynamoDB. As you probably know, it is a managed NoSQL database that scales to accommodate as much table space, read capacity, and write capacity as you need. With response times measured in single-digit milliseconds, our customers are using DynamoDB for many types of applications including adtech, IoT, gaming, media, online learning, travel, e-commerce, and finance. Some of these customers store more than 100 terabytes in a single DynamoDB table and make millions of read or write requests per second. The Amazon retail site relies on DynamoDB and uses it to withstand the traffic surges associated with brief, high-intensity events such as Black Friday, Cyber Monday, and Prime Day.

While DynamoDB’s ability to deliver fast, consistent performance benefits just about any application and workload, there’s always room to do even better. The business value of some workloads (gaming and adtech come to mind, but there are many others) is driven by low-latency, high-performance database reads. The ability to pull data from DynamoDB as quickly as possible leads to faster & more responsive games or ads that drive the highest click-through rates.

Amazon DynamoDB Accelerator
In order to support demanding, read-heavy workloads, we are launching a public preview of the Amazon DynamoDB Accelerator, otherwise known as DAX.

DAX is a fully managed caching service that sits (logically) in front of your DynamoDB tables. It operates in write-through mode, and is API-compatible with DynamoDB. Responses are returned from the cache in microseconds, making DAX a great fit for eventually-consistent read-intensive workloads. DAX is seamless and easy to use. As a managed service, you simply create your DAX cluster and use it as the target for your existing reads and writes. You don’t have to worry about patching, cluster maintenance, replication, or fault management.

Each DAX cluster can contain 1 to 10 nodes; you can add nodes in order to increase overall read throughput. The cache size (also known as the working set) is based on the node size (dax.r3.large to dax.r3.8xlarge) that you choose when you create the cluster. Clusters run within a VPC, with nodes spread across Availability Zones.

You will need to use the DAX SDK for Java to communicate with DAX. This SDK communicates with your cluster using a low-level TCP interface that is fine-tuned for low latency and high throughput (we’ll support access to DAX through other languages as quickly as possible).

Creating a DAX Cluster
Let’s create a DAX cluster from the DynamoDB Console (API and CLI support is also available). I open up the console and click on Create cluster to get started:

I enter a name and description, choose a node type, and set the initial size of my cluster. Then I create an IAM role and policy that gives DAX permission to access my DynamoDB tables (I can also choose an existing role):

The console allows me to create a policy that grants access to a single table. I am add additional tables to the policy using the IAM Console.

Next, I create a subnet group that DAX uses to place cluster nodes. I name the group and choose the desired subnets:

I accept the default settings and then click on Launch cluster:

My cluster is ready to use within minutes:

The next step is to update my application to use the DAX SDK for Java and to configure it to use the endpoint of my cluster (dax1.seutl3.clustercfg.dax.use1.cache.amazonaws.com:8111 in this case).

Once my application is up and running, I can visit the Metrics tab to see how well the cache is performing. The Amazon CloudWatch metrics include cache hits and misses, request counts, error counts, and so forth:

I can use the Alarms tab to create a CloudWatch Alarm for any of the metrics. Perhaps I want to know if an excessive number of cache misses are taking place:

I can use the Nodes tab to see the nodes in my cluster. I can also add new nodes or delete existing ones:

In order to see how DAX works, I installed the DAX Sample Application and ran it twice. The first run accessed DynamoDB directly and demonstrated the non-cached, baseline performance:

As you can see from the middle group of results, the queries ran in 2.9 to 11.3 milliseconds. The second run used DAX and showed the effect of caching on performance:

The first iteration of each test results in a cache miss. The subsequent iterations retrieve the results from the cache, and are (as you can see) quite a bit faster.

Things to Know
Here are a few things to keep in mind as you think about how to put DAX to use in your environment:

Java API – As I mentioned earlier, we are launching this public preview with support for Java. with plans to add support for other languages. DAX is API-compatible with DynamoDB so there’s no need to write your own caching logic or make changes to your code.

Consistency – DAX offers the best opportunity for performance gains when you are using eventually consistent reads that can be served from the in-memory cache (DAX always refers back to the DynamoDB table when processing consistent reads).

Write-Throughs – DAX is a write-through cache. However, if there is a weak correlation between what you read and what you write, you may want to direct your writes to DynamoDB. This will allow DAX to be of greater assistance for your reads.

Deprovisioning – After you have put DAX to use in your environment, you should be able to reduce the amount of read capacity provisioned for the underlying tables. This will reduce your costs (dramatically in many cases), while allowing DAX to provide spare capacity for sudden surges in usage.

Available Now
The public preview of DAX is available today in the US East (Northern Virginia), US West (Oregon), and EU (Ireland) Regions and you can sign up today. You can use the public preview at no charge and you can also learn more by reading the DAX Developer Guide.

Jeff;

 

 

Introducing Amazon CloudWatch Logs Integration for AWS OpsWorks Stacks

Post Syndicated from Kai Rubarth original https://aws.amazon.com/blogs/devops/introducing-amazon-cloudwatch-logs-integration-for-aws-opsworks-stacks/

AWS OpsWorks Stacks now supports Amazon CloudWatch Logs. This benefits all users who want to stream their log files from OpsWorks instances to CloudWatch. This enables you to take advantage of CloudWatch Logs features such as centralized log archival, real-time monitoring of log data, or generating CloudWatch alarms. Until now, OpsWorks customers had to manually install and configure the CloudWatch Logs agent on every instance they wanted to ship log data from. Now, with the built-in CloudWatch Logs integration these features are made available with just a few clicks across all instances in a layer.

The OpsWorks CloudWatch logs integration is compatible with all Chef 11 and Chef 12 Linux stacks. To get started, simply click the new CloudWatch Logs tab in your layer settings screen. OpsWorks agent command logs, which include Chef recipe output, are enabled by default; to enable logging for custom log files, simply input an absolute path or file pattern for every log file you want shipped.

It’s that simple. OpsWorks creates log groups and log streams for you, and starts streaming your system and application logs within a few minutes.

For more information, see Using Amazon CloudWatch Logs with AWS OpsWorks Stacks in the AWS OpsWorks User Guide.

AWS Big Data Blog Month in Review: March 2017

Post Syndicated from Derek Young original https://aws.amazon.com/blogs/big-data/aws-big-data-blog-month-in-review-march-2017/

Another month of big data solutions on the Big Data Blog. Please take a look at our summaries below and learn, comment, and share. Thank you for reading!

Analyze Security, Compliance, and Operational Activity Using AWS CloudTrail and Amazon Athena
In this blog post, walk through how to set up and use the recently released Amazon Athena CloudTrail SerDe to query CloudTrail log files for EC2 security group modifications, console sign-in activity, and operational account activity.  

Big Updates to the Big Data on AWS Training Course!
AWS offers a range of training resources to help you advance your knowledge with practical skills so you can get more out of the cloud. We’ve updated Big Data on AWS, a three-day, instructor-led training course to keep pace with the latest AWS big data innovations. This course allows you to hear big data best practices from an expert, get answers to your questions in person, and get hands-on practice using AWS big data services. 

Analyzing VPC Flow Logs with Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight
In this blog post, build a serverless architecture using Amazon Kinesis Firehose, AWS Lambda, Amazon S3, Amazon Athena, and Amazon QuickSight to collect, store, query, and visualize flow logs. In building this solution, you also learn how to implement Athena best practices with regard to compressing and partitioning data so as to reduce query latencies and drive down query costs. 

Amazon Redshift Monitoring Now Supports End User Queries and Canaries
The serverless Amazon Redshift Monitoring utility lets you gather important performance metrics from your Redshift cluster’s system tables and persists the results in Amazon CloudWatch. You can now create your own diagnostic queries and plug-in “canaries” that monitor the runtime of your most vital end user queries. These user-defined metrics can be used to create dashboards and trigger Alarms and should improve visibility into workloads running on a Cluster.  

Running R on Amazon Athena
In this blog post, connect R/RStudio running on an Amazon EC2 instance with Athena. You’ll learn to build a simple interactive application with Athena and R. Athena can be used to store and query the underlying data for your big data applications using standard SQL, while R can be used to interactively query Athena and generate analytical insights using the powerful set of libraries that R provides. This post has been translated into Japanese. 

Top 10 Performance Tuning Tips for Amazon Athena
In this blog post, we review the top 10 tips that can improve query performance. We focus on aspects related to storing data in Amazon S3 and tuning specific to queries. Amazon Athena uses Presto to run SQL queries and hence some of the advice will work if you are running Presto on Amazon EMR. This post has been translated into Japanese. 

Big Data Resources on the AWS Knowledge Center
The AWS Knowledge Center answers the questions we receive most frequently from AWS customers. It is a resource for you that is distinct from AWS Documentation, the AWS Discussion Forums, and the AWS Support Center. It covers questions from across every AWS service. This post is an introduction to Big Data resources on the AWS Knowledge Center. 

Encrypt and Decrypt Amazon Kinesis Records Using AWS KMS
In this bog post, learn to build encryption and decryption into sample Kinesis producer and consumer applications using the Amazon Kinesis Producer Library (KPL), the Amazon Kinesis Consumer Library (KCL), AWS KMS, and the aws-encryption-sdk. The methods and the techniques used in this post to encrypt and decrypt Kinesis records can be easily replicated into your architecture.

Want to learn more about Big Data or Streaming Data? Check out our Big Data and Streaming data educational pages.

Leave a comment below to let us know what big data topics you’d like to see next on the AWS Big Data Blog.

Build your own Crystal Maze at Home

Post Syndicated from Laura Sach original https://www.raspberrypi.org/blog/build-crystal-maze/

I recently discovered a TV channel which shows endless re-runs of the game show The Crystal Maze, and it got me thinking: what resources are available to help the younger generation experience the wonder of this iconic show? Well…

Enter the Crystal Maze

If you’re too young to remember The Crystal Maze, or if you come from a country lacking this nugget of TV gold, let me explain. A band of fairly useless contestants ran around a huge warehouse decked out to represent four zones: Industrial, Aztec, Futuristic, and Medieval. They were accompanied by a wisecracking host in a fancy coat, Richard O’Brien.

A GIF of Crystal Maze host Richard O'Brien having fun on set. Build your own Raspberry Pi Crystal Maze

Richard O’Brien also wrote The Rocky Horror Picture Show so, y’know, he was interesting to watch if nothing else.

The contestants would enter rooms to play themed challenges – the categories were mental, physical, mystery, and skill – with the aim of winning crystals. If they messed up, they were locked in the room forever (well, until the end of the episode). For every crystal they collected, they’d be given a bit more time in a giant crystal dome at the end of the programme. And what did they do in the dome? They tried to collect pieces of gold paper while being buffeted by a wind machine, of course!

A GIF of a boring prize being announced to the competing team. Build your own Raspberry Pi Crystal Maze

Collect enough gold paper and you win a mediocre prize. Fail to collect enough gold paper and you win a mediocre prize. Like I said: TV gold.

Sounds fun, doesn’t it? Here are some free resources that will help you recreate the experience of The Crystal Maze in your living room…without the fear of being locked in.

Marble maze

Image of Crystal Maze Board Game

Photo credit: Board Game Geek

Make the classic Crystal Maze game, but this time with a digital marble! Use your Sense HAT to detect pitch, roll, and yaw as you guide the marble to its destination.

Bonus fact: marble mazes featured in the Crystal Maze board game from the 1990s.

Buzz Wire

Crystal Maze Buzz Wire game screengrab

Photo credit: Board Game Geek

Guide the hook along the wire and win the crystal! Slip up and buzz three times, though, and it’s an automatic lock-in. The beauty of this make is that you can play any fail sound you like: burp wire, anyone? Follow the tutorial by community member David Pride, which he created for the Cotswold Jam.

Laser tripwire

Crystal Maze laser trip wire screengrab

Photo credit: Marc Gerrish

Why not recreate the most difficult game of all? Can you traverse a room without setting off the laser alarms, and grab the crystal? Try your skill with our laser tripwire resource!

Forget the crystal! Get out!

I would love to go to a school fête where kids build their own Crystal Maze-style challenges. I’m sure there are countless other events which you could jazz up with a fun digital making challenge, though the bald dude in a fur coat remains optional. So if you have made your own Crystal Maze challenge, or you try out one of ours, we’d love to hear about it!

Here at the Raspberry Pi Foundation, we take great pride in the wonderful free resources we produce for you to use in classes, at home, and in coding clubs. We publish them under a Creative Commons licence, and they’re an excellent way to develop your digital making skills. And massive thanks to David Pride and the Cotswold Jam for creating and sharing your great resources for free.

The post Build your own Crystal Maze at Home appeared first on Raspberry Pi.

Scaling Your Desktop Application Streams with Amazon AppStream 2.0

Post Syndicated from Bryan Liston original https://aws.amazon.com/blogs/compute/scaling-your-desktop-application-streams-with-amazon-appstream-2-0/

Want to stream desktop applications to a web browser, without rewriting them? Amazon AppStream 2.0 is a fully managed, secure, application streaming service. An easy way to learn what the service does is to try out the end-user experience, at no cost.

In this post, I describe how you can scale your AppStream 2.0 environment, and achieve some cost optimizations. I also add some setup and monitoring tips.

AppStream 2.0 workflow

You import your applications into AppStream 2.0 using an image builder. The image builder allows you to connect to a desktop experience from within the AWS Management Console, and then install and test your apps. Then, create an image that is a snapshot of the image builder.

After you have an image containing your applications, select an instance type and launch a fleet of streaming instances. Each instance in the fleet is used by only one user, and you match the instance type used in the fleet to match the needed application performance. Finally, attach the fleet to a stack to set up user access. The following diagram shows the role of each resource in the workflow.

Figure 1: Describing an AppStream 2.0 workflow

appstreamscaling_1.png

Setting up AppStream 2.0

To get started, set up an example AppStream 2.0 stack or use the Quick Links on the console. For this example, I named my stack ds-sample, selected a sample image, and chose the stream.standard.medium instance type. You can explore the resources that you set up in the AWS console, or use the describe-stacks and describe-fleets commands as follows:

Figure 2: Describing an AppStream 2.0 stack

appstreamscaling_1.png

Figure 3: Describing an AppStream 2.0 fleet

appstreamscaling_2.43%20AM

To set up user access to your streaming environment, you can use your existing SAML 2.0 compliant directory. Your users can then use their existing credentials to log in. Alternatively, to quickly test a streaming connection, or to start a streaming session from your own website, you can create a streaming URL. In the console, choose Stacks, Actions, Create URL, or call create-streaming-url as follows:

Figure 4: Creating a streaming URL

appstreamscaling_3.png

You can paste the streaming URL into a browser, and open any of the displayed applications.

appstreamscaling_4.30%20PM

Now that you have a sample environment set up, here are a few tips on scaling.

Scaling and cost optimization for AppStream 2.0

To provide an instant-on streaming connection, the instances in an AppStream 2.0 fleet are always running. You are charged for running instances, and each running instance can serve exactly one user at any time. To optimize your costs, match the number of running instances to the number of users who want to stream apps concurrently. This section walks through three options for doing this:

  • Fleet Auto Scaling
  • Fixed fleets based on a schedule
  • Fleet Auto Scaling with schedules

Fleet Auto Scaling

To dynamically update the number of running instances, you can use Fleet Auto Scaling. This feature allows you to scale the size of the fleet automatically between a minimum and maximum value based on demand. This is useful if you have user demand that changes constantly, and you want to scale your fleet automatically to match this demand. For examples about setting up and managing scaling policies, see Fleet Auto Scaling.

You can trigger changes to the fleet through the available Amazon CloudWatch metrics:

  • CapacityUtilization – the percentage of running instances already used.
  • AvailableCapacity – the number of instances that are unused and can receive connections from users.
  • InsufficientCapacityError – an error that is triggered when there is no available running instance to match a user’s request.

You can create and attach scaling policies using the AWS SDK or AWS Management Console. I find it convenient to set up the policies using the console. Use the following steps:

  1. In the AWS Management Console, open AppStream 2.0.
  2. Choose Fleets, select a fleet, and choose Scaling Policies.
  3. For Minimum capacity and Maximum capacity, enter values for the fleet.

Figure 5: Fleets tab for setting scaling policies

appstreamscaling_5.png

  1. Create scale out and scale in policies by choosing Add Policy in each section.

Figure 6: Adding a scale out policy

appstreamscaling_6.png

Figure 7: Adding a scale in policy

appstreamscaling_7.png

After you create the policies, they are displayed as part of your fleet details.

appstreamscaling_8.png

The scaling policies are triggered by CloudWatch alarms. These alarms are automatically created on your behalf when you create the scaling policies using the console. You can view and modify the alarms via the CloudWatch console.

Figure 8: CloudWatch alarms for triggering fleet scaling

appstreamscaling_9.png

Fixed fleets based on a schedule

An alternative option to optimize costs and respond to predictable demand is to fix the number of running instances based on the time of day or day of the week. This is useful if you have a fixed number of users signing in at different times of the day― scenarios such as a training classes, call center shifts, or school computer labs. You can easily set the number of instances that are running using the AppStream 2.0 update-fleet command. Update the Desired value for the compute capacity of your fleet. The number of Running instances changes to match the Desired value that you set, as follows:

Figure 9: Updating desired capacity for your fleet

appstreamscaling_10.png

Set up a Lambda function to update your fleet size automatically. Follow the example below to set up your own functions. If you haven’t used Lambda before, see Step 2: Create a HelloWorld Lambda Function and Explore the Console.

To create a function to change the fleet size

  1. In the Lambda console, choose Create a Lambda function.
  2. Choose the Blank Function blueprint. This gives you an empty blueprint to which you can add your code.
  3. Skip the trigger section for now. Later on, you can add a trigger based on time, or any other input.
  4. In the Configure function section:
    1. Provide a name and description.
    2. For Runtime, choose Node.js 4.3.
    3. Under Lambda function handler and role, choose Create a custom role.
    4. In the IAM wizard, enter a role name, for example Lambda-AppStream-Admin. Leave the defaults as is.
    5. After the IAM role is created, attach an AppStream 2.0 managed policy “AmazonAppStreamFullAccess” to the role. For more information, see Working with Managed Policies. This allows Lambda to call the AppStream 2.0 API on your behalf. You can edit and attach your own IAM policy, to limit access to only actions you would like to permit. To learn more, see Controlling Access to Amazon AppStream 2.0.
    6. Leave the default values for the rest of the fields, and choose Next, Create function.
  5. To change the AppStream 2.0 fleet size, choose Code and add some sample code, as follows:
    'use strict';
    
    /**
    This AppStream2 Update-Fleet blueprint sets up a schedule for a streaming fleet
    **/
    
    const AWS = require('aws-sdk');
    const appstream = new AWS.AppStream();
    const fleetParams = {
      Name: 'ds-sample-fleet', /* required */
      ComputeCapacity: {
        DesiredInstances: 1 /* required */
    
      }
    };
    
    exports.handler = (event, context, callback) => {
        console.log('Received event:', JSON.stringify(event, null, 2));
    
        var resource = event.resources[0];
        var increase = resource.includes('weekday-9am-increase-capacity')
    
        try {
            if (increase) {
                fleetParams.ComputeCapacity.DesiredInstances = 3
            } else {
                fleetParams.ComputeCapacity.DesiredInstances = 1
            }
            appstream.updateFleet(fleetParams, (error, data) => {
                if (error) {
                    console.log(error, error.stack);
                    return callback(error);
                }
                console.log(data);
                return callback(null, data);
            });
        } catch (error) {
            console.log('Caught Error: ', error);
            callback(error);
        }
    };

  6. Test the code. Choose Test and use the “Hello World” test template. The first time you do this, choose Save and Test. Create a test input like the following to trigger the scaling update.

    appstreamscaling_11.png

  7. You see output text showing the result of the update-fleet call. You can also use the CLI to check the effect of executing the Lambda function.

Next, to set up a time-based schedule, set a trigger for invoking the Lambda function.

To set a trigger for the Lambda function

  1. Choose Triggers, Add trigger.
  2. Choose CloudWatch Events – Schedule.
  3. Enter a rule name, such as “weekday-9am-increase-capacity”, and a description. For Schedule expression, choose cron. You can edit the value for the cron later.
  4. After the trigger is created, open the event weekday-9am-increase-capacity.
  5. In the CloudWatch console, edit the event details. To scale out the fleet at 9 am on a weekday, you can adjust the time to be: 00 17 ? * MON-FRI *. (If you’re not in Seattle (Pacific Time Zone), change this to another specific time zone).
  6. You can also add another event that triggers at the end of a weekday.

appstreamscaling_12.png

This setup now triggers scale-out and scale-in automatically, based on the time schedule that you set.

Fleet Auto Scaling with schedules

You can choose to combine both the fleet scaling and time-based schedule approaches to manage more complex scenarios. This is useful to manage the number of running instances based on business and non-business hours, and still respond to changes in demand. You could programmatically change the minimum and maximum sizes for your fleet based on time of day or day of week, and apply the default scale-out or scale-in policies. This allows you to respond to predictable minimum demand based on a schedule.

For example, at the start of a work day, you might expect a certain number of users to request streaming connections at one time. You wouldn’t want to wait for the fleet to scale out and meet this requirement. However, during the course of the day, you might expect the demand to scale in or out, and would want to match the fleet size to this demand.

To achieve this, set up the scaling polices via the console, and create a Lambda function to trigger changes to the minimum, maximum, and desired capacity for your fleet based on a schedule. Replace the code for the Lambda function that you created earlier with the following code:

'use strict';

/**
This AppStream2 Update-Fleet function sets up a schedule for a streaming fleet
**/

const AWS = require('aws-sdk');
const appstream = new AWS.AppStream();
const applicationAutoScaling = new AWS.ApplicationAutoScaling();

const fleetParams = {
  Name: 'ds-sample-fleet', /* required */
  ComputeCapacity: {
    DesiredInstances: 1 /* required */
  }
};

var scalingParams = {
  ResourceId: 'fleet/ds-sample-fleet', /* required - fleet name*/
  ScalableDimension: 'appstream:fleet:DesiredCapacity', /* required */
  ServiceNamespace: 'appstream', /* required */
  MaxCapacity: 1,
  MinCapacity: 6,
  RoleARN: 'arn:aws:iam::659382443255:role/service-role/ApplicationAutoScalingForAmazonAppStreamAccess'
};

exports.handler = (event, context, callback) => {
    
    console.log('Received this event now:', JSON.stringify(event, null, 2));
    
    var resource = event.resources[0];
    var increase = resource.includes('weekday-9am-increase-capacity')

    try {
        if (increase) {
            //usage during business hours - start at capacity of 10 and scale
            //if required. This implies at least 10 users can connect instantly. 
            //More users can connect as the scaling policy triggers addition of
            //more instances. Maximum cap is 20 instances - fleet will not scale
            //beyond 20. This is the cap for number of users.
            fleetParams.ComputeCapacity.DesiredInstances = 10
            scalingParams.MinCapacity = 10
            scalingParams.MaxCapacity = 20
        } else {
            //usage during non-business hours - start at capacity of 1 and scale
            //if required. This implies only 1 user can connect instantly. 
            //More users can connect as the scaling policy triggers addition of
            //more instances. 
            fleetParams.ComputeCapacity.DesiredInstances = 1
            scalingParams.MinCapacity = 1
            scalingParams.MaxCapacity = 10
        }
        
        //Update minimum and maximum capacity used by the scaling policies
        applicationAutoScaling.registerScalableTarget(scalingParams, (error, data) => {
             if (error) console.log(error, error.stack); 
             else console.log(data);                     
            });
            
        //Update the desired capacity for the fleet. This sets 
        //the number of running instances to desired number of instances
        appstream.updateFleet(fleetParams, (error, data) => {
            if (error) {
                console.log(error, error.stack);
                return callback(error);
            }

            console.log(data);
            return callback(null, data);
        });
            
    } catch (error) {
        console.log('Caught Error: ', error);
        callback(error);
    }
};

Note: To successfully execute this code, you need to add IAM policies to the role used by the Lambda function. The policies allow Lambda to call the Application Auto Scaling service on your behalf.

Figure 10: Inline policies for using Application Auto Scaling with Lambda

{
"Version": "2012-10-17",
"Statement": [
   {
      "Effect": "Allow", 
         "Action": [
            "iam:PassRole"
         ],
         "Resource": "*"
   }
]
}
{
"Version": "2012-10-17",
"Statement": [
   {
      "Effect": "Allow", 
         "Action": [
            "application-autoscaling:*"
         ],
         "Resource": "*"
   }
]
}

Monitoring usage

After you have set up scaling for your fleet, you can use CloudWatch metrics with AppStream 2.0, and create a dashboard for monitoring. This helps optimize your scaling policies over time based on the amount of usage that you see.

For example, if you were very conservative with your initial set up and over-provisioned resources, you might see long periods of low fleet utilization. On the other hand, if you set the fleet size too low, you would see high utilization or errors from insufficient capacity, which would block users’ connections. You can view CloudWatch metrics for up to 15 months, and drive adjustments to your fleet scaling policy.

Figure 11: Dashboard with custom Amazon CloudWatch metrics

appstreamscaling_13.53%20PM

Summary

These are just a few ideas for scaling AppStream 2.0 and optimizing your costs. Let us know if these are useful, and if you would like to see similar posts. If you have comments about the service, please post your feedback on the AWS forum for AppStream 2.0.

Amazon CloudWatch launches Alarms on Dashboards

Post Syndicated from Tara Walker original https://aws.amazon.com/blogs/aws/amazon-cloudwatch-launches-alarms-on-dashboards/

Amazon CloudWatch is a service that gives customers the ability to monitor their applications, systems, and solutions running on Amazon Web Services by providing and collecting metrics, logs, and events about AWS resources in real time. CloudWatch automatically provides key resource measurements such as; latency, error rates, and CPU usage, while also enabling monitoring of custom metrics via customer-supplied logs and system data.

Last November, Amazon CloudWatch added new Dashboard Widgets to provide additional data visualization options for all available metrics. In order to provide customers with even more insight into their solutions and resources running on AWS, CloudWatch has launched Alarms on Dashboards. With this alarms enhancement, customers can view alarms and metrics in the same dashboard widget enabling them to perform data-driven troubleshooting and analysis.

CloudWatch dashboards are designed with a goal of providing better visibility when monitoring AWS resources across regions in a consolidated view. Since CloudWatch dashboards are highly customizable, users can create their own custom dashboards to graphically represent data for varying metrics such as utilization, performance, estimated billing, and now alarm conditions. An alarm tracks a single metric over time based on the value of the metric in relation to a specified threshold. When the alarm state changes, an action such an Auto Scaling policy is executed or a notification is sent to Amazon SNS, among other options.

With the ability to add alarms to dashboards, CloudWatch users have another mechanism to proactively monitor and receive alerts about their AWS resources and applications across multiple regions. In addition, the metric data associated with an alarm, which has been added to a dashboard, can be charted and reviewed. Alarms have three possible states:

  • OK: The value of the alarm metric does not meet the threshold
  • INSUFFICIENT DATA: Initial triggering of alarm metric or alarm metric data does not have enough data to determine whether it’s in the OK state or the ALARM state
  • ALARM: The value of the alarm metric meets the threshold

When added to a dashboard, alarms are displayed in red when in the Alarm state, gray when in the Insufficient data state and shown with no color fill when the alarm is in the OK state. Alarms added to a dashboard are supported with the following widgets: Line, Number, and Stacked Graph widgets.

  • Number widget: provides a quick and efficient view of the latest value of any desired metric. Using the widget with alarms, the view of the state of the alarm is shown with different background colors for the latest metric data.
  • Line widget: allows the visualization of the actual value of any collection of chosen metrics. Provides a view on the dashboard of the state of the alarm, which displays the alarm threshold and condition as a horizontal line. The threshold line can act as a good indicator to view the degree of the alarm.
  • Stack graph widget: allows customers to visualize the net total effect of any collection of chosen metrics. The stacked graph widget loads one metric over another in order to illustrate the distribution and contribution of a metric and has the option to display the contribution of metrics in percentages. With alarms, it also provides a view of the state of the alarm, which displays the alarm threshold and condition as a horizontal line.

Currently, adding multiple metrics onto the same widget for an alarm is in the works and this feature is evolving based on customer feedback.

Adding Alarms on Dashboards

Let’s take a quick look at the utilizing the Alarms on a CloudWatch Dashboard. In the AWS Console, I will go to the CloudWatch service. When in the CloudWatch console, select Dashboards. I will click the Create dashboard button and create the CloudWatchBlog dashboard.

 

Upon creation of my CloudWatchBlog dashboard, a dialog box will open to allow me to add widgets to the dashboard. I will forego adding widgets for now since I want to focus on adding alarms on my dashboard. Therefore, I will hit the Cancel button here and go to the Alarms section of the CloudWatch console.

Once in the Alarms section of the CloudWatch console, you will see all of your alarms and the state of each of the alarms for the current region displayed.

As we mentioned earlier, there are three types of alarm states and as you can see in my console above that all of the different alarms states for various alarms are being displayed. If desired, you can adjust your filter on the console to display alarms filtered by the alarm state type.

As an example, I am only interested in viewing the alarms with an alarm state of ALARM. Therefore, I will adjust the filter to show only the alarms in the current region with an alarm state as ALARM.

Now only the two alarms that have a current alarm state of ALARM are displayed. One of these alarms is for monitoring the provisioned write capacity units of an Amazon DynamoDB table, and the other is to monitor the CPU utilization of my active Amazon Elasticsearch instance.

Let’s examine the scenario in which I leverage my CloudWatchBlog dashboard as my troubleshooting mechanism for identifying and diagnosing issues with my Elasticsearch solution and its instances. I will first add the Amazon Elasticsearch CPU utilization alarm, ES Alarm, to my CloudWatchBlog dashboard. To add the alarm, I simply select the checkbox by the desired alarm, which in this case is ES Alarm. Then with the alarm selected, I click the Add to Dashboard button.

The Add to dashboard dialog box will open, allowing me to select my CloudWatchBlog dashboard. Additionally, I can select the widget type I would like to use for the display of my alarm. For the ES Alarm, I will choose the Line widget and complete the process of adding this alarm to my dashboard by clicking the Add to dashboard button.

Upon successfully adding ES Alarm to the CloudWatchBlog dashboard, you will see a confirmation notice displayed in the CloudWatch console.

If I then go to the Dashboard section of the console and select my CloudWatchBlog dashboard, I will see the line widget for my alarm, ES Alarm, on the dashboard. To ensure that my ES Alarm widget is a permanent part of the dashboard, I will click the Save dashboard button to preserve the addition of this widget on the dashboard.

As we discussed, one of the benefits of utilizing a CloudWatch dashboard is the ability to add several alarms from various regions onto a dashboard. Since my scenario is leveraging my dashboard as a troubleshooting mechanism for my Elasticsearch solution, I would like to have several alarms and metrics related to my solution displayed on the CloudWatchBlog dashboard. Given this, I will create another alarm for my Elasticsearch instance and add it to my dashboard.

I will first return to the Alarms section of the console and click the Create Alarm button.

The Create Alarm dialog box is displayed showing all of the current metrics available in this region. From the summary, I can quickly see that there are 21 metrics being tracked for Elasticsearch. I will click on the ES Metrics link to view the individual metrics that can be used to create my alarm.

I can review the individual metrics shown for my Elasticsearch instance, and choose which metric I want to base my new alarm on. In this case, I choose the WriteLatency metric by selecting the checkbox for this metric and then click the Next button.

 

The next screen is where I fill in all the details about my alarm: name, description, alarm threshold, time period, and alarm action. I will name my new alarm, ES Latency Alarm, and complete the rest of the aforementioned data fields. To complete the creation of my new alarm, I click the Create Alarm button.

I will see a confirmation message box at the top of the Alarms console upon successful completion of adding the alarm, and the status of the newly created alarm will be displayed in the alarms list.

Now I will add my ES Latency Alarm to my CloudWatchBlog dashboard. Again, I click on the checkbox by the alarm and then click the Add to Dashboard button.

This time when the Add to Dashboard dialog comes up, I will choose the Stacked area widget to display the ES Latency Alarm on my CloudWatchBlog dashboard. Clicking the Add to Dashboard button will complete the addition of my ES Latency Alarm widget to the dashboard.

Once back in the console, again I will see the confirmation noting the successful addition of the widget. I go to the Dashboards and click on the CloudWatchBlog dashboard and I can now view the two widgets in my dashboard. To include this widget in the dashboard permanently, I click the Save dashboard button.

The final thing to note about the new CloudWatch feature, Alarms on Dashboards, is that alarms and metrics from other regions can be added to the dashboard for a complete view for troubleshooting. Let’s add a metric to the dashboard with the alarms widget.

Within the console, I will move from my current region, US East (Ohio), to the US East (N. Virginia) region.

Now I will go to the Metric section of the CloudWatch console. This section displays the metrics from services used in the US East (N. Virginia) region.

My Elasticsearch solution triggers Lambda functions to capture all of the EmployeeInfo DynamoDB database CRUD (Create, Read, Update, Delete) changes via DynamoDB streams and write those changes into my Elasticsearch domain, taratestdomain. Therefore, I will add metrics to my CloudWatchBlog dashboard to track table metrics from DynamoDB.

Therefore, I am going to add the EmployeeInfo database ProvisionedWriteCapacityUnits metric to my CloudWatchBlog dashboard.

Back again in the Add to Dashboard dialog, I will select my CloudWatchBlog dashboard and choose to display this metric using the Number widget.

Now, the ProvisionedWriteCapacityUnits metric from the US East (N. Virginia) is displayed in the CloudWatchBlog dashboard with the Number widget added to the dashboard to with the alarms from the US East (Ohio). To make this update permanent in the dashboard, I will (you guessed it!) click the Save dashboard button.

Summary

Getting started with alarms on dashboards is easy. You can use alarms on dashboards across regions for another means of proactively monitoring alarms, build troubleshooting playbooks, and view desired metrics. You can also choose the metric first in the Metric UI and then change the type of widget according to the visualization that fits the metric.

Alarms on Dashboards are supported on Line, Stacked Area, and Number widgets. In addition, you can use Text widgets next to alarms on a dashboard to add steps or observations on how to handle changes in the alarm state. To learn more about Amazon CloudWatch widgets and about the additional dashboard widgets, visit the Amazon CloudWatch documentation and the CloudWatch Getting Started guide.

 

Tara

Amazon Connect – Customer Contact Center in the Cloud

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-connect-customer-contact-center-in-the-cloud/

Customer service is essential to the success of any business. In order to provide voice-based customer service at scale, many organizations operate call centers. At the low end, call centers simply route incoming calls to any available agent. More sophisticated systems support more sophisticated routing and interaction, including the ability to create customized call trees and other Integrated Voice Response (IVR) systems. Traditionally, IVR systems have been difficult to install and expensive to license, with capacity-based pricing the norm.

The Amazon customer service organization aims to deliver an exceptional level of services to our customers. In order to do this at scale, the organization employs tens of thousands of agents working at more than 50 groups and subsidiaries centers scattered around the world, supporting customers that speak a wide variety of languages.

Welcome to Amazon Connect
Today I would like to introduce you to Amazon Connect. Building on the same technology used by many of our own customer service teams, Amazon Connect lets you set up a cloud-based contact center in minutes. You create your contact center, design your contact flows (similar to IVRs), and on-board your agents using a modern interface that is entirely web-based.

Instead of requiring help from IT teams and specialized consultants, Amazon Connect is simple enough to be configured and run directly by business decision makers. There’s no hardware to deploy and no per-agent licenses. Instead, you pay based on the number of customer-minutes and the amount of phone time that you consume. This scalable, pay-as-you-go model means that you can use Amazon Connect in situations where your call volume is unpredictable, spiky, or both.

Amazon Connect works with many different AWS services including:

Amazon S3Amazon Connect uses S3 to provide unlimited, encrypted storage of calls (audio) and reports.

AWS Lambda – This give you the ability to run code in serverless fashion as part of a customer contact. The code can pull data from your CRM or your database and use the data to provide a personalized customer experience.

Amazon Lex – Customer contacts can make use of natural language, conversational interfaces powered by the same technology behind Alexa. This aspect of Amazon Connect is being released in preview form and I’ll have more information about it some time soon.

AWS Directory ServiceAmazon Connect can reference an existing Active Directory or it can create a new one. The directory is used to store user (administrator, manager, or agent) identities and permissions.

Amazon KinesisAmazon Connect can stream contact trace records (CTRs) to Amazon Kinesis. From there they can be pushed to Amazon S3 or Amazon Redshift and analyzed using Amazon QuickSight or other business analytics tools.

Amazon CloudWatchAmazon Connect publishes real-time operational metrics to CloudWatch. These metrics will tell you how many calls are arriving per second, how many are being held in queues, and so forth. You can use these metrics to observe the performance of your contact center and to make sure that you have the right number of agents on hand.

Amazon Connect Benefits & Features
Let’s take a look at the major benefits and features of Amazon Connect.

Cloud-PoweredAmazon Connect is an integral part of AWS, and is designed to be robust, scalable, and highly available. Each contact center instance runs in multiple AWS Availability Zones.

Simple – As I already noted, Amazon Connect was designed to be set up and run by business decision makers. The graphical console makes the setup process (including the design of graphical call flows) easy and efficient.

Flexible – The Contact Flow Editor (CFE) that powers Amazon Connect includes blocks for interaction, integration, control flow, branching, and more. Call flows can include a mixture of prerecorded audio prompts, generated audio, Lex-powered interaction, integration with existing systems and databases, conversations with agents, call transfers, and more.

EconomicalAmazon Connect‘s pay-as-you-go model keeps operating costs in line with actual usage. On the agent side, Amazon Connect includes a softphone that supports high quality audio.

Amazon Connect Tour
Let’s set up a customer contact center, starting at the Amazon Connect console. I click on Get started:

I can choose to use an existing directory or I can create a new one, which I’ll do. I’ll set up the URL for my contact center to proceed. This URL will allow my contact center’s users to sign in:

Next, I set myself up as the administrator:

My contact center can accept incoming calls, make outgoing calls, or both. Outgoing calls can be made as part of a contact flow or by an agent. I simply specify what I need (I’ll choose my direct dial and/or toll-free numbers later):

Next, I specify the S3 locations (buckets and prefixes) for call recording and reports. I can use separate buckets and prefixes for each, along with distinct AWS Key Management Service (KMS) encryption keys:

Then I confirm my choices and click on Create instance:

A few seconds later my Amazon Connect instance is ready to go, and I can click on Get started:

Amazon Connect logs me in using the administrator credentials that I entered earlier, and then I click on Let’s go to proceed:

Now I can get set up to accept my first call, by choosing a toll free or direct dial phone number:

I am ready to accept my first call:

I call the number on my cell and walk through the options. I press “1” to speak to an agent, and the softphone offers to let me (playing the role of the agent) take the call:

Now that the Amazon Connect is set up and accepting calls, I can explore the Dashboard and configure my contact center:

I can see the phone numbers that are associated with the call center:

And I can claim additional numbers (up to 10 direct inward dialing and 10 toll free), choosing from numbers in a wide list of countries:

At the bottom of the screen I can choose the contact flow that is activated when the number is called:

Returning to the dashboard, the next step is to set the hours of operation. I can see the list of existing hours of operation, and I can create a new one:

I can have multiple hours of operation, each of which can be attached to a queue or referenced from a contact flow.

Now I can create queues to model the routing of contacts (incoming calls) to agents. Queues can represent priorities (different levels of regular and premium support), or agents with different business or language skills.

I also have the ability to create prompts. These are audio snippets that can be referenced from a contact flow. I can upload an existing WAV file or I can record one from within the browser:

Next, I can create a new contact flow or edit an existing one. A contact flow defines the customer experience after they make contact. Here is the sample flow that is set up by default:

I can find the desired block in the menu, drag it on to the canvas, and connect it in to the flow:

Then I click on the header of the block to set it up. I can use a prompt, or I can choose between static or dynamic text to speech (dynamic speech would reference a value attached to the call flow, perhaps resulting from a database lookup):

One of the most powerful and general blocks allows me to call a Lambda function during the contact flow. The function can pull data from a database, connect to a CRM, or implement any desired type of business logic. The function is provided with key-value pairs as input, and returns the same as output. The returned values are associated with the execution of the flow.

The next step is to create a routing profile, consisting of one or more queues, with associated priorities. The priorities allow agents to service multiple queues in the proper order (lower numbers are higher priority); you can combine priorities and delays to set up routing rules that are in accord with the needs of your users and your contact center:

The final step is to add and empower the users (administrators, managers, agents, and so forth). This can be done on a user-by-user basis or by importing a CSV file that contains definitions for multiple users. Here’s how I add a user named Hello World:

I can also create a hierarchy of agents (great for regional and/or divisional reporting).

Amazon Connect lets me use security profiles to control the permissions assigned to each type of user:

I also have access to operational metrics, both real-time and historical. Here are some of the real-time metrics (the time frame and displayed fields can be customized as needed):

The metrics are published to CloudWatch, where they can be used to drive alarms.

I hope that this brief tour has given you a reasonably thorough introduction to Amazon Connect, and would love to hear how you have used it to modernize your contact center and to improve the level of service that you can provide to your customers:

As you can see, it is a rich and powerful service, with plenty of room for customization and extension. There are plenty of advanced features and configuration options that you can explore on your own.

Available Now
Amazon Connect is available now in the US East (Northern Virginia) Region and you can create a contact center today!

Pricing, as I noted, is on a pay-as-you-go basis. You pay for each contact center and each active phone call (inbound or outbound) on a per-minute basis. As part of the AWS Free Tier, you get 90 minutes of contact center usage, two phone numbers (DID and toll-free), 30 minutes of inbound DID calls, 30 minutes of inbound toll-free calls, and 30 minutes of outbound calls, all per month, for an entire year.

Jeff;

PS – Visit the Amazon Connect Images album to see my entire LEGO build!

New Amazon AppStream 2.0 Features – Fleet Auto Scaling, Image Builder, SAML, Metrics, and Fleet Management

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-appstream-2-0-features-fleet-auto-scaling-image-builder-saml-metrics-and-fleet-management/

My colleague Gene Farrell introduced you to Amazon AppStream 2.0 late last year. In his guest post, Gene explained how AppStream 2.0 lets you run desktop applications securely on any device, from within the comfort of an HTML5 web browser (read the entire post to learn more). For example, I used the AppStream 2.0 Try it Now page to launch and then immediately start using Siemens Solid Edge. I simply chose the desired application from the Try it Now page:

I was running Solid Edge a few seconds later, no installation or setup needed:

By Popular Request – New Features for Enterprises, SMBs, and ISVs
Since that re:Invent launch, we have been fine-tuning AppStream 2.0, adding in some features that our customers have been asking for. These features will allow our customers to more easily deploy, access, manage and track the applications that they make available for use through AppStream. We’ve rolled most of these out without individual blog posts, and today I’d like to let you know what we’ve been up to. Here are the newest features:

Fleet Auto Scaling – This brand-new feature allows you to use the CloudWatch metrics to scale your fleet up and down in response to changes in demand. This allows you to deliver applications as economically as possible, while still providing instant access.

Image Builder – You can build your own AppStream 2.0 images that contain your choice of applications.

SAML 2.0 Authentication – You can use your existing SAML 2.0 compliant directory with AppStream 2.0. Your users can use their existing credentials to log in.

Fleet Management – You have additional management options for the instances that run your applications.

CloudWatch Metrics – You can observe and monitor seven Amazon CloudWatch metrics, including the size and overall utilization of your fleets.

Let’s take a look at each one!

Fleet Auto Scaling
This feature is brand new, and is powered by the new CloudWatch metrics! You can now associate scaling policies with each of your fleets and use them to meet varying levels of user demand and to control costs. If you are using AppStream 2.0 to deliver productivity applications to your users, you can use the scaling policies to ensure that capacity comes online as needed during office hours, and goes away in the evening when your users are done for the day. Here is a fleet with scale out (add capacity) and scale in (remove capacity) policies:

In order to take advantage of this feature, you set the minimum and maximum capacity when you create the fleet:

This will create the default policies, which you can later edit, add, or remove (you can have up to 50 policies per fleet). To learn more, read about AppStream Fleet Auto Scaling.

Image Builder
This feature allows you to create custom images that contain your choice of commercial or proprietary applications. In order to do this, you launch an instance called an image builder. Then you log in to the instance, install and configure the applications as desired, and capture the state of the instance as an image. The entire login and customization process takes place within your web browser; you don’t have to download any keys or remember any passwords. The application appears in the Image Registry and is available to your users.

I can launch an image builder from the AppStream 2.0 Console:

Next, I choose the starting point (an existing image):

Then I configure the builder by giving it a name, choosing an instance size, and setting up the VPC:

I click on Review, confirm my settings, and then wait for the builder to launch:

Then I can connect to the image builder, set up the apps, and create an image. I have my choice of two identities when I connect, Admin and Test:

I select ImageBuildAdmin and (when prompted for a password), click on Log me in in the Admin Commands menu:

After logging in, I launch the Image Assistant app and use it to install and test my apps:

To learn more, read about Image Builders and follow the Using an AppStream 2.0 Image Builder Tutorial.

SAML 2.0 Authentication
This feature allows you to use any external identity provider that supports SAML 2.0 including Active Directory Federation Services, PingFederate Server, Okta, or Shibboleth:

After you follow the directions in Setting Up SAML, your users can log in to AppStream 2.0 using their existing identity and credentials. You can manage users and groups, control access to applications based on the identity or location of the user, and use Multi-Factor Authentication (MFA). To learn more, read Enabling Single Sign-on Access to AppStream 2.0 Using SAML 2.0. If you have already set up federated access to the AWS Management Console, much of what you already know will apply.

Fleet Management
This feature gives me additional control over my fleets (groups of instances that are running applications for users). I can see all of my fleets on a single screen:

I can select a fleet and then act on it:

Some properties of a fleet can be edited at any time. Others, including the VPC properties, can only be edited after the fleet has been stopped. To learn more, read about Stacks and Fleets.

CloudWatch Metrics
AppStream publishes eight metrics to CloudWatch for each fleet:

  • RunningCapacity – Number of instances running.
  • InUseCapacity – Number of instances in use.
  • DesiredCapacity – Number of instances that are either running or pending.
  • AvailableCapacity – Number of idle instances available for use.
  • PendingCapacity – Number of instances being provisioned.
  • CapacityUtilization – Percentage of fleet being used.
  • InsufficientCapacityError – Number of sessions rejected due to lack of capacity.

You can see these metrics from within the AppStream 2.0 Console:

These metrics will help you to measure overall usage to to fine tune the size of your fleet. As is the case with every CloudWatch metric, you can generate alerts and raise alarms when a metric is outside of the desired range. You could also use AWS Lambda functions to make changes to your environment or to generate specialized notifications. To learn more, read about Monitoring Amazon AppStream 2.0 Resources.

Available Now
All of these features are available now and you can start using them today!

Jeff;

Implementing DevSecOps Using AWS CodePipeline

Post Syndicated from Ramesh Adabala original https://aws.amazon.com/blogs/devops/implementing-devsecops-using-aws-codepipeline/

DevOps is a combination of cultural philosophies, practices, and tools that emphasizes collaboration and communication between software developers and IT infrastructure teams while automating an organization’s ability to deliver applications and services rapidly, frequently, and more reliably.

CI/CD stands for continuous integration and continuous deployment. These concepts represent everything related to automation of application development and the deployment pipeline — from the moment a developer adds a change to a central repository until that code winds up in production.

DevSecOps covers security of and in the CI/CD pipeline, including automating security operations and auditing. The goals of DevSecOps are to:

  • Embed security knowledge into DevOps teams so that they can secure the pipelines they design and automate.
  • Embed application development knowledge and automated tools and processes into security teams so that they can provide security at scale in the cloud.

The Security Cloud Adoption Framework (CAF) whitepaper provides prescriptive controls to improve the security posture of your AWS accounts. These controls are in line with a DevOps blog post published last year about the control-monitor-fix governance model.

Security CAF controls are grouped into four categories:

  • Directive: controls establish the governance, risk, and compliance models on AWS.
  • Preventive: controls protect your workloads and mitigate threats and vulnerabilities.
  • Detective: controls provide full visibility and transparency over the operation of your deployments in AWS.
  • Responsive: controls drive remediation of potential deviations from your security baselines.

To embed the DevSecOps discipline in the enterprise, AWS customers are automating CAF controls using a combination of AWS and third-party solutions.

In this blog post, I will show you how to use a CI/CD pipeline to automate preventive and detective security controls. I’ll use an example that show how you can take the creation of a simple security group through the CI/CD pipeline stages and enforce security CAF controls at various stages of the deployment. I’ll use AWS CodePipeline to orchestrate the steps in a continuous delivery pipeline.

These resources are being used in this example:

  • An AWS CloudFormation template to create the demo pipeline.
  • A Lambda function to perform the static code analysis of the CloudFormation template.
  • A Lambda function to perform dynamic stack validation for the security groups in scope.
  • An S3 bucket as the sample code repository.
  • An AWS CloudFormation source template file to create the security groups.
  • Two VPCs to deploy the test and production security groups.

These are the high-level security checks enforced by the pipeline:

  • During the Source stage, static code analysis for any open security groups. The pipeline will fail if there are any violations.
  • During the Test stage, dynamic analysis to make sure port 22 (SSH) is open only to the approved IP CIDR range. The pipeline will fail if there are any violations.

demo_pipeline1

 

These are the pipeline stages:

1. Source stage: In this example, the pipeline gets the CloudFormation code that creates the security group from S3, the code repository service.

This stage passes the CloudFormation template and pipeline name to a Lambda function, CFNValidateLambda. This function performs the static code analysis. It uses the regular expression language to find patterns and identify security group policy violations. If it finds violations, then Lambda fails the pipeline and includes the violation details.

Here is the regular expression that Lambda function using for static code analysis of the open SSH port:

"^.*Ingress.*(([fF]rom[pP]ort|[tT]o[pP]ort).\s*:\s*u?.(22).*[cC]idr[iI]p.\s*:\s*u?.((0\.){3}0\/0)|[cC]idr[iI]p.\s*:\s*u?.((0\.){3}0\/0).*([fF]rom[pP]ort|[tT]o[pP]ort).\s*:\s*u?.(22))"

2. Test stage: After the static code analysis is completed successfully, the pipeline executes the following steps:

a. Create stack: This step creates the stack in the test VPC, as described in the test configuration.

b. Stack validation: This step triggers the StackValidationLambda Lambda function. It passes the stack name and pipeline name in the event parameters. Lambda validates the security group for the following security controls. If it finds violations, then Lambda deletes the stack, stops the pipeline, and returns an error message.

The following is the sample Python code used by AWS Lambda to check if the SSH port is open to the approved IP CIDR range (in this example, 72.21.196.67/32):

for n in regions:
    client = boto3.client('ec2', region_name=n)
    response = client.describe_security_groups(
        Filters=[{'Name': 'tag:aws:cloudformation:stack-name', 'Values': [stackName]}])
    for m in response['SecurityGroups']:
        if "72.21.196.67/32" not in str(m['IpPermissions']):
            for o in m['IpPermissions']:
                try:
                    if int(o['FromPort']) <= 22 <= int(o['ToPort']):
                        result = False
                        failReason = "Found Security Group with port 22 open to the wrong source IP range"
                        offenders.append(str(m['GroupId']))
                except:
                    if str(o['IpProtocol']) == "-1":
                        result = False
                        failReason = "Found Security Group with port 22 open to the wrong source IP range"
                        offenders.append(str(n) + " : " + str(m['GroupId']))

c. Approve test stack: This step creates a manual approval task for stack review. This step could be eliminated for automated deployments.

d. Delete test stack: After all the stack validations are successfully completed, this step deletes the stack in the test environment to avoid unnecessary costs.

3. Production stage: After the static and dynamic security checks are completed successfully, this stage creates the stack in the production VPC using the production configuration supplied in the template.

a. Create change set: This step creates the change set for the resources in the scope.

b. Execute change set: This step executes the change set and creates/updates the security group in the production VPC.

 

Source code and CloudFormation template

You’ll find the source code at https://github.com/awslabs/automating-governance-sample/tree/master/DevSecOps-Blog-Code

basic-sg-3-cfn.json creates the pipeline in AWS CodePipeline with all the stages previously described. It also creates the static code analysis and stack validation Lambda functions.

The CloudFormation template points to a shared S3 bucket. The codepipeline-lambda.zip file contains the Lambda functions. Before you run the template, upload the zip file to your S3 bucket and then update the CloudFormation template to point to your S3 bucket location.

The CloudFormation template uses the codepipe-single-sg.zip file, which contains the sample security group and test and production configurations. Update these configurations with your VPC details, and then upload the modified zip file to your S3 bucket.

Update these parts of the code to point to your S3 bucket:

 "S3Bucket": {
      "Default": "codepipeline-devsecops-demo",
      "Description": "The name of the S3 bucket that contains the source artifact, which must be in the same region as this stack",
      "Type": "String"
    },
    "SourceS3Key": {
      "Default": "codepipe-single-sg.zip",
      "Description": "The file name of the source artifact, such as myfolder/myartifact.zip",
      "Type": "String"
    },
    "LambdaS3Key": {
      "Default": "codepipeline-lambda.zip",
      "Description": "The file name of the source artifact of the Lambda code, such as myfolder/myartifact.zip",
      "Type": "String"
    },
	"OutputS3Bucket": {
      "Default": "codepipeline-devsecops-demo",
      "Description": "The name of the output S3 bucket that contains the processed artifact, which must be in the same region as this stack",
      "Type": "String"
    },

After the stack is created, AWS CodePipeline executes the pipeline and starts deploying the sample CloudFormation template. In the default template, security groups have wide-open ports (0.0.0.0/0), so the pipeline execution will fail. Update the CloudFormation template in codepipe-single-sg.zip with more restrictive ports and then upload the modified zip file to S3 bucket. Open the AWS CodePipeline console, and choose the Release Change button. This time the pipeline will successfully create the security groups.

demo_pipeline2

You could expand the security checks in the pipeline to include other AWS resources, not just security groups. The following table shows the sample controls you could enforce in the pipeline using the static and dynamic analysis Lambda functions.

demo_pipeline3
If you have feedback about this post, please add it to the Comments section below. If you have questions about implementing the example used in this post, please open a thread on the Developer Tools forum.