Tag Archives: Amazon Elastic Block Storage (EBS)

Learn about AWS Services & Solutions – April AWS Online Tech Talks

Post Syndicated from Robin Park original https://aws.amazon.com/blogs/aws/learn-about-aws-services-solutions-april-aws-online-tech-talks/

AWS Tech Talks

Join us this April to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!

Note – All sessions are free and in Pacific Time.

Tech talks this month:

Blockchain

May 2, 2019 | 11:00 AM – 12:00 PM PTHow to Build an Application with Amazon Managed Blockchain – Learn how to build an application on Amazon Managed Blockchain with the help of demo applications and sample code.

Compute

April 29, 2019 | 1:00 PM – 2:00 PM PTHow to Optimize Amazon Elastic Block Store (EBS) for Higher Performance – Learn how to optimize performance and spend on your Amazon Elastic Block Store (EBS) volumes.

May 1, 2019 | 11:00 AM – 12:00 PM PTIntroducing New Amazon EC2 Instances Featuring AMD EPYC and AWS Graviton Processors – See how new Amazon EC2 instance offerings that feature AMD EPYC processors and AWS Graviton processors enable you to optimize performance and cost for your workloads.

Containers

April 23, 2019 | 11:00 AM – 12:00 PM PTDeep Dive on AWS App Mesh – Learn how AWS App Mesh makes it easy to monitor and control communications for services running on AWS.

March 22, 2019 | 9:00 AM – 10:00 AM PTDeep Dive Into Container Networking – Dive deep into microservices networking and how you can build, secure, and manage the communications into, out of, and between the various microservices that make up your application.

Databases

April 23, 2019 | 1:00 PM – 2:00 PM PTSelecting the Right Database for Your Application – Learn how to develop a purpose-built strategy for databases, where you choose the right tool for the job.

April 25, 2019 | 9:00 AM – 10:00 AM PTMastering Amazon DynamoDB ACID Transactions: When and How to Use the New Transactional APIs – Learn how the new Amazon DynamoDB’s transactional APIs simplify the developer experience of making coordinated, all-or-nothing changes to multiple items both within and across tables.

DevOps

April 24, 2019 | 9:00 AM – 10:00 AM PTRunning .NET applications with AWS Elastic Beanstalk Windows Server Platform V2 – Learn about the easiest way to get your .NET applications up and running on AWS Elastic Beanstalk.

Enterprise & Hybrid

April 30, 2019 | 11:00 AM – 12:00 PM PTBusiness Case Teardown: Identify Your Real-World On-Premises and Projected AWS Costs – Discover tools and strategies to help you as you build your value-based business case.

IoT

April 30, 2019 | 9:00 AM – 10:00 AM PTBuilding the Edge of Connected Home – Learn how AWS IoT edge services are enabling smarter products for the connected home.

Machine Learning

April 24, 2019 | 11:00 AM – 12:00 PM PTStart Your Engines and Get Ready to Race in the AWS DeepRacer League – Learn more about reinforcement learning, how to build a model, and compete in the AWS DeepRacer League.

April 30, 2019 | 1:00 PM – 2:00 PM PTDeploying Machine Learning Models in Production – Learn best practices for training and deploying machine learning models.

May 2, 2019 | 9:00 AM – 10:00 AM PTAccelerate Machine Learning Projects with Hundreds of Algorithms and Models in AWS Marketplace – Learn how to use third party algorithms and model packages to accelerate machine learning projects and solve business problems.

Networking & Content Delivery

April 23, 2019 | 9:00 AM – 10:00 AM PTSmart Tips on Application Load Balancers: Advanced Request Routing, Lambda as a Target, and User Authentication – Learn tips and tricks about important Application Load Balancers (ALBs) features that were recently launched.

Productivity & Business Solutions

April 29, 2019 | 11:00 AM – 12:00 PM PTLearn How to Set up Business Calling and Voice Connector in Minutes with Amazon Chime – Learn how Amazon Chime Business Calling and Voice Connector can help you with your business communication needs.

May 1, 2019 | 1:00 PM – 2:00 PM PTBring Voice to Your Workplace – Learn how you can bring voice to your workplace with Alexa for Business.

Serverless

April 25, 2019 | 11:00 AM – 12:00 PM PTModernizing .NET Applications Using the Latest Features on AWS Development Tools for .NET – Get a dive deep and demonstration of the latest updates to the AWS SDK and tools for .NET to make development even easier, more powerful, and more productive.

May 1, 2019 | 9:00 AM – 10:00 AM PTCustomer Showcase: Improving Data Processing Workloads with AWS Step Functions’ Service Integrations – Learn how innovative customers like SkyWatch are coordinating AWS services using AWS Step Functions to improve productivity.

Storage

April 24, 2019 | 1:00 PM – 2:00 PM PTAmazon S3 Glacier Deep Archive: The Cheapest Storage in the Cloud – See how Amazon S3 Glacier Deep Archive offers the lowest cost storage in the cloud, at prices significantly lower than storing and maintaining data in on-premises magnetic tape libraries or archiving data offsite.

Recovering files from an Amazon EBS volume backup

Post Syndicated from Josh Rad original https://aws.amazon.com/blogs/compute/recovering-files-from-an-amazon-ebs-volume-backup/

Contributed by Jeff Bartley, Storage Solutions Architect, AWS

Amazon Elastic Block Store (Amazon EBS) enables you to back up volumes at any time using EBS snapshots. Volume backups can be triggered manually or they can be scheduled using Amazon Data Lifecycle Manager (Amazon DLM) or AWS Backup.

Each backup creates a unique EBS snapshot. The snapshot has all of the data necessary to restore the volume to the exact state that it was in when the backup was made. You can then attach that volume to an Amazon EC2 instance.

A common use case for restoring volumes is to re-create production workloads in test and development environments. For example, you might take a snapshot of your production database and copy that snapshot to your test/dev account. Then you can restore a volume from the snapshot and attach it to one of your test EC2 instances.

You can also use EBS snapshots to recover files and folders from volume backups where the volume was formatted with a Windows or Linux-compatible file system. This commonly occurs when users accidentally modify or delete one or more files and must go back to a previous version.

This post guides you through the process of restoring an EBS volume for the purpose of recovering files or folders from either a Windows or Linux volume. You learn how to restore a volume from an EBS snapshot, attach the volume to a running EC2 instance, and copy the files or folders to be recovered.

NOTE: This process can only be used to recover files or folders from EBS data volumes. To recover an EBS root volume, see the Amazon EBS-backed Instances section in the Amazon EC2 Root Device Volume topic.

Restore a volume from an EBS snapshot
The first step to recovering your files is to identify the EBS snapshot that contains the needed data and then create a volume from it.

On the Create Volume page, you are prompted to choose the volume type, volume size, and the Availability Zone in which the volume should be created. The default volume type is either gp2 or standard, depending on the AWS Region. That is applicable to most use cases.

The default volume size is the size of the volume from which the EBS snapshot was created. For recovering files and folders, the size should not be modified. Create a new volume that is an exact copy of the original volume.

For the Availability Zone, select the same zone as the EC2 instance to be used for recovery. EBS volumes can only be attached to EC2 instances in the same zone in which they were created. Tag the new volume for identification.

Select the new volume to monitor the status until the state is set to available.

Attach the volume to an EC2 instance
To access the files or folders to be recovered, the volume must be attached to an EC2 instance. The instance should be running the same version of Windows or Linux that was running when the volume backup was made. The instance does not need to be stopped, as an EBS volume can be attached to a running EC2 instance.

Recover your files on Windows
If your files were originally created on Windows, connect to the Windows EC2 instance using a desktop viewer that supports RDP. Then, make the EBS volume available for use.

Open Windows Explorer, navigate to the files or folders to be recovered and copy them to the desired destination.

When the recovery effort is complete, you can unmount the volume and detach it from the EC2 instance.

Recover your files on Linux
If your files were originally created on Linux, begin by logging in to the Linux EC2 instance using SSH. Then, make the EBS volume available for use.

The new volume can be identified by the name that corresponds to the device ID specified when the volume was attached to the instance.

When the recovery effort is complete, you can unmount the volume and detach it from the EC2 instance.

Clean up
With the files or folders recovered and the volume detached, you are free to delete the volume. You can always re-create it from the EBS snapshot as needed.

Summary
Maintaining regular backups of your EBS volumes helps protect you against events like the accidental deletion or unintentional modification of files and folders. AWS provides you with a number of tools to schedule and manage your EBS volume backups. These tools include Amazon DLM, AWS Backup, and manual backups using the AWS CLI or SDKs.

In this post, you learned how to recover files or folders from a backup of a volume that was formatted with a Windows or Linux file system. Now you can quickly get access to your files and folders and expedite your recovery process.

For more information, see Restoring an Amazon EBS Volume from a Snapshot.

Optimizing a Lift-and-Shift for Security

Post Syndicated from Jonathan Shapiro-Ward original https://aws.amazon.com/blogs/architecture/optimizing-a-lift-and-shift-for-security/

This is the third and final blog within a three-part series that examines how to optimize lift-and-shift workloads. A lift-and-shift is a common approach for migrating to AWS, whereby you move a workload from on-prem with little or no modification. This third blog examines how lift-and-shift workloads can benefit from an improved security posture with no modification to the application codebase. (Read about optimizing a lift-and-shift for performance and for cost effectiveness.)

Moving to AWS can help to strengthen your security posture by eliminating many of the risks present in on-premise deployments. It is still essential to consider how to best use AWS security controls and mechanisms to ensure the security of your workload. Security can often be a significant concern in lift-and-shift workloads, especially for legacy workloads where modern encryption and security features may not present. By making use of AWS security features you can significantly improve the security posture of a lift-and-shift workload, even if it lacks native support for modern security best practices.

Adding TLS with Application Load Balancers

Legacy applications are often the subject of a lift-and-shift. Such migrations can help reduce risks by moving away from out of date hardware but security risks are often harder to manage. Many legacy applications leverage HTTP or other plaintext protocols that are vulnerable to all manner of attacks. Often, modifying a legacy application’s codebase to implement TLS is untenable, necessitating other options.

One comparatively simple approach is to leverage an Application Load Balancer or a Classic Load Balancer to provide SSL offloading. In this scenario, the load balancer would be exposed to users, while the application servers that only support plaintext protocols will reside within a subnet which is can only be accessed by the load balancer. The load balancer would perform the decryption of all traffic destined for the application instance, forwarding the plaintext traffic to the instances. This allows  you to use encryption on traffic between the client and the load balancer, leaving only internal communication between the load balancer and the application in plaintext. Often this approach is sufficient to meet security requirements, however, in more stringent scenarios it is never acceptable for traffic to be transmitted in plaintext, even if within a secured subnet. In this scenario, a sidecar can be used to eliminate plaintext traffic ever traversing the network.

Improving Security and Configuration Management with Sidecars

One approach to providing encryption to legacy applications is to leverage what’s often termed the “sidecar pattern.” The sidecar pattern entails a second process acting as a proxy to the legacy application. The legacy application only exposes its services via the local loopback adapter and is thus accessible only to the sidecar. In turn the sidecar acts as an encrypted proxy, exposing the legacy application’s API to external consumers via TLS. As unencrypted traffic between the sidecar and the legacy application traverses the loopback adapter, it never traverses the network. This approach can help add encryption (or stronger encryption) to legacy applications when it’s not feasible to modify the original codebase. A common approach to implanting sidecars is through container groups such as pod in EKS or a task in ECS.

Implementing the Sidecar Pattern With Containers

Figure 1: Implementing the Sidecar Pattern With Containers

Another use of the sidecar pattern is to help legacy applications leverage modern cloud services. A common example of this is using a sidecar to manage files pertaining to the legacy application. This could entail a number of options including:

  • Having the sidecar dynamically modify the configuration for a legacy application based upon some external factor, such as the output of Lambda function, SNS event or DynamoDB write.
  • Having the sidecar write application state to a cache or database. Often applications will write state to the local disk. This can be problematic for autoscaling or disaster recovery, whereby having the state easily accessible to other instances is advantages. To facilitate this, the sidecar can write state to Amazon S3, Amazon DynamoDB, Amazon Elasticache or Amazon RDS.

A sidecar requires customer development, but it doesn’t require any modification of the lift-and-shifted application. A sidecar treats the application as a blackbox and interacts with it via its API, configuration file, or other standard mechanism.

Automating Security

A lift-and-shift can achieve a significantly stronger security posture by incorporating elements of DevSecOps. DevSecOps is a philosophy that argues that everyone is responsible for security and advocates for automation all parts of the security process. AWS has a number of services which can help implement a DevSecOps strategy. These services include:

  • Amazon GuardDuty: a continuous monitoring system which analyzes AWS CloudTrail Events, Amazon VPC Flow Log and DNS Logs. GuardDuty can detect threats and trigger an automated response.
  • AWS Shield: a managed DDOS protection services
  • AWS WAF: a managed Web Application Firewall
  • AWS Config: a service for assessing, tracking, and auditing changes to AWS configuration

These services can help detect security problems and implement a response in real time, achieving a significantly strong posture than traditional security strategies. You can build a DevSecOps strategy around a lift-and-shift workload using these services, without having to modify the lift-and-shift application.

Conclusion

There are many opportunities for taking advantage of AWS services and features to improve a lift-and-shift workload. Without any alteration to the application you can strengthen your security posture by utilizing AWS security services and by making small environmental and architectural changes that can help alleviate the challenges of legacy workloads.

About the author

Dr. Jonathan Shapiro-Ward is an AWS Solutions Architect based in Toronto. He helps customers across Canada to transform their businesses and build industry leading cloud solutions. He has a background in distributed systems and big data and holds a PhD from the University of St Andrews.

Optimizing a Lift-and-Shift for Cost Effectiveness and Ease of Management

Post Syndicated from Jonathan Shapiro-Ward original https://aws.amazon.com/blogs/architecture/optimizing-a-lift-and-shift-for-cost/

Lift-and-shift is the process of migrating a workload from on premise to AWS with little or no modification. A lift-and-shift is a common route for enterprises to move to the cloud, and can be a transitionary state to a more cloud native approach. This is the second blog post in a three-part series which investigates how to optimize a lift-and-shift workload. The first post is about performance.

A key concern that many customers have with a lift-and-shift is cost. If you move an application as is  from on-prem to AWS, is there any possibility for meaningful cost savings? By employing AWS services, in lieu of self-managed EC2 instances, and by leveraging cloud capability such as auto scaling, there is potential for significant cost savings. In this blog post, we will discuss a number of AWS services and solutions that you can leverage with minimal or no change to your application codebase in order to significantly reduce management costs and overall Total Cost of Ownership (TCO).

Automate

Even if you can’t modify your application, you can change the way you deploy your application. The adopting-an-infrastructure-as-code approach can vastly improve the ease of management of your application, thereby reducing cost. By templating your application through Amazon CloudFormation, Amazon OpsWorks, or Open Source tools you can make deploying and managing your workloads a simple and repeatable process.

As part of the lift-and-shift process, rationalizing the workload into a set of templates enables less time to spent in the future deploying and modifying the workload. It enables the easy creation of dev/test environments, facilitates blue-green testing, opens up options for DR, and gives the option to roll back in the event of error. Automation is the single step which is most conductive to improving ease of management.

Reserved Instances and Spot Instances

A first initial consideration around cost should be the purchasing model for any EC2 instances. Reserved Instances (RIs) represent a 1-year or 3-year commitment to EC2 instances and can enable up to 75% cost reduction (over on demand) for steady state EC2 workloads. They are ideal for 24/7 workloads that must be continually in operation. An application requires no modification to make use of RIs.

An alternative purchasing model is EC2 spot. Spot instances offer unused capacity available at a significant discount – up to 90%. Spot instances receive a two-minute warning when the capacity is required back by EC2 and can be suspended and resumed. Workloads which are architected for batch runs – such as analytics and big data workloads – often require little or no modification to make use of spot instances. Other burstable workloads such as web apps may require some modification around how they are deployed.

A final alternative is on-demand. For workloads that are not running in perpetuity, on-demand is ideal. Workloads can be deployed, used for as long as required, and then terminated. By leveraging some simple automation (such as AWS Lambda and CloudWatch alarms), you can schedule workloads to start and stop at the open and close of business (or at other meaningful intervals). This typically requires no modification to the application itself. For workloads that are not 24/7 steady state, this can provide greater cost effectiveness compared to RIs and more certainty and ease of use when compared to spot.

Amazon FSx for Windows File Server

Amazon FSx for Windows File Server provides a fully managed Windows filesystem that has full compatibility with SMB and DFS and full AD integration. Amazon FSx is an ideal choice for lift-and-shift architectures as it requires no modification to the application codebase in order to enable compatibility. Windows based applications can continue to leverage standard, Windows-native protocols to access storage with Amazon FSx. It enables users to avoid having to deploy and manage their own fileservers – eliminating the need for patching, automating, and managing EC2 instances. Moreover, it’s easy to scale and minimize costs, since Amazon FSx offers a pay-as-you-go pricing model.

Amazon EFS

Amazon Elastic File System (EFS) provides high performance, highly available multi-attach storage via NFS. EFS offers a drop-in replacement for existing NFS deployments. This is ideal for a range of Linux and Unix usecases as well as cross-platform solutions such as Enterprise Java applications. EFS eliminates the need to manage NFS infrastructure and simplifies storage concerns. Moreover, EFS provides high availability out of the box, which helps to reduce single points of failure and avoids the need to manually configure storage replication. Much like Amazon FSx, EFS enables customers to realize cost improvements by moving to a pay-as-you-go pricing model and requires a modification of the application.

Amazon MQ

Amazon MQ is a managed message broker service that provides compatibility with JMS, AMQP, MQTT, OpenWire, and STOMP. These are amongst the most extensively used middleware and messaging protocols and are a key foundation of enterprise applications. Rather than having to manually maintain a message broker, Amazon MQ provides a performant, highly available managed message broker service that is compatible with existing applications.

To use Amazon MQ without any modification, you can adapt applications that leverage a standard messaging protocol. In most cases, all you need to do is update the application’s MQ endpoint in its configuration. Subsequently, the Amazon MQ service handles the heavy lifting of operating a message broker, configuring HA, fault detection, failure recovery, software updates, and so forth. This offers a simple option for reducing management overhead and improving the reliability of a lift-and-shift architecture. What’s more is that applications can migrate to Amazon MQ without the need for any downtime, making this an easy and effective way to improve a lift-and-shift.

You can also use Amazon MQ to integrate legacy applications with modern serverless applications. Lambda functions can subscribe to MQ topics and trigger serverless workflows, enabling compatibility between legacy and new workloads.

Integrating Lift-and-Shift Workloads with Lambda via Amazon MQ

Figure 1: Integrating Lift-and-Shift Workloads with Lambda via Amazon MQ

Amazon Managed Streaming Kafka

Lift-and-shift workloads which include a streaming data component are often built around Apache Kafka. There is a certain amount of complexity involved in operating a Kafka cluster which incurs management and operational expense. Amazon Kinesis is a managed alternative to Apache Kafka, but it is not a drop-in replacement. At re:Invent 2018, we announced the launch of Amazon Managed Streaming Kafka (MSK) in public preview. MSK provides a managed Kafka deployment with pay-as-you-go pricing and an acts as a drop-in replacement in existing Kafka workloads. MSK can help reduce management costs and improve cost efficiency and is ideal for lift-and-shift workloads.

Leveraging S3 for Static Web Hosting

A significant portion of any web application is static content. This includes videos, image, text, and other content that changes seldom, if ever. In many lift-and-shifted applications, web servers are migrated to EC2 instances and host all content – static and dynamic. Hosting static content from an EC2 instance incurs a number of costs including the instance, EBS volumes, and likely, a load balancer. By moving static content to S3, you can significantly reduce the amount of compute required to host your web applications. In many cases, this change is non-disruptive and can be done at the DNS or CDN layer, requiring no change to your application.

Reducing Web Hosting Costs with S3 Static Web Hosting

Figure 2: Reducing Web Hosting Costs with S3 Static Web Hosting

Conclusion

There are numerous opportunities for reducing the cost of a lift-and-shift. Without any modification to the application, lift-and-shift workloads can benefit from cloud-native features. By using AWS services and features, you can significantly reduce the undifferentiated heavy lifting inherent in on-prem workloads and reduce resources and management overheads.

About the author

Dr. Jonathan Shapiro-Ward is an AWS Solutions Architect based in Toronto. He helps customers across Canada to transform their businesses and build industry leading cloud solutions. He has a background in distributed systems and big data and holds a PhD from the University of St Andrews.

Optimizing a Lift-and-Shift for Performance

Post Syndicated from Jonathan Shapiro-Ward original https://aws.amazon.com/blogs/architecture/optimizing-a-lift-and-shift-for-performance/

Many organizations begin their cloud journey with a lift-and-shift of applications from on-premise to AWS. This approach involves migrating software deployments with little, or no, modification. A lift-and-shift avoids a potentially expensive application rewrite but can result in a less optimal workload that a cloud native solution. For many organizations, a lift-and-shift is a transitional stage to an eventual cloud native solution, but there are some applications that can’t feasibly be made cloud-native such as legacy systems or proprietary third-party solutions. There are still clear benefits of moving these workloads to AWS, but how can they be best optimized?

In this blog series post, we’ll look at different approaches for optimizing a black box lift-and-shift. We’ll consider how we can significantly improve a lift-and-shift application across three perspectives: performance, cost, and security. We’ll show that without modifying the application we can integrate services and features that will make a lift-and-shift workload cheaper, faster, more secure, and more reliable. In this first blog, we’ll investigate how a lift-and-shift workload can have improved performance through leveraging AWS features and services.

Performance gains are often a motivating factor behind a cloud migration. On-premise systems may suffer from performance bottlenecks owing to legacy infrastructure or through capacity issues. When performing a lift-and-shift, how can you improve performance? Cloud computing is famous for enabling horizontally scalable architectures but many legacy applications don’t support this mode of operation. Traditional business applications are often architected around a fixed number of servers and are unable to take advantage of horizontal scalability. Even if a lift-and-shift can’t make use of auto scaling groups and horizontal scalability, you can achieve significant performance gains by moving to AWS.

Scaling Up

The easiest alternative to scale up to compute is vertical scalability. AWS provides the widest selection of virtual machine types and the largest machine types. Instances range from small, burstable t3 instances series all the way to memory optimized x1 series. By leveraging the appropriate instance, lift-and-shifts can benefit from significant performance. Depending on your workload, you can also swap out the instances used to power your workload to better meet demand. For example, on days in which you anticipate high load you could move to more powerful instances. This could be easily automated via a Lambda function.

The x1 family of instances offers considerable CPU, memory, storage, and network performance and can be used to accelerate applications that are designed to maximize single machine performance. The x1e.32xlarge instance, for example, offers 128 vCPUs, 4TB RAM, and 14,000 Mbps EBS bandwidth. This instance is ideal for high performance in-memory workloads such as real time financial risk processing or SAP Hana.

Through selecting the appropriate instance types and scaling that instance up and down to meet demand, you can achieve superior performance and cost effectiveness compared to running a single static instance. This affords lift-and-shift workloads far greater efficiency that their on-prem counterparts.

Placement Groups and C5n Instances

EC2 Placement groups determine how you deploy instances to underlying hardware. One can either choose to cluster instances into a low latency group within a single AZ or spread instances across distinct underlying hardware. Both types of placement groups are useful for optimizing lift-and-shifts.

The spread placement group is valuable in applications that rely on a small number of critical instances. If you can’t modify your application  to leverage auto scaling, liveness probes, or failover, then spread placement groups can help reduce the risk of simultaneous failure while improving the overall reliability of the application.

Cluster placement groups help improve network QoS between instances. When used in conjunction with enhanced networking, cluster placement groups help to ensure low latency, high throughput, and high network packets per second. This is beneficial for chatty applications and any application that leveraged physical co-location for performance on-prem.

There is no additional charge for using placement groups.

You can extend this approach further with C5n instances. These instances offer 100Gbps networking and can be used in placement group for the most demanding networking intensive workloads. Using both placement groups and the C5n instances require no modification to your application, only to how it is deployed – making it a strong solution for providing network performance to lift-and-shift workloads.

Leverage Tiered Storage to Optimize for Price and Performance

AWS offers a range of storage options, each with its own performance characteristics and price point. Through leveraging a combination of storage types, lift-and-shifts can achieve the performance and availability requirements in a price effective manner. The range of storage options include:

Amazon EBS is the most common storage service involved with lift-and-shifts. EBS provides block storage that can be attached to EC2 instances and formatted with a typical file system such as NTFS or ext4. There are several different EBS types, ranging from inexpensive magnetic storage to highly performant provisioned IOPS SSDs. There are also storage-optimized instances that offer high performance EBS access and NVMe storage. By utilizing the appropriate type of EBS volume and instance, a compromise of performance and price can be achieved. RAID offers a further option to optimize EBS. EBS utilizes RAID 1 by default, providing replication at no additional cost, however an EC2 instance can apply other RAID levels. For instance, you can apply RAID 0 over a number of EBS volumes in order to improve storage performance.

In addition to EBS, EC2 instances can utilize the EC2 instance store. The instance store provides ephemeral direct attached storage to EC2 instances. The instance store is included with the EC2 instance and provides a facility to store non-persistent data. This makes it ideal for temporary files that an application produces, which require performant storage. Both EBS and the instance store are expose to the EC2 instance as block level devices, and the OS can use its native management tools to format and mount these volumes as per a traditional disk – requiring no significant departure from the on prem configuration. In several instance types including the C5d and P3d are equipped with local NVMe storage which can support extremely IO intensive workloads.

Not all workloads require high performance storage. In many cases finding a compromise between price and performance is top priority. Amazon S3 provides highly durable, object storage at a significantly lower price point than block storage. S3 is ideal for a large number of use cases including content distribution, data ingestion, analytics, and backup. S3, however, is accessible via a RESTful API and does not provide conventional file system semantics as per EBS. This may make S3 less viable for applications that you can’t easily modify, but there are still options for using S3 in such a scenario.

An option for leveraging S3 is AWS Storage Gateway. Storage Gateway is a virtual appliance than can be run on-prem or on EC2. The Storage Gateway appliance can operate in three configurations: file gateway, volume gateway and tape gateway. File gateway provides an NFS interface, Volume Gateway provides an iSCSI interface, and Tape Gateway provides an iSCSI virtual tape library interface. This allows files, volumes, and tapes to be exposed to an application host through conventional protocols with the Storage Gateway appliance persisting data to S3. This allows an application to be agnostic to S3 while leveraging typical enterprise storage protocols.

Using S3 Storage via Storage Gateway

Figure 1: Using S3 Storage via Storage Gateway

Conclusion

A lift-and-shift can achieve significant performance gains on AWS by making use of a range of instance types, storage services, and other features. Even without any modification to the application, lift-and-shift workloads can benefit from cutting edge compute, network, and IO which can help realize significant, meaningful performance gains.

About the author

Dr. Jonathan Shapiro-Ward is an AWS Solutions Architect based in Toronto. He helps customers across Canada to transform their businesses and build industry leading cloud solutions. He has a background in distributed systems and big data and holds a PhD from the University of St Andrews.

Check it Out – New AWS Pricing Calculator for EC2 and EBS

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/check-it-out-new-aws-pricing-calculator-for-ec2-and-ebs/

The blog post that we published over a decade ago to launch the Simple Monthly Calculator still shows up on our internal top-10 lists from time to time! Since that post was published, we have extended, redesigned, and even rebuilt the calculator a time or two.

New Calculator
Starting with a blank screen, an empty code repo, and plenty of customer feedback, we are building a brand-new AWS Pricing Calculator. The new calculator is designed to help you estimate and understand your eventual AWS costs. We did our best to avoid excessive jargon and to make the calculations obvious, transparent, and accessible. You can see the options that are available to you, explore the associated costs, and make high-quality data-driven decisions.

We’re starting out with support for EC2 instances, EBS volumes, and a very wide variety of purchasing models, with plans to add support for more services as quickly as possible.

A Quick Tour
The new calculator lives at https://calculator.aws . Each estimate consists of one or more groups and the first one is created automatically:

Each group has a name, and has pricing for services in a particular AWS region. I click Edit group to change the name and pick a region, and click Apply:

Back at the main page of the calculator, I click Add service and choose to configure some EC2 instances. The group can contain multiple types and configurations of instances; I click Configure to move ahead:

At this point I can make a Quick estimate (the default), or supply more details as part of an Advanced estimate. I’ll start with a Quick estimate:

Here are a couple of things to keep in mind when you make a quick estimate:

Instance Type – I have two options for choosing EC2 instance types; I can enter my resource requirements (vCPU count, memory size, and GPU count) and have the calculator choose the option with the lowest price, or I can pick an EC2 instance type by name.

Pricing Strategy – I can choose to use On-Demand Instances, Convertible Reserved Instances, or Standard Reserved Instances, and can choose payment terms and options for RI’s.

EBS Volumes – I can choose the type and size of an EBS volume for the instance. Right now, the calculator allows you to associate one volume with each EC2 instance. If you need more than one, specify the total amount of storage you need across all volumes.

Details – I can expand the Show calculation section to see the math:

After I have made my choices, I click Add to my estimate to move ahead. My selections, along with their costs (annual, upfront, and monthly), are displayed:

I can go back and add another service, or create another group. I’ll add another EC2 instance, using an Advanced estimate this time around. Here’s where I start:

I have very fine-grained control over each aspect of my estimate. For example, I can characterize my workload in great detail. I click on Workload, and have the ability to select the graph that best represents my monthly workload:

I can even model workloads that have two or more independent daily (in this case) spike patterns. As I refine my model, the calculator figures out the most economical combination of On-Demand and Reserved Instances, and shows me the results:

The calculations are driven by the selection in the Pricing strategy. The default value, and the one that I used for the previous screen shot, is Cost optimized. I have other choices as well:

I can also model my data transfer in, out, and to other AWS regions:

Once I am happy with the results I click Add to my estimate, and take a look at my selections and their prices:

I can click Export to capture my estimate in spreadsheet form:

Here’s the data (I hid a few columns for clarity):

As you can see, the new calculator will quickly become a useful part of your planning and decision-making process.

One important thing to keep in mind: your estimates are stored in state that is local to the browser tab, and will be lost if you close the tab. The team is already hard at work on features that will allow you to save and even share your estimates, for launch in early 2019.

Stay Tuned
We will be adding more services and more features to the calculator in the months to come, and I’ll share some updates with you from time to time, either in this blog or via Twitter. If you have ideas, complaints, or other feedback, don’t hesitate to click on the Feedback link at the top of the page.

Jeff;

 

How to import split disks into AWS

Post Syndicated from Josh Rad original https://aws.amazon.com/blogs/compute/how-to-import-split-disks-into-aws/

Contributed by Raphael Sack

In an amusing coincidence, I was recently asked by two separate customers a nearly identical question: how to import split disks in the form of VMDK to AWS. This post covers one way to do this.

There are quite a few names, official and unofficial, for split disks. These include: non-monolithic, split disks, multi-part disks, delta parts, and more. For consistency, I refer to them as split disks in this post. The Image Formats page of VM Import/Export’s documentation specifies the officially supported formats when importing disks and virtual machines.

How to import split disks:

  1. Set up a conversion instance.
  2. Convert and build out the new disk image.
  3. Import the image.

After the image is imported, you can validate the import and start using the image.

Setting up a conversion instance

First things first—you need an instance to be used for the conversion job. You can use any machine that is able to run the QEMU image utility.

For convenience, spin up an Amazon EC2 instance running Amazon Linux. The only modification that you apply to the instance is installing the qemu-img disk image utility. To install the utility on Amazon Linux, use YUM:

yum install qemu-img

The instance requires the ec2:ImportImage and ec2:DescribeImportImageTasks permissions to be able to run the commands detailed in the steps below.

Converting and building out the new disk image

Next, you need your source VMDK files to be available on the machine. In this case, store on Amazon S3 and copy them over to the EC2 instance.

If you examine your source machine’s main .vmdk file, you find the “Extent description” section that specifies the disk parts. Make sure that you’ve got all of them.

As mentioned previously, you use qemu-img to convert the disks. Use the following command format:

qemu-img convert -f [SOURCE FORMAT] [SOURCE DISK FILENAME] -O [OUTPUT FORMAT] [OUTPUT FILENAME]

For example:

qemu-img convert -f vmdk MyDisk.vmdk -O raw MyDisk.raw

Because there are multiple parts, you can use a quick Bash one-liner loop:

for i in *-*.vmdk; do qemu-img convert -f vmdk $i -O raw $i.raw; done

A quick note about this command, it creates files such as AppServer-s001.vmdk.raw. You could use some command line tinkering and improve the naming convention. Because this post is only a temporary conversion, you can also leave the names as they are.

Now, concatenate all these newly converted RAW files into one large file, using the following command:

for i in *raw; do cat $i > AppServer.raw; done

Again, you could optimize this step and merge it into the previous loop. For the sake of clarity, I separated them out.

If you use qemu-img, you can now see that you have a single, RAW disk image:

Importing the disk image

The final step is to import the image, either as a snapshot, or as an image. In this example, use the image option and you can launch several instances off these disks.

To import an image or snapshot, the source has to be in an S3 bucket. Create a bucket named “import-blog-post”. Using the AWS CLI, upload the source to the bucket.

The last step is to create an import job. Jobs can represent a physical import (where you ship a physical device containing the data), or an upload of such data.

The corresponding CLI command for importing images is import-image and supports specifying parameters using command line arguments, or using a JSON file. For this post, use a combination of both for readability:

aws ec2 import-image --description <DESCRIPTION FOR THE IMPORT TASK> --license-type <LICENSE TYPE> --disk-containers <INFORMATION ABOUT THE DISKS>

Your command:

aws ec2 import-image --description "Blog import job" --license-type BYOL --disk-containers file://import-task.json

The contents of the JSON file:

[
        {
                "Description": "AppServer Ubuntu Image",
                "Format": "raw",
                "UserBucket": {
                        "S3Bucket": "import-blog-post",
                        "S3Key": "AppServer.raw"
                }
        }
]

You are specifying the source image format, location (in S3), and a description to be used for the AMI created from it.

Important: VM Import requires a service role to perform certain operations (such as downloading from S3, creating an image). If you do not have a vmimport role, follow the steps from the VM Import Service Role section in the Import Your VM as an Image topic. Additionally, before running the commands, I recommend that you verify that the role has the required access to the S3 bucket used for storing the image.

After you issue the command, a JSON response is displayed indicating that a task has been created:

{
    "Status": "active", 
    "SnapshotDetails": [
        {
            "UserBucket": {
                "S3Bucket": "import-blog-post", 
                "S3Key": "AppServer.raw"
            }, 
            "DiskImageSize": 0.0, 
            "Format": "RAW"
        }
    ], 
    "Progress": "2", 
    "StatusMessage": "pending", 
    "ImportTaskId": "import-ami-ffffffff"
}

The ImportTaskId value can be used to track or cancel the task:

aws ec2 describe-import-image-tasks --import-task-id import-ami- ffffffff

Looking at the task’s status, you can clearly see that it is active, with a percentage progress to help you understand where you are in the process:

{
    "ImportImageTasks": [
        {
            "Status": "active", 
            "Progress": "7", 
            "SnapshotDetails": [
                {
                    "UserBucket": {
                        "S3Bucket": "import-blog-post", 
                        "S3Key": "AppServer.raw"
                    }, 
                    "DiskImageSize": 26843545600.0, 
                    "Format": "RAW"
                }
            ], 
            "StatusMessage": "validated", 
            "ImportTaskId": "import-ami-ffffffff"
        }
    ]
}

The task progresses through the various import stages:

  • Converting
  • Updating
  • Updated
  • Preparing to boot
  • Booting
  • Booted
  • Preparing AMI
  • Completed

After the status transitions to Completed, you can find a value for ImageId in the response JSON. Start using your newly imported image, via the console, API, or CLI.

Conclusion

In this post, I presented a process for converting VMDK files, single or multiple, and importing them into AWS. This same process could be used for other source formats, such as QCOW, QCOW2, VDI, and more.

For those of you interested in other options, I would strongly recommend looking at AWS Server Migration Service, which makes it easier and faster to migrate workloads. For information about both partners and software solutions to assist you in such migrations, see Migration Partner Solutions.

Amazon ECS and Docker volume drivers, part 1: Amazon EBS

Post Syndicated from tiffany jernigan (@tiffanyfayj) original https://aws.amazon.com/blogs/compute/amazon-ecs-and-docker-volume-drivers-amazon-ebs/

→ Part 2: Amazon EFS

 

Post by: Jeremy Cowan, Ronnie Eichler, and Tiffany Jernigan

Introduction

Containers are emerging as the default compute primitive for building cloud-native applications.  They facilitate the adoption of continuous delivery, and help increase infrastructure use.

However, deploying stateful application as containers has been challenging because containers have short life-spans, get re-deployed frequently, are scaled up and down dynamically, and often share the same host with other containers. All of these factors make it challenging for you to appropriately align the lifecycles of storage volumes and containers.

Before Docker volume driver support was added to Amazon ECS, you had to manage storage volumes manually using custom tooling such as bash scripts, Lambda functions, or manual configuration of Docker volumes. Now, you can now take full advantage of the Docker plugin ecosystem by using popular plugins such as REX-Ray or Portworx.

ECS support for Docker volumes means that you can now deploy stateful and storage-intensive use cases. These include:

  • Machine learning and data processing workloads
  • Applications such as GitLab or Jenkins that share a filesystem across multiple tasks
  • Databases such as Cassandra or RocksDB
  • Streaming tools such as Kafka
  • Additional scratch space added to containers that process large workloads and are storage-intensive

To support this broad array of use cases, ECS offers you the flexibility to configure the lifecycle of the Docker volume. For example, you can specify whether it is a scratch space volume specific to a single instantiation of a task, or a persistent volume that persists beyond the lifecycle of a unique instantiation of the task. You can also choose to use a Docker volume that you’ve created before launching your task.

In addition to managing the Docker volume configuration and lifecycle, the ECS scheduler is now plugin-aware. ECS takes the availability of the requested driver into account in its placement decisions, so that tasks that require a certain driver are only placed on container instances that have the driver installed.

Docker and Docker volumes

Docker volumes are a way to persist data outside of the lifecycle of a container. Containers themselves are made up of multiple immutable layers of storage with an ephemeral layer, which is read/write. If your application writes files to the ephemeral layer, these changes are lost when the container stops.

Volumes are managed outside of the container lifecycle—stopping or removing the container does not remove the volume. Docker also supports volume drivers that allow you to use volumes as an abstraction between containers and persistent storage such as Amazon EBS or Amazon EFS. By default, Docker provides a driver called ‘local’ that provides local storage volumes to containers. With Docker plugins, you can now add volume drivers to provision and manage EBS and EFS storage, such as REX-Ray, Portworx, and NetShare.

To deploy a stateful application such as Cassandra, MongoDB, Zookeeper, or Kafka, you likely need high-performance persistent storage like EBS. Docker volumes allow you to present an EBS volume to your application as a Docker volume.

There are other applications such as Jenkins and GitLab, where multiple copies of the application need access to the same data. With volume drivers and EFS, you can present EFS as a shared volume to multiple instances of your container so that you can scale your application yet still retain and persist shared data on EFS.

Another overlooked use case involves applications that need scratch space. When you define a task in ECS and your application writes to the filesystem inside of the container (not on a Docker volume), the task consumes space on the underlying EC2 instance that is shared by all other running tasks. This can lead to issues of ‘noisy neighbors’ if a task were to write a bunch of data to /tmp on its local filesystem.

Now with Docker volume support in ECS, you can map an EBS volume to /tmp (or whatever your scratch space directory you prefer). You can ensure good performance while limiting the size of the underlying EBS volume using arguments in your ECS task to the volume driver.

What is REX-Ray?

REX-Ray is just one example of a Docker volume driver plugin that provides an abstraction between Docker volumes and the underlying storage. Built on top of the libStorage framework, REX-Ray’s simplified architecture consists of a single binary. It runs as a stateless service on every host, using a configuration file to orchestrate multiple storage platforms. REX-Ray supports multiple storage backends. For this post, we focus on EBS as a storage backend. Part two of this series focuses on EFS.

Using a plugin such as REX-Ray, your Docker container is able to persist data outside of the lifespan of a running container. You don’t have to worry about the underlying storage. Instead, you simply reference a Docker volume in your task definition and let REX-Ray provide the abstraction. While this post is specific to REX-Ray, ECS is designed to be open and pass through the volume driver arguments from your task definition to Docker. You can use any volume driver (such as Portworx) that is supported by Docker.

Putting it all together

Before you can get started using Docker volumes with ECS, there are a few things you need to do.

First, you need a suitable volume driver plugin, such as REX-Ray, to provide an abstraction between the Docker volume and the underlying storage, for example, EBS or EFS. Docker designed volumes and the associated driver mechanism to be pluggable to support a variety of storage backends. Although we’ve chosen to highlight REX-Ray for this post, there are several others to choose from, including Portworx and NetShare.

Because the volume plugin interacts with the AWS storage services on your behalf, an IAM role has to be assigned to the ECS container instances. This allows REX-Ray to issue the appropriate AWS API calls and perform actions such as attaching and detaching EBS volumes, and so on.

Using REX-Ray with Amazon EBS

To help you get started, we’ve created an AWS CloudFormation template that builds a two-node ECS cluster.  The template bootstraps the rexray/ebs volume driver onto each node and assigns them an IAM role with an inline policy that allows them to call the API actions that REX-Ray needs.  The template also creates a Network Load Balancer, which is used to expose an ECS service to the internet.

Finally, you create a task definition for a stateful service—MySQL—that uses the the rexray/ebs driver. Observe how the volume where MySQL stores its data is moved when the MySQL task is scheduled on another instance in the cluster.

Set up the environment

Here’s how to set up the environment for this walkthrough.

Step 1: Instantiate the AWS CloudFormation template

aws cloudformation create-stack --stack-name rexray-demo \
--capabilities CAPABILITY_NAMED_IAM \
--template-url http://s3.amazonaws.com/ecs-refarch-volume-plugins/rexray-demo.json \
--parameters ParameterKey=KeyName,ParameterValue=<keypair-name>

The ECS container instances are bootstrapped using the following script, which is given as user data in rexyray-demo.json.

#open file descriptor for stderr
exec 2>>/var/log/ecs/ecs-agent-install.log
set -x
#verify that the agent is running
until curl -s http://localhost:51678/v1/metadata
do
	sleep 1
done
#install the Docker volume plugin
docker plugin install rexray/ebs REXRAY_PREEMPT=true EBS_REGION=<AWS_REGION> --grant-all-permissions
#restart the ECS agent
stop ecs 
start ecs

Step 2: Export output parameters as environment variables

This shell script exports the output parameters from the CloudFormation template and imports them as OS environment variables.  You use these variables later to create task and service definitions.

cat > get-outputs.sh << 'EOF'
#!/bin/bash
function usage {
  echo "usage: source <(./get-outputs.sh <stackname-or-stackid> <region>)"
  echo "stack name or ID must be provided or exported as the CloudFormationStack environment variable"
  echo "region must be provided or set with aws configure"
}

function main {
    #Get stack
    if [ -z "$1" ]; then
        if [ -z "$CloudFormationStack" ]; then
            echo "please provide stack name or ID"
            usage
            exit 1
        fi
    else
        CloudFormationStack="$1"
    fi
    #Get region
    if [ -z "$2" ]; then
        region=$(aws configure get region)
        if [ -z $region ]; then
            echo "please provide region"
            usage
            exit 1
        fi
    else
        region="$2"
    fi
    
    echo "#Region: $region"
    echo "#Stack: $CloudFormationStack"
    echo "#---"
    
    echo "#Checking if stack exists..."
    aws cloudformation wait stack-exists \
    --region $region \
    --stack-name $CloudFormationStack
    
    echo "#Checking if stack creation is complete..."
    aws cloudformation wait stack-create-complete \
    --region $region \
    --stack-name $CloudFormationStack
     
    echo "#Getting output keys and values..."
    echo "#---"
    aws cloudformation describe-stacks \
    --region $region \
    --stack-name $CloudFormationStack \
    --query 'Stacks[].Outputs[].[OutputKey, OutputValue]' \
    --output text | awk '{print "export", $1"="$2}'
}
main "[email protected]"
EOF

#Add executable permissions
chmod +x get-outputs.sh

Export the output parameters. The region parameter is only needed if your Region configuration is not us-west-2, as defined in the CloudFormation template.

./get-outputs.sh && source <(./get-outputs.sh)

Step 3: Create the task definition

In this step, you create a task definition for MySQL.  MySQL is considered stateful service because the data stored in the database has to persist beyond the life of the task.

When the MySQL task is restarted on another instance in the cluster, the scheduler and the rexray/ebs plugin ensure that the task is launched on an instance that can re-establish a connection to the EBS volume where the database is stored.

The placement constraint in the task definition informs the ECS service scheduler to launch the task in a specific Availability Zone; the available zone where the EBS volume was originally created.  Such a constraint is necessary because instances cannot connect to volumes in a different Availability Zone.

cat > mysql-taskdef.json << EOF 
{
    "containerDefinitions": [
        {
            "logConfiguration": {
                "logDriver": "awslogs",
                "options": {
                    "awslogs-group": "${CWLogGroupName}",
                    "awslogs-region": "${AWSRegion}",
                    "awslogs-stream-prefix": "ecs"
                }
            },
            "portMappings": [
                {
                    "containerPort": 3306,
                    "protocol": "tcp"
                }
            ],
            "environment": [
                {
                    "name": "MYSQL_ROOT_PASSWORD",
                    "value": "my-secret-pw"
                }
            ],
            "mountPoints": [
                {
                    "containerPath": "/var/lib/mysql",
                    "sourceVolume": "rexray-vol"
                }
            ],
            "image": "mysql",
            "essential": true,
            "name": "mysql"
        }
    ],
    "placementConstraints": [
        {
            "type": "memberOf",
            "expression": "attribute:ecs.availability-zone==${AvailabilityZone}"
        }
    ],
    "memory": "512",
    "family": "mysql",
    "networkMode": "awsvpc",
    "requiresCompatibilities": [
        "EC2"
    ],
    "cpu": "512",
    "volumes": [
        {
            "name": "rexray-vol",
            "dockerVolumeConfiguration": {
                "autoprovision": true,
                "scope": "shared",
                "driver": "rexray/ebs",
                "driverOpts": {
                    "volumetype": "gp2",
                    "size": "5"
                }
            }
        }
    ]
}
EOF

Docker volumes support adds several new the parameters to the ECS task definition. These include the volume type, scope, drivers, and Docker options and labels. A volume can either be scoped to a single, specific task or it can be shared among multiple tasks.

When a volume is scoped to a task, it is not meant to be shared across different running tasks.  In contrast, a shared volume is for use cases where the volume lifecycle is independent of the ECS task. The volume can be used by different tasks concurrently or at different times. It is primarily intended for use cases such as single-task applications where the volume persists after the task dies and is re-used when the task starts again. Another use case is when multiple tasks on the same EC2 container instance access the volume concurrently.

The autoprovision parameter is used to specify whether ECS manages the lifecycle of the volume.  When this is set to true, ECS automatically provisions the volume for you, which is what you are doing in the above example.  When it’s set to false, ECS assumes that the volume already exists.  For this example, you could instead set autoprovision to false and run the following command to create a volume:

aws create-volume --size 1 --volume-type gp2 \
--availability-zone $AvailabilityZone \
--tag-specifications 'ResourceType=volume,Tags=[{Key=Name,Value=rexray-vol}]'

The driver options are used to configure the type of EBS storage use, for example, gp2, standard, io1, and so on, the size of the volume to provision, IOPS, and encryption.  The specific options vary depending on the volume plugin that you are using.

Register the task definition and extract the task definition ARN from the result:

TaskDefinitionArn=$(aws ecs register-task-definition \
--cli-input-json 'file://mysql-taskdef.json' \
| jq -r .taskDefinition.taskDefinitionArn)

Step 4: Create a service definition

In this step, you create a service definition for MySQL.  An ECS service is a long running task that is monitored by the service scheduler.  If the task dies or becomes unhealthy, the scheduler automatically attempts to restart the task.

The MySQL service is fronted by a Network Load Balancer that is configured for forward traffic on port 3306 to the tasks registered with a specific target group.  The desired count is the desired number of task copies to run. The minimum and maximum healthy percent parameters inform the scheduler to only run exactly the number of desired copies of this task at a time. Unless a task has been stopped, it does not try starting a new one.

cat > mysql-svcdef.json << EOF 
{
    "cluster": "${ECSClusterName}",
    "serviceName": "mysql-svc",
    "taskDefinition": "${TaskDefinitionArn}",
    "loadBalancers": [
        {
            "targetGroupArn": "${MySQLTargetGroupArn}",
            "containerName": "mysql",
            "containerPort": 3306
        }
    ],
    "desiredCount": 1,
    "launchType": "EC2",
    "healthCheckGracePeriodSeconds": 60, 
    "deploymentConfiguration": {
        "maximumPercent": 100,
        "minimumHealthyPercent": 0
    },
    "networkConfiguration": {
        "awsvpcConfiguration": {
            "subnets": [
                "${SubnetId}"
            ],
            "securityGroups": [
                "${SecurityGroupId}"
            ],
            "assignPublicIp": "DISABLED"
        }
    }
}
EOF

Create the MySQL service:

SvcDefinitionArn=$(aws ecs create-service \
--cli-input-json file://mysql-svcdef.json \
| jq -r .service.serviceArn)

Step 5: Connect to the MySQL service

After the service is running, configure a MySQL client, such as MySQL Workbench, to connect to the service:

  1. For Connection Name, type “rexray-demo”.
  2. For Hostname, copy and paste the DNS name of the Network Load Balancer.
  3. For Password, type the default password found in the mysql-taskdef.json file.
  4. Choose Test Connection, Close.
  5. Under MySQL Connections, open the rexray-demo connection.

MySQL Workbench

In the Query window, paste the following:

CREATE DATABASE rexraydb;
USE rexraydb;
CREATE TABLE pets (name VARCHAR(20), breed VARCHAR(20));
SHOW TABLES;
DESCRIBE pets;
INSERT INTO pets VALUES ('Fluffy', 'Poodle');
SELECT * FROM pets;

You can execute each line separately by placing the cursor on a line and clicking the execute statement button.

Execute MySQL commands

Step 6: Drain the instance

Now that you have a running MySQL database server running under a container and persisting its data, make sure that it will survive a container replacement.

Docker containers by their nature are designed to be ephemeral. If you upgrade the underlying host operating system, you must drain the tasks off of the instance and let them be re-scheduled onto another ECS host. Below, I show the behavior of persisting the MySQL instance’s data to an EBS volume and allowing the task to be re-scheduled.

The following script identifies the instance that is currently running the task and puts it in a draining state.  This forces the task to be rescheduled onto the other EC2 container instance in the cluster.

cat > drain-instance.sh << 'EOF'

echo "Region [$AWSRegion]"
echo "Cluster [$ECSClusterName]"
echo "Task Definition [$TaskDefinitionArn]"

TaskArns=$(aws ecs list-tasks --region $AWSRegion \
--cluster $ECSClusterName --query taskArns --output text)
echo "Task ARNs [$TaskArns]"

ContainerInstanceArns=$(aws ecs describe-tasks \
--region $AWSRegion --cluster $ECSClusterName \
--tasks $TaskArns \
--query 'tasks[?taskDefinitionArn==`'$TaskDefinitionArn'`]' \
--query 'tasks[].containerInstanceArn' --output text)
echo "Container Instance ARNs [$ContainerInstanceArns]"

echo "DRAINING Instances"
aws ecs update-container-instances-state --region $AWSRegion \
--cluster $ECSClusterName --container-instances $ContainerInstanceArns \
--status "DRAINING"

EOF

In the ECS console, if you click on the cluster and then the tab for the cluster’s tasks, you see the container instance ID for the MySQL task:

Clicking the link of the container instance ID takes you to another page that shows the EC2 instance ID of the instance where the MySQL task is running:

Now run the script:

chmod +x drain-instance.sh
./drain-instance.sh

When you run the script, the tasks on the draining instance are stopped. Because you have an ECS service definition for MySQL, ECS launches new tasks on other ECS instances in the cluster that meet the placement constraints. In this example, you placed a constraint on the Availability Zone of the EBS volume as it’s not possible to detach and re-attach volumes across Availability Zones. Because the volume already exists, REX-Ray attaches the existing volume to the new task. When MySQL starts, it sees this as its data volume and you have access to the recently stored data.

Step 7: Re-connect to the MySQL service

After you see that a new task has been provisioned on the ECS cluster, you can return to MySQL Workbench and attempt to run the following query:

USE rexraydb;
SELECT * FROM pets;

You may get an error message stating “The MySQL server has gone away.” This usually means that the new ECS task has not completed starting or hasn’t been registered yet as a healthy target behind the Network Load Balancer. If you wait a little longer and try again, you should see the same results in the query grid as before.

This environment is meant as a demonstration on how to use Docker volume plugins with ECS for supporting persistent workloads. For an actual production implementation, I recommend scoping the VPC and security groups to only allow network access from trusted resources. This post creates a MySQL server that is accessible from the internet. In addition, you should implement your own strong MySQL root password, among other things.

To clean up this demo, take the following steps.

Delete the service.

aws ecs update-service --cluster $ECSClusterName \
--service $SvcDefinitionArn \
--desired-count 0
aws ecs delete-service --cluster $ECSClusterName \
--service $SvcDefinitionArn

Delete the volume.

Even though you deleted the task and the service, you still need to clean up the EBS volume that you created. You created this volume and referenced it in the ECS task definition. ECS passed this information along to Docker running on the host, which in turn handed it to REX-Ray (your volume driver), which knew how to attach the EBS volume and map it to the container.

The easiest way to delete this volume is from the EC2 console. In the list of volumes, you should see a volume named rexray-vol that is unattached (state=available). Delete this volume as it is no longer needed.

 

REX-Ray Volume

Otherwise, you can run the following command, which grabs the volume ID and deletes it:

rexrayVolumeID=$(aws ec2 describe-volumes --filter Name="tag:Name",Values=rexray-vol \
--query "Volumes[].VolumeId" --output text)
aws ec2 delete-volume --volume-id $rexrayVolumeID

Delete the CloudFormation template.

Lastly, delete the CloudFormation template. This removes the rest of the environment that was pre-created for this exercise.

aws cloudformation delete-stack --stack-name rexray-demo

Summary

While it was possible to use Docker volume plugins with ECS previously, doing so required you to create volumes out of band, that is, outside of ECS, and create placement constraints to restrict where tasks could be run. With native support for Docker volumes, volumes can now be provisioned simply by adding a handful of parameters to an ECS task definition.

Moreover, the ECS scheduler is now volume plugin aware.  Instances that have a volume driver installed on them automatically get annotated with attributes that inform the scheduler where to place tasks that use a particular driver.  Together, these features help you to run stateful, storage intensive applications such as databases, machine learning, and data processing applications, streaming applications like Kafka, as well as applications that need additional scratch space.  We look forward to hearing about the use cases that this new feature enables.

– Jeremy, Ronnie, and Tiffany

Improving application performance and reducing costs with Amazon EBS-Optimized Instance burst capability

Post Syndicated from Geoff Murase original https://aws.amazon.com/blogs/compute/improving-application-performance-and-reducing-costs-with-amazon-ebs-optimized-instance-burst-capability/

Contributed by Sooraj Prasannan, Senior Product Manager, Amazon Elastic Block Store

In November 2017, Amazon EC2 introduced C5 compute-intensive instances and M5 general-purpose instances. In the first half of 2018, we released EC2 C5d instances and M5d instances by adding high-speed, ultra-low latency local NVMe storage to the EC2 C5 and M5 instance families. EC2 C5/C5d and M5/M5d instances are built on the Nitro system. This collection of AWS-built hardware and software components enables high performance, high availability, high security, and bare metal capabilities to reduce virtualization overhead.

During the design of the Nitro system, we analyzed real-world workloads and recognized the need for smaller instance sizes to drive higher performance from their Amazon EBS volumes. We found that the majority of application storage needs are bursty, with short, intense periods of high I/O and plenty of idle time between bursts. To improve the experience for these workloads, we developed burst capability for smaller instance sizes. Available on EC2 C5/C5d and M5/M5d instances, this feature enables large, xlarge, and 2xlarge instance sizes to drive the same performance as the 4xlarge instance for 30 minutes each day.

For applications with spiky Amazon EBS demand, you can right-size your instances based on your CPU and memory requirements and still meet your EBS-optimized instance performance requirements. This higher performance also enables you to speed up sections of your workflow dependent on EBS-optimized instance performance. Faster workflows result in quicker job completions and improved resource utilization. The burst capability ultimately enables you to reduce costs by right-sizing your instance and improving total resource usage.

With this performance increase, you will be able to handle unplanned spikes in demand without any impact to your application performance. You can now size your instances based on historical average trends. This burst capability gives you more performance to absorb spikes without affecting your customer experience.

Using Amazon CloudWatch metrics to monitor burst usage

For better visibility into your performance, instances based on the Nitro system provide Amazon CloudWatch metrics to help profile your usage. Based on the usage profile, you can decide if smaller instances meet your requirements.

These instances give you the ability to monitor your usage via instance level CloudWatch metrics for operations (EBSReadOpsandEBSWriteOps) and bytes transferred (EBSReadBytesand EBSWriteBytes). For more information on these metrics, see List of available CloudWatch metrics for your instances. These metrics support basic monitoring (five-minute frequency) by default, but you can enable detailed monitoring (one-minute frequency) for an additional cost. For more information, see Amazon CloudWatch pricing.

For large, xlarge, and 2xlarge instances, we also provide burst balance metrics. EBSIOBalance% monitors the instance I/O burst bucket, and EBSByteBalance% monitors the instance byte burst bucket. These metrics give information about the percentage of I/O or bytes credits remaining in the respective burst buckets. The metrics are expressed as a percentage, where 100% means that the instance has accumulated the maximum number of credits. You can set up an alarm that triggers if the balance gets too low.

To demonstrate these metrics, we launched an m5.large instance. We then attached a 500GB io1 Amazon EBS volume with 32,000 provisioned IOPS to the instance. Amazon EBS volumes attached to instances based on the Nitro system are exposed as NVMe devices.

First, we ran a large block (128 KiB) test using fio to /dev/disk/by-id/nvme-Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 and monitored both EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=128k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=large-block-test 

Because this is a large block workload, it’s not driving enough IOPS to deplete EBSIOBalance%. It depletes EBSByteBalance% instead, as shown in the following image.

Then we ran a small block test to understand how it affects EBSIOBalance% and EBSByteBalance%.

$ sudo fio --filename= /dev/disk/by-id/nvme-
Amazon_Elastic_Block_Store_vol02f2f9a66c2ebfd66 --rw=randread --
bs=16k --runtime=2400 --time_based=1 --iodepth=32 --
ioengine=libaio --direct=1 --name=small-block-test 

Because this is a small block test, it drives higher IOPS than bytes/second. Hence, EBSIOBalance% drops faster than EBSByteBalance%, as shown in the following image.

As long as EBSIOBalance% and EBSByteBalance% are above 0%, the instance can drive the burst performance. When the instance I/O activity is below the baseline rate, the burst buckets refill. After the tests finished, we paused all I/O from the instance. This period of inactivity allows the instance burst buckets to refill, as EBSIOBalance% and EBSByteBalance% show in the following image.

The refill rate for a burst bucket is the difference between the baseline rate and the instance I/O activity. For example, m5.large has a baseline throughput rate of 60 MB/s and a baseline IOPS rate of 3600 IOPS. Suppose the instance I/O activity is 10 MB/s and 1000 IOPS. The byte bucket fills at the rate of 50 MB/s (60 MB/s minus 10 MB/s). The IOPS bucket fills at the rate of 2600 IOPS (3600 IOPS minus 1000 IOPS). For the baseline rates for the different instances, see Amazon EBS–optimized instances. In addition, we top off the burst buckets every 24 hours, which means that the instance has burst performance available for 30 minutes each day.

Performance enhancements

We have continued to make enhancements to the Nitro system. With the latest set of enhancements, we have increased the maximum burst bandwidth on the large, xlarge, and 2xlarge EC2 C5/C5d and M5/M5d instances to 3.5 Gbps, up from 2.25 Gbps and 2.12 Gbps, respectively. We have also increased the maximum burst IOPS for EC2 C5/C5d to 20,000 IOPS and to 18,750 IOPS for M5/M5d, up from 16,000 IOPS for both. All new EC2 C5/C5d and M5/M5d smaller instances can take advantage of this performance increase at no additional cost.

For the latest list of instances based on the Nitro system that support this burst feature and their corresponding performance numbers, see Amazon EBS–optimized instances.

New – Lifecycle Management for Amazon EBS Snapshots

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-lifecycle-management-for-amazon-ebs-snapshots/

It is always interesting to zoom in on the history of a single AWS service or feature and watch how it has evolved over time in response to customer feedback. For example, Amazon Elastic Block Store (EBS) launched a decade ago and has been gaining more features and functionality every since. Here are a few of the most significant announcements:

Several of the items that I chose to highlight above make EBS snapshots more useful and more flexible. As you may already know, it is easy to create snapshots. Each snapshot is a point-in-time copy of the blocks that have changed since the previous snapshot, with automatic management to ensure that only the data unique to a snapshot is removed when it is deleted. This incremental model reduces your costs and minimizes the time needed to create a snapshot.

Because snapshots are so easy to create and use, our customers create a lot of them, and make great use of tags to categorize, organize, and manage them. Going back to my list, you can see that we have added multiple tagging features over the years.

Lifecycle Management – The Amazon Data Lifecycle Manager
We want to make it even easier for you to create, use, and benefit from EBS snapshots! Today we are launching Amazon Data Lifecycle Manager to automate the creation, retention, and deletion of Amazon EBS volume snapshots. Instead of creating snapshots manually and deleting them in the same way (or building a tool to do it for you), you simply create a policy, indicating (via tags) which volumes are to be snapshotted, set a retention model, fill in a few other details, and let Data Lifecycle Manager do the rest. Data Lifecycle Manager is powered by tags, so you should start by setting up a clear and comprehensive tagging model for your organization (refer to the links above to learn more).

It turns out that many of our customers have invested in tools to automate the creation of snapshots, but have skimped on the retention and deletion. Sooner or later they receive a surprisingly large AWS bill and find that their scripts are not working as expected. The Data Lifecycle Manager should help them to save money and to be able to rest assured that their snapshots are being managed as expected.

Creating and Using a Lifecycle Policy
Data Lifecycle Manager uses lifecycle policies to figure out when to run, which volumes to snapshot, and how long to keep the snapshots around. You can create the policies in the AWS Management Console, from the AWS Command Line Interface (CLI) or via the Data Lifecycle Manager APIs; I’ll use the Console today. Here are my EBS volumes, all suitably tagged with a department:

I access the Lifecycle Manager from the Elastic Block Store section of the menu:

Then I click Create Snapshot Lifecycle Policy to proceed:

Then I create my first policy:

I use tags to specify the volumes that the policy applies to. If I specify multiple tags, then the policy applies to volumes that have any of the tags:

I can create snapshots at 12 or 24 hour intervals, and I can specify the desired snapshot time. Snapshot creation will start no more than an hour after this time, with completion based on the size of the volume and the degree of change since the last snapshot.

I can use the built-in default IAM role or I can create one of my own. If I use my own role, I need to enable the EC2 snapshot operations and all of the DLM (Data Lifecycle Manager) operations; read the docs to learn more.

Newly created snapshots will be tagged with the aws:dlm:lifecycle-policy-id and  aws:dlm:lifecycle-schedule-name automatically; I can also specify up to 50 additional key/value pairs for each policy:

I can see all of my policies at a glance:

I took a short break and came back to find that the first set of snapshots had been created, as expected (I configured the console to show the two tags created on the snapshots):

Things to Know
Here are a couple of things to keep in mind when you start to use Data Lifecycle Manager to automate your snapshot management:

Data Consistency – Snapshots will contain the data from all completed I/O operations, also known as crash consistent.

Pricing – You can create and use Data Lifecyle Manager policies at no charge; you pay the usual storage charges for the EBS snapshots that it creates.

Availability – Data Lifecycle Manager is available in the US East (N. Virginia), US West (Oregon), and EU (Ireland) Regions.

Tags and Policies – If a volume has more than one tag and the tags match multiple policies, each policy will create a separate snapshot and both policies will govern the retention. No two policies can specify the same key/value pair for a tag.

Programmatic Access – You can create and manage policies programmatically! Take a look at the CreateLifecyclePolicy, GetLifecyclePolicies, and UpdateLifeCyclePolicy functions to get started. You can also write an AWS Lambda function in response to the createSnapshot event.

Error Handling – Data Lifecycle Manager generates a “DLM Policy State Change” event if a policy enters the error state.

In the Works – As you might have guessed from the name, we plan to add support for additional AWS data sources over time. We also plan to support policies that will let you do weekly and monthly snapshots, and also expect to give you additional scheduling flexibility.

Jeff;

Now You Can Create Encrypted Amazon EBS Volumes by Using Your Custom Encryption Keys When You Launch an Amazon EC2 Instance

Post Syndicated from Nishit Nagar original https://aws.amazon.com/blogs/security/create-encrypted-amazon-ebs-volumes-custom-encryption-keys-launch-amazon-ec2-instance-2/

Amazon Elastic Block Store (EBS) offers an encryption solution for your Amazon EBS volumes so you don’t have to build, maintain, and secure your own infrastructure for managing encryption keys for block storage. Amazon EBS encryption uses AWS Key Management Service (AWS KMS) customer master keys (CMKs) when creating encrypted Amazon EBS volumes, providing you all the benefits associated with using AWS KMS. You can specify either an AWS managed CMK or a customer-managed CMK to encrypt your Amazon EBS volume. If you use a customer-managed CMK, you retain granular control over your encryption keys, such as having AWS KMS rotate your CMK every year. To learn more about creating CMKs, see Creating Keys.

In this post, we demonstrate how to create an encrypted Amazon EBS volume using a customer-managed CMK when you launch an EC2 instance from the EC2 console, AWS CLI, and AWS SDK.

Creating an encrypted Amazon EBS volume from the EC2 console

Follow these steps to launch an EC2 instance from the EC2 console with Amazon EBS volumes that are encrypted by customer-managed CMKs:

  1. Sign in to the AWS Management Console and open the EC2 console.
  2. Select Launch instance, and then, in Step 1 of the wizard, select an Amazon Machine Image (AMI).
  3. In Step 2 of the wizard, select an instance type, and then provide additional configuration details in Step 3. For details about configuring your instances, see Launching an Instance.
  4. In Step 4 of the wizard, specify additional EBS volumes that you want to attach to your instances.
  5. To create an encrypted Amazon EBS volume, first add a new volume by selecting Add new volume. Leave the Snapshot column blank.
  6. In the Encrypted column, select your CMK from the drop-down menu. You can also paste the full Amazon Resource Name (ARN) of your custom CMK key ID in this box. To learn more about finding the ARN of a CMK, see Working with Keys.
  7. Select Review and Launch. Your instance will launch with an additional Amazon EBS volume with the key that you selected. To learn more about the launch wizard, see Launching an Instance with Launch Wizard.

Creating Amazon EBS encrypted volumes from the AWS CLI or SDK

You also can use RunInstances to launch an instance with additional encrypted Amazon EBS volumes by setting Encrypted to true and adding kmsKeyID along with the actual key ID in the BlockDeviceMapping object, as shown in the following command:

$> aws ec2 run-instances –image-id ami-b42209de –count 1 –instance-type m4.large –region us-east-1 –block-device-mappings file://mapping.json

In this example, mapping.json describes the properties of the EBS volume that you want to create:


{
"DeviceName": "/dev/sda1",
"Ebs": {
"DeleteOnTermination": true,
"VolumeSize": 100,
"VolumeType": "gp2",
"Encrypted": true,
"kmsKeyID": "arn:aws:kms:us-east-1:012345678910:key/abcd1234-a123-456a-a12b-a123b4cd56ef"
}
}

You can also launch instances with additional encrypted EBS data volumes via an Auto Scaling or Spot Fleet by creating a launch template with the above BlockDeviceMapping. For example:

$> aws ec2 create-launch-template –MyLTName –image-id ami-b42209de –count 1 –instance-type m4.large –region us-east-1 –block-device-mappings file://mapping.json

To learn more about launching an instance with the AWS CLI or SDK, see the AWS CLI Command Reference.

In this blog post, we’ve demonstrated a single-step, streamlined process for creating Amazon EBS volumes that are encrypted under your CMK when you launch your EC2 instance, thereby streamlining your instance launch workflow. To start using this functionality, navigate to the EC2 console.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on the Amazon EC2 forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Tag Amazon EBS Snapshots on Creation and Implement Stronger Security Policies

Post Syndicated from Woo Kim original https://aws.amazon.com/blogs/compute/tag-amazon-ebs-snapshots-on-creation-and-implement-stronger-security-policies/

This blog was contributed by Rucha Nene, Sr. Product Manager for Amazon EBS

AWS customers use tags to track ownership of resources, implement compliance protocols, control access to resources via IAM policies, and drive their cost accounting processes. Last year, we made tagging for Amazon EC2 instances and Amazon EBS volumes easier by adding the ability to tag these resources upon creation. We are now extending this capability to EBS snapshots.

Earlier, you could tag your EBS snapshots only after the resource had been created and sometimes, ended up with EBS snapshots in an untagged state if tagging failed. You also could not control the actions that users and groups could take over specific snapshots, or enforce tighter security policies.

To address these issues, we are making tagging for EBS snapshots more flexible and giving customers more control over EBS snapshots by introducing two new capabilities:

  • Tag on creation for EBS snapshots – You can now specify tags for EBS snapshots as part of the API call that creates the resource or via the Amazon EC2 Console when creating an EBS snapshot.
  • Resource-level permission and enforced tag usage – The CreateSnapshot, DeleteSnapshot, and ModifySnapshotAttrribute API actions now support IAM resource-level permissions. You can now write IAM policies that mandate the use of specific tags when taking actions on EBS snapshots.

Tag on creation

You can now specify tags for EBS snapshots as part of the API call that creates the resources. The resource creation and the tagging are performed atomically; both must succeed in order for the operation CreateSnapshot to succeed. You no longer need to build tagging scripts that run after EBS snapshots have been created.

Here’s how you specify tags when you create an EBS snapshot, using the console:

  1. Open the Amazon EC2 console at https://console.aws.amazon.com/ec2/.
  2. In the navigation pane, choose Snapshots, Create Snapshot.
  3. On the Create Snapshot page, select the volume for which to create a snapshot.
  4. (Optional) Choose Add tags to your snapshot. For each tag, provide a tag key and a tag value.
  5. Choose Create Snapshot.

Using the AWS CLI:

aws ec2 create-snapshot --volume-id vol-0c0e757e277111f3c --description 'Prod_Backup' --tag-specifications 
'ResourceType=snapshot,Tags=[{Key=costcenter,Value=115},{Key=IsProd,Value=Yes}]'

To learn more, see Using Tags.

Resource-level permissions and enforced tag usage

CreateSnapshot, DeleteSnapshot, and ModifySnapshotAttribute now support resource-level permissions, which allow you to exercise more control over EBS snapshots. You can write IAM policies that give you precise control over access to resources and let you specify which users are able to create snapshots for a given set of volumes. You can also enforce the use of specific tags to help track resources and achieve more accurate cost allocation reporting.

For example, here’s a statement that requires that the costcenter tag (with a value of “115”) be present on the volume from which snapshots are being created. It requires that this tag be applied to all newly created snapshots. In addition, it requires that the created snapshots are tagged with User:username for the customer.

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Action":"ec2:CreateSnapshot",
         "Resource":"arn:aws:ec2:us-east-1:123456789012:volume/*",
	   "Condition": {
		"StringEquals":{
               "ec2:ResourceTag/costcenter":"115"
}
 }
	
      },
      {
         "Sid":"AllowCreateTaggedSnapshots",
         "Effect":"Allow",
         "Action":"ec2:CreateSnapshot",
         "Resource":"arn:aws:ec2:us-east-1::snapshot/*",
         "Condition":{
            "StringEquals":{
               "aws:RequestTag/costcenter":"115",
		   "aws:RequestTag/User":"${aws:username}"
            },
            "ForAllValues:StringEquals":{
               "aws:TagKeys":[
                  "costcenter",
			"User"
               ]
            }
         }
      },
      {
         "Effect":"Allow",
         "Action":"ec2:CreateTags",
         "Resource":"arn:aws:ec2:us-east-1::snapshot/*",
         "Condition":{
            "StringEquals":{
               "ec2:CreateAction":"CreateSnapshot"
            }
         }
      }
   ]
}

To implement stronger compliance and security policies, you could also restrict access to DeleteSnapshot, if the resource is not tagged with the user’s name. Here’s a statement that allows the deletion of a snapshot only if the snapshot is tagged with User:username for the customer.

{
   "Version":"2012-10-17",
   "Statement":[
      {
         "Effect":"Allow",
         "Action":"ec2:DeleteSnapshot",
         "Resource":"arn:aws:ec2:us-east-1::snapshot/*",
         "Condition":{
            "StringEquals":{
               "ec2:ResourceTag/User":"${aws:username}"
            }
         }
      }
   ]
}

To learn more and to see some sample policies, see IAM Policies for Amazon EC2 and Working with Snapshots.

Available Now

These new features are available now in all AWS Regions. You can start using it today from the Amazon EC2 Console, AWS Command Line Interface (CLI), or the AWS APIs.

Best Practices for Running Apache Cassandra on Amazon EC2

Post Syndicated from Prasad Alle original https://aws.amazon.com/blogs/big-data/best-practices-for-running-apache-cassandra-on-amazon-ec2/

Apache Cassandra is a commonly used, high performance NoSQL database. AWS customers that currently maintain Cassandra on-premises may want to take advantage of the scalability, reliability, security, and economic benefits of running Cassandra on Amazon EC2.

Amazon EC2 and Amazon Elastic Block Store (Amazon EBS) provide secure, resizable compute capacity and storage in the AWS Cloud. When combined, you can deploy Cassandra, allowing you to scale capacity according to your requirements. Given the number of possible deployment topologies, it’s not always trivial to select the most appropriate strategy suitable for your use case.

In this post, we outline three Cassandra deployment options, as well as provide guidance about determining the best practices for your use case in the following areas:

  • Cassandra resource overview
  • Deployment considerations
  • Storage options
  • Networking
  • High availability and resiliency
  • Maintenance
  • Security

Before we jump into best practices for running Cassandra on AWS, we should mention that we have many customers who decided to use DynamoDB instead of managing their own Cassandra cluster. DynamoDB is fully managed, serverless, and provides multi-master cross-region replication, encryption at rest, and managed backup and restore. Integration with AWS Identity and Access Management (IAM) enables DynamoDB customers to implement fine-grained access control for their data security needs.

Several customers who have been using large Cassandra clusters for many years have moved to DynamoDB to eliminate the complications of administering Cassandra clusters and maintaining high availability and durability themselves. Gumgum.com is one customer who migrated to DynamoDB and observed significant savings. For more information, see Moving to Amazon DynamoDB from Hosted Cassandra: A Leap Towards 60% Cost Saving per Year.

AWS provides options, so you’re covered whether you want to run your own NoSQL Cassandra database, or move to a fully managed, serverless DynamoDB database.

Cassandra resource overview

Here’s a short introduction to standard Cassandra resources and how they are implemented with AWS infrastructure. If you’re already familiar with Cassandra or AWS deployments, this can serve as a refresher.

ResourceCassandraAWS
Cluster

A single Cassandra deployment.

 

This typically consists of multiple physical locations, keyspaces, and physical servers.

A logical deployment construct in AWS that maps to an AWS CloudFormation StackSet, which consists of one or many CloudFormation stacks to deploy Cassandra.
DatacenterA group of nodes configured as a single replication group.

A logical deployment construct in AWS.

 

A datacenter is deployed with a single CloudFormation stack consisting of Amazon EC2 instances, networking, storage, and security resources.

Rack

A collection of servers.

 

A datacenter consists of at least one rack. Cassandra tries to place the replicas on different racks.

A single Availability Zone.
Server/nodeA physical virtual machine running Cassandra software.An EC2 instance.
TokenConceptually, the data managed by a cluster is represented as a ring. The ring is then divided into ranges equal to the number of nodes. Each node being responsible for one or more ranges of the data. Each node gets assigned with a token, which is essentially a random number from the range. The token value determines the node’s position in the ring and its range of data.Managed within Cassandra.
Virtual node (vnode)Responsible for storing a range of data. Each vnode receives one token in the ring. A cluster (by default) consists of 256 tokens, which are uniformly distributed across all servers in the Cassandra datacenter.Managed within Cassandra.
Replication factorThe total number of replicas across the cluster.Managed within Cassandra.

Deployment considerations

One of the many benefits of deploying Cassandra on Amazon EC2 is that you can automate many deployment tasks. In addition, AWS includes services, such as CloudFormation, that allow you to describe and provision all your infrastructure resources in your cloud environment.

We recommend orchestrating each Cassandra ring with one CloudFormation template. If you are deploying in multiple AWS Regions, you can use a CloudFormation StackSet to manage those stacks. All the maintenance actions (scaling, upgrading, and backing up) should be scripted with an AWS SDK. These may live as standalone AWS Lambda functions that can be invoked on demand during maintenance.

You can get started by following the Cassandra Quick Start deployment guide. Keep in mind that this guide does not address the requirements to operate a production deployment and should be used only for learning more about Cassandra.

Deployment patterns

In this section, we discuss various deployment options available for Cassandra in Amazon EC2. A successful deployment starts with thoughtful consideration of these options. Consider the amount of data, network environment, throughput, and availability.

  • Single AWS Region, 3 Availability Zones
  • Active-active, multi-Region
  • Active-standby, multi-Region

Single region, 3 Availability Zones

In this pattern, you deploy the Cassandra cluster in one AWS Region and three Availability Zones. There is only one ring in the cluster. By using EC2 instances in three zones, you ensure that the replicas are distributed uniformly in all zones.

To ensure the even distribution of data across all Availability Zones, we recommend that you distribute the EC2 instances evenly in all three Availability Zones. The number of EC2 instances in the cluster is a multiple of three (the replication factor).

This pattern is suitable in situations where the application is deployed in one Region or where deployments in different Regions should be constrained to the same Region because of data privacy or other legal requirements.

ProsCons

●     Highly available, can sustain failure of one Availability Zone.

●     Simple deployment

●     Does not protect in a situation when many of the resources in a Region are experiencing intermittent failure.

 

Active-active, multi-Region

In this pattern, you deploy two rings in two different Regions and link them. The VPCs in the two Regions are peered so that data can be replicated between two rings.

We recommend that the two rings in the two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.

This pattern is most suitable when the applications using the Cassandra cluster are deployed in more than one Region.

ProsCons

●     No data loss during failover.

●     Highly available, can sustain when many of the resources in a Region are experiencing intermittent failures.

●     Read/write traffic can be localized to the closest Region for the user for lower latency and higher performance.

●     High operational overhead

●     The second Region effectively doubles the cost

 

Active-standby, multi-region

In this pattern, you deploy two rings in two different Regions and link them. The VPCs in the two Regions are peered so that data can be replicated between two rings.

However, the second Region does not receive traffic from the applications. It only functions as a secondary location for disaster recovery reasons. If the primary Region is not available, the second Region receives traffic.

We recommend that the two rings in the two Regions be identical in nature, having the same number of nodes, instance types, and storage configuration.

This pattern is most suitable when the applications using the Cassandra cluster require low recovery point objective (RPO) and recovery time objective (RTO).

ProsCons

●     No data loss during failover.

●     Highly available, can sustain failure or partitioning of one whole Region.

●     High operational overhead.

●     High latency for writes for eventual consistency.

●     The second Region effectively doubles the cost.

Storage options

In on-premises deployments, Cassandra deployments use local disks to store data. There are two storage options for EC2 instances:

Your choice of storage is closely related to the type of workload supported by the Cassandra cluster. Instance store works best for most general purpose Cassandra deployments. However, in certain read-heavy clusters, Amazon EBS is a better choice.

The choice of instance type is generally driven by the type of storage:

  • If ephemeral storage is required for your application, a storage-optimized (I3) instance is the best option.
  • If your workload requires Amazon EBS, it is best to go with compute-optimized (C5) instances.
  • Burstable instance types (T2) don’t offer good performance for Cassandra deployments.

Instance store

Ephemeral storage is local to the EC2 instance. It may provide high input/output operations per second (IOPs) based on the instance type. An SSD-based instance store can support up to 3.3M IOPS in I3 instances. This high performance makes it an ideal choice for transactional or write-intensive applications such as Cassandra.

In general, instance storage is recommended for transactional, large, and medium-size Cassandra clusters. For a large cluster, read/write traffic is distributed across a higher number of nodes, so the loss of one node has less of an impact. However, for smaller clusters, a quick recovery for the failed node is important.

As an example, for a cluster with 100 nodes, the loss of 1 node is 3.33% loss (with a replication factor of 3). Similarly, for a cluster with 10 nodes, the loss of 1 node is 33% less capacity (with a replication factor of 3).

 Ephemeral storageAmazon EBSComments

IOPS

(translates to higher query performance)

Up to 3.3M on I3

80K/instance

10K/gp2/volume

32K/io1/volume

This results in a higher query performance on each host. However, Cassandra implicitly scales well in terms of horizontal scale. In general, we recommend scaling horizontally first. Then, scale vertically to mitigate specific issues.

 

Note: 3.3M IOPS is observed with 100% random read with a 4-KB block size on Amazon Linux.

AWS instance typesI3Compute optimized, C5Being able to choose between different instance types is an advantage in terms of CPU, memory, etc., for horizontal and vertical scaling.
Backup/ recoveryCustomBasic building blocks are available from AWS.

Amazon EBS offers distinct advantage here. It is small engineering effort to establish a backup/restore strategy.

a) In case of an instance failure, the EBS volumes from the failing instance are attached to a new instance.

b) In case of an EBS volume failure, the data is restored by creating a new EBS volume from last snapshot.

Amazon EBS

EBS volumes offer higher resiliency, and IOPs can be configured based on your storage needs. EBS volumes also offer some distinct advantages in terms of recovery time. EBS volumes can support up to 32K IOPS per volume and up to 80K IOPS per instance in RAID configuration. They have an annualized failure rate (AFR) of 0.1–0.2%, which makes EBS volumes 20 times more reliable than typical commodity disk drives.

The primary advantage of using Amazon EBS in a Cassandra deployment is that it reduces data-transfer traffic significantly when a node fails or must be replaced. The replacement node joins the cluster much faster. However, Amazon EBS could be more expensive, depending on your data storage needs.

Cassandra has built-in fault tolerance by replicating data to partitions across a configurable number of nodes. It can not only withstand node failures but if a node fails, it can also recover by copying data from other replicas into a new node. Depending on your application, this could mean copying tens of gigabytes of data. This adds additional delay to the recovery process, increases network traffic, and could possibly impact the performance of the Cassandra cluster during recovery.

Data stored on Amazon EBS is persisted in case of an instance failure or termination. The node’s data stored on an EBS volume remains intact and the EBS volume can be mounted to a new EC2 instance. Most of the replicated data for the replacement node is already available in the EBS volume and won’t need to be copied over the network from another node. Only the changes made after the original node failed need to be transferred across the network. That makes this process much faster.

EBS volumes are snapshotted periodically. So, if a volume fails, a new volume can be created from the last known good snapshot and be attached to a new instance. This is faster than creating a new volume and coping all the data to it.

Most Cassandra deployments use a replication factor of three. However, Amazon EBS does its own replication under the covers for fault tolerance. In practice, EBS volumes are about 20 times more reliable than typical disk drives. So, it is possible to go with a replication factor of two. This not only saves cost, but also enables deployments in a region that has two Availability Zones.

EBS volumes are recommended in case of read-heavy, small clusters (fewer nodes) that require storage of a large amount of data. Keep in mind that the Amazon EBS provisioned IOPS could get expensive. General purpose EBS volumes work best when sized for required performance.

Networking

If your cluster is expected to receive high read/write traffic, select an instance type that offers 10–Gb/s performance. As an example, i3.8xlarge and c5.9xlarge both offer 10–Gb/s networking performance. A smaller instance type in the same family leads to a relatively lower networking throughput.

Cassandra generates a universal unique identifier (UUID) for each node based on IP address for the instance. This UUID is used for distributing vnodes on the ring.

In the case of an AWS deployment, IP addresses are assigned automatically to the instance when an EC2 instance is created. With the new IP address, the data distribution changes and the whole ring has to be rebalanced. This is not desirable.

To preserve the assigned IP address, use a secondary elastic network interface with a fixed IP address. Before swapping an EC2 instance with a new one, detach the secondary network interface from the old instance and attach it to the new one. This way, the UUID remains same and there is no change in the way that data is distributed in the cluster.

If you are deploying in more than one region, you can connect the two VPCs in two regions using cross-region VPC peering.

High availability and resiliency

Cassandra is designed to be fault-tolerant and highly available during multiple node failures. In the patterns described earlier in this post, you deploy Cassandra to three Availability Zones with a replication factor of three. Even though it limits the AWS Region choices to the Regions with three or more Availability Zones, it offers protection for the cases of one-zone failure and network partitioning within a single Region. The multi-Region deployments described earlier in this post protect when many of the resources in a Region are experiencing intermittent failure.

Resiliency is ensured through infrastructure automation. The deployment patterns all require a quick replacement of the failing nodes. In the case of a regionwide failure, when you deploy with the multi-Region option, traffic can be directed to the other active Region while the infrastructure is recovering in the failing Region. In the case of unforeseen data corruption, the standby cluster can be restored with point-in-time backups stored in Amazon S3.

Maintenance

In this section, we look at ways to ensure that your Cassandra cluster is healthy:

  • Scaling
  • Upgrades
  • Backup and restore

Scaling

Cassandra is horizontally scaled by adding more instances to the ring. We recommend doubling the number of nodes in a cluster to scale up in one scale operation. This leaves the data homogeneously distributed across Availability Zones. Similarly, when scaling down, it’s best to halve the number of instances to keep the data homogeneously distributed.

Cassandra is vertically scaled by increasing the compute power of each node. Larger instance types have proportionally bigger memory. Use deployment automation to swap instances for bigger instances without downtime or data loss.

Upgrades

All three types of upgrades (Cassandra, operating system patching, and instance type changes) follow the same rolling upgrade pattern.

In this process, you start with a new EC2 instance and install software and patches on it. Thereafter, remove one node from the ring. For more information, see Cassandra cluster Rolling upgrade. Then, you detach the secondary network interface from one of the EC2 instances in the ring and attach it to the new EC2 instance. Restart the Cassandra service and wait for it to sync. Repeat this process for all nodes in the cluster.

Backup and restore

Your backup and restore strategy is dependent on the type of storage used in the deployment. Cassandra supports snapshots and incremental backups. When using instance store, a file-based backup tool works best. Customers use rsync or other third-party products to copy data backups from the instance to long-term storage. For more information, see Backing up and restoring data in the DataStax documentation. This process has to be repeated for all instances in the cluster for a complete backup. These backup files are copied back to new instances to restore. We recommend using S3 to durably store backup files for long-term storage.

For Amazon EBS based deployments, you can enable automated snapshots of EBS volumes to back up volumes. New EBS volumes can be easily created from these snapshots for restoration.

Security

We recommend that you think about security in all aspects of deployment. The first step is to ensure that the data is encrypted at rest and in transit. The second step is to restrict access to unauthorized users. For more information about security, see the Cassandra documentation.

Encryption at rest

Encryption at rest can be achieved by using EBS volumes with encryption enabled. Amazon EBS uses AWS KMS for encryption. For more information, see Amazon EBS Encryption.

Instance store–based deployments require using an encrypted file system or an AWS partner solution. If you are using DataStax Enterprise, it supports transparent data encryption.

Encryption in transit

Cassandra uses Transport Layer Security (TLS) for client and internode communications.

Authentication

The security mechanism is pluggable, which means that you can easily swap out one authentication method for another. You can also provide your own method of authenticating to Cassandra, such as a Kerberos ticket, or if you want to store passwords in a different location, such as an LDAP directory.

Authorization

The authorizer that’s plugged in by default is org.apache.cassandra.auth.Allow AllAuthorizer. Cassandra also provides a role-based access control (RBAC) capability, which allows you to create roles and assign permissions to these roles.

Conclusion

In this post, we discussed several patterns for running Cassandra in the AWS Cloud. This post describes how you can manage Cassandra databases running on Amazon EC2. AWS also provides managed offerings for a number of databases. To learn more, see Purpose-built databases for all your application needs.

If you have questions or suggestions, please comment below.


Additional Reading

If you found this post useful, be sure to check out Analyze Your Data on Amazon DynamoDB with Apache Spark and Analysis of Top-N DynamoDB Objects using Amazon Athena and Amazon QuickSight.


About the Authors

Prasad Alle is a Senior Big Data Consultant with AWS Professional Services. He spends his time leading and building scalable, reliable Big data, Machine learning, Artificial Intelligence and IoT solutions for AWS Enterprise and Strategic customers. His interests extend to various technologies such as Advanced Edge Computing, Machine learning at Edge. In his spare time, he enjoys spending time with his family.

 

 

 

Provanshu Dey is a Senior IoT Consultant with AWS Professional Services. He works on highly scalable and reliable IoT, data and machine learning solutions with our customers. In his spare time, he enjoys spending time with his family and tinkering with electronics & gadgets.

 

 

 

Longer Resource IDs in 2018 for Amazon EC2, Amazon EBS, and Amazon VPC

Post Syndicated from Nathan Taber original https://aws.amazon.com/blogs/compute/longer-resource-ids-in-2018-for-amazon-ec2-amazon-ebs-and-amazon-vpc/

This post contributed by Laura Thomson, Senior Product Manager for Amazon EC2.

As you start planning for the new year, I want to give you a heads up that Amazon EC2 is migrating to longer format, 17-character resource IDs. Instances and volumes currently already receive this ID format. Beginning in July 2018, all newly created EC2 resources receive longer IDs as well.

The switch-over will not impact most customers. However, I wanted to make you aware so that you can schedule time at the beginning of 2018 to test your systems with the longer format. If you have a system that parses or stores resource IDs, you may be affected.

From January 2018 through the end of June 2018, there will be a transition period, during which you can opt in to receive longer IDs. To make this easy, AWS will provide an option to opt in with one click for all regions, resources, and users. AWS will also provide more granular controls via API operations and console support. More information on the opt-in process will be sent out in January.

We need to do this given how fast AWS is continuing to grow. We will start to run low on IDs for certain resources within a year or so. In order to enable the long-term, uninterrupted creation of new resources, we need to move to the longer ID format.

The current format is a resource identifier followed by an eight-character string. The new format is the same resource identifier followed by a 17-character string. For example, your current VPCs have resource identifiers such as “vpc-1234abc0”. Starting July 2018, new VPCs will be assigned an identifier such as “vpc-1234567890abcdef0”. You can continue using the existing eight-character IDs for your existing resources, which won’t change and will continue to be supported. Only new resources will receive the 17-character IDs and only after you opt in to the new format.

For more information, see Longer EC2, EBS, and Storage Gateway Resource IDs.  If you have any questions, contact AWS Support on the community forums and via AWS Support.

How to Patch, Inspect, and Protect Microsoft Windows Workloads on AWS—Part 2

Post Syndicated from Koen van Blijderveen original https://aws.amazon.com/blogs/security/how-to-patch-inspect-and-protect-microsoft-windows-workloads-on-aws-part-2/

Yesterday in Part 1 of this blog post, I showed you how to:

  1. Launch an Amazon EC2 instance with an AWS Identity and Access Management (IAM) role, an Amazon Elastic Block Store (Amazon EBS) volume, and tags that Amazon EC2 Systems Manager (Systems Manager) and Amazon Inspector use.
  2. Configure Systems Manager to install the Amazon Inspector agent and patch your EC2 instances.

Today in Steps 3 and 4, I show you how to:

  1. Take Amazon EBS snapshots using Amazon EBS Snapshot Scheduler to automate snapshots based on instance tags.
  2. Use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

To catch up on Steps 1 and 2, see yesterday’s blog post.

Step 3: Take EBS snapshots using EBS Snapshot Scheduler

In this section, I show you how to use EBS Snapshot Scheduler to take snapshots of your instances at specific intervals. To do this, I will show you how to:

  • Determine the schedule for EBS Snapshot Scheduler by providing you with best practices.
  • Deploy EBS Snapshot Scheduler by using AWS CloudFormation.
  • Tag your EC2 instances so that EBS Snapshot Scheduler backs up your instances when you want them backed up.

In addition to making sure your EC2 instances have all the available operating system patches applied on a regular schedule, you should take snapshots of the EBS storage volumes attached to your EC2 instances. Taking regular snapshots allows you to restore your data to a previous state quickly and cost effectively. With Amazon EBS snapshots, you pay only for the actual data you store, and snapshots save only the data that has changed since the previous snapshot, which minimizes your cost. You will use EBS Snapshot Scheduler to make regular snapshots of your EC2 instance. EBS Snapshot Scheduler takes advantage of other AWS services including CloudFormation, Amazon DynamoDB, and AWS Lambda to make backing up your EBS volumes simple.

Determine the schedule

As a best practice, you should back up your data frequently during the hours when your data changes the most. This reduces the amount of data you lose if you have to restore from a snapshot. For the purposes of this blog post, the data for my instances changes the most between the business hours of 9:00 A.M. to 5:00 P.M. Pacific Time. During these hours, I will make snapshots hourly to minimize data loss.

In addition to backing up frequently, another best practice is to establish a strategy for retention. This will vary based on how you need to use the snapshots. If you have compliance requirements to be able to restore for auditing, your needs may be different than if you are able to detect data corruption within three hours and simply need to restore to something that limits data loss to five hours. EBS Snapshot Scheduler enables you to specify the retention period for your snapshots. For this post, I only need to keep snapshots for recent business days. To account for weekends, I will set my retention period to three days, which is down from the default of 15 days when deploying EBS Snapshot Scheduler.

Deploy EBS Snapshot Scheduler

In Step 1 of Part 1 of this post, I showed how to configure an EC2 for Windows Server 2012 R2 instance with an EBS volume. You will use EBS Snapshot Scheduler to take eight snapshots each weekday of your EC2 instance’s EBS volumes:

  1. Navigate to the EBS Snapshot Scheduler deployment page and choose Launch Solution. This takes you to the CloudFormation console in your account. The Specify an Amazon S3 template URL option is already selected and prefilled. Choose Next on the Select Template page.
  2. On the Specify Details page, retain all default parameters except for AutoSnapshotDeletion. Set AutoSnapshotDeletion to Yes to ensure that old snapshots are periodically deleted. The default retention period is 15 days (you will specify a shorter value on your instance in the next subsection).
  3. Choose Next twice to move to the Review step, and start deployment by choosing the I acknowledge that AWS CloudFormation might create IAM resources check box and then choosing Create.

Tag your EC2 instances

EBS Snapshot Scheduler takes a few minutes to deploy. While waiting for its deployment, you can start to tag your instance to define its schedule. EBS Snapshot Scheduler reads tag values and looks for four possible custom parameters in the following order:

  • <snapshot time> – Time in 24-hour format with no colon.
  • <retention days> – The number of days (a positive integer) to retain the snapshot before deletion, if set to automatically delete snapshots.
  • <time zone> – The time zone of the times specified in <snapshot time>.
  • <active day(s)>all, weekdays, or mon, tue, wed, thu, fri, sat, and/or sun.

Because you want hourly backups on weekdays between 9:00 A.M. and 5:00 P.M. Pacific Time, you need to configure eight tags—one for each hour of the day. You will add the eight tags shown in the following table to your EC2 instance.

TagValue
scheduler:ebs-snapshot:09000900;3;utc;weekdays
scheduler:ebs-snapshot:10001000;3;utc;weekdays
scheduler:ebs-snapshot:11001100;3;utc;weekdays
scheduler:ebs-snapshot:12001200;3;utc;weekdays
scheduler:ebs-snapshot:13001300;3;utc;weekdays
scheduler:ebs-snapshot:14001400;3;utc;weekdays
scheduler:ebs-snapshot:15001500;3;utc;weekdays
scheduler:ebs-snapshot:16001600;3;utc;weekdays

Next, you will add these tags to your instance. If you want to tag multiple instances at once, you can use Tag Editor instead. To add the tags in the preceding table to your EC2 instance:

  1. Navigate to your EC2 instance in the EC2 console and choose Tags in the navigation pane.
  2. Choose Add/Edit Tags and then choose Create Tag to add all the tags specified in the preceding table.
  3. Confirm you have added the tags by choosing Save. After adding these tags, navigate to your EC2 instance in the EC2 console. Your EC2 instance should look similar to the following screenshot.
    Screenshot of how your EC2 instance should look in the console
  4. After waiting a couple of hours, you can see snapshots beginning to populate on the Snapshots page of the EC2 console.Screenshot of snapshots beginning to populate on the Snapshots page of the EC2 console
  5. To check if EBS Snapshot Scheduler is active, you can check the CloudWatch rule that runs the Lambda function. If the clock icon shown in the following screenshot is green, the scheduler is active. If the clock icon is gray, the rule is disabled and does not run. You can enable or disable the rule by selecting it, choosing Actions, and choosing Enable or Disable. This also allows you to temporarily disable EBS Snapshot Scheduler.Screenshot of checking to see if EBS Snapshot Scheduler is active
  1. You can also monitor when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule as shown in the previous screenshot and choosing Show metrics for the rule.Screenshot of monitoring when EBS Snapshot Scheduler has run by choosing the name of the CloudWatch rule

If you want to restore and attach an EBS volume, see Restoring an Amazon EBS Volume from a Snapshot and Attaching an Amazon EBS Volume to an Instance.

Step 4: Use Amazon Inspector

In this section, I show you how to you use Amazon Inspector to scan your EC2 instance for common vulnerabilities and exposures (CVEs) and set up Amazon SNS notifications. To do this I will show you how to:

  • Install the Amazon Inspector agent by using EC2 Run Command.
  • Set up notifications using Amazon SNS to notify you of any findings.
  • Define an Amazon Inspector target and template to define what assessment to perform on your EC2 instance.
  • Schedule Amazon Inspector assessment runs to assess your EC2 instance on a regular interval.

Amazon Inspector can help you scan your EC2 instance using prebuilt rules packages, which are built and maintained by AWS. These prebuilt rules packages tell Amazon Inspector what to scan for on the EC2 instances you select. Amazon Inspector provides the following prebuilt packages for Microsoft Windows Server 2012 R2:

  • Common Vulnerabilities and Exposures
  • Center for Internet Security Benchmarks
  • Runtime Behavior Analysis

In this post, I’m focused on how to make sure you keep your EC2 instances patched, backed up, and inspected for common vulnerabilities and exposures (CVEs). As a result, I will focus on how to use the CVE rules package and use your instance tags to identify the instances on which to run the CVE rules. If your EC2 instance is fully patched using Systems Manager, as described earlier, you should not have any findings with the CVE rules package. Regardless, as a best practice I recommend that you use Amazon Inspector as an additional layer for identifying any unexpected failures. This involves using Amazon CloudWatch to set up weekly Amazon Inspector scans, and configuring Amazon Inspector to notify you of any findings through SNS topics. By acting on the notifications you receive, you can respond quickly to any CVEs on any of your EC2 instances to help ensure that malware using known CVEs does not affect your EC2 instances. In a previous blog post, Eric Fitzgerald showed how to remediate Amazon Inspector security findings automatically.

Install the Amazon Inspector agent

To install the Amazon Inspector agent, you will use EC2 Run Command, which allows you to run any command on any of your EC2 instances that have the Systems Manager agent with an attached IAM role that allows access to Systems Manager.

  1. Choose Run Command under Systems Manager Services in the navigation pane of the EC2 console. Then choose Run a command.
    Screenshot of choosing "Run a command"
  2. To install the Amazon Inspector agent, you will use an AWS managed and provided command document that downloads and installs the agent for you on the selected EC2 instance. Choose AmazonInspector-ManageAWSAgent. To choose the target EC2 instance where this command will be run, use the tag you previously assigned to your EC2 instance, Patch Group, with a value of Windows Servers. For this example, set the concurrent installations to 1 and tell Systems Manager to stop after 5 errors.
    Screenshot of installing the Amazon Inspector agent
  3. Retain the default values for all other settings on the Run a command page and choose Run. Back on the Run Command page, you can see if the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances.
    Screenshot showing that the command that installed the Amazon Inspector agent executed successfully on all selected EC2 instances

Set up notifications using Amazon SNS

Now that you have installed the Amazon Inspector agent, you will set up an SNS topic that will notify you of any findings after an Amazon Inspector run.

To set up an SNS topic:

  1. In the AWS Management Console, choose Simple Notification Service under Messaging in the Services menu.
  2. Choose Create topic, name your topic (only alphanumeric characters, hyphens, and underscores are allowed) and give it a display name to ensure you know what this topic does (I’ve named mine Inspector). Choose Create topic.
    "Create new topic" page
  3. To allow Amazon Inspector to publish messages to your new topic, choose Other topic actions and choose Edit topic policy.
  4. For Allow these users to publish messages to this topic and Allow these users to subscribe to this topic, choose Only these AWS users. Type the following ARN for the US East (N. Virginia) Region in which you are deploying the solution in this post: arn:aws:iam::316112463485:root. This is the ARN of Amazon Inspector itself. For the ARNs of Amazon Inspector in other AWS Regions, see Setting Up an SNS Topic for Amazon Inspector Notifications (Console). Amazon Resource Names (ARNs) uniquely identify AWS resources across all of AWS.
    Screenshot of editing the topic policy
  5. To receive notifications from Amazon Inspector, subscribe to your new topic by choosing Create subscription and adding your email address. After confirming your subscription by clicking the link in the email, the topic should display your email address as a subscriber. Later, you will configure the Amazon Inspector template to publish to this topic.
    Screenshot of subscribing to the new topic

Define an Amazon Inspector target and template

Now that you have set up the notification topic by which Amazon Inspector can notify you of findings, you can create an Amazon Inspector target and template. A target defines which EC2 instances are in scope for Amazon Inspector. A template defines which packages to run, for how long, and on which target.

To create an Amazon Inspector target:

  1. Navigate to the Amazon Inspector console and choose Get started. At the time of writing this blog post, Amazon Inspector is available in the US East (N. Virginia), US West (N. California), US West (Oregon), EU (Ireland), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Sydney), and Asia Pacific (Tokyo) Regions.
  2. For Amazon Inspector to be able to collect the necessary data from your EC2 instance, you must create an IAM service role for Amazon Inspector. Amazon Inspector can create this role for you if you choose Choose or create role and confirm the role creation by choosing Allow.
    Screenshot of creating an IAM service role for Amazon Inspector
  3. Amazon Inspector also asks you to tag your EC2 instance and install the Amazon Inspector agent. You already performed these steps in Part 1 of this post, so you can proceed by choosing Next. To define the Amazon Inspector target, choose the previously used Patch Group tag with a Value of Windows Servers. This is the same tag that you used to define the targets for patching. Then choose Next.
    Screenshot of defining the Amazon Inspector target
  4. Now, define your Amazon Inspector template, and choose a name and the package you want to run. For this post, use the Common Vulnerabilities and Exposures package and choose the default duration of 1 hour. As you can see, the package has a version number, so always select the latest version of the rules package if multiple versions are available.
    Screenshot of defining an assessment template
  5. Configure Amazon Inspector to publish to your SNS topic when findings are reported. You can also choose to receive a notification of a started run, a finished run, or changes in the state of a run. For this blog post, you want to receive notifications if there are any findings. To start, choose Assessment Templates from the Amazon Inspector console and choose your newly created Amazon Inspector assessment template. Choose the icon below SNS topics (see the following screenshot).
    Screenshot of choosing an assessment template
  6. A pop-up appears in which you can choose the previously created topic and the events about which you want SNS to notify you (choose Finding reported).
    Screenshot of choosing the previously created topic and the events about which you want SNS to notify you

Schedule Amazon Inspector assessment runs

The last step in using Amazon Inspector to assess for CVEs is to schedule the Amazon Inspector template to run using Amazon CloudWatch Events. This will make sure that Amazon Inspector assesses your EC2 instance on a regular basis. To do this, you need the Amazon Inspector template ARN, which you can find under Assessment templates in the Amazon Inspector console. CloudWatch Events can run your Amazon Inspector assessment at an interval you define using a Cron-based schedule. Cron is a well-known scheduling agent that is widely used on UNIX-like operating systems and uses the following syntax for CloudWatch Events.

Image of Cron schedule

All scheduled events use a UTC time zone, and the minimum precision for schedules is one minute. For more information about scheduling CloudWatch Events, see Schedule Expressions for Rules.

To create the CloudWatch Events rule:

  1. Navigate to the CloudWatch console, choose Events, and choose Create rule.
    Screenshot of starting to create a rule in the CloudWatch Events console
  2. On the next page, specify if you want to invoke your rule based on an event pattern or a schedule. For this blog post, you will select a schedule based on a Cron expression.
  3. You can schedule the Amazon Inspector assessment any time you want using the Cron expression, or you can use the Cron expression I used in the following screenshot, which will run the Amazon Inspector assessment every Sunday at 10:00 P.M. GMT.
    Screenshot of scheduling an Amazon Inspector assessment with a Cron expression
  4. Choose Add target and choose Inspector assessment template from the drop-down menu. Paste the ARN of the Amazon Inspector template you previously created in the Amazon Inspector console in the Assessment template box and choose Create a new role for this specific resource. This new role is necessary so that CloudWatch Events has the necessary permissions to start the Amazon Inspector assessment. CloudWatch Events will automatically create the new role and grant the minimum set of permissions needed to run the Amazon Inspector assessment. To proceed, choose Configure details.
    Screenshot of adding a target
  5. Next, give your rule a name and a description. I suggest using a name that describes what the rule does, as shown in the following screenshot.
  6. Finish the wizard by choosing Create rule. The rule should appear in the Events – Rules section of the CloudWatch console.
    Screenshot of completing the creation of the rule
  7. To confirm your CloudWatch Events rule works, wait for the next time your CloudWatch Events rule is scheduled to run. For testing purposes, you can choose your CloudWatch Events rule and choose Edit to change the schedule to run it sooner than scheduled.
    Screenshot of confirming the CloudWatch Events rule works
  8. Now navigate to the Amazon Inspector console to confirm the launch of your first assessment run. The Start time column shows you the time each assessment started and the Status column the status of your assessment. In the following screenshot, you can see Amazon Inspector is busy Collecting data from the selected assessment targets.
    Screenshot of confirming the launch of the first assessment run

You have concluded the last step of this blog post by setting up a regular scan of your EC2 instance with Amazon Inspector and a notification that will let you know if your EC2 instance is vulnerable to any known CVEs. In a previous Security Blog post, Eric Fitzgerald explained How to Remediate Amazon Inspector Security Findings Automatically. Although that blog post is for Linux-based EC2 instances, the post shows that you can learn about Amazon Inspector findings in other ways than email alerts.

Conclusion

In this two-part blog post, I showed how to make sure you keep your EC2 instances up to date with patching, how to back up your instances with snapshots, and how to monitor your instances for CVEs. Collectively these measures help to protect your instances against common attack vectors that attempt to exploit known vulnerabilities. In Part 1, I showed how to configure your EC2 instances to make it easy to use Systems Manager, EBS Snapshot Scheduler, and Amazon Inspector. I also showed how to use Systems Manager to schedule automatic patches to keep your instances current in a timely fashion. In Part 2, I showed you how to take regular snapshots of your data by using EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

If you have comments about today’s or yesterday’s post, submit them in the “Comments” section below. If you have questions about or issues implementing any part of this solution, start a new thread on the Amazon EC2 forum or the Amazon Inspector forum, or contact AWS Support.

– Koen

How to Patch, Inspect, and Protect Microsoft Windows Workloads on AWS—Part 1

Post Syndicated from Koen van Blijderveen original https://aws.amazon.com/blogs/security/how-to-patch-inspect-and-protect-microsoft-windows-workloads-on-aws-part-1/

Most malware tries to compromise your systems by using a known vulnerability that the maker of the operating system has already patched. To help prevent malware from affecting your systems, two security best practices are to apply all operating system patches to your systems and actively monitor your systems for missing patches. In case you do need to recover from a malware attack, you should make regular backups of your data.

In today’s blog post (Part 1 of a two-part post), I show how to keep your Amazon EC2 instances that run Microsoft Windows up to date with the latest security patches by using Amazon EC2 Systems Manager. Tomorrow in Part 2, I show how to take regular snapshots of your data by using Amazon EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

What you should know first

To follow along with the solution in this post, you need one or more EC2 instances. You may use existing instances or create new instances. For the blog post, I assume this is an EC2 for Microsoft Windows Server 2012 R2 instance installed from the Amazon Machine Images (AMIs). If you are not familiar with how to launch an EC2 instance, see Launching an Instance. I also assume you launched or will launch your instance in a private subnet. A private subnet is not directly accessible via the internet, and access to it requires either a VPN connection to your on-premises network or a jump host in a public subnet (a subnet with access to the internet). You must make sure that the EC2 instance can connect to the internet using a network address translation (NAT) instance or NAT gateway to communicate with Systems Manager and Amazon Inspector. The following diagram shows how you should structure your Amazon Virtual Private Cloud (VPC). You should also be familiar with Restoring an Amazon EBS Volume from a Snapshot and Attaching an Amazon EBS Volume to an Instance.

Later on, you will assign tasks to a maintenance window to patch your instances with Systems Manager. To do this, the AWS Identity and Access Management (IAM) user you are using for this post must have the iam:PassRole permission. This permission allows this IAM user to assign tasks to pass their own IAM permissions to the AWS service. In this example, when you assign a task to a maintenance window, IAM passes your credentials to Systems Manager. This safeguard ensures that the user cannot use the creation of tasks to elevate their IAM privileges because their own IAM privileges limit which tasks they can run against an EC2 instance. You should also authorize your IAM user to use EC2, Amazon Inspector, Amazon CloudWatch, and Systems Manager. You can achieve this by attaching the following AWS managed policies to the IAM user you are using for this example: AmazonInspectorFullAccess, AmazonEC2FullAccess, and AmazonSSMFullAccess.

Architectural overview

The following diagram illustrates the components of this solution’s architecture.

Diagram showing the components of this solution's architecture

For this blog post, Microsoft Windows EC2 is Amazon EC2 for Microsoft Windows Server 2012 R2 instances with attached Amazon Elastic Block Store (Amazon EBS) volumes, which are running in your VPC. These instances may be standalone Windows instances running your Windows workloads, or you may have joined them to an Active Directory domain controller. For instances joined to a domain, you can be using Active Directory running on an EC2 for Windows instance, or you can use AWS Directory Service for Microsoft Active Directory.

Amazon EC2 Systems Manager is a scalable tool for remote management of your EC2 instances. You will use the Systems Manager Run Command to install the Amazon Inspector agent. The agent enables EC2 instances to communicate with the Amazon Inspector service and run assessments, which I explain in detail later in this blog post. You also will create a Systems Manager association to keep your EC2 instances up to date with the latest security patches.

You can use the EBS Snapshot Scheduler to schedule automated snapshots at regular intervals. You will use it to set up regular snapshots of your Amazon EBS volumes. EBS Snapshot Scheduler is a prebuilt solution by AWS that you will deploy in your AWS account. With Amazon EBS snapshots, you pay only for the actual data you store. Snapshots save only the data that has changed since the previous snapshot, which minimizes your cost.

You will use Amazon Inspector to run security assessments on your EC2 for Windows Server instance. In this post, I show how to assess if your EC2 for Windows Server instance is vulnerable to any of the more than 50,000 CVEs registered with Amazon Inspector.

In today’s and tomorrow’s posts, I show you how to:

  1. Launch an EC2 instance with an IAM role, Amazon EBS volume, and tags that Systems Manager and Amazon Inspector will use.
  2. Configure Systems Manager to install the Amazon Inspector agent and patch your EC2 instances.
  3. Take EBS snapshots by using EBS Snapshot Scheduler to automate snapshots based on instance tags.
  4. Use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any common vulnerabilities and exposures (CVEs).

Step 1: Launch an EC2 instance

In this section, I show you how to launch your EC2 instances so that you can use Systems Manager with the instances and use instance tags with EBS Snapshot Scheduler to automate snapshots. This requires three things:

  • Create an IAM role for Systems Manager before launching your EC2 instance.
  • Launch your EC2 instance with Amazon EBS and the IAM role for Systems Manager.
  • Add tags to instances so that you can automate policies for which instances you take snapshots of and when.

Create an IAM role for Systems Manager

Before launching your EC2 instance, I recommend that you first create an IAM role for Systems Manager, which you will use to update the EC2 instance you will launch. AWS already provides a preconfigured policy that you can use for your new role, and it is called AmazonEC2RoleforSSM.

  1. Sign in to the IAM console and choose Roles in the navigation pane. Choose Create new role.
    Screenshot of choosing "Create role"
  2. In the role-creation workflow, choose AWS service > EC2 > EC2 to create a role for an EC2 instance.
    Screenshot of creating a role for an EC2 instance
  3. Choose the AmazonEC2RoleforSSM policy to attach it to the new role you are creating.
    Screenshot of attaching the AmazonEC2RoleforSSM policy to the new role you are creating
  4. Give the role a meaningful name (I chose EC2SSM) and description, and choose Create role.
    Screenshot of giving the role a name and description

Launch your EC2 instance

To follow along, you need an EC2 instance that is running Microsoft Windows Server 2012 R2 and that has an Amazon EBS volume attached. You can use any existing instance you may have or create a new instance.

When launching your new EC2 instance, be sure that:

  • The operating system is Microsoft Windows Server 2012 R2.
  • You attach at least one Amazon EBS volume to the EC2 instance.
  • You attach the newly created IAM role (EC2SSM).
  • The EC2 instance can connect to the internet through a network address translation (NAT) gateway or a NAT instance.
  • You create the tags shown in the following screenshot (you will use them later).

If you are using an already launched EC2 instance, you can attach the newly created role as described in Easily Replace or Attach an IAM Role to an Existing EC2 Instance by Using the EC2 Console.

Add tags

The final step of configuring your EC2 instances is to add tags. You will use these tags to configure Systems Manager in Step 2 of this blog post and to configure Amazon Inspector in Part 2. For this example, I add a tag key, Patch Group, and set the value to Windows Servers. I could have other groups of EC2 instances that I treat differently by having the same tag key but a different tag value. For example, I might have a collection of other servers with the Patch Group tag key with a value of IAS Servers.

Screenshot of adding tags

Note: You must wait a few minutes until the EC2 instance becomes available before you can proceed to the next section.

At this point, you now have at least one EC2 instance you can use to configure Systems Manager, use EBS Snapshot Scheduler, and use Amazon Inspector.

Note: If you have a large number of EC2 instances to tag, you may want to use the EC2 CreateTags API rather than manually apply tags to each instance.

Step 2: Configure Systems Manager

In this section, I show you how to use Systems Manager to apply operating system patches to your EC2 instances, and how to manage patch compliance.

To start, I will provide some background information about Systems Manager. Then, I will cover how to:

  • Create the Systems Manager IAM role so that Systems Manager is able to perform patch operations.
  • Associate a Systems Manager patch baseline with your instance to define which patches Systems Manager should apply.
  • Define a maintenance window to make sure Systems Manager patches your instance when you tell it to.
  • Monitor patch compliance to verify the patch state of your instances.

Systems Manager is a collection of capabilities that helps you automate management tasks for AWS-hosted instances on EC2 and your on-premises servers. In this post, I use Systems Manager for two purposes: to run remote commands and apply operating system patches. To learn about the full capabilities of Systems Manager, see What Is Amazon EC2 Systems Manager?

Patch management is an important measure to prevent malware from infecting your systems. Most malware attacks look for vulnerabilities that are publicly known and in most cases are already patched by the maker of the operating system. These publicly known vulnerabilities are well documented and therefore easier for an attacker to exploit than having to discover a new vulnerability.

Patches for these new vulnerabilities are available through Systems Manager within hours after Microsoft releases them. There are two prerequisites to use Systems Manager to apply operating system patches. First, you must attach the IAM role you created in the previous section, EC2SSM, to your EC2 instance. Second, you must install the Systems Manager agent on your EC2 instance. If you have used a recent Microsoft Windows Server 2012 R2 AMI published by AWS, Amazon has already installed the Systems Manager agent on your EC2 instance. You can confirm this by logging in to an EC2 instance and looking for Amazon SSM Agent under Programs and Features in Windows. To install the Systems Manager agent on an instance that does not have the agent preinstalled or if you want to use the Systems Manager agent on your on-premises servers, see the documentation about installing the Systems Manager agent. If you forgot to attach the newly created role when launching your EC2 instance or if you want to attach the role to already running EC2 instances, see Attach an AWS IAM Role to an Existing Amazon EC2 Instance by Using the AWS CLI or use the AWS Management Console.

To make sure your EC2 instance receives operating system patches from Systems Manager, you will use the default patch baseline provided and maintained by AWS, and you will define a maintenance window so that you control when your EC2 instances should receive patches. For the maintenance window to be able to run any tasks, you also must create a new role for Systems Manager. This role is a different kind of role than the one you created earlier: Systems Manager will use this role instead of EC2. Earlier we created the EC2SSM role with the AmazonEC2RoleforSSM policy, which allowed the Systems Manager agent on our instance to communicate with the Systems Manager service. Here we need a new role with the policy AmazonSSMMaintenanceWindowRole to make sure the Systems Manager service is able to execute commands on our instance.

Create the Systems Manager IAM role

To create the new IAM role for Systems Manager, follow the same procedure as in the previous section, but in Step 3, choose the AmazonSSMMaintenanceWindowRole policy instead of the previously selected AmazonEC2RoleforSSM policy.

Screenshot of creating the new IAM role for Systems Manager

Finish the wizard and give your new role a recognizable name. For example, I named my role MaintenanceWindowRole.

Screenshot of finishing the wizard and giving your new role a recognizable name

By default, only EC2 instances can assume this new role. You must update the trust policy to enable Systems Manager to assume this role.

To update the trust policy associated with this new role:

  1. Navigate to the IAM console and choose Roles in the navigation pane.
  2. Choose MaintenanceWindowRole and choose the Trust relationships tab. Then choose Edit trust relationship.
  3. Update the policy document by copying the following policy and pasting it in the Policy Document box. As you can see, I have added the ssm.amazonaws.com service to the list of allowed Principals that can assume this role. Choose Update Trust Policy.
    {
       "Version":"2012-10-17",
       "Statement":[
          {
             "Sid":"",
             "Effect":"Allow",
             "Principal":{
                "Service":[
                   "ec2.amazonaws.com",
                   "ssm.amazonaws.com"
               ]
             },
             "Action":"sts:AssumeRole"
          }
       ]
    }

Associate a Systems Manager patch baseline with your instance

Next, you are going to associate a Systems Manager patch baseline with your EC2 instance. A patch baseline defines which patches Systems Manager should apply. You will use the default patch baseline that AWS manages and maintains. Before you can associate the patch baseline with your instance, though, you must determine if Systems Manager recognizes your EC2 instance.

Navigate to the EC2 console, scroll down to Systems Manager Shared Resources in the navigation pane, and choose Managed Instances. Your new EC2 instance should be available there. If your instance is missing from the list, verify the following:

  1. Go to the EC2 console and verify your instance is running.
  2. Select your instance and confirm you attached the Systems Manager IAM role, EC2SSM.
  3. Make sure that you deployed a NAT gateway in your public subnet to ensure your VPC reflects the diagram at the start of this post so that the Systems Manager agent can connect to the Systems Manager internet endpoint.
  4. Check the Systems Manager Agent logs for any errors.

Now that you have confirmed that Systems Manager can manage your EC2 instance, it is time to associate the AWS maintained patch baseline with your EC2 instance:

  1. Choose Patch Baselines under Systems Manager Services in the navigation pane of the EC2 console.
  2. Choose the default patch baseline as highlighted in the following screenshot, and choose Modify Patch Groups in the Actions drop-down.
    Screenshot of choosing Modify Patch Groups in the Actions drop-down
  3. In the Patch group box, enter the same value you entered under the Patch Group tag of your EC2 instance in “Step 1: Configure your EC2 instance.” In this example, the value I enter is Windows Servers. Choose the check mark icon next to the patch group and choose Close.Screenshot of modifying the patch group

Define a maintenance window

Now that you have successfully set up a role and have associated a patch baseline with your EC2 instance, you will define a maintenance window so that you can control when your EC2 instances should receive patches. By creating multiple maintenance windows and assigning them to different patch groups, you can make sure your EC2 instances do not all reboot at the same time. The Patch Group resource tag you defined earlier will determine to which patch group an instance belongs.

To define a maintenance window:

  1. Navigate to the EC2 console, scroll down to Systems Manager Shared Resources in the navigation pane, and choose Maintenance Windows. Choose Create a Maintenance Window.
    Screenshot of starting to create a maintenance window in the Systems Manager console
  2. Select the Cron schedule builder to define the schedule for the maintenance window. In the example in the following screenshot, the maintenance window will start every Saturday at 10:00 P.M. UTC.
  3. To specify when your maintenance window will end, specify the duration. In this example, the four-hour maintenance window will end on the following Sunday morning at 2:00 A.M. UTC (in other words, four hours after it started).
  4. Systems manager completes all tasks that are in process, even if the maintenance window ends. In my example, I am choosing to prevent new tasks from starting within one hour of the end of my maintenance window because I estimated my patch operations might take longer than one hour to complete. Confirm the creation of the maintenance window by choosing Create maintenance window.
    Screenshot of completing all boxes in the maintenance window creation process
  5. After creating the maintenance window, you must register the EC2 instance to the maintenance window so that Systems Manager knows which EC2 instance it should patch in this maintenance window. To do so, choose Register new targets on the Targets tab of your newly created maintenance window. You can register your targets by using the same Patch Group tag you used before to associate the EC2 instance with the AWS-provided patch baseline.
    Screenshot of registering new targets
  6. Assign a task to the maintenance window that will install the operating system patches on your EC2 instance:
    1. Open Maintenance Windows in the EC2 console, select your previously created maintenance window, choose the Tasks tab, and choose Register run command task from the Register new task drop-down.
    2. Choose the AWS-RunPatchBaseline document from the list of available documents.
    3. For Parameters:
      1. For Role, choose the role you created previously (called MaintenanceWindowRole).
      2. For Execute on, specify how many EC2 instances Systems Manager should patch at the same time. If you have a large number of EC2 instances and want to patch all EC2 instances within the defined time, make sure this number is not too low. For example, if you have 1,000 EC2 instances, a maintenance window of 4 hours, and 2 hours’ time for patching, make this number at least 500.
      3. For Stop after, specify after how many errors Systems Manager should stop.
      4. For Operation, choose Install to make sure to install the patches.
        Screenshot of stipulating maintenance window parameters

Now, you must wait for the maintenance window to run at least once according to the schedule you defined earlier. Note that if you don’t want to wait, you can adjust the schedule to run sooner by choosing Edit maintenance window on the Maintenance Windows page of Systems Manager. If your maintenance window has expired, you can check the status of any maintenance tasks Systems Manager has performed on the Maintenance Windows page of Systems Manager and select your maintenance window.

Screenshot of the maintenance window successfully created

Monitor patch compliance

You also can see the overall patch compliance of all EC2 instances that are part of defined patch groups by choosing Patch Compliance under Systems Manager Services in the navigation pane of the EC2 console. You can filter by Patch Group to see how many EC2 instances within the selected patch group are up to date, how many EC2 instances are missing updates, and how many EC2 instances are in an error state.

Screenshot of monitoring patch compliance

In this section, you have set everything up for patch management on your instance. Now you know how to patch your EC2 instance in a controlled manner and how to check if your EC2 instance is compliant with the patch baseline you have defined. Of course, I recommend that you apply these steps to all EC2 instances you manage.

Summary

In Part 1 of this blog post, I have shown how to configure EC2 instances for use with Systems Manager, EBS Snapshot Scheduler, and Amazon Inspector. I also have shown how to use Systems Manager to keep your Microsoft Windows–based EC2 instances up to date. In Part 2 of this blog post tomorrow, I will show how to take regular snapshots of your data by using EBS Snapshot Scheduler and how to use Amazon Inspector to check if your EC2 instances running Microsoft Windows contain any CVEs.

If you have comments about this post, submit them in the “Comments” section below. If you have questions about or issues implementing this solution, start a new thread on the EC2 forum or the Amazon Inspector forum, or contact AWS Support.

– Koen

Automating Amazon EBS Snapshot Management with AWS Step Functions and Amazon CloudWatch Events

Post Syndicated from Andy Katz original https://aws.amazon.com/blogs/compute/automating-amazon-ebs-snapshot-management-with-aws-step-functions-and-amazon-cloudwatch-events/

Brittany Doncaster, Solutions Architect

Business continuity is important for building mission-critical workloads on AWS. As an AWS customer, you might define recovery point objectives (RPO) and recovery time objectives (RTO) for different tier applications in your business. After the RPO and RTO requirements are defined, it is up to your architects to determine how to meet those requirements.

You probably store persistent data in Amazon EBS volumes, which live within a single Availability Zone. And, following best practices, you take snapshots of your EBS volumes to back up the data on Amazon S3, which provides 11 9’s of durability. If you are following these best practices, then you’ve probably recognized the need to manage the number of snapshots you keep for a particular EBS volume and delete older, unneeded snapshots. Doing this cleanup helps save on storage costs.

Some customers also have policies stating that backups need to be stored a certain number of miles away as part of a disaster recovery (DR) plan. To meet these requirements, customers copy their EBS snapshots to the DR region. Then, the same snapshot management and cleanup has to also be done in the DR region.

All of this snapshot management logic consists of different components. You would first tag your snapshots so you could manage them. Then, determine how many snapshots you currently have for a particular EBS volume and assess that value against a retention rule. If the number of snapshots was greater than your retention value, then you would clean up old snapshots. And finally, you might copy the latest snapshot to your DR region. All these steps are just an example of a simple snapshot management workflow. But how do you automate something like this in AWS? How do you do it without servers?

One of the most powerful AWS services released in 2016 was Amazon CloudWatch Events. It enables you to build event-driven IT automation, based on events happening within your AWS infrastructure. CloudWatch Events integrates with AWS Lambda to let you execute your custom code when one of those events occurs. However, the actions to take based on those events aren’t always composed of a single Lambda function. Instead, your business logic may consist of multiple steps (like in the case of the example snapshot management flow described earlier). And you may want to run those steps in sequence or in parallel. You may also want to have retry logic or exception handling for each step.

AWS Step Functions serves just this purpose―to help you coordinate your functions and microservices. Step Functions enables you to simplify your effort and pull the error handling, retry logic, and workflow logic out of your Lambda code. Step Functions integrates with Lambda to provide a mechanism for building complex serverless applications. Now, you can kick off a Step Functions state machine based on a CloudWatch event.

In this post, I discuss how you can target Step Functions in a CloudWatch Events rule. This allows you to have event-driven snapshot management based on snapshot completion events firing in CloudWatch Event rules.

As an example of what you could do with Step Functions and CloudWatch Events, we’ve developed a reference architecture that performs management of your EBS snapshots.

Automating EBS Snapshot Management with Step Functions

This architecture assumes that you have already set up CloudWatch Events to create the snapshots on a schedule or that you are using some other means of creating snapshots according to your needs.

This architecture covers the pieces of the workflow that need to happen after a snapshot has been created.

  • It creates a CloudWatch Events rule to invoke a Step Functions state machine execution when an EBS snapshot is created.
  • The state machine then tags the snapshot, cleans up the oldest snapshots if the number of snapshots is greater than the defined number to retain, and copies the snapshot to a DR region.
  • When the DR region snapshot copy is completed, another state machine kicks off in the DR region. The new state machine has a similar flow and uses some of the same Lambda code to clean up the oldest snapshots that are greater than the defined number to retain.
  • Also, both state machines demonstrate how you can use Step Functions to handle errors within your workflow. Any errors that are caught during execution result in the execution of a Lambda function that writes a message to an SNS topic. Therefore, if any errors occur, you can subscribe to the SNS topic and get notified.

The following is an architecture diagram of the reference architecture:

Creating the Lambda functions and Step Functions state machines

First, pull the code from GitHub and use the AWS CLI to create S3 buckets for the Lambda code in the primary and DR regions. For this example, assume that the primary region is us-west-2 and the DR region is us-east-2. Run the following commands, replacing the italicized text in <> with your own unique bucket names.

git clone https://github.com/awslabs/aws-step-functions-ebs-snapshot-mgmt.git

cd aws-step-functions-ebs-snapshot-mgmt/

aws s3 mb s3://<primary region bucket name> --region us-west-2

aws s3 mb s3://<DR region bucket name> --region us-east-2

Next, use the Serverless Application Model (SAM), which uses AWS CloudFormation to deploy the Lambda functions and Step Functions state machines in the primary and DR regions. Replace the italicized text in <> with the S3 bucket names that you created earlier.

aws cloudformation package --template-file PrimaryRegionTemplate.yaml --s3-bucket <primary region bucket name>  --output-template-file tempPrimary.yaml --region us-west-2

aws cloudformation deploy --template-file tempPrimary.yaml --stack-name ebsSnapshotMgmtPrimary --capabilities CAPABILITY_IAM --region us-west-2

aws cloudformation package --template-file DR_RegionTemplate.yaml --s3-bucket <DR region bucket name> --output-template-file tempDR.yaml  --region us-east-2

aws cloudformation deploy --template-file tempDR.yaml --stack-name ebsSnapshotMgmtDR --capabilities CAPABILITY_IAM --region us-east-2

CloudWatch event rule verification

The CloudFormation templates deploy the following resources:

  • The Lambda functions that are coordinated by Step Functions
  • The Step Functions state machine
  • The SNS topic
  • The CloudWatch Events rules that trigger the state machine execution

So, all of the CloudWatch event rules have been created for you by performing the preceding commands. The next section demonstrates how you could create the CloudWatch event rule manually. To jump straight to testing the workflow, see the “Testing in your Account” section. Otherwise, you begin by setting up the CloudWatch event rule in the primary region for the createSnapshot event and also the CloudWatch event rule in the DR region for the copySnapshot command.

First, open the CloudWatch console in the primary region.

Choose Create Rule and create a rule for the createSnapshot command, with your newly created Step Function state machine as the target.

For Event Source, choose Event Pattern and specify the following values:

  • Service Name: EC2
  • Event Type: EBS Snapshot Notification
  • Specific Event: createSnapshot

For Target, choose Step Functions state machine, then choose the state machine created by the CloudFormation commands. Choose Create a new role for this specific resource. Your completed rule should look like the following:

Choose Configure Details and give the rule a name and description.

Choose Create Rule. You now have a CloudWatch Events rule that triggers a Step Functions state machine execution when the EBS snapshot creation is complete.

Now, set up the CloudWatch Events rule in the DR region as well. This looks almost same, but is based off the copySnapshot event instead of createSnapshot.

In the upper right corner in the console, switch to your DR region. Choose CloudWatch, Create Rule.

For Event Source, choose Event Pattern and specify the following values:

  • Service Name: EC2
  • Event Type: EBS Snapshot Notification
  • Specific Event: copySnapshot

For Target, choose Step Functions state machine, then select the state machine created by the CloudFormation commands. Choose Create a new role for this specific resource. Your completed rule should look like in the following:

As in the primary region, choose Configure Details and then give this rule a name and description. Complete the creation of the rule.

Testing in your account

To test this setup, open the EC2 console and choose Volumes. Select a volume to snapshot. Choose Actions, Create Snapshot, and then create a snapshot.

This results in a new execution of your state machine in the primary and DR regions. You can view these executions by going to the Step Functions console and selecting your state machine.

From there, you can see the execution of the state machine.

Primary region state machine:

DR region state machine:

I’ve also provided CloudFormation templates that perform all the earlier setup without using git clone and running the CloudFormation commands. Choose the Launch Stack buttons below to launch the primary and DR region stacks in Dublin and Ohio, respectively. From there, you can pick up at the Testing in Your Account section above to finish the example. All of the code for this example architecture is located in the aws-step-functions-ebs-snapshot-mgmt AWSLabs repo.

Launch EBS Snapshot Management into Ireland with CloudFormation
Primary Region eu-west-1 (Ireland)

Launch EBS Snapshot Management into Ohio with CloudFormation
DR Region us-east-2 (Ohio)

Summary

This reference architecture is just an example of how you can use Step Functions and CloudWatch Events to build event-driven IT automation. The possibilities are endless:

  • Use this pattern to perform other common cleanup type jobs such as managing Amazon RDS snapshots, old versions of Lambda functions, or old Amazon ECR images—all triggered by scheduled events.
  • Use Trusted Advisor events to identify unused EC2 instances or EBS volumes, then coordinate actions on them, such as alerting owners, stopping, or snapshotting.

Happy coding and please let me know what useful state machines you build!