All posts by Jeff Barr

AWS Backup – Automate and Centrally Manage Your Backups

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-backup-automate-and-centrally-manage-your-backups/

AWS gives you the power to easily and dynamically create file systems, block storage volumes, relational databases, NoSQL databases, and other resources that store precious data. You can create them on a moment’s notice as the need arises, giving you access to as much storage as you need and opening the door to large-scale cloud migration. When you bring your sensitive data to the cloud, you need to make sure that you continue to meet business and regulatory compliance requirements, and you definitely want to make sure that you are protected against application errors.

While you can build your own backup tools using the built-in snapshot operations built in to many of the services that I listed above, creating an enterprise wide backup strategy and the tools to implement it still takes a lot of work. We are changing that.

New AWS Backup
AWS Backup is designed to help you automate and centrally manage your backups. You can create policy-driven backup plans, monitor the status of on-going backups, verify compliance, and find / restore backups, all using a central console. Using a combination of the existing AWS snapshot operations and new, purpose-built backup operations, Backup backs up EBS volumes, EFS file systems, RDS & Aurora databases, DynamoDB tables, and Storage Gateway volumes to Amazon Simple Storage Service (S3), with the ability to tier older backups to Amazon Glacier. Because Backup includes support for Storage Gateway volumes, you can include your existing, on-premises data in the backups that you create.

Each backup plan includes one or more backup rules. The rules express the backup schedule, frequency, and backup window. Resources to be backed-up can be identified explicitly or in a policy-driven fashion using tags. Lifecycle rules control storage tiering and expiration of older backups. Backup gathers the set of snapshots and the metadata that goes along with the snapshots into collections that define a recovery point. You get lots of control so that you can define your daily / weekly / monthly backup strategy, the ability to rest assured that your critical data is being backed up in accord with your requirements, and the ability to restore that data on an as-needed data. Backups are grouped into vaults, each encrypted by a KMS key.

Using AWS Backup
You can get started with AWS Backup in minutes. Open the AWS Backup Console and click Create backup plan:

I can build a plan from scratch, start from an existing plan or define one using JSON. I’ll Build a new plan, and start by giving my plan a name:

Now I create the first rule for my backup plan. I call it MainBackup, indicate that I want it to run daily, define the lifecycle (transition to cold storage after 1 month, expire after 6 months), and select the Default vault:

I can tag the recovery points that are created as a result of this rule, and I can also tag the backup plan itself:

I’m all set, so I click Create plan to move forward:

At this point my plan exists and is ready to run, but it has just one rule and does not have any resource assignments (so there’s nothing to back up):

Now I need to indicate which of my resources are subject to this backup plan I click Assign resources, and then create one or more resource assignments. Each assignment is named and references an IAM role that is used to create the recovery point. Resources can be denoted by tag or by resource ID, and I can use both in the same assignment. I enter all of the values and click Assign resources to wrap up:

The next step is to wait for the first backup job to run (I cheated by editing my backup window in order to get this post done as quickly as possible). I can peek at the Backup Dashboard to see the overall status:

Backups On Demand
I also have the ability to create a recovery point on demand for any of my resources. I choose the desired resource and designate a vault, then click Create an on-demand backup:

I indicated that I wanted to create the backup right away, so a job is created:

The job runs to completion within minutes:

Inside a Vault
I can also view my collection of vaults, each of which contains multiple recovery points:

I can examine see the list of recovery points in a vault:

I can inspect a recovery point, and then click Restore to restore my table (in this case):

I’ve shown you the highlights, and you can discover the rest for yourself!

Things to Know
Here are a couple of things to keep in mind when you are evaluating AWS Backup:

Services – We are launching with support for EBS volumes, RDS databases, DynamoDB tables, EFS file systems, and Storage Gateway volumes. We’ll add support for additional services over time, and welcome your suggestions. Backup uses the existing snapshot operations for all services except EFS file systems.

Programmatic Access – You can access all of the functions that I showed you above using the AWS Command Line Interface (CLI) and the AWS Backup APIs. The APIs are powerful integration points for your existing backup tools and scripts.

Regions – Backups work within the scope of a particular AWS Region, with plans in the works to enable several different types of cross-region functionality in 2019.

Pricing – You pay the normal AWS charges for backups that are created using the built-in AWS snapshot facilities. For Amazon EFS, there’s a low, per-GB charge for warm storage and an even lower charge for cold storage.

Available Now
AWS Backup is available now and you can start using it today!

Jeff;

 

 

Behind the Scenes & Under the Carpet – The CenturyLink Network that Powered AWS re:Invent 2018

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/behind-the-scenes-under-the-carpet-the-centurylink-network-that-powered-aws-reinvent-2018/

If you are a long-time reader, you may have already figured out that I am fascinated by the behind-the-scenes and beneath-the-streets activities that enable and power so much of our modern world. For example, late last year I told you how The AWS Cloud Goes Underground at re:Invent and shared some information about the communication and network infrastructure that was used to provide top-notch connectivity to re:Invent attendees and to those watching the keynotes and live streams from afar.

Today, with re:Invent 2018 in the rear-view mirror (and planning for next year already underway), I would like to tell you how 5-time re:Invent Network Services Provider CenturyLink designed and built a redundant, resilient network that used AWS Direct Connect to provide 180 Gbps of bandwidth and supported over 81,000 devices connected across eight venues. Above the ground, we worked closely with ShowNets to connect their custom network and WiFi deployment in each venue to the infrastructure provided by CenturyLink.

The 2018 re:Invent Network
This year, the network included diverse routes to multiple AWS regions, with a brand-new multi-node metro fiber ring that encompassed the Sands Expo, Wynn Resort, Circus Circus, Mirage, Vdara, Bellagio, Aria, and MGM Grand facilities. Redundant 10 Gbps connections to each venue and to multiple AWS Direct Connect locations were used to ensure high availability. The network was provisioned using CenturyLink Cloud Connect Dynamic Connections.

Here’s a network diagram (courtesy of CenturyLink) that shows the metro fiber ring and the connectivity:

The network did its job, and supported keynotes, live streams, breakout sessions, hands-on labs, hackathons, workshops, and certification exams. Here are the final numbers, as measured on-site at re:Invent 2018:

  • Live Streams – Over 60K views from over 100 countries.
  • Peak Data Transfer – 9.5 Gbps across six 10 Gbps connections.
  • Total Data Transfer – 160 TB.

Thanks again to our Managed Service Partner for building and running the robust network that supported our customers, partners, and employees at re:Invent!

Jeff;

New – Amazon DocumentDB (with MongoDB Compatibility): Fast, Scalable, and Highly Available

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-documentdb-with-mongodb-compatibility-fast-scalable-and-highly-available/

A glance at the AWS Databases page will show you that we offer an incredibly wide variety of databases, each one purpose-built to address a particular need! In order to help you build the coolest and most powerful applications, you can mix and match relational, key-value, in-memory, graph, time series, and ledger databases.

Introducing Amazon DocumentDB (with MongoDB compatibility)
Today we are launching Amazon DocumentDB (with MongoDB compatibility), a fast, scalable, and highly available document database that is designed to be compatible with your existing MongoDB applications and tools. Amazon DocumentDB uses a purpose-built SSD-based storage layer, with 6x replication across 3 separate Availability Zones. The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads.

Each MongoDB database contains a set of collections. Each collection (similar to a relational database table) contains a set of documents, each in the JSON-like BSON format. For example:

{
  name: "jeff",
  full_name: {first: "jeff", last: "barr"},
  title: "VP, AWS Evangelism",
  email: "[email protected]",
  city: "Seattle",
  foods: ["chocolate", "peanut butter"]
}

Each document can have a unique set of field-value pairs and data; there are no fixed or predefined schemas. The MongoDB API includes the usual CRUD (create, read, update, and delete) operations along with a very rich query model. This is just the tip of the iceberg (the MongoDB API is very powerful and flexible), so check out the list of supported MongoDB operations, data types, and functions to learn more.

All About Amazon DocumentDB
Here’s what you need to know about Amazon DocumentDB:

Compatibility – Amazon DocumentDB is compatible with version 3.6 of MongoDB.

Scalability – Storage can be scaled from 10 GB up to 64 TB in increments of 10 GB. You don’t need to preallocate storage or monitor free space; Amazon DocumentDB will take care of that for you. You can choose between six instance sizes (15.25 GiB to 488 GiB of memory), and you can create up to 15 read replicas. Storage and compute are decoupled and you can scale each one independently and as-needed.

PerformanceAmazon DocumentDB stores database changes as a log stream, allowing you to process millions of reads per second with millisecond latency. The storage model provides a nice performance increase without compromising data durability, and greatly enhances overall scalability.

Reliability – The 6-way storage replication ensures high availability. Amazon DocumentDB can failover from a primary to a replica within 30 seconds, and supports MongoDB replica set emulation so applications can handle failover quickly.

Fully Managed – Like the other AWS database services, Amazon DocumentDB is fully managed, with built-in monitoring, fault detection, and failover. You can set up daily snapshot backups, take manual snapshots, and use either one to create a fresh cluster if necessary. You can also do point-in-time restores (with second-level resolution) to any point within the 1-35 day backup retention period.

Secure – You can choose to encrypt your active data, snapshots, and replicas with the KMS key of your choice when you create each of your Amazon DocumentDB clusters. Authentication is enabled by default, as is encryption of data in transit.

Compatible – As I said earlier, Amazon DocumentDB is designed to work with your existing MongoDB applications and tools. Just be sure to use drivers intended for MongoDB 3.4 or newer. Internally, Amazon DocumentDB implements the MongoDB 3.6 API by emulating the responses that a MongoDB client expects from a MongoDB server.

Creating An Amazon DocumentDB (with MongoDB compatibility) Cluster
You can create a cluster from the Console, Command Line, CloudFormation, or by making a call to the CreateDBCluster function. I’ll use the Amazon DocumentDB Console today. I open the console and click Launch Amazon DocumentDB to get started:

I name my cluster, choose the instance class, and specify the number of instances (one is the primary and the rest are replicas). Then I enter a master username and password:

I can use any of the following instance classes for my cluster:

At this point I can click Create cluster to use default settings, or I can click Show advanced settings for additional control. I can choose any desired VPC, subnets, and security group. I can also set the port and parameter group for the cluster:

I can control encryption (enabled by default), set the backup retention period, and establish the backup window for point-in-time restores:

I can also control the maintenance window for my new cluster. Once I am ready I click Create cluster to proceed:

My cluster starts out in creating status, and switches to available very quickly:

As do the instances in the cluster:

Connecting to a Cluster
With the cluster up and running, I install the mongo shell on an EC2 instance (details depend on your distribution) and fetch a certificate so that I can make a secure connection:

$ wget https://s3.amazonaws.com/rds-downloads/rds-combined-ca-bundle.pem

The console shows me the command that I need to use to make the connection:

I simply customize the command with the password that I specified when I created the cluster:

From there I can use any of the mongo shell commands to insert, query, and examine data. I inserted some very simple documents and then ran an equally simple query (I’m sure you can do a lot better):

Now Available
Amazon DocumentDB (with MongoDB compatibility) is available now and you can start using it today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland) Regions. Pricing is based on the instance class, storage consumption for current documents and snapshots, I/O operations, and data transfer.

Jeff;

Western Digital HDD Simulation at Cloud Scale – 2.5 Million HPC Tasks, 40K EC2 Spot Instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/western-digital-hdd-simulation-at-cloud-scale-2-5-million-hpc-tasks-40k-ec2-spot-instances/

Earlier this month my colleague Bala Thekkedath published a story about Extreme Scale HPC and talked about how AWS customer Western Digital built a cloud-scale HPC cluster on AWS and used it to simulate crucial elements of upcoming head designs for their next-generation hard disk drives (HDD).

The simulation described in the story encompassed a little over 2.5 million tasks, and ran to completion in just 8 hours on a million-vCPU Amazon EC2 cluster. As Bala shared in his story, much of the simulation work at Western Digital revolves around the need to evaluate different combinations of technologies and solutions that comprise an HDD. The engineers focus on cramming ever-more data into the same space, improving storage capacity and increasing transfer speed in the process. Simulating millions of combinations of materials, energy levels, and rotational speeds allows them to pursue the highest density and the fastest read-write times. Getting the results more quickly allows them to make better decisions and lets them get new products to market more rapidly than before.

Here’s a visualization of Western Digital’s energy-assisted recording process in action. The top stripe represents the magnetism; the middle one represents the added energy (heat); and the bottom one represents the actual data written to the medium via the combination of magnetism and heat:

I recently spoke to my colleagues and to the teams at Western Digital and Univa who worked together to make this record-breaking run a reality. My goal was to find out more about how they prepared for this run, see what they learned, and to share it with you in case you are ready to run a large-scale job of your own.

Ramping Up
About two years ago, the Western Digital team was running clusters as big as 80K vCPUs, powered by EC2 Spot Instances in order to be as cost-effective as possible. They had grown to the 80K vCPU level after repeated, successful runs with 8K, 16K, and 32K vCPUs. After these early successes, they decided to shoot for the moon, push the boundaries, and work toward a one million vCPU run. They knew that this would stress and tax their existing tools, and settled on a find/fix/scale-some-more methodology.

Univa’s Grid Engine is a batch scheduler. It is responsible for keeping track of the available compute resources (EC2 instances) and dispatching work to the instances as quickly and efficiently as possible. The goal is to get the job done in the smallest amount of time and at the lowest cost. Univa’s Navops Launch supports container-based computing and also played a critical role in this run by allowing the same containers to be used for Grid Engine and AWS Batch.

One interesting scaling challenge arose when 50K hosts created concurrent connections to the Grid Engine scheduler. Once running, the scheduler can dispatch up to 3000 tasks per second, with an extra burst in the (relatively rare) case that an instance terminates unexpectedly and signals the need to reschedule 64 or more tasks as quickly as possible. The team also found that referencing worker instances by IP addresses allowed them to sidestep some internal (AWS) rate limits on the number of DNS lookups per Elastic Network Interface.

The entire simulation is packed in a Docker container for ease of use. When newly launched instances come online they register their specs (instance type, IP address, vCPU count, memory, and so forth) in an ElastiCache for Redis cluster. Grid Engine uses this data to find and manage instances; this is more efficient and scalable than calling DescribeInstances continually.

The simulation tasks read and write data from Amazon Simple Storage Service (S3), taking advantage of S3’s ability to store vast amounts of data and to handle any conceivable request rate.

Inside a Simulation Task
Each potential head design is described by a collection of parameters; the overall simulation run consists of an exploration of this parameter space. The results of the run help the designers to find designs that are buildable, reliable, and manufacturable. This particular run focused on modeling write operations.

Each simulation task ran for 2 to 3 hours, depending on the EC2 instance type. In order to avoid losing work if a Spot Instance is about to be terminated, the tasks checkpoint themselves to S3 every 15 minutes, with a bit of extra logic to cover the important case where the job finishes after the termination signal but before the actual shutdown.

Making the Run
After just 6 weeks of planning and prep (including multiple large-scale AWS Batch runs to generate the input files), the combined Western Digital / Univa / AWS team was ready to make the full-scale run. They used an AWS CloudFormation template to start Grid Engine and launch the cluster. Due to the Redis-based tracking that I described earlier, they were able to start dispatching tasks to instances as soon as they became available. The cluster grew to one million vCPUs in 1 hour and 32 minutes and ran full-bore for 6 hours:

When there were no more undispatched tasks available, Grid Engine began to shut the instances down, reaching the zero-instance point in about an hour. During the run, Grid Engine was able to keep the instances fully supplied with work over 99% of the time. The run used a combination of C3, C4, M4, R3, R4, and M5 instances. Here’s the overall breakdown over the course of the run:

The job spanned all six Availability Zones in the US East (N. Virginia) Region. Spot bids were placed at the On-Demand price. Over the course of the run, about 1.5% of the instances in the fleet were terminated and automatically replaced; the vast majority of the instances stayed running for the entire time.

And That’s That
This job ran 8 hours and cost $137,307 ($17,164 per hour). The folks I talked to estimated that this was about half the cost of making the run on an in-house cluster, if they had one of that size!

Evaluating the success of the run, Steve Phillpott (CIO of Western Digital) told us:

“Storage technology is amazingly complex and we’re constantly pushing the limits of physics and engineering to deliver next-generation capacities and technical innovation. This successful collaboration with AWS shows the extreme scale, power and agility of cloud-based HPC to help us run complex simulations for future storage architecture analysis and materials science explorations. Using AWS to easily shrink simulation time from 20 days to 8 hours allows Western Digital R&D teams to explore new designs and innovations at a pace un-imaginable just a short time ago.”

The Western Digital team behind this one is hiring an R&D Engineering Technologist; they also have many other open positions!

A Run for You
If you want to do a run on the order of 100K to 1M cores (or more), our HPC team is ready to help, as are our friends at Univa. To get started, Contact HPC Sales!

Jeff;

Now Open – AWS Europe (Stockholm) Region

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-open-aws-europe-stockholm-region/

The AWS Region in Sweden that I promised you last year is now open and you can start using it today! The official name is Europe (Stockholm) and the API name is eu-north-1. This is our fifth region in Europe, joining the existing regions in Europe (Ireland), Europe (London), Europe (Frankfurt), and Europe (Paris). Together, these regions provide you with a total of 15 Availability Zones and allow you to architect applications that are resilient and fault tolerant. You now have yet another option to help you to serve your customers in the Nordics while keeping their data close to home.

Instances and Services
Applications running in this 3-AZ region can use C5, C5d, D2, I3, M5, M5d, R5, R5d, and T3 instances, and can use of a long list of AWS services including Amazon API Gateway, Application Auto Scaling, AWS Artifact, AWS Certificate Manager (ACM), Amazon CloudFront, AWS CloudFormation, AWS CloudTrail, Amazon CloudWatch, CloudWatch Events, Amazon CloudWatch Logs, AWS CodeDeploy, AWS Config, AWS Config Rules, AWS Database Migration Service, AWS Direct Connect, Amazon DynamoDB, EC2 Auto Scaling, EC2 Dedicated Hosts, Amazon Elastic Container Service for Kubernetes, AWS Elastic Beanstalk, Amazon Elastic Block Store (EBS), Amazon Elastic Compute Cloud (EC2), Elastic Container Registry, Amazon ECS, Application Load Balancers (Classic, Network, and Application), Amazon EMR, Amazon ElastiCache, Amazon Elasticsearch Service, Amazon Glacier, AWS Identity and Access Management (IAM), Amazon Kinesis Data Streams, AWS Key Management Service (KMS), AWS Lambda, AWS Marketplace, AWS Organizations, AWS Personal Health Dashboard, AWS Resource Groups, Amazon RDS for Aurora, Amazon RDS for PostgreSQL, Amazon Route 53 (including Private DNS for VPCs), AWS Server Migration Service, AWS Shield Standard, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon Simple Storage Service (S3), Amazon Simple Workflow Service (SWF), AWS Step Functions, AWS Storage Gateway, AWS Support API, Amazon EC2 Systems Manager (SSM), AWS Trusted Advisor, Amazon Virtual Private Cloud, VM Import, and AWS X-Ray.

Edge Locations and Latency
CloudFront edge locations are already operational in four cities adjacent to the new region:

  • Stockholm, Sweden (3 locations)
  • Copenhagen, Denmark
  • Helsinki, Finland
  • Oslo, Norway

AWS Direct Connect is also available in all of these locations.

The region also offers low-latency connections to other cities and AWS regions in area. Here are the latest numbers:

AWS Customers in the Nordics
Tens of thousands of our customers in Denmark, Finland, Iceland, Norway, and Sweden already use AWS! Here’s a sampling:

Volvo Connected Solutions Group – AWS is their preferred cloud solution provider; allowing them to connect over 800,000 Volvo trucks, buses, construction equipment, and Penta engines. They make heavy use of microservices and will use the new region to deliver services with lower latency than ever before.

Fortum – Their one-megawatt Virtual Battery runs on top of AWS. The battery aggregates and controls usage of energy assets and allows Fortum to better balance energy usage across their grid. This results in lower energy costs and power bills, along with a reduced environmental impact.

Den Norske Bank – This financial services customer is using AWS to provide a modern banking experience for their customers. They can innovate and scale more rapidly, and have devoted an entire floor of their headquarters to AWS projects.

Finnish Rail – They are moving their website and travel applications to AWS in order to allow their developers to quickly experiment, build, test, and deliver personalized services for each of their customers.

And That Makes 20
With today’s launch, the AWS Cloud spans 60 Availability Zones within 20 geographic regions around the world. We are currently working on 12 more Availability Zones and four more AWS Regions in Bahrain, Cape Town, Hong Kong SAR, and Milan.

AWS services are GDPR ready and also include capabilities that are designed to support your own GDPR readiness efforts. To learn more, read the AWS Service Capabilities for GDPR and check out the AWS General Data Protection Regulation (GDPR) Center.

The Europe (Stockholm) Region is now open and you can start creating your AWS resources in it today!

Jeff;

And Now a Word from Our AWS Heroes…

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/and-now-a-word-from-our-aws-heroes/

Whew! Now that AWS re:Invent 2018 has wrapped up, the AWS Blog Team is taking some time to relax, recharge, and to prepare for 2019.

In order to wrap up the year in style, we have asked several of the AWS Heroes to write guest blog posts on an AWS-related topic of their choice. You will get to hear from Machine Learning Hero Cyrus Wong (pictured at right), Community Hero Markus Ostertag, Container Hero Philipp Garbe, and several others.

Each of these Heroes brings a fresh and unique perspective to the AWS Blog and I know that you will enjoy hearing from them. We’ll have the first post up in a day or two, so stay tuned!

Jeff;

New – EC2 P3dn GPU Instances with 100 Gbps Networking & Local NVMe Storage for Faster Machine Learning + P3 Price Reduction

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-ec2-p3dn-gpu-instances-with-100-gbps-networking-local-nvme-storage-for-faster-machine-learning-p3-price-reduction/

Late last year I told you about Amazon EC2 P3 instances and also spent some time discussing the concept of the Tensor Core, a specialized compute unit that is designed to accelerate machine learning training and inferencing for large, deep neural networks. Our customers love P3 instances and are using them to run a wide variety of machine learning and HPC workloads. For example, fast.ai set a speed record for deep learning, training the ResNet-50 deep learning model on 1 million images for just $40.

Raise the Roof
Today we are expanding the P3 offering at the top end with the addition of p3dn.24xlarge instances, with 2x the GPU memory and 1.5x as many vCPUs as p3.16xlarge instances. The instances feature 100 Gbps network bandwidth (up to 4x the bandwidth of previous P3 instances), local NVMe storage, the latest NVIDIA V100 Tensor Core GPUs with 32 GB of GPU memory, NVIDIA NVLink for faster GPU-to-GPU communication, AWS-custom Intel® Xeon® Scalable (Skylake) processors running at 3.1 GHz sustained all-core Turbo, all built atop the AWS Nitro System. Here are the specs:4

Model NVIDIA V100 Tensor Core GPUs GPU Memory NVIDIA NVLink vCPUs Main Memory Local Storage Network Bandwidth EBS-Optimized Bandwidth
p3dn.24xlarge 8 256 GB 300 GB/s 96 768 GiB 2 x 900 GB NVMe SSD 100 Gbps 14 Gbps

If you are doing large-scale training runs using MXNet, TensorFlow, PyTorch, or Keras, be sure to check out the Horovod distributed training framework that is included in the Amazon Deep Learning AMIs. You should also take a look at the new NVIDIA AI Software containers in the AWS Marketplace; these containers are optimized for use on P3 instances with V100 GPUs.

With a total of 256 GB of GPU memory (twice as much as the largest of the current P3 instances), the p3dn.24xlarge allows you to explore bigger and more complex deep learning algorithms. You can rotate and scale your training images faster than ever before, while also taking advantage of the Intel AVX-512 instructions and other leading-edge Skylake features. Your GPU code can scale out across multiple GPUs and/or instances using NVLink and the NVLink Collective Communications Library (NCCL). Using NCCL will also allow you to fully exploit the 100 Gbps of network bandwidth that is available between instances when used within a Placement Group.

In addition to being a great fit for distributed machine learning training and image classification, these instances provide plenty of power for your HPC jobs. You can render 3D images, transcode video in real time, model financial risks, and much more.

You can use existing AMIs as long as they include the ENA, NVMe, and NVIDIA drivers. You will need to upgrade to the latest ENA driver to get 100 Gbps networking; if you are using the Deep Learning AMIs, be sure to use a recent version that is optimized for AVX-512.

Available Today
The p3dn.24xlarge instances are available now in the US East (N. Virginia) and US West (Oregon) Regions and you can start using them today in On-Demand, Spot, and Reserved Instance form.

Bonus – P3 Price Reduction
As part of today’s launch we are also reducing prices for the existing P3 instances. The following prices went in to effect on December 6, 2018:

  • 20% reduction for all prices (On-Demand and RI) and all instance sizes in the Asia Pacific (Tokyo) Region.
  • 15% reduction for all prices (On-Demand and RI) and all instance sizes in the Asia Pacific (Sydney), Asia Pacific (Singapore), and Asia Pacific (Seoul) Regions.
  • 15% reduction for Standard RIs with a three-year term for all instance sizes in all regions except Asia Pacific (Tokyo), Asia Pacific (Sydney), Asia Pacific (Singapore), and Asia Pacific (Seoul).

The percentages apply to instances running Linux; slightly smaller percentages apply to instances that run Microsoft Windows and other operating systems.

These reductions will help to make your machine learning training and inferencing even more affordable, and are being brought to you as we pursue our goal of putting machine learning in the hands of every developer.

Jeff;

 

 

New – AWS Well-Architected Tool – Review Workloads Against Best Practices

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-well-architected-tool-review-workloads-against-best-practices/

Back in 2015 we launched the AWS Well-Architected Framework and I asked Are You Well-Architected? The framework includes five pillars that encapsulate a set of core strategies and best practices for architecting systems in the cloud:

Operational Excellence – Running and managing systems to deliver business value.

Security – Protecting information and systems.

Reliability – Preventing and quickly recovering from failures.

Performance Efficiency – Using IT and compute resources efficiently.

Cost Optimization – Avoiding un-needed costs.

I think of it as a way to make sure that you are using the cloud right, and that you are using it well.

AWS Solutions Architects (SA) work with our customers to perform thousands of Well-Architected reviews every year! Even at that pace, the demand for reviews always seems to be a bit higher than our supply of SAs. Our customers tell us that the reviews are of great value and use the results to improve their use of AWS over time.

New AWS Well-Architected Tool
In order to make the Well-Architected reviews open to every AWS customer, we are introducing the AWS Well-Architected Tool. This is a self-service tool that is designed to help architects and their managers to review AWS workloads at any time, without the need for an AWS Solutions Architect.

The AWS Well-Architected Tool helps you to define your workload, answer questions designed to review the workload against the best practices specified by the five pillars, and to walk away with a plan that will help you to do even better over time. The review process includes educational content that focuses on the most current set of AWS best practices.

Let’s take a quick tour…

AWS Well-Architected Tool in Action
I open the AWS Well-Architected Tool Console and click Define workload to get started:

I begin by naming and defining my workload. I choose an industry type and an industry, list the regions where I operate, indicate if this is a pre-production or production workload, and optionally enter a list of AWS account IDs to define the span of the workload. Then I click Define workload to move ahead:

I am ready to get started, so I click Start review:

The first pillar is Operational Excellence. There are nine questions, each with multiple-choice answers. Helpful resources are displayed on the side:

I can go through the pillars and questions in order, save and exit, and so forth. After I complete my review, I can consult the improvement plan for my workload:

I can generate a detailed PDF report that summarizes my answers:

I can review my list of workloads:

And I can see the overall status in the dashboard:

Available Now
The AWS Well-Architected Tool is available now and you can start using it today for workloads in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland) Regions at no charge.

Jeff;

New – Compute, Database, Messaging, Analytics, and Machine Learning Integration for AWS Step Functions

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-compute-database-messaging-analytics-and-machine-learning-integration-for-aws-step-functions/

AWS Step Functions is a fully managed workflow service for application developers. You can think & work at a high level, connecting and coordinating activities in a reliable and repeatable way, while keeping your business logic separate from your workflow logic. After you design and test your workflows (which we call state machines), you can deploy them at scale, with tens or even hundreds of thousands running independently and concurrently. Step Functions tracks the status of each workflow, takes care of retrying activities on transient failures, and also simplifies monitoring and logging. To learn more, step through the Create a Serverless Workflow with AWS Step Functions and AWS Lambda tutorial.

Since our launch at AWS re:Invent 2016, our customers have made great use of Step Functions (my post, Things go Better with Step Functions describes a real-world use case). Our customers love the fact that they can easily call AWS Lambda functions to implement their business logic, and have asked us for even more options.

More Integration, More Power
Today we are giving you the power to use eight more AWS services from your Step Function state machines. Here are the new actions:

DynamoDB – Get an existing item from an Amazon DynamoDB table; put a new item into a DynamoDB table.

AWS Batch – Submit a AWS Batch job and wait for it to complete.

Amazon ECS – Run an Amazon ECS or AWS Fargate task using a task definition.

Amazon SNS – Publish a message to an Amazon Simple Notification Service (SNS) topic.

Amazon SQS – Send a message to an Amazon Simple Queue Service (SQS) queue.

AWS Glue – Start a AWS Glue job run.

Amazon SageMaker – Create an Amazon SageMaker training job; create a SageMaker transform job (learn more by reading New Features for Amazon SageMaker: Workflows, Algorithms, and Accreditation).

You can use these actions individually or in combination with each other. To help you get started, we’ve built some cool samples that will show you how to manage a batch job, manage a container task, copy data from DynamoDB, retrieve the status of a Batch job, and more. For example, here’s a visual representation of the sample that copies data from DynamoDB to SQS:

The sample (available to you as an AWS CloudFormation template) creates all of the necessary moving parts including a Lambda function that will populate (seed) the table with some test data. After I create the stack I can locate the state machine in the Step Functions Console and execute it:

I can inspect each step in the console; the first one (Seed the DynamoDB Table) calls a Lambda function that creates some table entries and returns a list of keys (message ids):

The third step (Send Message to SQS) starts with the following input:

And delivers this output, including the SQS MessageId:

As you can see, the state machine took care of all of the heavy lifting — calling the Lambda function, iterating over the list of message IDs, and calling DynamoDB and SQS for each one. I can run many copies at the same time:

I’m sure you can take this example as a starting point and build something awesome with it; be sure to check out the other samples and templates for some ideas!

If you are already building and running your own state machines, you should know about Magic ARNs and Parameters:

Magic ARNs – Each of these new operations is represented by a special “magic” (that’s the technical term Tim used) ARN. There’s one for sending to SQS, another one for running a batch job, and so forth.

Parameters – You can use the Parameters field in a Task state to control the parameters that are passed to the service APIs that implement the new functions. Your state machine definitions can include static JSON or references (in JsonPath form) to specific elements in the state input.

Here’s how the Magic ARNs and Parameters are used to define a state:

   "Read Next Message from DynamoDB": {
      "Type": "Task",
      "Resource": "arn:aws:states:::dynamodb:getItem",
      "Parameters": {
        "TableName": "StepDemoStack-DDBTable-1DKVAVTZ1QTSH",
        "Key": {
          "MessageId": {"S.$": "$.List[0]"}
        }
      },
      "ResultPath": "$.DynamoDB",
      "Next": "Send Message to SQS"
    },

Available Now
The new integrations are available now and you can start using them today in all AWS Regions where Step Functions are available. You pay the usual charge for each state transition and for the AWS services that you consume.

Jeff;

New – Hibernate Your EC2 Instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-hibernate-your-ec2-instances/

As you know, you can easily build highly scalable AWS applications that launch fresh EC2 instances on an as-needed basis. While the instances can be up and running in a matter of seconds, booting the operating system and the application can take considerable time. Also, caches and other memory-centric application components can take some time (sometimes tens of minutes) to preload or warm up. Both of these factors impose a delay that can force you to over-provision in case you need incremental capacity very quickly.

Hibernation for EC2 Instances
Today we are giving you the ability to launch EC2 instances, set them up as desired, hibernate them, and then bring them back to life when you need them. The hibernation process stores the in-memory state of the instance, along with its private and elastic IP addresses, allowing it to pick up exactly where it left off.

This feature is available today and you can use it on freshly launched M3, M4, M5, C3, C4, C5, R3, R4, and R5 instances running Amazon Linux 1 (support for Amazon Linux 2 is in the works and will be ready soon). It applies to On-Demand instances and instances running with Reserved Instance coverage.

When an instance is instructed to hibernate, it writes the in-memory state to a file in the root EBS volume and then (in effect) shuts itself down. The AMI used to launch the instance must be encrypted, as must the root EBS volume of the instance. The encryption ensures proper protection for sensitive data when it is copied from memory to the EBS volume.

While the instance is in hibernation, you pay only for the EBS volumes and Elastic IP Addresses attached to it; there are no other hourly charges (just like any other stopped instance).

Hibernation in Action
In order to check out this feature I launch a c4.large instance, and select hibernation as a stop behavior:

I also expand my instance’s root volume, adding 10 GB + the memory size of the instance to the desired size:

I also create and associate an Elastic IP address with my instance since the public IP address will change. My instance is up and running, and I can check the uptime:

Then I select the instance in the EC2 Console and choose Stop – Hibernate from the Instance State menu (API and CLI support is also available):

The instance state transitions from running to stopping, and then to stopped, in seconds:

The console provides additional information about the transition:

The SSH connection to the instance drops, since it is no longer running:

Later, when I am ready to proceed, I click Start:

This time the state goes from stopped to pending, and then to running, again in seconds, and I can reconnect. I can then use uptime to see that the instance has not been rebooted, but has continued from where it left off:

If I was using this instance interactively, I could use a session manager such as screen, tmux, or mosh to make this totally seamless. The most interesting use cases for hibernation revolve around long-running processes and services that take a lot of time to initialize before they are ready to accept traffic where this would not be a concern.

Things to Know
As you can see, hibernation is really easy to use, and I hope that you are already thinking of some ways to apply it to your application. Here are a couple of things to keep in mind:

Instance Type – You can enable and use hibernation on freshly launches instances of the types that I listed above.

Root Volume Size – The root volume must have free space equal to the amount of RAM on the instance in order for the hibernation to succeed.

Operating Systems – The newest Amazon Linux 1 AMIs are configured for hibernation, with many others in the works. You will need to create an encrypted AMI, using one of these AMIs as a base. You can also follow our directions to customize and use your own AMI.

Modifications – You cannot modify the instance size or type while it is in hibernation, but you can modify the user data and the EBS Optimization setting.

Pricing – While the instance is in hibernation, you pay only for the EBS storage and any Elastic IP addresses attached to the instance.

Performance – The time to hibernate or resume is dependent on the memory size of the instance, the amount of in-memory data to be saved, and the throughput of the root EBS volume.

Coming Soon – We are working on support for Amazon Linux 2, Ubuntu, Windows Server 2008 R2, Windows Server 2012, Windows Server 2012 R2, Windows Server 2016, along with the SQL Server variants of the Windows AMIs.

Available Now
This feature is available now in the US East (N. Virginia, Ohio), US West (N. California, Oregon), Canada (Central), South America (São Paulo), Asia Pacific (Mumbai, Seoul, Singapore, Sydney, Tokyo), and EU (Frankfurt, London, Ireland, Paris) Regions.

Jeff;

New AWS License Manager – Manage Software Licenses and Enforce Licensing Rules

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-license-manager-manage-software-licenses-and-enforce-licensing-rules/

When you make use of commercial, licensed software in the AWS Cloud using a BYOL (Bring Your Own License) strategy, you need to make sure that you stay within the provisions of the license, while also avoiding expensive over-provisioning. This can be a challenge when it is so easy to launch instances on demand whenever you need them!

New AWS License Manager
Today we are launching AWS License Manager. You can define your licensing rules, taking in to account any enterprise agreements and other terms that govern your use of the licensed software. Then you associate them with your deployment mechanism (golden AMIs or Launch Templates) so that EC2 instances launched via the mechanism will be automatically tracked. You can also discover existing usage across one or more AWS accounts, and track all usage through the AWS Management Console.

Let’s take a quick tour, assuming that I own a 100-vCPU license for an enterprise database server.

The first step is to define one or more License Configurations. I open the License Manager Console and click Create license configuration to get started:

I enter a name and description for my configuration, indicate that the license is based on vCPUs (and limited to 100), and that I want to enforce the license:

I can also create rules for the license. The rules control the applicability of the license with respect to this configuration. I can specify a minimum and/or maximum number of vCPUs, and any desired EC2 tenancy (shared, dedicated host, or dedicated instance). Here’s a rule that specifies 4-64 vCPUs, and shared tenancy:

I confirm that the rule is defined as desired, and click Submit to move ahead. My license configuration is ready, as are some others created by colleagues:

After I create my license configuration, I can associate it with an AMI by selecting the configuration and clicking Associate AMI in the Actions menu. I pick one or more AMIs and click Associate:

I can see my overall license usage at a glance (this is a central dashboard that works across multiple accounts and in conjunction with AWS Organizations):

I can click Settings to link to my AWS Organizations accounts, set up a cross-account inventory search and arrange to receive SNS alerts when the usage limit for a license has been breached:

Going Further
Here are a couple of other things to know about AWS License Manager:

Supported License TypesAWS License Manager supports any license based on vCPUs, physical cores, and physical sockets, and is not tied to any software vendor.

Cross-Account UsageAWS License Manager works hand-in-glove with AWS Organizations. You can sign it to your Master account, link all of the accounts with a click, and share license configurations across your Organization. You will be able to use the dashboard to see an Organization-wide view of your license usage.

Multi-Account Software DiscoveryAWS License Manager also works with AWS Systems Manager, and works across accounts within an Organization. The discovered data is stored in an S3 bucket and an Amazon Athena database (encrypted in both places), and is processed by a AWS Glue job.

Programmatic Access – You can create and manage license configurations from the Console, APIs, or the AWS Command Line Interface (CLI). Interesting functions include CreateLicenseConfiguration, GetLicenseConfiguration, ListResourceInventory, and ListUsageForLicenseConfiguration.

Pricing – You can use AWS License Manager at no charge. Behind the scenes, AWS License Manager stores inventory data in an S3 bucket and an Amazon Athena database, and processes it using a AWS Glue job. You’ll pay the usual AWS prices for these resources and services.

Available Now
AWS License Manager is available now and you can start using it today in the US East (N. Virginia), US West (Oregon), US East (Ohio), Europe (Ireland), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), Europe (Frankfurt), Asia Pacific (Seoul), Asia Pacific (Mumbai), and Europe (London) Regions.

AWS Launches, Previews, and Pre-Announcements at re:Invent 2018 – Andy Jassy Keynote

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-previews-and-pre-announcements-at-reinvent-2018-andy-jassy-keynote/

As promised in Welcome to AWS re:Invent 2018, here’s a summary of the launches, previews, and pre-announcements from Andy Jassy’s keynote. I have included links to allow you to sign up for previews, as appropriate.

(photo from AWS Community Hero Eric Hammond)

Launches
Here are the blog posts that we wrote for today’s launches:

S3 Glacier Deep Archive
This new storage class for Amazon Simple Storage Service (S3) is designed for long-term data archival and is the lowest cost storage from any cloud provider. Priced from just $0.00099/GB-mo (less than one-tenth of one cent, or $1.01 per TB-mo), the cost is comparable to tape archival services. Data can be retrieved in 12 hours or less, and there will also be a bulk retrieval option that will allow you to inexpensively retrieve even petabytes of data within 48 hours.

Control Tower
This service helps you automate the set up a well-architected multi-account AWS environment using a set of blueprints that embody AWS best practices. Guardrails, both mandatory and recommended, are available for high-level, rule-based governance. You will have access to an integrated dashboard so that you can keep a watchful eye over the accounts provisioned, the guardrails that are enabled, and your overall compliance status. Learn more.

Amazon Textract
This Optical Character Recognition (OCR) service will help you to extract text and data from virtually any document. Powered by Machine Learning, it will identify bounding boxes, detect key-value pairs, and make sense of tables, while eliminating manual effort and lowering your document-processing costs. Sign up for the preview.

AWS Outposts
This service will bring AWS to your existing data center, providing a consistent, seamless experience across on-premises and the cloud, and giving you the ability to run on-premises applications with the exact same Application Programming Interfaces (APIs), consoles, features, hardware, and tools that you use on AWS. Sign up for the preview.

Amazon RDS on VMware
This is a fully managed service for on-premises databases. You can set up, run, and scale databases in VMware vSphere using the same tools already enjoyed by hundreds of thousands of Amazon Relational Database Service (RDS) customers. You can build low-cost high-availability hybrid environments, implement disaster recovery to AWS, and do long-term archival in Amazon Simple Storage Service (S3). Sign up for the preview!

Amazon Quantum Ledger Database
This fully managed ledger database will allow you to track and verify the complete history of changes to your application data. It uses an immutable journal that maintains a sequenced, cryptographically verifiable record of all changes that cannot be deleted or modified. It is scalable and easy to use, supports SQL queries, and lets it run 2-3x faster than common blockchain frameworks. Sign up for the preview.

AWS Managed Blockchain
This fully managed ledger database will allow you to track and verify the complete history of changes to your application data. It uses an immutable journal that maintains a sequenced, cryptographically verifiable record of all changes that cannot be deleted or modified. It is scalable and easy to use, supports SQL queries, and lets it run 2-3x faster than common blockchain frameworks. Sign up for the preview.

Amazon Timestream
This a fast, scalable, fully managed time-series database that you can use to store and analyze trillions of events per day at 1/10th the cost of a relational database. It is optimized for data that arrives in time order and for queries that include a time interval. It is a great fit for IoT, industrial telemetry, app monitoring, and DevOps data. Timestream automates rollups, retention, tiering, and compression so time-series data can be efficiently stored and processed. Timestream’s query engine adapts to the location and format of data making it easier and faster to query time-series data. Learn more.

AWS Lake Formation
This fully managed service will help you to build, secure, and manage a data lake. You’ll be able to point it at your data sources, have it crawl the sources, and pull the data into Amazon Simple Storage Service (S3). Lake Formation uses Machine Learning to identify and de-duplicate data, and also performs format changes in order to accelerate analytical processing. You will also be able to define and centrally manage consistent security policies across your data lake and the services that you use to analyze and process the data. Sign up for the preview.

AWS Security Hub
This service will allow you to to centrally view & manage security alerts and automate compliance checks within and across AWS accounts. It will aggregate security findings from AWS and partner services and present you with built-in and customizable insights that are unique to your environment. Try the preview!

Stay Tuned
I am looking forward to writing about each of these services when they are ready to launch, so stay tuned!

Jeff;

 

AWS DeepRacer – Go Hands-On with Reinforcement Learning at re:Invent

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-deepracer-go-hands-on-with-reinforcement-learning-at-reinvent/

Reinforcement Learning is a type of machine learning that works when an “agent” is allowed to act on a trial-and-error basis within an interactive environment, using feedback from those actions to learn over time in order to reach a predetermined goal or to maximize some type of score or reward. This stands in contrast to other forms of machine learning such as Supervised Learning, where a set of facts (ground truths) are used to train a model so that it can make inferences.

We want you to get some hands-on experience with Reinforcement Learning at AWS re:Invent and I would like to tell you all about it today. This combination of hardware and software will help you get things (literally) moving!

AWS DeepRacer
Let’s talk about the hardware and software first. AWS DeepRacer is a 1/18th scale radio-controlled, four-wheel drive car:

There’s an Intel Atom® processor onboard, a 4 megapixel camera with 1080p resolution, fast (802.11ac) WiFi, multiple USB ports, and enough battery power to last for about 2 hours. The Atom processor runs Ubuntu 16.04 LTS, ROS (Robot Operating System), and the Intel OpenVino™ computer vision toolkit.

AWS DeepRacer includes a fully-configured cloud environment that you can use to train your Reinforcement Learning models. It takes advantage of the new Reinforcement Learning feature in Amazon SageMaker and also includes a 3D simulation environment powered by AWS RoboMaker. You can train an autonomous driving model against a collection of predefined race tracks included with the simulator and then evaluate them virtually or download them to a AWS DeepRacer car and verify performance in the real world.

Reinforcement Learning is one of the technologies that are used to make self-driving cars a reality; the AWS DeepRacer is the perfect vehicle (so to speak) for you to go hands-on and learn all about it. We’re ramping up volume production and you will be able to buy one of your very own very soon.

You can pre-order your very own AWS DeepRacer today and sign up to be part of the preview at aws.amazon.com/deepracer.

AWS DeepRacer & Reinforcement Learning at re:Invent
My colleagues have created an incredible program that will get you started with AWS DeepRacer and Reinforcement Learning!

re:Invent attendees can attend a workshop that will teach you the fundamentals of Reinforcement Learning and then show you how to create, train, and tweak an autonomous driving model for an AWS DeepRacer. You’ll create, train, and refine your model on an online simulator and then load it into a genuine AWS DeepRacer for a spin around one of our test tracks. Your goal: Get your AWS DeepRacer around the track as quickly and accurately as possible. There will be a competition every hour, with the chance to win AWS DeepRacers and AWS credits.

Start Your Engines
If you’re here at re:Invent consider yourselves under starters’ orders, because the very first AWS DeepRacer League will take place over the next 24 hours in the AWS DeepRacer workshops and at the MGM Speedway. You will use Amazon SageMaker, AWS RoboMaker, and other AWS services while you learn about Reinforcement Learning. There are 6 main tracks (and a pit area for each), a hacker garage, 2 extra tracks that you can use for training and experimentation, and a DJ to keep you revved up.

From 11:30 AM to 10 PM today (November 28th) every lap time will be entered onto the Speedway Leaderboard. The top 3 developers with the fastest times over the course of the day’s racing will advance to the 2018 grand finale where they will compete to become the AWS DeepRacer 2018 Champion.

The final race will take place on the AWS re:Invent International Speedway at 8 AM on Thursday, just before Werner’s keynote. You will get to race, learn, win prizes, and collect some swag!

AWS DeepRacer League
We want to make sure that developers all over the world have the same opportunity to get involved with AWS DeepRacer as re:Invent attendees. To that end I am excited to announce the AWS DeepRacer League – the world’s first global autonomous racing league, open to anyone. In 2019 there will be a series of live racing events at AWS Global Summits around the world, and we’ll also have virtual events and tournaments throughout the year. Winners and top scorers will advance to the AWS DeepRacer 2019 Championship Cup at re:invent 2019. I’ll have more detail on that soon, or you can check the AWS DeepRacer site for the latest updates.

I’ll have more details soon, so stay tuned and happy racing!

 

 

Jeff;

New – Amazon FSx for Lustre

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-fsx-for-lustre/

A pebibyte (PiB – 1,125,899,906,842,624 bytes) is an impressive amount of data, slightly less than half of the estimated memory capacity of a human brain. Data lakes, High-Performance Computing (HPC), and Electronic Design Automation (EDA) applications traditionally work at this scale, as do more recent data-intensive applications such as Machine Learning and media processing.

Amazon FSx for Lustre
Today we are launching Amazon FSx for Lustre, designed to meet the needs of these applications and others that you will undoubtedly dream up. Based on the mature and popular Lustre open source project, Amazon FSx for Lustre is a highly parallel file system that supports sub-millisecond access to petabyte-scale file systems. Thousands of simultaneous clients (EC2 instances and on-premises servers) can drive millions of IOPS (Input/Output Operations per Second) and transfer hundreds of gigibytes of data per second.

You can create a file system in minutes, mount it on any number of clients, and start accessing it right away. This is a fully managed service so there’s nothing to maintain and nothing to administer. You can build standalone file systems for ephemeral use, or you can seamlessly join them to an S3 bucket and then access the contents of the bucket as if it were a Lustre file system. Each file system is backed by NVMe SSD storage, provisioned in increments of 3.6 TiB, and designed to deliver 200 Mbps of aggregate throughput at 10,000 IOPS for every 1 TiB of provisioned capacity.

Creating a Lustre File System
You can create a Lustre file system from the AWS Management Console, CLI, or by calling the CreateFileSystem function. I’ll use the CLI today; I simply specify the subnets for the Lustre endpoints and the desired storage capacity:

$ aws fsx create-file-system --file-system-type LUSTRE --storage-capacity 3600 --subnet-ids subnet-009a1149
----------------------------------------------------------------------------------------------
|                                      CreateFileSystem                                      |
+--------------------------------------------------------------------------------------------+
||                                        FileSystem                                        ||
|+-----------------+------------------------------------------------------------------------+|
||  CreationTime   |  1542666225.28                                                         ||
||  DNSName        |  fs-00a2e062546ff4fce.fsx.us-east-1.amazonaws.com                      ||
||  FileSystemId   |  fs-00a2e062546ff4fce                                                  ||
||  FileSystemType |  LUSTRE                                                                ||
||  Lifecycle      |  CREATING                                                              ||
||  OwnerId        |  012345678912                                                          ||
||  ResourceARN    |  arn:aws:fsx:us-east-1:012345678912:file-system/fs-00a2e062546ff4fce   ||
||  StorageCapacity|  3600                                                                  ||
||  VpcId          |  vpc-e68d9c81                                                          ||
|+-----------------+------------------------------------------------------------------------+|
|||                                   LustreConfiguration                                  |||
||+----------------------------------------------------------------+-----------------------+||
|||  WeeklyMaintenanceStartTime                                    |  5:09:00              |||
||+----------------------------------------------------------------+-----------------------+||
|||                                        SubnetIds                                       |||
||+----------------------------------------------------------------------------------------+||
|||  subnet-009a1149                                                                       |||
||+----------------------------------------------------------------------------------------+||

This takes about 5 minutes and then it becomes AVAILABLE:

$ aws fsx describe-file-systems --file-system-id fs-00a2e062546ff4fce | grep Lifecycle
||  Lifecycle      |  AVAILABLE                                                             ||

My EC2 instance already has the Lustre kernel modules and the Lustre client installed:

I create a mount point and mount my Lustre file system:

$ sudo mkdir /fsx
$ sudo mount -t lustre [email protected]:/fsx /fsx

And my 3.4 TiB Lustre file system is ready to use:

I can also create a file system that sits in front of an S3 bucket (or a prefixed section of an S3 bucket). This allows me to treat my bucket as a data lake, and to process it using tools and applications that are file-based. I simply include the bucket name as the ImportPath when I create the file system:

$ aws fsx create-file-system --file-system-type LUSTRE --storage-capacity 3600 \
  --subnet-ids subnet-009a1149 --lustre-configuration ImportPath=s3://jbarr-src

My bucket has about 1 million files inside, so the creation process takes about 30 minutes (the team told me that this takes about 500 files per second). Here is my bucket:

And here is what it looks like from my EC2 instance:

At this point, the Lustre file system contains all of the metadata (names, dates, sizes, and so forth) for my objects but it does not have the actual file data. This data is copied from S3 on an as-needed basis. As a result, this command will not access S3:

$ find . -type f

And this one will, with a small latency penalty for each access because objects are copied from S3 to the file system on an as-needed basis:

$ find . -type f -exec grep -l -i main {} \;

If I understand my code’s access pattern, I can use the hsm_restore option of the lfs command to pre-load them. Perhaps I plan to analyze all of the C header files:

$ find . -type f -name '*.h' -print0 | \
  xargs -0 -n 50 -P 8 sudo lfs hsm_restore

Any changes that I make to the files remain within the file system. I can export changed files back to S3 using the hsm_archive option of the lfs command:

$ sudo lfs hsm_archive README.md
$ sudo lfs hsm_action README.md

The first command initiates the export operation and the second one indicates that it is complete by printing NOOP. The changed files are written to the same bucket, prefixed by the ExportPath of the file system:

I can discover the ExportPath from the command line:

$ aws fsx describe-file-systems --file-system-id fs-086f5160a68bc158b | grep Path
||||  ExportPath       |  s3://jbarr-src/FSxLustre20181120T005845Z                        ||||
||||  ImportPath       |  s3://jbarr-src                                                  ||||

Each file system publishes a rich set of metrics to CloudWatch:

There’s a lot more, but I’m just about out of space! For example, I didn’t show you the scale that you can achieve using Amazon FSx for Lustre. I used one client, but could just have easily used thousands.

Things to Know
Here are a couple of interesting things to keep in mind regarding Amazon FSx for Lustre:

Console Access – I wrote this post using the CLI; a full console is also available.

Regions – You can create Lustre file systems in the US East (N. Virginia), US West (Oregon), US East (Ohio), and Europe (Ireland) Regions.

Pricing – Pricing is based on the amount of storage that you have provisioned, and starts at $0.14 per GiB per month in the US East (N. Virginia), US West (Oregon), and Europe (Ireland) Regions.

Access – You can access your file systems from EC2 instances. You can also use AWS Direct Connect to connect your existing data center or colo to AWS, and access your file systems from there.

Security – Access to each file system goes through a security group, with IAM policies for fine-grained access control. Data at rest is encrypted using a 256-bit block cypher and keys managed by Amazon FSx for Lustre.

Available Now
Amazon FSx for Lustre is available now and you can start using it today!

Jeff;

 

 

New – Amazon FSx for Windows File Server – Fast, Fully Managed, and Secure

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-fsx-for-windows-file-server-fast-fully-managed-and-secure/

Organizations that want to run Windows applications on the cloud are commonly looking for network file storage that’s fully compatible with their applications and their Windows environments. For example, enterprises use Active Directory for identification and Windows Access Control Lists for fine-grained control over access to folders and files, and their applications typically rely on storage that provides full Windows file system (NTFS file system) compatibility.

Amazon FSx for Windows File Server
Amazon FSx for Windows File Server fits all of these needs, and more. It was designed from the ground up to work with your existing Windows applications and environments, making lift-and-shift of your Windows workloads to the cloud super-easy. You get a native Windows file system backed by fully-managed Windows file servers, accessible via the widely adopted SMB (Server Message Block) protocol. Built on SSD storage, Amazon FSx for Windows File Server delivers the throughput, IOPS, and consistent sub-millisecond performance that you (and your Windows applications) expect.

Here are the most important things to know:

Accessibility & Protocol Support – You can access your shares from Amazon Elastic Compute Cloud (EC2) instances, Amazon WorkSpaces virtual desktops, Amazon AppStream 2.0 applications, and VMware Cloud on AWS. Versions 2.0 through 3.1.1 of SMB are supported, allowing you to use Windows versions starting from Windows 7 and Windows Server 2008, and current versions of Linux (via Samba). Active Directory integration is built in, allowing you to easily integrate with your existing enterprise environment.

Performance and TunabilityAmazon FSx for Windows File Server delivers consistent, sub-millisecond latency. You can set the file system size and throughput (in megabytes per second) independently, with plenty of latitude in each dimension. File systems can be as big as 64 TB, and can deliver up to 2,048 MB/second of throughput.

Management – Your file systems are fully managed and data is stored in redundant form within an AWS Availability Zone. You don’t have to worry about attaching and formatting additional storage devices, updating Windows Server, or recovering from hardware failures. Incremental file-system consistent backups are taken automatically every day, with the option to take additional backups when needed.

Security – You get multiple levels of access control and data protection. File system endpoints are created within Virtual Private Clouds (VPCs) and access is governed by Security Groups. Windows ACLs are used to control access to folders and files; IAM roles are used to control access to administrative functions, with administrative activities logged to AWS CloudTrail. Your data is encrypted in transit and (using a KMS key that you can control) at rest. The service is PCI-DSS compliant and can be used to build HIPAA-compliant applications.

Multi-AZ Deployment – You create file systems in distinct AWS Availability Zones, and can use Microsoft DFS to set up automatic replication and failover between them. You can also use Microsoft DFS Namespaces to create shared, common namespaces that span multiple file systems and provide up to 300 PB of storage.

Creating a File System
Amazon FSx for Windows File Server is easy to use. I start by confirming that I have a Active Directory with a Domain Controller in the VPC subnet (subnet-009a1149) where I plan to create my file system’s endpoints:

For testing purposes, I also have an EC2 instance running Windows in the same subnet:

I open the Amazon FSx Console, and click Create file system:

I choose my file system option:

I specify a name, size, optional throughput, and other parameters for my new file system, and click Review summary to proceed:

On another browser tab I verify that the security group for the file system is configured to allow connections from my EC2 instance on the desired ports (135, 445, and 55555):

On the next page I review the settings and the estimated monthly costs, and click Create file system. My file system starts out in the Creating status and transitions to Available in minutes:

I can see an overview at a glance:

And I can click the Network & Security tab to get the DNS name for my file system:

I copy the DNS name, hop over to my EC2 instance, open Explorer, and Map my file system (a shared named share is created automatically):

Then I can use it like any other share (I’m sure that your use case is better than mine, but perhaps not as historically significant):

Each file system includes one share (named share) automatically. I can connect to the file system and create additional shares using the standard Windows tools and wizards:

File-system consistent backups are made daily during the backup window for the file system, and are retained for up to 35 days, as specified when the file system was created. I can also make backups on an as-needed basis:

Available Now
Amazon FSx for Windows File Server is available now and you can start using it today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Ireland) Regions, with expansion to other Regions planned for the coming months. Pricing is based on the amount of storage and throughput that you configure.

Jeff;

 

New – Amazon CloudWatch Logs Insights – Fast, Interactive Log Analytics

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-cloudwatch-logs-insights-fast-interactive-log-analytics/

Many AWS services create logs. Off the top of my head there are VPC Flow Logs, Route 53 Logs, Lambda Logs, CloudTrail Logs (for AWS API calls), RDS Logs, IoT Logs, ECS Logs, API Gateway Logs, and S3 Server Access Logs, EC2 Instance Logs (via the CloudWatch Agent), to name a few. The services that you run on your EC2 instances (Apache, Tomcat, NGINX, and the like) also produce logs, and your application code probably does the same.

Embedded within these logs are the data points, patterns, trends, and insights that you can use to understand how your applications and AWS resources are behaving, identify room for improvement, and to address operational issues. But, as usual, there’s a catch. The breadth of formats and data elements and the sheer size of the raw logs can make analysis difficult. When individual AWS customers routinely generate 100 terabytes or more of log files each day, old-school tools such as find and grep no longer suffice!

CloudWatch Logs Insights
The new CloudWatch Logs Insights will help! This is a fully managed service that is designed to work at cloud scale, with no setup or maintenance required. It plows through massive logs in seconds, and gives you fast, interactive queries and visualizations. It can can handle any log format, and auto-discovers fields from JSON logs. As you will see, it is very flexible, and will quickly become one of your favorite tools for diving in to your logs.

CloudWatch Logs Insights includes a sophisticated ad-hoc query language, with commands to fetch desired event fields, filter based on conditions, calculate aggregate statistics including percentiles and time series aggregations, sort on any desired file, and limit the number of events returned by a query. You can also use regular expressions to extract data from an event field, creating one or more ephemeral fields that can be further processed by the query. You can visualize query results using line and stacked area charts, and you can add queries to a CloudWatch Dashboard. There’s even a rich set of sample queries to get you started.

Insights in Action
To get started, I open the CloudWatch Console and click Insights:

Then I choose the desired Log Group using the menu:

I can enter a query, or I can choose one of the samples:

As you can see, sample queries are supplied for several different types of logs. I pick the first one, click Run query, the logs are scanned and the results are visible within seconds:

I can add a filter to my query and run it again. Perhaps I want to focus on EC2 API calls, so I use a pipe ( | ) and the filter command:

I can filter by an absolute or relative time range:

I can also generate visualizations. Here’s a simple one: Amazon RDS memory usage metrics for the last 30 minutes, grouped into 1-minute bins:

CloudWatch Logs Insights discovers all of the fields in the events and tells me how common they are in the selected log:

I can use this to build my queries interactively:

For queries that do not do any aggregation, I can expand an event and see all of the fields:

The query language supports six types of commands:

fields – Retrieves one or more log fields. It can also make use of functions such as abs, sqrt, strlen, trim, and more.

filter – Retrieves log fields based on one or more conditions built from Boolean operators, comparison operators, and regular expressions.

stats – Calculates aggregate statistics such as sum, avg, count, min, max, and percentile for a log field, across a given time interval (specified using the optional by modifier).

sort – Sorts logs events in ascending or descending order.

limit – Limits the number of log events returned by a query.

parse – Extracts data from a log field, creating one or more ephemeral fields that can be further processed by the query.

The language also supports a rich set of arithmetic & comparison operators, numeric functions, string functions, date/time functions, and aggregation functions.

As usual, I have shown you a fairly simply subset of the functionality and power that is available to you. Here are a couple of things that you can try on your own:

Add to Dashboard – After you have created an insightful query, click Add to Dashboard, then select an existing dashboard or create a new one:

Copy Query Results – After your have used CloudWatch Logs Insights to discover an issue, click the Action menu and choose Copy query results:

Then you can paste the results into your ticketing system for resolution.

API and CLI Access – In addition to console access, this feature is accessible via the AWS Command Line Interface (CLI) and the AWS SDKs.

CloudWatch Integration – You can write a bit of glue code to run queries, use the results to publish Custom Metrics. Then you can visualize them, set alarms, and so forth, all with the goal of simplifying and accelerating your troubleshooting.

Available Now
CloudWatch Logs Insights is available now in the US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Canada (Central), Europe (Ireland), Europe (Frankfurt), Europe (London), Europe (Paris), Asia Pacific (Tokyo), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Sydney), and South America (São Paulo) Regions and you can start using it today.

Pricing is based on the amount of ingested log data scanned for each query; you pay $0.005 per GB in US East (N. Virginia), with similar prices in the other regions.

AWS Ground Station – Ingest and Process Data from Orbiting Satellites

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-ground-station-ingest-and-process-data-from-orbiting-satellites/

Not an actual satelliteDid you know that there are currently thousands of satellites orbiting the Earth? I certainly did not, and would have guessed a few hundred at most. Today, high school and college students design, fabricate, and launch nano-, pico-, and even femto-satellites such as CubeSats, PocketQubes, and SunCubes. On the commercial side, organizations of any size can now launch satellites for Earth observation, communication, media distribution, and so forth.

All of these satellites collect a lot of data, and that’s where things get even more interesting. While it is now relatively cheap to get a satellite into Low Earth Orbit (LEO) or Medium Earth Orbit (MEO) and only slightly more expensive to achieve a more distant Geostationary Orbit, getting that data back to Earth is still more difficult than it should be. Large-scale satellite operators often build and run their own ground stations at a cost of up to one million dollars or more each; smaller operators enter into inflexible long-term contracts to make use of existing ground stations.

Some of the challenges that I reviewed above may remind you of those early, pre-cloud days when you had to build and run your own data center. That changed when we launched Amazon EC2 back in 2006.

Introducing AWS Ground Station
Today I would like to tell you about AWS Ground Station. Amazon EC2 made compute power accessible on a cost-effective, pay-as-you-go basis. AWS Ground Station does the same for satellite ground stations. Instead of building your own ground station or entering in to a long-term contract, you can make use of AWS Ground Station on an as-needed, pay-as-you-go basis. You can get access to a ground station on short notice in order to handle a special event: severe weather, a natural disaster, or something more positive such as a sporting event. If you need access to a ground station on a regular basis to capture Earth observations or distribute content world-wide, you can reserve capacity ahead of time and pay even less. AWS Ground Station is a fully managed service. You don’t need to build or maintain antennas, and can focus on your work or research.

We’re starting out with a pair of ground stations today, and will have 12 in operation by mid-2019. Each ground station is associated with a particular AWS Region; the raw analog data from the satellite is processed by our modem digitizer into a data stream (in what is formally known as VITA 49 baseband or VITA 49 RF over IP data streams) and routed to an EC2 instance that is responsible for doing the signal processing to turn it into a byte stream.

Once the data is in digital form, you have a host of streaming, processing, analytics, and storage options. Here’s a starter list:

StreamingAmazon Kinesis Data Streams to capture, process, and store data streams.

ProcessingAmazon Rekognition for image analysis; Amazon SageMaker to build, train, and deploy ML models.

Analytics / ReportingAmazon Redshift to store processed data in structured data warehouse form; Amazon Athena and Amazon QuickSight for queries.

StorageAmazon Simple Storage Service (S3) to store data in object form, with Amazon Glacier for long-term archival storage.

Your entire workflow, from the ground stations all the way through to processing, storage, reporting, and delivery, can now be done on elastic, pay-as-you-go infrastructure!

AWS Ground Station in Action
I did not have an actual satellite to test with, so the AWS Ground Station team created an imaginary one in my account! When you are ready to make use of AWS Ground Station, we’ll need your satellite’s NORAD ID, information about your FCC license, and your AWS account number so that we can associate it with your account.

I open the Ground Station Console and click Reserve contacts now to get started:

The first step is to reserve a contact (an upcoming time when my satellite will be in the optimal position to transmit to the ground station I choose). I choose a ground station from the menu:

I can filter based on status (Available, Scheduled, or Completed) and on a time range:

I can see the contacts, pick one that meets my requirements, select it, and click Reserve Contact:

I confirm my contact on the next page, and click Reserve:

Then I can filter the Contacts list to show all of my upcoming reservations:

After my contact has been reserved, I make sure that my EC2 instances will be running in the AWS Region associated with the ground station at least 15 minutes ahead of the start time. The instance responsible for the signal processing connects to an Elastic Network Interface (ENI), uses DataDefender to manage the data transfer, and routes the data to a software modem such as qRadio to convert it to digital form (we’ll provide customers with a CloudFormation template that will create the ENI and do all of the other setup work).

Things to Know
Here are a couple of things you should know about AWS Ground Station:

Access – Due to the nature of this service, access is not self-serve. You will need to communicate with our team in order to register your satellite(s).

Ground Stations – As I mentioned earlier, we are launching today with 2 ground stations, and will have a total of 12 in operation by 2019. We will monitor utilization and demand, and will build additional stations and antennas as needed.

Pricing – Pricing is per-minute of downlink time, with an option to pre-pay for blocks of minutes.

Jeff;

 

AWS Launches, Previews, and Pre-Announcements at re:Invent 2018 – Monday Night Live

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-previews-and-pre-announcements-at-reinvent-2018-monday-night-live/

As promised in Welcome to AWS re:Invent 2018, here’s a summary of the launches, previews, and pre-announcements from Monday Night Live!

Launches
Here are detailed blog posts & whats new pages for tonight’s launches:

P3dn Instances
The upcoming p3dn.24xlarge instances will feature 100 Gbps network bandwidth, local NVMe storage, and eight of the latest NVIDIA TESLA v100 GPUs (a total of 256 GB of GPU memory). With 2x the GPU memory and 1.5x as many vCPUs as p3.16xlarge instances, these instances will allow you to explore bigger and more complex deep learning algorithms, render 3D images, transcode video in real time, model financial risks, and much more.

Elastic Fabric Adapter
This is a new network interface for EC2 instances. It is designed to support High Performance Computing (HPC) AWS workloads that need lots of inter-node communication: computational fluid dynamics, weather modeling, reservoir simulation, and the like. EFA will support the industry-standard Message Passing Interface (MPI) so that you can bring your existing HPC applications to AWS without changing any code. Sign up for the preview.

AWS IoT Events
This service monitors IoT sensors at scale, looking for anomalies, trends, and patterns that can indicate a systemic failure, production slowdown, or a change in operation. It triggers pre-defined actions and generates alerts to on-site teams when something is amiss. Sign up for the preview.

AWS IoT SiteWise
This service helps our industrial customers to collect, structure, and search thousands of sensor data streams across multiple facilities. An on-premises gateway device collects data from OPC-UA servers and forwards it to AWS for further processing. The data can be used to build visual representations of production lines and processes, and is used in conjunction with AWS IoT Analytics to forecast trends. Sign up for the preview.

AWS IoT Things Graph
This service will make it even easier for you to rapidly build IoT applications for edge gateways that run AWS IoT Greengrass. You will be able to connect devices and web services, even if the devices are from a variety of vendors and speak different protocols. Sign up for the preview.

Stay Tuned
I am looking forward to writing about each of these services when they are ready to launch, so stay tuned!

Jeff;

Firecracker – Lightweight Virtualization for Serverless Computing

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/firecracker-lightweight-virtualization-for-serverless-computing/

One of my favorite Amazon Leadership Principles is Customer Obsession. When we launched AWS Lambda, we focused on giving developers a secure serverless experience so that they could avoid managing infrastructure. In order to attain the desired level of isolation we used dedicated EC2 instances for each customer. This approach allowed us to meet our security goals but forced us to make some tradeoffs with respect to the way that we managed Lambda behind the scenes. Also, as is the case with any new AWS service, we did not know how customers would put Lambda to use or even what they would think of the entire serverless model. Our plan was to focus on delivering a great customer experience while making the backend ever-more efficient over time.

Just four years later (Lambda was launched at re:Invent 2014) it is clear that the serverless model is here to stay. Today, Lambda processes trillions of executions for hundreds of thousands of active customers every month. Last year we extended the benefits of serverless to containers with the launch of AWS Fargate, which now runs tens of millions of containers for AWS customers every week.

As our customers increasingly adopted serverless, it was time to revisit the efficiency issue. Taking our Invent and Simplify principle to heart, we asked ourselves what a virtual machine would look like if it was designed for today’s world of containers and functions!

Introducing Firecracker
Today I would like to tell you about Firecracker, a new virtualization technology that makes use of KVM. You can launch lightweight micro-virtual machines (microVMs) in non-virtualized environments in a fraction of a second, taking advantage of the security and workload isolation provided by traditional VMs and the resource efficiency that comes along with containers.

Here’s what you need to know about Firecracker:

Secure – This is always our top priority! Firecracker uses multiple levels of isolation and protection, and exposes a minimal attack surface.

High Performance – You can launch a microVM in as little as 125 ms today (and even faster in 2019), making it ideal for many types of workloads, including those that are transient or short-lived.

Battle-TestedFirecracker has been battled-tested and is already powering multiple high-volume AWS services including AWS Lambda and AWS Fargate.

Low OverheadFirecracker consumes about 5 MiB of memory per microVM. You can run thousands of secure VMs with widely varying vCPU and memory configurations on the same instance.

Open SourceFirecracker is an active open source project. We are already ready to review and accept pull requests, and look forward to collaborating with contributors from all over the world.

Firecracker was built in a minimalist fashion. We started with crosvm and set up a minimal device model in order to reduce overhead and to enable secure multi-tenancy. Firecracker is written in Rust, a modern programming language that guarantees thread safety and prevents many types of buffer overrun errors that can lead to security vulnerabilities.

Firecracker Security
As I mentioned earlier, Firecracker incorporates a host of security features! Here’s a partial list:

Simple Guest ModelFirecracker guests are presented with a very simple virtualized device model in order to minimize the attack surface: a network device, a block I/O device, a Programmable Interval Timer, the KVM clock, a serial console, and a partial keyboard (just enough to allow the VM to be reset).

Process Jail – The Firecracker process is jailed using cgroups and seccomp BPF, and has access to a small, tightly controlled list of system calls.

Static Linking – The firecracker process is statically linked, and can be launched from a jailer to ensure that the host environment is as safe and clean as possible.

Firecracker in Action
To get some experience with Firecracker, I launch an i3.metal instance and download three files (the firecracker binary, a root file system image, and a Linux kernel):

I need to set up the proper permission to access /dev/kvm:

$  sudo setfacl -m u:${USER}:rw /dev/kvm

I start firecracker in one PuTTY session, and then issue commands in another (the process listens on a Unix-domain socket and implements a REST API). The first command sets the configuration for my first guest machine:

$ curl --unix-socket /tmp/firecracker.sock -i \
    -X PUT "http://localhost/machine-config" \
    -H "accept: application/json" \
    -H "Content-Type: application/json" \
    -d "{
        \"vcpu_count\": 1,
        \"mem_size_mib\": 512
    }"

And, the second sets the guest kernel:

$ curl --unix-socket /tmp/firecracker.sock -i \
    -X PUT "http://localhost/boot-source" \
    -H "accept: application/json" \
    -H "Content-Type: application/json" \
    -d "{
        \"kernel_image_path\": \"./hello-vmlinux.bin\",
        \"boot_args\": \"console=ttyS0 reboot=k panic=1 pci=off\"
    }"

And, the third one sets the root file system:

$ curl --unix-socket /tmp/firecracker.sock -i \
    -X PUT "http://localhost/drives/rootfs" \
    -H "accept: application/json" \
    -H "Content-Type: application/json" \
    -d "{
        \"drive_id\": \"rootfs\",
        \"path_on_host\": \"./hello-rootfs.ext4\",
        \"is_root_device\": true,
        \"is_read_only\": false
    }"

With everything set to go, I can launch a guest machine:

# curl --unix-socket /tmp/firecracker.sock -i \
    -X PUT "http://localhost/actions" \
    -H  "accept: application/json" \
    -H  "Content-Type: application/json" \
    -d "{
        \"action_type\": \"InstanceStart\"
     }"

And I am up and running with my first VM:

In a real-world scenario I would script or program all of my interactions with Firecracker, and I would probably spend more time setting up the networking and the other I/O. But re:Invent awaits and I have a lot more to do, so I will leave that part as an exercise for you.

Collaborate with Us
As you can see this is a giant leap forward, but it is just a first step. The team is looking forward to telling you more, and to working with you to move ahead. Star the repo, join the community, and send us some code!

Jeff;

 

 

 

 

New C5n Instances with 100 Gbps Networking

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-c5n-instances-with-100-gbps-networking/

We launched the powerful, compute-intensive C5 instances last year, and followed up with the C5d instances earlier this year with the addition of local NVMe storage. Both instances are built on the AWS Nitro system and are powered by AWS-custom 3.0 Ghz Intel® Xeon® Platinum 8000 series processors. They are designed for compute-heavy applications such as batch processing, distributed analytics, high-performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding.

New 100 Gbps Networking
Today we are adding an even more powerful variant, the C5n instance. With up to 100 Gbps of network bandwidth, your simulations, in-memory caches, data lakes, and other communication-intensive applications will run better than ever. Here are the specs:

Instance Name vCPUs
RAM
EBS Bandwidth Network Bandwidth
c5n.large 2 5.25 GiB Up to 3.5 Gbps Up to 25 Gbps
c5n.xlarge 4 10.5 GiB Up to 3.5 Gbps Up to 25 Gbps
c5n.2xlarge 8 21 GiB Up to 3.5 Gbps Up to 25 Gbps
c5n.4xlarge 16 42 GiB 3.5 Gbps Up to 25 Gbps
c5n.9xlarge 36 96 GiB 7 Gbps 50 Gbps
c5n.18xlarge 72 192 GiB 14 Gbps 100 Gbps

The Nitro Hypervisor allows the full range of C5n instances to deliver performance that is just about indistinguishable from bare metal. Other AWS Nitro System components, including the Nitro Security Chip, hardware EBS processing, and hardware support for the software defined network inside of each VPC also enhance performance.

Each vCPU is a hardware hyperthread on the Intel Xeon Platinum 8000 series processor. You get full control over the C-states on the two largest sizes, allowing you to run a single core at up to 3.5 Ghz using Intel Turbo Boost Technology.

The new instances also feature a higher amount of memory per core, putting them in the current “sweet spot” for HPC applications that work most efficiently when there’s at least 4 GiB of memory for each core. The instances also benefit from some internal improvements that boost memory access speed by up to 19% in comparison to the C5 and C5d instances.

It’s All About the Network
Now let’s get to the big news!

The C5n instances incorporate the fourth generation of our custom Nitro hardware, allowing the high-end instances to provide up to 100 Gbps of network throughput, along with a higher ceiling on packets per second. The Elastic Network Interface (ENI) on the C5n uses up to 32 queues (in comparison to 8 on the C5 and C5d), allowing the packet processing workload to be better distributed across all available vCPUs. The ability to push more packets per second will make these instances a great fit for network appliances such as firewalls, routers, and 5G cellular infrastructure.

In order to make the most of the available network bandwidth, you need to be using the latest Elastic Network Adapter (ENA) drivers (available in the latest Amazon Linux, Red Hat 7.6, and Ubuntu AMIs, and in the upstream Linux kernel) and you need to make use of multiple traffic flows. Flows within a Placement Group can reach 10 Gbps; the rest can reach 5 Gbps. When using multiple flows on the high-end instances, you can transfer 100 Gbps between EC2 instances in the same region (within or across AZs), S3 buckets, and AWS services such as Amazon Relational Database Service (RDS), Amazon ElastiCache, and Amazon EMR.

Available Now
C5n instances are available now in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), and AWS GovCloud (US-West) Regions and you can launch one (or an entire cluster of them) today in On-Demand, Reserved Instance, Spot, Dedicated Host, or Dedicated Instance form.

Jeff;