Tag Archives: launch

New – Parallel Query for Amazon Aurora

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-parallel-query-for-amazon-aurora/

Amazon Aurora is a relational database that was designed to take full advantage of the abundance of networking, processing, and storage resources available in the cloud. While maintaining compatibility with MySQL and PostgreSQL on the user-visible side, Aurora makes use of a modern, purpose-built distributed storage system under the covers. Your data is striped across hundreds of storage nodes distributed over three distinct AWS Availability Zones, with two copies per zone, on fast SSD storage. Here’s what this looks like (extracted from Getting Started with Amazon Aurora):

New Parallel Query
When we launched Aurora we also hinted at our plans to apply the same scale-out design principle to other layers of the database stack. Today I would like to tell you about our next step along that path.

Each node in the storage layer pictured above also includes plenty of processing power. Aurora is now able to make great use of that processing power by taking your analytical queries (generally those that process all or a large part of a good-sized table) and running them in parallel across hundreds or thousands of storage nodes, with speed benefits approaching two orders of magnitude. Because this new model reduces network, CPU, and buffer pool contention, you can run a mix of analytical and transactional queries simultaneously on the same table while maintaining high throughput for both types of queries.

The instance class determines the number of parallel queries that can be active at a given time:

  • db.r*.large – 1 concurrent parallel query session
  • db.r*.xlarge – 2 concurrent parallel query sessions
  • db.r*.2xlarge – 4 concurrent parallel query sessions
  • db.r*.4xlarge – 8 concurrent parallel query sessions
  • db.r*.8xlarge – 16 concurrent parallel query sessions
  • db.r4.16xlarge – 16 concurrent parallel query sessions

You can use the aurora_pq parameter to enable and disable the use of parallel queries at the global and the session level.

Parallel queries enhance the performance of over 200 types of single-table predicates and hash joins. The Aurora query optimizer will automatically decide whether to use Parallel Query based on the size of the table and the amount of table data that is already in memory; you can also use the aurora_pq_force session variable to override the optimizer for testing purposes.

Parallel Query in Action
You will need to create a fresh cluster in order to make use of the Parallel Query feature. You can create one from scratch, or you can restore a snapshot.

To create a cluster that supports Parallel Query, I simply choose Provisioned with Aurora parallel query enabled as the Capacity type:

I used the CLI to restore a 100 GB snapshot for testing, and then explored one of the queries from the TPC-H benchmark. Here’s the basic query:

  SUM(l_extendedprice * (1-l_discount)) AS revenue,

FROM customer, orders, lineitem

  AND c_custkey = o_custkey
  AND l_orderkey = o_orderkey
  AND o_orderdate < date '1995-03-13'
  AND l_shipdate > date '1995-03-13'


  revenue DESC,
  o_orderdate LIMIT 15;

The EXPLAIN command shows the query plan, including the use of Parallel Query:

| id | select_type | table    | type | possible_keys                 | key  | key_len | ref  | rows      | Extra                                                                                                                          |
|  1 | SIMPLE      | customer | ALL  | PRIMARY                       | NULL | NULL    | NULL |  14354602 | Using where; Using temporary; Using filesort                                                                                   |
|  1 | SIMPLE      | orders   | ALL  | PRIMARY,o_custkey,o_orderdate | NULL | NULL    | NULL | 154545408 | Using where; Using join buffer (Hash Join Outer table orders); Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra)   |
|  1 | SIMPLE      | lineitem | ALL  | PRIMARY,l_shipdate            | NULL | NULL    | NULL | 606119300 | Using where; Using join buffer (Hash Join Outer table lineitem); Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra) |
3 rows in set (0.01 sec)

Here is the relevant part of the Extras column:

Using parallel query (4 columns, 1 filters, 1 exprs; 0 extra)

The query runs in less than 2 minutes when Parallel Query is used:

| l_orderkey | revenue     | o_orderdate | o_shippriority |
|   92511430 | 514726.4896 | 1995-03-06  |              0 |
|  593851010 | 475390.6058 | 1994-12-21  |              0 |
|  188390981 | 458617.4703 | 1995-03-11  |              0 |
|  241099140 | 457910.6038 | 1995-03-12  |              0 |
|  520521156 | 457157.6905 | 1995-03-07  |              0 |
|  160196293 | 456996.1155 | 1995-02-13  |              0 |
|  324814597 | 456802.9011 | 1995-03-12  |              0 |
|   81011334 | 455300.0146 | 1995-03-07  |              0 |
|   88281862 | 454961.1142 | 1995-03-03  |              0 |
|   28840519 | 454748.2485 | 1995-03-08  |              0 |
|  113920609 | 453897.2223 | 1995-02-06  |              0 |
|  377389669 | 453438.2989 | 1995-03-07  |              0 |
|  367200517 | 453067.7130 | 1995-02-26  |              0 |
|  232404000 | 452010.6506 | 1995-03-08  |              0 |
|   16384100 | 450935.1906 | 1995-03-02  |              0 |
15 rows in set (1 min 53.36 sec)

I can disable Parallel Query for the session (I can use an RDS custom cluster parameter group for a longer-lasting effect):

set SESSION aurora_pq=OFF;

The query runs considerably slower without it:

| l_orderkey | o_orderdate | revenue     | o_shippriority |
|   92511430 | 1995-03-06  | 514726.4896 |              0 |
|   16384100 | 1995-03-02  | 450935.1906 |              0 |
15 rows in set (1 hour 25 min 51.89 sec)

This was on a db.r4.2xlarge instance; other instance sizes, data sets, access patterns, and queries will perform differently. I can also override the query optimizer and insist on the use of Parallel Query for testing purposes:

set SESSION aurora_pq_force=ON;

Things to Know
Here are a couple of things to keep in mind when you start to explore Amazon Aurora Parallel Query:

Engine Support – We are launching with support for MySQL 5.6, and are working on support for MySQL 5.7 and PostgreSQL.

Table Formats – The table row format must be COMPACT; partitioned tables are not supported.

Data Types – The TEXT, BLOB, and GEOMETRY data types are not supported.

DDL – The table cannot have any pending fast online DDL operations.

Cost – You can make use of Parallel Query at no extra charge. However, because it makes direct access to storage, there is a possibility that your IO cost will increase.

Give it a Shot
This feature is available now and you can start using it today!



AWS Data Transfer Price Reductions – Up to 34% (Japan) and 28% (Australia)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-data-transfer-price-reductions-up-to-34-japan-and-28-australia/

I’ve got good good news for AWS customers who make use of our Asia Pacific (Tokyo) and Asia Pacific (Sydney) Regions. Effective September 1, 2018 we are reducing prices for data transfer from Amazon Elastic Compute Cloud (EC2), Amazon Simple Storage Service (S3), and Amazon CloudFront by up to 34% in Japan and 28% in Australia.

EC2 and S3 Data Transfer
Here are the new prices for data transfer from EC2 and S3 to the Internet:

EC2 & S3 Data Transfer Out to Internet Japan Australia
Old Rate New Rate Change Old Rate New Rate Change
Up to 1 GB / Month $0.000 $0.000 0% $0.000 $0.000 0%
Next 9.999 TB / Month $0.140 $0.114 -19% $0.140 $0.114 -19%
Next 40 TB / Month $0.135 $0.089 -34% $0.135 $0.098 -27%
Next 100 TB / Month $0.130 $0.086 -34% $0.130 $0.094 -28%
Greater than 150 TB / Month $0.120 $0.084 -30% $0.120 $0.092 -23%

You can consult the EC2 Pricing and S3 Pricing pages for more information.

CloudFront Data Transfer
Here are the new prices for data transfer from CloudFront edge nodes to the Internet

CloudFront Data Transfer Out to Internet Japan Australia
Old Rate New Rate Change Old Rate New Rate Change
Up to 10 TB / Month $0.140 $0.114 -19% $0.140 $0.114 -19%
Next 40 TB / Month $0.135 $0.089 -34% $0.135 $0.098 -27%
Next 100 TB / Month $0.120 $0.086 -28% $0.120 $0.094 -22%
Next 350 TB / Month $0.100 $0.084 -16% $0.100 $0.092 -8%
Next 524 TB / Month $0.080 $0.080 0% $0.095 $0.090 -5%
Next 4 PB / Month $0.070 $0.070 0% $0.090 $0.085 -6%
Over 5 PB / Month $0.060 $0.060 0% $0.085 $0.080 -6%

Visit the CloudFront Pricing page for more information.

We have also reduced the price of data transfer from CloudFront to your Origin. The price for CloudFront Data Transfer to Origin from edge locations in Australia has been reduced 20% to $0.080 per GB. This represents content uploads via POST and PUT.

Things to Know
Here are a couple of interesting things that you should know about AWS and data transfer:

AWS Free Tier – You can use the AWS Free Tier to get started with, and to learn more about, EC2, S3, CloudFront, and many other AWS services. The AWS Getting Started page contains lots of resources to help you with your first project.

Data Transfer from AWS Origins to CloudFront – There is no charge for data transfers from an AWS origin (S3, EC2, Elastic Load Balancing, and so forth) to any CloudFront edge location.

CloudFront Reserved Capacity Pricing – If you routinely use CloudFront to deliver 10 TB or more content per month, you should investigate our Reserved Capacity pricing. You can receive a significant discount by committing to transfer 10 TB or more content from a single region, with additional discounts at higher levels of usage. To learn more or to sign up, simply Contact Us.



New – AWS Storage Gateway Hardware Appliance

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-storage-gateway-hardware-appliance/

AWS Storage Gateway connects your on-premises applications to AWS storage services such as Amazon Simple Storage Service (S3), Amazon Elastic Block Store (EBS), and Amazon Glacier. It runs in your existing virtualized environment and is visible to your applications and your client operating systems as a file share, a local block volume, or a virtual tape library. The resulting hybrid storage model gives our customers the ability to use their AWS Storage Gateways for backup, archiving, disaster recovery, cloud data processing, storage tiering, and migration.

New Hardware Appliance
Today we are making Storage Gateway available as a hardware appliance, adding to the existing support for VMware ESXi, Microsoft Hyper-V, and Amazon EC2. This means that you can now make use of Storage Gateway in situations where you do not have a virtualized environment, server-class hardware or IT staff with the specialized skills that are needed to manage them. You can order appliances from Amazon.com for delivery to branch offices, warehouses, and “outpost” offices that lack dedicated IT resources. Setup (as you will see in a minute) is quick and easy, and gives you access to three storage solutions:

File Gateway – A file interface to Amazon S3, accessible via NFS or SMB. The files are stored as S3 objects, allowing you to make use of specialized S3 features such as lifecycle management and cross-region replication. You can trigger AWS Lambda functions, run Amazon Athena queries, and use Amazon Macie to discover and classify sensitive data.

Volume Gateway – Cloud-backed storage volumes, accessible as local iSCSI volumes. Gateways can be configured to cache frequently accessed data locally, or to store a full copy of all data locally. You can create EBS snapshots of the volumes and use them for disaster recovery or data migration.

Tape Gateway – A cloud-based virtual tape library (VTL), accessible via iSCSI, so you can replace your on-premises tape infrastructure, without changing your backup workflow.

To learn more about each of these solutions, read What is AWS Storage Gateway.

The AWS Storage Gateway Hardware Appliance is based on a specially configured Dell EMC PowerEdge R640 Rack Server that is pre-loaded with AWS Storage Gateway software. It has 2 Intel® Xeon® processors, 128 GB of memory, 6 TB of usable SSD storage for your locally cached data, redundant power supplies, and you can order one from Amazon.com:

If you have an Amazon Business account (they’re free) you can use a purchase order for the transaction. In addition to simplifying deployment, using this standardized configuration helps to assure consistent performance for your local applications.

Hardware Setup
As you know, I like to go hands-on with new AWS products. My colleagues shipped a pre-release appliance to me; I left it under the watchful guide of my CSO (Canine Security Officer) until I was ready to write this post:

I don’t have a server room or a rack, so I set it up on my hobby table for testing:

In addition to the appliance, I also scrounged up a VGA cable, a USB keyboard, a small monitor, and a power adapter (C13 to NEMA 5-15). The adapter is necessary because the cord included with the appliance is intended to plug in a power distribution jack commonly found in a data center. I connected it all up, turned it on and watched it boot up, then entered a new administrative password.

Following the directions in the documentation, I configured an IPV4 address, using DHCP for convenience:

I captured the IP address for use in the next step, selected Back (the UI is keyboard-driven) and then logged out. This is the only step that takes place on the local console.

Gateway Configuration
At this point I will switch from past to present, and walk you through the configuration process. As directed by the Getting Started Guide, I open the Storage Gateway Console on the same network as the appliance, select the region where I want to create my gateway, and click Get started:

I select File gateway and click Next to proceed:

I select Hardware Appliance as my host platform (I can click Buy on Amazon to purchase one if necessary), and click Next:

Then I enter the IP address of my appliance and click Connect:

I enter a name for my gateway (jbgw1), set the time zone, pick ZFS as my RAID Volume Manager, and click Activate to proceed:

My gateway is activated within a second or two and I can see it in the Hardware section of the console:

At this point I am free to use a console that is not on the same network, so I’ll switch back to my trusty WorkSpace!

Now that my hardware has been activated, I can launch the actual gateway service on it. I select the appliance, and choose Launch Gateway from the Actions menu:

I choose the desired gateway type, enter a name (fgw1) for it, and click Launch gateway:

The gateway will start off in the Offline status, and transition to Online within 3 to 5 minutes. The next step is to allocate local storage by clicking Edit local disks:

Since I am creating a file gateway, all of the local storage is used for caching:

Now I can create a file share on my appliance! I click Create file share, enter the name of an existing S3 bucket, and choose NFS or SMB, then click Next:

I configure a couple of S3 options, request creation of a new IAM role, and click Next:

I review all of my choices and click Create file share:

After I create the share I can see the commands that are used to mount it in each client environment:

I mount the share on my Ubuntu desktop (I had to install the nfs-client package first) and copy a bunch of files to it:

Then I visit the S3 bucket and see that the gateway has already uploaded the files:

Finally, I have the option to change the configuration of my appliance. After making sure that all network clients have unmounted the file share, I remove the existing gateway:

And launch a new one:

And there you have it. I installed and configured the appliance, created a file share that was accessible from my on-premises systems, and then copied files to it for replication to the cloud.

Now Available
The Storage Gateway Hardware Appliance is available now and you can purchase one today. Start in the AWS Storage Gateway Console and follow the steps above!




New – AWS Systems Manager Session Manager for Shell Access to EC2 Instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-session-manager/

It is a very interesting time to be a corporate IT administrator. On the one hand, developers are talking about (and implementing) an idyllic future where infrastructure as code, and treating servers and other resources as cattle. On the other hand, legacy systems still must be treated as pets, set up and maintained by hand or with the aid of limited automation. Many of the customers that I speak with are making the transition to the future at a rapid pace, but need to work in the world that exists today. For example, they still need shell-level access to their servers on occasion. They might need to kill runaway processes, consult server logs, fine-tune configurations, or install temporary patches, all while maintaining a strong security profile. They want to avoid the hassle that comes with running Bastion hosts and the risks that arise when opening up inbound SSH ports on the instances.

We’ve already addressed some of the need for shell-level access with the AWS Systems Manager Run Command. This AWS facility gives administrators secure access to EC2 instances. It allows them to create command documents and run them on any desired set of EC2 instances, with support for both Linux and Microsoft Windows. The commands are run asynchronously, with output captured for review.

New Session Manager
Today we are adding a new option for shell-level access. The new Session Manager makes the AWS Systems Manager even more powerful. You can now use a new browser-based interactive shell and a command-line interface (CLI) to manage your Windows and Linux instances. Here’s what you get:

Secure Access – You don’t have to manually set up user accounts, passwords, or SSH keys on the instances and you don’t have to open up any inbound ports. Session Manager communicates with the instances via the SSM Agent across an encrypted tunnel that originates on the instance, and does not require a bastion host.

Access Control – You use IAM policies and users to control access to your instances, and don’t need to distribute SSH keys. You can limit access to a desired time/maintenance window by using IAM’s Date Condition Operators.

Auditability – Commands and responses can be logged to Amazon CloudWatch and to an S3 bucket. You can arrange to receive an SNS notification when a new session is started.

Interactivity – Commands are executed synchronously in a full interactive bash (Linux) or PowerShell (Windows) environment

Programming and Scripting – In addition to the console access that I will show you in a moment, you can also initiate sessions from the command line (aws ssm ...) or via the Session Manager APIs.

The SSM Agent running on the EC2 instances must be able to connect to Session Manager’s public endpoint. You can also set up a PrivateLink connection to allow instances running in private VPCs (without Internet access or a public IP address) to connect to Session Manager.

Session Manager in Action
In order to use Session Manager to access my EC2 instances, the instances must be running the latest version (2.3.12 or above) of the SSM Agent. The instance role for the instances must reference a policy that allows access to the appropriate services; you can create your own or use AmazonEC2RoleForSSM. Here are my EC2 instances (sk1 and sk2 are running Amazon Linux; sk3-win and sk4-win are running Microsoft Windows):

Before I run my first command, I open AWS Systems Manager and click Preferences. Since I want to log my commands, I enter the name of my S3 bucket and my CloudWatch log group. If I enter either or both values, the instance policy must also grant access to them:

I’m ready to roll! I click Sessions, see that I have no active sessions, and click Start session to move ahead:

I select a Linux instance (sk1), and click Start session again:

The session opens up immediately:

I can do the same for one of my Windows instances:

The log streams are visible in CloudWatch:

Each stream contains the content of a single session:

In the Works
As usual, we have some additional features in the works for Session Manager. Here’s a sneak peek:

SSH Client – You will be able to create SSH sessions atop Session Manager without opening up any inbound ports.

On-Premises Access – We plan to give you the ability to access your on-premises instances (which must be running the SSM Agent) via Session Manager.

Available Now
Session Manager is available in all AWS regions (including AWS GovCloud) at no extra charge.


AWS – Ready for the Next Storm

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-ready-for-the-next-storm/

As I have shared in the past (AWS – Ready to Weather the Storm) we take extensive precautions to help ensure that AWS will remain operational in the face of hurricanes, storms, and other natural disasters. With Hurricane Florence heading for the east coast of the United States, I thought it would be a good time to review and update some of the most important points from that post. Here’s what I want you to know:

Availability Zones – We replicate critical components of AWS across multiple Availability Zones to ensure high availability. Common points of failure, such as generators, UPS units, and air conditioning, are not shared across Availability Zones. Electrical power systems are designed to be fully redundant and can be maintained without impacting operations. The AWS Well-Architected Framework provides guidance on the proper use of multiple Availability Zones to build applications that are reliable and resilient, as does the Building Fault-Tolerant Applications on AWS whitepaper.

Contingency Planning – We maintain contingency plans and regularly rehearse our responses. We maintain a series of incident response plans and update them regularly to incorporate lessons learned and to prepare for emerging threats. In the days leading up to a known event such as a hurricane, we increase fuel supplies, update staffing plans, and add provisions to ensure the health and safety of our support teams.

Data Transfer – With a storage capacity of 100 TB per device, AWS Snowball Edge appliances can be used to quickly move large amounts of data to the cloud.

Disaster Response – When call volumes spike before, during, or after a disaster, Amazon Connect can supplement your existing call center resources and allow you to provide a better response.

Support – You can contact AWS Support if you are in need of assistance with any of these issues.




Amazon AppStream 2.0 – New Application Settings Persistence and a Quick Launch Recap

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-appstream-2-0-new-application-settings-persistence-and-a-quick-launch-recap/

Amazon AppStream 2.0 gives you access to Windows desktop applications through a web browser. Thousands of AWS customers, including SOLIDWORKS, Siemens, and MathWorks are already using AppStream 2.0 to deliver applications to their customers.

Today I would like to bring you up to date on some recent additions to AppStream 2.0, wrapping up with a closer look at a brand new feature that will automatically save application customizations (preferences, bookmarks, toolbar settings, connection profiles, and the like) and Windows settings between your sessions.

The recent additions to AppStream 2.0 can be divided into four categories:

User Enhancements – Support for time zone, locale, and language input, better copy/paste, and the new application persistence feature.

Admin Improvements – The ability to configure default application settings, control access to some system resources, copy images across AWS regions, establish custom branding, and share images between AWS accounts.

Storage Integration – Support for Microsoft OneDrive for Business and Google Drive for G Suite.

Regional Expansion – AppStream 2.0 recently became available in three additional AWS regions in Europe and Asia.

Let’s take a look at each item and then at application settings persistence….

User Enhancements
In June we gave AppStream 2.0 users control over the time zone, locale, and input methods. Once set, the values apply to future sessions in the same AWS region. This feature (formally known as Regional Settings) must be enabled by the AppStream 2.0 administrator as detailed in Enable Regional Settings for Your AppStream 2.0 Users.

In July we added keyboard shortcuts for copy/paste between your local device and your AppStream 2.0 sessions when using Google Chrome.

Admin Improvements
In February we gave AppStream 2.0 administrators the ability to copy AppStream 2.0 images to other AWS regions, simplifying the process of creating and managing global application deployments (to learn more, visit Tag and Copy an Image):

In March we gave AppStream 2.0 administrators additional control over the user experience, including the ability to customize the logo, color, text, and help links in the application catalog page. Read Add Your Custom Branding to AppStream 2.0 to learn more.

In May we added administrative control over the data that moves to and from the AppStream 2.0 streaming sessions. AppStream 2.0 administrators can control access to file upload, file download, printing, and copy/paste to and from local applications. Read Create AppStream 2.0 Fleets and Stacks to learn more.

In June we gave AppStream 2.0 administrators the power to configure default application settings (connection profiles, browser settings, and plugins) on behalf of their users. Read Enabling Default OS and Application Settings for Your Users to learn more.

In July we gave AppStream 2.0 administrators the ability to share AppStream 2.0 images between AWS accounts for use in the same AWS Region. To learn more, take a look at the UpdateImagePermissions API and the update-image-permissions command.

Storage Integration
Both of these launches provide AppStream 2.0 users with additional storage options for the documents that they access, edit, and create:

Launched in June, the Google Drive for G Suite support allows users to access files on a Google Drive from inside of their applications. Read Google Drive for G Suite is now enabled on Amazon AppStream 2.0 to learn how to enable this feature for an AppStream application stack.

Similiarly, the Microsoft OneDrive for Business Support that was launched in July allows users to access files stored in OneDrive for Business accounts. Read Amazon AppStream 2.0 adds support for OneDrive for Business to learn how to set this up.


Regional Expansion
In January we made AppStream 2.0 available in the Asia Pacific (Singapore) and Asia Pacific (Sydney) Regions.

In March we made AppStream 2.0 available in the Europe (Frankfurt) Region.

See the AWS Region Table for the full list of regions where AppStream 2.0 is available.

Application Settings Persistence
With the past out of the way, let’s take a look at today’s new feature, Application Settings Persistence!

As you can see from the launch recap above, AppStream 2.0 already saves several important application and system settings between sessions. Today we are adding support for the elements that make up the Windows Roaming Profile. This includes:

Windows Profile – The contents of C:\users\user_name\appdata .

Windows Profile Folder – The contents of C:\users\user_name .

Windows Registry – The tree of registry entries rooted at HKEY_CURRENT_USER .

This feature must be enabled by the AppStream 2.0 administrator. The contents of the Windows Roaming Profile are stored in an S3 bucket in the administrator’s AWS account, with an initial storage allowance (easily increased) of up to 1 GB per user. The S3 bucket is configured for Server Side Encryption with keys managed by S3. Data moves between AppStream 2.0 and S3 across a connection that is protected by SSL. The administrator can choose to enable S3 versioning to allow recovery from a corrupted profile.

Application Settings Persistence can be enabled for an existing stack, as long as it is running the latest version of the AppStream 2.0 Agent. Here’s how it is enabled when creating a new stack:

Putting multiple stacks in the same settings group allows them to share a common set of user settings. The settings are applied when the user logs in, and then persisted back to S3 when they log out.

This feature is available now and AppStream 2.0 administrators can enable it today. The only cost is for the S3 storage consumed by the stored profiles, charged at the usual S3 prices.


PS – Follow the AWS Desktop and Application Streaming Blog to make sure that you know about new features as quickly as possible.


Chat with the Alexa Prize Finalists Today

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/chat-with-the-alexa-prize-finalists-today/

The Alexa Prize is an annual competition designed to spur academic research and development in the field of conversational artificial intelligence. This year, students are working to build socialbots that can engage in a fun, high-quality conversation on popular societal topics for up to 20 minutes. In order to succeed at this task, the teams must innovate in a broad range of areas including knowledge acquisition, natural language understanding, natural language generation, context modeling, common-sense reasoning, and dialog planning. They use the Alexa Skills Kit (ASK) to construct their bot and to receive real-time feedback on its performance.

Last month the socialbots from Heriot-Watt University (Alana), Czech Technical University (Alquist), and UC Davis (Gunrock) were chosen as the finalists (watch the Twitch stream to learn more). The competition was tough, with points assigned for the potential scientific contribution to the field, the technical merit of the approach, the overall novelty of the idea, and the team’s ability to deliver on their vision.

Time to Chat
We’re now ready for the final round.

Step up to your nearest Alexa-powered device and say “Alexa, let’s chat!” You will be connected to one of the three socialbots (chosen at random) and can converse with it for as long as you would like. When you are through, say “Alexa stop,” and rate the socialbot when prompted. You can also provide additional feedback for the team. We’ll announce the winner at AWS re:Invent 2018 in Las Vegas.


PS – If you are ready to build your very own Alexa Skill, check out the Alexa Skills Kit Tutorials and subscribe to the Alexa Blogs.


In the Works – Amazon RDS on VMware

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/in-the-works-amazon-rds-on-vmware/

Database administrators spend a lot of time provisioning hardware, installing and patching operating systems and databases, and managing backups. All of this undifferentiated heavy lifting keeps the lights on but often takes time away from higher-level efforts that have a higher return on investment. For many years, Amazon Relational Database Service (RDS) has taken care of this heavy lifting, and simplified the use of MariaDB, Microsoft SQL Server, MySQL, Oracle, and PostgreSQL in the cloud. AWS customers love the high availability, scalability, durability, and management simplicity of RDS.

Earlier this week we announced that we are working to bring the benefits of RDS to on-premises virtualized environments, to hybrid environments, and to VMware Cloud on AWS. You will be able to provision new on-premises database instances in minutes with a couple of clicks, make backups to on-premises or cloud-based storage, and to establish read replicas running on-premises or in the AWS cloud. Amazon RDS on vSphere will take care of OS and database patching, and will let you migrate your on-premises databases to AWS with a single click.

Inside Amazon RDS on VMware
I sat down with the development team to learn more about Amazon RDS on VMware. Here’s a quick summary of what I learned:

Architecture – Your vSphere environment is effectively a private, local AWS Availability Zone (AZ), connected to AWS across a VPN tunnel running over the Internet or a AWS Direct Connect connection. You will be able to create Multi-AZ instances of RDS that span vSphere clusters.

Backups – Backups can make use of local (on-premises storage) or AWS, and are subject to both local and AWS retention policies. Backups are portable, and can be used to create an in-cloud Amazon RDS instance. Point in Time Recovery (PITR) will be supported, as long as you restore to the same environment.

Management – You will be able to manage your Amazon RDS on vSphere instances from the Amazon RDS Console and from vCenter. You will also be able to use the Amazon RDS CLI and the Amazon RDS APIs.

Regions – We’ll be launching in the US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Frankfurt) Regions, with more to come over time.

Register for the Preview
If you would like to be among the first to take Amazon RDS on VMware for a spin, you can Register for the Preview. I’ll have more information (and a hands-on blog post) in the near future, so stay tuned!



New – Over-the-Air (OTA) Updates for Amazon FreeRTOS

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-over-the-air-ota-updates-for-amazon-freertos/

Amazon FreeRTOS is an operating system for the microcontrollers that power connected devices such as appliances, fitness trackers, industrial sensors, smart utility meters, security systems, and the like. Designed for use in small, low-powered devices, Amazon FreeRTOS extends the FreeRTOS kernel with libraries for communication with cloud services such as AWS IoT Core and with more powerful edge devices that are running AWS Greengrass (to learn more, read Announcing Amazon FreeRTOS – Enabling Billions of Devices to Securely Benefit from the Cloud).

Unlike more powerful, general-purpose computers that include generous amounts of local memory and storage, and the ability to load and run code on demand, microcontrollers are often driven by firmware that is loaded at the factory and then updated with bug fixes and new features from time to time over the life of the device. While some devices are able to accept updates in the field and while they are running, others must be disconnected, removed from service, and updated manually. This can be disruptive, inconvenient, and expensive, not to mention time-consuming.

As usual, we want to provide a better solution for our customers!

Over-the-Air Updates
Today we are making Amazon FreeRTOS even more useful with the addition of an over-the-air update mechanism that can be used to deliver updates to devices in the field. Here are the most important properties of this new feature:

Security – Updates can be signed by an integrated code signer, streamed to the target device across a TLS-protected connection, and then verified on the target device in order to guard against corrupt, unauthorized, fraudulent updates.

Fault Tolerance – In order to guard against failed updates that can result in a useless, “bricked” device, the update process is resilient and able to handle partial updates from taking effect, leaving the device in an operable state.

Scalability – Device fleets often contain thousands or millions of devices, and can be divided into groups for updating purposes, powered by AWS IoT Device Management.

Frugality – Microcontrollers have limited amounts of RAM (often 128KB or so) and compute power. Amazon FreeRTOS makes the most of these scarce resources by using a single TLS connection for updates and other AWS IoT Core communication, and by using the lightweight MQTT protocol.

Each device must include the OTA Updates Library. This library contains an agent that listens for update jobs and supervises the update process.

OTA in Action
I don’t happen to have a fleet of devices deployed, so I’ll have to limit this post to the highlights and direct you to the OTA Tutorial for more info.

Each update takes the form of an AWS IoT job. A job specifies a list of target devices (things and/or thing groups) and references a job document that describes the operations to be performed on each target. The job document, in turn, points to the code or data to be deployed for the update, and specifies the desired code signing option. Code signing ensures that the deployed content is genuine; you can sign the content yourself ahead of time or request that it be done as part of the job.

Jobs can be run once (a snapshot job), or whenever a change is detected in a target (a continuous job). Continuous jobs can be used to onboard or upgrade new devices as they are added to a thing group.

After the job has been created, AWS IoT will publish an OTA job message via MQTT. The OTA Updates library will download the signed content in streaming fashion, supervise the update, and report status back to AWS IoT.

You can create and manage jobs from the AWS IoT Console, and can also build your own tools using the CLI and the API. I open the Console and click Create a job to get started:

Then I click Create OTA update job:

I select and sign my firmware image:

From there I would select my things or thing groups, initiate the job, and monitor the status:

Again, to learn more, check out the tutorial.

This new feature is available now and you can start using it today.


Amazon DynamoDB – Features to Power Your Enterprise

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/dynamodb-features-to-power-your-enterprise/

I first told you about Amazon DynamoDB in early 2012, and said:

We want you to think big, to dream big dreams, and to envision (and then build) data-intensive applications that can scale from zero users up to tens or hundreds of millions of users before you know it. We want you to succeed, and we don’t want your database to get in the way. Focus on your app and on building a user base, and leave the driving to us.

Six years later, DynamoDB handles trillions of requests per day, and is the NoSQL database of choice for more than 100,000 AWS customers.

Every so often I like to take a look back and summarize some of our most recent launches. I want to make sure that you don’t miss something of importance due to our ever-quickening pace of innovation, and I also like to put the individual releases into a larger context.

For the Enterprise
Many of our recent DynamoDB launches have been driven by the needs of our enterprise customers. For example:

Global Tables – Announced last November, global tables exist in two or more AWS Regions, with fast automated replication across Regions.

Encryption – Announced in February, tables can be encrypted at rest with no overhead.

Point-in-Time Recovery – Announced in March, continuous backups support the ability to restore a table to a prior state with a resolution of one second, going up to 35 days into the past.

DynamoDB Service Level Agreement – Announced in June, the SLA defines availability expectations for DynamoDB tables.

Adaptive Capacity – Though not a new feature, a popular recent blog post explained how DynamoDB automatically adapts to changing access patterns.

Let’s review each of these important features. Even though I have flagged them as being of particular value to enterprises, I am confident that all DynamoDB users will find them valuable.

Global Tables
Even though I try not to play favorites when it comes to services or features, I have to admit that I really like this one. It allows you to create tables that are automatically replicated across two or more AWS Regions, with full support for multi-master writes, all with a couple of clicks. You get an additional level of redundancy (tables are also replicated across three Availability Zones in each region) and fast read/write performance that can scale to meet the needs of the most demanding global apps.

Global tables can be used in nine AWS Regions (we recently added support for three more) and can be set up when you create the table:

To learn more, read Amazon DynamoDB Update – Global Tables and On-Demand Backup.

Our customers store sensitive data in DynamoDB and need to protect it in order to achieve their compliance objectives. The encryption at rest feature protects data stored in tables, local secondary indexes, and global secondary indexes using AES-256. The encryption adds no storage overhead, is completely transparent, and does not affect latency. It can be enabled with one click when you create a new table:

To learn more, read New – Encryption at Rest for DynamoDB.

Point-in-Time Recovery
Even when you take every possible operational precaution, you may still do something regrettable to your production database. When (not if) that happens. you can use the DynamoDB point-in-time recovery feature to turn back time, restoring the database to its state as of up to 35 days earlier. Assuming that you enabled continuous backups for the table, restoration is as simple as choosing the desired point in time:

To learn more, read New – Amazon DynamoDB Continuous Backups and Point-in-Time Recovery (PITR).

Service Level Agreement
If you are building your applications on DynamoDB and relying on it to store your mission-critical data, you need to know what kind of availability to expect. The DynamoDB Service Level Agreement (SLA) promises 99.99% availability for tables in a single region and 99.999% availability for global tables, within a monthly billing cycle. The SLA provides service credits if the availability promise is not met.

Adaptive Capacity
DynamoDB does a lot of work behind the scenes to adapt to varying workloads. For example, as your workload scales and evolves, DynamoDB automatically reshards and dynamically redistributes data between multiple storage partitions in response to changes in read throughput, write throughput, and storage.

Also, DynamoDB uses an adaptive capacity mechanism to address situations where the distribution of data across the storage partitions of a table has become somewhat uneven. This mechanism allows one partition to consume more than its fair share of the overall provisioned capacity for the table for as long as necessary, as long as the overall use of provisioned capacity remains within bounds. With change, the advice that we gave in the past regarding key distribution is not nearly as important.

To learn more about this feature and to see how it can help to compensate for surprising or unusual access patterns to your DynamoDB tables, read How Amazon DynamoDB adaptive capacity accommodates uneven data access patterns.

And There You Go
I hope that you have enjoyed this quick look at some of the most recent enterprise-style features for DynamoDB. We’ve got more on the way, so stay tuned for future updates.


PS – Last week we released a DynamoDB local Docker image that you can use in your containerized development environment and for CI testing.

Amazon Lightsail Update – More Instance Sizes and Price Reductions

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-lightsail-update-more-instance-sizes-and-price-reductions/

Amazon Lightsail gives you access to the power of AWS, with the simplicity of a VPS (Virtual Private Server). You choose a configuration from a menu and launch a virtual machine (an instance) preconfigured with SSD-based storage, DNS management, and a static IP address. You can use Linux or Windows, and can even choose between eleven Linux-powered blueprints that contain ready-to-run copies of popular web, e-commerce, and development tools:

On the Linux/Unix side, you now have six options, including CentOS:

The monthly fee for each instance includes a generous data transfer allocation, giving you the ability to host web sites, blogs, online stores and whatever else you can dream up!

Since the launch of Lightsail in late 2016, we’ve done our best to listen and respond to customer feedback. For example:

October 2017Microsoft Windows – This update let you launch Lightsail instances running Windows Server 2012 R2, Windows Server 2016, and Windows Server 2016 with SQL Server 2016 Express. This allowed you to build, test, and deploy .NET and Windows applications without having to set up or run any infrastructure.

November 2017Load Balancers & Certificate Management – This update gave you the ability to build highly scalable applications that use load balancers to distribute traffic to multiple Lightsail instances. It also gave you access to free SSL/TLS certificates and a simple, integrated tool to request and validate them, along with an automated renewal mechanism.

November 2017Additional Block Storage – This update let you extend your Lightsail instances with additional SSD-backed storage, with the ability to attach up to 15 disks (each holding up to 16 TB) to each instance. The additional storage is automatically replicated and encrypted.

May 2018Additional Regions – This update let you launch Lightsail instances in the Canada (Central), Europe (Paris), and Asia Pacific (Seoul) Regions, bringing the total region count to 13, and giving you lots of geographic flexibility.

So that’s where we started and how we got here! What’s next?

And Now for the Updates
Today we are adding two more instances sizes at the top end of the range and reducing the prices for the existing instances by up to 50%.

Here are the new instance sizes:

16 GB – 16 GB of memory, 4 vCPUs, 320 GB of storage, and 6 TB of data transfer.

32 GB – 32 GB of memory, 8 vCPUs, 640 GB of storage, and 7 TB of data transfer.

Here are the monthly prices (billed hourly) for Lightsail instances running Linux:

512 MB
1 GB 2 GB 4 GB 8 GB 16 GB 32 GB
Old $5.00 $10 $20 $40 $80
New $3.50 $5 $10 $20 $40 $80 $160

And for Lightsail instances running Windows:

512 MB 1 GB 2 GB 4 GB 8 GB 16 GB 32 GB
Old $10 $17 $30 $55 $100
New $8 $12 $20 $40 $70 $120 $240

These reductions are effective as of August 1, 2018 and take place automatically, with no action on your part.

From Our Customers
WordPress power users, developers, entrepreneurs, and people who need a place to host their personal web site are all making great use of Lightsail. The Lightsail team is always thrilled to see customer feedback on social media and shared a couple of recent tweets with me as evidence!

Emil Uzelac (@emiluzelac) is a well-respected member of the WordPress community, especially in the area of WordPress theme development and reviews. When he tried Lightsail he was super impressed with the speed of our instances calling them “by far the fastest I’ve tried”:

As an independent developer and SaaS cofounder, Mike Rogers (@mikerogers0) hasn’t spent a lot of time working with infrastructure. However, when he moved some of his Ruby on Rails projects over to Lightsail, he realized that it was easy (and actually fun) to make the move:

Stephanie Davis (@StephanieMDavis) is a business intelligence developer and honey bee researcher who wanted to find a new home for her writings. She settled on Lightsail, and after it was all up and running she had a “. . . a much, much better grasp of the AWS cloud infrastructure and an economical, slick web host”:

If you have your own Lightsail success story to share, could I ask you to tweet it and hashtag it with #PoweredByLightsail ? I can’t wait to read it!

Some new Lightsail Resources
While I have got your attention, I’d like to share some helpful videos with you!

Deploying a MEAN stack Application on Amazon Lightsail – AWS Developer Advocate Mike Coleman shows you how to deploy a MEAN stack (MongoDB, Express.js, Angular, Node.js) on Lightsail:

Deploying a WordPress Instance on Amazon Lightsail – Mike shows you how to deploy WordPress:

Deploying Docker Containers on Amazon Lightsail – Mike shows you how to use Docker containers:


New T3 Instances – Burstable, Cost-Effective Performance

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-t3-instances-burstable-cost-effective-performance/

We launched the t1.micro instance type in 2010, and followed up with the first of the T2 instances (micro, small, and medium) in 2014, more sizes in 2015 (nano) and 2016 (xlarge and 2xlarge), and unlimited bursting last year.

Today we are launching T3 instances in twelve regions. These general-purpose instances are even more cost-effective than the T2 instances, with On-Demand prices that start at $0.0052 per hour ($3.796 per month). If you have workloads that currently run on M4 or M5 instances but don’t need sustained compute power, consider moving them to T3 instances. You can host the workloads at a very low cost while still having access to sustainable high performance when needed (unlimited bursting is enabled by default, making these instances even easier to use than their predecessors).

As is the case with the T1 and the T2, you get a generous and assured baseline amount of processing power and the ability to transparently scale up to full core performance when you need more processing power, for as long as necessary. The instances are powered by 2.5 GHz Intel® Xeon® Scalable (Skylake) Processors featuring the new Intel® AVX-512 instructions and you can launch them today in seven sizes:

Name vCPUs Baseline Performance / vCPU
Price / Hour
Price / Hour
t3.nano 2 5% 0.5 GiB $0.0052 $0.0980
t3.micro 2 10% 1 GiB $0.0104 $0.0196
t3.small 2 20% 2 GiB $0.0209 $0.0393
t3.medium 2 20% 4 GiB $0.0418 $0.0602
t3.large 2 30% 8 GiB $0.0835 $0.1111
t3.xlarge 4 40% 16 GiB $0.1670 $0.2406
t3.2xlarge 8 40% 32 GiB $0.3341 $0.4813

The column labeled Baseline Performance indicates the percentage of a single hyperthread’s processing power that is allocated to the instance. Instances accumulate credits when idle and consume them when running, with credits stored for up to 7 days. If the average CPU utilization of a T3 instance is lower than the baseline over a 24 hour period, the hourly instance price covers all spikes in usage. If the instance runs at higher CPU utilization for a prolonged period, there will be an additional charge of $0.05 per vCPU-hour. This billing model means that you can choose the T3 instance that has enough memory and vCPUs for your needs, and then count on the bursting to deliver all necessary CPU cycles. This is a unique form of vertical scaling that could very well enable some new types of applications.

T3 instances are powered by the Nitro system. In addition to CPU bursting, they support network and EBS bursting, giving you access to additional throughput when you need it. Network traffic can burst to 5 Gbps for all instance sizes; EBS bursting ranges from 1.5 Gbps to 2.05 Gbps depending on the size of the instance, with corresponding bursts for EBS IOPS.

Like all of our most recent instance types, the T3 instances are HVM only, and must be launched within a Virtual Private Cloud (VPC) using an AMI that includes the Elastic Network Adapter (ENA) driver.

Now Available
The T3 instances are available in the US East (Ohio), US East (N. Virginia), US West (N. California), US West (Oregon), Canada (Central), Europe (Ireland), Europe (London), Europe (Frankfurt), South America (São Paulo), Asia Pacific (Tokyo), Asia Pacific (Singapore), and Asia Pacific (Sydney) Regions.


AWS IoT Device Defender Now Available – Keep Your Connected Devices Safe

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-iot-device-defender-now-available-keep-your-connected-devices-safe/

I was cleaning up my home office over the weekend and happened upon a network map that I created in 1997. Back then my fully wired network connected 5 PCs and two printers. Today, with all of my children grown up and out of the house, we are down to 2 PCs. However, our home mesh network is also host to 2 Raspberry Pis, some phones, a pair of tablets, another pair of TVs, a Nintendo 3DS (thanks, Eric and Ana), 4 or 5 Echo devices, several brands of security cameras, and random gadgets that I buy. I also have a guest network, temporary home to random phones and tablets, and to some of the devices that I don’t fully trust.

This is, of course, a fairly meager collection compared to the typical office or factory, but I want to use it to point out some of the challenges that we all face as IoT devices become increasingly commonplace. I’m not a full-time system administrator. I set strong passwords and apply updates as I become aware of them, but security is always a concern.

New AWS IoT Device Defender
Today I would like to tell you about AWS IoT Device Defender. This new, fully-managed service (first announced at re:Invent) will help to keep your connected devices safe. It audits your device fleet, detects anomalous behavior, and recommends mitigations for any issues that it finds. It allows you to work at scale and in an environment that contains multiple types of devices.

Device Defender audits the configuration of your IoT devices against recommended security best practices. The audits can be run on a schedule on or demand, and perform the following checks:

Imperfect Configurations – The audit looks for expiring and revoked certificates, certificates that are shared by multiple devices, and duplicate client identifiers.

AWS Issues – The audit looks for overly permission IoT policies, Cognito Ids with overly permissive access, and ensures that logging is enabled.

When issues are detected in the course of an audit, notifications can be delivered to the AWS IoT Console, as CloudWatch metrics, or as SNS notifications.

On the detection side, Device Defender looks at network connections, outbound packet and byte counts, destination IP addresses, inbound and outbound message rates, authentication failures, and more. You can set up security profiles, define acceptable behavior, and configure whitelists and blacklists of IP addresses and ports. An agent on each device is responsible for collecting device metrics and sending them to Device Defender. Devices can send metrics at 5 minute to 48 hour intervals.

Using AWS IoT Device Defender
You can access Device Defender’s features from the AWS IoT Console, CLI, or via a full set of APIs. I’ll use the Console, as I usually do, starting at the Defend menu:

The full set of available audit checks is available in Settings (any check that is enabled can be used as part of an audit):

I can see my scheduled audits by clicking Audit and Schedules. Then I can click Create to schedule a new one, or to run one immediately:

I create an audit by selecting the desired set of checks, and then save it for repeated use by clicking Create, or run it immediately:

I can choose the desired recurrence:

I can set desired day for a weekly audit, with similar options for the other recurrence frequencies. I also enter a name for my audit, and click Create (not shown in the screen shot):

I can click Results to see the outcome of past audits:

And I can click any audit to learn more:

Device Defender allows me to create security profiles to describe the expected behavior for devices within a thing group (or for all devices). I click Detect and Security profiles to get started, and can see my profiles. Then I can click Create to make a new one:

I enter a name and a description, and then model the expected behavior. In this case, I expect each device to send and receive less than 100K of network traffic per hour:

I can choose to deliver alerts to an SNS topic (I’ll need to set up an IAM role if I do this):

I can specify a behavior for all of my devices, or for those in specific thing groups:

After setting it all up, I click Save to create my security profile:

Next, I can click Violations to identify things that are in conflict with the behavior that is expected of them. The History tab lets me look back in time and examine past violations:

I can also view a device’s history of violations:

As you can see, Device Defender lets me know what is going on with my IoT devices, raises alarms when something suspicious occurs, and helps me to track down past issues, all from within the AWS Management Console.

Available Now
AWS IoT Device Defender is available today in the US East (N. Virginia), US West (Oregon), US East (Ohio), EU (Ireland), EU (Frankfurt), EU (London), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Sydney), and Asia Pacific (Seoul) Regions and you can start using it today. Pricing for audits is per-device, per-month; pricing for monitored datapoints is per datapoint, both with generous allocations in the AWS Free Tier (see the AWS IoT Device Defender page for more info).


New – Provisioned Throughput for Amazon Elastic File System (EFS)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-provisioned-throughput-for-amazon-elastic-file-system-efs/

Amazon Elastic File System lets you create petabyte-scale file systems that can be accessed in massively parallel fashion from hundreds or thousands of Amazon Elastic Compute Cloud (EC2) servers and on-premises resources, scaling on demand without disrupting applications. Behind the scenes, storage is distributed across multiple Availability Zones and redundant storage servers in order to provide you with file systems that are scalable, durable, and highly available. Space is allocated and billed on as as-needed basis, allowing you to consume as much as you need while keeping costs proportional to actual usage. Applications can achieve high levels of aggregate throughput and IOPS, with consistent low latencies. Our customers are using EFS for a broad spectrum of use cases including media processing workflows, big data & analytics jobs, code repositories, build trees, and content management repositories, taking advantage of the ability to simply lift-and-shift their existing file-based applications and workflows to the cloud.

A Quick Review
As you may already know, EFS lets you choose one of two performance modes each time you create a file system:

General Purpose – This is the default mode, and the one that you should start with. It is perfect for use cases that are sensitive to latency, and supports up to 7,000 operations per second per file system.

Max I/O – This mode scales to higher levels of aggregate throughput and performance, with slightly higher latency. It does not have an intrinsic limit on operations per second.

With either mode, throughput scales with the size of the file system, and can also burst to higher levels as needed. The size of the file system determines the upper bound on how high and how long you can burst. For example:

1 TiB – A 1 TiB file system can deliver 50 MiB/second continuously, and burst to 100 MiB/second for up to 12 hours each day.

10 TiB – A 10 TiB file system can deliver 500 MiB/second continuously, and burst up to 1 GiB/second for up to 12 hours each day.

EFS uses a credit system that allows you to “save up” throughput during quiet times and then use it during peak times. You can read about Amazon EFS Performance to learn more about the credit system and the two performance modes.

New Provisioned Throughput
Web server content farms, build trees, and EDA simulations (to name a few) can benefit from high throughput to a set of files that do not occupy a whole lot of space. In order to address this usage pattern, you now have the option to provision any desired level of throughput (up to 1 GiB/second) for each of your EFS file systems. You can set an initial value when you create the file system, and then increase it as often as you’d like. You can dial it back down every 24 hours, and you can also switch between provisioned throughput and bursting throughput on the same cycle. You can for example, configure an EFS file system to provide 50 MiB/second of throughput for your web server, even if the volume contains a relatively small amount of content.

If your application has throughput requirements that exceed what is possible using the default (bursting) model, the provisioned model is for you! You can achieve the desired level of throughput as soon as you create the file system, regardless of how much or how little storage you consume.

Here’s how I set this up:

Using provisioned throughput means that I will be billed separately for storage (in GiB/month units) and for provisioned throughput (in MiB/second-month units).

I can monitor average throughput by using a CloudWatch metric math expression. The Amazon EFS Monitoring Tutorial contains all of the formulas, along with CloudFormation templates that I can use to set up a complete CloudWatch Dashboard in a matter of minutes:

I select the desired template, enter the Id of my EFS file system, and click through the remaining screens to create my dashboard:

The template creates an IAM role, a Lambda function, a CloudWatch Events rule, and the dashboard:

The dashboard is available within the CloudWatch Console:

Here’s the dashboard for my test file system:

To learn more about how to use the EFS performance mode that is the best fit for your application, read Amazon Elastic File System – Choosing Between the Different Throughput & Performance Modes.

Available Now
This feature is available now and you can start using in today in all AWS regions where EFS is available.



Now Available: R5, R5d, and z1d Instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-r5-r5d-and-z1d-instances/

Just last week I told you about our plans to launch EC2 Instances with Faster Processors and More Memory. Today I am happy to report that the R5, R5d, and z1d instances are available now and you can start using them today. Let’s take a look at each one!

R5 Instances
The memory-optimized R5 instances use custom Intel® Xeon® Platinum 8000 Series (Skylake-SP) processors running at up to 3.1 GHz, powered by sustained all-core Turbo Boost. They are perfect for distributed in-memory caches, in-memory analytics, and big data analytics, and are available in six sizes:

Instance Name vCPUs Memory EBS-Optimized Bandwidth Network Bandwidth
r5.large 2 16 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.xlarge 4 32 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.2xlarge 8 64 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.4xlarge 16 128 GiB 3.5 Gbps Up to 10 Gbps
r5.12xlarge 48 384 GiB 7.0 Gbps 10 Gbps
r5.24xlarge 96 768 GiB 14.0 Gbps 25 Gbps

You can launch R5 instances today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) Regions. To learn more, read about R5 and R5d Instances.

R5d Instances
The R5d instances share their specs with the R5 instances, and also include up to to 3.6 TB of local NVMe storage. Here are the sizes:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
r5d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD Up to 3.5 Gbps Up to 10 Gbps
r5d.4xlarge 16 128 GiB 2 x 300 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
r5d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
r5d.24xlarge 96 768 GiB 4 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The R5d instances are available in the US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions. To learn more, read about R5 and R5d Instances.

z1d Instances
The high frequency z1d instances use custom Intel® Xeon® Scalable Processors running at up to 4.0 GHz, powered by sustained all-core Turbo Boost, perfect for Electronic Design Automation (EDA), financial simulation, relational database, and gaming workloads that can benefit from extremely high per-core performance. They are available in six sizes:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
z1d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD 2.333 Gbps Up to 10 Gbps
z1d.3xlarge 12 96 GiB 1 x 450 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
z1d.6xlarge 24 192 GiB 1 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
z1d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The fast cores allow you to run your existing jobs to completion more quickly than ever before, giving you the ability to fine-tune your use of databases and EDA tools that are licensed on a per-core basis.

You can launch z1d instances today in the US East (N. Virginia), US West (Oregon), US West (N. California), EU (Ireland), Asia Pacific (Tokyo), and Asia Pacific (Singapore) Regions. To learn more, read about z1d Instances.

In the Works
You will be able to create Amazon Relational Database Service (RDS), Amazon ElastiCache, and Amazon Aurora instances that make use of the R5 in the next two or three months.

The r5.metal, r5d.metal, and z1d.metal instances are on the near-term roadmap and I’ll let you know when they are available.



Amazon Polly Update – Time-Driven Prosody and Asynchronous Synthesis

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-polly-update-time-driven-prosody-and-asynchronous-synthesis/

I hope that you are enjoying the Polly-powered audio that is available for the newest posts on this blog, including the DeepLens Challenge and the Storage Gateway Recap. As part of my blogging process, I now listen to the synthesized speech for my draft blog posts in order to get a better sense for how they flow.

Today we are launching two new features for Amazon Polly:

Time-Driven Prosody – You can now specify the desired duration for the synthesized speech that corresponds to part or all of the input text.

Asynchronous Synthesis – You can now process large blocks of text and store the synthesized speech in Amazon S3 with a single call.

Both of these features are available now and you can start using them today. Let’s take a closer look!

Time-Driven Prosody
Imagine that you are creating a multi-lingual version of a video or a self-running presentation. You write the script, record the video in one language, and then use Amazon Translate and Amazon Polly to create audio tracks in other languages. In order to keep each language in sync with the visual content, you need to exercise fine-grained control over the duration of each segment. That’s where this new feature comes in. You can now specify the maximum desired duration for any desired segments, counting on Polly to adjust the speech rate in order to limit the length of each segment.

The preceding paragraph generates 19 seconds of audio if I use Amazon Polly’s Joanna voice with no other options:

  In order to keep each language in sync with the visual content, 
  you need to exercise fine-grained control over the duration of
  each segment. That's where this new feature comes in. You can 
  now specify the maximum desired duration for any desired segments, 
  counting on Polly to adjust the speech rate in order to limit 
  the length of each segment.

I can use a <prosody> tag to limit the length to 15 seconds:

  <prosody amazon:max-duration="15s">
    In order to keep each language in sync with the visual content, 
    you need to exercise fine-grained control over the duration of
    each segment. That's where this new feature comes in. You can 
    now specify the maximum desired duration for any desired segments, 
    counting on Polly to adjust the speech rate in order to limit 
    the length of each segment.

I can control the duration at a more fine-grained level by using multiple <prosody> tags:

  <prosody amazon:max-duration="10s">
    In order to keep each language in sync with the visual content, 
    you need to exercise fine-grained control over the duration of
    each segment. 
  <prosody amazon:max-duration="7s">
    That's where this new feature comes in. You can now specify 
    the maximum desired duration for any desired segments, 
    counting on Polly to adjust the speech rate in order to limit 
    the length of each segment.

The Spanish equivalent (courtesy of Amazon Translate) of my English text is somewhat longer and the speed-up is apparent:

  <prosody amazon:max-duration="15s">
    Para mantener cada idioma sincronizado con el contenido
    visual, es necesario ejercer un control detallado sobre
    la duración de cada segmento. Ahí es donde entra esta 
    nueva característica. Ahora puede especificar la 
    duración máxima deseada para los segmentos deseados, 
    contando con que Polly ajuste la velocidad de voz para 
    limitar la longitud de cada segmento.

The text inside of each time-limited <prosody> tag is limited to 1500 characters and nesting is not allowed (the inner tag will be ignored). In order to ensure that the audio remains comprehensible, Polly will speed up the audio by a maximum of 5x.

Asynchronous Synthesis
This feature makes it easier for you to use Polly to generate speech for long-form content such as articles or book chapters by allowing you to process up to 100,000 characters of text at a time using asynchronous requests. The synthesized speech is delivered to the S3 bucket of your choice, with failure notifications optionally routed to the Amazon Simple Notification Service (SNS) topic of your choice. The generated audio can be up to 6 hours long, and is typically ready within minutes. In addition to 100,000 characters of text, each request can include an additional 100,000 characters of Speech Synthesis Markup Language (SSML) markup.

Each asynchronous request creates a new speech synthesis task. You can initiate and manage tasks from the Polly Console, CLI (start-speech-synthesis-task), or API (StartSpeechSynthesisTask).

To test this feature I created a plain-text version of my thoroughly obsolete AWS book and inserted some SSML tags, turning it in to valid XML along the way. Then I open the Polly Console, click Text-to-Speech, paste the XML, and click Synthesize to S3:

I enter the name of my S3 bucket (which must be in region where I plan to create the task), and click Synthesize to proceed:

My task is created:

And I can see it in the list of tasks:

I receive an email when the synthesis is complete:

And the file is in my bucket as expected:

I did not spend a lot of time on the markup, but the results are impressive:

Interestingly enough, most of that chapter is still relevant. The rest of the book has been overtaken by history, and is best left there! Perhaps I’ll write another one sometime.

Anyway, as you can see (and hear) the asynchronous speech synthesis is powerful and easy to use. Give it a shot, build something cool, and tell me about it.




Amazon EC2 Instance Update – Faster Processors and More Memory

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-instance-update-faster-processors-and-more-memory/

Last month I told you about the Nitro system and explained how it will allow us to broaden the selection of EC2 instances and to pick up the pace as we do so, with an ever-broadening selection of compute, storage, memory, and networking options. This will allow us to give you access to the latest technology very quickly, giving you the ability to choose the instance type that is the best match for your applications.

Today, I would like to tell you about three new instance types that are in the works and that will be available soon:

Z1d – Compute-intensive instances running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They are ideal for Electronic Design Automation (EDA) and relational database workloads, and are also a great fit for several kinds of HPC workloads.

R5 – Memory-optimized instances running at up to 3.1 GHz powered by sustained all-core Turbo Boost, with up to 50% more vCPUs and 60% more memory than R4 instances.

R5d – Memory-optimized instances equipped with local NVMe storage (up to 3.6 TB for the largest R5d instance), and will be available in the same sizes and with the same specs as the R5 instances.

We are also planning to launch R5 Bare Metal, R5d Bare Metal, and Z1d Bare Metal instances. As is the case with the existing i3.metal instances, you will be able to access low-level hardware features and to run applications that are not licensed or supported in virtualized environments.

Z1d Instances
The Z1d instances are designed for applications that can benefit from extremely high per-core performance. These include:

Electronic Design Automation – As chips become smaller and denser, the amount of compute power needed to design and verify the chips increases non-linearly. Semiconductor customers deploy jobs that span thousands of cores; having access to faster cores reduces turnaround time for each job and can also lead to a measurable reduction in software licensing costs.

HPC – In the financial services world, jobs that run analyses or compute risks also benefit from faster cores. Manufacturing organizations can run their Finite Element Analysis (FEA) and simulation jobs to completion more quickly.

Relational Database – CPU-bound workloads that run on a database that “features” high per-core license fees will enjoy both cost and performance benefits.

Z1d instances use custom Intel® Xeon® Scalable Processors running at up to 4.0 GHz, powered by sustained all-core Turbo Boost. They will be available in 6 sizes, with up to 48 vCPUs, 384 GiB of memory, and 1.8 TB of local NVMe storage. On the network side, they feature ENA networking that will deliver up to 25 Gbps of bandwidth, and are EBS-Optimized by default for up to 14 Gbps of bandwidth. As usual, you can launch them in a Cluster Placement Group to increase throughput and reduce latency. Here are the sizes and specs:

Instance Name vCPUs Memory Local Storage EBS-Optimized Bandwidth Network Bandwidth
z1d.large 2 16 GiB 1 x 75 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.xlarge 4 32 GiB 1 x 150 GB NVMe SSD Up to 2.333 Gbps Up to 10 Gbps
z1d.2xlarge 8 64 GiB 1 x 300 GB NVMe SSD 2.333 Gbps Up to 10 Gbps
z1d.3xlarge 12 96 GiB 1 x 450 GB NVMe SSD 3.5 Gbps Up to 10 Gbps
z1d.6xlarge 24 192 GiB 1 x 900 GB NVMe SSD 7.0 Gbps 10 Gbps
z1d.12xlarge 48 384 GiB 2 x 900 GB NVMe SSD 14.0 Gbps 25 Gbps

The instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers. Any AMI that runs on C5 or M5 instances will also run on Z1d instances.

R5 Instances
Building on the earlier generations of memory-intensive instance types (M2, CR1, R3, and R4), the R5 instances are designed to support high-performance databases, distributed in-memory caches, in-memory analytics, and big data analytics. They use custom Intel® Xeon® Platinum 8000 Series (Skylake-SP) processors running at up to 3.1 GHz, again powered by sustained all-core Turbo Boost. The instances will be available in 6 sizes, with up to 96 vCPUs and 768 GiB of memory. Like the Z1d instances, they feature ENA networking and are EBS-Optimized by default, and can be launched in Placement Groups. Here are the sizes and specs:

Instance Name vCPUs Memory EBS-Optimized Bandwidth Network Bandwidth
r5.large 2 16 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.xlarge 4 32 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.2xlarge 8 64 GiB Up to 3.5 Gbps Up to 10 Gbps
r5.4xlarge 16 128 GiB 3.5 Gbps Up to 10 Gbps
r5.12xlarge 48 384 GiB 7.0 Gbps 10 Gbps
r5.24xlarge 96 768 GiB 14.0 Gbps 25 Gbps

Once again, the instances are HVM and VPC-only, and you will need to use an AMI with the appropriate ENA and NVMe drivers.

Learn More
The new EC2 instances announced today highlight our plan to continue innovating in order to better meet your needs! I’ll share additional information as soon as it is available.




New – EC2 Compute Instances for AWS Snowball Edge

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-ec2-compute-instances-for-aws-snowball-edge/

I love factories and never miss an opportunity to take a tour. Over the years, I have been lucky enough to watch as raw materials and sub-assemblies are turned into cars, locomotives, memory chips, articulated buses, and more. I’m always impressed by the speed, precision, repeat-ability, and the desire to automate every possible step. On one recent tour, the IT manager told me that he wanted to be able to set up and centrally manage the global collection of on-premises industrialized PCs that monitor their machinery as easily and as efficiently as he does their EC2 instances and other cloud resources.

Today we are making that manager’s dream a reality, with the introduction of EC2 instances that run on AWS Snowball Edge devices! These ruggedized devices, with 100 TB of local storage, can be used to collect and process data in hostile environments with limited or non-existent Internet connections before shipping the processed data back to AWS for storage, aggregation, and detailed analysis. Here are the instance specs:

Instance Name vCPUs Memory
sbe1.small 1 1 GiB
sbe1.medium 1 2 GiB
sbe1.large 2 4 GiB
sbe1.xlarge 4 8 GiB
sbe1.2xlarge 8 16 GiB
sbe1.4xlarge 16 32 GiB

Each Snowball Edge device is powered by an Intel® Xeon® D processor running at 1.8 GHz, and supports any combination instances that consume up to 24 vCPUs and 32 GiB of memory. You can build and test AMIs (Amazon Machine Images) in the cloud and then preload them onto the device as part of the ordering process (I’ll show you how in just a minute). You can use the EC2-compatible endpoint exposed by each device to programmatically start, stop, resume, and terminate instances. This allows you to use the existing CLI commands and to build tools and scripts to manage fleets of devices. It also allows you to take advantage of your existing EC2 skills and knowledge, and to put them to good use in a new environment.

There are three main setup steps:

  1. Creating a suitable AMI.
  2. Ordering a Snowball Edge Device.
  3. Connecting and Configuring the Device.

Let’s take an in-depth look at the first two steps. Time was tight and I didn’t have time to get hands-on experience with an actual device, so the third step will have to wait for another time.

Creating a Suitable AMI
I have the ability to choose up to 10 AMIs that will be preloaded onto the device. The AMIs must be owned by my AWS account, and must be based on one of the following Marketplace AMIs:

These AMIs have been tested for use on Snowball Edge devices and can be used as a starting point for customization. We will be adding additional options over time, so let us know what you need.

I decided to start with the newest Ubuntu AMI, and launch it on an M5 instance, taking care to specify the SSH keypair that I will eventually use to connect to the instance from my terminal client:

After my instance launches, I connect to it, customize it as desired for use on my device, and then return to the EC2 Console to create an AMI. I select the running instance, choose Create Image from the Actions menu, specify the details, and click Create Image:

The size of the root volume will determine how much of the device’s SSD storage is allocated to the instance when it launches. A total of one TB of space is available for all running instances, so keep your local file storage needs in mind as your analyze your use case and set up your AMIs. Also, Snowball Edge devices cannot make use of additional EBS volumes, so don’t bother including them in your AMI. My AMI is ready within minutes (To learn more about how to create AMIs, read Creating an Amazon EBS-Backed Linux AMI):

Now I am ready to order my first device!

Ordering a Snowball Edge Device
The ordering procedure lets me designate a shipping address and specify how I would like my Snowball device to be configured. I open the AWS Snowball Console and click Create job:

I specify the job type (they all support EC2 compute instances):

Then I select my shipping address, entering a new one if necessary (come and visit me):

Next, I define my job. I give it a name (SJ1), select the 100 TB device, and pick the S3 bucket that will receive data when the device is returned to AWS:

Now comes the fun part! I click Enable compute with EC2 and select the AMIs to be loaded on the Snowball Edge:

I click Add an AMI and find the one that I created earlier:

I can add up to ten AMIs to my job, but will stop at one for this post:

Next, I set up my IAM role and configure encryption:

Then I configure the optional SNS notifications. I can choose to receive notification for a wide variety of job status values:

My job is almost ready! I review the settings and click Create job to create it:

Connecting and Configuring the Device
After I create the job, I wait until my Snowball Edge device arrives. I connect it to my network, power it on, and then unlock it using my manifest and device code, as detailed in Unlock the Snowball Edge. Then I configure my EC2 CLI to use the EC2 endpoint on the device and launch an instance. Since I configured my AMI for SSH access, I can connect to it as if it were an EC2 instance in the cloud.

Things to Know
Here are a couple of things to keep in mind:

Long-Term Usage – You can keep the Snowball Edge devices on your premises and hard at work for as long as you would like. You’ll be billed for a one-time setup fee for each job; after 10 days you will pay an additional, per-day fee for each device. If you want to keep a device for an extended period of time, you can also pay upfront as part of a one or three year commitment.

Dev/Test – You should be able to do much of your development and testing on an EC2 instance running in the cloud; some of our early users are working in this way as part of a “Digital Twin” strategy.

S3 Access – Each Snowball Edge device includes an S3-compatible endpoint that you can access from your on-device code. You can also make use of existing S3 tools and applications.

Now Available
You can start ordering devices today and make use of this exciting new AWS feature right away.




New – Lifecycle Management for Amazon EBS Snapshots

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-lifecycle-management-for-amazon-ebs-snapshots/

It is always interesting to zoom in on the history of a single AWS service or feature and watch how it has evolved over time in response to customer feedback. For example, Amazon Elastic Block Store (EBS) launched a decade ago and has been gaining more features and functionality every since. Here are a few of the most significant announcements:

Several of the items that I chose to highlight above make EBS snapshots more useful and more flexible. As you may already know, it is easy to create snapshots. Each snapshot is a point-in-time copy of the blocks that have changed since the previous snapshot, with automatic management to ensure that only the data unique to a snapshot is removed when it is deleted. This incremental model reduces your costs and minimizes the time needed to create a snapshot.

Because snapshots are so easy to create and use, our customers create a lot of them, and make great use of tags to categorize, organize, and manage them. Going back to my list, you can see that we have added multiple tagging features over the years.

Lifecycle Management – The Amazon Data Lifecycle Manager
We want to make it even easier for you to create, use, and benefit from EBS snapshots! Today we are launching Amazon Data Lifecycle Manager to automate the creation, retention, and deletion of Amazon EBS volume snapshots. Instead of creating snapshots manually and deleting them in the same way (or building a tool to do it for you), you simply create a policy, indicating (via tags) which volumes are to be snapshotted, set a retention model, fill in a few other details, and let Data Lifecycle Manager do the rest. Data Lifecycle Manager is powered by tags, so you should start by setting up a clear and comprehensive tagging model for your organization (refer to the links above to learn more).

It turns out that many of our customers have invested in tools to automate the creation of snapshots, but have skimped on the retention and deletion. Sooner or later they receive a surprisingly large AWS bill and find that their scripts are not working as expected. The Data Lifecycle Manager should help them to save money and to be able to rest assured that their snapshots are being managed as expected.

Creating and Using a Lifecycle Policy
Data Lifecycle Manager uses lifecycle policies to figure out when to run, which volumes to snapshot, and how long to keep the snapshots around. You can create the policies in the AWS Management Console, from the AWS Command Line Interface (CLI) or via the Data Lifecycle Manager APIs; I’ll use the Console today. Here are my EBS volumes, all suitably tagged with a department:

I access the Lifecycle Manager from the Elastic Block Store section of the menu:

Then I click Create Snapshot Lifecycle Policy to proceed:

Then I create my first policy:

I use tags to specify the volumes that the policy applies to. If I specify multiple tags, then the policy applies to volumes that have any of the tags:

I can create snapshots at 12 or 24 hour intervals, and I can specify the desired snapshot time. Snapshot creation will start no more than an hour after this time, with completion based on the size of the volume and the degree of change since the last snapshot.

I can use the built-in default IAM role or I can create one of my own. If I use my own role, I need to enable the EC2 snapshot operations and all of the DLM (Data Lifecycle Manager) operations; read the docs to learn more.

Newly created snapshots will be tagged with the aws:dlm:lifecycle-policy-id and  aws:dlm:lifecycle-schedule-name automatically; I can also specify up to 50 additional key/value pairs for each policy:

I can see all of my policies at a glance:

I took a short break and came back to find that the first set of snapshots had been created, as expected (I configured the console to show the two tags created on the snapshots):

Things to Know
Here are a couple of things to keep in mind when you start to use Data Lifecycle Manager to automate your snapshot management:

Data Consistency – Snapshots will contain the data from all completed I/O operations, also known as crash consistent.

Pricing – You can create and use Data Lifecyle Manager policies at no charge; you pay the usual storage charges for the EBS snapshots that it creates.

Availability – Data Lifecycle Manager is available in the US East (N. Virginia), US West (Oregon), and EU (Ireland) Regions.

Tags and Policies – If a volume has more than one tag and the tags match multiple policies, each policy will create a separate snapshot and both policies will govern the retention. No two policies can specify the same key/value pair for a tag.

Programmatic Access – You can create and manage policies programmatically! Take a look at the CreateLifecyclePolicy, GetLifecyclePolicies, and UpdateLifeCyclePolicy functions to get started. You can also write an AWS Lambda function in response to the createSnapshot event.

Error Handling – Data Lifecycle Manager generates a “DLM Policy State Change” event if a policy enters the error state.

In the Works – As you might have guessed from the name, we plan to add support for additional AWS data sources over time. We also plan to support policies that will let you do weekly and monthly snapshots, and also expect to give you additional scheduling flexibility.


AWS Storage Gateway Recap – SMB Support, RefreshCache Event, and More

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-storage-gateway-recap-smb-support-refreshcache-event-and-more/

To borrow my own words, the AWS Storage Gateway is a service that includes a multi-protocol storage appliance that fits in between your existing application and the AWS Cloud. Your applications see the gateway as a file system, a local disk volume, or a Virtual Tape Library, depending on how it was configured.

Today I would like to share a few recent updates to the File Gateway configuration of the Storage Gateway, and also show you how they come together to enable some new processing models. First, the most recent updates:

SMB Support – The File Gateway already supports access from clients that speak NFS (versions 3 and 4.1 are supported). Last month we added support for the Server Message Block (SMB) protocol. This allows Windows applications that communicate using v2 or v3 of SMB to store files as objects in S3 through the gateway, enabling hybrid cloud use cases such as backup, content distribution, and processing of machine learning and big data workloads. You can control access to the gateway using your existing on-premises Active Directory (AD) domain or a cloud-based domain hosted in AWS Directory Service, or you can use authenticated guest access. To learn more about this update, read AWS Storage Gateway Adds SMB Support to Store and Access Objects in Amazon S3 Buckets.

Cross-Account Permissions – Some of our customers run their gateways in one AWS account and configure them to upload to an S3 bucket owned by another account. This allows them to track departmental storage and retrieval costs using chargeback and showback models. In order to simplify this important use case, you can configure the gateway to provide the bucket owner with full permissions. This avoids a pain point which could arise if the bucket owner was unable to see the objects. To learn how to set this up, read Using a File Share for Cross-Account Access.

Requester Pays – Bucket owners are responsible for storage costs. Owners pay for data transfer costs by default, but also have the option to have the requester pay. To support this use case, the File Gateway now supports S3’s Requester Pays Buckets. Data collectors and aggregators can use this feature to share data with research organizations such as universities and labs without incurring the costs of access themselves. File Gateway provides file based access to the S3 objects, caches recently accessed data locally, helping requesters reduce latency and costs. To learn more, read about Creating an NFS File Share and Creating an SMB File Share.

File Upload Notification – The gateway caches files locally, and uploads them to a designated S3 bucket in the background. Late last year we gave you the ability to request notification (in the form of a CloudWatch Event) when new files have been uploaded. You can use this to initiate cloud-based processing or to implement advanced logging. To learn more, read Getting File Upload Notification and study the NotifyWhenUploaded function.

Cache Refresh Event – You have long had the ability to use the RefreshCache function to make sure that the gateway is aware of objects that have been added, removed, or replaced in the bucket. The new Storage Gateway Cache Refresh Event lets you know that the cache is now in sync with S3, and can be used as a signal to initiate local processing. To learn more, read Getting Refresh Cache Notification.

Hybrid Processing Using File Gateway
You can use the File Upload Notification and Cache Refresh to automate some of your routine hybrid process tasks!

Let’s say that you run a geographically distributed office or retail business, with locations all over the world. Raw data (metrics, cash register receipts, or time sheets) is collected at each location, and then uploaded to S3 using a File Gateway hosted at each location. As the data arrives, you use the File Upload Notifications to process each S3 object, perhaps using an AWS Lambda function that invokes Amazon Athena to run a stock set of queries against each one. The data arrives over the course of a couple of hours, and results accumulate in another bucket. At the end of the reporting period, the intermediate results are processed, custom reports are generated for each branch location, and then stored in another bucket (this bucket, as it turns out, is also associated with a gateway, and each gateway will have cached copies of the prior versions of the reports). After you generate your reports, you can refresh each of the gateway caches, wait for the corresponding notifications, and then send an email to the branch managers to tell them that their new report is available.

Here’s a video (and presentation) with more information about this processing model:

Now Available
All of the features listed above are available now and you can start using them today in all regions where Storage Gateway is available.