All posts by Jeff Barr

And that’s a wrap!

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/and-thats-a-wrap/

After 20 years, and 3283 posts adding up to 1,577,106 words I am wrapping up my time as the lead blogger on the AWS News Blog.

It has been a privilege to be able to “live in the future” and to get to learn and write about so many of our innovations over the last two decades: message queuing, storage, on-demand computing, serverless, and quantum computing to name just a few and to leave many others out. It has also been a privilege to be able to meet and to hear from so many of you that have faithfully read and (hopefully) learned from my content over the years. I treasure those interactions and your kind words, and I keep both in mind when I write.

Next for Jeff
I began my career as a builder. Over the years I have written tens of thousands of lines of assembly code (6502, Z80, and 68000), Visual Basic, and PHP, along with hundreds of thousands of lines of C. However, over the years I’ve progressively spent less time building and more time talking about building. As each new service and feature whizzed past my eyes I would reminiscence about days and decades past, when I could actually use these goodies to create something cool. I went from being a developer who could market, to a marketer who used to be able to develop. There’s absolutely nothing wrong with that, but I like to build. The medium could be code, 3D printing, LEGO bricks, electronics components, or even cardboard –creating and innovating is what motivates and sustains me.

With that as my driving force, my goal for the next step of my career is to invest more time focused on learning and using fewer things, building cool stuff, and creating fresh, developer-focused content as I do so. I’m still working to figure out the form that this will take, so stay tuned. I am also going to continue to make my weekly appearances at AWS OnAir (our Friday Twitch show), and I will continue to speak at AWS community events around the globe.

Next for the Blog
As for the AWS News Blog, it has long been backed by an awesome team, both visible and invisible. Here we are at the recent AWS re:Invent celebration of the blog’s 20th anniversary (photo courtesy of Liz Fuentes with edits by Channy Yun to add those who were otherwise occupied):

During the celebration I told the team that I look forward to celebrating the 30 year anniversary with them at re:Invent 2034.

Going forward, the team will continue to grow and the goal remains the same: to provide our customers with carefully chosen, high-quality information about the latest and most meaningful AWS launches. The blog is in great hands and this team will continue to keep you informed even as the AWS pace of innovation continues to accelerate.

Thanks Again
Once again I need to thank all of you for the very kind words and gestures over the years. Once in your life, if you work hard and get really lucky, you get a unique opportunity to do something that really and truly matters to people. And I have been lucky.

Jeff;

Now Available – Second-Generation FPGA-Powered Amazon EC2 instances (F2)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-second-generation-fpga-powered-amazon-ec2-instances-f2/

Equipped with up to eight AMD Field-Programmable Gate Arrays (FPGAs), AMD EPYC (Milan) processors with up to 192 cores, High Bandwidth Memory (HBM), up to 8 TiB of SSD-based instance storage, and up to 2 TiB of memory, the new F2 instances are available in two sizes, and are ready to accelerate your genomics, multimedia processing, big data, satellite communication, networking, silicon simulation, and live video workloads.

A Quick FPGA Recap
Here’s how I explained the FPGA model when we previewed the first generation of FPGA-powered Amazon Elastic Compute Cloud (Amazon EC2) instances

One of the more interesting routes to a custom, hardware-based solution is known as a Field Programmable Gate Array, or FPGA. In contrast to a purpose-built chip which is designed with a single function in mind and then hard-wired to implement it, an FPGA is more flexible. It can be programmed in the field, after it has been plugged in to a socket on a PC board. Each FPGA includes a fixed, finite number of simple logic gates. Programming an FPGA is “simply” a matter of connecting them up to create the desired logical functions (AND, OR, XOR, and so forth) or storage elements (flip-flops and shift registers). Unlike a CPU which is essentially serial (with a few parallel elements) and has fixed-size instructions and data paths (typically 32 or 64 bit), the FPGA can be programmed to perform many operations in parallel, and the operations themselves can be of almost any width, large or small.

Since that launch, AWS customers have used F1 instances to host many different types of applications and services. With a newer FPGA, more processing power, and more memory bandwidth, the new F2 instances are an even better host for highly parallelizable, compute-intensive workloads.

Each of the AMD Virtex UltraScale+ HBM VU47P FPGAs has 2.85 million system logic cells and 9,024 DSP slices (up to 28 TOPS of DSP compute performance when processing INT8 values). The FPGA Accelerator Card associated with each F2 instance provides 16 GiB of High Bandwidth Memory and 64 GiB of DDR4 memory per FPGA.

Inside the F2
F2 instances are powered by 3rd generation AMD EPYC (Milan) processors. In comparison to F1 instances, they offer up to 3x as many processor cores, up to twice as much system memory and NVMe storage, and up to 4x the network bandwidth. Each FPGA comes with 16 GiB High Bandwidth Memory (HBM) with up to 460 GiB/s bandwidth. Here are the instance sizes and specs:

Instance Name vCPUs
FPGAs
FPGA Memory
HBM / DDR4
Instance Memory
NVMe Storage
EBS Bandwidth
Network Bandwidth
f2.12xlarge 48 2 32 GiB /
128 GiB
512 GiB 1900 GiB
(2x 950 GiB)
15 Gbps 25 Gbps
f2.48xlarge 192 8 128 GiB /
512 GiB
2,048 GiB 7600 GiB
(8x 950 GiB)
60 Gbps 100 Gbps

The high-end f2.48xlarge instance supports the AWS Cloud Digital Interface (CDI) to reliably transport uncompressed live video between applications, with instance-to-instance latency as low as 8 milliseconds.

Building FPGA Applications
The AWS EC2 FPGA Development Kit contains the tools that you will use to develop, simulate, debug, compile, and run your hardware-accelerated FPGA applications. You can launch the kit’s FPGA Developer AMI on a memory-optimized or compute-optimized instance for development and simulation, then use an F2 instance for final debugging and testing.

The tools included in the developer kit support a variety of development paradigms, tools, accelerator languages, and debugging options. Regardless of your choice, you will ultimately create an Amazon FPGA Image (AFI) which contains your custom acceleration logic and the AWS Shell which implements access to the FPGA memory, PCIe bus, interrupts, and external peripherals. You can deploy AFIs to as many F2 instances as desired, share with other AWS accounts or publish on AWS Marketplace.

If you have already created an application that runs on F1 instances, you will need to update your development environment to use the latest AMD tools, then rebuild and validate before upgrading to F2 instances.

FPGA Instances in Action
Here are some cool examples of how F1 and F2 instances can support unique and highly demanding workloads:

Genomics – Multinational pharmaceutical and biotechnology company AstraZeneca used thousands of F1 instances to build the world’s fastest genomics pipeline, able to process over 400K whole genome samples in under two months. They will adopt Illumina DRAGEN for F2 to realize better performance at a lower cost, while accelerating disease discovery, diagnosis, and treatment.

Satellite Communication – Satellite operators are moving from inflexible and expensive physical infrastructure (modulators, demodulators, combiners, splitters, and so forth) toward agile, software-defined, FPGA-powered solutions. Using the digital signal processor (DSP) elements on the FPGA, these solutions can be reconfigured in the field to support new waveforms and to meet changing requirements. Key F2 features such as support for up to 8 FPGAs per instance, generous amounts of network bandwidth, and support for the Data Plan Development Kit (DPDK) using Virtual Ethernet can be used to support processing of multiple, complex waveforms in parallel.

AnalyticsNeuroBlade‘s SQL Processing Unit (SPU) integrates with Presto, Apache Spark, and other open source query engines, delivering faster query processing and market-leading query throughput efficiency when run on F2 instances.

Things to Know
Here are a couple of final things that you should know about the F2 instances:

Regions – F2 instances are available today in the US East (N. Virginia) and Europe (London) AWS Regions, with plans to extend availability to additional regions over time.

Operating Systems – F2 instances are Linux-only.

Purchasing Options – F2 instances are available in On-Demand, SpotSavings Plan, Dedicated Instance, and Dedicated Host form.

Jeff;

AWS Education Equity Initiative: Applying generative AI to educate the next wave of innovators

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-education-equity-initiative-applying-generative-ai-to-educate-the-next-wave-of-innovators/

Building on the work that we and our partners have been doing for many years, Amazon is committing up to $100 million in cloud technology and technical resources to help existing, dedicated learning organizations reach more learners by creating new and innovative digital learning solutions, all as part of the AWS Education Equity Initiative.

The Work So Far
AWS and Amazon have a long-standing commitment to learning and education. Here’s a sampling of what we have already done:

AWS AI & ML Scholarship Program – This program has awarded $28 million in scholarships to approximately 6000 students.

Machine Learning University – MLU offers a free program helping community colleges and Historically Black Colleges and Universities (HBCUs) teach data management, artificial intelligence, and machine learning concepts. The program is designed to address opportunity gaps by supporting students who are historically underserved and underrepresented in technology disciplines.

Amazon Future Engineer – Since 2021, up to $46 million in scholarships has been awarded to 1150 students through this program. In the past year, more than 2.1 million students received over 17 million hours of STEM education, literacy, and career exploration courses through this and other Amazon philanthropic education programs in the United States. I was able to speak to one such session last year and it was an amazing experience:

Free Cloud Training – In late 2020 we set a goal of helping 29 million people grow their tech skills with free cloud computing training by 2025. We worked hard and met that target a year ahead of time!

There’s More To Do
Despite all of this work and progress, there’s still more to be done. The future is definitely not evenly distributed: over half a billion students cannot be reached by digital learning today.

We believe that Generative AI can amplify the good work that socially-minded edtech organizations, non-profits, and governments are already doing. Our goal is to empower them to build new and innovative digital learning systems that can amplify their work and allow them to reach a bigger audience.

With the launch of the AWS Education Equity Initiative, we want to help pave the way for the next generation of technology pioneers as they build powerful tools, train foundation models at scale, and create AI-powered teaching assistants.

We are committing up to $100 million in cloud technology and comprehensive technical advising over the next five years. The awardees will have access to the portfolio of AWS services and technical expertise so that they can build and scale learning management systems, mobile apps, chatbots, and other digital learning tools. As part of the application process, applicants will be asked to demonstrate how their proposed solution will benefit students from underserved and underrepresented communities.

As I mentioned earlier, our partners are already doing a lot of great work in this area. For example:

Code.org has already used AWS to scale their free computer science curriculum to millions of students in more than 100 countries. With this initiative, they will expand their use of Amazon Bedrock to provide an automated assessment of student projects, freeing up educator time that can be use for individual instruction and tailored learning.

Rocket Learning focuses on early childhood education in India. They will use Amazon Q in QuickSight to enhance learning outcomes for more than three million children.

I’m super excited about this initiative and look forward to seeing how it will help to create and educate the next generation of technology pioneers!

Jeff;

Introducing queryable object metadata for Amazon S3 buckets (preview)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/introducing-queryable-object-metadata-for-amazon-s3-buckets-preview/

AWS customers make use of Amazon Simple Storage Service (Amazon S3) at an incredible scale, regularly creating individual buckets that contain billions or trillions of objects! At that scale, finding the objects which meet particular criteria — objects with keys that match a pattern, objects of a particular size, or objects with a specific tag — becomes challenging. Our customers have had to build systems that capture, store, and query for this information. These systems can become complex and hard to scale, and can fall out of sync with the actual state of the bucket and the objects within.

Rich Metadata
Today we are enabling in preview automatic generation of metadata that is captured when S3 objects are added or modified, and stored in fully managed Apache Iceberg tables. This allows you to use Iceberg-compatible tools such as Amazon Athena, Amazon Redshift, Amazon QuickSight, and Apache Spark to easily and efficiently query the metadata (and find the objects of interest) at any scale. As a result, you can quickly find the data that you need for your analytics, data processing, and AI training workloads.

For video inference responses stored in S3, Amazon Bedrock will annotate the content it generates with metadata that will allow you to identify the content as AI-generated, and to know which model was used to generate it.

The metadata schema contains over 20 elements including the bucket name, object key, creation/modification time, storage class, encryption status, tags, and user metadata. You can also store additional, application-specific descriptive information in a separate table and then join it with the metadata table as part of your query.

How it Works
You can enable capture of rich metadata for any of your S3 buckets by specifying the location (an S3 table bucket and a table name) where you want the metadata to be stored. Capture of updates (object creations, object deletions, and changes to object metadata) begins right away and will be stored in the table within minutes. Each update generates a new row in the table, with a record type (CREATE, UPDATE_METADATA, or DELETE) and a sequence number. You can retrieve the historical record for a given object by running a query that orders the results by sequence number.

Enabling and Querying Metadata
I start by creating a table bucket for my metadata using the create-table-bucket command (this can also be done from the AWS Management Console or with an API call):

$ aws s3tables create-table-bucket --name jbarr-table-bucket-1 --region us-east-2
--------------------------------------------------------------------------------
|                               CreateTableBucket                              |
+-----+------------------------------------------------------------------------+
|  arn|  arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-1   |
+-----+------------------------------------------------------------------------+

Then I specify the table bucket (by ARN) and the desired table name by putting this JSON into a file (I’ll call it config.json):

{
  "S3TablesDestination": {
    "TableBucketArn": "arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-1",
    "TableName": "jbarr_data_bucket_1_table"
  }
}

And then I attach this configuration to my data bucket (the one that I want to capture metadata for):

$ aws s3tables create-bucket-metadata-table-configuration \
  --bucket jbarr-data-bucket-1 \
  --metadata-table-configuration file://./config.json \
  --region us-east-2

For testing purposes I installed Apache Spark on an EC2 instance and after a little bit of setup I was able to run queries by referencing the Amazon S3 Tables Catalog for Apache Iceberg package and adding the metadata table (as mytablebucket) to the command line:

$ bin/spark-shell \
--packages org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.6.0 \
--jars ~/S3TablesCatalog.jar \
--master yarn \
--conf "spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions" \
--conf "spark.sql.catalog.mytablebucket=org.apache.iceberg.spark.SparkCatalog" \
--conf "spark.sql.catalog.mytablebucket.catalog-impl=com.amazon.s3tables.iceberg.S3TablesCatalog" \
--conf "spark.sql.catalog.mytablebucket.warehouse=arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-1"

Here is the current schema for the Iceberg table:

scala> spark.sql("describe table mytablebucket.aws_s3_metadata.jbarr_data_bucket_1_table").show(100,35)

+---------------------+------------------+-----------------------------------+
|             col_name|         data_type|                            comment|
+---------------------+------------------+-----------------------------------+
|               bucket|            string|   The general purpose bucket name.|
|                  key|            string|The object key name (or key) tha...|
|      sequence_number|            string|The sequence number, which is an...|
|          record_type|            string|The type of this record, one of ...|
|     record_timestamp|     timestamp_ntz|The timestamp that's associated ...|
|           version_id|            string|The object's version ID. When yo...|
|     is_delete_marker|           boolean|The object's delete marker statu...|
|                 size|            bigint|The object size in bytes, not in...|
|   last_modified_date|     timestamp_ntz|The object creation date or the ...|
|                e_tag|            string|The entity tag (ETag), which is ...|
|        storage_class|            string|The storage class that's used fo...|
|         is_multipart|           boolean|The object's upload type. If the...|
|    encryption_status|            string|The object's server-side encrypt...|
|is_bucket_key_enabled|           boolean|The object's S3 Bucket Key enabl...|
|          kms_key_arn|            string|The Amazon Resource Name (ARN) f...|
|   checksum_algorithm|            string|The algorithm that's used to cre...|
|          object_tags|map<string,string>|The object tags that are associa...|
|        user_metadata|map<string,string>|The user metadata that's associa...|
|            requester|            string|The AWS account ID of the reques...|
|    source_ip_address|            string|The source IP address of the req...|
|           request_id|            string|The request ID. For records that...|
+---------------------+------------------+-----------------------------------+

Here’s a simple query that shows some of the metadata for the ten most recent updates:

scala> spark.sql("SELECT key,size, storage_class,encryption_status \
  FROM mytablebucket.aws_s3_metadata.jbarr_data_bucket_1_table \
  order by last_modified_date DESC LIMIT 10").show(false)
+--------------------+------+-------------+-----------------+                   
|key                 |size  |storage_class|encryption_status|
+--------------------+------+-------------+-----------------+
|wnt_itco_2.png      |36923 |STANDARD     |SSE-S3           |
|wnt_itco_1.png      |37274 |STANDARD     |SSE-S3           |
|wnt_imp_new_1.png   |15361 |STANDARD     |SSE-S3           |
|wnt_imp_change_3.png|67639 |STANDARD     |SSE-S3           |
|wnt_imp_change_2.png|67639 |STANDARD     |SSE-S3           |
|wnt_imp_change_1.png|71182 |STANDARD     |SSE-S3           |
|wnt_email_top_4.png |135164|STANDARD     |SSE-S3           |
|wnt_email_top_2.png |117171|STANDARD     |SSE-S3           |
|wnt_email_top_3.png |55913 |STANDARD     |SSE-S3           |
|wnt_email_top_1.png |140937|STANDARD     |SSE-S3           |
+--------------------+------+-------------+-----------------+

In a real-world situation I would query the table using one of the AWS or open source analytics tools that I mentioned earlier.

Console Access
I can also set up and manage the metadata configuration for my buckets using the Amazon S3 Console by clicking the Metadata tab:

Available Now
Amazon S3 Metadata is available in preview now and you can start using it today in the US East (Ohio, N. Virginia) and US West (Oregon) AWS Regions.

Integration with AWS Glue Data Catalog is in preview, allowing you to query and visualize data—including S3 Metadata tables—using AWS Analytics services such as Amazon Athena, Amazon Redshift, Amazon EMR, and Amazon QuickSight.

Pricing is based on the number updates (object creations, object deletions, and changes to object metadata) with an additional charge for storage of the metadata table. For more pricing information, visit the S3 Pricing page.

I’m confident that you will be able to make use of this metadata in many powerful ways, and am looking forward to hearing about your use cases. Let me know what you think!

Jeff;

New Amazon S3 Tables: Storage optimized for analytics workloads

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-s3-tables-storage-optimized-for-analytics-workloads/

Amazon S3 Tables give you storage that is optimized for tabular data such as daily purchase transactions, streaming sensor data, and ad impressions in Apache Iceberg format, for easy queries using popular query engines like Amazon Athena, Amazon EMR, and Apache Spark. When compared to self-managed table storage, you can expect up to 3x faster query performance and up to 10x more transactions per second, along with the operational efficiency that is part-and-parcel when you use a fully managed service.

Iceberg has become the most popular way to manage Parquet files, with thousands of AWS customers using Iceberg to query across often billions of files containing petabytes or even exabytes of data.

Table Buckets, Tables, and Namespaces
Table buckets are the third type of S3 bucket, taking their place alongside the existing general purpose and directory buckets. You can think of a table bucket as an analytics warehouse that can store Iceberg tables with various schemas. Additionally, S3 Tables deliver the same durability, availability, scalability, and performance characteristics as S3 itself, and automatically optimize your storage to maximize query performance and to minimize cost.

Each table bucket resides in a specific AWS Region and has a name that must be unique within the AWS account with respect to the region. Buckets are referenced by ARN and also have a resource policy. Finally, each bucket uses namespaces to logically group the tables in the bucket.

Tables are structured datasets stored in a table bucket. Like table buckets, they have ARNs and resource policies, and exist within one of the bucket’s namespaces. Tables are fully managed, with automatic, configurable continuous maintenance including compaction, management of aged snapshots, and removal of unreferenced files. Each table has an S3 API endpoint for storage operations.

Namespaces can be referenced from access policies in order to simplify access management.

Buckets and Tables from the Command Line
Ok, let’s dive right in, create a bucket, and put a table or two inside. I’ll use the AWS Command Line Interface (AWS CLI), but AWS Management Console and API support is also available. For conciseness, I will pipe the output of the more verbose commands through jq and show you only the most relevant values.

The first step is to create a table bucket:

$ aws s3tables create-table-bucket --name jbarr-table-bucket-2 | jq .arn
"arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-2"

For convenience, I create an environment variable with the ARN of the table bucket:

$ export ARN="arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-2"

And then I list my table buckets:

$ aws s3tables list-table-buckets | jq .tableBuckets[].arn
"arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-1"
"arn:aws:s3tables:us-east-2:123456789012:bucket/jbarr-table-bucket-2"

I can access and populate the table in many different ways. For testing purposes I installed Apache Spark, then invoked the Spark shell with command-line arguments to use the Amazon S3 Tables Catalog for Apache Iceberg package and to set mytablebucket to the ARN of my table.

I create a namespace (mydata) that I will use to group my tables:

scala> spark.sql("""CREATE NAMESPACE IF NOT EXISTS mytablebucket.mydata""")

Then I create a simple Iceberg table in the namespace:

spark.sql("""CREATE TABLE IF NOT EXISTS mytablebucket.mydata.table1
 (id INT,
  name STRING,
  value INT)
  USING iceberg
  """)

I use somes3tables commands to check my work:

$ aws s3tables list-namespaces --table-bucket-arn $ARN | jq .namespaces[].namespace[] 
"mydata"
$
$ aws s3tables list-tables --table-bucket-arn $ARN | jq .tables[].name
"table1"

Then I return to the Spark shell and add a few rows of data to my table:

spark.sql("""INSERT INTO mytablebucket.mydata.table1
  VALUES
  (1, 'Jeff', 100),
  (2, 'Carmen', 200),
  (3, 'Stephen', 300),
  (4, 'Andy', 400),
  (5, 'Tina', 500),
  (6, 'Bianca', 600),
  (7, 'Grace', 700)
  """)

Buckets and Tables from the Console
I can also create and work on table buckets using the S3 Console. I click Table buckets to get started:

Before creating my first bucket I click Enable integration so that I can access my table buckets from Amazon Athena, Amazon Redshift, Amazon EMR, and other AWS query engines (I can do this later if I don’t do it now):

I read the fine print and click Enable integration to create the specified IAM role and an entry in the AWS Glue Data Catalog:

After a few seconds the integration is enabled and I click Create table bucket to move ahead:

I enter a name (jbarr-table-bucket-3) and click Create table bucket:

From here I can create and use tables as I showed you earlier in the CLI section.

Table Maintenance
Table buckets take care of some important maintenance duties that would be your responsibility if you were creating and managing your own Iceberg tables. To relieve you of these duties so that you can spend more time on your table, the following maintenance operations are performed automatically:

Compaction – This process combines multiple small table objects into a larger object to improve query performance, in pursuit of a target file size that can be configured to be between 64 MiB and 512 MiB. The new object is rewritten as a new snapshot.

Snapshot Management – This process expires and ultimately removes table snapshots, with configuration options for the minimum number of snapshots to retain and the maximum age of a snapshot to retain. Expired snapshots are marked as non-current, then later deleted after a specified number of days.

Unreferenced File Removal – This process removes and deletes objects that are not referenced by any table snapshots.

Things to Know
Here are a couple of important things that you should know about table buckets and tables:

AWS Integration – S3 Tables integration with AWS Glue Data Catalog is in preview, allowing you to query and visualize data using AWS Analytics services such as Amazon Athena, Amazon Redshift, Amazon EMR, and Amazon QuickSight.

S3 API Support – Table buckets support relevant S3 API functions including GetObject, HeadObject, PutObject, and the multi-part upload operations.

Security – All objects stored in table buckets are automatically encrypted. Table buckets are configured to enforce Block Public Access.

Pricing – You pay for storage, requests, an object monitoring fee, and and fees for compaction. See the S3 Pricing page for more info.

Regions – You can use this new feature in the US East (Ohio, N. Virginia) and US West (Oregon) AWS Regions.

Jeff;

Amazon EC2 Trn2 Instances and Trn2 UltraServers for AI/ML training and inference are now available

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-trn2-instances-and-trn2-ultraservers-for-aiml-training-and-inference-is-now-available/

The new Amazon Elastic Compute Cloud (Amazon EC2) Trn2 instances and Trn2 UltraServers are the most powerful EC2 compute options for ML training and inference. Powered by the second generation of AWS Trainium chips (AWS Trainium2), the Trn2 instances are 4x faster, offer 4x more memory bandwidth, and 3x more memory capacity than the first-generation Trn1 instances. Trn2 instances offer 30-40% better price performance than the current generation of GPU-based EC2 P5e and P5en instances.

In addition to the 16 Trainium2 chips, each Trn2 instance features 192 vCPUs, 2 TiB of memory, and 3.2 Tbps of Elastic Fabric Adapter (EFA) v3 network bandwidth, which offers up to 50% lower latency than the previous generation.

The Trn2 UltraServers, which are a completely new compute offering, feature 64x Trainium2 chips connected with a high-bandwidth, low-latency NeuronLink interconnect, for peak inference and training performance on frontier foundation models.

Tens of thousands of Trainium chips are already powering Amazon and AWS services. For example, over 80,000 AWS Inferentia and Trainium1 chips supported the Rufus shopping assistant on the most recent Prime Day. Trainium2 chips are already powering the latency-optimized versions of Llama 3.1 405B and Claude 3.5 Haiku models on Amazon Bedrock.

Up and Out and Up
Sustained growth in the size and complexity of the frontier models is enabled by innovative forms of compute power, assembled into equally innovative architectural forms. In simpler times we could talk about architecting for scalability in two ways: scaling up (using a bigger computer) and scaling out (using more computers). Today, when I look at the Trainium2 chip, the Trn2 instance, and the even larger compute offerings that I will talk about in a minute, it seems like both models apply, but at different levels of the overall hierarchy. Let’s review the Trn2 building blocks, starting at the NeuronCore and scaling to an UltraCluster:

NeuronCores are at the heart of the Trainium2 chip. Each third-generation NeuronCore includes a scalar engine (1 input to 1 output), a vector engine (multiple inputs to multiple outputs), a tensor engine (systolic array multiplication, convolution, and transposition), and a GPSIMD (general purpose single instruction multiple data) core.

Each Trainium2 chip is home to eight NeuronCores and 96 GiB of High Bandwidth Memory (HBM), and supports 2.9 TB/second of HBM bandwidth. The cores can be addressed and used individually, or pairs of physical cores can be grouped into a single logical core. A single Trainium2 chip delivers up to 1.3 petaflops of dense FP8 compute and up to 5.2 petaflops of sparse FP8 compute, and can drive 95% utilization of memory bandwidth thanks to automated reordering of the HBM queue.

Each Trn2 instance is, in turn, home to 16 Trainum2 chips. That’s a total of 128 NeuronCores, 1.5 TiB of HBM, and 46 TB/second of HBM bandwidth. Altogether this multiplies out to up to 20.8 petaflops of dense FP8 compute and up to 83.2 petaflops of sparse FP8 compute. The Trainium2 chips are connected across NeuronLink in a 2D torus for high bandwidth, low latency chip-to-chip communication at 1 GB/second.

An UltraServer is home to four Trn2 instances connected with low-latency, high-bandwidth NeuronLink. That’s 512 NeuronCores, 64 Trainium2 chips, 6 TiB of HBM, and 185 TB/second of HBM bandwidth. Doing the math, this results in up to 83 petaflops of dense FP compute and up to 332 petaflops of sparse FP8 compute. In addition to the 2D torus that connects NeuronCores within an instance, Cores at corresponding XY positions in each of the four instances are connected in a ring. For inference, UltraServers help deliver industry-leading response time to create the best real-time experiences. For training, UltraServers boost model training speed and efficiency with faster collective communication for model parallelism when compared to standalone instances. UltraServers are designed to support training and inference at the trillion parameter level and beyond; they are available in preview form and you can contact us to join the preview.

Trn2 instances and UltraServers are being deployed in EC2 UltraClusters to enable scale-out distributed training across tens of thousands of Trainium chips on a single petabit scale, non-blocking network, with access to Amazon FSx for Lustre high performance storage.

Using Trn2 Instances
Trn2 instances are available today for production use in the US East (Ohio) AWS Region and can be reserved by using Amazon EC2 Capacity Blocks for ML. You can reserve up to 64 instances for up to six months, with reservations accepted up to eight weeks in advance, with instant start times and the ability to extend your reservations if needed. To learn more, read Announcing Amazon EC2 Capacity Blocks for ML to reserve GPU capacity for your machine learning workloads.

On the software side, you can start with the AWS Deep Learning AMIs. These images are preconfigured with the frameworks and tools that you probably already know and use: PyTorch, JAX, and a lot more.

If you used the AWS Neuron SDK to build your apps, you can bring them over and recompile them for use on Trn2 instances. This SDK integrates natively with JAX, PyTorch, and essential libraries like Hugging Face, PyTorch Lightning, and NeMo. Neuron includes out-of-the-box optimizations for distributed training and inference with the open source PyTorch libraries NxD Training and NxD Inference, while providing deep insights for profiling and debugging. Neuron also supports OpenXLA, including stable HLO and GSPMD, enabling PyTorch/XLA and JAX developers to utilize Neuron’s compiler optimizations for Trainium2.

Jeff;

Announcing Amazon FSx Intelligent-Tiering, a new storage class for FSx for OpenZFS

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/announcing-amazon-fsx-intelligent-tiering-a-new-storage-class-for-fsx-for-openzfs/

When I speak to customers who are planning to migrate massive amounts of on-premises data to AWS, they tell me that they want to simplify their storage management, reduce their costs, and to make the data more accessible so that it can be used for analytics, machine learning training, genomics, and other use cases. Customers are already using Network Attached Storage (NAS) on-premises, and are looking for a cloud-based upgrade that offers similar capabilities including point-in-time snapshots, data clones, and user management.

AWS customers such as Amdocs, Vela Games, and Astera Labs have been running their mission-critical and performance-intensive NAS workloads like databases, game development and streaming, and semiconductor chip design on Amazon FSx for OpenZFS. They’ve been using the existing SSD storage class on FSx to provide the predictable, high performance these workloads need. However, many other customers have large data sets that are stored on HDD-based or hybrid SSD/HDD-based NAS storage on prem that find it cost-prohibitive to move their data sets to all-SSD storage. Additionally, these customers are finding it increasingly challenging and expensive to manage provisioned storage on prem for unpredictable data sets and avoid running out of space. And they are keeping their NAS data around for longer because it could have future value for building their next model, investment strategy, or product, but that means they need to spend more time and effort monitoring access patterns and moving data around between hot and cold storage media to optimize costs.

FSx Intelligent-Tiering
Taking all of this into account, I am happy to be able to tell you about the new Amazon FSx Intelligent-Tiering storage class, available today for use with Amazon FSx for OpenZFS file systems. The new storage class is priced 85% lower than the existing SSD storage class and 20% lower than traditional HDD-based deployments on premises, and brings full elasticity and intelligent tiering to NAS data sets.

Your data moves between three storage tiers (Frequent Access, Infrequent Access, and Archive) with no effort on your part, so you get automatic cost savings with no upfront costs or commitments. Here’s how the tiers work:

Frequent Access – Data that has been accessed within the last 30 days is stored in this tier.

Infrequent Access – Data that has been not been accessed for 30 to 90 days is stored in tier, at a 44% cost reduction from Frequent Access.

Archive – Data that has not been accessed for 90 or more days is stored in this tier, at a 65% cost reduction from Infrequent Access.

Regardless of the storage tier, your data is stored across multiple AWS Availability Zones (AZs) for redundancy and availability, and can be retrieved instantly in milliseconds.

There’s no need to manage or pre-provision storage, making this storage class a great fit for uses case such as genomics, financial data analytics, seismic imagery analysis, and machine learning where storage requirements can change dramatically over the course of days or weeks.

Along with the potential for cost savings, you get high performance: up to 400K IOPS and 20 GB/second of throughput for each OpenZFS file system, with a time-to-first-byte of tens of milliseconds for all data, regardless of storage class. You can also configure an SSD-based read cache (64 GiB to 512 TiB) to reduce the time-to-first-byte by 10x to 100x for cached data.

Creating a File System
I can create a file system using the AWS Management Console, CLI, API, or a AWS CloudFormation. From the Console I click Create file system to get started:

I choose Amazon FSx for OpenZFS and click Next:

Then I enter a name (jeff_fsx_openzfs_1) for my file system and select the Intelligent-Tiering storage class. I choose the desired Throughput capacity, and I select one of the three sizing mode options for the read cache, click Next, and confirm my choices in order to create my file system:

It is ready within minutes, and I can NFS mount it to my EC2 instance:

$ sudo mkfs /fsx_zfs
$ sudo mount -t nfs -o noatime,nfsvers=4.2,sync,nconnect=16,rsize=1048576,wsize=1048576 \
  fs-00fc74f020d1e6f4e.fsx.us-east-2.aws.internal:/fsx/ /fsx_zfs/

After I run a representative workload for a while I can look at the metrics and review the performance of my file system:

It appears that I have plenty of throughput, but my read cache may be larger than needed. I created it in Automatically Provisioned mode, which allocated 3200 GiB of cache. I can change that (and save some money) with a couple of clicks:

I can also change the throughput capacity as needed:

Amazon FSx NAS Features and Attributes
Let’s take a quick look at some of the features which make FSx for OpenZFS and the FSx Intelligent-Tiering storage class a great for for your NAS-level storage needs:

Built-in Backups – Amazon FSx automatically makes a daily backup of each file system during a specified backup window and retains them for a specified retention period. The backups are file-system consistent, highly durable, and incremental. You can also create backups on your own and retain them for as long as needed.

Point-In-Time Snapshots -You can create a read-only image of an OpenZFS volume at any time. The snapshots are stored within the file system and consume storage; they can be used to restore a volume, restore individual files and folders, or to create a new volume as either a clone or a full-copy.

Replication – You can replicate a point-in-time view of an OpenZFS volume to another volume across file systems, AWS Regions, and AWS accounts. FSx uses ZFS send/receive technology behind the scenes to perform this replication and automatically establishes and maintains network connectivity between file systems to handle interruptions and resume data transfer as needed.

Data Compression – You can enable ZSTD or LZ4 compression on your OpenZFS volumes to reduce storage cost and speed up data transfer.

User and Volume Quotas – You can limit the amount of storage consumed by an individual volume or user.

Things to Know
Here are a couple of things to keep in mind before we wrap up:

Regions – This new storage class is available in the US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), Canada (Central), and Europe (Frankfurt, Ireland) AWS Regions.

Pricing – Pricing is based on the amount of primary storage consumed (GB/Month) and read cache provisioned (GB/Month). See the Amazon FSx for OpenZFS Pricing page for more information.

Jeff;

Securely share AWS resources across VPC and account boundaries with PrivateLink, VPC Lattice, EventBridge, and Step Functions

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/securely-share-aws-resources-across-vpc-and-account-boundaries-with-privatelink-vpc-lattice-eventbridge-and-step-functions/

At some point, every AWS customer tells me that they have the desire to move into the future as quickly as possible. They want to simplify their modernization efforts, drive growth, and adapt to the cloud, while also reducing costs as they proceed. These customers typically have a large suite of legacy applications, possibly running on-premises, that are running on diverse technology stacks managed by disparate parts of the organization. To make things even more challenging, these organizations often have to meet stringent security and compliance requirements.

Prepare to Share
You can now share AWS resources such as Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Kubernetes Service (Amazon EKS) container services, and your own HTTPS services across Amazon Virtual Private Cloud (Amazon VPC) and AWS account boundaries and use them to build event-driven apps via Amazon EventBridge and orchestrate workflows with AWS Step Functions. You can update your existing workloads, connect your modern cloud-native apps to on-premises legacy systems, with all communication routed across private endpoints and networks.

These new features build on Amazon VPC Lattice and AWS PrivateLink, and give you a lot of new options to design and control your network, along with some cool new ways to integrate and orchestrate across all of your technology stacks. For example, you can build hybrid event-driven architectures that make use of your existing on-premises applications.

Today, some customers use AWS Lambda functions or Amazon Simple Queue Service (Amazon SQS) queues to transfer data into VPCs. This undifferentiated heavy lifting can now be replaced with a simpler and more efficient solution.

Bringing all of this together, you get a set of services that will help you to accelerate your modernization efforts and simplify integration between your applications, regardless of where they are situated. EventBridge and Step Functions work hand-in-hand with PrivateLink and VPC Lattice to enable integration of public and private HTTPS-based applications into your event-driven architectures and workflows.

Here are the essential terms and concepts:

Resource Owner VPC – A VPC that has resources to be shared. The owner of this VPC creates a Resource Gateway with one or more associated Resource Configurations, then uses AWS Resource Access Manager (RAM) to share the Resource Configuration with the Resource Consumer, such as another AWS account, or a developer building event-driven architectures and workflows using EventBridge and Step Functions. Let’s define the Resource Owner as the person (maybe you) in your organization who is responsible for the care and feeding of this VPC.

Resource Gateway – Provides a point of ingress to a VPC so that clients can access resources in the Resource Owner VPC, as indicated by the Resource Configurations that are associated with the gateway. One Resource Gateway can make multiple resources available.

Resource – This can be a HTTPS endpoint, a database, a database cluster, an EC2 instance, an Application Load Balancer in front of multiple EC2 instances, an ECS service discoverable via AWS Cloud Map, an Amazon Elastic Kubernetes Service (Amazon EKS) service behind a Network Load Balancer, or a legacy service running in the Resource Owner VPC or running in on-premises across AWS Site-to-Site VPN or AWS Direct Connect.

Resource Configuration – Defines a set of resources that can be accessed through a particular Resource Gateway. The resources can be referenced by IP address, DNS name, or (for AWS resources) an ARN.

Resource Consumer – The person in your organization who is responsible for building applications that connect with and consume services provided by resources in a Resource Owner VPC.

Sharing Resources
You can put all of this power to use in a lot of different ways; I’ll focus on one for this post.

First, I will play the role of the Resource Owner. I click Resource gateways in the VPC Console, see that I don’t have a gateway, and click Create resource gateway to get started:

I assign a name (main-rg) and an IP address type, then pick the VPC and the private subnets where the gateway will have a presence (this is a one-shot selection that cannot be changed without creating a new Resource Gateway). I also choose up to five security groups to control inbound traffic:

I scroll down, assign any desired tags, and click Create resource gateway to proceed:

My new gateway is active within seconds; I nod in appreciation and click Create resource configuration to move ahead:

Now I need to create my first Resource Configuration. Let’s say that I have a HTTPS service running on an EC2 instance on a private subnet in my Resource Owner VPC. I assign a DNS name to the service and use a Amazon Route 53 Alias record which returns the IP address of the instance:

I am using a public hosted zone in this example. We already working on support for private hosted zones.

With DNS all set up, I click Create resource configuration to move ahead. I enter a name (rc-service1), choose Resource as the type, and select the Resource Gateway that I created earlier:

I scroll down and define my EC2 instance as a resource, entering the DNS name and setting up sharing for ports 80 and 443:

Now I take a small detour, and hop over to the RAM Console to create a Resource Share so that other AWS accounts can access the resources (this is optional, and only relevant for cross-account scenarios). I could create one Resource Share for each service, but in most cases I would create one share and use it to package up a collection of related services. I’ll do that, and call it shared-services:

Returning from my detour, I refresh the list of resource shares, pick the one that I created, and click Create resource configuration:

The resource configuration is ready within seconds.

Recap and Planning Time
Before moving ahead, let’s do a quick recap and make some plans. Here’s what I (in the role of Resource Provider) have so far:

  • MainVPC – My Resource Owner VPC.
  • main-rg – A Resource Gateway in MainVPC.
  • rc-service1 – The Resource Configuration for main-rg.
  • service1 – An HTTPS service hosted on an EC2 instance in a private subnet of MainVPC, at a fixed IP address.

Ok, so what’s next?

Share – This is the first and most obvious use use. I can use AWS Resource Access Manager (RAM) to share the Resource Configuration with another AWS account and access the service from another VPC. On the other side (as the Resource Consumer), I take a couple of quick steps to connect to the service that has been shared with me:

  • Service Network – I can create a service network, add the Resource Configuration to the Service Network, and create a VPC endpoint in a VPC to connect to the service network.
  • Endpoint – I can create a VPC endpoint in a VPC and access the shared resource via the endpoint.

Modernize – I can remove my legacy Lambda or SQS integration to get rid of some undifferentiated heavy lifting.

Build – I can use EventBridge and Step Functions to build event-driven architectures and orchestrate applications. I’ll take this option!

Accessing Private Resources with EventBridge and Step Functions
EventBridge and Step Functions already make it easy access to public HTTPS endpoints such as those from SaaS providers like Slack, Salesforce, and Adobe. With today’s launch, consuming private HTTPS services is just as easy.

As a Resource Consumer, I simply create an EventBridge connection, reference a Resource Configuration that was shared with me, and call the service from my event-driven application. Everything that I already know still applies, and I now have the new-found power to access private services.

To create the EventBridge connection, I open the EventBridge console and click Connections in the Integration  menu:

I review my existing connections (none so far), then click Create connection to move ahead:

I enter a name (MyService1) and a description for my connection, select Private as the API type, and choose the Resource Configuration that I created earlier:

Scrolling down, I need to configure the authorization for the service that I am connecting to. I select Custom configuration and Basic authorization, and enter the Username and Password for my service. I also add Action=Forecast to the query string (as you can see there are a lot of options for authorization), and click Create:

The connection is created and ready within minutes. Then I use it in my Step Functions workflows by using the HTTP Task, selecting the connection, entering the URL of my API endpoint, and choosing an HTTP method:

And that’s all there is to it: your Step Functions workflows can now make use of Private Resources!

I can also use this connection as an EventBridge API destination target in Event Buses and Pipes.

Things to Know
Here a couple of things to know about these cool new features:

Pricing – Existing pricing for Step Functions, EventBridge, PrivateLink, and VPC Lattice apply including the per-GB charge for data transfer into the VPC.

Regions – You can create and use Resource Gateways and Resource Configurations in 21 AWS Regions: US East (Ohio, N. Virginia), US West (N. California, Oregon), Africa (Cape Town), Asia Pacific (Hong Kong, Mumbai, Osaka, Seoul, Singapore, Sydney, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Milan, Paris, Stockholm), Middle East (Bahrain), and South America (São Paulo).

In the Works – As I noted earlier, we are already working on support for private hosted zones. We are also planning to support access to other types of AWS resources through EventBridge and Step Functions .

Jeff;

Announcing AWS Transfer Family web apps for fully managed Amazon S3 file transfers

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/announcing-aws-transfer-family-web-apps-for-fully-managed-amazon-s3-file-transfers/

Today I would like to introduce you to AWS Transfer Family web apps, the newest AWS Transfer Family resource. You can create a fully managed, no-code web app that allows authenticated users to list, upload, download, copy, and delete files in specific Amazon Simple Storage Service (Amazon S3) buckets. Non-developer, line-of-business users inside and outside of your organization can easily exchange file data without the need for desktop clients, scripts, faded instructions on sticky notes, or local IT help.

As the web apps administrator, you get full control over authentication, access, and permissions, and can customize the web app with a page title and a favicon. Here is the web app that I created while writing this blog post:

I can click files to download them, click folders to open them, and click columns to sort. The vertical ellipses menu provides additional options:

Each web app supports uploading and downloading of files up to 160 GiB in size, and uses multipart uploads for large files. Files are transferred across HTTPS connections protected by TLS, with automatic retries and a CRC32 end-to-end integrity check.

All about Transfer Family web apps
I will show you how to create your own web app in just a minute. But first, let’s take a look at some of the essential features and benefits…

Security – Transfer Family web apps use AWS IAM Identity Center, allowing you to use your existing SAML or OIDC identity provider or the built-in Identity Store. Either way, you can use S3 Access Grants to exercise full, fine-grained control over the users and groups that are allowed to see, download, delete, and upload files and to create directories. Your organization can also benefit from AWS Transfer Family’s compliance with SOC, PCI DSS, FedRAMP, HIPAA, and other programs.

Customization – You can customize each Transfer Family web app with a page title and a favicon. You can also put a Amazon CloudFront distribution in front of the web app and host it at a custom domain name, with HTTPS access and a public certificate.

AWS Ecosystem – Transfer Family web apps are hosted on AWS and as such are scalable and highly available. All files are stored in designated S3 buckets, with eleven nines (99.999999999%) of durability. You can take advantage of S3 features including S3 Versioning, S3 server access logging, S3 Event Notifications, and more. You can also use Amazon EventBridge to orchestrate complex post-upload workflows.

Creating a Transfer Family web app
Let’s go through the steps to create a Transfer Family web app. Each web app exists in a specific AWS Region, so I open the AWS Transfer Family console, choose the desired Region (us-east-2 for this post), and select Web apps on the left:

Then I click Create web app to proceed:

I connect to my IAM Identity Center if necessary, then create or choose an IAM service role (details) that allows the Transfer Family web app to access S3 and S3 Access Grants:

I add a Name tag and set the maximum number of concurrent web app users, then click Next:

Now I design my web app, setting the page title and the logo (both optional) before clicking Next:

On the next page I review my settings and click Create to move ahead:

And my web app is created and almost ready to use (I still need to set up permissions and users):

I will use the Access endpoint in the CORS policy that I will soon create for the bucket associated with the web app, so I copy and save it.

Setting Permissions and Users
I create an IAM custom trust policy that provides the necessary read and write permissions to the S3 bucket(s) that will be accessible through my web app (details). This policy will be referenced in an S3 Access Grant that I will create in a minute:

Moving right along, I create the initial set of users and groups in IAM Identity Center (I can add more later):

Next, I create an S3 bucket in the same region as the web app and create an S3 Access Grant. Each S3 Access Grant allows a particular IAM Identity Center identity (a user or a group) to access a specific scope (a bucket or a prefixed part of a bucket) for reading and/or writing:

I also need to attach a CORS policy (details) to the bucket so that the web app is allowed to access it from the browser:

The final step is to associate the users with the new web app. I return to the AWS Transfer Family Web apps page, find my app, and click Assign users and groups:

I can add new users to my directory or pick existing ones:

I’ll add myself to start:

Once assigned, I can share the Access endpoint (as seen above) with the user and they (me, in this case) can log in to the web app:

The Web app endpoint and the Access endpoint are the same by default. If you set up a CloudFront distribution for your web app, the Access endpoint will reflect the URL of the endpoint.

I have shown you the express path through the setup process. As you probably noticed, there are lots of options to control read and write access at the individual and group level. Be sure to explore and fully understand all of these options before you set up your production web app!

Things to Know
Here are a couple of things to know about S3 Transfer Family web apps:

Regions – Web apps can be created in nine AWS Regions; check out the web app documentation for a current list.

Pricing – Pricing is per web app/hour.

API and CLI – You can create and manage web apps programmatically by using create-web-app, describe-web-app, and other AWS Transfer Family actions.

Storage Browser for S3 – Transfer Family web apps are built using Storage Browser for Amazon S3 and offer the same end-user functionality in a fully managed offering.

Getting Started – You can get started with Transfer Family web apps in the Transfer Family console.

Jeff;

Now available: Storage optimized Amazon EC2 I7ie instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-storage-optimized-amazon-ec2-i7ie-instances/

The new storage optimized Amazon Elastic Compute Cloud (Amazon EC2) I7ie instances feature up to 120 TB of low latency NVMe storage and 5th generation Intel Xeon Scalable Processors with an all-core turbo frequency of 3.2 GHz. Fueled by 3rd Generation AWS Nitro SSDs, these instances deliver the highest storage density available in the cloud today. When compared to the previous generation of storage optimized instances, they provide:

  • Up to 65% better real-time storage performance per TB
  • Up to 50% lower I/O latency with up to 65% lower latency variability
  • Up to 40% better compute performance
  • Up to twice as many vCPUs and twice as much memory
  • 20% better price-performance

The instances are designed to support I/O intensive workloads that need a high degree of random IOPS: NoSQL databases, distributed file systems, search engines, data warehouses, and analytics.

I7ie instances are available in nine sizes with up to 192 vCPUs and 1.5 TiB of memory:

Instance Name vCPUs
Memory
NVMe Storage
(Nitro SSD)
EBS Bandwidth
Network Bandwidth
I7ie.large 2 16 GiB 1.25 TB
(1 x 1.25 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.xlarge 4 32 GiB 2.5 TB
(1 x 2.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.2xlarge 8 64 GiB 5 TB
(2 x 2.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.3xlarge 12 96 GiB 7.5 TB
(3 x 2.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.6xlarge 24 192 GiB 15 TB
(2 x 7.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.12xlarge 48 384 GiB 30 TB
(4 x 7.5 TB)
15 Gbps Up to 25 Gbps
I7ie.18xlarge 72 576 GiB 45 TB
(6 x 7.5 TB)
22.5 Gbps Up to 75 Gbps
I7ie.24xlarge 96 768 GiB 60 TB
(8 x 7.5 TB)
30 Gbps Up to 100 Gbps
I7ie.48xlarge 192 1,536 GiB 120 TB
(16 x 7.5 TB)
60 Gbps 100 Gbps

A larger L3 cache, increased memory bandwidth, and other improvements deliver increased processing power. The VP2INTERSECT instruction (part of AVX-512) accelerates Machine Learning and graph processing workloads; the Advanced Matrix Extensions (AMX) increase deep learning training and inferencing performance.

On the network side, the instances feature over 3x the EBS bandwidth of the previous generation of storage optimized instances. This accelerates just about every I/O-intensive use case, and is especially helpful when hydrating an in-memory database or caching server. All instances sizes support the Elastic Network Adapter (ENA) and can be launched in cluster placement groups; the 48xlarge instance size also supports the Elastic Fabric Adapter (EFA).

Things to Know
Here are a couple of things that you should know about these new instances:

Regions – We are launching in the US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Frankfurt, London) AWS Regions today, with plans to expand to others in the future.

Purchase Options – I7ie instances are available in On-Demand, Spot, Savings Plan, Dedicated Instance, and Dedicated Host form.

Jeff;

New Amazon CloudWatch Database Insights: Comprehensive database observability from fleets to instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-amazon-cloudwatch-database-insights-comprehensive-database-observability-from-fleets-to-instances/

Observing your Amazon Aurora databases is now a whole lot easier. Instead of spending time setting up telemetry, building dashboards, and configuring alarms, you just open Amazon CloudWatch Database Insights and take a look. With no further setup, you can monitor the health of all of your Amazon Aurora MySQL and PostgreSQL instances in the selected Region:

Each of the sections contains a wealth of detail and I’ll get to that in a moment (this may be the ultimate “but wait, there’s more” post). From this view, I can open the filter control on the left and filter the set of instances in a couple of different ways. For example, I can filter for all of the instances running Amazon Aurora MySQL, and see that I have 66 such instances, with 3 raising alarms:

I can save the filter as a Fleet (note that Fleets are defined by specific properties and tags of the database instances and as such are inherently dynamic):

And then I can see the overall health of the fleet with a click. The entire page updates to reflect the fleet; I focus on the summary:

Behind the scenes, Database Insights looks for CloudWatch alarms that include a DBInstanceIdentifier dimension, and uses these alarms to establish a correlation between database instances and alarms. This, along with other built-in heuristics and correlation steps, allows Database Insights to deliver helpful, well-organized information that will help you to better understand the overall health of your fleet and to dive deep in order to find bottlenecks and other issues.

Clicking on an instance (represented by a hexagon) reveals details; I click on the instance name (demo-mysql-reader0) to learn more:

In the per-instance view I can also see a myriad of details:

Each of the tabs at the bottom provides additional insights into what’s happening inside the database instance. For example, selecting DB Load Analysis / Top SQL / SQL Metrics shows me which SQL statements are imposing the heaviest load, along with 29 additional metrics (not shown):

From past experience, I know that finding and understanding slow queries is a tedious yet important task. with Database Insights I can see patterns that are common to the slow queries, as well as the actual queries:

With help from AWS X-Ray, Application Signals, and the AWS Distro for OpenTelemetry SDK, I can see the services and operations that originate the queries to the database instance:

The red X indicates that this operation is failing the associated Service Level Objective (SLO), an application performance monitoring aspect of Application Signals. An SLO defines the reliability of a service against customer expectations, and can be set up by selecting the service and clicking Create SLO. There are a couple of steps and some very helpful options, but at the core a SLO is measured as a percentage of successful requests over an extended period of time:

If the database instance is configured to send logs to CloudWatch Logs, I can see and search the logs, filtered by the selected time period, and within a particular log group:

There’s still a lot more to explore at the fleet level. For example, I can see the ten calling services which drive the highest DB load (again, this is powered by AWS X-Ray, Application Signals, and the AWS Distro for OpenTelemetry SDK):

And I can see the top 10 instances with respect to any of eight different metrics:

I could go on all day, but I will leave the rest for you to explore. As I never tire of saying, this feature is available now and you can start using it today.

Things to Know
Here are a couple of things to know about Database Insights:

Supported Databases – You can use Database Insights with Amazon Aurora MySQL and Amazon Aurora PostgreSQL database instances.

Pricing – There is a per-hour, per-database instance charge based on the average number of vCPUs used (for provisioned instances) or Aurora Capacity Units (for Serverless v2 databases) monitored, with separate charges for ingestion and storage of database logs. See the CloudWatch Pricing page for more information.

Regions – This feature is available in all commercial AWS Regions.

Jeff;

Time-based snapshot copy for Amazon EBS

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/time-based-snapshot-copy-for-amazon-ebs/

You can now specify a desired completion duration (15 minutes to 48 hours) when you copy an Amazon Elastic Block Store (Amazon EBS) snapshot within or between AWS Regions and/or accounts. This will help you to meet time-based compliance and business requirements for critical workloads. For example:

Testing – Distribute fresh data on a timely basis as part of your Test Data Management (TDM) plan.

Development – Provide your developers with updated snapshot data on a regular and frequent basis.

Disaster Recovery – Ensure that critical snapshots are copied in order to meet a Recovery Point Objective (RPO).

Regardless of your use case, this new feature gives you consistent and predictable copies. This does not affect the performance or reliability of standard copies—you can choose the option and timing that works best for each situation.

Creating a Time-Based Snapshot Copy
I can create time-based snapshot copies from the AWS Management Console, CLI (copy-snapshot), or API (CopySnapshot). While working on this post I created two EBS volumes (100 GiB and 1 TiB), filled each one with files, and created snapshots:

To create a time-based snapshot, I select the source as usual and choose Copy snapshot from the Action menu. I enter a description for the copy, choose the us-east-1 AWS Region as the destination, select Enable time-based copy, and (because this is a time-critical snapshot), enter a 15 minute Completion duration:

When I click Copy snapshot, the request will be accepted (and the copy will become Pending) only if my account’s throughput quotas are not already exceeded due to the throughput consumed by other active copies that I am making to the destination region. If the account level throughput quota is already exceeded, the console will display an error.

I can click Launch copy duration calculator to get a better idea of the minimum achievable copy duration for the snapshot. I open the calculator, enter my account’s throughput limit, and choose an evaluation period:

The calculator then uses historical data collected over the course of previous snapshot copies to tell me the minimum achievable completion duration. In this example I copied 1,800,000 MiB in the last 24 hours; with time-based copy and my current account throughput quota of 2000 MiB/second I can copy this much data in 15 minutes.

While the copy is in progress, I can monitor progress using the console or by calling DescribeSnapshots and examining the progress field of the result. I can also use the following Amazon EventBridge events to take actions (if the copy operation crosses regions, the event is sent in the destination region):

copySnapshot – Sent after the copy operation completes.

copyMissedCompletionDuration – Sent if the copy is still pending when the deadline has passed.

Things to Know
And that’s just about all there is to it! Here’s what you need to know about time-based snapshot copies:

CloudWatch Metrics – The SnapshotCopyBytesTransferred metric is emitted in the destination region, and reflect the amount of data transferred between the source and destination region in bytes.

Duration – The duration can range from 15 minutes to 48 hours in 15 minute increments, and is specified on a per-copy basis.

Concurrency – If a snapshot is being copied and I initiate a second copy of the same snapshot to the same destination, the duration for the second one starts when the first one is completed.

Throughput – There is a default per-account limit of 2000 MiB/second between each source and destination pair. If you need additional throughput in order to meet your RPO you can request an increase via the AWS Support Center. Maximum per-snapshot throughput is 500 MiB/second and cannot be increased.

Pricing – Refer to the Amazon EBS Pricing page for complete pricing information.

Regions – Time-based snapshot copies are available in all AWS Regions.

Jeff;

AWS Lambda turns ten – looking back and looking ahead

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-lambda-turns-ten-the-first-decade-of-serverless-innovation/

I have a very vague memory of a 2013-era meeting with my then-colleague Tim Wagner. The term serverless did not exist, but we chatted about various ways to allow developers to focus on code instead of on infrastructure. At some I recall throwing my arms skyward and indicating that it would be cool to simply toss the code into the air and have the cloud grab, store, and run it. After many more such meetings, Tim wrote a PRFAQ proposing that we build a platform that did just that, and in 2014 I was able to announce AWS Lambda – Run Code in the Cloud.

From Startup to Enterprise
It is often the case that startups, with no installed base to worry about and the need to innovate, are the first to take a chance on something new such as Lambda. While that certainly did happen, I was pleasantly surprised to find that established companies—up to and including enterprises—were just as quick to jump in. After a bit of experimentation, they quickly found ways to build event-driven applications that supported critical internal use cases. I took this as an early indicator that Lambda would be a success. It was easy to see how quickly our customers felt a new sense of empowerment: they could move from idea to implementation, and from there to realizing business value, more quickly than ever, while still building their systems in a scalable and composable way.

Today, over 1.5 million Lambda users collectively make tens of trillion function invocations per month. These customers use Lambda for file processing, stream processing (in conjunction with Amazon Kinesis and/or Amazon MSK), web applications, IoT backends, mobile backends (often using Amazon API Gateway and AWS Amplify as well) and to support and power many other use cases.

The First Decade of Serverless Innovation
Let’s roll back the calendar and take a look at a few of the more significant Lambda launches of the past decade:

2014 – The preview launch of AWS Lambda ahead of AWS re:Invent 2014 with support for Node.js and the ability to respond to event triggers from S3 buckets, DynamoDB tables, and Kinesis streams.

2015General availability, use of Amazon Simple Notification Service (Amazon SNS) notifications as triggers, and support for Lambda functions written in Java.

2016 – Support for DynamoDB Streams, Support for Python, and an increase in the function duration to 5 minutes (this was later raised to 15 minutes). Access to resources in a VPC, the power to call Lambda functions from Amazon Aurora stored procedures, environment variables, and the Serverless Application Model. This year also saw the introduction of Step Functions, which gave you the power to compose multiple Lambda functions to build more complex applications.

2017Support for AWS X-Ray, launches of AWS SAM Local and the Serverless Application Repository.

2018Support for Amazon SQS as an event trigger, the power to Extend AWS CloudFormation with Lambda-powered macros, and the ability to write your Lambda functions in any programming language.

2019 – Support for provisioned concurrency to give you additional control over performance.

2020 – Access to Savings Plans to save up to 17%, the ability for Lambda functions to access a shared file system, support for AWS PrivateLink to access your functions over a private network, code signing, billing at 1 ms granularity, functions that can use up to 10 MB of memory and 6 vCPUs, and support for container images.

2021Amazon S3 Object Lambda to let you process data as it is being retrieved from S3, AWS Lambda Extensions, support for running Lambda functions on Graviton processors.

2022 – Support for up to 10 GB of ephemeral storage per function invocation, HTTPS endpoints for Lambda functions, and Lambda SnapStart to make function invocation faster and more predictable.

2023 – Amazon S3 Object Lambda support for CloudFront, response streaming, and 12x faster function scaling when handling an unpredictable volume of requests.

2024 -New controls to make it easier to capture and search your Lambda function logs, SnapStart support for Java functions that use the ARM64 architecture, recursive loop detection, a new console editor based on VS Code, and an enhanced local IDE experience. The last two launches were designed to improve the developer experience by making it easier to build, test, debug, and deploy Lambda functions.

Again, this is just a subset of what we have launched. If you want to find even more launches, check out the Lambda category tag and search the What’s New for Lambda.

The Next Decade of Serverless
From the start, the vision for severless has been about helping developers to move from idea to business value more quickly. With that in mind, here are a couple of trends that seem clear to me as I look at Lambda’s direction over the first decade:

Default Choice – The serverless model is definitely here to stay, and will likely become the default operating model over time.

Continued Shift Toward Composability – Over time I can see that serverless applications will continue to make increasing use of reusable, prebuilt components. Aided by AI-powered development tools, a lot of new code will focus on connecting exiting components together in new and powerful ways. This will also boost consistency and reliability across applications.

Automated, AI-Optimized Infrastructure Management– We have already seen that Lambda reduces the amount of time and effort needed for managing infrastructure. Going forward, I can see that Machine Learning and other forms of AI will help to optimize costs and performance by allocating resources optimally with minimal human intervention. Applications will run on infrastructure that is automated, self-healing, and fault tolerant.

Extensibility and Integration – As a consequence of the two previous items, applications should be able to grow and adapt to changing conditions more easily than ever.

Security – Automated infrastructure management, real-time monitoring and other forms of threat detection, and AI-assisted remediation will work together to make serverless applications even more secure.

Some Lambda Resources
If you are already using Lambda to build serverless apps, great! If you are ready to get started, here are some resources to help out:

Serverless Training – Enroll in the free Serverless Learning Plan to learn about serverless concepts, common patterns, and best practices. Read the Serverless Ramp-Up Guide, and look at our extensive (in both topic and language) selection of digital training courses and in-person classroom training:

Case Studies – Review the AWS Serverless Customer Success stories to learn how AWS customers are building and innovating with Lambda and other serverless technologies.

re:Invent 2024 Sessions -Browse the re:Invent 2024 Session Catalog to find nearly 200 sessions focused on Serverless Compute & Containers:

Podcast – Listen to Episode 137 (AWS Lambda: A Decade of Transformation) of the AWS Developers Podcast to hear Marc Brooker and Julian Wood discuss the origins, evolution, and impact of Lambda.

New Books – Take a peek at some of the newest books on serverless development and architecture:

I hope that you have enjoyed this not-so-brief look at the past, present, and future of AWS Lambda. Leave me a comment and let me know what you think!

Jeff;

AWS BuilderCards second edition at re:Invent 2024

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-buildercards-second-edition-available-at-reinvent-2024-and-online/

I have been following the progress of AWS BuilderCards for several years. Players of all skill levels use the cards to learn about AWS in a fun and engaging way, competing (in a friendly fashion) to combine their cards to form architectures, gaining knowledge and scoring points as they play:

To date, more than 15,000 sets of BuilderCards have been printed and put to use over the course of three re:Invents, fifteen AWS Summits, and multiple community events. For example, here is a group of AWS enthusiasts having a good time in Tokyo during JAWS Days 2024:

Feedback from players has been incredibly positive, with a 4.8 star customer satisfaction score (CSAT) across more than 1500 replies.

Second Edition Now Available
I am happy to announce that the second edition of AWS BuilderCards will be distributed at re:Invent 2024 and will soon be available for purchase online. In response to feedback on the first edition, we have made many changes to the second edition. Here’s a summary:

Design — The revised design focuses on the contents of the card, with additional gradients and patterns to make the cards more attractive and playful. The font size was increased, game effects are now represented as icons, and the QR codes now point to the BuilderCards site:

Game Mechanics — The mechanics have been revised to improve balance, simplify play, and remove some bias. There are also some starter cards that represent a common on-premises data center environment.

Generative AI Add-On Pack — This new deck includes 50 additional BuilderCards designed to show how Generative AI and AWS architecture align. This add-on pack also includes a set of Mission cards. These cards add context, with QR codes that link to published best practices or solutions. The cards are larger, with additional text and an architecture diagram. Use is optional, with players earning extra points for completing a mission. Each deck also includes five blank BuilderCards and two blank Mission cards to support customization:

Get Your BuilderCards
If you will be at re:Invent 2024, make plans to visit the BuilderCards play area on Level 1 of Caesar’s Forum next to the PeerTalk area:

Sunday – 10 AM to 6 PM
Monday – 9 AM to 5 PM
Tuesday – 9 AM to 5 PM
Wednesday – 9 AM to 5 PM
Thursday – 9 AM to 4 PM

You can play against other re:Invent attendees and also get your own game and add-on packs to take home (we’ll give away 1,000 per day). Thanks to hard work by Masanori Yamaguchi (AWS Hero) and Kosuke Enomoto (AWS Community Builder) we will also have BuilderCards in Japanese for on-site play. You can read Kosuke’s blog post (AWS BuilderCards Japanese Edition for JAWS DAYS) to learn more about the translation process.

If you won’t be able to attend re:Invent, you will soon be able to purchase a deck of your own. Check back here for more info!

Stay Tuned
The BuilderCards team is working on additional goodies including expansion packs and additional language packs.

Jeff;

AWS named as a Leader in the 2024 Gartner Magic Quadrant for Desktop as a Service (DaaS)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-named-as-a-leader-in-the-2024-gartner-magic-quadrant-for-desktop-as-a-service-daas/

The 2024 Gartner Magic Quadrant for DaaS (Desktop as a Service) positions AWS as a Leader for the first time. Last year we were recognized as a Challenger. We believe this is a result of our commitment to meet a wide range of customer needs by delivering a diverse portfolio of virtual desktop services with license portability (including Microsoft 365 Apps for Enterprise), our geographic strategy, and operational capabilities focused on cost optimization and automation. Also, our focus on easy-to-use interfaces for managing each aspect of our virtual desktop services means that our customers rarely need to make use of third-party tools.

You can access the complete 2024 Gartner Magic Quadrant for Desktop as a Service (DaaS) to learn more.

2024-Gartner-MQ-for-DaaS-Graph

AWS DaaS Offerings
Let’s take a quick look at our lineup of DaaS offerings (part of our End User Computing portfolio):

Amazon WorkSpaces Family – Originally launched in early 2014 and enhanced frequently ever since, Amazon WorkSpaces gives you a desktop computing environment running Microsoft Windows, Ubuntu, Amazon Linux, or Red Hat Enterprise Linux in the cloud. Designed to support remote & hybrid workers, knowledge workers, developer workstations, and learning environments, WorkSpaces is available in sixteen AWS Regions, in your choice of six bundle sizes, including the GPU-equipped Graphics G4dn bundle. WorkSpaces Personal gives each user a persistent desktop — perfect for developers, knowledge workers, and others who need to install apps and save files or data. If your users do not need persistent desktops (often the case for contact centers, training, virtual learning, and back office access) you can use WorkSpaces Pools to simplify management and reduce costs. WorkSpaces Core provides managed virtual desktop infrastructure that is designed to work with third-party VDI solutions such as those from Citrix, Leostream, Omnissa, and Workspot.

Amazon WorkSpaces clients are available for desktops and tablets, with web access (Amazon WorkSpaces Secure Browser) and the Amazon WorkSpaces Thin Client providing even more choices. If you have the appropriate Windows 10 or 11 desktop license from Microsoft, you can bring your own license to the cloud (also known as BYOL), where it will run on hardware that is dedicated to you.

You can read about the Amazon WorkSpaces Family and review the WorkSpaces Features to learn more about what WorkSpaces has to offer.

Amazon AppStream 2.0 – Launched in late 2016, Amazon AppStream gives you instant, streamed access to SaaS applications and desktop applications without writing code or refactoring the application. You can easily scale applications and make them available to users across the globe without the need to manage any infrastructure. A wide range of compute, memory, storage, GPU, and operating system options let you empower remote workers, while also taking advantage of auto-scaling to avoid overprovisioning. Amazon AppStream offers three fleet types: Always on (instant connections), On-Demand (2 minutes to launch), and Elastic (for unpredictable demand). Pricing varies by type, with per second and per hour granularity for Windows and Linux; read Amazon AppStream 2.0 Pricing to learn more.

Jeff;

Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

GARTNER is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from AWS.

Now available: Graviton4-powered memory-optimized Amazon EC2 X8g instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-graviton4-powered-memory-optimized-amazon-ec2-x8g-instances/

Graviton-4-powered, memory-optimized X8g instances are now available in ten virtual sizes and two bare metal sizes, with up to 3 TiB of DDR5 memory and up to 192 vCPUs. The X8g instances are our most energy efficient to date, with the best price performance and scale-up capability of any comparable EC2 Graviton instance to date. With a 16 to 1 ratio of memory to vCPU, these instances are designed for Electronic Design Automation, in-memory databases & caches, relational databases, real-time analytics, and memory-constrained microservices. The instances fully encrypt all high-speed physical hardware interfaces and also include additional AWS Nitro System and Graviton4 security features.

Over 50K AWS customers already make use of the existing roster of over 150 Graviton-powered instances. They run a wide variety of applications including Valkey, Redis, Apache Spark, Apache Hadoop, PostgreSQL, MariaDB, MySQL, and SAP HANA Cloud. Because they are available in twelve sizes, the new X8g instances are an even better host for these applications by allowing you to choose between scaling up (using a bigger instance) and scaling out (using more instances), while also providing additional flexibility for existing memory-bound workloads that are currently running on distinct instances.

The Instances
When compared to the previous generation (X2gd) instances, the X8g instances offer 3x more memory, 3x more vCPUs, more than twice as much EBS bandwidth (40 Gbps vs 19 Gbps), and twice as much network bandwidth (50 Gbps vs 25 Gbps).

The Graviton4 processors inside the X8g instances have twice as much L2 cache per core as the Graviton2 processors in the X2gd instances (2 MiB vs 1 MiB) along with 160% higher memory bandwidth, and can deliver up to 60% better compute performance.

The X8g instances are built using the 5th generation of AWS Nitro System and Graviton4 processors, which incorporates additional security features including Branch Target Identification (BTI) which provides protection against low-level attacks that attempt to disrupt control flow at the instruction level. To learn more about this and Graviton4’s other security features, read How Amazon’s New CPU Fights Cybersecurity Threats and watch the re:Invent 2023 AWS Graviton session.

Here are the specs:

Instance Name vCPUs
Memory (DDR5)
EBS Bandwidth
Network Bandwidth
x8g.medium 1 16 GiB Up to 10 Gbps Up to 12.5 Gbps
x8g.large 2 32 GiB Up to 10 Gbps Up to 12.5 Gbps
x8g.xlarge 4 64 GiB Up to 10 Gbps Up to 12.5 Gbps
x8g.2xlarge 8 128 GiB Up to 10 Gbps Up to 15 Gbps
x8g.4xlarge 16 256 GiB Up to 10 Gbps Up to 15 Gbps
x8g.8xlarge 32 512 GiB 10 Gbps 15 Gbps
x8g.12xlarge 48 768 GiB 15 Gbps 22.5 Gbps
x8g.16xlarge 64 1,024 GiB 20 Gbps 30 Gbps
x8g.24xlarge 96 1,536 GiB 30 Gbps 40 Gbps
x8g.48xlarge 192 3,072 GiB 40 Gbps 50 Gbps
x8g.metal-24xl 96 1,536 GiB 30 Gbps 40 Gbps
x8g.metal-48xl 192 3,072 GiB 40 Gbps 50 Gbps

The instances support ENA, ENA Express, and EFA Enhanced Networking. As you can see from the table above they provide a generous amount of EBS bandwidth, and support all EBS volume types including io2 Block Express, EBS General Purpose SSD, and EBS Provisioned IOPS SSD.

X8g Instances in Action
Let’s take a look at some applications and use cases that can make use of 16 GiB of memory per vCPU and/or up to 3 TiB per instance:

Databases – X8g instances allow SAP HANA and SAP Data Analytics Cloud to handle larger and more ambitious workloads than before. Running on Graviton4 powered instances, SAP has measured up to 25% better performance for analytical workloads and up to 40% better performance for transactional workloads in comparison to the same workloads running on Graviton3 instances. X8g instances allow SAP to expand their Graviton-based usage to even larger memory bound solutions.

Electronic Design Automation – EDA workloads are central to the process of designing, testing, verifying, and taping out new generations of chips, including Graviton, Trainium, Inferentia, and those that form the building blocks for the Nitro System. AWS and many other chip makers have adopted the AWS Cloud for these workloads, taking advantage of scale and elasticity to supply each phase of the design process with the appropriate amount of compute power. This allows engineers to innovate faster because they are not waiting for results. Here’s a long-term snapshot from one of the clusters that was used to support development of Graviton4 in late 2022 and early 2023. As you can see this cluster runs at massive scale, with peaks as high as 5x normal usage:

You can see bursts of daily and weekly activity, and then a jump in overall usage during the tape-out phase. The instances in the cluster are on the large end of the size spectrum so the peaks represent several hundred thousand cores running concurrently. This ability to spin up compute when we need it and down when we don’t gives us access to unprecedented scale without a dedicated investment in hardware.

The new X8g instances will allow us and our EDA customers to run even more workloads on Graviton processors, reducing costs and decreasing energy consumption, while also helping to get new products to market faster than ever.

Available Now
X8g instances are available today in the US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) AWS Regions in On Demand, Spot, Reserved Instance, Savings Plan, Dedicated Instance, and Dedicated Host form. To learn more, visit the X8g page.

AWS and Multicloud: Existing capabilities & continued enhancements

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-and-multicloud-existing-capabilities-continued-enhancements/

When I speak to large-scale AWS customers about their challenges and concerns, the conversation often turns to the topic of multicloud. Whether by intent or by accident, these customers sometimes choose to make use of services from more than one cloud provider, sometimes in conjunction with applications or services that are still hosted on-premises. In some cases they made early, bottom-up choices at the team and division level, choosing cloud offerings from multiple vendors in the absence of a top-down mandate. In others, they acquired or merged with another organization and discovered a similar multi-vendor situation.

Regardless of the path, these customers tell me that they want to simplify and centralize their oversight and management of this diverse portfolio of cloud and on-premises resources. It is sometimes the case that the “multi” situation is time-bound, with a plan in place to ultimately consolidate operations in one place. It is also sometimes the case that the customer plans to retain their diverse portfolio.

AWS and multicloud
Our goal with AWS is to make you successful no matter what architectural choices you have made. In this post I want to outline our approach, share some capabilities that our customers have been using over the years, and provide you with an update on some of the more recent service announcements and content that we have created to give you guidance that will help you to succeed.

Our approach is to extend existing AWS operational and management capabilities to work in multicloud and hybrid environments. Because we extend existing capabilities, your investment in training, development, scripting, and runbooks is preserved, and actually becomes even more worthwhile since it applies to your other (non-AWS) resources. For example, you can use the same service (AWS Systems Manager) to patch and update Amazon Elastic Compute Cloud (Amazon EC2) instances, servers running on-premises, and servers provided by other cloud providers. Similarly, you can use Amazon CloudWatch to monitor applications, compute resources, and other cloud resources in all of those environments. These are two examples of how we are putting our approach into practice for you.

The AWS Solutions for Hybrid and Multicloud page contains additional examples of our extension-based approach to adding new capabilities, along some success stories from customers who have put the capabilities to use including Phillips 66 and Deutsche Börse.

Whether you choose to operate entirely on AWS or in multicloud and hybrid environments, one of the primary reasons to adopt AWS is the broad choice of services we offer, enabling you to innovate, build, deploy, and monitor your workloads. Just as we recently launched free data transfer out to the internet (DTO) when you want to move outside of AWS, we are committed to helping you be successful regardless of your approach.

Now that I have explained our approach and highlighted some of the principal multicloud service offerings, let’s take a look at a few of the newest multicloud and hybrid capabilities.

Multicloud launches
Since the beginning of 2023 we have launched eighteen new multicloud capabilities to existing AWS services, including 15 for data & analytics, 1 for security, and 2 for identity. Many of these launches add to the existing multicloud capabilities of the respective services:

AWS DataSync – This service transfers data between storage services. In addition to existing support for Google Cloud Storage, Azure Files, and Azure Blob Storage, we added support for five additional cloud service providers and storage services including Oracle Cloud Storage and DigitalOcean Spaces (full list). To learn more about this service, read What is AWS DataSync. To get started, I create a source location:

AWS Glue – This data integration service helps you to discover, prepare, and integrate all of your data at any scale. You can use it to connect to more than 80 different data sources, including cloud databases and analytics services. In October 2023, we introduced additional new connectors that allow you to move data bidirectionally between Amazon Simple Storage Service (Amazon S3), and either Azure Blob Storage or Azure Data Lake Storage (full list). We also launched six database connectors for AWS Glue for Apache Spark, including Teradata, SAP HANA, Azure SQL, Azure Cosmos DB, Vertica, and MongoDB (full list). To learn more about AWS Glue, read What is AWS Glue. I create a visual job flow to get started:

Amazon Athena – This serverless analytics service lets you use interactive SQL queries to analyze petabyte-scale data where it lives (more than 25 external data sources, including other cloud data stores), without copying or transforming it. Last year we added a new data source connector that allows you to query data in Google Cloud Storage. To learn more about Amazon Athena, read What is Amazon Athena.

Amazon AppFlow – You can take advantage of data and analytics in Google BigQuery using a connector available in Amazon AppFlow. To get started with Amazon AppFlow I create a flow and configure a data source:

Amazon Security Lake – This service helps you to achieve a more complete, organization-wide view of your security posture. It centralizes security data from your AWS environments, SaaS providers, on-premises environments, and cloud sources (Azure and GCP) into a purpose-built data lake. It became generally available last year, and now supports collection and analysis of security data from sources that support the Open Cybersecurity Schema Framework (OCSF) standard—more than 80 sources (full list).

AWS Secrets Manager – This service centrally manages secrets such as database credentials and API keys. Secrets are securely encrypted and can be centrally audited, with support for replication to support disaster recovery and multi-region applications. Last year we announced that you can Use AWS Secrets Manager to store and manage secrets in on-premises or multicloud workloads. To learn more, read What is AWS Secrets Manager.

AWS Identity and Access Management (IAM) – AWS IAM Identity Center now supports automated user provisioning from Google Workspace. The integration helps administrators simplify AWS access management across multiple accounts while maintaining familiar Google Workspace experiences for end users as they sign in.

Amazon CloudWatch – This service lets you query, visualize, and alarm on metrics of all sorts: application, AWS, on-premises, and multicloud. At re:Invent 2023 we added even more support for consolidation of hybrid, multicloud, and on-premises metrics. This new feature allows you to select and configure connectors that pull data from Amazon Managed Service for Prometheus, generic Prometheus, Amazon OpenSearch Service, Amazon RDS for MySQL, Amazon RDS for PostgreSQL, CSV files stored in Amazon Simple Storage Service (Amazon S3), and Microsoft Azure Monitor.

Multicloud content and guidance
Now that you know about some of our latest multicloud launches, let’s take a look at some of the blog posts and other content that my colleagues have created.

First, some blog posts:

Next, some of the most popular multicloud videos from AWS re:Invent 2023:

And finally, be sure to bookmark the AWS Solutions for Hybrid and Multicloud page.

We’re here to help
If you are running in a multicloud environment and are ready to simplify and centralize, be sure to reach out to your AWS Account Manager (AM) or Technical Account Manager (TAM). Both will be happy to help!

Jeff;

Amazon WorkSpaces Pools: Cost-effective, non-persistent virtual desktops

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-workspaces-pools-cost-effective-non-persistent-virtual-desktops/

You can now create a pool of non-persistent virtual desktops using Amazon WorkSpaces and share them across a group of users. As the desktop administrator you can manage your entire portfolio of persistent and non-persistent virtual desktops using one GUI, command line, or set of API-powered tools. Your users can log in to these desktops using a browser, a client application (Windows, Mac, or Linux), or a thin client device.

Amazon WorkSpaces Pools (non-persistent desktops)
WorkSpaces Pools ensures that each user gets the same applications and the same experience. When the user logs in, they always get access to a fresh WorkSpace that’s based on the latest configuration for the pool, centrally managed by their administrator. If the administrator enables application settings persistence for the pool, users can configure certain application settings, such as browser favorites, plugins, and UI customizations. Users can also access persistent file or object storage external to the desktop.

These desktops are a great fit for many types of users and use cases including remote workers, task workers (shared service centers, finance, procurement, HR, and so forth), contact center workers, and students.

As the administrator for the pool, you have full control over the compute resources (bundle type) and the initial configuration of the desktops in the pool, including the set of applications that are available to the users. You can use an existing custom WorkSpaces image, create a new one, or use one of the standard ones. You can also include Microsoft 365 Apps for Enterprise on the image. You can configure the pool to accommodate the size and working hours of your user base, and you can optionally join the pool to your organization’s domain and active directory.

Getting started
Let’s walk through the process of setting up a pool and inviting some users. I open the WorkSpaces console and choose Pools to get started:

I have no pools, so I choose Create WorkSpace on the Pools tab to begin the process of creating a pool:

The console can recommend workspace options for me, or I can choose what I want. I leave Recommend workspace options… selected, and choose No – non-persistent to create a pool of non-persistent desktops. Then I select my use cases from the menu and pick the operating system and choose Next to proceed:

The use case menu has lots of options:

On the next page I start by reviewing the WorkSpace options and assigning a name to my pool:

Next, I scroll down and choose a bundle. I can pick a public bundle or a custom one of my own. Bundles must use the WSP 2.0 protocol. I can create a custom bundle to provide my users with access to applications or to alter any desired system settings.

Moving right along, I can customize the settings for each user session. I can also enable application settings persistence to save application customizations and Windows settings on a per-user basis between sessions:

Next, I set the capacity of my pool, and optionally establish one or more schedules based on date or time. The schedules give me the power to match the size of my pool (and hence my costs) to the rhythms and needs of my users:

If the amount of concurrent usage is more dynamic and not aligned to a schedule, then I can use manual scale out and scale in policies to control the size of my pool:


I tag my pool, and then choose Next to proceed:

The final step is to select a WorkSpaces pool directory or create a new one following these steps. Then, I choose Create WorkSpace pool.

WorkSpaces Pools Directory

After the pool has been created and started, I can send registration codes to users, and they can log in to a WorkSpace:

WorkSpaces Pools Login with Registration Code

I can monitor the status of the pool from the console:

WorkSpaces Pool Status On Console

Things to know
Here are a couple of things that you should know about WorkSpaces Pools:

Programmatic access – You can automate the setup process that I showed above by using functions like CreateWorkSpacePool, DescribeWorkSpacePool, UpdateWorkSpacePool, or the equivalent AWS command line interface (CLI) commands.

Regions – WorkSpaces Pools is available in all commercial AWS Regions where WorkSpaces Personal is available, except Israel (Tel Aviv), Africa (Cape Town), and China (Ningxia). Check the full Region list for future updates.

Pricing – Refer to the Amazon WorkSpaces Pricing page for complete pricing information.

Visit Amazon WorkSpaces Pools to learn more.

Jeff;

Optimizing Amazon Simple Queue Service (SQS) for speed and scale

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/optimizing-amazon-simple-queue-service-sqs-for-speed-and-scale/

After several public betas, we launched Amazon Simple Queue Service (Amazon SQS) in 2006. Nearly two decades later, this fully managed service is still a fundamental building block for microservices, distributed systems, and serverless applications, processing over 100 million messages per second at peak times.

Because there’s always a better way, we continue to look for ways to improve performance, security, internal efficiency, and so forth. When we do find a potential way to do something better, we are careful to preserve existing behavior, and often run new and old systems in parallel to allow us to compare results.

Today I would like to tell you how we recently made improvements to Amazon SQS to reduce latency, increase fleet capacity, mitigate an approaching scalability cliff, and reduce power consumption.

Improving SQS
Like many AWS services, Amazon SQS is implemented using a collection of internal microservices. Let’s focus on two of them today:

Customer Front-End – The customer-facing front-end accepts, authenticates, and authorizes API calls such as CreateQueue and SendMessage. It then routes each request to the storage back-end.

Storage Back-End -This internal microservice is responsible for persisting messages sent to standard (non-FIFO) queues. Using a cell-based model, each cluster in the cell contains multiple hosts, each customer queue is assigned to one or more clusters, and each cluster is responsible for a multitude of queues:

Connections – Old and New
The original implementation used a connection per request between these two services. Each front-end had to connect to many hosts, which mandated the use of a connection pool, and also risked reaching an ultimate, hard-wired limit on the number of open connections. While it is often possible to simply throw hardware at problems like this and scale out, that’s not always the best way. It simply moves the moment of truth (the “scalability cliff”) into the future and does not make efficient use of resources.

After carefully considering several long-term solutions, the Amazon SQS team invented a new, proprietary binary framing protocol between the customer front-end and storage back-end. The protocol multiplexes multiple requests and responses across a single connection, using 128-bit IDs and checksumming to prevent crosstalk. Server-side encryption provides an additional layer of protection against unauthorized access to queue data.

It Works!
The new protocol was put into production earlier this year and has processed 744.9 trillion requests as I write this. The scalability cliff has been eliminated and we are already looking for ways to put this new protocol to work in other ways.

Performance-wise, the new protocol has reduced dataplane latency by 11% on average, and by 17.4% at the P90 mark. In addition to making SQS itself more performant, this change benefits services that build on SQS as well. For example, messages sent through Amazon Simple Notification Service (Amazon SNS) now spend 10% less time “inside” before being delivered. Finally, due to the protocol change, the existing fleet of SQS hosts (a mix of X86 and Graviton-powered instances) can now handle 17.8% more requests than before.

More to Come
I hope that you have enjoyed this little peek inside the implementation of Amazon SQS. Let me know in the comments, and I will see if I can find some more stories to share.

Jeff;

IAM Access Analyzer Update: Extending custom policy checks & guided revocation

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/iam-access-analyzer-update-extending-custom-policy-checks-guided-revocation/

We are making IAM Access Analyzer even more powerful, extending custom policy checks and adding easy access to guidance that will help you to fine-tune your IAM policies. Both of these new features build on the Custom Policy Checks and the Unused Access analysis that were launched at re:Invent 2023. Here’s what we are launching:

New Custom Policy Checks – Using the power of automated reasoning, the new checks help you to detect policies that grant access to specific, critical AWS resources, or that grant any type of public access. Both of the checks are designed to be used ahead of deployment, possibly as part of your CI/CD pipeline, and will help you proactively detect updates that do not conform to your organization’s security practices and policies.

Guided Revocation – IAM Access Analyzer now gives you guidance that you can share with your developers so that they can revoke permissions that grant access that is not actually needed. This includes unused roles, roles with unused permissions, unused access keys for IAM users, and unused passwords for IAM users. The guidance includes the steps needed to either remove the extra items or to replace them with more restrictive ones.

New Custom Policy Checks
The new policy checks can be invoked from the command line or by calling an API function. The checks examine a policy document that is supplied as part of the request and return a PASS or FAIL value. In both cases, PASS indicates that the policy document properly disallows the given access, and FAIL indicates that the policy might allow some or all of the permissions. Here are the new checks:

Check No Public Access – This check operates on a resource policy, and checks to see if the policy grants public access to a specified resource type. For example, you can check a policy to see if it allows public access to an S3 bucket by specifying the AWS::S3::Bucket resource type. Valid resource types include DynamoDB tables and streams, EFS file systems, OpenSearch domains, Kinesis streams and stream consumers, KMS keys, Lambda functions, S3 buckets and access points, S3 Express directory buckets, S3 Outposts buckets and access points, Glacier, Secrets Manager secrets, SNS topics and queues, and IAM policy documents that assume roles. The list of valid resource types will expand over time and can be found in the CheckNoPublicAccess documentation,

Let’s say that I have a policy which accidentally grants public access to an Amazon Simple Queue Service (Amazon SQS) queue. Here’s how I check it:

$ aws accessanalyzer check-no-public-access --policy-document file://resource.json \
  --resource-type AWS::SQS::Queue --output json

And here is the result:

{
    "result": "FAIL",
    "message": "The resource policy grants public access for the given resource type.",
    "reasons": [
        {
            "description": "Public access granted in the following statement with sid: SqsResourcePolicy.",
            "statementIndex": 0,
            "statementId": "SqsResourcePolicy"
        }
    ]
}

I edit the policy to remove the access grant and try again, and this time the check passes:

{
    "result": "PASS",
    "message": "The resource policy does not grant public access for the given resource type."
}

Check Access Not Granted – This check operates on a single resource policy or identity policy at a time. It also accepts an list of actions and resources, both in the form that are acceptable as part of an IAM policy. The check sees if the policy grants unintended access to any of the resources in the list by way of the listed actions. For example, this check could be used to make sure that a policy does not allow a critical CloudTrail trail to be deleted:

$ aws accessanalyzer check-access-not-granted --policy-document file://ct.json \
  --access resources="arn:aws:cloudtrail:us-east-1:123456789012:trail/MySensitiveTrail" \
  --policy-type IDENTITY_POLICY --output json

IAM Access Analyzer indicates that the check fails:

{
    "result": "FAIL",
    "message": "The policy document grants access to perform one or more of the listed actions or resources.",
    "reasons": [
        {
            "description": "One or more of the listed actions or resources in the statement with index: 0.",
            "statementIndex": 0
        }
    ]
}

I fix the policy and try again, and this time the check passes, indicating that the policy does not grant access to the listed resources:

{
    "result": "PASS",
    "message": "The policy document does not grant access to perform the listed actions or resources."
}

Guided Revocation
In my earlier post I showed you how IAM Access Analyzer discovers and lists IAM items that grant access which is not actually needed. With today’s launch, you now get guidance to help you (or your developer team) to resolve these findings. Here are the latest findings from my AWS account:

Some of these are leftovers from times when I was given early access to a service so that I could use and then blog about it; others are due to my general ineptness as a cloud admin! Either way, I need to clean these up. Let’s start with the second one, Unused access key. I click on the item and can see the new Recommendations section at the bottom:

I can follow the steps and delete the access key or I can click Archive to remove the finding from the list of active findings and add it to the list of archived ones. I can also create an archive rule that will do the same for similar findings in the future. Similar recommendations are provided for unused IAM users, IAM roles, and passwords.

Now let’s take a look at a finding of Unused permissions:

The recommendation is to replace the existing policy with a new one. I can preview the new policy side-by-side with the existing one:

As in the first example I can follow the steps or I can archive the finding.

The findings and the recommendations are also available from the command line. I generate the recommendation by specifying an analyzer and a finding from it:

$ aws accessanalyzer generate-finding-recommendation \
  --analyzer-arn arn:aws:access-analyzer-beta:us-west-2:123456789012:analyzer/MyAnalyzer \
  --id 67110f3e-05a1-4562-b6c2-4b009e67c38e

Then I retrieve the recommendation. In this example, I am filtering the output to only show the steps since the entire JSON output is fairly rich:

$ aws accessanalyzer get-finding-recommendation \
  --analyzer-arn arn:aws:access-analyzer-beta:us-west-2:123456789012:analyzer/MyAnalyzer \
  --id 67110f3e-05a1-4562-b6c2-4b009e67c38e --output json | \
  jq .recommendedSteps[].unusedPermissionsRecommendedStep.recommendedAction
"CREATE_POLICY"
"DETACH_POLICY"

You can use these commands (or the equivalent API calls) to integrate the recommendations into your own tools and systems.

Available Now
The new checks and the resolution steps are available now and you can start using them today in all public AWS regions!

Jeff;