Tag Archives: announcements

AWS Weekly Roundup – Amazon MWAA, EMR Studio, Generative AI, and More – August 14, 2023

2023-08-14 Antje Barth

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-mwaa-emr-studio-generative-ai-and-more-august-14-2023/

While I enjoyed a few days off in California to get a dose of vitamin sea, a lot has happened in the AWS universe. Let’s take a look together!

Last Week’s Launches
Here are some launches that got my attention:

Amazon MWAA now supports Apache Airflow version 2.6 – Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is a managed orchestration service for Apache Airflow that you can use to set up and operate end-to-end data pipelines in the cloud. Apache Airflow version 2.6 introduces important security updates and bug fixes that enhance the security and reliability of your workflows. If you’re currently running Apache Airflow version 2.x, you can now seamlessly upgrade to version 2.6.3. Check out this AWS Big Data Blog post to learn more.

Amazon EMR Studio adds support for AWS Lake Formation fine-grained access control – Amazon EMR Studio is a web-based integrated development environment (IDE) for fully managed Jupyter notebooks that run on Amazon EMR clusters. When you connect to EMR clusters from EMR Studio workspaces, you can now choose the AWS Identity and Access Management (IAM) role that you want to connect with. Apache Spark interactive notebooks will access only the data and resources permitted by policies attached to this runtime IAM role. When data is accessed from data lakes managed with AWS Lake Formation, you can enforce table and column-level access using policies attached to this runtime role. For more details, have a look at the Amazon EMR documentation.

AWS Security Hub launches 12 new security controls – AWS Security Hub is a cloud security posture management (CSPM) service that performs security best practice checks, aggregates alerts, and enables automated remediation. With the newly released controls, Security Hub now supports three additional AWS services: Amazon Athena, Amazon DocumentDB (with MongoDB compatibility), and Amazon Neptune. Security Hub has also added an additional control against Amazon Relational Database Service (Amazon RDS). AWS Security Hub now offers 276 controls. You can find more information in the AWS Security Hub documentation.

Additional AWS services available in the AWS Israel (Tel Aviv) Region – The AWS Israel (Tel Aviv) Region opened on August 1, 2023. This past week, AWS Service Catalog, Amazon SageMaker, Amazon EFS, and Amazon Kinesis Data Analytics were added to the list of available services in the Israel (Tel Aviv) Region. Check the AWS Regional Services List for the most up-to-date availability information.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some additional blog posts and news items that you might find interesting:

AWS recognized as a Leader in 2023 Gartner Magic Quadrant for Contact Center as a Service with Amazon Connect – AWS was named a Leader for the first time since Amazon Connect, our flexible, AI-powered cloud contact center, was launched in 2017. Read the full story here.

Generate creative advertising using generative AI – This AWS Machine Learning Blog post shows how to generate captivating and innovative advertisements at scale using generative AI. It discusses the technique of inpainting and how to seamlessly create image backgrounds, visually stunning and engaging content, and reducing unwanted image artifacts.

AWS open-source news and updates – My colleague Ricardo writes this weekly open-source newsletter in which he highlights new open-source projects, tools, and demos from the AWS Community.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

Build On Generative AI – Your favorite weekly Twitch show about all things generative AI is back for season 2 today! Every Monday, 9:00 US PT, my colleagues Emily and Darko look at new technical and scientific patterns on AWS, inviting guest speakers to demo their work and show us how they built something new to improve the state of generative AI.

In today’s episode, Emily and Darko discussed the latest models LlaMa-2 and Falcon, and explored them in retrieval-augmented generation design patterns. You can watch the video here. Check out show notes and the full list of episodes on community.aws.

AWS NLP Conference 2023 – Join this in-person event on September 13–14 in London to hear about the latest trends, ground-breaking research, and innovative applications that leverage natural language processing (NLP) capabilities on AWS. This year, the conference will primarily focus on large language models (LLMs), as they form the backbone of many generative AI applications and use cases. Register here.

AWS Global Summits – The 2023 AWS Summits season is almost coming to an end with the last two in-person events in Mexico City (August 30) and Johannesburg (September 26).

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: West Africa (August 19), Taiwan (August 26), Aotearoa (September 6), Lebanon (September 9), and Munich (September 14).

AWS re:Invent (November 27 – December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Registration is now open.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Antje

P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link doesn’t lead to our website. AWS handles your information as described in the AWS Privacy Notice.

New — File Release for Amazon FSx for Lustre

2023-08-09 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/new-file-release-for-amazon-fsx-for-lustre/

Amazon FSx for Lustre provides fully managed shared storage with the scalability and high performance of the open-source Lustre file systems to support your Linux-based workloads. FSx for Lustre is for workloads where storage speed and throughput matter. This is because FSx for Lustre helps you avoid storage bottlenecks, increase utilization of compute resources, and decrease time to value for workloads that include artificial intelligence (AI) and machine learning (ML), high performance computing (HPC), financial modeling, and media processing. FSx for Lustre integrates natively with Amazon Simple Storage Service (Amazon S3), synchronizing changes in both directions with automatic import and export, so that you can access your Amazon S3 data lakes through a high-performance POSIX-compliant file system on demand.

Today, I’m excited to announce file release for FSx for Lustre. This feature helps you manage your data lifecycle by releasing file data that has been synchronized with Amazon S3. File release frees up storage space so that you can continue writing new data to the file system while retaining on-demand access to released files through the FSx for Lustre lazy loading from Amazon S3. You specify a directory to release from, and optionally a minimum amount of time since last access, so that only data from the specified directory, and the minimum amount of time since last access (if specified), is released. File release helps you with data lifecycle management by moving colder file data to S3 enabling you to take advantage of S3 tiering.

File release tasks are initiated using the AWS Management Console, or by making an API call using the AWS CLI, AWS SDK, or Amazon EventBridge Scheduler to schedule release tasks at regular intervals. You can choose to receive completion reports at the end of your release task if so desired.

Initiating a Release Task
As an example, let’s look at how to use the console to initiate a release task. To specify criteria for files to release (for example, directories or time since last access), we define release data repository tasks (DRTs). DRTs release all files that are synchronized with Amazon S3 and that meet the specified criteria. It’s worth noting that release DRTs are processed in sequence. This means that if you submit a release DRT while another DRT (for example, import or export) is in progress, the release DRT will be queued but not processed until after the import or export DRT has completed.

Note: For the data repository association to work, automatic backups for the file system must be disabled (use the Backups tab to do this). Secondly, ensure that the file system and the associated S3 bucket are in the same AWS Region.

I already have an FSx for Lustre file system my-fsx-test.

I create a data repository association, which is a link between a directory on the file system and an S3 bucket or prefix.

I specify the name of the S3 bucket or an S3 prefix to be associated with the file system.

After the data repository association has been created, I select Create release task.

The release task will release directories or files that you want to release based on your specific criteria (again, important to remember that these files or directories must be synchronized with an S3 bucket in order for the release to work). If you specified the minimum last access for release (in addition to the directory), files that have not been accessed more recently than that will be released.

In my example, I chose to Disable completion reports. However, if you choose to Enable completion reports, the release task will produce a report at the end of the release task.

Files that have been released can still be accessed using existing FSx for Lustre functionality to automatically retrieve data from Amazon S3 back to the file system on demand. This is because, although released, their metadata stays on the file system.

File release won’t automatically prevent your file system from becoming full. It remains important to ensure that you don’t write more data than the available storage capacity before you run the next release task.

Now Available
File release on FSx for Lustre is available today in all AWS Regions where FSx for Lustre is supported, on all new or existing S3-linked file systems running Lustre version 2.12 or later. With file release on FSx for Lustre, there is no additional cost. However, if you release files that you later access again from the file system, you will incur normal Amazon S3 request and data retrieval costs where applicable when those files are read back into the file system.

To learn more, visit the Amazon FSx for Lustre Page, and please send feedback to AWS re:Post for Amazon FSx for Lustre or through your usual AWS support contacts.

– Veliswa

AWS Weekly Roundup – AWS Storage Day, AWS Israel (Tel Aviv) Region, and More – Aug 8, 2023

2023-08-08 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-storage-day-aws-israel-tel-aviv-region-and-more-aug-8-2023/

(Editor’s note: Today, we are changing the title of this regular weekly post from AWS Week in Review to AWS Weekly Roundup to better reflect the mix of recent top news and announcements as well as upcoming events you won’t want to miss.)

It’s taken me some time to finally be comfortable with being in front of a camera, a strange thing for a Developer Advocate to say I know! Last week I joined a couple of my team-mates at the AWS London Studios to record a series of videos that will be published in our Build On AWS YouTube Channel. Build On AWS is for the hands-on, technical AWS cloud builder who wants to become more agile and innovate faster. In the channel, you’ll find dynamic, high-quality content that’s designed for developers, by developers!

This video tells you more about what you’ll find in the channel. Check it out and consider subscribing to not miss out when we publish new content.

Now on to the AWS updates. There was a lot of news related to AWS last week, and I’ve compiled a few announcements and upcoming events you need to know about. Let’s get started!

Last Week’s Launches
Here are a few launches from last week that you might have missed:

Microsoft 365 Apps for enterprise now available on Amazon WorkSpaces services – Amazon WorkSpaces is a fully managed, secure, and reliable virtual desktop in the AWS Cloud. With Amazon WorkSpaces, you improve IT agility and maximize user experience, while only paying for the infrastructure that you use. We announced the availability of Microsoft 365 Apps for enterprise on Amazon WorkSpaces. You can bring your own Microsoft 365 licenses (if they meet Microsoft’s licensing requirements) and activate the applications at no additional cost to run Microsoft 365 Apps for enterprise on Amazon WorkSpaces services.

AWS Israel (Tel Aviv) Region is Now Open – You can now securely store data in Israel while serving users in the vicinity with even lower latency. This is because last week we launched the Tel Aviv Region to give customers an additional option for running applications and serving users from data centers located in Israel.

Amazon Connect Launches – This is one of my favorite AWS services to write about because of how Amazon Connect is changing our customers’ engagement with their own customers. Last week, Amazon Connect announced automatic activity scheduling based on shift duration, custom flow block titles, and archiving and deleting flows from the UI, to name a few.

Other AWS News
A few more news items and blog posts you might have missed:

Customizable thresholds for health events supported on Amazon CloudWatch Internet Monitor – Until this announcement, the default threshold for overall availability and performance scores to invoke a health event was 95 percent. Now, you can customize the thresholds for when to invoke a health event for internet-facing traffic between your end users and your applications hosted on AWS.

Improved AWS Backup performance for Amazon S3 buckets – Now you can speed up your initial Amazon S3 backup workflow and back up buckets with more than 3 billion objects due to improvements to the speed of backups by up to 10x for buckets with more than 300 million objects. This performance improvement is automatically enabled at no additional cost in all Regions where AWS Backup support for Amazon S3 is available.

For AWS open-source news and updates, check out the latest newsletter curated by my colleague Ricardo Sueiras to bring you the most recent updates on open-source projects, posts, events, and more.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Upcoming AWS Events
We have the following upcoming events:

AWS Storage Day (August 9) – A one-day virtual event where you’ll learn how to prepare for AI/ML with the storage decisions you make now, how to do more with your budget by optimizing storage costs for on-premises and cloud data, and how to deliver holistic data protection for your organization, including recovery planning to help protect against ransomware. Learn more and register here.

AWS Summit Mexico City (August 30) – Sign up for the Summit to connect and collaborate with other like-minded folks while learning about AWS.

AWS Community Days (August 12, 19) – Join these community-led conferences where event logistics and content are planned, sourced, and delivered by community leaders: Colombia (August 12), and West Africa (August 19).

– Veliswa

New — Deliver Interactive Real-Time Live Streams with Amazon IVS

2023-08-07 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/new-deliver-interactive-real-time-live-streams-with-amazon-ivs/

Live streaming is becoming an increasingly popular way to connect customers with their favorite influencers and brands through interactive live video experiences. Our customers, DeNA and Rooter, rely on Amazon Interactive Video Service (Amazon IVS), a fully managed live streaming solution, to build engaging live stream and interactive video experiences for their audiences.

In March we introduced Amazon IVS support for multiple hosts in live streams to further provide flexibility in building interactive experiences, by using a resource called stage. A stage is a virtual space where participants can exchange audio and video in real time.

However, latency is still a critical component to engaging audiences and enriching the overall experience. The lower the latency, the better it is to connect with live audiences in a direct and personal way. Previously, Amazon IVS supported real-time live streaming for up to 12 hosts via stages with around 3–5 seconds latency for viewers via channels. This latency gap restricts the ability to build interactive experiences with direct engagement for wider audiences.

Introducing Amazon IVS Real-Time Streaming
Today, I’m excited to share that with Amazon IVS Real-Time Streaming, you now can deliver real-time live streams to 10,000 viewers with up to 12 hosts from a stage, with latency that can be under 300 milliseconds from host to viewer.

This feature unlocks the opportunity for you to build interactive video experiences for social media applications or for latency sensitive use cases like auctions.

Now you will no longer have to compromise to achieve real-time latency for viewers. You can avoid such workarounds as using multiple AWS services or external tools. Instead, you can simply use Amazon IVS as a centralized service to deliver real-time interactive live streams, and you don’t even need to enable anything on your account to start using this feature.

Deliver Real-time Streams with The Amazon IVS Broadcast SDK
To deliver real-time streams, you need to interact with a stage resource and use the Amazon IVS Broadcast SDK available on iOS, Android, and web. With a stage, you can create a virtual space for participants to join as either viewers or hosts with real-time latency that can be under 300 ms.

You can use a stage to build an experience where hosts and viewers can go live together. For example, inviting viewers to become hosts and join other hosts in a Q&A session, delivering a singing competition, or having multiple guests in a talk show.

We published an overview on how to get started with a stage resource on the Add multiple hosts to live streams with Amazon IVS page. Let me do a quick refresher for the overall flow and how to interact with a stage resource.

First, you need to create a stage. You can do this via the console or programmatically using the Amazon IVS API. The following command is an example of how to create a stage using the create-stage API and AWS CLI.

$ aws ivs-realtime create-stage \
    --region us-east-1 \
    --name demo-realtime \
{
    "stage": {
        "arn": "arn:aws:ivs:us-east-1:xyz:stage/mEvTj9PDyBwQ",
        "name": "demo-realtime",
        "tags": {}
    }
}

A key concept for a stage resource that enables participants to join as a host or a viewer is a participation token. A participant token is an authorization token that lets your participants publish or subscribe to a stage. When you’re using the create-stage API, you can also generate a participation token and add additional information by using attributes, including custom user IDs and their display names. The API responds with stage details and participant tokens.

$ aws ivs-realtime create-stage \
    --region us-east-1 \
    --name demo-realtime \
    --participant-token-configurations userId=test-1,capabilities=PUBLISH,SUBSCRIBE,attributes={demo-attribute=test-1}

{
    "participantTokens": [
        {
            "attributes": {
                "demo-attribute": "test-1"
            },
            "capabilities": [
                "PUBLISH",
                "SUBSCRIBE"
            ],
            "participantId": "p7HIfs3v9GIo",
            "token": "TOKEN",
            "userId": "test-1"
        }
    ],
    "stage": {
        "arn": "arn:aws:ivs:us-east-1:xyz:stage/mEvTj9PDyBwQ",
        "name": "demo-realtime",
        "tags": {}
    }
}

In addition to the create-stage API, you can also programmatically generate participant tokens using the API. Currently, there are two capability values for a participant token, PUBLISH and SUBSCRIBE. If you need to invite a participant to host, you need to add a PUBLISH capability while creating the participant token. With the PUBLISH attribute, you can include video and audio of your host into a stream.

Here is an example on how you can generate a participant token.

$ aws ivs-realtime create-participant-token \
    --region us-east-1 \
	--capabilities PUBLISH \
	--stage-arn ARN \
	--user-id test-2

{
    "participantToken": {
        "capabilities": [
            "PUBLISH"
        ],
        "expirationTime": "2023-07-23T23:48:57+00:00",
        "participantId": "86KGafGbrXpK",
        "token": "TOKEN",
        "userId": "test-2"
    }
}

Once you have generated a participant token, you need to distribute it to your respective clients using, for example, a WebSocket message. Then, within your client applications using Amazon IVS Broadcast SDK, you can use this participant token to let the your users join the stage as hosts or viewers. To learn more on how you can interact with a stage resource, you can see and review the sample demo for iOS or Android, and the supporting serverless applications for real-time demo.

At this point, you’re able to deliver real-time live streams using a stage to 10,000 viewers. If you need to extend the stream to a wider audience, you can use your stage as the input for a channel and use the Amazon IVS Low-Latency Streaming capability. With a channel, you can deliver high concurrency video from a single source with low latency that can be under 5 seconds to millions of viewers. You can learn more on how to publish a stage to a channel on the Amazon IVS Broadcast SDK documentation page, which includes information for iOS, Android, and web.

Layered Encoding Feature for Amazon IVS Real-Time Streaming Capability
End users prefer a live stream with good quality. However, the quality of the live stream depends on various factors, such as the health of their network connections and device performance.

The most common scenario is that viewers will receive a single version of video that is above their optimum viewing configuration. For example, if the host can produce high-quality video, the live stream can be enjoyed by viewers with good connections, but viewers with slower connections would experience loading delays or even an inability to watch the videos. However, if the host can only produce low-quality video, viewers with good connections will get less optimal video, while viewers with slower connections will have a better experience.

To address the issue, with this announcement we also released the layered encoding feature for Amazon IVS Real-Time Streaming capability. With layered encoding (also known as simulcast) when you publish to a stage, Amazon IVS will automatically send multiple variations of video and audio. This ensures your viewers can continue to enjoy the stream at the best quality they can receive based on their network conditions.

Customer Voices
During the private preview period, we heard lots of feedback from our customers about Amazon IVS Real-Time Streaming.

Whatnot is a live stream shopping platform and marketplace that allows collectors and enthusiasts to connect with their community to buy and sell products they’re passionate about. “Scaling live video auctions to our global community is one of our major engineering challenges. Ensuring real-time latency is fundamental to maintaining the integrity and excitement of our auction experience. By leveraging Amazon IVS Real-Time Streaming, we can confidently scale our operations worldwide, assuring a seamless and high-quality real-time video experience across our entire user base, whether on web or mobile platforms.”, Ludo Antonov, VP of Engineering.

Available Now
Amazon IVS Real-Time Streaming is available in all AWS Regions where Amazon IVS is available. To use Amazon IVS Real-Time Streaming, you pay an hourly rate for the duration that you have hosts or viewers connected to the stage resource as a participant.

Learn more about benefits, use cases, how to get started, and pricing details for Amazon IVS’s Real-Time Streaming and Low-Latency Streaming capabilities on the Amazon IVS page.

Happy streaming!
— Donnie

New Seventh-Generation General Purpose Amazon EC2 Instances (M7i-Flex and M7i)

2023-08-03 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-seventh-generation-general-purpose-amazon-ec2-instances-m7i-flex-and-m7i/

Today we are launching Amazon Elastic Compute Cloud (Amazon EC2) M7i-Flex and M7i instances powered by custom 4th generation Intel Xeon Scalable processors available only on AWS, that offer the best performance among comparable Intel processors in the cloud – up to 15% faster than Intel processors utilized by other cloud providers. M7i-Flex instances are available in the five most common sizes, and are designed to give you up to 19% better price/performance than M6i instances for many workloads. The M7i instances are available in nine sizes (with two size of bare metal instances in the works), and offer 15% better price/performance than the previous generation of Intel-powered instances.

M7i-Flex Instances
The M7i-Flex instances are a lower-cost variant of the M7i instances, with 5% better price/performance and 5% lower prices. They are great for applications that don’t fully utilize all compute resources. The M7i-Flex instances deliver a baseline of 40% CPU performance, and can scale up to full CPU performance 95% of the time. M7i-Flex instances are ideal for running general purpose workloads such as web and application servers, virtual desktops, batch processing, micro-services, databases and enterprise applications. If you are currently using earlier generations of general-purposes instances, you can adopt M7i-Flex instances without having to make changes to your application or your workload.

Here are the specs for the M7i-Flex instances:

Instance Name	vCPUs	Memory	Network Bandwidth	EBS Bandwidth
m7i-flex.large	2	8 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i-flex.xlarge	4	16 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i-flex.2xlarge	8	32 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i-flex.4xlarge	16	64 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i-flex.8xlarge	32	128 GiB	up to 12.5 Gbps	up to 10 Gbps

M7i Instances
For workloads such as large application servers and databases, gaming servers, CPU based machine learning, and video streaming that need the largest instance sizes or high CPU continuously, you can get price/performance benefits by using M7i instances.

Here are the specs for the M7i instances:

Instance Name	vCPUs	Memory	Network Bandwidth	EBS Bandwidth
m7i.large	2	8 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i.xlarge	4	16 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i.2xlarge	8	32 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i.4xlarge	16	64 GiB	up to 12.5 Gbps	up to 10 Gbps
m7i.8xlarge	32	128 GiB	12.5 Gbps	10 Gbps
m7i.12xlarge	48	192 GiB	18.75 Gbps	15 Gbps
m7i.16xlarge	64	256 GiB	25.0 Gbps	20 Gbps
m7i.24xlarge	96	384 GiB	37.5 Gbps	30 Gbps
m7i.48xlarge	192	768 GiB	50 Gbps	40 Gbps

You can attach up to 128 EBS volumes to each M7i instance; by way of comparison, the M6i instances allow you to attach up to 28 volumes.

We are also getting ready to launch two sizes of bare metal M7i instances:

Instance Name	vCPUs	Memory	Network Bandwidth	EBS Bandwidth
m7i.metal-24xl	96	384 GiB	37.5 Gbps	30 Gbps
m7i.metal-48xl	192	768 GiB	50.0 Gbps	40 Gbps

Built-In Accelerators
The Sapphire Rapids processors include four built-in accelerators, each providing hardware acceleration for a specific workload:

Advanced Matrix Extensions (AMX) – This set of extensions to the x86 instruction set improve deep learning and inferencing, and support workloads such as natural language processing, recommendation systems, and image recognition. The extensions provide high-speed multiplication operations on 2-dimensional matrices of INT8 or BF16 values. To learn more, read Chapter 3 of the Intel AMX Instruction Set Reference.
Intel Data Streaming Accelerator (DSA) – This accelerator drives high performance for storage, networking, and data-intensive workloads by offloading common data movement tasks between CPU, memory, caches, network devices, and storage devices, improving streaming data movement and transformation operations. Read Introducing the Intel Data Streaming Accelerator (Intel DSA) to learn more.
Intel In-Memory Analytics Accelerator (IAA) – This accelerator runs database and analytic workloads faster, with the potential for greater power efficiency. In-memory compression, decompression, and encryption at very high throughput, and a suite of analytics primitives support in-memory databases, open source database, and data stores like RocksDB and ClickHouse. To learn more, read the Intel In-Memory Analytics Accelerator (Intel IAA) Architecture Specification.
Intel QuickAssist Technology (QAT) -This accelerator offloads encryption, decryption, and compression, freeing up processor cores and reducing power consumption. It also supports merged compression and encryption in a single data flow. To learn more start at the Intel QuickAssist Technology (Intel QAT) Overview.

Some of these accelerators require the use of specific kernel versions, drivers, and/or compilers.

The Advanced Matrix Extensions are available on all sizes of M7i and M7i-Flex instances. The Intel QAT, Intel IAA, and Intel DSA accelerators will be available on the m7i.metal-24xl and m7i.metal-48xl instances.

Details
Here are a couple of things to keep in mind about the M7i-Flex and M7i instances:

Regions – The new instances are available in the US East (Ohio, N. Virginia), US West (Oregon), and Europe (Ireland) AWS Regions, and we plan to expand to additional regions throughout the rest of 2023.

Purchasing Options – M7i-Flex amd M7i instances are available in On-Demand, Reserved Instance, Savings Plan, and Spot form. M7i instances are also available in Dedicated Host and Dedicated Instance form.

— Jeff;

Automatically delete schedules upon completion with Amazon EventBridge Scheduler

2023-08-02 Marcia Villalba

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/compute/automatically-delete-schedules-upon-completion-with-amazon-eventbridge-scheduler/

Amazon EventBridge Scheduler now supports configuring automatic deletion of schedules after completion. Now you can configure one-time and recurring schedules with an end date to be automatically deleted upon completion to avoid managing individual schedules.

Amazon EventBridge Scheduler allows you to create, run, and manage schedules on scale. Using EventBridge Scheduler, you can schedule millions of tasks to invoke over 270 AWS services and over 6,000 API operations, such as AWS Lambda, AWS Step Functions, and Amazon SNS.

By default, EventBridge Scheduler allows customers to have 1 million schedules per account, which can be increased as needed. However, completed schedules are counted towards the account quota limits. In addition, completed schedules are visible when listing schedules, and require customers to remove them. Some customers have created their own patterns to automatically remove completed schedules and since the EventBridge Scheduler announcement last November, this was one of the most requested features by customers.

Deleting after completion

When you configure automatic deletion for a schedule, EventBridge Scheduler deletes the schedule shortly after its last target invocation. You can set up automatic deletion when you create the schedule, or you can update the schedule settings at any point before its last invocation.

You can configure this setting in one-time and recurring schedules.

One-time schedules: your schedule is deleted after the schedule has invoked its target once.
Recurring schedules: set with rate or cron expressions, your schedule is deleted after the last invocation.

If all retries are exhausted because of failure for a schedule configured with automatic deletion, the schedule is deleted shortly after the last unsuccessful attempt.

With this new capability, you can save time, resources, and operational costs when managing your schedules.

Setting up schedules to delete after completion

You can create schedules that are automatically deleted after completion from the AWS Management Console, AWS SDK, or AWS CLI in all AWS Regions where EventBridge Scheduler is available.

For example, imagine that you are a developer on a platform that allows end users to receive notifications when a task is due. You are already using EventBridge Scheduler to implement this feature. For every task that your users create in your application, your code creates a new schedule in EventBridge Scheduler. You can now configure all these schedules to be deleted automatically after completion. And shortly after the schedules run, they are removed from your EventBridge Scheduler, allowing you to scale your system and keep on creating schedules, making it easier to manage your active schedules and quota limits.

Let see how you can implement this example with the new capability of EventBridge Scheduler. When a user creates a new task with a reminder, a function is triggered from your application. That function creates a one-time schedule in EventBridge Scheduler.

This example shows how you can create a new one-time schedule that is automatically deleted after completion using the AWS CLI and has SNS as a target. Make sure that you update the AWS CLI to the latest version. Then you can create a new schedule with the parameter action-after-completion ‘DELETE’.

$ aws scheduler create-schedule --name SendEmailOnce \
--schedule-expression ”at(2023-08-02T17:35:00)",\
--schedule-expression-timezone "Europe/Helsinki" \
--flexible-time-window "{\"Mode\": \"OFF\"}" \
--target "{\"Arn\": \"arn:aws:sns:us-east-1:xxx:test-send-email\", \"RoleArn\": \" arn:aws:iam::xxxx:role/sam_scheduler_role\" }" \
--action-after-completion 'DELETE'

This command creates a one-time schedule with the name SendEmailOnce, that runs at a specific date, defined in the schedule-expression, and in a specific time zone, defined in the schedule-expression-timezone. This schedule is not using the flexible time window feature. Next, you must define the target for this schedule. This one sends a message to an SNS topic.

You can validate that your schedule is created correctly from the AWS CLI with the get-schedule command.

$ aws scheduler get-schedule --name SendEmailOnce
{
    "ActionAfterCompletion": "DELETE",
    "Arn": "arn:aws:scheduler:us-east-1:905614108351:schedule/default/SendEmailOnce",
    "CreationDate": 1690874334.83,
    "FlexibleTimeWindow": {
        "Mode": "OFF"
    },
    "GroupName": "default",
    "LastModificationDate": 1690874334.83,
    "Name": "SendEmailOnce3",
    "ScheduleExpression": "at(2023-08-02T17:35:00)",
    "ScheduleExpressionTimezone": "Europe/Helsinki",
    "State": "ENABLED",
    "Target": {
        "Arn": "arn:aws:sns:us-east-1:xxxx:test-send-email",
        "RetryPolicy": {
            "MaximumEventAgeInSeconds": 86400,
            "MaximumRetryAttempts": 185
        },
        "RoleArn": "arn:aws:iam::XXXX:role/scheduler_role"
    }
}

In addition, you can see the details of the schedule from the AWS Management Console.

Now when the date of the notification arrives, EventBridge Scheduler invokes the target configured in the schedule, in this case SNS, and emails a notification to the customer.

Shortly after this schedule is completed, if you list the schedules, you see that the schedule was deleted and it is no longer listed.

$ aws scheduler list-schedules
{
"Schedules": [
]
}

Benefits of automation

Traditionally, many problems that EventBridge Scheduler solves were addressed using batch processes and pull-based models.

Some organizations are using EventBridge Scheduler to replace their pull-based models for a more dynamic push-based model. Before implementing Scheduler, they were relying on the customer to ask for the data when they need it. Now, with EventBridge Scheduler, they are creating schedules to report back to their customers at critical times of their journey.

For example, an airline can use EventBridge Scheduler to create one-time schedules 24 hours, 4 hours, and 2 hours before the flight, to keep their passengers up to date with the flight status. Customers receive a notification with the link for the online check-in, the check-in counter number, baggage pick up information, and any flight changes that occur. In this way, passengers are always up to date on their flight status and they can take immediate action. This dynamic model not only helps to improve the customer experience but also improves the operational efficiency for the airline.

Other organizations use EventBridge Scheduler to replace batch operations, as you can configure a schedule that starts a batch process at the time of the day you need. Also, you can take advantage of EventBridge Scheduler time zones and run the processes at the time that make sense for your end customer.

For example, consider an international financial institution that must send customers a statement of their account at the end of the day. You can use EventBridge Scheduler to set up a recurrent schedule for each of your customers that sends a report at the end of the day of your customers’ time zone. In this way, you can improve the customer experience as now the system is personalized for their settings, and also reduce operational overhead, as the processing operations are distributed throughout the day.

In addition, EventBridge Scheduler solves many new use cases for customers. For example, if you are a financial institution that handles payments, you can create a one-time schedule for every large transaction that needs a confirmation. If the transaction is not confirmed when the schedule runs, you can cancel the transaction. This decreases the risk of handling transactions, improves the customer experience, and also improves the automation of your processes by making them real time.

Another use case is to handle credit card expiration dates. You can create a one-time schedule that emails the customer to update their credit card information one month before the expiration date. This solution removes operational overhead compared to the traditional implementation of using servers and batch processes.

Conclusion

In the preceding use cases listed, automation and task scheduling improve the end user experience, remove undifferentiated heavy lifting, and benefit from using the new capability of removing schedules after their completion.

This blog post introduces the new capability from Amazon EventBridge Scheduler that automatically deletes the completed schedules. This feature simplifies the use of EventBridge Scheduler, reduces the operational overhead of managing schedules at scale, and allows you to scale even further.

To get started with EventBridge Scheduler, visit Serverless Land patterns where you can find over 20 patterns using this service.

Introducing the first AWS Security Heroes

2023-08-01 Taylor Jacobsen

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/introducing-the-first-aws-security-heroes/

The AWS Heroes program recognizes individuals who combine their deeply technical expertise with a passion for helping others to learn more and build faster. Over the years, trends have evolved in how the community develops and deploys solutions built on AWS, which has influenced the creation of specialized Hero categories. Today, we’re thrilled to officially recognize and acknowledge leaders in the security area of focus.

Security is often looked at in terms of impact and not how it enables teams to safely innovate. Our inaugural AWS Security Heroes have shown time and time again that a pragmatic approach, executed with the intent to inform and educate, delivers positive security outcomes. The initial cohort of AWS Security Heroes are experts at the forefront of their field, and share a mission to help others better understand security.

Please join us in welcoming our first AWS Security Heroes!

Chris Farris – Atlanta, USA

Security Hero Chris Farris has worked in IT since 1994, primarily focused on Linux, networking, and security. For the past eight years, he has been deeply involved in public cloud and public cloud security in media and entertainment, leveraging his expertise to build and evolve cloud security programs at Turner Broadcasting, WarnerMedia, Discovery Communications, and PlayOn! Sports. His current focus is on educating and empowering builders to understand core cloud security concepts, and to enable small and medium sized organizations to better secure and govern in the cloud.

Gerardo Castro – Callao, Perú

Security Hero Gerardo Castro is a Security Solutions Architect at Caleidos. He likes to write technical posts and talk about cybersecurity on his Medium blog. He also builds and leads videos, podcasts, online classes, and workshops focused on AWS. In addition, Gerardo is a community leader of the AWS UG Security Community in Latin America, and has inspired many people to begin and grow their career in the cloud.

Keisuke Usuda – Chiba, Japan

Security Hero Keisuke Usuda is a Senior Solution Architect at Classmethod and holds the CISSP certification. He is also a core member of the Japan AWS User Group focused on security (Security-JAWS), and regularly organizes events. Keisuke has a deep affection for AWS security-related managed services, and advocates for the enablement of Amazon GuardDuty across all AWS accounts worldwide.

Ray Lin (Chia-Wei Lin) – Taipei, Taiwan

Security Hero Ray Lin is an AWS and Security Consultant at iFUS System Consultants Ltd., and excels in building teams and developing new products from zero to one. His primary expertise spans software project management, Agile development, business and system analysis, SaaS product development, architecture design, cybersecurity, DevSecOps, and AI. Ray has also made significant contributions to the AWS community, particularly in cybersecurity and secure architecture design. His commitment to sharing knowledge is evident in his active involvement in the AWS User Group Taiwan.

Shun Yoshie – Yokohama, Japan

Security Hero Shun Yoshie is a Security Consultant at Nomura Research Institute, Ltd(NRI), and has been a Hero since 2021. He consults on operational design of security in multi-cloud environments, and has been focusing on themes related to multi-cloud, Cloud Native, CNAPP, and Security Observability. Additionally, Shun joined the Japanese AWS User Group (JAWS-UG) in 2013, and he has been running the JAWS-UG Tokyo chapter since 2019.

Teri Radichel – Savannah, USA

Security Hero Teri Radichel is the CEO of 2nd Sight Lab, a cybersecurity company that offers three services: cloud security training to organizations, penetration tests, and security assessments. She also answers cybersecurity questions for clients on consulting calls scheduled through IANS Research. Teri is the author of the book, “Cybersecurity for Executives in the Age of Cloud,” has been a Hero since 2016, and received the SANS 2017 Difference Makers Award for security innovation. Teri has 13 cybersecurity and pentesting certifications, including the GSE, which required a two-day hands-on in person test to pass at the time she obtained it.

Learn More

If you’d like to learn more about the new Security Hero category or connect with a Hero near you, please visit the AWS Heroes website or browse the AWS Heroes Content Library.

— Taylor

Now Open – AWS Israel (Tel Aviv ) Region

2023-08-01 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/now-open-aws-israel-tel-aviv-region/

In June 2021, Jeff Barr announced the upcoming AWS Israel (Tel Aviv) Region. Today we’re announcing the general availability of the AWS Israel (Tel Aviv) Region, with three Availability Zones and the il-central-1 API name.

The new Tel Aviv Region gives customers an additional option for running their applications and serving users from data centers located in Israel. Customers can securely store data in Israel while serving users in the vicinity with even lower latency.

AWS Services in the AWS Israel (Tel Aviv) Region
In the new Tel Aviv Region, you can use C5, C5d, C6g, C6gn, C6i, C6id, D3, G5, I3, I3en, I4i, M5, M5d, M6g, M6gd, M6i, M6id, P4de (public preview only), R5, R5d, R6g, R6i, R6id, T3, T3a, T4g instances, and a long list of AWS services including: Amazon API Gateway, AWS AppConfig, AWS Application Auto Scaling, Amazon Aurora, Aurora PostgreSQL, AWS Budgets, AWS Certificate Manager, AWS CloudFormation, Amazon Cloudfront, AWS Cloud Map, AWS CloudTrail, Amazon CloudWatch, Amazon CloudWatch Events, Amazon CloudWatch Logs, AWS CodeBuild, AWS CodeDeploy, AWS Config, AWS Cost Explorer, AWS Database Migration Service, AWS Direct Connect, AWS Directory Service, Amazon DynamoDB, Amazon Elastic Block Store (Amazon EBS), Amazon Elastic Compute Cloud (Amazon EC2), Amazon EC2 Auto Scaling, EC2 Image Builder, Amazon Elastic Container Registry (Amazon ECR), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service, Amazon ElastiCache, AWS Elastic Beanstalk, Elastic Load Balancing, Elastic Load Balancing – Network (NLB), Amazon EMR, Amazon EventBridge, AWS Fargate, Glacier, AWS Health Dashboard, AWS Identity and Access Management (IAM), Amazon Kinesis Data Streams, Amazon Kinesis Data Firehose, AWS Key Management Service (AWS KMS), AWS Lambda, AWS Marketplace, AWS Mobile SDK for iOS and Android, Amazon OpenSearch Service, AWS Organizations, Amazon Redshift, AWS Resource Access Manager, Amazon Relational Database Service (Amazon RDS), Resource Groups, Amazon Route 53, Amazon Virtual Private Cloud (Amazon VPC), AWS Secrets Manager, AWS Shield Standard, AWS Shield Advanced, Amazon Simple Notification Service (Amazon SNS), Amazon Simple Queue Service (Amazon SQS), Amazon Simple Storage Service (Amazon S3), Amazon Simple Workflow Service (Amazon SWF), AWS Step Functions, AWS Support API, AWS Systems Manager, AWS Trusted Advisor, VM Import/Export, AWS VPN, AWS WAF, and AWS X-Ray.

AWS in Israel
According to the Israel Ministry of Economic Industry, Israel is in the front line of the cloud computing era and “is known to be the ‘start-up nation’ of the number of global start-ups being produced. Over the past decade, Israel has produced over 2,000 start-ups, the majority of these start-ups are driven by software as a service (SaaS). Israeli cloud technology remains a strong promise in the market as new start-ups are continuously penetrating the market.”

AWS began supporting startups in Israel in 2013 through its AWS Activate program. In Israel, AWS works with accelerator organizations such as 8200 EISP, F2 Venture Capital – thejunction, and TechStars as well as venture capital firms like Entrée Capital, Bessemer Venture Partners, Pitango, Vertex Ventures Israel, and Viola Group to support the rapid growth of their portfolio companies.

Back in 2014, we opened an AWS office and a research and development (R&D) center in Israel. Since then, Amazon has expanded its R&D presence in the country, which now includes Prime Air and Alexa Shopping.

In 2015, AWS acquired Annapurna Labs, an Israeli microelectronics company, which has developed advanced compute, networking, security, and storage technologies for AWS—such as AWS-designed Graviton processors, AWS Inferentia, AWS Trainium chips, and the AWS Nitro System.

In 2018, we expanded to new offices in Tel Aviv, including AWS Experience Tel Aviv on Floor28 to support the growth of Israeli startups, enterprises, and government customers through technology-focused events and educational activities. Now, AWS Experience Tel Aviv on Floor28 is an education hub where anyone interested in AWS can attend industry events, workshops, and meetups, and receive free, in-person technical and business guidance from AWS experts.

In 2019, we launched the first AWS infrastructure in Israel, opening an Amazon CloudFront edge location. In 2020, we brought AWS Outposts and AWS Direct Connect to Israel, providing Israeli organizations with the ability to run AWS technology in their own data centers and establish dedicated connections back to the AWS Cloud.

In April 2021, the government of Israel announced that it had selected AWS as its primary cloud provider as part of the Nimbus contract. The Nimbus framework will enable government departments—including the ministries, education, healthcare, and municipalities—to accelerate their digital transformation by using AWS technologies.

AWS continues to invest in upskilling local developers, students, and the next generation of IT leaders in Israel through programs such as AWS Educate, AWS Academy, AWS re/Start, and other Training and Certification programs.

AWS Educate and Academy programs are providing free resources to accelerate cloud-related learning and preparing today’s students in Israel for the jobs of the future. Israel colleges already participating in the AWS Academy program include the Bar Ilan University, Ben-Gurion University of the Negev, Holon Institute of Technology, Jerusalem College of Technology, and University of Haifa. We also launched AWS re/Start to focus on helping unemployed or underemployed individuals to launch a new cloud career. You can now apply to AWS re/Start programs through Appleseeds, Sigma Labs Jerusalem, and Analiza Cyber Intelligence in Israel.

AWS Customers in Israel
We have many amazing customers in Israel who are doing incredible things with AWS, for example:

AI21 Labs – AI21 Labs offers access to its state-of-the-art proprietary language models through AI21 Studio for businesses to build their own generative artificial intelligence applications, as well as its consumer product, Wordtune, the first AI-based writing assistant to understand context and meaning. AI21 Labs scaled to hundreds of GPUs efficiently and cost effectively to build the Jurassic-2 family of language models. These models were trained with distributed and parallelized infrastructure based on Amazon EC2 P4d instances 400 Gbps high-performance networking supported by Elastic Fabric Adaptor (EFA).

Bank Leumi – Leumi is one of the leading banks in Israel and has over 200 branches across the country and dedicated teams using AWS to build an advanced banking services marketplace. In just 5 months, Leumi migrated 16 on-premises applications from its former Kubernetes solution to Amazon EKS Anywhere with no service interruptions. The bank’s new environment facilitates a consistent, scalable approach to deployments, saving time and money and increasing innovation velocity.

CyberArk – CyberArk is an AWS partner in the identity security industry. Centered on privileged access management, CyberArk provides the most comprehensive security SaaS offering on AWS for any identity—human or machine—across business applications, distributed workforces, hybrid cloud workloads, and throughout the DevOps lifecycle. CyberArk Identity Security Intelligence has integrated with AWS CloudTrail Lake to increase visibility and responsiveness associated with targeted threats. CyberArk Audit also delivers security event information to Amazon Security Lake.

Ichilov Hospital – The I-Medata Innovation Center of Ichilov Hospital uses AWS Control Tower to facilitate the fast, consistent, and secure creation of AWS accounts while protecting sensitive medical data. The center also relies on Amazon SageMaker to enable its scientists to build, train, and deploy advanced machine learning models for early detection of deterioration in COVID-19 patients. They had full protection of sensitive medical data on AWS while continuing to enable the productivity of researchers.

You can find more customer stories from Israel.

Available Now
The new Tel Aviv Region is ready to support your business. You can find a detailed list of the services available in this Region on the AWS Regional Services List.

With this launch, AWS now spans 102 Availability Zones in 32 geographic Regions around the world. We have also announced plans for 12 more Availability Zones and four more Regions in Canada, Malaysia, New Zealand, and Thailand.

To learn more, see the Global Infrastructure page, give it a try, and send feedback through your usual AWS support contacts in Israel.

— Channy

P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

AWS Week in Review – Agents for Amazon Bedrock, Amazon SageMaker Canvas New Capabilities, and More – July 31, 2023

2023-08-01 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-week-in-review-agents-for-amazon-bedrock-amazon-sagemaker-canvas-new-capabilities-and-more-july-31-2023/

This July, AWS communities in ASEAN wrote a new history. First, the AWS User Group Malaysia recently held the first AWS Community Day in Malaysia.

Another significant milestone has been achieved by the AWS User Group Philippines. They just celebrated their tenth anniversary by running 2 days of AWS Community Day Philippines. Here are a few photos from the event, including Jeff Barr sharing his experiences attending AWS User Group meetup, in Manila, Philippines 10 years ago.

Big congratulations to AWS Community Heroes, AWS Community Builders, AWS User Group leaders and all volunteers who organized and delivered AWS Community Days! Also, thank you to everyone who attended and help support our AWS communities.

Last Week’s Launches
We had interesting launches last week, including from AWS Summit, New York. Here are some of my personal highlights:

(Preview) Agents for Amazon Bedrock – You can now create managed agents for Amazon Bedrock to handle tasks using API calls to company systems, understand user requests, break down complex tasks into steps, hold conversations to gather more information, and take actions to fulfill requests.

(Coming Soon) New LLM Capabilities in Amazon QuickSight Q – We are expanding the innovation in QuickSight Q by introducing new LLM capabilities through Amazon Bedrock. These Generative BI capabilities will allow organizations to easily explore data, uncover insights, and facilitate sharing of insights.

AWS Glue Studio support for Amazon CodeWhisperer – You can now write specific tasks in natural language (English) as comments in the Glue Studio notebook, and Amazon CodeWhisperer provides code recommendations for you.

(Preview) Vector Engine for Amazon OpenSearch Serverless – This capability empowers you to create modern ML-augmented search experiences and generative AI applications without the need to handle the complexities of managing the underlying vector database infrastructure.

Last week, Amazon SageMaker Canvas also released a set of new capabilities:

AWS Open-Source Updates
As always, my colleague Ricardo has curated the latest updates for open-source news at AWS. Here are some of the highlights.

cdk-aws-observability-accelerator is a set of opinionated modules to help you set up observability for your AWS environments with AWS native services and AWS-managed observability services such as Amazon Managed Service for Prometheus, Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch.

iac-devtools-cli-for-cdk is a command line interface tool that automates many of the tedious tasks of building, adding to, documenting, and extending AWS CDK applications.

Upcoming AWS Events
There are upcoming events that you can join to learn. Let’s start with AWS events:

AWS Storage Day (August 9)
AWS Summit Taiwan (August 2-3)
AWS Summit São Paulo (August 3)
AWS Summit Mexico City (August 30)

And let’s learn from our fellow builders and join AWS Community Days:

AWS Community Day Colombia (August 12)
AWS Community Day West Africa (August 19)

Open for Registration for AWS re:Invent
We want to be sure you know that AWS re:Invent registration is now open!

This learning conference hosted by AWS for the global cloud computing community will be held from November 27 to December 1, 2023, in Las Vegas.

Pro-tip: You can use information on the Justify Your Trip page to prove the value of your trip to AWS re:Invent trip.

Give Us Your Feedback
We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

That’s all for this week. Check back next Monday for another Week in Review.

Happy building!

— Donnie

This post is part of our Week in Review series. Check back each week for a quick round-up of interesting news and announcements from AWS!

Improving medical imaging workflows with AWS HealthImaging and SageMaker

2023-07-31 Sukhomoy Basak

Post Syndicated from Sukhomoy Basak original https://aws.amazon.com/blogs/architecture/improving-medical-imaging-workflows-with-aws-healthimaging-and-sagemaker/

Medical imaging plays a critical role in patient diagnosis and treatment planning in healthcare. However, healthcare providers face several challenges when it comes to managing, storing, and analyzing medical images. The process can be time-consuming, error-prone, and costly.

There’s also a radiologist shortage across regions and healthcare systems, making the demand for this specialty increases due to an aging population, advances in imaging technology, and the growing importance of diagnostic imaging in healthcare.

As the demand for imaging studies continues to rise, the limited number of available radiologists results in delays in available appointments and timely diagnoses. And while technology enables healthcare delivery improvements for clinicians and patients, hospitals seek additional tools to solve their most pressing challenges, including:

Professional burnout due to an increasing demand for imaging and diagnostic services
Labor-intensive tasks, such as volume measurement or structural segmentation of images
Increasing expectations from patients expecting high-quality healthcare experiences that match retail and technology in terms of convenience, ease, and personalization

To improve clinician and patient experiences, run your picture archiving and communication system (PACS) with an artificial intelligence (AI)-enabled diagnostic imaging cloud solution to securely gain critical insights and improve access to care.

AI helps reduce the radiologist burndown rate through automation. For example, AI saves radiologist chest x-ray interpretation time. It is also a powerful tool to identify areas that need closer inspection, and helps capture secondary findings that weren’t initially identified. The advancement of interoperability and analytics gives radiologist a 360-degree, longitudinal view of patient health records to provide better healthcare at potentially lower costs.

AWS offers services to address these challenges. This blog post discusses AWS HealthImaging (AWS AHI) and Amazon SageMaker, and how they are used together to improve healthcare providers’ medical imaging workflows. This ultimately accelerates imaging diagnostics and increases radiology productivity. AWS AHI enables developers to deliver performance, security, and scale to cloud-native medical imaging applications. It allows ingestion of Digital Imaging and Communication in Medicine (DICOM) images. Amazon SageMaker provides end-to-end solution for AI and machine learning.

Let’s explore an example use case involving X-rays after an auto accident. In this diagnostic medical imaging workflow, a patient is in the emergency room. From there:

The patient undergoes an X-ray to check for fractures.
The scanned acquisition device images flow to the PACS system.
The radiologist reviews the information gathered from this procedure and authors the report.
The patient workflow continues as the reports are made available to the referring physician.

Next-generation imaging solutions and workflows

Healthcare providers can use AWS AHI and Amazon SageMaker together to enable next-generation imaging solutions and improve medical imaging workflows. The following architecture illustrates this example.

Figure 1: X-ray images are sent to AWS HealthImaging and an Amazon SageMaker endpoint extracts insights.

Let’s review the architecture and the key components:

1. Imaging Scanner: Captures the images from a patient’s body. Depending on the modality, this can be an X-ray detector; a series of detectors in a CT scanner; a magnetic field and radio frequency coils in an MRI scanner; or an ultrasound transducer. This example uses an X-ray device.

AWS IoT Greengrass: Edge runtime and cloud service configured with DICOM C-Store SCP that receives the images and sends it to Amazon Simple Storage Service (Amazon S3). The images along with the related metadata are sent to Amazon S3 and Amazon Simple Queue Service (Amazon SQS) respectively, that triggers the workflow.

2. Amazon SQS message queue: Consumes event from S3 bucket and triggers an AWS Step Functions workflow orchestration.

3. AWS Step Functions runs the transform and import jobs to further process and import the images into AWS AHI data store instance.

4. The final diagnostic image—along with any relevant patient information and metadata—is stored in the AWS AHI datastore. This allows for efficient imaging date retrieval and management. It also enables medical imaging data access with sub-second image retrieval latencies at scale, powered by cloud-native APIs and applications from AWS partners.

5. Radiologists responsible for ground truth for ML images perform medical image annotations using Amazon SageMaker Ground Truth. They visualize and label DICOM images using a custom data labeling workflow—a fully managed data labeling service that supports built-in or custom data labeling workflows. They also leverage tools like 3D Slicer for interactive medical image annotations.

6. Data scientists build or leverage built-in deep learning models using the annotated images on Amazon SageMaker. SageMaker offers a range of deployment options that vary from low latency and high throughput to long-running inference jobs. These options include considerations for batch, real-time, or near real-time inference.

7. Healthcare providers use AWS AHI and Amazon SageMaker to run AI-assisted detection and interpretation workflow. This workflow is used to identify hard-to-see fractures, dislocations, or soft tissue injuries to allow surgeons and radiologist to be more confident in their treatment choices.

8. Finally, the image stored in AWS AHI is displayed on a monitor or other visual output device where it can be analyzed and interpreted by a radiologist or other medical professional.

The Open Health Imaging Foundation (OHIF) Viewer is an open source, web-based, medical imaging platform. It provides a core framework for building complex imaging applications.
Radical Imaging or Arterys are AWS partners that provide OHIF-based medical imaging viewer.

Each of these components plays a critical role in the overall performance and accuracy of the medical imaging system as well as ongoing research and development focused on improving diagnostic outcomes and patient care. AWS AHI uses efficient metadata encoding, lossless compression, and progressive resolution data access to provide industry leading performance for loading images. Efficient metadata encoding enables image viewers and AI algorithms to understand the contents of a DICOM study without having to load the image data.

Security

The AWS shared responsibility model applies to data protection in AWS AHI and Amazon SageMaker.

Amazon SageMaker is HIPAA-eligible and can operate with data containing Protected Health Information (PHI). Encryption of data in transit is provided by SSL/TLS and is used when communicating both with the front-end interface of Amazon SageMaker (to the Notebook) and whenever Amazon SageMaker interacts with any other AWS services.

AWS AHI is also HIPAA-eligible service and provides access control at the metadata level, ensuring that each user and application can only see the images and metadata fields that are required based upon their role. This prevents the proliferation of Patient PHI. All access to AWS AHI APIs is logged in detail in AWS CloudTrail.

Both of these services leverage AWS Key Management service (AWS KMS) to satisfy the requirement that PHI data is encrypted at rest.

Conclusion

In this post, we reviewed a common use case for early detection and treatment of conditions, resulting in better patient outcomes. We also covered an architecture that can transform the radiology field by leveraging the power of technology to improve accuracy, efficiency, and accessibility of medical imaging.

New – AWS Public IPv4 Address Charge + Public IP Insights

2023-07-28 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-public-ipv4-address-charge-public-ip-insights/

We are introducing a new charge for public IPv4 addresses. Effective February 1, 2024 there will be a charge of $0.005 per IP per hour for all public IPv4 addresses, whether attached to a service or not (there is already a charge for public IPv4 addresses you allocate in your account but don’t attach to an EC2 instance).

Public IPv4 Charge
As you may know, IPv4 addresses are an increasingly scarce resource and the cost to acquire a single public IPv4 address has risen more than 300% over the past 5 years. This change reflects our own costs and is also intended to encourage you to be a bit more frugal with your use of public IPv4 addresses and to think about accelerating your adoption of IPv6 as a modernization and conservation measure.

This change applies to all AWS services including Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (RDS) database instances, Amazon Elastic Kubernetes Service (EKS) nodes, and other AWS services that can have a public IPv4 address allocated and attached, in all AWS regions (commercial, AWS China, and GovCloud). Here’s a summary in tabular form:

Public IP Address Type	Current Price/Hour (USD)	New Price/Hour (USD) (Effective February 1, 2024)
In-use Public IPv4 address (including Amazon provided public IPv4 and Elastic IP) assigned to resources in your VPC, Amazon Global Accelerator, and AWS Site-to-site VPN tunnel	No charge	$0.005
Additional (secondary) Elastic IP Address on a running EC2 instance	$0.005	$0.005
Idle Elastic IP Address in account	$0.005	$0.005

The AWS Free Tier for EC2 will include 750 hours of public IPv4 address usage per month for the first 12 months, effective February 1, 2024. You will not be charged for IP addresses that you own and bring to AWS using Amazon BYOIP.

Starting today, your AWS Cost and Usage Reports automatically include public IPv4 address usage. When this price change goes in to effect next year you will also be able to use AWS Cost Explorer to see and better understand your usage.

As I noted earlier in this post, I would like to encourage you to consider accelerating your adoption of IPv6. A new blog post shows you how to use Elastic Load Balancers and NAT Gateways for ingress and egress traffic, while avoiding the use of a public IPv4 address for each instance that you launch. Here are some resources to show you how you can use IPv6 with widely used services such as EC2, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Kubernetes Service (EKS), Elastic Load Balancing, and Amazon Relational Database Service (RDS):

Earlier this year we enhanced EC2 Instance Connect and gave it the ability to connect to your instances using private IPv4 addresses. As a result, you no longer need to use public IPv4 addresses for administrative purposes (generally using SSH or RDP).

Public IP Insights
In order to make it easier for you to monitor, analyze, and audit your use of public IPv4 addresses, today we are launching Public IP Insights, a new feature of Amazon VPC IP Address Manager that is available to you at no cost. In addition to helping you to make efficient use of public IPv4 addresses, Public IP Insights will give you a better understanding of your security profile. You can see the breakdown of public IP types and EIP usage, with multiple filtering options:

You can also see, sort, filter, and learn more about each of the public IPv4 addresses that you are using:

Using IPv4 Addresses Efficiently
By using the new IP Insights tool and following the guidance that I shared above, you should be ready to update your application to minimize the effect of the new charge. You may also want to consider using AWS Direct Connect to set up a dedicated network connection to AWS.

Finally, be sure to read our new blog post, Identify and Optimize Public IPv4 Address Usage on AWS, for more information on how to make the best use of public IPv4 addresses.

— Jeff;

New Amazon EC2 Instances (C7gd, M7gd, and R7gd) Powered by AWS Graviton3 Processor with Local NVMe-based SSD Storage

2023-07-28 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-instances-c7gd-m7gd-and-r7gd-powered-by-aws-graviton3-processor-with-local-nvme-based-ssd-storage/

We launched Amazon EC2 C7g instances in May 2022 and M7g and R7g instances in February 2023. Powered by the latest AWS Graviton3 processors, the new instances deliver up to 25 percent higher performance, up to two times higher floating-point performance, and up to 2 times faster cryptographic workload performance compared to AWS Graviton2 processors.

Graviton3 processors deliver up to 3 times better performance compared to AWS Graviton2 processors for machine learning (ML) workloads, including support for bfloat16. They also support DDR5 memory that provides 50 percent more memory bandwidth compared to DDR4. Graviton3 also uses up to 60 percent less energy for the same performance as comparable EC2 instances, which helps you reduce your carbon footprint.

The C7g instances are well suited for compute-intensive workloads, such as high performance computing (HPC), batch processing, ad serving, video encoding, gaming, scientific modeling, distributed analytics, and CPU-based machine learning inference. The M7g instances are for general purpose workloads such as application servers, microservices, gaming servers, mid-sized data stores, and caching fleets. The R7g instances are a great fit for memory-intensive workloads such as open-source databases, in-memory caches, and real-time big data analytics.

Today, we’re adding a d variant to all three instance families. The new Amazon EC2 C7gd, M7gd, and R7gd instance types have NVM Express (NVMe) locally attached up to 2 x 1.9 TB SSD drives that are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the instance. These instances have up to 45 percent better real-time NVMe storage performance than comparable Graviton2-based instances.

These are a great fit for applications that need access to high-speed, low-latency local storage, including those that need temporary storage of data for scratch space, temporary files, and caches. The data on an instance store volume persists only during the life of the associated EC2 instance.

Here are the specs for these instances:

Instance Size	vCPU	Memory (GiB)	Local NVMe Storage (GB)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)
C7gd/M7gd/R7gd		C7gd/M7gd/R7gd	C7gd/M7gd/R7gd
medium	1	2/ 4 / 8	1 x 59	Up to 12.5	Up to 10
large	2	4 / 8 / 16	1 x 118	Up to 12.5	Up to 10
xlarge	4	8 / 16 / 32	1 x 237	Up to 12.5	Up to 10
2xlarge	8	16 / 32 / 64	1 x 474	Up to 15	Up to 10
4xlarge	16	32 / 64 / 128	1 x 950	Up to 15	Up to 10
8xlarge	32	64 / 128 / 256	1 x 1900	15	10
12xlarge	48	96 / 192/ 384	2 x 1425	22.5	15
16xlarge	64	128 / 256 / 512	2 x 1900	30	20

These instances are built on the AWS Nitro System, a combination of AWS-designed dedicated hardware and a lightweight hypervisor that allows the delivery of isolated multitenancy, private networking, and fast local storage. They provide up to 20 Gbps Amazon Elastic Block Store (Amazon EBS) bandwidth and up to 30 Gbps network bandwidth. The 16xlarge instances also support Elastic Fabric Adapter (EFA) for applications that need a high level of inter-node communication.

Now Available
Amazon EC2 C7gd, M7gd, and R7gd instances are now available in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), and Europe (Ireland). As usual with Amazon EC2, you only pay for what you use. For more information, see the Amazon EC2 pricing page.

If you’re optimizing applications for Arm architecture, be sure to have a look at our Getting Started collection of resources or learn more about AWS Graviton3-based EC2 instances.

To learn more, visit our Amazon EC2 C7g instances, M7g instances or R7g instances page, and please send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

— Channy

New: AWS Local Zone in Phoenix, Arizona – More Instance Types, More EBS Storage Classes, and More Services

2023-07-28 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-aws-local-zone-in-phoenix-arizona-more-instance-types-more-ebs-storage-classes-and-more-services/

I am happy to announce that a new AWS Local Zone in Phoenix, Arizona is now open and ready for you to use, with more instance types, storage classes, and services than ever before.

We launched the first AWS Local Zone in 2019 (AWS Now Available from a Local Zone in Los Angeles) with the goal of making a select set of EC2 instance types, EBS volume types, and other AWS services available with single-digit millisecond when accessed from Los Angeles and other locations in Southern California. Since then, we have launched a second Local Zone in Los Angeles, along with 15 more in other parts of the United States and another 17 around the world, 34 in all. We are also planning to build 19 more Local Zones outside of the US (see the Local Zones Locations page for a complete list).

Local Zones In Action
Our customers make use of Local Zones in many different ways. Popular use cases include real-time gaming, hybrid migrations, content creation for media & entertainment, live video streaming, engineering simulations, and AR/VR at the edge. Here are a couple of great examples that will give you a taste of what is possible:

Arizona State University (ASU) – Known for its innovation and research, ASU is among the largest universities in the U.S. with 173,000 students and 20,000 faculty and staff. Local Zones help them to accelerate the delivery of online services and storage, giving them a level of performance that is helping them to transform the educational experience for students and staff.

DISH Wireless -Two years ago they began to build a cloud-native, fully virtualized 5G network on AWS, making use of Local Zones to support latency-sensitive real-time 5G applications and workloads at the network edge (read Telco Meets AWS Cloud to learn more). The new Local Zone in Phoenix will allow them to further enhance the strength and reliability of their network by extending their 5G core to the edge.

We work closely with these and many other customers to make sure that the Local Zone(s) that they use are a great fit for their use cases. In addition to the already-strong set of instance types, storage classes, and services that are part-and-parcel of every Local Zone, we add others on an as-needed basis.

For example, Local Zones in Los Angeles, Miami, and other locations have additional instance types; several Local Zones have additional Amazon Elastic Block Store (Amazon EBS) storage classes, and others have extra services such as Application Load Balancer, Amazon FSx, Amazon EMR, Amazon ElastiCache, Amazon Relational Database Service (RDS), Amazon GameLift, and AWS Application Migration Service (AWS MGN). You can see this first-hand on the Local Zones Features page.

And Now, Phoenix
As I mentioned earlier, this Local Zone has more instance types, storage classes, and services than earlier Local Zones. Here’s what’s inside:

Instance Types – Compared to all other Local Zones with the T3, C5(d), R5(d), and G4dn instance types, the Phoenix Local Zone includes C6i, M6i, R6i, and Cg6n instances.

EBS Volume Types – In addition to the gp2 volumes that are available in all Local Zones, the Phoenix Local Zone includes gp3 (General Purpose SSD) , io1 (Provisioned IOPS SSD) , st1 (Throughput Optimized HDD), and sc1 (Cold HDD) storage.

Services – In addition to Amazon Elastic Compute Cloud (Amazon EC2), Amazon Elastic Block Store (Amazon EBS), AWS Shield, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Container Service (Amazon ECS). Amazon Elastic Kubernetes Service (EKS), Application Load Balancer, and AWS Direct Connect, the Phoenix LZ includes NAT Gateway.

Pricing Models – In addition to On-Demand and Savings Plans, the Phoenix Local Zone includes Spot.

Going forward, we plan to launch more Local Zones that are similarly equipped.

Opting-In to the Phoenix Local Zone
The original Phoenix Local Zone was launched in 2022 and remains available to customers who have already enabled it. The Zone that we are announcing today can be enabled by new and existing customers.

To get started with this or any other Local Zone, I must first enable it. To do this, I open the EC2 Console, select the parent region (US West (Oregon)) from the menu, and then click EC2 Dashboard in the left-side navigation:

Then I click on Zones in the Account attributes box:

Next, I scroll down to the new Phoenix Local Zone (us-west-2-phx-2), and click Manage:

I click Enabled, and then Update zone group:

I confirm that I want to enable the Zone Group, and click Ok:

And I am all set. I can create EBS volumes, launch EC2 instances, and make use of the other services in this Local Zone.

— Jeff;

Improved scalability and resiliency for Amazon EMR on EC2 clusters

2023-07-27 Ravi Kumar Singh

Post Syndicated from Ravi Kumar Singh original https://aws.amazon.com/blogs/big-data/improved-scalability-and-resiliency-for-amazon-emr-on-ec2-clusters/

Amazon EMR is the cloud big data solution for petabyte-scale data processing, interactive analytics, and machine learning using open-source frameworks such as Apache Spark, Apache Hive, and Presto. Customers asked us for features that would further improve the resiliency and scalability of their Amazon EMR on EC2 clusters, including their large, long-running clusters. We have been hard at work to meet those needs. Over the past 12 months, we have worked backward from customer requirements and launched over 30 new features that improve the resiliency and scalability of your Amazon EMR on EC2 clusters. This post covers some of these key enhancements across three main areas:

Improved cluster utilization with optimized scaling experience
Minimized interruptions with enhanced resiliency and availability
Improved cluster resiliency with upgraded logging and debugging capabilities

Let’s dive into each of these areas.

Improved cluster utilization with optimized scaling experience

Customers use Amazon EMR to run diverse analytics workloads with varying SLAs, ranging from near-real-time streaming jobs to exploratory interactive workloads and everything in between. To cater to these dynamic workloads, you can resize your clusters either manually or by enabling automatic scaling. You can also use the Amazon EMR managed scaling feature to automatically resize your clusters for optimal performance at the lowest possible cost. To ensure swift cluster resizes, we implemented multiple improvements that are available in the latest Amazon EMR releases:

Enhanced resiliency of cluster scaling workflow to EC2 Spot Instance interruptions – Many Amazon EMR customers use EC2 Spot Instances for their Amazon EMR on EC2 clusters to reduce costs. Spot Instances are spare Amazon Elastic Compute Cloud (Amazon EC2) compute capacity offered at discounts of up to 90% compared to On-Demand pricing. However, Amazon EC2 can reclaim Spot capacity with a two-minute warning, which can lead to interruptions in workload. We identified an issue where the cluster’s scaling operation gets stuck when over a hundred core nodes launched on Spot Instances are reclaimed by Amazon EC2 throughout the life of the cluster. Starting with Amazon EMR version 6.8.0, we mitigated this issue by fixing a gap in the process HDFS uses to decommission nodes that caused the scaling operations to get stuck. We contributed this improvement back to the open-source community, enabling seamless recovery and efficient scaling in the event of Spot interruptions.
Improve cluster utilization by recommissioning recently decommissioned nodes for Spark workloads within seconds – Amazon EMR allows you to scale down your cluster without affecting your workload by gracefully decommissioning core and task nodes. Furthermore, to prevent task failures, Apache Spark ensures that decommissioning nodes are not assigned any new tasks. However, if a new job is submitted immediately before these nodes are fully decommissioned, Amazon EMR will trigger a scale-up operation for the cluster. This results in these decommissioning nodes to be immediately recommissioned and added back into the cluster. Due to a gap in Apache Spark’s recommissioning logic, these recommissioned nodes would not accept new Spark tasks for up to 60 minutes. We enhanced the recommissioning logic, which ensures recommissioned nodes would start accepting new tasks within seconds, thereby improving cluster utilization. This improvement is available in Amazon EMR release 6.11 and higher.
Minimized cluster scaling interruptions due to disk over-utilization – The YARN ResourceManager exclude file is a key component of Apache Hadoop that Amazon EMR uses to centrally manage cluster resources for multiple data-processing frameworks. This exclude file contains a list of nodes to be removed from the cluster to facilitate a cluster scale-down operation. With Amazon EMR release 6.11.0, we improved the cluster scaling workflow to reduce scale-down failures. This improvement minimizes failures due to partial updates or corruption in the exclude file caused by low disk space. Additionally, we built a robust file recovery mechanism to restore the exclude file in case of corruption, ensuring uninterrupted cluster scaling operations.

Minimized interruptions with enhanced resiliency and availability

Amazon EMR offers high availability and fault tolerance for your big data workloads. Let’s look at a few key improvements we launched in this area:

Improved fault tolerance to hardware reconfiguration – Amazon EMR offers the flexibility to decouple storage and compute. We observed that customers often increase the size of or add incremental block-level storage to their EC2 instances as their data processing volume and concurrency grow. Starting with Amazon EMR release 6.11.0, we made the EMR cluster’s local storage file system more resilient to unpredictable instance reconfigurations such as instance restarts. By addressing scenarios where an instant restart could result in the block storage device name to change, we eliminated the risk of the cluster becoming inoperable or losing data.
Reduce cluster startup time for Kerberos-enabled EMR clusters with long-running bootstrap actions – Multiple customers use Kerberos for authentication and run long-running bootstrap actions on their EMR clusters. In Amazon EMR 6.9.0 and higher releases, we fixed a timing sequence mismatch issue that occurs between Apache BigTop and the Amazon EMR on EC2 cluster startup sequence. This timing sequence mismatch occurs when a system attempts to perform two or more operations at the same time instead of doing them in the proper sequence. This issue caused certain cluster configurations to experience instance startup timeouts. We contributed a fix to the open-source community and made additional improvements to the Amazon EMR startup sequence to prevent this condition, resulting in cluster start time improvements of up to 200% for such clusters.

Improved cluster resiliency with upgraded logging and debugging capabilities

Effective log management is essential to ensure log availability and maintain the health of EMR clusters. This becomes especially critical when you’re running multiple custom client tools and third-party applications on your Amazon EMR on EC2 clusters. Customers depend on EMR logs, in addition to EMR events, to monitor cluster and workload health, troubleshoot urgent issues, simplify security audit, and enhance compliance. Let’s look at a few key enhancements we made in this area:

Upgraded on-cluster log management daemon – Amazon EMR now automatically restarts the log management daemon if it’s interrupted. The Amazon EMR on-cluster log management daemon archives logs to Amazon Simple Storage Service (Amazon S3) and deletes them from instance storage. This minimizes cluster failures due to disk over-utilization, while allowing the log files to remain accessible even after the cluster or node stops. This upgrade is available in Amazon EMR release 6.10.0 and higher. For more information, see Configure cluster logging and debugging.
Enhanced cluster stability with improved log rotation and monitoring – Many of our customers have long-running clusters that have been operating for years. Some open-source application logs such as Hive and Kerberos logs that are never rotated can continue to grow on these long-running clusters. This could lead to disk over-utilization and eventually result in cluster failures. We enabled log rotation for such log files to minimize disk, memory, and CPU over-utilization scenarios. Furthermore, we expanded our log monitoring to include additional log folders. These changes, available starting with Amazon EMR version 6.10.0, minimize situations where EMR cluster resources are over-utilized, while ensuring log files are archived to Amazon S3 for a wider variety of use cases.

Conclusion

In this post, we highlighted the improvements that we made in Amazon EMR on EC2 with the goal to make your EMR clusters more resilient and stable. We focused on improving cluster utilization with the improved and optimized scaling experience for EMR workloads, minimized interruptions with enhanced resiliency and availability for Amazon EMR on EC2 clusters, and improved cluster resiliency with upgraded logging and debugging capabilities. We will continue to deliver further enhancements with new Amazon EMR releases. We invite you to try new features and capabilities in the latest Amazon EMR releases and get in touch with us through your AWS account team to share your valuable feedback and comments. To learn more and get started with Amazon EMR, check out the tutorial Getting started with Amazon EMR.

About the Authors

Ravi Kumar is a Senior Product Manager for Amazon EMR at Amazon Web Services.

Kevin Wikant is a Software Development Engineer for Amazon EMR at Amazon Web Services.

New – Amazon EC2 P5 Instances Powered by NVIDIA H100 Tensor Core GPUs for Accelerating Generative AI and HPC Applications

2023-07-26 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-ec2-p5-instances-powered-by-nvidia-h100-tensor-core-gpus-for-accelerating-generative-ai-and-hpc-applications/

In March 2023, AWS and NVIDIA announced a multipart collaboration focused on building the most scalable, on-demand artificial intelligence (AI) infrastructure optimized for training increasingly complex large language models (LLMs) and developing generative AI applications.

We preannounced Amazon Elastic Compute Cloud (Amazon EC2) P5 instances powered by NVIDIA H100 Tensor Core GPUs and AWS’s latest networking and scalability that will deliver up to 20 exaflops of compute performance for building and training the largest machine learning (ML) models. This announcement is the product of more than a decade of collaboration between AWS and NVIDIA, delivering the visual computing, AI, and high performance computing (HPC) clusters across the Cluster GPU (cg1) instances (2010), G2 (2013), P2 (2016), P3 (2017), G3 (2017), P3dn (2018), G4 (2019), P4 (2020), G5 (2021), and P4de instances (2022).

Most notably, ML model sizes are now reaching trillions of parameters. But this complexity has increased customers’ time to train, where the latest LLMs are now trained over the course of multiple months. HPC customers also exhibit similar trends. With the fidelity of HPC customer data collection increasing and data sets reaching exabyte scale, customers are looking for ways to enable faster time to solution across increasingly complex applications.

Introducing EC2 P5 Instances
Today, we are announcing the general availability of Amazon EC2 P5 instances, the next-generation GPU instances to address those customer needs for high performance and scalability in AI/ML and HPC workloads. P5 instances are powered by the latest NVIDIA H100 Tensor Core GPUs and will provide a reduction of up to 6 times in training time (from days to hours) compared to previous generation GPU-based instances. This performance increase will enable customers to see up to 40 percent lower training costs.

P5 instances provide 8 x NVIDIA H100 Tensor Core GPUs with 640 GB of high bandwidth GPU memory, 3rd Gen AMD EPYC processors, 2 TB of system memory, and 30 TB of local NVMe storage. P5 instances also provide 3200 Gbps of aggregate network bandwidth with support for GPUDirect RDMA, enabling lower latency and efficient scale-out performance by bypassing the CPU on internode communication.

Here are the specs for these instances:

Instance Size	vCPUs	Memory (GiB)	GPUs (H100)	Network Bandwidth (Gbps)	EBS Bandwidth (Gbps)	Local Storage (TB)
P5.48xlarge	192	2048	8	3200	80	8 x 3.84

Here’s a quick infographic that shows you how the P5 instances and NVIDIA H100 Tensor Core GPUs compare to previous instances and processors:

P5 instances are ideal for training and running inference for increasingly complex LLMs and computer vision models behind the most demanding and compute-intensive generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more. P5 will provide up to 6 times lower time to train compared with previous generation GPU-based instances across those applications. Customers who can use lower precision FP8 data types in their workloads, common in many language models that use a transformer model backbone, will see further benefit at up to 6 times performance increase through support for the NVIDIA transformer engine.

HPC customers using P5 instances can deploy demanding applications at greater scale in pharmaceutical discovery, seismic analysis, weather forecasting, and financial modeling. Customers using dynamic programming (DP) algorithms for applications like genome sequencing or accelerated data analytics will also see further benefit from P5 through support for a new DPX instruction set.

This enables customers to explore problem spaces that previously seemed unreachable, iterate on their solutions at a faster clip, and get to market more quickly.

You can see the detail of instance specifications along with comparisons of instance types between p4d.24xlarge and new p5.48xlarge below:

Feature	p4d.24xlarge	p5.48xlarge	Comparision
Number & Type of Accelerators	8 x NVIDIA A100	8 x NVIDIA H100	–
FP8 TFLOPS per Server	–	16,000	640% vs.A100 FP16
FP16 TFLOPS per Server	2,496	8,000	640% vs.A100 FP16
GPU Memory	40 GB	80 GB	200%
GPU Memory Bandwidth	12.8 TB/s	26.8 TB/s	200%
CPU Family	Intel Cascade Lake	AMD Milan	–
vCPUs	96	192	200%
Total System Memory	1152 GB	2048 GB	200%
Networking Throughput	400 Gbps	3200 Gbps	800%
EBS Throughput	19 Gbps	80 Gbps	400%
Local Instance Storage	8 TBs NVMe	30 TBs NVMe	375%
GPU to GPU Interconnect	600 GB/s	900 GB/s	150%

Second-generation Amazon EC2 UltraClusters and Elastic Fabric Adaptor
P5 instances provide market-leading scale-out capability for multi-node distributed training and tightly coupled HPC workloads. They offer up to 3,200 Gbps of networking using the second-generation Elastic Fabric Adaptor (EFA) technology, 8 times compared with P4d instances.

To address customer needs for large-scale and low latency, P5 instances are deployed in the second-generation EC2 UltraClusters, which now provide customers with lower latency across up to 20,000+ NVIDIA H100 Tensor Core GPUs. Providing the largest scale of ML infrastructure in the cloud, P5 instances in EC2 UltraClusters deliver up to 20 exaflops of aggregate compute capability.

EC2 UltraClusters use Amazon FSx for Lustre, fully managed shared storage built on the most popular high-performance parallel file system. With FSx for Lustre, you can quickly process massive datasets on demand and at scale and deliver sub-millisecond latencies. The low-latency and high-throughput characteristics of FSx for Lustre are optimized for deep learning, generative AI, and HPC workloads on EC2 UltraClusters.

FSx for Lustre keeps the GPUs and ML accelerators in EC2 UltraClusters fed with data, accelerating the most demanding workloads. These workloads include LLM training, generative AI inferencing, and HPC workloads, such as genomics and financial risk modeling.

Getting Started with EC2 P5 Instances
To get started, you can use P5 instances in the US East (N. Virginia) and US West (Oregon) Region.

When launching P5 instances, you will choose AWS Deep Learning AMIs (DLAMIs) to support P5 instances. DLAMI provides ML practitioners and researchers with the infrastructure and tools to quickly build scalable, secure distributed ML applications in preconfigured environments.

You will be able to run containerized applications on P5 instances with AWS Deep Learning Containers using libraries for Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). For a more managed experience, you can also use P5 instances via Amazon SageMaker, which helps developers and data scientists easily scale to tens, hundreds, or thousands of GPUs to train a model quickly at any scale without worrying about setting up clusters and data pipelines. HPC customers can leverage AWS Batch and ParallelCluster with P5 to help orchestrate jobs and clusters efficiently.

Existing P4 customers will need to update their AMIs to use P5 instances. Specifically, you will need to update your AMIs to include the latest NVIDIA driver with support for NVIDIA H100 Tensor Core GPUs. They will also need to install the latest CUDA version (CUDA 12), CuDNN version, framework versions (e.g., PyTorch, Tensorflow), and EFA driver with updated topology files. To make this process easy for you, we will provide new DLAMIs and Deep Learning Containers that come prepackaged with all the needed software and frameworks to use P5 instances out of the box.

Now Available
Amazon EC2 P5 instances are available today in AWS Regions: US East (N. Virginia) and US West (Oregon). For more information, see the Amazon EC2 pricing page. To learn more, visit our P5 instance page and explore AWS re:Post for EC2 or through your usual AWS Support contacts.

You can choose a broad range of AWS services that have generative AI built in, all running on the most cost-effective cloud infrastructure for generative AI. To learn more, visit Generative AI on AWS to innovate faster and reinvent your applications.

— Channy

AWS Entity Resolution: Match and Link Related Records from Multiple Applications and Data Stores

2023-07-26 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-entity-resolution-match-and-link-related-records-from-multiple-applications-and-data-stores/

As organizations grow, the records that contain information about customers, businesses, or products tend to be increasingly fragmented and siloed across applications, channels, and data stores. Because information can be gathered in different ways, there is also the issue of different but equivalent data, such as for street addresses (“5th Avenue” and “5th Ave”). As a consequence, it’s not easy to link related records together to create a unified view and gain better insights.

For example, companies want to run advertising campaigns to reach consumers across multiple applications and channels with personalized messaging. Companies often have to deal with disparate data records that contain incomplete or conflicting information, creating a difficult matching process.

In the retail industry, companies have to reconcile, across their supply chain and stores, products that use multiple and different product codes, such as stock keeping units (SKUs), universal product codes (UPCs), or proprietary codes. This prevents them from analyzing information quickly and holistically.

One way to address this problem is to build bespoke data resolution solutions such as complex SQL queries interacting with multiple databases, or train machine learning (ML) models for record matching. But these solutions take months to build, require development resources, and are costly to maintain.

To help you with that, today we’re introducing AWS Entity Resolution, an ML-powered service that helps you match and link related records stored across multiple applications, channels, and data stores. You can get started in minutes configuring entity resolution workflows that are flexible, scalable, and can seamlessly connect to your existing applications.

AWS Entity Resolution offers advanced matching techniques, such as rule-based matching and machine learning models, to help you accurately link related sets of customer information, product codes, or business data codes. For example, you can use AWS Entity Resolution to create a unified view of your customer interactions by linking recent events (such as ad clicks, cart abandonment, and purchases) into a unique entity ID, or better track products that use different codes (like SKUs or UPCs) across your stores.

With AWS Entity Resolution, you can improve matching accuracy and protect data security while minimizing data movement because it reads records where they already live. Let’s see how that works in practice.

Using AWS Entity Resolution
As part of my analytics platform, I have a comma-separated values (CSV) file containing one million fictitious customers in an Amazon Simple Storage Service (Amazon S3) bucket. These customers come from a loyalty program but can have applied through different channels (online, in store, by post), so it’s possible that multiple records relate to the same customer.

This is the format of the data in the CSV file:

loyalty_id, rewards_id, name_id, first_name, middle_initial, last_name, program_id, emp_property_nbr, reward_parent_id, loyalty_program_id, loyalty_program_desc, enrollment_dt, zip_code,country, country_code, address1, address2, address3, address4, city, state_code, state_name, email_address, phone_nbr, phone_type

I use an AWS Glue crawler to automatically determine the content of the file and keep the metadata table updated in the data catalog so that it’s available for my analytics jobs. Now, I can use the same setup with AWS Entity Resolution.

In the AWS Entity Resolution console, I choose Get started to see how to set up a matching workflow.

To create a matching workflow, I first need to define my data with a schema mapping.

I choose Create schema mapping, enter a name and description, and select the option to import the schema from AWS Glue. I could also define a custom schema using a step-by-step flow or a JSON editor.

I select the AWS Glue database and table from the two dropdowns to import columns and pre-populate the input fields.

I select the Unique ID from the dropdown. The unique ID is the column that can distinctly reference each row of my data. In this case, it’s the loyalty_id in the CSV file.

I select the input fields that are going to be used for matching. In this case, I choose the columns from the dropdown that can be used to recognize if multiple records are related to the same customer. If some columns aren’t required for matching but are required in the output file, I can optionally add them as pass-through fields. I choose Next.

I map the input fields to their input type and match key. In this way, AWS Entity Resolution knows how to use these fields to match similar records. To continue, I choose Next.

Now, I use grouping to better organize the data I need to compare. For example, the First name, Middle name, and Last name input fields can be grouped together and compared as a Full name.

I also create a group for the Address fields.

I choose Next and review all configurations. Then, I choose Create schema mapping.

Now that I’ve created the schema mapping, I choose Matching workflows from the navigation pane and then Create matching workflow.

I enter a name and a description. Then, to configure the input data, I select the AWS Glue database and table and the schema mapping.

To give the service access to the data, I select a service role that I configured previously. The service role gives access to the input and output S3 buckets and the AWS Glue database and table. If the input or output buckets are encrypted, the service role can also give access to the AWS Key Management Service (AWS KMS) keys needed to encrypt and decrypt the data. I choose Next.

I have the option to use a rule-based or ML-powered matching method. Depending on the method, I can use a manual or automatic processing cadence to run the matching workflow job. For now, I select Machine learning matching and Manual for the Processing cadence, and then choose Next.

I configure an S3 bucket as the output destination. Under Data format, I select Normalized data so that special characters and extra spaces are removed, and data is formatted to lowercase.

I use the default Encryption settings. For Data output, I use the default so that all input fields are included. For security, I can hide fields to exclude them from output or hash fields I want to mask. I choose Next.

I review all settings and choose Create and run to complete the creation of the matching workflow and run the job for the first time.

After a few minutes, the job completes. According to this analysis, of the 1 million records, only 835 thousand are unique customers. I choose View output in Amazon S3 to download the output files.

In the output files, each record has the original unique ID (loyalty_id in this case) and a newly assigned MatchID. Matching records, related to the same customers, have the same MatchID. The ConfidenceLevel field describes the confidence that machine learning matching has that the corresponding records are actually a match.

I can now use this information to have a better understanding of customers who are subscribed to the loyalty program.

Availability and Pricing
AWS Entity Resolution is generally available today in the following AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London).

With AWS Entity Resolution, you pay only for what you use based on the number of source records processed by your workflows. Pricing doesn’t depend on the matching method, whether it’s machine learning or rule-based record matching. For more information, see AWS Entity Resolution pricing.

Using AWS Entity Resolution, you gain a deeper understanding of how data is linked. That helps you deliver new insights, enhance decision making, and improve customer experiences based on a unified view of their records.

Simplify the way you match and link related records across applications, channels, and data stores with AWS Entity Resolution.

— Danilo

Introducing the vector engine for Amazon OpenSearch Serverless, now in preview

2023-07-26 Pavani Baddepudi

Post Syndicated from Pavani Baddepudi original https://aws.amazon.com/blogs/big-data/introducing-the-vector-engine-for-amazon-opensearch-serverless-now-in-preview/

We are pleased to announce the preview release of the vector engine for Amazon OpenSearch Serverless. The vector engine provides a simple, scalable, and high-performing similarity search capability in Amazon OpenSearch Serverless that makes it easy for you to build modern machine learning (ML) augmented search experiences and generative artificial intelligence (AI) applications without having to manage the underlying vector database infrastructure. This post summarizes the features and functionalities of our vector engine.

Using augmented ML search and generative AI with vector embeddings

Organizations across all verticals are rapidly adopting generative AI for its ability to handle vast datasets, generate automated content, and provide interactive, human-like responses. Customers are exploring ways to transform the end-user experience and interaction with their digital platform by integrating advanced conversational generative AI applications such as chatbots, question and answer systems, and personalized recommendations. These conversational applications enable you to search and query in natural language and generate responses that closely resemble human-like responses by accounting for the semantic meaning, user intent, and query context.

ML-augmented search applications and generative AI applications use vector embeddings, which are numerical representations of text, image, audio, and video data to generate dynamic and relevant content. The vector embeddings are trained on your private data and represent the semantic and contextual attributes of the information. Ideally, these embeddings can be stored and managed close to your domain-specific datasets, such as within your existing search engine or database. This enables you to process a user’s query to find the closest vectors and combine them with additional metadata without relying on external data sources or additional application code to integrate the results. Customers want a vector database option that is simple to build on and enables them to move quickly from prototyping to production so they can focus on creating differentiated applications. The vector engine for OpenSearch Serverless extends OpenSearch’s search capabilities by enabling you to store, search, and retrieve billions of vector embeddings in real time and perform accurate similarity matching and semantic searches without having to think about the underlying infrastructure.

Exploring the vector engine’s capabilities

Built on OpenSearch Serverless, the vector engine inherits and benefits from its robust architecture. With the vector engine, you don’t have to worry about sizing, tuning, and scaling the backend infrastructure. The vector engine automatically adjusts resources by adapting to changing workload patterns and demand to provide consistently fast performance and scale. As the number of vectors grows from a few thousand during prototyping to hundreds of millions and beyond in production, the vector engine will scale seamlessly, without the need for reindexing or reloading your data to scale your infrastructure. Additionally, the vector engine has separate compute for indexing and search workloads, so you can seamlessly ingest, update, and delete vectors in real time while ensuring that the query performance your users experience remains unaffected. All the data is persisted in Amazon Simple Storage Service (Amazon S3), so you get the same data durability guarantees as Amazon S3 (eleven nines). Even though we are still in preview, the vector engine is designed for production workloads with redundancy for Availability Zone outages and infrastructure failures.

The vector engine for OpenSearch Serverless is powered by the k-nearest neighbor (kNN) search feature in the open-source OpenSearch Project, proven to deliver reliable and precise results. Many customers today are using OpenSearch kNN search in managed clusters for offering semantic search and personalization in their applications. With the vector engine, you can get the same functionality with the simplicity of a serverless environment. The vector engine supports the popular distance metrics such as Euclidean, cosine similarity, and dot product, and can accommodate 16,000 dimensions, making it well-suited to support a wide range of foundational and other AI/ML models. You can also store diverse fields with various data types such as numeric, boolean, date, keyword, geopoint for metadata, and text for descriptive information to add more context to the stored vectors. Colocating the data types reduces the complexity and maintainability and avoids data duplication, version compatibility challenges, and licensing issues, effectively simplifying your application stack. Because the vector engine supports the same OpenSearch open-source suite APIs, you can take advantage of its rich query capabilities, such as full text search, advanced filtering, aggregations, geo-spatial query, nested queries for faster retrieval of data, and enhanced search results. For example, if your use case requires you to find the results within 15 miles of the requestor, the vector engine can do this in a single query, eliminating the need for maintaining two different systems and then combining the results through application logic. With support for integration with LangChain, Amazon Bedrock, and Amazon SageMaker, you can easily integrate your preferred ML and AI system with the vector engine.

The vector engine supports a wide range of use cases across various domains, including image search, document search, music retrieval, product recommendation, video search, location-based search, fraud detection, and anomaly detection. We also anticipate a growing trend for hybrid searches that combine lexical search methods with advanced ML and generative AI capabilities. For example, when a user searches for a “red shirt” on your e-commerce website, semantic search helps expand the scope by retrieving all shades of red, while preserving the tuning and boosting logic implemented on the lexical (BM25) search. With OpenSearch filtering, you can further enhance the relevance of your search results by providing users with options to refine their search based on size, brand, price range, and availability in nearby stores, allowing for a more personalized and precise experience. The hybrid search support in the vector engine enables you to query vector embeddings, metadata, and descriptive information within a single query call, making it easy to provide more accurate and contextually relevant search results without building complex application code.

You can get started in minutes with the vector engine by creating a specialized vector search collection under OpenSearch Serverless using the AWS Management Console, AWS Command Line Interface (AWS CLI), or the AWS software development kit (AWS SDK). Collections are a logical grouping of indexed data that works together to support a workload, while the physical resources are automatically managed in the backend. You don’t have to declare how much compute or storage is needed or monitor the system to make sure it’s running well. OpenSearch Serverless applies different sharding and indexing strategies for the three available collection types: time series, search, and vector search. The vector engine’s compute capacity used for data ingestion, and search and query are measured in OpenSearch Compute Units (OCUs). One OCU can handle 4 million vectors for 128 dimensions or 500K for 768 dimensions at 99% recall rate. The vector engine is built on OpenSearch Serverless, which is a highly available service and requires a minimum of 4 OCUs (two OCUs for the ingest including primary and standby, and two OCUs for the search with two active replicas across Availability Zones) for that first collection in an account. All subsequent collections using the same AWS Key Management Service (AWS KMS) key can share those OCUs.

Get started with vector embeddings

To get started using vector embeddings using the console, complete the following steps:

Create a new collection on the OpenSearch Serverless console.
Provide a name and optional description.
Currently, vector embeddings are supported exclusively by vector search collections; therefore, for Collection type, select Vector search.
Next, you must configure the security policies, which includes encryption, network, and data access policies.

We are introducing the new Easy create option, which streamlines the security configuration for faster onboarding. All the data in the vector engine is encrypted in transit and at rest by default. You can choose to bring your own encryption key or use the one provided by the service that is dedicated for your collection or account. You can choose to host your collection on a public endpoint or within a VPC. The vector engine supports fine-grained AWS Identity and Access Management (IAM) permissions so that you can define who can create, update, and delete encryption, network, collections, and indexes, thereby enabling organizational alignment.

With the security settings in place, you can finish creating the collection.

After the collection is successfully created, you can create the vector index. At this point, you can use the API or the console to create an index. An index is a collection of documents with a common data schema and provides a way for you to store, search, and retrieve your vector embeddings and other fields. The vector index supports up to 1,000 fields.

To create the vector index, you must define the vector field name, dimensions, and the distance metric.

The vector index supports up to 16,000 dimensions and three types of distance metrics: Euclidean, cosine, and dot product.

Once you have successfully created the index, you can use OpenSearch’s powerful query capabilities to get comprehensive search results.

The following example shows how easily you can create a simple property listing index with the title, description, price, and location details as fields using the OpenSearch API. By using the query APIs, this index can efficiently provide accurate results to match your search requests, such as “Find me a two-bedroom apartment in Seattle that is under $3000.”

From preview to GA and beyond

Today, we are excited to announce the preview of the vector engine, making it available for you to begin testing it out immediately. As we noted earlier, OpenSearch Serverless was designed to provide a highly available service to power your enterprise applications, with independent compute resources for index and search and built-in redundancy.

We recognize that many of you are in the experimentation phase and would like a more economical option for dev-test. Prior to GA, we plan to offer two features that will enable us to reduce the cost of your first collection. The first is a new dev-test option that enables you to launch a collection with no active standby or replica, reducing the entry cost by 50%. The vector engine still provides durability guarantees because it persists all the data in Amazon S3. The second is to initially provision a 0.5 OCU footprint, which will scale up as needed to support your workload, further lowering costs if your initial workload is in the tens of thousands to low-hundreds of thousands of vectors (depending on the number of dimensions). Between these two features, we will reduce the minimum OCUs needed to power your first collection from 4 OCUs down to 1 OCU per hour.

We are also working on features that will allow us to achieve workload pause and resume capabilities in the coming months, which is particularly useful for the vector engine because many of these use cases don’t require continuous indexing of the data.

Lastly, we are diligently focused on optimizing the performance and memory usage of the vector graphs, including improving caching, merging and more.

While we work on these cost reductions, we will be offering the first 1400 OCU-hours per month free on vector collections until the dev-test option is made available. This will enable you to test the vector engine preview for up to two weeks every month at no cost, based on your workload.

Summary

The vector engine for OpenSearch Serverless introduces a simple, scalable, and high-performing vector storage and search capability that makes it straightforward for you to quickly store and query billions of vector embeddings generated from a variety of ML models, such as those provided by Amazon Bedrock, with response times in milliseconds.

The preview release of vector engine for OpenSearch Serverless is now available in eight Regions globally: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland).

We are the excited about the future ahead and your feedback will play a vital role in guiding the progress of this product. We encourage you to try out the vector engine for OpenSearch Serverless and share your use cases, questions, and feedback in the comments section.

In the coming weeks, we will be publishing a series of posts to provide you with detailed guidance on how to integrate the vector engine with LangChain, Amazon Bedrock, and SageMaker. To learn more about the vector engine’s capabilities, refer to our Getting Started with Amazon OpenSearch Serverless documentation

About the authors

Pavani Baddepudi is a Principal Product Manager for Search Services at AWS and the lead PM for OpenSearch Serverless. Her interests include distributed systems, networking, and security. When not working, she enjoys hiking and exploring new cuisines.

Carl Meadows is Director of Product Management at AWS and is responsible for Amazon Elasticsearch Service, OpenSearch, Open Distro for Elasticsearch, and Amazon CloudSearch. Carl has been with Amazon Elasticsearch Service since before it was launched in 2015. He has a long history of working in the enterprise software and cloud services spaces. When not working, Carl enjoys making and recording music.

Preview – Enable Foundation Models to Complete Tasks With Agents for Amazon Bedrock

2023-07-26 Antje Barth

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/preview-enable-foundation-models-to-complete-tasks-with-agents-for-amazon-bedrock/

This April, Swami Sivasubramanian, Vice President of Data and Machine Learning at AWS, announced Amazon Bedrock and Amazon Titan models as part of new tools for building with generative AI on AWS. Amazon Bedrock, currently available in preview, is a fully managed service that makes foundation models (FMs) from Amazon and leading AI startups—such as AI21 Labs, Anthropic, Cohere, and Stability AI—available through an API.

Today, I’m excited to announce the preview of agents for Amazon Bedrock, a new capability for developers to create fully managed agents in a few clicks. Agents for Amazon Bedrock accelerate the delivery of generative AI applications that can manage and perform tasks by making API calls to your company systems. Agents extend FMs to understand user requests, break down complex tasks into multiple steps, carry on a conversation to collect additional information, and take actions to fulfill the request.

Using agents for Amazon Bedrock, you can automate tasks for your internal or external customers, such as managing retail orders or processing insurance claims. For example, an agent-powered generative AI e-commerce application can not only respond to the question, “Do you have this jacket in blue?” with a simple answer but can also help you with the task of updating your order or managing an exchange.

For this to work, you first need to give the agent access to external data sources and connect it to existing APIs of other applications. This allows the FM that powers the agent to interact with the broader world and extend its utility beyond just language processing tasks. Second, the FM needs to figure out what actions to take, what information to use, and in which sequence to perform these actions. This is possible thanks to an exciting emerging behavior of FMs—their ability to reason. You can show FMs how to handle such interactions and how to reason through tasks by building prompts that include definitions and instructions. The process of designing prompts to guide the model towards desired outputs is known as prompt engineering.

Introducing Agents for Amazon Bedrock
Agents for Amazon Bedrock automate the prompt engineering and orchestration of user-requested tasks. Once configured, an agent automatically builds the prompt and securely augments it with your company-specific information to provide responses back to the user in natural language. The agent is able to figure out the actions required to automatically process user-requested tasks. It breaks the task into multiple steps, orchestrates a sequence of API calls and data lookups, and maintains memory to complete the action for the user.

With fully managed agents, you don’t have to worry about provisioning or managing infrastructure. You’ll have seamless support for monitoring, encryption, user permissions, and API invocation management without writing custom code. As a developer, you can use the Bedrock console or SDK to upload the API schema. The agent then orchestrates the tasks with the help of FMs and performs API calls using AWS Lambda functions.

Primer on Advanced Reasoning and ReAct
You can help FMs to reason and figure out how to solve user-requested tasks with a reasoning technique called ReAct (synergizing reasoning and acting). Using ReAct, you can structure prompts to show an FM how to reason through a task and decide on actions that help find a solution. The structured prompts include a sequence of question-thought-action-observation examples.

The question is the user-requested task or problem to solve. The thought is a reasoning step that helps demonstrate to the FM how to tackle the problem and identify an action to take. The action is an API that the model can invoke from an allowed set of APIs. The observation is the result of carrying out the action. The actions that the FM is able to choose from are defined by a set of instructions that are prepended to the example prompt text. Here is an illustration of how you would build up a ReAct prompt:

The good news is that Bedrock performs the heavy lifting for you! Behind the scenes, agents for Amazon Bedrock build the prompts based on the information and actions you provide.

Now, let me show you how to get started with agents for Amazon Bedrock.

Create an Agent for Amazon Bedrock
Let’s assume you’re a developer at an insurance company and want to provide a generative AI application that helps the insurance agency owners automate repetitive tasks. You create an agent in Bedrock and integrate it into your application.

To get started with the agent, open the Bedrock console, select Agents in the left navigation panel, then choose Create Agent.

This starts the agent creation workflow.

Provide agent details including agent name, description (optional), whether the agent is allowed to request additional user inputs, and the AWS Identity and Access Management (IAM) service role that gives your agent access to other required services, such as Amazon Simple Storage Service (Amazon S3) and AWS Lambda.
Select a foundation model from Bedrock that fits your use case. Here, you provide an instruction to your agent in natural language. The instruction tells the agent what task it’s supposed to perform and the persona it’s supposed to assume. For example, “You are an agent designed to help with processing insurance claims and managing pending paperwork.”

Add action groups. An action is a task that the agent can perform automatically by making API calls to your company systems. A set of actions is defined in an action group. Here, you provide an API schema that defines the APIs for all the actions in the group. You also must provide a Lambda function that represents the business logic for each API. For example, let’s define an action group called ClaimManagementActionGroup that manages insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders. Make sure to capture this information in the action group description.

The business logic for my action group is captured in the Lambda function InsuranceClaimsLambda. This AWS Lambda function implements methods for the following API calls: open-claims, identify-missing-documents, and send-reminders.Here’s a short extract from my OrderManagementLambda:

import json
import time
 
def open_claims():
    ...

def identify_missing_documents(parameters):
    ...
 
def send_reminders():
    ...
 
def lambda_handler(event, context):
    responses = []
 
    for prediction in event['actionGroups']:
        response_code = ...
        action = prediction['actionGroup']
        api_path = prediction['apiPath']
        
        if api_path == '/claims':
            body = open_claims() 
        elif api_path == '/claims/{claimId}/identify-missing-documents':
			parameters = prediction['parameters']
            body = identify_missing_documents(parameters)
        elif api_path == '/send-reminders':
            body =  send_reminders()
        else:
            body = {"{}::{} is not a valid api, try another one.".format(action, api_path)}
 
        response_body = {
            'application/json': {
                'body': str(body)
            }
        }
        
        action_response = {
            'actionGroup': prediction['actionGroup'],
            'apiPath': prediction['apiPath'],
            'httpMethod': prediction['httpMethod'],
            'httpStatusCode': response_code,
            'responseBody': response_body
        }
        
        responses.append(action_response)
 
    api_response = {'response': responses}
 
    return api_response

Note that you also must provide an API schema in the OpenAPI schema JSON format. Here’s what my API schema file insurance_claim_schema.json looks like:

{"openapi": "3.0.0",
    "info": {
        "title": "Insurance Claims Automation API",
        "version": "1.0.0",
        "description": "APIs for managing insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders."
    },
    "paths": {
        "/claims": {
            "get": {
                "summary": "Get a list of all open claims",
                "description": "Get the list of all open insurance claims. Return all the open claimIds.",
                "operationId": "getAllOpenClaims",
                "responses": {
                    "200": {
                        "description": "Gets the list of all open insurance claims for policy holders",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "array",
                                    "items": {
                                        "type": "object",
                                        "properties": {
                                            "claimId": {
                                                "type": "string",
                                                "description": "Unique ID of the claim."
                                            },
                                            "policyHolderId": {
                                                "type": "string",
                                                "description": "Unique ID of the policy holder who has filed the claim."
                                            },
                                            "claimStatus": {
                                                "type": "string",
                                                "description": "The status of the claim. Claim can be in Open or Closed state"
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        },
        "/claims/{claimId}/identify-missing-documents": {
            "get": {
                "summary": "Identify missing documents for a specific claim",
                "description": "Get the list of pending documents that need to be uploaded by policy holder before the claim can be processed. The API takes in only one claim id and returns the list of documents that are pending to be uploaded by policy holder for that claim. This API should be called for each claim id",
                "operationId": "identifyMissingDocuments",
                "parameters": [{
                    "name": "claimId",
                    "in": "path",
                    "description": "Unique ID of the open insurance claim",
                    "required": true,
                    "schema": {
                        "type": "string"
                    }
                }],
                "responses": {
                    "200": {
                        "description": "List of documents that are pending to be uploaded by policy holder for insurance claim",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "pendingDocuments": {
                                            "type": "string",
                                            "description": "The list of pending documents for the claim."
                                        }
                                    }
                                }
                            }
                        }

                    }
                }
            }
        },
        "/send-reminders": {
            "post": {
                "summary": "API to send reminder to the customer about pending documents for open claim",
                "description": "Send reminder to the customer about pending documents for open claim. The API takes in only one claim id and its pending documents at a time, sends the reminder and returns the tracking details for the reminder. This API should be called for each claim id you want to send reminders for.",
                "operationId": "sendReminders",
                "requestBody": {
                    "required": true,
                    "content": {
                        "application/json": {
                            "schema": {
                                "type": "object",
                                "properties": {
                                    "claimId": {
                                        "type": "string",
                                        "description": "Unique ID of open claims to send reminders for."
                                    },
                                    "pendingDocuments": {
                                        "type": "string",
                                        "description": "The list of pending documents for the claim."
                                    }
                                },
                                "required": [
                                    "claimId",
                                    "pendingDocuments"
                                ]
                            }
                        }
                    }
                },
                "responses": {
                    "200": {
                        "description": "Reminders sent successfully",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "sendReminderTrackingId": {
                                            "type": "string",
                                            "description": "Unique Id to track the status of the send reminder Call"
                                        },
                                        "sendReminderStatus": {
                                            "type": "string",
                                            "description": "Status of send reminder notifications"
                                        }
                                    }
                                }
                            }
                        }
                    },
                    "400": {
                        "description": "Bad request. One or more required fields are missing or invalid."
                    }
                }
            }
        }
    }
}

When a user asks your agent to complete a task, Bedrock will use the FM you configured for the agent to identify the sequence of actions and invoke the corresponding Lambda functions in the right order to solve the user-requested task.

In the final step, review your agent configuration and choose Create Agent.
Congratulations, you’ve just created your first agent in Amazon Bedrock!

Deploy an Agent for Amazon Bedrock
To deploy an agent in your application, you must create an alias. Bedrock then automatically creates a version for that alias.

In the Bedrock console, select your agent, then select Deploy, and choose Create to create an alias.
Provide an alias name and description and choose whether to create a new version or use an existing version of your agent to associate with this alias.
This saves a snapshot of the agent code and configuration and associates an alias with this snapshot or version. You can use the alias to integrate the agent into your applications.

Now, let’s test the insurance agent! You can do this right in the Bedrock console.

Let’s ask the agent to “Send reminder to all policy holders with open claims and pending paper work.” You can see how the FM-powered agent is able to understand the user request, break down the task into steps (collect the open insurance claims, lookup the claim IDs, send reminders), and perform the corresponding actions.

Agents for Amazon Bedrock can help you increase productivity, improve your customer service experience, or automate DevOps tasks. I’m excited to see what use cases you will implement!

Learn the Fundamentals of Generative AI
If you’re interested in the fundamentals of generative AI and how to work with FMs, including advanced prompting techniques and agents, check out this this new hands-on course that I developed with AWS colleagues and industry experts in collaboration with DeepLearning.AI:

Generative AI with large language models (LLMs) is an on-demand, three-week course for data scientists and engineers who want to learn how to build generative AI applications with LLMs. It’s the perfect foundation to start building with Amazon Bedrock. Enroll for generative AI with LLMs today.

Sign up to Learn More about Amazon Bedrock (Preview)
Amazon Bedrock is currently available in preview. Reach out to us if you’d like access to agents for Amazon Bedrock as part of the preview. We’re regularly providing access to new customers. Visit the Amazon Bedrock Features page and sign up to learn more about Amazon Bedrock.

— Antje

Top Announcements of the AWS Summit in New York, 2023

2023-07-26 AWS News Blog Team

Post Syndicated from AWS News Blog Team original https://aws.amazon.com/blogs/aws/top-announcements-of-the-aws-summit-in-new-york-2023/

It’ll be a full house as the AWS Summit gets underway in New York City on Wednesday, July 26, 2023. The cloud event has something for everyone including a keynote, breakout sessions, opportunities to network, and of course, to learn about the latest exciting AWS product announcements.

Today, we’re sharing a selection of announcements to get the fun started. We’ll also share major updates from Wednesday’s keynote, so check back for more exciting news to come soon.

If you want to attend the event virtually, you can still register for the keynote livestream.

(This post was last updated: 5:35 p.m. PST, July 25, 2023.)

AWS product announcements from July 25, 2023

Introducing AWS HealthImaging — purpose-built for medical imaging at scale
This new HIPAA-eligible service empowers healthcare providers and their software partners to store, analyze, and share medical imaging data at petabyte scale.

Amazon Redshift now supports querying Apache Iceberg tables (preview)
Apache Iceberg, one of the most recent open table formats, has been used by many customers to simplify data processing on rapidly expanding and evolving tables stored in data lakes.

AWS Glue Studio now supports Amazon Redshift Serverless
Before this launch, developers using Glue Studio only had access to Redshift tables in Redshift clusters. Now, those same developers can connect to Redshift Serverless tables directly without manual configuration.

Snowflake connectivity for AWS Glue for Apache Spark is now generally available
AWS Glue for Apache Spark now supports native connectivity to Snowflake, which enables users to read and write data without the need to install or manage Snowflake connector libraries.

AWS Glue jobs can now include AWS Glue DataBrew Recipes
The new integration makes it simpler to deploy and scale DataBrew jobs and gives DataBrew users access to AWS Glue features not available in DataBrew.

AWS re:Inforce 2023: Key announcements and session highlights

2023-07-21 Nisha Amthul

Post Syndicated from Nisha Amthul original https://aws.amazon.com/blogs/security/aws-reinforce-2023-key-announcements-and-session-highlights/

AWS re:Inforce

Thank you to everyone who participated in AWS re:Inforce 2023, both virtually and in-person. The conference featured a lineup of over 250 engaging sessions and hands-on labs, in collaboration with more than 80 AWS partner sponsors, over two days of immersive cloud security learning. The keynote was delivered by CJ Moses, AWS Chief Information Security Officer, Becky Weiss, AWS Senior Principal Engineer, and Debbie Wheeler, Delta Air Lines Chief Information Security Officer. They shared the latest innovations in cloud security from AWS and provided insights on how to foster a culture of security in your organization.

If you couldn’t join us or would like to revisit the insightful themes discussed, we’ve put together this blog post for you. It provides a comprehensive summary of all the key announcements made and includes information on where you can watch the keynote and sessions at your convenience.

Key announcements

Here are some of the top announcements that we made at AWS re:Inforce 2023:

Amazon Verified Permissions — Verified Permissions is a scalable permissions management and fine-grained authorization service for the applications you build. The service helps your developers build secure applications faster by externalizing authorization and centralizing policy management and administration. Developers can align their application access with Zero Trust principles by implementing least privilege and continual verification within applications. Security and audit teams can better analyze and audit who has access to what within applications. Amazon Verified Permissions uses Cedar, an open-source policy language for access control that empowers developers and admins to define policy-based access controls using roles and attributes for context-aware access control.
Amazon Inspector code scanning of Lambda functions — Amazon Inspector now supports code scanning of AWS Lambda functions, expanding the existing capability to scan Lambda functions and associated layers for software vulnerabilities in application package dependencies. Amazon Inspector code scanning of Lambda functions scans custom proprietary application code you write within Lambda functions for security vulnerabilities such as injection flaws, data leaks, weak cryptography, or missing encryption. Upon detecting code vulnerabilities within the Lambda function or layer, Amazon Inspector generates actionable security findings that provide several details, such as security detector name, impacted code snippets, and remediation suggestions to address vulnerabilities. The findings are aggregated in the Amazon Inspector console and integrated with AWS Security Hub and Amazon EventBridge for streamlined workflow automation.
Amazon Inspector SBOM export — Amazon Inspector now offers the ability to export a consolidated Software Bill of Materials (SBOMs) for resources that it monitors across your organization in multiple industry-standard formats, including CycloneDx and Software Package Data Exchange (SPDX). With this new capability, you can use automated and centrally managed SBOMs to gain visibility into key information about your software supply chain. This includes details about software packages used in the resource, along with associated vulnerabilities. SBOMs can be exported to an Amazon Simple Storage Service (Amazon S3) bucket and downloaded for analyzing with Amazon Athena or Amazon QuickSight to visualize software supply chain trends. This functionality is available with a few clicks in the Amazon Inspector console or using Amazon Inspector APIs.
Amazon CodeGuru Security — Amazon CodeGuru Security offers a comprehensive set of APIs that are designed to seamlessly integrate with your existing pipelines and tooling. CodeGuru Security serves as a static application security testing (SAST) tool that uses machine learning to help you identify code vulnerabilities and provide guidance you can use as part of remediation. CodeGuru Security also provides in-context code patches for certain classes of vulnerabilities, helping you reduce the effort required to fix code.
Amazon EC2 Instance Connect Endpoint — Amazon Elastic Compute Cloud (Amazon EC2) announced support for connectivity to instances using SSH or RDP in private subnets over the Amazon EC2 Instance Connect Endpoint (EIC Endpoint). With this capability, you can connect to your instances by using SSH or RDP from the internet without requiring a public IPv4 address.
AWS built-in partner solutions — AWS built-in partner solutions are co-built with AWS experts, helping to ensure that AWS Well-Architected security reference architecture guidelines and best security practices were rigorously followed. AWS built-in partner solutions can save you valuable time and resources by getting the building blocks of cloud development right when you begin a migration or modernization initiative. AWS built-in solutions also automate deployments and can reduce installation time from months or weeks to a single day. Customers often look to our partners for innovation and help with “getting cloud right.” Now, partners with AWS built-in solutions can help you be more efficient and drive business value for both partner software and AWS native services.
AWS Cyber Insurance Partners — AWS has worked with leading cyber insurance partners to help simplify the process of obtaining cyber insurance. You can now reduce business risk by finding and procuring cyber insurance directly from validated AWS cyber insurance partners. To reduce the amount of paperwork and save time, download and share your AWS Foundational Security Best Practices Standard detailed report from AWS Security Hub and share the report with the AWS Cyber Insurance Partner of your choice. With AWS vetted cyber insurance partners, you can have confidence that these insurers understand AWS security posture and are evaluating your environment according to the latest AWS Security Best Practices. Now you can get a full cyber insurance quote in just two business days.
AWS Global Partner Security Initiative — With the AWS Global Partner Security Initiative, AWS will jointly develop end-to-end security solutions and managed services, leveraging the capabilities, scale, and deep security knowledge of our Global System Integrators (GSI) partners.
Amazon Detective finding groups — Amazon Detective expands its finding groups capability to include Amazon Inspector findings, in addition to Amazon GuardDuty findings. Using machine learning, this extension of the finding groups feature significantly streamlines the investigation process, reducing the time spent and helping to improve identification of the root cause of security incidents. By grouping findings from Amazon Inspector and GuardDuty, you can use Detective to answer difficult questions such as “was this EC2 instance compromised because of a vulnerability?” or “did this GuardDuty finding occur because of unintended network exposure?” Furthermore, Detective maps the identified findings and their corresponding tactics, techniques, and procedures to the MITRE ATT&CK framework, enhancing the overall effectiveness and alignment of security measures.
[Pre-announce] AWS Private Certificate Authority Connector for Active Directory –— AWS Private CA will soon launch a Connector for Active Directory (AD). The Connector for AD will help to reduce upfront public key infrastructure (PKI) investment and ongoing maintenance costs with a fully managed serverless solution. This new feature will help reduce PKI complexity by replacing on-premises certificate authorities with a highly secure hardware security module (HSM)-backed AWS Private CA. You will be able to automatically deploy certificates using auto-enrollment to on-premises AD and AWS Directory Service for Microsoft Active Directory.
AWS Payment Cryptography — The day before re:Inforce, AWS Payment Cryptography launched with general availability. This service simplifies cryptography operations in cloud-hosted payment applications. AWS Payment Cryptography simplifies your implementation of the cryptographic functions and key management used to secure data and operations in payment processing in accordance with various PCI standards.
AWS WAF Fraud Control launches account creation fraud prevention — AWS WAF Fraud Control announces Account Creation Fraud Prevention, a managed protection for AWS WAF that’s designed to prevent creation of fake or fraudulent accounts. Fraudsters use fake accounts to initiate activities, such as abusing promotional and sign-up bonuses, impersonating legitimate users, and carrying out phishing tactics. Account Creation Fraud Prevention helps protect your account sign-up or registration pages by allowing you to continuously monitor requests for anomalous digital activity and automatically block suspicious requests based on request identifiers and behavioral analysis.
AWS Security Hub automation rules — AWS Security Hub, a cloud security posture management service that performs security best practice checks, aggregates alerts, and facilitates automated remediation, now features a capability to automatically update or suppress findings in near real time. You can now use automation rules to automatically update various fields in findings, suppress findings, update finding severity and workflow status, add notes, and more.
Amazon S3 announces dual-layer server-side encryption — Amazon S3 is the only cloud object storage service where you can apply two layers of encryption at the object level and control the data keys used for both layers. Dual-layer server-side encryption with keys stored in AWS Key Management Service (DSSE-KMS) is designed to adhere to National Security Agency Committee on National Security Systems Policy (CNSSP) 15 for FIPS compliance and Data-at-Rest Capability Package (DAR CP) Version 5.0 guidance for two layers of MFS U/00/814670-15 Commercial National Security Algorithm (CNSA) encryption.
AWS CloudTrail Lake dashboards — AWS CloudTrail Lake, a managed data lake that lets organizations aggregate, immutably store, visualize, and query their audit and security logs, announces the general availability of CloudTrail Lake dashboards. CloudTrail Lake dashboards provide out-of-the-box visualizations and graphs of key trends from your audit and security data directly within the CloudTrail console. It also offers the flexibility to drill down on additional details, such as specific user activity, for further analysis and investigation using CloudTrail Lake SQL queries.
AWS Well-Architected Profiles — AWS Well-Architected introduces Profiles, which allows you to tailor your Well-Architected reviews based on your business goals. This feature creates a mechanism for continuous improvement by encouraging you to review your workloads with certain goals in mind first, and then complete the remaining Well-Architected review questions.

Watch on demand

Leadership sessions — You can watch the leadership sessions to learn from AWS security experts as they talk about essential topics, including open source software (OSS) security, Zero Trust, compliance, and proactive security.

Breakout sessions, lightning talks, and more — Explore our content across these six tracks:

Application Security— Discover how AWS, customers, and AWS Partners move fast while understanding the security of the software they build.
Data Protection — Learn how AWS, customers, and AWS Partners work together to protect data. Get insights into trends in data management, cryptography, data security, data privacy, encryption, and key rotation and storage.
Governance, Risk, and Compliance — Dive into the latest hot topics in governance and compliance for security practitioners, and discover how to automate compliance tools and services for operational use.
Identity and Access Management — Learn how AWS, customers, and AWS Partners use AWS Identity Services to manage identities, resources, and permissions securely and at scale. Discover how to configure fine-grained access controls for your employees, applications, and devices and deploy permission guardrails across your organization.
Network and Infrastructure Security — Gain practical expertise on the services, tools, and products that AWS, customers, and partners use to protect the usability and integrity of their networks and data.
Threat Detection and Incident Response — Discover how AWS, customers, and AWS Partners get the visibility they need to improve their security posture, reduce the risk profile of their environments, identify issues before they impact business, and implement incident response best practices.
You can also watch our Lightning Talks and the AWS On Air day 1 and day 2 livestream on demand.

Session presentation downloads are also available on the AWS Events Content page. If you’re interested in further in-person security learning opportunities, consider registering for AWS re:Invent 2023, which will be held from November 27 to December 1 in Las Vegas, NV. We look forward to seeing you there!

If you would like to discuss how these new announcements can help your organization improve its security posture, AWS is here to help. Contact your AWS account team today.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Deleting after completion

Setting up schedules to delete after completion

Benefits of automation

Conclusion

Chris Farris – Atlanta, USA

Gerardo Castro – Callao, Perú

Keisuke Usuda – Chiba, Japan

Ray Lin (Chia-Wei Lin) – Taipei, Taiwan

Shun Yoshie – Yokohama, Japan

Teri Radichel – Savannah, USA

Learn More

Next-generation imaging solutions and workflows

Security

Conclusion

Further reading

Improved cluster utilization with optimized scaling experience

Minimized interruptions with enhanced resiliency and availability

Improved cluster resiliency with upgraded logging and debugging capabilities

Conclusion

About the Authors

Using augmented ML search and generative AI with vector embeddings

Exploring the vector engine’s capabilities

Get started with vector embeddings

From preview to GA and beyond

Summary

About the authors

AWS product announcements from July 25, 2023

Key announcements

Watch on demand

The collective thoughts of the interwebz