Tag Archives: Amazon EC2

AWS Weekly Roundup: Advanced capabilities in Amazon Bedrock and Amazon Q, and more (July 15, 2024).

Post Syndicated from Abhishek Gupta original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-advanced-capabilities-in-amazon-bedrock-and-amazon-q-and-more-july-15-2024/

As expected, there were lots of exciting launches and updates announced during the AWS Summit New York. You can quickly scan the highlights in Top Announcements of the AWS Summit in New York, 2024.

NY-Summit-feat-img

My colleagues and fellow AWS News Blog writers Veliswa Boya and Sébastien Stormacq were at the AWS Community Day Cameroon last week. They were energized to meet amazing professionals, mentors, and students – all willing to learn and exchange thoughts about cloud technologies. You can access the video replay to feel the vibes or just watch some of the talks!

AWS Community Day Cameroon 2024

Last week’s launches
In addition to the launches at the New York Summit, here are a few others that got my attention.

Advanced RAG capabilities Knowledge Bases for Amazon Bedrock – These include custom chunking options to enable customers to write their own chunking code as a Lambda function; smart parsing to extract information from complex data such as tables; and query reformulation to break down queries into simpler sub-queries, retrieve relevant information for each, and combine the results into a final comprehensive answer.

Amazon Bedrock Prompt Management and Prompt Flows – This is a preview launch of Prompt Management that help developers and prompt engineers get the best responses from foundation models for their use cases; and Prompt Flows accelerates the creation, testing, and deployment of workflows through an intuitive visual builder.

Fine-tuning for Anthropic’s Claude 3 Haiku in Amazon Bedrock (preview) – By providing your own task-specific training dataset, you can fine tune and customize Claude 3 Haiku to boost model accuracy, quality, and consistency to further tailor generative AI for your business.

IDE workspace context awareness in Amazon Q Developer chat – Users can now add @workspace to their chat message in Q Developer to ask questions about the code in the project they currently have open in the IDE. Q Developer automatically ingests and indexes all code files, configurations, and project structure, giving the chat comprehensive context across your entire application within the IDE.

New features in Amazon Q Business –  The new personalization capabilities in Amazon Q Business are automatically enabled and will use your enterprise’s employee profile data to improve their user experience. You can now get answers from text content in scanned PDFs, and images embedded in PDF documents, without having to use OCR for preprocessing and text extraction.

Amazon EC2 R8g instances powered by AWS Graviton4 are now generally available – Amazon EC2 R8g instances are ideal for memory-intensive workloads such as databases, in-memory caches, and real-time big data analytics. These are powered by AWS Graviton4 processors and deliver up to 30% better performance compared to AWS Graviton3-based instances.

Vector search for Amazon MemoryDB is now generally available – Vector search for MemoryDB enables real-time machine learning (ML) and generative AI applications. It can store millions of vectors with single-digit millisecond query and update latencies at the highest levels of throughput with >99% recall.

Introducing Valkey GLIDE, an open source client library for Valkey and Redis open sourceValkey is an open source key-value data store that supports a variety of workloads such as caching, and message queues. Valkey GLIDE is one of the official client libraries for Valkey and it supports all Valkey commands. GLIDE supports Valkey 7.2 and above, and Redis open source 6.2, 7.0, and 7.2.

Amazon OpenSearch Service enhancementsAmazon OpenSearch Serverless now supports workloads up to 30TB of data for time-series collections enabling more data-intensive use cases, and an innovative caching mechanism that automatically fetches and intelligently manages data, leading to faster data retrieval, efficient storage usage, and cost savings. Amazon OpenSearch Service has now added support for AI powered Natural Language Query Generation in OpenSearch Dashboards Log Explorer so you can get started quickly with log analysis without first having to be proficient in PPL.

Open source release of Secrets Manager Agent for AWS Secrets Manager – Secrets Manager Agent is a language agnostic local HTTP service that you can install and use in your compute environments to read secrets from Secrets Manager and cache them in memory, instead of making a network call to Secrets Manager.

Amazon S3 Express One Zone now supports logging of all events in AWS CloudTrail – This capability lets you get details on who made API calls to S3 Express One Zone and when API calls were made, thereby enhancing data visibility for governance, compliance, and operational auditing.

Amazon CloudFront announces managed cache policies for web applications – Previously, Amazon CloudFront customers had two options for managed cache policies, and had to create custom cache policies for all other cases. With the new managed cache policies, CloudFront caches content based on the Cache-Control headers returned by the origin, and defaults to not caching when the header is not returned.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

We launched existing services in additional Regions:

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Context window overflow: Breaking the barrierThis blog post dives into intricate workings of generative artificial intelligence (AI) models, and why is it crucial to understand and mitigate the limitations of CWO (context window overflow).

Using Agents for Amazon Bedrock to interactively generate infrastructure as code – This blog post explores how Agents for Amazon Bedrock can be used to generate customized, organization standards-compliant IaC scripts directly from uploaded architecture diagrams.

Automating model customization in Amazon Bedrock with AWS Step Functions workflow – This blog post covers orchestrating repeatable and automated workflows for customizing Amazon Bedrock models and how AWS Step Functions can help overcome key pain points in model customization.

AWS open source news and updates – My colleague Ricardo Sueiras writes about open source projects, tools, and events from the AWS Community; check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. To learn more about future AWS Summit events, visit the AWS Summit page. Register in your nearest city: Bogotá (July 18), Taipei (July 23–24), AWS Summit Mexico City (Aug. 7), and AWS Summit Sao Paulo (Aug. 15).

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. Upcoming AWS Community Days are in Aotearoa (Aug. 15), Nigeria (Aug. 24), New York (Aug. 28), and Belfast (Sept. 6).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Abhishek

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Graviton4-based Amazon EC2 R8g instances: best price performance in Amazon EC2

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/aws-graviton4-based-amazon-ec2-r8g-instances-best-price-performance-in-amazon-ec2/

Today, I am very excited to announce that the new AWS Graviton4-based Amazon Elastic Compute Cloud (Amazon EC2) R8g instances, that have been available in preview since re:Invent 2023, are now generally available to all. AWS offers more than 150 different AWS Graviton-powered Amazon EC2 instance types globally at scale, has built more than 2 million Graviton processors, and has more than 50,000 customers using AWS Graviton-based instances to achieve the best price performance for their applications.

AWS Graviton4 is the most powerful and energy efficient processor we have ever designed for a broad range of workloads running on Amazon EC2. Like all the other AWS Graviton processors, AWS Graviton4 uses a 64-bit Arm instruction set architecture. AWS Graviton4-based Amazon EC2 R8g instances deliver up to 30% better performance than AWS Graviton3-based Amazon EC2 R7g instances. This helps you to improve performance of your most demanding workloads such as high-performance databases, in-memory caches, and real time big data analytics.

Since the preview announcement at re:Invent 2023, over 100 customers, including Epic Games, SmugMug, Honeycomb, SAP, and ClickHouse have tested their workloads on AWS Graviton4-based R8g instances and observed significant performance improvement over comparable instances. SmugMug achieved 20-40% performance improvements using AWS Graviton4-based instances compared to AWS Graviton3-based instances for their image and data compression operations. Epic Games found AWS Graviton4 instances to be the fastest EC2 instances they have ever tested and Honeycomb.io achieved more than double the throughput per vCPU compared to the non-Graviton based instances that they used four years ago.

Let’s look at some of the improvements that we have made available in our new instances. R8g instances offer larger instance sizes with up to 3x more vCPUs (up to 48xl), 3x the memory (up to 1.5TB), 75% more memory bandwidth, and 2x more L2 cache over R7g instances. This helps you to process larger amounts of data, scale up your workloads, improve time to results, and lower your TCO. R8g instances also offer up to 50 Gbps network bandwidth and up to 40 Gbps EBS bandwidth compared to up to 30 Gbps network bandwidth and up to 20 Gbps EBS bandwidth on Graviton3-based instances.

R8g instances are the first Graviton instances to offer two bare metal sizes (metal-24xl and metal-48xl). You can right size your instances and deploy workloads that benefit from direct access to physical resources. Here are the specs for R8g instances:

Instance Size vCPUs
Memory
Network Bandwidth
EBS Bandwidth
r8g.medium 1 8 GiB up to 12.5 Gbps up to 10 Gbps
r8g.large 2 16 GiB up to 12.5 Gbps up to 10 Gbps
r8g.xlarge 4 32 GiB up to 12.5 Gbps up to 10 Gbps
r8g.2xlarge 8 64 GiB up to 15 Gbps up to 10 Gbps
r8g.4xlarge 16 128 GiB up to 15 Gbps up to 10 Gbps
r8g.8xlarge 32 256 GiB 15 Gbps 10 Gbps
r8g.12xlarge 48 384 GiB 22.5 Gbps 15 Gbps
r8g.16xlarge 64 512 GiB 30 Gbps 20 Gbps
r8g.24xlarge 96 768 GiB 40 Gbps 30 Gbps
r8g.48xlarge 192 1,536 GiB 50 Gbps 40 Gbps
r8g.metal-24xl 96 768 GiB 40 Gbps 30 Gbps
r8g.metal-48xl 192 1,536 GiB 50 Gbps 40 Gbps

If you are looking for more energy-efficient compute options to help you reduce your carbon footprint and achieve your sustainability goals, R8g instances provide the best energy efficiency for memory-intensive workloads in EC2. Additionally, these instances are built on the AWS Nitro System, which offloads CPU virtualization, storage, and networking functions to dedicated hardware and software to enhance the performance and security of your workloads. The Graviton4 processors offer you enhanced security by fully encrypting all high-speed physical hardware interfaces.

R8g instances are ideal for all Linux-based workloads including containerized and micro-services-based applications built using Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Container Registry (Amazon ECR), Kubernetes, and Docker, and as well as applications written in popular programming languages such as C/C++, Rust, Go, Java, Python, .NET Core, Node.js, Ruby, and PHP. AWS Graviton4 processors are up to 30% faster for web applications, 40% faster for databases, and 45% faster for large Java applications than AWS Graviton3 processors. To learn more, visit the AWS Graviton Technical Guide.

Check out the collection of Graviton resources to help you start migrating your applications to Graviton instance types. You can also visit the AWS Graviton Fast Start program to begin your Graviton adoption journey.

Now available
R8g instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions.

You can purchase R8g instances as Reserved Instances, On-Demand, Spot Instances, and via Savings Plans. For further information, visit Amazon EC2 pricing.

To learn more about Graviton-based instances, visit AWS Graviton Processors or the Amazon EC2 R8g Instances.

— Esra

Best practices working with self-hosted GitHub Action runners at scale on AWS

Post Syndicated from Shilpa Sharma original https://aws.amazon.com/blogs/devops/best-practices-working-with-self-hosted-github-action-runners-at-scale-on-aws/

Overview

GitHub Actions is a continuous integration and continuous deployment platform that enables the automation of build, test and deployment activities for your workload. GitHub Self-Hosted Runners provide a flexible and customizable option to run your GitHub Action pipelines. These runners allow you to run your builds on your own infrastructure, giving you control over the environment in which your code is built, tested, and deployed. This reduces your security risk and costs, and gives you the ability to use specific tools and technologies that may not be available in GitHub hosted runners. In this blog, I explore security, performance and cost best practices to take into consideration when deploying GitHub Self-Hosted Runners to your AWS environment.

Best Practices

Understand your security responsibilities

GitHub Self-hosted runners, by design, execute code defined within a GitHub repository, either through the workflow scripts or through the repository build process. You must understand that the security of your AWS runner execution environments are dependent upon the security of your GitHub implementation. Whilst a complete overview of GitHub security is outside the scope of this blog, I recommended that before you begin integrating your GitHub environment with your AWS environment, you review and understand at least the following GitHub security configurations.

  • Federate your GitHub users, and manage the lifecycle of identities through a directory.
  • Limit administrative privileges of GitHub repositories, and restrict who is able to administer permissions, write to repositories, modify repository configurations or install GitHub Apps.
  • Limit control over GitHub Actions runner registration and group settings
  • Limit control over GitHub workflows, and follow GitHub’s recommendations on using third-party actions
  • Do not allow public repositories access to self-hosted runners

Reduce security risk with short-lived AWS credentials

Make use of short-lived credentials wherever you can. They expire by default within 1 hour, and you do not need to rotate or explicitly revoke them. Short lived credentials are created by the AWS Security Token Service (STS). If you use federation to access your AWS account, assume roles, or use Amazon EC2 instance profiles and Amazon ECS task roles, you are using STS already!

In almost all cases, you do not need long-lived AWS Identity and Access Management (IAM) credentials (access keys) even for services that do not “run” on AWS – you can extend IAM roles to workloads outside of AWS without requiring you to manage long-term credentials. With GitHub Actions, we suggest you use OpenID Connect (OIDC). OIDC is a decentralized authentication protocol that is natively supported by STS using sts:AssumeRoleWithWebIdentity, GitHub and many other providers. With OIDC, you can create least-privilege IAM roles tied to individual GitHub repositories and their respective actions. GitHub Actions exposes an OIDC provider to each action run that you can utilize for this purpose.

Short lived AWS credentials with GitHub self-hosted runners

Short lived AWS credentials with GitHub self-hosted runners

If you have many repositories that you wish to grant an individual role to, you may run into a hard limit of the number of IAM roles in a single account. While I advocate solving this problem with a multi-account strategy, you can alternatively scale this approach by:

  • using attribute based access control (ABAC) to match claims in the GitHub token (such as repository name, branch, or team) to the AWS resource tags.
  • using role based access control (RBAC) by logically grouping the repositories in GitHub into Teams or applications to create fewer subset of roles.
  • use an identity broker pattern to vend credentials dynamically based on the identity provided to the GitHub workflow.

Use Ephemeral Runners

Configure your GitHub Action runners to run in “ephemeral” mode, which creates (and destroys) individual short-lived compute environments per job on demand. The short environment lifespan and per-build isolation reduces the risk of data leakage , even in multi-tenanted continuous integration environments, as each build job remains isolated from others on the underlying host.

As each job runs on a new environment created on demand, there is no need for a job to wait for an idle runner, simplifying auto-scaling. With the ability to scale runners on demand, you do not need to worry about turning build infrastructure off when it is not needed (for example out of office hours), giving you a cost-efficient setup. To optimize the setup further, consider allowing developers to tag workflows with instance type tags and launch specific instance types that are optimal for respective workflows.

There are a few considerations to take into account when using ephemeral runners:

  • A job will remain queued until the runner EC2 instance has launched and is ready. This can take up to 2 minutes to complete. To speed up this process, consider using an optimized AMI with all prerequisites installed.
  • Since each job is launched on a fresh runner, utilizing caching on the runner is not possible. For example, Docker images and libraries will always be pulled from source.

Use Runner Groups to isolate your runners based on security requirements

By using ephemeral runners in a single GitHub runner group, you are creating a pool of resources in the same AWS account that are used by all repositories sharing this runner group. Your organizational security requirements may dictate that your execution environments must be isolated further, such as by repository or by environment (such as dev, test, prod).

Runner groups allow you to define the runners that will execute your workflows on a repository-by-repository basis. Creating multiple runner groups not only allow you to provide different types of compute environments, but allow you to place your workflow executions in locations within AWS that are isolated from each other. For example, you may choose to locate your development workflows in one runner group and test workflows in another, with each ephemeral runner group being deployed to a different AWS account.

Runners by definition execute code on behalf of your GitHub users. At a minimum, I recommend that your ephemeral runner groups are contained within their own AWS account and that this AWS account has minimal access to other organizational resources. When access to organizational resources is required, this can be given on a repository-by-repository basis through IAM role assumption with OIDC, and these roles can be given least-privilege access to the resources they require.

Optimize runner start up time using Amazon EC2 warm-pools

Ephemeral runners provide strong build isolation, simplicity and security. Since the runners are launched on demand, the job will be required to wait for the runner to launch and register itself with GitHub. While this usually happens in under 2 minutes, this wait time might not be acceptable in some scenarios.

We can use a warm pool of pre-registered ephemeral runners to reduce the wait time. These runners will listen to the incoming GitHub workflow events actively and as soon as an incoming workflow event is queued, it is picked up readily by the warm pool of registered EC2 runners.

While there can be multiple strategies to manage the warm pool, I recommend the following strategy which uses AWS Lambda for scaling up and scaling down the ephemeral runners:

GitHub self-hosted runners warm pool flow

GitHub self-hosted runners warm pool flow

A GitHub workflow event is created on a trigger like push of code in a master repository or a merge of pull request. This event triggers a Lambda function via webhook and Amazon API Gateway endpoint. The Lambda function helps in validating the GitHub workflow event payload and log events for observability & building metrics. It can be used optionally to replenish the warm pool. There are separate backend Lambda functions to launch, scale up and scale down the warm pool of EC2 instances. The EC2 instances or runners are registered with GitHub at the time of launch. The registered runners listens for incoming GitHub work flow events using GitHub’s internal job queue and as soon as workflow events are triggered, its assigned by GitHub to one of the runners in warm pool for job execution. The runner is automatically de-registered once the job completes. A job can be a build, or deploy request as defined in your GitHub workflow.

With warm pool in place, it is expected to help reduce wait time by 70-80%.

Considerations

  • Increased complexity as there is a possibility of over provisioning runners. This will depend on how long a runner EC2 instance requires to launch and reach a ready state and how frequently the scale up Lambda is configured to run. For example, if the scale up Lambda runs every 1 minute and the EC2 runner requires 2 minutes to launch, then the scale up Lambda will launch 2 instances. The mitigation is to use Auto scaling groups to manage the EC2 warm pool and desired capacity with predictive scaling policies tying back to incoming GitHub workflow events i.e. build job requests.
  • This strategy may have to be revised when supporting Windows or Mac based runners given the spin up times can vary.

Use an optimized AMI to speed up the launch of GitHub self-hosted runners

Amazon Machine Images (AMI) provide a pre-configured, optimized image that can be used to launch the runner EC2 instance. By using AMIs, you will be able to reduce the launch time of a new runner since dependencies and tools are already installed. Consistency across builds is guaranteed due to all instances running the same version of dependencies and tools. Runners will benefit from increased stability and security compliance as images are tested and approved before being distributed for use as runner instances.

When building an AMI for use as a GitHub self-hosted runner the following considerations need to be made:

  • Choosing the right OS base image for the builds. This will depend on your tech stack and toolset.
  • Install the GitHub runner app as part of the image. Ensure automatic runner updates are enabled to reduce the overhead of managing running versions. In case a specific runner version must be used you can disable automatic runner updates to avoid untested changes. Keep in mind, if disabled, a runner will need to be updated manually within 30 days of a new version becoming available.
  • Install build tools and dependencies from trusted sources.
  • Ensure runner logs are captured and forwarded to your security information and event management (SIEM) of choice.
  • The runner requires internet connectivity to access GitHub. This may require configuring proxy settings on the instance depending on your networking setup.
  • Configure any artifact repositories the runner requires. This includes sources and authentication.
  • Automate the creation of the AMI using tools such as EC2 Image Builder to achieve consistency.

Use Spot instances to save costs

The cost associated with scaling up the runners as well as maintaining a hot pool can be minimized using Spot Instances, which can result in savings up to 90% compared to On-Demand prices. However, there could be requirements where we can have longer running builds or batch jobs that cannot tolerate the spot instance terminating on 2 minutes notice. So, having a mixed pool of instances will be a good option where such jobs should be routed to on-demand EC2 instances and the rest on the Spot instances to cater for diverse build needs. This can be done by assigning labels to the runner during launch /registration. In that case, the on-demand instances will be launched and we can a savings plan in place to get cost benefits.

Record runner metrics using Amazon CloudWatch for Observability

It is vital for the observability of the overall platform to generate metrics for the EC2 based GitHub self-hosted runners. Examples of the GitHub runners metrics can be: the number of GitHub workflow events queued or completed in a minute, or number of EC2 runners up and available in the warm pool etc.

We can log the triggered workflow events and runner logs in Amazon CloudWatch and then use CloudWatch embedded metrics to collect metrics such as number of workflow events queued, in progress and completed. Using elements like “started_at” and “completed_at” timings which are part of workflow event payload we can calculate build wait time.

As an example, below is the sample incoming GitHub workflow event logged in Amazon Cloud Watch Logs

<p> </p><p><code>{</code></p><p><code>"hostname": "xxx.xxx.xxx.xxx",</code></p><p><code>"requestId": "aafddsd55-fgcf555",</code></p><p><code>"date": "2022-10-11T05:50:35.816Z",</code></p><p><code>"logLevel": "info",</code></p><p><code>"logLevelId": 3,</code></p><p><code>"filePath": "index.js",</code></p><p><code>"fullFilePath": "/var/task/index.js",</code></p><p><code>"fileNa<a class="ab-item" href="https://aws-blogs-prod.amazon.com/devops/" aria-haspopup="true">AWS DevOps Blog</a>me": "index.js",</code></p><p><code>"lineNumber": 83889,</code></p><p><code>"columnNumber": 12,</code></p><p><code>"isConstructor": false,</code></p><p><code>"functionName": "handle",</code></p><p><code>"argumentsArray": [</code></p><p><code>"Processing Github event",</code></p><p><code>"{\"event\":\"workflow_job\",\"repository\":\"testorg-poc/github-actions-test-repo\",\"action\":\"queued\",\"name\":\"jobname-buildanddeploy\",\"status\":\"queued\",\"started_at\":\"2022-10-11T05:50:33Z\",\"completed_at\":null,\"conclusion\":null}"</code></p><p><code>]</code></p><p><code>}</code></p>

In order to use the logged elements of above log into metrics by capturing \”status\”:\”queued\”,\”repository\”:\”testorg-poc/github-actions-test-repo\c, \”name\”:\”jobname-buildanddeploy\” ,and workflow \”event\” , one can build embedded metrics in the application code or AWS metrics Lambda using any of the cloud watch metrics client library Creating logs in embedded metric format using the client libraries – Amazon CloudWatch based on the language of your choice listed.c

Essentially what one of those libraries will do under the hood is map elements from Log event into dimension fields so cloud watch can then read and generate a metric using that.

console.log(<br />      JSON.stringify({<br />        message: '[Embedded Metric]', // Identifier for metric logs in CW logs<br />        build_event_metric: 1, // Metric Name and value<br />        status: `${status}`, // Dimension name and value<br />        eventName: `${eventName}`,<br />        repository: `${repository}`,<br />        name: `${name}`,<br />        <br />        _aws: {<br />          Timestamp: Date.now(),<br />          CloudWatchMetrics: [<br />            {<br />              Namespace: `demo_2`,<br />              Dimensions: [['status','eventName','repository','name']],<br />              Metrics: [<br />                {<br />                  Name: 'build_event_metric',<br />                  Unit: 'Count',<br />                },<br />              ],<br />            },<br />          ],<br />        },<br />      })<br />    );

A sample architecture:

Consumption of GitHub webhook events

Consumption of GitHub webhook events

The cloud watch metrics can be published to your dashboards or forwarded to any external tool based on requirements. Once we have metrics, CloudWatch alarms and notifications can be configured to manage pool exhaustion.

Conclusion

In this blog post, we outlined several best practices covering security, scalability and cost efficiency when using GitHub Actions with EC2 self-hosted runners on AWS. We covered how using short-lived credentials combined with ephemeral runners will reduce security and build contamination risks. We also showed how runners can be optimized for faster startup and job execution AMIs and warm EC2 pools. Last but not least, cost efficiencies can be maximized by using Spot instances for runners in the right scenarios.

Resources:

AWS Weekly Roundup: Amazon EC2 U7i Instances, Bedrock Converse API, AWS World IPv6 Day and more (June 3, 2024)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-u7i-instances-bedrock-converse-api-aws-world-ipv6-day-and-more-june-3-2024/

Life is not always happy, there are difficult times. However, we can share our joys and sufferings with those we work with. The AWS Community is no exception.

Jeff Barr introduced two members of the AWS community who are dealing with health issues. Farouq Mousa is an AWS Community Builder and fighting brain cancer. Allen Helton is an AWS Serverless Hero and his young daughter is fighting leukemia.

Please donate to support Farauq and Olivia, Allen’s daughter to overcome their disease.

Last week’s launches
Here are some launches that got my attention:

Amazon EC2 high memory U7i Instances – These instances with up to 32 TiB of DDR5 memory and 896 vCPUs are powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids). These high memory instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. To learn more, visit Jeff’s blog post.

New Amazon Connect analytics data lake – You can use a single source for contact center data including contact records, agent performance, Contact Lens insights, and more — eliminating the need to build and maintain complex data pipelines. Your organization can create your own custom reports using Amazon Connect data or combine data queried from third-party sources. To learn more, visit Donnie’s blog post.

Amazon Bedrock Converse API – This API provides developers a consistent way to invoke Amazon Bedrock models removing the complexity to adjust for model-specific differences such as inference parameters. With this API, you can write a code once and use it seamlessly with different models in Amazon Bedrock. To learn more, visit Dennis’s blog post to get started.

New Document widget for PartyRock – You can build, use, and share generative AI-powered apps for fun and for boosting personal productivity, using PartyRock. Its widgets display content, accept input, connect with other widgets, and generate outputs like text, images, and chats using foundation models. You can now use new document widget to integrate text content from files and documents directly into a PartyRock app.

30 days of alarm history in Amazon CloudWatch – You can view the history of your alarm state changes for up to 30 days prior. Previously, CloudWatch provided 2 weeks of alarm history. This extended history makes it easier to observe past behavior and review incidents over a longer period of time. To learn more, visit the CloudWatch alarms documentation section.

10x faster startup time in Amazon SageMaker Canvas – You can launch SageMaker Canvas in less than a minute and get started with your visual, no-code interface for machine learning 10x faster than before. Now, all new user profiles created in existing or new SageMaker domains can experience this accelerated startup time.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items and a Twitch show that you might find interesting:

Let us manage your relational database! – Jeff Barr ran a poll to better understand why some AWS customers still choose to host their own databases in the cloud. Working backwards, he highlights four issues that AWS managed database services address. Consider these before hosting your own database.

Amazon Bedrock Serverless Prompt Chaining – This repository provides examples of using AWS Step Functions and Amazon Bedrock to build complex, serverless, and highly scalable generative AI applications with prompt chaining.

AWS Merch Store Spring Sale – Do you want to buy AWS branded t-shirts, hats, bags, and so on? Get 15% off on all items now through June 7th.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS World IPv6 Day — Join us a free in-person celebration event on June 6, for technical presentations from AWS experts plus a workshop and whiteboarding session. You will learn how to get started with IPv6 and hear from customers who have started on the journey of IPv6 adoption. Check out your near city: San Francisco, Seattle, New YorkLondon, Mumbai, Bangkok, Singapore, Kuala Lumpur, Beijing, Manila, and Sydney.

AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Stockholm (June 4), Madrid (June 5), and Washington, DC (June 26–27).

AWS re:Inforce — Join us for AWS re:Inforce (June 10–12) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), New Zealand (August 15), Nigeria (August 24), and New York (August 28).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Channy

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Implementing network traffic inspection on AWS Outposts rack

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/implementing-network-traffic-inspection-on-aws-outposts-rack/

This blog post is written by Brian Daugherty, Principal Solutions Architect. Enrico Liguori, Solution Architect, Networking. Sedji Gaouaou, Senior Solution Architect, Hybrid Cloud.

Network traffic inspection on AWS Outposts rack is a crucial aspect of making sure of security and compliance within your on-premises environment. With network traffic inspection, you can gain visibility into the data flowing in and out of your Outposts rack environment, enabling you to detect and mitigate potential threats proactively.

By deploying AWS partner solutions on Outposts rack, you can take advantage of their expertise and specialized capabilities to gain insights into network traffic patterns, identify and mitigate threats, and help ensure compliance with industry-specific regulations and standards. This includes advanced network traffic inspection capabilities, such as deep packet inspection, intrusion detection and prevention, application-level firewalling, and advanced threat detection.

This post presents an example architecture of deploying a firewall appliance on an Outposts rack to perform on-premises to Virtual Private Cloud (VPC) and VPC-to-VPC inline traffic inspection.

Architecture

The example traffic inspection architecture illustrated in the following diagram is built using a common Outposts rack deployment pattern.

In this example, an Outpost rack is deployed on premises to support:

  • Manufacturing/operational technologies (OT) applications that need low latency between OT servers and devices
  • Information technology (IT) applications that are subject to strict data residency and data protection policies

Separate VPCs, that can be owned by different AWS accounts, and subnets are created for the IT and OT departments’ instances (see 1 and 2 in the diagram).

Organizational security policies require that traffic flowing to and from the Outpost and the site, and between VPCs on the Outpost, be inspected, controlled, and logged using a centralized firewall.

In an AWS Region it is possible to implement a centralized traffic inspection architecture using routing services such as AWS Transit Gateways (TGW) or Gateway Load Balancers (GWLB) to route traffic to a central firewall, but these services are not available on Outposts.

On Outposts, some use the Local Gateway (LGW) to implement a distributed traffic inspection architecture with firewalls deployed in each VPC, but this can be operationally complex and cost prohibitive.

In this post, you will learn how to use a recently introduced feature – Multi-VPC Elastic Network Interface (ENI) Attachments – to create a centralized traffic inspection architecture on Outposts. Using Multi-VPC Attached ENIs you can attach ENIs created in subnets that are owned and managed by other VPCs (even VPCs in different accounts) to an Amazon Elastic Compute Cloud (EC2) instance.

Specifically, you can create ENIs in the IT and OT subnets that can be shared with a centralized firewall (see 3 and 4).

Because it’s a best practice to minimize the attack surface of a centralized firewall through isolation, the example includes a VPC and subnet created solely for the firewall instance (see 5).

To protect traffic flowing to and from the IT, OT, and firewall VPCs and on-site networks, another ‘Exposed’ VPC, subnet (see 6), and ENI (see 7) are created. These are the only resources associated with the Outposts Local Gateway (LGW) and ‘exposed’ to on-site networks.

In the example, traffic is routed from the IT and OT VPCs using a default route that points to the ENI used by the firewall (see 8 and 9). The firewall can route traffic back to the IT and OT VPCs, as allowed by policy, through its directly connected interfaces.

The firewall uses a route for the on-site network (192.168.30.0/24) – or a default route – pointing to the gateway associated with the exposed ENI (eni11, 172.16.2.1 – see 10).

To complete the routing between the IT, OT, and firewall VPCs and the on-site networks, static routes are added to the LGW route table pointing to the firewall’s exposed ENI as the next hop (see 11).

Once these static routes are inserted, the Outposts Ingress Routing feature will trigger the routes to be advertised toward the on-site layer-3 switch using BGP.

Likewise, the on-site layer-3 switch will advertise a route (see 12) for 192.168.30.0/24 (or a default route) over BGP to the LGW, completing end-to-end routing between on-site networks and the IT and OT VPCs through the centralized firewall.

The following diagram shows an example of packet flow between an on-site OT device and the OT server, inspected by the firewall:

Implementation on AWS Outposts rack

The following implementation details are essential for our example traffic inspection on the Outposts rack architecture.

Prerequisites

The following prerequisites are required:

  • Deployment of an Outpost on premises;
  • Creation of four VPCs – Exposed, firewall, IT, and OT;
  • Creation of private subnets in each of the four VPCs where ENIs and instances can be created;
  • Creation of ENIs in each of the four private subnets for attachment to the firewall instance (keep track of the ENI IDs);
  • If needed, sharing the subnets and ENIs with the firewall account, using AWS Resource Access Manager (AWS RAM);
  • Association of the Exposed VPC to the LGW.

Firewall selection and sizing

Although in this post a basic Linux instance is deployed and configured as the firewall, in the Network Security section of the AWS Marketplace, you can find several sophisticated, powerful, and manageable AWS Partner solutions that perform deep packet inspection.

Most network security marketplace offerings provide guidance on capabilities and expected performance and pricing for specific appliance instance sizes.

Firewall instance selection

Currently, an Outpost rack can be configured with EC2 instances in the M5, C5, R5, and G4dn families. As a user, you can select the size and number of instances available on an Outpost to match your requirements.

When selecting an EC2 instance for use as a centralized firewall it is important to consider the following:

  • Performance recommendations for instance types and sizes made by the firewall appliance partner;
  • The number of VPCs that are inspected by the firewall appliance;
  • The availability of instances on the Outpost.

For example, after evaluating the partner recommendations you may determine that an instance size of c5.large, r5.large, or larger provide the required performance.

Next, you can use the following AWS Command Line Interface (AWS CLI) command to identify the EC2 instances configured on an Outpost:

Outposts get-outpost-instance-types \
--outpost-id op-abcdefgh123456789

The output of this command lists the instance types and sizes configured on your Outpost:

InstanceTypes:
- InstanceType: c5.xlarge
- InstanceType: c5.4xlarge
- InstanceType: r5.2xlarge
- InstanceType: r5.4xlarge

With knowledge of the instance types and sizes installed on your Outpost, you can now determine if any of these are available. The following AWS CLI command – one for each of the preceding instance types – lists the number of each instance type and size available for use. For example:

aws cloudwatch get-metric-statistics \
--namespace AWS/Outposts \
--metric-name AvailableInstanceType_Count \
--statistics Average --period 3600 \
--start-time $(date -u -Iminutes -d '-1hour') \
--end-time $(date -u -Iminutes) \
--dimensions \
Name=OutpostId,Value=op-abcdefgh123456789 \
Name=InstanceType,Value=c5.xlarge

This command returns:

Datapoints:
- Average: 2.0
  Timestamp: '2024-04-10T10:39:00+00:00'
  Unit: Count
Label: AvailableInstanceType_Count

The output indicates that there are (on average) two c5.xlarge instances available on this Outpost in the specified time period (1 hour). The same steps for the other instance type suggest that there are also two c5.4xlarge, two r5.2xlarge, and no r5.4xlarge available.

Next, consider the number of VPCs to be connected to the firewall and determine if the instances available support the required number of ENIs.

The firewall requires an ENI in its own VPC, in the Exposed VPC, and one for each additional VPC. In this post, because there is a VPC for IT and for OT, you need an EC2 instance that supports four interfaces in total:

To determine the number of supported interfaces for each available instance type and size, let’s use the AWS CLI:

aws ec2 describe-instance-types \
--instance-types c5.xlarge c5.4xlarge r5.2xlarge \
--query 'InstanceTypes[].[InstanceType,NetworkInfo.NetworkCards]'

This returns:

- - r5.2xlarge
  - - BaselineBandwidthInGbps: 2.5
      MaximumNetworkInterfaces: 4
      NetworkCardIndex: 0
      NetworkPerformance: Up to 10 Gigabit
      PeakBandwidthInGbps: 10.0
- - c5.xlarge
  - - BaselineBandwidthInGbps: 1.25
      MaximumNetworkInterfaces: 4
      NetworkCardIndex: 0
      NetworkPerformance: Up to 10 Gigabit
      PeakBandwidthInGbps: 10.0
- - c5.4xlarge
  - - BaselineBandwidthInGbps: 5.0
      MaximumNetworkInterfaces: 8
      NetworkCardIndex: 0
      NetworkPerformance: Up to 10 Gigabit
      PeakBandwidthInGbps: 10.0

The output suggests that the three available EC2 instances (r5.2xlarge, c5.xlarge and c5.4xlarge) can support four network interfaces. The output also suggests that the c5.4xlarge instance, for example, supports up to 8 network interfaces and a maximum bandwidth of 10Gb/s. This helps you plan for the potential growth in network requirements.

Attaching remote ENIs to the firewall instance

With the firewall instance deployed in the firewall VPC, the next step is to attach the remote ENIs created previously in the Exposed, OT, and IT subnets. Using the firewall instance ID and the Network Interface IDs for each of the remote ENIs, you can create the Multi-VPC Attached ENIs to connect the firewall to the other VPCs.  Each attached interface needs a unique device-index greater than ‘0’ which is the primary instance interface.

For example, to connect the Exposed VPC ENI:

aws ec2 attach-network-interface --device-index 1 \
--instance-id i-0e47e6eb9873d1234 \
--network-interface-id eni-012a3b4cd5efghijk \
--region us-west-2

Attach the OT and IT ENIs while incrementing the device-index and using the respective unique ENI IDs:

aws ec2 attach-network-interface --device-index 2 \
--instance-id i-0e47e6eb9873d1234 \
--network-interface-id eni-0bbe1543fb0bdabff \
--region us-west-2
aws ec2 attach-network-interface --device-index 3 \
--instance-id i-0e47e6eb9873d1234 \
--network-interface-id eni-0bbe1a123b0bdabde \
--region us-west-2

After attaching each remote ENI, the firewall instance now has an interface and IP address in each VPC used in this example architecture:

ubuntu@firewall:~$ ip address

ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 10.240.4.10/24 metric 100 brd 10.240.4.255 scope global dynamic ens5

ens6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 10.242.0.50/24 metric 100 brd 10.242.0.255 scope global dynamic ens6

ens7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 10.244.76.51/16 metric 100 brd 10.244.255.255 scope global dynamic ens7

ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 172.16.2.7/24 metric 100 brd 172.16.2.255 scope global dynamic ens11

Updating the VPC/subnet route tables

You can now add the routes needed to allow traffic to be inspected to flow through the firewall.

For example, the OT subnet (10.242.0.0/24) uses a route table with the ID rtb- abcdefgh123456789. To send the traffic through the firewall, you need to add a default route with the target being the ENI (eni-07957a9f294fdbf5d) that is now attached to the firewall:

aws ec2 create-route --route-table-id rtb-abcdefgh123456789 \
--destination-cidr-block 0.0.0.0/0 \
--network-interface-id eni-07957a9f294fdbf5d

You can follow the same process is used to add a default route to the IT VPC/subnet.

With routing established from the IT and OT VPCs to the firewall, you need to make sure that the firewall uses the Exposed VPC to route traffic toward the on-premises network 192.168.30.0/24. This is done by adding a route within the firewall OS using the VPC gateway as a next hop.

The ENI attached to the firewall from the Exposed VPC is in subnet 172.16.2.0/28, and the gateway used by this subnet is, by Amazon Virtual Private Cloud (VPC) convention, the first address in the subnet – 172.16.2.1. This is used when updating the firewall OS route table:

sudo ip route add 192.168.30.0/24 via 172.16.2.1

You can now confirm that the firewall OS has routes to each attached subnet and to the on-premises subnet:

ubuntu@firewall:~$ ip route
default via 10.240.4.1 dev ens5 proto dhcp src 10.240.4.10 metric 100
10.240.0.2 via 10.240.4.1 dev ens5 proto dhcp src 10.240.4.10 metric 100
10.240.4.0/24 dev ens5 proto kernel scope link src 10.240.4.10 metric 100
10.240.4.1 dev ens5 proto dhcp scope link src 10.240.4.10 metric 100
10.242.0.0/24 dev ens6 proto kernel scope link src 10.242.0.50 metric 100
10.242.0.2 dev ens6 proto dhcp scope link src 10.242.0.50 metric 100
10.244.0.0/16 dev ens7 proto kernel scope link src 10.244.76.51 metric 100
10.244.0.2 dev ens7 proto dhcp scope link src 10.244.76.51 metric 100
172.16.2.0/24 dev ens11 proto kernel scope link src 172.16.2.7 metric 100
172.16.2.2 dev ens11 proto dhcp scope link src 172.16.2.7 metric 100
192.168.30.0/24 via 172.16.2.1 dev ens11

The final step in establishing end-to-end routing is to make sure that the LGW route table contains static routes for the firewall, IT, and OT VPCs. These routes target the ENIs used by the firewall in the Exposed VPC.

After gathering the LGW Route Table ID and the firewall’s Exposed ENI ID used by the firewall, you can now add routes toward the firewall VPC:

aws ec2 create-local-gateway-route \
    --local-gateway-route-table-id lgw-rtb-abcdefgh123456789 \
    --network-interface-id eni-0a2e4f68f323022c3 \
    --destination-cidr-block 10.240.0.0/16

Repeat this command for the OT and IT VPC CIDRs – 10.242.0.0/16 and 10.244.0.0/16, respectively.

You can query the LGW route table to make sure that each of the static routes was inserted:

aws ec2 search-local-gateway-routes \
    --local-gateway-route-table-id lgw-rtb-abcdefgh123456789 \
    --filters "Name=type,Values=static"

This returns:

Routes:

- DestinationCidrBlock: 10.240.0.0/16
  LocalGatewayRouteTableId: lgw-rtb-abcdefgh123456789
  NetworkInterfaceId: eni-0a2e4f68f323022c3
  State: active
  Type: static

- DestinationCidrBlock: 10.242.0.0/16
  LocalGatewayRouteTableId: lgw-rtb-abcdefgh123456789
  NetworkInterfaceId: eni-0a2e4f68f323022c3
  State: active
  Type: static

- DestinationCidrBlock: 10.244.0.0/16
  LocalGatewayRouteTableId: lgw-rtb-abcdefgh123456789
  NetworkInterfaceId: eni-0a2e4f68f323022c3
  State: active
  Type: static

With the addition of these static routes the LGW begins to advertise reachability to the firewall, OT, and IT Classless Inter-Domain Routing (CIDR) blocks over the BGP neighborship. The CIDR for the Exposed VPC is already advertised because it is associated directly to the LGW.

The firewall now has full visibility of the traffic and can apply the monitoring, inspection, and security profiles defined by your organization.

Other considerations

  • It is important to follow the best practices specified by the Firewall Appliance Partner to fully secure the appliance. In the example architecture, access to the firewall console is restricted to AWS Session Manager.
  • The commands used previously to create/update the Outpost/LGW route tables need an account with full privileges to administer the Outpost.

Fault tolerance

As a crucial component of the infrastructure, the firewall instance needs a mechanism for automatic recovery from failures. One effective approach is to deploy the firewall instances within an Auto Scaling group, which can automatically replace unhealthy instances with new, healthy ones. In addition, using host or rack level spread placement group makes sure that your instances are deployed on distinct underlying hardware. This enables high availability and minimizes downtime. Furthermore, this approach based on Auto Scaling can be implemented regardless of the specific third-party product used.

To ensure a seamless transition when Auto Scaling replaces an unhealthy firewall instance, it is essential that the multi-VPC ENIs responsible for receiving and forwarding traffic are automatically attached to the new instance. When re-using the same multi-VPC ENIs, make sure that no changes are required in the subnets and LGW route tables.

To re-attach the same multi-VPC ENIs to the new instance, you can do this using Auto Scaling lifecycle hooks, with which you can pause the instance replacement process and perform custom actions.

After re-attaching the multi-VPC ENIs to the instance, the last step is to restore the configuration of the firewall from a backup.

Conclusion

In this post, you have learned how to implement on-premises to VPC and VPC-to-VPC inline traffic inspection on Outposts rack with a centralized firewall deployment. This architecture requires a VPC for the firewall instance itself, an Exposed VPC connecting to your on-premises network, and one or more VPCs for your workloads running on the Outpost. You can either use a basic Linux instance as a router, or choose from the advanced AWS Partner solutions in the Network Security section of the AWS Marketplace and follow the respective guidance on firewall instance selection. With multi-VPC ENI attachments, you can create network traffic routing between VPCs and forward traffic to the centralized firewall for inspection. In addition, you can use Auto Scaling groups, spread placement groups, and Auto Scaling lifecycle hooks to enable high availability and fault tolerance for your firewall instance.

If you want to learn more about network security on AWS, visit: Network Security on AWS.

Amazon EC2 high memory U7i Instances for large in-memory databases

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-high-memory-u7i-instances-for-large-in-memory-databases/

Announced in preview form at re:Invent 2023, Amazon Elastic Compute Cloud (Amazon EC2) U7i instances with up to 32 TiB of DDR5 memory and 896 vCPUs are now available. Powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids), these high memory instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. Here are the specs:

Instance Name vCPUs
Memory (DDR5)
EBS Bandwidth
Network Bandwidth
u7i-12tb.224xlarge 896 12,288 GiB 60 Gbps 100 Gbps
u7in-16tb.224xlarge 896 16,384 GiB 100 Gbps 200 Gbps
u7in-24tb.224xlarge 896 24,576 GiB 100 Gbps 200 Gbps
u7in-32tb.224xlarge 896 32,768 GiB 100 Gbps 200 Gbps

The new instances deliver the best compute price performance for large in-memory workloads, and offer the highest memory and compute power of any SAP-certified virtual instance from a leading cloud provider.

Thanks to AWS Nitro System, all of the memory on the instance is available for use. For example, here’s the 32 TiB instance:

In comparison to the previous generation of EC2 High Memory instances, the U7i instances offer more than 135% of the compute performance, up to 115% more memory performance, and 2.5x the EBS bandwidth. This increased bandwidth allows you to transfer 30 TiB of data from EBS into memory in an hour or less, making data loads and cache refreshes faster than ever before. The instances also support ENA Express with 25 Gbps of bandwidth per flow, and provide an 85% improvement in P99.9 latency between instances.

Each U7i instance supports attachment of up to 128 General Purpose (gp2 and gp3) or Provisioned IOPS (io1 and io2 Block Express) EBS volumes. Each io2 Block Express volume can be as big as 64 TiB and can deliver up to 256K IOPS at up to 32 Gbps, making them a great match for U7i instances.

The instances are SAP certified to run Business Suite on HANA, Business Suite S/4HANA, Business Warehouse on HANA (BW), and SAP BW/4HANA in production environments. To learn more, consult the Certified and Supported SAP HANA Hardware and the SAP HANA to AWS Migration Guide. Also, be sure to take a look at the AWS Launch Wizard for SAP.

Things to Know
Here are a couple of things that you should know about these new instances:

Regions – U7i instances are available in the US East (N. Virginia), US West (Oregon), and Asia Pacific (Seoul, Sydney) AWS Regions.

Operating Systems – Supported operating systems include Amazon Linux, Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Ubuntu, and Windows Server.

Larger Instances – We are also working on offering even larger instance later this year with increased compute to meet our customer needs.

Jeff;

AWS Weekly Roundup – Application Load Balancer IPv6, Amazon S3 pricing update, Amazon EC2 Flex instances, and more (May 20, 2024)

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-application-load-balancer-ipv6-amazon-s3-pricing-update-amazon-ec2-flex-instances-and-more-may-20-2024/

AWS Summit season is in full swing around the world, with last week’s events in Bengaluru, Berlin, and  Seoul, where my blog colleague Channy delivered one of the keynotes.

AWS Summit Seoul Keynote

Last week’s launches
Here are some launches that got my attention:

Amazon S3 will no longer charge for several HTTP error codesA customer reported how he was charged for Amazon S3 API requests he didn’t initiate and which resulted in AccessDenied errors. The Amazon Simple Storage Service (Amazon S3) service team updated the service to not charge such API requests anymore. As always when talking about pricing, the exact wording is important, so please read the What’s New post for the details.

Introducing Amazon EC2 C7i-flex instances – These instances delivers up to 19 percent better price performance compared to C6i instances. Using C7i-flex instances is the easiest way for you to get price performance benefits for a majority of compute-intensive workloads. The new instances are powered by the 4th generation Intel Xeon Scalable custom processors (Sapphire Rapids) that are available only on AWS and offer 5 percent lower prices compared to C7i.

Application Load Balancer launches IPv6 only support for internet clientsApplication Load Balancer now allows customers to provision load balancers without IPv4s for clients that can connect using just IPv6s. To connect, clients can resolve AAAA DNS records that are assigned to Application Load Balancer. The Application Load Balancer is still dual stack for communication between the load balancer and targets. With this new capability, you have the flexibility to use both IPv4s or IPv6s for your application targets while avoiding IPv4 charges for clients that don’t require it.

Amazon VPC Lattice now supports TLS Passthrough – We announced the general availability of TLS passthrough for Amazon VPC Lattice, which allows customers to enable end-to-end authentication and encryption using their existing TLS or mTLS implementations. Prior to this launch, VPC Lattice supported HTTP and HTTPS listener protocols only, which terminates TLS and performs request-level routing and load balancing based on information in HTTP headers.

Amazon DocumentDB zero-ETL integration with Amazon OpenSearch Service – This new integration provides you with advanced search capabilities, such as fuzzy search, cross-collection search and multilingual search, on your Amazon DocumentDB (with MongoDB compatibility) documents using the OpenSearch API. With a few clicks in the AWS Management Console, you can now synchronize your data from Amazon DocumentDB to Amazon OpenSearch Service, eliminating the need to write any custom code to extract, transform, and load the data.

Amazon EventBridge now supports customer managed keys (CMK) for event buses – This capability allows you to encrypt your events using your own keys instead of an AWS owned key (which is used by default). With support for CMK, you now have more fine-grained security control over your events, satisfying your company’s security requirements and governance policies.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items, open source projects, and Twitch shows that you might find interesting:

The Four Pillars of Managing Email Reputation – Dustin Taylor is the manager of anti-abuse and email deliverability for Amazon Simple Email Service (SES). He wrote a remarkable post exploring Amazon SES approach to managing domain and IP reputation. Maintaining a high reputation ensures optimal recipient inboxing. His post outlines how Amazon SES protects its network reputation to help you deliver high-quality email consistently. A worthy read, even if you’re not sending email at scale. I learned a lot.

AWS Build On Generative AIBuild On Generative AI – Season 3 of your favorite weekly Twitch show about all things generative artificial intelligence (AI) is in full swing! Streaming every Monday, 9:00 AM US PT, my colleagues Tiffany and Darko discuss different aspects of generative AI and invite guest speakers to demo their work.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter, in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

New compute-optimized (C7i-flex) Amazon EC2 Flex instances

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/new-compute-optimized-c7i-flex-amazon-ec2-flex-instances/

The vast majority of applications don’t run run the CPU flat-out at 100% utilization continuously. Take a web application, for instance. It typically fluctuates between periods of high and low demand, but hardly ever uses a server’s compute at full capacity.

a graph showing how a typical application runs with low-to-moderate CPU utilization most of the time with occasional peaks.

CPU utilization for many common workloads that customers run in the AWS Cloud today. (source: AWS Documentation)

One easy and cost-effective way to run such workloads is to use the Amazon EC2 M7i-flex instances which we introduced last August. These are lower-priced variants of the Amazon EC2 M7i instances offering the same next-generation specs for general purpose compute for the most popular sizes with the added benefit of giving you better price/performance if you don’t need full compute power 100 percent of the time. This makes them a great first choice if you are looking to reduce your running cost while meeting the same performance benchmarks.

This flexibility resonated really well with customers so, today, we are expanding our Flex portfolio by launching Amazon EC2 C7i-flex instances offering similar benefits of price/performance and lower costs for compute-intensive workloads. These are lower-priced variants of the Amazon EC2 C7i instances that offer a baseline level of CPU performance with the ability to scale up to the full compute performance 95% of the time.

C7i-flex instances
C7i-flex offers five of the most common sizes from large to 8xlarge, delivering 19 percent better price performance than Amazon EC2 C6i instances.

Instance name vCPU Memory (GiB) Instance storage (GB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
c7i-flex.large 2 4 EBS-only up to 12.5 up to 10
c7i-flex.xlarge 4 8 EBS-only up to 12.5 up to 10
c7i-flex.2xlarge 8 16 EBS-only up to 12.5 up to 10
c7i-flex.4xlarge 16 32 EBS-only up to 12.5 up to 10
c7i-flex.8xlarge 32 64 EBS-only up to 12.5 up to 10

Should I use C7i-flex or C7i?
Both C7i-flex and C7i are compute-optmized instances powered by custom 4th Generation Intel Xeon Scalable processors which are only available at Amazon Web Services (AWS). They offer up to 15 percent better performance over comparable x86-based Intel processors used by other cloud providers.

They both also use DDR5 memory, feature a 2:1 ratio of memory to vCPU, and are ideal for running applications such as web and application servers, databases, caches, Apache Kafka, and Elasticsearch.

So why would you use one over the other? Here are three things to consider when deciding which one is right for you.

Usage pattern
EC2 flex instances are a great fit for when you don’t need to fully utilize all compute resources.

You can achieve 5 percent better price performance and 5 percent lower prices due to efficient use of compute resources. Typically, this is a great fit for most applications, so C7i-flex instances should be the first choice for compute-intensive workloads.

However, if your application requires continuous high CPU usage, then you should use C7i instances instead. They are likely more suitable for workloads such as batch processing, distributed analytics, high performance computing (HPC), ad serving, highly scalable multiplayer gaming, and video encoding.

Instance sizes
C7i-flex instances offer the most common sizes used by a majority of workloads going up to a maximum of 8xlarge in size.

If you need higher specs, then you should look into the large C7i instances, which include 12xlarge, 16xlarge, 24xlarge, 48xlarge and two bare metal options with metal-24xl and metal-48xl sizes.

Network bandwidth
Larger sizes also offer higher network and Amazon Elastic Block Store (Amazon EBS) bandwidths so you may need to use one of the larger C7i instances depending on your requirements. C7i-flex instances offer up to 12.5 Gbps of network bandwidth and up to 10 Gbps of Amazon Elastic Block Store (Amazon EBS) bandwidth which should be suitable for most applications.

Things to know
Regions – Visit AWS Services by Region to check whether C7i-flex instances are available in your preferred regions.

Purchasing options – C7i-Flex and C7i instances are available in On-Demand, Savings Plan, Reserved Instance, and Spot form. C7i instances are also available in Dedicated Host and Dedicated Instance form.

To learn more visit Amazon EC2 C7i and C7i-flex instances

Matheus Guimaraes

AWS Ops Automator v2 features vertical scaling (Preview)

Post Syndicated from AWS Editorial Team original https://aws.amazon.com/blogs/architecture/aws-ops-automator-v2-features-vertical-scaling-preview/

Editors note April 30, 2024: The information in this post is outdated and the solution has been retired. For more solutions using AWS services, see the AWS Solutions Library.

The new version of the AWS Ops Automator, a solution that enables you to automatically manage your AWS resources, features vertical scaling for Amazon EC2 instances. With vertical scaling, the solution automatically adjusts capacity to maintain steady, predictable performance at the lowest possible cost. The solution can resize your instances by restarting your existing instance with a new size. Or, the solution can resize your instances by replacing your existing instance with a new, resized instance.

With this update, the AWS Ops Automator can help make setting up vertical scaling easier. All you have to do is define the time-based or event-based trigger that determines when the solution scales your instances, and choose whether you want to change the size of your existing instances or replace your instances with new, resized instances. The time-based or event-based trigger invokes the AWS Lambda to scale your instances.

Ops Automator Vertical Scaling

Restarting with a new size

When you choose to resize your instances by restarting the instance with a new size, the solution increases or decreases the size of your existing instances in response to changes in demand or at a specified point in time. The solution automatically changes the instance size to the next defined size up or down.

Replacing with a new, resized instance

Alternatively, you can choose to have the Ops Automator replace your instance with a new, resized instance instead of restarting your existing instance. When the solution determines that your instances need to be scaled, the solution launches new instances with the next defined instance size up or down. The solution is also integrated with Elastic Load Balancing to automatically register the new instance with load Balancers.

AWS Weekly Roundup: Amazon Bedrock, AWS CodeBuild, Amazon CodeCatalyst, and more (April 29, 2024)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-bedrock-aws-codebuild-amazon-codecatalyst-and-more-april-29-2024/

This was a busy week for Amazon Bedrock with many new features! Using GitHub Actions with AWS CodeBuild is much easier. Also, Amazon Q in Amazon CodeCatalyst can now manage more complex issues.

I was amazed to meet so many new and old friends at the AWS Summit London. To give you a quick glimpse, here’s AWS Hero Yan Cui starting his presentation at the AWS Community stage.

AWS Community at the AWS Summit London 2024

Last week’s launches
With so many interesting new features, I start with generative artificial intelligence (generative AI) and then move to the other topics. Here’s what got my attention:

Amazon Bedrock – For supported architectures such as Llama, Mistral, or Flan T5, you can now import custom models and access them on demand. Model evaluation is now generally available to help you evaluate, compare, and select the best foundation models (FMs) for your specific use case. You can now access Meta’s Llama 3 models.

Agents for Amazon Bedrock – A simplified agent creation and return of control, so that you can define an action schema and get the control back to perform those action without needing to create a specific AWS Lambda function. Agents also added support for Anthropic Claude 3 Haiku and Sonnet to help build faster and more intelligent agents.

Knowledge Bases for Amazon Bedrock – You can now ingest data from up to five data sources and provide more complete answers. In the console, you can now chat with one of your documents without needing to set up a vector database (read more in this Machine Learning blog post).

Guardrails for Amazon Bedrock – The capability to implement safeguards based on your use cases and responsible AI policies is now available with new safety filters and privacy controls.

Amazon Titan – The new watermark detection feature is now generally available in Amazon Bedrock. In this way, you can identify images generated by Amazon Titan Image Generator using an invisible watermark present in all images generated by Amazon Titan.

Amazon CodeCatalyst – Amazon Q can now split complex issues into separate, simpler tasks that can then be assigned to a user or back to Amazon Q. CodeCatalyst now also supports approval gates within a workflow. Approval gates pause a workflow that is building, testing, and deploying code so that a user can validate whether it should be allowed to proceed.

Amazon EC2 – You can now remove an automatically assigned public IPv4 address from an EC2 instance. If you no longer need the automatically assigned public IPv4 (for example, because you are migrating to using a private IPv4 address for SSH with EC2 instance connect), you can use this option to quickly remove the automatically assigned public IPv4 address and reduce your public IPv4 costs.

Network Load Balancer – Now supports Resource Map in AWS Management Console, a tool that displays all your NLB resources and their relationships in a visual format on a single page. Note that Application Load Balancer already supports Resource Map in the console.

AWS CodeBuild – Now supports managed GitHub Action self-hosted runners. You can configure CodeBuild projects to receive GitHub Actions workflow job events and run them on CodeBuild ephemeral hosts.

Amazon Route 53 – You can now define a standard DNS configuration in the form of a Profile, apply this configuration to multiple VPCs, and share it across AWS accounts.

AWS Direct Connect – Hosted connections now support capacities up to 25 Gbps. Before, the maximum was 10 Gbps. Higher bandwidths simplify deployments of applications such as advanced driver assistance systems (ADAS), media and entertainment (M&E), artificial intelligence (AI), and machine learning (ML).

NoSQL Workbench for Amazon DynamoDB – A revamped operation builder user interface to help you better navigate, run operations, and browse your DynamoDB tables.

Amazon GameLift – Now supports in preview end-to-end development of containerized workloads, including deployment and scaling on premises, in the cloud, or for hybrid configurations. You can use containers for building, deploying, and running game server packages.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

GQL, the new ISO standard for graphs, has arrived – GQL, which stands for Graph Query Language, is the first new ISO database language since the introduction of SQL in 1987.

Authorize API Gateway APIs using Amazon Verified Permissions and Amazon Cognito – Externalizing authorization logic for application APIs can yield multiple benefits. Here’s an example of how to use Cedar policies to secure a REST API.

Build and deploy a 1 TB/s file system in under an hour – Very nice walkthrough for something that used to be not so easy to do in the recent past.

Let’s Architect! Discovering Generative AI on AWS – A new episode in this amazing series of posts that provides a broad introduction to the domain and then shares a mix of videos, blog posts, and hands-on workshops.

Building scalable, secure, and reliable RAG applications using Knowledge Bases for Amazon Bedrock – This post explores the new features (including AWS CloudFormation support) and how they align with the AWS Well-Architected Framework.

Using the unified CloudWatch Agent to send traces to AWS X-Ray – With added support for the collection of AWS X-Ray and OpenTelemetry traces, you can now provision a single agent to capture metrics, logs, and traces.

The executive’s guide to generative AI for sustainability – A guide for implementing a generative AI roadmap within sustainability strategies.

AWS open source news and updates – My colleague Ricardo writes about open source projects, tools, and events from the AWS Community. Check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Singapore (May 7), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Turkey (May 18), Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

GOTO EDA Day LondonJoin us in London on May 14 to learn about event-driven architectures (EDA) for building highly scalable, fault tolerant, and extensible applications. This conference is organized by GOTO, AWS, and partners.

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Architecting for Disaster Recovery on AWS Outposts Racks with AWS Elastic Disaster Recovery

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/architecting-for-disaster-recovery-on-aws-outposts-racks-with-aws-elastic-disaster-recovery/

This blog post is written by Brianna Rosentrater, Hybrid Edge Specialist SA.

AWS Elastic Disaster Recovery Service (AWS DRS) now supports disaster recovery (DR) architectures that include on-premises Windows and Linux workloads running on AWS Outposts. AWS DRS minimizes downtime and data loss with fast, reliable recovery of on-premises and cloud-based applications using affordable storage, minimal compute, and point-in-time recovery. Both services are billed and managed from your AWS Management Console.

Like workloads running in AWS Regions, it’s critical to plan for failures. Outposts are designed with resiliency in mind, providing redundant power, networking, and are available to order with N+M active compute instance capacity. In other words, for every physical N compute servers, you have the option of including M redundant hosts capable of handling the workload during a failure. When leveraging AWS DRS with Outpost, you can plan for larger-scale failure modes, such as data center outages, by replicating mission-critical workloads to other remote data center locations or the AWS Region.

In this post, you’ll learn how AWS DRS can be used with Outpost rack to architect for high availability in the event of a site failure. The post will examine several different architectures enabled by AWS DRS that provide DR for Outpost, and the benefits of each method described.

Prerequisites

Each of these architectures described below need the following:

Public internet access isn’t needed, AWS PrivateLink and AWS Direct Connect are supported for replication and failback which is a significant security benefit.

Planning for failure

Disasters come in many forms and are often unplanned and unexpected events. Regardless of whether your workload resides on premises, in a colocation facility, or in an AWS Region, it’s critical to define the Recovery Time Objective (RTO) and Recovery Point Objective (RPO) which are often workload-specific. These two metrics profile how long a service can be down during recovery and quantify the acceptable amount of data loss. RTO and RPO guide you in choosing the appropriate strategy such as backup and recovery, pilot light, warm standby, or a multi-site (active-active) approach.

With AWS DRS, while failing back to a test machine (not the original source server), replication of the source server continues. This allows failback drills without impacting RPO, and non-disruptive failback drills are an important part of disaster planning to validate your recovery plan meets your expected RPO/RTO as per your business requirements.

How AWS DRS integrates with Outpost

AWS DRS uses an AWS Replication Agent at the source to capture the workload and transfer it to a lightweight staging area, which resides on an Outpost equipped with Amazon S3 on Outposts. This method also provides the ability to perform low-effort, non-disruptive DR drills before making the final cutover. The AWS Replication Agent doesn’t need a reboot nor does it impact your applications during installation.

When an Outpost’s subnet is selected as the target for replication or launch, all associated AWS DRS components remain within the Outpost, including the AWS DRS server conversion technology. These conversion servers convert source disks of servers being migrated so that they can boot and run in the target infrastructure, Amazon EBS volumes, snapshots, and replication servers. The replication servers replicate the disks to the target infrastructure. With AWS DRS you can control the data replication path using private connectivity options such as a virtual private network (VPN), AWS Direct Connect, VPC peering, or another private connection. Learn more about using a private IP for data replication.

AWS DRS provides nearly continuous replication for mission-critical workloads and supports deployment patterns including on-premises to Outpost, Outpost to Region, Region to Outpost, and between two logical Outposts through local networks. To leverage Outpost with AWS DRS, simply select the Outpost subnet as your target or source for replication when configuring AWS DRS for your workload. If you are currently using CloudEndure DR for disaster recovery with Outpost, see these detailed instructions for migrating to AWS DRS from CloudEndure DR.

DR from on-premises to Outpost

Outpost can be used as a DR target for on-premises workloads. By deploying an Outpost in a remote data center or colocation a significant distance from the source within the same geo-political boundary, you can replicate workloads across great distances and increase resiliency of the data while ensuring adherence to data residency policies or legislation.

DR from on-premises to Outposts

Figure 1 – DR from on-premises to Outposts

In Figure 1, on premises sources replicate traffic from a LAN to a staging area residing in an Outpost subnet via the local gateway. This allows workloads to failover from their on-premises environment to an Outpost in a different physical location during a disaster.

The staging areas and replication servers run on Amazon Elastic Compute Cloud (Amazon EC2) with Amazon EBS volumes and require Amazon S3 on Outposts where the Amazon EBS snapshots reside.

The replication agent is responsible for providing nearly continuous, block-level replication from your LAN using TCP/1500 with traffic routing to Amazon EC2 instances using the Outposts local gateway.

DR from Outpost to Region

Since its initial release, Outpost has supported Amazon EBS snapshots written to Amazon S3 located in the AWS Region. Backup to an AWS Region is one of the most cost-effective and easiest-to-configure DR approaches, enabling data redundancy outside of your Outpost and data center.

This method also offers flexibility for restoration within an AWS Region if the original deployment is irrecoverable. However, depending on the frequency of the snapshots and the timing of the failure, backup, and recovery to the Region has the potential to have an RPO/RTO spanning hours depending on the throughput of the service link.

For critical workloads, AWS DRS can reduce RTO to minutes and RPO in the sub-second range. After creating an initial replication of workloads that reside on the Outpost, AWS DRS provides nearly continuous, block-level replication in the Region. Just like replication from non-AWS virtual machines or bare metal servers, AWS DRS resources, including Replication Servers, Conversion Servers, Amazon EBS Volumes, and Snapshots reside in the Region.

DR from Outpost to Region

Figure 2 – DR from Outpost to Region

In Figure 2, data replication is performed over the service link from Amazon EC2 instances running locally on an Outpost to an AWS Region. The service link traverses either public Region connectivity or AWS Direct Connect.

AWS Direct Connect is the recommended option because it provides low latency and consistent bandwidth for the service link back to a Region, which also improves the reliability of transmission for AWS DRS replication traffic.

The service link is comprised of redundant, encrypted VPN tunnels. Replication traffic can also be sent privately without traversing the public internet by leveraging Private Virtual Interfaces with Direct Connect for the service link.

With this architecture in place, you can mitigate disasters and reduce downtime by failing over to the AWS Region using AWS DRS.

DR from Region to Outpost

AWS provides multiple Availability Zones (AZs) within a Region and isolated AWS Regions globally for the greatest possible fault tolerance and stability. The reliability pillar of AWS’s Well-Architected Framework encourages distributing workloads across AZs and replicating data between Regions when the need for distances exceeds those of AZs.

AWS DRS supports nearly continuous replication of workloads from a Region to an Outpost within your data center or colocation facility for DR. This deployment model provides increased durability from a source AWS Region to an Outpost anchored to a different Region.

In this model, AWS DRS components remain on-premises within the Outpost, but data charges are applicable as data egresses from the Region back to the data center and Amazon S3 on Outposts is required on the destination Outpost.

DR from Region to Outpost

Figure 3 – DR from Region to Outpost

Implementing the preceding architecture diagram enables failover of critical workloads from the Region to on-premises Outposts seamlessly. Keep in mind that AWS Regions provide the management and control plane for Outpost, making it critical to consider probability and frequency of service link interruptions as a part of your DR planning. Scenarios such as warm standby with pre-allocated Amazon EC2 and Amazon EBS resources may prove more resilient during service link disruptions.

DR between two Outposts

Each logical Outpost is comprised of one or more physical racks. Logical Outposts are in independent colocations of one another, and support deployments in disparate data centers or colocation facilities. You can elect to have multiple logical Outposts anchored to different Availability Zones or Regions. AWS DRS unlocks options for replication between two logical Outposts, leading to increased resiliency and reducing the impact of your data center as a single point of failure. In the following architecture, nearly continuous replication captured from a single Outpost source is applied at a second logical Outpost.

DR between two Outposts

Figure 4 – DR between two Outposts

Supporting both directional and bidirectional replication between Outposts can minimize disruption caused by events that take down a data center, Availability Zone, or even the entire Region result in minimal disruption. In the following architecture diagram, bidirectional data replication occurs between the Outposts by routing traffic via the local gateways, minimizing outbound data charges from the Region and allowing for more direct routing between deployment sites that could potentially span significant distances. AWS DRS cannot communicate with resources directly utilizing a customer-owned IP address pool (CoIP pool).

Figure 5 – DR between two Outposts – bidirectional 

Architecture Considerations

When planning an Outpost deployment leveraging AWS DRS, it’s critical to consider the impact on storage. As a general best practice, AWS recommends planning for a 2:1 ratio consisting of EBS volumes used for nearly continuous replication and Amazon EBS snapshots on Amazon S3 for point-in-time recovery. While it’s unlikely that all servers would need recovery simultaneously, it’s also important to allocate a reserve of EBS volume capacity, which will launch at the time of recovery. Amazon S3 on Outpost is needed for each Outpost used as a replication destination, and the recommendation is to plan for a 1:1 ratio consisting of S3 on Outposts storage, plus the rate of data change. For example, if your data change rate is 10%, you’d want to plan for 110% S3 on Outpost use with AWS DRS.

Amazon CloudWatch has integrated metrics for EC2, Amazon EBS, and Amazon S3 capacity on Outposts, making it easy to create custom tailored dashboards and integrate with Simple Notification Service (Amazon SNS) for alerts at defined thresholds. Monitoring these metrics is critical in making sure that proper free space is available for data replication to occur unimpeded. CloudWatch has metrics available for AWS DRS as well. You can also use the AWS DRS service page in the AWS Console to monitor the status of your recovery instances.

Consider taking advantage of Recovery Plans within AWS DRS to make sure that related services are recovered in a particular order. For example, during a disaster, it might be critical to first bring up a database before recovering application tiers. Recovery plans provide the ability to group related services and apply wait times to individual targets.

Conclusion

AWS Outpost enables low latency, data residency, or data gravity-constrained workloads by supplying managed cloud compute and storage services within your data center or colocation. When coupled with AWS DRS, you can decrease RPO and RTO through a variety of flexible deployment models with sources and destinations ranging from on-premises, the Region, or another AWS Outpost.

AWS Weekly Roundup: Amazon EC2 G6 instances, Mistral Large on Amazon Bedrock, AWS Deadline Cloud, and more (April 8, 2024)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-mistral-large-aws-clean-rooms-ml-aws-deadline-cloud-and-more-april-8-2024/

We’re just two days away from AWS Summit Sydney (April 10–11) and a month away from the AWS Summit season in Southeast Asia, starting with the AWS Summit Singapore (May 7) and the AWS Summit Bangkok (May 30). If you happen to be in Sydney, Singapore, or Bangkok around those dates, please join us.

Last Week’s Launches
If you haven’t read last week’s Weekly Roundup yet, Channy wrote about the AWS Chips Taste Test, a new initiative from Jeff Barr as part of April’ Fools Day.

Here are some launches that caught my attention last week:

New Amazon EC2 G6 instances — We announced the general availability of Amazon EC2 G6 instances powered by NVIDIA L4 Tensor Core GPUs. G6 instances can be used for a wide range of graphics-intensive and machine learning use cases. G6 instances deliver up to 2x higher performance for deep learning inference and graphics workloads compared to Amazon EC2 G4dn instances. To learn more, visit the Amazon EC2 G6 instance page.

Mistral Large is now available in Amazon Bedrock — Veliswa wrote about the availability of the Mistral Large foundation model, as part of the Amazon Bedrock service. You can use Mistral Large to handle complex tasks that require substantial reasoning capabilities. In addition, Amazon Bedrock is now available in the Paris AWS Region.

Amazon Aurora zero-ETL integration with Amazon Redshift now in additional Regions — Zero-ETL integration announcements were my favourite launches last year. This Zero-ETL integration simplifies the process of transferring data between the two services, allowing customers to move data between Amazon Aurora and Amazon Redshift without the need for manual Extract, Transform, and Load (ETL) processes. With this announcement, Zero-ETL integrations between Amazon Aurora and Amazon Redshift is now supported in 11 additional Regions.

Announcing AWS Deadline Cloud — If you’re working in films, TV shows, commercials, games, and industrial design and handling complex rendering management for teams creating 2D and 3D visual assets, then you’ll be excited about AWS Deadline Cloud. This new managed service simplifies the deployment and management of render farms for media and entertainment workloads.

AWS Clean Rooms ML is Now Generally Available — Last year, I wrote about the preview of AWS Clean Rooms ML. In that post, I elaborated a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. Now, AWS Clean Rooms ML is available for you to use.

Knowledge Bases for Amazon Bedrock now supports private network policies for OpenSearch Serverless — Here’s exciting news for you who are building with Amazon Bedrock. Now, you can implement Retrieval-Augmented Generation (RAG) with Knowledge Bases for Amazon Bedrock using Amazon OpenSearch Serverless (OSS) collections that have a private network policy.

Amazon EKS extended support for Kubernetes versions now generally available — If you’re running Kubernetes version 1.21 and higher, with this Extended Support for Kubernetes, you can stay up-to-date with the latest Kubernetes features and security improvements on Amazon EKS.

AWS Lambda Adds Support for Ruby 3.3 — Coding in Ruby? Now, AWS Lambda supports Ruby 3.3 as its runtime. This update allows you to take advantage of the latest features and improvements in the Ruby language.

Amazon EventBridge Console Enhancements — The Amazon EventBridge console has been updated with new features and improvements, making it easier for you to manage your event-driven applications with a better user experience.

Private Access to the AWS Management Console in Commercial Regions — If you need to restrict access to personal AWS accounts from the company network, you can use AWS Management Console Private Access. With this launch, you can use AWS Management Console Private Access in all commercial AWS Regions.

From community.aws 
The community.aws is a home for us, builders, to share our learnings with building on AWS. Here’s my Top 3 posts from last week:

Other AWS News 
Here are some additional news items, open-source projects, and Twitch shows that you might find interesting:

Build On Generative AI – Join Tiffany and Darko to learn more about generative AI, see their demos and discuss different aspects of generative AI with the guest speakers. Streaming every Monday on Twitch, 9:00 AM US PT.

AWS open source news and updates – If you’re looking for various open-source projects and tools from the AWS community, please read the AWS open-source newsletter maintained by my colleague, Ricardo.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Amsterdam (April 9), Sydney (April 10–11), London (April 24), Singapore (May 7), Berlin (May 15–16), Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Dubai (May 29), Thailand (May 30), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore cloud security in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania for two-and-a-half days of immersive cloud security learning designed to help drive your business initiatives.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Poland (April 11), Bay Area (April 12), Kenya (April 20), and Turkey (May 18).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Donnie

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Applying Spot-to-Spot consolidation best practices with Karpenter

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/applying-spot-to-spot-consolidation-best-practices-with-karpenter/

This post is written by Robert Northard – AWS Container Specialist Solutions Architect, and Carlos Manzanedo Rueda – AWS WW SA Leader for Efficient Compute

Karpenter is an open source node lifecycle management project built for Kubernetes. In this post, you will learn how to use the new Spot-to-Spot consolidation functionality released in Karpenter v0.34.0, which helps further optimize your cluster. Amazon Elastic Compute Cloud (Amazon EC2) Spot Instances are spare Amazon EC2 capacity available for up to 90% off compared to On-Demand prices. One difference between On-Demand and Spot is that Spot Instances can be interrupted by Amazon EC2 when the capacity is needed back. Karpenter’s built-in support for Spot Instances allows users to seamlessly implement Spot best practices and helps users optimize the cost of stateless, fault tolerant workloads. For example, when Karpenter observes a Spot interruption, it automatically starts a new node in response.

Karpenter provisions nodes in response to unschedulable pods based on aggregated CPU, memory, volume requests, and other scheduling constraints. Over time, Karpenter has added functionality to simplify instance lifecycle configuration, providing a termination controller, instance expiration, and drift detection. Karpenter also helps optimize Kubernetes clusters by selecting the optimal instances while still respecting Kubernetes pod-to-node placement nuances, such as nodeSelector, affinity and anti-affinity, taints and tolerations, and topology spread constraints.

The Kubernetes scheduler assigns pods to nodes based on their scheduling constraints. Over time, as workloads are scaled out and scaled in or as new instances join and leave, the cluster placement and instance load might end up not being optimal. In many cases, it results in unnecessary extra costs. Karpenter has a consolidation feature that improves cluster placement by identifying and taking action in situations such as:

  1. when a node is empty
  2. when a node can be removed as the pods that are running on it can be rescheduled into other existing nodes
  3. when the number of pods in a node has gone down and the node can now be replaced with a lower-priced and rightsized variant (which is shown in the following figure)
Karpenter consolidation, replacing one 2xlarge Amazon EC2 Instance with an xlarge Amazon EC2 Instance.

Karpenter consolidation, replacing one 2xlarge Amazon EC2 Instance with an xlarge Amazon EC2 Instance.

Karpenter versions prior to v0.34.0 only supported consolidation for Amazon EC2 On-Demand Instances. On-Demand consolidation allowed consolidating from On-Demand into Spot Instances and to lower-priced On-Demand Instances. However, once a pod was placed on a Spot Instance, Spot nodes were only removed when the nodes were empty. In v0.34.0, you can enable the feature gate to use Spot-to-Spot consolidation.

Solution overview

When launching Spot Instances, Karpenter uses the price-capacity-optimized allocation strategy when calling the Amazon EC2 instant Fleet API (shown in the following figure) and passes in a selection of compute instance types based on the Karpenter NodePool configuration. The Amazon EC2 Fleet API in instant mode is a synchronous API call that immediately returns a list of instances that launched and any instance that could not be launched. For any instances that could not be launched, Karpenter might request alternative capacity or remove any soft Kubernetes scheduling constraints for the workload.

Karpenter instance orchestration

Karpenter instance orchestration

Spot-to-Spot consolidation needed an approach that was different from On-Demand consolidation. For On-Demand consolidation, rightsizing and lowest price are the main metrics used. For Spot-to-Spot consolidation to take place, Karpenter requires a diversified instance configuration (see the example NodePool defined in the walkthrough) with at least 15 instances types. Without this constraint, there would be a risk of Karpenter selecting an instance that has lower availability and, therefore, higher frequency of interruption.

Prerequisites

The following prerequisites are required to complete the walkthrough:

  • Install an Amazon Elastic Kubernetes Service (Amazon EKS) cluster (version 1.29 or higher) with Karpenter (v0.34.0 or higher). The Karpenter Getting Started Guide provides steps for setting up an Amazon EKS cluster and adding Karpenter.
  • Enable replacement with Spot consolidation through the SpotToSpotConsolidation feature gate. This can be enabled during a helm install of the Karpenter chart by adding –-set settings.featureGates.spotToSpotConsolidation=true argument.
  • Install kubectl, the Kubernetes command line tool for communicating with the Kubernetes control plane API, and kubectl context configured with Cluster Operator and Cluster Developer permissions.

Walkthrough

The following walkthrough guides you through the steps for simulating Spot-to-Spot consolidation.

1. Create a Karpenter NodePool and EC2NodeClass

Create a Karpenter NodePool and EC2NodeClass. Replace the following with your own values. If you used the Karpenter Getting Started Guide to create your installation, then the value would be your cluster name.

  • Replace <karpenter-discovery-tag-value> with your subnet tag for Karpenter subnet and security group auto-discovery.
  • Replace <role-name> with the name of the AWS Identity and Access Management (IAM) role for node identity.
cat <<EOF > nodepool.yaml
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    metadata:
      labels:
        intent: apps
    spec:
      nodeClassRef:
        name: default
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c","m","r"]
        - key: karpenter.k8s.aws/instance-size
          operator: NotIn
          values: ["nano","micro","small","medium"]
        - key: karpenter.k8s.aws/instance-hypervisor
          operator: In
          values: ["nitro"]
  limits:
    cpu: 100
    memory: 100Gi
  disruption:
    consolidationPolicy: WhenUnderutilized
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: Bottlerocket
  subnetSelectorTerms:          
    - tags:
        karpenter.sh/discovery: "<karpenter-discovery-tag-value>"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "<karpenter-discovery-tag-value>"
  role: "<role-name>"
  tags:
    Name: karpenter.sh/nodepool/default
    IntentLabel: "apps"
EOF

kubectl apply -f nodepool.yaml

The NodePool definition demonstrates a flexible configuration with instances from the C, M, or R EC2 instance families. The configuration is restricted to use smaller instance sizes but is still diversified as much as possible. For example, this might be needed in scenarios where you deploy observability DaemonSets. If your workload has specific requirements, then see the supported well-known labels in the Karpenter documentation.

2. Deploy a sample workload

Deploy a sample workload by running the following command. This command creates a Deployment with five pod replicas using the pause container image:

cat <<EOF > inflate.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 5
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      nodeSelector:
        intent: apps
      containers:
        - name: inflate
          image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
          resources:
            requests:
              cpu: 1
              memory: 1.5Gi
EOF
kubectl apply -f inflate.yaml

Next, check the Kubernetes nodes by running a kubectl get nodes CLI command. The capacity pool (instance type and Availability Zone) selected depends on any Kubernetes scheduling constraints and spare capacity size. Therefore, it might differ from this example in the walkthrough. You can see Karpenter launched a new node of instance type c6g.2xlarge, an AWS Graviton2-based instance, in the eu-west-1c Region:

$ kubectl get nodes -L karpenter.sh/nodepool -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone -L karpenter.sh/capacity-type

NAME                                     STATUS   ROLES    AGE   VERSION               NODEPOOL   INSTANCE-TYPE   ZONE         CAPACITY-TYPE
ip-10-0-12-17.eu-west-1.compute.internal Ready    <none>   80s   v1.29.0-eks-a5ec690   default    c6g.2xlarge     eu-west-1c   spot

3. Scale in a sample workload to observe consolidation

To invoke a Karpenter consolidation event scale, inflate the deployment to 1. Run the following command:

kubectl scale --replicas=1 deployment/inflate 

Tail the Karpenter logs by running the following command. If you installed Karpenter in a different Kubernetes namespace, then replace the name for the -n argument in the command:

kubectl -n karpenter logs -l app.kubernetes.io/name=karpenter --all-containers=true -f --tail=20

After a few seconds, you should see the following disruption via consolidation message in the Karpenter logs. The message indicates the c6g.2xlarge Spot node has been targeted for replacement and Karpenter has passed the following 15 instance types—m6gd.xlarge, m5dn.large, c7a.xlarge, r6g.large, r6a.xlarge and 10 other(s)—to the Amazon EC2 Fleet API:

{"level":"INFO","time":"2024-02-19T12:09:50.299Z","logger":"controller.disruption","message":"disrupting via consolidation replace, terminating 1 candidates ip-10-0-12-181.eu-west-1.compute.internal/c6g.2xlarge/spot and replacing with spot node from types m6gd.xlarge, m5dn.large, c7a.xlarge, r6g.large, r6a.xlarge and 10 other(s)","commit":"17d6c05","command-id":"60f27cb5-98fa-40fb-8231-05b31fd41892"}

Check the Kubernetes nodes by running the following kubectl get nodes CLI command. You can see that Karpenter launched a new node of instance type c6g.large:

$ kubectl get nodes -L karpenter.sh/nodepool -L node.kubernetes.io/instance-type -L topology.kubernetes.io/zone -L karpenter.sh/capacity-type

NAME                                      STATUS   ROLES    AGE   VERSION               NODEPOOL   INSTANCE-TYPE ZONE       CAPACITY-TYPE
ip-10-0-12-156.eu-west-1.compute.internal           Ready    <none>   2m1s   v1.29.0-eks-a5ec690   default    c6g.large       eu-west-1c   spot

Use kubectl get nodeclaims to list all objects of type NodeClaim and then describe the NodeClaim Kubernetes resource using kubectl get nodeclaim/<claim-name> -o yaml. In the NodeClaim .spec.requirements, you can also see the 15 instance types passed to the Amazon EC2 Fleet API:

apiVersion: karpenter.sh/v1beta1
kind: NodeClaim
...
spec:
  nodeClassRef:
    name: default
  requirements:
  ...
  - key: node.kubernetes.io/instance-type
    operator: In
    values:
    - c5.large
    - c5ad.large
    - c6g.large
    - c6gn.large
    - c6i.large
    - c6id.large
    - c7a.large
    - c7g.large
    - c7gd.large
    - m6a.large
    - m6g.large
    - m6gd.large
    - m7g.large
    - m7i-flex.large
    - r6g.large
...

What would happen if a Spot node could not be consolidated?

If a Spot node cannot be consolidated because there are not 15 instance types in the compute selection, then the following message will appear in the events for the NodeClaim object. You might get this event if you overly constrained your instance type selection:

Normal  Unconsolidatable   31s   karpenter  SpotToSpotConsolidation requires 15 cheaper instance type options than the current candidate to consolidate, got 1

Spot best practices with Karpenter

The following are some best practices to consider when using Spot Instances with Karpenter.

  • Avoid overly constraining instance type selection: Karpenter selects Spot Instances using the price-capacity-optimized allocation strategy, which balances the price and availability of AWS spare capacity. Although a minimum of 15 instances are needed, you should avoid constraining instance types as much as possible. By not constraining instance types, there is a higher chance of acquiring Spot capacity at large scales with a lower frequency of Spot Instance interruptions at a lower cost.
  • Gracefully handle Spot interruptions and consolidation actions: Karpenter natively handles Spot interruption notifications by consuming events from an Amazon Simple Queue Service (Amazon SQS) queue, which is populated with Spot interruption notifications through Amazon EventBridge. As soon as Karpenter receives a Spot interruption notification, it gracefully drains the interrupted node of any running pods while also provisioning a new node for which those pods can schedule. With Spot Instances, this process needs to complete within 2 minutes. For a pod with a termination period longer than 2 minutes, the old node will be interrupted prior to those pods being rescheduled. To test a replacement node, AWS Fault Injection Service (FIS) can be used to simulate Spot interruptions.
  • Carefully configure resource requests and limits for workloads: Rightsizing and optimizing your cluster is a shared responsibility. Karpenter effectively optimizes and scales infrastructure, but the end result depends on how well you have rightsized your pod requests and any other Kubernetes scheduling constraints. Karpenter does not consider limits or resource utilization. For most workloads with non-compressible resources, such as memory, it is generally recommended to set requests==limits because if a workload tries to burst beyond the available memory of the host, an out-of-memory (OOM) error occurs. Karpenter consolidation can increase the probability of this as it proactively tries to reduce total allocatable resources for a Kubernetes cluster. For help with rightsizing your Kubernetes pods, consider exploring Kubecost, Vertical Pod Autoscaler configured in recommendation mode, or an open source tool such as Goldilocks.
  • Configure metrics for Karpenter: Karpenter emits metrics in the Prometheus format, so consider using Amazon Managed Service for Prometheus to track interruptions caused by Karpenter Drift, consolidation, Spot interruptions, or other Amazon EC2 maintenance events. These metrics can be used to confirm that interruptions are not having a significant impact on your service’s availability and monitor NodePool usage and pod lifecycles. The Karpenter Getting Started Guide contains an example Grafana dashboard configuration.

You can learn more about other application best practices in the Reliability section of the Amazon EKS Best Practices Guide.

Cleanup

To avoid incurring future charges, delete any resources you created as part of this walkthrough. If you followed the Karpenter Getting Started Guide to set up a cluster and add Karpenter, follow the clean-up instructions in the Karpenter documentation to delete the cluster. Alternatively, if you already had a cluster with Karpenter, delete the resources created as part of this walkthrough:

kubectl delete -f inflate.yaml
kubectl delete -f nodepool.yaml

Conclusion

In this post, you learned how Karpenter can actively replace a Spot node with another more cost-efficient Spot node. Karpenter can consolidate Spot nodes that have the right balance between lower price and low-frequency interruptions when there are at least 15 selectable instances to balance price and availability.

To get started, check out the Karpenter documentation as well as Karpenter Blueprints, which is a repository including common workload scenarios following the best practices.

You can share your feedback on this feature by a raising a GitHub Issue.

AWS Weekly Roundup — Claude 3 Sonnet support in Bedrock, new instances, and more — March 11, 2024

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-claude-3-sonnet-support-in-bedrock-new-instances-and-more-march-11-2024/

Last Friday was International Women’s Day (IWD), and I want to take a moment to appreciate the amazing ladies in the cloud computing space that are breaking the glass ceiling by reaching technical leadership positions and inspiring others to go and build, as our CTO Werner Vogels says.Now go build

Last week’s launches
Here are some launches that got my attention during the previous week.

Amazon Bedrock – Now supports Anthropic’s Claude 3 Sonnet foundational model. Claude 3 Sonnet is two times faster and has the same level of intelligence as Anthropic’s highest-performing models, Claude 2 and Claude 2.1. My favorite characteristic is that Sonnet is better at producing JSON outputs, making it simpler for developers to build applications. It also offers vision capabilities. You can learn more about this foundation model (FM) in the post that Channy wrote early last week.

AWS re:Post – Launched last week! AWS re:Post Live is a weekly Twitch livestream show that provides a way for the community to reach out to experts, ask questions, and improve their skills. The show livestreams every Monday at 11 AM PT.

Amazon CloudWatchNow streams daily metrics on CloudWatch metric streams. You can use metric streams to send a stream of near real-time metrics to a destination of your choice.

Amazon Elastic Compute Cloud (Amazon EC2)Announced the general availability of new metal instances, C7gd, M7gd, and R7gd. These instances have up to 3.8 TB of local NVMe-based SSD block-level storage and are built on top of the AWS Nitro System.

AWS WAFNow supports configurable evaluation time windows for request aggregation with rate-based rules. Previously, AWS WAF was fixed to a 5-minute window when aggregating and evaluating the rules. Now you can select windows of 1, 2, 5 or 10 minutes, depending on your application use case.

AWS Partners – Last week, we announced the AWS Generative AI Competency Partners. This new specialization features AWS Partners that have shown technical proficiency and a track record of successful projects with generative artificial intelligence (AI) powered by AWS.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Some other updates and news that you may have missed:

One of the articles that caught my attention recently compares different design approaches for building serverless microservices. This article, written by Luca Mezzalira and Matt Diamond, compares the three most common designs for serverless workloads and explains the benefits and challenges of using one over the other.

And if you are interested in the serverless space, you shouldn’t miss the Serverless Office Hours, which airs live every Tuesday at 10 AM PT. Join the AWS Serverless Developer Advocates for a weekly chat on the latest from the serverless space.

Serverless office hours

The Official AWS Podcast – Listen each week for updates on the latest AWS news and deep dives into exciting use cases. There are also official AWS podcasts in several languages. Check out the ones in FrenchGermanItalian, and Spanish.

AWS Open Source News and Updates – This is a newsletter curated by my colleague Ricardo to bring you the latest open source projects, posts, events, and more.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Summit season is about to start. The first ones are Paris (April 3), Amsterdam (April 9), and London (April 24). AWS Summits are free events that you can attend in person and learn about the latest in AWS technology.

GOTO x AWS EDA Day London 2024 – On May 14, AWS partners with GOTO bring to you the event-driven architecture (EDA) day conference. At this conference, you will get to meet experts in the EDA space and listen to very interesting talks from customers, experts, and AWS.

GOTO EDA Day 2022

You can browse all upcoming in-person and virtual events here.

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Weekly Roundup — Amazon API Gateway, AWS Step Functions, Amazon ECS, Amazon EKS, Amazon LightSail, Amazon VPC, and more — January 29, 2024

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-api-gateway-aws-step-functions-amazon-ecs-amazon-eks-amazon-lightsail-amazon-vpc-and-more-january-29-2024/

This past week our service teams continue to innovate on your behalf, and a lot has happened in the Amazon Web Services (AWS) universe. I’ll also share about all the AWS Community events and initiatives that are happening around the world.

Let’s dive in!

Last week’s launches
Here are some launches that got my attention:

AWS Step Functions adds integration for 33 services including Amazon Q – AWS Step Functions is a visual workflow service capable of orchestrating over 11,000+ API actions from over 220 AWS services to help customers build distributed applications at scale. This week, AWS Step Functions expands its AWS SDK integrations with support for 33 additional AWS services, including Amazon Q, AWS B2B Data Interchange, and Amazon CloudFront KeyValueStore.

Amazon Elastic Container Service (Amazon ECS) Service Connect introduces support for automatic traffic encryption with TLS Certificates – Amazon ECS launches support for automatic traffic encryption with Transport Layer Security (TLS) certificates for its networking capability called ECS Service Connect. With this support, ECS Service Connect allows your applications to establish a secure connection by encrypting your network traffic.

Amazon Elastic Kubernetes Service (Amazon EKS) and Amazon EKS Distro support Kubernetes version 1.29Kubernetes version 1.29 introduced several new features and bug fixes. You can create new EKS clusters using v1.29 and upgrade your existing clusters to v1.29 using the Amazon EKS console, the eksctl command line interface, or through an infrastructure-as-code (IaC) tool.

IPv6 instance bundles on Amazon Lightsail – With these new instance bundles, you can get up and running quickly on IPv6-only without the need for a public IPv4 address with the ease of use and simplicity of Amazon Lightsail. If you have existing Lightsail instances with a public IPv4 address, you can migrate your instances to IPv6-only in a few simple steps.

Amazon Virtual Private Cloud (Amazon VPC) supports idempotency for route table and network ACL creationIdempotent creation of route tables and network ACLs is intended for customers that use network orchestration systems or automation scripts that create route tables and network ACLs as part of a workflow. It allows you to safely retry creation without additional side effects.

Amazon Interactive Video Service (Amazon IVS) announces audio-only pricing for Low-Latency Streaming – Amazon IVS is a managed live streaming solution that is designed to make low-latency or real-time video available to viewers around the world. It now offers audio-only pricing for its Low-Latency Streaming capability at 1/10th of the existing HD video rate.

Sellers can resell third-party professional services in AWS Marketplace – AWS Marketplace sellers, including independent software vendors (ISVs), consulting partners, and channel partners, can now resell third-party professional services in AWS Marketplace. Services can include implementation, assessments, managed services, training, or premium support.

Introducing the AWS Small and Medium Business (SMB) Competency – This is the first go-to-market AWS Specialization designed for partners who deliver to small and medium-sized customers. The SMB Competency provides enhanced benefits for AWS Partners to invest and focus on SMB customer business, such as becoming the go-to standard for participation in new pilots and sales initiatives and receiving unique access to scale demand generation engines.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

X in Y – We launched existing services and instance types in additional Regions:

Other AWS news
Here are some additional projects, programs, and news items that you might find interesting:

Get The NewsExport a Software Bill of Materials using Amazon Inspector – Generating an SBOM gives you critical security information that offers you visibility into specifics about your software supply chain, including the packages you use the most frequently and the related vulnerabilities that might affect your whole company. My colleague Varun Sharma in South Africa shows how to export a consolidated SBOM for the resources monitored by Amazon Inspector across your organization in industry standard formats, including CycloneDx and SPDX. It also shares insights and approaches for analyzing SBOM artifacts using Amazon Athena.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS InnovateAWS Innovate: AI/ML and Data Edition – Register now for the Asia Pacific & Japan AWS Innovate online conference on February 22, 2024, to explore, discover, and learn how to innovate with artificial intelligence (AI) and machine learning (ML). Choose from over 50 sessions in three languages and get hands-on with technical demos aimed at generative AI builders.

AWS Summit Paris 2024AWS Summit Paris  – The AWS Summit Paris is an annual event that is held in Paris, France. It is a great opportunity for cloud computing professionals from all over the world to learn about the latest AWS technologies, network with other professionals, and collaborate on projects. The Summit is free to attend and features keynote presentations, breakout sessions, and hands-on labs. Registrations are open!

AWS Community re:Invent re:CapsAWS Community re:Invent re:Caps – Join a Community re:Cap event organized by volunteers from AWS User Groups and AWS Cloud Clubs around the world to learn about the latest announcements from AWS re:Invent.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Behavior Driven Chaos with AWS Fault Injection Simulator

Post Syndicated from Richard Whitworth original https://aws.amazon.com/blogs/architecture/behavior-driven-chaos-with-aws-fault-injection-simulator/

A common challenge organizations face is how to gain confidence in and provide evidence for the continuous resilience of their workloads. Using modern chaos engineering principles can help in meeting this challenge, but the practice of chaos engineering can become complex. As a result, both the definition of the inputs and comprehension of the outputs of the process can become inaccessible to non-technical stakeholders.

In this post, we will explore a working example of how you can build chaos experiments using human readable language, AWS Fault Injection Simulator (FIS), and a framework familiar to Developers and Test Engineers. In turn, this will help you to produce auditable evidence of your workload’s continuous resilience in a way that is more engaging and understandable to a wider community of stakeholders.

If you are new to chaos engineering, including the process and benefits, a great place to start is with the Architecture Blog post on workload resiliency.

Chaos experiment attributes

For a chaos experiment to be considered complete, the experiment should exhibit the following attributes:

  • Defined steady state
  • Hypothesis
  • Defined variables and experiment actions to take
  • Verification of the hypothesis

Combining FIS and Behave

FIS enables you to create the experiment actions outlined in the list of chaos experiment attributes. You can use the actions in FIS to simulate the effect of disruptions on your workloads so that you can observe the resulting outcome and gain valuable insights into the workload’s resilience. However, there are additional attributes that should be defined when writing a fully featured chaos experiment.

This is what combining Python-style Behave with FIS enables you to do (other behavior-driven development frameworks exist for different languages). By approaching chaos experiments in this way, you get the benefit of codifying all of your chaos experiment attributes, such as the hypothesis, steady state and verification of the hypothesis using human readable Gherkin syntax, then automating the whole experiment in code.

Using Gherkin syntax enables non-technical stakeholders to review, validate, and contribute to chaos experiments, plus it helps to ensure the experiments can be driven by business outcomes and personas. If you have defined everything as code, then the whole process can be wrapped into the appropriate stage of your CI/CD pipelines to ensure existing experiments are always run to avoid regression. You can also iteratively add new chaos experiments as new business features are enabled in your workloads or you become aware of new potential disruptions. In addition, using a behavior-driven development (BDD) framework, like Behave, also enables developers and test engineers to deliver the capability quickly since they are likely already familiar with BDD and Behave.

The remainder of this blog post provides an example of this approach using an experiment that can be built on to create a set of experiments for your own workloads. The code and resources used throughout this blog are available in the AWS Samples aws-fis-behaviour-driven-chaos repository, which provides a CloudFormation template that builds the target workload for our chaos experiment.

The workload comprises an Amazon Virtual Private Cloud with a public subnet, an EC2 Auto-scaling Group and EC2 instances running NGINX. The CloudFormation template also creates an FIS experiment template, comprising a standard FIS Amazon Elastic Compute Cloud (Amazon EC2) action. For your own implementation, we recommend that you keep the CloudFormation for FIS separate to the CloudFormation, which builds the workload so that it can be maintained independently. Please note, for simplicity, they are together in the same repo for this blog.

Note: The Behave code in the repo is structured in a way we suggest you adopt for your own repo. It keeps the scenario definition separated from the Python-specific implementation of the steps and in turn the outline of the steps is separated from the step helper methods. This will allow you to build a set of re-usable step helper methods that can be dropped-into/called-from any Behave step. This can help keep your test codebase as DRY and efficient as possible as it grows. This can be very challenging for large test frameworks.

Figure 1 shows the AWS services and components we’re interacting with in this post.

Infrastructure for the chaos experiment

Figure 1. Infrastructure for the chaos experiment

Defining and launching the chaos experiment

We start by defining our chaos experiment in Gherkin syntax with the Gherkin’s Scenario being used to articulate the hypothesis for our chaos experiment as follows:

Scenario: My website is resilient to infrastructure failure

Given My website is up and can serve 10 transactions per second
And I have an EC2 Auto-Scaling Group with at least 3 running EC2 instances
And I have an EC2 Auto-Scaling Group with instances distributed across at least 3 Availability Zones
When an EC2 instance is lost
Then I can continue to serve 10 transactions per second
And 90 percent of transactions to my website succeed

Our initial Given, And steps validate that the conditions and environment that we are launching the Scenario in are sufficient for the experiment to be successful (the steady state). Therefore, if the environment is already out of bounds (read: the website isn’t running) before we begin, then the test will fail anyway, and we don’t want a false positive result. Since the steps are articulated as code using Behave, the test report will demonstrate what caused the experiment to fail and be able to identify if it was an environmental issue (false positive) rather than a true positive failure (the workload didn’t respond as we anticipated) during our chaos experiment.

The Given, And steps are launched using steps like the following example. Steps, in turn, call the relevant step_helper functions. Note how the phrases from the scenario are represented in the decorator for the step_impl function; this is how you link the human readable language in the scenario to the Python code that initiates the test logic.

@step("My website is up and can serve {number} transactions per second")
def step_impl(context, number):

    target = f"http://{context.config.userdata['website_hostname']}"

    logger.info(f'Sending traffic to target website: {target} for the next 60 seconds, please wait....')
    send_traffic_to_website(target, 60, "before_chaos", int(number))

    assert verify_locust_run(int(number), "before_chaos") is True

Once the Given, And steps have initiated successfully, we are satisfied that the conditions for the experiment are appropriate. Next, we launch the chaos actions using the When step. Here, we interact with FIS using boto3 to start the experiment template that was created earlier using CloudFormation. The following code snippet shows the code, which begins this step:

@step("an EC2 instance is lost")
def step_impl(context):

    if "fis" not in context.clients:
        create_client("fis", context)

    state = start_experiment(
        context.clients["fis"], context.config.userdata["fis_experiment_id"]
    )["experiment"]["state"]["status"]
    logger.info(f"FIS experiment state is: {state}")

    assert state in ["running", "initiating"]

The experiment template being used here is intentionally a very simple, single-step experiment as an example for this blog. FIS enables you to create very elaborate multi-step experiments in a straightforward manner, for more information please refer to the AWS FIS actions reference.

The experiment is now in flight! We launched the Then, And steps to validate our hypothesis expressed in the Scenario. Now, we query the website endpoint to see if we get any failed requests:

@step("I can continue to serve {number} transactions per second")
def step_impl(context, number):

    target = f"http://{context.config.userdata['website_hostname']}"

    logger.info(f'Sending traffic to target website {target} for the next 60 seconds, please wait....')
    send_traffic_to_website(target, 60, "after_chaos", int(number))

    assert verify_locust_run(int(number), "after_chaos") is True


@step("{percentage} percent of transactions to my website succeed")
def step_impl(context, percentage):

    assert success_percent(int(percentage), "after_chaos") is True

You can add as many Given, When, Then steps to validate your Scenario (the experiment’s hypothesis) as you need; for example, you can use additional FIS actions to validate what happens if a network failure prevents traffic to a subnet. You can also code your own actions using AWS Systems Manager or boto3 calls of your choice.

In our experiment, the results have validated our hypothesis, as seen in Figure 2.

Hypothesis validation example

Figure 2. Hypothesis validation example

There are a few different ways to format your results when using Behave so that they are easier to pull into a report; Allure is a nice example.

To follow along, the steps in the Implementation Details section will help launch the chaos experiment at your CLI. As previously stated, if you were to use this approach in your development lifecycle, you would hoist this into your CI/CD pipeline and tooling and not launch it locally.

Implementation details

Prerequisites

To deploy the chaos experiment and test application, you will need:

Note: Website availability tests are initiated from your CLI in the sample code used in this blog. If you are traversing a busy corporate proxy or a network connection that is not stable, then it may cause the experiment to fail.

Further, to keep the prerequisites as minimal and self-contained as possible for this blog, we are using Locust as a Python library, which is not a robust implementation of Locust. Using a Behave step implementation, we instantiate a local Locust runner to send traffic to the website we want to test before and after the step, which takes the chaos action. For a robust implementation in your own test suite, you could build a Locust implementation behind a REST API or use a load-testing suite with an existing REST API, like Blazemeter, which can be called from a Behave step and run for the full length of the experiment.

The CloudFormation that you will launch with this post creates some public facing EC2 instances. You should restrict access to just your public IP address using the instructions below. You can find your IP at https://checkip.amazonaws.com/. Use the IP address shown with a trailing /32 e.g. 1.2.3.4/32

Environment preparation

Clone the git repository aws-fis-behaviour-driven-chaos that contains the blog resources using the below command:

git clone https://github.com/aws-samples/aws-fis-behaviour-driven-chaos.git

We recommend creating a new, clean Python virtual environment and activating it:

python3 -m venv behavefisvenv
source behavefisvenv/bin/activate

Deployment steps

To be carried out from the root of the blog repo:

  1. Install the Python dependencies into your Python environment:
    pip install -r requirements.txt
  2. Create the test stack and wait for completion (ensure you replace the parameter value for AllowedCidr with your public IP address):
    aws cloudformation create-stack --stack-name my-chaos-stack --template-body file://cloudformation/infrastructure.yaml --region=eu-west-1 --parameters ParameterKey=AllowedCidr,ParameterValue=1.2.3.4/32 --capabilities CAPABILITY_IAM
    aws cloudformation wait stack-create-complete --stack-name my-chaos-stack --region=eu-west-1
  3. Once the deployment reaches a create-complete state, retrieve the stack outputs:
    aws cloudformation describe-stacks --stack-name my-chaos-stack --region=eu-west-1
  4. Copy the OutputValue of the stack Outputs for AlbHostname and FisExperimentId into the behave/userconfig.json file, replacing the placeholder values for website_hostname and fis_experiment_id, respectively.
  5. Replace the region value in the behave/userconfig.json file with the region you built the stack in (if altered in Step 2).
  6. Change directory into behave/.
    cd behave/
  7. Launch behave:
    behave
    Once completed, Locust results will appear inside the behave folder (Figure 3 is an example).

    Example CLI output

    Figure 3. Example CLI output

Cleanup

If you used the CloudFormation templates that we provided to create AWS resources to follow along with this blog post, delete them now to avoid future recurring charges.

To delete the stack, run:

aws cloudformation delete-stack --stack-name my-chaos-stack --region=eu-west-1 &&
aws cloudformation wait stack-delete-complete --stack-name my-chaos-stack --region=eu-west-1

Conclusion

This blog post has given usable and actionable insights into how you can wrap FIS actions, plus experiment templates in a way that fully defines and automates a chaos experiment with language that will be accessible to stakeholders outside of the test engineering team. You can extend on what is presented here to test your own workloads with your own methods and metrics through a powerful suite of chaos experiments, which will build confidence in your workload’s continuous resilience and enable you to provide evidence of this to the wider organization.

Optimizing video encoding with FFmpeg using NVIDIA GPU-based Amazon EC2 instances

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/optimizing-video-encoding-with-ffmpeg-using-nvidia-gpu-based-amazon-ec2-instances/

This post is written by Alejandro Gil, Solutions Architect and Joseba Echevarría, Solutions Architect. 

Introduction

The purpose of this blog post is to compare video encoding performance between CPUs and Nvidia GPUs to determine the price/performance ratio in different scenarios while highlighting where it would be best to use a GPU.

Video encoding plays a critical role in modern media delivery, enabling efficient storage, delivery, and playback of high-quality video content across a wide range of devices and platforms.

Video encoding is frequently performed solely by the CPU because of its widespread availability and flexibility. Still, modern hardware includes specialized components designed specifically to obtain very high performance video encoding and decoding.

Nvidia GPUs, such as those found in the P and G Amazon EC2 instances, include this kind of built-in hardware in their NVENC (encoding) and NVDEC (decoding) accelerator engines, which can be used for real-time video encoding/decoding with minimal impact on the performance of the CPU or GPU.

NVIDIA NVDEC/NVENC architecture. Source https://developer.nvidia.com/video-codec-sdk

Figure 1: NVIDIA NVDEC/NVENC architecture. Source https://developer.nvidia.com/video-codec-sdk

Scenario

Two main transcoding job types should be considered depending on the video delivery use case, 1) batch jobs for on demand video files and 2) streaming jobs for real-time, low latency use cases. In order to achieve optimal throughput and cost efficiency, it is a best practice to encode the videos in parallel using the same instance.

The utilized instance types in this benchmark can be found in figure 2 table (i.e g4dn and p3). For hardware comparison purposes, the p4d instance has been included in the table, showing the GPU specs and total number of NVDEC & NVENC cores in these EC2 instances. Based on the requirements, multiple GPU instances types are available in EC2.

Instance size GPUs GPU model NVDEC generation NVENC generation NVDEC cores/GPU NVENC cores/GPU
g4dn.xlarge 1 T4 4th 7th 2 1
p3.2xlarge 1 V100 3rd 6th 1 3
p4d.24xlarge 8 A100 4th N/A 5 0

Figure 2: GPU instances specifications

Benchmark

In order to determine which encoding strategy is the most convenient for each scenario, a benchmark will be conducted comparing CPU and GPU instances across different video settings. The results will be further presented using graphical representations of the performance indicators obtained.

The benchmark uses 3 input videos with different motion and detail levels (still, medium motion and high dynamic scene) in 4k resolution at 60 frames per second. The tests will show the average performance for encoding with FFmpeg 6.0 in batch (using Constant Rate Factor (CRF) mode) and streaming (using Constant Bit Rate (CBR)) with x264 and x265 codecs to five output resolutions (1080p, 720p, 480p, 360p and 160p).

The benchmark tests encoding the target videos into H.264 and H.265 using the x264 and x265 open-source libraries in FFmpeg 6.0 on the CPU and the NVENC accelerator when using the Nvidia GPU. The H.264 standard enjoys broad compatibility, with most consumer devices supporting accelerated decoding. The H.265 standard offers superior compression at a given level of quality than H.264 but hardware accelerated decoding is not as widely deployed. As a result, for most media delivery scenarios having more than one video format will be required in order to provide the best possible user experience.

Offline (batch) encoding

This test consists of a batch encoding with two different standard presets (ultrafast and medium for CPU-based encoding and p1 and medium presets for GPU-accelerated encoding) defined in the FFmpeg guide.

The following chart shows the relative cost of transcoding 1 million frames to the 5 different output resolutions in parallel for CPU-encoding EC2 instance (c6i.4xlarge) and two types of GPU-powered instances (g4dn.xlarge and p3.2xlarge). The results are normalized so that the cost of x264 ultrafast preset on c6i.4xlarge is equal to one.

Batch encoding performance for CPU and GPU instances.

Figure 3: Batch encoding performance for CPU and GPU instances.

The performance of batch encoding in the best GPU instance (g4dn.xlarge) shows around 73% better price/performance in x264 compared to the c6i.4xlarge and around 82% improvement in x265.

A relevent aspect to have in consideration is that the presets used are not exactly equivalent for each hardware because FFmpeg uses different operators depending on where the process runs (i.e CPU or GPU). As a consequence, the video outputs in each case have a noticeable difference between them. Generally, NVENC-based encoded videos (GPU) tend to have a higher quality in H.264, whereas CPU outputs present more encoding artifacts. The difference is more noticeable for lower quality cases (ultrafast/p1 presets or streaming use cases).

The following images compare the output quality for the medium motion video in the ultrafast/p1 and medium presets.

It is clearly seen in the following example, that the h264_nevenc (GPU) codec outperforms the libx264 codec (CPU) in terms of quality, showing less pixelation, especially in the ultrafast preset. For the medium preset, although the quality difference is less pronounced, the GPU output file is noticeably larger (refer to Figure 6 table).

Result comparison between GPU and CPU for h264, ultrafast

Figure 4: Result comparison between GPU and CPU for h264, ultrafast

Result comparison between GPU and CPU for h264, medium

Figure 5: Result comparison between GPU and CPU for h264, medium

The output file sizes mainly depend on the preset, codec and input video. The different configurations can be found in the following table.

Sizes for output batch encoded videos. Streaming not represented because the size is the same (fixed bitrate)

Figure 6: Sizes for output batch encoded videos. Streaming not represented because the size is the same (fixed bitrate)

Live stream encoding

For live streaming use cases, it is useful to measure how many streams a single instance can maintain transcoding to five output resolutions (1080p, 720p, 480p, 360p and 160p). The following results are the relative cost of each instance, which is the ratio of number of streams the instance was able to sustain divided by the cost per hour.

Streaming encoding performance for CPU and GPU instances.

Figure 6: Streaming encoding performance for CPU and GPU instances.

The previous results show that a GPU-based instance family like g4dn is ideal for streaming use cases, where they can sustain up to 4 parallel encodings from 4K to 1080p, 720p, 480p, 360p & 160p simultaneously. Notice that the GPU-based p5 family performance is not compensating the cost increase.

On the other hand, the CPU-based instances can sustain 1 parallel stream (at most). If you want to sustain the same number of parallel streams in Intel-based instances, you’d have to opt for a much larger instance (c6i.12xlarge can almost sustain 3 simultaneous streams, but it struggles to keep up with the more dynamic scenes when encoding with x265) with a much higher cost ($2.1888 hourly for c6i.12xlarge vs $0.587 for g4dn.xlarge).

The price/performance difference is around 68% better in GPU for x264 and 79% for x265.

Conclusion

The results show that for the tested scenarios there can be a price-performance gain when transcoding with GPU compared to CPU. Also, GPU-encoded videos tend to have an equal or higher perceived quality level to CPU-encoded counterparts and there is no significant performance penalty for encoding to the more advanced H.265 format, which can make GPU-based encoding pipelines an attractive option.

Still, CPU-encoders do a particularly good job with containing output file sizes for most of the cases we tested, producing smaller output file sizes even when the perceived quality is simmilar. This is an important aspect to have into account since it can have a big impact in cost. Depending on the amount of media files distributed and consumed by final users, the data transfer and storage cost will noticeably increase if GPUs are used. With this in mind, it is important to weight the compute costs with the data transfer and storage costs for your use case when chosing to use CPU or GPU-based video encoding.

One additional point to be considered is pipeline flexibility. Whereas the GPU encoding pipeline is rigid, CPU-based pipelines can be modified to the customer’s needs, including  additional FFmpeg filters to accommodate future needs as required.

The test did not include any specific quality measurements in the transcoded images, but it would be interesting to perform an analysis based on quantitative VMAF (or similar algorithm) metrics for the videos. We always recommend to make your own test to validate if the results obtained meet your requirements.

Benchmarking method

This blog post extends on the original work described in Optimized Video Encoding with FFmpeg on AWS Graviton Processors and the benchmarking process has been maintained in order to preserve consistency of the benchmark results. The original article analyzes in detail the price/performance advantages of AWS Graviton 3 compared to other processors.

Batch encoding workflow

Figure 7: Batch encoding workflow

Amazon Q brings generative AI-powered assistance to IT pros and developers (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/amazon-q-brings-generative-ai-powered-assistance-to-it-pros-and-developers-preview/

Today, we are announcing the preview of Amazon Q, a new type of generative artificial intelligence (AI) powered assistant that is specifically for work and can be tailored to a customer’s business.

Amazon Q brings a set of capabilities to support developers and IT professionals. Now you can use Amazon Q to get started building applications on AWS, research best practices, resolve errors, and get assistance in coding new features for your applications. For example, Amazon Q Code Transformation can perform Java application upgrades now, from version 8 and 11 to version 17.

Amazon Q is available in multiple areas of AWS to provide quick access to answers and ideas wherever you work. Here’s a quick look at Amazon Q, including in integrated development environment (IDE):

Building applications together with Amazon Q
Application development is a journey. It involves a continuous cycle of researching, developing, deploying, optimizing, and maintaining. At each stage, there are many questions—from figuring out the right AWS services to use, to troubleshooting issues in the application code.

Trained on 17 years of AWS knowledge and best practices, Amazon Q is designed to help you at each stage of development with a new experience for building applications on AWS. With Amazon Q, you minimize the time and effort you need to gain the knowledge required to answer AWS questions, explore new AWS capabilities, learn unfamiliar technologies, and architect solutions that fuel innovation.

Let us show you some capabilities of Amazon Q.

1. Conversational Q&A capability
You can interact with the Amazon Q conversational Q&A capability to get started, learn new things, research best practices, and iterate on how to build applications on AWS without needing to shift focus away from the AWS console.

To start using this feature, you can select the Amazon Q icon on the right-hand side of the AWS Management Console.

For example, you can ask, “What are AWS serverless services to build serverless APIs?” Amazon Q provides concise explanations along with references you can use to follow up on your questions and validate the guidance. You can also use Amazon Q to follow up on and iterate your questions. Amazon Q will show more deep-dive answers for you with references.

There are times when we have questions for a use case with fairly specific requirements. With Amazon Q, you can elaborate on your use cases in more detail to provide context.

For example, you can ask Amazon Q, “I’m planning to create serverless APIs with 100k requests/day. Each request needs to lookup into the database. What are the best services for this workload?” Amazon Q responds with a list of AWS services you can use and tries to limit the answer results to those that are accurately referenceable and verified with best practices.

Here is some additional information that you might want to note:

2. Optimize Amazon EC2 instance selection
Choosing the right Amazon Elastic Compute Cloud (Amazon EC2) instance type for your workload can be challenging with all the options available. Amazon Q aims to make this easier by providing personalized recommendations.

To use this feature, you can ask Amazon Q, “Which instance families should I use to deploy a Web App Server for hosting an application?” This feature is also available when you choose to launch an instance in the Amazon EC2 console. In Instance type, you can select Get advice on instance type selection. This will show a dialog to define your requirements.

Your requirements are automatically translated into a prompt on the Amazon Q chat panel. Amazon Q returns with a list of suggestions of EC2 instances that are suitable for your use cases. This capability helps you pick the right instance type and settings so your workloads will run smoothly and more cost-efficiently.

This capability to provide EC2 instance type recommendations based on your use case is available in preview in all commercial AWS Regions.

3. Troubleshoot and solve errors directly in the console
Amazon Q can also help you to solve errors for various AWS services directly in the console. With Amazon Q proposed solutions, you can avoid slow manual log checks or research.

Let’s say that you have an AWS Lambda function that tries to interact with an Amazon DynamoDB table. But, for an unknown reason (yet), it fails to run. Now, with Amazon Q, you can troubleshoot and resolve this issue faster by selecting Troubleshoot with Amazon Q.

Amazon Q provides concise analysis of the error which helps you to understand the root cause of the problem and the proposed resolution. With this information, you can follow the steps described by Amazon Q to fix the issue.

In just a few minutes, you will have the solution to solve your issues, saving significant time without disrupting your development workflow. The Amazon Q capability to help you troubleshoot errors in the console is available in preview in the US West (Oregon) for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), Amazon ECS, and AWS Lambda.

4. Network troubleshooting assistance
You can also ask Amazon Q to assist you in troubleshooting network connectivity issues caused by network misconfiguration in your current AWS account. For this capability, Amazon Q works with Amazon VPC Reachability Analyzer to check your connections and inspect your network configuration to identify potential issues.

This makes it easy to diagnose and resolve AWS networking problems, such as “Why can’t I SSH to my EC2 instance?” or “Why can’t I reach my web server from the Internet?” which you can ask Amazon Q.

Then, on the response text, you can select preview experience here, which will provide explanations to help you to troubleshoot network connectivity-related issues.

Here are a few things you need to know:

5. Integration and conversational capabilities within your IDEs
As we mentioned, Amazon Q is also available in supported IDEs. This allows you to ask questions and get help within your IDE by chatting with Amazon Q or invoking actions by typing / in the chat box.

To get started, you need to install or update the latest AWS Toolkit and sign in to Amazon CodeWhisperer. Once you’re signed in to Amazon CodeWhisperer, it will automatically activate the Amazon Q conversational capability in the IDE. With Amazon Q enabled, you can now start chatting to get coding assistance.

You can ask Amazon Q to describe your source code file.

From here, you can improve your application, for example, by integrating it with Amazon DynamoDB. You can ask Amazon Q, “Generate code to save data into DynamoDB table called save_data() accepting data parameter and return boolean status if the operation successfully runs.”

Once you’ve reviewed the generated code, you can do a manual copy and paste into the editor. You can also select Insert at cursor to place the generated code into the source code directly.

This feature makes it really easy to help you focus on building applications because you don’t have to leave your IDE to get answers and context-specific coding guidance. You can try the preview of this feature in Visual Studio Code and JetBrains IDEs.

6. Feature development capability
Another exciting feature that Amazon Q provides is guiding you interactively from idea to building new features within your IDE and Amazon CodeCatalyst. You can go from a natural language prompt to application features in minutes, with interactive step-by-step instructions and best practices, right from your IDE. With a prompt, Amazon Q will attempt to understand your application structure and break down your prompt into logical, atomic implementation steps.

To use this capability, you can start by invoking an action command /dev in Amazon Q and describe the task you need Amazon Q to process.

Then, from here, you can review, collaborate and guide Amazon Q in the chat for specific areas that need to be implemented.

Additional capabilities to help you ship features faster with complete pull requests are available if you’re using Amazon CodeCatalyst. In Amazon CodeCatalyst, you can assign a new or an existing issue to Amazon Q, and it will process an end-to-end development workflow for you. Amazon Q will review the existing code, propose a solution approach, seek feedback from you on the approach, generate merge-ready code, and publish a pull request for review. All you need to do after is to review the proposed solutions from Amazon Q.

The following screenshots show a pull request created by Amazon Q in Amazon CodeCatalyst.

Here are a couple of things that you should know:

  • Amazon Q feature development capability is currently in preview in Visual Studio Code and Amazon CodeCatalyst
  • To use this capability in IDE, you need to have the Amazon CodeWhisperer Professional tier. Learn more on the Amazon CodeWhisperer pricing page.

7. Upgrade applications with Amazon Q Code Transformation
With Amazon Q, you can now upgrade an entire application within a few hours by starting a guided code transformation. This capability, called Amazon Q Code Transformation, simplifies maintaining, migrating, and upgrading your existing applications.

To start, navigate to the CodeWhisperer section and then select Transform. Amazon Q Code Transformation automatically analyzes your existing codebase, generates a transformation plan, and completes the key transformation tasks suggested by the plan.

Some additional information about this feature:

  • Amazon Q Code Transformation is available in preview today in the AWS Toolkit for IntelliJ IDEA and the AWS Toolkit for Visual Studio Code.
  • To use this capability, you need to have the Amazon CodeWhisperer Professional tier during the preview.
  • During preview, you can can upgrade Java 8 and 11 applications to version 17, a Java Long-Term Support (LTS) release.

Get started with Amazon Q today
With Amazon Q, you have an AI expert by your side to answer questions, write code faster, troubleshoot issues, optimize workloads, and even help you code new features. These capabilities simplify every phase of building applications on AWS.

Amazon Q lets you engage with AWS Support agents directly from the Q interface if additional assistance is required, eliminating any dead ends in the customer’s self-service experience. The integration with AWS Support is available in the console and will honor the entitlements of your AWS Support plan.

Learn more

— Donnie & Channy

Introducing Amazon EC2 high memory U7i Instances for large in-memory databases (preview)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/introducing-amazon-ec2-high-memory-u7i-instances-for-large-in-memory-databases-preview/

The new U7i instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. Powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids), the instances are now available in multiple AWS regions in preview form, in the US West (Oregon), Asia Pacific (Seoul), and Europe (Frankfurt) AWS Regions, as follows:

Instance Name vCPUs
Memory (DDR5)
EBS Bandwidth
Network Bandwidth
u7in-16tb.224xlarge 896 16,384 GiB 100 Gbps 100 Gbps
u7in-24tb.224xlarge 896 24,576 GiB 100 Gbps 100 Gbps
u7in-32tb.224xlarge 896 32,768 GiB 100 Gbps 100 Gbps

We are also working on a smaller instance:

Instance Name vCPUs
Memory (DDR5)
EBS Bandwidth
Network Bandwidth
u7i-12tb.224xlarge 896 12,288 GiB 60 Gbps 100 Gbps

Here’s what 32 TiB of memory looks like:

And here are the 896 vCPUs (and lots of other info):

When compared to the first generation of High Memory instances, the U7i instances offer up to 125% more compute performance and up to 120% more memory performance. They also provide 2.5x as much EBS bandwidth, giving you the ability to hydrate in-memory databases at a rate of up to 44 terabytes per hour.

Each U7i instance supports attachment of up to 128 General Purpose (gp2 and gp3) or Provisioned IOPS (io1 and io2 Block Express) EBS volumes. Each io2 Block Express volume can be as big as 64 TiB and can deliver up to 256K IOPS at up to 32 Gbps, making them a great match for the U7i instance.

On the network side, the instances support ENA Express and deliver up to 25 Gbps of bandwidth per network flow.

Supported operating systems include Red Hat Enterprise Linux and SUSE Enterprise Linux Server.

Join the Preview
If you are ready to put the U7i instances to the test in your environment, join the preview.

Jeff;

The attendee’s guide to the AWS re:Invent 2023 Compute track

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/the-attendees-guide-to-the-aws-reinvent-2023-compute-track/

This post by Art Baudo – Principal Product Marketing Manager – AWS EC2, and Pranaya Anshu – Product Marketing Manager – AWS EC2

We are just a few weeks away from AWS re:Invent 2023, AWS’s biggest cloud computing event of the year. This event will be a great opportunity for you to meet other cloud enthusiasts, find productive solutions that can transform your company, and learn new skills through 2000+ learning sessions.

Even if you are not able to join in person, you can catch-up with many of the sessions on-demand and even watch the keynote and innovation sessions live.

If you’re able to join us, just a reminder we offer several types of sessions which can help maximize your learning in a variety of AWS topics. Breakout sessions are lecture-style 60-minute informative sessions presented by AWS experts, customers, or partners. These sessions are recorded and uploaded a few days after to the AWS Events YouTube channel.

re:Invent attendees can also choose to attend chalk-talks, builder sessions, workshops, or code talk sessions. Each of these are live non-recorded interactive sessions.

  • Chalk-talk sessions: Attendees will interact with presenters, asking questions and using a whiteboard in session.
  • Builder Sessions: Attendees participate in a one-hour session and build something.
  • Workshops sessions: Attendees join a two-hour interactive session where they work in a small team to solve a real problem using AWS services.
  • Code talk sessions: Attendees participate in engaging code-focused sessions where an expert leads a live coding session.

To start planning your re:Invent week, check-out some of the Compute track sessions below. If you find a session you’re interested in, be sure to reserve your seat for it through the AWS attendee portal.

Explore the latest compute innovations

This year AWS compute services have launched numerous innovations: From the launch of over 100 new Amazon EC2 instances, to the general availability of Amazon EC2 Trn1n instances powered by AWS Trainium and Amazon EC2 Inf2 instances powered by AWS Inferentia2, to a new way to reserve GPU capacity with Amazon EC2 Capacity Blocks for ML. There’s a lot of exciting launches to take in.

Explore some of these latest and greatest innovations in the following sessions:

  • CMP102 | What’s new with Amazon EC2
    Provides an overview on the latest Amazon EC2 innovations. Hear about recent Amazon EC2 launches, learn how about differences between Amazon EC2 instances families, and how you can use a mix of instances to deliver on your cost, performance, and sustainability goals.
  • CMP217 | Select and launch the right instance for your workload and budget
    Learn how to select the right instance for your workload and budget. This session will focus on innovations including Amazon EC2 Flex instances and the new generation of Intel, AMD, and AWS Graviton instances.
  • CMP219-INT | Compute innovation for any application, anywhere
    Provides you with an understanding of the breadth and depth of AWS compute offerings and innovation. Discover how you can run any application, including enterprise applications, HPC, generative artificial intelligence (AI), containers, databases, and games, on AWS.

Customer experiences and applications with machine learning

Machine learning (ML) has been evolving for decades and has an inflection point with generative AI applications capturing widespread attention and imagination. More customers, across a diverse set of industries, choose AWS compared to any other major cloud provider to build, train, and deploy their ML applications. Learn about the generative AI infrastructure at Amazon or get hands-on experience building ML applications through our ML focused sessions, such as the following:

Discover what powers AWS compute

AWS has invested years designing custom silicon optimized for the cloud to deliver the best price performance for a wide range of applications and workloads using AWS services. Learn more about the AWS Nitro System, processors at AWS, and ML chips.

Optimize your compute costs

At AWS, we focus on delivering the best possible cost structure for our customers. Frugality is one of our founding leadership principles. Cost effective design continues to shape everything we do, from how we develop products to how we run our operations. Come learn of new ways to optimize your compute costs through AWS services, tools, and optimization strategies in the following sessions:

Check out workload-specific sessions

Amazon EC2 offers the broadest and deepest compute platform to help you best match the needs of your workload. More SAP, high performance computing (HPC), ML, and Windows workloads run on AWS than any other cloud. Join sessions focused around your specific workload to learn about how you can leverage AWS solutions to accelerate your innovations.

Hear from AWS customers

AWS serves millions of customers of all sizes across thousands of use cases, every industry, and around the world. Hear customers dive into how AWS compute solutions have helped them transform their businesses.

Ready to unlock new possibilities?

The AWS Compute team looks forward to seeing you in Las Vegas. Come meet us at the Compute Booth in the Expo. And if you’re looking for more session recommendations, check-out additional re:Invent attendee guides curated by experts.