Tag Archives: Amazon EC2

AWS Weekly Roundup: Jamba 1.5 family, Llama 3.2, Amazon EC2 C8g and M8g instances and more (Sep 30, 2024)

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-jamba-1-5-family-llama-3-2-amazon-ec2-c8g-and-m8g-instances-and-more-sep-30-2024/

Every week, there’s a new Amazon Web Services (AWS) community event where you can network, learn something new, and immerse yourself in the community. When you’re in a community, everyone grows together, and no one is left behind. Last week was no exception. I can highlight the Dutch AWS Community Day where Viktoria Semaan closed with a talk titled How to Create Impactful Content and Build a Strong Personal Brand, and the Peru User Group, who organized two days of talks and learning opportunities: UGCONF & SERVERLESSDAY 2024, featuring Jeff Barr, who spoke about how to Create Your Own Luck. The community events continue, so check them out at Upcoming AWS Community Days.

Last week’s launches
Here are the launches that got my attention.

Jamba 1.5 family of models by AI21 Labs is now available in Amazon Bedrock – The Jamba 1.5 Large and 1.5 Mini models feature a 256k context window, one of the longest on the market, enabling complex tasks like lengthy document analysis. With native support for structured JSON output, function calling, and document processing, they integrate into enterprise workflows for specialized AI solutions. To learn more, read Jamba 1.5 family of models by AI21 Labs is now available in Amazon Bedrock, visit the AI21 Labs in Amazon Bedrock page, and read the documentation.

AWS Lambda now supports Amazon Linux 2023 runtimes in AWS GovCloud (US) Regions – These runtimes offer the latest language features, including Python 3.12, Node.js 20, Java 21, .NET 8, Ruby 3.3, and Amazon Linux 2023. They have smaller deployment footprints, updated libraries, and a new package manager. Additionally, you can also use the container base images to build and deploy functions as a container image.

Amazon SageMaker Studio now supports automatic shutdown of idle applications – You can now enable automatic shutdown of inactive JupyterLab and CodeEditor applications using Amazon SageMaker Distribution image v2.0 or newer. Administrators can set idle shutdown times at domain or user profile levels, with optional user customization. This cost control mechanism helps avoid charges for unused instances and is available across all AWS Regions where SageMaker Studio is offered.

Amazon S3 is implementing a default 128 KB minimum object size for S3 Lifecycle transition rules to any S3 storage class – Reduce transition costs for datasets with many small objects by decreasing transition requests. Users can override the default and customize minimum object sizes. Existing rules remain unchanged, but the new default applies to new or modified configurations.

AWS Lake Formation centralized access control for Amazon Redshift data sharing is now available in 11 additional Regions – Enabling granular permissions management, including table, column, and row-level access to shared Amazon Redshift data. It also supports tag-based access control and trusted identity propagation with AWS IAM Identity Center for improved security and simplified management.

Llama 3.2 generative AI models now available in Amazon Bedrock – The collection includes 90B and 11B parameter multimodal models for sophisticated reasoning tasks, and 3B and 1B text-only models for edge devices. These models support vision tasks, offer improved performance, and are designed for responsible AI innovation across various applications. These models support a 128K context length and multilingual capabilities in eight languages. Learn more about it in Introducing Llama 3.2 models from Meta in Amazon Bedrock.

Share AWS End User Messaging SMS resources across multiple AWS accounts – You can use AWS Resource Access Manager (RAM), to share phone numbers, sender IDs, phone pools, and opt-out lists. Additionally, Amazon SNS now delivers SMS text messages through AWS End User Messaging, offering enhanced features like two-way messaging and granular permissions. These updates provide greater flexibility and control for SMS messaging across AWS services.

AWS Serverless Application Repository now supports AWS PrivateLink Enabling direct connection from Amazon Virtual Private Cloud (VPC) without internet exposure. This enhances security by keeping communication within the AWS network. Available in all Regions where AWS Serverless Application Repository is offered, it can be set up using the AWS Management Console or AWS Command Line Interface (AWS CLI).

Amazon SageMaker with MLflow now supports AWS PrivateLink for secure traffic routing – Enabling secure data transfer from Amazon Virtual Private Cloud (VPC) to MLflow Tracking Servers within the AWS network. This enhances protection of sensitive information by avoiding public internet exposure. Available in most AWS Regions, it improves security for machine learning (ML) and generative AI experimentation using MLflow.

Introducing Amazon EC2 C8g and M8g Instances – Enhanced performance for compute-intensive and general-purpose workloads. With up to three times more vCPUs, three times more memory, 75 percent more memory bandwidth, and two times more L2 cache, these instances improve data processing, scalability, and cost-efficiency for various applications including high performance computing (HPC), batch processing, and microservices. Read more in Run your compute-
intensive and general purpose workloads sustainably with the new Amazon EC2 C8g, M8g instances.

Llama 3.2 models are now available in Amazon SageMaker JumpStart – These models offer various sizes from 1B to 90B parameters, support multimodal tasks, including image reasoning, and are more efficient for AI workloads. The 1B and 3B models can be fine-tuned, while Llama Guard 3 11B Vision supports responsible innovation and system-level safety. Learn more in Llama 3.2 models from Meta are now available in Amazon SageMaker JumpStart.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Deploy generative AI agents in your contact center for voice and chat using Amazon Connect, Amazon Lex, and Amazon Bedrock Knowledge Bases – This solution enables low-latency customer interactions, answering queries from a knowledge base. Features include conversation analytics, automated testing, and hallucination detection in a serverless architecture.

How AWS WAF threat intelligence features help protect the player experience for betting and gaming customersAWS WAF enhances bot protection for betting and gaming. New features include browser fingerprinting, automation detection, and ML models to identify coordinated bots. These tools combat scraping, fraud, distributed denial of service (DDoS) attacks, and cheating, safeguarding player experiences.

How to migrate 3DES keys from a FIPS to a non-FIPS AWS CloudHSM cluster – Learn how to securely transfer Triple Data Encryption Algorithm (3DES) keys from Federal Information Processing Standard (FIPS) hsm1 to non-FIPS hsm2 clusters using RSA-AES wrapping, without backups. This enables using new hsm2.medium instances with FIPS 140-3 Level 3 support, non-FIPS mode, increased key capacity, and mutual TLS (mTLS).

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. These events offer technical sessions, demonstrations, and workshops delivered by experts. There is only one event left that you can still register for: Ottawa (October 9).

AWS Community Days – Join community-led conferences featuring technical discussions, workshops, and hands-on labs driven by expert AWS users and industry leaders from around the world. Upcoming AWS Community Days are scheduled for October 3 in the Netherlands and Romania, and on October 5 in Jaipur, Mexico, Bolivia, Ecuador, and Panama. I’m happy to share with you that I will be joining the Panama community on October 5.

AWS GenAI Lofts – Collaborative spaces and immersive experiences that showcase AWS’s expertise with the cloud and AI, while providing startups and developers with hands-on access to AI products and services, exclusive sessions with industry leaders, and valuable networking opportunities with investors and peers. Find a GenAI Loft location near you and don’t forget to register. I’ll be in the San Francisco lounge with some demos on October 15 at the Gen AI Developer Day. If you’re attending, feel free to stop by and say hello!

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Thanks to Dmytro Hlotenko and Diana Alfaro for the photos of their community events.

Eli

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Leverage IAM Roles for email sending via SES from EC2 and eliminate a common credential risk

Post Syndicated from Zip Zieper original https://aws.amazon.com/blogs/messaging-and-targeting/leverage-iam-roles-for-email-sending-via-ses-from-ec2-and-eliminate-a-common-credential-risk/

Sending automated transactional emails, such as account verifications and password resets, is a common requirement for web applications hosted on Amazon EC2 instances. Amazon SES provides multiple interfaces for sending emails, including SMTP, API, and the SES console itself. The type of SES credential you use with Amazon SES depends on the method through which you are sending the emails.

In this blog post, we describe how to leverage IAM roles for EC2 instances to securely send emails via the Amazon SES API, without the need to embed IAM credentials directly in the application code, link to a shared credentials file, or manage IAM credentials within the EC2 instance. By adopting the approach outlined in this blog, you can enhance security by eliminating the risk of credential exposure and simplify credential management for your web applications.

Solution Overview

Below we provide step-by-step instructions to configure an IAM role with SES permissions to use on your EC2 instance. This allows the EC2 hosted web application to securely send emails via Amazon SES without storing or managing IAM credentials within the EC2 instance. We present an option for running EC2 and SES in the same AWS account, as well as an option to accommodate running EC2 and SES in different AWS accounts. Both options offer a way to enhance security and simplify credential management.

Either option begins with creating an IAM role with SES permissions. Next, the IAM role is attached to your EC2 instance, providing it with the necessary permissions for SES without needing to embed IAM credentials in your application code or on a file in the EC2 instance. In option 2, we’ll add cross-account permissions that allow the code on the EC2 instance in account “A” to send email via the SES API in account “B”. We also provide a sample Python script that demonstrates how to send an email from your EC2 instance using the attached IAM role.

Option 1 – SES and EC2 are in a single AWS account

In a typical scenario where an EC2 instance is operating in the same AWS account as SES, the process of using an IAM role to send emails via SES is straightforward. In the steps below, you’ll configure and attach an IAM role to the EC2 instance. You’ll then update a sample Python script to use the permissions provided by the attached IAM role to send emails via SES. This direct access simplifies the SES sending process, as no explicit credential management is required in the code, nor do you need to include a shared credentials file on the EC2 instance.

Option_1-Single_AWS_Account

EC2 & SES in the same AWS Account

Prerequisites – single AWS account for EC2 and SES

  • A single AWS account in a region that supports SES
  • Verified domain or email identity in Amazon SES.
    • Make note of a verified sending email address here: ___________
  • EC2 instance (Linux) in running state
    • If you don’t have a EC2 instance create one (Linux)
  • Administrative Access to Amazon SES, IAM and EC2 consoles.
  • Access to a recipient email address to receive test emails from the python script.
    • Make note of a SES verified recipient email address to send test emails here: ___________

Step 1 – Create IAM Role for EC2 instance with SES Permissions

To start, create an IAM role that grants the necessary permissions to send emails using Amazon SES by following these steps:

  • Sign in to the AWS Management Console and open the IAM console.
  • In the navigation pane, choose “Roles,” and then choose “Create role.”
  • Choose the trusted entity type as “AWS service” and select “EC2” as the service that will use this role, then click ‘Next
  • Search for and select the “AmazonSESFullAccess” policy from the list (or create a custom policy with the necessary SES permissions), then click ‘Next’.
  • Provide a name for your role (e.g., EC2_SES_SendEmail_Role).
  • Click “Create role“.

Step 2 – Attach the IAM Role to EC2 instance.

Next, attach the IAM role to your EC2 instance:

  • Open the EC2 Management Console.
  • In the navigation pane, choose “Instances,” and select the running EC2 instance to which you want to attach the IAM role.
  • With the instance selected, choose “Actions,” then “Security,” and “Modify IAM role.
  • Choose the IAM role you created (EC2_SES_SendEmail_Role) from the drop-down menu and click “Update IAM role.”

Step 3 – Create a sample python script that sends emails from the EC2 instance with the attached role.

  • Now that your EC2 instance is configured with the necessary permissions, you can set up an example Python script to send emails via Amazon SES using the IAM Role. Here, we’re using the AWS SDK for Python (Boto3), a powerful and versatile library to interact with the SES API endpoint. Before running the example script, ensure that Python, pip (the package installer for Python), and the Boto3 library are installed on your EC2 instance:
    • Run the ‘python3 –version‘ command to check if Python is installed on your EC2 instance. If Python is installed, the version will be displayed, otherwise you’ll receive a ‘command not found’ or similar error message.
      • If python is not installed, run the command ‘sudo yum install python3 -y
    • Run the ‘pip3 --version‘ command to check if pip is installed on your EC2 instance. If pip3 is installed, is installed, the version will be displayed, otherwise you’ll receive a ‘command not found’ or similar error message.
      • If pip3 is not installed, run the command ‘sudo yum install python3-pip
    • Install the Boto3 Library which allows Python scripts to interact with AWS services including SES. Run the command ‘pip3 install boto3‘ to install (or update) Boto3 using pip.
  • Save the code below as a Python file named ‘sesemail.py‘ on your EC2 instance.
  • Edit 'sesemail.py‘ and replace the placeholder values of SENDER, RECIPIENT, and AWS_REGION with your values (see prerequisites). Do not modify any “” marks.

[copy]

import boto3
from botocore.exceptions import ClientError

SENDER = "[email protected]"
RECIPIENT = "[email protected]"
#CONFIGURATION_SET = "ConfigSet"
AWS_REGION = "us-west-2"
SUBJECT = "Amazon SES Test Email (SDK for Python) using IAM Role"
BODY_TEXT = ("Amazon SES Test (Python)\r\n"
             "This email was sent with Amazon SES using the "
             "AWS SDK for Python (Boto)."
            )
            
BODY_HTML = """<html>
<head></head>
<body>
  <h1>Amazon SES Test (SDK for Python) using IAM Role</h1>
  <p>This email was sent with
    <a href='https://aws.amazon.com/ses/'>Amazon SES</a> using the
    <a href='https://aws.amazon.com/sdk-for-python/'>
      AWS SDK for Python (Boto)</a>.</p>
</body>
</html>
            """            

CHARSET = "UTF-8"

client = boto3.client('ses',region_name=AWS_REGION)

try:
    response = client.send_email(
        Destination={
            'ToAddresses': [
                RECIPIENT,
            ],
        },
        Message={
            'Body': {
                'Html': {
                    'Charset': CHARSET,
                    'Data': BODY_HTML,
                },
                'Text': {
                    'Charset': CHARSET,
                    'Data': BODY_TEXT,
                },
            },
            'Subject': {
                'Charset': CHARSET,
                'Data': SUBJECT,
            },
        },
        Source=SENDER,
    )   
except ClientError as e:
    print(e.response['Error']['Message'])
else:
    print("Email sent! Message ID:"),
    print(response['MessageId'])
  • Run ‘python3 sesmail.py‘ to execute the Python script.
  • When ‘python3 sesmail.py‘ runs successfully, an email is sent to the RECIPIENT(check the inbox), and the command line will display the sent Message ID.


Option 2 – SES and EC2 are in different AWS accounts

In some scenarios, your EC2 instance might operate in a different AWS account than SES. Let’s call the EC2 AWS account “A” and SES AWS account “B”. Because the AWS resources in account A don’t automatically have permission to access AWS resources account B, we need some way to allow the code on EC2 to assume a role in the SES Account using the AWS Security Token Service (STS). This involves a method that generates temporary credentials that include an access key, secret access key, and session token, which are only valid for a limited time.

option-2

EC2 & SES in different AWS Accounts

In the steps below, you’ll configure and attach an IAM role to the EC2 instance in account “A” such that it can run an example Python script. This Python script can use the permissions provided by the attached IAM role to send emails via SES in account “B”. This approach leverages cross-account access and simplifies sending email from the EC2 in account A via SES in account B. As with Option 1, no explicit credential management is required in the code running on EC2, nor do you need to include a shared credentials file on the Ec2 instance.

Prerequisites – different AWS accounts for EC2 and SES (use cross-account access)

  • An AWS account “A” with:
    • EC2 instance (Linux) in running state. (If you don’t have a EC2 instance, create one using Amazon Linux)
    • Administrative Access to Amazon IAM and EC2 consoles.
    • Make note of your “A” AWS account ID here: ________________
  • An AWS account “B” with:
    • Verified domain (or email identity for testing only) in Amazon SES
      • Make note of a verified sending email address here: ___________
    • Administrative Access to Amazon SES and IAM consoles.
      • Make note of your “B” AWS account ID here: ________________
    • In the steps below, you will create a “SES_Role_for_account_A” role.
      • Make note of the ARN of the “SES_Role_for_account_A” role here: ___________
    • Access to a recipient email address to receive test emails from the python script.
      • Make note of a SES verified recipient email address to send test emails here: ___________

Step 1 – Create IAM Role in the SES “B” account

  • Sign in to the SES “B” account via the AWS Management Console and open the IAM console.
  • In the navigation pane, choose “Roles,” and then choose “Create role“.
  • Choose the trusted entity type as ‘AWS account’ and select ‘Another AWS account’.
  • Add the AWS account ID where your EC2 instance resides (AWS account “A” in the prerequisites) and click ‘Next’.
  • Search for and select the “AmazonSESFullAccess” policy or create a custom policy with the necessary SES permissions, then click ‘Next’.
  • Provide a name for your role (e.g., ‘SES_Role_for_account_A').
  • Click “Create role“.
  • Copy the arn for the new SES_Role_for_account_A (you’ll need the arn in the next step).

Step 2 – Create a IAM policy in the EC2 “A” account that allows this role to assume the SES_Role_for_account_A role you just created in the SES “B” Account.

  • Sign in to the EC2 “A” account via the AWS Management Console and open the IAM console.
  • In the navigation pane, choose “Policies,” and then choose “Create Policy”.
  • Choose the service as ‘EC2’ and select policy editor as JSON.
  • Copy the policy below, and in the policy editor, replace the Resource with the arn of theSES_Role_for_account_A in the SES account “B” (you created this in step 1).

[copy, paste into policy editor & replace the arn with SES_Role_for_account_A]

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::<SES_Account_ID>:role/<Role_Name>"
}
]
}

  • Click ‘Next’ and provide a name for your role (e.g., EC2_Policy_for_account_B).
  • Click ‘Create the Policy

Step 3 – Create an IAM role in the EC2 “A” account, and attach the previously created IAM policy (EC2_Policy_for_account_B) to it.

  • In the EC2 “A” account IAM console navigation pane, choose “Roles,” and then choose “Create role.”
  • Choose the trusted entity type as “AWS service” and select “EC2” as the service, then click ‘Next’.

  • Filter by type “customer managed”, search for (EC2_Policy_for_account_B) and select that policy and ‘Next’ (note – if you are using AWS Session Manger to remotely connect to your EC2 instance, you may need to add the “AmazonSSMManagedInstanceCore” policy to the role).

  • Provide a name for your role (e.g., EC2_SES_in_account_B_role).
  • Click “Create role“.

Step 4 – Attach the IAM Role (EC2_SES_in_account_B_role) to the EC2 instance in AWS account “A”.

  • Open the EC2 Management Console in AWS account “A”
  •  In the navigation pane, choose “Instances,” and select the instance to which you want to attach the EC2_SES_in_account_B_role IAM role.
  • With the instance selected, choose “Actions,” then “Security,” and “Modify IAM role.”

  • Choose the IAM role you created (EC2_SES_in_account_B_role) from the drop-down menu.
  • Click “Update IAM role.”

Step 5 – Create a sample python script that sends emails via SES in AWS account “B” from the EC2 instance in AWS account “A” using the EC2 attached role.

  1. Now that your EC2 instance is configured with the necessary permissions, you can set up an example Python script to send emails via Amazon SES in AWS Account “B” using the IAM Role on EC2 in AWS Account “A”. We’ll use the AWS SDK for Python (Boto3), a powerful and versatile library to interact with the SES API endpoint. Before running the example script, ensure that Python, pip (the package installer for Python), and the Boto3 library are installed on your EC2 instance:
    • Run the ‘python3 –version‘ command to check if Python is installed on your EC2 instance. If Python is installed, the version will be displayed, otherwise you’ll receive a ‘command not found’ or similar error message.
      • If python is not installed, run the command ‘sudo yum install python3 -y
    • Run the ‘pip3 --version‘ command to check if pip is installed on your EC2 instance. If pip3 is installed, is installed, the version will be displayed, otherwise you’ll receive a ‘command not found’ or similar error message.
      • If pip3 is not installed, run the command ‘sudo yum install python3-pip
    • Install the Boto3 Library which allows Python scripts to interact with AWS services including SES. Run the command ‘pip3 install boto3‘ to install (or update) Boto3 using pip.
  1. Save the code below as a Python file named cross_sesemail.py on your EC2 instance.
    4b. Edit cross_sesemail.py and replace the placeholder values of the ROLE_ARN with ARN of the SES_Role_for_account_A you created in SES Account “B” (see prerequisites), SENDER, RECIPIENT, and AWS_REGION with your values (see prerequisites). Do not modify any “” marks.

[copy, edit & replace the ROLE_ARN]

import boto3
from botocore.exceptions import ClientError

# Replace with your role ARN in SES Account
ROLE_ARN = "arn:aws:iam::<Account_ID>:role/<Role_Name>"

# Create an STS client
sts_client = boto3.client('sts')

# Assume the role
assumed_role = sts_client.assume_role(
    RoleArn=ROLE_ARN,
    RoleSessionName="SESSession"
)

# Extract the temporary credentials
credentials = assumed_role['Credentials']

# Create an SES client using the assumed role credentials
ses_client = boto3.client(
    'ses',
    region_name='us-west-2',
    aws_access_key_id=credentials['AccessKeyId'],
    aws_secret_access_key=credentials['SecretAccessKey'],
    aws_session_token=credentials['SessionToken']
)

# Email parameters
SENDER = "[email protected]"
RECIPIENT = "[email protected]"
SUBJECT = "Amazon SES Test (SDK for Python) using cross-account IAM Role"
BODY_TEXT = ("Amazon SES Test (Python)\r\n"
             "This email was sent with Amazon SES using the "
             "AWS SDK for Python (Boto) using IAM Role."
            )
BODY_HTML = """<html>
<head></head>
<body>
  <h1>Amazon SES Test (SDK for Python) using IAM Role</h1>
  <p>This email was sent with
    <a href='https://aws.amazon.com/ses/'>Amazon SES</a> using the
    <a href='https://aws.amazon.com/sdk-for-python/'>
      AWS SDK for Python (Boto)</a> using IAM Role.</p>
</body>
</html>
            """
CHARSET = "UTF-8"

# Send the email
try:
    response = ses_client.send_email(
        Destination={
            'ToAddresses': [RECIPIENT],
        },
        Message={
            'Body': {
                'Html': {
                    'Charset': CHARSET,
                    'Data': BODY_HTML,
                },
                'Text': {
                    'Charset': CHARSET,
                    'Data': BODY_TEXT,
                },
            },
            'Subject': {
                'Charset': CHARSET,
                'Data': SUBJECT,
            },
        },
        Source=SENDER,
    )
except ClientError as e:
    print(e.response['Error']['Message'])
else:
    print("Email sent! Message ID:"),
    print(response['MessageId'])
  • Run the python script python3 cross_sesemail.py. When the email is sent successfully, the command line output will display the message ID of the sent email, and the recipient will receive an email.


Conclusion:

By implementing IAM roles for EC2 instances with SES permissions, you can securely send emails via the SES APIs from your web applications without the need to store or manage IAM credentials within the EC2 instance or application code. This approach not only enhances security by eliminating the risk of credential exposure, but also simplifies the management of credentials. With the step-by-step guide provided in this blog post, you can easily configure IAM roles for your EC2 instances and start sending emails via the Amazon SES API in a secure and efficient manner, regardless of whether your EC2 and SES resources reside in the same or different AWS accounts.

Next Steps:

  1. Sign up for the AWS Free Tier and try out Amazon SES with IAM roles for EC2 instances as demonstrated in this blog post.
  2. Consult the AWS documentation on IAM Roles for Amazon EC2 and Amazon SES for more detailed instructions and best practices.
  3. Join the AWS Community Forums to ask questions, share experiences, and learn from other AWS users who have implemented similar solutions for secure email sending from their web applications.

About the Authors

Manas Murali M

Manas Murali M

Manas Murali M is a Cloud Support Engineer II at AWS and subject matter expert in Amazon Simple Email Service (SES) and Amazon CloudFront. With over 5 years of experience in the IT industry, he is passionate about resolving technical issues for customers. In his free time, he enjoys spending time with friends, traveling, and exploring emerging technologies.

zip

Zip

Zip is an Amazon Pinpoint and Amazon Simple Email Service Sr. Specialist Solutions Architect at AWS. Outside of work he enjoys time with his family, cooking, mountain biking and plogging.

Run your compute-intensive and general purpose workloads sustainably with the new Amazon EC2 C8g, M8g instances

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/run-your-compute-intensive-and-general-purpose-workloads-sustainably-with-the-new-amazon-ec2-c8g-m8g-instances/

Today we’re announcing general availability of the Amazon Elastic Compute Cloud (Amazon EC2) C8g and M8g instances.

C8g instances are AWS Graviton4 based and are ideal for compute-intensive workloads such as high performance computing (HPC), batch processing, gaming, video encoding, scientific modeling, distributed analytics, CPU-based machine learning (ML) inference, and ad serving.

Also Graviton4 based, M8g instances provide the best price performance for general purpose workloads. M8g instances are ideal for applications such as application servers, microservices, gaming servers, mid-size data stores, and caching fleets.

Now looking at some of the improvements that we have made available in both these instances. C8g and M8g instances offer larger instance sizes with up to three times more vCPUs (up to 48xl), three times the memory (up to 384GB for C8g and up to 768GB for M8g), 75 percent more memory bandwidth, and two times more L2 cache over equivalent 7g instances. This helps you to process larger amounts of data, scale up your workloads, improve time to results, and lower your total cost of ownership (TCO). These instances also offer up to 50 Gbps network bandwidth and up to 40 Gbps Amazon Elastic Block Storage (Amazon EBS) bandwidth compared to up to 30 Gbps network bandwidth and up to 20 Gbps Amazon EBS bandwidth on Graviton3-based instances. Similar to R8g instances, C8g and M8g instances offer two bare metal sizes (metal-24xl and metal-48xl). You can right size your instances and deploy workloads that benefit from direct access to physical resources.

The specs for the C8g instances are as follows.

Instance size
vCPUs
Memory (GiB)
Network bandwidth (Gbps)
EBS bandwidth (Gbps)
c8g.medium 1 2 Up to 12.5 Up to 10
c8g.large 2 4 Up to 12.5 Up to 10
c8g.xlarge 4 8 Up to 12.5 Up to 10
c8g.2xlarge 8 16 Up to 15 Up to 10
c8g.4xlarge 16 32 Up to 15 Up to 10
c8g.8xlarge 32 64 15 10
c8g.12xlarge 48 96 22.5 15
c8g.16xlarge 64 128 30 20
c8g.24xlarge 96 192 40 30
c8g.48xlarge 192 384 50 40
c8g.metal-24xl 96 192 40 30
c8g.metal-48xl 192 384 50 40

The specs for the M8g instances are as follows.

Instance size
vCPUs
Memory (GiB)
Network bandwidth (Gbps)
EBS bandwidth (Gbps)
m8g.medium 1 4 Up to 12.5 Up to 10
m8g.large 2 8 Up to 12.5 Up to 10
m8g.xlarge 4 16 Up to 12.5 Up to 10
m8g.2xlarge 8 32 Up to 15 Up to 10
m8g.4xlarge 16 64 Up to 15 Up to 10
m8g.8xlarge 32 128 15 10
m8g.12xlarge 48 192 22.5 15
m8g.16xlarge 64 256 30 20
m8g.24xlarge 96 384 40 30
m8g.48xlarge 192 768 50 40
m8g.metal-24xl 96 384 40 30
m8g.metal-48xl 192 768 50 40

Good to know

  • AWS Graviton4 processors offer enhanced security with always-on memory encryption, dedicated caches for every vCPU, and support for pointer authentication.
  • These instances are built on the AWS Nitro System which is a rich collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware and software. It delivers high performance, high availability, and high security, thus reducing virtualization overhead.
  • The C8g and M8g instances are ideal for Linux-based workloads including containerized and microservices-based applications such as those running on Amazon Elastic Kubernetes Service (Amazon EKS) and Amazon Elastic Container Service (Amazon ECS), as well as applications written in popular programming languages such as C/C++, Rust, Go, Java, Python, .NET Core, Node.js, Ruby, and PHP.

Available now
C8g and M8g instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions. As usual with Amazon EC2, you pay only for what you use. For more information, see Amazon EC2 Pricing. Check out the collection of AWS Graviton resources to help you start migrating your applications to Graviton instance types. You can also visit the AWS Graviton Fast Start program to begin your Graviton adoption journey.

To learn more, visit our Amazon EC2 instances page, and please send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

– Veliswa

AWS Weekly Roundup: Amazon EC2 X8g Instances, Amazon Q generative SQL for Amazon Redshift, AWS SDK for Swift, and more (Sep 23, 2024)

Post Syndicated from Abhishek Gupta original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-x8g-instances-amazon-q-generative-sql-for-amazon-redshift-aws-sdk-for-swift-and-more-sep-23-2024/

AWS Community Days have been in full swing around the world. I am going to put the spotlight on AWS Community Day Argentina where Jeff Barr delivered the keynote, talks and shared his nuggets of wisdom with the community, including a fun story of how he once followed Bill Gates to a McDonald’s!

I encourage you to read about his experience.

Last week’s launches
Here are the launches that got my attention, starting off with the GA releases.

Amazon EC2 X8g Instances are now generally availableX8g instances are powered by AWS Graviton4 processors and deliver up to 60% better performance than AWS Graviton2-based Amazon EC2 X2gd instances. These instances offer larger sizes with up to 3x more vCPU (up to 48xlarge) and memory (up to 3TiB) than Graviton2-based X2gd instances.

Amazon Q generative SQL for Amazon Redshift is now generally available – Amazon Q generative SQL in Amazon Redshift Query Editor is an out-of-the-box web-based SQL editor for Amazon Redshift. It uses generative AI to analyze user intent, query patterns, and schema metadata to identify common SQL query patterns directly within Amazon Redshift, accelerating the query authoring process for users and reducing the time required to derive actionable data insights.

AWS SDK for Swift is now generally availableAWS SDK for Swift provides a modern, user-friendly, and native Swift interface for accessing Amazon Web Services from Apple platforms, AWS Lambda, and Linux-based Swift on Server applications. Now that it’s GA, customers can use AWS SDK for Swift for production workloads. Learn more in the AWS SDK for Swift Developer Guide.

AWS Amplify now supports long-running tasks with asynchronous server-side function calls – Developers can use AWS Amplify to invoke Lambda function asynchronously for operations like generative AI model inferences, batch processing jobs, or message queuing without blocking the GraphQL API response. This improves responsiveness and scalability, especially for scenarios where immediate responses are not required or where long-running tasks need to be offloaded.

Amazon Keyspaces (for Apache Cassandra) now supports add-column for multi-Region tables – With this launch, you can modify the schema of your existing multi-Region tables in Amazon Keyspaces (for Apache Cassandra) to add new columns. You only have to modify the schema in one of its replica Regions and Keyspaces will replicate the new schema to the other Regions where the table exists.

Amazon Corretto 23 is now generally availableAmazon Corretto is a no-cost, multi-platform, production-ready distribution of OpenJDK. Corretto 23 is an OpenJDK 23 Feature Release that includes an updated Vector API, expanded pattern matching and switch expression, and more. It will be supported through April, 2025.

Use OR1 instances for existing Amazon OpenSearch Service domains – With OpenSearch 2.15, you can leverage OR1 instances for your existing Amazon OpenSearch Service domains by simply updating your existing domain configuration, and choosing OR1 instances for data nodes. This will seamlessly move domains running OpenSearch 2.15 to OR1 instances using a blue/green deployment.

Amazon S3 Express One Zone now supports AWS KMS with customer managed keys – By default, S3 Express One Zone encrypts all objects with server-side encryption using S3 managed keys (SSE-S3). With S3 Express One Zone support for customer managed keys, you have more options to encrypt and manage the security of your data. S3 Bucket Keys are always enabled when you use SSE-KMS with S3 Express One Zone, at no additional cost.

Use AWS Chatbot to interact with Amazon Bedrock agents from Microsoft Teams and Slack – Before, customers had to develop custom chat applications in Microsoft Teams or Slack and integrate it with Amazon Bedrock agents. Now they can invoke their Amazon Bedrock agents from chat channels by connecting the agent alias with an AWS Chatbot channel configuration.

AWS CodeBuild support for managed GitLab runners – Customers can configure their AWS CodeBuild projects to receive GitLab CI/CD job events and run them on ephemeral hosts. This feature allows GitLab jobs to integrate natively with AWS, providing security and convenience through features such as IAM, AWS Secrets Manager, AWS CloudTrail, and Amazon VPC.

We launched existing services in additional Regions:

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Secure Cross-Cluster Communication in EKS – It demonstrates how you can use Amazon VPC Lattice and Pod Identity to secure cross-EKS-cluster application communication, along with an example that you can use as a reference to adapt to your own microservices applications.

Improve RAG performance using Cohere Rerank – This post focuses on improving search efficiency and accuracy in RAG systems using Cohere Rerank.

AWS open source news and updates – My colleague Ricardo Sueiras writes about open source projects, tools, and events from the AWS Community; check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. Upcoming AWS Community Days are in Italy (Sep. 27), Taiwan (Sep. 28), Saudi Arabia (Sep. 28)), Netherlands (Oct. 3), and Romania (Oct. 5).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Abhishek

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Now available: Graviton4-powered memory-optimized Amazon EC2 X8g instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-graviton4-powered-memory-optimized-amazon-ec2-x8g-instances/

Graviton-4-powered, memory-optimized X8g instances are now available in ten virtual sizes and two bare metal sizes, with up to 3 TiB of DDR5 memory and up to 192 vCPUs. The X8g instances are our most energy efficient to date, with the best price performance and scale-up capability of any comparable EC2 Graviton instance to date. With a 16 to 1 ratio of memory to vCPU, these instances are designed for Electronic Design Automation, in-memory databases & caches, relational databases, real-time analytics, and memory-constrained microservices. The instances fully encrypt all high-speed physical hardware interfaces and also include additional AWS Nitro System and Graviton4 security features.

Over 50K AWS customers already make use of the existing roster of over 150 Graviton-powered instances. They run a wide variety of applications including Valkey, Redis, Apache Spark, Apache Hadoop, PostgreSQL, MariaDB, MySQL, and SAP HANA Cloud. Because they are available in twelve sizes, the new X8g instances are an even better host for these applications by allowing you to choose between scaling up (using a bigger instance) and scaling out (using more instances), while also providing additional flexibility for existing memory-bound workloads that are currently running on distinct instances.

The Instances
When compared to the previous generation (X2gd) instances, the X8g instances offer 3x more memory, 3x more vCPUs, more than twice as much EBS bandwidth (40 Gbps vs 19 Gbps), and twice as much network bandwidth (50 Gbps vs 25 Gbps).

The Graviton4 processors inside the X8g instances have twice as much L2 cache per core as the Graviton2 processors in the X2gd instances (2 MiB vs 1 MiB) along with 160% higher memory bandwidth, and can deliver up to 60% better compute performance.

The X8g instances are built using the 5th generation of AWS Nitro System and Graviton4 processors, which incorporates additional security features including Branch Target Identification (BTI) which provides protection against low-level attacks that attempt to disrupt control flow at the instruction level. To learn more about this and Graviton4’s other security features, read How Amazon’s New CPU Fights Cybersecurity Threats and watch the re:Invent 2023 AWS Graviton session.

Here are the specs:

Instance Name vCPUs
Memory (DDR5)
EBS Bandwidth
Network Bandwidth
x8g.medium 1 16 GiB Up to 10 Gbps Up to 12.5 Gbps
x8g.large 2 32 GiB Up to 10 Gbps Up to 12.5 Gbps
x8g.xlarge 4 64 GiB Up to 10 Gbps Up to 12.5 Gbps
x8g.2xlarge 8 128 GiB Up to 10 Gbps Up to 15 Gbps
x8g.4xlarge 16 256 GiB Up to 10 Gbps Up to 15 Gbps
x8g.8xlarge 32 512 GiB 10 Gbps 15 Gbps
x8g.12xlarge 48 768 GiB 15 Gbps 22.5 Gbps
x8g.16xlarge 64 1,024 GiB 20 Gbps 30 Gbps
x8g.24xlarge 96 1,536 GiB 30 Gbps 40 Gbps
x8g.48xlarge 192 3,072 GiB 40 Gbps 50 Gbps
x8g.metal-24xl 96 1,536 GiB 30 Gbps 40 Gbps
x8g.metal-48xl 192 3,072 GiB 40 Gbps 50 Gbps

The instances support ENA, ENA Express, and EFA Enhanced Networking. As you can see from the table above they provide a generous amount of EBS bandwidth, and support all EBS volume types including io2 Block Express, EBS General Purpose SSD, and EBS Provisioned IOPS SSD.

X8g Instances in Action
Let’s take a look at some applications and use cases that can make use of 16 GiB of memory per vCPU and/or up to 3 TiB per instance:

Databases – X8g instances allow SAP HANA and SAP Data Analytics Cloud to handle larger and more ambitious workloads than before. Running on Graviton4 powered instances, SAP has measured up to 25% better performance for analytical workloads and up to 40% better performance for transactional workloads in comparison to the same workloads running on Graviton3 instances. X8g instances allow SAP to expand their Graviton-based usage to even larger memory bound solutions.

Electronic Design Automation – EDA workloads are central to the process of designing, testing, verifying, and taping out new generations of chips, including Graviton, Trainium, Inferentia, and those that form the building blocks for the Nitro System. AWS and many other chip makers have adopted the AWS Cloud for these workloads, taking advantage of scale and elasticity to supply each phase of the design process with the appropriate amount of compute power. This allows engineers to innovate faster because they are not waiting for results. Here’s a long-term snapshot from one of the clusters that was used to support development of Graviton4 in late 2022 and early 2023. As you can see this cluster runs at massive scale, with peaks as high as 5x normal usage:

You can see bursts of daily and weekly activity, and then a jump in overall usage during the tape-out phase. The instances in the cluster are on the large end of the size spectrum so the peaks represent several hundred thousand cores running concurrently. This ability to spin up compute when we need it and down when we don’t gives us access to unprecedented scale without a dedicated investment in hardware.

The new X8g instances will allow us and our EDA customers to run even more workloads on Graviton processors, reducing costs and decreasing energy consumption, while also helping to get new products to market faster than ever.

Available Now
X8g instances are available today in the US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) AWS Regions in On Demand, Spot, Reserved Instance, Savings Plan, Dedicated Instance, and Dedicated Host form. To learn more, visit the X8g page.

New: Zone Groups for Availability Zones in AWS Regions

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/new-zone-groups-for-availability-zones-in-aws-regions/

This blog post is written by Pranav Chachra, Principal Product Manager, AWS.

In 2019, AWS introduced Zone Groups for AWS Local Zones. Today, we’re announcing that we are working on extending the Zone Group construct to Availability Zones (AZs).

Zone Groups were launched to help users of AWS Local Zones identify related groups of Local Zones that reside in the same geography. For example, the two interconnected Local Zones in Los Angeles (us-west-2-lax-1a and us-west-2-lax-1b) make up the us-west-2-lax-1 Zone Group. These Zone Groups are used for opting in to the Local Zones, as shown in the following image.

In October 2024, we will extend the Zone Group construct to AZs with a consistent naming format across all AWS Regions. This update will help you differentiate group of Local Zones and AZs based on their unique GroupName, improving manageability and clarity. For example, the Zone Group of AZs in the US West (Oregon) Region will now be referred to as us-west-2-zg-1, where us-west-2 indicates the Region, and zg-1 indicates it is the group of AZs in the Region. This new identifier (such as us-west-2-zg-1) will replace the current naming (such as us-west-2) for the GroupName available through the DescribeAvailabilityZones API. The names for the Local Zones Groups (such as us-west-2-lax-1) will remain the same as before.

For example, when you list your AZs, you see the current naming (such as us-west-2) for the GroupName of AZs in the US West (Oregon) Region:

The new identifier (such as us-west-2-zg-1) will replace the current naming for the GroupName of AZs, as shown in the following image.

At AWS we remain committed to responding to customer feedback and we continuously improve our services based upon that feedback. If you have questions or need further assistance, contact AWS Support on the community forums and through AWS Support.

AWS Weekly Roundup: AWS Parallel Computing Service, Amazon EC2 status checks, and more (September 2, 2024)

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-parallel-computing-service-amazon-ec2-status-checks-and-more-september-2-2024/

With the arrival of September, AWS re:Invent 2024 is now 3 months away and I am very excited for the new upcoming services and announcements at the conference. I remember attending re:Invent 2019, just before the COVID-19 pandemic. It was the biggest in-person re:Invent with 60,000+ attendees and it was my second one. It was amazing to be in that atmosphere! Registration is now open for AWS re:Invent 2024. Come join us in Las Vegas for five exciting days of keynotes, breakout sessions, chalk talks, interactive learning opportunities, and career-changing connections!

Now let’s look at the last week’s new announcements.

Last week’s launches
Here are the launches that got my attention.

Announcing AWS Parallel Computing Service – AWS Parallel Computing Service (AWS PCS) is a new managed service that lets you run and scale high performance computing (HPC) workloads on AWS. You can build scientific and engineering models and run simulations using a fully managed Slurm scheduler with built-in technical support and a rich set of customization options. Tailor your HPC environment to your specific needs and integrate it with your preferred software stack. Build complete HPC clusters that integrates compute, storage, networking, and visualization resources, and seamlessly scale from zero to thousands of instances. To learn more, visit AWS Parallel Computing Service and read Channy’s blog post.

Amazon EC2 status checks now support reachability health of attached EBS volumes – You can now use Amazon EC2 status checks to directly monitor if the Amazon EBS volumes attached to your instances are reachable and able to complete I/O operations. With this new status check, you can quickly detect attachment issues or volume impairments that may impact the performance of your applications running on Amazon EC2 instances. You can further integrate these status checks within Auto Scaling groups to monitor the health of EC2 instances and replace impacted instances to ensure high availability and reliability of your applications. Attached EBS status checks can be used along with the instance status and system status checks to monitor the health of your instances. To learn more, refer to the Status checks for Amazon EC2 instances documentation.

Amazon QuickSight now supports sharing views of embedded dashboards – You can now share views of embedded dashboards in Amazon QuickSight. This feature allows you to enable more collaborative capabilities in your application with embedded QuickSight dashboards. Additionally, you can enable personalization capabilities such as bookmarks for anonymous users. You can share a unique link that displays only your changes while staying within the application, and use dashboard or console embedding to generate a shareable link to your application page with QuickSight’s reference encapsulated using the QuickSight Embedding SDK. QuickSight Readers can then send this shareable link to their peers. When their peer accesses the shared link, they are taken to the page on the application that contains the embedded QuickSight dashboard. For more information, refer to Embedded view documentation.

Amazon Q Business launches IAM federation for user identity authenticationAmazon Q Business is a fully managed service that deploys a generative AI business expert for your enterprise data. You can use the Amazon Q Business IAM federation feature to connect your applications directly to your identity provider to source user identity and user attributes for these applications. Previously, you had to sync your user identity information from your identity provider into AWS IAM Identity Center, and then connect your Amazon Q Business applications to IAM Identity Center for user authentication. At launch, Amazon Q Business IAM federation will support the OpenID Connect (OIDC) and SAML2.0 protocols for identity provider connectivity. To learn more, visit Amazon Q Business documentation.

Amazon Bedrock now supports cross-Region inferenceAmazon Bedrock announces support for cross-Region inference, an optional feature that enables you to seamlessly manage traffic bursts by utilizing compute across different AWS Regions. If you are using on-demand mode, you’ll be able to get higher throughput limits (up to 2x your allocated in-Region quotas) and enhanced resilience during periods of peak demand by using cross-Region inference. By opting in, you no longer have to spend time and effort predicting demand fluctuations. Instead, cross-Region inference dynamically routes traffic across multiple Regions, ensuring optimal availability for each request and smoother performance during high-usage periods. You can control where your inference data flows by selecting from a pre-defined set of Regions, helping you comply with applicable data residency requirements and sovereignty laws. Find the list at Supported Regions and models for cross-Region inference. To get started, refer to the Amazon Bedrock documentation or this Machine Learning blog.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

We launched existing services and instance types in additional Regions:

Other AWS events
AWS GenAI Lofts are collaborative spaces and immersive experiences that showcase AWS’s cloud and AI expertise, while providing startups and developers with hands-on access to AI products and services, exclusive sessions with industry leaders, and valuable networking opportunities with investors and peers. Find a GenAI Loft location near you and don’t forget to register.

Gen AI loft workshop

credit: Antje Barth

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

AWS Summits are free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. AWS Summits for this year are coming to an end. There are 3 more left that you can still register: Jakarta (September 5), Toronto (September 11), and Ottawa (October 9).

AWS Community Days feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. While AWS Summits 2024 are almost over, AWS Community Days are in full swing. Upcoming AWS Community Days are in Belfast (September 6), SF Bay Area (September 13), where our own Antje Barth is a keynote speaker, Argentina (September 14), and Armenia (September 14).

Browse all upcoming AWS led in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Esra

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Weekly Roundup: G6e instances, Karpenter, Amazon Prime Day metrics, AWS Certifications update and more (August 19, 2024)

Post Syndicated from Prasad Rao original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-g6e-instances-karpenter-amazon-prime-day-metrics-aws-certifications-update-and-more-august-19-2024/

You know what I find more exciting than the Amazon Prime Day sale? Finding out how Amazon Web Services (AWS) makes it all happen. Every year, I wait eagerly for Jeff Barr’s annual post to read the chart-topping metrics. The scale never ceases to amaze me.

This year, Channy Yun and Jeff Barr bring us behind the scenes of how AWS powered Prime Day 2024 for record-breaking sales. I will let you read the post for full details, but one metric that blows my mind every year is that of Amazon Aurora. On Prime Day, 6,311 Amazon Aurora database instances processed more than 376 billion transactions, stored 2,978 terabytes of data, and transferred 913 terabytes of data.

Amazon Box with checkbox showing a record breaking prime day event powered by AWS

Other news I’m excited to share is that registration is open for two new AWS Certification exams. You can now register for the beta version of the AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer – Associate. These certifications are for everyone—from line-of-business professionals to experienced machine learning (ML) engineers—and will help individuals prepare for in-demand artificial intelligence and machine learning (AI/ML) careers. You can prepare for your exam by following a four-step exam prep plan for AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer – Associate.

Last week’s launches
Here are some launches that got my attention:

General availability of Amazon Elastic Compute Cloud (Amazon EC2) EC2 G6e instances – Powered by NVIDIA L40S Tensor Core GPUs, G6e instances can be used for a wide range of ML and spatial computing use cases. You can use G6e instances to deploy large language models (LLMs) with up to 13B parameters and diffusion models for generating images, video, and audio.

Release of Karpenter 1.0 – Karpenter is a flexible, efficient, and high-performance Kubernetes compute management solution. You can use Karpenter with Amazon Elastic Kubernetes Service (Amazon EKS) or any conformant Kubernetes cluster. To learn more, visit the Karpenter 1.0 launch post.

Drag-and-drop UI for Amazon SageMaker Pipelines – With this launch, you can now quickly create, execute, and monitor an end-to-end AI/ML workflow to train, fine-tune, evaluate, and deploy models without writing code. You can drag and drop various steps of the workflow and connect them together in the UI to compose an AI/ML workflow.

Split, move and modify Amazon EC2 On-Demand Capacity Reservations – With the new capabilities for managing Amazon EC2 On-Demand Capacity Reservations, you can split your Capacity Reservations, move capacity between Capacity Reservations, and modify your Capacity Reservation’s instance eligibility attribute. To learn more about these features, refer to Split off available capacity from an existing Capacity Reservation.

Document-level sync reports in Amazon Q Business – This new feature of Amazon Q Business provides you with a comprehensive document-level report including granular indexing status, metadata, and access control list (ACL) details for every document processed during a data source sync job. You have the visibility of the status of the documents Amazon Q Business attempted to crawl and index as well as the ability to troubleshoot why certain documents were not returned with the expected answers.

Landing zone version selection in AWS Control Tower – Starting with landing zone version 3.1 and above, you can update or reset in-place your landing zone on the current version, or upgrade to a version of your choice. To learn more, visit Select a landing zone version in the AWS Control Tower user guide.

Launch of AWS Support Official channel on AWS re:Post – You now have access to curated content for operating at scale on AWS, authored by AWS Support and AWS Managed Services (AMS) experts. In this new channel, you can find technical solutions for complex problems, operational best practices, and insights into AWS Support and AMS offerings. To learn more, visit the AWS Support Official channel on re:Post.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Regional expansion of AWS Services
Here are some of the expansions of AWS services into new AWS Regions that happened this week:

Amazon VPC Lattice is now available in 7 additional RegionsAmazon VPC Lattice is now available in US West (N. California), Africa (Cape Town), Europe (Milan), Europe (Paris), Asia Pacific (Mumbai), Asia Pacific (Seoul), and South America (São Paulo). With this launch, Amazon VPC Lattice is now generally available in 18 AWS Regions.

Amazon Q in QuickSight is now available in 5 additional Regions Amazon Q in QuickSight is now generally available in Asia Pacific (Mumbai), Canada (Central), Europe (Ireland), Europe (London), and South America (São Paulo), in addition to the existing US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) Regions.

AWS Wickr is now available in the Europe (Zurich) RegionAWS Wickr adds Europe (Zurich) to the US East (N. Virginia), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (London), Europe (Frankfurt), and Europe (Stockholm) Regions that it’s available in.

You can browse the full list of AWS Services available by Region.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS re:Invent 2024 – Dive into the first-round session catalog. Explore all the different learning opportunities at AWS re:Invent this year and start building your agenda today. You’ll find sessions for all interests and learning styles.

AWS Summits – The 2024 AWS Summit season is starting to wrap up! Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Jakarta (September 5), and Toronto (September 11).

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Colombia (August 24), New York (August 28), Belfast (September 6), and Bay Area (September 13).

AWS GenAI Lofts – Meet AWS AI experts and attend talks, workshops, fireside chats, and Q&As with industry leaders. All lofts are free and are carefully curated to offer something for everyone to help you accelerate your journey with AI. There are lofts scheduled in San Francisco (August 14–September 27), São Paulo (September 2–November 20), London (September 30–October 25), Paris (October 8–November 25), and Seoul (November).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Prasad

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

How AWS powered Prime Day 2024 for record-breaking sales

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/how-aws-powered-prime-day-2024-for-record-breaking-sales/

The last Amazon Prime Day 2024 (July 17-18) was Amazon’s biggest Prime Day shopping event ever, with record sales and more items sold during the two-day event than any previous Prime Day event. Prime members shopped for millions of deals and saved billions across more than 35 categories globally.

I live in South Korea, but luckily I was staying in Seattle to attend the AWS Heroes Summit during Prime Day 2024. I signed up for a Prime membership and used Rufus, my new AI-powered conversational shopping assistant, to search for items quickly and easily. Prime members in the U.S. like me chose to consolidate their deliveries on millions of orders during Prime Day, saving an estimated 10 million trips. This consolidation results in lower carbon emissions on average.

We know from Jeff’s annual blog post that AWS runs the Amazon website and mobile app that makes these short-term, large scale global events feasible. (check out his 2016, 2017, 2019, 2020, 2021, 2022, and 2023 posts for a look back). Today I want to share top numbers from AWS that made my amazing shopping experience possible.

Prime Day 2024 – all the numbers
Here are some of the most interesting and/or mind-blowing metrics:

Amazon EC2 – Since many of Amazon.com services such as Rufus and Search use AWS artificial intelligence (AI) chips under the hood, Amazon deployed a cluster of over 80,000 Inferentia and Trainium chips for Prime Day. During Prime Day 2024, Amazon used over 250K AWS Graviton chips to power more than 5,800 distinct Amazon.com services (double that of 2023).

Amazon EBS – In support of Prime Day, Amazon provisioned 264 PiB of Amazon EBS storage in 2024, a 62 percent increase compared to 2023. When compared to the day before Prime Day 2024, Amazon.com performance on Amazon EBS jumped by 5.6 trillion read/write I/O operations during the event, or an increase of 64 percent compared to Prime Day 2023. Also, when compared to the day before Prime Day 2024, Amazon.com transferred an incremental 444 petabytes of data during the event, or an increase of 81 percent compared to Prime Day 2023.

Amazon Aurora – On Prime Day, 6,311 database instances running the PostgreSQL-compatible and MySQL-compatible editions of Amazon Aurora processed more than 376 billion transactions, stored 2,978 terabytes of data, and transferred 913 terabytes of data.

Amazon DynamoDB – DynamoDB powers multiple high-traffic Amazon properties and systems including Alexa, the Amazon.com sites, and all Amazon fulfillment centers. Over the course of Prime Day, these sources made tens of trillions of calls to the DynamoDB API. DynamoDB maintained high availability while delivering single-digit millisecond responses and peaking at 146 million requests per second.

Amazon ElastiCache – ElastiCache served more than quadrillion requests on a single day with a peak of over 1 trillion requests per minute.

Amazon QuickSight – Over the course of Prime Day 2024, one Amazon QuickSight dashboard used by Prime Day teams saw 107K unique hits, 1300+ unique visitors, and delivered over 1.6M queries.

Amazon SageMaker – SageMaker processed more than 145B inference requests during Prime Day.

Amazon Simple Email Service (Amazon SES) – SES sent 30 percent more emails for Amazon.com during Prime Day 2024 vs 2023, delivering 99.23 percent of those emails to customers.

Amazon GuardDuty – During Prime Day 2024, Amazon GuardDuty monitored nearly 6 trillion log events per hour, a 31.9% increase from the previous year’s Prime Day.

AWS CloudTrail – CloudTrail processed over 976 billion events in support of Prime Day 2024.

Amazon CloudFront – CloudFront handled a peak load of over 500 million HTTP requests per minute, for a total of over 1.3 trillion HTTP requests during Prime Day 2024, a 30 percent increase in total requests compared to Prime Day 2023.

Prepare to Scale
As Jeff noted in every year, rigorous preparation is key to the success of Prime Day and our other large-scale events. For example, 733 AWS Fault Injection Service experiments were run to test resilience and ensure Amazon.com remains highly available on Prime Day.

If you are preparing for a similar business-critical events, product launches, and migrations, I strongly recommend that you take advantage of newly-branded AWS Countdown, a support program designed for your project lifecycle to assess operational readiness, identify and mitigate risks, and plan capacity, using proven playbooks developed by AWS experts. For example, with additional help from AWS Countdown, Legal Zoom successfully migrated 450 servers with minimal issues and continues to leverage AWS Countdown Premium to streamline and expedite the launch of SaaS applications.

We look forward to seeing what other records will be broken next year!

Channy & Jeff;

Enabling high availability of Amazon EC2 instances on AWS Outposts servers: (Part 2)

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/enabling-high-availability-of-amazon-ec2-instances-on-aws-outposts-servers-part-2/

This blog post was written by Brianna Rosentrater – Hybrid Edge Specialist SA and Jessica Win – Software Development Engineer

This post is Part 2 of the two-part series ‘Enabling high availability of Amazon EC2 instances on AWS Outposts servers’, providing you with code samples and considerations for implementing custom logic to automate Amazon Elastic Compute Cloud (Amazon EC2) relaunch on AWS Outposts servers. This post focuses on stateful applications where the Amazon EC2 instance store state needs to be maintained at relaunch.

Overview

AWS Outposts servers provide compute and networking services that are ideal for low-latency, local data processing needs for on-premises locations such as retail stores, branch offices, healthcare provider locations, or environments that are space-constrained. Outposts servers use EC2 instance store storage to provide non-durable block-level storage to the instances, and many applications use the instance store to save stateful information that must be retained in a Disaster Recovery (DR) type event. In this post, you will learn how to implement custom logic to provide High Availability (HA) for your applications running on an Outposts server using two or more servers for N+1 fault tolerance. The code provided is meant to help you get started with creating your own custom relaunch logic for workloads that require HA, and can be modified further for your unique workload needs.

Architecture

This solution is scoped to work for two Outposts servers set up as a resilient pair. For three or more servers running in the same data center, each server would need to be mapped to a secondary server for HA. One server can be the relaunch destination for multiple other servers, as long as Amazon EC2 capacity requirements are met. If both the source and destination Outposts servers are unavailable or experience a failure at the same time, then additional user action is required to resolve. In this case, a notification email is sent to the address specified in the notification email parameter that you supplied when executing the init.py script from Part 1 of this series. This lets you know that the attempted relaunch of your EC2 instances failed.

Figure 1: Amazon EC2 auto-relaunch custom logic on Outposts server architecture.

Figure 1: Amazon EC2 auto-relaunch custom logic on Outposts server architecture.

Refer to Part 1 of this series for a detailed breakdown of Steps 1-6 that discusses how the Amazon EC2 relaunch automation works, as shown in the preceding figure. For stateful applications, this logic has been extended to capture the EC2 instance store state. In order to save the state of the instance store, AWS Systems Manager automation is being used to create an Amazon Elastic Block Store (Amazon EBS)-backed Amazon Machine Image (AMI) in the Region of the EC2 instance running on the source Outposts server. Then, this AMI can be relaunched on another Outposts server in the event of a source server hardware or service link failure. The EBS volume associated with the AMI is automatically converted to the instance store root volume when relaunched on another Outposts server.

Prerequisites

The following prerequisites are required to complete the walkthrough:

This post builds on the Amazon EC2 auto-relaunch logic implemented in Part 1 of this series. In Part 1, we covered the implementation for achieving HA for stateless applications. In Part 2, we extend the Part 1 implementation to achieve HA for stateful applications, which must retain EC2 instance store data when instances are relaunched.

Deploying Outposts Servers Linux Instance Backup Solution

For the purpose of this post, a virtual private cloud (VPC) named “Production-Application-A”, and subnets on each of the two Outposts servers being used for this post named “source-outpost-a” and “destination-outpost-b” have been created. The destination-outpost-b subnet is supplied in the launch template being used for this walkthrough. The Amazon EC2 auto-relaunch logic discussed in Part 1 of this series has already been implemented, and the focus here is on the next steps required to extend that auto-relaunch capability to stateful applications.

Following the installation instructions available in the GitHub repository README file, you first open an AWS CloudShell terminal from within the account that has access to your Outposts servers. Next, clone the GitHub repository and cd into the “backup-outposts-servers-linux-instance” directory:

From here you can build the Systems Manager Automation document with its attachments using the make documents command. Your output should look similar to the following after successful execution:

Finally, upload the Systems Manager Automation document you just created to the S3 bucket you created in your Outposts server’s parent region for this purpose. For the purpose of this post, an S3 bucket named “ssm-bucket07082024” was created. Following Step 4 in the GitHub installation instructions, the command looks like the following:

BUCKET_NAME="ssm-bucket07082024"
DOC_NAME="BackupOutpostsServerLinuxInstanceToEBS"
OUTPOST_REGION="us-east-1"
aws s3 cp Output/Attachments/attachment.zip s3://${BUCKET_NAME}
aws ssm create-document --content file://Output/BackupOutpostsServerLinuxInstanceToEBS.json --name ${DOC_NAME} --document-type "Automation" --document-format JSON --attachments Key=S3FileUrl,Values=s3://${BUCKET_NAME}/attachment.zip,Name=attachment.zip --region ${OUTPOST_REGION}

After you have successfully created the Systems Manager Automation document, the output of the command shows the content of your newly created file. After reviewing it, you can exit the terminal and confirm that a new file named “attachments.zip” is in the S3 bucket that you specified.

Now you’re ready to put this automation logic in place. Following the GitHub instructions for usage, navigate to Systems Manager in the account that has access to your Outposts servers, and execute the automation. The default document name is used for the purpose of this post “BackupOutpostsServerLinuxInstanceToEBS”, so that is the document selected. You may have other documents available to you for quick setup, and those can be disregarded for now.

Select the chosen document to execute this automation using the button in the top right-hand corner of the document details page.

After executing the automation, you are asked to configure the runbook for this automation. Leave the default Simple execution option selected:

For the Input parameters section, review the parameter definitions given in the GitHub repository README file. For the purpose of this post, the following is used:

Note that you may need to create a service role for Systems Manager to perform this automation on your behalf. For the purposes of this post, I have done so using the Required IAM Permissions to run this runbook section of the GitHub repository README file. The other settings can be left as default. Finish your set up by selecting Execute at the bottom of this page. It could take up to 30 minutes for all necessary steps to execute. Note that the automation document shows 32 steps, but the number of steps that are executed varies based on the type of Linux AMI that you started with. As long as your automation’s overall status shows as successful, you have completed implementation successfully. Here is a sample output:

You can find the AMI that was produced from this automation in your Amazon EC2 console under the Images section:

The final implementation step is creating a Systems Manager parameter for the AMI you just created. This prevents you from having to manually update the launch template for your application each time a new AMI is created and the AMI ID changes. Since this AMI is essentially a backup of your application and its current instance store state, you should expect the AMI ID to change with each new backup or new AMI that you create for your application, and determine the cadence for creating these AMIs that aligns to your application Recovery Point Objectives (RPO).

To create a Systems Manager parameter for your AMI, first navigate to your Systems Manager console. Under Application Management, select Parameter Store and Create parameter. You can select either the Standard or Advanced tier depending on your needs. The AMI ID I have is ami-038c878d31d9d0bfb and the following is an example of how the parameter details are filled in for this walkthrough:

Now you can modify your application’s launch template that you created in Part 1 of this series, and specify the Systems Manager parameter you just created. To do this, navigate to your Amazon EC2 console, and under Instances select the Launch Templates option. Create a new version of your launch template, select the Browse more AMIs option, and choose the arrow button to the right of the search bar. Select Specify custom value/Systems Manager parameter.

Now enter the name of your parameter in one of the listed formats, and select Save.

You should see your parameter listed in the launch template summary under Software Image (AMI):

Make sure that your launch template is set to the latest version. Your installation is now complete, and in the event of a source Outposts server failure, your application will be automatically relaunched on a new EC2 instance on your destination Outposts server. You will also receive a notification email sent to the address specified in the notification email parameter of the init.py script from Part 1 of this series. This means you can start triaging why your source Outposts server experienced a failure immediately without worrying about getting your application(s) back up and running. This helps make sure that your application(s) are highly available and reduces your Recovery Time Objective (RTO).

Cleaning up

The custom Amazon EC2 relaunch logic is implemented through AWS CloudFormation, so the only clean up required is to delete the CloudFormation stack from your AWS account. Doing so deletes the resources that were deployed through the CloudFormation stack. To remove the Systems Manager automation, un-enroll your EC2 instance from Host Management and delete the Amazon EBS-backed AMI in the Region.

Conclusion

The use of custom logic through AWS tools such as CloudFormation, CloudWatch, Systems Manager, and AWS Lambda enables you to architect for HA for stateful workloads on Outposts server. By implementing the custom logic we walked through in this post, you can automatically relaunch EC2 instances running on a source Outposts server to a secondary destination Outposts server while maintaining your application’s state data. This also reduces the downtime of your application(s) in the event of a hardware or service link failure. The code provided in this post can also be further expanded upon to meet the unique needs of your workload.

Note that while the use of Infrastructure-as-Code (IaC) can improve your application’s availability and be used to standardize deployments across multiple Outposts servers, it is crucial to do regular failure drills to test the custom logic in place to make sure that you understand your application’s expected behavior on relaunch in the event of a failure. To learn more about Outposts servers, please visit the Outposts servers user guide.

Enabling high availability of Amazon EC2 instances on AWS Outposts servers: (Part 1)

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/enabling-high-availability-of-amazon-ec2-instances-on-aws-outposts-servers-part-1/

This blog post is written by Brianna Rosentrater – Hybrid Edge Specialist SA and Jessica Win – Software Development Engineer.

This post is part 1 of the two-part series ‘Enabling high availability of Amazon EC2 instances on AWS Outposts servers’, providing you with code samples and considerations for implementing custom logic to automate Amazon Elastic Compute Cloud (EC2) relaunch on Outposts servers. This post focuses on guidance for stateless applications, whereas part 2 focuses on stateful applications where the Amazon EC2 instance store state needs to be maintained at relaunch.

Outposts servers provide compute and networking services that are ideal for low-latency, local data processing needs for on-premises locations such as retail stores, branch offices, healthcare provider locations, or environments that are space-constrained. Outposts servers use EC2 instance store storage to provide non-durable block-level storage to the instances running stateless workloads, and while stateless workloads don’t require resilient storage, many application owners still have uptime requirements for these types of workloads. In this post, you will learn how to implement custom logic to provide high availability (HA) for your applications running on Outposts servers using two or more servers for N+1 fault tolerance. The code provided is meant to help you get started, and can be modified further for your unique workload needs.

Overview

In this post, we have provided an init.py script. This script takes your input parameters and creates a custom AWS CloudFormation template that is deployed in the specified account. Users can run “./init.py –-help” or “./init.py -h” to view parameter descriptions. The following input parameters are needed:

Parameter Description
Launch template ID(s) This is used to relaunch your EC2 instances on the destination Outposts server in the event of a source server hardware or service link failure. You can specify multiple Launch Template IDs for multiple applications.
Source Outpost ID This is the Outpost ID of the server actively running your EC2 workload.
Template file This is the base CloudFormation template. The init.py script customizes the AutoRestartTemplate.yaml template based on your inputs. Make sure to execute the init.py in the file directory that contains the AutoRestartTemplate.yaml file.
Stack name This is the name you’d like to give your CloudFormation stack.
Region This should be the same AWS Region to which your Outposts servers are anchored.
Notification email This is the email Amazon Simple Notification Service (SNS) uses to alert you if Amazon CloudWatch detects that your source Outposts server has failed.
Launch template description This is the description of the launch template(s) used to relaunch your EC2 instances on the destination Outposts server in the event of a source server failure.

After collecting the preceding parameters, the

script generates a CloudFormation template. You are asked to review the template and confirm that it meets your expectations. Once you select yes, the CloudFormation template is deployed in your account, and you can view the stack from your AWS Management Console. You also receive a confirmation email sent to the address specified in the notification email parameter, confirming your subscription to the SNS topic. This SNS topic was created by the CloudFormation stack to alert you if your source Outposts server experiences a hardware or service link failure.

The init.py script and AutoRestartTemplate.yaml CloudFormation template provided in this post is intended to be used to implement custom logic that relaunches EC2 instances running on the source Outposts server to a specified destination Outposts server for improved application availability. This logic works by essentially creating a mapping between the source and destination Outpost, and only works between two Outposts servers. This code can be further customized to meet your application requirements, and is meant to help you get started with implementing custom logic for your Outposts server environment. Now that we have covered the init.py parameters, the intended use case, scope, and limitations of the code provided, read on for more information on the architecture for this solution.

Architecture diagram

This solution is scoped to work for two Outposts servers set up as a resilient pair. For more than two servers running in the same data center, each server would need to be mapped to a secondary server for HA. One server can be the relaunch destination for multiple other servers, as long as Amazon EC2 capacity requirements are met. If both the source and destination Outposts servers are unavailable or experience a failure at the same time, then additional user action is required to resolve. In this case, a notification email is sent to the address specified in the notification email parameter letting you know that the attempted relaunch of your EC2 instances failed.

Amazon EC2 auto-relaunch custom logic on AWS Outposts server architecture.

Figure 1: Amazon EC2 auto-relaunch custom logic on AWS Outposts server architecture.

  1. Input environment parameters required for the CloudFormation template AutoRestartTemplate.yaml. After confirming that the customized template looks correct, agree to allow the init.py script to deploy the CloudFormation stack in your desired AWS account.
  2. The CloudFormation stack is created and deployed in your AWS account with two or more Outposts servers. The CloudFormation stack creates the following resources:
    • A CloudWatch alarm to monitor the source Outpost server ConnectedStatus metric;
    • An SNS topic that alerts you if your source Outposts server ConnectedStatus shows as down;
    • An AWS Lambda function that relaunches the source Outposts server EC2 instances on the destination Outposts server according to the launch template you provided.
  1. A CloudWatch alarm monitors the ConnectedStatus metric of the source Outposts server to detect hardware or service link failure.
  2. If the ConnectedStatus metric shows the source Outposts server service link as down, then a Lambda function coordinates relaunching the EC2 instances on the destination Outposts server according to the launch template that you provided.
  3. In the event of a source Outposts server hardware or service link failure and Amazon EC2 relaunch, Amazon SNS sends a notification to the notification email provided in the init.py script as an environment parameter. You will be notified when the CloudWatch alarm is triggered, and when the automation finishes executing with an execution status included.
  4. The EC2 instances described in your launch template are launched on the destination Outposts server automatically, with no manual action needed.

Now that we’ve covered the architecture and workflow for this solution, read on for step-by-step instructions on how to implement this code in your AWS account.

Prerequisites

The following prerequisites are required to complete the walkthrough:

  • Python is used to run the init.py script that dynamically creates a CloudFormation stack in the account specified as an input parameter.
  • Two Outposts servers that can be set up as an active/active or active/passive resilient pair depending on the size of the workload.
  • Create Launch Templates for the applications you want to protect—make sure that an instance type is selected that is available on your destination Outposts server.
  • Make sure that you have the credentials needed to programmatically deploy the CloudFormation stack in your AWS account.
  • If you are setting this up from an Outposts consumer account, you will need to configure CloudWatch cross-account observability between the consumer account and the Outposts owning account to view Outposts metrics.
  • Download the repository ec2-outposts-autorestart.

Deploying the AutoRestart CloudFormation stack

For the purpose of this post, a virtual private cloud (VPC) named “Production-Application-A”, and subnets on each of the two Outposts servers being used for this post named “source-outpost-a” and “destination-outpost-b” have been created. The destination-outpost-b subnet is supplied in the launch template being used for this walkthrough.

  1. Make sure that you are in the directory that contains the init.py and AutoRestartTemplate.yaml files. Next, run the following command to execute the init.py file. Note that you may need to change the file permissions to do this. If so, then run “chmod a+x init.py” to give all users execute permissions for this file: ./init.py --launch-template-id <value> --source-outpost-id <value> --template-file AutoRestartTemplate.yaml --stack-name <value> --region <value> --notification-email <value>
  1. After executing the preceding command, the init.py script asks you for a launch template description. Provide a brief description for the launch template that describes to which application it correlates. After that, the init.py script customizes the AutoRestartTemplate.yaml file using the parameter values you entered, and the content of the file is displayed in the terminal for you to verify before confirming everything looks correct.
  2. After verifying the AutoRestartTemplate.yaml file looks correct, enter ‘y’ to confirm. Then, the script deploys a CloudFormation stack in your AWS account using the AutoRestartTemplate.yaml file as its template. It takes a few moments for the stack to deploy, after which it is visible in your AWS account under your CloudFormation console.
  3. Verify the CloudFormation stack is visible in your AWS account.
  4. You receive an email that looks like the preceding example asking you to confirm your subscription to the SNS topic that was created for your CloudWatch alarm. This alarm monitors your Outposts server ConnectedStatus metric. This is a crucial step, without confirming your SNS topic subscription for this alarm, you won’t be notified in the event that your source Outposts server experiences a hardware or service link failure and this relaunch logic is used. Once you have confirmed your email address, the implementation of this Amazon EC2 Auto-Relaunch logic is now complete, and in the event of a service link or source Outposts server failure, your EC2 instances now automatically relaunch on the destination Outposts server subnet you supplied as a parameter in your launch template. You also receive an email notifying you that your source Outpost went down and a relaunch event occurred.

A service link failure is simulated on the source-outpost-a server for the purpose of this post. Within a minute or so of the CloudWatch alarm being triggered, you receive an email alert from the SNS topic to which you subscribed earlier in the post. The email alert looks like the following image:

After receiving this alert, you can navigate to your EC2 Dashboard and view your running instances. There you should see a new instance being launched. It takes a minute or two to finish initializing before showing that both status checks passed:

Now that your EC2 instance(s) has been relaunched on your healthy destination Outposts server, you can start triaging why your source Outposts server experienced a failure without worrying about getting your application(s) back up and running.

Cleaning up

Because this custom logic is implemented through CloudFormation, the only clean up required is to delete the CloudFormation stack from your AWS account. Doing so deletes all resources that were deployed through the CloudFormation stack.

Conclusion

The use of custom logic through AWS tools such as CloudFormation, CloudWatch, and Lambda enables you to architect for HA for stateless workloads on an Outposts server. By implementing the custom logic we walked through in this post, you can automatically relaunch EC2 instances running on a source Outposts server to a secondary destination Outposts server, reducing the downtime of your applications in the event of a hardware or service link failure. The code provided in this post can also be further expanded upon to meet the unique needs of your workload.

Note that, while the use of Infrastructure-as-Code (IaC) can improve your application’s availability and be used to standardize deployments across multiple Outposts servers, it is crucial to do regular failure drills to test the custom logic in place. This helps make sure that you understand your application’s expected behavior on relaunch in the event of a hardware failure. Check out part 2 of this series to learn more about enabling HA on Outposts servers for stateful workloads.

Stream data to Amazon S3 for real-time analytics using the Oracle GoldenGate S3 handler

Post Syndicated from Prasad Matkar original https://aws.amazon.com/blogs/big-data/stream-data-to-amazon-s3-for-real-time-analytics-using-the-oracle-goldengate-s3-handler/

Modern business applications rely on timely and accurate data with increasing demand for real-time analytics. There is a growing need for efficient and scalable data storage solutions. Data at times is stored in different datasets and needs to be consolidated before meaningful and complete insights can be drawn from the datasets. This is where replication tools help move the data from its source to the target systems in real time and transform it as necessary to help businesses with consolidation.

In this post, we provide a step-by-step guide for installing and configuring Oracle GoldenGate for streaming data from relational databases to Amazon Simple Storage Service (Amazon S3) for real-time analytics using the Oracle GoldenGate S3 handler.

Oracle GoldenGate for Oracle Database and Big Data adapters

Oracle GoldenGate is a real-time data integration and replication tool used for disaster recovery, data migrations, high availability. It captures and applies transactional changes in real time, minimizing latency and keeping target systems synchronized with source databases. It supports data transformation, allowing modifications during replication, and works with various database systems, including SQL Server, MySQL, and PostgreSQL. GoldenGate supports flexible replication topologies such as unidirectional, bidirectional, and multi-master configurations. Before using GoldenGate, make sure you have reviewed and adhere to the license agreement.

Oracle GoldenGate for Big Data provides adapters that facilitate real-time data integration from different sources to big data services like Hadoop, Apache Kafka, and Amazon S3. You can configure the adapters to control the data capture, transformation, and delivery process based on your specific requirements to support both batch-oriented and real-time streaming data integration patterns.

GoldenGate provides special tools called S3 event handlers to integrate with Amazon S3 for data replication. These handlers allow GoldenGate to read from and write data to S3 buckets. This option allows you to use Amazon S3 for GoldenGate deployments across on-premises, cloud, and hybrid environments.

Solution overview

The following diagram illustrates our solution architecture.

In this post, we walk you through the following high-level steps:

  1. Install GoldenGate software on Amazon Elastic Compute Cloud (Amazon EC2).
  2. Configure GoldenGate for Oracle Database and extract data from the Oracle database to trail files.
  3. Replicate the data to Amazon S3 using the GoldenGate for Big Data S3 handler.

Prerequisites

You must have the following prerequisites in place:

Install GoldenGate software on Amazon EC2

You need to run GoldenGate on EC2 instances. The instances must have adequate CPU, memory, and storage to handle the anticipated replication volume. For more details, refer to Operating System Requirements. After you determine the CPU and memory requirements, select a current generation EC2 instance type for GoldenGate.

Use the following formula to estimate the required trail space:

trail disk space = transaction log volume in 1 hour x number of hours down x .4

When the EC2 instance is up and running, download the following GoldenGate software from the Oracle GoldenGate Downloads page:

  • GoldenGate 21.3.0.0
  • GoldenGate for Big Data 21c

Use the following steps to upload and install the file from your local machine to the EC2 instance. Make sure that your IP address is allowed in the inbound rules of the security group of your EC2 instance before starting a session. For this use case, we install GoldenGate for Classic Architecture and Big Data. See the following code:

scp -i pem-key.pem 213000_fbo_ggs_Linux_×64_Oracle_shiphome.zip ec2-user@hostname:~/.
ssh -i pem-key.pem  ec2-user@hostname
unzip 213000_fbo_ggs_Linux_×64_Oracle_shiphome.zip

Install GoldenGate 21.3.0.0

Complete the following steps to install GoldenGate 21.3 on an EC2 instance:

  1. Create a home directory to install the GoldenGate software and run the installer:
    mkdir /u01/app/oracle/product/OGG_DB_ORACLE
    /fbo_ggs_Linux_x64_Oracle_shiphome/Disk1
    
    ls -lrt
    total 8
    drwxr-xr-x. 4 oracle oinstall 187 Jul 29 2021 install
    drwxr-xr-x. 12 oracle oinstall 4096 Jul 29 2021 stage
    -rwxr-xr-x. 1 oracle oinstall 918 Jul 29 2021 runInstaller
    drwxrwxr-x. 2 oracle oinstall 25 Jul 29 2021 response

  2. Run runInstaller:
    [oracle@hostname Disk1]$ ./runInstaller
    Starting Oracle Universal Installer.
    Checking Temp space: must be greater than 120 MB.   Actual 193260 MB Passed
    Checking swap space: must be greater than 150 B.       Actual 15624 MB    Passed

A GUI window will pop up to install the software.

  1. Follow the instructions in the GUI to complete the installation process. Provide the directory path you created as the home directory for GoldenGate.

After the GoldenGate software installation is complete, you can create the GoldenGate processes that read the data from the source. First, you configure OGG EXTRACT.

  1. Create an extract parameter file for the source Oracle database. The following code is the sample file content:
    [oracle@hostname Disk1]$vi eabc.prm
    
    -- Extract group name
    EXTRACT EABC
    SETENV (TNS_ADMIN = "/u01/app/oracle/product/19.3.0/network/admin")
    
    -- Extract database user login
    
    USERID ggs_admin@mydb, PASSWORD "********"
    
    -- Local trail on the remote host
    EXTTRAIL /u01/app/oracle/product/OGG_DB_ORACLE/dirdat/ea
    IGNOREREPLICATES
    GETAPPLOPS
    TRANLOGOPTIONS EXCLUDEUSER ggs_admin
    TABLE scott.emp;

  2. Add the EXTRACT on the GoldenGate prompt by running the following command:
    GGSCI> ADD EXTRACT EABC, TRANLOG, BEGIN NOW

  3. After you add the EXTRACT, check the status of the running programs with the info all

You will see the EXTRACT status is in the STOPPED state, as shown in the following screenshot; this is expected.

  1. Start the EXTRACT process as shown in the following figure.

The status changes to RUNNING. The following are the different statuses:

  • STARTING – The process is starting.
  • RUNNING – The process has started and is running normally.
  • STOPPED – The process has stopped either normally (controlled manner) or due to an error.
  • ABENDED – The process has been stopped in an uncontrolled manner. An abnormal end is known as ABEND.

This will start the extract process and a trail file will be created in the location mentioned in the extract parameter file.

  1. You can verify this by using the command stats <<group_name>>, as shown in the following screenshot.

Install GoldenGate for Big Data 21c

In this step, we install GoldenGate for Big Data in the same EC2 instance where we installed the GoldenGate Classic Architecture.

  1. Create a directory to install the GoldenGate for Big Data software. To copy the .zip file, follow these steps:
    mkdir /u01/app/oracle/product/OGG_BIG_DATA
    
    unzip 214000_ggs_Linux_x64_BigData_64bit.zip
    tar -xvf ggs_Linux_x64_BigData_64bit.tar
    
    GGSCI> CREATE SUBDIRS
    GGSCI> EDIT PARAM MGR
    PORT 7801
    
    GGSCI> START MGR

This will start the MANAGER program. Now you can install the dependencies required for the REPLICAT to run.

  1. Go to /u01/app/oracle/product/OGG_BIG_DATA/DependencyDownloader and run the sh file with the latest version of aws-java-sdk. This script downloads the AWS SDK, which provides client libraries for connectivity to the AWS Cloud.
    [oracle@hostname DependencyDownloader]$ ./aws.sh 1.12.748

Configure the S3 handler

To configure an GoldenGate Replicat to send data to an S3 bucket, you need to set up a Replicat parameter file and properties file that defines how data is handled and sent to Amazon S3.

AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are the access key and secret access key of your IAM user, respectively. Do not hardcode credentials or security keys in the parameter and properties file. There are several methods available to achieve this, such as the following:

#!/bin/bash

# Use environment variables that are already set in the OS
export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID
export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY
export AWS_REGION="your_aws_region"

You can set these environment variables in your shell configuration file (e.g., .bashrc, .bash_profile, .zshrc) or use a secure method to set them temporarily:

export AWS_ACCESS_KEY_ID="your_access_key_id"
export AWS_SECRET_ACCESS_KEY="your_secret_access_key"

Configure the properties file

Create a properties file for the S3 handler. This file defines how GoldenGate will interact with your S3 bucket. Make sure that you have added the correct parameters as shown in the properties file.

The following code is an example of an S3 handler properties file (dirprm/reps3.properties):

[oracle@hostname dirprm]$ cat reps3.properties
gg.handlerlist=filewriter

gg.handler.filewriter.type=filewriter
gg.handler.filewriter.fileRollInterval=60s
gg.handler.filewriter.fileNameMappingTemplate=${tableName}${currentTimestamp}.json
gg.handler.filewriter.pathMappingTemplate=./dirout
gg.handler.filewriter.stateFileDirectory=./dirsta
gg.handler.filewriter.format=json
gg.handler.filewriter.finalizeAction=rename
gg.handler.filewriter.fileRenameMappingTemplate=${tableName}${currentTimestamp}.json
gg.handler.filewriter.eventHandler=s3

goldengate.userexit.writers=javawriter
#TODO Set S3 Event Handler- please update as needed
gg.eventhandler.s3.type=s3
gg.eventhandler.s3.region=eu-west-1
gg.eventhandler.s3.bucketMappingTemplate=s3bucketname
gg.eventhandler.s3.pathMappingTemplate=${tableName}_${currentTimestamp}
gg.eventhandler.s3.accessKeyId=$AWS_ACCESS_KEY_ID
gg.eventhandler.s3.secretKey=$AWS_SECRET_ACCESS_KEY

gg.classpath=/u01/app/oracle/product/OGG_BIG_DATA/dirprm/:/u01/app/oracle/product/OGG_BIG_DATA/DependencyDownloader/dependencies/aws_sdk_1.12.748/
gg.log=log4j
gg.log.level=DEBUG

#javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=.:ggjava/ggjava.jar -Daws.accessKeyId=my_access_key_id -Daws.secretKey=my_secret_key
javawriter.bootoptions=-Xmx512m -Xms32m -Djava.class.path=.:ggjava/ggjava.jar

Configure GoldenGate REPLICAT

Create the parameter file in /dirprm in the GoldenGate for Big Data home:

[oracle@hostname dirprm]$ vi rps3.prm
REPLICAT rps3
-- Command to add REPLICAT
-- add replicat fw, exttrail AdapterExamples/trail/tr
SETENV(GGS_JAVAUSEREXIT_CONF = 'dirprm/rps3.props')
TARGETDB LIBFILE libggjava.so SET property=dirprm/rps3.props
REPORTCOUNT EVERY 1 MINUTES, RATE
MAP SCOTT.EMP, TARGET gg.handler.s3handler;;

[oracle@hostname OGG_BIG_DATA]$ ./ggsci
GGSCI > add replicat rps3, exttrail ./dirdat/tr/ea
Replicat added.

GGSCI > info all
Program Status Group Lag at Chkpt Time Since Chkpt
MANAGER RUNNING
REPLICAT STOPPED RPS3 00:00:00 00:00:39

GGSCI > start *
Sending START request to Manager ...
Replicat group RPS3 starting.

Now you have successfully started the Replicat. You can verify this by running info and stats commands followed by the Replicat name, as shown in the following screenshot.

To confirm that the file has been replicated to an S3 bucket, open the Amazon S3 console and open the bucket you created. You can see that the table data has been replicated to Amazon S3 in JSON file format.

Best practices

Make sure that you are following the best practices on performance, compression, and security.

Consider the following best practices for performance:

The following are best practices for compression:

  • Enable compression for trail files to reduce storage requirements and improve network transfer performance.
  • Use GoldenGate’s built-in compression capabilities or use file system-level compression tools.
  • Strike a balance between compression level and CPU overhead, because higher compression levels may impact performance.

Lastly, when implementing Oracle GoldenGate for streaming data to Amazon S3 for real-time analytics, it’s crucial to address various security considerations to protect your data and infrastructure. Follow the security best practices for Amazon S3 and security options available for GoldenGate Classic Architecture.

Clean up

To avoid ongoing charges, delete the resources that you created as part of this post:

  1. Remove the S3 bucket and trail files if no longer needed and stop the GoldenGate processes on Amazon EC2.
  2. Revert the changes that you made in the database (such as grants, supplemental logging, and archive log retention).
  3. To delete the entire setup, stop your EC2 instance.

Conclusion

In this post, we provided a step-by-step guide for installing and configuring GoldenGate for Oracle Classic Architecture and Big Data for streaming data from relational databases to Amazon S3. With these instructions, you can successfully set up an environment and take advantage of the real-time analytics using a GoldenGate handler for Amazon S3, which we will explore further in an upcoming post.

If you have any comments or questions, leave them in the comments section.


About the Authors

Prasad Matkar is Database Specialist Solutions Architect at AWS based in the EMEA region. With a focus on relational database engines, he provides technical assistance to customers migrating and modernizing their database workloads to AWS.

Arun Sankaranarayanan is a Database Specialist Solution Architect based in London, UK. With a focus on purpose-built database engines, he assists customers in migrating and modernizing their database workloads to AWS.

Giorgio Bonzi is a Sr. Database Specialist Solutions Architect at AWS based in the EMEA region. With a focus on relational database engines, he provides technical assistance to customers migrating and modernizing their database workloads to AWS.

Using Amazon APerf to go from 50% below to 36% above performance target

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/using-amazon-aperf-to-go-from-50-below-to-36-above-performance-target/

This post is written by Tyler Jones, Senior Solutions Architect – Graviton, AWS.

Performance tuning the Renaissance Finagle-http benchmark

Sometimes software doesn’t perform the way it’s expected to across different systems. This can be due to a configuration error, code bug, or differences in hardware performance. Amazon APerf is a powerful tool designed to help identify and address performance issues on AWS instances and other computers. APerf captures comprehensive system metrics simultaneously and then visualizes them in an interactive report. The report allows users to analyze metrics such as CPU usage, interrupts, memory usage, and CPU core performance counters (PMU) together. APerf is particularly useful for performance tuning workloads across different instance types, as it can generate side-by-side reports for easy comparison. APerf is valuable for developers, system administrators, and performance engineers who need to optimize application performance on AWS. From here on we use the Renaissance benchmark as an example to demonstrate how APerf is used to debug and find performance bottlenecks.

The example

The Renaissance finagle-http benchmark was unexpectedly found to run 50% slower on a c7g.16xl Graviton3 than on a reference instance, both initially using the Linux-5.15.60 kernel. This is unexpected behavior.Graviton3 should be performing as good or better than our reference instance as it does for other Java based workloads. It’s likely there is a configuration problem somewhere. The Renaissance finagle-http benchmark is written in Scala but produces Java byte code, so our investigation will focus on the Java JVM as well as system-level configurations.

Overview

System performance tuning is an iterative process that is conducted in two main phases, the first focuses on overall system issues, and the second focuses on CPU core bottlenecks. APerf is used to assist in both phases.

APerf can render several instances’ data in one report, side by side, typically a reference system and the system to be tuned. The reference system provides values to compare against. In isolation, metrics are harder to evaluate. A metric may be acceptable in general but the comparison to the reference system makes it easier to spot room for improvement.

APerf helps to identify unusual system behavior, such as high interrupt load, excessive I/O wait, unusual network layer patterns, and other such issues in the first phase. After adjustments are made to address these issues, for example by modifying JVM flags, the second phase starts. Using the system tuning of the first phase, fresh APerf data is collected and evaluated with a focus on CPU core performance metrics.

Any inferior metric of the SUT CPU core, as compared to the reference system, holds potential for improvement. In the following section we discuss the two phases in detail.
For more background on system tuning, refer to the Performance Runbook in the AWS Graviton Getting Started guide.

Initial data collection

Here is an example for how 240 seconds of system data is collected with APerf:

#enable PMU access
echo 0 | sudo tee /proc/sys/kernel/perf_event_paranoid
#APerf has to open more than the default limit of files.
ulimit -n 65536
#usually aperf would be run in another terminal. 
#For illustration purposes it is send to the background here
./aperf record --run-name finagle_1 --period 240 &
#With 64 CPUs it takes APerf a little less than 15s to report readiness 
#for data collection.
sleep 15
java -jar renaissance-gpl-0.14.2.jar -r 8 finagle-http

The APerf report is generated as follows:

./aperf report --run finagle_1

Then, the report can be viewed with a web browser:

firefox aperf_report_finagle_1/index.html

The APerf report can render several data sets in the same report, side-by-side.

./aperf report --run finagle_1_c7g --run finagle_1_reference --run ...

Note that it is crucial to examine the CPU usage over time shown in APerf. Some data may be gathered while the system is idle. The metrics during idle times have no significant value.

First phase: system level

In this phase we look for differences on the system level. To do this APerf collects data during runs of finagle-http on c7g.16xl and the reference. The reference system provides the target numbers to compare against. Any large difference warrants closer inspection.

The first differences can be seen in the following figure.

The APerf CPU usage plot shows larger drops on Graviton3 (highlighted in red) at the beginning of each run than on the reference instance.

Figure 1: CPU usage. c7g.16xl on the left, reference on the right.

Figure 1: CPU usage. c7g.16xl on the left, reference on the right.Figure 1: CPU usage. c7g.16xl on the left, reference on the right.

The log messages about the GC runtime hint at a possible reason as they coincide with the dips in CPU usage.

====== finagle-http (web) [default], iteration 0 started ======
GC before operation: completed in 32.692 ms, heap usage 87.588 MB → 31.411 MB.
====== finagle-http (web) [default], iteration 0 completed (6534.533 ms) ======

The JVM tends to spend a significant amount of time in garbage collection, during which time it has to suspend all threads, and choosing a different GC may have a positive impact.

The default GC on OpenJDK17 is G1GC. Using parallelGC is an alternative given that the instances have 64 CPUs, and thus GC can be performed highly parallel. The Graviton Getting Started guide also recommends checking the GC log when working on Java performance issues. A cross check using the JVM’s -Xlog:gc option confirms the reduced GC time with parallelGC.

The second difference is evident in the following figure, the CPU-to-CPU interrupts (IPI). There is more than 10x higher activity on Graviton 3, which means additional IRQ work c7g.16xl on which the reference system does not have to spend CPU cycles.

Figure 2: IPI0/RES Interrupts. c7g.16xl on the left, reference on the right.

 Figure 2: IPI0/RES Interrupts. c7g.16xl on the left, reference on the right. Figure 2: IPI0/RES Interrupts. c7g.16xl on the left, reference on the right.

Grepping through kernel commit messages can help find patches that address a particular issue, such as IPI inefficiencies.

This scheduling patch improves performance by 19%. Another IPI patch provides an additional 1% improvement. Switching to Linux-5.15.61 allows us to use these IPI improvements. The following figure shows the effect in APerf.

Figure 3: IPI0 Interrupts (c7g.16xl)

IPI0 Interrupts (c7g.16xl)

Second phase: CPU core level

Now that the system level issues are mitigated, the focus is on the CPU cores. The data collected by APerf shows PMU data where the reference instance and Graviton3 differ significantly.

PMU metric Graviton3 Reference system
Branch Prediction Misses/1000 Instructions 16 4
Instruction Translation Lookaside Buffer (TLB) Misses/1000 Instructions 8.3 4
Instructions Per Clock cycle 0.6 0.6*

*Note that reference has a 20% higher CPU clock than c7g.16xl. As a rule of thumb, instructions per clock (IPC) multiplied by clock rate equals work done by a CPU.

Addressing CPU Core bottlenecks

The first improvement in PMU metrics stems from the parallelGC option. Although the intention was to increase CPU usage, the following figure shows lowered branch miss counts as well. Limiting the JIT tiered compilation to only use C2 mode helps branch prediction by reducing branch indirection and increasing the locality of executed code. Finally adding Transparent Huge Pages helps branch prediction logic and avoids lengthy address translation look-ups in DDR memory. The following graphs show the effects of the chosen JVM options.

Branch misses/1000 instructions (c7g.16xl)

Figure 4: Branch misses per 1k instructions

Branch misses per 1k instructions

JVM Options from left to right:

  • -XX:+UseParallelGC
  • -XX:+UseParallelGC -XX:-TieredCompilation
  • -XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

With the options listed under the preceding figure, APerf shows the branch prediction miss rate decreasing from the initial 16 to 11. Branch mis-predictions incur significant performance penalties as they result in wasted cycles spent computing results that ultimately need to be discarded. Furthermore, these mis-predictions cause the prefetching and cache subsystems to fail to load the necessary subsequent instructions into cache. Consequently, costly pipeline stalls and frontend stalls occur, preventing the CPU from executing instructions.

Code sparsity

Figure 5: Code sparsity

Code sparsity

JVM Options from left to right:

  • -XX:+UseParallelGC
  • -XX:+UseParallelGC -XX:-TieredCompilation
  • -XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

Code sparsity is a measure of how compact the instruction code is packed and how closely related code is placed. This is where turning off tiered compilation shows its effect. Lower sparsity helps branch prediction and the cache subsystem.

Instruction TLB misses/1000 instructions (c7g.16xl)

Figure 6: Instruction TLB misses per 1k Instructions

Instruction TLB misses per 1k Instructions

JVM Options from left to right:

  • -XX:+UseParallelGC
  • -XX:+UseParallelGC -XX:-TieredCompilation
  • -XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

The big decrease in TLB misses is caused by the use of transparent huge pages, which increase the likelihood that a virtual address translation is present in the TLB, since fewer entries are needed. Translation table walks are avoided that otherwise need to traverse entries in DDR memory that cost hundreds of CPU cycles to read.

Instructions per clock cycle (IPC)

Figure 7: Instructions per clock cycle

Instructions per clock cycle

JVM Options from left to right:

  • -XX:+UseParallelGC
  • -XX:+UseParallelGC -XX:-TieredCompilation
  • -XX:+UseParallelGC -XX:-TieredCompilation -XX:+UseTransparentHugePages

The preceding figure shows IPC increasing from 0.58 to 0.71 as JVM flags are added.

Results

This table summarizes the measures taken for performance improvement and their results.

JVM option set baseline 1 2 3
IPC 0.6 0.6 0.63 0.71
Branch misses/1000 instructions 16 14 12 11
ITLB misses/1000 instructions 8.3 7.7 8.8 1.1
Benchmark runtime [ms] 6000 3922 3843 3512
execution time improvement vs baseline
  1. +parallelGC
  2. +parallelGC -tieredCompilation
  3. +parallelGC -tieredCompilation +UseTransparentHugePages

Re-examining the setup of our testing enviornment: Changing where the load-generator lives

With the preceding efforts, c7g.16xl is within 91% of the reference system. The c7g.16xl branch prediction miss rate is still higher at 11 than the references at 4. As shown in the preceding figure, reduced branch prediction misses have a strong positive effect on performance. What follows is an experiment to achieve parity or better with the reference system based on the reduction of branch prediction misses.

Finagle-http serves HTTP requests generated by wrk2, which is a load generator implemented in C. The expectation is that the c7g.16xl branch predictor works better with the native wrk2 binary, unlike the Renaissance load generator, which is executing on the JVM. The wrk2 load generator and the finagle-http are assigned through taskset to separate sets of CPUs: 16 CPUs for wrk2 and 48 CPUs for finagle-http. The idea here is to have the branch predictors on these CPU sets focus on a limited code set. The following diagram illustrates the difference between the Renaissance and the experimental setup.

With this CPU performance tuned setup, c7g.16xl can now handle a 36% higher request load than the reference using the same configuration, at an average latency limit of 1.5ms. This illustrates the impact that system tuning with APerf can have. The same system that scored 50% lower than the comparison system now exceeds it by 36%.
The following APerf data shows the improvement of key PMU metrics that lead to the performance jump.

Branch prediction misses/1000 instructions

Figure 8: Branch misses per 1k instructions

Left chart: Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

The branch prediction miss rate is reduced to 1.5 from 11 with the Java-only setup.

IPC

Figure 9: Instructions Per Clock Cycle

Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

Left chart: Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

The IPC steps up from 0.7 to 1.5 due to the improvement in branch prediction.

Code sparsity

Figure 10: Code Sparsity

Left chart: Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

Left chart: Optimized Java-only setup. Right chart: Finagle-http with wrk2 load generator

The code sparsity decreases to 0.014 from 0.21, a factor of 15.

Conclusion

AWS created the APerf tool to aid in root cause analysis and help address performance issues for any workload. APerf is a standalone binary that captures relevant data simultaneously as a time series, such as CPU usage, interrupt frequency, memory usage, and CPU core metrics (PMU counters). APerf can generate reports for multiple data captures, making it easy to spot differences between instance types. We were able to use this data to analyze why Graviton3 was underperforming and to also see the impact of our changes in terms of performance. Using APerf we were able to successfully adjust configuration parameters and go from 50% below our performance target to 36% more performant than our reference system and associated performance target. Without Aperf, collecting these metrics and visualizing them is a non-trival task. With Aperf you can capture and visualize these metrics with two short commands, saving you time and effort so you can focus on what matters most: getting the most performance from your application.

AWS Weekly Roundup: Advanced capabilities in Amazon Bedrock and Amazon Q, and more (July 15, 2024).

Post Syndicated from Abhishek Gupta original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-advanced-capabilities-in-amazon-bedrock-and-amazon-q-and-more-july-15-2024/

As expected, there were lots of exciting launches and updates announced during the AWS Summit New York. You can quickly scan the highlights in Top Announcements of the AWS Summit in New York, 2024.

NY-Summit-feat-img

My colleagues and fellow AWS News Blog writers Veliswa Boya and Sébastien Stormacq were at the AWS Community Day Cameroon last week. They were energized to meet amazing professionals, mentors, and students – all willing to learn and exchange thoughts about cloud technologies. You can access the video replay to feel the vibes or just watch some of the talks!

AWS Community Day Cameroon 2024

Last week’s launches
In addition to the launches at the New York Summit, here are a few others that got my attention.

Advanced RAG capabilities Knowledge Bases for Amazon Bedrock – These include custom chunking options to enable customers to write their own chunking code as a Lambda function; smart parsing to extract information from complex data such as tables; and query reformulation to break down queries into simpler sub-queries, retrieve relevant information for each, and combine the results into a final comprehensive answer.

Amazon Bedrock Prompt Management and Prompt Flows – This is a preview launch of Prompt Management that help developers and prompt engineers get the best responses from foundation models for their use cases; and Prompt Flows accelerates the creation, testing, and deployment of workflows through an intuitive visual builder.

Fine-tuning for Anthropic’s Claude 3 Haiku in Amazon Bedrock (preview) – By providing your own task-specific training dataset, you can fine tune and customize Claude 3 Haiku to boost model accuracy, quality, and consistency to further tailor generative AI for your business.

IDE workspace context awareness in Amazon Q Developer chat – Users can now add @workspace to their chat message in Q Developer to ask questions about the code in the project they currently have open in the IDE. Q Developer automatically ingests and indexes all code files, configurations, and project structure, giving the chat comprehensive context across your entire application within the IDE.

New features in Amazon Q Business –  The new personalization capabilities in Amazon Q Business are automatically enabled and will use your enterprise’s employee profile data to improve their user experience. You can now get answers from text content in scanned PDFs, and images embedded in PDF documents, without having to use OCR for preprocessing and text extraction.

Amazon EC2 R8g instances powered by AWS Graviton4 are now generally available – Amazon EC2 R8g instances are ideal for memory-intensive workloads such as databases, in-memory caches, and real-time big data analytics. These are powered by AWS Graviton4 processors and deliver up to 30% better performance compared to AWS Graviton3-based instances.

Vector search for Amazon MemoryDB is now generally available – Vector search for MemoryDB enables real-time machine learning (ML) and generative AI applications. It can store millions of vectors with single-digit millisecond query and update latencies at the highest levels of throughput with >99% recall.

Introducing Valkey GLIDE, an open source client library for Valkey and Redis open sourceValkey is an open source key-value data store that supports a variety of workloads such as caching, and message queues. Valkey GLIDE is one of the official client libraries for Valkey and it supports all Valkey commands. GLIDE supports Valkey 7.2 and above, and Redis open source 6.2, 7.0, and 7.2.

Amazon OpenSearch Service enhancementsAmazon OpenSearch Serverless now supports workloads up to 30TB of data for time-series collections enabling more data-intensive use cases, and an innovative caching mechanism that automatically fetches and intelligently manages data, leading to faster data retrieval, efficient storage usage, and cost savings. Amazon OpenSearch Service has now added support for AI powered Natural Language Query Generation in OpenSearch Dashboards Log Explorer so you can get started quickly with log analysis without first having to be proficient in PPL.

Open source release of Secrets Manager Agent for AWS Secrets Manager – Secrets Manager Agent is a language agnostic local HTTP service that you can install and use in your compute environments to read secrets from Secrets Manager and cache them in memory, instead of making a network call to Secrets Manager.

Amazon S3 Express One Zone now supports logging of all events in AWS CloudTrail – This capability lets you get details on who made API calls to S3 Express One Zone and when API calls were made, thereby enhancing data visibility for governance, compliance, and operational auditing.

Amazon CloudFront announces managed cache policies for web applications – Previously, Amazon CloudFront customers had two options for managed cache policies, and had to create custom cache policies for all other cases. With the new managed cache policies, CloudFront caches content based on the Cache-Control headers returned by the origin, and defaults to not caching when the header is not returned.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

We launched existing services in additional Regions:

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Context window overflow: Breaking the barrierThis blog post dives into intricate workings of generative artificial intelligence (AI) models, and why is it crucial to understand and mitigate the limitations of CWO (context window overflow).

Using Agents for Amazon Bedrock to interactively generate infrastructure as code – This blog post explores how Agents for Amazon Bedrock can be used to generate customized, organization standards-compliant IaC scripts directly from uploaded architecture diagrams.

Automating model customization in Amazon Bedrock with AWS Step Functions workflow – This blog post covers orchestrating repeatable and automated workflows for customizing Amazon Bedrock models and how AWS Step Functions can help overcome key pain points in model customization.

AWS open source news and updates – My colleague Ricardo Sueiras writes about open source projects, tools, and events from the AWS Community; check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. To learn more about future AWS Summit events, visit the AWS Summit page. Register in your nearest city: Bogotá (July 18), Taipei (July 23–24), AWS Summit Mexico City (Aug. 7), and AWS Summit Sao Paulo (Aug. 15).

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. Upcoming AWS Community Days are in Aotearoa (Aug. 15), Nigeria (Aug. 24), New York (Aug. 28), and Belfast (Sept. 6).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Abhishek

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Graviton4-based Amazon EC2 R8g instances: best price performance in Amazon EC2

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/aws-graviton4-based-amazon-ec2-r8g-instances-best-price-performance-in-amazon-ec2/

Today, I am very excited to announce that the new AWS Graviton4-based Amazon Elastic Compute Cloud (Amazon EC2) R8g instances, that have been available in preview since re:Invent 2023, are now generally available to all. AWS offers more than 150 different AWS Graviton-powered Amazon EC2 instance types globally at scale, has built more than 2 million Graviton processors, and has more than 50,000 customers using AWS Graviton-based instances to achieve the best price performance for their applications.

AWS Graviton4 is the most powerful and energy efficient processor we have ever designed for a broad range of workloads running on Amazon EC2. Like all the other AWS Graviton processors, AWS Graviton4 uses a 64-bit Arm instruction set architecture. AWS Graviton4-based Amazon EC2 R8g instances deliver up to 30% better performance than AWS Graviton3-based Amazon EC2 R7g instances. This helps you to improve performance of your most demanding workloads such as high-performance databases, in-memory caches, and real time big data analytics.

Since the preview announcement at re:Invent 2023, over 100 customers, including Epic Games, SmugMug, Honeycomb, SAP, and ClickHouse have tested their workloads on AWS Graviton4-based R8g instances and observed significant performance improvement over comparable instances. SmugMug achieved 20-40% performance improvements using AWS Graviton4-based instances compared to AWS Graviton3-based instances for their image and data compression operations. Epic Games found AWS Graviton4 instances to be the fastest EC2 instances they have ever tested and Honeycomb.io achieved more than double the throughput per vCPU compared to the non-Graviton based instances that they used four years ago.

Let’s look at some of the improvements that we have made available in our new instances. R8g instances offer larger instance sizes with up to 3x more vCPUs (up to 48xl), 3x the memory (up to 1.5TB), 75% more memory bandwidth, and 2x more L2 cache over R7g instances. This helps you to process larger amounts of data, scale up your workloads, improve time to results, and lower your TCO. R8g instances also offer up to 50 Gbps network bandwidth and up to 40 Gbps EBS bandwidth compared to up to 30 Gbps network bandwidth and up to 20 Gbps EBS bandwidth on Graviton3-based instances.

R8g instances are the first Graviton instances to offer two bare metal sizes (metal-24xl and metal-48xl). You can right size your instances and deploy workloads that benefit from direct access to physical resources. Here are the specs for R8g instances:

Instance Size vCPUs
Memory
Network Bandwidth
EBS Bandwidth
r8g.medium 1 8 GiB up to 12.5 Gbps up to 10 Gbps
r8g.large 2 16 GiB up to 12.5 Gbps up to 10 Gbps
r8g.xlarge 4 32 GiB up to 12.5 Gbps up to 10 Gbps
r8g.2xlarge 8 64 GiB up to 15 Gbps up to 10 Gbps
r8g.4xlarge 16 128 GiB up to 15 Gbps up to 10 Gbps
r8g.8xlarge 32 256 GiB 15 Gbps 10 Gbps
r8g.12xlarge 48 384 GiB 22.5 Gbps 15 Gbps
r8g.16xlarge 64 512 GiB 30 Gbps 20 Gbps
r8g.24xlarge 96 768 GiB 40 Gbps 30 Gbps
r8g.48xlarge 192 1,536 GiB 50 Gbps 40 Gbps
r8g.metal-24xl 96 768 GiB 40 Gbps 30 Gbps
r8g.metal-48xl 192 1,536 GiB 50 Gbps 40 Gbps

If you are looking for more energy-efficient compute options to help you reduce your carbon footprint and achieve your sustainability goals, R8g instances provide the best energy efficiency for memory-intensive workloads in EC2. Additionally, these instances are built on the AWS Nitro System, which offloads CPU virtualization, storage, and networking functions to dedicated hardware and software to enhance the performance and security of your workloads. The Graviton4 processors offer you enhanced security by fully encrypting all high-speed physical hardware interfaces.

R8g instances are ideal for all Linux-based workloads including containerized and micro-services-based applications built using Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Container Registry (Amazon ECR), Kubernetes, and Docker, and as well as applications written in popular programming languages such as C/C++, Rust, Go, Java, Python, .NET Core, Node.js, Ruby, and PHP. AWS Graviton4 processors are up to 30% faster for web applications, 40% faster for databases, and 45% faster for large Java applications than AWS Graviton3 processors. To learn more, visit the AWS Graviton Technical Guide.

Check out the collection of Graviton resources to help you start migrating your applications to Graviton instance types. You can also visit the AWS Graviton Fast Start program to begin your Graviton adoption journey.

Now available
R8g instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions.

You can purchase R8g instances as Reserved Instances, On-Demand, Spot Instances, and via Savings Plans. For further information, visit Amazon EC2 pricing.

To learn more about Graviton-based instances, visit AWS Graviton Processors or the Amazon EC2 R8g Instances.

— Esra

Best practices working with self-hosted GitHub Action runners at scale on AWS

Post Syndicated from Shilpa Sharma original https://aws.amazon.com/blogs/devops/best-practices-working-with-self-hosted-github-action-runners-at-scale-on-aws/

Overview

GitHub Actions is a continuous integration and continuous deployment platform that enables the automation of build, test and deployment activities for your workload. GitHub Self-Hosted Runners provide a flexible and customizable option to run your GitHub Action pipelines. These runners allow you to run your builds on your own infrastructure, giving you control over the environment in which your code is built, tested, and deployed. This reduces your security risk and costs, and gives you the ability to use specific tools and technologies that may not be available in GitHub hosted runners. In this blog, I explore security, performance and cost best practices to take into consideration when deploying GitHub Self-Hosted Runners to your AWS environment.

Best Practices

Understand your security responsibilities

GitHub Self-hosted runners, by design, execute code defined within a GitHub repository, either through the workflow scripts or through the repository build process. You must understand that the security of your AWS runner execution environments are dependent upon the security of your GitHub implementation. Whilst a complete overview of GitHub security is outside the scope of this blog, I recommended that before you begin integrating your GitHub environment with your AWS environment, you review and understand at least the following GitHub security configurations.

  • Federate your GitHub users, and manage the lifecycle of identities through a directory.
  • Limit administrative privileges of GitHub repositories, and restrict who is able to administer permissions, write to repositories, modify repository configurations or install GitHub Apps.
  • Limit control over GitHub Actions runner registration and group settings
  • Limit control over GitHub workflows, and follow GitHub’s recommendations on using third-party actions
  • Do not allow public repositories access to self-hosted runners

Reduce security risk with short-lived AWS credentials

Make use of short-lived credentials wherever you can. They expire by default within 1 hour, and you do not need to rotate or explicitly revoke them. Short lived credentials are created by the AWS Security Token Service (STS). If you use federation to access your AWS account, assume roles, or use Amazon EC2 instance profiles and Amazon ECS task roles, you are using STS already!

In almost all cases, you do not need long-lived AWS Identity and Access Management (IAM) credentials (access keys) even for services that do not “run” on AWS – you can extend IAM roles to workloads outside of AWS without requiring you to manage long-term credentials. With GitHub Actions, we suggest you use OpenID Connect (OIDC). OIDC is a decentralized authentication protocol that is natively supported by STS using sts:AssumeRoleWithWebIdentity, GitHub and many other providers. With OIDC, you can create least-privilege IAM roles tied to individual GitHub repositories and their respective actions. GitHub Actions exposes an OIDC provider to each action run that you can utilize for this purpose.

Short lived AWS credentials with GitHub self-hosted runners

Short lived AWS credentials with GitHub self-hosted runners

If you have many repositories that you wish to grant an individual role to, you may run into a hard limit of the number of IAM roles in a single account. While I advocate solving this problem with a multi-account strategy, you can alternatively scale this approach by:

  • using attribute based access control (ABAC) to match claims in the GitHub token (such as repository name, branch, or team) to the AWS resource tags.
  • using role based access control (RBAC) by logically grouping the repositories in GitHub into Teams or applications to create fewer subset of roles.
  • use an identity broker pattern to vend credentials dynamically based on the identity provided to the GitHub workflow.

Use Ephemeral Runners

Configure your GitHub Action runners to run in “ephemeral” mode, which creates (and destroys) individual short-lived compute environments per job on demand. The short environment lifespan and per-build isolation reduces the risk of data leakage , even in multi-tenanted continuous integration environments, as each build job remains isolated from others on the underlying host.

As each job runs on a new environment created on demand, there is no need for a job to wait for an idle runner, simplifying auto-scaling. With the ability to scale runners on demand, you do not need to worry about turning build infrastructure off when it is not needed (for example out of office hours), giving you a cost-efficient setup. To optimize the setup further, consider allowing developers to tag workflows with instance type tags and launch specific instance types that are optimal for respective workflows.

There are a few considerations to take into account when using ephemeral runners:

  • A job will remain queued until the runner EC2 instance has launched and is ready. This can take up to 2 minutes to complete. To speed up this process, consider using an optimized AMI with all prerequisites installed.
  • Since each job is launched on a fresh runner, utilizing caching on the runner is not possible. For example, Docker images and libraries will always be pulled from source.

Use Runner Groups to isolate your runners based on security requirements

By using ephemeral runners in a single GitHub runner group, you are creating a pool of resources in the same AWS account that are used by all repositories sharing this runner group. Your organizational security requirements may dictate that your execution environments must be isolated further, such as by repository or by environment (such as dev, test, prod).

Runner groups allow you to define the runners that will execute your workflows on a repository-by-repository basis. Creating multiple runner groups not only allow you to provide different types of compute environments, but allow you to place your workflow executions in locations within AWS that are isolated from each other. For example, you may choose to locate your development workflows in one runner group and test workflows in another, with each ephemeral runner group being deployed to a different AWS account.

Runners by definition execute code on behalf of your GitHub users. At a minimum, I recommend that your ephemeral runner groups are contained within their own AWS account and that this AWS account has minimal access to other organizational resources. When access to organizational resources is required, this can be given on a repository-by-repository basis through IAM role assumption with OIDC, and these roles can be given least-privilege access to the resources they require.

Optimize runner start up time using Amazon EC2 warm-pools

Ephemeral runners provide strong build isolation, simplicity and security. Since the runners are launched on demand, the job will be required to wait for the runner to launch and register itself with GitHub. While this usually happens in under 2 minutes, this wait time might not be acceptable in some scenarios.

We can use a warm pool of pre-registered ephemeral runners to reduce the wait time. These runners will listen to the incoming GitHub workflow events actively and as soon as an incoming workflow event is queued, it is picked up readily by the warm pool of registered EC2 runners.

While there can be multiple strategies to manage the warm pool, I recommend the following strategy which uses AWS Lambda for scaling up and scaling down the ephemeral runners:

GitHub self-hosted runners warm pool flow

GitHub self-hosted runners warm pool flow

A GitHub workflow event is created on a trigger like push of code in a master repository or a merge of pull request. This event triggers a Lambda function via webhook and Amazon API Gateway endpoint. The Lambda function helps in validating the GitHub workflow event payload and log events for observability & building metrics. It can be used optionally to replenish the warm pool. There are separate backend Lambda functions to launch, scale up and scale down the warm pool of EC2 instances. The EC2 instances or runners are registered with GitHub at the time of launch. The registered runners listens for incoming GitHub work flow events using GitHub’s internal job queue and as soon as workflow events are triggered, its assigned by GitHub to one of the runners in warm pool for job execution. The runner is automatically de-registered once the job completes. A job can be a build, or deploy request as defined in your GitHub workflow.

With warm pool in place, it is expected to help reduce wait time by 70-80%.

Considerations

  • Increased complexity as there is a possibility of over provisioning runners. This will depend on how long a runner EC2 instance requires to launch and reach a ready state and how frequently the scale up Lambda is configured to run. For example, if the scale up Lambda runs every 1 minute and the EC2 runner requires 2 minutes to launch, then the scale up Lambda will launch 2 instances. The mitigation is to use Auto scaling groups to manage the EC2 warm pool and desired capacity with predictive scaling policies tying back to incoming GitHub workflow events i.e. build job requests.
  • This strategy may have to be revised when supporting Windows or Mac based runners given the spin up times can vary.

Use an optimized AMI to speed up the launch of GitHub self-hosted runners

Amazon Machine Images (AMI) provide a pre-configured, optimized image that can be used to launch the runner EC2 instance. By using AMIs, you will be able to reduce the launch time of a new runner since dependencies and tools are already installed. Consistency across builds is guaranteed due to all instances running the same version of dependencies and tools. Runners will benefit from increased stability and security compliance as images are tested and approved before being distributed for use as runner instances.

When building an AMI for use as a GitHub self-hosted runner the following considerations need to be made:

  • Choosing the right OS base image for the builds. This will depend on your tech stack and toolset.
  • Install the GitHub runner app as part of the image. Ensure automatic runner updates are enabled to reduce the overhead of managing running versions. In case a specific runner version must be used you can disable automatic runner updates to avoid untested changes. Keep in mind, if disabled, a runner will need to be updated manually within 30 days of a new version becoming available.
  • Install build tools and dependencies from trusted sources.
  • Ensure runner logs are captured and forwarded to your security information and event management (SIEM) of choice.
  • The runner requires internet connectivity to access GitHub. This may require configuring proxy settings on the instance depending on your networking setup.
  • Configure any artifact repositories the runner requires. This includes sources and authentication.
  • Automate the creation of the AMI using tools such as EC2 Image Builder to achieve consistency.

Use Spot instances to save costs

The cost associated with scaling up the runners as well as maintaining a hot pool can be minimized using Spot Instances, which can result in savings up to 90% compared to On-Demand prices. However, there could be requirements where we can have longer running builds or batch jobs that cannot tolerate the spot instance terminating on 2 minutes notice. So, having a mixed pool of instances will be a good option where such jobs should be routed to on-demand EC2 instances and the rest on the Spot instances to cater for diverse build needs. This can be done by assigning labels to the runner during launch /registration. In that case, the on-demand instances will be launched and we can a savings plan in place to get cost benefits.

Record runner metrics using Amazon CloudWatch for Observability

It is vital for the observability of the overall platform to generate metrics for the EC2 based GitHub self-hosted runners. Examples of the GitHub runners metrics can be: the number of GitHub workflow events queued or completed in a minute, or number of EC2 runners up and available in the warm pool etc.

We can log the triggered workflow events and runner logs in Amazon CloudWatch and then use CloudWatch embedded metrics to collect metrics such as number of workflow events queued, in progress and completed. Using elements like “started_at” and “completed_at” timings which are part of workflow event payload we can calculate build wait time.

As an example, below is the sample incoming GitHub workflow event logged in Amazon Cloud Watch Logs

<p> </p><p><code>{</code></p><p><code>"hostname": "xxx.xxx.xxx.xxx",</code></p><p><code>"requestId": "aafddsd55-fgcf555",</code></p><p><code>"date": "2022-10-11T05:50:35.816Z",</code></p><p><code>"logLevel": "info",</code></p><p><code>"logLevelId": 3,</code></p><p><code>"filePath": "index.js",</code></p><p><code>"fullFilePath": "/var/task/index.js",</code></p><p><code>"fileNa<a class="ab-item" href="https://aws-blogs-prod.amazon.com/devops/" aria-haspopup="true">AWS DevOps Blog</a>me": "index.js",</code></p><p><code>"lineNumber": 83889,</code></p><p><code>"columnNumber": 12,</code></p><p><code>"isConstructor": false,</code></p><p><code>"functionName": "handle",</code></p><p><code>"argumentsArray": [</code></p><p><code>"Processing Github event",</code></p><p><code>"{\"event\":\"workflow_job\",\"repository\":\"testorg-poc/github-actions-test-repo\",\"action\":\"queued\",\"name\":\"jobname-buildanddeploy\",\"status\":\"queued\",\"started_at\":\"2022-10-11T05:50:33Z\",\"completed_at\":null,\"conclusion\":null}"</code></p><p><code>]</code></p><p><code>}</code></p>

In order to use the logged elements of above log into metrics by capturing \”status\”:\”queued\”,\”repository\”:\”testorg-poc/github-actions-test-repo\c, \”name\”:\”jobname-buildanddeploy\” ,and workflow \”event\” , one can build embedded metrics in the application code or AWS metrics Lambda using any of the cloud watch metrics client library Creating logs in embedded metric format using the client libraries – Amazon CloudWatch based on the language of your choice listed.c

Essentially what one of those libraries will do under the hood is map elements from Log event into dimension fields so cloud watch can then read and generate a metric using that.

console.log(<br />      JSON.stringify({<br />        message: '[Embedded Metric]', // Identifier for metric logs in CW logs<br />        build_event_metric: 1, // Metric Name and value<br />        status: `${status}`, // Dimension name and value<br />        eventName: `${eventName}`,<br />        repository: `${repository}`,<br />        name: `${name}`,<br />        <br />        _aws: {<br />          Timestamp: Date.now(),<br />          CloudWatchMetrics: [<br />            {<br />              Namespace: `demo_2`,<br />              Dimensions: [['status','eventName','repository','name']],<br />              Metrics: [<br />                {<br />                  Name: 'build_event_metric',<br />                  Unit: 'Count',<br />                },<br />              ],<br />            },<br />          ],<br />        },<br />      })<br />    );

A sample architecture:

Consumption of GitHub webhook events

Consumption of GitHub webhook events

The cloud watch metrics can be published to your dashboards or forwarded to any external tool based on requirements. Once we have metrics, CloudWatch alarms and notifications can be configured to manage pool exhaustion.

Conclusion

In this blog post, we outlined several best practices covering security, scalability and cost efficiency when using GitHub Actions with EC2 self-hosted runners on AWS. We covered how using short-lived credentials combined with ephemeral runners will reduce security and build contamination risks. We also showed how runners can be optimized for faster startup and job execution AMIs and warm EC2 pools. Last but not least, cost efficiencies can be maximized by using Spot instances for runners in the right scenarios.

Resources:

AWS Weekly Roundup: Amazon EC2 U7i Instances, Bedrock Converse API, AWS World IPv6 Day and more (June 3, 2024)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-u7i-instances-bedrock-converse-api-aws-world-ipv6-day-and-more-june-3-2024/

Life is not always happy, there are difficult times. However, we can share our joys and sufferings with those we work with. The AWS Community is no exception.

Jeff Barr introduced two members of the AWS community who are dealing with health issues. Farouq Mousa is an AWS Community Builder and fighting brain cancer. Allen Helton is an AWS Serverless Hero and his young daughter is fighting leukemia.

Please donate to support Farauq and Olivia, Allen’s daughter to overcome their disease.

Last week’s launches
Here are some launches that got my attention:

Amazon EC2 high memory U7i Instances – These instances with up to 32 TiB of DDR5 memory and 896 vCPUs are powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids). These high memory instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. To learn more, visit Jeff’s blog post.

New Amazon Connect analytics data lake – You can use a single source for contact center data including contact records, agent performance, Contact Lens insights, and more — eliminating the need to build and maintain complex data pipelines. Your organization can create your own custom reports using Amazon Connect data or combine data queried from third-party sources. To learn more, visit Donnie’s blog post.

Amazon Bedrock Converse API – This API provides developers a consistent way to invoke Amazon Bedrock models removing the complexity to adjust for model-specific differences such as inference parameters. With this API, you can write a code once and use it seamlessly with different models in Amazon Bedrock. To learn more, visit Dennis’s blog post to get started.

New Document widget for PartyRock – You can build, use, and share generative AI-powered apps for fun and for boosting personal productivity, using PartyRock. Its widgets display content, accept input, connect with other widgets, and generate outputs like text, images, and chats using foundation models. You can now use new document widget to integrate text content from files and documents directly into a PartyRock app.

30 days of alarm history in Amazon CloudWatch – You can view the history of your alarm state changes for up to 30 days prior. Previously, CloudWatch provided 2 weeks of alarm history. This extended history makes it easier to observe past behavior and review incidents over a longer period of time. To learn more, visit the CloudWatch alarms documentation section.

10x faster startup time in Amazon SageMaker Canvas – You can launch SageMaker Canvas in less than a minute and get started with your visual, no-code interface for machine learning 10x faster than before. Now, all new user profiles created in existing or new SageMaker domains can experience this accelerated startup time.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items and a Twitch show that you might find interesting:

Let us manage your relational database! – Jeff Barr ran a poll to better understand why some AWS customers still choose to host their own databases in the cloud. Working backwards, he highlights four issues that AWS managed database services address. Consider these before hosting your own database.

Amazon Bedrock Serverless Prompt Chaining – This repository provides examples of using AWS Step Functions and Amazon Bedrock to build complex, serverless, and highly scalable generative AI applications with prompt chaining.

AWS Merch Store Spring Sale – Do you want to buy AWS branded t-shirts, hats, bags, and so on? Get 15% off on all items now through June 7th.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS World IPv6 Day — Join us a free in-person celebration event on June 6, for technical presentations from AWS experts plus a workshop and whiteboarding session. You will learn how to get started with IPv6 and hear from customers who have started on the journey of IPv6 adoption. Check out your near city: San Francisco, Seattle, New YorkLondon, Mumbai, Bangkok, Singapore, Kuala Lumpur, Beijing, Manila, and Sydney.

AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Stockholm (June 4), Madrid (June 5), and Washington, DC (June 26–27).

AWS re:Inforce — Join us for AWS re:Inforce (June 10–12) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), New Zealand (August 15), Nigeria (August 24), and New York (August 28).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Channy

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Implementing network traffic inspection on AWS Outposts rack

Post Syndicated from Macey Neff original https://aws.amazon.com/blogs/compute/implementing-network-traffic-inspection-on-aws-outposts-rack/

This blog post is written by Brian Daugherty, Principal Solutions Architect. Enrico Liguori, Solution Architect, Networking. Sedji Gaouaou, Senior Solution Architect, Hybrid Cloud.

Network traffic inspection on AWS Outposts rack is a crucial aspect of making sure of security and compliance within your on-premises environment. With network traffic inspection, you can gain visibility into the data flowing in and out of your Outposts rack environment, enabling you to detect and mitigate potential threats proactively.

By deploying AWS partner solutions on Outposts rack, you can take advantage of their expertise and specialized capabilities to gain insights into network traffic patterns, identify and mitigate threats, and help ensure compliance with industry-specific regulations and standards. This includes advanced network traffic inspection capabilities, such as deep packet inspection, intrusion detection and prevention, application-level firewalling, and advanced threat detection.

This post presents an example architecture of deploying a firewall appliance on an Outposts rack to perform on-premises to Virtual Private Cloud (VPC) and VPC-to-VPC inline traffic inspection.

Architecture

The example traffic inspection architecture illustrated in the following diagram is built using a common Outposts rack deployment pattern.

In this example, an Outpost rack is deployed on premises to support:

  • Manufacturing/operational technologies (OT) applications that need low latency between OT servers and devices
  • Information technology (IT) applications that are subject to strict data residency and data protection policies

Separate VPCs, that can be owned by different AWS accounts, and subnets are created for the IT and OT departments’ instances (see 1 and 2 in the diagram).

Organizational security policies require that traffic flowing to and from the Outpost and the site, and between VPCs on the Outpost, be inspected, controlled, and logged using a centralized firewall.

In an AWS Region it is possible to implement a centralized traffic inspection architecture using routing services such as AWS Transit Gateways (TGW) or Gateway Load Balancers (GWLB) to route traffic to a central firewall, but these services are not available on Outposts.

On Outposts, some use the Local Gateway (LGW) to implement a distributed traffic inspection architecture with firewalls deployed in each VPC, but this can be operationally complex and cost prohibitive.

In this post, you will learn how to use a recently introduced feature – Multi-VPC Elastic Network Interface (ENI) Attachments – to create a centralized traffic inspection architecture on Outposts. Using Multi-VPC Attached ENIs you can attach ENIs created in subnets that are owned and managed by other VPCs (even VPCs in different accounts) to an Amazon Elastic Compute Cloud (EC2) instance.

Specifically, you can create ENIs in the IT and OT subnets that can be shared with a centralized firewall (see 3 and 4).

Because it’s a best practice to minimize the attack surface of a centralized firewall through isolation, the example includes a VPC and subnet created solely for the firewall instance (see 5).

To protect traffic flowing to and from the IT, OT, and firewall VPCs and on-site networks, another ‘Exposed’ VPC, subnet (see 6), and ENI (see 7) are created. These are the only resources associated with the Outposts Local Gateway (LGW) and ‘exposed’ to on-site networks.

In the example, traffic is routed from the IT and OT VPCs using a default route that points to the ENI used by the firewall (see 8 and 9). The firewall can route traffic back to the IT and OT VPCs, as allowed by policy, through its directly connected interfaces.

The firewall uses a route for the on-site network (192.168.30.0/24) – or a default route – pointing to the gateway associated with the exposed ENI (eni11, 172.16.2.1 – see 10).

To complete the routing between the IT, OT, and firewall VPCs and the on-site networks, static routes are added to the LGW route table pointing to the firewall’s exposed ENI as the next hop (see 11).

Once these static routes are inserted, the Outposts Ingress Routing feature will trigger the routes to be advertised toward the on-site layer-3 switch using BGP.

Likewise, the on-site layer-3 switch will advertise a route (see 12) for 192.168.30.0/24 (or a default route) over BGP to the LGW, completing end-to-end routing between on-site networks and the IT and OT VPCs through the centralized firewall.

The following diagram shows an example of packet flow between an on-site OT device and the OT server, inspected by the firewall:

Implementation on AWS Outposts rack

The following implementation details are essential for our example traffic inspection on the Outposts rack architecture.

Prerequisites

The following prerequisites are required:

  • Deployment of an Outpost on premises;
  • Creation of four VPCs – Exposed, firewall, IT, and OT;
  • Creation of private subnets in each of the four VPCs where ENIs and instances can be created;
  • Creation of ENIs in each of the four private subnets for attachment to the firewall instance (keep track of the ENI IDs);
  • If needed, sharing the subnets and ENIs with the firewall account, using AWS Resource Access Manager (AWS RAM);
  • Association of the Exposed VPC to the LGW.

Firewall selection and sizing

Although in this post a basic Linux instance is deployed and configured as the firewall, in the Network Security section of the AWS Marketplace, you can find several sophisticated, powerful, and manageable AWS Partner solutions that perform deep packet inspection.

Most network security marketplace offerings provide guidance on capabilities and expected performance and pricing for specific appliance instance sizes.

Firewall instance selection

Currently, an Outpost rack can be configured with EC2 instances in the M5, C5, R5, and G4dn families. As a user, you can select the size and number of instances available on an Outpost to match your requirements.

When selecting an EC2 instance for use as a centralized firewall it is important to consider the following:

  • Performance recommendations for instance types and sizes made by the firewall appliance partner;
  • The number of VPCs that are inspected by the firewall appliance;
  • The availability of instances on the Outpost.

For example, after evaluating the partner recommendations you may determine that an instance size of c5.large, r5.large, or larger provide the required performance.

Next, you can use the following AWS Command Line Interface (AWS CLI) command to identify the EC2 instances configured on an Outpost:

Outposts get-outpost-instance-types \
--outpost-id op-abcdefgh123456789

The output of this command lists the instance types and sizes configured on your Outpost:

InstanceTypes:
- InstanceType: c5.xlarge
- InstanceType: c5.4xlarge
- InstanceType: r5.2xlarge
- InstanceType: r5.4xlarge

With knowledge of the instance types and sizes installed on your Outpost, you can now determine if any of these are available. The following AWS CLI command – one for each of the preceding instance types – lists the number of each instance type and size available for use. For example:

aws cloudwatch get-metric-statistics \
--namespace AWS/Outposts \
--metric-name AvailableInstanceType_Count \
--statistics Average --period 3600 \
--start-time $(date -u -Iminutes -d '-1hour') \
--end-time $(date -u -Iminutes) \
--dimensions \
Name=OutpostId,Value=op-abcdefgh123456789 \
Name=InstanceType,Value=c5.xlarge

This command returns:

Datapoints:
- Average: 2.0
  Timestamp: '2024-04-10T10:39:00+00:00'
  Unit: Count
Label: AvailableInstanceType_Count

The output indicates that there are (on average) two c5.xlarge instances available on this Outpost in the specified time period (1 hour). The same steps for the other instance type suggest that there are also two c5.4xlarge, two r5.2xlarge, and no r5.4xlarge available.

Next, consider the number of VPCs to be connected to the firewall and determine if the instances available support the required number of ENIs.

The firewall requires an ENI in its own VPC, in the Exposed VPC, and one for each additional VPC. In this post, because there is a VPC for IT and for OT, you need an EC2 instance that supports four interfaces in total:

To determine the number of supported interfaces for each available instance type and size, let’s use the AWS CLI:

aws ec2 describe-instance-types \
--instance-types c5.xlarge c5.4xlarge r5.2xlarge \
--query 'InstanceTypes[].[InstanceType,NetworkInfo.NetworkCards]'

This returns:

- - r5.2xlarge
  - - BaselineBandwidthInGbps: 2.5
      MaximumNetworkInterfaces: 4
      NetworkCardIndex: 0
      NetworkPerformance: Up to 10 Gigabit
      PeakBandwidthInGbps: 10.0
- - c5.xlarge
  - - BaselineBandwidthInGbps: 1.25
      MaximumNetworkInterfaces: 4
      NetworkCardIndex: 0
      NetworkPerformance: Up to 10 Gigabit
      PeakBandwidthInGbps: 10.0
- - c5.4xlarge
  - - BaselineBandwidthInGbps: 5.0
      MaximumNetworkInterfaces: 8
      NetworkCardIndex: 0
      NetworkPerformance: Up to 10 Gigabit
      PeakBandwidthInGbps: 10.0

The output suggests that the three available EC2 instances (r5.2xlarge, c5.xlarge and c5.4xlarge) can support four network interfaces. The output also suggests that the c5.4xlarge instance, for example, supports up to 8 network interfaces and a maximum bandwidth of 10Gb/s. This helps you plan for the potential growth in network requirements.

Attaching remote ENIs to the firewall instance

With the firewall instance deployed in the firewall VPC, the next step is to attach the remote ENIs created previously in the Exposed, OT, and IT subnets. Using the firewall instance ID and the Network Interface IDs for each of the remote ENIs, you can create the Multi-VPC Attached ENIs to connect the firewall to the other VPCs.  Each attached interface needs a unique device-index greater than ‘0’ which is the primary instance interface.

For example, to connect the Exposed VPC ENI:

aws ec2 attach-network-interface --device-index 1 \
--instance-id i-0e47e6eb9873d1234 \
--network-interface-id eni-012a3b4cd5efghijk \
--region us-west-2

Attach the OT and IT ENIs while incrementing the device-index and using the respective unique ENI IDs:

aws ec2 attach-network-interface --device-index 2 \
--instance-id i-0e47e6eb9873d1234 \
--network-interface-id eni-0bbe1543fb0bdabff \
--region us-west-2
aws ec2 attach-network-interface --device-index 3 \
--instance-id i-0e47e6eb9873d1234 \
--network-interface-id eni-0bbe1a123b0bdabde \
--region us-west-2

After attaching each remote ENI, the firewall instance now has an interface and IP address in each VPC used in this example architecture:

ubuntu@firewall:~$ ip address

ens5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 10.240.4.10/24 metric 100 brd 10.240.4.255 scope global dynamic ens5

ens6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 10.242.0.50/24 metric 100 brd 10.242.0.255 scope global dynamic ens6

ens7: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 10.244.76.51/16 metric 100 brd 10.244.255.255 scope global dynamic ens7

ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc mq state UP group default qlen 1000
    inet 172.16.2.7/24 metric 100 brd 172.16.2.255 scope global dynamic ens11

Updating the VPC/subnet route tables

You can now add the routes needed to allow traffic to be inspected to flow through the firewall.

For example, the OT subnet (10.242.0.0/24) uses a route table with the ID rtb- abcdefgh123456789. To send the traffic through the firewall, you need to add a default route with the target being the ENI (eni-07957a9f294fdbf5d) that is now attached to the firewall:

aws ec2 create-route --route-table-id rtb-abcdefgh123456789 \
--destination-cidr-block 0.0.0.0/0 \
--network-interface-id eni-07957a9f294fdbf5d

You can follow the same process is used to add a default route to the IT VPC/subnet.

With routing established from the IT and OT VPCs to the firewall, you need to make sure that the firewall uses the Exposed VPC to route traffic toward the on-premises network 192.168.30.0/24. This is done by adding a route within the firewall OS using the VPC gateway as a next hop.

The ENI attached to the firewall from the Exposed VPC is in subnet 172.16.2.0/28, and the gateway used by this subnet is, by Amazon Virtual Private Cloud (VPC) convention, the first address in the subnet – 172.16.2.1. This is used when updating the firewall OS route table:

sudo ip route add 192.168.30.0/24 via 172.16.2.1

You can now confirm that the firewall OS has routes to each attached subnet and to the on-premises subnet:

ubuntu@firewall:~$ ip route
default via 10.240.4.1 dev ens5 proto dhcp src 10.240.4.10 metric 100
10.240.0.2 via 10.240.4.1 dev ens5 proto dhcp src 10.240.4.10 metric 100
10.240.4.0/24 dev ens5 proto kernel scope link src 10.240.4.10 metric 100
10.240.4.1 dev ens5 proto dhcp scope link src 10.240.4.10 metric 100
10.242.0.0/24 dev ens6 proto kernel scope link src 10.242.0.50 metric 100
10.242.0.2 dev ens6 proto dhcp scope link src 10.242.0.50 metric 100
10.244.0.0/16 dev ens7 proto kernel scope link src 10.244.76.51 metric 100
10.244.0.2 dev ens7 proto dhcp scope link src 10.244.76.51 metric 100
172.16.2.0/24 dev ens11 proto kernel scope link src 172.16.2.7 metric 100
172.16.2.2 dev ens11 proto dhcp scope link src 172.16.2.7 metric 100
192.168.30.0/24 via 172.16.2.1 dev ens11

The final step in establishing end-to-end routing is to make sure that the LGW route table contains static routes for the firewall, IT, and OT VPCs. These routes target the ENIs used by the firewall in the Exposed VPC.

After gathering the LGW Route Table ID and the firewall’s Exposed ENI ID used by the firewall, you can now add routes toward the firewall VPC:

aws ec2 create-local-gateway-route \
    --local-gateway-route-table-id lgw-rtb-abcdefgh123456789 \
    --network-interface-id eni-0a2e4f68f323022c3 \
    --destination-cidr-block 10.240.0.0/16

Repeat this command for the OT and IT VPC CIDRs – 10.242.0.0/16 and 10.244.0.0/16, respectively.

You can query the LGW route table to make sure that each of the static routes was inserted:

aws ec2 search-local-gateway-routes \
    --local-gateway-route-table-id lgw-rtb-abcdefgh123456789 \
    --filters "Name=type,Values=static"

This returns:

Routes:

- DestinationCidrBlock: 10.240.0.0/16
  LocalGatewayRouteTableId: lgw-rtb-abcdefgh123456789
  NetworkInterfaceId: eni-0a2e4f68f323022c3
  State: active
  Type: static

- DestinationCidrBlock: 10.242.0.0/16
  LocalGatewayRouteTableId: lgw-rtb-abcdefgh123456789
  NetworkInterfaceId: eni-0a2e4f68f323022c3
  State: active
  Type: static

- DestinationCidrBlock: 10.244.0.0/16
  LocalGatewayRouteTableId: lgw-rtb-abcdefgh123456789
  NetworkInterfaceId: eni-0a2e4f68f323022c3
  State: active
  Type: static

With the addition of these static routes the LGW begins to advertise reachability to the firewall, OT, and IT Classless Inter-Domain Routing (CIDR) blocks over the BGP neighborship. The CIDR for the Exposed VPC is already advertised because it is associated directly to the LGW.

The firewall now has full visibility of the traffic and can apply the monitoring, inspection, and security profiles defined by your organization.

Other considerations

  • It is important to follow the best practices specified by the Firewall Appliance Partner to fully secure the appliance. In the example architecture, access to the firewall console is restricted to AWS Session Manager.
  • The commands used previously to create/update the Outpost/LGW route tables need an account with full privileges to administer the Outpost.

Fault tolerance

As a crucial component of the infrastructure, the firewall instance needs a mechanism for automatic recovery from failures. One effective approach is to deploy the firewall instances within an Auto Scaling group, which can automatically replace unhealthy instances with new, healthy ones. In addition, using host or rack level spread placement group makes sure that your instances are deployed on distinct underlying hardware. This enables high availability and minimizes downtime. Furthermore, this approach based on Auto Scaling can be implemented regardless of the specific third-party product used.

To ensure a seamless transition when Auto Scaling replaces an unhealthy firewall instance, it is essential that the multi-VPC ENIs responsible for receiving and forwarding traffic are automatically attached to the new instance. When re-using the same multi-VPC ENIs, make sure that no changes are required in the subnets and LGW route tables.

To re-attach the same multi-VPC ENIs to the new instance, you can do this using Auto Scaling lifecycle hooks, with which you can pause the instance replacement process and perform custom actions.

After re-attaching the multi-VPC ENIs to the instance, the last step is to restore the configuration of the firewall from a backup.

Conclusion

In this post, you have learned how to implement on-premises to VPC and VPC-to-VPC inline traffic inspection on Outposts rack with a centralized firewall deployment. This architecture requires a VPC for the firewall instance itself, an Exposed VPC connecting to your on-premises network, and one or more VPCs for your workloads running on the Outpost. You can either use a basic Linux instance as a router, or choose from the advanced AWS Partner solutions in the Network Security section of the AWS Marketplace and follow the respective guidance on firewall instance selection. With multi-VPC ENI attachments, you can create network traffic routing between VPCs and forward traffic to the centralized firewall for inspection. In addition, you can use Auto Scaling groups, spread placement groups, and Auto Scaling lifecycle hooks to enable high availability and fault tolerance for your firewall instance.

If you want to learn more about network security on AWS, visit: Network Security on AWS.

Amazon EC2 high memory U7i Instances for large in-memory databases

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-high-memory-u7i-instances-for-large-in-memory-databases/

Announced in preview form at re:Invent 2023, Amazon Elastic Compute Cloud (Amazon EC2) U7i instances with up to 32 TiB of DDR5 memory and 896 vCPUs are now available. Powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids), these high memory instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. Here are the specs:

Instance Name vCPUs
Memory (DDR5)
EBS Bandwidth
Network Bandwidth
u7i-12tb.224xlarge 896 12,288 GiB 60 Gbps 100 Gbps
u7in-16tb.224xlarge 896 16,384 GiB 100 Gbps 200 Gbps
u7in-24tb.224xlarge 896 24,576 GiB 100 Gbps 200 Gbps
u7in-32tb.224xlarge 896 32,768 GiB 100 Gbps 200 Gbps

The new instances deliver the best compute price performance for large in-memory workloads, and offer the highest memory and compute power of any SAP-certified virtual instance from a leading cloud provider.

Thanks to AWS Nitro System, all of the memory on the instance is available for use. For example, here’s the 32 TiB instance:

In comparison to the previous generation of EC2 High Memory instances, the U7i instances offer more than 135% of the compute performance, up to 115% more memory performance, and 2.5x the EBS bandwidth. This increased bandwidth allows you to transfer 30 TiB of data from EBS into memory in an hour or less, making data loads and cache refreshes faster than ever before. The instances also support ENA Express with 25 Gbps of bandwidth per flow, and provide an 85% improvement in P99.9 latency between instances.

Each U7i instance supports attachment of up to 128 General Purpose (gp2 and gp3) or Provisioned IOPS (io1 and io2 Block Express) EBS volumes. Each io2 Block Express volume can be as big as 64 TiB and can deliver up to 256K IOPS at up to 32 Gbps, making them a great match for U7i instances.

The instances are SAP certified to run Business Suite on HANA, Business Suite S/4HANA, Business Warehouse on HANA (BW), and SAP BW/4HANA in production environments. To learn more, consult the Certified and Supported SAP HANA Hardware and the SAP HANA to AWS Migration Guide. Also, be sure to take a look at the AWS Launch Wizard for SAP.

Things to Know
Here are a couple of things that you should know about these new instances:

Regions – U7i instances are available in the US East (N. Virginia), US West (Oregon), and Asia Pacific (Seoul, Sydney) AWS Regions.

Operating Systems – Supported operating systems include Amazon Linux, Red Hat Enterprise Linux, SUSE Linux Enterprise Server, Ubuntu, and Windows Server.

Larger Instances – We are also working on offering even larger instance later this year with increased compute to meet our customer needs.

Jeff;

AWS Weekly Roundup – Application Load Balancer IPv6, Amazon S3 pricing update, Amazon EC2 Flex instances, and more (May 20, 2024)

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-application-load-balancer-ipv6-amazon-s3-pricing-update-amazon-ec2-flex-instances-and-more-may-20-2024/

AWS Summit season is in full swing around the world, with last week’s events in Bengaluru, Berlin, and  Seoul, where my blog colleague Channy delivered one of the keynotes.

AWS Summit Seoul Keynote

Last week’s launches
Here are some launches that got my attention:

Amazon S3 will no longer charge for several HTTP error codesA customer reported how he was charged for Amazon S3 API requests he didn’t initiate and which resulted in AccessDenied errors. The Amazon Simple Storage Service (Amazon S3) service team updated the service to not charge such API requests anymore. As always when talking about pricing, the exact wording is important, so please read the What’s New post for the details.

Introducing Amazon EC2 C7i-flex instances – These instances delivers up to 19 percent better price performance compared to C6i instances. Using C7i-flex instances is the easiest way for you to get price performance benefits for a majority of compute-intensive workloads. The new instances are powered by the 4th generation Intel Xeon Scalable custom processors (Sapphire Rapids) that are available only on AWS and offer 5 percent lower prices compared to C7i.

Application Load Balancer launches IPv6 only support for internet clientsApplication Load Balancer now allows customers to provision load balancers without IPv4s for clients that can connect using just IPv6s. To connect, clients can resolve AAAA DNS records that are assigned to Application Load Balancer. The Application Load Balancer is still dual stack for communication between the load balancer and targets. With this new capability, you have the flexibility to use both IPv4s or IPv6s for your application targets while avoiding IPv4 charges for clients that don’t require it.

Amazon VPC Lattice now supports TLS Passthrough – We announced the general availability of TLS passthrough for Amazon VPC Lattice, which allows customers to enable end-to-end authentication and encryption using their existing TLS or mTLS implementations. Prior to this launch, VPC Lattice supported HTTP and HTTPS listener protocols only, which terminates TLS and performs request-level routing and load balancing based on information in HTTP headers.

Amazon DocumentDB zero-ETL integration with Amazon OpenSearch Service – This new integration provides you with advanced search capabilities, such as fuzzy search, cross-collection search and multilingual search, on your Amazon DocumentDB (with MongoDB compatibility) documents using the OpenSearch API. With a few clicks in the AWS Management Console, you can now synchronize your data from Amazon DocumentDB to Amazon OpenSearch Service, eliminating the need to write any custom code to extract, transform, and load the data.

Amazon EventBridge now supports customer managed keys (CMK) for event buses – This capability allows you to encrypt your events using your own keys instead of an AWS owned key (which is used by default). With support for CMK, you now have more fine-grained security control over your events, satisfying your company’s security requirements and governance policies.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items, open source projects, and Twitch shows that you might find interesting:

The Four Pillars of Managing Email Reputation – Dustin Taylor is the manager of anti-abuse and email deliverability for Amazon Simple Email Service (SES). He wrote a remarkable post exploring Amazon SES approach to managing domain and IP reputation. Maintaining a high reputation ensures optimal recipient inboxing. His post outlines how Amazon SES protects its network reputation to help you deliver high-quality email consistently. A worthy read, even if you’re not sending email at scale. I learned a lot.

AWS Build On Generative AIBuild On Generative AI – Season 3 of your favorite weekly Twitch show about all things generative artificial intelligence (AI) is in full swing! Streaming every Monday, 9:00 AM US PT, my colleagues Tiffany and Darko discuss different aspects of generative AI and invite guest speakers to demo their work.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter, in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Hong Kong (May 22), Milan (May 23), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore 2.5 days of immersive cloud security learning in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), Nigeria (August 24), and New York (August 28).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!