Tag Archives: Amazon EC2

New Amazon EC2 Graviton4-based instances with NVMe SSD storage

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/new-amazon-ec2-graviton4-based-instances-with-nvme-ssd-storage/

Since the launch of AWS Graviton processors in 2018, we have continued to innovate and deliver improved performance for our customers’ cloud workloads. Following the success of our Graviton3-based instances, we are excited to announce three new Amazon Elastic Compute Cloud (Amazon EC2) instance families powered by AWS Graviton4 processors with NVMe-based SSD local storage: compute optimized (C8gd), general purpose (M8gd), and memory optimized (R8gd) instances. These instances deliver up to 30% better compute performance, 40% higher performance for I/O intensive database workloads, and up to 20% faster query results for I/O intensive real-time data analytics than comparable AWS Graviton3-based instances.

Let’s look at some of the improvements that are now available in our new instances. These instances offer larger instance sizes with up to 3x more vCPUs (up to 192 vCPUs), 3x the memory (up to 1.5 TiB), 3x the local storage (up to 11.4TB of NVMe SSD storage), 75% higher memory bandwidth, and 2x more L2 cache compared to their Graviton3-based predecessors. These features help you to process larger amounts of data, scale up your workloads, improve time to results, and lower your total cost of ownership (TCO). These instances also offer up to 50 Gbps network bandwidth and up to 40 Gbps Amazon Elastic Block Store (Amazon EBS) bandwidth, a significant improvement over Graviton3-based instances. Additionally, you can now adjust the network and Amazon EBS bandwidth on these instances by up to 25% using EC2 instance bandwidth weighting configuration, providing you greater flexibility with the allocation of your bandwidth resources to better optimize your workloads.

Built on AWS Graviton4, these instances are great for storage intensive Linux-based workloads including containerized and micro-services-based applications built using Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Container Registry (Amazon ECR), Kubernetes, and Docker, as well as applications written in popular programming languages such as C/C++, Rust, Go, Java, Python, .NET Core, Node.js, Ruby, and PHP. AWS Graviton4 processors are up to 30% faster for web applications, 40% faster for databases, and 45% faster for large Java applications than AWS Graviton3 processors.

Instance specifications

These instances also offer two bare metal sizes (metal-24xl and metal-48xl), allowing you to right size your instances and deploy workloads that benefit from direct access to physical resources. Additionally, these instances are built on the AWS Nitro System, which offloads CPU virtualization, storage, and networking functions to dedicated hardware and software to enhance the performance and security of your workloads. In addition, Graviton4 processors offer you enhanced security by fully encrypting all high-speed physical hardware interfaces.

The instances are available in 10 sizes per family, as well as two bare metal configurations each:

Instance Name vCPUs Memory (GiB) (C/M/R) Storage (GB) Network Bandwidth (Gbps) EBS Bandwidth (Gbps)
medium 1 2/4/8* 1 x 59 Up to 12.5 Up to 10
large 2 4/8/16* 1 x 118 Up to 12.5 Up to 10
xlarge 4 8/16/32* 1 x 237 Up to 12.5 Up to 10
2xlarge 8 16/32/64* 1 x 474 Up to 15 Up to 10
4xlarge 16 32/64/128* 1 x 950 Up to 15 Up to 10
8xlarge 32 64/128/256* 1 x 1900 15 10
12xlarge 48 96/192/384* 3 x 950 22.5 15
16xlarge 64 128/256/512* 2 x 1900 30 20
24xlarge 96 192/384/768* 3 x 1900 40 30
48xlarge 192 384/768/1536* 6 x 1900 50 40
metal-24xl 96 192/384/768* 3 x 1900 40 30
metal-48xl 192 384/768/1536* 6 x 1900 50 40

*Memory values are for C8gd/M8gd/R8gd respectively

Availability and pricing

M8gd, C8gd, and R8gd instances are available today in US East (N. Virginia, Ohio) and US West (Oregon) Regions. These instances can be purchased as On-Demand instances, Savings Plans, Spot instances, or as Dedicated instances or Dedicated hosts.

Get started today

You can launch M8gd, C8gd and R8gd instances today in the supported Regions through the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDKs. To learn more, check out the collection of Graviton resources to help you start migrating your applications to Graviton instance types. You can also visit the Graviton Getting Started Guide to begin your Graviton adoption journey.

— Micah;


How is the News Blog doing? Take this 1 minute survey!

(This survey is hosted by an external company. AWS handles your information as described in the AWS Privacy Notice. AWS will own the data gathered via this survey and will not share the information collected with survey respondents.)

Powering generative AI/ML solutions with AWS Outposts Servers at Edge locations

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/powering-generative-ai-ml-solutions-with-aws-outposts-servers-at-edge-locations/

This post is written by Brian Daugherty, Principal Solutions Architect, Leonardo Queirolo, Senior Cloud Support Engineer, and Reet Kundu, Senior Cloud Support Engineer

Powering generative AI/ML solutions with AWS Outposts Servers at Edge locations

Many organizations are vigorously pursuing generative AI initiatives in the Amazon Web Services (AWS) cloud today because generative AI drive advances in productivity, efficiency, and innovation.

However, for some organizations, industries, and use-cases, there is a compelling need to deploy generative AI not only in the cloud, but also at the edge due to factors such as application latency and proximity to critical data.

AWS Outposts can help these organizations address this need by extending AWS services to the edge, such as generative AI services, while maintaining the same tooling and orchestration capabilities found in AWS Regions.

Industrial and manufacturing use-cases are a primary focus of AWS Outposts Servers, which can be deployed on-premises to minimize latency and make sure of stable connectivity between orchestration and control applications such as Manufacturing Execution Systems (MES) or Supervisory, Control, and Data Acquisition (SCADA) systems and the industrial processes they control.

This post explores how to use Outposts Servers to power generative AI solutions at the edge. The example use-case demonstrates real-time anomaly detection for industrial processes and an edge-based human machine interface including a small language model (SLM) with Retrieval-Augmented Generation (RAG) to guide operators on best practices for problem resolution. Although the use case is specific, the tools and methods can be applied to many other edge generative AI use cases.

For a hands-on experience to implement this solution using Outposts Servers, fill out this form with your contact information and we will get back to you with lab access. A detailed step-by-step guide to develop the hands-on example is available in this link.

Architecture overview

As depicted in the following diagram, the solution is distributed in three modules. The first module (1) guides you to establish low-latency, local connectivity to an MQTT broker within the same on-premises network as your lab Amazon Elastic Compute Cloud (Amazon EC2) instance. You configure essential AWS infrastructure (Amazon S3, AWS Secrets Manager, AWS Identity and Access Management (IAM)) to manage the deployment, authentication, and permissions of AWS IoT Greengrass components. You deploy a component to the existing Greengrass core device on your lab EC2 instance to retrieve synthetic Arduino sensor data from the broker using its Local Network Interface (LNI).
Figure 1 – Architectural diagram of the solution to perform low-latency, local inference through generative AI and ML models running on Outposts Servers

Figure 1 – Architectural diagram of the solution to perform low-latency, local inference through generative AI and ML models running on Outposts Servers

In the second module (2), you deploy a component that detects anomalies in sensor data in real-time. This component runs on the Outposts Server EC2 instance hosting the AWS IoT Greengrass core device, performing inference directly at the edge. You use synthetic Arduino sensor data to generate anomalies and observe them being detected by the model. You configure an IoT rule to send the anomaly count to the Amazon CloudWatch Dashboard in the Region. This provides centralized monitoring, while making sure that the raw data and any sensitive data remains processed locally at the edge where latency and connectivity are assured.

In the third module (3), you deploy a comprehensive edge computing solution to enhance operational visibility and decision-making capabilities at the local level. The solution includes a local dashboard that provides a real-time telemetry to display raw sensor data and detect anomalies. A Virtual Assistant is integrated with SLM to provide context-aware response from the factory data and forecasting capability to predict future anomaly trends.

Outposts Server

Outposts Servers provide fully managed AWS infrastructure, services, APIs, and tools for edge use-cases . Two form factors are available: 1U servers are AWS Graviton based, and 2U servers are third-generation Intel Xeon Scalable processor based.

Enabling anomaly detection at the edge

Outposts Servers allow local sensor data processing for low-latency anomaly detection and resilience against external connectivity issues, as shown in the following figure. The example uses synthetic Arduino devices with gyroscope sensors data, simulating industrial sensors sending data to an MQTT Broker on an EC2 instance in the Outposts Server. Gyroscope data is used in various monitoring systems, such as motion control systems, orientation detection, stability, and balance mechanism. The Lab EC2 instance fetches sensor data through the MQTT client and processes it using a machine learning (ML) model for anomaly detection.

Figure 2 – Architectural diagram showing data flow from Arduino sensors through MQTT broker and EC2 on Outposts Server to perform local inference

Figure 2 – Architectural diagram showing data flow from Arduino sensors through MQTT broker and EC2 on Outposts Server to perform local inference

Outposts server LNI

Local communication between synthetic Arduino sensor data, MQTT broker, and the Lab EC2 instance uses LNI, providing Layer 2 presence on the local network. The setup necessitates creating an Elastic Network Interface (ENI) on an Outposts subnet with the LNI enabled, attaching it to the Lab EC2 Instance, and verifying connectivity through the MQTT Broker’s LNI IP using the command ping -c 5 <MQTT_BROKER_LNI_IP> . This enables direct, low-latency communication between components crucial for this edge computing scenario.

AWS IoT Greengrass

AWS IoT Greengrass is an open source edge runtime and cloud service for device software management and deployment supported on Outposts Server. This hybrid approach combines the benefits of edge computing with centralized management, such as:

  • Centralized artifact management: store and version component artifacts in Amazon S3, enabling consistent deployment across multiple edge locations.
  • Secure configuration: use Secrets Manager to handle sensitive information and credentials unique to each edge location.
  • Fleet monitoring: use CloudWatch for centralized monitoring and logging across your distributed edge deployment.
  • Automated updates: deploy software updates and model improvements across your edge fleet through AWS IoT Greengrass component management.

AWS IoT Greengrass components, such as the one used for the anomaly detection, can be deployed to EC2 instances running on Outposts Servers. After configuring the Lab EC2 instance with Greengrass, you can download components from an S3 bucket. The first component deploys a subscriber for receiving synthetic Arduino sensor data through MQTT broker configuration, as shown in the following configuration line.

{
    "broker": "<MQTT_BROKER_LNI_IP>",
    "port": 1883,
    "client_id": "OutpostsServerMLEdge_<workshop-id>",
    "sensor_name": "ArduinoSensor_<arduino-id>",
    "topic": "arduino/ArduinoSensor_<arduino-id>/3-axis-rotation",
    "thing_name": "OutpostsServerMLEdge_Sub",
    "mqttauth_creds": "<ARN_SECRET_MQTT_CREDENTIALS>"
}

The second component is the Anomaly Detector artifact that processes sensor data in real-time, detects anomalies using a pre-trained model, and sends anomaly counts to AWS IoT Core. Key components include:

  • edge_application.py: script for processing sensor data, performing local inference using pre-trained model in ONNX format, and publishing anomaly counts to AWS IoT Core. It is used for local inference, so that the raw data is not exposed outside the Edge location.
  • model: directory storing “arduino.onnx”, a pre-trained Autoencoder model for anomaly detection.
  • statistics: directory storing the values of different statistical functions (for example, mean and standard deviation) from the training phase and used by edge_application.py for inference.
  • functions: directory storing the code of the functions, such as the code to publish to the AWS IoT Core.

After deployment of subscriber and detector components, the Lab EC2 instance processes synthetic gyroscope data from Arduino sensors, detecting anomalies during X, Y, or Z axis movement:

Real-time Dashboard showing sensor data and anomaly count

Real-time anomaly detection results from gyroscope sensor data across X, Y, and Z axes.

Building upon the foundation of Outposts Server, Local Network Interface (LNI), and AWS IoT Greengrass, this solution extends beyond anomaly detection to deliver comprehensive edge AI capabilities. These core components work together to enable advanced generative AI applications at the edge, as demonstrated in the following sections.

Edge generative AI applications with Outposts Server

The solution demonstrates the implementation of key edge generative AI capabilities:

  • Contextual virtual assistance: providing on-site personnel with AI-powered guidance and troubleshooting using local operational data, SOPs, and technical documentation.
  • Predictive insights: using foundational models (FMs) to forecast future trends based on historical data, enabling proactive planning and optimization.
  • Real-time operational dashboard: integrating sensor data visualization with AI-powered insights and forecasts in a unified local interface that maintains operations during connectivity interruptions.

1. Contextual virtual assistance at the edge

The solution implements the virtual assistant through an AWS IoT Greengrass component. The following is a snippet from the component recipe showing the key configuration parameters:

{
    "ComponentConfiguration": {
        "DefaultConfiguration": {
            // Workshop defaults, SLM runs locally on same EC2 instance
            "SLM_endpoint": "http://localhost:8080",  
            "embedding_model": "all-MiniLM-L6-v2",    
            "knowledge_base_directory": "Factory_Data" 
        }
    }
    // Additional component recipe configurations...
}

Although the solution demonstrates a streamlined setup with the SLM running on the same EC2 instance as the AWS IoT Greengrass component, the architecture enables flexible deployment options through the SLM_endpoint configuration. Organizations can:

  • Deploy the SLM on a dedicated resource in their on-premises network (for example "http://<LNI-IP-DEDICATED-RESOURCE>:8080")
  • Use existing hardware infrastructure accessible through LNI
  • Scale SLM compute resources independently from the AWS IoT Greengrass component
  • Maintain low-latency communication through local network interfaces

The implementation showcases a streamlined approach to RAG at the edge through three main components:

Knowledge base management: the solution uses Amazon S3 for document storage (PDFs, Markdown, text) with automatic edge deployment through AWS IoT Greengrass. Alternatively, you can also choose to store the documents in a local storage. A vector database, such as ChromaDB, handles local vector storage and similarity search, enabling efficient knowledge base updates with centralized control.

Flexible query processing: the implementation provides a streamlined interface for RAG management, allowing users to load site-specific knowledge bases and switch between basic SLM and RAG-enhanced responses with local context:

if prompt := st.chat_input("Question"):
if "db" in st.session_state:
        prompt = augmentPrompt(prompt, st.session_state["db"])
response = getStreamingAnswer(prompt, SLM_MODEL_ENDPOINT)

Modular SLM integration: The solution uses a standardized chat completion API, which allows for integration with different SLM deployments while maintaining a consistent interface across the edge fleet:

def getStreamingAnswer(question: str, endpoint: str):    
    chat_template = '<|user|>\n{input} <|end|>\n<|assistant|>'
    payload = {
        'messages': [{'content': f'{chat_template.format(input=question)}'}],
        'stream': True
    }
    SLM_URL = endpoint + '/v1/chat/completions'

This flexible architecture can be adapted for many industrial use-cases where latency and proximity to local data-sources and processes are critical.

2. Predictive insights using local models

The solution demonstrates forecasting capabilities using Chronos, a small and efficient time series forecasting model that can run entirely at the edge. The following solution implementation shows how to process historical data and generate predictions using Chronos on the AWS IoT Greengrass component deployed on Outposts Server:

# Load Chronos model locally on the Outposts Server
pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-small",
    device_map="cpu",
    torch_dtype=torch.bfloat16,
)
# Generate forecasts with confidence intervals
def predict_anomaly_count_data():
    forecast = pipeline.predict(
        context = torch.tensor(df["total_anomalies"]),
        prediction_length = pred_length,
        num_samples = n_samples,
        top_k = 50,
        top_p = 1.0,
    )
    
    # Calculate confidence bounds
    low, median, high = np.quantile(forecast[0].numpy(), [0.1, 0.5, 0.9], axis=0)

Although the solution uses sample data for the demonstration, this architecture allows organizations to process complex, real-time data at each edge location. Companies can choose to upload only aggregated metrics to CloudWatch or Amazon QuickSight for fleet monitoring and BI analysis, making sure that sensitive raw data remains secure at the edge.

3. Real-time operational dashboard

The solution showcases a resilient monitoring solution where all inter-component communication occurs within the local network and processing happens on the Outposts server, making sure of full functionality during external network interruptions. The dashboard is accessible through the LNI of the Outposts server, allowing local clients to maintain access through the LNI IP address even when connectivity to the Region is lost.

Through a unified interface, the dashboard provides:

  • Real-time visualization of sensor readings
  • Anomaly detection results from the local ML component
  • AI-powered insights from the local SLM
  • Trend forecasting from the Chronos model

Real-time Dashboard showing sensor data and anomaly count

Real-time Dashboard showing sensor data and anomaly count

Virtual Assistant leveraging Factory Data to provide contextualized answers

Virtual Assistant leveraging Factory Data to provide contextualized answers

Chronos forecasting anomaly count based on historical data

Chronos forecasting anomaly count based on historical data

Conclusion

The implementation demonstrates how AWS Outposts Server enables organizations to use both traditional ML and advanced generative AI capabilities at the edge for a variety of industrial and manufacturing use-cases where low-latency and proximity to sensitive or real-time data are business- and process-critical.

To get started with AWS Outposts and explore use cases like this edge AI solution, fill out this form and our team will contact you with lab access and additional guidance. For a detailed walkthrough of this specific edge AI example, refer to this step-by-step guide. For more information about AWS Outposts Server, see the AWS Outposts Server User Guide.

Anchoring AWS Outposts servers with AWS Direct Connect

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/anchoring-aws-outposts-servers-with-aws-direct-connect/

This post is written by Perry Wald, Principal GTM SA, Hybrid Edge, Eric Vasquez Senior SA Hybrid Edge, and Fernando Galves Gen AI Solutions Architect, Outposts

AWS Outposts is a fully managed service that extends AWS infrastructure, services, APIs, and tools to customer premises. Outposts servers launched in 2022, a 1U or 2U rack-mountable host, with the ability to run Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic Container Service (Amazon ECS), as well as other appropriate smaller scale edge services such as AWS IoT Greengrass. This version of Outposts is primarily focused on bringing lower latency, AWS compute capabilities to the edge at many user locations.

During Outposts provisioning, you or AWS creates a service link connection that connects your Outposts server to your chosen AWS Region or home Region. Outposts depends on regional connectivity “to reach out to home,” needing very little in terms of networking. Looking at the network requirements, it needs:

  • DHCP, to assign an IP address and a default gateway
  • Public DNS, to resolve the name of the initial regional endpoint, to allow automated setup, and
  • Internet access, so that when the regional endpoint has been resolved, the Outpost can reach that endpoint. With a minimum of 500 Mbps or and a max of 175 ms round trip latency

User challenges with internet connectivity at the edge

When you order an Outposts server, you are responsible for installing the server. Outposts servers are self-provisioning and need a service link connection between your Outposts and the AWS Region (or home Region). This connection allows for the management of Outposts and the exchange of traffic to and from the AWS Region. Server deployment can be broken down into the following steps: installing the Outposts servers, powering them on, and providing authentication details through a command line. Then, the Outpost servers reach out to the regional endpoint, and provision themselves. Your Outpost status will show as Active when the process has completed, it could take a few hours depending on service link bandwidth.

Although this has been suitable for the vast majority of use cases, there are some locations that can’t provide internet connectivity in their environments. This has mostly been in use cases where there is a strong security reason for not having an internet connection (such as financial services kiosks, small manufacturing facilities, and defense), so as to avoid risks such as DDoS attacks and potential hack attempts, or to meet requirements for receiving an authority to operate (ATO).

These locations either have some form of direct connect, or more commonly have a centralized direct connect link to AWS, and an MPLS network linking all their remote sites to a central one. In both of these scenarios, the requirement is to allow the Outpost servers to resolve and reach the public endpoint for setup, and subsequently the public anchor endpoint for management. This is done without needing to leave the AWS ecosystem, without needing to expose themselves unnecessarily to potential internet threats, and without adding more systems to manage themselves, but rather making use of AWS services.

To meet this requirement, we identified several key things that need to be provided if the user does not have internet connectivity at the remote location, as follows:

  1. DHCP, to provide the Outposts servers with an IP address, default gateway, and DNS servers.
  2. Public DNS access to resolve both the setup endpoint, and when live, the anchor endpoint.
  3. Public internet access, without exposing the user location to potentially harmful traffic from the internet.

Direct Connect VIF options

There are three different types of Virtual Interfaces (VIF) possible to configure on an AWS Direct Connect link:

  • Public VIF: A public VIF can access all AWS public services using public IP addresses.
  • Private VIF: A private VIF should be used to access an Amazon Virtual Private Cloud (Amazon VPC) using private IP addresses.
  • Transit VIF: A transit VIF should be used to access one or more Amazon VPC Transit Gateways associated with Direct Connect gateways.

Transit VIF option

A transit VIF can be used to solve both of these issues. First, a transit VIF deploys an ENI within a VPC (known as an attachment), so that traffic coming from the transit VIF into a VPC can be routed. This is because it follows the rule that, for non-transitive VPC routing, the traffic has to either be sourced or targeted for an ENI in the VPC.

If the traffic is forwarded to a regional VPC through the transit gateway, then it can be forwarded to the internet through an NAT gateway. This is an enhancement of the architecture to use a transit gateway to provide a single egress point for multiple VPCs to the internet. For more information, see Creating a single internet exit point from multiple VPCs Using AWS Transit Gateway. In this case, instead of the transit gateway routing multiple VPCs to the internet, it’s routing to an on-premises connection.

Using a transit gateway to forward traffic to an NAT gateway allows you to provide internet connectivity for the Outposts servers without managing virtual appliances, because NAT gateway provides this as a service. NAT gateways also only allow outbound access, so they provide security against any attempted external access by a bad actor from the internet. This works for Outposts servers since they only need outbound access. Outposts always initiate communication to an anchor or service endpoint, and they never receive communication except as a response.

Architectural diagram showing the use of a Transit VIF and NAT gateway in a Region reaching regional endpoints

Figure 1. Architectural diagram showing the use of a Transit VIF and NAT gateway in a Region reaching regional endpoints

DNS provisioning

Although the preceding architecture solves the challenge of how we provide a path for IP packets to transit between the Outposts servers and the public endpoints needed, it doesn’t solve the issue of resolving DNS names. If the remote site is isolated from the internet, then it has no clear way to resolve DNS.

Amazon Route53 resolver endpoints allow you to deploy an IP address within a VPC subnet, which provides DNS resolution. There are two types of resolver endpoints: outbound and inbound.

Outbound resolver endpoints are used by AWS to send DNS queries to your on-premises DNS servers. Inbound resolver endpoints are used by your DNS servers (and hosts) to resolve addresses within Route 53.

Route 53 can resolve public DNS names, so the Outposts service endpoint outposts.<region-name>.amazonaws.com becomes resolvable by an inbound resolver endpoint.

Configuring the Outposts egress VPC

  1. Set up service link egress VPC, build subnets, deploy a NAT gateway, and transit gateway.
  2. Create Route 53 resolver inbound endpoint.
  3. Configure DHCP on the switch, and make sure that the DNS value matches resolver endpoint.
  4. Configure Transit VIF on the switch, build a BGP peer, and attach to your transit gateway.
  5. Confirm propagation settings on transit gateway and default routes.
  6. Confirm routes on subnets to allow traffic out to the internet, and back to your Outpost servers.
  7. Test name resolution (dig) and https (curl) test to service endpoint.
  8. If needed, install your Outpost servers.

Public VIF option

Using a public VIF allows you to provide an internet connection directly to the on-premises site. In turn, this means you need to implement firewalls and security functions on this connection, adding more layers of operational overhead. A public VIF also means that the on-premises end of the VIF can be accessed by any public IP on the AWS public network, regardless of the instance to which IP is mapped. A public VIF is a public IP endpoint on the AWS public network. You should treat public VIF traffic as internet-based traffic. This can become cumbersome for firewalls teams if they have to allow-list known AWS IP ranges and manage the stateful firewall for a long range of AWS IPs.

Furthermore, even if the user is happy to implement and manage a firewall on the end of that public VIF, there is still a question of how the Outpost would resolve DNS in this setup, and subsequent anchor endpoints. Unless the private network already has DNS resolution to a public DNS, then there are no DNS servers that DHCP can point to in order to allow the Outposts servers to get name resolution. This is because there is no public DNS endpoint within the AWS public network. Traffic from a user’s public VIF can access the AWS public network, but it can’t exit it to other public networks. For example, if the you had configured DHCP to point to one of the well-known DNS servers (such as 8.8.8.8), then, since this DNS servers lives outside of the AWS public network, requests originating from the on-premises side of a public VIF would be dropped as it hit the border of the AWS autonomous system.

The only way for a DNS request to be resolved would be to build a bind forwarding service within a VPC, provide it with a public IP address, and point the DHCP DNS values at this IP address.

This network configuration introduces complexity, and won’t be possible for those with highly regulated workloads. You would need to manage a firewall on-premises, allow a public network to reach the on-premises location, and manage a bind servers setup within a VPC. For these reasons, a public VIF is generally not an option unless the user is already running one, and is familiar with the steps to secure it.

Figure 2. Architectural diagram showing traffic flow using a public VIF and AWS Outposts

Private VIF option

A private VIF whether connected directly to a virtual private gateway (VGW), or through a Direct Connect gateway. VPCs do not support transitive routing. To explain this another way, any traffic following a routing rule in a subnet route table has to either originate from, or be destined for, an IP address (or to be more explicit, an Elastic Network Interface (ENI)) inside that VPC.

Virtual private gateways do not have an ENI associated with them, but are pointed to as a next hop within a subnet routing table. If we take this example and look at what the Outposts servers would be trying to pass as traffic, then it would send a packet with a source address of the Outposts servers, and a destination address of the Outposts service public endpoint (assuming that it could resolve it). When this packet reaches the VPC, then neither the source nor destination address would belong to an ENI within the VPC. Therefore, VPC routing would drop the packet.

Even if there was a routing rule on the subnet pointing the next hop for all traffic to a NAT gateway (ideal for internet egress), the routing still wouldn’t work. This is because the packet from the Outposts servers doesn’t have a destination of the NAT gateway, but instead a destination of the setup endpoint in the internet.

It’s possible to use a combination of ingress routing and transparent proxies to ingest the traffic and pass it to an instance running a proxy service to forward to the internet. However, this adds complexity having to manage and maintain proxy servers. For these reasons, a private VIF is generally not recommended.

Architectural diagram showing VGW and packet drops because of transitive routing not being supported

Figure 3. Architectural diagram showing VGW and packet drops because of transitive routing not being supported

Conclusion

In this post, we discussed architecture patterns you can use to provision your Outposts when public internet connectivity is unavailable. To get started with Outpost servers please visit our Server User Guide. For more information, contact us to learn more.

Migrating your on-premises workloads to AWS Outposts Rack

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/migrating-your-on-premises-workloads-to-aws-outposts-rack-2/

This post is written by Craig Warburton, Senior Solutions Architect, Hybrid; Sedji Gaouaou, Senior Solutions Architect, Hybrid; and Brian Daugherty, Principal Solutions Architect, Hybrid.

Migrating workloads to AWS Outposts Rack offers you the opportunity to gain the benefits of cloud computing while keeping your data and applications on premises.

For organizations with strict data residency requirements, by deploying AWS infrastructure and services on premises, you can keep sensitive data and mission-critical applications within your own data centers or facilities, helping ensure compliance with data sovereignty laws and regulatory frameworks.

On the other hand, if your organization does not have stringent data residency requirements, you may opt for a hybrid approach, using both Outposts Rack and the AWS Regions. With this flexibility, you can process and store data in the most appropriate location based on factors such as latency, cost optimization, and application requirements.

In this post, we cover options to migrate your workloads to an Outposts Rack, taking into account your specific data residency requirements. We explore strategies, tools, and best practices to enable a successful migration tailored to your organization’s needs.

Overview

AWS has several services to help you migrate and rehost workloads, including AWS Migration Hub, AWS Application Migration Service, AWS Elastic Disaster Recovery. Alternatively, you can use backup and recovery solutions provided by AWS partners.

At AWS, we use the 7 Rs framework to help organizations evaluate and choose the appropriate migration strategy for moving applications and workloads to the AWS Cloud. The 7 Rs represent:

  1. Rehosting (rehost or lift and shift)
  2. Replatforming (lift, tinker, and shift)
  3. Repurchasing (republish or re-vendor)
  4. Refactoring (re-architecting)
  5. Retiring
  6. Retaining (revisit)
  7. Relocating (remigrate).

This post focuses on rehosting and the services available to help rehost on-premises applications to Outposts Rack.

Before getting started with any migration, AWS recommends a three-phase approach to migrating workloads to the cloud (AWS Region or Outposts Rack). The three phases are assess, mobilize, and migrate and modernize.

Figure 1: Diagram showing the three migration phases of assess, mobilize, and migrate and modernize

Figure 1: Diagram showing the three migration phases of assess, mobilize, and migrate and modernize

This post describes the steps that you can take in the migrate and modernize phase. However, the assess and mobilize phases are also critical to allow you to understand what applications are migrated, the dependencies between them, and the planning associated with how and when migration occurs.

Workload migration to Outposts Rack: With staging environment in a Region

After deploying an Outposts Rack to your desired on-premises location, you can perform migrations of on-premises systems and virtual machines using either Application Migration Service and AMI creation or third-party backup and recovery services. Both scenarios are described in the following sections.

Scenario 1: Using Application Migration Service with AMI creation

Application Migration Service is able to lift and shift a large number of physical or virtual servers without compatibility issues, performance disruption, or long cutover windows.

In this scenario, at least one Outposts Rack is deployed on premises with the following prerequisites:

  • An AWS Replication Agent installed on each source server
  • At least one Outposts Rack installed and activated
  • VPC in an AWS Region
  • Staging subnet for staging migrated instances
  • Cutover subnet to validating migrated instances
  • Extended VPC spanning Region to the Outposts Rack
  • Migrated resources subnet where instances will be deployed from AMIs

The following diagram shows the solution architecture including the prerequisites and the on-premises servers that will be migrated to the Outposts Rack.

Figure 2: Architecture diagram showing migration with Application Migration Service

Figure 2: Architecture diagram showing migration with Application Migration Service

Step 1: Outposts Rack configuration

You can work with AWS specialists to size your Outposts for your workload and application requirements. In this scenario, you don’t need additional Outposts Rack capacity for migration because the staging area will be deployed in the Region (see 1 in Figure 2).

Step 2: Prepare Application Migration service

Set up Application Migration Service from the console in the Region to which your Outposts Rack is anchored. If this is your first setup, then choose Get started on the Application Migration Service console. When creating the replication settings template, ensure that your staging area is using subnets in the anchor Region (see 2 in Figure 2).

Step 3: Install the AWS Replication Agent to the source servers or machines

For large migrations, source servers may have a wide variety of operating system versions and may be distributed across multiple data centers. Application Migration Service offers the MGN connector, a feature that allows you to automate running commands on your source environment. Finally, ensure that communication is possible between the agent and Application Migration Service (see 3 in Figure 2).

In the following image, there is an example of deploying the AWS Replication Agent providing the necessary parameters (AWS Region, AWS access key and AWS secret access key).

Figure 2: Architecture diagram showing migration with Application Migration Service

When the AWS Replication Agent is installed, the server is added to the Application Migration Service console. Next, it undergoes the initial syncronization process, which is completed when showing the Ready for testing lifecycle state in the Application Migration Service console.

Step 4: Configure launch settings

Prior to testing or cutting over an instance, you must configure the launch settings by creating Amazon Elastic Compute Cloud (Amazon EC2) launch templates, ensuring that your cutover subnet is selected and that you choose an available instance type (see 4 in Figure 2). The instance type right-sizing feature allows AWS Application Migration Service to launch a test or cutover instance type that best matches the hardware configuration of the source server, by selecting the Basic option, AWS Application Migration Service will launch a test or cutover AWS instance type that best matches the OS, CPU, and RAM of your source server.

Step 5: Install AWS Systems Manager Agent on your cutover instances. When the launch settings are defined, you must activate the post-launch actions for either a specific server or all the servers. You must leave the Install the Systems Manager agent and allow executing actions on launched servers option toggled on in order for post-launch actions to work. Untoggling the option would disallow Application Migration Service to install the AWS Systems Manager Agent on your servers, and post-launch actions would no longer be executed (see 5 in Figure 2).

Figure 3: Post-launch actions on the Application Migration Service console

Figure 3: Post-launch actions on the Application Migration Service console

Step 6: Testing and cutover in Region

When you have configured the launch settings for each source server, you are ready to launch the servers as test instances. Best practice is to test instances before cutover.

Figure 4: Application Migration Service console ready to launch test instances

Figure 4: Application Migration Service console ready to launch test instances

Finally, after completing the testing of all the source servers, you are ready for cutover (see 6 on Figure 2). Prior to launching cutover instances, check that the source servers are listed as Ready for cutover under Migration lifecycle and Healthy under Data replication status.

Figure 5: Application Migration Console ready for cutover

Figure 5: Application Migration Console ready for cutover

To launch the cutover instances, choose the instances you want to cutover and then choose Launch cutover instances under Cutover (see Figure 5). The Application Migration Service console indicates Cutover finalized when the cutover has completed successfully the chosen source servers’ Migration lifecycle column shows the Cutover complete status, the Data replication status column shows Disconnected, and the Next step column shows Mark as archived. The source servers have now been successfully migrated into AWS. You can now archive your source servers that have launched cutover instances.

Step 7: Create a Migration AMI

After migrating all your workloads in the region where the Outposts is anchored to, create Amazon Machine Images (AMI). When you create an AMI from an instance, Amazon EC2 powers down the instance before creating the AMI to make sure that everything on the instance is stopped and in a consistent state during the creation process. If you are confident that your instance is in a consistent state appropriate for AMI creation, you can tell Amazon EC2 not to power down and reboot the instance.

This step can be automated using an existing Post Launch Action.

Step 8: Launch instances on AWS Outposts

The final part is to launch your created AMIs to your Outposts. To identify the EC2 instances configured on your Outpost you can use the following AWS Command Line Interface (AWS CLI):

Outposts get-outpost-instance-types \

–outpost-id op-abcdefgh123456789

The output of this command lists the instance types and sizes configured on your Outpost:

InstanceTypes:

– InstanceType: c5.xlarge

– InstanceType: c5.4xlarge

– InstanceType: r5.2xlarge

– InstanceType: r5.4xlarge

With knowledge of the instance types configured, you can now determine how many of each are available. For example, the following AWS CLI command, which is run on the account that owns the Outpost, lists the number of c5.xlarge instances available for use:

aws cloudwatch get-metric-statistics \

–namespace AWS/Outposts \

–metric-name AvailableInstanceType_Count \

–statistics Average –period 3600 \

–start-time $(date -u -Iminutes -d ‘-1hour’) \

–end-time $(date -u -Iminutes) \

–dimensions \

Name=OutpostId,Value=op-abcdefgh123456789 \

Name=InstanceType,Value=c5.xlarge

This command returns:

Datapoints:

– Average: 10.0

Timestamp: ‘2024-04-10T10:39:00+00:00’

Unit: Count

Label: AvailableInstanceType_Count

The output indicates that there were (on average) 10 c5.xlarge instances available in the specified time period (one hour). Using the same command for the other instance types, you discover that there are also 20 c5.4xlarge, 10 r5.2xlarge, and 6 r5.4xlarge available for use in completing the necessary EC2 launch templates.

Scenario 2: Using partner backup and replication solutions

You may already be using a third-party or AWS Partner solution to create on-premises backups of bare-metal or virtualized systems. These solutions often use local disk-arrays or object stores to create tiered backups of systems covering restore-points going back years, days, or just a few hours or minutes.

These solutions may also have inherent capabilities to restore from these backups directly to the AWS. This enables migration of on-premises systems to EC2 instances deployed to Outposts Rack.

In the scenario illustrated in Figure 6, the partner backup and replication service (BR) creates backups (see 1 in Figure 6) of virtual machines to on-premises disk or object storage repositories. Using the service’s AWS integration, virtual machines can be restored (see 2 in Figure 6) to an EC2 instance deployed on Outposts Rack, which is also on-premises. The restoration may follow a process that uses helper instances and volumes (see 3 in Figure 6) during intermediate steps to create Amazon Elastic Block Store (Amazon EBS) snapshots (see 4 in Figure 6) and then AMIs of the systems being migrated (see 5 in Figure 6), which are ultimately deployed (see 6 in Figure 6) to Outposts Rack.

Figure 6: Architecture diagram of the partner backup and replication scenario

Figure 6: Architecture diagram of the partner backup and replication scenario

When deploying an AMI created from a restored instance you must specify the target VPC and subnet. These should be the VPC being extended to the Outpost and a subnet that has been created in that VPC on the Outpost. You also need to specify an EC2 instance type that is available on the Outpost, which can be discovered using the process described in the previous section.

Workload migration to Outposts Rack using AWS Elastic Disaster Recovery (DRS)

Data residency can be a critical consideration for organizations that collect and store sensitive information, such as personally identifiable information (PII), financial data, or medical records. AWS Elastic Disaster Recovery, supported on Outposts Rack, helps enable seamless replication of on-premises data to Outposts Rack and addresses data residency concerns by keeping data within your on-premises environment, using Amazon EBS and Amazon S3 on Outposts.

In this scenario, an Outpost Rack is deployed on-premises with the following prerequisites:

  • At least one Outposts Rack installed and activated
  • The Outposts Rack must be in Direct VPC Routing (DVR) mode
  • VPC extended to the Outposts Rack containing subnets for staging and target resources
  • Amazon S3 on Outposts (necessary for all Elastic Disaster Recovery replication destinations)
  • An AWS Replication Agent installed on each source server

The following diagram shows the solution architecture and includes the on-premises servers that are migrated from the local network to the Outposts Rack. It also includes the staging VPC used to deploy the replication servers on Outposts Rack, Amazon S3 on Outposts to store the local Amazon EBS snapshots, and the target VPC extended to Outposts Rack.

Figure 7: Architecture diagram for workflow migration to Outposts Rack

Figure 7: Architecture diagram for workflow migration to Outposts Rack

Step 1: Outposts Rack configuration

To use Elastic Disaster Recovery on Outposts Rack, you need to configure both Amazon EBS and Amazon S3 on Outposts to support continuous replication and point-in-time recovery for your workload needs (see 1 in Figure 7). Specifically, you need to size the Amazon EBS and Amazon S3 on Outposts capacity according to your workload capacity requirements and application interdependencies. To do this, you can define dependency groups: each dependency group is a collection of applications and their underlying infrastructure with technical or non-technical dependencies. A 2:1 ratio is recommended for the EBS volumes to be used for near-continuous replication, and a 1:1 ratio is recommended for the Amazon S3 on Outposts ratio for EBS snapshots. For example, to migrate 40 TB of workloads, you need to plan for 80 TB of EBS volumes and 40 TB of Amazon S3 on Outposts capacity.

Step 2: Extend VPC to your Outposts Rack

When your Outpost has been provisioned and is available, extend the necessary Amazon Virtual Private Cloud (Amazon VPC) connection to the Outpost from the Region by creating the desired staging and target subnets (see 2 in Figure 7).

Step 3: Prepare Elastic Disaster Recovery service

Prepare the Elastic Disaster Recovery service from the Console to set the default replication and launch settings. When defining these settings, make sure that the Outposts resources available are chosen for staging and target subnets and instance and storage type (see 3 in Figure 7).

Step 4: Install the AWS Replication Agent to the source servers or machines

The next phase is to install the AWS Replication Agent to the source servers and to make sure that communication is possible between the AWS Replication Agent and your Outposts replication subnet through the Outposts local gateway, which makes sure that replication traffic uses the local network (see 4 in Figure 7).

Step 5: Continuous block-level replication

Staging area resources are automatically created and managed by Elastic Disaster Recovery. When the AWS Replication Agent has been deployed, continuous block-level replication (compressed and encrypted in transit) occurs (see 5 in Figure 7) over the local network.

Step 6: Launch Outposts Rack resources

Finally, migrated instances can now be launched using Outposts Rack resources based on the launch settings defined previously (see 6 in Figure 7).

Conclusion

In this post, you have learned how to migrate your workloads from your on-premises environment to AWS Outposts Rack based on your specific data residency requirements. When you have the flexibility of using AWS Regional services, AWS migration services or partner solutions can be used with infrastructure already in place. If your data must stay on-premises, then using AWS Elastic Disaster Recovery allows you to migrate your data without using Regional services, allowing you to migrate to Outposts Rack without your data leaving the boundary of a certain geographic location.

To learn more about an end-to-end migration and modernization journey, visit the AWS Migration Hub.

Implementing a serverless architecture to detect absence of Guardrails in Amazon Bedrock inference API calls

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/implementing-a-serverless-architecture-to-detect-absence-of-guardrails-in-amazon-bedrock-inference-api-calls/

This post is written by Sayan Chakraborty, Senior Solutions Architect, AWS

Implementing a serverless architecture to detect absence of Guardrails in Amazon Bedrock inference API calls

In today’s rapidly evolving artificial intelligence (AI) landscape, organizations are increasingly harnessing the power of foundation models through Amazon Bedrock to build sophisticated generative AI applications. Although this technology opens up exciting possibilities, it also brings forth important considerations around responsible AI implementation and content safety.

Amazon Bedrock Guardrails serve as a crucial safeguard, helping organizations filter out harmful content, prevent prompt injection attacks (LLM01:2025 from OWASP Top 10 for generative AI), and maintain ethical AI practices. These configurable safeguards are essential for enterprises committed to responsible AI development, especially when scaling their applications across various use cases.

However, there’s a critical consideration: although Guardrails are powerful, they’re optional by default in Amazon Bedrock inference API calls. For organizations that mandate the use of Guardrails as part of their responsible AI strategy, a solution is needed to make sure of consistent implementation across all API requests.

In this post, we explore how to build a serverless architecture that automatically detects when Guardrails are absent in Amazon Bedrock inference API calls. We demonstrate how enterprises can implement automated monitoring and alerting systems to maintain compliance with their AI safety standards, making sure that Guardrails are properly implemented wherever needed. This solution is particularly valuable for organizations prioritizing secure and responsible AI deployment at scale.

Prerequisites

Before proceeding with the implementation, make sure that you do the following:

1.Create an AWS account if you do not already have one, and log in. The AWS Identity and Access Management (IAM) user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.

2.Have AWS Command Line Interface (AWS CLI) installed and configured.

3.Have Git Installed.

Architecture

The following diagram shows an event-driven architecture of this solution.

Figure 1: Solution architecture diagram

Figure 1: Solution architecture diagram

Amazon Bedrock supports model invocation logging. When enabled, it collects the full request data, response data, and metadata associated with all model invocation calls performed in your AWS account. Logging can be configured to send the logs to supported destinations such as Amazon CloudWatch Logs and Amazon S3. This solution uses an S3 bucket to collect these logs. Note that this solution supports the below Amazon Bedrock inference APIs:

As logs get stored in the S3 bucket, an Amazon S3 event notification is generated to an Amazon EventBridge event bus. A rule that matches “Object Created” events from Amazon S3 routes these events to an AWS Step Functions state machine, which defines the orchestration logic to inspect the model invocation logs for missing Guardrails, and sends out an alert to a monitored email address when applicable.

Walkthrough of the orchestration

As mentioned previously, the Step Functions state machine is the orchestration engine that performs the business logic for this solution, as events are received from new logs created in the S3 bucket. When opened in the Workflow studio in the Step Functions console, you should observe the following diagram.

Figure 1: Step Functions state machine diagram as seen in workflow studio

Figure 2: Step Functions state machine diagram as seen in workflow studio

1.The first step in the state machine is to call an AWS Lambda function to get the logs from the S3 bucket using the bucket name and object key supplied in the event object received from EventBridge.

2.If the log shows that the Amazon Bedrock API invocation was successful, then the state machine collects the output object of the API response from the log that is needed for further evaluation.

3.The next step is to check if Amazon Bedrock Guardrails was used. This is done by looking for specific objects in the Amazon Bedrock API output that was captured from the logs.

4.If a Guardrail was detected, then the flow completes successfully, and no further action is needed.

5.If a Guardrail was not detected, then the next step in the state machine collects a few pieces of information from the log file that is necessary to record the transaction and adds the transaction date. Then, the transaction is logged in to the transactions table in Amazon DynamoDB.

6.A user or a role may be making a lot of API calls to Amazon Bedrock each day. Therefore, the solution implements a mechanism to prevent the monitored email address from being swamped by emails reporting the same user or role more than once each day. This is done in parallel to Step 5, where the flow checks if the principal’s identity (user ID/IAM role) is recorded as notified in the current date, by querying the notifications table in DynamoDB. If no results were found, meaning that a notification hasn’t been sent yet, then an email is sent out to a monitored email address through an Amazon Simple Notification Service (Amazon SNS) topic. Furthermore, an item is inserted into the notifications table in DynamoDB to prevent sending more notifications on the same day for the same principal.

Solution deployment

For deployment instructions, follow along in the GitHub repo or use this post. An AWS CloudFormation template is provided to deploy the solution.

1.Create an S3 bucket to store the model invocation logs from Amazon Bedrock. Under bucket Properties, turn the EventBridge notifications to On. This enables Amazon S3 to send an event notification to the EventBridge default event bus whenever a log file is created in the bucket by Amazon Bedrock.

2.Go to the Amazon Bedrock console and enable Model invocation logging under Bedrock Configuration > Settings, from the left navigation pane. Specify the bucket created in Step 1 under S3 location.

Figure 2: Amazon Bedrock settings for Model invocation logging

Figure 3: Amazon Bedrock settings for Model invocation logging

3. Create two more S3 buckets: one that is used by the Step Functions state machine to store Bedrock model invocation errors detected from the log, and the other that stores the Lambda function code for this solution. Inside the latter bucket, create a Folder called code (or any other preferred name) and upload the ZIP archive under the lambda-code folder of this repository, into that Amazon S3 folder. Note the names for these two S3 buckets and the Amazon S3 object key for the Lambda ZIP file. These must be specified as input parameters to the CloudFormation template.

4. From the CloudFormation console or using CLI, create a stack using the template provided in this repository called bedrock-guardrails-detection-template.yaml. For inputs, specify the BedrockLogsBucket (from Step 1), BedrockLogsErrorBucket (from Step 3), LambdaFunctionCodeBucket (from Step 3), LambdaFunctionCodeBucketKey (S3 object key for the ZIP file uploaded in Step 3, for example code/get-bedrock-logs-from-s3.py.zip), and NotificationEmailAddress (email address to subscribe to the SNS topic). It may take a few minutes to complete deployment of the CloudFormation stack.

5. When deployment is complete, access the email inbox for the email address specified during the CloudFormation stack deployment, and confirm the subscription using the email sent from the Amazon SNS topic. The email should be titled: AWS Notification – Subscription Confirmation. Choose the Confirm subscription link inside the email to complete the subscription process. The email account is now ready to receive notifications from this solution.

Scaling to multiple AWS accounts

The architecture discussed previously shows how Guardrails can be detected from within the same AWS account where Amazon Bedrock APIs are invoked. However, in most production environments, there are multiple AWS accounts where independent teams may be deploying their own generative AI workloads using Amazon Bedrock in their own accounts. To collect model invocation logs from all those accounts, EventBridge can be configured to send events from event buses in separate source workload accounts to a central event bus deployed in a central destination governance account. This central event bus can have a rule to route events to the Step Functions state machine deployed in that central governance account. The deployment model looks like the following diagram.

To learn more about sending and receiving events between AWS accounts in EventBridge, refer to the documentation.

Figure 3: Cross-account guardrail detection solution

Figure 4: Cross-account guardrail detection solution

Further considerations and clean up

Amazon Bedrock model invocation logging captures requests and responses from model invocations and stores the logs in the destination of your choosing. In this sample it is in an S3 bucket that you create. The following are some more security considerations.

1.To protect information, you may choose to use to encrypt the contents using server-side encryption with AWS KMS keys (SSE-KMS) on the S3 bucket, and specify a customer managed encryption key. More details are in this Amazon Bedrock user guide.

2.Perform regular cleanup of the model invocation logs bucket using an Amazon S3 lifecycle configuration rule as mentioned in this post.

To avoid ongoing charges, clean up your environment by following these steps to delete the resources you created by following this post, if they are no longer needed:

1.Delete the stack:
aws cloudformation delete-stack –stack-name STACK_NAME

2.Confirm the stack has been deleted:
aws cloudformation list-stacks –query “StackSummaries[?contains(StackName,’STACK_NAME’)].StackStatus”

3.Empty contents of the S3 buckets created manually as a prerequisite to deploying the CloudFormation stack and delete the buckets.

4.Turn off model invocation logging from under Settings in the Amazon Bedrock console, if it’s not desired any longer.

Conclusion

This post discussed implementing a serverless event-driven architecture to detect the absence of Guardrails in Amazon Bedrock inference API calls. As organizations increasingly use foundation models through Amazon Bedrock for generative AI applications, making sure of responsible AI implementation becomes crucial.

The solution presents an event-driven architecture that automatically detects when Guardrails are missing in API calls. It uses the Amazon Bedrock model invocation logging, storing logs in an Amazon S3 bucket. When new logs are created, an Amazon S3 event notification triggers an Amazon EventBridge event bus, which routes events to an AWS Step Functions state machine. Then, the state machine inspects the logs for missing Guardrails and sends alerts through Amazon SNS to a monitored email address.

The architecture includes features to prevent notification flooding and can scale across multiple AWS accounts. The post provides detailed deployment instructions using AWS CloudFormation and includes security considerations and cleanup procedures. With this solution you can help your organization maintain compliance with AI safety standards while scaling generative AI applications.

Efficiently manage Amazon EC2 On-Demand Capacity Reservations (ODCRs) with split, move, and modify

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/efficiently-manage-amazon-ec2-on-demand-capacity-reservations-odcrs-with-split-move-and-modify/

This post is written by Ninad Joshi, Senior Solutions Architect, Ballu Singh, Principal Solutions Architect, and Ankush Goyal, Enterprise Support Lead AWS.

Introduction

In today’s cloud-first world, managing compute capacity efficiently while making sure of application availability is crucial for your business. Amazon EC2 On-Demand Capacity Reservations (ODCR) is a valuable tool for organizations looking to manage their reservations, but managing reservations across multiple teams and accounts is challenging. Recently, AWS introduced new capabilities – split, move, and modify – that improve how organizations can manage their Capacity Reservations. In this post, we explore how these features can transform your operations.

Common ODCR management challenges

As a consumer of ODCR, you might face several challenges managing your Capacity Reservations. These challenges include but are not limited to the following:

  • Underused reserved capacity in some accounts
  • Inability to redistribute excess capacity efficiently
  • Difficulty in managing existing capacity across multiple AWS accounts
  • Difficulty in modifying reservation attributes post-creation

With multiple development teams and various projects running simultaneously, you might struggle with efficient capacity allocation. You might also find yourself dealing with situations where one team has excess capacity while another desperately needs it.

Use case 1: Redistributing capacity across teams

The unused capacity dilemma

Consider a scenario where your machine learning (ML) team has an ODCR for ten c5.2xlarge instances, but they’re only using five. Meanwhile, your Analytics team urgently needs three Amazon Elastic Compute Cloud (Amazon EC2) instances of the same type for a new project. Previously, your Analytics team would have had to create a new reservation, leading to unnecessary overhead of managing their own Capacity Reservation. Meanwhile, the five unused capacity slots of the ODCR owned by your ML team results in unnecessary costs.

Split capability to the rescue

Using the new split capability, you can now divide the existing ODCR (see ODCR-1 in the following figure), which has a total capacity of ten EC2 instances, and create a new ODCR with three of the unused capacity.

Before split, ODCR-1 with original total and unused capacity

Figure 1: Before split, ODCR-1 with original total and unused capacity

This results in the creation of two ODCRs:

  1. Original ODCR: total capacity of seven instances for the ML team
  2. New ODCR: three instances for the Analytics team

The following figure illustrates the split result:After split, ODCR-1 with updated total and unused capacity, and newly created ODCR-2

Figure 2: After split, ODCR-1 with updated total and unused capacity, and newly created ODCR-2

Sharing across accounts

The split operation creates the new ODCR in the same AWS account. If your teams operate under the same AWS account, then the split operation is direct without any further steps. However, if your teams use different AWS accounts, then you would need to use AWS Resource Access Manager (AWS RAM) to share the newly created ODCR after the split operation. This enables cross-account capacity management while maintaining centralized control.

Refer to the AWS Documentation for more information on pre-requisites and considerations when splitting off capacity from one reservation to a new one.

Refer to the API and CLI documentation for further information on the split capability such as parameters, exceptions, and limits.

Use case 2: moving capacity between reservations

Scaling for growth

After a few days, when your Analytics team needs one more capacity to launch an instance for their expanding project, you need to add more capacity to ODCR-2.

Move capability to the rescue

Instead of creating a new ODCR for this purpose, you can move one of the unused slots from ODCR-1 to ODCR-2. This flexibility saves you multiple steps involved in reserving new capacity, removes any disruptions to running existing workloads, and helps with simpler ODCR management. This rebalancing makes sure of optimal resource usage without further procurement.

Before move, ODCR-1 with unused capacity and ODCR-2 with current capacity

Figure 3: Before move, ODCR-1 with unused capacity and ODCR-2 with current capacity

After move, ODCR-1 with reduced capacity and ODCR-2 with additional capacity

Figure 4: After move, ODCR-1 with reduced capacity and ODCR-2 with additional capacity

Refer to the AWS Documentation for more information on pre-requisites and considerations when moving capacity from one reservation to another one.

Refer to the API and CLI documentation for further information on the move capability such as parameters, exceptions, and limits.

Use case 3: adjusting reservation attributes for changing workload patterns

Dynamic workload requirements

When your data processing workload patterns change significantly, you must adapt. Initially, you might have set up your ODCR with specific instance matching criteria, making it a targeted reservation for predictable workloads. However, as you introduce more dynamic, impromptu analysis projects, you need more flexibility in how instances can be launched against your reservation.

Modify feature to the rescue

Using the modify capability, you can now change the reservation’s attributes without creating a new reservation or disrupting running workloads. You can modify your ODCR by:

  • Changing instance quantity
  • Changing instance eligibility from Targeted to Open
  • Adjusting the reservation’s end date to align with your project timeline

This modification allows you to:

  • Launch new instances more flexibly without strict instance eligibility
  • Improve the usage of reserved capacity across different projects
  • Maintain cost optimization while adapting to changing business needs

The modify feature provides this flexibility while making sure that your existing workloads continue running uninterrupted, making it an invaluable tool for dynamic environments. See the following figures for an example where the instance quantity of ODCR-2 is modified from four to six:

Before modify, ODCR-2 with total capacity of four and instance eligibility of targeted

Figure 5: Before modify, ODCR-2 with total capacity of four and instance eligibility of targeted

After modify, ODCR-2 with new total capacity of six and instance eligibility of open

Figure 6: After modify, ODCR-2 with new total capacity of six and instance eligibility of open

Increasing ODCR size or creating a new one is subject to capacity availability in Amazon EC2 on-demand availability. Therefore, if unused capacity is available in an existing ODCR, then moving/splitting that could be a better option than modifying an ODCR.

Refer to the AWS Documentation for more information on pre-requisites and considerations when modifying Capacity Reservations.

Refer to the API and CLI documentation for further information on the modify capability such as parameters, exceptions, and limits.

Special considerations for split capacity

In the preceding sections, we saw how you can use the split capability to detach excess unused capacity to create an ODCR for another team. However, you can also use this capability to split used capacity to create new ODCRs. This capability is particularly helpful when you want to split partially used ODCRs to create a new one for easier tracking and management. Along with the considerations for splitting unused/excess capacity, the following considerations apply for splitting used capacity:

  1. The used capacity can only be split for an ODCR with open instance eligibility that isn’t shared with any account.
  2. The instances running inside the reservation are of open eligibility (in other words they are not targeting the reservation).
  3. When you split the used capacity, the eligible instances are randomly selected. You cannot specify which running instances are split. If a sufficient number of eligible instances aren’t found to fulfill the split quantity, then the split operation fails. When you specify the quantity of instances to be split, by default any unused capacity is moved first, followed by any eligible running instances (the used capacity in your reservation).

In the next section we different scenarios where you can or can’t use split capability.

Scenario 1: managing internal ODCRs (Capacity Reservation not shared with any other AWS account)

For your internal projects, when managing ODCRs that aren’t shared with external partners (other AWS accounts) and all have open instance eligibility, consider this example with ODCR-1:

  • Total capacity of ten c5.2xlarge instances, all with open instance eligibility
  • Eight instances currently in use by your ML team
  • Two unused instances

Before split, ODCR-1 with total capacity of 10 and 2 unused instances

Figure 7: Before split, ODCR-1 with total capacity of 10 and 2 unused instances

This ODCR isn’t shared with any external AWS accounts, thus you have maximum flexibility in splitting the reservation. You can split up to nine instances into a new reservation (total capacity minus one), regardless of how many instances are currently in use. In this scenario, you can share used as well as unused capacity. This gives you significant freedom in restructuring the capacity allocation for your internal teams.

After split, ODCR-1 remains with total capacity of one, and ODCR-2 with total capacity of nine with two unused capacities

Figure 8: After split, ODCR-1 remains with total capacity of one, and ODCR-2 with total capacity of nine with two unused capacities

Scenario 2: managing shared ODCRs with partners (Capacity Reservation shared with other AWS account)

When you need to share your ODCR with a partner’s AWS account, consider this scenario where ODCR-1 has:

  • Total capacity of ten c5.2xlarge instances
  • Eight instances in use by both your team and your partner’s team
  • Two unused instances

Before split, ODCR-1 shared with another AWS account

Figure 9: Before split, ODCR-1 shared with another AWS account
In this case, your options are more limited. ODCR-1 is shared with your partner’s AWS account, thus you can only split the unused capacity (maximum of two instances). After split, the newly created ODCR (ODCR-2) remains in your AWS account and isn’t shared with any other AWS account. This restriction helps prevent any disruption to your partner’s running workloads while still allowing for some flexibility in capacity management.

After split, ODCR-1 remains shared with another AWS account, and newly created ODCR-2 isn’t shared

Figure 10: After split, ODCR-1 remains shared with another AWS account, and newly created ODCR-2 isn’t shared

These scenarios demonstrate important factors about capacity management in both internal and partner-shared environments. You should carefully consider the sharing status of ODCRs before planning any splits or modifications, making sure of smooth operations for both your teams and your partners.

Special considerations for move capability

The move capability enables you to redistribute available (or excess) capacity between ODCRs. However, in certain cases, you can also use this capability to move used instances between ODCRs. This capability is particularly helpful if you want to merge partially used ODCRs into one for easier tracking and management. Along with the considerations for moving unused capacity, the following considerations apply for moving used capacity:

  1. Both source and destination ODCR are of open instance eligibility and in active state.
  2. The instances running inside the reservation are of open eligibility (in other words they are not targeting the reservation).
  3. Both source and destination ODCRs are owned by the same account.
  4. The source and destination ODCRs can be shared, but with the same list of accounts when moving used portion. This sharing to same accounts condition doesn’t apply to the unused portion of the ODCR.

When you specify the quantity of instances to be moved, by default any unused capacity is moved first, followed by any eligible running instances (the used capacity in your reservation).

In the next sections, we review where you can or can’t use this capability.

Scenario 1: source and destination ODCRs not shared with other account(s) (Team Transfers)

When managing capacity between your internal teams using the same AWS account (Account-A), you find the process clear. For example, when consolidating the ML team’s resources:

  • ODCR-1 (ML Team A): had ten capacities total (all with open eligibility), with eight in use and two unused.
  • ODCR-2 (ML Team B): had five capacities (all with open eligibility), all in use.

Before move, ODCR-1 and ODCR-2 both in the same AWS account, unshared

Figure 11: Before move, ODCR-1 and ODCR-2 both in the same AWS account, unshared

Both ODCRs belonged to the same account and weren’t shared externally, and the ODCRs have open instance eligibility. Therefore, you could freely move all ten instances from ODCR-1 to ODCR-2, creating a unified pool of 15 instances for the consolidated DevOps team.

After moving capacity from ODCR-1, ODCR-2 has combined total capacity of 15 with 2 unused

Figure 12: After moving capacity from ODCR-1, ODCR-2 has combined total capacity of 15 with 2 unused

Scenario 2: source and destination ODCRs shared with the same account(s) (External Partner Collaboration)

If your ML team (ODCR-1) collaborates with an external AI research partner (Account-B), your setup might look like the following:

  • ODCR-1: ten instances (eight used, two unused), all with open instance eligibility, shared with the research partner through AWS RAM.
  • ODCR-2: Five instances (all used), all with open instance eligibility, for internal Analytics team.

Before move, ODCR-1 and ODCR-2 both in the same AWS account, with ODCR-1 shared with other AWS account

Figure 13: Before move, ODCR-1 and ODCR-2 both in the same AWS account, with ODCR-1 shared with other AWS account

When your Analytics team needs more capacity, you can only move the two unused instances from ODCR-1 to ODCR-2, as the other eight are actively used in the partner collaboration.

Since ODCR-1 is shared with other AWS account, only unused capacity is moved to ODCR-2

Figure 14: Since ODCR-1 is shared with other AWS account, only unused capacity is moved to ODCR-2

Scenario 3: source and destination ODCRs shared with different account(s) (Multi-Partner Projects)

In this scenario involving managing capacity across different partner engagements:

  • ODCR-1: Ten instances (eight used, two unused), shared with a database partner (Account-B).
  • ODCR-2: Five instances (all used), shared with a security partner (Account-C).

ODCR-1 and ODCR-2 are shared with different AWS account

Figure 15: ODCR-1 and ODCR-2 are shared with different AWS account

Due to the different partner arrangements, in other words ODCRs shared with another accounts, you can only move the two unused capacities from ODCR-1 to ODCR-2. This makes sure that there is no disruption to database partner workloads.

Only unused capacity moved to ODCR-2 due to shared capacity reservations

Figure 16: Only unused capacity moved to ODCR-2 due to shared capacity reservations

These scenarios teach valuable lessons about capacity management in multi-account environments. You can develop a comprehensive sharing strategy that balances flexibility with partner commitments, enabling you to optimize your resource usage while maintaining strong partner relationships.

Conclusion

The new ODCR features of AWS –a split, move, and modify – represent a significant advancement in cloud capacity management. For your organization, these features transform how you handle compute resources, enabling more efficient operations and cost management. The ability to dynamically adjust and share Capacity Reservations provides the flexibility you need while maintaining the stability necessary for your critical workloads.

As cloud infrastructure continues to evolve, these features demonstrate the AWS commitment to addressing real-world challenges that you face when managing complex cloud environments. If you’re looking to optimize your AWS infrastructure, then these new ODCR capabilities offer powerful tools for better capacity management and resource usage.

To enhance your understanding of these capabilities, we’ve created a GitHub repository containing APIs for implementation purposes. For more details, refer to the updated Capacity Reservations documentation. If you have any questions or feedback, feel free to share them in the comments section or contact AWS Support.

Streamlining AMI creation with EC2 Image Builder components in AWS Marketplace

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/streamliningamicreationwith-ec2imagebuilder/

This post is written by Smriti Ohri, Senior Product Manager, EC2 and Omar Chehab, Senior Product Manager, AWS Marketplace.

At re:Invent 2024, Amazon Web Services (AWS) announced the availability of third-party EC2 Image Builder components in AWS Marketplace. EC2 Image Builder is a fully managed service that streamlines the customization, testing, distribution, and lifecycle management of images. You can use this new feature to procure third-party components from AWS Marketplace directly on the EC2 Image Builder console and in the AWS Marketplace website. You can add multiple of these components to create your golden images.

A golden image is a customized and pre-configured Amazon Machine Image (AMI) needed for launching Amazon Elastic Compute Cloud (Amazon EC2) instances. It includes a standardized set of software, configurations, and security settings that meet an organization’s specific requirements, promoting consistency and efficiency across all EC2 instances.

EC2 Image Builder provides Amazon managed components, and you can build your own components that help when building custom images. However, you may need third-party software to build your golden images. Procuring this software can be time-consuming and necessitates custom setup. This integration aims to address these challenges by providing the ability to add third-party software from AWS Marketplace directly while creating golden images using EC2 Image Builder. While creating the image, you can customize your image recipe to use the latest version of components published in AWS Marketplace and make sure that you always remain up to date.

This post shows you how to find, subscribe to, and incorporate components from AWS Marketplace using the EC2 Image Builder console.

Prerequisites

You must have access to subscribe to a product in AWS Marketplace. Check AWS Marketplace subscription permissions.

Solution overview

Three high-level steps are involved in using the third-party component from AWS Marketplace in EC2 Image Builder:

  1. Discover and subscribe to the third-party component on the EC2 Image Builder console.
  2. Build the golden image with the third-party component.
  3. Launch the EC2 instance using the golden image.

Solution walkthrough: Streamlining AMI creation with EC2 Image builder components in AWS Marketplace

To perform the solution, go through the steps in the following sections.

Discover and subscribe to a component by Cribl

To discover and subscribe to the component, follow these steps:

  1. On the EC2 Image Builder console, in the navigation pane, choose Discover products. On the Components tab, you can view the list of available AWS Marketplace image products and the associated components. As shown in the following screenshot, choose View subscription options, which shows the different pricing offered.

 Figure 1: Discover components on EC2 Image Builder console

 Figure 1: Discover components on EC2 Image Builder console

  1. To subscribe to the product, from the dropdown menu choose the available offers and choose Subscribe, as shown in the following screenshot. You can now start using the associated component in your image recipe.

Figure 2: Subscribe to the product that has the component

Figure 2: Subscribe to the product that has the component

Build the golden image with the third-party component

To use the component, you can either subscribe to it first, or you can create the pipeline and subscribe to the component later based on your preference. For this walkthrough, I already subscribed to the component. The following section shows how to create a pipeline to build a custom AMI using the component to which I subscribed. You can follow a similar process to install other components to create your golden AMIs. The high-level steps are:

  1. Create the recipe.
  2. Create the pipeline.

To create the recipe, follow these steps:

  1. On the EC2 Image Builder console, choose Image recipes and Create image recipe. A recipe has a base image and the components that you want to install on it.

For this example, Amazon Linux was chosen as the base image operating system and “Amazon Linux 2023 x86” as the image name.

  1. In the Build components section, choose Add build components and, from the dropdown, choose AWS Marketplace. Search for the component to which you subscribed and choose Add to recipe, as shown in the following screenshot.

You can choose to use the latest version or a specific version of the component. For this walkthrough, the latest available version was selected.

Figure 3: Create recipe and add components from AWS Marketplace

Figure 3: Create recipe and add components from AWS Marketplace

To create the pipeline, an automation configuration (where you define the infrastructure configuration), image workflows, and distribution configuration, follow these steps:

  1. On the EC2 Image Builder console, choose Image pipelines and Create image pipeline. Provide the name of the pipeline and choose a Build schedule. You can also enable scanning, which scans your AMIs for Common Vulnerabilities and Exposures (CVEs) using Amazon Inspector.

For more information, refer to Amazon Inspector integration in Image Builder in the EC2 Image Builder User Guide. For this example, image scanning is enabled and the option to manually trigger the pipeline was selected.

Figure 4: Create the pipeline with the recipe and other configurations

Figure 4: Create the pipeline with the recipe and other configurations

  1. Choose the recipe you created with third-party components from AWS Marketplace.
  2. Choose the image workflows for the image creation process and define infrastructure configurations for creating the image.

You can choose Dedicated Host, Dedicated Instance, or Shared Tenancy. By default, it uses Shared Tenancy. For this example, the default configuration was selected. I chose the c5.large instance type since that is the supported instance type for this component.

Figure 5: Select the supported instance type in the infrastructure configurations

Figure 5: Select the supported instance type in the infrastructure configurations

  1. Provide the distribution configuration details to share or copy the output image to other accounts and in other AWS Regions.

To allow these accounts to use any component from AWS Marketplace, you must share license entitlements with these accounts using AWS License Manager. Instructions for sharing license entitlements are outside the scope of this post. To learn more, refer to Associating licenses with AMI based products using AWS License Manager.

  1. Choose the pipeline that you created and choose Run pipeline. After a while, the image is created and ready to use.

Run the EC2 instance using the golden image

Create an EC2 instance with the output golden image. You can also view the product code stamped on the AMIs, as shown in the following figure.

 

Figure 6: View the output image to check the product code

Conclusion

This feature helps you save time and automate the process of using the latest versions of the software. With this integration, you get a diverse set of software components from verified sellers in AWS Marketplace to address the monitoring, security, governance, and compliance needs of your organization. You can learn more about these components in the documentation. Visit AWS Marketplace to view all supported EC2 Image Builder components.

If you’re an AWS Partner, then you can publish your software as components in AWS Marketplace to cater to your customers. To learn more about onboarding your software to AWS Marketplace, visit this blog post. You can reach out to [email protected] if you have questions about this new feature or the publishing process.

Start building your custom AMIs using components from Marketplace today.

AWS Weekly Roundup: Cloud Club Captain Applications, Formula 1®, Amazon Nova Prompt Engineering, and more (Feb 24, 2025)

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-cloud-club-captain-applications-formula-1-amazon-nova-prompt-engineering-and-more-feb-24-2025/

AWS Developer Day 2025, held on February 20th, showcased how to integrate responsible generative AI into development workflows. The event featured keynotes from AWS leaders including Srini Iragavarapu, Director Generative AI Applications and Developer Experiences, Jeff Barr, Vice President of AWS Evangelism, David Nalley, Director Open Source Marketing of AWS, along with AWS Heroes and technical community members. Watch the full event recording on Developer Day 2025.

Cloud Club

Applications are now open through March 6th for the 2025 AWS Cloud Clubs Captains program. AWS Cloud Clubs are student-led groups for post-secondary and independent students, 18 years old and over. Find a club near you on our Meetup page.

Last week’s launches
Here are some launches that got my attention:

Amplify Hosting announces support for IAM roles for server-side rendered (SSR) applications  AWS Amplify Hosting now supports AWS Identity and Access Management (IAM) roles for SSR applications, enabling secure access to AWS services without managing credentials manually. Learn more in the IAM Compute Roles for Server-Side Rendering with AWS Amplify Hosting blog.

AWS WAF enhances Data Protection and logging experience  AWS WAF expands its Data Protection capabilities allowing sensitive data in logs to be replaced with cryptographic hashes (e.g. ‘ade099751d2ea9f3393f0f’) or a predefined static string (‘REDACTED’) before logs are sent to WAF Sample Logs, Amazon Security Lake, Amazon CloudWatch, or other logging destinations.

Announcing AWS DMS Serverless comprehensive premigration assessments AWS Database Migration Service Serverless (AWS DMS Serverless) now supports premigration assessments for replications to identify potential issues before database migrations begin. The tool analyzes source and target databases, providing recommendations for optimal DMS settings and best practices.

Amazon ECS increases the CPU limit for ECS tasks to 192 vCPUs – Amazon Elastic Container Service (Amazon ECS) now supports CPU limits of up to 192 vCPU for ECS tasks deployed on Amazon Elastic Compute Cloud (Amazon EC2) instances, an increase from the previous 10 vCPU limit. This enhancement allows customers to more effectively manage resource allocation on larger Amazon EC2 instances.

AWS Network Firewall introduces automated domain lists and insightsAWS Network Firewall now provides automated domain lists and insights by analyzing 30 days of HTTP/S traffic. This helps create and maintain allow-list policies more efficiently, at no extra cost.

AWS announces Backup Payment Methods for invoices AWS now enables you to set up backup payment methods that automatically activate if primary payment fails. This helps prevent service interruptions and reduces manual intervention for invoice payments.

Get updated with all the announcements of AWS announcements on the What’s New with AWS? page.

Other AWS news
Here are additional noteworthy items:

AWS Partner Network: Essential training resources for ISV partners To help scale solutions effectively, AWS provides essential training resources for Software Vendors (ISVs) partners in four key areas: AWS Marketplace fundamentals, Foundational Technical Review (FTR), APN Customer Engagement (ACE) program and co-selling, and Partner funding opportunities.

How Formula 1® uses generative AI to accelerate race-day issue resolution Formula 1® (F1) uses Amazon Bedrock to speed up race-day issue resolution, reducing troubleshooting time from weeks to minutes through a chatbot that analyzes root causes and suggests fixes.

How Formula 1® uses generative AI to accelerate race-day issue resolution

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases This blog introduces a solution using Amazon Bedrock Knowledge Bases and Amazon Bedrock Agents to reduce Large language models (LLMs) hallucinations by implementing a verified semantic cache that checks queries against curated answers before generating new responses, improving accuracy and response times.

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

Orchestrate an intelligent document processing workflow using tools in Amazon Bedrock This blog demonstrates an intelligent document processing workflow using Amazon Bedrock tools that combines Anthropic’s Claude 3 Haiku for orchestration and Anthropic’s Claude 3.5 Sonnet (v2) for analysis to handle structured, semi-structured, and unstructured healthcare documents efficiently.

From community.aws
Here are my personal favorites posts from community.aws:

Tracing Amazon Bedrock Agents Learn how to track and analyze Amazon Bedrock Agents workflows using AWS X-Ray for better observability, by Randy D.

Testing Amazon ECS Network Resilience with AWS FISThis article demonstrates how to test network resilience in Amazon ECS using AWS FIS with guidance from Amazon Q Developer, by Sunil Govindankutty

Stop Using Default Arguments in AWS Lambda Functions Discover why your AWS Lambda costs might be spiralling out of control due to a common Python programming practice, by Stuart Clark.

Amazon Nova Prompt Engineering on AWS: A Field Guide by Brooke A field guide for using Amazon Nova models, covering prompt engineering patterns and best practices on AWS, by Brooke Jamieson.

Amazon Nova Prompt Engineering on AWS: A Field Guide by Brooke

Creating Deployment Configurations for EKS with Amazon Q Amazon Q Developer helps create EKS deployments by providing templates and best practices for Kubernetes configs, by Ricardo Tasso.

Processing WhatsApp Multimedia with Amazon Bedrock Agents: Images, Video, and DocumentsI invite you to read my latest blog, which explains how to create a WhatsApp AI assistant using Amazon Bedrock and Amazon Nova models to process multimedia content such as images, videos, documents, and audio.

Processing WhatsApp Multimedia with Amazon Bedrock Agents: Images, Video, and Documents

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events:

AWS GenAI Lofts – GenAI Lofts offer collaborative spaces and immersive experiences for startups and developers. You can join in-person GenAI Loft San Francisco events such as Hands-on with Agentic Graph RAG Workshop (February 25), Unstructured Data Meetup SF (February 26 – 27) and AI Tinkerers – San Francisco – February 2025 Demos + Science Fair (February 27 – 28). GenAI Loft Berlin has events and workshops on February 24 to March 7 that you can’t miss!

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Milan, Italy (April 2), Bay Area – Security Edition (April 4), Timișoara, Romania (April 10), and Prague, Czeh Republic (April 29).

AWS Innovate: Generative AI + Data – Join a free online conference focusing on generative AI and data innovations. Available in multiple geographic regions: APJC and EMEA (March 6), North America (March 13), Greater China Region (March 14), and Latin America (April 8).

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Paris (April 9), Amsterdam (April 16), London (April 30), and Poland (May 5).

AWS re:Inforce – AWS re:Inforce (June 16–18) in Philadelphia, PA our annual learning event devoted to all things AWS cloud security. Registration opens in March, and be ready to join more than 5,000 security builders and leaders.

Create your AWS Builder ID and reserve your alias. Builder ID is a universal login credential that gives you access–beyond the AWS Management Console–to AWS tools and resources, including over 600 free training courses, community features, and developer tools such as Amazon Q Developer.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Stay tuned for next week’s Weekly Roundup!

Eli

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Automating Notifications for Future-Dated Amazon EC2 Capacity Reservation State Changes

Post Syndicated from aostan original https://aws.amazon.com/blogs/compute/automating-notifications-for-future-dated-amazon-ec2-capacity-reservation-state-changes/

This post is written by Ballu Singh, Principal Solutions Architect at AWS, Sandeep Rohilla, Senior Solutions Architect at AWS and Pranjal Gururani, Senior Solutions Architect at AWS.

AWS customers are able to proactively reserve future-dated Amazon EC2 On-Demand Capacity Reservations (known as future-dated CRs) to get capacity assurance for workloads and events. Because reservations can be created weeks in advance, customers are able to ensure that they can monitor their real-time status. Future-dated CRs can transition through various state of a Capacity Reservation, based on capacity availability. Over the course of several weeks, the status of a reservation might only change 1 or 2 times. Even with the low frequency of changes, most customers don’t want to manually poll the state of their reservations, instead they want to be proactively notified when something changes.

Amazon EventBridge is a serverless event bus service that facilitates the routing of events and EventBridge rules establish how events are processed, allowing you to filter events and route them to one or more targets for action. Amazon Simple Notification Service (SNS) is a fully managed messaging service that enables you to send messages or notifications to a variety of endpoints, facilitating scalable, and decoupled communication between different components of your application or with end-users.

In this blog post, we guide you through the process of automating notifications for future-dated CR state changes using AWS services such as Amazon EventBridge and Amazon Simple Notification Service (SNS).

Pre-requisites

  • Existing future-dated Capacity Reservation. To learn how to create future-dated CRs, visit our blog post here.
  • IAM access to create Amazon EventBridge rules
  • IAM access to create Amazon SNS topic and subscription

Architecture

Amazon EC2 continuously monitors the state of your Capacity Reservations and sends events to Amazon EventBridge when Capacity Reservation states change. Using Amazon EventBridge, you can create rules that trigger notifications through Amazon SNS in response to these events. Amazon SNS then pushes these notifications to a variety of supported endpoint types such as Amazon Data Firehose, Amazon Simple Queue Service (SQS), AWS Lambda, HTTP, email, mobile push notifications, and mobile text messages (SMS). In our example, we will send notification to our email address.

Send notification to email address

Walkthrough

To automate the notification process for future-dated CR state changes, we will use the following AWS services:

  1. Amazon Simple Notification Service (SNS): We will use SNS to send email notifications to the designated recipients.
  2. Amazon EventBridge: We will create an EventBridge rule to capture the state change events of future-dated CRs.

AWS Console Walkthrough

Step 1: Set Up Amazon SNS Topics and Subscriptions

  • Create a new topic for future-dated CR state change notifications.
    • Navigate to Amazon SNS console. On the Topics page, choose Create topic
    • In the Details section, for Type, choose Standard and enter the name of Topic (ex: Subscriber)

Set up amazon sns topics and subscriptions

    • Choose rest of details as is. Choose Create topic.
  • Create a subscription for the desired recipients, such as email addresses to receive the notifications.
    • Navigate to Amazon SNS console. In the left navigation pane, choose Subscriptions. Choose Create subscription.
    • On the Create subscription page, in the Details section, for Topic ARN, choose the Amazon Resource Name (ARN) of a topic enter topic ARN noted above. For Protocol, choose Email/Email-JSON
    • For Endpoint, enter your email address. Keep rest of details as is.

create subscription for the desired recipients

    • Choose Create subscription
    • Navigate to your email and choose Confirm subscription in the email from Amazon SNS.

Step 2: Create an Amazon EventBridge Rule

  • Next open the Amazon EventBridge console and navigate to the Rules
  • Click on Create rule and provide a name (ex: EC2CapacityReservationStateChange) and description for the rule. Keep all other setting as it is. Select
  • For Event source, choose AWS events or EventBridge partner events.
  • In the Creation method section, for Method, choose Use pattern form.
  • For Event source, choose AWS services. Under AWS Service, choose EC2 and then choose EC2 Capacity Reservation.
  • Optionally, you can narrow down the state and reservation ID you want to be alerted for by selecting Specific Capacity Reservation State and Specific Capacity reservation ID. Select Next.

Create an amazon eventbridge rule

  • In the Target section, under Target Type, select AWS Service.
  • For Select Target, choose SNS topic, under Topic, select SNS Topic you created in Step 1. Select
  • On Configure Tags page, select Next.
  • On Review and create page, select Create Rule.

CloudFormation Walkthrough

To simplify the setup process and make it easier for you to implement the automated notification solution, we have provided a CloudFormation template. This template automates the creation of the necessary Amazon EventBridge rule and Amazon SNS topic, along with the required configurations and permissions.

  1. Download yaml sample template.
  2. Navigate to AWS CloudFormation console and click on Create stack.
  3. Choose Upload a template file and select the downloaded template. Choose Next
  4. Provide a name for the stack under Stack Name. For CapacityReservationId, enter ID of the EC2 Capacity Reservation to monitor, (e.g., cr-1234567890abcdef0), for EmailAddress, enter email address you want to subscribe to the SNS topic, and for MonitoredStates enter comma-separated list of Capacity Reservation states to monitor (e.g. failed, expired, cancelled, pending). Choose Next

CloudFormation walkthrough

5. On Configure stack options, keep defaults. Choose

6. On Review and create page, choose

Clean up

To avoid ongoing charges, clean up your environment, by following these steps to delete the resources you created by following this blog, if they are no longer needed:

If you followed setup for AWS Console, please follow the below steps:

  1. Delete the Amazon EventBridge Rule
    1. Navigate to Amazon EventBridge console and choose to the Rules from left pane.
    2. Choose the rule you created earlier and click on Delete.
    3. Confirm the deletion by clicking Delete in the confirmation dialog.
  2. Delete the Amazon SNS Topic and Subscriptions
    1. Navigate to Amazon SNS console and navigate to the Topics from left pane.
    2. Choose the topic you created earlier and click on Delete.
    3. Confirm the deletion by clicking Delete in the confirmation dialog.

If you created any subscriptions for the topic (e.g., email or SMS), they will be automatically deleted along with the topic.

If you deployed CloudFormation template, please follow the below steps:

Conclusion

In this blog post we walked you through setting up an EventBridge rule to capture the state change events of your future-dated CRs and configure SNS to send notifications to the designated recipients. This automated approach eliminates the need for manual monitoring and ensures that you stay informed about the status of your capacity reservations.

By proactively managing your future-dated CRs through automated notifications, you can make informed decisions, adjust your reservation plans if need, and take corrective actions to ensure you have the necessary resources available for your critical events.

This solution enhances your operational efficiency, reduces the risk of capacity shortages, and allows you to focus on other important aspects of your business.

We encourage you to implement this automated notification system for your future-dated Amazon EC2 On-Demand Capacity Reservations and experience the benefits of streamlined monitoring and proactive capacity management.

Enhance the resilience of critical workloads by architecting with multiple AWS Regions

Post Syndicated from John Formento original https://aws.amazon.com/blogs/architecture/enhance-the-resilience-of-critical-workloads-by-architecting-with-multiple-aws-regions/

In this post, we will share how you can use multi-Region as an architectural approach to achieve higher resilience on Amazon Web Services (AWS). This approach relies on first operating a workload across multiple Availability Zones within an AWS Region, before expanding to achieve even higher resilience by using multiple Regions. This is because within a Region there are multiple Availability Zones, which are physically separated by many miles but still close enough together (60 miles or less) to allow for single-digit millisecond latency. Each Availability Zone features one or more data centers, each housed in its own facility with its own redundant networking, connectivity, and power. Availability Zones provide fundamental building blocks that can help you achieve your resilience goals for your applications. First, you can benefit from the separation between Availability Zones by using Zonal services to specify which Availability Zone a resource is in, such as an Amazon Elastic Compute Cloud (Amazon EC2) instance. This means that if you build your application with redundant replicas of your application resources in each Availability Zone, you can gain excellent resilience to infrastructure events impacting any one Availability Zone.

A multi-Region approach is a reliable way to achieve a bounded recovery time for critical applications in the rare event of a service failure in a Region that is impacting your application. Each Region has strict logical and physical separation from other Regions. This purposeful design helps avoid service and infrastructure disruptions in one Region affecting another Region. This unique property of Regions can be used to build multi-Region applications with predictable fault domains.

While a multi-Region approach can improve your application’s resilience to failures, it can be challenging to build and operate such an application. It requires careful work to take advantage of the isolation between Regions, with care taken to not remove this isolation benefit at the application level. For example, if you fail over an application between Regions, you need to maintain strict separation between your application stacks in each Region, be aware of all the application dependencies, and fail over all parts of the application together. This kind of system requires planning and coordination amongst many engineering and business teams, especially with a complex, microservices-based architecture that could have several dependencies between applications.

If you’re replicating data between Regions using an asynchronous approach, you should be aware of the risk that not all your data has been replicated to the standby Region when you fail over. Because there’s a finite time needed to copy data over between Regions, data might be out of sync between the primary and standby Regions. If you use a synchronously replicated database across Regions to support your applications running from more than one Region concurrently, you avoid issues with data being out of sync when starting your application in the new Region. However, this introduces higher latency characteristics into your application’s resources. This is because writes need to commit to more than one Region, and the Regions can span hundreds or thousands of miles from one another. This latency characteristic needs to be accounted for in your application design. In addition, synchronous replication can increase the chance for correlated failures because writes need to be committed to more than one Region to be successful. If there is an impairment within one Region, you’ll need to form a quorum for writes to be successful, which typically involves having your database in three Regions and having a quorum of two out of three.

Finally, you need to practice the failover and simulate Region impairments to know that it works when you need it. It’s a substantial time and resource investment to regularly rotate your application between Regions to practice failover, but it’s a recommended practice if you plan to build a multi-Region application.

Given these additional considerations when implementing a multi-Region approach, for most AWS customers, multi-AZ is the right approach for building and operating resiliently in the cloud. This approach helps mitigate most infrastructure failures, which are usually contained within an Availability Zone. A multi-Region approach is most common in the following scenarios.

Meet regulatory and compliance requirements and enhance disaster recovery capabilities

Regulated industries like financial services and healthcare and life sciences can require that applications be multi-Region. Healthcare providers and pharmaceutical companies, for example, often deploy electronic health records (EHR), clinical trial management systems, and other applications across multiple Regions for enhanced data redundancy, disaster recovery, and compliance with regional data privacy regulations (like HIPAA in the US or GDPR in the EU). Epic on AWS, for example, is typically deployed across multiple Availability Zones and multiple Regions to increase the resilience of customers’ EHR and integrated application environment, making full use of the resources and geographic diversity of the AWS Cloud.

Banks and financial institutions, including Fidelity and Vanguard, also deploy many of their core trading and investment platforms and customer-facing applications across multiple Regions for enhanced business continuity and compliance with local data protection regulations.

Achieve a bounded recovery time to support highly available business-critical workloads

With growing demand for always-on applications and services, companies are increasingly reliant on cloud-based services and infrastructure for day-to-day operations and business continuity. While a single Region supports highly available and resilient applications, distributing workloads across multiple Regions enables a bounded recovery time in the rare event of a disruption to the application. The physical and logical separation of Regions provides a well-defined fault isolation boundary that you can use to create predictable fault boundaries for your applications. If their application experiences issues in one Region, the workloads can continue operating in another Region, which minimizes downtime for customers and users.

Streaming platforms like Netflix, NBCUniversal, and Disney, for example, deploy their content delivery networks (CDNs) and video streaming infrastructure across multiple Regions to provide a seamless media experience for their customers. In many cases, video streaming and video gaming companies deploy their infrastructure across multiple Regions to offer lower-latency gaming experiences for players worldwide.

Automotive companies such as Honda deploy their connected vehicle platforms across multiple Regions to scale globally. They use geo-location routing that identifies the closest broker the vehicle should communicate with based on customer-configured rules that govern how vehicles connect to the cloud infrastructure. This allows them to reliably connect millions of vehicles to the cloud while supporting high availability.

Conclusion

No matter the industry or scenario, AWS is the definitive choice for organizations that want to build and run highly available, resilient applications in the cloud, with resilience built into its infrastructure, operational models, and comprehensive capabilities across Regions. To learn how to choose between the different options for building resilience into your application, see the Well-Architected reliability pillar, and for a detailed framework for choosing multi-Region, see AWS Multi-Region Fundamentals.


AWS Weekly Roundup: New AWS Mexico (Central) Region, simultaneous sign-in for multiple AWS accounts, and more (January 20, 2025)

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-new-aws-mexico-central-region-simultaneous-sign-in-for-multiple-aws-accounts-and-more-january-20-2025/

As winter maintains its hold over where I live in the Netherlands, rare moments of sunlight become precious gifts. This weekend offered one such treasure—while cycling along a quiet canal, golden rays broke through the typically gray Dutch sky, creating a perfect moment of serenity. These glimpses of brightness feel particularly special during January, when daylight can be scarce in our corner of Europe. As we move deeper into 2025, the third week of the new year brings both reflection and forward momentum. While global conversations swirl around technological advancements, it’s these small, personal moments that remind us to pause and appreciate the simple pleasures among our rapidly evolving world.

Let’s look at the last week’s new announcements.

Last week’s launches
Here are the launches that got my attention.

AWS Mexico (Central) Region – In February 2024, we announced plans to expand infrastructure in Mexico, and we’ve now launched the AWS Mexico (Central) Region with three Availability Zones and API code mx-central-1. This marks the first AWS infrastructure Region in Mexico and adds to our growing presence in Latin America. The new Region provides you with local workload management, data storage capabilities, enhanced performance with lower latency, and robust security standards. It features advanced cloud technologies, including cutting-edge artificial intelligence and machine learning (AI/ML) capabilities with purpose-built processors, comprehensive security capabilities with support for 143 security standards and compliance certifications. With this launch, AWS now spans 114 Availability Zones within 36 geographic Regions.

AWS Management Console now supports simultaneous sign-in for multiple AWS accounts – Using multi-session capability in the AWS Management Console, you can now sign-in to multiple AWS accounts and manage your resources in a single browser. You can sign in to up to 5 sessions and this can be any combination of root, AWS Identity and Access Management (IAM), or federated roles in different accounts or in the same account. You can scale your applications using multiple accounts following AWS best-practice guidelines. You can use accounts for different environments, such as development, testing, and production, and compare resource configurations and status across multiple accounts for troubleshooting application issues and other application related jobs.

Introducing new larger sizes on Amazon EC2 Flex instances – We’re announcing the general availability of two new larger sizes (12xlarge and 16xlarge) on Amazon Elastic Compute Cloud (Amazon EC2) Flex (C7i-flex and M7i-flex) instances. The new sizes expand the EC2 Flex portfolio, providing additional compute options to scale up existing workloads or run larger-sized applications that need additional memory. These instances are powered by custom 4th Gen Intel Xeon Scalable processors, which are available only on AWS, and offer up to 15% better performance over comparable x86-based Intel processors used by other cloud providers. Flex instances are the easiest way to get price performance benefits and lower prices for a majority of compute-intensive and general-purpose workloads. They deliver up to 19% better price performance than comparable previous generation instances and are a great first choice for applications that don’t fully utilize the compute resources. Flex instances are ideal for web and application servers, batch processing, enterprise applications, databases, and more. For compute-intensive and general-purpose workloads that need even larger instance sizes (up to 192 vCPUs and 768 GiB memory) or continuous high CPU usage, you can use Amazon EC2 C7i and M7i instances.

Announcing AWS User Notifications general availability on AWS CloudFormation – You can use AWS User Notifications to configure notifications to be sent using the AWS Management Console Notifications Center, email, AWS Chatbot, or mobile push notifications to the AWS Console Mobile App to keep you informed about important events such as Amazon CloudWatch alarms. With this capability, you can define notification configurations as part of your infrastructure-as-code (IaC) practices and specify notification configurations for specific resource types within your AWS CloudFormation templates. For example, you can set up notifications to trigger when an Amazon EC2 Auto Scaling group scales out, an Elastic Load Balancing (ELB) load balancer is provisioned, or an Amazon Relational Database Service (Amazon RDS) database is modified. You have granular control over which events will trigger notifications and who should receive them.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

We launched existing services and instance types in additional Regions:

Other AWS events
Check your calendar and sign up for upcoming AWS events.

AWS Summits are free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Stay updated by visiting the official AWS Summit website and sign up for notifications to learn when registration opens for events in your area.

AWS GenAI Lofts are collaborative spaces and immersive experiences that showcase AWS expertise in cloud computing and AI. They provide startups and developers with hands-on access to AI products and services, exclusive sessions with industry leaders, and valuable networking opportunities with investors and peers. Find a GenAI Loft location near you and don’t forget to register.

Browse all upcoming AWS led in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Esra

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

New Amazon EC2 High Memory U7inh instance on HPE Server for large in-memory databases

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/new-amazon-ec2-high-memory-u7inh-instance-on-hpe-server-for-large-in-memory-databases/

Today we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) U7inh instance, a new addition to EC2 High Memory family, built in collaboration with Hewlett Packard Enterprise (HPE). Amazon EC2 U7inh instance runs on the 16-socket HPE Compute Scale-up Server 3200, and are built on the AWS Nitro System to deliver a fully integrated and managed experience consistent with other EC2 instances.

Powered by the fourth generation Intel® Xeon® Scalable processors (Sapphire Rapids), U7inh instance supports 32 TB of memory and 1920 vCPUs. This instance offers the highest compute performance, largest compute and memory size in the Amazon Web Services (AWS) Cloud for running large, mission-critical database workloads, like SAP HANA.

In May 2024, we launched U7i instances to support up to 896 vCPUs and up to 32 TB of memory, which our enterprise customers could use to successfully migrate their large mission-critical in-memory databases to AWS and benefit from the flexibility, scalability, reliability, and cost advantages that AWS offers.

As customers continue to scale their business applications, they wanted the performance combined with the additional CPUs and memory along with SAP certification to generate real-time business insights. Other customers that currently run on-premises with HPE servers have also asked how we can help them migrate to AWS to take advantage of cloud benefits while continuing to use HPE hardware.

Here are the detailed specs of new U7inh instance:

Instance name vCPUs Memory (DDR5) EBS bandwidth Network bandwidth
U7inh-32tb.480xlarge 1920 32,768 GiB 160 Gbps 200 Gbps

U7inh instance offers up to two times vCPUs and 1.6 times EBS bandwidth in a single instance, compared with the largest U7i instance. You can run your largest in-memory database workloads like SAP HANA or seamlessly migrate workloads running on HPE hardware to AWS.

U7inh instance supports Amazon Linux, Red Hat Enterprise Linux, and SUSE Enterprise Linux Server. Operating system support for SAP HANA workloads on High Memory instances include: SUSE Linux Enterprise Server 15 SP3 for SAP and above and Red Hat Enterprise Linux 8.6/9.0 for SAP and above.

U7inh instance is SAP certified to run Business Suite on HANA (SoH), Business Suite S/4HANA, Business Warehouse on HANA (BW), and SAP BW/4HANA in production environments. U7inh instance is also certified for scale-out SAP HANA OLTP workloads such as S/4HANA and customers can deploy up to four U7inh instance (128TB) in a cluster for even larger SAP HANA workloads.

To learn more about how to migrate, visit Migrating SAP HANA on AWS to an EC2 High Memory Instance in the SAP HANA on AWS Guides and AWS Launch Wizard for SAP in the AWS Launch Wizard User Guide.

Now available
Amazon EC2 U7inh instance is available in the US East (N. Virginia) and US West (Oregon) AWS Regions.

To learn more, visit the U7i instance product page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: Amazon EC2 F2 instances, Amazon Bedrock Guardrails price reduction, Amazon SES update, and more (December 16, 2024)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-f2-instances-amazon-bedrock-guardrails-price-reduction-amazon-ses-update-and-more-december-16-2024/

The week after AWS re:Invent builds on the excitement and energy of the event and is a good time to learn more and understand how the recent announcements can help you solve your challenges. As usual, we have you covered with our top announcements of AWS re:Invent 2024 post.

You can now watch keynotes and sessions on the AWS Event YouTube channel. This year Andy Jassy, now President and CEO at Amazon, returned to re:Invent and shared some thoughts in these videos.

Drawing on experiences Amazon has had building distributed systems at massive scale, Werner Vogels, VP and CTO at Amazon, shared critical lessons and strategies he has learned for managing complex systems in his keynote.

Last week’s launches
Here are the launches that got my attention.

Amazon Elastic Compute Cloud (Amazon EC2) – A new generation of FPGA-powered instances (F2) is now available. In contrast to a purpose-built chip designed with a single function in mind and then hard-wired to implement it, a field programmable gate array (FPGA) can be programmed in the field, after it has been plugged in to a socket on a PC board. We’re also introducing Amazon EC2 High Memory U7i instances with 6TiB and 8TiB of memory. U7i instances are ideal to run large in-memory databases such as SAP HANA, Oracle, and SQL Server. Graviton-based 8th generation instances now support bandwidth configurations for Amazon VPC and Amazon EBS.

Amazon Bedrock Guardrails – We are reducing pricing by up to 85% to help you implement safeguards for your generative AI applications. Also, we’re adding multilingual capabilities with support for Spanish and French languages.

Amazon Simple Email Services (SES) – Now offers Global Endpoints for multi-region sending resilience and announces the availability of Deterministic Easy DKIM (DEED), a new form of global identity which simplifies the use of DomainKeys Identified Mail (DKIM) management.

AWS CloudFormation – An enhanced version of the AWS Secrets Manager transform introducing automatic AWS Lambda upgrades.

Amazon Lex – Launches new multilingual streaming speech recognition models that enhance recognition accuracy through two specialized groupings: a European-based model (for Portuguese, Catalan, French, Italian, German, and Spanish) and a Asia Pacific-based model (for Chinese, Korean, and Japanese).

Amazon Connect – Now supports push notifications for mobile chat on iOS and Android devices. In this way, you can be proactively notified as soon as there is a new message from an agent or chatbot, even when not actively chatting. You can now also configure holidays and other variances to your contact center hours of operation.

AWS Security Hub – Now supports automated security checks aligned to the Payment Card Industry Data Security Standard (PCI DSS) v4.0.1, a compliance framework that provides a set of rules and guidelines for safely handling credit and debit card information.

AWS Resource ExplorerSupports 59 new resource types including Amazon Elastic Kubernetes Service (Amazon EKS), Amazon Kendra, AWS Identity and Access Management (IAM) Access Analyzer, and Amazon SageMaker.

Amazon SageMaker AI – Inference optimized Amazon EC2 G6e instances (powered by NVIDIA L40S Tensor Core GPUs) and P5e (powered by NVIDIA H200 Tensor Core GPUs) are now available on Amazon SageMaker.

Amazon Redshift – Now supports automatically and incrementally refreshable materialized views on tables in a zero-ETL integration. Previously, in this case, you had to run a full refresh.

AWS Toolkit for Visual Studio Code – Now includes Amazon CloudWatch Logs Live Tail, an interactive log streaming and analytics capability that provides real-time visibility into your logs and makes it easier to develop and troubleshoot applications.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Build a managed transactional data lake with Amazon S3 Tables – Just introduced at re:Invent 2024, Amazon S3 Tables is the first cloud object store with built-in Apache Iceberg support and the easiest way to store tabular data at scale. This post on the AWS Storage Blog provides an overview of S3 Tables and an example of how to build a transactional data lake with S3 Tables using Apache Spark on Amazon EMR.

Introducing Cross-Region Connectivity for AWS PrivateLink – More information on this recent launch that can be used to share and access Amazon Virtual Private Cloud (Amazon VPC) endpoint services across different AWS Regions.

Marc Brooker, VP/Distinguished Engineer at AWS, shared on his personal blog a few posts about what Amazon Aurora DSQL is, how it works, and how to make the best use of it:

That’s all for this week. Check back next Monday for another Weekly Roundup!

Danilo

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Now Available – Second-Generation FPGA-Powered Amazon EC2 instances (F2)

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-second-generation-fpga-powered-amazon-ec2-instances-f2/

Equipped with up to eight AMD Field-Programmable Gate Arrays (FPGAs), AMD EPYC (Milan) processors with up to 192 cores, High Bandwidth Memory (HBM), up to 8 TiB of SSD-based instance storage, and up to 2 TiB of memory, the new F2 instances are available in two sizes, and are ready to accelerate your genomics, multimedia processing, big data, satellite communication, networking, silicon simulation, and live video workloads.

A Quick FPGA Recap
Here’s how I explained the FPGA model when we previewed the first generation of FPGA-powered Amazon Elastic Compute Cloud (Amazon EC2) instances

One of the more interesting routes to a custom, hardware-based solution is known as a Field Programmable Gate Array, or FPGA. In contrast to a purpose-built chip which is designed with a single function in mind and then hard-wired to implement it, an FPGA is more flexible. It can be programmed in the field, after it has been plugged in to a socket on a PC board. Each FPGA includes a fixed, finite number of simple logic gates. Programming an FPGA is “simply” a matter of connecting them up to create the desired logical functions (AND, OR, XOR, and so forth) or storage elements (flip-flops and shift registers). Unlike a CPU which is essentially serial (with a few parallel elements) and has fixed-size instructions and data paths (typically 32 or 64 bit), the FPGA can be programmed to perform many operations in parallel, and the operations themselves can be of almost any width, large or small.

Since that launch, AWS customers have used F1 instances to host many different types of applications and services. With a newer FPGA, more processing power, and more memory bandwidth, the new F2 instances are an even better host for highly parallelizable, compute-intensive workloads.

Each of the AMD Virtex UltraScale+ HBM VU47P FPGAs has 2.85 million system logic cells and 9,024 DSP slices (up to 28 TOPS of DSP compute performance when processing INT8 values). The FPGA Accelerator Card associated with each F2 instance provides 16 GiB of High Bandwidth Memory and 64 GiB of DDR4 memory per FPGA.

Inside the F2
F2 instances are powered by 3rd generation AMD EPYC (Milan) processors. In comparison to F1 instances, they offer up to 3x as many processor cores, up to twice as much system memory and NVMe storage, and up to 4x the network bandwidth. Each FPGA comes with 16 GiB High Bandwidth Memory (HBM) with up to 460 GiB/s bandwidth. Here are the instance sizes and specs:

Instance Name vCPUs
FPGAs
FPGA Memory
HBM / DDR4
Instance Memory
NVMe Storage
EBS Bandwidth
Network Bandwidth
f2.12xlarge 48 2 32 GiB /
128 GiB
512 GiB 1900 GiB
(2x 950 GiB)
15 Gbps 25 Gbps
f2.48xlarge 192 8 128 GiB /
512 GiB
2,048 GiB 7600 GiB
(8x 950 GiB)
60 Gbps 100 Gbps

The high-end f2.48xlarge instance supports the AWS Cloud Digital Interface (CDI) to reliably transport uncompressed live video between applications, with instance-to-instance latency as low as 8 milliseconds.

Building FPGA Applications
The AWS EC2 FPGA Development Kit contains the tools that you will use to develop, simulate, debug, compile, and run your hardware-accelerated FPGA applications. You can launch the kit’s FPGA Developer AMI on a memory-optimized or compute-optimized instance for development and simulation, then use an F2 instance for final debugging and testing.

The tools included in the developer kit support a variety of development paradigms, tools, accelerator languages, and debugging options. Regardless of your choice, you will ultimately create an Amazon FPGA Image (AFI) which contains your custom acceleration logic and the AWS Shell which implements access to the FPGA memory, PCIe bus, interrupts, and external peripherals. You can deploy AFIs to as many F2 instances as desired, share with other AWS accounts or publish on AWS Marketplace.

If you have already created an application that runs on F1 instances, you will need to update your development environment to use the latest AMD tools, then rebuild and validate before upgrading to F2 instances.

FPGA Instances in Action
Here are some cool examples of how F1 and F2 instances can support unique and highly demanding workloads:

Genomics – Multinational pharmaceutical and biotechnology company AstraZeneca used thousands of F1 instances to build the world’s fastest genomics pipeline, able to process over 400K whole genome samples in under two months. They will adopt Illumina DRAGEN for F2 to realize better performance at a lower cost, while accelerating disease discovery, diagnosis, and treatment.

Satellite Communication – Satellite operators are moving from inflexible and expensive physical infrastructure (modulators, demodulators, combiners, splitters, and so forth) toward agile, software-defined, FPGA-powered solutions. Using the digital signal processor (DSP) elements on the FPGA, these solutions can be reconfigured in the field to support new waveforms and to meet changing requirements. Key F2 features such as support for up to 8 FPGAs per instance, generous amounts of network bandwidth, and support for the Data Plan Development Kit (DPDK) using Virtual Ethernet can be used to support processing of multiple, complex waveforms in parallel.

AnalyticsNeuroBlade‘s SQL Processing Unit (SPU) integrates with Presto, Apache Spark, and other open source query engines, delivering faster query processing and market-leading query throughput efficiency when run on F2 instances.

Things to Know
Here are a couple of final things that you should know about the F2 instances:

Regions – F2 instances are available today in the US East (N. Virginia) and Europe (London) AWS Regions, with plans to extend availability to additional regions over time.

Operating Systems – F2 instances are Linux-only.

Purchasing Options – F2 instances are available in On-Demand, SpotSavings Plan, Dedicated Instance, and Dedicated Host form.

Jeff;

Amazon EC2 Trn2 Instances and Trn2 UltraServers for AI/ML training and inference are now available

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-ec2-trn2-instances-and-trn2-ultraservers-for-aiml-training-and-inference-is-now-available/

The new Amazon Elastic Compute Cloud (Amazon EC2) Trn2 instances and Trn2 UltraServers are the most powerful EC2 compute options for ML training and inference. Powered by the second generation of AWS Trainium chips (AWS Trainium2), the Trn2 instances are 4x faster, offer 4x more memory bandwidth, and 3x more memory capacity than the first-generation Trn1 instances. Trn2 instances offer 30-40% better price performance than the current generation of GPU-based EC2 P5e and P5en instances.

In addition to the 16 Trainium2 chips, each Trn2 instance features 192 vCPUs, 2 TiB of memory, and 3.2 Tbps of Elastic Fabric Adapter (EFA) v3 network bandwidth, which offers up to 50% lower latency than the previous generation.

The Trn2 UltraServers, which are a completely new compute offering, feature 64x Trainium2 chips connected with a high-bandwidth, low-latency NeuronLink interconnect, for peak inference and training performance on frontier foundation models.

Tens of thousands of Trainium chips are already powering Amazon and AWS services. For example, over 80,000 AWS Inferentia and Trainium1 chips supported the Rufus shopping assistant on the most recent Prime Day. Trainium2 chips are already powering the latency-optimized versions of Llama 3.1 405B and Claude 3.5 Haiku models on Amazon Bedrock.

Up and Out and Up
Sustained growth in the size and complexity of the frontier models is enabled by innovative forms of compute power, assembled into equally innovative architectural forms. In simpler times we could talk about architecting for scalability in two ways: scaling up (using a bigger computer) and scaling out (using more computers). Today, when I look at the Trainium2 chip, the Trn2 instance, and the even larger compute offerings that I will talk about in a minute, it seems like both models apply, but at different levels of the overall hierarchy. Let’s review the Trn2 building blocks, starting at the NeuronCore and scaling to an UltraCluster:

NeuronCores are at the heart of the Trainium2 chip. Each third-generation NeuronCore includes a scalar engine (1 input to 1 output), a vector engine (multiple inputs to multiple outputs), a tensor engine (systolic array multiplication, convolution, and transposition), and a GPSIMD (general purpose single instruction multiple data) core.

Each Trainium2 chip is home to eight NeuronCores and 96 GiB of High Bandwidth Memory (HBM), and supports 2.9 TB/second of HBM bandwidth. The cores can be addressed and used individually, or pairs of physical cores can be grouped into a single logical core. A single Trainium2 chip delivers up to 1.3 petaflops of dense FP8 compute and up to 5.2 petaflops of sparse FP8 compute, and can drive 95% utilization of memory bandwidth thanks to automated reordering of the HBM queue.

Each Trn2 instance is, in turn, home to 16 Trainum2 chips. That’s a total of 128 NeuronCores, 1.5 TiB of HBM, and 46 TB/second of HBM bandwidth. Altogether this multiplies out to up to 20.8 petaflops of dense FP8 compute and up to 83.2 petaflops of sparse FP8 compute. The Trainium2 chips are connected across NeuronLink in a 2D torus for high bandwidth, low latency chip-to-chip communication at 1 GB/second.

An UltraServer is home to four Trn2 instances connected with low-latency, high-bandwidth NeuronLink. That’s 512 NeuronCores, 64 Trainium2 chips, 6 TiB of HBM, and 185 TB/second of HBM bandwidth. Doing the math, this results in up to 83 petaflops of dense FP compute and up to 332 petaflops of sparse FP8 compute. In addition to the 2D torus that connects NeuronCores within an instance, Cores at corresponding XY positions in each of the four instances are connected in a ring. For inference, UltraServers help deliver industry-leading response time to create the best real-time experiences. For training, UltraServers boost model training speed and efficiency with faster collective communication for model parallelism when compared to standalone instances. UltraServers are designed to support training and inference at the trillion parameter level and beyond; they are available in preview form and you can contact us to join the preview.

Trn2 instances and UltraServers are being deployed in EC2 UltraClusters to enable scale-out distributed training across tens of thousands of Trainium chips on a single petabit scale, non-blocking network, with access to Amazon FSx for Lustre high performance storage.

Using Trn2 Instances
Trn2 instances are available today for production use in the US East (Ohio) AWS Region and can be reserved by using Amazon EC2 Capacity Blocks for ML. You can reserve up to 64 instances for up to six months, with reservations accepted up to eight weeks in advance, with instant start times and the ability to extend your reservations if needed. To learn more, read Announcing Amazon EC2 Capacity Blocks for ML to reserve GPU capacity for your machine learning workloads.

On the software side, you can start with the AWS Deep Learning AMIs. These images are preconfigured with the frameworks and tools that you probably already know and use: PyTorch, JAX, and a lot more.

If you used the AWS Neuron SDK to build your apps, you can bring them over and recompile them for use on Trn2 instances. This SDK integrates natively with JAX, PyTorch, and essential libraries like Hugging Face, PyTorch Lightning, and NeMo. Neuron includes out-of-the-box optimizations for distributed training and inference with the open source PyTorch libraries NxD Training and NxD Inference, while providing deep insights for profiling and debugging. Neuron also supports OpenXLA, including stable HLO and GSPMD, enabling PyTorch/XLA and JAX developers to utilize Neuron’s compiler optimizations for Trainium2.

Jeff;

New Amazon EC2 P5en instances with NVIDIA H200 Tensor Core GPUs and EFAv3 networking

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/new-amazon-ec2-p5en-instances-with-nvidia-h200-tensor-core-gpus-and-efav3-networking/

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P5en instances, powered by NVIDIA H200 Tensor Core GPUs and custom 4th generation Intel Xeon Scalable processors with an all-core turbo frequency of 3.2 GHz (max core turbo frequency of 3.8 GHz) available only on AWS. These processors offer 50 percent higher memory bandwidth and up to four times throughput between CPU and GPU with PCIe Gen5, which help boost performance for machine learning (ML) training and inference workloads.

P5en, with up to 3200 Gbps of third generation of Elastic Fabric Adapter (EFAv3) using Nitro v5, shows up to 35% improvement in latency compared to P5 that uses the previous generation of EFA and Nitro. This helps improve collective communications performance for distributed training workloads such as deep learning, generative AI, real-time data processing, and high-performance computing (HPC) applications.

Here are the specs for P5en instances:

Instance size vCPUs Memory (GiB) GPUs (H200) Network bandwidth (Gbps) GPU Peer to peer (GB/s) Instance storage (TB) EBS bandwidth (Gbps)
p5en.48xlarge 192 2048 8 3200 900 8 x 3.84 100

On September 9, we introduced Amazon EC2 P5e instances, powered by 8 NVIDIA H200 GPUs with 1128 GB of high bandwidth GPU memory, 3rd Gen AMD EPYC processors, 2 TiB of system memory, and 30 TB of local NVMe storage. These instances provide up to 3,200 Gbps of aggregate network bandwidth with EFAv2 and support GPUDirect RDMA, enabling lower latency and efficient scale-out performance by bypassing the CPU for internode communication.

With P5en instances, you can increase the overall efficiency in a wide range of GPU-accelerated applications by further reducing the inference and network latency. P5en instances increases local storage performance by up to two times and Amazon Elastic Block Store (Amazon EBS) bandwidth by up to 25 percent compared with P5 instances, which will further improve inference latency performance for those of you who are using local storage for caching model weights.

The transfer of data between CPUs and GPUs can be time-consuming, especially for large datasets or workloads that require frequent data exchanges. With PCIe Gen 5 providing up to four times bandwidth between CPU and GPU compared with P5eand P5e instances, you can further improve latency for model training, fine-tuning, and running inference for complex large language models (LLMs) and multimodal foundation models (FMs), and memory-intensive HPC applications such as simulations, pharmaceutical discovery, weather forecasting, and financial modeling.

Getting started with Amazon EC2 P5en instances
You can use EC2 P5en instances available in the US East (Ohio), US West (Oregon), and Asia Pacific (Tokyo) AWS Regions through EC2 Capacity Blocks for ML, On Demand, and Savings Plan purchase options.

I want to introduce how to use P5en instances with Capacity Reservation as an option. To reserve your EC2 Capacity Blocks, choose Capacity Reservations on the Amazon EC2 console in the US East (Ohio) AWS Region.

Select Purchase Capacity Blocks for ML and then choose your total capacity and specify how long you need the EC2 Capacity Block for p5en.48xlarge instances. The total number of days that you can reserve EC2 Capacity Blocks is 1–14, 21, or 28 days. EC2 Capacity Blocks can be purchased up to 8 weeks in advance.

When you select Find Capacity Blocks, AWS returns the lowest-priced offering available that meets your specifications in the date range you have specified. After reviewing EC2 Capacity Blocks details, tags, and total price information, choose Purchase.

Now, your EC2 Capacity Block will be scheduled successfully. The total price of an EC2 Capacity Block is charged up front, and the price does not change after purchase. The payment will be billed to your account within 12 hours after you purchase the EC2 Capacity Blocks. To learn more, visit Capacity Blocks for ML in the Amazon EC2 User Guide.

To run instances within your purchased Capacity Block, you can use AWS Management Console, AWS Command Line Interface (AWS CLI) or AWS SDKs.

Here is a sample AWS CLI command to run 16 P5en instances to maximize EFAv3 benefits. This configuration provides up to 3200 Gbps of EFA networking bandwidth and up to 800 Gbps of IP networking bandwidth with eight private IP address:

$ aws ec2 run-instances --image-id ami-abc12345 \
  --instance-type p5en.48xlarge \
  --count 16 \
  --key-name MyKeyPair \
  --instance-market-options MarketType='capacity-block' \
  --capacity-reservation-specification CapacityReservationTarget={CapacityReservationId=cr-a1234567}
--network-interfaces "NetworkCardIndex=0,DeviceIndex=0,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=1,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=2,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=3,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=4,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=5,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=6,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=7,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=8,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=9,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=10,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=11,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=12,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=13,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=14,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=15,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=16,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=17,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=18,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=19,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=20,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=21,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=22,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=23,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=24,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=25,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=26,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=27,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=28,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa" \
"NetworkCardIndex=29,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=30,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only" \
"NetworkCardIndex=31,DeviceIndex=1,Groups=security_group_id,SubnetId=subnet_id,InterfaceType=efa-only"
...

When launching P5en instances, you can use AWS Deep Learning AMIs (DLAMI) to support EC2 P5en instances. DLAMI provides ML practitioners and researchers with the infrastructure and tools to quickly build scalable, secure, distributed ML applications in preconfigured environments.

You can run containerized ML applications on P5en instances with AWS Deep Learning Containers using libraries for Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS).

For fast access to large datasets, you can use up to 30 TB of local NVMe SSD storage or virtually unlimited cost-effective storage with Amazon Simple Storage Service (Amazon S3). You can also use Amazon FSx for Lustre file systems in P5en instances so you can access data at the hundreds of GB/s of throughput and millions of input/output operations per second (IOPS) required for large-scale deep learning and HPC workloads.

Now available
Amazon EC2 P5en instances are available today in the US East (Ohio), US West (Oregon), and Asia Pacific (Tokyo) AWS Regions and US East (Atlanta) Local Zone us-east-1-atl-2a through EC2 Capacity Blocks for ML, On Demand, and Savings Plan purchase options. For more information, visit the Amazon EC2 pricing page.

Give Amazon EC2 P5en instances a try in the Amazon EC2 console. To learn more, see Amazon EC2 P5 instance page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

NEW: Simplifying the use of third-party block storage with AWS Outposts

Post Syndicated from Rachel Zheng original https://aws.amazon.com/blogs/compute/new-simplifying-the-use-of-third-party-block-storage-with-aws-outposts/

This post is written by Kate Sposato, Senior Solutions Architect, EC2 Edge Compute

AWS is excited to announce deeper collaboration with industry-leading storage solutions to streamline the use of third-party storage with AWS Outposts. You can now attach and use external block data volumes from NetApp® on-premises enterprise storage arrays and Pure Storage® FlashArray™ directly from the AWS Management Console.

Outposts is a fully managed service that extends AWS infrastructure, AWS services, APIs, and tools to customer premises. By providing local access to AWS managed infrastructure, Outposts allows you to build and run applications on premises using the same application programming interfaces (APIs) as in AWS Regions. Moreover, this is done while using local compute and storage resources to meet lower latency and local data processing needs. Outposts is available in various rack and server form factors.

Many of you have block storage systems running in your on-premises environments that provide advanced data storage and management features—such as snapshots, replication, and encryption—to protect data integrity and security. There are various uses cases that would predicate you needing to access data through these external volumes backed by external storage systems from an application running in Amazon Elastic Compute Cloud (Amazon EC2) instances on Outposts. These include: regulatory auditing requirements, government and local regulation compliance, high data durability and resiliency requirements, low-latency data access, and migration of on-premises applications that are tightly coupled with existing external storage systems. To make it easier for you to use external volumes with Outposts, AWS has validated a broad range of third-party storage solutions through the AWS Outposts Ready Program. With this program, you can easily identify storage solutions that are tested to run with Outposts.

Today, we are taking our integration with storage solutions from NetApp and Pure Storage to the next level. Outposts now has a simplified and automated way to launch EC2 instances with attached block storage from external infrastructure through the AWS Management Console. The new integration includes automated user script generation and attachment of data volumes to EC2 instances running on 42U Outposts racks and 2U Outposts servers. This integration reduces the friction associated with using the advanced data management and security features of external storage infrastructure in combination with Outposts, allowing you to create a resilient, compliant, and optimized storage and compute infrastructure.

Outposts rack storage and networking overview

Outposts racks support Amazon Elastic Block Store (Amazon EBS) volumes for EC2 instances, which provide persistent local block storage.

EC2 instances running on Outposts racks can access data stored on external block storage arrays over the Outposts local gateway (LGW). An LGW enables connectivity between the Outpost subnets, where EC2 instances run, and the on-premises network. It carries storage traffic between the EC2 instances running on the Outposts rack and the local network. The LGW is created by AWS as part of the Outposts rack installation process. Each Outposts rack supports a single LGW.

The following diagram shows an EC2 instance running on an Outposts rack with an elastic network interface (ENI) and LGW configured for instance connectivity. An external storage array communicates with the EC2 instance running on the Outposts rack through the Outpost network devices (ONDs). Customer Network Devices (CNDs) that connect to EC2 instances running on Outposts racks need to support the following:

  • Link aggregation: connections to the Outposts rack network devices are added to a link aggregation group (LAG).
  • VLANs: Virtual LANs (VLANs) are configured between each Outposts rack TOR device and any customer devices, including data stores.;
  • Dynamic routing: Border Gateway Protocol (BGP) is configured between the CND and the OND for each VLAN. Two total BGP sessions are shown in the following diagram between devices.

Figure 1. Outposts rack and Amazon EC2 networking architecture

Figure 1. Outposts rack and Amazon EC2 networking architecture

Outposts server storage and networking overview

Outposts servers come with internal NVMe SSD-based high-performance instance storage. Similar to AWS Regions, instance storage is allocated directly to the EC2 instance and follows the lifecycle of the instance. For example, if an EC2 instance is terminated, then the instance storage associated with the instance is also deleted. If you want data to persist after the instance is terminated, you can use external storage solutions to complement the instance storage included with Outposts servers.

Outposts servers have a local network interface (LNI). This logical networking component connects the EC2 instances running on the Outposts servers subnet to the on-premises network and allows communication to other on-premises storage, compute, and networking appliances.

To support the Amazon EC2 on Outposts to external storage array integration, an LNI must be created then added to the EC2 instance during instance launch. An LNI can only be created through the AWS Command Line Interface (AWS CLI) or the AWS software development toolkit (SDK) using the following command. The subnet id is the Outposts server subnet and the device index should be unique to the subnet.

aws ec2 modify-subnet-attribute --subnet-id <subnet id> --enable-lni-at-device-index <device index>

In the on-premises network, you must have a Network Interface Card (NIC) at the same device index that you specified when running the preceding CLI command.

Further detailed steps for this workflow are listed in the Outposts server user guide.

When the local network interfaces are enabled on an Outpost subnet, the EC2 instances in the Outpost subnet can be configured to include this LNI in addition to the ENI. The LNI connects to the on-premises network while the ENI connects to the VPC.

The following diagram shows an EC2 instance running on an Outposts server with both an ENI and LNI configured for instance connectivity. There is an external storage array connected to the Outposts server using a CND through NVMe-over-TCP or iSCSI protocol. Figure 2. Outposts server and Amazon EC2 networking architecture

Figure 2. Outposts server and Amazon EC2 networking architecture

Supported operating systems and AWS Support

The rest of this post covers the steps for how to launch an EC2 instance running on an Outposts 2U server or Outposts rack with a connected external block storage volume for local data access from within the EC2 instance. The current release of this feature supports EC2 instances running Microsoft Windows Server 2022 and Red Hat Enterprise Linux 9 (RHEL9) based operating systems.

Support for Outposts and all Outposts integration features, including this one, needs an active AWS Enterprise Support Plan or AWS Enterprise On-Ramp Support Plan. Support for external storage arrays and configurations can be obtained from the respective storage vendor and may need an additional support plan depending on the vendor and the storage solution implemented.

This post assumes you’re familiar with the basic functionality of Outposts servers and Outposts rack. If you would like to learn more about the Outposts family in general, then the user guide, What is AWS Outposts?, is a great place to start.

Solution deployment

The following sections outline the solution deployment.

Prerequisites:

  1. An Outposts 2U server or Outposts rack is provisioned, activated, and connected to the customer network.
  2. A block storage array is connected on the same network and accessible to Outposts subnets.
  3. A block data volume is configured and running on the storage array. The unique identifier for this volume is necessary for launching the EC2 instance on the Outpost. The volume must remain provisioned after initial provisioning on the storage array.
  4. The IP address and port number (optional for iSCSI connections) of the block storage volume, which is necessary for launching the EC2 instance on the Outpost.

Deployment architecture overview

The following deployment architecture shows the workflow attaching an external storage array to an Outpost, launching an EC2 instance through the AWS Management Console, and accessing the data on the external storage array from within the EC2 instance running on the Outpost.Figure 3. Third-party block storage on Outposts architecture overview

Figure 3. Third-party block storage on Outposts architecture overview

Deployment steps for NVMe-over-TCP connections

1. (Prerequisite) If there is no block data volume already running and configured on the compatible storage array, this must be completed in the storage solution’s interface before moving to Step 2.

a. Create an NVMe device, subsystem, and namespace for the block data volume.

b. Optionally, generate a host NQN that is used for the EC2 instance connection, and add it to the allow list for the appropriate subsystems.

c. The following pieces of information are used in later steps:

i. Host NQN: Unique identifier of the EC2 instance for attachment;

ii. Target IP: Address of the connected block volume host;

iii. Target Port Number: Port number of the connected block volume host.

You can learn more about launching and configuring external storage arrays in the Outposts family documentation or in the respective storage array vendor documentation.

2. In the Console, navigate to EC2 Launch Instance Wizard by choosing EC2, Instances, Launch instances.

a. Name the instance and add any desired tags to be applied at launch.

b. Choose the desired, compatible RHEL9 based Amazon Machine Image (AMI) from the list, or choose one from the AWS Marketplace.

c. Choose the desired EC2 Instance type.

d. Expand the Network settings section and select Edit. Choose the VPC and subnet of the target Outpost.

i. Outposts servers only: You must create an LNI in the Advanced Network settings before launching the instance.

e. Expand Advanced network configuration and select Add network device. Continue to add network devices until the Device index is equal to the volume index.

Figure 4. Advanced network configurationFigure 4. Advanced network configuration

f. Expand Configure storage and select Edit next to External storage volumes settings section and choose NVMe/TCP in Storage network protocol.

Figure 5. External storage volumes configuration

g. Enter the HostNQN in the format provided for the NVMe/TCP data volume. Make sure that the HostNQN used has been added to the storage array subsystem allow list.

h. Select Add NVMe/TCP Discovery Controller and enter the IP address and port of the controller from the storage array. Enter 4420 as the Target Port, if the target port is unknown.

i. (Optional) You can add more data volumes that use a different target discovery controller at this time by choosing the Add NVMe/TCP Data Volume button under the Target IP address. Repeat Steps 2.h for each data volume to be attached to the EC2 instance.

j. Expand the Advanced details and provide any additional Amazon EC2 behavior settings as appropriate.

k. At the bottom of the Advanced details section is the automatically generated User data. If you need to manually edit this data, you can do so by selecting Edit at the bottom.

Figure 6. Automatically generated user data file

l. When the configurations are set, choose the Launch instance button in the right-side column.

3. The EC2 Launch Instance Wizard now launches an EC2 instance configured as described on the Outpost and attaches the desired external data volume(s) to the EC2 instance.

4. Applications and users can access the data on the attached external volumes from within the EC2 instance. To verify this:

a. From within the launched EC2 instance, run sudo nvme list

b. The volumes are displayed as /dev/nvme1n1 with the number increasing for each attached volume. Local instance store volumes on Outposts servers and EBS boot volumes on Outposts racks are listed first. External volumes are listed after those with sequentially increasing node numbers.

5. External storage volume and array management, configuration, and backups continue to be managed through the storage vendor-provided toolkit. You can find more information on external storage management in the respective storage array vendor documentation.

Deployment steps for iSCSI connections

1. (Prerequisite) If there is no block data volume already running and configured on the compatible storage array, this must be completed in the storage solution’s interface before moving to Step 2.

a. Create an Initiator group (igroup) and add the Initiator IQN to the igroup. Then map the logical unit number (LUN) to the igroup.

b. Optionally, generate an initiator IQN that is used for the EC2 instance connection, and add it to the allow list for the appropriate subsystems.

c. The following pieces of information are used in later steps:

i. Initiator IQN: Unique identifier of the EC2 instance for attachment;

ii. Target IQNs: Unique identifier of the storage virtual machine (SVM);

iii. Target IP: Address of the connected block volume host;

iv. (Optional) Target Port Number: Port number of the connected block volume host.

You can learn more about launching and configuring external storage arrays in the Outposts family documentation or in the respective storage array vendor documentation.

2. In the Console, navigate to EC2 Launch Instance Wizard by choosing EC2, Instances, Launch instances.

a. Name the instance and add any desired tags to be applied at launch.

b. Choose the desired, compatible RHEL9 or Windows Server 2022 based AMI from the list, or purchase one from the AWS Marketplace.

c. Choose the desired EC2 Instance type.

d. Expand the Network settings section and choose the VPC and subnet of the target Outpost.

i. Outposts servers only: You must create an LNI in the Advanced Network settings before launching the instance.

e. Expand Advanced network configuration and select Add network device. Continue to add network devices until the Device index is equal to the volume index.

Figure 7. Advanced network configurationFigure 7. Advanced network configuration

f. Expand Configure storage and select Edit next to External storage volumes settings section and choose iSCSI in Storage network protocol.

Figure 8. External storage volumes configurationFigure 8. External storage volumes configuration

g. Enter the Initiator IQN for the iSCSI data volume in the format provided. Make sure that the Initiator IQN used has been added to the allow list for the volume.

h. Select Add iSCSI Target and enter the Target IP, Target Port, and Target IQN of the storage array. Enter 4420 for the Target Port, if the target port is unknown.

i. (Optional) You can add additional data volumes with a different Target IQN at this time by selecting the Add iSCSI Target button under the Target IP address. Repeat Steps 2.h for each data volume to be attached to the EC2 instance.

j. Expand the Advanced details and provide any additional Amazon EC2 behavior settings as appropriate.

k. At the bottom of the Advanced details section is the automatically generated User data. If you need to manually edit this data, you can do so by selecting Edit at the bottom.

Figure 9. Automatically generated user data fileFigure 9. Automatically generated user data file

l. When the configurations are set, choose the Launch instance button in the right-side column.

3. The EC2 Launch Instance Wizard now launches an EC2 instance configured as described on the Outpost and attaches the desired external data volume(s) to the EC2 instance.

4. Applications and users can access the data on the attached external volumes from within the EC2 instance. To verify this:

a. From within the launched EC2 instance, run iscsiadm -m session -P3

b. The volumes are displayed as /dev/sd0 with the number increasing for each attached volume.

5. External storage volume and array management, configuration, and backups continue to be managed through the storage vendor-provided toolkit. You can find more information on external storage management in the respective storage array vendor documentation.

Conclusion

This integration offers a streamlined workflow to attach and utilize external block data volumes on Outposts directly through the AWS Management Console, eliminating manual processes. It provides the full benefits of advanced data infrastructure from trusted storage providers in conjunction with the security, reliability, and scalability of AWS managed infrastructure. This helps you accelerate cloud migration with dependencies on third-party storage and realize the full potential of your on-premises data.

To learn more about this integration, visit the NetApp on-premises enterprise storage arrays for AWS Outposts solution page and the Pure Storage FlashArray for AWS Outposts blog post. To discuss your external storage needs with an Outposts expert, submit this form. If you are attending AWS re:Invent 2024, make sure to check out the NetApp booth (booth #1748) and Pure Storage booth (booth #454) to connect with our partner specialists.

Introducing storage optimized Amazon EC2 I8g instances powered by AWS Graviton4 processors and 3rd gen AWS Nitro SSDs

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-storage-optimized-amazon-ec2-i8g-instances-powered-by-aws-graviton4-processors-and-3rd-gen-aws-nitro-ssds/

Today, we’re announcing the general availability of Amazon EC2 I8g instances, a new storage optimized instance type to provide the highest real-time storage performance among storage-optimized EC2 instances with the third generation of AWS Nitro SSDs and AWS Graviton4 processors.

AWS Graviton4 is the most powerful and energy efficient processor we have ever designed for a broad range of workloads running on EC2 instances using a 64-bit ARM instruction set architecture. AWS Nitro System SSDs are custom built by AWS and offer high I/O performance, low latency, minimal latency variability, and security with always-on encryption.

EC2 I8g instances are the first instance type to use third-generation AWS Nitro SSDs. These instances offer up to 22.5 TB local NVME SSD storage with up to 65 percent better real-time storage performance per TB and 60 percent lower latency variability compared to the previous generation I4g instances. Based on the AWS Graviton4 processors, I8g instances deliver up to 60 percent better compute performance and two times larger caches compared to I4g.

I8g instances offer up to 96 vCPUs, 768 GiB of memory, and 22.5 TB of storage to deliver more compute and storage choices compared with I4g instances.

Instance name vCPUs Memory (Gib) Storage (GB) Network bandwidth (Gbps) EBS bandwidth (Gbps)
I8g.large 2 16 468 up to 10 up to 10 Gbps
I8g.xlarge 4 32 937 up to 10 up to 10 Gbps
I8g.2xlarge 8 64 1,875 up to 12 up to 10 Gbps
I8g.4xlarge 16 128 3,750 up to 25 up to 10 Gbps
I8g.8xlarge 32 256 7,500
(2 x 3,750)
up to 25 10 Gbps
I8g.12xlarge 48 384 11,520
(3 x 3,750)
up to 28.125 15 Gbps
I8g.16xlarge 64 512 15,000
(4 x 3,750)
up to 37.5 20 Gbps
I8g.24xlarge 96 768 22,500
(6 x 3,750)
up to 56.25 20 Gbps
I8g.metal-24×1 96 768 22,500
(6 x 3,750)
up to 56.25 30 Gbps

You can use I8g instances for I/O intensive workloads that require low latency access to data such as transactional databases (MySQL and PostgreSQL), real-time databases, NoSQL databases, (Aerospike, Apache Druid, MongoDB) and real-time analytics such as Apache Spark.

Additionally, I8g instances are built on the AWS Nitro System, which offloads CPU virtualization, storage, and networking functions to dedicated hardware and software to enhance the performance and security of your workloads. The Graviton4 processors offer you enhanced security by fully encrypting all high-speed physical hardware interfaces.

Things to know
Here are some things that you should know about EC2 I8g instances:

  • Operating system – EC2 I8g instances support Amazon Linux 2023, Amazon Linux 2, CentOS Stream 8 or newer, Ubuntu 18.04 or newer, SUSE 15 SP2 or newer, Debian 11 or newer, Red Hat Enterprise 8.2 or newer, CentOS 8.2 or newer, FreeBSD 13 or newer, Rocky Linux 8.4 or newer, Alma Linux 8.4 or newer, and Alpine Linux 3.12.7 or newer.
  • Networking – You can use I8g instances in storage intensive workloads that typically have burst network usage patterns. All I8g instance sizes have burst network bandwidth and can burst more than 60 minutes, depending on the instance sizes, to support the majority of the workloads requiring instance storage data hydration, backup, and snapshot over the network.
  • Migration – If you’re using I4g instances now, you will have straightforward experience migrating storage intensive workloads to I8g instances because these instances offer similar memory and storage ratios to existing I4g instances.

Now available
Amazon EC2 I8g instances are now available in the US East (N. Virginia) and US West (Oregon) AWS Regions through On-Demand instances, Savings Plans, Spot Instances, Dedicated Instances, or Dedicated Hosts.

Give EC2 I8g instances a try in the Amazon EC2 console. To learn more, visit the EC2 I8g instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

Now available: Storage optimized Amazon EC2 I7ie instances

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-storage-optimized-amazon-ec2-i7ie-instances/

The new storage optimized Amazon Elastic Compute Cloud (Amazon EC2) I7ie instances feature up to 120 TB of low latency NVMe storage and 5th generation Intel Xeon Scalable Processors with an all-core turbo frequency of 3.2 GHz. Fueled by 3rd Generation AWS Nitro SSDs, these instances deliver the highest storage density available in the cloud today. When compared to the previous generation of storage optimized instances, they provide:

  • Up to 65% better real-time storage performance per TB
  • Up to 50% lower I/O latency with up to 65% lower latency variability
  • Up to 40% better compute performance
  • Up to twice as many vCPUs and twice as much memory
  • 20% better price-performance

The instances are designed to support I/O intensive workloads that need a high degree of random IOPS: NoSQL databases, distributed file systems, search engines, data warehouses, and analytics.

I7ie instances are available in nine sizes with up to 192 vCPUs and 1.5 TiB of memory:

Instance Name vCPUs
Memory
NVMe Storage
(Nitro SSD)
EBS Bandwidth
Network Bandwidth
I7ie.large 2 16 GiB 1.25 TB
(1 x 1.25 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.xlarge 4 32 GiB 2.5 TB
(1 x 2.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.2xlarge 8 64 GiB 5 TB
(2 x 2.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.3xlarge 12 96 GiB 7.5 TB
(3 x 2.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.6xlarge 24 192 GiB 15 TB
(2 x 7.5 TB)
Up to 10 Gbps Up to 25 Gbps
I7ie.12xlarge 48 384 GiB 30 TB
(4 x 7.5 TB)
15 Gbps Up to 25 Gbps
I7ie.18xlarge 72 576 GiB 45 TB
(6 x 7.5 TB)
22.5 Gbps Up to 75 Gbps
I7ie.24xlarge 96 768 GiB 60 TB
(8 x 7.5 TB)
30 Gbps Up to 100 Gbps
I7ie.48xlarge 192 1,536 GiB 120 TB
(16 x 7.5 TB)
60 Gbps 100 Gbps

A larger L3 cache, increased memory bandwidth, and other improvements deliver increased processing power. The VP2INTERSECT instruction (part of AVX-512) accelerates Machine Learning and graph processing workloads; the Advanced Matrix Extensions (AMX) increase deep learning training and inferencing performance.

On the network side, the instances feature over 3x the EBS bandwidth of the previous generation of storage optimized instances. This accelerates just about every I/O-intensive use case, and is especially helpful when hydrating an in-memory database or caching server. All instances sizes support the Elastic Network Adapter (ENA) and can be launched in cluster placement groups; the 48xlarge instance size also supports the Elastic Fabric Adapter (EFA).

Things to Know
Here are a couple of things that you should know about these new instances:

Regions – We are launching in the US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Frankfurt, London) AWS Regions today, with plans to expand to others in the future.

Purchase Options – I7ie instances are available in On-Demand, Spot, Savings Plan, Dedicated Instance, and Dedicated Host form.

Jeff;

Announcing future-dated Amazon EC2 On-Demand Capacity Reservations

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/announcing-future-dated-amazon-ec2-on-demand-capacity-reservations/

Customers use Amazon Elastic Compute Cloud (Amazon EC2) to run every type of workload imaginable, including web hosting, big data processing, high-performance computing (HPC), virtual desktops, live event streaming, and databases. Some of these workloads are so critical that customers asked for the ability to reserve capacity for them.

To help customers flexibly reserve capacity, we launched EC2 On-Demand Capacity Reservations (ODCRs) in 2018. Since then, customers have used capacity reservations (CRs) to run critical applications like hosting consumer websites, streaming lives sporting events and processing financial transactions.

Today, we’re announcing the ability to get capacity for future workloads using CRs. Many customers have future events such as product launches, large migrations, or end-of-year sales events like Cyber Monday or Diwali. These events are critical, and customers want to ensure they have the capacity when and where they need it.

While CRs helped customers reserve capacity for these events, they were only available just-in-time. So customers either needed to provision the capacity ahead of time and pay for it or plan with precision to provision CRs just-in-time at the start of the event.

Now you can plan and schedule your CRs up to 120 days in advance. To get started you specify the capacity you need, the start date, delivery preference, and the minimum duration you commit to use the capacity reservation. There are no upfront charges to schedule a capacity reservation. After Amazon EC2 evaluates and approves the request, it will activate the reservation on the start date, and customers can use it to immediately launch instances.

Getting started with future-dated capacity reservations
To reserve your future-dated capacity, choose Capacity Reservations on the Amazon EC2 console and select Create On-Demand Capacity Reservation, and choose Get started.

To create a capacity reservation, specify the instance type, platform, Availability Zone, platform, tenancy, and number of instances you are requesting.

future-dated-2a

In the Capacity Reservation details section, choose At a future date in the Capacity Reservation starts option and choose your start date and commitment duration.

future-dated-1a

You can also choose to end the capacity reservation at a specific time or manually. If you select Manually, the reservation has no end date. It will remain active in your account and continue to be billed until you manually cancel it. To reserve this capacity, choose Create.

future-dated-4

After you create your capacity request, it appears in the dashboard with an Assessing status. During this state, AWS systems will work to determine if your request is supportable which is usually done within 5 days. Once the systems determine the request is supportable, the status will be changed to Scheduled. In rare cases, your request may be unsupported.

On your scheduled date, the capacity reservation will change to an Active state, the total instance count will be increased to the amount requested, and you can immediately launch instances.

After activation, you must hold the reservation for at least the commitment duration. After the commitment duration elapses, you can continue to hold and use the reservation if you’d like or cancel it if no longer needed.

Things to know
Here are some things that you should know about the future-dated CRs:

  • Evaluation – Amazon EC2 considers multiple factors when evaluating your request. Along with forecasted supply, Amazon EC2 considers how long you plan to hold the capacity, how early you create the Capacity Reservation relative to your start date, and the size of your request. To improve the ability of Amazon EC2 to support your request, create your reservation at least 56 days (8 weeks) before the start date. You need to submit a request for at least 100 vCPUs for only C, M, R, T, I instance types. The recommended minimum commitment for most requests will be 14 days.
  • Notification – We recommend monitoring the status of your request through the console or Amazon EventBridge You can use these notifications to trigger automation or send an email or text update. To learn more, visit Send an email when events happen using Amazon EventBridge in the Amazon EventBridge User Guide.
  • Pricing – Future dated capacity reservations are billed just like regular CRs. It is charged at the equivalent On-Demand rate whether you run instances in reserved capacity or not. For example, if you create a future dated CR for 20 instances and run 15 instances, you will be charged for 15 active instances and for 5 unused instances in the reservation including the minimum duration. Savings Plans apply to both unused reservations and instances running on the reservation. To learn more, visit Capacity Reservation pricing and billing in the Amazon EC2 User Guide.

Now available
Future dated EC2 Capacity Reservations are now available today in all AWS Regions where Amazon EC2 Capacity Reservations are available.

Give Amazon EC2 Capacity Reservations a try in the Amazon EC2 console. To learn more, visit On-Demand Capacity Reservations in the Amazon EC2 User Guide and send feedback to AWS re:Post for Amazon EC2 or through your usual AWS Support contacts.

Channy