All posts by Art Baudo

Powering generative AI/ML solutions with AWS Outposts Servers at Edge locations

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/powering-generative-ai-ml-solutions-with-aws-outposts-servers-at-edge-locations/

This post is written by Brian Daugherty, Principal Solutions Architect, Leonardo Queirolo, Senior Cloud Support Engineer, and Reet Kundu, Senior Cloud Support Engineer

Powering generative AI/ML solutions with AWS Outposts Servers at Edge locations

Many organizations are vigorously pursuing generative AI initiatives in the Amazon Web Services (AWS) cloud today because generative AI drive advances in productivity, efficiency, and innovation.

However, for some organizations, industries, and use-cases, there is a compelling need to deploy generative AI not only in the cloud, but also at the edge due to factors such as application latency and proximity to critical data.

AWS Outposts can help these organizations address this need by extending AWS services to the edge, such as generative AI services, while maintaining the same tooling and orchestration capabilities found in AWS Regions.

Industrial and manufacturing use-cases are a primary focus of AWS Outposts Servers, which can be deployed on-premises to minimize latency and make sure of stable connectivity between orchestration and control applications such as Manufacturing Execution Systems (MES) or Supervisory, Control, and Data Acquisition (SCADA) systems and the industrial processes they control.

This post explores how to use Outposts Servers to power generative AI solutions at the edge. The example use-case demonstrates real-time anomaly detection for industrial processes and an edge-based human machine interface including a small language model (SLM) with Retrieval-Augmented Generation (RAG) to guide operators on best practices for problem resolution. Although the use case is specific, the tools and methods can be applied to many other edge generative AI use cases.

For a hands-on experience to implement this solution using Outposts Servers, fill out this form with your contact information and we will get back to you with lab access. A detailed step-by-step guide to develop the hands-on example is available in this link.

Architecture overview

As depicted in the following diagram, the solution is distributed in three modules. The first module (1) guides you to establish low-latency, local connectivity to an MQTT broker within the same on-premises network as your lab Amazon Elastic Compute Cloud (Amazon EC2) instance. You configure essential AWS infrastructure (Amazon S3, AWS Secrets Manager, AWS Identity and Access Management (IAM)) to manage the deployment, authentication, and permissions of AWS IoT Greengrass components. You deploy a component to the existing Greengrass core device on your lab EC2 instance to retrieve synthetic Arduino sensor data from the broker using its Local Network Interface (LNI).
Figure 1 – Architectural diagram of the solution to perform low-latency, local inference through generative AI and ML models running on Outposts Servers

Figure 1 – Architectural diagram of the solution to perform low-latency, local inference through generative AI and ML models running on Outposts Servers

In the second module (2), you deploy a component that detects anomalies in sensor data in real-time. This component runs on the Outposts Server EC2 instance hosting the AWS IoT Greengrass core device, performing inference directly at the edge. You use synthetic Arduino sensor data to generate anomalies and observe them being detected by the model. You configure an IoT rule to send the anomaly count to the Amazon CloudWatch Dashboard in the Region. This provides centralized monitoring, while making sure that the raw data and any sensitive data remains processed locally at the edge where latency and connectivity are assured.

In the third module (3), you deploy a comprehensive edge computing solution to enhance operational visibility and decision-making capabilities at the local level. The solution includes a local dashboard that provides a real-time telemetry to display raw sensor data and detect anomalies. A Virtual Assistant is integrated with SLM to provide context-aware response from the factory data and forecasting capability to predict future anomaly trends.

Outposts Server

Outposts Servers provide fully managed AWS infrastructure, services, APIs, and tools for edge use-cases . Two form factors are available: 1U servers are AWS Graviton based, and 2U servers are third-generation Intel Xeon Scalable processor based.

Enabling anomaly detection at the edge

Outposts Servers allow local sensor data processing for low-latency anomaly detection and resilience against external connectivity issues, as shown in the following figure. The example uses synthetic Arduino devices with gyroscope sensors data, simulating industrial sensors sending data to an MQTT Broker on an EC2 instance in the Outposts Server. Gyroscope data is used in various monitoring systems, such as motion control systems, orientation detection, stability, and balance mechanism. The Lab EC2 instance fetches sensor data through the MQTT client and processes it using a machine learning (ML) model for anomaly detection.

Figure 2 – Architectural diagram showing data flow from Arduino sensors through MQTT broker and EC2 on Outposts Server to perform local inference

Figure 2 – Architectural diagram showing data flow from Arduino sensors through MQTT broker and EC2 on Outposts Server to perform local inference

Outposts server LNI

Local communication between synthetic Arduino sensor data, MQTT broker, and the Lab EC2 instance uses LNI, providing Layer 2 presence on the local network. The setup necessitates creating an Elastic Network Interface (ENI) on an Outposts subnet with the LNI enabled, attaching it to the Lab EC2 Instance, and verifying connectivity through the MQTT Broker’s LNI IP using the command ping -c 5 <MQTT_BROKER_LNI_IP> . This enables direct, low-latency communication between components crucial for this edge computing scenario.

AWS IoT Greengrass

AWS IoT Greengrass is an open source edge runtime and cloud service for device software management and deployment supported on Outposts Server. This hybrid approach combines the benefits of edge computing with centralized management, such as:

  • Centralized artifact management: store and version component artifacts in Amazon S3, enabling consistent deployment across multiple edge locations.
  • Secure configuration: use Secrets Manager to handle sensitive information and credentials unique to each edge location.
  • Fleet monitoring: use CloudWatch for centralized monitoring and logging across your distributed edge deployment.
  • Automated updates: deploy software updates and model improvements across your edge fleet through AWS IoT Greengrass component management.

AWS IoT Greengrass components, such as the one used for the anomaly detection, can be deployed to EC2 instances running on Outposts Servers. After configuring the Lab EC2 instance with Greengrass, you can download components from an S3 bucket. The first component deploys a subscriber for receiving synthetic Arduino sensor data through MQTT broker configuration, as shown in the following configuration line.

{
    "broker": "<MQTT_BROKER_LNI_IP>",
    "port": 1883,
    "client_id": "OutpostsServerMLEdge_<workshop-id>",
    "sensor_name": "ArduinoSensor_<arduino-id>",
    "topic": "arduino/ArduinoSensor_<arduino-id>/3-axis-rotation",
    "thing_name": "OutpostsServerMLEdge_Sub",
    "mqttauth_creds": "<ARN_SECRET_MQTT_CREDENTIALS>"
}

The second component is the Anomaly Detector artifact that processes sensor data in real-time, detects anomalies using a pre-trained model, and sends anomaly counts to AWS IoT Core. Key components include:

  • edge_application.py: script for processing sensor data, performing local inference using pre-trained model in ONNX format, and publishing anomaly counts to AWS IoT Core. It is used for local inference, so that the raw data is not exposed outside the Edge location.
  • model: directory storing “arduino.onnx”, a pre-trained Autoencoder model for anomaly detection.
  • statistics: directory storing the values of different statistical functions (for example, mean and standard deviation) from the training phase and used by edge_application.py for inference.
  • functions: directory storing the code of the functions, such as the code to publish to the AWS IoT Core.

After deployment of subscriber and detector components, the Lab EC2 instance processes synthetic gyroscope data from Arduino sensors, detecting anomalies during X, Y, or Z axis movement:

Real-time Dashboard showing sensor data and anomaly count

Real-time anomaly detection results from gyroscope sensor data across X, Y, and Z axes.

Building upon the foundation of Outposts Server, Local Network Interface (LNI), and AWS IoT Greengrass, this solution extends beyond anomaly detection to deliver comprehensive edge AI capabilities. These core components work together to enable advanced generative AI applications at the edge, as demonstrated in the following sections.

Edge generative AI applications with Outposts Server

The solution demonstrates the implementation of key edge generative AI capabilities:

  • Contextual virtual assistance: providing on-site personnel with AI-powered guidance and troubleshooting using local operational data, SOPs, and technical documentation.
  • Predictive insights: using foundational models (FMs) to forecast future trends based on historical data, enabling proactive planning and optimization.
  • Real-time operational dashboard: integrating sensor data visualization with AI-powered insights and forecasts in a unified local interface that maintains operations during connectivity interruptions.

1. Contextual virtual assistance at the edge

The solution implements the virtual assistant through an AWS IoT Greengrass component. The following is a snippet from the component recipe showing the key configuration parameters:

{
    "ComponentConfiguration": {
        "DefaultConfiguration": {
            // Workshop defaults, SLM runs locally on same EC2 instance
            "SLM_endpoint": "http://localhost:8080",  
            "embedding_model": "all-MiniLM-L6-v2",    
            "knowledge_base_directory": "Factory_Data" 
        }
    }
    // Additional component recipe configurations...
}

Although the solution demonstrates a streamlined setup with the SLM running on the same EC2 instance as the AWS IoT Greengrass component, the architecture enables flexible deployment options through the SLM_endpoint configuration. Organizations can:

  • Deploy the SLM on a dedicated resource in their on-premises network (for example "http://<LNI-IP-DEDICATED-RESOURCE>:8080")
  • Use existing hardware infrastructure accessible through LNI
  • Scale SLM compute resources independently from the AWS IoT Greengrass component
  • Maintain low-latency communication through local network interfaces

The implementation showcases a streamlined approach to RAG at the edge through three main components:

Knowledge base management: the solution uses Amazon S3 for document storage (PDFs, Markdown, text) with automatic edge deployment through AWS IoT Greengrass. Alternatively, you can also choose to store the documents in a local storage. A vector database, such as ChromaDB, handles local vector storage and similarity search, enabling efficient knowledge base updates with centralized control.

Flexible query processing: the implementation provides a streamlined interface for RAG management, allowing users to load site-specific knowledge bases and switch between basic SLM and RAG-enhanced responses with local context:

if prompt := st.chat_input("Question"):
if "db" in st.session_state:
        prompt = augmentPrompt(prompt, st.session_state["db"])
response = getStreamingAnswer(prompt, SLM_MODEL_ENDPOINT)

Modular SLM integration: The solution uses a standardized chat completion API, which allows for integration with different SLM deployments while maintaining a consistent interface across the edge fleet:

def getStreamingAnswer(question: str, endpoint: str):    
    chat_template = '<|user|>\n{input} <|end|>\n<|assistant|>'
    payload = {
        'messages': [{'content': f'{chat_template.format(input=question)}'}],
        'stream': True
    }
    SLM_URL = endpoint + '/v1/chat/completions'

This flexible architecture can be adapted for many industrial use-cases where latency and proximity to local data-sources and processes are critical.

2. Predictive insights using local models

The solution demonstrates forecasting capabilities using Chronos, a small and efficient time series forecasting model that can run entirely at the edge. The following solution implementation shows how to process historical data and generate predictions using Chronos on the AWS IoT Greengrass component deployed on Outposts Server:

# Load Chronos model locally on the Outposts Server
pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-small",
    device_map="cpu",
    torch_dtype=torch.bfloat16,
)
# Generate forecasts with confidence intervals
def predict_anomaly_count_data():
    forecast = pipeline.predict(
        context = torch.tensor(df["total_anomalies"]),
        prediction_length = pred_length,
        num_samples = n_samples,
        top_k = 50,
        top_p = 1.0,
    )
    
    # Calculate confidence bounds
    low, median, high = np.quantile(forecast[0].numpy(), [0.1, 0.5, 0.9], axis=0)

Although the solution uses sample data for the demonstration, this architecture allows organizations to process complex, real-time data at each edge location. Companies can choose to upload only aggregated metrics to CloudWatch or Amazon QuickSight for fleet monitoring and BI analysis, making sure that sensitive raw data remains secure at the edge.

3. Real-time operational dashboard

The solution showcases a resilient monitoring solution where all inter-component communication occurs within the local network and processing happens on the Outposts server, making sure of full functionality during external network interruptions. The dashboard is accessible through the LNI of the Outposts server, allowing local clients to maintain access through the LNI IP address even when connectivity to the Region is lost.

Through a unified interface, the dashboard provides:

  • Real-time visualization of sensor readings
  • Anomaly detection results from the local ML component
  • AI-powered insights from the local SLM
  • Trend forecasting from the Chronos model

Real-time Dashboard showing sensor data and anomaly count

Real-time Dashboard showing sensor data and anomaly count

Virtual Assistant leveraging Factory Data to provide contextualized answers

Virtual Assistant leveraging Factory Data to provide contextualized answers

Chronos forecasting anomaly count based on historical data

Chronos forecasting anomaly count based on historical data

Conclusion

The implementation demonstrates how AWS Outposts Server enables organizations to use both traditional ML and advanced generative AI capabilities at the edge for a variety of industrial and manufacturing use-cases where low-latency and proximity to sensitive or real-time data are business- and process-critical.

To get started with AWS Outposts and explore use cases like this edge AI solution, fill out this form and our team will contact you with lab access and additional guidance. For a detailed walkthrough of this specific edge AI example, refer to this step-by-step guide. For more information about AWS Outposts Server, see the AWS Outposts Server User Guide.

Anchoring AWS Outposts servers with AWS Direct Connect

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/anchoring-aws-outposts-servers-with-aws-direct-connect/

This post is written by Perry Wald, Principal GTM SA, Hybrid Edge, Eric Vasquez Senior SA Hybrid Edge, and Fernando Galves Gen AI Solutions Architect, Outposts

AWS Outposts is a fully managed service that extends AWS infrastructure, services, APIs, and tools to customer premises. Outposts servers launched in 2022, a 1U or 2U rack-mountable host, with the ability to run Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic Container Service (Amazon ECS), as well as other appropriate smaller scale edge services such as AWS IoT Greengrass. This version of Outposts is primarily focused on bringing lower latency, AWS compute capabilities to the edge at many user locations.

During Outposts provisioning, you or AWS creates a service link connection that connects your Outposts server to your chosen AWS Region or home Region. Outposts depends on regional connectivity “to reach out to home,” needing very little in terms of networking. Looking at the network requirements, it needs:

  • DHCP, to assign an IP address and a default gateway
  • Public DNS, to resolve the name of the initial regional endpoint, to allow automated setup, and
  • Internet access, so that when the regional endpoint has been resolved, the Outpost can reach that endpoint. With a minimum of 500 Mbps or and a max of 175 ms round trip latency

User challenges with internet connectivity at the edge

When you order an Outposts server, you are responsible for installing the server. Outposts servers are self-provisioning and need a service link connection between your Outposts and the AWS Region (or home Region). This connection allows for the management of Outposts and the exchange of traffic to and from the AWS Region. Server deployment can be broken down into the following steps: installing the Outposts servers, powering them on, and providing authentication details through a command line. Then, the Outpost servers reach out to the regional endpoint, and provision themselves. Your Outpost status will show as Active when the process has completed, it could take a few hours depending on service link bandwidth.

Although this has been suitable for the vast majority of use cases, there are some locations that can’t provide internet connectivity in their environments. This has mostly been in use cases where there is a strong security reason for not having an internet connection (such as financial services kiosks, small manufacturing facilities, and defense), so as to avoid risks such as DDoS attacks and potential hack attempts, or to meet requirements for receiving an authority to operate (ATO).

These locations either have some form of direct connect, or more commonly have a centralized direct connect link to AWS, and an MPLS network linking all their remote sites to a central one. In both of these scenarios, the requirement is to allow the Outpost servers to resolve and reach the public endpoint for setup, and subsequently the public anchor endpoint for management. This is done without needing to leave the AWS ecosystem, without needing to expose themselves unnecessarily to potential internet threats, and without adding more systems to manage themselves, but rather making use of AWS services.

To meet this requirement, we identified several key things that need to be provided if the user does not have internet connectivity at the remote location, as follows:

  1. DHCP, to provide the Outposts servers with an IP address, default gateway, and DNS servers.
  2. Public DNS access to resolve both the setup endpoint, and when live, the anchor endpoint.
  3. Public internet access, without exposing the user location to potentially harmful traffic from the internet.

Direct Connect VIF options

There are three different types of Virtual Interfaces (VIF) possible to configure on an AWS Direct Connect link:

  • Public VIF: A public VIF can access all AWS public services using public IP addresses.
  • Private VIF: A private VIF should be used to access an Amazon Virtual Private Cloud (Amazon VPC) using private IP addresses.
  • Transit VIF: A transit VIF should be used to access one or more Amazon VPC Transit Gateways associated with Direct Connect gateways.

Transit VIF option

A transit VIF can be used to solve both of these issues. First, a transit VIF deploys an ENI within a VPC (known as an attachment), so that traffic coming from the transit VIF into a VPC can be routed. This is because it follows the rule that, for non-transitive VPC routing, the traffic has to either be sourced or targeted for an ENI in the VPC.

If the traffic is forwarded to a regional VPC through the transit gateway, then it can be forwarded to the internet through an NAT gateway. This is an enhancement of the architecture to use a transit gateway to provide a single egress point for multiple VPCs to the internet. For more information, see Creating a single internet exit point from multiple VPCs Using AWS Transit Gateway. In this case, instead of the transit gateway routing multiple VPCs to the internet, it’s routing to an on-premises connection.

Using a transit gateway to forward traffic to an NAT gateway allows you to provide internet connectivity for the Outposts servers without managing virtual appliances, because NAT gateway provides this as a service. NAT gateways also only allow outbound access, so they provide security against any attempted external access by a bad actor from the internet. This works for Outposts servers since they only need outbound access. Outposts always initiate communication to an anchor or service endpoint, and they never receive communication except as a response.

Architectural diagram showing the use of a Transit VIF and NAT gateway in a Region reaching regional endpoints

Figure 1. Architectural diagram showing the use of a Transit VIF and NAT gateway in a Region reaching regional endpoints

DNS provisioning

Although the preceding architecture solves the challenge of how we provide a path for IP packets to transit between the Outposts servers and the public endpoints needed, it doesn’t solve the issue of resolving DNS names. If the remote site is isolated from the internet, then it has no clear way to resolve DNS.

Amazon Route53 resolver endpoints allow you to deploy an IP address within a VPC subnet, which provides DNS resolution. There are two types of resolver endpoints: outbound and inbound.

Outbound resolver endpoints are used by AWS to send DNS queries to your on-premises DNS servers. Inbound resolver endpoints are used by your DNS servers (and hosts) to resolve addresses within Route 53.

Route 53 can resolve public DNS names, so the Outposts service endpoint outposts.<region-name>.amazonaws.com becomes resolvable by an inbound resolver endpoint.

Configuring the Outposts egress VPC

  1. Set up service link egress VPC, build subnets, deploy a NAT gateway, and transit gateway.
  2. Create Route 53 resolver inbound endpoint.
  3. Configure DHCP on the switch, and make sure that the DNS value matches resolver endpoint.
  4. Configure Transit VIF on the switch, build a BGP peer, and attach to your transit gateway.
  5. Confirm propagation settings on transit gateway and default routes.
  6. Confirm routes on subnets to allow traffic out to the internet, and back to your Outpost servers.
  7. Test name resolution (dig) and https (curl) test to service endpoint.
  8. If needed, install your Outpost servers.

Public VIF option

Using a public VIF allows you to provide an internet connection directly to the on-premises site. In turn, this means you need to implement firewalls and security functions on this connection, adding more layers of operational overhead. A public VIF also means that the on-premises end of the VIF can be accessed by any public IP on the AWS public network, regardless of the instance to which IP is mapped. A public VIF is a public IP endpoint on the AWS public network. You should treat public VIF traffic as internet-based traffic. This can become cumbersome for firewalls teams if they have to allow-list known AWS IP ranges and manage the stateful firewall for a long range of AWS IPs.

Furthermore, even if the user is happy to implement and manage a firewall on the end of that public VIF, there is still a question of how the Outpost would resolve DNS in this setup, and subsequent anchor endpoints. Unless the private network already has DNS resolution to a public DNS, then there are no DNS servers that DHCP can point to in order to allow the Outposts servers to get name resolution. This is because there is no public DNS endpoint within the AWS public network. Traffic from a user’s public VIF can access the AWS public network, but it can’t exit it to other public networks. For example, if the you had configured DHCP to point to one of the well-known DNS servers (such as 8.8.8.8), then, since this DNS servers lives outside of the AWS public network, requests originating from the on-premises side of a public VIF would be dropped as it hit the border of the AWS autonomous system.

The only way for a DNS request to be resolved would be to build a bind forwarding service within a VPC, provide it with a public IP address, and point the DHCP DNS values at this IP address.

This network configuration introduces complexity, and won’t be possible for those with highly regulated workloads. You would need to manage a firewall on-premises, allow a public network to reach the on-premises location, and manage a bind servers setup within a VPC. For these reasons, a public VIF is generally not an option unless the user is already running one, and is familiar with the steps to secure it.

Figure 2. Architectural diagram showing traffic flow using a public VIF and AWS Outposts

Private VIF option

A private VIF whether connected directly to a virtual private gateway (VGW), or through a Direct Connect gateway. VPCs do not support transitive routing. To explain this another way, any traffic following a routing rule in a subnet route table has to either originate from, or be destined for, an IP address (or to be more explicit, an Elastic Network Interface (ENI)) inside that VPC.

Virtual private gateways do not have an ENI associated with them, but are pointed to as a next hop within a subnet routing table. If we take this example and look at what the Outposts servers would be trying to pass as traffic, then it would send a packet with a source address of the Outposts servers, and a destination address of the Outposts service public endpoint (assuming that it could resolve it). When this packet reaches the VPC, then neither the source nor destination address would belong to an ENI within the VPC. Therefore, VPC routing would drop the packet.

Even if there was a routing rule on the subnet pointing the next hop for all traffic to a NAT gateway (ideal for internet egress), the routing still wouldn’t work. This is because the packet from the Outposts servers doesn’t have a destination of the NAT gateway, but instead a destination of the setup endpoint in the internet.

It’s possible to use a combination of ingress routing and transparent proxies to ingest the traffic and pass it to an instance running a proxy service to forward to the internet. However, this adds complexity having to manage and maintain proxy servers. For these reasons, a private VIF is generally not recommended.

Architectural diagram showing VGW and packet drops because of transitive routing not being supported

Figure 3. Architectural diagram showing VGW and packet drops because of transitive routing not being supported

Conclusion

In this post, we discussed architecture patterns you can use to provision your Outposts when public internet connectivity is unavailable. To get started with Outpost servers please visit our Server User Guide. For more information, contact us to learn more.

Implementing network traffic inspection on AWS Outposts rack

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/implementing-network-traffic-inspection-on-aws-outposts-rack-2/

This post is written by Arun Kumar N C, Technical Account Manager; Debapriyo Jogi, Technical Account Manager; and Ashish Nagaraj, Cloud Support Engineer 2

Organizations are increasingly adopting hybrid cloud architectures that combine the scalability of cloud computing with the control and compliance benefits of on-premises infrastructure. AWS Outposts extends AWS infrastructure, AWS services, APIs, and tools to on-premises locations for workloads that require low latency, local data processing, or data residency. Outposts comes in a variety of form factors, from 42U Outposts racks to 1U and 2U Outposts servers. This post will focus on implementing network traffic inspection on Outposts rack.

Comprehensive security is critical for organizations deploying production workloads on Outposts. Network traffic inspection serves as a crucial security control, protecting against threats while enabling secure communication between different network segments. This post provides guidance on how to implement effective network traffic inspection across your hybrid cloud infrastructure using Outposts rack.

Overview

In the coming sections we will cover strategies for network traffic inspection on Outposts rack, focusing on outbound internet access and communication with on-premises networks. We explore AWS native services and third-party tools, offering a comprehensive overview of your options. We will cover architectural patterns, implementation guides, and best practices to help build a strong security posture for your hybrid cloud environment.

Securing internet-facing applications

Securing internet-facing applications on Outposts requires a robust, multi-layered approach for high availability and comprehensive security. We will explore two key architectural patterns that ensure enterprise-grade security for your workloads below.

Amazon CloudFront with AWS WAF integration

This architecture uses multiple AWS services including AWS Shield and AWS WAF for multi-layered security, Amazon CloudFront for global content delivery, and an Application Load Balancer (ALB) on Outposts for on-premises traffic management. Applications are deployed on Outposts, with CloudFront as the content delivery network. AWS WAF rules on CloudFront protect against web exploits, while the ALB distributes requests to application instances within Outposts.

This diagram illustrates AWS CloudFront with WAF integration connecting AWS cloud services to a customer data center through the internet. The setup includes CloudFront protected by WAF and Shield, EC2 instances in a VPC, and an Outpost deployment in the customer data center with ALB and EC2 instances.

Figure 1 – Amazon CloudFront with AWS WAF integration

  1. User sends a request via web browser or mobile app to access the application.
  2. The request is received by the CloudFront in AWS Edge Location, performing content-based routing.
  3. CloudFront integrates with AWS WAF to filter web traffic and block common attack patterns.
  4. ALB routes it to the appropriate targets.
  5. The application on Outposts processes the request and generates a response.

This flow ensures secure and efficient handling of user requests using both cloud and on-premises resources.

ALB with AWS WAF


This architecture offers more control over traffic routing while using AWS WAF for security. Applications are deployed on Outposts, but the ALB is in the parent Region, as AWS WAF cannot be associated with Outposts ALBs. The regional ALB handles incoming traffic, with AWS WAF providing firewall capabilities. After passing through AWS WAF, traffic is routed to Outposts applications. This configuration allows advanced WAF features but may introduce latency, as traffic must first reach the regional ALB. This trade-off between security and latency should be considered based on application needs.

Note: A critical dependency exists on the service link connection, as application traffic routing relies on the regional ALB. Service link failures will disrupt workload operations, making connection resilience essential for this architecture.

Figure 2 – ALB with AWS WAF

  1. User sends a request via web browser or mobile app for a webpage, API call, or service.
  2. The ALB in the AWS Region receives the request and performs Layer 7 content-based routing.
  3. ALB integrates with AWS WAF for security inspection.
  4. If the request passes, ALB routes it to the appropriate target in Outposts, selecting a specific instance or service.
  5. The application on Outposts processes the request, generates a response, and returns it.
  6. The response travels back through Outposts ALB to the regional ALB, which forwards it to the user’s browser or app.

Inspection between the Outpost subnet and regional subnet

Network traffic inspection between the Outpost and regional subnets is vital for security in hybrid cloud deployments. It makes sure traffic between Outposts and the parent Region complies with security policies and requirements. Two main architectural approaches exist for implementing this inspection:

  1. Using a third-party firewall in the Outpost subnet.
  2. Using AWS Network Firewall in an AWS Region.

Both approaches support various connectivity (service link) options between Outposts and the Region, including AWS Direct Connect.

Using third-party firewall in the Outpost subnet

This architecture uses a third-party firewall in the Outposts subnet, routing all traffic between the Outposts and regio0nal subnets through it. This setup enables local traffic inspection, reducing latency while enforcing security policies before traffic leaves the Outposts.

This diagram illustrates AWS Region and customer data center connected via service-linked VPN. Outpost deployment includes third-party firewall EC2 instance and target EC2 instances in Outpost subnet.

Figure 3 – Third-party firewall in the Outpost subnet

Traffic can originate from either Outposts or AWS regional subnet.

  1. Traffic originating from the Outpost to AWS Region:

a. Traffic is sent to the third-party firewall in the Outpost.
b. The firewall inspects the traffic and applies security policies.
c. If allowed, the firewall forwards traffic to the Region.
d. Traffic travels via service link connectivity (Direct Connect or public internet) to the regional subnet.

  1. Traffic originating from AWS Region to the Outpost:

a. Traffic originates in the regional subnet.
b. Traffic travels via service link connectivity (Direct Connect or public internet).
c. Upon reaching the Outpost, the traffic is sent to the third-party firewall.
d. The firewall inspects packets and applies security policies.
e. If allowed, the firewall forwards traffic to the Outpost subnet destination.

Using AWS Network Firewall in an AWS Region

In this architecture, a Network Firewall is deployed in the regional VPC, routing all traffic between the Outpost and regional subnets through it. This centralized approach ensures consistent policy enforcement with AWS native tools. The firewall inspects all traffic between Outposts and the AWS infrastructure in the Region.

This diagram illustrates AWS Network Firewall in a Region connected to customer data center via service-linked VPN. Includes VPC with Network Firewall, and EC2 in Outpost routing the traffic through the AWS Network Firewall endpoint.

Figure 4 – AWS Network Firewall in an AWS Region

Traffic can originate from either the Outposts subnet or AWS regional subnet.

All traffic is routed to the Network Firewall in the AWS Region.

  1. The firewall applies configured rules, including:
  • Custom rules for specific security needs.
  • Managed AWS rule groups for common threats.
  • Third-party rule groups for specialized protection.
  1. If traffic passes all rules, it is forwarded to its destination (Outpost or Region).
  2. Return traffic follows the same path, all traffic is inspected by the Network Firewall.

Inspection between on-premises and Outposts through Local Gateway

Network traffic inspection between on-premises networks and Outposts via Local Gateway (LGW) is essential for securing hybrid environments. It helps you make sure safe communication is happening between Outposts workloads and on-premises infrastructure.
Two primary architectural approaches are available explained below. The choice depends on infrastructure, security needs, and operational preferences.

Using third-party firewall on Outposts

For more details on implementing network traffic inspection between on-premises networks and Outposts via LGW, refer to Implementing network traffic inspection on AWS Outposts rack.

This post expands on the preceding blog by offering detailed guidance on architectural options and traffic flows for inspecting network traffic between on-premises environments and Outposts via LGW.

Using your on-premises router/firewall

This approach uses the existing firewall capabilities of your on-premises router/firewall. The network is configured to route all traffic between the on-premises environment and Outposts through this router/firewall. The LGW on your Outpost connects directly to your router/firewall, which handles the firewall functions. This setup uses the on-premises security infrastructure and policies, ensuring continuity in security management while integrating Outposts into the broader network security strategy.

Traffic flow:

  1. Traffic originates from on-premises network
  2. Passes through your router with the firewall
  3. Router inspects the traffic
  4. If allowed, traffic is sent to Outposts through the LGW
  5. Outbound inspection to the internet from Outposts instances

Outbound inspection to the internet from Outposts instances

Outbound internet traffic inspection for Outposts instances is useful for security and controlling access to external resources. Three architectural approaches are available for implementing this inspection, which are discussed in the following sections.

Using Customer-Owned IP (CoIP) with on-premises firewall

In this architecture, Outposts instances are assigned Customer-Owned IP (CoIP) addresses, with all outbound internet traffic routed through the on-premises network and firewall. The LGW connects the Outposts environment to the on-premises network. This setup enables organizations to leverage existing on-premises security and internet connectivity while ensuring consistent IP addressing across their hybrid environment.

This diagram illustrates Customer-Owned IP (CoIP) implementation in a customer data center, where Outpost EC2 instances use CoIP addresses, routing through LGW to an on-premises firewall for inspection.

Figure 5 – Customer-Owned IP (CoIP) with on-premises firewall

  1. An Outposts instance with a CoIP address initiates outbound internet traffic.
  2. The traffic is routed to the LGW on the Outpost.
  3. The LGW forwards the traffic to the on-premises network.
  4. The traffic reaches the on-premises firewall and inspects the traffic, applying security policies and rules.
  5. If allowed, the firewall forwards the traffic to the internet through the on-premises connection.
  6. Return traffic follows the reverse path, being inspected by the firewall before reaching the Outposts instance.

Using CoIP with third-party firewalls on Outposts

Using this configuration, you would assign a CoIP addresses to your Outposts instances and deploy a third-party firewall appliance directly on the Outposts rack. Outbound internet traffic from these instances is routed through the local firewall running on EC2 before reaching the internet via the LGW. This approach ensures local traffic inspection while preserving the advantages of CoIP addressing, enabling seamless integration with existing IP management systems.

This diagram illustrates third-party firewalls on Outposts as EC2 instances. Customer data center contains Outpost subnet with EC2 instances using CoIP, connected to LGW. Traffic routes through Customer Edge Router/Firewall before reaching Internet

Figure 6 – CoIP with third-party firewalls on Outposts

  • An Outposts instance with a CoIP address initiates outbound internet traffic.
  • The traffic is routed to the third-party firewall deployed on the Outpost.
  • The firewall performs deep packet inspection, applying security policies and rules.
  • If allowed, the firewall forwards the traffic to the LGW.
  • The LGW sends the traffic to the internet through the on-premises connection.
  • Return traffic follows the reverse path, being inspected by the firewall before reaching the Outposts instance.

Using Internet Gateway (IGW) with Network Firewall in the Region

This architecture provides secure outbound internet access for Outposts workloads by using services in the parent Region. The VPC extends to include the Outposts rack, with internet-bound traffic routed via the service link to the AWS Region. In the Region, the Network Firewall inspects the traffic before forwarding it to the Internet Gateway (IGW) for internet access.

Traffic flow:

  1. Traffic is sent to the parent Region via the service link.
  2. In the Region, traffic is routed to the Network Firewall.
  3. The Network Firewall inspects the traffic and applies rules.
  4. If allowed, traffic is forwarded to the IGW via the NAT Gateway.
  5. The IGW sends the traffic to the internet.
  6. Return traffic follows the reverse path, inspected before reaching Outposts.

Conclusion

Implementing effective network traffic inspection for AWS Outposts requires a strategic approach balancing security, efficiency, and architectural complexity. We’ve explored multiple architectural patterns for implementing network traffic inspection with Outposts rack.

Reach out to your AWS account team or AWS support to learn more about inspection in Outpost.

Migrating your on-premises workloads to AWS Outposts Rack

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/migrating-your-on-premises-workloads-to-aws-outposts-rack-2/

This post is written by Craig Warburton, Senior Solutions Architect, Hybrid; Sedji Gaouaou, Senior Solutions Architect, Hybrid; and Brian Daugherty, Principal Solutions Architect, Hybrid.

Migrating workloads to AWS Outposts Rack offers you the opportunity to gain the benefits of cloud computing while keeping your data and applications on premises.

For organizations with strict data residency requirements, by deploying AWS infrastructure and services on premises, you can keep sensitive data and mission-critical applications within your own data centers or facilities, helping ensure compliance with data sovereignty laws and regulatory frameworks.

On the other hand, if your organization does not have stringent data residency requirements, you may opt for a hybrid approach, using both Outposts Rack and the AWS Regions. With this flexibility, you can process and store data in the most appropriate location based on factors such as latency, cost optimization, and application requirements.

In this post, we cover options to migrate your workloads to an Outposts Rack, taking into account your specific data residency requirements. We explore strategies, tools, and best practices to enable a successful migration tailored to your organization’s needs.

Overview

AWS has several services to help you migrate and rehost workloads, including AWS Migration Hub, AWS Application Migration Service, AWS Elastic Disaster Recovery. Alternatively, you can use backup and recovery solutions provided by AWS partners.

At AWS, we use the 7 Rs framework to help organizations evaluate and choose the appropriate migration strategy for moving applications and workloads to the AWS Cloud. The 7 Rs represent:

  1. Rehosting (rehost or lift and shift)
  2. Replatforming (lift, tinker, and shift)
  3. Repurchasing (republish or re-vendor)
  4. Refactoring (re-architecting)
  5. Retiring
  6. Retaining (revisit)
  7. Relocating (remigrate).

This post focuses on rehosting and the services available to help rehost on-premises applications to Outposts Rack.

Before getting started with any migration, AWS recommends a three-phase approach to migrating workloads to the cloud (AWS Region or Outposts Rack). The three phases are assess, mobilize, and migrate and modernize.

Figure 1: Diagram showing the three migration phases of assess, mobilize, and migrate and modernize

Figure 1: Diagram showing the three migration phases of assess, mobilize, and migrate and modernize

This post describes the steps that you can take in the migrate and modernize phase. However, the assess and mobilize phases are also critical to allow you to understand what applications are migrated, the dependencies between them, and the planning associated with how and when migration occurs.

Workload migration to Outposts Rack: With staging environment in a Region

After deploying an Outposts Rack to your desired on-premises location, you can perform migrations of on-premises systems and virtual machines using either Application Migration Service and AMI creation or third-party backup and recovery services. Both scenarios are described in the following sections.

Scenario 1: Using Application Migration Service with AMI creation

Application Migration Service is able to lift and shift a large number of physical or virtual servers without compatibility issues, performance disruption, or long cutover windows.

In this scenario, at least one Outposts Rack is deployed on premises with the following prerequisites:

  • An AWS Replication Agent installed on each source server
  • At least one Outposts Rack installed and activated
  • VPC in an AWS Region
  • Staging subnet for staging migrated instances
  • Cutover subnet to validating migrated instances
  • Extended VPC spanning Region to the Outposts Rack
  • Migrated resources subnet where instances will be deployed from AMIs

The following diagram shows the solution architecture including the prerequisites and the on-premises servers that will be migrated to the Outposts Rack.

Figure 2: Architecture diagram showing migration with Application Migration Service

Figure 2: Architecture diagram showing migration with Application Migration Service

Step 1: Outposts Rack configuration

You can work with AWS specialists to size your Outposts for your workload and application requirements. In this scenario, you don’t need additional Outposts Rack capacity for migration because the staging area will be deployed in the Region (see 1 in Figure 2).

Step 2: Prepare Application Migration service

Set up Application Migration Service from the console in the Region to which your Outposts Rack is anchored. If this is your first setup, then choose Get started on the Application Migration Service console. When creating the replication settings template, ensure that your staging area is using subnets in the anchor Region (see 2 in Figure 2).

Step 3: Install the AWS Replication Agent to the source servers or machines

For large migrations, source servers may have a wide variety of operating system versions and may be distributed across multiple data centers. Application Migration Service offers the MGN connector, a feature that allows you to automate running commands on your source environment. Finally, ensure that communication is possible between the agent and Application Migration Service (see 3 in Figure 2).

In the following image, there is an example of deploying the AWS Replication Agent providing the necessary parameters (AWS Region, AWS access key and AWS secret access key).

Figure 2: Architecture diagram showing migration with Application Migration Service

When the AWS Replication Agent is installed, the server is added to the Application Migration Service console. Next, it undergoes the initial syncronization process, which is completed when showing the Ready for testing lifecycle state in the Application Migration Service console.

Step 4: Configure launch settings

Prior to testing or cutting over an instance, you must configure the launch settings by creating Amazon Elastic Compute Cloud (Amazon EC2) launch templates, ensuring that your cutover subnet is selected and that you choose an available instance type (see 4 in Figure 2). The instance type right-sizing feature allows AWS Application Migration Service to launch a test or cutover instance type that best matches the hardware configuration of the source server, by selecting the Basic option, AWS Application Migration Service will launch a test or cutover AWS instance type that best matches the OS, CPU, and RAM of your source server.

Step 5: Install AWS Systems Manager Agent on your cutover instances. When the launch settings are defined, you must activate the post-launch actions for either a specific server or all the servers. You must leave the Install the Systems Manager agent and allow executing actions on launched servers option toggled on in order for post-launch actions to work. Untoggling the option would disallow Application Migration Service to install the AWS Systems Manager Agent on your servers, and post-launch actions would no longer be executed (see 5 in Figure 2).

Figure 3: Post-launch actions on the Application Migration Service console

Figure 3: Post-launch actions on the Application Migration Service console

Step 6: Testing and cutover in Region

When you have configured the launch settings for each source server, you are ready to launch the servers as test instances. Best practice is to test instances before cutover.

Figure 4: Application Migration Service console ready to launch test instances

Figure 4: Application Migration Service console ready to launch test instances

Finally, after completing the testing of all the source servers, you are ready for cutover (see 6 on Figure 2). Prior to launching cutover instances, check that the source servers are listed as Ready for cutover under Migration lifecycle and Healthy under Data replication status.

Figure 5: Application Migration Console ready for cutover

Figure 5: Application Migration Console ready for cutover

To launch the cutover instances, choose the instances you want to cutover and then choose Launch cutover instances under Cutover (see Figure 5). The Application Migration Service console indicates Cutover finalized when the cutover has completed successfully the chosen source servers’ Migration lifecycle column shows the Cutover complete status, the Data replication status column shows Disconnected, and the Next step column shows Mark as archived. The source servers have now been successfully migrated into AWS. You can now archive your source servers that have launched cutover instances.

Step 7: Create a Migration AMI

After migrating all your workloads in the region where the Outposts is anchored to, create Amazon Machine Images (AMI). When you create an AMI from an instance, Amazon EC2 powers down the instance before creating the AMI to make sure that everything on the instance is stopped and in a consistent state during the creation process. If you are confident that your instance is in a consistent state appropriate for AMI creation, you can tell Amazon EC2 not to power down and reboot the instance.

This step can be automated using an existing Post Launch Action.

Step 8: Launch instances on AWS Outposts

The final part is to launch your created AMIs to your Outposts. To identify the EC2 instances configured on your Outpost you can use the following AWS Command Line Interface (AWS CLI):

Outposts get-outpost-instance-types \

–outpost-id op-abcdefgh123456789

The output of this command lists the instance types and sizes configured on your Outpost:

InstanceTypes:

– InstanceType: c5.xlarge

– InstanceType: c5.4xlarge

– InstanceType: r5.2xlarge

– InstanceType: r5.4xlarge

With knowledge of the instance types configured, you can now determine how many of each are available. For example, the following AWS CLI command, which is run on the account that owns the Outpost, lists the number of c5.xlarge instances available for use:

aws cloudwatch get-metric-statistics \

–namespace AWS/Outposts \

–metric-name AvailableInstanceType_Count \

–statistics Average –period 3600 \

–start-time $(date -u -Iminutes -d ‘-1hour’) \

–end-time $(date -u -Iminutes) \

–dimensions \

Name=OutpostId,Value=op-abcdefgh123456789 \

Name=InstanceType,Value=c5.xlarge

This command returns:

Datapoints:

– Average: 10.0

Timestamp: ‘2024-04-10T10:39:00+00:00’

Unit: Count

Label: AvailableInstanceType_Count

The output indicates that there were (on average) 10 c5.xlarge instances available in the specified time period (one hour). Using the same command for the other instance types, you discover that there are also 20 c5.4xlarge, 10 r5.2xlarge, and 6 r5.4xlarge available for use in completing the necessary EC2 launch templates.

Scenario 2: Using partner backup and replication solutions

You may already be using a third-party or AWS Partner solution to create on-premises backups of bare-metal or virtualized systems. These solutions often use local disk-arrays or object stores to create tiered backups of systems covering restore-points going back years, days, or just a few hours or minutes.

These solutions may also have inherent capabilities to restore from these backups directly to the AWS. This enables migration of on-premises systems to EC2 instances deployed to Outposts Rack.

In the scenario illustrated in Figure 6, the partner backup and replication service (BR) creates backups (see 1 in Figure 6) of virtual machines to on-premises disk or object storage repositories. Using the service’s AWS integration, virtual machines can be restored (see 2 in Figure 6) to an EC2 instance deployed on Outposts Rack, which is also on-premises. The restoration may follow a process that uses helper instances and volumes (see 3 in Figure 6) during intermediate steps to create Amazon Elastic Block Store (Amazon EBS) snapshots (see 4 in Figure 6) and then AMIs of the systems being migrated (see 5 in Figure 6), which are ultimately deployed (see 6 in Figure 6) to Outposts Rack.

Figure 6: Architecture diagram of the partner backup and replication scenario

Figure 6: Architecture diagram of the partner backup and replication scenario

When deploying an AMI created from a restored instance you must specify the target VPC and subnet. These should be the VPC being extended to the Outpost and a subnet that has been created in that VPC on the Outpost. You also need to specify an EC2 instance type that is available on the Outpost, which can be discovered using the process described in the previous section.

Workload migration to Outposts Rack using AWS Elastic Disaster Recovery (DRS)

Data residency can be a critical consideration for organizations that collect and store sensitive information, such as personally identifiable information (PII), financial data, or medical records. AWS Elastic Disaster Recovery, supported on Outposts Rack, helps enable seamless replication of on-premises data to Outposts Rack and addresses data residency concerns by keeping data within your on-premises environment, using Amazon EBS and Amazon S3 on Outposts.

In this scenario, an Outpost Rack is deployed on-premises with the following prerequisites:

  • At least one Outposts Rack installed and activated
  • The Outposts Rack must be in Direct VPC Routing (DVR) mode
  • VPC extended to the Outposts Rack containing subnets for staging and target resources
  • Amazon S3 on Outposts (necessary for all Elastic Disaster Recovery replication destinations)
  • An AWS Replication Agent installed on each source server

The following diagram shows the solution architecture and includes the on-premises servers that are migrated from the local network to the Outposts Rack. It also includes the staging VPC used to deploy the replication servers on Outposts Rack, Amazon S3 on Outposts to store the local Amazon EBS snapshots, and the target VPC extended to Outposts Rack.

Figure 7: Architecture diagram for workflow migration to Outposts Rack

Figure 7: Architecture diagram for workflow migration to Outposts Rack

Step 1: Outposts Rack configuration

To use Elastic Disaster Recovery on Outposts Rack, you need to configure both Amazon EBS and Amazon S3 on Outposts to support continuous replication and point-in-time recovery for your workload needs (see 1 in Figure 7). Specifically, you need to size the Amazon EBS and Amazon S3 on Outposts capacity according to your workload capacity requirements and application interdependencies. To do this, you can define dependency groups: each dependency group is a collection of applications and their underlying infrastructure with technical or non-technical dependencies. A 2:1 ratio is recommended for the EBS volumes to be used for near-continuous replication, and a 1:1 ratio is recommended for the Amazon S3 on Outposts ratio for EBS snapshots. For example, to migrate 40 TB of workloads, you need to plan for 80 TB of EBS volumes and 40 TB of Amazon S3 on Outposts capacity.

Step 2: Extend VPC to your Outposts Rack

When your Outpost has been provisioned and is available, extend the necessary Amazon Virtual Private Cloud (Amazon VPC) connection to the Outpost from the Region by creating the desired staging and target subnets (see 2 in Figure 7).

Step 3: Prepare Elastic Disaster Recovery service

Prepare the Elastic Disaster Recovery service from the Console to set the default replication and launch settings. When defining these settings, make sure that the Outposts resources available are chosen for staging and target subnets and instance and storage type (see 3 in Figure 7).

Step 4: Install the AWS Replication Agent to the source servers or machines

The next phase is to install the AWS Replication Agent to the source servers and to make sure that communication is possible between the AWS Replication Agent and your Outposts replication subnet through the Outposts local gateway, which makes sure that replication traffic uses the local network (see 4 in Figure 7).

Step 5: Continuous block-level replication

Staging area resources are automatically created and managed by Elastic Disaster Recovery. When the AWS Replication Agent has been deployed, continuous block-level replication (compressed and encrypted in transit) occurs (see 5 in Figure 7) over the local network.

Step 6: Launch Outposts Rack resources

Finally, migrated instances can now be launched using Outposts Rack resources based on the launch settings defined previously (see 6 in Figure 7).

Conclusion

In this post, you have learned how to migrate your workloads from your on-premises environment to AWS Outposts Rack based on your specific data residency requirements. When you have the flexibility of using AWS Regional services, AWS migration services or partner solutions can be used with infrastructure already in place. If your data must stay on-premises, then using AWS Elastic Disaster Recovery allows you to migrate your data without using Regional services, allowing you to migrate to Outposts Rack without your data leaving the boundary of a certain geographic location.

To learn more about an end-to-end migration and modernization journey, visit the AWS Migration Hub.

Implementing a serverless architecture to detect absence of Guardrails in Amazon Bedrock inference API calls

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/implementing-a-serverless-architecture-to-detect-absence-of-guardrails-in-amazon-bedrock-inference-api-calls/

This post is written by Sayan Chakraborty, Senior Solutions Architect, AWS

Implementing a serverless architecture to detect absence of Guardrails in Amazon Bedrock inference API calls

In today’s rapidly evolving artificial intelligence (AI) landscape, organizations are increasingly harnessing the power of foundation models through Amazon Bedrock to build sophisticated generative AI applications. Although this technology opens up exciting possibilities, it also brings forth important considerations around responsible AI implementation and content safety.

Amazon Bedrock Guardrails serve as a crucial safeguard, helping organizations filter out harmful content, prevent prompt injection attacks (LLM01:2025 from OWASP Top 10 for generative AI), and maintain ethical AI practices. These configurable safeguards are essential for enterprises committed to responsible AI development, especially when scaling their applications across various use cases.

However, there’s a critical consideration: although Guardrails are powerful, they’re optional by default in Amazon Bedrock inference API calls. For organizations that mandate the use of Guardrails as part of their responsible AI strategy, a solution is needed to make sure of consistent implementation across all API requests.

In this post, we explore how to build a serverless architecture that automatically detects when Guardrails are absent in Amazon Bedrock inference API calls. We demonstrate how enterprises can implement automated monitoring and alerting systems to maintain compliance with their AI safety standards, making sure that Guardrails are properly implemented wherever needed. This solution is particularly valuable for organizations prioritizing secure and responsible AI deployment at scale.

Prerequisites

Before proceeding with the implementation, make sure that you do the following:

1.Create an AWS account if you do not already have one, and log in. The AWS Identity and Access Management (IAM) user that you use must have sufficient permissions to make necessary AWS service calls and manage AWS resources.

2.Have AWS Command Line Interface (AWS CLI) installed and configured.

3.Have Git Installed.

Architecture

The following diagram shows an event-driven architecture of this solution.

Figure 1: Solution architecture diagram

Figure 1: Solution architecture diagram

Amazon Bedrock supports model invocation logging. When enabled, it collects the full request data, response data, and metadata associated with all model invocation calls performed in your AWS account. Logging can be configured to send the logs to supported destinations such as Amazon CloudWatch Logs and Amazon S3. This solution uses an S3 bucket to collect these logs. Note that this solution supports the below Amazon Bedrock inference APIs:

As logs get stored in the S3 bucket, an Amazon S3 event notification is generated to an Amazon EventBridge event bus. A rule that matches “Object Created” events from Amazon S3 routes these events to an AWS Step Functions state machine, which defines the orchestration logic to inspect the model invocation logs for missing Guardrails, and sends out an alert to a monitored email address when applicable.

Walkthrough of the orchestration

As mentioned previously, the Step Functions state machine is the orchestration engine that performs the business logic for this solution, as events are received from new logs created in the S3 bucket. When opened in the Workflow studio in the Step Functions console, you should observe the following diagram.

Figure 1: Step Functions state machine diagram as seen in workflow studio

Figure 2: Step Functions state machine diagram as seen in workflow studio

1.The first step in the state machine is to call an AWS Lambda function to get the logs from the S3 bucket using the bucket name and object key supplied in the event object received from EventBridge.

2.If the log shows that the Amazon Bedrock API invocation was successful, then the state machine collects the output object of the API response from the log that is needed for further evaluation.

3.The next step is to check if Amazon Bedrock Guardrails was used. This is done by looking for specific objects in the Amazon Bedrock API output that was captured from the logs.

4.If a Guardrail was detected, then the flow completes successfully, and no further action is needed.

5.If a Guardrail was not detected, then the next step in the state machine collects a few pieces of information from the log file that is necessary to record the transaction and adds the transaction date. Then, the transaction is logged in to the transactions table in Amazon DynamoDB.

6.A user or a role may be making a lot of API calls to Amazon Bedrock each day. Therefore, the solution implements a mechanism to prevent the monitored email address from being swamped by emails reporting the same user or role more than once each day. This is done in parallel to Step 5, where the flow checks if the principal’s identity (user ID/IAM role) is recorded as notified in the current date, by querying the notifications table in DynamoDB. If no results were found, meaning that a notification hasn’t been sent yet, then an email is sent out to a monitored email address through an Amazon Simple Notification Service (Amazon SNS) topic. Furthermore, an item is inserted into the notifications table in DynamoDB to prevent sending more notifications on the same day for the same principal.

Solution deployment

For deployment instructions, follow along in the GitHub repo or use this post. An AWS CloudFormation template is provided to deploy the solution.

1.Create an S3 bucket to store the model invocation logs from Amazon Bedrock. Under bucket Properties, turn the EventBridge notifications to On. This enables Amazon S3 to send an event notification to the EventBridge default event bus whenever a log file is created in the bucket by Amazon Bedrock.

2.Go to the Amazon Bedrock console and enable Model invocation logging under Bedrock Configuration > Settings, from the left navigation pane. Specify the bucket created in Step 1 under S3 location.

Figure 2: Amazon Bedrock settings for Model invocation logging

Figure 3: Amazon Bedrock settings for Model invocation logging

3. Create two more S3 buckets: one that is used by the Step Functions state machine to store Bedrock model invocation errors detected from the log, and the other that stores the Lambda function code for this solution. Inside the latter bucket, create a Folder called code (or any other preferred name) and upload the ZIP archive under the lambda-code folder of this repository, into that Amazon S3 folder. Note the names for these two S3 buckets and the Amazon S3 object key for the Lambda ZIP file. These must be specified as input parameters to the CloudFormation template.

4. From the CloudFormation console or using CLI, create a stack using the template provided in this repository called bedrock-guardrails-detection-template.yaml. For inputs, specify the BedrockLogsBucket (from Step 1), BedrockLogsErrorBucket (from Step 3), LambdaFunctionCodeBucket (from Step 3), LambdaFunctionCodeBucketKey (S3 object key for the ZIP file uploaded in Step 3, for example code/get-bedrock-logs-from-s3.py.zip), and NotificationEmailAddress (email address to subscribe to the SNS topic). It may take a few minutes to complete deployment of the CloudFormation stack.

5. When deployment is complete, access the email inbox for the email address specified during the CloudFormation stack deployment, and confirm the subscription using the email sent from the Amazon SNS topic. The email should be titled: AWS Notification – Subscription Confirmation. Choose the Confirm subscription link inside the email to complete the subscription process. The email account is now ready to receive notifications from this solution.

Scaling to multiple AWS accounts

The architecture discussed previously shows how Guardrails can be detected from within the same AWS account where Amazon Bedrock APIs are invoked. However, in most production environments, there are multiple AWS accounts where independent teams may be deploying their own generative AI workloads using Amazon Bedrock in their own accounts. To collect model invocation logs from all those accounts, EventBridge can be configured to send events from event buses in separate source workload accounts to a central event bus deployed in a central destination governance account. This central event bus can have a rule to route events to the Step Functions state machine deployed in that central governance account. The deployment model looks like the following diagram.

To learn more about sending and receiving events between AWS accounts in EventBridge, refer to the documentation.

Figure 3: Cross-account guardrail detection solution

Figure 4: Cross-account guardrail detection solution

Further considerations and clean up

Amazon Bedrock model invocation logging captures requests and responses from model invocations and stores the logs in the destination of your choosing. In this sample it is in an S3 bucket that you create. The following are some more security considerations.

1.To protect information, you may choose to use to encrypt the contents using server-side encryption with AWS KMS keys (SSE-KMS) on the S3 bucket, and specify a customer managed encryption key. More details are in this Amazon Bedrock user guide.

2.Perform regular cleanup of the model invocation logs bucket using an Amazon S3 lifecycle configuration rule as mentioned in this post.

To avoid ongoing charges, clean up your environment by following these steps to delete the resources you created by following this post, if they are no longer needed:

1.Delete the stack:
aws cloudformation delete-stack –stack-name STACK_NAME

2.Confirm the stack has been deleted:
aws cloudformation list-stacks –query “StackSummaries[?contains(StackName,’STACK_NAME’)].StackStatus”

3.Empty contents of the S3 buckets created manually as a prerequisite to deploying the CloudFormation stack and delete the buckets.

4.Turn off model invocation logging from under Settings in the Amazon Bedrock console, if it’s not desired any longer.

Conclusion

This post discussed implementing a serverless event-driven architecture to detect the absence of Guardrails in Amazon Bedrock inference API calls. As organizations increasingly use foundation models through Amazon Bedrock for generative AI applications, making sure of responsible AI implementation becomes crucial.

The solution presents an event-driven architecture that automatically detects when Guardrails are missing in API calls. It uses the Amazon Bedrock model invocation logging, storing logs in an Amazon S3 bucket. When new logs are created, an Amazon S3 event notification triggers an Amazon EventBridge event bus, which routes events to an AWS Step Functions state machine. Then, the state machine inspects the logs for missing Guardrails and sends alerts through Amazon SNS to a monitored email address.

The architecture includes features to prevent notification flooding and can scale across multiple AWS accounts. The post provides detailed deployment instructions using AWS CloudFormation and includes security considerations and cleanup procedures. With this solution you can help your organization maintain compliance with AI safety standards while scaling generative AI applications.

Efficiently manage Amazon EC2 On-Demand Capacity Reservations (ODCRs) with split, move, and modify

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/efficiently-manage-amazon-ec2-on-demand-capacity-reservations-odcrs-with-split-move-and-modify/

This post is written by Ninad Joshi, Senior Solutions Architect, Ballu Singh, Principal Solutions Architect, and Ankush Goyal, Enterprise Support Lead AWS.

Introduction

In today’s cloud-first world, managing compute capacity efficiently while making sure of application availability is crucial for your business. Amazon EC2 On-Demand Capacity Reservations (ODCR) is a valuable tool for organizations looking to manage their reservations, but managing reservations across multiple teams and accounts is challenging. Recently, AWS introduced new capabilities – split, move, and modify – that improve how organizations can manage their Capacity Reservations. In this post, we explore how these features can transform your operations.

Common ODCR management challenges

As a consumer of ODCR, you might face several challenges managing your Capacity Reservations. These challenges include but are not limited to the following:

  • Underused reserved capacity in some accounts
  • Inability to redistribute excess capacity efficiently
  • Difficulty in managing existing capacity across multiple AWS accounts
  • Difficulty in modifying reservation attributes post-creation

With multiple development teams and various projects running simultaneously, you might struggle with efficient capacity allocation. You might also find yourself dealing with situations where one team has excess capacity while another desperately needs it.

Use case 1: Redistributing capacity across teams

The unused capacity dilemma

Consider a scenario where your machine learning (ML) team has an ODCR for ten c5.2xlarge instances, but they’re only using five. Meanwhile, your Analytics team urgently needs three Amazon Elastic Compute Cloud (Amazon EC2) instances of the same type for a new project. Previously, your Analytics team would have had to create a new reservation, leading to unnecessary overhead of managing their own Capacity Reservation. Meanwhile, the five unused capacity slots of the ODCR owned by your ML team results in unnecessary costs.

Split capability to the rescue

Using the new split capability, you can now divide the existing ODCR (see ODCR-1 in the following figure), which has a total capacity of ten EC2 instances, and create a new ODCR with three of the unused capacity.

Before split, ODCR-1 with original total and unused capacity

Figure 1: Before split, ODCR-1 with original total and unused capacity

This results in the creation of two ODCRs:

  1. Original ODCR: total capacity of seven instances for the ML team
  2. New ODCR: three instances for the Analytics team

The following figure illustrates the split result:After split, ODCR-1 with updated total and unused capacity, and newly created ODCR-2

Figure 2: After split, ODCR-1 with updated total and unused capacity, and newly created ODCR-2

Sharing across accounts

The split operation creates the new ODCR in the same AWS account. If your teams operate under the same AWS account, then the split operation is direct without any further steps. However, if your teams use different AWS accounts, then you would need to use AWS Resource Access Manager (AWS RAM) to share the newly created ODCR after the split operation. This enables cross-account capacity management while maintaining centralized control.

Refer to the AWS Documentation for more information on pre-requisites and considerations when splitting off capacity from one reservation to a new one.

Refer to the API and CLI documentation for further information on the split capability such as parameters, exceptions, and limits.

Use case 2: moving capacity between reservations

Scaling for growth

After a few days, when your Analytics team needs one more capacity to launch an instance for their expanding project, you need to add more capacity to ODCR-2.

Move capability to the rescue

Instead of creating a new ODCR for this purpose, you can move one of the unused slots from ODCR-1 to ODCR-2. This flexibility saves you multiple steps involved in reserving new capacity, removes any disruptions to running existing workloads, and helps with simpler ODCR management. This rebalancing makes sure of optimal resource usage without further procurement.

Before move, ODCR-1 with unused capacity and ODCR-2 with current capacity

Figure 3: Before move, ODCR-1 with unused capacity and ODCR-2 with current capacity

After move, ODCR-1 with reduced capacity and ODCR-2 with additional capacity

Figure 4: After move, ODCR-1 with reduced capacity and ODCR-2 with additional capacity

Refer to the AWS Documentation for more information on pre-requisites and considerations when moving capacity from one reservation to another one.

Refer to the API and CLI documentation for further information on the move capability such as parameters, exceptions, and limits.

Use case 3: adjusting reservation attributes for changing workload patterns

Dynamic workload requirements

When your data processing workload patterns change significantly, you must adapt. Initially, you might have set up your ODCR with specific instance matching criteria, making it a targeted reservation for predictable workloads. However, as you introduce more dynamic, impromptu analysis projects, you need more flexibility in how instances can be launched against your reservation.

Modify feature to the rescue

Using the modify capability, you can now change the reservation’s attributes without creating a new reservation or disrupting running workloads. You can modify your ODCR by:

  • Changing instance quantity
  • Changing instance eligibility from Targeted to Open
  • Adjusting the reservation’s end date to align with your project timeline

This modification allows you to:

  • Launch new instances more flexibly without strict instance eligibility
  • Improve the usage of reserved capacity across different projects
  • Maintain cost optimization while adapting to changing business needs

The modify feature provides this flexibility while making sure that your existing workloads continue running uninterrupted, making it an invaluable tool for dynamic environments. See the following figures for an example where the instance quantity of ODCR-2 is modified from four to six:

Before modify, ODCR-2 with total capacity of four and instance eligibility of targeted

Figure 5: Before modify, ODCR-2 with total capacity of four and instance eligibility of targeted

After modify, ODCR-2 with new total capacity of six and instance eligibility of open

Figure 6: After modify, ODCR-2 with new total capacity of six and instance eligibility of open

Increasing ODCR size or creating a new one is subject to capacity availability in Amazon EC2 on-demand availability. Therefore, if unused capacity is available in an existing ODCR, then moving/splitting that could be a better option than modifying an ODCR.

Refer to the AWS Documentation for more information on pre-requisites and considerations when modifying Capacity Reservations.

Refer to the API and CLI documentation for further information on the modify capability such as parameters, exceptions, and limits.

Special considerations for split capacity

In the preceding sections, we saw how you can use the split capability to detach excess unused capacity to create an ODCR for another team. However, you can also use this capability to split used capacity to create new ODCRs. This capability is particularly helpful when you want to split partially used ODCRs to create a new one for easier tracking and management. Along with the considerations for splitting unused/excess capacity, the following considerations apply for splitting used capacity:

  1. The used capacity can only be split for an ODCR with open instance eligibility that isn’t shared with any account.
  2. The instances running inside the reservation are of open eligibility (in other words they are not targeting the reservation).
  3. When you split the used capacity, the eligible instances are randomly selected. You cannot specify which running instances are split. If a sufficient number of eligible instances aren’t found to fulfill the split quantity, then the split operation fails. When you specify the quantity of instances to be split, by default any unused capacity is moved first, followed by any eligible running instances (the used capacity in your reservation).

In the next section we different scenarios where you can or can’t use split capability.

Scenario 1: managing internal ODCRs (Capacity Reservation not shared with any other AWS account)

For your internal projects, when managing ODCRs that aren’t shared with external partners (other AWS accounts) and all have open instance eligibility, consider this example with ODCR-1:

  • Total capacity of ten c5.2xlarge instances, all with open instance eligibility
  • Eight instances currently in use by your ML team
  • Two unused instances

Before split, ODCR-1 with total capacity of 10 and 2 unused instances

Figure 7: Before split, ODCR-1 with total capacity of 10 and 2 unused instances

This ODCR isn’t shared with any external AWS accounts, thus you have maximum flexibility in splitting the reservation. You can split up to nine instances into a new reservation (total capacity minus one), regardless of how many instances are currently in use. In this scenario, you can share used as well as unused capacity. This gives you significant freedom in restructuring the capacity allocation for your internal teams.

After split, ODCR-1 remains with total capacity of one, and ODCR-2 with total capacity of nine with two unused capacities

Figure 8: After split, ODCR-1 remains with total capacity of one, and ODCR-2 with total capacity of nine with two unused capacities

Scenario 2: managing shared ODCRs with partners (Capacity Reservation shared with other AWS account)

When you need to share your ODCR with a partner’s AWS account, consider this scenario where ODCR-1 has:

  • Total capacity of ten c5.2xlarge instances
  • Eight instances in use by both your team and your partner’s team
  • Two unused instances

Before split, ODCR-1 shared with another AWS account

Figure 9: Before split, ODCR-1 shared with another AWS account
In this case, your options are more limited. ODCR-1 is shared with your partner’s AWS account, thus you can only split the unused capacity (maximum of two instances). After split, the newly created ODCR (ODCR-2) remains in your AWS account and isn’t shared with any other AWS account. This restriction helps prevent any disruption to your partner’s running workloads while still allowing for some flexibility in capacity management.

After split, ODCR-1 remains shared with another AWS account, and newly created ODCR-2 isn’t shared

Figure 10: After split, ODCR-1 remains shared with another AWS account, and newly created ODCR-2 isn’t shared

These scenarios demonstrate important factors about capacity management in both internal and partner-shared environments. You should carefully consider the sharing status of ODCRs before planning any splits or modifications, making sure of smooth operations for both your teams and your partners.

Special considerations for move capability

The move capability enables you to redistribute available (or excess) capacity between ODCRs. However, in certain cases, you can also use this capability to move used instances between ODCRs. This capability is particularly helpful if you want to merge partially used ODCRs into one for easier tracking and management. Along with the considerations for moving unused capacity, the following considerations apply for moving used capacity:

  1. Both source and destination ODCR are of open instance eligibility and in active state.
  2. The instances running inside the reservation are of open eligibility (in other words they are not targeting the reservation).
  3. Both source and destination ODCRs are owned by the same account.
  4. The source and destination ODCRs can be shared, but with the same list of accounts when moving used portion. This sharing to same accounts condition doesn’t apply to the unused portion of the ODCR.

When you specify the quantity of instances to be moved, by default any unused capacity is moved first, followed by any eligible running instances (the used capacity in your reservation).

In the next sections, we review where you can or can’t use this capability.

Scenario 1: source and destination ODCRs not shared with other account(s) (Team Transfers)

When managing capacity between your internal teams using the same AWS account (Account-A), you find the process clear. For example, when consolidating the ML team’s resources:

  • ODCR-1 (ML Team A): had ten capacities total (all with open eligibility), with eight in use and two unused.
  • ODCR-2 (ML Team B): had five capacities (all with open eligibility), all in use.

Before move, ODCR-1 and ODCR-2 both in the same AWS account, unshared

Figure 11: Before move, ODCR-1 and ODCR-2 both in the same AWS account, unshared

Both ODCRs belonged to the same account and weren’t shared externally, and the ODCRs have open instance eligibility. Therefore, you could freely move all ten instances from ODCR-1 to ODCR-2, creating a unified pool of 15 instances for the consolidated DevOps team.

After moving capacity from ODCR-1, ODCR-2 has combined total capacity of 15 with 2 unused

Figure 12: After moving capacity from ODCR-1, ODCR-2 has combined total capacity of 15 with 2 unused

Scenario 2: source and destination ODCRs shared with the same account(s) (External Partner Collaboration)

If your ML team (ODCR-1) collaborates with an external AI research partner (Account-B), your setup might look like the following:

  • ODCR-1: ten instances (eight used, two unused), all with open instance eligibility, shared with the research partner through AWS RAM.
  • ODCR-2: Five instances (all used), all with open instance eligibility, for internal Analytics team.

Before move, ODCR-1 and ODCR-2 both in the same AWS account, with ODCR-1 shared with other AWS account

Figure 13: Before move, ODCR-1 and ODCR-2 both in the same AWS account, with ODCR-1 shared with other AWS account

When your Analytics team needs more capacity, you can only move the two unused instances from ODCR-1 to ODCR-2, as the other eight are actively used in the partner collaboration.

Since ODCR-1 is shared with other AWS account, only unused capacity is moved to ODCR-2

Figure 14: Since ODCR-1 is shared with other AWS account, only unused capacity is moved to ODCR-2

Scenario 3: source and destination ODCRs shared with different account(s) (Multi-Partner Projects)

In this scenario involving managing capacity across different partner engagements:

  • ODCR-1: Ten instances (eight used, two unused), shared with a database partner (Account-B).
  • ODCR-2: Five instances (all used), shared with a security partner (Account-C).

ODCR-1 and ODCR-2 are shared with different AWS account

Figure 15: ODCR-1 and ODCR-2 are shared with different AWS account

Due to the different partner arrangements, in other words ODCRs shared with another accounts, you can only move the two unused capacities from ODCR-1 to ODCR-2. This makes sure that there is no disruption to database partner workloads.

Only unused capacity moved to ODCR-2 due to shared capacity reservations

Figure 16: Only unused capacity moved to ODCR-2 due to shared capacity reservations

These scenarios teach valuable lessons about capacity management in multi-account environments. You can develop a comprehensive sharing strategy that balances flexibility with partner commitments, enabling you to optimize your resource usage while maintaining strong partner relationships.

Conclusion

The new ODCR features of AWS –a split, move, and modify – represent a significant advancement in cloud capacity management. For your organization, these features transform how you handle compute resources, enabling more efficient operations and cost management. The ability to dynamically adjust and share Capacity Reservations provides the flexibility you need while maintaining the stability necessary for your critical workloads.

As cloud infrastructure continues to evolve, these features demonstrate the AWS commitment to addressing real-world challenges that you face when managing complex cloud environments. If you’re looking to optimize your AWS infrastructure, then these new ODCR capabilities offer powerful tools for better capacity management and resource usage.

To enhance your understanding of these capabilities, we’ve created a GitHub repository containing APIs for implementation purposes. For more details, refer to the updated Capacity Reservations documentation. If you have any questions or feedback, feel free to share them in the comments section or contact AWS Support.

Streamlining AMI creation with EC2 Image Builder components in AWS Marketplace

Post Syndicated from Art Baudo original https://aws.amazon.com/blogs/compute/streamliningamicreationwith-ec2imagebuilder/

This post is written by Smriti Ohri, Senior Product Manager, EC2 and Omar Chehab, Senior Product Manager, AWS Marketplace.

At re:Invent 2024, Amazon Web Services (AWS) announced the availability of third-party EC2 Image Builder components in AWS Marketplace. EC2 Image Builder is a fully managed service that streamlines the customization, testing, distribution, and lifecycle management of images. You can use this new feature to procure third-party components from AWS Marketplace directly on the EC2 Image Builder console and in the AWS Marketplace website. You can add multiple of these components to create your golden images.

A golden image is a customized and pre-configured Amazon Machine Image (AMI) needed for launching Amazon Elastic Compute Cloud (Amazon EC2) instances. It includes a standardized set of software, configurations, and security settings that meet an organization’s specific requirements, promoting consistency and efficiency across all EC2 instances.

EC2 Image Builder provides Amazon managed components, and you can build your own components that help when building custom images. However, you may need third-party software to build your golden images. Procuring this software can be time-consuming and necessitates custom setup. This integration aims to address these challenges by providing the ability to add third-party software from AWS Marketplace directly while creating golden images using EC2 Image Builder. While creating the image, you can customize your image recipe to use the latest version of components published in AWS Marketplace and make sure that you always remain up to date.

This post shows you how to find, subscribe to, and incorporate components from AWS Marketplace using the EC2 Image Builder console.

Prerequisites

You must have access to subscribe to a product in AWS Marketplace. Check AWS Marketplace subscription permissions.

Solution overview

Three high-level steps are involved in using the third-party component from AWS Marketplace in EC2 Image Builder:

  1. Discover and subscribe to the third-party component on the EC2 Image Builder console.
  2. Build the golden image with the third-party component.
  3. Launch the EC2 instance using the golden image.

Solution walkthrough: Streamlining AMI creation with EC2 Image builder components in AWS Marketplace

To perform the solution, go through the steps in the following sections.

Discover and subscribe to a component by Cribl

To discover and subscribe to the component, follow these steps:

  1. On the EC2 Image Builder console, in the navigation pane, choose Discover products. On the Components tab, you can view the list of available AWS Marketplace image products and the associated components. As shown in the following screenshot, choose View subscription options, which shows the different pricing offered.

 Figure 1: Discover components on EC2 Image Builder console

 Figure 1: Discover components on EC2 Image Builder console

  1. To subscribe to the product, from the dropdown menu choose the available offers and choose Subscribe, as shown in the following screenshot. You can now start using the associated component in your image recipe.

Figure 2: Subscribe to the product that has the component

Figure 2: Subscribe to the product that has the component

Build the golden image with the third-party component

To use the component, you can either subscribe to it first, or you can create the pipeline and subscribe to the component later based on your preference. For this walkthrough, I already subscribed to the component. The following section shows how to create a pipeline to build a custom AMI using the component to which I subscribed. You can follow a similar process to install other components to create your golden AMIs. The high-level steps are:

  1. Create the recipe.
  2. Create the pipeline.

To create the recipe, follow these steps:

  1. On the EC2 Image Builder console, choose Image recipes and Create image recipe. A recipe has a base image and the components that you want to install on it.

For this example, Amazon Linux was chosen as the base image operating system and “Amazon Linux 2023 x86” as the image name.

  1. In the Build components section, choose Add build components and, from the dropdown, choose AWS Marketplace. Search for the component to which you subscribed and choose Add to recipe, as shown in the following screenshot.

You can choose to use the latest version or a specific version of the component. For this walkthrough, the latest available version was selected.

Figure 3: Create recipe and add components from AWS Marketplace

Figure 3: Create recipe and add components from AWS Marketplace

To create the pipeline, an automation configuration (where you define the infrastructure configuration), image workflows, and distribution configuration, follow these steps:

  1. On the EC2 Image Builder console, choose Image pipelines and Create image pipeline. Provide the name of the pipeline and choose a Build schedule. You can also enable scanning, which scans your AMIs for Common Vulnerabilities and Exposures (CVEs) using Amazon Inspector.

For more information, refer to Amazon Inspector integration in Image Builder in the EC2 Image Builder User Guide. For this example, image scanning is enabled and the option to manually trigger the pipeline was selected.

Figure 4: Create the pipeline with the recipe and other configurations

Figure 4: Create the pipeline with the recipe and other configurations

  1. Choose the recipe you created with third-party components from AWS Marketplace.
  2. Choose the image workflows for the image creation process and define infrastructure configurations for creating the image.

You can choose Dedicated Host, Dedicated Instance, or Shared Tenancy. By default, it uses Shared Tenancy. For this example, the default configuration was selected. I chose the c5.large instance type since that is the supported instance type for this component.

Figure 5: Select the supported instance type in the infrastructure configurations

Figure 5: Select the supported instance type in the infrastructure configurations

  1. Provide the distribution configuration details to share or copy the output image to other accounts and in other AWS Regions.

To allow these accounts to use any component from AWS Marketplace, you must share license entitlements with these accounts using AWS License Manager. Instructions for sharing license entitlements are outside the scope of this post. To learn more, refer to Associating licenses with AMI based products using AWS License Manager.

  1. Choose the pipeline that you created and choose Run pipeline. After a while, the image is created and ready to use.

Run the EC2 instance using the golden image

Create an EC2 instance with the output golden image. You can also view the product code stamped on the AMIs, as shown in the following figure.

 

Figure 6: View the output image to check the product code

Conclusion

This feature helps you save time and automate the process of using the latest versions of the software. With this integration, you get a diverse set of software components from verified sellers in AWS Marketplace to address the monitoring, security, governance, and compliance needs of your organization. You can learn more about these components in the documentation. Visit AWS Marketplace to view all supported EC2 Image Builder components.

If you’re an AWS Partner, then you can publish your software as components in AWS Marketplace to cater to your customers. To learn more about onboarding your software to AWS Marketplace, visit this blog post. You can reach out to [email protected] if you have questions about this new feature or the publishing process.

Start building your custom AMIs using components from Marketplace today.