Tag Archives: launch

Optimizing vector search using Amazon S3 Vectors and Amazon OpenSearch Service

Post Syndicated from Sohaib Katariwala original https://aws.amazon.com/blogs/big-data/optimizing-vector-search-using-amazon-s3-vectors-and-amazon-opensearch-service/

NOTE: As of July 15, the Amazon S3 Vectors Integration with Amazon OpenSearch Service is in preview release and is subject to change.

The way we store and search through data is evolving rapidly with the advancement of vector embeddings and similarity search capabilities. Vector search has become essential for modern applications such as generative AI and agentic AI, but managing vector data at scale presents significant challenges. Organizations often struggle with the trade-offs between latency, cost, and accuracy when storing and searching through millions or billions of vector embeddings. Traditional solutions either require substantial infrastructure management or come with prohibitive costs as data volumes grow.

We now have a public preview of two integrations between Amazon Simple Storage Service (Amazon S3) Vectors and Amazon OpenSearch Service that give you more flexibility in how you store and search vector embeddings:

  1. Cost-optimized vector storage: OpenSearch Service managed clusters using service-managed S3 Vectors for cost-optimized vector storage. This integration will support OpenSearch workloads that are willing to trade off higher latency for ultra-low cost and still want to use advanced OpenSearch capabilities (such as hybrid search, advanced filtering, geo filtering, and so on).
  2. One-click export from S3 Vectors: One-click export from an S3 vector index to OpenSearch Serverless collections for high-performance vector search. Customers who build natively on S3 Vectors will benefit from being able to use OpenSearch for faster query performance.

By using these integrations, you can optimize cost, latency, and accuracy by intelligently distributing your vector workloads by keeping infrequent queried vectors in S3 Vectors and using OpenSearch for your most time-sensitive operations that require advanced search capabilities such as hybrid search and aggregations. Further, OpenSearch performance tuning capabilities (that is, quantization, k-nearest neighbor (knn) algorithms, and method-specific parameters) help to improve the performance with little compromise of cost or accuracy.

In this post, we walk through this seamless integration, providing you with flexible options for vector search implementation. You’ll learn how to use the new S3 Vectors engine type in OpenSearch Service managed clusters for cost-optimized vector storage and how to use one-click export from S3 Vectors to OpenSearch Serverless collections for high-performance scenarios requiring sustained queries with latency as low as 10ms. By the end of this post, you’ll understand how to choose and implement the right integration pattern based on your specific requirements for performance, cost, and scale.

Service overview

Amazon S3 Vectors is the first cloud object store with native support to store and query vectors with sub-second search capabilities, requiring no infrastructure management. It combines the simplicity, durability, availability, and cost-effectiveness of Amazon S3 with native vector search functionality, so you can store and query vector embeddings directly in S3. Amazon OpenSearch Service provides two complementary deployment options for vector workloads: Managed Clusters and Serverless Collections. Both harness Amazon OpenSearch’s powerful vector search and retrieval capabilities, though each excels in different scenarios. For OpenSearch users, the integration between S3 Vectors and Amazon OpenSearch Service offers unprecedented flexibility in optimizing your vector search architecture. Whether you need ultra-fast query performance for real-time applications or cost-effective storage for large-scale vector datasets, this integration lets you choose the approach that best fits your specific use case.

Understanding Vector Storage Options

OpenSearch Service provides multiple options for storing and searching vector embeddings, each optimized for different use cases. The Lucene engine, which is OpenSearch’s native search library, implements the Hierarchical Navigable Small World (HNSW) method, offering efficient filtering capabilities and strong integration with OpenSearch’s core functionality. For workloads requiring additional optimization options, the Faiss engine (Facebook AI Similarity Search) provides implementations of both HNSW and IVF (Inverted File Index) methods, along with vector compression capabilities. HNSW creates a hierarchical graph structure of connections between vectors, enabling efficient navigation during search, while IVF organizes vectors into clusters and searches only relevant subsets during query time. With the introduction of the S3 engine type, you now have a cost-effective option that uses Amazon S3’s durability and scalability while maintaining sub-second query performance. With this variety of options, you can choose the most suitable approach based on your specific requirements for performance, cost, and accuracy. For instance, if your application requires sub-50 ms query responses with efficient filtering, Faiss’s HNSW implementation is the best choice. Alternatively, if you need to optimize storage costs while maintaining reasonable performance, the new S3 engine type would be more appropriate.

Solution overview

In this post, we explore two primary integration patterns:

OpenSearch Service managed clusters using service-managed S3 Vectors for cost-optimized vector storage.

For customers already using OpenSearch Service domains who want to optimize costs while maintaining sub-second query performance, the new Amazon S3 engine type offers a compelling solution. OpenSearch Service automatically manages vector storage in Amazon S3, data retrieval, and cache optimization, eliminating operational overhead.

One-click export from an S3 vector index to OpenSearch Serverless collections for high-performance vector search.

For use cases requiring faster query performance, you can migrate your vector data from an S3 vector index to an OpenSearch Serverless collection. This approach is ideal for applications that require real-time response times and gives you the benefits that come with Amazon OpenSearch Serverless, including advanced query capabilities and filters, automatic scaling and high availability, and no administration. The export process automatically handles schema mapping, vector data transfer, index optimization, and connection configuration.

The following illustration shows the two integration patterns between Amazon OpenSearch Service and S3 Vectors.

Prerequisites

Before you begin, make sure you have:

  • An AWS account
  • Access to Amazon S3 and Amazon OpenSearch Service
  • An OpenSearch Service domain (for the first integration pattern)
  • Vector data stored in S3 Vectors (for the second integration pattern)

Integration pattern 1: OpenSearch Service managed cluster using S3 Vectors

To implement this pattern:

  1. Create an OpenSearch Service Domain using OR1 instances on OpenSearch version 2.19.
    1. While creating the OpenSearch Service domain, choose the Enable S3 Vectors as an engine option in the Advanced features section.
  2. Sign in to OpenSearch Dashboards and open Dev tools. Then create your knn index and specify s3vector as the engine.
PUT my-first-s3vector-index
{
  "settings": {
    "index": {
      "knn": true
    }
  },
  "mappings": {
    "properties": {
        "my_vector1": {
          "type": "knn_vector",
          "dimension": 2,
          "space_type": "l2",
          "method": {
            "engine": "s3vector"
          }
        },
        "price": {
          "type": "float"
        }
    }
  }
} 
  1. Index your vectors using the Bulk API:
POST _bulk
{ "index": { "_index": "my-first-s3vector-index", "_id": "1" } }
{ "my_vector1": [2.5, 3.5], "price": 7.1 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "3" } }
{ "my_vector1": [3.5, 4.5], "price": 12.9 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "4" } }
{ "my_vector1": [5.5, 6.5], "price": 1.2 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "5" } }
{ "my_vector1": [4.5, 5.5], "price": 3.7 }
{ "index": { "_index": "my-first-s3vector-index", "_id": "6" } }
{ "my_vector1": [1.5, 2.5], "price": 12.2 }
  1. Run a knn query as usual:
GET my-first-s3vector-index/_search
{
  "size": 2,
  "query": {
    "knn": {
      "my_vector1": {
        "vector": [2.5, 3.5],
        "k": 2
      }
    }
  }
}

The following animation demonstrates steps 2-4 above.

Integration pattern 2: Export S3 vector indexes to OpenSearch Serverless

To implement this pattern:

  1. Navigate to the AWS Management Console for Amazon S3 and select your S3 vector bucket.

  1. Select a vector index that you want to export. Under Advanced search export, select Export to OpenSearch.

Alternatively, you can:

  1. Navigate to the OpenSearch Service console.
  2. Select Integrations from the navigation pane.
  3. Here you will see a new Integration Template to Import S3 vectors to OpenSearch vector engine – preview. Select Import S3 vector index.

  1. You will now be brought to the Amazon OpenSearch Service integration console with the Export S3 vector index to OpenSearch vector engine template pre-selected and pre-populated with your S3 vector index Amazon Resource Name (ARN). Select an existing role that has the necessary permissions or create a new service role.

  1. Scroll down and choose Export to start the steps to create a new OpenSearch Serverless collection and copy data from your S3 vector index into an OpenSearch knn index.

  1. You will now be taken to the Import history page in the OpenSearch Service console. Here you will see the new job that was created to migrate your S3 vector index into the OpenSearch serverless knn index. After the status changes from In Progress to Complete, you can connect to the new OpenSearch serverless collection and query your new OpenSearch knn index.

The following animation demonstrates how to connect to the new OpenSearch serverless collection and query your new OpenSearch knn index using Dev tools.

Cleanup

To avoid ongoing charges:

  1. For Pattern 1:
  1. For Pattern 2:
    • Delete the import task from the Import history section of the OpenSearch Service console. Deleting this task will remove both the OpenSearch vector collection and the OpenSearch Ingestion pipeline that was automatically created by the import task.

Conclusion

The innovative integration between Amazon S3 Vectors and Amazon OpenSearch Service marks a transformative milestone in vector search technology, offering unprecedented flexibility and cost-effectiveness for enterprises. This powerful combination delivers the best of both worlds: The renowned durability and cost efficiency of Amazon S3 merged seamlessly with the advanced AI search capabilities of OpenSearch. Organizations can now confidently scale their vector search solutions to billions of vectors while maintaining control over their latency, cost, and accuracy. Whether your priority is ultra-fast query performance with latency as low as 10ms through OpenSearch Service, or cost-optimized storage with impressive sub-second performance using S3 Vectors or implementing advanced search capabilities in OpenSearch, this integration provides the perfect solution for your specific needs. We encourage you to get started today by trying S3 Vectors engine in your OpenSearch managed clusters and testing the one-click export from S3 vector indexes to OpenSearch Serverless.

For more information, visit:


About the Authors

Sohaib Katariwala is a Senior Specialist Solutions Architect at AWS focused on Amazon OpenSearch Service based out of Chicago, IL. His interests are in all things data and analytics. More specifically he loves to help customers use AI in their data strategy to solve modern day challenges.

Mark Twomey is a Senior Solutions Architect at AWS focused on storage and data management. He enjoys working with customers to put their data in the right place, at the right time, for the right cost. Living in Ireland, Mark enjoys walking in the countryside, watching movies, and reading books.

Sorabh Hamirwasia is a senior software engineer at AWS working on the OpenSearch Project. His primary interest include building cost optimized and performant distributed systems.

Pallavi Priyadarshini is a Senior Engineering Manager at Amazon OpenSearch Service leading the development of high-performing and scalable technologies for search, security, releases, and dashboards.

Bobby Mohammed is a Principal Product Manager at AWS leading the Search, GenAI, and Agentic AI product initiatives. Previously, he worked on products across the full lifecycle of machine learning, including data, analytics, and ML features on SageMaker platform, deep learning training and inference products at Intel.

AWS Weekly Roundup: Kiro, AWS Lambda remote debugging, Amazon ECS blue/green deployments, Amazon Bedrock AgentCore, and more (July 21, 2025)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-kiro-aws-lambda-remote-debugging-amazon-ecs-blue-green-deployments-amazon-bedrock-agentcore-and-more-july-21-2025/

I’m writing this as I depart from Ho Chi Minh City back to Singapore. Just realized what a week it’s been, so let me rewind a bit. This week, I tried my first Corne keyboard, wrapped up rehearsals for AWS Summit Jakarta with speakers who are absolutely raising the bar, and visited Vietnam to participate as a technical keynote speaker in AWS Community Day Vietnam, an energetic gathering of hundreds of cloud practitioners and AWS enthusiasts who shared knowledge through multiple technical tracks and networking sessions.

What I presented was a keynote titled “Reinvent perspective as modern developers”, featuring serverless, containers, and how we can cut the learning curves and be more productive with Amazon Q Developer and Kiro. I got a chance to discuss with a couple of AWS Community Builders and community developers, who shared how Amazon Q Developer actually addressed their challenges on building applications, with several highlighting significant productivity improvements and smoother learning curves in their cloud development journeys.

As I head back to Singapore, I’m carrying with me not just memories of delicious cà phê sữa đá (iced milk coffee), but also fresh perspectives and inspirations from this vibrant community of cloud innovators.

Introducing Kiro
One of the highlights from last week was definitely Kiro, an AI IDE that helps you deliver from concept to production through a simplified developer experience for working with AI agents. Kiro goes beyond “vibe coding” with features like specs and hooks that help get prototypes into production systems with proper planning and clarity.

Join the waitlist to get notified when it becomes available.

Last week’s AWS Launches
In other news, last week we had AWS Summit in New York, where we released several services. Here are some launches that caught my attention:

Console to IDE Integration

ECS Blue-Green Deployments

AWS Free Tier Enhanced Benefits

  • Monitor and debug event-driven applications with new Amazon EventBridge logging — Amazon EventBridge now provides enhanced logging capabilities that offer comprehensive event lifecycle tracking with detailed information about successes, failures, and status codes. This new observability feature addresses microservices and event-driven architecture monitoring challenges by providing visibility into the complete event journey.

EventBridge Enhanced Logging

S3 Vectors Overview

  • Amazon EKS enables ultra-scale AI/ML workloads with support for 100k nodes per cluster — Amazon EKS now supports up to 100,000 worker nodes in a single cluster, enabling customers to scale up to 1.6 million AWS Trainium accelerators or 800K NVIDIA GPUs. This industry-leading scale empowers customers to train trillion-parameter models and advance AGI development while maintaining Kubernetes conformance and familiar developer experience.

EKS Ultra-Scale Performance Improvements

From AWS Builder Center
In case you missed it, we just launched AWS Builder Center and integrated community.aws. Here are my top picks from the posts:

Upcoming AWS events
Check your calendars and sign up for upcoming AWS and AWS Community events:

  • AWS re:Invent – Register now to get a head start on choosing your best learning path, booking travel and accommodations, and bringing your team to learn, connect, and have fun. If you’re an early-career professional, you can apply to the All Builders Welcome Grant program, which is designed to remove financial barriers and create diverse pathways into cloud technology.
  • AWS Builders Online Series – If you’re based in one of the Asia Pacific time zones, join and learn fundamental AWS concepts, architectural best practices, and hands-on demonstrations to help you build, migrate, and deploy your workloads on AWS.
  • AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Taipei (July 29), Mexico City (August 6), and Jakarta (June 26–27).
  • AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Singapore (August 2), Australia (August 15), Adria (September 5), Baltic (September 10), and Aotearoa (September 18).

You can browse all upcoming AWS led in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Donnie

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!


Join Builder ID: Get started with your AWS Builder journey at builder.aws.com

Unifying data insights with Amazon QuickSight and Amazon SageMaker

Post Syndicated from Ramon Lopez original https://aws.amazon.com/blogs/big-data/unifying-data-insights-with-amazon-quicksight-and-amazon-sagemaker/

Amazon SageMaker has announced an integration with Amazon QuickSight, bringing together data in SageMaker seamlessly with QuickSight capabilities like interactive dashboards, pixel perfect reports and generative business intelligence (BI)—all in a governed and automated manner. With this integration users can go from exploring data in SageMaker to visualizing it in QuickSight with a single click.

“The integration between Amazon SageMaker and Amazon QuickSight will help us streamline how our teams move from data exploration to insights. Our analysts can go from data discovery to building and sharing dashboards through a unified, governed experience. Dashboards are no longer siloed, one-off reports. They’re cataloged, discoverable assets that others can find and access. This has made insight delivery faster, more consistent, and far easier to scale across the business.”

– Lingam Chockalingam, Chief Data Architect, Maryland Department of Human Services – MD THINK

About QuickSight

QuickSight is a cloud-powered BI service that revolutionizes data analysis and visualization. It seamlessly integrates data from various sources, including AWS services, third-party applications, and software as a service (SaaS) platforms into a single, intuitive dashboard. As a fully managed service, QuickSight offers enterprise-grade security, global accessibility, and scalability without the hassle of infrastructure management. Amazon Q in QuickSight transforms access to data insights for the entire organization using generative AI. Using Amazon Q, business analysts can generate dashboards and reports using natural language prompts. With Amazon Q, business users can ask and answer questions of data using data Q&A, get natural language executive summaries of data to see trends and insights, and use the powerful new agentic data analysis experience of scenarios to discover patterns and outliers in data and perform what-if analysis.

About SageMaker

Amazon SageMaker Unified Studio provides a unified, end-to-end experience consisting of data, analytics, and AI capabilities. You can use familiar AWS services for model development, generative AI, data processing, and analytics—all within a single, governed environment. Users can now build, deploy, and execute end-to-end workflows from a single interface. SageMaker is built on the foundations of Amazon DataZone, where it uses domains to categorize and structure the data assets, while offering project-based collaboration features that teams can use to securely share artifacts and work together across various compute services. This experience allows multiple personas to seamlessly collaborate, while operating under appropriate access controls and governance policies.

Dashboard and insight workflows simplified

Today administrators can configure SageMaker projects with QuickSight to streamline the flow of building insights from your data lake. After being set up, the integration automatically creates a restricted folders that provides a governed context to share assets and data sources, pre-configured with secure connections to data lake tables. This serves as the foundation for any project member securely building and sharing insights. When exploring data in your project the integration allows for one-click access to building a dashboard from any table. Behind the scenes, SageMaker creates a QuickSight dataset in the project’s restricted folder that’s accessible only to members within the project. Not only do dashboards you build in QuickSight stay within this folder, they’re also automatically added as assets to your SageMaker project. There, you can add custom metadata, publish to the SageMaker Catalog and share with users or groups in your corporate directory for broader access—all within SageMaker Unified Studio. This keeps your dashboards organized, discoverable, shareable, and governed, making cross-team collaboration and asset reuse straightforward.

Configure SageMaker and QuickSight

To get started with SageMaker and QuickSight integration, you enable the QuickSight blueprint and create project profiles in the AWS Management Console.

Note that both your SageMaker Unified Studio domain and QuickSight account must be integrated with AWS IAM Identity Center using the same Identity Center instance. Additionally, your QuickSight account must exist in the same AWS account.

  1. Go to the SageMaker console and choose Domain in the navigation pane.
  2. Select the Blueprints tab.
  3. To enable the QuickSight Blueprint, select it from the list, then choose Enable.
  4. On the Enable QuickSight page:
        1. For Provisioning role, select your provisioning role.
        2. For QuickSight VPC manager role, select the AmazonSageMakerQuickSightVPC role.
  5. Choose Enable blueprint.
  6. A confirmation message will appear after the blueprint is successfully enabled.
  7. Go back to the Domains page and select the Project profiles tab and then select the SQL analytics project profile.
  8. Choose Add blueprint deployment settings.
  9. Configure the blueprint deployment settings as follows:
    • Blueprint deployment settings name: Enter a name for your settings. For this post, we used QuickSight-BDS.
    • Blueprint: Select the QuickSight blueprint from the list.
    • Other parameters: Adjust these based on your use case. For this post, we kept the default values.
  10. Scroll down and choose Add blueprint deployment settings to save your configuration.
  11. You’ll receive a confirmation message, and you’ll see that the QuickSight Blueprint deployment setting (QuickSight-BDS) has been added to the list.

Create a SageMaker project with QuickSight enabled:

After the QuickSight integration has been set up by the administrator, data consumers such as analysts and data scientists can begin using it in the SageMaker portal by creating a new project.

  1. Go to the SageMaker portal.
  2. Choose Select a project, then, choose Create project.
  3. On the Create project page:
    1. Project name: Enter the name of your project. For this post, we’re using KPI-Analysis.
    2. Project profile: Select the SQL Analytics project profile.
    3. Choose Continue.
  4. Leave the remaining parameters set to their default values and choose Continue.
  5. Review the information displayed, then choose Create project.
  6. You’ll be redirected to the Creating new project page. Wait for the process to complete.
  7. After the project creation process is complete, you’ll be taken to the Project overview page.

Create a data asset to build the analysis

  1. For this post, you’ll use the transactions.csv file, which contains financial transaction data from various departments.
  2. Choose Build in the top-right menu.
  3. Then select Query Editor from the dropdown.
  4. Choose the plus (+) icon
  5. Select Create table, then choose Next.
  6. On the Set table properties page:
    1. Upload file: Upload the transactions.csv file.
    2. Table type: Select S3/external table.
    3. Leave the remaining parameters at the default values.
    4. Choose Next.
  7. On the Preview schema page, verify that the schema matches the expected structure, then choose Create table.
  8. The Transactions table has now been successfully created.

Create a dashboard using QuickSight

  1. Choose the KPI-Analysis project, then choose Data.
  2. On the Data page: Select the Transactions table, choose Actions, then select Open in QuickSight.
  3. This step redirects you to the QuickSight UI, specifically to the transactions dataset page.
  4. Choose USE IN ANALYSIS to begin exploring the data.
  5. Choose a folder to save your new analysis—for this post, we selected the Assets folder.
  6. Choose Add to save the analysis.
  7. On the New sheet page, leave all parameters at the default values, then choose CREATE.
  8. You’ll now be taken to the Analysis page. In this example, you analyze credit card spending at gas stations, focusing on identifying the most popular fuel type among your cardholders. The goal is to use this insight to design targeted promotions.
  9. Under Visuals, select Pie chart.
  10. Under GROUP/COLOR, select fuel_type.
  11. Under Value, select amount[Sum].
  12. You will see that credit card holders of AWSome-Bank prefer the Premium fuel type.
  13. Publish this new dashboard to the enterprise data catalog. To do that, choose PUBLISH located in the top right corner.
  14. On the Publish Dashboard page:
    1. Enter a name for the dashboard. For this post, we’re using gas_consumption_analysis.
    2. Leave the remaining parameters set to their default values.
    3. Choose PUBLISH DASHBOARD.

Documenting and publishing a QuickSight asset

After the dashboard is created, it’s automatically added to the SageMaker project. From there, analysts or BI engineers can enrich it with business metadata, make it discoverable across the organization, and share it with other users or groups in their corporate directory.

  1. Go back to the Amazon SageMaker portal
  2. Select the Assets tab.
  3. On the Inventory tab, select the gas_consumption_analysis asset.
  4. This will take you to the main asset page, where you can add business metadata, view the lineage diagram, and review the asset history.
  5. For this post, you will only add a README section.
  6. Choose CREATE README to get started.
  7. Add a description for the asset. For this POST, we used the following:
Overview
This Amazon QuickSight dashboard provides insights into the fuel type preferences of a bank’s credit card holders. It helps business stakeholders and analysts understand customer behavior at fuel stations, supporting data-driven marketing strategies and product personalization.
Purpose
The goal of this dashboard is to:
Analyze which fuel types (for example, Regular, Premium, Diesel, Electric) are most frequently purchased using the bank’s credit cards.
Identify customer segments (for example, age groups, locations, income brackets) that prefer specific fuel types.
Understand transaction patterns such as frequency, average spend per fuel type, and purchase timeframes.
  1. Choose SAVE README to save the description.
  2. On this page, you can also add glossary terms and metadata forms to provide additional business context to the asset. For this post, leave these fields empty.
  3. Now you’re ready to publish the QuickSight asset to the enterprise data catalog. To do this, choose PUBLISH ASSET.
  4. A confirmation prompt will appear. Choose PUBLISH ASSET again to complete the publishing process.

Search for a QuickSight asset

  1. For this post, we created a second project called Marketing, but you can use any other project within your domain or even reuse the one created in the earlier steps.
  2. Navigate to the SageMaker home page.
  3. In the catalog search field, enter gas to find the published asset.
  4. Select the relevant result for the published asset from the search results.
  5. This will take you to the asset’s main page, where you can view the metadata added by the producer.

Sharing a QuickSight asset

You can share the QuickSight dashboard with users and groups in your organization directly from within SageMaker.

  1. Go back to the KPI-Analysis project.
  2. Choose the Data tab.
  3. Then, select Assets from the Project catalog.
  4. Go to the PUBLISHED tab, then select the gas_consumption_analysis asset.
  5. Choose Actions, then select Share.
  6. You can share the asset with individual SSO users or with groups. For this post, we selected an SSO group named quicksight-users, but you can choose any user or group you have previously created.
  7. Choose Share.
  8. A confirmation message will appear after the asset has been successfully shared.

Clean up

When you’re done with these exercises, complete the following steps to delete your resources to avoid incurring costs:

  1. Delete the QuickSight assets that you created.
    1. If QuickSight is enabled solely for testing, make sure to cancel the QuickSight account.
  2. Delete the project created in SageMaker.
    1. If SageMaker is enabled solely for testing, make sure to cancel the SageMaker account.

Conclusion

This post walked through the complete process of integrating Amazon QuickSight with Amazon SageMaker Unified Studio, demonstrating how teams can move from raw data to published dashboards in a secure and governed environment. By combining the advanced analytics capabilities of QuickSight with the collaborative project-based structure of SageMaker, organizations can accelerate insight delivery while maintaining clear control over data access and governance.

The integration simplifies creating datasets directly from Amazon Athena or Amazon Redshift tables, enrich them with business metadata, and publish dashboards to the SageMaker Catalog. When published, these dashboards can be shared with users or groups across the organization, making insights both discoverable and actionable.

With the added power of Amazon Q in QuickSight and generative BI, users can ask questions in plain English and receive real-time visualizations and insights. This makes data exploration intuitive and inclusive, empowering more users to make informed decisions. Combined with the unified analytics and AI environment of SageMaker Unified Studio, this solution supports secure, scalable, and collaborative data-driven innovation.


About the authors

Ramon Lopez is a Principal Solutions Architect for Amazon QuickSight. With many years of experience building BI solutions and a background in accounting, he loves working with customers, creating solutions, and making world-class services. When not working, he prefers to be outdoors in the ocean or up on a mountain.

Leonardo Gomez is a Principal Analytics Specialist Solutions Architect at AWS. He has over a decade of experience in data management, helping customers around the globe address their business and technical needs. Connect with him on LinkedIn.

Simplify serverless development with console to IDE and remote debugging for AWS Lambda

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/simplify-serverless-development-with-console-to-ide-and-remote-debugging-for-aws-lambda/

Today, we’re announcing two significant enhancements to AWS Lambda that make it easier than ever for developers to build and debug serverless applications in their local development environments: console to IDE integration and remote debugging. These new capabilities build upon our recent improvements to the Lambda development experience, including the enhanced in-console editing experience and the improved local integrated development environment (IDE) experience launched in late 2024.

When building serverless applications, developers typically focus on two areas to streamline their workflow: local development environment setup and cloud debugging capabilities. While developers can bring functions from the console to their IDE, they’re looking for ways to make this process more efficient. Additionally, as functions interact with various AWS services in the cloud, developers want enhanced debugging capabilities to identify and resolve issues earlier in the development cycle, reducing their reliance on local emulation and helping them optimize their development workflow.

Console to IDE integration

To address the first challenge, we’re introducing console to IDE integration, which streamlines the workflow from the AWS Management Console to Visual Studio Code (VS Code). This new capability adds an Open in Visual Studio Code button to the Lambda console, enabling developers to quickly move from viewing their function in the browser to editing it in their IDE, eliminating the time-consuming setup process for local development environments.

The console to IDE integration automatically handles the setup process, checking for VS Code installation and the AWS Toolkit for VS Code. For developers that have everything already configured, choosing the button immediately opens their function code in VS Code, so they can continue editing and deploy changes back to Lambda in seconds. If VS Code isn’t installed, it directs developers to the download page, and if the AWS Toolkit is missing, it prompts for installation.

To use console to IDE, look for the Open in VS Code button in either the Getting Started popup after creating a new function or the Code tab of existing Lambda functions. After selecting, VS Code opens automatically (installing AWS Toolkit if needed). Unlike the console environment, you now have access to a full development environment with integrated terminal – a significant improvement for developers who need to manage packages (npm install, pip install), run tests, or use development tools like linters and formatters. You can edit code, add new files/folders, and any changes you make will trigger an automatic deploy prompt. When you choose to deploy, the AWS Toolkit automatically deploys your function to your AWS account.

Screenshot showing Console to IDE

Remote debugging

Once developers have their functions in their IDE, they can use remote debugging to debug Lambda functions deployed in their AWS account directly from VS Code. The key benefit of remote debugging is that it allows developers to debug functions running in the cloud while integrated with other AWS services, enabling faster and more reliable development.

With remote debugging, developers can debug their functions with complete access to Amazon Virtual Private Cloud (VPC) resources and AWS Identity and Access Management (AWS IAM) roles, eliminating the gap between local development and cloud execution. For example, when debugging a Lambda function that interacts with an Amazon Relational Database Service (Amazon RDS) database in a VPC, developers can now debug the execution environment of the function running in the cloud within seconds, rather than spending time setting up a local environment that might not match production.

Getting started with remote debugging is straightforward. Developers can select a Lambda function in VS Code and enable debugging in seconds. AWS Toolkit for VS Code automatically downloads the function code, establishes a secure debugging connection, and enables breakpoint setting. When debugging is complete, AWS Toolkit for VS Code automatically cleans up the debugging configuration to prevent any impact on production traffic.

Let’s try it out

To take remote debugging for a spin, I chose to start with a basic “hello world” example function, written in Python. I had previously created the function using the AWS Management Console for AWS Lambda. Using the AWS Toolkit for VS Code, I can navigate to my function in the Explorer pane. Hovering over my function, I can right-click (ctrl-click in Windows) to download the code to my local machine to edit the code in my IDE. Saving the file will ask me to decide if I want to deploy the latest changes to Lambda.

Screenshot view of the Lambda Debugger in VS Code

From here, I can select the play icon to open the Remote invoke configuration page for my function. This dialog will now display a Remote debugging option, which I configure to point at my local copy of my function handler code. Before choosing Remote invoke, I can set breakpoints on the left anywhere I want my code to pause for inspection.

My code will be running in the cloud after it’s invoked, and I can monitor its status in real time in VS Code. In the following screenshot, you can see I’ve set a breakpoint at the print statement. My function will pause execution at this point in my code, and I can inspect things like local variable values before either continuing to the next breakpoint or stepping into the code line by line.

Here, you can see that I’ve chosen to step into the code, and as I go through it line by line, I can see the context and local and global variables displayed on the left side of the IDE. Additionally, I can follow the logs in the Output tab at the bottom of the IDE. As I step through, I’ll see any log messages or output messages from the execution of my function in real time.

Enhanced development workflow

These new capabilities work together to create a more streamlined development experience. Developers can start in the console, quickly transition to VS Code using the console to IDE integration, and then use remote debugging to debug their functions running in the cloud. This workflow eliminates the need to switch between multiple tools and environments, helping developers identify and fix issues faster.

Now available

You can start using these new features through the AWS Management Console and VS Code with the AWS Toolkit for VS Code (v3.69.0 or later) installed. Console to IDE integration is available in all commercial AWS Regions where Lambda is available, except AWS GovCloud (US) Regions. Learn more about it in Lambda and AWS Toolkit for VS Code documentation. To learn more about remote debugging capability, including AWS Regions it is available in, visit the AWS Toolkit for VS Code and Lambda documentation.

Console to IDE and remote debugging are available to you at no additional cost. With remote debugging, you pay only for the standard Lambda execution costs during debugging sessions. Remote debugging will support Python, Node.js, and Java runtimes at launch, with plans to expand support to additional runtimes in the future.

These enhancements represent a significant step forward in simplifying the serverless development experience, which means developers can build and debug Lambda functions more efficiently than ever before.

AWS AI League: Learn, innovate, and compete in our new ultimate AI showdown

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/aws-ai-league-learn-innovate-and-compete-in-our-new-ultimate-ai-showdown/

Since 2018, AWS DeepRacer has engaged over 560,000 builders worldwide, demonstrating that developers learn and grow through competitive experiences. Today, we’re excited to expand into the generative AI era with AWS Artificial Intelligence (AI) League.

This is a unique competitive experience – your chance to dive deep into generative AI regardless of your skill level, compete with peers, and build solutions that solve actual business problems through an engaging, competitive experience.

With AWS AI League, your organization hosts private tournaments where teams collaborate and compete to solve real-world business use cases using practical AI skills. Participants craft effective prompts and fine-tune models while building powerful generative AI solutions relevant for their business. Throughout the competition, participants’ solutions are evaluated against reference standards on a real-time leaderboard that tracks performance based on accuracy and latency.

The AWS AI League experience starts with a 2-hour hands-on workshop led by AWS experts. This is followed by self-paced experimentation, culminating in a gameshow-style grand finale where participants showcase their generative AI creations addressing business challenges. Organizations can set up their own AWS AI League within half a day. The scalable design supports 500 to 5,000 employees while maintaining the same efficient timeline.

Supported by up to $2 million in AWS credits and a $25,000 championship prize pool at AWS re:Invent 2025, the program provides a unique opportunity to solve real business challenges.

AWS AI League transforms how organizations develop generative AI capabilities
AWS AI League transforms how organizations develop generative AI capabilities by combining hands-on skills development, domain expertise, and gamification. This approach makes AI learning accessible and engaging for all skill levels. Teams collaborate through industry-specific challenges that mirror real organizational needs, with each challenge providing reference datasets and evaluation standards that reflect actual business requirements.

  • Customizable industry-specific challenges – Tailor competitions to your specific business context. Healthcare teams work on patient discharge summaries, financial services focus on fraud detection, and media companies develop content creation solutions.
  • Integrated AWS AI stack experience – Participants gain hands-on experience with AWS AI and ML tools, including Amazon SageMaker AI, Amazon Bedrock, and Amazon Nova, accessible from Amazon SageMaker Unified Studio. Teams work through a secure, cost-controlled environment within their organization’s AWS account.
  • Real-time performance tracking – The leaderboard evaluates submissions against established benchmarks and reference standards throughout the competition, providing immediate feedback on accuracy and speed so teams can iterate and improve their solutions. During the final round, this scoring includes expert evaluation where domain experts and a live audience participate in real-time voting to determine which AI solutions best solve real business challenges.

  • AWS AI League offers two foundational competition tracks:
    • Prompt Sage – The Ultimate Prompt Battle – Race to craft the perfect AI prompts that unlock breakthrough solutions. whether you detect financial fraud or streamlining healthcare workflows, every word counts as they climb the leaderboard using zero-shot learning and chain-of-thought reasoning.
    • Tune Whiz – The Model Mastery Showdown – Generic AI models meet their match as you sculpt them into industry-specific powerhouses. Armed with your domain expertise and specialized questions, competitors fine-tune models that speak your business language fluently. Victory goes to who achieve the perfect balance of blazing performance, lightning efficiency, and cost optimization.

As Generative AI continues to evolve, AWS AI League will regularly introduce new challenges and formats in addition to these tracks.

Get started today
Ready to get started? Organizations can host private competitions by applying through the AWS AI League page. Individual developers can join public competitions at AWS Summits and AWS re:Invent.

PS: Writing a blog post at AWS is always a team effort, even when you see only one name under the post title. In this case, I want to thank Natasya Idries, for her generous help with technical guidance, and expertise, which made this overview possible and comprehensive.

— Eli

Accelerate safe software releases with new built-in blue/green deployments in Amazon ECS

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/accelerate-safe-software-releases-with-new-built-in-blue-green-deployments-in-amazon-ecs/

While containers have revolutionized how development teams package and deploy applications, these teams have had to carefully monitor releases and build custom tooling to mitigate deployment risks, which slows down shipping velocity. At scale, development teams spend valuable cycles building and maintaining undifferentiated deployment tools instead of innovating for their business.

Starting today, you can use the built-in blue/green deployment capability in Amazon Elastic Container Service (Amazon ECS) to make your application deployments safer and more consistent. This new capability eliminates the need to build custom deployment tooling while giving you the confidence to ship software updates more frequently with rollback capability.

Here’s how you can enable the built-in blue/green deployment capability in the Amazon ECS console.

You create a new “green” application environment while your existing “blue” environment continues to serve live traffic. After monitoring and testing the green environment thoroughly, you route the live traffic from blue to green. With this capability, Amazon ECS now provides built-in functionality that makes containerized application deployments safer and more reliable.

Below is a diagram illustrating how blue/green deployment works by shifting application traffic from the blue environment to the green environment. You can learn more at the Amazon ECS blue/green service deployments workflow page.

Amazon ECS orchestrates this entire workflow while providing event hooks to validate new versions using synthetic traffic before routing production traffic. You can validate new software versions in production environments before exposing them to end users and roll back near-instantaneously if issues arise. Because this functionality is built directly into Amazon ECS, you can add these safeguards by simply updating your configuration without building any custom tooling.

Getting started
Let me walk you through a demonstration that showcases how to configure and use blue/green deployments for an ECS service. Before that, there are a few setup steps that I need to complete, including configuring AWS Identity and Access Management (IAM) roles, which you can find on the Required resources for Amazon ECS blue/green deployments Documentation page.

For this demonstration, I want to deploy a new version of my application using the blue/green strategy to minimize risk. First, I need to configure my ECS service to use blue/green deployments. I can do this through the ECS console, AWS Command Line Interface (AWS CLI), or using infrastructure as code.

Using the Amazon ECS console, I create a new service and configure it as usual:

In the Deployment Options section, I choose ECS as the Deployment controller type, then Blue/green as the Deployment strategy. Bake time is the time after the production traffic has shifted to green, when instant rollback to blue is available. When the bake time expires, blue tasks are removed.

We’re also introducing deployment lifecycle hooks. These are event-driven mechanisms you can use to augment the deployment workflow. I can select which AWS Lambda function I’d like to use as a deployment lifecycle hook. The Lambda function can perform the required business logic, but it must return a hook status.

Amazon ECS supports the following lifecycle hooks during blue/green deployments. You can learn more about each stage on the Deployment lifecycle stages page.

  • Pre scale up
  • Post scale up
  • Production traffic shift
  • Test traffic shift
  • Post production traffic shift
  • Post test traffic shift

For my application, I want to test when the test traffic shift is complete and the green service handles all of the test traffic. Since there’s no end-user traffic, a rollback at this stage will have no impact on users. This makes Post test traffic shift suitable for my use case as I can test it first with my Lambda function.

Switching context for a moment, let’s focus on the Lambda function that I use to validate the deployment before allowing it to proceed. In my Lambda function as a deployment lifecycle hook, I can perform any business logic, such as synthetic testing, calling another API, or querying metrics.

Within the Lambda function, I must return a hookStatus. A hookStatus can be SUCCESSFUL, which will move the process to the next step. If the status is FAILED, it rolls back to the blue deployment. If it’s IN_PROGRESS, then Amazon ECS retries the Lambda function in 30 seconds.

In the following example, I set up my validation with a Lambda function that performs file upload as part of a test suite for my application.

import json
import urllib3
import logging
import base64
import os

# Configure logging
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)

# Initialize HTTP client
http = urllib3.PoolManager()

def lambda_handler(event, context):
    """
    Validation hook that tests the green environment with file upload
    """
    logger.info(f"Event: {json.dumps(event)}")
    logger.info(f"Context: {context}")
    
    try:
        # In a real scenario, you would construct the test endpoint URL
        test_endpoint = os.getenv("APP_URL")
        
        # Create a test file for upload
        test_file_content = "This is a test file for deployment validation"
        test_file_data = test_file_content.encode('utf-8')
        
        # Prepare multipart form data for file upload
        fields = {
            'file': ('test.txt', test_file_data, 'text/plain'),
            'description': 'Deployment validation test file'
        }
        
        # Send POST request with file upload to /process endpoint
        response = http.request(
            'POST', 
            test_endpoint,
            fields=fields,
            timeout=30
        )
        
        logger.info(f"POST /process response status: {response.status}")
        
        # Check if response has OK status code (200-299 range)
        if 200 <= response.status < 300:
            logger.info("File upload test passed - received OK status code")
            return {
                "hookStatus": "SUCCEEDED"
            }
        else:
            logger.error(f"File upload test failed - status code: {response.status}")
            return {
                "hookStatus": "FAILED"
            }
            
    except Exception as error:
        logger.error(f"File upload test failed: {str(error)}")
        return {
            "hookStatus": "FAILED"
        }

When the deployment reaches the lifecycle stage that is associated with the hook, Amazon ECS automatically invokes my Lambda function with deployment context. My validation function can run comprehensive tests against the green revision—checking application health, running integration tests, or validating performance metrics. The function then signals back to ECS whether to proceed or abort the deployment.

As I chose the blue/green deployment strategy, I also need to configure the load balancers and/or Amazon ECS Service Connect. In the Load balancing section, I select my Application Load Balancer.

In the Listener section, I use an existing listener on port 80 and select two Target groups.

Happy with this configuration, I create the service and wait for ECS to provision my new service.

Testing blue/green deployments
Now, it’s time to test my blue/green deployments. For this test, Amazon ECS will trigger my Lambda function after the test traffic shift is completed. My Lambda function will return FAILED in this case as it performs file upload to my application, but my application doesn’t have this capability.

I update my service and check Force new deployment, knowing the blue/green deployment capability will roll back if it detects a failure. I select this option because I haven’t modified the task definition but still need to trigger a new deployment.

At this stage, I have both blue and green environments running, with the green revision handling all the test traffic. Meanwhile, based on Amazon CloudWatch Logs of my Lambda function, I also see that the deployment lifecycle hooks work as expected and emit the following payload:

[INFO]	2025-07-10T13:15:39.018Z	67d9b03e-12da-4fab-920d-9887d264308e	Event: 
{
    "executionDetails": {
        "testTrafficWeights": {},
        "productionTrafficWeights": {},
        "serviceArn": "arn:aws:ecs:us-west-2:123:service/EcsBlueGreenCluster/nginxBGservice",
        "targetServiceRevisionArn": "arn:aws:ecs:us-west-2:123:service-revision/EcsBlueGreenCluster/nginxBGservice/9386398427419951854"
    },
    "executionId": "a635edb5-a66b-4f44-bf3f-fcee4b3641a5",
    "lifecycleStage": "POST_TEST_TRAFFIC_SHIFT",
    "resourceArn": "arn:aws:ecs:us-west-2:123:service-deployment/EcsBlueGreenCluster/nginxBGservice/TFX5sH9q9XDboDTOv0rIt"
}

As expected, my AWS Lambda function returns FAILED as hookStatus because it failed to perform the test.

[ERROR]	2025-07-10T13:18:43.392Z	67d9b03e-12da-4fab-920d-9887d264308e	File upload test failed: HTTPConnectionPool(host='xyz.us-west-2.elb.amazonaws.com', port=80): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7f8036273a80>, 'Connection to xyz.us-west-2.elb.amazonaws.com timed out. (connect timeout=30)'))

Because the validation wasn’t completed successfully, Amazon ECS tries to roll back to the blue version, which is the previous working deployment version. I can monitor this process through ECS events in the Events section, which provides detailed visibility into the deployment progress.

Amazon ECS successfully rolls back the deployment to the previous working version. The rollback happens near-instantaneously because the blue revision remains running and ready to receive production traffic. There is no end-user impact during this process, as production traffic never shifted to the new application version—ECS simply rolled back test traffic to the original stable version. This eliminates the typical deployment downtime associated with traditional rolling deployments.

I can also see the rollback status in the Last deployment section.

Throughout my testing, I observed that the blue/green deployment strategy provides consistent and predictable behavior. Furthermore, the deployment lifecycle hooks provide more flexibility to control the behavior of the deployment. Each service revision maintains immutable configuration including task definition, load balancer settings, and Service Connect configuration. This means that rollbacks restore exactly the same environment that was previously running.

Additional things to know
Here are a couple of things to note:

  • Pricing – The blue/green deployment capability is included with Amazon ECS at no additional charge. You pay only for the compute resources used during the deployment process.
  • Availability – This capability is available in all commercial AWS Regions.

Get started with blue/green deployments by updating your Amazon ECS service configuration in the Amazon ECS console.

Happy deploying!
Donnie

Top announcements of the AWS Summit in New York, 2025

Post Syndicated from AWS News Blog Team original https://aws.amazon.com/blogs/aws/top-announcements-of-the-aws-summit-in-new-york-2025/

Today at the AWS Summit in New York City, Swami Sivasubramanian, AWS VP of Agentic AI, provided the day’s keynote on how we’re enabling customers to deliver production-ready AI agents at scale. See below for a roundup of the biggest announcements from the event.

Introducing Amazon Bedrock AgentCore: Securely deploy and operate AI agents at any scale (preview)
Amazon Bedrock AgentCore enables rapid deployment and scaling of AI agents with enterprise-grade security. It provides memory management, identity controls, and tool integration—streamlining development while working with any open-source framework and foundation model.

Announcing Amazon Nova customization in Amazon SageMaker AI
AWS now enables extensive customization of Amazon Nova foundation models through SageMaker AI across all stages of model training. Available as ready-to-use SageMaker recipes, these capabilities allow customers to adapt Nova understanding models across pre-training and post-training, including fine-tuning and alignment recipes to better address business-specific requirements across industries.

AWS Free Tier update: New customers can get started and explore AWS with up to $200 in credits
AWS is enhancing its Free Tier program with up to $200 in credits for new users: $100 upon sign-up and an additional $100 earned by completing activities with services like Amazon EC2, Amazon Bedrock, and AWS Budgets.

TwelveLabs video understanding models are now available in Amazon Bedrock
TwelveLabs video understanding models are now available on Amazon Bedrock and enable customers to search through videos, classify scenes, summarize content, and extract insights with precision and reliability.

Amazon S3 Metadata now supports metadata for all your S3 objects
Amazon S3 Metadata now provides comprehensive visibility into all objects in S3 buckets through live inventory and journal tables, enabling SQL-based analysis of both existing and new objects with automatic updates within an hour of changes.

Introducing Amazon S3 Vectors: First cloud storage with native vector support at scale (preview)
Amazon S3 Vectors is a new cloud object store that provides native support for storing and querying vectors at massive scale, offering up to 90% cost reduction compared to conventional approaches while seamlessly integrating with Amazon Bedrock Knowledge Bases, SageMaker, and OpenSearch for AI applications.

Streamline the path from data to insights with new Amazon SageMaker capabilities
Amazon SageMaker has introduced three new capabilities—Amazon QuickSight integration for dashboard creation, governance, and sharing, Amazon S3 Unstructured Data Integration for cataloging documents and media files, and automatic data onboarding from Lakehouse—that eliminate data silos by unifying structured and unstructured data management, visualization, and governance in a single experience.

Monitor and debug event-driven applications with new Amazon EventBridge logging
Amazon EventBridge now offers enhanced logging capabilities that provide comprehensive event lifecycle tracking, helping users monitor and troubleshoot their event-driven applications with detailed logs that show when events are published, matched against rules, delivered to subscribers, or encounter failures.

Amazon EKS enables ultra scale AI/ML workloads with support for 100K nodes per cluster
Amazon EKS now scales to 100,000 nodes per cluster, enabling massive AI/ML workloads with up to 1.6M AWS Trainium accelerators or 800K NVIDIA GPUs. This allows organizations to efficiently train and run large AI models while maintaining Kubernetes compatibility and existing tooling integration.

Announcing Amazon Nova customization in Amazon SageMaker AI

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/announcing-amazon-nova-customization-in-amazon-sagemaker-ai/

Today, we’re announcing a suite of customization capabilities for Amazon Nova in Amazon SageMaker AI. Customers can now customize Nova Micro, Nova Lite, and Nova Pro across the model training lifecycle, including pre-training, supervised fine-tuning, and alignment. These techniques are available as ready-to-use Amazon SageMaker recipes with seamless deployment to Amazon Bedrock, supporting both on-demand and provisioned throughput inference.

Amazon Nova foundation models power diverse generative AI use cases across industries. As customers scale deployments, they need models that reflect proprietary knowledge, workflows, and brand requirements. Prompt optimization and retrieval-augmented generation (RAG) work well for integrating general-purpose foundation models into applications, however business-critical workflows require model customization to meet specific accuracy, cost, and latency requirements.

Choosing the right customization technique
Amazon Nova models support a range of customization techniques including: 1) supervised fine-tuning, 2) alignment, 3) continued pre-training, and 4) knowledge distillation. The optimal choice depends on goals, use case complexity, and the availability of data and compute resources. You can also combine multiple techniques to achieve your desired outcomes with the preferred mix of performance, cost, and flexibility.

Supervised fine-tuning (SFT) customizes model parameters using a training dataset of input-output pairs specific to your target tasks and domains. Choose from the following two implementation approaches based on data volume and cost considerations:

  • Parameter-efficient fine-tuning (PEFT) — updates only a subset of model parameters through lightweight adapter layers such as LoRA (Low-Rank Adaptation). It offers faster training and lower compute costs compared to full fine-tuning. PEFT-adapted Nova models are imported to Amazon Bedrock and invoked using on-demand inference.
  • Full fine-tuning (FFT) — updates all the parameters of the model and is ideal for scenarios when you have extensive training datasets (tens of thousands of records). Nova models customized through FFT can also be imported to Amazon Bedrock and invoked for inference with provisioned throughput.

Alignment steers the model output towards desired preferences for product-specific needs and behavior, such as company brand and customer experience requirements. These preferences may be encoded in multiple ways, including empirical examples and policies. Nova models support two preference alignment techniques:

  • Direct preference optimization (DPO) — offers a straightforward way to tune model outputs using preferred/not preferred response pairs. DPO learns from comparative preferences to optimize outputs for subjective requirements such as tone and style. DPO offers both a parameter-efficient version and a full-model update version. The parameter-efficient version supports on-demand inference.
  • Proximal policy optimization (PPO) — uses reinforcement learning to enhance model behavior by optimizing for desired rewards such as helpfulness, safety, or engagement. A reward model guides optimization by scoring outputs, helping the model learn effective behaviors while maintaining previously learned capabilities.

Continued pre-training (CPT) expands foundational model knowledge through self-supervised learning on large quantities of unlabeled proprietary data, including internal documents, transcripts, and business-specific content. CPT followed by SFT and alignment through DPO or PPO provides a comprehensive way to customize Nova models for your applications.

Knowledge distillation transfers knowledge from a larger “teacher” model to a smaller, faster, and more cost-efficient “student” model. Distillation is useful in scenarios where customers do not have adequate reference input-output samples and can leverage a more powerful model to augment the training data. This process creates a customized model of teacher-level accuracy for specific use cases and student-level cost-effectiveness and speed.

Here is a table summarizing the available customization techniques across different modalities and deployment options. Each technique offers specific training and inference capabilities depending on your implementation requirements.

Recipe Modality Training Inference
Amazon Bedrock Amazon SageMaker Amazon Bedrock On-demand Amazon Bedrock Provisioned Throughput
Supervised fine tuning Text, image, video
Parameter-efficient fine-tuning (PEFT) ✅ ✅ ✅ ✅
Full fine-tuning ✅ ✅
Direct preference optimization (DPO)  Text, image, video
Parameter-efficient DPO ✅ ✅ ✅
Full model DPO ✅ ✅
Proximal policy optimization (PPO)  Text-only ✅ ✅
Continuous pre-training  Text-only ✅ ✅
Distillation Text-only ✅ ✅ ✅ ✅

Early access customers, including Cosine AI, Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL), Volkswagen, Amazon Customer Service, and Amazon Catalog Systems Service, are already successfully using Amazon Nova customization capabilities.

Customizing Nova models in action
The following walks you through an example of customizing the Nova Micro model using direct preference optimization on an existing preference dataset. To do this, you can use Amazon SageMaker Studio.

Launch your SageMaker Studio in the Amazon SageMaker AI console and choose JumpStart, a machine learning (ML) hub with foundation models, built-in algorithms, and pre-built ML solutions that you can deploy with a few clicks.

Then, choose Nova Micro, a text-only model that delivers the lowest latency responses at the lowest cost per inference among the Nova model family, and then choose Train.

Next, you can choose a fine-tuning recipe to train the model with labeled data to enhance performance on specific tasks and align with desired behaviors. Choosing the Direct Preference Optimization offers a straightforward way to tune model outputs with your preferences.

When you choose Open sample notebook, you have two environment options to run the recipe: either on the SageMaker training jobs or SageMaker Hyperpod:

Choose Run recipe on SageMaker training jobs when you don’t need to create a cluster and train the model with the sample notebook by selecting your JupyterLab space.

Alternately, if you want to have a persistent cluster environment optimized for iterative training processes, choose Run recipe on SageMaker HyperPod. You can choose a HyperPod EKS cluster with at least one restricted instance group (RIG) to provide a specialized isolated environment, which is required for such Nova model training. Then, choose your JupyterLabSpace and Open sample notebook.

This notebook provides an end-to-end walkthrough for creating a SageMaker HyperPod job using a SageMaker Nova model with a recipe and deploying it for inference. With the help of a SageMaker HyperPod recipe, you can streamline complex configurations and seamlessly integrate datasets for optimized training jobs.

In SageMaker Studio, you can see that your SageMaker HyperPod job has been successfully created and you can monitor it for further progress.

After your job completes, you can use a benchmark recipe to evaluate if the customized model performs better on agentic tasks.

For comprehensive documentation and additional example implementations, visit the SageMaker HyperPod recipes repository on GitHub. We continue to expand the recipes based on customer feedback and emerging ML trends, ensuring you have the tools needed for successful AI model customization.

Availability and getting started
Recipes for Amazon Nova on Amazon SageMaker AI are available in US East (N. Virginia). Learn more about this feature by visiting the Amazon Nova customization webpage and Amazon Nova user guide and get started in the Amazon SageMaker AI console.

Betty

Introducing Amazon Bedrock AgentCore: Securely deploy and operate AI agents at any scale (preview)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-amazon-bedrock-agentcore-securely-deploy-and-operate-ai-agents-at-any-scale/

In just a few years, foundation models (FMs) have evolved from being used directly to create content in response to a user’s prompt, to now powering AI agents, a new class of software applications that use FMs to reason, plan, act, learn, and adapt in pursuit of user-defined goals with limited human oversight. This new wave of agentic AI is enabled by the emergence of standardized protocols such as Model Context Protocol (MCP) and Agent2Agent (A2A) that simplify how agents connect with other tools and systems.

In fact, building AI agents that can reliably perform complex tasks has become increasingly accessible thanks to open source frameworks like CrewAILangGraph, and Strands Agents. However, moving from a promising proof-of-concept to a production-ready agent that can scale to thousands of users presents significant challenges.

Instead of being able to focus on the core features of the agent, developers and AI engineers have to spend months building foundational infrastructure for session management, identity controls, memory systems, and observability—at the same time supporting security and compliance.

Today, we’re excited to announce the preview of Amazon Bedrock AgentCore, a comprehensive set of enterprise-grade services that help developers quickly and securely deploy and operate AI agents at scale using any framework and model, hosted on Amazon Bedrock or elsewhere.

More specifically, we are introducing today:

AgentCore Runtime – Provides sandboxed low-latency serverless environments with session isolation, supporting any agent framework including popular open source frameworks, tools, and models, and handling multimodal workloads and long-running agents.

AgentCore Memory – Manages session and long-term memory, providing relevant context to models while helping agents learn from past interactions.

AgentCore Observability – Offers step-by-step visualization of agent execution with metadata tagging, custom scoring, trajectory inspection, and troubleshooting/debugging filters.

AgentCore Identity – Enables AI agents to securely access AWS services and third-party tools and services such as GitHub, Salesforce, and Slack, either on behalf of users or by themselves with pre-authorized user consent.

AgentCore Gateway – Transforms existing APIs and AWS Lambda functions into agent-ready tools, offering unified access across protocols, including MCP, and runtime discovery.

AgentCore Browser – Provides managed web browser instances to scale your agents’ web automation workflows.

AgentCore Code Interpreter – Offers an isolated environment to run the code your agents generate.

These services can be used individually and are optimized to work together so developers don’t need to spend time piecing together components. AgentCore can work with open source or custom AI agent frameworks, giving teams the flexibility to maintain their preferred tools while gaining enterprise capabilities. To integrate these services into their existing code, developers can use the AgentCore SDK.

You can now discover, buy, and run pre-built agents and agent tools from AWS Marketplace with AgentCore Runtime. With just a few lines of code, your agents can securely connect to API-based agents and tools from AWS Marketplace with AgentCore Gateway to help you run complex workflows while maintaining compliance and control.

AgentCore eliminates tedious infrastructure work and operational complexity so development teams can bring groundbreaking agentic solutions to market faster.

Let’s see how this works in practice. I’ll share more info on the services as we use them.

Deploying a production-ready customer support assistant with Amazon Bedrock AgentCore (Preview)
When customers reach out with an email, it takes time to provide a reply. Customer support needs to check the validity of the email, find who the actual customer is in the customer relationship management (CRM) system, check their orders, and use product-specific knowledge bases to find the information required to prepare an answer.

An AI agent can simplify that by connecting to the internal systems, retrieve contextual information using a semantic data source, and draft a reply for the support team. For this use case, I built a simple prototype using Strands Agents. For simplicity and to validate the scenario, the internal tools are simulated using Python functions.

When I talk to developers, they tell me that similar prototypes, covering different use cases, are being built in many companies. When these prototypes are demonstrated to the company leadership and receive confirmation to proceed, the development team has to define how to go in production and satisfy the usual requirements for security, performance, availability, and scalability. This is where AgentCore can help.

Step 1 – Deploying to the cloud with AgentCore Runtime

AgentCore Runtime is a new service to securely deploy, run, and scale AI agents, providing isolation so that each user session runs in its own protected environment to help prevent data leakage—a critical requirement for applications handling sensitive data.

To match different security postures, agents can use different network configurations:

Sandbox – To only communicate with allowlisted AWS services.

Public – To run with managed internet access.

VPC-only (coming soon) – This option will allow to access resources hosted in a customer’s VPC or connected via AWS PrivateLink endpoints.

To deploy the agent to the cloud and get a secure, serverless endpoint with AgentCore Runtime, I add to the prototype a few lines of code using the AgentCore SDK to:

  • Import the AgentCore SDK.
  • Create the AgentCore app.
  • Specify which function is the entry point to invoke the agent.

Using a different or custom agent framework is a matter of replacing the agent invocation inside the entry point function.

Here’s the code of the prototype. The three lines I added to use AgentCore Runtime are the ones preceded by a comment.

from strands import Agent, tool
from strands_tools import calculator, current_time

# Import the AgentCore SDK
from bedrock_agentcore.runtime import BedrockAgentCoreApp

WELCOME_MESSAGE = """
Welcome to the Customer Support Assistant! How can I help you today?
"""

SYSTEM_PROMPT = """
You are an helpful customer support assistant.
When provided with a customer email, gather all necessary info and prepare the response email.
When asked about an order, look for it and tell the full description and date of the order to the customer.
Don't mention the customer ID in your reply.
"""

@tool
def get_customer_id(email_address: str):
    if email_address == "[email protected]":
        return { "customer_id": 123 }
    else:
        return { "message": "customer not found" }

@tool
def get_orders(customer_id: int):
    if customer_id == 123:
        return [{
            "order_id": 1234,
            "items": [ "smartphone", "smartphone USB-C charger", "smartphone black cover"],
            "date": "20250607"
        }]
    else:
        return { "message": "no order found" }

@tool
def get_knowledge_base_info(topic: str):
    kb_info = []
    if "smartphone" in topic:
        if "cover" in topic:
            kb_info.append("To put on the cover, insert the bottom first, then push from the back up to the top.")
            kb_info.append("To remove the cover, push the top and bottom of the cover at the same time.")
        if "charger" in topic:
            kb_info.append("Input: 100-240V AC, 50/60Hz")
            kb_info.append("Includes US/UK/EU plug adapters")
    if len(kb_info) > 0:
        return kb_info
    else:
        return { "message": "no info found" }

# Create an AgentCore app
app = BedrockAgentCoreApp()

agent = Agent(
    system_prompt=SYSTEM_PROMPT,
    tools=[calculator, current_time, get_customer_id, get_orders, get_knowledge_base_info]
)

# Specify the entrypoint function invoking the agent
@app.entrypoint
def invoke(payload, context: RequestContext):
    """Handler for agent invocation"""
    user_message = payload.get(
        "prompt", "No prompt found in input, please guide customer to create a json payload with prompt key"
    )
    result = agent(user_message)
    return {"result": result.message}

if __name__ == "__main__":
    app.run()

I install the AgentCore SDK and the starter toolkit in the Python virtual environment:

pip install bedrock-agentcore bedrock-agentcore-starter-toolkit

After I activate the virtual environment, I have access to the AgentCore command line interface (CLI) provided by the starter toolkit.

First, I use agentcore configure --entrypoint my_agent.py -er <IAM_ROLE_ARN> to configure the agent, passing the AWS Identity and Access Management (IAM) role that the agent will assume. In this case, the agent needs access to Amazon Bedrock to invoke the model. The role can give access to other AWS resources used by an agent, such as an Amazon Simple Storage Service (Amazon S3) bucket or a Amazon DynamoDB table.

I launch the agent locally with agentcore launch --local. When running locally, I can interact with the agent using agentcore invoke --local <PAYLOAD>. The payload is passed to the entry point function. Note that the JSON syntax of the invocations is defined in the entry point function. In this case, I look for prompt in the JSON payload, but can use a different syntax depending on your use case.

When I am satisfied by local testing, I use agentcore launch to deploy to the cloud.

After the deployment is succesful and an endpoint has been created, I check the status of the endpoint with agentcore status and invoke the endpoint with agentcore invoke <PAYLOAD>. For example, I pass a customer support request in the invocation:

agentcore invoke '{"prompt": "From: [email protected] – Hi, I bought a smartphone from your store. I am traveling to Europe next week, will I be able to use the charger? Also, I struggle to remove the cover. Thanks, Danilo"}'

Step 2 – Enabling memory for context

After an agent has been deployed in the AgentCore Runtime, the context needs to be persisted to be available for a new invocation. I add AgentCore Memory to maintain session context using its short-term memory capabilities.

First, I create a memory client and the memory store for the conversations:

from bedrock_agentcore.memory import MemoryClient

memory_client = MemoryClient(region_name="us-east-1")

memory = memory_client.create_memory_and_wait(
    name="CustomerSupport", 
    description="Customer support conversations"
)

I can now use create_event to stores agent interactions into short-term memory:

memory_client.create_event(
    memory_id=memory.get("id"), # Identifies the memory store
    actor_id="user-123",        # Identifies the user
    session_id="session-456",   # Identifies the session
    messages=[
        ("Hi, ...", "USER"),
        ("I'm sorry to hear that...", "ASSISTANT"),
        ("get_orders(customer_id='123')", "TOOL"),
        . . .
    ]
)

I can load the most recent turns of a conversations from short-term memory using list_events:

conversations = memory_client.list_events(
    memory_id=memory.get("id"), # Identifies the memory store
    actor_id="user-123",        # Identifies the user 
    session_id="session-456",   # Identifies the session
    max_results=5               # Number of most recent turns to retrieve
)

With this capability, the agent can maintain context during long sessions. But when a users come back with a new session, the conversation starts blank. Using long-term memory, the agent can personalize user experiences by retaining insights across multiple interactions.

To extract memories from a conversation, I can use built-in AgentCore Memory policies for user preferences, summarization, and semantic memory (to capture facts) or create custom policies for specialized needs. Data is stored encrypted using a namespace-based storage for data segmentation.

I change the previous code creating the memory store to include long-term capabilities by passing a semantic memory strategy. Note that an existing memory store can be updated to add strategies. In that case, the new strategies are applied to newer events.

memory = memory_client.create_memory_and_wait(
    name="CustomerSupport", 
    description="Customer support conversations",
    strategies=[{
        "semanticMemoryStrategy": {
            "name": "semanticFacts",
            "namespaces": ["/facts/{actorId}"]
        }
    }]
)

After long-term memory has been configured for a memory store, calling create_event will automatically apply those strategies to extract information from the conversations. I can then retrieve memories extracted from the conversation using a semantic query:

memories = memory_client.retrieve_memories(
    memory_id=memory.get("id"),
    namespace="/facts/user-123",
    query="smartphone model"
)

In this way, I can quickly improve the user experience so that the agent remembers customer preferences and facts that are outside of the scope of the CRM and use this information to improve the replies.

Step 3 – Adding identity and access controls

Without proper identity controls, access from the agent to internal tools always uses the same access level. To follow security requirements, I integrate AgentCore Identity so that the agent can use access controls scoped to the user’s or agent’s identity context.

I set up an identity client and create a workload identity, a unique identifier that represents the agent within the AgentCore Identity system:

from bedrock_agentcore.services.identity import IdentityClient

identity_client = IdentityClient("us-east-1")
workload_identity = identity_client.create_workload_identity(name="my-agent")

Then, I configure the credential providers, for example:

google_provider = identity_client.create_oauth2_credential_provider(
    {
        "name": "google-workspace",
        "credentialProviderVendor": "GoogleOauth2",
        "oauth2ProviderConfigInput": {
            "googleOauth2ProviderConfig": {
                "clientId": "your-google-client-id",
                "clientSecret": "your-google-client-secret",
            }
        },
    }
)

perplexity_provider = identity_client.create_api_key_credential_provider(
    {
        "name": "perplexity-ai",
        "apiKey": "perplexity-api-key"
    }
)

I can then add the @requires_access_token Python decorator (passing the provider name, the scope, and so on) to the functions that need an access token to perform their activities.

Using this approach, the agent can verify the identity through the company’s existing identity infrastructure, operate as a distinct, authenticated identity, act with scoped permissions and integrate across multiple identity providers (such as Amazon Cognito, Okta, or Microsoft Entra ID) and service boundaries including AWS and third-party tools and services (such as Slack, GitHub, and Salesforce).

To offer robust and secure access controls while streamlining end-user and agent builder experiences, AgentCore Identity implements a secure token vault that stores users’ tokens and allows agents to retrieve them securely.

For OAuth 2.0 compatible tools and services, when a user first grants consent for an agent to act on their behalf, AgentCore Identity collects and stores the user’s tokens issued by the tool in its vault, along with securely storing the agent’s OAuth client credentials. Agents, operating with their own distinct identity and when invoked by the user, can then access these tokens as needed, reducing the need for frequent user consent.

When the user token expires, AgentCore Identity triggers a new authorization prompt to the user for the agent to obtain updated user tokens. For tools that use API keys, AgentCore Identity also stores these keys securely and gives agents controlled access to retrieve them when needed. This secure storage streamlines the user experience while maintaining robust access controls, enabling agents to operate effectively across various tools and services.

Step 4 – Expanding agent capabilities with AgentCore Gateway

Until now, all internal tools are simulated in the code. Many agent frameworks, including Strands Agents, natively support MCP to connect to remote tools. To have access to internal systems (such as CRM and order management) via an MCP interface, I use AgentCore Gateway.

With AgentCore Gateway, the agent can access AWS services using Smithy models, Lambda functions, and internal APIs and third-party providers using OpenAPI specifications. It employs a dual authentication model to have secure access control for both incoming requests and outbound connections to target resources. Lambda functions can be used to integrate external systems, particularly applications that lack standard APIs or require multiple steps to retrieve information.

AgentCore Gateway facilitates cross-cutting features that most customers would otherwise need to build themselves, including authentication, authorization, throttling, custom request/response transformation (to match underlying API formats), multitenancy, and tool selection.

The tool selection feature helps find the most relevant tools for a specific agent’s task. AgentCore Gateway brings a uniform MCP interface across all these tools, using AgentCore Identity to provide an OAuth interface for tools that do not support OAuth out of the box like AWS services.

Step 5 – Adding capabilities with AgentCore Code Interpreter and Browser tools

To answer to customer requests, the customer support agent needs to perform calculations. To simplify that, I use the AgentCode SDK to add access to the AgentCore Code Interpreter.

Similarly, some of the integrations required by the agent don’t implement a programmatic API but need to be accessed through a web interface. I give access to the AgentCore Browser to let the agent navigate those web sites autonomously.

Step 6 – Gaining visibility with observability

Now that the agent is in production, I need visibility into its activities and performance. AgentCore provides enhanced observability to help developers effectively debug, audit, and monitor their agent performance in production. It comes with built-in dashboards to track essential operational metrics such as session count, latency, duration, token usage, error rates, and component-level latency and error breakdowns. AgentCore also gives visibility into an agent’s behavior by capturing and visualizing both the end-to-end traces, as well as “spans” that capture each step of the agent workflow including tool invocations, memory

The built-in dashboards offered by this service help reveal performance bottlenecks and identify why certain interactions might fail, enabling continuous improvement and reducing the mean time to detect (MTTD) and mean time to repair (MTTR) in case of issues.

AgentCore supports OpenTelemetry to help integrate agent telemetry data with existing observability platforms, including Amazon CloudWatch, Datadog, LangSmith, and Langfuse.

Step 7 – Conclusion

Through this journey, we transformed a local prototype into a production-ready system. Using AgentCore modular approach, we implemented enterprise requirements incrementally—from basic deployment to sophisticated memory, identity management, and tool integration—all while maintaining the existing agent code.

Things to know
Amazon Bedrock AgentCore is available in preview in US East (N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (Frankfurt). You can start using AgentCore services through the AWS Management Console , the AWS Command Line Interface (AWS CLI), the AWS SDKs, or via the AgentCore SDK.

You can try AgentCore services at no charge until September 16, 2025. Standard AWS pricing applies to any additional AWS Services used as part of using AgentCore (for example, CloudWatch pricing will apply for AgentCore Observability). Starting September 17, 2025, AWS will bill you for AgentCore service usage based on this page.

Whether you’re building customer support agents, workflow automation, or innovative AI-powered experiences, AgentCore provides the foundation you need to move from prototype to production with confidence.

To learn more and start deploying production-ready agents, visit the AgentCore documentation. For code examples and integration guides, check out the AgentCore samples GitHub repo.

Join the AgentCore Preview Discord server to provide feedback and discuss use cases. We’d like to hear from you!

Danilo

Streamline the path from data to insights with new Amazon SageMaker Catalog capabilities

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/streamline-the-path-from-data-to-insights-with-new-amazon-sagemaker-capabilities/

Modern organizations manage data across multiple disconnected systems—structured databases, unstructured files, and separate visualization tools—creating barriers that slow analytics workflows and limit insight generation. Separate visualization platforms often create barriers that prevent teams from extracting comprehensive business insights.

These disconnected workflows prevent your organizations from maximizing your data investments, creating delays in decision making and missed opportunities for comprehensive analysis that combines multiple data types.

Starting today, you can use three new capabilities in Amazon SageMaker to accelerate your path from raw data to actionable insights:

  • Amazon QuickSight integration – Launch Amazon QuickSight directly from Amazon SageMaker Unified Studio to build dashboards using your project data, then publish them to the Amazon SageMaker Catalog for broader discovery and sharing across your organization.
  • Amazon SageMaker adds support for Amazon S3 general purpose buckets and Amazon S3 Access Grants in SageMaker Catalog– Make data stored in Amazon S3 general purpose buckets easier for teams to find, access, and collaborate on all types of data including unstructured data, while maintaining fine-grained access control using Amazon S3 Access Grants.
  • Automatic data onboarding from your lakehouse – Automatic onboarding of existing AWS Glue Data Catalog (GDC) datasets from the lakehouse architecture into SageMaker Catalog, without manual setup.

These new SageMaker capabilities address the complete data lifecycle within a unified and governed experience. You get automatic onboarding of existing structured data from your lakehouse, seamless cataloging of unstructured data content in Amazon S3, and streamlined visualization through QuickSight—all with consistent governance and access controls.

Let’s take a closer look at each capability.

Amazon SageMaker and Amazon QuickSight Integration
With this integration, you can build dashboards in Amazon QuickSight using data from your Amazon SageMaker projects. When you launch QuickSight from Amazon SageMaker Unified Studio, Amazon SageMaker automatically creates the QuickSight dataset and organizes it in a secured folder accessible only to project members.

Furthermore, the dashboards you build stay within this folder and automatically appear as assets in your SageMaker project, where you can publish them to the SageMaker Catalog and share them with users or groups in your corporate directory. This keeps your dashboards organized, discoverable, and governed within SageMaker Unified Studio.

To use this integration, both your Amazon SageMaker Unified Studio domain and QuickSight account must be integrated with AWS IAM Identity Center using the same IAM Identity Center instance. Additionally, your QuickSight account must exist in the same AWS account where you want to enable the QuickSight blueprint. You can learn more about the prerequisites on Documentation page

After these prerequisites are met, you can enable the blueprint for Amazon QuickSight by navigating to the Amazon SageMaker console and choosing the Blueprints tab. Then find Amazon QuickSight and follow the instructions.

You also need to configure your SQL analytics project profile to include Amazon QuickSight in Add blueprint deployment settings.

To learn more on onboarding setup, refer to the Documentation page.

Then, when you create a new project, you need to use the SQL analytics profile.

With your project created, you can start building visualizations with QuickSight. You can navigate to the Data tab, select the table or view to visualize, and choose Open in QuickSight under Actions.

This will redirect you to the Amazon QuickSight transactions dataset page and you can choose USE IN ANALYSIS to begin exploring the data.

When you create a project with the QuickSight blueprint, SageMaker Unified Studio automatically provisions a restricted QuickSight folder per project where SageMaker scopes all new assets—analyses, datasets, and dashboards. The integration maintains real-time folder permission sync, keeping QuickSight folder access permissions aligned with project membership.

Amazon Simple Storage Service (S3) general purpose buckets integration
Starting today, SageMaker adds support for S3 general purpose buckets in SageMaker Catalog to increase discoverability and allows granular permissions through S3 Access Grants, enabling users to govern data, including sharing and managing permissions. Data consumers, such as data scientists, engineers, and business analysts, can now discover and access S3 assets through SageMaker Catalog. This expansion also enables data producers to govern security controls on any S3 data asset through a single interface.

To use this integration, you need appropriate S3 general purpose bucket permissions, and your SageMaker Unified Studio projects must have access to the S3 buckets containing your data. Learn more about prerequisites on Amazon S3 data in Amazon SageMaker Unified Studio Documentation page.

You can add a connection to an existing S3 bucket.

When it’s connected, you can browse accessible folders and create discoverable assets by choosing on the bucket or a folder and selecting Publish to Catalog.

This action creates a SageMaker Catalog asset of type “S3 Object Collection” and opens an asset details page where users can augment business context to improve search and discoverability. Once published, data consumers can discover and subscribe to these cataloged assets. When data consumers subscribe to “S3 Object Collection” assets, SageMaker Catalog automatically grants access using S3 Access Grants upon approval, enabling cross-team collaboration while ensuring only the right users have the right access.

When you have access, now you can process your unstructured data in Amazon SageMaker Jupyter notebook. Following screenshot is an example to process image in medical use case.

If you have structured data, you can query your data using Amazon Athena or process using Spark in notebooks.

With this access granted through S3 Access Grants, you can seamlessly incorporate S3 data into my workflows—analyzing it in notebooks, combining it with structured data in the lakehouse and Amazon Redshift for comprehensive analytics. You can access unstructured data such as documents, images in JupyterLab notebooks to train ML models, or generate queryable insights.

Automatic data onboarding from your lakehouse
This integration automatically onboards all your lakehouse datasets into SageMaker Catalog. The key benefit for you is to bring AWS Glue Data Catalog (GDC) datasets into SageMaker Catalog, eliminating manual setup for cataloging, sharing, and governing them centrally.

This integration requires an existing lakehouse setup with Data Catalog containing your structured datasets.

When you set up a SageMaker domain, SageMaker Catalog automatically ingests metadata from all lakehouse databases and tables. This means you can immediately explore and use these datasets from within SageMaker Unified Studio without any configuration.

The integration helps you to start managing, governing, and consuming these assets from within SageMaker Unified Studio, applying the same governance policies and access controls you can use for other data types while unifying technical and business metadata.

Additional things to know
Here are a couple of things to note:

  • Availability – These integrations are available in all commercial AWS Regions where Amazon SageMaker is supported.
  • Pricing – Standard SageMaker Unified Studio, QuickSight, and Amazon S3 pricing applies. No additional charges for the integrations themselves.
  • Documentation – You can find complete setup guides in the SageMaker Unified Studio Documentation.

Get started with these new integrations through the Amazon SageMaker Unified Studio console.

Happy building!
Donnie

AWS Free Tier update: New customers can get started and explore AWS with up to $200 in credits

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/aws-free-tier-update-new-customers-can-get-started-and-explore-aws-with-up-to-200-in-credits/

When you’re new to Amazon Web Services (AWS), you can get started with AWS Free Tier to learn about AWS services, gain hands-on experience, and build applications. You can explore the portfolio of services without incurring costs, making it even easier to get started with AWS.

Today, we’re announcing some enhancements to the AWS Free Tier program, offering up to $200 in AWS credits that can be used across AWS services. You’ll receive $100 in AWS credits upon sign-up and can earn an additional $100 in credits by using services such as Amazon Elastic Compute Cloud (Amazon EC2), Amazon Relational Database Service (Amazon RDS), AWS Lambda, Amazon Bedrock, and AWS Budgets.

The enhanced AWS Free Tier program offers two options during sign-up: a free account plan and a paid account plan. The free account plan ensures you won’t incur any charges until you upgrade to a paid plan. The free account plan expires after 6 months or when you exhaust your credits, whichever comes first.

While on the free account plan, you won’t be able to use some services typically used by large enterprises. You can upgrade to a paid plan at any time to continue building on AWS. When you upgrade, you can still use any unused credits for any eligible service usage for up to 12 months from your initial sign-up date.

When you choose the paid plan, AWS will automatically apply your Free Tier credits to the use of eligible services in your AWS bills. For usage that exceeds the credits, you’re charged with the on-demand pricing.

Get up to $200 credits in action
When you sign up for either a free plan or a paid plan, you’ll receive $100 credit. You can also earn an additional $20 credits for each of these five AWS service activities you complete:

  • Amazon EC2 – You’ll learn how to launch an EC2 instance and terminate it.
  • Amazon RDS – You’ll learn the basic configuration options for launching an RDS database.
  • AWS Lambda – You’ll learn to build a straightforward web application consisting of a Lambda function with a function URL.
  • Amazon Bedrock – You’ll learn how to submit a prompt to generate a response in the Amazon Bedrock text playground.
  • AWS Budgets – You’ll learn how to set a budget that alerts you when you exceed your budgeted cost amount.

You can see the credit details in the Explore AWS widget in the AWS Management Console.

These activities are designed to expose customers to important building blocks of AWS, including cost and usage that show up in the AWS Billing Console. These charges are deducted from your Free Tier credits and help teach new AWS users about selecting the appropriate instance sizes to minimize your costs.

Choose Set up a cost budget using AWS Budgets to earn your first $20 credits. It redirects to the AWS Billing and Cost Management console.

To create your first budget, choose Use a template (simplified) and Monthly cost budget to notify you if you exceed, or are forecasted to exceed, the budget amount.

When you choose the Customize (advanced) setup option, you can customize a budget to set parameters specific to your use case, scope of AWS services or AWS Regions, the time period, the start month, and specific accounts.

After you successfully create your budget, your begin receiving alerts when your spend exceeds your budgeted amount.

You can go to the Credits page in the left navigation pane in the AWS Billing and Cost Management Console to confirm your $20 in credits. Please note, it can take up to 10 minutes for your credits to appear.

You can receive an additional $80 by completing the remaining four activities. Now you can use up to $200 in credits to learn AWS services and build your first application.

Things to know
Here are some of things to know about the enhanced AWS Free Tier program:

  • Notifications – We’ll send an email alert when 50 percent, 25 percent, or 10 percent of your AWS credits remain. We’ll also send notifications to the AWS console and your email inbox when you have 15 days, 7 days, and 2 days left in your 6-month free period. After your free period ends, we’ll send you an email with instructions on how to upgrade to a paid plan. You’ll have 90 days to reopen your account by upgrading to a paid plan.
  • AWS services – The free account can access parts of AWS services including over 30 services that offer always-free tier. The paid account can access all AWS services. For more information, visit AWS Free Tier page.
  • Legacy Free Tier – If your AWS account was created before July 15, 2025, you’ll continue to be in the legacy Free Tier program, where you can access short-term trials, 12-month trials, and always free tier services. The always-free tier is available under both the new Free Tier program and the legacy Free Tier program.

Now available
The new AWS Free Tier features are generally available in all AWS Regions, except the AWS GovCloud (US) Regions and the China Regions. To learn more, visit the AWS Free Tier page and AWS Free Tier Documentation.

Give the new AWS Free Tier a try by signing up today, and send feedback to AWS re:Post for AWS Free Tier or through your usual AWS Support contacts.

Channy

Monitor and debug event-driven applications with new Amazon EventBridge logging

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/monitor-and-debug-event-driven-applications-with-new-amazon-eventbridge-logging/

Starting today, you can use enhanced logging capability in Amazon EventBridge to monitor and debug your event-driven applications with comprehensive logs. These new enhancements help improve how you monitor and troubleshoot event flows.

Here’s how you can find this new capability on the Amazon EventBridge console:

The new observability capabilities address microservices and event-driven architecture monitoring challenges by providing comprehensive event lifecycle tracking. EventBridge now generates detailed log entries every time a matched event against rules is published, delivered to subscribers, or encounters failures and retries.

You gain visibility into the complete event journey with detailed information about successes, failures, and status codes that make identifying and diagnosing issues straightforward. What used to take hours of trial-and-error debugging now takes minutes with detailed event lifecycle tracking and built-in query tools.

Using Amazon EventBridge enhanced observability
Let me walk you through a demonstration that showcases the logging capability in Amazon EventBridge.

I can enable logging for an existing event bus or when creating a new custom event bus. First, I navigate to the EventBridge console and choose Event buses in the left navigation pane. In Custom event bus, I choose Create event bus.

I can see this new capability in the Logs section. I have three options to configure the Log destination: Amazon CloudWatch Logs, Amazon Data Firehose Stream, and Amazon Simple Storage Service (Amazon S3). If I want to stream my logs into a data lake, I can select Amazon Kinesis Data Firehose Stream. Logs are encrypted in transit with TLS and at rest if a customer-managed key (CMK) is provided for the event bus. CloudWatch Logs supports customer-managed keys, and Data Firehose offers server-side encryption for downstream destinations.

For this demo, I select CloudWatch logs and S3 logs.

I can also choose Log level, from Error, Info, or Trace. I choose Trace and select Include execution data because I need to review the payloads. You need to be mindful as logging payload data may contain sensitive information, and this setting applies to all log destinations you select. Then, I configure two destinations, one each for CloudWatch log group and S3 logs. Then I choose Create.

After logging is enabled, I can start publishing test events to observe the logging behavior.

For the first scenario, I’ve built an AWS Lambda function and configured this Lambda function as a target.

I navigate to my event bus to send a sample event by choosing Send events.

Here’s the payload that I use:

{
  "Source": "ecommerce.orders",
  "DetailType": "Order Placed",
  "Detail": {
    "orderId": "12345",
    "customerId": "cust-789",
    "amount": 99.99,
    "items": [
      {
        "productId": "prod-456",
        "quantity": 2,
        "price": 49.99
      }
    ]
  }
}

After I sent the sample event, I can see the logs are available in my S3 bucket.

I can also see the log entries appearing in the Amazon CloudWatch logs. The logs show the event lifecycle, from EVENT_RECEIPT to SUCCESS. Learn more about the complete event lifecycle on TBD:DOC_PAGE.

Now, let’s evaluate these logs. For brevity, I only include a few logs and have redacted them for readability. Here’s the log from when I triggered the event:

{
    "resource_arn": "arn:aws:events:us-east-1:123:event-bus/demo-logging",
    "message_timestamp_ms": 1751608776896,
    "event_bus_name": "demo-logging",
// REDACTED FOR BREVITY //
    "message_type": "EVENT_RECEIPT",
    "log_level": "TRACE",
    "details": {
        "caller_account_id": "123",
        "source_time_ms": 1751608775000,
        "source": "ecommerce.orders",
        "detail_type": "Order Placed",
        "resources": [],
        "event_detail": "REDACTED FOR BREVITY"
    }
}

Here’s the log when the event was successfully invoked:

{
    "resource_arn": "arn:aws:events:us-east-1:123:event-bus/demo-logging",
    "message_timestamp_ms": 1751608777091,
    "event_bus_name": "demo-logging",
// REDACTED FOR BREVITY //
    "message_type": "INVOCATION_SUCCESS",
    "log_level": "INFO",
    "details": {
// REDACTED FOR BREVITY //
        "total_attempts": 1,
        "final_invocation_status": "SUCCESS",
        "ingestion_to_start_latency_ms": 105,
        "ingestion_to_complete_latency_ms": 183,
        "ingestion_to_success_latency_ms": 183,
        "target_duration_ms": 53,
        "target_response_body": "<REDACTED FOR BREVITY>",
        "http_status_code": 202
    }
}

The additional log entries include rich metadata that makes troubleshooting straightforward. For example, on a successful event, I can see the latency timing from starting to completing the event, duration for the target to finish processing, and HTTP status code.

Debugging failures with complete event lifecycle tracking
The benefit of EventBridge logging becomes apparent when things go wrong. To test failure scenarios, I intentionally misconfigure a Lambda function’s permissions and change the rule to point to a different Lambda function without proper permissions.

The attempt failed with a permanent failure due to missing permissions. The log shows it’s a FIRST attempt that resulted in NO_PERMISSIONS status.

{
    "message_type": "INVOCATION_ATTEMPT_PERMANENT_FAILURE",
    "log_level": "ERROR",
    "details": {
        "rule_arn": "arn:aws:events:us-east-1:123:rule/demo-logging/demo-order-placed",
        "role_arn": "arn:aws:iam::123:role/service-role/Amazon_EventBridge_Invoke_Lambda_123",
        "target_arn": "arn:aws:lambda:us-east-1:123:function:demo-evb-fail",
        "attempt_type": "FIRST",
        "attempt_count": 1,
        "invocation_status": "NO_PERMISSIONS",
        "target_duration_ms": 25,
        "target_response_body": "{\"requestId\":\"a4bdfdc9-4806-4f3e-9961-31559cb2db62\",\"errorCode\":\"AccessDeniedException\",\"errorType\":\"Client\",\"errorMessage\":\"User: arn:aws:sts::123:assumed-role/Amazon_EventBridge_Invoke_Lambda_123/db4bff0a7e8539c4b12579ae111a3b0b is not authorized to perform: lambda:InvokeFunction on resource: arn:aws:lambda:us-east-1:123:function:demo-evb-fail because no identity-based policy allows the lambda:InvokeFunction action\",\"statusCode\":403}",
        "http_status_code": 403
    }
}

The final log entry summarizes the complete failure with timing metrics and the exact error message.

{
    "message_type": "INVOCATION_FAILURE",
    "log_level": "ERROR",
    "details": {
        "rule_arn": "arn:aws:events:us-east-1:123:rule/demo-logging/demo-order-placed",
        "role_arn": "arn:aws:iam::123:role/service-role/Amazon_EventBridge_Invoke_Lambda_123",
        "target_arn": "arn:aws:lambda:us-east-1:123:function:demo-evb-fail",
        "total_attempts": 1,
        "final_invocation_status": "NO_PERMISSIONS",
        "ingestion_to_start_latency_ms": 62,
        "ingestion_to_complete_latency_ms": 114,
        "target_duration_ms": 25,
        "http_status_code": 403
    },
    "error": {
        "http_status_code": 403,
        "error_message": "User: arn:aws:sts::123:assumed-role/Amazon_EventBridge_Invoke_Lambda_123/db4bff0a7e8539c4b12579ae111a3b0b is not authorized to perform: lambda:InvokeFunction on resource: arn:aws:lambda:us-east-1:123:function:demo-evb-fail because no identity-based policy allows the lambda:InvokeFunction action",
        "aws_service": "AWSLambda",
        "request_id": "a4bdfdc9-4806-4f3e-9961-31559cb2db62"
    }
}

The logs provide detailed performance metrics that help identify bottlenecks. The ingestion_to_start_latency_ms: 62 shows the time from event ingestion to starting invocation, while ingestion_to_complete_latency_ms: 114 represents the total time from ingestion to completion. Additionally, target_duration_ms: 25 indicates how long the target service took to respond, helping distinguish between EventBridge processing time and target service performance.

The error message clearly states what failed, lambda:InvokeFunction action, why it failed, (no identity-based policy allows the action), which role was involved (Amazon_EventBridge_Invoke_Lambda_1428392416), and which specific resource was affected, which was indicated by the Lambda function Amazon Resource Name (ARN).

Debugging API Destinations with EventBridge Logging
One particular use case that I think EventBridge logging capability will be helpful is to debug issues with API destinations. EventBridge API destinations are HTTPS endpoints that you can invoke as the target of an event bus rule or pipe. HTTPS endpoints help you to route events from your event bus to external systems, software-as-a-service (SaaS) applications, or third-party APIs using HTTPS calls. They use connections to handle authentication and credentials, making it easy to integrate your event-driven architecture with any HTTPS-based service. 

API destinations are commonly used to send events to external HTTPS endpoints and debugging failures from the external endpoint can be a challenge. These problems typically stem from changes to the endpoint authentication requirements or modified credentials.

To demonstrate this debugging capability, I intentionally configured an API destination with incorrect credentials in the connection resource.

When I send an event to this misconfigured endpoint, the enhanced logging shows the root cause of this failure.

{
    "resource_arn": "arn:aws:events:us-east-1:123:event-bus/demo-logging",
    "message_timestamp_ms": 1750344097251,
    "event_bus_name": "demo-logging",
    //REDACTED FOR BREVITY//,
    "message_type": "INVOCATION_FAILURE",
    "log_level": "ERROR",
    "details": {
        //REDACTED FOR BREVITY//,
        "total_attempts": 1,
        "final_invocation_status": "SDK_CLIENT_ERROR",
        "ingestion_to_start_latency_ms": 135,
        "ingestion_to_complete_latency_ms": 549,
        "target_duration_ms": 327,
        "target_response_body": "",
        "http_status_code": 400
    },
    "error": {
        "http_status_code": 400,
        "error_message": "Unable to invoke ApiDestination endpoint: The request failed because the credentials included for the connection are not authorized for the API destination."
    }
}

The log provides immediate clarity about the failure. The target_arn shows this involves an API destination, the final_invocation_status indicates SDK_CLIENT_ERROR, and the http_status_code of 400 , which points to a client-side issue. Most importantly, the error_message explicitly states that: Unable to invoke ApiDestination endpoint: The request failed because the credentials included for the connection are not authorized for the API destination.

This complete log sequence provides useful debugging insights because I can see exactly how the event moved through EventBridge — from event receipt, to ingestion, to rule matching, to invocation attempts. This level of detail eliminates guesswork and points directly to the root cause of the issue.

Additional things to know
Here are a couple of things to note:

  • Architecture support – Logging works with all EventBridge features including custom event buses, partner event sources, and API destinations for HTTPS endpoints.
  • Performance impact – Logging operates asynchronously with no measurable impact on event processing latency or throughput.
  • Pricing – You pay standard Amazon S3, Amazon CloudWatch Logs or Amazon Data Firehose pricing for log storage and delivery. EventBridge logging itself incurs no additional charges. For details, visit the Amazon EventBridge pricing page .
  • Availability – Amazon EventBridge logging capability is available in all AWS Regions where EventBridge is supported.
  • Documentation — For more details, refer to the Amazon EventBridge monitoring and debugging Documentation.

Get started with Amazon EventBridge logging capability by visiting the EventBridge console and enabling logging on your event buses.

Happy building!
— Donnie 

Introducing Amazon S3 Vectors: First cloud storage with native vector support at scale (preview)

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-amazon-s3-vectors-first-cloud-storage-with-native-vector-support-at-scale/

Today, we’re announcing the preview of Amazon S3 Vectors, a purpose-built durable vector storage solution that can reduce the total cost of uploading, storing, and querying vectors by up to 90 percent. Amazon S3 Vectors is the first cloud object store with native support to store large vector datasets and provide subsecond query performance that makes it affordable for businesses to store AI-ready data at massive scale.

Vector search is an emerging technique used in generative AI applications to find similar data points to given data by comparing their vector representations using distance or similarity metrics. Vectors are numerical representation of unstructured data created from embedding models. You generate vectors using embedding models for fields inside your document and store vectors into S3 Vectors to search semantically.

S3 Vectors introduces vector buckets, a new bucket type with a dedicated set of APIs to store, access, and query vector data without provisioning any infrastructure. When you create an S3 vector bucket, you organize your vector data within vector indexes, making it simple for running similarity search queries against your dataset. Each vector bucket can have up to 10,000 vector indexes, and each vector index can hold tens of millions of vectors.

After creating a vector index, when adding vector data to the index, you can also attach metadata as key-value pairs to each vector to filter future queries based on a set of conditions, for example, dates, categories, or user preferences. As you write, update, and delete vectors over time, S3 Vectors automatically optimizes the vector data to achieve the best possible price-performance for vector storage, even as the datasets scale and evolve.

S3 Vectors is also natively integrated with Amazon Bedrock Knowledge Bases, including within Amazon SageMaker Unified Studio, for building cost-effective Retrieval-Augmented Generation (RAG) applications. Through its integration with Amazon OpenSearch Service, you can lower storage costs by keeping infrequent queried vectors in S3 Vectors and then quickly move them to OpenSearch as demands increase or to support real-time, low-latency search operations.

With S3 Vectors, you can now economically store the vector embeddings that represent massive amounts of unstructured data such as images, videos, documents, and audio files, enabling scalable generative AI applications including semantic and similarity search, RAG, and build agent memory. You can also build applications to support a wide range of industry use cases including personalized recommendations, automated content analysis, and intelligent document processing without the complexity and cost of managing vector databases.

S3 Vectors in action
To create a vector bucket, choose Vector buckets in the left navigation pane in the Amazon S3 console and then choose Create vector bucket.

Enter a vector bucket name and choose the encryption type. If you don’t specify an encryption type, Amazon S3 applies server-side encryption with Amazon S3 managed keys (SSE-S3) as the base level of encryption for new vectors. You can also choose server-side encryption with AWS Key Management Service (AWS KMS) keys (SSE-KMS). To learn more about managing your vector bucket, visit S3 Vector buckets in the Amazon S3 User Guide.

Now, you can create a vector index to store and query your vector data within your created vector bucket.

Enter a vector index name and the dimensionality of the vectors to be inserted in the index. All vectors added to this index must have exactly the same number of values.

For Distance metric, you can choose either Cosine or Euclidean. When creating vector embeddings, select your embedding model’s recommended distance metric for more accurate results.

Choose Create vector index and then you can insert, list, and query vectors.

To insert your vector embeddings to a vector index, you can use the AWS Command Line Interface (AWS CLI), AWS SDKs, or Amazon S3 REST API. To generate vector embeddings for your unstructured data, you can use embedding models offered by Amazon Bedrock.

If you’re using the latest AWS Python SDKs, you can generate vector embeddings for your text using Amazon Bedrock using following code example:

# Generate and print an embedding with Amazon Titan Text Embeddings V2.
import boto3 
import json 

# Create a Bedrock Runtime client in the AWS Region of your choice. 
bedrock= boto3.client("bedrock-runtime", region_name="us-west-2") 

The text strings to convert to embeddings.
texts = [
"Star Wars: A farm boy joins rebels to fight an evil empire in space", 
"Jurassic Park: Scientists create dinosaurs in a theme park that goes wrong",
"Finding Nemo: A father fish searches the ocean to find his lost son"]

embeddings=[]
#Generate vector embeddings for the input texts
for text in texts:
        body = json.dumps({
            "inputText": text
        })    
        # Call Bedrock's embedding API
        response = bedrock.invoke_model(
        modelId='amazon.titan-embed-text-v2:0',  # Titan embedding model 
        body=body)   
        # Parse response
        response_body = json.loads(response['body'].read())
        embedding = response_body['embedding']
        embeddings.append(embedding)

Now, you can insert vector embeddings into the vector index and query vectors in your vector index using the query embedding:

# Create S3Vectors client
s3vectors_client = boto3.client('s3vectors', region_name='us-west-2')

# Insert vector embedding
s3vectors.put_vectors( vectorBucketName="channy-vector-bucket",
  indexName="channy-vector-index", 
  vectors=[
{"key": "v1", "data": {"float32": embeddings[0]}, "metadata": {"id": "key1", "source_text": texts[0], "genre":"scifi"}},
{"key": "v2", "data": {"float32": embeddings[1]}, "metadata": {"id": "key2", "source_text": texts[1], "genre":"scifi"}},
{"key": "v3", "data": {"float32": embeddings[2]}, "metadata": {"id": "key3", "source_text":  texts[2], "genre":"family"}}
],
)

#Create an embedding for your query input text
# The text to convert to an embedding.
input_text = "List the movies about adventures in space"

# Create the JSON request for the model.
request = json.dumps({"inputText": input_text})

# Invoke the model with the request and the model ID, e.g., Titan Text Embeddings V2. 
response = bedrock.invoke_model(modelId="amazon.titan-embed-text-v2:0", body=request)

# Decode the model's native response body.
model_response = json.loads(response["body"].read())

# Extract and print the generated embedding and the input text token count.
embedding = model_response["embedding"]

# Performa a similarity query. You can also optionally use a filter in your query
query = s3vectors.query_vectors( vectorBucketName="channy-vector-bucket",
  indexName="channy-vector-index",
  queryVector={"float32":embedding},
  topK=3, 
  filter={"genre":"scifi"},
  returnDistance=True,
  returnMetadata=True
  )
results = query["vectors"]
print(results)

To learn more about inserting vectors into a vector index, or listing, querying, and deleting vectors, visit S3 vector buckets and S3 vector indexes in the Amazon S3 User Guide. Additionally, with the S3 Vectors embed command line interface (CLI), you can create vector embeddings for your data using Amazon Bedrock and store and query them in an S3 vector index using single commands. For more information, see the S3 Vectors Embed CLI GitHub repository.

Integrate S3 Vectors with other AWS services
S3 Vectors integrates with other AWS services such as Amazon Bedrock, Amazon SageMaker, and Amazon OpenSearch Service to enhance your vector processing capabilities and provide comprehensive solutions for AI workloads.

Create Amazon Bedrock Knowledge Bases with S3 Vectors
You can use S3 Vectors in Amazon Bedrock Knowledge Bases to simplify and reduce the cost of vector storage for RAG applications. When creating a knowledge base in the Amazon Bedrock console, you can choose the S3 vector bucket as your vector store option.

In Step 3, you can choose the Vector store creation method either to create an S3 vector bucket and vector index or choose the existing S3 vector bucket and vector index that you’ve previously created.

For detailed step-by-step instructions, visit Create a knowledge base by connecting to a data source in Amazon Bedrock Knowledge Bases in the Amazon Bedrock User Guide.

Using Amazon SageMaker Unified Studio
You can create and manage knowledge bases with S3 Vectors in Amazon SageMaker Unified Studio when you build your generative AI applications through Amazon Bedrock. SageMaker Unified Studio is available in the next generation of Amazon SageMaker and provides a unified development environment for data and AI, including building and texting generative AI applications that use Amazon Bedrock knowledge bases.

You can choose your knowledge bases using the S3 Vectors created through Amazon Bedrock when you build generative AI applications. To learn more, visit Add a data source to your Amazon Bedrock app in the Amazon SageMaker Unified Studio User Guide.

Export S3 vector data to Amazon OpenSearch Service
You can balance cost and performance by adopting a tiered strategy that stores long-term vector data cost-effectively in Amazon S3 while exporting high priority vectors to OpenSearch for real-time query performance.

This flexibility means your organizations can access OpenSearch’s high performance (high QPS, low latency) for critical, real-time applications, such as product recommendations or fraud detection, while keeping less time-sensitive data in S3 Vectors.

To export your vector index, choose Advanced search export, then choose Export to OpenSearch in the Amazon S3 console.

Then, you will be brought to the Amazon OpenSearch Service Integration console with a template for S3 vector index export to OpenSearch vector engine. Choose Export with pre-selected S3 vector source and a service access role.

It will start the steps to create a new OpenSearch Serverless collection and migrate data from your S3 vector index into an OpenSearch knn index.

Choose the Import history in the left navigation pane. You can see the new import job that was created to make a copy of vector data from your S3 vector index into the OpenSearch Serverless collection.

Once the status changes to Complete, you can connect to the new OpenSearch serverless collection and query your new OpenSearch knn index.

To learn more, visit Creating and managing Amazon OpenSearch Serverless collections in the Amazon OpenSearch Service Developer Guide.

Now available
Amazon S3 Vectors, and its integrations with Amazon Bedrock, Amazon OpenSearch Service, and Amazon SageMaker are now in preview in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Frankfurt), and Asia Pacific (Sydney) Regions.

Give S3 Vectors a try in the Amazon S3 console today and send feedback to AWS re:Post for Amazon S3 or through your usual AWS Support contacts.

Channy

Amazon S3 Metadata now supports metadata for all your S3 objects

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/amazon-s3-metadata-now-supports-metadata-for-all-your-s3-objects/

Amazon S3 Metadata now provides complete visibility into all your existing objects in your Amazon Simple Storage Service (Amazon S3) buckets, expanding beyond new objects and changes. With this expanded coverage, you can analyze and query metadata for your entire S3 storage footprint.

Today, many customers rely on Amazon S3 to store unstructured data at scale. To understand what’s in a bucket, you often need to build and maintain custom systems that scan for objects, track changes, and manage metadata over time. These systems are expensive to maintain and hard to keep up to date as data grows.

Since the launch of S3 Metadata at re:Invent 2024, you’ve been able to query new and updated object metadata using metadata tables instead of relying on Amazon S3 Inventory or object-level APIs such as ListObjects, HeadObject, and GetObject—which can introduce latency and impact downstream workflows.

To make it easier for you to work with this expanded metadata, S3 Metadata introduces live inventory tables that work with familiar SQL-based tools. After your existing objects are backfilled into the system, any updates like uploads or deletions typically appear within an hour in your live inventory tables.

With S3 Metadata live inventory tables, you get a fully managed Apache Iceberg table that provides a complete and current snapshot of the objects and their metadata in your bucket, including existing objects, thanks to backfill support. These tables are refreshed automatically within an hour of changes such as uploads or deletions, so you stay up to date. You can use them to identify objects with specific properties—like unencrypted data, missing tags, or particular storage classes—and to support analytics, cost optimization, auditing, and governance.

S3 Metadata journal tables, previously known as S3 Metadata tables, are automatically enabled when you configure live inventory tables, provide a near real-time view of object-level changes in your bucket—including uploads, deletions, and metadata updates. These tables are ideal for auditing activity, tracking the lifecycle of objects, and generating event-driven insights. For example, you can use them to find out which objects were deleted in the past 24 hours, identify the requester making the most PUT operations, or monitor updates to object metadata over time.

S3 Metadata tables are created in a namespace name that is similar to your bucket name for easier discovery. The tables are stored in AWS system table buckets, grouped by account and Region. After you enable S3 Metadata for a general purpose S3 bucket, the system creates and maintains these tables for you. You don’t need to manage compaction or garbage collection processes—S3 Tables takes care of table maintenance tasks in the background.

These new tables help avoid waiting for metadata discovery before processing can begin, making them ideal for large-scale analytics and machine learning (ML) workloads. By querying metadata ahead of time, you can schedule GPU jobs more efficiently and reduce idle time in compute-intensive environments.

Let’s see how it works
To see how this works in practice, I configure S3 Metadata for a general purpose bucket using the AWS Management Console.

S3 Metadata, start from general purpose bucket

After choosing a general purpose bucket, I choose the Metadata tab, then I choose Create metadata configuration.

S3 Metadata, configure journal and inventory tableFor Journal table, I can choose the Server-side encryption option and the Record expiration period. For Live Inventory table, I choose Enabled and I can select the Server-side encryption options.

I configure Record expiration on the journal table. Journal table records expire after the specified number of days, 365 days (one year) in my example.

Then, I choose Create metadata configuration.

S3 Metadata, backfilling

S3 Metadata creates the live inventory table and journal table. In the Live Inventory table section, I can observe the Table status: the system immediately starts to backfill the table with existing object metadata. It can take between minutes to hours. The exact time depends on the quantity of objects you have in your S3 bucket.

While waiting, I also upload and delete objects to generate data in the journal table.

Then, I navigate to Amazon Athena to start querying the new tables.

S3 Metadata, query with Athena

I choose Query table with Athena to start querying the table. I can choose between a couple of default queries on the console.

S3 MetaData table structure

In Athena, I observe the structure of the tables in the AWSDataCatalog Data source and I start with a short query to check how many records are available in the journal table. I already have 6,488 entries:

SELECT count(*) FROM "b_aws_news_blog_metadata_inventory_ns"."journal";

# _col0
1 6488

Here are a couple of example queries I tried on the journal table:

# Query deleted objects in last 24 hours
# Use is_delete_marker=true for versioned buckets and record_type='DELETE' otherwise
SELECT bucket, key, version_id, last_modified_date
FROM "s3tablescatalog/aws-managed-s3"."b_aws_news_blog_metadata_inventory_ns"."journal"
WHERE last_modified_date >= (current_date - interval '1' day) AND is_delete_marker = true;

# bucket key version_id last_modified_date is_delete_marker
1 aws-news-blog-metadata-inventory .build/index-build/arm64-apple-macosx/debug/index/store/v5/records/G0/NSURLSession.h-JET61D329FG0 
2 aws-news-blog-metadata-inventory .build/index-build/arm64-apple-macosx/debug/index/store/v5/records/G5/cdefs.h-PJ21EUWKMWG5 
3 aws-news-blog-metadata-inventory .build/index-build/arm64-apple-macosx/debug/index/store/v5/records/FX/buf.h-25EDY57V6ZXFX 
4 aws-news-blog-metadata-inventory .build/index-build/arm64-apple-macosx/debug/index/store/v5/records/G6/NSMeasurementFormatter.h-3FN8J9CLVMYG6 
5 aws-news-blog-metadata-inventory .build/index-build/arm64-apple-macosx/debug/index/store/v5/records/G8/NSXMLDocument.h-1UO2NUJK0OAG8 

# Query recent PUT requests IP addresses
SELECT source_ip_address, count(source_ip_address)
FROM "s3tablescatalog/aws-managed-s3"."b_aws_news_blog_metadata_inventory_ns"."journal"
GROUP BY source_ip_address;

#	source_ip_address	_col1
1	my_laptop_IP_address	12488

# Query S3 Lifecycle expired objects in last 7 days
SELECT bucket, key, version_id, last_modified_date, record_timestamp
FROM "s3tablescatalog/aws-managed-s3"."b_aws_news_blog_metadata_inventory_ns"."journal"
WHERE requester = 's3.amazonaws.com' AND record_type = 'DELETE' AND record_timestamp > (current_date - interval '7' day);

(not applicable to my demo bucket)

The results helped me track the specific objects that were removed, including their timestamps.

Now, I look at the live inventory table:

# Distribution of object tags
SELECT object_tags, count(object_tags)
FROM "s3tablescatalog/aws-managed-s3"."b_aws_news_blog_metadata_inventory_ns"."inventory"
GROUP BY object_tags;

# object_tags    _col1
1 {Source=Swift} 1
2 {Source=swift} 1
3 {}             12486

# Query storage class and size for specific tags
SELECT storage_class, count(*) as count, sum(size) / 1024 / 1024 as usage
FROM "s3tablescatalog/aws-managed-s3"."b_aws_news_blog_metadata_inventory_ns"."inventory"
GROUP BY object_tags['pii=true'], storage_class;

# storage_class count   usage
1 STANDARD      124884  165

# Find objects with specific user defined metadata
SELECT key, last_modified_date, user_metadata
FROM "s3tablescatalog/aws-managed-s3"."b_aws_news_blog_metadata_inventory_ns"."inventory"
WHERE cardinality(user_metadata) > 0 ORDER BY last_modified_date DESC;

(not applicable to my demo bucket)

These are just a few examples of what is possible with S3 Metadata. Your preferred queries will depend on your use cases. Refer to Analyzing Amazon S3 Metadata with Amazon Athena and Amazon QuickSight in the AWS Storage Blog for more examples.

Pricing and availability
S3 Metadata live inventory and journal tables are available today in US East (Ohio, N. Virginia) and US West (N. California).

The journal tables are charged $0.30 per million updates. This is a 33 percent drop from our previous price.

For inventory tables, there’s a one-time backfill cost of $0.30 for a million objects to set up the table and generate metadata for existing objects. There are no additional costs if your bucket has less than one billion objects. For buckets with more than a billion objects, there is a monthly fee of $0.10 per million objects per month.

As usual, the Amazon S3 pricing page has all the details.

With S3 Metadata live inventory and journal tables, you can reduce the time and effort required to explore and manage large datasets. You get an up-to-date view of your storage and a record of changes, and both are available as Iceberg tables you can query on demand. You can discover data faster, power compliance workflows, and optimize your ML pipelines.

You can get started by enabling metadata inventory on your S3 bucket through the AWS console, AWS Command Line Interface (AWS CLI), or AWS SDKs. When they’re enabled, the journal and live inventory tables are automatically created and updated. To learn more, visit the S3 Metadata Documentation page.

— seb

TwelveLabs video understanding models are now available in Amazon Bedrock

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/twelvelabs-video-understanding-models-are-now-available-in-amazon-bedrock/

Earlier this year, we preannounced that TwelveLabs video understanding models were coming to Amazon Bedrock. Today, we’re announcing the models are now available for searching through videos, classifying scenes, summarizing, and extracting insights with precision and reliability.

TwelveLabs has introduced Marengo, a video embedding model proficient at performing tasks such as search and classification, and Pegasus, a video language model that can generate text based on video data. These models are trained on Amazon SageMaker HyperPod to deliver groundbreaking video analysis that provides text summaries, metadata generation, and creative optimization.

With the TwelveLabs models in Amazon Bedrock, you can find specific moments using natural language video search capabilities like “show me the first touchdown of the game” or “find the scene where the main characters first meet” and instantly jump to those exact moments. You can also build applications to understand video content by generating descriptive text such as titles, topics, hashtags, summaries, chapters, or highlights for discovering insights and connections without requiring predefined labels or categories.

For example, you can find recurring themes in customer feedback or spot product usage patterns that weren’t obvious before. Whether you have hundreds or thousands of hours of video content, you can now transform that entire library into a searchable knowledge resource while maintaining enterprise-grade security and performance.

Let’s take a look at the Marengo and Pegasus videos that TwelveLabs has published.

You can transform video workflows with these models across industries. Media producers and editors can instantly locate specific scenes or dialogue, which means you can focus on storytelling rather than sifting through hours of footage. Marketing teams are streamlining their advertising workflows by quickly personalizing content to resonate with various audiences, while security teams are using the technology to proactively identify potential risks by spotting patterns across multiple video feeds.

Getting started with TwelveLabs models in Amazon Bedrock
Before getting started, if you’re new to using TwelveLabs models, go to the Amazon Bedrock console and choose Model access in the bottom left navigation pane. To access the latest TwelveLabs models, request access for Marengo Embed 2.7 and Pegasus 1.2 in TwelveLabs.

To use TwelveLabs models in Amazon Bedrock, choose Chat/Text Playground under Test in the left navigation pane. Choose Select model, select TwelveLabs as the category and Pegasus as the model, and then choose Apply.

For searching or generating text from your video, you should upload your video to an Amazon Simple Storage Service (Amazon S3) bucket or input Base64-formatted video string.

I will use a sample video generated with Amazon Nova Reel with a prompt: “A cute raccoon playing guitar underwater.”

Enter the S3 URI of your uploaded video and the S3 Bucket Owner, and then run your prompt: “Tell me about the video by timeline”.

You can get following model response:

The video opens with an underwater scene featuring a sandy seabed and swaying seaweed, creating a serene atmosphere. At around 5 seconds, a raccoon appears, swimming towards the camera with a curious expression. By 10 seconds, the raccoon is seen swimming near the seabed. The focus then shifts to a group of koi fish swimming gracefully near the seabed from 11 to 13 seconds.

At 14 seconds, the scene transitions to an ukulele resting on the seabed, with the camera zooming in to highlight its details. The raccoon reappears at 20 seconds, swimming among the koi fish, and by 23 seconds, it begins to play the ukulele. The raccoon continues to play the ukulele from 29 to 41 seconds, with the koi fish swimming around it, adding a lively and colorful touch to the underwater setting.

From 42 to 49 seconds, the raccoon is shown playing the ukulele with the koi fish swimming around it. The video concludes with a screen displaying "Video generated with Amazon Novo Reel 1.1" and "Available in Amazon Bedrock" from 50 to 54 seconds, followed by the AWS logo from 55 to 57 seconds.

The TwelveLabs models can be easily integrated into your applications using the Amazon Bedrock Converse API, which provides a unified interface for conversational AI interactions.

Here’s an example of how to use the AWS SDK for Python (Boto3) with the TwelveLabs Pegasus model:

import boto3
import json
import os

AWS_REGION = "us-east-1"
MODEL_ID = "twelvelabs.pegasus-1-2-v1:0"
VIDEO_PATH = "sample.mp4"

def read_file(file_path: str) -> bytes:
    """Read a file in binary mode."""
    try:
        with open(file_path, 'rb') as file:
            return file.read()
    except Exception as e:
        raise Exception(f"Error reading file {file_path}: {str(e)}")

bedrock_runtime = boto3.client(
    service_name="bedrock-runtime",
    region_name=AWS_REGION
)

request_body = {
    "messages": [
        {
            "role": "user",
            "content": [
                {
                    "inputPrompt": "tell me about the video",
                    "mediaSource: {
                        "base64String": read_file(VIDEO_PATH)
                    }
                },
            ],
        }
    ]
}

response = bedrock_runtime.converse(
    modelId=MODEL_ID,
    messages=request_body["messages"]
)

print(response["output"]["message"]["content"][-1]["text"])

The TwelveLabs Marengo Embed 2.7 model generates vector embeddings from video, text, audio, or image inputs. These embeddings can be used for similarity search, clustering, and other machine learning (ML) tasks. The model supports asynchronous inference through the Bedrock AsyncInvokeModel API.

For video source, you can request JSON format for the TwelveLabs Marengo Embed 2.7 model using the AsyncInvokeModel API.

{
    "modelId": "twelvelabs.marengo-embed-2.7",
    "modelInput": {
        "inputType": "video",
        "mediaSource": {
            "s3Location": {
                "uri": "s3://your-video-object-s3-path",
                "bucketOwner": "your-video-object-s3-bucket-owner-account"
            }
        }
    },
    "outputDataConfig": {
        "s3OutputDataConfig": {
            "s3Uri": "s3://your-bucket-name"
        }
    }
}

You can get a response delivered to the specified S3 location.

{
    "embedding": [0.345, -0.678, 0.901, ...],
    "embeddingOption": "visual-text",
    "startSec": 0.0,
    "endSec": 5.0
}

To help you get started, check out a broad range of code examples for multiple use cases and a variety of programming languages. To learn more, visit TwelveLabs Pegasus 1.2 and TwelveLabs Marengo Embed 2.7 in the AWS Documentation.

Now available
TwelveLabs models are generally available today in Amazon Bedrock: the Marengo model in the US East (N. Virginia), Europe (Ireland), and Asia Pacific (Seoul) Region, and the Pegasus model in US West (Oregon), and Europe (Ireland) Region accessible with cross-Region inference from US and Europe Regions. Check the full Region list for future updates. To learn more, visit the TwelveLabs in Amazon Bedrock product page and the Amazon Bedrock pricing page.

Give TwelveLabs models a try on the Amazon Bedrock console today, and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup: AWS Builder Center, Amazon Q, Oracle Database@AWS, and more (July 14, 2025)

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-builder-center-amazon-q-oracle-databaseaws-and-more-july-14-2025/

Summer is well and truly here in the UK! I’m a bit of a summer grinch though so, unlike most people, I’m not crazy about “the glorious sun” scorching me when I’m out and about. On the upside, this provides the perfect excuse to retreat to the comfort of a well-ventilated room where I can focus on coding and curating the latest AWS releases to bring you the highlights.

I also managed to escape the heat for most of yesterday while recording an episode for the AWS Developers Podcast where the wonderful Sebastien Stormaq and Tiffany Souterre interviewed me about games development. If you haven’t discovered it yet, I highly recommend you give it a go as the episodes are full of interesting lessons and insights from not just AWS, but customers and community members who share their stories and expertise in a relaxed conversation.

Alright, ready to discover some of the new things we released last week? Here are the highlights.

AWS Builder Center
There is a new home for AWS builders and community members! AWS Builder Center is a new place where cloud builders can connect, share knowledge, and access resources to enhance their AWS journey. The platform enables users to join community programs, discover trending topics, access AWS Skill Builder courses, participate in technical challenges, and more, using a single Builder ID sign-in.

One the features that I’m personally most excited about is the Wishlist. You can now create wishes and tell AWS directly about ways to improve our products and services or share original ideas that you think could help you and your teams. You can also browse and upvote existing wishes to support any suggestions that you think should be prioritized. The AWS teams will keep an eye on this and if a wish has enough traction it may just be considered!

Read the news blog post for a quick tour through some of the most exciting features or head over to AWS Builder Center and start exploring!

AI
The world of AI keeps moving fast and changing our world, by providing new and exciting ways to do things and become more productive. Here are two releases from last week that caught my attention.

  • Amazon Q chat in the AWS Management Console can now query AWS service data – Amazon Q Developer expands its capabilities by enabling natural language queries of data stored across AWS services like S3, DynamoDB, and CloudWatch, directly from the AWS Console, Slack, Microsoft Teams, and AWS Console Mobile Application. This enhancement streamlines cloud management and troubleshooting by allowing users to access and analyze service data through conversational interfaces, with access controls managed through IAM permissions.
  • Amazon CloudWatch and Application Signals MCP servers for AI-assisted troubleshooting – AWS has released two new Model Context Protocol (MCP) servers – CloudWatch MCP and Application Signals MCP – that enable AI agents to leverage observability data for automated troubleshooting through conversational interfaces. These open-source servers allow AI assistants to analyze metrics, alarms, logs, traces, and service health data across AWS environments, streamlining incident response and root cause analysis without requiring developers to manually navigate multiple AWS consoles.

Oracle Database@AWS
It seems like yesterday when Andy Jassy announced our partnership with Oracle to create Oracle Database@AWS, a jointly offered service that runs Oracle databases on Exadata infrastructure directly within AWS data centers, providing a unified AWS-Oracle experience. Fast forward to last week and Oracle Database@AWS has reached a significant milestone with its general availability release. It is now available in US East (N. Virginia) and US West (Oregon) regions, with plans to expand to 20 additional regions globally.

In addition, VPC Lattice has added support for Oracle Database@AWS enabling seamless connectivity between applications in VPCs and on-premises environments to Oracle database networks. The integration simplifies network management and provides secure access from Oracle Database@AWS to AWS services like Amazon S3 and Amazon Redshift, without requiring complex networking setup.

So if you’re looking to migrate your Oracle database workloads, now is a great time to explore Oracle Database@AWS as it offers a compelling path forward with minimal modifications required.

Additional highlights
Here are some other releases that I think many people will be happy about.

  • AWS Config now supports 12 new resource types – AWS Config has expanded its monitoring capabilities with support for 12 new resource types across services including BackupGateway, CloudFront, EntityResolution, Bedrock, and more. These additions are automatically tracked if you have enabled recording for all resource types, enhancing your ability to discover, assess, and audit AWS resources.
  • Amazon SageMaker Studio now supports remote connections from Visual Studio Code – Amazon SageMaker Studio now supports remote connections from Visual Studio Code, allowing developers to use their familiar VS Code setup while leveraging SageMaker’s scalable compute resources for AI development.
  • AWS Network Firewall: Native AWS Transit Gateway support in all regions – AWS Network Firewall now offers native integration with AWS Transit Gateway across all supported regions, enabling direct attachment and simplified traffic inspection between VPCs and on-premises networks. This integration eliminates the need for managing dedicated VPC subnets and route tables while providing multi-AZ redundancy for improved security and reliability.

Upcoming AWS Events
AWS Summit New York – this is definitely one to watch…literally! Registrations are closed due to capacity but you can tune in to watch live all the announcements and launches! No spoilers, but, trust me, there are a quite a few exciting things in store, so make sure to check it out.

AWS Gen AI LoftsAWS Gen AI Lofts are multi-day events offering hands-on workshops, expert guidance, and networking opportunities for developers and business leaders looking to explore or advance their generative AI journey. These events are hosted across multiple global locations including San Francisco, Berlin, Dubai, Dublin, Bengaluru, Manchester, Paris, and Tel Aviv, providing accessible opportunities to accelerate your generative AI adoption.

And that’s it for this week! Come back next Monday for more highlights and keep your AWS knowledge up to date as we cover the latest releases.

Matheus Guimaraes | @codingmatheus

New Amazon EC2 P6e-GB200 UltraServers accelerated by NVIDIA Grace Blackwell GPUs for the highest AI performance

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/new-amazon-ec2-p6e-gb200-ultraservers-powered-by-nvidia-grace-blackwell-gpus-for-the-highest-ai-performance/

Today, we’re announcing the general availability of Amazon Elastic Compute Cloud (Amazon EC2) P6e-GB200 UltraServers, accelerated by NVIDIA GB200 NVL72 to offer the highest GPU performance for AI training and inference. Amazon EC2 UltraServers connect multiple EC2 instances using a dedicated, high-bandwidth, and low-latency accelerator interconnect across these instances.

The NVIDIA Grace Blackwell Superchips connect two high-performance NVIDIA Blackwell tensor core GPUs and an NVIDIA Grace CPU based on Arm architecture using the NVIDIA NVLink-C2C interconnect. Each Grace Blackwell Superchip delivers 10 petaflops of FP8 compute (without sparsity) and up to 372 GB HBM3e memory. With the superchip architecture, GPU and CPU are colocated within one compute module, increasing bandwidth between GPU and CPU significantly compared to current generation EC2 P5en instances.

With EC2 P6e-GB200 UltraServers, you can access up to 72 NVIDIA Blackwell GPUs within one NVLink domain to use 360 petaflops of FP8 compute (without sparsity) and 13.4 TB of total high bandwidth memory (HBM3e). Powered by the AWS Nitro System, P6e-GB200 UltraServers are deployed in EC2 UltraClusters to securely and reliably scale to tens of thousands of GPUs.

EC2 P6e-GB200 UltraServers deliver up to 28.8 Tbps of total Elastic Fabric Adapter (EFAv4) networking. EFA is also coupled with NVIDIA GPUDirect RDMA to enable low-latency GPU-to-GPU communication between servers with operating system bypass.

EC2 P6e-GB200 UltraServers specifications
EC2 P6e-GB200 UltraServers are available in sizes ranging from 36 to 72 GPUs under NVLink. Here are the specs for EC2 P6e-GB200 UltraServers:

UltraServer type GPUs
GPU
memory (GB)
vCPUs Instance memory
(GiB)
Instance storage (TB) Aggregate EFA Network Bandwidth (Gbps) EBS bandwidth (Gbps)
u-p6e-gb200x36 36 6660 1296 8640 202.5 14400 540
u-p6e-gb200x72 72 13320 2592 17280 405 28800 1080

P6e-GB200 UltraServers are ideal for the most compute and memory intensive AI workloads, such as training and inference of frontier models, including mixture of experts models and reasoning models, at the trillion-parameter scale.

You can build agentic and generative AI applications, including question answering, code generation, video and image generation, speech recognition, and more.

P6e-GB200 UltraServers in action
You can use EC2 P6e-GB200 UltraServers in the Dallas Local Zone through EC2 Capacity Blocks for ML. The Dallas Local Zone (us-east-1-dfw-2a) is an extension of the US East (N. Virginia) Region.

To reserve your EC2 Capacity Blocks, choose Capacity Reservations on the Amazon EC2 console. You can select Purchase Capacity Blocks for ML and then choose your total capacity and specify how long you need the EC2 Capacity Block for u-p6e-gb200x36 or u-p6e-gb200x72 UltraServers.

Once Capacity Block is successfully scheduled, it is charged up front and its price doesn’t change after purchase. The payment will be billed to your account within 12 hours after you purchase the EC2 Capacity Blocks. To learn more, visit Capacity Blocks for ML in the Amazon EC2 User Guide.

To run instances within your purchased Capacity Block, you can use AWS Management Console, AWS Command Line Interface (AWS CLI) or AWS SDKs. On the software side, you can start with the AWS Deep Learning AMIs. These images are preconfigured with the frameworks and tools that you probably already know and use: PyTorch, JAX, and a lot more.

You can also integrate EC2 P6e-GB200 UltraServers seamlessly with various AWS managed services. For example:

  • Amazon SageMaker Hyperpod provides managed, resilient infrastructure that automatically handles the provisioning and management of P6e-GB200 UltraServers, replacing faulty instances with preconfigured spare capacity within the same NVLink domain to maintain performance.
  • Amazon Elastic Kubernetes Services (Amazon EKS) allows one managed node group to span across multiple P6e-GB200 UltraServers as nodes, automating their provisioning and lifecycle management within Kubernetes clusters. You can use EKS topology-aware routing for P6e-GB200 UltraServers, enabling optimal placement of tightly coupled components of distributed workloads within a single UltraServer’s NVLink-connected instances.
  • Amazon FSx for Lustre file systems provide data access for P6e-GB200 UltraServers at the hundreds of GB/s of throughput and millions of input/output operations per second (IOPS) required for large-scale HPC and AI workloads. For fast access to large datasets, you can use up to 405 TB of local NVMe SSD storage or virtually unlimited cost-effective storage with Amazon Simple Storage Service (Amazon S3).

Now available
Amazon EC2 P6e-GB200 UltraServers are available today in the Dallas Local Zone (us-east-1-dfw-2a) through EC2 Capacity Blocks for ML. For more information, visit the Amazon EC2 pricing page.

Give Amazon EC2 P6e-GB200 UltraServers a try in the Amazon EC2 console. To learn more, visit the Amazon EC2 P6e instances page and send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

Channy

Introducing AWS Builder Center: A new home for the AWS builder community

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-aws-builder-center-a-new-home-for-the-aws-builder-community/

We really love builders at AWS. We’re constantly thinking of new ways to help technical communities thrive and create spaces like AWS Developer Center and community.aws where people can connect and share their knowledge and experiences.

Today, we’re announcing AWS Builder Center, a new home for builders to access all builder resources, engage with the AWS community, and provide feedback or product suggestions to AWS product teams. This new experience also integrates the previous AWS Developer Center and community.aws.

There are a variety of exciting features so let us discover some of them.

Your voice matters: Introducing Wishlist
One of the most exciting new features, in my opinion, is Wishlist. You can now submit your wishes for new features or improvements you’d like to see in AWS services. Others can discover and vote on these wishes while also creating their own.

You can influence product roadmap collectively as a community and help us shape the future of AWS services. You can share ideas, suggestions, feature proposals, or challenges while operating AWS services, with the ability for the AWS community to upvote ideas and highlight the most sought-after improvements. Our internal teams will keep an eye on these and bring the most popular wishes to the attention of our service teams, making your voice an integral part of our product development process.

Connect people in the AWS community
On the Connect page, you’ll find many opportunities to connect directly with AWS Heroes and AWS Community Builders. You can explore and join AWS User Groups and AWS Cloud Clubs near your cities around the world.

On top of that, you can bookmark this page as your centralized hub for finding upcoming community events, making it easy to find opportunities to learn and network in your local area and meet like-minded builders who share your interests.

Speaking of following people, AWS Builder Center makes it really straightforward to connect and engage with others, serving as the central hub for the AWS technical community. It brings together all the different ways that you can connect with fellow builders. For example, the Who to Follow section introduces you to AWS Heroes, Community Builders, and active community members who are sharing their knowledge and expertise in your areas of interest.

Explore our AWS hands-on resources
On the Build page, you’ll discover ways to get familiar with AWS with hands-on experience such as interactive learning resources designed for every skill level such as AWS Tutorials and AWS Workshops. You can explore generative AI and agentic AI services playground and find the AWS Free Tier to try out AWS services free of charge up to specified limits for each service.

Choose the Toolbox page and discover the latest tools, programming language resources, and Open Source projects for AWS. The Toolbox has everything you need to get your project scaffolded and up and running.

To improve the build experience for builders, we plan to expand Builder Center’s built-in offerings such as creating dedicated groups and forums for collaborating on a particular topic, run workshops for hands-on labs, and various service playgrounds where builders can freely experiment with AWS services.

Supporting your builder journey
The new Learn section serves as your gateway to skill development, bringing together everything you need to expand your AWS expertise. Here, you can explore learning and training resources, workshops, gamified experiences, and more to make your journey of building on AWS both educational and engaging.

Choose the Topics page, where you can explore and discover more content. You can explore content by topics and tags. There is a featured and trending topics section that helps you to stay connected with what’s capturing the community’s attention right now.

Built-in localization for your spoken language
AWS Builder Center breaks down language barriers with comprehensive localization support. All content published in the Builder Center is automatically available in 16 languages, and user-generated content, such as posts, comments, or wishes, can be machine-translated on demand using Translate. So, you can collaborate with builders worldwide, sharing knowledge and experiences across language boundaries.

By default, all content will be displayed in based on the language that your browser is set to. But, you can override this by visiting the settings page and choosing the language that you want AWS Builder Center to use by default.

Sign up and build your profile now
AWS Builder Center gives you a more personalized and comprehensive way to showcase your AWS journey. Your unique profile comes with a custom URL and shareable QR code, making it straightforward to connect with others and share your presence in the AWS community.

All your posts, wishes, and meaningful interactions are organized within a centralized view so you can easily check them. In the Manage profile page, you can customize your profile, add specific interests and areas of expertise, helping you connect with builders who share your passions. Profile management is seamless: it synchronizes across all AWS services using AWS Builder ID, ensuring your identity remains consistent wherever you engage with AWS offerings.

Visit builder.aws.com, sign up with AWS Builder ID, and claim your unique alias to access all features, including content creation, Wishlist, and community engagement tools.

AWS Builder Center was designed to help you connect, learn, and build with fellow AWS builders, so enjoy your journey together!

ChannyMatheus Guimaraes | @codingmatheus

Introducing Oracle Database@AWS for simplified Oracle Exadata migrations to the AWS Cloud

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/introducing-oracle-databaseaws-for-simplified-oracle-exadata-migrations-to-the-aws-cloud/

Today, we’re announcing the general availability of Oracle Database@AWS, a new offering for Oracle Exadata workloads, including Oracle Real Application Clusters (RAC) within AWS.

In the past 14 years, customers had the choice of self-managing Oracle database workloads in the cloud using Amazon Elastic Compute Cloud (Amazon EC2) or using fully managed Amazon Relational Database Service (Amazon RDS) for Oracle. Now, you have an additional option for your workloads that require Oracle RAC or Oracle Exadata for quicker and simpler migrations to the cloud. You also get a single invoice through AWS Marketplace, which counts towards AWS commitments and Oracle license benefits, including Bring Your Own License (BYOL) and discount programs such as Oracle Support Rewards.

With Oracle Database@AWS, you can migrate your Oracle Exadata workloads to Oracle Exadata Database Service on Dedicated Infrastructure or Oracle Autonomous Database on Dedicated Exadata Infrastructure within AWS with minimal changes. You can purchase, provision, and manage your Oracle Database@AWS deployments through familiar AWS tools and interfaces such as AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS APIs for applications running on AWS. The AWS APIs call the corresponding Oracle Cloud Infrastructure (OCI) APIs necessary to provision and manage the resources.

Since its preview last December, we’ve improved or added features to help run production workloads at general availability:

  • Regional expansion – You can now use Oracle Database@AWS in the U.S. East (N. Virginia) and U.S. West (Oregon) Regions today. We are also announcing plans to expand to 20 AWS Regions globally. This broader availability supports the diverse needs of our customers across various geographical areas so more enterprises can benefit from this option. You can choose from different Exadata system sizes to match your workload requirements in your AWS Region.
  • Zero-ETL and S3 backups – You can now benefit from zero-ETL integration with Amazon Redshift for analytics to remove the need to build and manage data pipelines for extract, transform, and load operations. With zero-ETL, you can unify your data on AWS without incurring cross network data transfer costs. We’re providing Amazon Simple Storage Service (Amazon S3) backups with up to eleven nines of data durability.
  • Autonomous VM cluster – You can now provision an Autonomous VM Cluster in addition to an Exadata VM cluster on the Exadata Dedicated Infrastructure. You can run Oracle Autonomous Database on Dedicated Exadata Infrastructure, a fully managed database environment using committed hardware and software resources.

Oracle Database@AWS also integrates with other AWS services such as Amazon Virtual Private Cloud (Amazon VPC) Lattice for configuring network paths to AWS services such as S3 and Redshift directly, AWS Identity and Access Management (IAM) for authentication and authorization, Amazon EventBridge for monitoring database lifecycle events, AWS CloudFormation for infrastructure automation, Amazon CloudWatch for collecting and monitoring metrics, and AWS CloudTrail for logging API operations.

Getting started with Oracle Database@AWS
Oracle Database@AWS supports two key services: Oracle Exadata Database Service on Dedicated Infrastructure and Oracle Autonomous Database on Dedicated Exadata Infrastructure within AWS data centers.

These services physically reside within an Availability Zone in an AWS Region and logically reside in an OCI region, enabling seamless integration with AWS services through high-speed, low-latency connections.

You create an ODB network, a private, isolated network that hosts Oracle Exadata VM Clusters within an Availability Zone. Then, you use ODB peering accessible to EC2 application servers running in a VPC. To learn more, visit How Oracle Database@AWS works in the AWS documentation.

Request a private offer in AWS Marketplace

To begin your journey with Oracle Database@AWS, visit the AWS console or request the AWS Marketplace private offer. Your AWS and Oracle sales team will receive your request, then contact you to find the best option for your workloads, and activate your account.

When you activate and get access to Oracle Database@AWS, you can use the Dashboard to create an ODB network, Exadata infrastructure, and Exadata VM cluster or Autonomous VM cluster, and ODB peering connection.

To learn more, visit the Onboarding to Oracle Database@AWS and AWS Marketplace buyer private offers in the AWS documentation.

Create an ODB network

An ODB network is a private isolated network that hosts OCI infrastructure on AWS. The ODB network maps directly to the network that exists within the OCI child site, thus serving as the means of communication between AWS and OCI.

In the Dashboard, choose Create ODB network, enter a network name, choose the Availability Zone, and specify a CIDR ranges for client connections established by applications and backup connections used for taking automated backups. You can also enter a name to use as a prefix to your domain fixed as oraclevcn.com. For example, if you enter myhost, the fully qualified domain name is myhost.oraclevcn.com.

Optionally, you can configure ODB network access to perform automated backups to Amazon S3 and zero-ETL for near real-time analytics and ML on your Oracle data using Amazon Redshift.

After you create your ODB network, update your VPC route tables of your EC2 application servers with the client connection CIDR in the ODB network. To learn more, visit ODB network, ODB peering, and Configuring VPC route tables for ODB peering in the AWS documentation.

Create Exadata infrastructure

The Oracle Exadata infrastructure is the underlying architecture of your database servers, storage servers, and networking that run your Oracle Exadata databases.

Choose Create Exadata infrastructure, enter a name, and use the default Availability Zone. In the next step, you can choose Exadata.X11M for the Exadata system model. You can also set a default of 2 or up to 32 database servers and 3 or up to 64 storage servers with 80 TB storage capacity per server.

Finally, you can configure system maintenance preferences, such as scheduling, patching mode, and OCI maintenance notification contacts. You can’t modify an infrastructure after you create it from the AWS console. But, you can navigate to the OCI console and modify it.

To delete an Exadata infrastructure, visit Deleting an Oracle Exadata infrastructure in Oracle Database@AWS in the AWS documentation.

Create an Exadata VM cluster or Autonomous VM cluster

You can create VM clusters on Exadata infrastructure and deploy multiple VM clusters with different Oracle Exadata infrastructures in the same ODB network.

Here are two types of VM clusters:

  • An Exadata VM cluster is a set of virtual machines that has a complete Oracle database installation that includes all features of Oracle Enterprise Edition.
  • An Autonomous VM cluster is a set of fully managed databases that automate key management tasks using AI/ML with no human intervention required.

Choose Create Exadata VM cluster, enter a VM cluster name and a time zone, choose Bring Your Own License (BYOL) or license included for license options. In the next step, you can choose your Exadata infrastructure, grid infrastructure version, and Exadata image version. For database servers, you can choose the CPU core count, memory, and local storage for each VM or accept the defaults.

In the next step, you can configure the connectivity setting by choosing your ODB network and entering a prefix for the VM cluster. You can enter a port number for TCP access to the single client access name (SCAN) listener. The default port is 1521 or you can enter a custom SCAN port in the range 1024–8999. For SSH key pairs, enter the public key portion of one or more key pairs used for SSH access to the VM cluster.

Then, you can choose diagnostics and tags, review your settings, and create a VM cluster. The creation process can take up to 6 hours, depending on the size of the VM cluster.

Create and manage an Oracle database

When the VM cluster is ready, you can create and manage your Oracle Exadata databases in the OCI console. Choose Manage in OCI in the details page of the Exadata VM cluster. You will be redirected to the OCI console.

When you create an Oracle Database in the OCI console, you can select Oracle Database 19c or 23ai. When enabling automatic backups for your provisioned databases, you can use an S3 bucket or OCI Object Storage in the OCI region. To learn more, visit Provision Oracle Exadata Database Service in Oracle Database@AWS in the OCI documentation.

Things to know
Here are a couple of things to know about Oracle Database@AWS:

  • Monitoring – You can monitor Oracle Database@AWS using Amazon CloudWatch metrics in the AWS/ODB namespaces for VM clusters, container databases, and pluggable databases. AWS CloudTrail captures all AWS API calls for Oracle Database@AWS as events. Using CloudTrail logs, you can determine the request that was made to Oracle Database@AWS, the IP address from which the request was made, when it was made, and additional details. To learn more, visit Monitoring Oracle Database@AWS.
  • Security – You can use IAM to assign permissions that determine who is allowed to manage Oracle Database@AWS resources and SSL/TLS encrypted connections to secure data. You can also use Amazon EventBridge for seamless event-driven database operations—all working together to maintain security standards while enabling efficient cloud operations. To learn more, visit Security in Oracle Database@AWS.
  • Compliance – Your compliance responsibility when using Oracle Database@AWS is determined by the sensitivity of your data, your company’s compliance objectives, and applicable laws and regulations. We provides the following compliances with Oracle Database@AWS: SOC 1, SOC 2, SOC 3, HIPAA, C5, CSA STAR Attest, CSA STAR Cert, HDS (France), ISO Series (ISO/IEC 9001, 20000-1, 27001, 27017, 27018, 27701, 22301), PCI DSS, and HITRUST. To learn more, visit Compliance validation for Oracle Database@AWS.
  • Support – Your AWS or Oracle sales account team can help you evaluate your current database infrastructure, determine how Oracle Database@AWS can best serve your organization’s requirements, and develop a tailored migration strategy and timeline. You can also get help from AWS Oracle Competency Partners specialized to architect, deploy, and manage Oracle-based workloads running in the AWS Cloud.

Now available and coming soon
Oracle Database@AWS is now available in the U.S. East (N. Virginia) and U.S. West (Oregon) Regions through the AWS Marketplace. Oracle Database@AWS pricing and any AWS Marketplace private offers are set by Oracle. You can see specific details around pricing on Oracle’s pricing page for the offering.

Oracle Database@AWS will expand to 20 more AWS Regions across the Americas, Europe, and Asia-Pacific including: US East (Ohio), US West (N. California), Asia Pacific (Hyderabad), Asia Pacific (Melbourne), Asia Pacific (Mumbai), Asia Pacific (Osaka), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Milan), Europe (Paris), Europe (Spain), Europe (Stockholm), Europe (Zurich), and South America (São Paulo).

You can get started with Oracle Database@AWS with using AWS console. To learn more, visit the Oracle Database@AWS User Guide and OCI documentation and send feedback through your usual AWS Support contacts or OCI support.

Channy

AWS Weekly Roundup: EC2 C8gn instances, Amazon Nova Canvas virtual try-on, and more (July 7, 2025)

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-bedrock-api-keys-amazon-nova-canvas-virtual-try-on-and-more-july-7-2025/

Every Monday we tell you about the best releases and blogs that caught our attention last week.

Before continuing with this AWS Weekly Roundup, I’d like to share that last month I moved with my family to San Francisco, California, to start a new role as Developer Advocate/SDE, GenAI.

This excites me because I’ll have the opportunity to connect with new communities in the Bay Area while tackling exciting new challenges. If you’re part of a community focused on building generative AI and agentics applications, or know of one, I’d love to connect. Let’s connect!

Last week’s launches
Here are the launches from last week:

  • New Amazon EC2 C8gn instances powered by AWS Graviton4 offering up to 600Gbps network bandwidth – Amazon Elastic Compute Cloud (Amazon EC2) C8gn instances are now generally available, powered by AWS Graviton4 processors and 6th generation AWS Nitro Cards. These network-optimized instances deliver up to 600 Gbps network bandwidth. This represents the highest bandwidth among EC2 network-optimized instances, with up to 192 vCPUs and 384 GiB memory. They provide 30% higher compute performance than C7gn instances and are ideal for network-intensive workloads like virtual appliances, data analytics, and cluster computing jobs.
  • Build the highest resilience apps with multi-Region strong consistency in Amazon DynamoDB global tables – Amazon DynamoDB global tables now supports multi-Region strong consistency (MRSC) for applications requiring zero Recovery Point Objective (RPO). This capability ensures applications can read the latest data from any Region during outages, addressing critical needs in payment processing and financial services. MRSC requires three AWS Regions configured as either three full replicas or two replicas plus a witness, providing the highest level of application resilience for mission-critical workloads.
  • Amazon Nova Canvas update: Virtual try-on and style options now available – Amazon Nova Canvas introduces virtual try-on capabilities that help you visualize how clothing looks on a person by combining two images, plus eight new pre-trained style options (3D animation, design sketch, vector illustration, graphic novel, etc.) for generating images with improved artistic consistency. Available in three AWS Regions, these features enhance AI-powered image generation capabilities for retailers and content creators seeking realistic product visualizations.
  • Amazon Q in Connect now supports 7 languages for proactive recommendations – Amazon Q in Connect, a generative AI-powered assistant for customer service, now provides proactive recommendations in seven languages: English, Spanish, French, Portuguese, Mandarin, Japanese, and Korean. The AI-powered customer service assistant detects customer intent during voice and chat interactions to help agents resolve issues quickly and accurately.
  • Amazon Aurora MySQL and Amazon RDS for MySQL integration with Amazon SageMaker is now available – This integration provides near real-time data availability for analytics. It automatically extracts MySQL data into lakehouses with Apache Iceberg compatibility. You can then access this data seamlessly through various analytics engines and machine learning tools.
  • Amazon Aurora DSQL is now available in additional AWS RegionsAmazon Aurora DSQL expands to Asia Pacific (Seoul) and now supports multi-Region clusters across Asia Pacific and European regions. This serverless, distributed SQL database offers unlimited scalability, highest availability, and zero infrastructure management with AWS Free Tier access.

Other AWS blog posts

  • Optimize RAG in production environments using Amazon SageMaker JumpStart and Amazon OpenSearch Service – Learn how to optimize Retrieval Augmented Generation (RAG) in production environments using Amazon SageMaker JumpStart and Amazon OpenSearch Service. This comprehensive guide demonstrates implementing RAG workflows with LangChain, covers OpenSearch optimization strategies, provides setup instructions, and explains benefits of combining these AWS services for scalable, cost-effective generative AI applications.v
  • Agentic GenAI App Using Bedrock, MCP servers on EKS – This post shows how to build a scalable AI chat application using Amazon Bedrock, Strands Agent, and Model Context Protocol (MCP) servers deployed on Amazon Elastic Kubernetes Service (Amazon EKS). The architecture combines agentic workflows with containerized microservices for intelligent, auto-scaling conversations with multiple foundation models.
  • Enforce table level access control on data lake tables using AWS Glue 5.0 with AWS Lake Formation – AWS Glue 5.0 introduces Full-Table Access (FTA) control for Apache Spark with AWS Lake Formation, providing table-level security without fine-grained access overhead. This feature supports native Spark SQL/DataFrames for Lake Formation tables. It enables read/write operations on Iceberg and Hive tables with improved performance and lower costs.

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events:

  • AWS re:Invent – Register now to get a head start on choosing your best learning path, booking travel and accommodations, and bringing your team to learn, connect, and have fun. Early-career professionals can apply for the All Builders Welcome Grant program, designed to remove financial barriers and create diverse pathways into cloud technology. Applications are now open and close on July 15, 2025.
  • AWS NY Summit – You can gain insights from Swami’s keynote featuring the latest cutting-edge AWS technologies in compute, storage, and generative AI. My News Blog team is also preparing some exciting news for you. If you’re unable to attend in person, you can still participate by registering for the global live stream. Also, save the date for these upcoming Summits in July and August near your city.
  • AWS Builders Online Series – If you’re based in one of the Asia Pacific time zones, join and learn fundamental AWS concepts, architectural best practices, and hands-on demonstrations to help you build, migrate, and deploy your workloads on AWS.
  • Join AWS Gen AI Lofts – Experience AWS Gen AI Lofts across San Francisco, Berlin, Dubai, Dublin, Bengaluru, Manchester, Paris, Tel Aviv, and additional locations – hands-on workshops, expert guidance, investor networking, and collaborative spaces designed to accelerate your generative AI startup journey.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Eli