Tag Archives: Amazon Bedrock

AWS Weekly Roundup: AWS Control Tower, Amazon Bedrock, Amazon OpenSearch Service, and More (October 9, 2023)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-control-tower-amazon-bedrock-amazon-opensearch-service-and-more-october-9-2023/

Pumpkins

As the Northern Hemisphere enjoys early fall and pumpkins take over the local farmers markets and coffee flavors here in the United States, we’re also just 50 days away from re:Invent 2023! But before we officially enter pre:Invent sea­­son, let’s have a look at some of last week’s exciting news and announcements.

Last Week’s Launches
Here are some launches that got my attention:

AWS Control Tower – AWS Control Tower released 22 proactive controls and 10 AWS Security Hub detective controls to help you meet regulatory requirements and meet control objectives such as encrypting data in transit, encrypting data at rest, or using strong authentication. For more details and a list of controls, check out the AWS Control Tower user guide.

Amazon Bedrock – Just a week after Amazon Bedrock became available in AWS Regions US East (N. Virginia) and US West (Oregon), Amazon Bedrock is now also available in the Asia Pacific (Tokyo) AWS Region. To get started building and scaling generative AI applications with foundation models, check out the Amazon Bedrock documentation, explore the generative AI space at community.aws, and get hands-on with the Amazon Bedrock workshop.

Amazon OpenSearch Service – You can now run OpenSearch version 2.9 in Amazon OpenSearch Service with improvements to search, observability, security analytics, and machine learning (ML) capabilities. OpenSearch Service has expanded its geospatial aggregations support in version 2.9 to gather insights on high-level overview of trends and patterns and establish correlations within the data. OpenSearch Service 2.9 now also comes with OpenSearch Service Integrations to take advantage of new schema standards such as OpenTelemetry and supports managing and overlaying alerts and anomalies onto dashboard visualization line charts.

Amazon SageMakerSageMaker Feature Store now supports a fully managed, in-memory online store to help you retrieve features for model serving in real time for high throughput ML applications. The new online store is powered by ElastiCache for Redis, an in-memory data store built on open-source Redis. The SageMaker developer guide has all the details.

Also, SageMaker Model Registry added support for private model repositories. You can now register models that are stored in private Docker repositories and track all your models across multiple private AWS and non-AWS model repositories in one central service, simplifying ML operations (MLOps) and ML governance at scale. The SageMaker Developer Guide shows you how to get started.

Amazon SageMaker CanvasSageMaker Canvas expanded its support for ready-to-use models to include foundation models (FMs). You can now access FMs such as Claude 2, Amazon Titan, and Jurassic-2 (powered by Amazon Bedrock) as well as publicly available models such as Falcon and MPT (powered by SageMaker JumpStart) through a no-code chat interface. Check out the SageMaker Developer Guide for more details.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some additional blog posts and news items that you might find interesting:

Behind the scenes on AWS contributions to open-source databases – This post shares some of the more substantial open-source contributions AWS has made in the past two years to upstream databases, introduces some key contributors, and shares how AWS approaches upstream work in our database services.

Fast and cost-effective Llama 2 fine-tuning with AWS Trainium – This post shows you how to fine-tune the Llama 2 model from Meta on AWS Trainium, a purpose-built accelerator for LLM training, to reduce training times and costs.

Code Llama code generation models from Meta are now available via Amazon SageMaker JumpStart – You can now deploy Code Llama FMs, developed by Meta, with one click in SageMaker JumpStart. This post walks you through the details.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

Build On AWS - Generative AIBuild On Generative AI – Season 2 of this weekly Twitch show about all things generative AI is in full swing! Every Monday, 9:00 US PT, my colleagues Emily and Darko look at new technical and scientific patterns on AWS, invite guest speakers to demo their work, and show us how they built something new to improve the state of generative AI. In today’s episode, Emily and Darko discussed how to translate unstructured documents into structured data. Check out show notes and the full list of episodes on community.aws.

AWS Community Days – Join a community-led conference run by AWS user group leaders in your region: DMV (DC, Maryland, Virginia) (October 13), Italy (October 18), UAE (October 21), Jaipur (November 4), Vadodara (November 4), and Brasil (November 4).

AWS InnovateAWS Innovate: Every Application Edition – Join our free online conference to explore cutting-edge ways to enhance security and reliability, optimize performance on a budget, speed up application development, and revolutionize your applications with generative AI. Register for AWS Innovate Online Americas and EMEA on October 19 and AWS Innovate Online Asia Pacific & Japan on October 26.

AWS re:Invent 2023AWS re:Invent (November 27 – December 1) – Join us to hear the latest from AWS, learn from experts, and connect with the global cloud community. Browse the session catalog and attendee guides and check out the re:Invent highlights for generative AI.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Antje

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Building a serverless document chat with AWS Lambda and Amazon Bedrock

Post Syndicated from Pascal Vogel original https://aws.amazon.com/blogs/compute/building-a-serverless-document-chat-with-aws-lambda-and-amazon-bedrock/

This post is written by Pascal Vogel, Solutions Architect, and Martin Sakowski, Senior Solutions Architect.

Large language models (LLMs) are proving to be highly effective at solving general-purpose tasks such as text generation, analysis and summarization, translation, and much more. Because they are trained on large datasets, they can use a broad generalist knowledge base. However, as training takes place offline and uses publicly available data, their ability to access specialized, private, and up-to-date knowledge is limited.

One way to improve LLM knowledge in a specific domain is fine-tuning them on domain-specific datasets. However, this is time and resource intensive, requires specialized knowledge, and may not be appropriate for some tasks. For example, fine-tuning won’t allow an LLM to access information with daily accuracy.

To address these shortcomings, Retrieval Augmented Generation (RAG) is proving to be an effective approach. With RAG, data external to the LLM is used to augment prompts by adding relevant retrieved data in the context. This allows for integrating disparate data sources and the complete separation of data sources from the machine learning model entirely.

Tools such as LangChain or LlamaIndex are gaining popularity because of their ability to flexibly integrate with a variety of data sources such as (vector) databases, search engines, and current public data sources.

In the context of LLMs, semantic search is an effective search approach, as it considers the context and intent of user-provided prompts as opposed to a traditional literal search. Semantic search relies on word embeddings, which represent words, sentences, or documents as vectors. Consequently, documents must be transformed into embeddings using an embedding model as the basis for semantic search. Because this embedding process only needs to happen when a document is first ingested or updated, it’s a great fit for event-driven compute with AWS Lambda.

This blog post presents a solution that allows you to ask natural language questions of any PDF document you upload. It combines the text generation and analysis capabilities of an LLM with a vector search on the document content. The solution uses serverless services such as AWS Lambda to run LangChain and Amazon DynamoDB for conversational memory.

Amazon Bedrock is used to provide serverless access to foundational models such as Amazon Titan and models developed by leading AI startups, such as AI21 Labs, Anthropic, and Cohere. See the GitHub repository for a full list of available LLMs and deployment instructions.

You learn how the solution works, what design choices were made, and how you can use it as a blueprint to build your own custom serverless solutions based on LangChain that go beyond prompting individual documents. The solution code and deployment instructions are available on GitHub.

Solution overview

Let’s look at how the solution works at a high level before diving deeper into specific elements and the AWS services used in the following sections. The following diagram provides a simplified view of the solution architecture and highlights key elements:

The process of interacting with the web application looks like this:

  1. A user uploads a PDF document into an Amazon Simple Storage Service (Amazon S3) bucket through a static web application frontend.
  2. This upload triggers a metadata extraction and document embedding process. The process converts the text in the document into vectors. The vectors are loaded into a vector index and stored in S3 for later use.
  3. When a user chats with a PDF document and sends a prompt to the backend, a Lambda function retrieves the index from S3 and searches for information related to the prompt.
  4. An LLM then uses the results of this vector search, previous messages in the conversation, and its general-purpose capabilities to formulate a response to the user.

As can be seen on the following screenshot, the web application deployed as part of the solution allows you to upload documents and list uploaded documents and their associated metadata, such as number of pages, file size, and upload date. The document status indicates if a document is successfully uploaded, is being processed, or is ready for a conversation.

Web application document list view

By clicking on one of the processed documents, you can access a chat interface, which allows you to send prompts to the backend. It is possible to have multiple independent conversations with each document with separate message history.

Web application chat view

Embedding documents

Solution architecture diagram excerpt: embedding documents

When a new document is uploaded to the S3 bucket, an S3 event notification triggers a Lambda function that extracts metadata, such as file size and number of pages, from the PDF file and stores it in a DynamoDB table. Once the extraction is complete, a message containing the document location is placed on an Amazon Simple Queue Service (Amazon SQS) queue. Another Lambda function polls this queue using Lambda event source mapping. Applying the decouple messaging pattern to the metadata extraction and document embedding functions ensures loose coupling and protects the more compute-intensive downstream embedding function.

The embedding function loads the PDF file from S3 and uses a text embedding model to generate a vector representation of the contained text. LangChain integrates with text embedding models for a variety of LLM providers. The resulting vector representation of the text is loaded into a FAISS index. FAISS is an open source vector store that can run inside the Lambda function memory using the faiss-cpu Python package. Finally, a dump of this FAISS index is stored in the S3 bucket besides the original PDF document.

Generating responses

Solution architecture diagram excerpt: generating responses

When a prompt for a specific document is submitted via the Amazon API Gateway REST API endpoint, it is proxied to a Lambda function that:

  1. Loads the FAISS index dump of the corresponding PDF file from S3 and into function memory.
  2. Performs a similarity search of the FAISS vector store based on the prompt.
  3. If available, retrieves a record of previous messages in the same conversation via the DynamoDBChatMessageHistory integration. This integration can store message history in DynamoDB. Each conversation is identified by a unique ID.
  4. Finally, a LangChain ConversationalRetrievalChain passes the combination of the prompt submitted by the user, the result of the vector search, and the message history to an LLM to generate a response.

Web application and file uploads

Solution architecture diagram excerpt: web application

A static web application serves as the frontend for this solution. It’s built with React, TypeScriptVite, and TailwindCSS and deployed via AWS Amplify Hosting, a fully managed CI/CD and hosting service for fast, secure, and reliable static and server-side rendered applications. To protect the application from unauthorized access, it integrates with an Amazon Cognito user pool. The API Gateway uses an Amazon Cognito authorizer to authenticate requests.

Users upload PDF files directly to the S3 bucket using S3 presigned URLs obtained via the REST API. Several Lambda functions implement API endpoints used to create, read, and update document metadata in a DynamoDB table.

Extending and adapting the solution

The solution provided serves as a blueprint that can be enhanced and extended to develop your own use cases based on LLMs. For example, you can extend the solution so that users can ask questions across multiple PDF documents or other types of data sources. LangChain makes it easy to load different types of data into vector stores, which you can then use for semantic search.

Once your use case involves searching across multiple documents, consider moving from loading vectors into memory with FAISS to a dedicated vector database. There are several options for vector databases on AWS. One serverless option is Amazon Aurora Serverless v2 with the pgvector extension for PostgreSQL. Alternatively, vector databases developed by AWS Partners such as Pinecone or MongoDB Atlas Vector Search can be integrated with LangChain. Besides vector search, LangChain also integrates with traditional external data sources, such as the enterprise search service Amazon Kendra, Amazon OpenSearch, and many other data sources.

The solution presented in this blog post uses similarity search to find information in the vector database that closely matches the user-supplied prompt. While this works well in the presented use case, you can also use other approaches, such as maximal marginal relevance, to find the most relevant information to provide to the LLM. When searching across many documents and receiving many results, techniques such as MapReduce can improve the quality of the LLM responses.

Depending on your use case, you may also want to select a different LLM to achieve an ideal balance between quality of results and cost. Amazon Bedrock is a fully managed service that makes foundational models (FMs) from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that’s best suited for your use case. You can use models such as Amazon Titan, Jurassic-2 from AI21 Labs, or Anthropic Claude.

To further optimize the user experience of your generative AI application, consider streaming LLM responses to your frontend in real-time using Lambda response streaming and implementing real-time data updates using AWS AppSync subscriptions or Amazon API Gateway WebSocket APIs.

Conclusion

AWS serverless services make it easier to focus on building generative AI applications by providing automatic scaling, built-in high availability, and a pay-for-use billing model. Event-driven compute with AWS Lambda is a good fit for compute-intensive, on-demand tasks such as document embedding and flexible LLM orchestration.

The solution in this blog post combines the capabilities of LLMs and semantic search to answer natural language questions directed at PDF documents. It serves as a blueprint that can be extended and adapted to fit further generative AI use cases.

Deploy the solution by following the instructions in the associated GitHub repository.

For more serverless learning resources, visit Serverless Land.

AWS Weekly Roundup – Amazon Bedrock Is Now Generally Available, Attend AWS Innovate Online, and More – Oct 2, 2023

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-bedrock-is-now-generally-available-attend-aws-innovate-online-and-more-oct-2-2023/

Last week I attended the AWS Summit Johannesburg. This was the first summit to be hosted in my own country and my own city since 2019 so it was very special to have the opportunity to attend. It was great to get to meet with so many of our customers and hear how they are building on AWS.

Now on to the AWS updates. I’ve compiled a few announcements and upcoming events you need to know about. Let’s get started!

Last Week’s Launches
Amazon Bedrock Is Now Generally Available – Amazon Bedrock was announced in preview in April of this year as part of a set of new tools for building with generative AI on AWS. Last week’s announcement of this service being generally available was received with a lot of excitement and customers have already been sharing what they are building with Amazon Bedrock. I quite enjoyed this lighthearted post from AWS Serverless Hero Jones Zachariah Noel about the “Bengaluru with traffic-filled roads” image he produced using Stability AI’s Stable Diffusion XL image generation model on Amazon Bedrock.

Amazon MSK Introduces Managed Data Delivery from Apache Kafka to Your Data Lake – Amazon MSK was released in 2019 to help our customers reduce the work needed to set up, scale, and manage Apache Kafka in production. Now you can continuously load data from an Apache Kafka cluster to Amazon Simple Storage Service (Amazon S3).

Other AWS News
A few more news items and blog posts you might have missed:

The Community.AWS Blog is where builders share and learn with the community of cloud enthusiasts. Contributors to this blog include AWS employees, AWS Heroes, AWS Community Builders, and other members of the AWS Community. Last week, AWS Hero Johannes Koch published this awesome post on how to build a simple website using Flutter that interacts with a serverless backend powered by AppSync-merged APIs.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Upcoming AWS Events
We have the following upcoming events:

AWS Cloud Days (October 10, 24) – Connect and collaborate with other like-minded folks while learning about AWS at the AWS Cloud Day in Athens and Prague.

AWS Innovate Online (October 19)Register for AWS Innovate Online to learn how you can build, run, and scale next-generation applications on the most extensive cloud platform. There will be 80+ sessions delivered in five languages and you’ll receive a certificate of attendance to showcase all you’ve learned.

We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link doesn’t lead to our website. AWS handles your information as described in the AWS Privacy Notice.

Veliswa

Amazon Bedrock Is Now Generally Available – Build and Scale Generative AI Applications with Foundation Models

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-bedrock-is-now-generally-available-build-and-scale-generative-ai-applications-with-foundation-models/

This April, we announced Amazon Bedrock as part of a set of new tools for building with generative AI on AWS. Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies, including AI21 Labs, Anthropic, Cohere, Stability AI, and Amazon, along with a broad set of capabilities to build generative AI applications, simplifying the development while maintaining privacy and security.

Today, I’m happy to announce that Amazon Bedrock is now generally available! I’m also excited to share that Meta’s Llama 2 13B and 70B parameter models will soon be available on Amazon Bedrock.

Amazon Bedrock

Amazon Bedrock’s comprehensive capabilities help you experiment with a variety of top FMs, customize them privately with your data using techniques such as fine-tuning and retrieval-augmented generation (RAG), and create managed agents that perform complex business tasks—all without writing any code. Check out my previous posts to learn more about agents for Amazon Bedrock and how to connect FMs to your company’s data sources.

Note that some capabilities, such as agents for Amazon Bedrock, including knowledge bases, continue to be available in preview. I’ll share more details on what capabilities continue to be available in preview towards the end of this blog post.

Since Amazon Bedrock is serverless, you don’t have to manage any infrastructure, and you can securely integrate and deploy generative AI capabilities into your applications using the AWS services you are already familiar with.

Amazon Bedrock is integrated with Amazon CloudWatch and AWS CloudTrail to support your monitoring and governance needs. You can use CloudWatch to track usage metrics and build customized dashboards for audit purposes. With CloudTrail, you can monitor API activity and troubleshoot issues as you integrate other systems into your generative AI applications. Amazon Bedrock also allows you to build applications that are in compliance with the GDPR and you can use Amazon Bedrock to run sensitive workloads regulated under the U.S. Health Insurance Portability and Accountability Act (HIPAA).

Get Started with Amazon Bedrock
You can access available FMs in Amazon Bedrock through the AWS Management Console, AWS SDKs, and open-source frameworks such as LangChain.

In the Amazon Bedrock console, you can browse FMs and explore and load example use cases and prompts for each model. First, you need to enable access to the models. In the console, select Model access in the left navigation pane and enable the models you would like to access. Once model access is enabled, you can try out different models and inference configuration settings to find a model that fits your use case.

For example, here’s a contract entity extraction use case example using Cohere’s Command model:

Amazon Bedrock

The example shows a prompt with a sample response, the inference configuration parameter settings for the example, and the API request that runs the example. If you select Open in Playground, you can explore the model and use case further in an interactive console experience.

Amazon Bedrock offers chat, text, and image model playgrounds. In the chat playground, you can experiment with various FMs using a conversational chat interface. The following example uses Anthropic’s Claude model:

Amazon Bedrock

As you evaluate different models, you should try various prompt engineering techniques and inference configuration parameters. Prompt engineering is a new and exciting skill focused on how to better understand and apply FMs to your tasks and use cases. Effective prompt engineering is about crafting the perfect query to get the most out of FMs and obtain proper and precise responses. In general, prompts should be simple, straightforward, and avoid ambiguity. You can also provide examples in the prompt or encourage the model to reason through more complex tasks.

Inference configuration parameters influence the response generated by the model. Parameters such as Temperature, Top P, and Top K give you control over the randomness and diversity, and Maximum Length or Max Tokens control the length of model responses. Note that each model exposes a different but often overlapping set of inference parameters. These parameters are either named the same between models or similar enough to reason through when you try out different models.

We discuss effective prompt engineering techniques and inference configuration parameters in more detail in week 1 of the Generative AI with Large Language Models on-demand course, developed by AWS in collaboration with DeepLearning.AI. You can also check the Amazon Bedrock documentation and the model provider’s respective documentation for additional tips.

Next, let’s see how you can interact with Amazon Bedrock via APIs.

Using the Amazon Bedrock API
Working with Amazon Bedrock is as simple as selecting an FM for your use case and then making a few API calls. In the following code examples, I’ll use the AWS SDK for Python (Boto3) to interact with Amazon Bedrock.

List Available Foundation Models
First, let’s set up the boto3 client and then use list_foundation_models() to see the most up-to-date list of available FMs:

import boto3
import json

bedrock = boto3.client(
    service_name='bedrock', 
    region='us-east-1'
)

bedrock.list_foundation_models()

Run Inference Using Amazon Bedrock’s InvokeModel API
Next, let’s perform an inference request using Amazon Bedrock’s InvokeModel API and boto3 runtime client. The runtime client manages the data plane APIs, including the InvokeModel API.

Amazon Bedrock

The InvokeModel API expects the following parameters:

{
    "modelId": <MODEL_ID>,
    "contentType": "application/json",
    "accept": "application/json",
    "body": <BODY>
}

The modelId parameter identifies the FM you want to use. The request body is a JSON string containing the prompt for your task, together with any inference configuration parameters. Note that the prompt format will vary based on the selected model provider and FM. The contentType and accept parameters define the MIME type of the data in the request body and response and default to application/json. For more information on the latest models, InvokeModel API parameters, and prompt formats, see the Amazon Bedrock documentation.

Example: Text Generation Using AI21 Lab’s Jurassic-2 Model
Here is a text generation example using AI21 Lab’s Jurassic-2 Ultra model. I’ll ask the model to tell me a knock-knock joke—my version of a Hello World.

bedrock_runtime = boto3.client(
    service_name='bedrock-runtime', 
    region='us-east-1'
)

modelId = 'ai21.j2-ultra-v1' 
accept = 'application/json'
contentType = 'application/json'

body = json.dumps(
    {"prompt": "Knock, knock!", 
     "maxTokens": 200,
     "temperature": 0.7,
     "topP": 1,
    }
)

response = bedrock_runtime.invoke_model(
    body=body, 
	modelId=modelId, 
	accept=accept, 
	contentType=contentType
)

response_body = json.loads(response.get('body').read())

Here’s the response:

outputText = response_body.get('completions')[0].get('data').get('text')
print(outputText)
Who's there? 
Boo! 
Boo who? 
Don't cry, it's just a joke!

You can also use the InvokeModel API to interact with embedding models.

Example: Create Text Embeddings Using Amazon’s Titan Embeddings Model
Text embedding models translate text inputs, such as words, phrases, or possibly large units of text, into numerical representations, known as embedding vectors. Embedding vectors capture the semantic meaning of the text in a high-dimension vector space and are useful for applications such as personalization or search. In the following example, I’m using the Amazon Titan Embeddings model to create an embedding vector.

prompt = "Knock-knock jokes are hilarious."

body = json.dumps({
    "inputText": prompt,
})

model_id = 'amazon.titan-embed-g1-text-02'
accept = 'application/json' 
content_type = 'application/json'

response = bedrock_runtime.invoke_model(
    body=body, 
    modelId=model_id, 
    accept=accept, 
    contentType=content_type
)

response_body = json.loads(response['body'].read())
embedding = response_body.get('embedding')

The embedding vector (shortened) will look similar to this:

[0.82421875, -0.6953125, -0.115722656, 0.87890625, 0.05883789, -0.020385742, 0.32421875, -0.00078201294, -0.40234375, 0.44140625, ...]

Note that Amazon Titan Embeddings is available today. The Amazon Titan Text family of models for text generation continues to be available in limited preview.

Run Inference Using Amazon Bedrock’s InvokeModelWithResponseStream API
The InvokeModel API request is synchronous and waits for the entire output to be generated by the model. For models that support streaming responses, Bedrock also offers an InvokeModelWithResponseStream API that lets you invoke the specified model to run inference using the provided input but streams the response as the model generates the output.

Amazon Bedrock

Streaming responses are particularly useful for responsive chat interfaces to keep the user engaged in an interactive application. Here is a Python code example using Amazon Bedrock’s InvokeModelWithResponseStream API:

response = bedrock_runtime.invoke_model_with_response_stream(
    modelId=modelId, 
    body=body)

stream = response.get('body')
if stream:
    for event in stream:
        chunk=event.get('chunk')
        if chunk:
            print(json.loads(chunk.get('bytes').decode))

Data Privacy and Network Security
With Amazon Bedrock, you are in control of your data, and all your inputs and customizations remain private to your AWS account. Your data, such as prompts, completions, and fine-tuned models, is not used for service improvement. Also, the data is never shared with third-party model providers.

Your data remains in the Region where the API call is processed. All data is encrypted in transit with a minimum of TLS 1.2 encryption. Data at rest is encrypted with AES-256 using AWS KMS managed data encryption keys. You can also use your own keys (customer managed keys) to encrypt the data.

You can configure your AWS account and virtual private cloud (VPC) to use Amazon VPC endpoints (built on AWS PrivateLink) to securely connect to Amazon Bedrock over the AWS network. This allows for secure and private connectivity between your applications running in a VPC and Amazon Bedrock.

Governance and Monitoring
Amazon Bedrock integrates with IAM to help you manage permissions for Amazon Bedrock. Such permissions include access to specific models, playground, or features within Amazon Bedrock. All AWS-managed service API activity, including Amazon Bedrock activity, is logged to CloudTrail within your account.

Amazon Bedrock emits data points to CloudWatch using the AWS/Bedrock namespace to track common metrics such as InputTokenCount, OutputTokenCount, InvocationLatency, and (number of) Invocations. You can filter results and get statistics for a specific model by specifying the model ID dimension when you search for metrics. This near real-time insight helps you track usage and cost (input and output token count) and troubleshoot performance issues (invocation latency and number of invocations) as you start building generative AI applications with Amazon Bedrock.

Billing and Pricing Models
Here are a couple of things around billing and pricing models to keep in mind when using Amazon Bedrock:

Billing – Text generation models are billed per processed input tokens and per generated output tokens. Text embedding models are billed per processed input tokens. Image generation models are billed per generated image.

Pricing Models – Amazon Bedrock offers two pricing models, on-demand and provisioned throughput. On-demand pricing allows you to use FMs on a pay-as-you-go basis without having to make any time-based term commitments. Provisioned throughput is primarily designed for large, consistent inference workloads that need guaranteed throughput in exchange for a term commitment. Here, you specify the number of model units of a particular FM to meet your application’s performance requirements as defined by the maximum number of input and output tokens processed per minute. For detailed pricing information, see Amazon Bedrock Pricing.

Now Available
Amazon Bedrock is available today in AWS Regions US East (N. Virginia) and US West (Oregon). To learn more, visit Amazon Bedrock, check the Amazon Bedrock documentation, explore the generative AI space at community.aws, and get hands-on with the Amazon Bedrock workshop. You can send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS contacts.

(Available in Preview) The Amazon Titan Text family of text generation models, Stability AI’s Stable Diffusion XL image generation model, and agents for Amazon Bedrock, including knowledge bases, continue to be available in preview. Reach out through your usual AWS contacts if you’d like access.

(Coming Soon) The Llama 2 13B and 70B parameter models by Meta will soon be available via Amazon Bedrock’s fully managed API for inference and fine-tuning.

Start building generative AI applications with Amazon Bedrock, today!

— Antje

Preview – Connect Foundation Models to Your Company Data Sources with Agents for Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/preview-connect-foundation-models-to-your-company-data-sources-with-agents-for-amazon-bedrock/

In July, we announced the preview of agents for Amazon Bedrock, a new capability for developers to create generative AI applications that complete tasks. Today, I’m happy to introduce a new capability to securely connect foundation models (FMs) to your company data sources using agents.

With a knowledge base, you can use agents to give FMs in Bedrock access to additional data that helps the model generate more relevant, context-specific, and accurate responses without continuously retraining the FM. Based on user input, agents identify the appropriate knowledge base, retrieve the relevant information, and add the information to the input prompt, giving the model more context information to generate a completion.

Knowledge Base for Amazon Bedrock

Agents for Amazon Bedrock use a concept known as retrieval augmented generation (RAG) to achieve this. To create a knowledge base, specify the Amazon Simple Storage Service (Amazon S3) location of your data, select an embedding model, and provide the details of your vector database. Bedrock converts your data into embeddings and stores your embeddings in the vector database. Then, you can add the knowledge base to agents to enable RAG workflows.

For the vector database, you can choose between vector engine for Amazon OpenSearch Serverless, Pinecone, and Redis Enterprise Cloud. I’ll share more details on how to set up your vector database later in this post.

Primer on Retrieval Augmented Generation, Embeddings, and Vector Databases
RAG isn’t a specific set of technologies but a concept for providing FMs access to data they didn’t see during training. Using RAG, you can augment FMs with additional information, including company-specific data, without continuously retraining your model.

Continuously retraining your model is not only compute-intensive and expensive, but as soon as you’ve retrained the model, your company might have already generated new data, and your model has stale information. RAG addresses this issue by providing your model access to additional external data at runtime. Relevant data is then added to the prompt to help improve both the relevance and the accuracy of completions.

This data can come from a number of data sources, such as document stores or databases. A common implementation for document search is converting your documents, or chunks of the documents, into vector embeddings using an embedding model and then storing the vector embeddings in a vector database, as shown in the following figure.

Knowledge Base for Amazon Bedrock

The vector embedding includes the numeric representations of text data within your documents. Each embedding aims to capture the semantic or contextual meaning of the data. Each vector embedding is put into a vector database, often with additional metadata such as a reference to the original content the embedding was created from. The vector database then indexes the vectors, which can be done using a variety of approaches. This indexing enables quick retrieval of relevant data.

Compared to traditional keyword search, vector search can find relevant results without requiring an exact keyword match. For example, if you search for “What is the cost of product X?” and your documents say “The price of product X is […]”, then keyword search might not work because “price” and “cost” are two different words. With vector search, it will return the accurate result because “price” and “cost” are semantically similar; they have the same meaning. Vector similarity is calculated using distance metrics such as Euclidean distance, cosine similarity, or dot product similarity.

The vector database is then used within the prompt workflow to efficiently retrieve external information based on an input query, as shown in the figure below.

Knowledge Base for Amazon Bedrock

The workflow starts with a user input prompt. Using the same embedding model, you create a vector embedding representation of the input prompt. This embedding is then used to query the database for similar vector embeddings to return the most relevant text as the query result.

The query result is then added to the prompt, and the augmented prompt is passed to the FM. The model uses the additional context in the prompt to generate the completion, as shown in the following figure.

Knowledge Stores for Amazon Bedrock

Similar to the fully managed agents experience I described in the blog post on agents for Amazon Bedrock, the knowledge base for Amazon Bedrock manages the data ingestion workflow, and agents manage the RAG workflow for you.

Get Started with Knowledge Bases for Amazon Bedrock
You can add a knowledge base by specifying a data source, such as Amazon S3, select an embedding model, such as Amazon Titan Embeddings to convert the data into vector embeddings, and a destination vector database to store the vector data. Bedrock takes care of creating, storing, managing, and updating your embeddings in the vector database.

If you add knowledge bases to an agent, the agent will identify the appropriate knowledge base based on user input, retrieve the relevant information, and add the information to the input prompt, providing the model with more context information to generate a response, as shown in the figure below. All information retrieved from knowledge bases comes with source attribution to improve transparency and minimize hallucinations.

Knowledge Base for Amazon Bedrock

Let me walk you through those steps in more detail.

Create a Knowledge Base for Amazon Bedrock
Let’s assume you’re a developer at a tax consulting company and want to provide users with a generative AI application—a TaxBot—that can answer US tax filing questions. You first create a knowledge base that holds the relevant tax documents. Then, you configure an agent in Bedrock with access to this knowledge base and integrate the agent into your TaxBot application.

To get started, open the Bedrock console, select Knowledge base in the left navigation pane, then choose Create knowledge base.

Knowledge Base for Amazon Bedrock

Step 1 – Provide knowledge base details. Enter a name for the knowledge base and a description (optional). You also must select an AWS Identity and Access Management (IAM) runtime role with a trust policy for Amazon Bedrock, permissions to access the S3 bucket you want the knowledge base to use, and read/write permissions to your vector database. You can also assign tags as needed.

Knowledge Base for Amazon Bedrock

Step 2 – Set up data source. Enter a data source name and specify the Amazon S3 location for your data. Supported data formats include .txt, .md, .html, .doc and .docx, .csv, .xls and .xlsx, and .pdf files. You can also provide an AWS Key Management Service (AWS KMS) key to allow Bedrock to decrypt and encrypt your data and another AWS KMS key for transient data storage while Bedrock is converting your data into embeddings.

Choose the embedding model, such as Amazon Titan Embeddings – Text, and your vector database. For the vector database, as mentioned earlier, you can choose between vector engine for Amazon OpenSearch Serverless, Pinecone, or Redis Enterprise Cloud.

Knowledge Base for Amazon Bedrock

Important note on the vector database: Amazon Bedrock is not creating a vector database on your behalf. You must create a new, empty vector database from the list of supported options and provide the vector database index name as well as index field and metadata field mappings. This vector database will need to be for exclusive use with Amazon Bedrock.

Let me show you what the setup looks like for vector engine for Amazon OpenSearch Serverless. Assuming you’ve set up an OpenSearch Serverless collection as described in the Developer Guide and this AWS Big Data Blog post, provide the ARN of the OpenSearch Serverless collection, specify the vector index name, and the vector field and metadata field mapping.

Knowledge Base for Amazon Bedrock

The configuration for Pinecone and Redis Enterprise Cloud is similar. Check out this Pinecone blog post and this Redis Inc. blog post for more details on how to set up and prepare their vector database for Bedrock.

Step 3 – Review and create. Review your knowledge base configuration and choose Create knowledge base.

Knowledge Base for Amazon Bedrock

Back in the knowledge base details page, choose Sync for the newly created data source, and whenever you add new data to the data source, to start the ingestion workflow of converting your Amazon S3 data into vector embeddings and upserting the embeddings into the vector database. Depending on the amount of data, this whole workflow can take some time.

Knowledge Base for Amazon Bedrock

Next, I’ll show you how to add the knowledge base to an agent configuration.

Add a Knowledge Base to Agents for Amazon Bedrock
You can add a knowledge base when creating or updating an agent for Amazon Bedrock. Create an agent as described in this AWS News Blog post on agents for Amazon Bedrock.

For my tax bot example, I’ve created an agent called “TaxBot,” selected a foundation model, and provided these instructions for the agent in step 2: “You are a helpful and friendly agent that answers US tax filing questions for users.” In step 4, you can now select a previously created knowledge base and provide instructions for the agent describing when to use this knowledge base.

Knowledge Base for Amazon Bedrock

These instructions are very important as they help the agent decide whether or not a particular knowledge base should be used for retrieval. The agent will identify the appropriate knowledge base based on user input and available knowledge base instructions.

For my tax bot example, I added the knowledge base “TaxBot-Knowledge-Base” together with these instructions: “Use this knowledge base to answer tax filing questions.”

Once you’ve finished the agent configuration, you can test your agent and how it’s using the added knowledge base. Note how the agent provides a source attribution for information pulled from knowledge bases.

Knowledge Base for Amazon Bedrock

Generative AI with large language modelsLearn the Fundamentals of Generative AI
Generative AI with large language models (LLMs) is an on-demand, three-week course for data scientists and engineers who want to learn how to build generative AI applications with LLMs, including RAG. It’s the perfect foundation to start building with Amazon Bedrock. Enroll for generative AI with LLMs today.

Sign up to Learn More about Amazon Bedrock (Preview)
Amazon Bedrock is currently available in preview. Reach out through your usual AWS support contacts if you’d like access to knowledge bases for Amazon Bedrock as part of the preview. We’re regularly providing access to new customers. To learn more, visit the Amazon Bedrock Features page and sign up to learn more about Amazon Bedrock.

— Antje

AWS Week in Review – Agents for Amazon Bedrock, Amazon SageMaker Canvas New Capabilities, and More – July 31, 2023

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-week-in-review-agents-for-amazon-bedrock-amazon-sagemaker-canvas-new-capabilities-and-more-july-31-2023/

This July, AWS communities in ASEAN wrote a new history. First, the AWS User Group Malaysia recently held the first AWS Community Day in Malaysia.

Another significant milestone has been achieved by the AWS User Group Philippines. They just celebrated their tenth anniversary by running 2 days of AWS Community Day Philippines. Here are a few photos from the event, including Jeff Barr sharing his experiences attending AWS User Group meetup, in Manila, Philippines 10 years ago.

Big congratulations to AWS Community Heroes, AWS Community Builders, AWS User Group leaders and all volunteers who organized and delivered AWS Community Days! Also, thank you to everyone who attended and help support our AWS communities.

Last Week’s Launches
We had interesting launches last week, including from AWS Summit, New York. Here are some of my personal highlights:

(Preview) Agents for Amazon Bedrock – You can now create managed agents for Amazon Bedrock to handle tasks using API calls to company systems, understand user requests, break down complex tasks into steps, hold conversations to gather more information, and take actions to fulfill requests.

(Coming Soon) New LLM Capabilities in Amazon QuickSight Q – We are expanding the innovation in QuickSight Q by introducing new LLM capabilities through Amazon Bedrock. These Generative BI capabilities will allow organizations to easily explore data, uncover insights, and facilitate sharing of insights.

AWS Glue Studio support for Amazon CodeWhisperer – You can now write specific tasks in natural language (English) as comments in the Glue Studio notebook, and Amazon CodeWhisperer provides code recommendations for you.

(Preview) Vector Engine for Amazon OpenSearch Serverless – This capability empowers you to create modern ML-augmented search experiences and generative AI applications without the need to handle the complexities of managing the underlying vector database infrastructure.

Last week, Amazon SageMaker Canvas also released a set of new capabilities:

AWS Open-Source Updates
As always, my colleague Ricardo has curated the latest updates for open-source news at AWS. Here are some of the highlights.

cdk-aws-observability-accelerator is a set of opinionated modules to help you set up observability for your AWS environments with AWS native services and AWS-managed observability services such as Amazon Managed Service for Prometheus, Amazon Managed Grafana, AWS Distro for OpenTelemetry (ADOT) and Amazon CloudWatch.

iac-devtools-cli-for-cdk is a command line interface tool that automates many of the tedious tasks of building, adding to, documenting, and extending AWS CDK applications.

Upcoming AWS Events
There are upcoming events that you can join to learn. Let’s start with AWS events:

And let’s learn from our fellow builders and join AWS Community Days:

Open for Registration for AWS re:Invent
We want to be sure you know that AWS re:Invent registration is now open!


This learning conference hosted by AWS for the global cloud computing community will be held from November 27 to December 1, 2023, in Las Vegas.

Pro-tip: You can use information on the Justify Your Trip page to prove the value of your trip to AWS re:Invent trip.

Give Us Your Feedback
We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

That’s all for this week. Check back next Monday for another Week in Review.

Happy building!

Donnie

This post is part of our Week in Review series. Check back each week for a quick round-up of interesting news and announcements from AWS!


P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.

Preview – Enable Foundation Models to Complete Tasks With Agents for Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/preview-enable-foundation-models-to-complete-tasks-with-agents-for-amazon-bedrock/

This April, Swami Sivasubramanian, Vice President of Data and Machine Learning at AWS, announced Amazon Bedrock and Amazon Titan models as part of new tools for building with generative AI on AWS. Amazon Bedrock, currently available in preview, is a fully managed service that makes foundation models (FMs) from Amazon and leading AI startups—such as AI21 Labs, Anthropic, Cohere, and Stability AI—available through an API.

Today, I’m excited to announce the preview of agents for Amazon Bedrock, a new capability for developers to create fully managed agents in a few clicks. Agents for Amazon Bedrock accelerate the delivery of generative AI applications that can manage and perform tasks by making API calls to your company systems. Agents extend FMs to understand user requests, break down complex tasks into multiple steps, carry on a conversation to collect additional information, and take actions to fulfill the request.

Agents for Amazon Bedrock

Using agents for Amazon Bedrock, you can automate tasks for your internal or external customers, such as managing retail orders or processing insurance claims. For example, an agent-powered generative AI e-commerce application can not only respond to the question, “Do you have this jacket in blue?” with a simple answer but can also help you with the task of updating your order or managing an exchange.

For this to work, you first need to give the agent access to external data sources and connect it to existing APIs of other applications. This allows the FM that powers the agent to interact with the broader world and extend its utility beyond just language processing tasks. Second, the FM needs to figure out what actions to take, what information to use, and in which sequence to perform these actions. This is possible thanks to an exciting emerging behavior of FMs—their ability to reason. You can show FMs how to handle such interactions and how to reason through tasks by building prompts that include definitions and instructions. The process of designing prompts to guide the model towards desired outputs is known as prompt engineering.

Introducing Agents for Amazon Bedrock
Agents for Amazon Bedrock automate the prompt engineering and orchestration of user-requested tasks. Once configured, an agent automatically builds the prompt and securely augments it with your company-specific information to provide responses back to the user in natural language. The agent is able to figure out the actions required to automatically process user-requested tasks. It breaks the task into multiple steps, orchestrates a sequence of API calls and data lookups, and maintains memory to complete the action for the user.

With fully managed agents, you don’t have to worry about provisioning or managing infrastructure. You’ll have seamless support for monitoring, encryption, user permissions, and API invocation management without writing custom code. As a developer, you can use the Bedrock console or SDK to upload the API schema. The agent then orchestrates the tasks with the help of FMs and performs API calls using AWS Lambda functions.

Primer on Advanced Reasoning and ReAct
You can help FMs to reason and figure out how to solve user-requested tasks with a reasoning technique called ReAct (synergizing reasoning and acting). Using ReAct, you can structure prompts to show an FM how to reason through a task and decide on actions that help find a solution. The structured prompts include a sequence of question-thought-action-observation examples.

The question is the user-requested task or problem to solve. The thought is a reasoning step that helps demonstrate to the FM how to tackle the problem and identify an action to take. The action is an API that the model can invoke from an allowed set of APIs. The observation is the result of carrying out the action. The actions that the FM is able to choose from are defined by a set of instructions that are prepended to the example prompt text. Here is an illustration of how you would build up a ReAct prompt:

Building up a ReAct prompt

The good news is that Bedrock performs the heavy lifting for you! Behind the scenes, agents for Amazon Bedrock build the prompts based on the information and actions you provide.

Now, let me show you how to get started with agents for Amazon Bedrock.

Create an Agent for Amazon Bedrock
Let’s assume you’re a developer at an insurance company and want to provide a generative AI application that helps the insurance agency owners automate repetitive tasks. You create an agent in Bedrock and integrate it into your application.

To get started with the agent, open the Bedrock console, select Agents in the left navigation panel, then choose Create Agent.

Agents for Amazon Bedrock

This starts the agent creation workflow.

  1. Provide agent details including agent name, description (optional), whether the agent is allowed to request additional user inputs, and the AWS Identity and Access Management (IAM) service role that gives your agent access to other required services, such as Amazon Simple Storage Service (Amazon S3) and AWS Lambda.Agents for Amazon Bedrock
  2. Select a foundation model from Bedrock that fits your use case. Here, you provide an instruction to your agent in natural language. The instruction tells the agent what task it’s supposed to perform and the persona it’s supposed to assume. For example, “You are an agent designed to help with processing insurance claims and managing pending paperwork.”Agents for Amazon Bedrock
  3. Add action groups. An action is a task that the agent can perform automatically by making API calls to your company systems. A set of actions is defined in an action group. Here, you provide an API schema that defines the APIs for all the actions in the group. You also must provide a Lambda function that represents the business logic for each API. For example, let’s define an action group called ClaimManagementActionGroup that manages insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders. Make sure to capture this information in the action group description. Agents for Amazon BedrockThe business logic for my action group is captured in the Lambda function InsuranceClaimsLambda. This AWS Lambda function implements methods for the following API calls: open-claims, identify-missing-documents, and send-reminders.Here’s a short extract from my OrderManagementLambda:
    import json
    import time
     
    def open_claims():
        ...
    
    def identify_missing_documents(parameters):
        ...
     
    def send_reminders():
        ...
     
    def lambda_handler(event, context):
        responses = []
     
        for prediction in event['actionGroups']:
            response_code = ...
            action = prediction['actionGroup']
            api_path = prediction['apiPath']
            
            if api_path == '/claims':
                body = open_claims() 
            elif api_path == '/claims/{claimId}/identify-missing-documents':
    			parameters = prediction['parameters']
                body = identify_missing_documents(parameters)
            elif api_path == '/send-reminders':
                body =  send_reminders()
            else:
                body = {"{}::{} is not a valid api, try another one.".format(action, api_path)}
     
            response_body = {
                'application/json': {
                    'body': str(body)
                }
            }
            
            action_response = {
                'actionGroup': prediction['actionGroup'],
                'apiPath': prediction['apiPath'],
                'httpMethod': prediction['httpMethod'],
                'httpStatusCode': response_code,
                'responseBody': response_body
            }
            
            responses.append(action_response)
     
        api_response = {'response': responses}
     
        return api_response

    Note that you also must provide an API schema in the OpenAPI schema JSON format. Here’s what my API schema file insurance_claim_schema.json looks like:

    {"openapi": "3.0.0",
        "info": {
            "title": "Insurance Claims Automation API",
            "version": "1.0.0",
            "description": "APIs for managing insurance claims by pulling a list of open claims, identifying outstanding paperwork for each claim, and sending reminders to policy holders."
        },
        "paths": {
            "/claims": {
                "get": {
                    "summary": "Get a list of all open claims",
                    "description": "Get the list of all open insurance claims. Return all the open claimIds.",
                    "operationId": "getAllOpenClaims",
                    "responses": {
                        "200": {
                            "description": "Gets the list of all open insurance claims for policy holders",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "array",
                                        "items": {
                                            "type": "object",
                                            "properties": {
                                                "claimId": {
                                                    "type": "string",
                                                    "description": "Unique ID of the claim."
                                                },
                                                "policyHolderId": {
                                                    "type": "string",
                                                    "description": "Unique ID of the policy holder who has filed the claim."
                                                },
                                                "claimStatus": {
                                                    "type": "string",
                                                    "description": "The status of the claim. Claim can be in Open or Closed state"
                                                }
                                            }
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            },
            "/claims/{claimId}/identify-missing-documents": {
                "get": {
                    "summary": "Identify missing documents for a specific claim",
                    "description": "Get the list of pending documents that need to be uploaded by policy holder before the claim can be processed. The API takes in only one claim id and returns the list of documents that are pending to be uploaded by policy holder for that claim. This API should be called for each claim id",
                    "operationId": "identifyMissingDocuments",
                    "parameters": [{
                        "name": "claimId",
                        "in": "path",
                        "description": "Unique ID of the open insurance claim",
                        "required": true,
                        "schema": {
                            "type": "string"
                        }
                    }],
                    "responses": {
                        "200": {
                            "description": "List of documents that are pending to be uploaded by policy holder for insurance claim",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "object",
                                        "properties": {
                                            "pendingDocuments": {
                                                "type": "string",
                                                "description": "The list of pending documents for the claim."
                                            }
                                        }
                                    }
                                }
                            }
    
                        }
                    }
                }
            },
            "/send-reminders": {
                "post": {
                    "summary": "API to send reminder to the customer about pending documents for open claim",
                    "description": "Send reminder to the customer about pending documents for open claim. The API takes in only one claim id and its pending documents at a time, sends the reminder and returns the tracking details for the reminder. This API should be called for each claim id you want to send reminders for.",
                    "operationId": "sendReminders",
                    "requestBody": {
                        "required": true,
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "claimId": {
                                            "type": "string",
                                            "description": "Unique ID of open claims to send reminders for."
                                        },
                                        "pendingDocuments": {
                                            "type": "string",
                                            "description": "The list of pending documents for the claim."
                                        }
                                    },
                                    "required": [
                                        "claimId",
                                        "pendingDocuments"
                                    ]
                                }
                            }
                        }
                    },
                    "responses": {
                        "200": {
                            "description": "Reminders sent successfully",
                            "content": {
                                "application/json": {
                                    "schema": {
                                        "type": "object",
                                        "properties": {
                                            "sendReminderTrackingId": {
                                                "type": "string",
                                                "description": "Unique Id to track the status of the send reminder Call"
                                            },
                                            "sendReminderStatus": {
                                                "type": "string",
                                                "description": "Status of send reminder notifications"
                                            }
                                        }
                                    }
                                }
                            }
                        },
                        "400": {
                            "description": "Bad request. One or more required fields are missing or invalid."
                        }
                    }
                }
            }
        }
    }

    When a user asks your agent to complete a task, Bedrock will use the FM you configured for the agent to identify the sequence of actions and invoke the corresponding Lambda functions in the right order to solve the user-requested task.

  4. In the final step, review your agent configuration and choose Create Agent.Agents for Amazon Bedrock
  5. Congratulations, you’ve just created your first agent in Amazon Bedrock!Agents for Amazon Bedrock

Deploy an Agent for Amazon Bedrock
To deploy an agent in your application, you must create an alias. Bedrock then automatically creates a version for that alias.

  1. In the Bedrock console, select your agent, then select Deploy, and choose Create to create an alias.Agents for Amazon Bedrock
  2. Provide an alias name and description and choose whether to create a new version or use an existing version of your agent to associate with this alias.
    Agents for Amazon Bedrock
  3. This saves a snapshot of the agent code and configuration and associates an alias with this snapshot or version. You can use the alias to integrate the agent into your applications.
    Agents for Amazon Bedrock

Now, let’s test the insurance agent! You can do this right in the Bedrock console.

Let’s ask the agent to “Send reminder to all policy holders with open claims and pending paper work.” You can see how the FM-powered agent is able to understand the user request, break down the task into steps (collect the open insurance claims, lookup the claim IDs, send reminders), and perform the corresponding actions.

Agents for Amazon Bedrock

Agents for Amazon Bedrock can help you increase productivity, improve your customer service experience, or automate DevOps tasks. I’m excited to see what use cases you will implement!

Generative AI with large language modelsLearn the Fundamentals of Generative AI
If you’re interested in the fundamentals of generative AI and how to work with FMs, including advanced prompting techniques and agents, check out this this new hands-on course that I developed with AWS colleagues and industry experts in collaboration with DeepLearning.AI:

Generative AI with large language models (LLMs) is an on-demand, three-week course for data scientists and engineers who want to learn how to build generative AI applications with LLMs. It’s the perfect foundation to start building with Amazon Bedrock. Enroll for generative AI with LLMs today.

Sign up to Learn More about Amazon Bedrock (Preview)
Amazon Bedrock is currently available in preview. Reach out to us if you’d like access to agents for Amazon Bedrock as part of the preview. We’re regularly providing access to new customers. Visit the Amazon Bedrock Features page and sign up to learn more about Amazon Bedrock.

— Antje


P.S. We’re focused on improving our content to provide a better customer experience, and we need your feedback to do so. Please take this quick survey to share insights on your experience with the AWS Blog. Note that this survey is hosted by an external company, so the link does not lead to our website. AWS handles your information as described in the AWS Privacy Notice.