Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/agents-for-amazon-bedrock-now-support-memory-retention-and-code-interpretation-preview/
With Agents for Amazon Bedrock, generative artificial intelligence (AI) applications can run multistep tasks across different systems and data sources. A couple of months back, we simplified the creation and configuration of agents. Today, we are introducing in preview two new fully managed capabilities:
Retain memory across multiple interactions – Agents can now retain a summary of their conversations with each user and be able to provide a smooth, adaptive experience, especially for complex, multistep tasks, such as user-facing interactions and enterprise automation solutions like booking flights or processing insurance claims.
Support for code interpretation – Agents can now dynamically generate and run code snippets within a secure, sandboxed environment and be able to address complex use cases such as data analysis, data visualization, text processing, solving equations, and optimization problems. To make it easier to use this feature, we also added the ability to upload documents directly to an agent.
Let’s see how these new capabilities work in more detail.
Memory retention across multiple interactions
With memory retention, you can build agents that learn and adapt to each user’s unique needs and preferences over time. By maintaining a persistent memory, agents can pick up right where the users left off, providing a smooth flow of conversations and workflows, especially for complex, multistep tasks.
Imagine a user booking a flight. Thanks to the ability to retain memory, the agent can learn their travel preferences and use that knowledge to streamline subsequent booking requests, creating a personalized and efficient experience. For example, it can automatically propose the right seat to a user or a meal similar to their previous choices.
Using memory retention to be more context-aware also simplifies business process automation. For example, an agent used by an enterprise to process customer feedback can now be aware of previous and on-going interactions with the same customer without having to handle custom integrations.
Each user’s conversation history and context are securely stored under a unique memory identifier (ID), ensuring complete separation between users. With memory retention, it’s easier to build agents that provide seamless, adaptive, and personalized experiences that continuously improve over time. Let’s see how this works in practice.
Using memory retention in Agents for Amazon Bedrock
In the Amazon Bedrock console, I choose Agents from the Builder Tools section of the navigation pane and start creating an agent.
For the agent, I use agent-book-flight as the name with this as description:
Help book a flight.
Then, in the agent builder, I select the Anthropic’s Claude 3 Sonnet model and enter these instructions:
To book a flight, you should know the origin and destination airports and the day and time the flight takes off.
In Additional settings, I enable User input to allow the agent to ask clarifying questions to capture necessary inputs. This will help when a request to book a flight misses some necessary information such as the origin and destination or the date and time of the flight.
In the new Memory section, I enable memory to generate and store a session summary at the end of each session and use the default 30 days for memory duration.
Then, I add an action group to search and book flights. I use search-and-book-flights as name and this description:
Search for flights between two destinations on a given day and book a specific flight.
Then, I choose to define the action group with function details and then to create a new Lambda function. The Lambda function will implement the business logic for all the functions in this action group.
I add two functions to this action group: one to search for flights and another to book flights.
The first function is search-for-flights and has this description:
Search for flights on a given date between two destinations.
All parameters of this function are required and of type string. Here are the parameters’ names and descriptions:
origin_airport – Origin IATA airport code
destination_airport – Destination IATA airport code
date – Date of the flight in YYYYMMDD format
The second function is book-flight and uses this description:
Book a flight at a given date and time between two destinations.
Again, all parameters are required and of type string. These are the names and descriptions for the parameters:
origin_airport – Origin IATA airport code
destination_airport – Destination IATA airport code
date – Date of the flight in YYYYMMDD format
time – Time of the flight in HHMM format
To complete the creation of the agent, I choose Create.
To access the source code of the Lambda function, I choose the search-and-book-flights action group and then View (near the Select Lambda function settings). Normally, I’d use this Lambda function to integrate with an existing system such as a travel booking platform. In this case, I use this code to simulate a booking platform for the agent.
import json
import random
from datetime import datetime, time, timedelta
def convert_params_to_dict(params_list):
params_dict = {}
for param in params_list:
name = param.get("name")
value = param.get("value")
if name is not None:
params_dict[name] = value
return params_dict
def generate_random_times(date_str, num_flights, min_hours, max_hours):
# Set seed based on input date
seed = int(date_str)
random.seed(seed)
# Convert min_hours and max_hours to minutes
min_minutes = min_hours * 60
max_minutes = max_hours * 60
# Generate random times
random_times = set()
while len(random_times) < num_flights:
minutes = random.randint(min_minutes, max_minutes)
hours, mins = divmod(minutes, 60)
time_str = f"{hours:02d}{mins:02d}"
random_times.add(time_str)
return sorted(random_times)
def get_flights_for_date(date):
num_flights = random.randint(1, 6) # Between 1 and 6 flights per day
min_hours = 6 # 6am
max_hours = 22 # 10pm
flight_times = generate_random_times(date, num_flights, min_hours, max_hours)
return flight_times
def get_days_between(start_date, end_date):
# Convert string dates to datetime objects
start = datetime.strptime(start_date, "%Y%m%d")
end = datetime.strptime(end_date, "%Y%m%d")
# Calculate the number of days between the dates
delta = end - start
# Generate a list of all dates between start and end (inclusive)
date_list = [start + timedelta(days=i) for i in range(delta.days + 1)]
# Convert datetime objects back to "YYYYMMDD" string format
return [date.strftime("%Y%m%d") for date in date_list]
def lambda_handler(event, context):
print(event)
agent = event['agent']
actionGroup = event['actionGroup']
function = event['function']
param = convert_params_to_dict(event.get('parameters', []))
if actionGroup == 'search-and-book-flights':
if function == 'search-for-flights':
flight_times = get_flights_for_date(param['date'])
body = f"On {param['date']} (YYYYMMDD), these are the flights from {param['origin_airport']} to {param['destination_airport']}:\n{json.dumps(flight_times)}"
elif function == 'book-flight':
body = f"Flight from {param['origin_airport']} to {param['destination_airport']} on {param['date']} (YYYYMMDD) at {param['time']} (HHMM) booked and confirmed."
elif function == 'get-flights-in-date-range':
days = get_days_between(param['start_date'], param['end_date'])
flights = {}
for day in days:
flights[day] = get_flights_for_date(day)
body = f"These are the times (HHMM) for all the flights from {param['origin_airport']} to {param['destination_airport']} between {param['start_date']} (YYYYMMDD) and {param['end_date']} (YYYYMMDD) in JSON format:\n{json.dumps(flights)}"
else:
body = f"Unknown function {function} for action group {actionGroup}."
else:
body = f"Unknown action group {actionGroup}."
# Format the output as expected by the agent
responseBody = {
"TEXT": {
"body": body
}
}
action_response = {
'actionGroup': actionGroup,
'function': function,
'functionResponse': {
'responseBody': responseBody
}
}
function_response = {'response': action_response, 'messageVersion': event['messageVersion']}
print(f"Response: {function_response}")
return function_response
I prepare the agent to test it in the console and ask this question:
Which flights are available from London Heathrow to Rome Fiumicino on July 20th, 2024?
The agent replies with a list of times. I choose Show trace to get more information about how the agent processed my instructions.
In the Trace tab, I explore the trace steps to understand the chain of thought used by the agent’s orchestration. For example, here I see that the agent handled the conversion of the airport names to codes (LHR for London Heathrow, FCO for Rome Fiumicino) before calling the Lambda function.
In the new Memory tab, I see what’s the content of the memory. The console uses a specific test memory ID. In an application, to keep memory separated for each user, I can use a different memory ID for every user.
I look at the list of flights and ask to book one:
Book the one at 6:02pm.
The agent replies confirming the booking.
After a few minutes, after the session has expired, I see a summary of my conversation in the Memory tab.
I choose the broom icon to start with a new conversation and ask a question that, by itself, doesn’t provide a full context to the agent:
Which other flights are available on the day of my flight?
The agent recalls the flight that I booked from our previous conversation. To provide me with an answer, the agent asks me to confirm the flight details. Note that the Lambda function is just a simulation and didn’t store the booking information in any database. The flight details were retrieved from the agent’s memory.
I confirm those values and get the list of the other flights with the same origin and destination on that day.
Yes, please.
To better demonstrate the benefits of memory retention, let’s call the agent using the AWS SDK for Python (Boto3). To do so, I first need to create an agent alias and version. I write down the agent ID and the alias ID because they are required when invoking the agent.
In the agent invocation, I add the new memoryId
option to use memory. By including this option, I get two benefits:
- The memory retained for that
memoryId
(if any) is used by the agent to improve its response.
- A summary of the conversation for the current session is retained for that
memoryId
so that it can be used in another session.
Using an AWS SDK, I can also get the content or delete the content of the memory for a specific memoryId
.
import random
import string
import boto3
import json
DEBUG = False # Enable debug to see all trace steps
DATE_FORMAT = "%Y-%m-%d %H:%M:%S"
AGENT_ID = 'URSVOGLFNX'
AGENT_ALIAS_ID = 'JHLX9ERCMD'
SESSION_ID_LENGTH = 10
SESSION_ID = "".join(
random.choices(string.ascii_uppercase + string.digits, k=SESSION_ID_LENGTH)
)
# A unique identifier for each user
MEMORY_ID = 'danilop-92f79781-a3f3-4192-8de6-890b67c63d8b'
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')
def invoke_agent(prompt, end_session=False):
response = bedrock_agent_runtime.invoke_agent(
agentId=AGENT_ID,
agentAliasId=AGENT_ALIAS_ID,
sessionId=SESSION_ID,
inputText=prompt,
memoryId=MEMORY_ID,
enableTrace=DEBUG,
endSession=end_session,
)
completion = ""
for event in response.get('completion'):
if DEBUG:
print(event)
if 'chunk' in event:
chunk = event['chunk']
completion += chunk['bytes'].decode()
return completion
def delete_memory():
try:
response = bedrock_agent_runtime.delete_agent_memory(
agentId=AGENT_ID,
agentAliasId=AGENT_ALIAS_ID,
memoryId=MEMORY_ID,
)
except Exception as e:
print(e)
return None
if DEBUG:
print(response)
def get_memory():
response = bedrock_agent_runtime.get_agent_memory(
agentId=AGENT_ID,
agentAliasId=AGENT_ALIAS_ID,
memoryId=MEMORY_ID,
memoryType='SESSION_SUMMARY',
)
memory = ""
for content in response['memoryContents']:
if 'sessionSummary' in content:
s = content['sessionSummary']
memory += f"Session ID {s['sessionId']} from {s['sessionStartTime'].strftime(DATE_FORMAT)} to {s['sessionExpiryTime'].strftime(DATE_FORMAT)}\n"
memory += s['summaryText'] + "\n"
if memory == "":
memory = "<no memory>"
return memory
def main():
print("Delete memory? (y/n)")
if input() == 'y':
delete_memory()
print("Memory content:")
print(get_memory())
prompt = input('> ')
if len(prompt) > 0:
print(invoke_agent(prompt, end_session=False)) # Start a new session
invoke_agent('end', end_session=True) # End the session
if __name__ == "__main__":
main()
I run the Python script from my laptop. I choose to delete the current memory (even if it should be empty for now) and then ask to book a morning flight on a specific date.
Delete memory? (y/n)
y
Memory content:
<no memory>
> Book me on a morning flight on July 20th, 2024 from LHR to FCO.
I have booked you on the morning flight from London Heathrow (LHR) to Rome Fiumicino (FCO) on July 20th, 2024 at 06:44.
I wait a couple of minutes and run the script again. The script creates a new session every time it’s run. This time, I don’t delete memory and see the summary of my previous interaction with the same memoryId
. Then, I ask on which date my flight is scheduled. Even though this is a new session, the agent finds the previous booking in the content of the memory.
Delete memory? (y/n)
n
Memory content:
Session ID MM4YYW0DL2 from 2024-07-09 15:35:47 to 2024-07-09 15:35:58
The user's goal was to book a morning flight from LHR to FCO on July 20th, 2024. The assistant booked a 0644 morning flight from LHR to FCO on the requested date of July 20th, 2024. The assistant successfully booked the requested morning flight for the user. The user requested a morning flight booking on July 20th, 2024 from London Heathrow (LHR) to Rome Fiumicino (FCO). The assistant booked a 0644 flight for the specified route and date.
> Which date is my flight on?
I recall from our previous conversation that you booked a morning flight from London Heathrow (LHR) to Rome Fiumicino (FCO) on July 20th, 2024. Please confirm if this date of July 20th, 2024 is correct for the flight you are asking about.
Yes, that’s my flight!
Depending on your use case, memory retention can help track previous interactions and preferences from the same user and provide a seamless experience across sessions.
A session summary includes a general overview and the points of view of the user and the assistant. For a short session as this one, this can cause some repetition.
Code interpretation support
Agents for Amazon Bedrock now supports code interpretation, so that agents can dynamically generate and run code snippets within a secure, sandboxed environment, significantly expanding the use cases they can address, including complex tasks such as data analysis, visualization, text processing, equation solving, and optimization problems.
Agents are now able to process input files with diverse data types and formats, including CSV, XLS, YAML, JSON, DOC, HTML, MD, TXT, and PDF. Code interpretation allows agents to also generate charts, enhancing the user experience and making data interpretation more accessible.
Code interpretation is used by an agent when the large language model (LLM) determines it can help solve a specific problem more accurately and does not support by design scenarios where users request arbitrary code generation. For security, each user session is provided with an isolated, sandboxed code runtime environment.
Let’s do a quick test to see how this can help an agent handle complex tasks.
Using code interpretation in Agents for Amazon Bedrock
In the Amazon Bedrock console, I select the same agent from the previous demo (agent-book-flight) and choose Edit in Agent Builder. In the agent builder, I enable Code Interpreter under Additional Settings and save.
I prepare the agent and test it straight in the console. First, I ask a mathematical question.
Compute the sum of the first 10 prime numbers.
After a few seconds, I get the answer from the agent:
The sum of the first 10 prime numbers is 129.
That’s accurate. Looking at the traces, the agent built and ran this Python program to compute what I asked:
import math
def is_prime(n):
if n < 2:
return False
for i in range(2, int(math.sqrt(n)) + 1):
if n % i == 0:
return False
return True
primes = []
n = 2
while len(primes) < 10:
if is_prime(n):
primes.append(n)
n += 1
print(f"The first 10 prime numbers are: {primes}")
print(f"The sum of the first 10 prime numbers is: {sum(primes)}")
Now, let’s go back to the agent-book-flight agent. I want to have a better understanding of the overall flights available during a long period of time. To do so, I start by adding a new function to the same action group to get all the flights available in a date range.
I name the new function get-flights-in-date-range and use this description:
Get all the flights between two destinations for each day in a date range.
All the parameters are required and of type string. These are the parameters names and descriptions:
origin_airport – Origin IATA airport code
destination_airport – Destination IATA airport code
start_date – Start date of the flight in YYYYMMDD format
end_date – End date of the flight in YYYYMMDD format
If you look at the Lambda function code I shared earlier, you’ll find that it already supports this agent function.
Now that the agent has a way to extract more information with a single function call, I ask the agent to visualize flight information data in a chart:
Draw a chart with the number of flights each day from JFK to SEA for the first ten days of August, 2024.
The agent reply includes a chart:
I choose the link to download the image on my computer:
That’s correct. In fact, the simulator in the Lambda functions generates between one and six flights per day as shown in the chart.
Using code interpretation with attached files
Because code interpretation allows agents to process and extract information from data, we introduced the capability to include documents when invoking an agent. For example, I have an Excel file with the number of flights booked for different flights:
Origin |
Destination |
Number of flights |
LHR |
FCO |
636 |
FCO |
LHR |
456 |
JFK |
SEA |
921 |
SEA |
JFK |
544 |
Using the clip icon in the test interface, I attach the file and ask (the agent replies in bold):
What is the most popular route? And the least one?
Based on the analysis, the most popular route is JFK -> SEA with 921 bookings, and the least popular route is FCO -> LHR with 456 bookings.
How many flights in total have been booked?
The total number of booked flights across all routes is 2557.
Draw a chart comparing the % of flights booked for these routes compared to the total number.
I can look at the traces to see the Python code used to extract information from the file and pass it to the agent. I can attach more than one file and use different file formats. These options are available in AWS SDKs to let agents use files in your applications.
Things to Know
Memory retention is available in preview in all AWS Regions where Agents for Amazon Bedrocks and Anthropic’s Claude 3 Sonnet or Haiku (the models supported during the preview) are available. Code interpretation is available in preview in the US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) Regions.
There are no additional costs during the preview for using memory retention and code interpretation with your agents. When using agents with these features, normal model use charges apply. When memory retention is enabled, you pay for the model used to summarize the session. For more information, see the Amazon Bedrock Pricing page.
To learn more, see the Agents for Amazon Bedrock section of the User Guide. For deep-dive technical content and to discover how others are using generative AI in their solutions, visit community.aws.
— Danilo