Tag Archives: Amazon MemoryDB

Amazon MemoryDB Multi-Region is now generally available

2024-12-01 Betty Zheng (郑予彬)

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/amazon-memorydb-multi-region-is-now-generally-available/

Providing highly available applications while maintaining low latency reads and writes across AWS Regions is a common challenge faced by many customers. Accessing data from different Regions can cause a delay of hundreds of milliseconds compared to microseconds within the same Region. The necessity for developers to create complex custom solutions for data replication and conflict resolution can lead to increased operational workload and potential errors. Beyond multi-Region replication, these customers have to implement manual database failover procedures and provide data consistency and recovery to deliver highly available applications and data durability.

Today, Amazon Web Services (AWS) announced the general availability of Amazon MemoryDB Multi-Region, a fully managed, active-active, multi-Region database that you can use to build applications with up to 99.999 percent availability, microsecond read, and single-digit millisecond write latencies across multiple AWS Regions. MemoryDB Multi-Region is available for Valkey, which is a Redis Open Source Software (OSS) drop-in replacement stewarded by Linux Foundation. This new feature builds upon the existing benefits of Amazon MemoryDB, such as multi-AZ durability and high throughput across multiple AWS Regions, and addresses these common challenges faced by many customers.

In this post, we discuss the benefits of MemoryDB Multi-Region and demonstrate how to get started with it using the AWS Management Console and the AWS Command Line Interface (AWS CLI).

Benefits of MemoryDB Multi-Region

MemoryDB Multi-Region provides the following benefits to customers:

High availability and disaster recovery – With MemoryDB Multi-Region, you can build applications with up to 99.999 percent availability. It also makes sure that if an application is unable to connect to MemoryDB in a local Region, the application can connect to MemoryDB from another AWS Regional endpoint with full read and write access to the data. When the application reconnects to the original MemoryDB Regional endpoint, MemoryDB Multi-Region will automatically synchronize data across all AWS Regions.
Microsecond read and single-digit millisecond write latency for multi-Region distributed applications – MemoryDB Multi-Region offers active-active replication, so you can serve both reads and writes locally from the Regions closest to your customers with microsecond read and single-digit millisecond write latency at any scale. It automatically replicates data asynchronously between AWS Regions with data typically propagated in less than one second.
Adhere to compliance and regulatory requirements where data needs to reside in a specific geography – There are compliance and regulatory requirements under which data needs to be within a geographic location. MemoryDB Multi-Region can help you meet these requirements as it allows customers to choose which region they want their data to reside.

Getting started with Amazon MemoryDB Multi-Region

Setting up MemoryDB Multi-Region is straightforward and can be accomplished through the AWS Management Console, AWS SDK, or AWS CLI.

Getting started with MemoryDB Multi-Region using the console

To set up your MemoryDB Multi-Region cluster using the console, complete the following steps:

On the MemoryDB console, choose Clusters in the navigation pane, choose Create cluster, select Multi-Region cluster for Cluster type, and Create new cluster for the Cluster creation method.

You can select the Node type and number of shards based on your workload requirement when you set up your Multi-Region cluster.

Create the Regional cluster within your Multi-Region cluster with the appropriate cluster settings.

You can add a second Regional cluster to your Multi-Region cluster by choosing Add AWS region after the Multi-Region cluster and the first Regional cluster are set up.

When the cluster creation workflow finishes successfully, you can observe that there are two Regional clusters within the Multi-Region cluster.

Here are the steps to get started using the AWS CLI

To begin, create a new MemoryDB Multi-Region cluster:

aws memorydb create-multi-region-cluster \
--multi-region-cluster-name-suffix testmrrlp \
--endpoint-url https://elasticache-qa.us-east-1.amazonaws.com \
--description "testdescription" \
--node-type db.r7g.xlarge \
--region us-east-1 \
--no-verify-ssl

Next, create a Regional cluster in the Multi-Region cluster:

aws memorydb create-cluster \
--cluster-name testmrrlp-member1 \
--multi-region-cluster-name ldgnf-testmrrlp \
--node-type db.r7g.xlarge \
--num-replicas-per-shard 1 \
--snapshot-retention-limit 10 \
--endpoint-url <value> \
--acl-name open-access \
--region us-east-1 \
--no-verify-ssl

After verifying the successful creation of the first cluster, create the second cluster in a different Region:

aws memorydb create-cluster \
--cluster-name testmrrlp-member2 \
--multi-region-cluster-name ldgnf-testmrrlp \
--node-type db.r7g.xlarge \
--num-replicas-per-shard 1 \
--snapshot-retention-limit 10 \
--endpoint-url https://elmo-qa.fra.aws-border.com \
--acl-name open-access \
--region eu-central-1 \
--no-verify-ssl

Check the status of the Multi-Region cluster:

aws memorydb describe-multi-region-clusters \
--multi-region-cluster-name ldgnf-testmrrlp \
--region us-east-1 \
--show-member-cluster-details \
--endpoint-url https://elasticache-qa.us-east-1.amazonaws.com \
--no-verify-ssl

Now available

Amazon MemoryDB Multi-Region is available for Valkey and in the following AWS Regions: US East (N. Virginia, Ohio), US West (N. California, Oregon), Asia Pacific (Mumbai, Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London).

To learn more, visit the MemoryDB features page and documentation. For pricing, refer to Amazon MemoryDB pricing.

–Betty

AWS Weekly Roundup: What’s App, AWS Lambda, Load Balancers, AWS Console, and more (Oct 14, 2024).

2024-10-14 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-whats-app-aws-lambda-load-balancers-aws-console-and-more-oct-14-2024/

Last week, AWS hosted free half-day conferences in London and Paris. My colleagues and I demonstrated how developers can use generative AI tools to speed up their design, analysis, code writing, debugging, and deployment workflows. These events were held at the GenAI Lofts. These lofts are open until October 25 (London) and November 5 (Paris). They will be packed with events, conferences, workshops, and meetups. If you’re around, be sure to check the agenda (London, Paris).

Veliswa live coding on stage at NGDE Day London

Our well-known AWS News blog co-author Veliswa did an amazing demo. She live-coded a Duolingo-like app from scratch, just using suggestions and reviews from Amazon Q Developer.

Now, let’s turn to other exciting news in the AWS universe from last week.

Last week’s launches
Here are some launches that got my attention:

Bring your conversations to WhatsApp — AWS has added support for What’sApp to AWS End User Messaging, so developers can reach users on WhatsApp with multimedia and interactive messaging options. This feature integrates with SMS and push notifications already available. Developers can get started quickly using AWS Management Console.

Amazon Redshift data sharing with data lake tables — This offers a secure and convenient way to share live data lake tables across different Amazon Redshift warehouses. Data sharing of data lake tables in AWS Glue Data Catalog provides live access to the data, so you always see the most up-to-date and consistent information as it’s updated in the data lake.

Zonal shift and zonal autoshift for cross zoned Network Load Balancer — Network Load Balancer (NLB) now supports the Amazon Application Recovery Controller zonal shift and zonal autoshift features on load balancers that are enabled across zones. With Zonal shift, you can quickly shift traffic away from an impaired Availability Zone and recover from events such as bad application deployment and gray failures. Zonal autoshift safely and automatically shifts your traffic away from an Availability Zone when AWS identifies a potential impact to it.

Console to Code to generate infrastructure as a service code — This is by far my favorite launch of the week. Console to Code makes it simple, fast, and cost-effective to move from prototyping in the AWS Management Console to building code for production deployments. You can generate code for their console actions in their preferred format with a single click. The generated code helps you get started and bootstrap your automation pipelines for tasks. Console to Code is powered by Amazon Q Developer.

A new getting started experience for AWS CodePipeline — AWS Data Pipeline introduces a simplified and new getting started experience so you can quickly create new pipelines. When you create a new pipeline using the CodePipeline console, you can now select from a list of pipeline templates across build, automation, and deployment use cases. After selecting a pipeline template, you will be prompted to enter values for the action configuration fields in the pipeline definition, and completing the process will render a fully configured pipeline that’s ready to run.

AWS Lambda detects and stops recursive loops between Lambda and Amazon S3 — Lambda recursive loop detection can now automatically detect and stop recursive loops between AWS Lambda and Amazon Simple Storage Service (Amazon S3). Lambda recursive loop detection, which is enabled by default, is a preventative guardrail that automatically detects and stops recursive invocations between Lambda and other supported services, preventing unintended usage and billing from runaway workloads.

Amazon MemoryDB for Valkey — Amazon MemoryDB for Redis is a fully managed, Valkey– and Redis OSS-compatible database service, which provides multi-AZ durability, microsecond read and single-digit millisecond write latency, and high throughput. It is ideal for use cases such as caching, leaderboards, and session stores. With MemoryDB for Valkey, you can benefit from a fully managed experience built on open-source technology while using the security, operational excellence, and reliability that AWS provides. MemoryDB for Valkey also delivers the fastest vector search performance at the highest recall rates among popular vector databases on AWS.

Amazon Polly adds four wew English voices for the generative engine and expands to three Regions — Polly is a managed service that turns text into lifelike speech, so you can create applications that talk and to build speech-enabled products depending on your business needs. The generative engine is the most advanced Amazon Polly text-to-speech (TTS) model. With this launch, we add a variety of new synthetic generative English voices to the Amazon Polly portfolio: an Australian English voice Olivia and three US English voices Joanna, Danielle, and Stephen. These voices have more natural pronunciation and prosody. You can use this high-tier product in various industries and for different purposes such as education, publishing, or marketing.

For a full list of AWS announcements, be sure to keep an eye on the AWS What’s New Feed page.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS Cloud Day Prague — Join us for a free technical conferences in Prague on October 23. I will be there and share with attendees “The Art of Transforming a Foundation Model into a Domain Expert”. Be sure to register today!

Innovate Migrate, Modernize, and Build — Whether you are new to the cloud or an experienced user, you will learn something new at AWS Innovate. This is a free online conference. Register for a time and region convenient to North America (October 15), or Europe, Middle East & Africa (October 24).

AWS Community Days — Join community-led conferences featuring technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. Don’t miss out on the AWS Community Days happening on October 19 in Vadodara, Spain, and Guatemala.

AWS re:Invent 2024 — Registration is now open for the annual tech extravaganza, taking place December 2 – 6 in Las Vegas. Beside recording podcast episodes, I will present three sessions:

CMP410 | Accelerate testing cycles of CI/CD pipelines with EC2 Mac instances (with Vishal)
DEV301 | The art of transforming foundation models into domain experts (with Gregory)
DEV334 | Swift, server-side, serverless

There are just a few seats left for these three sessions, so be sure to book your seat today!

Browse more upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— seb

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS Weekly Roundup: Advanced capabilities in Amazon Bedrock and Amazon Q, and more (July 15, 2024).

2024-07-15 Abhishek Gupta

Post Syndicated from Abhishek Gupta original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-advanced-capabilities-in-amazon-bedrock-and-amazon-q-and-more-july-15-2024/

As expected, there were lots of exciting launches and updates announced during the AWS Summit New York. You can quickly scan the highlights in Top Announcements of the AWS Summit in New York, 2024.

My colleagues and fellow AWS News Blog writers Veliswa Boya and Sébastien Stormacq were at the AWS Community Day Cameroon last week. They were energized to meet amazing professionals, mentors, and students – all willing to learn and exchange thoughts about cloud technologies. You can access the video replay to feel the vibes or just watch some of the talks!

Last week’s launches
In addition to the launches at the New York Summit, here are a few others that got my attention.

Advanced RAG capabilities Knowledge Bases for Amazon Bedrock – These include custom chunking options to enable customers to write their own chunking code as a Lambda function; smart parsing to extract information from complex data such as tables; and query reformulation to break down queries into simpler sub-queries, retrieve relevant information for each, and combine the results into a final comprehensive answer.

Amazon Bedrock Prompt Management and Prompt Flows – This is a preview launch of Prompt Management that help developers and prompt engineers get the best responses from foundation models for their use cases; and Prompt Flows accelerates the creation, testing, and deployment of workflows through an intuitive visual builder.

Fine-tuning for Anthropic’s Claude 3 Haiku in Amazon Bedrock (preview) – By providing your own task-specific training dataset, you can fine tune and customize Claude 3 Haiku to boost model accuracy, quality, and consistency to further tailor generative AI for your business.

IDE workspace context awareness in Amazon Q Developer chat – Users can now add @workspace to their chat message in Q Developer to ask questions about the code in the project they currently have open in the IDE. Q Developer automatically ingests and indexes all code files, configurations, and project structure, giving the chat comprehensive context across your entire application within the IDE.

New features in Amazon Q Business – The new personalization capabilities in Amazon Q Business are automatically enabled and will use your enterprise’s employee profile data to improve their user experience. You can now get answers from text content in scanned PDFs, and images embedded in PDF documents, without having to use OCR for preprocessing and text extraction.

Amazon EC2 R8g instances powered by AWS Graviton4 are now generally available – Amazon EC2 R8g instances are ideal for memory-intensive workloads such as databases, in-memory caches, and real-time big data analytics. These are powered by AWS Graviton4 processors and deliver up to 30% better performance compared to AWS Graviton3-based instances.

Vector search for Amazon MemoryDB is now generally available – Vector search for MemoryDB enables real-time machine learning (ML) and generative AI applications. It can store millions of vectors with single-digit millisecond query and update latencies at the highest levels of throughput with >99% recall.

Introducing Valkey GLIDE, an open source client library for Valkey and Redis open source – Valkey is an open source key-value data store that supports a variety of workloads such as caching, and message queues. Valkey GLIDE is one of the official client libraries for Valkey and it supports all Valkey commands. GLIDE supports Valkey 7.2 and above, and Redis open source 6.2, 7.0, and 7.2.

Amazon OpenSearch Service enhancements – Amazon OpenSearch Serverless now supports workloads up to 30TB of data for time-series collections enabling more data-intensive use cases, and an innovative caching mechanism that automatically fetches and intelligently manages data, leading to faster data retrieval, efficient storage usage, and cost savings. Amazon OpenSearch Service has now added support for AI powered Natural Language Query Generation in OpenSearch Dashboards Log Explorer so you can get started quickly with log analysis without first having to be proficient in PPL.

Open source release of Secrets Manager Agent for AWS Secrets Manager – Secrets Manager Agent is a language agnostic local HTTP service that you can install and use in your compute environments to read secrets from Secrets Manager and cache them in memory, instead of making a network call to Secrets Manager.

Amazon S3 Express One Zone now supports logging of all events in AWS CloudTrail – This capability lets you get details on who made API calls to S3 Express One Zone and when API calls were made, thereby enhancing data visibility for governance, compliance, and operational auditing.

Amazon CloudFront announces managed cache policies for web applications – Previously, Amazon CloudFront customers had two options for managed cache policies, and had to create custom cache policies for all other cases. With the new managed cache policies, CloudFront caches content based on the Cache-Control headers returned by the origin, and defaults to not caching when the header is not returned.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

We launched existing services in additional Regions:

Amazon Relational Database Service (RDS) Data API for Aurora PostgreSQL is now available in 10 additional AWS regions.
Amazon Managed Workflows for Apache Airflow (MWAA) is now available in nine new AWS Regions.
Amazon Simple Notification Service (Amazon SNS) customers can now host their applications in Canada West (Calgary) region, and send text messages (SMS) to consumers in more than 200 countries and territories.
Amazon EMR support for backup and restore for Apache HBase Tables is available in Asia Pacific (Seoul) region.
Amazon Cognito is now available in Canada West (Calgary) and Asia Pacific (Hong Kong) regions.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Context window overflow: Breaking the barrier – This blog post dives into intricate workings of generative artificial intelligence (AI) models, and why is it crucial to understand and mitigate the limitations of CWO (context window overflow).

Using Agents for Amazon Bedrock to interactively generate infrastructure as code – This blog post explores how Agents for Amazon Bedrock can be used to generate customized, organization standards-compliant IaC scripts directly from uploaded architecture diagrams.

Automating model customization in Amazon Bedrock with AWS Step Functions workflow – This blog post covers orchestrating repeatable and automated workflows for customizing Amazon Bedrock models and how AWS Step Functions can help overcome key pain points in model customization.

AWS open source news and updates – My colleague Ricardo Sueiras writes about open source projects, tools, and events from the AWS Community; check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. To learn more about future AWS Summit events, visit the AWS Summit page. Register in your nearest city: Bogotá (July 18), Taipei (July 23–24), AWS Summit Mexico City (Aug. 7), and AWS Summit Sao Paulo (Aug. 15).

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. Upcoming AWS Community Days are in Aotearoa (Aug. 15), Nigeria (Aug. 24), New York (Aug. 28), and Belfast (Sept. 6).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Abhishek

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Vector search for Amazon MemoryDB is now generally available

2024-07-10 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/vector-search-for-amazon-memorydb-is-now-generally-available/

Today, we are announcing the general availability of vector search for Amazon MemoryDB, a new capability that you can use to store, index, retrieve, and search vectors to develop real-time machine learning (ML) and generative artificial intelligence (generative AI) applications with in-memory performance and multi-AZ durability.

With this launch, Amazon MemoryDB delivers the fastest vector search performance at the highest recall rates among popular vector databases on Amazon Web Services (AWS). You no longer have to make trade-offs around throughput, recall, and latency, which are traditionally in tension with one another.

You can now use one MemoryDB database to store your application data and millions of vectors with single-digit millisecond query and update response times at the highest levels of recall. This simplifies your generative AI application architecture while delivering peak performance and reducing licensing cost, operational burden, and time to deliver insights on your data.

With vector search for Amazon MemoryDB, you can use the existing MemoryDB API to implement generative AI use cases such as Retrieval Augmented Generation (RAG), anomaly (fraud) detection, document retrieval, and real-time recommendation engines. You can also generate vector embeddings using artificial intelligence and machine learning (AI/ML) services like Amazon Bedrock and Amazon SageMaker and store them within MemoryDB.

Which use cases would benefit most from vector search for MemoryDB?
You can use vector search for MemoryDB for the following specific use cases:

1. Real-time semantic search for retrieval-augmented generation (RAG)
You can use vector search to retrieve relevant passages from a large corpus of data to augment a large language model (LLM). This is done by taking your document corpus, chunking them into discrete buckets of texts, and generating vector embeddings for each chunk with embedding models such as the Amazon Titan Multimodal Embeddings G1 model, then loading these vector embeddings into Amazon MemoryDB.

With RAG and MemoryDB, you can build real-time generative AI applications to find similar products or content by representing items as vectors, or you can search documents by representing text documents as dense vectors that capture semantic meaning.

2. Low latency durable semantic caching
Semantic caching is a process to reduce computational costs by storing previous results from the foundation model (FM) in-memory. You can store prior inferenced answers alongside the vector representation of the question in MemoryDB and reuse them instead of inferencing another answer from the LLM.

If a user’s query is semantically similar based on a defined similarity score to a prior question, MemoryDB will return the answer to the prior question. This use case will allow your generative AI application to respond faster with lower costs from making a new request to the FM and provide a faster user experience for your customers.

3. Real-time anomaly (fraud) detection
You can use vector search for anomaly (fraud) detection to supplement your rule-based and batch ML processes by storing transactional data represented by vectors, alongside metadata representing whether those transactions were identified as fraudulent or valid.

The machine learning processes can detect users’ fraudulent transactions when the net new transactions have a high similarity to vectors representing fraudulent transactions. With vector search for MemoryDB, you can detect fraud by modeling fraudulent transactions based on your batch ML models, then loading normal and fraudulent transactions into MemoryDB to generate their vector representations through statistical decomposition techniques such as principal component analysis (PCA).

As inbound transactions flow through your front-end application, you can run a vector search against MemoryDB by generating the transaction’s vector representation through PCA, and if the transaction is highly similar to a past detected fraudulent transaction, you can reject the transaction within single-digit milliseconds to minimize the risk of fraud.

Getting started with vector search for Amazon MemoryDB
Look at how to implement a simple semantic search application using vector search for MemoryDB.

Step 1. Create a cluster to support vector search
You can create a MemoryDB cluster to enable vector search within the MemoryDB console. Choose Enable vector search in the Cluster settings when you create or update a cluster. Vector search is available for MemoryDB version 7.1 and a single shard configuration.

Step 2. Create vector embeddings using the Amazon Titan Embeddings model
You can use Amazon Titan Text Embeddings or other embedding models to create vector embeddings, which is available in Amazon Bedrock. You can load your PDF file, split the text into chunks, and get vector data using a single API with LangChain libraries integrated with AWS services.

import redis
import numpy as np
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import BedrockEmbeddings

# Load a PDF file and split document
loader = PyPDFLoader(file_path=pdf_path)
        pages = loader.load_and_split()
        text_splitter = RecursiveCharacterTextSplitter(
            separators=["\n\n", "\n", ".", " "],
            chunk_size=1000,
            chunk_overlap=200,
        )
        chunks = loader.load_and_split(text_splitter)

# Create MemoryDB vector store the chunks and embedding details
client = RedisCluster(
        host=' mycluster.memorydb.us-east-1.amazonaws.com',
        port=6379,
        ssl=True,
        ssl_cert_reqs="none",
        decode_responses=True,
    )

embedding =  BedrockEmbeddings (
           region_name="us-east-1",
 endpoint_url=" https://bedrock-runtime.us-east-1.amazonaws.com",
    )

#Save embedding and metadata using hset into your MemoryDB cluster
for id, dd in enumerate(chucks*):
     y = embeddings.embed_documents([dd])
     j = np.array(y, dtype=np.float32).tobytes()
     client.hset(f'oakDoc:{id}', mapping={'embed': j, 'text': chunks[id] } )

Once you generate the vector embeddings using the Amazon Titan Text Embeddings model, you can connect to your MemoryDB cluster and save these embeddings using the MemoryDB HSET command.

Step 3. Create a vector index
To query your vector data, create a vector index using theFT.CREATE command. Vector indexes are also constructed and maintained over a subset of the MemoryDB keyspace. Vectors can be saved in JSON or HASH data types, and any modifications to the vector data are automatically updated in a keyspace of the vector index.

from redis.commands.search.field import TextField, VectorField

index = client.ft(idx:testIndex).create_index([
        VectorField(
            "embed",
            "FLAT",
            {
                "TYPE": "FLOAT32",
                "DIM": 1536,
                "DISTANCE_METRIC": "COSINE",
            }
        ),
        TextField("text")
        ]
    )

In MemoryDB, you can use four types of fields: numbers fields, tag fields, text fields, and vector fields. Vector fields support K-nearest neighbor searching (KNN) of fixed-sized vectors using the flat search (FLAT) and hierarchical navigable small worlds (HNSW) algorithm. The feature supports various distance metrics, such as euclidean, cosine, and inner product. We will use the euclidean distance, a measure of the angle distance between two points in vector space. The smaller the euclidean distance, the closer the vectors are to each other.

Step 4. Search the vector space
You can use FT.SEARCH and FT.AGGREGATE commands to query your vector data. Each operator uses one field in the index to identify a subset of the keys in the index. You can query and find filtered results by the distance between a vector field in MemoryDB and a query vector based on some predefined threshold (RADIUS).

from redis.commands.search.query import Query

# Query vector data
query = (
    Query("@vector:[VECTOR_RANGE $radius $vec]=>{$YIELD_DISTANCE_AS: score}")
     .paging(0, 3)
     .sort_by("vector score")
     .return_fields("id", "score")     
     .dialect(2)
)

# Find all vectors within 0.8 of the query vector
query_params = {
    "radius": 0.8,
    "vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
}

results = client.ft(index).search(query, query_params).docs

For example, when using cosine similarity, the RADIUS value ranges from 0 to 1, where a value closer to 1 means finding vectors more similar to the search center.

Here is an example result to find all vectors within 0.8 of the query vector.

[Document {'id': 'doc:a', 'payload': None, 'score': '0.243115246296'},
 Document {'id': 'doc:c', 'payload': None, 'score': '0.24981123209'},
 Document {'id': 'doc:b', 'payload': None, 'score': '0.251443207264'}]

To learn more, you can look at a sample generative AI application using RAG with MemoryDB as a vector store.

What’s new at GA
At re:Invent 2023, we released vector search for MemoryDB in preview. Based on customers’ feedback, here are the new features and improvements now available:

VECTOR_RANGE to allow MemoryDB to operate as a low latency durable semantic cache, enabling cost optimization and performance improvements for your generative AI applications.
SCORE to better filter on similarity when conducting vector search.
Shared memory to not duplicate vectors in memory. Vectors are stored within the MemoryDB keyspace and pointers to the vectors are stored in the vector index.
Performance improvements at high filtering rates to power the most performance-intensive generative AI applications.

Now available
Vector search is available in all Regions that MemoryDB is currently available. Learn more about vector search for Amazon MemoryDB in the AWS documentation.

Give it a try in the MemoryDB console and send feedback to the AWS re:Post for Amazon MemoryDB or through your usual AWS Support contacts.

— Channy

Noise