Tag Archives: news

Use natural language to explore and prepare data with a new capability of Amazon SageMaker Canvas

Post Syndicated from Irshad Buchh original https://aws.amazon.com/blogs/aws/use-natural-language-to-explore-and-prepare-data-with-a-new-capability-of-amazon-sagemaker-canvas/

Today, I’m happy to introduce the ability to use natural language instructions in Amazon SageMaker Canvas to explore, visualize, and transform data for machine learning (ML).

SageMaker Canvas now supports using foundation model- (FM) powered natural language instructions to complement its comprehensive data preparation capabilities for data exploration, analysis, visualization, and transformation. Using natural language instructions, you can now explore and transform your data to build highly accurate ML models. This new capability is powered by Amazon Bedrock.

Data is the foundation for effective machine learning, and transforming raw data to make it suitable for ML model building and generating predictions is key to better insights. Analyzing, transforming, and preparing data to build ML models is often the most time-consuming part of the ML workflow. With SageMaker Canvas, data preparation for ML is seamless and fast with 300+ built-in transforms, analyses, and an in-depth data quality insights report without writing any code. Starting today, the process of data exploration and preparation is faster and simpler in SageMaker Canvas using natural language instructions for exploring, visualizing, and transforming data.

Data preparation tasks are now accelerated through a natural language experience using queries and responses. You can quickly get started with contextual, guided prompts to understand and explore your data.

Say I want to build an ML model to predict house prices Using SageMaker Canvas. First, I need to prepare my housing dataset to build an accurate model. To get started with the new natural language instructions, I open the SageMaker Canvas application, and in the left navigation pane, I choose Data Wrangler. Under the Data tab and from the list of available datasets, I select the canvas-housing-sample.csv as the dataset, then select Create a data flow and choose Create. I see the tabular view of my dataset and an introduction to the new Chat for data prep capability.

data-flow

I select Chat for data prep, and it displays the chat interface with a set of guided prompts relevant to my dataset. I can use any of these prompts or query the data for something else.

chat-interface

First, I want to understand the quality of my dataset to identify any outliers or anomalies. I ask SageMaker Canvas to generate a data quality report to accomplish this task.

data-quality

I see there are no major issues with my data. I would now like to visualize the distribution of a couple of features in the data. I ask SageMaker Canvas to plot a chart.

query

I now want to filter certain rows to transform my data. I ask SageMaker Canvas to remove rows where the population is less than 1,000. Canvas removes those rows, shows me a preview of the transformed data, and also gives me the option to view and update the code that generated the transform.

code-view

I am happy with the preview and add the transformed data to my list of data transform steps on the right. SageMaker Canvas adds the step along with the code.

transform

Now that my data is transformed, I can go on to build my ML model to predict house prices and even deploy the model into production using the same visual interface of SageMaker Canvas, without writing a single line of code.

Data preparation has never been easier for ML!

Availability
The new capability in Amazon SageMaker Canvas to explore and transform data using natural language queries is available in all AWS Regions where Amazon SageMaker Canvas and Amazon Bedrock are supported.

Learn more
Amazon SageMaker Canvas product page

Go build!

— Irshad

Amazon SageMaker adds new inference capabilities to help reduce foundation model deployment costs and latency

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-sagemaker-adds-new-inference-capabilities-to-help-reduce-foundation-model-deployment-costs-and-latency/

Today, we are announcing new Amazon SageMaker inference capabilities that can help you optimize deployment costs and reduce latency. With the new inference capabilities, you can deploy one or more foundation models (FMs) on the same SageMaker endpoint and control how many accelerators and how much memory is reserved for each FM. This helps to improve resource utilization, reduce model deployment costs on average by 50 percent, and lets you scale endpoints together with your use cases.

For each FM, you can define separate scaling policies to adapt to model usage patterns while further optimizing infrastructure costs. In addition, SageMaker actively monitors the instances that are processing inference requests and intelligently routes requests based on which instances are available, helping to achieve on average 20 percent lower inference latency.

Key components
The new inference capabilities build upon SageMaker real-time inference endpoints. As before, you create the SageMaker endpoint with an endpoint configuration that defines the instance type and initial instance count for the endpoint. The model is configured in a new construct, an inference component. Here, you specify the number of accelerators and amount of memory you want to allocate to each copy of a model, together with the model artifacts, container image, and number of model copies to deploy.

Amazon SageMaker - MME

Let me show you how this works.

New inference capabilities in action
You can start using the new inference capabilities from SageMaker Studio, the SageMaker Python SDK, and the AWS SDKs and AWS Command Line Interface (AWS CLI). They are also supported by AWS CloudFormation.

For this demo, I use the AWS SDK for Python (Boto3) to deploy a copy of the Dolly v2 7B model and a copy of the FLAN-T5 XXL model from the Hugging Face model hub on a SageMaker real-time endpoint using the new inference capabilities.

Create a SageMaker endpoint configuration

import boto3
import sagemaker

role = sagemaker.get_execution_role()
sm_client = boto3.client(service_name="sagemaker")

sm_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ExecutionRoleArn=role,
    ProductionVariants=[{
        "VariantName": "AllTraffic",
        "InstanceType": "ml.g5.12xlarge",
        "InitialInstanceCount": 1,
		"RoutingConfig": {
            "RoutingStrategy": "LEAST_OUTSTANDING_REQUESTS"
        }
    }]
)

Create the SageMaker endpoint

sm_client.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name,
)

Before you can create the inference component, you need to create a SageMaker-compatible model and specify a container image to use. For both models, I use the Hugging Face LLM Inference Container for Amazon SageMaker. These deep learning containers (DLCs) include the necessary components, libraries, and drivers to host large models on SageMaker.

Prepare the Dolly v2 model

from sagemaker.huggingface import get_huggingface_llm_image_uri

# Retrieve the container image URI
hf_inference_dlc = get_huggingface_llm_image_uri(
  "huggingface",
  version="0.9.3"
)

# Configure model container
dolly7b = {
    'Image': hf_inference_dlc,
    'Environment': {
        'HF_MODEL_ID':'databricks/dolly-v2-7b',
        'HF_TASK':'text-generation',
    }
}

# Create SageMaker Model
sagemaker_client.create_model(
    ModelName        = "dolly-v2-7b",
    ExecutionRoleArn = role,
    Containers       = [dolly7b]
)

Prepare the FLAN-T5 XXL model

# Configure model container
flant5xxlmodel = {
    'Image': hf_inference_dlc,
    'Environment': {
        'HF_MODEL_ID':'google/flan-t5-xxl',
        'HF_TASK':'text-generation',
    }
}

# Create SageMaker Model
sagemaker_client.create_model(
    ModelName        = "flan-t5-xxl",
    ExecutionRoleArn = role,
    Containers       = [flant5xxlmodel]
)

Now, you’re ready to create the inference component.

Create an inference component for each model
Specify an inference component for each model you want to deploy on the endpoint. Inference components let you specify the SageMaker-compatible model and the compute and memory resources you want to allocate. For CPU workloads, define the number of cores to allocate. For accelerator workloads, define the number of accelerators. RuntimeConfig defines the number of model copies you want to deploy.

# Inference compoonent for Dolly v2 7B
sm_client.create_inference_component(
    InferenceComponentName="IC-dolly-v2-7b",
    EndpointName=endpoint_name,
    VariantName=variant_name,
    Specification={
        "ModelName": "dolly-v2-7b",
        "ComputeResourceRequirements": {
		    "NumberOfAcceleratorDevicesRequired": 2, 
			"NumberOfCpuCoresRequired": 2, 
			"MinMemoryRequiredInMb": 1024
	    }
    },
    RuntimeConfig={"CopyCount": 1},
)

# Inference component for FLAN-T5 XXL
sm_client.create_inference_component(
    InferenceComponentName="IC-flan-t5-xxl",
    EndpointName=endpoint_name,
    VariantName=variant_name,
    Specification={
        "ModelName": "flan-t5-xxl",
        "ComputeResourceRequirements": {
		    "NumberOfAcceleratorDevicesRequired": 2, 
			"NumberOfCpuCoresRequired": 1, 
			"MinMemoryRequiredInMb": 1024
	    }
    },
    RuntimeConfig={"CopyCount": 1},
)

Once the inference components have successfully deployed, you can invoke the models.

Run inference
To invoke a model on the endpoint, specify the corresponding inference component.

import json
sm_runtime_client = boto3.client(service_name="sagemaker-runtime")
payload = {"inputs": "Why is California a great place to live?"}

response_dolly = sm_runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    InferenceComponentName = "IC-dolly-v2-7b",
    ContentType="application/json",
    Accept="application/json",
    Body=json.dumps(payload),
)

response_flant5 = sm_runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    InferenceComponentName = "IC-flan-t5-xxl",
    ContentType="application/json",
    Accept="application/json",
    Body=json.dumps(payload),
)

result_dolly = json.loads(response_dolly['Body'].read().decode())
result_flant5 = json.loads(response_flant5['Body'].read().decode())

Next, you can define separate scaling policies for each model by registering the scaling target and applying the scaling policy to the inference component. Check out the SageMaker Developer Guide for detailed instructions.

The new inference capabilities provide per-model CloudWatch metrics and CloudWatch Logs and can be used with any SageMaker-compatible container image across SageMaker CPU- and GPU-based compute instances. Given support by the container image, you can also use response streaming.

Now available
The new Amazon SageMaker inference capabilities are available today in AWS Regions US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Jakarta, Mumbai, Seoul, Singapore, Sydney, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Stockholm), Middle East (UAE), and South America (São Paulo). For pricing details, visit Amazon SageMaker Pricing. To learn more, visit Amazon SageMaker.

Get started
Log in to the AWS Management Console and deploy your FMs using the new SageMaker inference capabilities today!

— Antje

Introducing highly durable Amazon OpenSearch Service clusters with 30% price/performance improvement

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/introducing-highly-durable-amazon-opensearch-service-clusters-with-30-price-performance-improvement/

You can use the new OR1 instances to create Amazon OpenSearch Service clusters that use Amazon Simple Storage Service (Amazon S3) for primary storage. You can ingest, store, index, and access just about any imaginable amount of data, while also enjoying a 30% price/performance improvement over existing instance types, eleven nines of data durability, and a zero-time Recovery Point Objective (RPO). You can use this to perform interactive log analytics, monitor application in real time, and more.

New OR1 Instances
These benefits are all made possible by the new OR1 instances, which are available in eight sizes and used for the data nodes of the cluster:

Instance Name vCPUs
Memory
EBS Storage Max (gp3)
or1.medium.search 1 8 GiB 400 GiB
or1.large.search 2 16 GiB 800 GiB
or1.xlarge.search 4 32 GiB 1.5 TiB
or1.2xlarge.search 8 64 GiB 3 TiB
or1.4xlarge.search 16 128 GiB 6 TiB
or1.8xlarge.search 32 256 GiB 12 TiB
or1.12xlarge.search 48 384 GiB 18 TiB
or1.16xlarge.search 64 512 GiB 24 TiB

To choose a suitable instance size, read Sizing Amazon OpenSearch Service domains.

The Amazon Elastic Block Store (Amazon EBS) volumes are used for primary storage, with data copied synchronously to S3 as it arrives. The data in S3 is used to create replicas and to rehydrate EBS after shards are moved between instances as a result of a node failure or a routine rebalancing operation. This is made possible by the remote-backed storage and segment replication features that were recently released for OpenSearch.

Creating a Domain
To create a domain I open the Amazon OpenSearch Service Console, select Managed clusters, and click Create domain:

I enter a name for my domain (my-domain), select Standard create, and use the Production template:

Then I choose the Domain with standby deployment option. This option will create active data nodes in two Availability Zones and a standby one in a third. I also choose the latest engine version:

Then I select the OR1 instance family and (for my use case) configure 500 GiB of EBS storage per data node:

I set the other settings as needed, and click Create to proceed:

I take a quick lunch break and when i come back my domain is ready:

Things to Know
Here are a couple of things to know about this new storage option:

Engine Versions – Amazon OpenSearch Service engines version 2.11 and above support OR1 instances.

Regions – The OR1 instance family is available for use with OpenSearch in the US East (Ohio, N. Virginia), US West (N. California, Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, Spain, Stockholm) AWS Regions.

Pricing – You pay On-Demand or Reserved prices for data nodes, and you also pay for EBS storage. See the Amazon OpenSearch Service Pricing page for more information.

Jeff;

Evaluate, compare, and select the best foundation models for your use case in Amazon Bedrock (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/evaluate-compare-and-select-the-best-foundation-models-for-your-use-case-in-amazon-bedrock-preview/

I’m happy to share that you can now evaluate, compare, and select the best foundation models (FMs) for your use case in Amazon Bedrock. Model Evaluation on Amazon Bedrock is available today in preview.

Amazon Bedrock offers a choice of automatic evaluation and human evaluation. You can use automatic evaluation with predefined metrics such as accuracy, robustness, and toxicity. For subjective or custom metrics, such as friendliness, style, and alignment to brand voice, you can set up human evaluation workflows with just a few clicks.

Model evaluations are critical at all stages of development. As a developer, you now have evaluation tools available for building generative artificial intelligence (AI) applications. You can start by experimenting with different models in the playground environment. To iterate faster, add automatic evaluations of the models. Then, when you prepare for an initial launch or limited release, you can incorporate human reviews to help ensure quality.

Let me give you a quick tour of Model Evaluation on Amazon Bedrock.

Automatic model evaluation
With automatic model evaluation, you can bring your own data or use built-in, curated datasets and pre-defined metrics for specific tasks such as content summarization, question and answering, text classification, and text generation. This takes away the heavy lifting of designing and running your own model evaluation benchmarks.

To get started, navigate to the Amazon Bedrock console, then select Model evaluation under Assessment & deployment in the left menu. Create a new model evaluation and choose Automatic.

Amazon Bedrock Model Evaluation

Next, follow the setup dialog to choose the FM you want to evaluate and the type of task, for example, text summarization. Select the evaluation metrics and specify a dataset—either built-in or your own.

If you bring your own dataset, make sure it’s in JSON Lines format, and each line contains all of the key-value pairs that you want to evaluate your model with for the model dimension that you want to evaluate. For example, if you want to evaluate the model on a question-answer task, you would format your data as follows (with category being optional):

{"referenceResponse":"Cantal","category":"Capitals","prompt":"Aurillac is the capital of"}
{"referenceResponse":"Bamiyan Province","category":"Capitals","prompt":"Bamiyan city is the capital of"}
{"referenceResponse":"Abkhazia","category":"Capitals","prompt":"Sokhumi is the capital of"}
...

Then, create and run the evaluation job to understand the model’s task-specific performance. Once the evaluation job is complete, you can review the results in the model evaluation report.

Amazon Bedrock Model Evaluations

Human model evaluation
For human evaluation, you can have Amazon Bedrock set up human review workflows with a few clicks. You can bring your own datasets and define custom evaluation metrics, such as relevance, style, or alignment to brand voice. You also have the choice to either leverage your own internal teams as reviewers or engage an AWS managed team. This takes away the tedious effort of building and operating human evaluation workflows.

To get started, create a new model evaluation and select Human: Bring your own team or Human: AWS managed team.

If you choose an AWS managed team for human evaluation, describe your model evaluation needs, including task type, expertise of the work team, and the approximate number of prompts, along with your contact information. In the next step, an AWS expert will reach out to discuss your model evaluation project requirements in more detail. Upon review, the team will share a custom quote and project timeline.

If you choose to bring your own team, follow the setup dialog to choose the FMs you want to evaluate and the type of task, for example, text summarization. Then, select the evaluation metrics, upload your test dataset, and set up the work team.

For human evaluation, you would format the example data shown before again in JSON Lines format like this (with category and referenceResponse being optional):

{"prompt":"Aurillac is the capital of","referenceResponse":"Cantal","category":"Capitals"}
{"prompt":"Bamiyan city is the capital of","referenceResponse":"Bamiyan Province","category":"Capitals"}
{"prompt":"Senftenberg is the capital of","referenceResponse":"Oberspreewald-Lausitz","category":"Capitals"}

Once the human evaluation is completed, Amazon Bedrock generates an evaluation report with the model’s performance against your selected metrics.

Amazon Bedrock Model Evaluation

Things to know
Here are a couple of important things to know:

Model support – During preview, you can evaluate and compare text-based large language models (LLMs) available on Amazon Bedrock. During preview, you can select one model for each automatic evaluation job and up to two models for each human evaluation job using your own team. For human evaluation using an AWS managed team, you can specify custom project requirements.

Pricing – During preview, AWS only charges for the model inference needed to perform the evaluation (processed input and output tokens for on-demand pricing). There will be no separate charges for human evaluation or automatic evaluation. Amazon Bedrock Pricing has all the details.

Join the preview
Automatic evaluation and human evaluation using your own work team are available today in public preview in AWS Regions US East (N. Virginia) and US West (Oregon). Human evaluation using an AWS managed team is available in public preview in AWS Region US East (N. Virginia). To learn more, visit the Amazon Bedrock Developer Experience web page and check out the User Guide.

Get started
Log in to the AWS Management Console and start exploring model evaluation in Amazon Bedrock today!

— Antje

Amazon Redshift adds new AI capabilities, including Amazon Q, to boost efficiency and productivity

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/amazon-redshift-adds-new-ai-capabilities-to-boost-efficiency-and-productivity/

Amazon Redshift puts artificial intelligence (AI) at your service to optimize efficiencies and make you more productive with two new capabilities that we are launching in preview today.

First, Amazon Redshift Serverless becomes smarter. It scales capacity proactively and automatically along dimensions such as the complexity of your queries, their frequency, the size of the dataset, and so on to deliver tailored performance optimizations. This allows you to spend less time tuning your data warehouse instances and more time getting value from your data.

Second, Amazon Q generative SQL in Amazon Redshift Query Editor generates SQL recommendations from natural language prompts. This helps you to be more productive in extracting insights from your data.

Let’s start with Amazon Redshift Serverless
When you use Amazon Redshift Serverless, you can now opt in for a preview of AI-driven scaling and optimizations. When enabled, the system observes and learns from your usage patterns, such as the concurrent number of queries, their complexity, and the time it takes to run them. Then, it automatically optimizes your serverless endpoint to meet your price performance target. Based on AWS internal testing, this new capability may give you up to ten times better price performance for variable workloads without any manual intervention.

AI-driven scaling and optimizations eliminate the time and effort to manually resize your workgroup and plan background optimizations based on workload needs. It continually runs automatic optimizations when they are most valuable for better performance, avoiding performance cliffs and time-outs.

This new capability goes beyond the existing self-tuning capabilities of Amazon Redshift Serverless, such as machine learning (ML)-enhanced techniques to adjust your compute, modify the physical schema of the database, create or drop materialized views as needed (the one we manage automatically, not yours), and vacuum tables. This new capability brings more intelligence to decide how to adjust the compute, what background optimizations are required, and when to apply them, and it makes its decisions based on more dimensions. We also orchestrate ML-based optimizations for materialized views, table optimizations, and workload management when your queries need it.

During the preview, you must opt in to enable these AI-driven scaling and optimizations on your workgroups. You configure the system to balance the optimization for price or performance. There is only one slider to adjust in the console.

Redshift serverless - AI driven workgoups

As usual, you can track resource usage and associated changes through the console, Amazon CloudWatch metrics, and the system table SYS_SERVERLESS_USAGE.

Now, let’s look at Amazon Q generative SQL in Amazon Redshift Query Editor
What if you could use generative AI to help analysts write effective SQL queries more rapidly? This is the new experience we introduce today in Amazon Redshift Query Editor, our web-based SQL editor.

You can now describe the information you want to extract from your data in natural language, and we generate the SQL query recommendations for you. Behind the scenes, Amazon Q generative SQL uses a large language model (LLM) and Amazon Bedrock to generate the SQL query. We use different techniques, such as prompt engineering and Retrieval Augmented Generation (RAG), to query the model based on your context: the database you’re connected to, the schema you’re working on, your query history, and optionally the query history of other users connected to the same endpoint. The system also remembers previous questions. You can ask it to refine a previously generated query.

The SQL generation model uses metadata specific to your data schema to generate relevant queries. For example, it uses the table and column names and the relationship between the tables in your database. In addition, your database administrator can authorize the model to use the query history of all users in your AWS account to generate even more relevant SQL statements. We don’t share your query history with other AWS accounts and we don’t train our generation models with any data coming from your AWS account. We maintain the high level of privacy and security that you expect from us.

Using generated SQL queries helps you to get started when discovering new schemas. It does the heavy lifting of discovering the column names and relationships between tables for you. Senior analysts also benefit from asking what they want in natural language and having the SQL statement automatically generated. They can review the queries and run them directly from their notebook.

Let’s explore a schema and extract information
For this demo, let’s pretend I am a data analyst at a company that sells concert tickets. The database schema and data are available for you to download. My manager asks me to analyze the ticket sales data to send a thank you note with discount coupons to the highest-spending customers in Seattle.

I connect to Amazon Redshift Query Editor and connect the analytic endpoint. I create a new tab for a Notebook (SQL generation is available from notebooks only).

Instead of writing a SQL statement, I open the chat panel and type, “Find the top five users from Seattle who bought the most number of tickets in 2022.” I take the time to verify the generated SQL statement. It seems correct, so I decide to run it. I select Add to notebook and then Run. The query returns the list of the top five buyers in Seattle.

sql generation - top 5 users

I had no previous knowledge of the data schema, and I did not type a single line of SQL to find the information I needed.

But generative SQL is not limited to a single interaction. I can chat with it to dynamically refine the queries. Here is another example.

I ask “Which state has the most venues?” Generative SQL proposes the following query. The answer is New York, with 49 venues, if you’re curious.

generative sql chat 01

I changed my mind, and I want to know the top three cities with the most venues. I simply rephrase my question: “What about the top three venues?

generative sql chat 02

I add the query to the notebook and run it. It returns the expected result.

generative sql chat 03

Best practices for prompting
Here are a couple of tips and tricks to get the best results out of your prompts.

Be specific – When asking questions in natural language, be as specific as possible to help the system understand exactly what you need. For example, instead of writing “find the top venues that sold the most tickets,” provide more details like “find the names of the top three venues that sold the most tickets in 2022.” Use consistent entity names like venue, ticket, and location instead of referring to the same entity in different ways, which can confuse the system.

Iterate – Break your complex requests into multiple simple statements that are easier for the system to interpret. Iteratively ask follow-up questions to get more detailed analysis from the system. For example, start by asking, “Which state has the most venues?” Then, based on the response, ask a follow-up question like “Which is the most popular venue from this state?”

Verify – Review the generated SQL before running it to ensure accuracy. If the generated SQL query has errors or does not match your intent, provide instructions to the system on how to correct it instead of rephrasing the entire request. For example, if the query is missing a filter clause on year, write “provide venues from year 2022.”

Availability and pricing
AI-driven scaling and optimizations are in preview in six AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and Europe (Ireland, Stockholm). They come at no additional cost. You pay only for the compute capacity your data warehouse consumes when it is active. Pricing is per Redshift Processing Unit (RPU) per hour. The billing is per second of used capacity. The pricing page for Amazon Redshift has the details.

Amazon Q generative SQL for Amazon Redshift Query Editor is in preview in two AWS Regions today: US East (N. Virginia) and US West (Oregon). There is no charge during the preview period.

These are two examples of how AI helps to optimize performance and increase your productivity, either by automatically adjusting the price-performance ratio of your Amazon Redshift Serverless endpoints or by generating correct SQL statements from natural language prompts.

Previews are essential for us to capture your feedback before we make these capabilities available for all. Experiment with these today and let us know what you think on the re:Post forums or using the feedback button on the bottom left side of the console.

— seb

AWS Clean Rooms Differential Privacy enhances privacy protection of your users data (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-clean-rooms-differential-privacy-enhances-privacy-protection-of-your-users-data-preview/

Starting today, you can use AWS Clean Rooms Differential Privacy (preview) to help protect the privacy of your users with mathematically backed and intuitive controls in a few steps. As a fully managed capability of AWS Clean Rooms, no prior differential privacy experience is needed to help you prevent the reidentification of your users.

AWS Clean Rooms Differential Privacy obfuscates the contribution of any individual’s data in generating aggregate insights in collaborations so that you can run a broad range of SQL queries to generate insights about advertising campaigns, investment decisions, clinical research, and more.

Quick overview on differential privacy
Differential privacy is not new. It is a strong, mathematical definition of privacy compatible with statistical and machine learning based analysis, and has been used by the United States Census Bureau as well as companies with vast amounts of data.

Differential privacy helps with a wide variety of use cases involving large datasets, where adding or removing a few individuals has a small impact on the overall result, such as population analyses using count queries, histograms, benchmarking, A/B testing, and machine learning.

The following illustration shows how differential privacy works when it is applied to SQL queries.

When an analyst runs a query, differential privacy adds a carefully calibrated amount of error (also referred to as noise) to query results at run-time, masking the contribution of individuals while still keeping the query results accurate enough to provide meaningful insights. The noise is carefully fine-tuned to mask the presence or absence of any possible individual in the dataset.

Differential privacy also has another component called privacy budget. The privacy budget is a finite resource consumed each time a query is run and thus controls the number of queries that can be run on your datasets, helping ensure that the noise cannot be averaged out to reveal any private information about an individual. When the privacy budget is fully exhausted, no more queries can be run on your tables until it is increased or refreshed.

However, differential privacy is not easy to implement because this technique requires an in-depth understanding of mathematically rigorous formulas and theories to apply it effectively. Configuring differential privacy is also a complex task because customers need to calculate the right level of noise in order to preserve the privacy of their users without negatively impacting the utility of query results.

Customers also want to enable their partners to conduct a wide variety of analyses including highly complex and customized queries on their data. This requirement is hard to support with differential privacy because of the intricate nature of the calculations involved in calibrating the noise while processing various query components such as aggregations, joins, and transformations.

We created AWS Clean Rooms Differential Privacy to help you protect the privacy of your users with mathematically backed controls in a few clicks.

How differential privacy works in AWS Clean Rooms
While differential privacy is quite a sophisticated technique, AWS Clean Rooms Differential Privacy makes it easy for you to apply it and protect the privacy of your users with mathematically backed, flexible, and intuitive controls. You can begin using it with just a few steps after starting or joining an AWS Clean Rooms collaboration as a member with abilities to contribute data.

You create a configured table, which is a reference to your table in the AWS Glue Data Catalog, and choose to turn on differential privacy while adding a custom analysis rule to the configured table.

Next, you associate the configured table to your AWS Clean Rooms collaboration and configure a differential privacy policy in the collaboration to make your table available for querying. You can use a default policy to quickly complete the setup or customize it to meet your specific requirements. As part of this step, you will configure the following:

Privacy budget
Quantified as a value that we call epsilon, the privacy budget controls the level of privacy protection. It is a common, finite resource that is applied for all of your tables protected with differential privacy in the collaboration because the goal is to preserve the privacy of your users whose information can be present in multiple tables. The privacy budget is consumed every time a query is run on your tables. You have the flexibility to increase the privacy budget value any time during the collaboration and automatically refresh it each calendar month.

Noise added per query
Measured in terms of the number of users whose contributions you want to obscure, this input parameter governs the rate at which the privacy budget is depleted.

In general, you need to balance your privacy needs against the number of queries you want to permit and the accuracy of those queries. AWS Clean Rooms makes it easy for you to complete this step by helping you understand the resulting utility you are providing to your collaboration partner. You can also use the interactive examples to understand how your chosen settings would impact the results for different types of SQL queries.

Now that you have successfully enabled differential privacy protection for your data, let’s see AWS Clean Rooms Differential Privacy in action. For this demo, let’s assume I am your partner in the AWS Clean Rooms collaboration.

Here, I’m running a query to count the number of overlapping customers and the result shows there are 3,227,643 values for tv.customer_id.

Now, if I run the same query again after removing records about an individual from coffee_customers table, it shows a different result, 3,227,604 tv.customer_id. This variability in the query results prevents me from identifying the individuals from observing the difference in query results.

I can also see the impact of differential privacy, including the remaining queries I can run.

Available for preview
Join this preview and start protecting the privacy of your users with AWS Clean Rooms Differential Privacy. During this preview period, you can use AWS Clean Rooms Differential Privacy wherever AWS Clean Rooms is available. To learn more on how to get started, visit the AWS Clean Rooms Differential Privacy page.

Happy collaborating!
Donnie

AWS Clean Rooms ML helps customers and partners apply ML models without sharing raw data (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-clean-rooms-ml-helps-customers-and-partners-apply-ml-models-without-sharing-raw-data-preview/

Today, we’re introducing AWS Clean Rooms ML (preview), a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. With this new capability, you can generate predictive insights using ML models while continuing to protect your sensitive data.

During this preview, AWS Clean Rooms ML introduces its first model specialized to help companies create lookalike segments for marketing use cases. With AWS Clean Rooms ML lookalike, you can train your own custom model, and you can invite partners to bring a small sample of their records to collaborate and generate an expanded set of similar records while protecting everyone’s underlying data.

In the coming months, AWS Clean Rooms ML will release a healthcare model. This will be the first of many models that AWS Clean Rooms ML will support next year.

AWS Clean Rooms ML helps you to unlock various opportunities for you to generate insights. For example:

  • Airlines can take signals about loyal customers, collaborate with online booking services, and offer promotions to users with similar characteristics.
  • Auto lenders and car insurers can identify prospective auto insurance customers who share characteristics with a set of existing lease owners.
  • Brands and publishers can model lookalike segments of in-market customers and deliver highly relevant advertising experiences.
  • Research institutions and hospital networks can find candidates similar to existing clinical trial participants to accelerate clinical studies (coming soon).

AWS Clean Rooms ML lookalike modeling helps you apply an AWS managed, ready-to-use model that is trained in each collaboration to generate lookalike datasets in a few clicks, saving months of development work to build, train, tune, and deploy your own model.

How to use AWS Clean Rooms ML to generate predictive insights
Today I will show you how to use lookalike modeling in AWS Clean Rooms ML and assume you have already set up a data collaboration with your partner. If you want to learn how to do that, check out the AWS Clean Rooms Now Generally Available — Collaborate with Your Partners without Sharing Raw Data post.

With your collective data in the AWS Clean Rooms collaboration, you can work with your partners to apply ML lookalike modeling to generate a lookalike segment. It works by taking a small sample of representative records from your data, creating a machine learning (ML) model, then applying the particular model to identify an expanded set of similar records from your business partner’s data.

The following screenshot shows the overall workflow for using AWS Clean Rooms ML.

By using AWS Clean Rooms ML, you don’t need to build complex and time-consuming ML models on your own. AWS Clean Rooms ML trains a custom, private ML model, which saves months of your time while still protecting your data.

Eliminating the need to share data
As ML models are natively built within the service, AWS Clean Rooms ML helps you protect your dataset and customer’s information because you don’t need to share your data to build your ML model.

You can specify the training dataset using the AWS Glue Data Catalog table, which contains user-item interactions.

Under Additional columns to train, you can define numerical and categorical data. This is useful if you need to add more features to your dataset, such as the number of seconds spent watching a video, the topic of an article, or the product category of an e-commerce item.

Applying custom-trained AWS-built models
Once you have defined your training dataset, you can now create a lookalike model. A lookalike model is a machine learning model used to find similar profiles in your partner’s dataset without either party having to share their underlying data with each other.

When creating a lookalike model, you need to specify the training dataset. From a single training dataset, you can create many lookalike models. You also have the flexibility to define the date window in your training dataset using Relative range or Absolute range. This is useful when you have data that is constantly updated within AWS Glue, such as articles read by users.

Easy-to-tune ML models
After you create a lookalike model, you need to configure it to use in AWS Clean Rooms collaboration. AWS Clean Rooms ML provides flexible controls that enable you and your partners to tune the results of the applied ML model to garner predictive insights.

On the Configure lookalike model page, you can choose which Lookalike model you want to use and define the Minimum matching seed size you need. This seed size defines the minimum number of profiles in your seed data that overlap with profiles in the training data.

You also have the flexibility to choose whether the partner in your collaboration receives metrics in Metrics to share with other members.

With your lookalike models properly configured, you can now make the ML models available for your partners by associating the configured lookalike model with a collaboration.

Creating lookalike segments
Once the lookalike models have been associated, your partners can now start generating insights by selecting Create lookalike segment and choosing the associated lookalike model for your collaboration.

Here on the Create lookalike segment page, your partners need to provide the Seed profiles. Examples of seed profiles include your top customers or all customers who purchased a specific product. The resulting lookalike segment will contain profiles from the training data that are most similar to the profiles from the seed.

Lastly, your partner will get the Relevance metrics as the result of the lookalike segment using the ML models. At this stage, you can use the Score to make a decision.

Export data and use programmatic API
You also have the option to export the lookalike segment data. Once it’s exported, the data is available in JSON format and you can process this output by integrating with AWS Clean Rooms API and your applications.

Join the preview
AWS Clean Rooms ML is now in preview and available via AWS Clean Rooms in US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London). Support for additional models is in the works.

Learn how to apply machine learning with your partners without sharing underlying data on the AWS Clean Rooms ML page.

Happy collaborating!
— Donnie

Announcing Amazon OpenSearch Service zero-ETL integration with Amazon S3 (preview)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-opensearch-service-zero-etl-integration-with-amazon-s3-preview/

Today we are announcing a preview of Amazon OpenSearch Service zero-ETL integration with Amazon S3, a new way to query operational logs in Amazon S3 and S3-based data lakes without needing to switch between services. You can now analyze infrequently queried data in cloud object stores and simultaneously use the operational analytics and visualization capabilities of OpenSearch Service.

Amazon OpenSearch Service direct queries with Amazon S3 provides a zero-ETL integration to reduce the operational complexity of duplicating data or managing multiple analytics tools by enabling customers to directly query their operational data, reducing costs and time to action. This zero-ETL integration will be configurable within OpenSearch Service, where you can take advantage of various log type templates, including predefined dashboards, and configure data accelerations tailored to that log type. Templates include VPC Flow Logs, Elastic Load Balancing logs, and NGINX logs, and accelerations include skipping indexes, materialized views, and covered indexes.

With direct queries with Amazon S3, you can perform complex queries critical to security forensic and threat analysis that correlate data across multiple data sources, which aids teams in investigating service downtime and security events. After creating an integration, you can start querying their data directly from the OpenSearch Dashboards or OpenSearch API. You can easily audit connections to ensure that they are set up in a scalable, cost-efficient, and secure way.

Getting started with direct queries with Amazon S3
You can easily get started by creating a new Amazon S3 direct query data source for OpenSearch Service through the AWS Management Console or the API. Each new data source uses AWS Glue Data Catalog to manage tables that represent S3 buckets. Once you create a data source, you can configure Amazon S3 tables and data indexing and query data in OpenSearch Dashboards.

1. Create a data source in OpenSearch Service
Before you create a data source, you should have an OpenSearch Service domain with version 2.11 or later and a target Amazon S3 table in AWS Glue Data Catalog with the appropriate IAM permissions. IAM will need access to the desired S3 bucket(s) and read and write access to AWS Glue Data Catalog. To learn more about IAM prerequisites, see Creating a data source in the AWS documentation.

Go to the OpenSearch Service console and choose the domain you want to set up a new data source for. In the domain details page, choose the Connections tab below the general information and see the Direct Query section.

To create a new data source, choose Create, input the name of your new data source, select the data source type as Amazon S3 with AWS Glue Data Catalog, and choose the IAM role for your data source.

Once you create a data source, you can go to the OpenSearch Dashboards of the domain, which you use to configure access control, define tables, set up log type–based dashboards for popular log types, and query your data.

2. Configuring your data source in OpenSearch Dashboards
To configure data source in OpenSearch Dashboards, choose Configure in the console and go to OpenSearch Dashboards. In the left-hand navigation of OpenSearch Dashboards, under Management, choose Data sources. Under Manage data sources, choose the name of the data source you created in the console.

Direct queries from OpenSearch Service to Amazon S3 use Spark tables within AWS Glue Data Catalog. To create a new table you want to direct query, go to the Query Workbench in the Open Search Plugins menu.

Now run as in the following SQL statement to create http_logs table and run MSCK REPAIR TABLE mys3.default.http_logs command to update the metadata in the catalog

CREATE EXTERNAL TABLE IF NOT EXISTS mys3.default.http_logs (
   `@timestamp` TIMESTAMP,
    clientip STRING,
    request STRING, 
    status INT, 
    size INT, 
    year INT, 
    month INT, 
    day INT) 
USING json PARTITIONED BY(year, month, day) OPTIONS (path 's3://mys3/data/http_log/http_logs_partitioned_json_bz2/', compression 'bzip2')

To ensure a fast experience with your data in Amazon S3, you can set up any of three different types of accelerations to index data into OpenSearch Service, such as skipping indexes, materialized views, and covering indexes. To create OpenSearch indexes from external data connections for better performance, choose the Accelerate Table.

  • Skipping indexes allow you to index only the metadata of the data stored in Amazon S3. Skipping indexes help quickly identify data stored by narrowing down a specific location of where the data is stored.
  • Materialized views enable you to use complex queries such as aggregations, which can be used for querying or powering dashboard visualizations. Materialized views ingest data into OpenSearch Service for anomaly detection or geospatial capabilities.
  • Covering indexes will ingest all the data from the specified table column. Covering indexes are the most performant of the three indexing types.

3. Query your data source in OpenSearch Dashboards
After you set up your tables, you can query your data using Discover. You can run a sample SQL query for the http_logs table you created in AWS Glue Data Catalog tables.

To learn more, see Working with Amazon OpenSearch Service direct queries with Amazon S3 in the AWS documentation.

Join the preview
Amazon OpenSearch Service zero-ETL integration with Amazon S3 is now previewed in the AWS US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland) Regions.

OpenSearch Service separately charges for only the compute needed as OpenSearch Compute Units to query your external data as well as maintain indexes in OpenSearch Service. For more information, see Amazon OpenSearch Service Pricing.

Give it a try and send feedback to the AWS re:Post for Amazon OpenSearch Service or through your usual AWS Support contacts.

Channy

Analyze large amounts of graph data to get insights and find trends with Amazon Neptune Analytics

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/introducing-amazon-neptune-analytics-a-high-performance-graph-analytics/

I am happy to announce the general availability of Amazon Neptune Analytics, a new analytics database engine that makes it faster for data scientists and application developers to quickly analyze large amounts of graph data. With Neptune Analytics, you can now quickly load your dataset from Amazon Neptune or your data lake on Amazon Simple Storage Service (Amazon S3), run your analysis tasks in near real time, and optionally terminate your graph afterward.

Graph data enables the representation and analysis of intricate relationships and connections within diverse data domains. Common applications include social networks, where it aids in identifying communities, recommending connections, and analyzing information diffusion. In supply chain management, graphs facilitate efficient route optimization and bottleneck identification. In cybersecurity, they reveal network vulnerabilities and identify patterns of malicious activity. Graph data finds application in knowledge management, financial services, digital advertising, and network security, performing tasks such as identifying money laundering networks in banking transactions and predicting network vulnerabilities.

Since the launch of Neptune in May 2018, thousands of customers have embraced the service for storing their graph data and performing updates and deletion on specific subsets of the graph. However, analyzing data for insights often involves loading the entire graph into memory. For instance, a financial services company aiming to detect fraud may need to load and correlate all historical account transactions.

Performing analyses on extensive graph datasets, such as running common graph algorithms, requires specialized tools. Utilizing separate analytics solutions demands the creation of intricate pipelines to transfer data for processing, which is challenging to operate, time-consuming, and prone to errors. Furthermore, loading large datasets from existing databases or data lakes to a graph analytic solution can take hours or even days.

Neptune Analytics offers a fully managed graph analytics experience. It takes care of the infrastructure heavy lifting, enabling you to concentrate on problem-solving through queries and workflows. Neptune Analytics automatically allocates compute resources according to the graph’s size and quickly loads all the data in memory to run your queries in seconds. Our initial benchmarking shows that Neptune Analytics loads data from Amazon S3 up to 80x faster than existing AWS solutions.

Neptune Analytics supports 5 families of algorithms covering 15 different algorithms, each with multiple variants. For example, we provide algorithms for path-finding, detecting communities (clustering), identifying important data (centrality), and quantifying similarity. Path-finding algorithms are used for use cases such as route planning for supply chain optimization. Centrality algorithms like page rank identify the most influential sellers in a graph. Algorithms like connected components, clustering, and similarity algorithms can be used for fraud-detection use cases to determine whether the connected network is a group of friends or a fraud ring formed by a set of coordinated fraudsters.

Neptune Analytics facilitates the creation of graph applications using openCypher, presently one of the widely adopted graph query languages. Developers, business analysts, and data scientists appreciate openCypher’s SQL-inspired syntax, finding it familiar and structured for composing graph queries.

Let’s see it at work
As we usually do on the AWS News blog, let’s show how it works. For this demo, I first navigate to Neptune in the AWS Management Console. There is a new Analytics section on the left navigation pane. I select Graphs and then Create graph.

Neptune Analytics - create graph 1

On the Create graph page, I enter the details of my graph analytics database engine. I won’t detail each parameter here; their names are self-explanatory.

Neptune Analytics - Create graph 1

Pay attention to Allow from public because, the vast majority of the time, you want to keep your graph only available from the boundaries of your VPC. I also create a Private endpoint to allow private access from machines and services inside my account VPC network.

Neptune Analytics - Create graph 2

In addition to network access control, users will need proper IAM permissions to access the graph.

Finally, I enable Vector search to perform similarity search using embeddings in the dataset. The dimension of the vector depends on the large language model (LLM) that you use to generate the embedding.

Neptune Analytics - Create graph 3

When I am ready, I select Create graph (not shown here).

After a few minutes, my graph is available. Under Connectivity & security, I take note of the Endpoint. This is the DNS name I will use later to access my graph from my applications.

I can also create Replicas. A replica is a warm standby copy of the graph in another Availability Zone. You might decide to create one or more replicas for high availability. By default, we create one replica, and depending on your availability requirements, you can choose not to create replicas.

Neptune Analytics - create graph 3

Business queries on graph data
Now that the Neptune Analytics graph is available, let’s load and analyze data. For the rest of this demo, imagine I’m working in the finance industry.

I have a dataset obtained from the US Securities and Exchange Commission (SEC). This dataset contains the list of positions held by investors that have more than $100 million in assets. Here is a diagram to illustrate the structure of the dataset I use in this demo.

Nuptune graph analytics - dataset structure

I want to get a better understanding of the positions held by one investment firm (let’s name it “Seb’s Investments LLC”). I wonder what its top five holdings are and who else holds more than $1 billion in the same companies. I am also curious to know what are other investment companies that have a similar portfolio as Seb’s Investments LLC.

To start my analysis, I create a Jupyter notebook in the Neptune section of the AWS Management Console. In the notebook, I first define my analytics endpoint and load the data set from an S3 bucket. It takes only 18 seconds to load 17 million records.

Neptune Analytics - load data

Then, I start to explore the dataset using openCypher queries. I start by defining my parameters:

params = {'name': "Seb's Investments LLC", 'quarter': '2023Q4'}

First, I want to know what the top five holdings are for Seb’s Investments LLC in this quarter and who else holds more than $1 billion in the same companies. In openCypher, it translates to the query hereafter. The $name parameter’s value is “Seb’s Investment LLC” and the $quarter parameter’s value is 2023Q4.

MATCH p=(h:Holder)-->(hq1)-[o:owns]->(holding)
WHERE h.name = $name AND hq1.name = $quarter
WITH DISTINCT holding as holding, o ORDER BY o.value DESC LIMIT 5
MATCH (holding)<-[o2:owns]-(hq2)<--(coholder:Holder)
WHERE hq2.name = '2023Q4'
WITH sum(o2.value) AS totalValue, coholder, holding
WHERE totalValue > 1000000000
RETURN coholder.name, collect(holding.name)

Neptune Analytics - query 1

Then, I want to know what the other top five companies are that have similar holdings as “Seb’s Investments LLC.” I use the topKByNode() function to perform a vector search.

MATCH (n:Holder)
WHERE n.name = $name
CALL neptune.algo.vectors.topKByNode(n)
YIELD node, score
WHERE score >0
RETURN node.name LIMIT 5

This query identifies a specific Holder node with the name “Seb’s Investments LLC.” Then, it utilizes the Neptune Analytics custom vector similarity search algorithm on the embedding property of the Holder node to find other nodes in the graph that are similar. The results are filtered to include only those with a positive similarity score, and the query finally returns the names of up to five related nodes.

Neptune Analytics - query 2

Pricing and availability
Neptune Analytics is available today in seven AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Singapore, Tokyo), and Europe (Frankfurt, Ireland).

AWS charges for the usage on a pay-as-you-go basis, with no recurring subscriptions or one-time setup fees.

Pricing is based on configurations of memory-optimized Neptune capacity units (m-NCU). Each m-NCU corresponds to one hour of compute and networking capacity and 1 GiB of memory. You can choose configurations starting with 128 m-NCUs and up to 4096 m-NCUs. In addition to m-NCU, storage charges apply for graph snapshots.

I invite you to read the Neptune pricing page for more details

Neptune Analytics is a new analytics database engine to analyze large graph datasets. It helps you discover insights faster for use cases such as fraud detection and prevention, digital advertising, cybersecurity, transportation logistics, and bioinformatics.

Get started
Log in to the AWS Management Console to give Neptune Analytics a try.

— seb

Vector search for Amazon DocumentDB (with MongoDB compatibility) is now generally available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/vector-search-for-amazon-documentdb-with-mongodb-compatibility-is-now-generally-available/

Today, we are announcing the general availability of vector search for Amazon DocumentDB (with MongoDB compatibility), a new built-in capability that lets you store, index, and search millions of vectors with millisecond response times within your document database.

Vector search is an emerging technique used in machine learning (ML) to find similar data points to given data by comparing their vector representations using distance or similarity metrics. Vectors are numerical representation of unstructured data created from large language models (LLM) hosted in Amazon Bedrock, Amazon SageMaker, and other open source or proprietary ML services. This approach is useful in creating generative artificial intelligence (AI) applications, such as intuitive search, product recommendation, personalization, and chatbots using Retrieval Augmented Generation (RAG) model approach. For example, if your data set contained individual documents for movies, you could semantically search for movies similar to Titanic based on shared context such as “boats”, “tragedy”, or “movies based on true stories” instead of simply matching keywords.

With vector search for Amazon DocumentDB, you can effectively search the database based on nuanced meaning and context without spending time and cost to manage a separate vector database infrastructure. You also benefit from the fully managed, scalable, secure, and highly available JSON-based document database that Amazon DocumentDB provides.

Getting started with vector search on Amazon DocumentDB
The vector search feature is available on your Amazon DocumentDB 5.0 instance-based clusters. To implement a vector search application, you generate vectors using embedding models for fields inside your document and store vectors side by side your source data inside Amazon DocumentDB.

Next, you create a vector index on a vector field that will help retrieve similar vectors and can search the Amazon DocumentDB database using semantic search. Finally, user-submitted queries are converted to vectors using the same embedding model to get semantically similar documents and return them to the client.

Let’s look at how to implement a simple semantic search application using vector search on Amazon DocumentDB.

Step 1. Create vector embeddings using the Amazon Titan Embeddings model
Let’s use the Amazon Titan Embeddings model to create an embedding vector. Amazon Titan Embeddings model is available in Amazon Bedrock, a serverless generative AI service. You can easily access it using a single API and without managing any infrastructure.

prompt = "I love dog and cat."
response = bedrock_runtime.invoke_model(
    body= json.dumps({"inputText": prompt}), 
    modelId='amazon.titan-embed-text-v1', 
    accept='application/json', 
    contentType='application/json'
)
response_body = json.loads(response['body'].read())
embedding = response_body.get('embedding')

The returned vector embedding will look similar to this:

[0.82421875, -0.6953125, -0.115722656, 0.87890625, 0.05883789, -0.020385742, 0.32421875, -0.00078201294, -0.40234375, 0.44140625, ...]

Step 2. Insert vector embeddings and create a vector index
You can add generated vector embeddings using the insertMany( [{},...,{}] ) operation with a list of the documents that you want added to your collection in Amazon DocumentDB.

db.collection.insertMany([
    {sentence: "I love a dog and cat.", vectorField: [0.82421875, -0.6953125,...]},
    {sentence: "My dog is very cute.", vectorField: [0.05883789, -0.020385742,...]},
    {sentence: "I write with a pen.", vectorField: [-0.020385742, 0.32421875,...]},
  ...
]);

You can create a vector index using the createIndex command. Amazon DocumentDB performs an approximate nearest neighbor (ANN) search using the inverted file with flat compression (IVFFLAT) vector index. The feature supports three distance metrics: euclidean, cosine, and inner product. We will use the euclidean distance, a measure of the straight-line distance between two points in space. The smaller the euclidean distance, the closer the vectors are to each other.

db.collection.createIndex (
   { vectorField: "vector" },
   { "name": "index name",
     "vectorOptions": {
        "dimensions": 100, // the number of vector data dimensions
        "similarity": "euclidean", // Or cosine and dotProduct
        "lists": 100 
      }
   }
);

Step 3.  Search vector embeddings from Amazon DocumentDB
You can now search for similar vectors within your documents using a new aggregation pipeline operator within $search. The example code to search “I like pets” is as follows:

db.collection.aggregate ({
  $search: {
    "vectorSearch": {
      "vector": [0.82421875, -0.6953125,...], // Search for ‘I like pets’
      "path": vectorField,
      "k": 5,
      "similarity": "euclidean", // Or cosine and dotProduct
      "probes": 1 // the number of clusters for vector search
      }
     }
   });

This returns search results such as “I love a dog and cat.” which is semantically similar.

To learn more, see Amazon DocumentDB documentation. To see a more practical example—a semantic movie search with Amazon DocumentDB—find the Python source codes and data-sets in the GitHub repository.

Now available
Vector search for Amazon DocumentDB is now available at no additional cost to all customers using Amazon DocumentDB 5.0 instance-based clusters in all AWS Regions where Amazon DocumentDB is available. Standard compute, I/O, storage, and backup charges will apply as you store, index, and search vector embeddings on Amazon DocumentDB.

To learn more, see the Amazon DocumentDB documentation and send feedback to AWS re:Post for Amazon DocumentDB or through your usual AWS Support contacts.

Channy

Vector engine for Amazon OpenSearch Serverless is now available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/vector-engine-for-amazon-opensearch-serverless-is-now-generally-available/

Today we are announcing the general availability of the vector engine for Amazon OpenSearch Serverless with new features. In July 2023, we introduced the preview release of the vector engine for Amazon OpenSearch Serverless, a simple, scalable, and high-performing similarity search capability. The vector engine makes it easy for you to build modern machine learning (ML) augmented search experiences and generative artificial intelligence (generative AI) applications without needing to manage the underlying vector database infrastructure.

You can now store, update, and search billions of vector embeddings with thousands of dimensions in milliseconds. The highly performant similarity search capability of vector engine enables generative AI-powered applications to deliver accurate and reliable results with consistent milliseconds-scale response times.

The vector engine also enables you to optimize and tune results with hybrid search by combining vector search and full-text search in the same query, removing the need to manage and maintain separate data stores or a complex application stack. The vector engine provides a secure, reliable, scalable, and enterprise-ready platform to cost effectively build a prototyping application and then seamlessly scale to production.

You can now get started in minutes with the vector engine by creating a specialized vector engine–based collection, which is a logical grouping of embeddings that works together to support a workload.

The vector engine uses OpenSearch Compute Units (OCUs), compute capacity unit, to ingest and run similarity search queries. One OCU can handle up to 2 million vectors for 128 dimensions or 500,000 for 768 dimensions at 99 percent recall rate.

The vector engine built on OpenSearch Serverless is a highly available service by default. It requires a minimum of four OCUs (2 OCUs for the ingest, including primary and standby, and 2 OCUs for the search with two active replicas across Availability Zones) for the first collection in an account. All subsequent collections using the same AWS Key Management Service (AWS KMS) key can share those OCUs.

What’s new at GA?
Since the preview, the vector engine for Amazon OpenSearch Serverless became one of the vector database options in the knowledge base of Amazon Bedrock to build generative AI applications using a Retrieval Augmented Generation (RAG) concept.

Here are some new or improved features for this GA release:

Disable redundant replica (development and test focused) option
As we announced in our preview blog post, this feature eliminates the need to have redundant OCUs in another Availability Zone solely for availability purposes. A collection can be deployed with two OCUs – one for indexing and one for search. This cuts the costs in half compared to default deployment with redundant replicas. The reduced cost makes this configuration suitable and economical for development and testing workloads.

With this option, we will still provide durability guarantees since the vector engine persists all the data in Amazon S3, but single-AZ failures would impact your availability.

If you want to disable a redundant replica, uncheck Enable redundancy when creating a new vector search
collection.

Fractional OCU for the development and test focused option
Support for fractional OCU billing for development and test focused workloads (that is, no redundant replica option) reduces the floor price for vector search collection. The vector engine will initially deploy smaller 0.5 OCUs while providing the same capabilities at lower scale and will scale up to a full OCU and beyond to meet your workload demand. This option will further reduce the monthly costs when experimenting with using the vector engine.

Automatic scaling for a billion scale
With vector engine’s seamless auto-scaling, you no longer have to reindex for scaling purposes. At preview, we were supporting about 20 million vector embeddings. With the general availability of vector engine, we have raised the limits to support a billion vector scale.

Now available
The vector engine for Amazon OpenSearch Serverless is now available in all AWS Regions where Amazon OpenSearch Serverless is available.

To get started, you can refer to the following resources:

Give it a try and send feedback to AWS re:Post for Amazon OpenSearch Service or through your usual AWS support contacts.

Channy

Introducing Amazon SageMaker HyperPod, a purpose-built infrastructure for distributed training at scale

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/introducing-amazon-sagemaker-hyperpod-a-purpose-built-infrastructure-for-distributed-training-at-scale/

Today, we are introducing Amazon SageMaker HyperPod, which helps reducing time to train foundation models (FMs) by providing a purpose-built infrastructure for distributed training at scale. You can now use SageMaker HyperPod to train FMs for weeks or even months while SageMaker actively monitors the cluster health and provides automated node and job resiliency by replacing faulty nodes and resuming model training from a checkpoint.

The clusters come preconfigured with SageMaker’s distributed training libraries that help you split your training data and model across all the nodes to process them in parallel and fully utilize the cluster’s compute and network infrastructure. You can further customize your training environment by installing additional frameworks, debugging tools, and optimization libraries.

Let me show you how to get started with SageMaker HyperPod. In the following demo, I create a SageMaker HyperPod and show you how to train a Llama 2 7B model using the example shared in the AWS ML Training Reference Architectures GitHub repository.

Create and manage clusters
As the SageMaker HyperPod admin, you can create and manage clusters using the AWS Management Console or AWS Command Line Interface (AWS CLI). In the console, navigate to Amazon SageMaker, select Cluster management under HyperPod Clusters in the left menu, then choose Create a cluster.

Amazon SageMaker HyperPod Clusters

In the setup that follows, provide a cluster name and configure instance groups with your instance types of choice and the number of instances to allocate to each instance group.

Amazon SageMaker HyperPod

You also need to prepare and upload one or more lifecycle scripts to your Amazon Simple Storage Service (Amazon S3) bucket to run in each instance group during cluster creation. With lifecycle scripts, you can customize your cluster environment and install required libraries and packages. You can find example lifecycle scripts for SageMaker HyperPod in the GitHub repo.

Using the AWS CLI
You can also use the AWS CLI to create and manage clusters. For my demo, I specify my cluster configuration in a JSON file. I choose to create two instance groups, one for the cluster controller node(s) that I call “controller-group,” and one for the cluster worker nodes that I call “worker-group.” For the worker nodes that will perform model training, I specify Amazon EC2 Trn1 instances powered by AWS Trainium chips.

// demo-cluster.json
{
   "InstanceGroups": [
        {
            "InstanceGroupName": "controller-group",
            "InstanceType": "ml.m5.xlarge",
            "InstanceCount": 1,
            "lifecycleConfig": {
                "SourceS3Uri": "s3://<your-s3-bucket>/<lifecycle-script-directory>/",
                "OnCreate": "on_create.sh"
            },
            "ExecutionRole": "arn:aws:iam::111122223333:role/my-role-for-cluster",
            "ThreadsPerCore": 1
        },
        {
            "InstanceGroupName": "worker-group",
            "InstanceType": "trn1.32xlarge",
            "InstanceCount": 4,
            "lifecycleConfig": {
                "SourceS3Uri": "s3://<your-s3-bucket>/<lifecycle-script-directory>/",
                "OnCreate": "on_create.sh"
            },
            "ExecutionRole": "arn:aws:iam::111122223333:role/my-role-for-cluster",
            "ThreadsPerCore": 1
        }
    ]
}

To create the cluster, I run the following AWS CLI command:

aws sagemaker create-cluster \
--cluster-name antje-demo-cluster \
--instance-groups file://demo-cluster.json

Upon creation, you can use aws sagemaker describe-cluster and aws sagemaker list-cluster-nodes to view your cluster and node details. Note down the cluster ID and instance ID of your controller node. You need that information to connect to your cluster.

You also have the option to attach a shared file system, such as Amazon FSx for Lustre. To use FSx for Lustre, you need to set up your cluster with an Amazon Virtual Private Cloud (Amazon VPC) configuration. Here’s an AWS CloudFormation template that shows how to create a SageMaker VPC and how to deploy FSx for Lustre.

Connect to your cluster
As a cluster user, you need to have access to the cluster provisioned by your cluster admin. With access permissions in place, you can connect to the cluster using SSH to schedule and run jobs. You can use the preinstalled AWS CLI plugin for AWS Systems Manager to connect to the controller node of your cluster.

For my demo, I run the following command specifying my cluster ID and instance ID of the control node as the target.

aws ssm start-session \
--target sagemaker-cluster:ntg44z9os8pn_i-05a854e0d4358b59c \
--region us-west-2

Schedule and run jobs on the cluster using Slurm
At launch, SageMaker HyperPod supports Slurm for workload orchestration. Slurm is a popular an open source cluster management and job scheduling system. You can install and set up Slurm through lifecycle scripts as part of the cluster creation. The example lifecycle scripts show how. Then, you can use the standard Slurm commands to schedule and launch jobs. Check out the Slurm Quick Start User Guide for architecture details and helpful commands.

For this demo, I’m using this example from the AWS ML Training Reference Architectures GitHub repo that shows how to train Llama 2 7B on Slurm with Trn1 instances. My cluster is already setup with Slurm, and I have an FSx for Lustre filesystem mounted.

Note
The Llama 2 model is governed by Meta. You can request access through the Meta request access page.

Set up the cluster environment
SageMaker HyperPod supports training in a range of environments, including Conda, venv, Docker, and enroot. Following the instructions in the README, I build my virtual environment aws_neuron_venv_pytorch and set up the torch_neuronx and neuronx-nemo-megatron libraries for training models on Trn1 instances.

Prepare model, tokenizer, and dataset
I follow the instructions to download the Llama 2 model and tokenizer and convert the model into the Hugging Face format. Then, I download and tokenize the RedPajama dataset. As a final preparation step, I pre-compile the Llama 2 model using ahead-of-time (AOT) compilation to speed up model training.

Launch jobs on the cluster
Now, I’m ready to start my model training job using the sbatch command.

sbatch --nodes 4 --auto-resume=1 run.slurm ./llama_7b.sh

You can use the squeue command to view the job queue. Once the training job is running, the SageMaker HyperPod resiliency features are automatically enabled. SageMaker HyperPod will automatically detect hardware failures, replace nodes as needed, and resume training from checkpoints if the auto-resume parameter is set, as shown in the preceding command.

You can view the output of the model training job in the following file:

tail -f slurm-run.slurm-<JOB_ID>.out

A sample output indicating that model training has started will look like this:

Epoch 0:  22%|██▏       | 4499/20101 [22:26:14<77:48:37, 17.95s/it, loss=2.43, v_num=5563, reduced_train_loss=2.470, gradient_norm=0.121, parameter_norm=1864.0, global_step=4512.0, consumed_samples=1.16e+6, iteration_time=16.40]
Epoch 0:  22%|██▏       | 4500/20101 [22:26:32<77:48:18, 17.95s/it, loss=2.43, v_num=5563, reduced_train_loss=2.470, gradient_norm=0.121, parameter_norm=1864.0, global_step=4512.0, consumed_samples=1.16e+6, iteration_time=16.40]
Epoch 0:  22%|██▏       | 4500/20101 [22:26:32<77:48:18, 17.95s/it, loss=2.44, v_num=5563, reduced_train_loss=2.450, gradient_norm=0.120, parameter_norm=1864.0, global_step=4512.0, consumed_samples=1.16e+6, iteration_time=16.50]

To further monitor and profile your model training jobs, you can use SageMaker hosted TensorBoard or any other tool of your choice.

Now available
SageMaker HyperPod is available today in AWS Regions US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm).

Learn more:

— Antje

PS: Writing a blog post at AWS is always a team effort, even when you see only one name under the post title. In this case, I want to thank Brad Doran, Justin Pirtle, Ben Snyder, Pierre-Yves Aquilanti, Keita Watanabe, and Verdi March for their generous help with example code and sharing their expertise in managing large-scale model training infrastructures, Slurm, and SageMaker HyperPod.

Amazon Titan Image Generator, Multimodal Embeddings, and Text models are now available in Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-titan-image-generator-multimodal-embeddings-and-text-models-are-now-available-in-amazon-bedrock/

Today, we’re introducing two new Amazon Titan multimodal foundation models (FMs): Amazon Titan Image Generator (preview) and Amazon Titan Multimodal Embeddings. I’m also happy to share that Amazon Titan Text Lite and Amazon Titan Text Express are now generally available in Amazon Bedrock. You can now choose from three available Amazon Titan Text FMs, including Amazon Titan Text Embeddings.

Amazon Titan models incorporate 25 years of artificial intelligence (AI) and machine learning (ML) innovation at Amazon and offer a range of high-performing image, multimodal, and text model options through a fully managed API. AWS pre-trained these models on large datasets, making them powerful, general-purpose models built to support a variety of use cases while also supporting the responsible use of AI.

You can use the base models as is, or you can privately customize them with your own data. To enable access to Amazon Titan FMs, navigate to the Amazon Bedrock console and select Model access on the bottom left menu. On the model access overview page, choose Manage model access and enable access to the Amazon Titan FMs.

Amazon Titan Models

Let me give you a quick tour of the new models.

Amazon Titan Image Generator (preview)
As a content creator, you can now use Amazon Titan Image Generator to quickly create and refine images using English natural language prompts. This helps companies in advertising, e-commerce, and media and entertainment to create studio-quality, realistic images in large volumes and at low cost. The model makes it easy to iterate on image concepts by generating multiple image options based on the text descriptions. The model can understand complex prompts with multiple objects and generates relevant images. It is trained on high-quality, diverse data to create more accurate outputs, such as realistic images with inclusive attributes and limited distortions.

Titan Image Generator’s image editing features include the ability to automatically edit an image with a text prompt using a built-in segmentation model. The model supports inpainting with an image mask and outpainting to extend or change the background of an image. You can also configure image dimensions and specify the number of image variations you want the model to generate.

In addition, you can customize the model with proprietary data to generate images consistent with your brand guidelines or to generate images in a specific style, for example, by fine-tuning the model with images from a previous marketing campaign. Titan Image Generator also mitigates harmful content generation to support the responsible use of AI. All images generated by Amazon Titan contain an invisible watermark, by default, designed to help reduce the spread of misinformation by providing a discreet mechanism to identify AI-generated images.

Amazon Titan Image Generator in action
You can start using the model in the Amazon Bedrock console by submitting either an English natural language prompt to generate images or by uploading an image for editing. In the following example, I show you how to generate an image with Amazon Titan Image Generator using the AWS SDK for Python (Boto3).

First, let’s have a look at the configuration options for image generation that you can specify in the body of the inference request. For task type, I choose TEXT_IMAGE to create an image from a natural language prompt.

import boto3
import json

bedrock = boto3.client(service_name="bedrock")
bedrock_runtime = boto3.client(service_name="bedrock-runtime")

# ImageGenerationConfig Options:
#   numberOfImages: Number of images to be generated
#   quality: Quality of generated images, can be standard or premium
#   height: Height of output image(s)
#   width: Width of output image(s)
#   cfgScale: Scale for classifier-free guidance
#   seed: The seed to use for reproducibility  

body = json.dumps(
    {
        "taskType": "TEXT_IMAGE",
        "textToImageParams": {
            "text": "green iguana",   # Required
#           "negativeText": "<text>"  # Optional
        },
        "imageGenerationConfig": {
            "numberOfImages": 1,   # Range: 1 to 5 
            "quality": "premium",  # Options: standard or premium
            "height": 768,         # Supported height list in the docs 
            "width": 1280,         # Supported width list in the docs
            "cfgScale": 7.5,       # Range: 1.0 (exclusive) to 10.0
            "seed": 42             # Range: 0 to 214783647
        }
    }
)

Next, specify the model ID for Amazon Titan Image Generator and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(
    body=body, 
    modelId="amazon.titan-image-generator-v1" 
    accept="application/json", 
    contentType="application/json"
)

Then, parse the response and decode the base64-encoded image.

import base64
from PIL import Image
from io import BytesIO

response_body = json.loads(response.get("body").read())
images = [Image.open(BytesIO(base64.b64decode(base64_image))) for base64_image in response_body.get("images")]

for img in images:
    display(img)

Et voilà, here’s the green iguana (one of my favorite animals, actually):

Green iguana generated by Amazon Titan Image Generator

To learn more about all the Amazon Titan Image Generator features, visit the Amazon Titan product page. (You’ll see more of the iguana over there.)

Next, let’s use this image with the new Amazon Titan Multimodal Embeddings model.

Amazon Titan Multimodal Embeddings
Amazon Titan Multimodal Embeddings helps you build more accurate and contextually relevant multimodal search and recommendation experiences for end users. Multimodal refers to a system’s ability to process and generate information using distinct types of data (modalities). With Titan Multimodal Embeddings, you can submit text, image, or a combination of the two as input.

The model converts images and short English text up to 128 tokens into embeddings, which capture semantic meaning and relationships between your data. You can also fine-tune the model on image-caption pairs. For example, you can combine text and images to describe company-specific manufacturing parts to understand and identify parts more effectively.

By default, Titan Multimodal Embeddings generates vectors of 1,024 dimensions, which you can use to build search experiences that offer a high degree of accuracy and speed. You can also configure smaller vector dimensions to optimize for speed and price performance. The model provides an asynchronous batch API, and the Amazon OpenSearch Service will soon offer a connector that adds Titan Multimodal Embeddings support for neural search.

Amazon Titan Multimodal Embeddings in action
For this demo, I create a combined image and text embedding. First, I base64-encode my image, and then I specify either inputText, inputImage, or both in the body of the inference request.

# Maximum image size supported is 2048 x 2048 pixels
with open("iguana.png", "rb") as image_file:
    input_image = base64.b64encode(image_file.read()).decode('utf8')

# You can specify either text or image or both
body = json.dumps(
    {
        "inputText": "Green iguana on tree branch",
        "inputImage": input_image
    }
)

Next, specify the model ID for Amazon Titan Multimodal Embeddings and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(
	body=body, 
	modelId="amazon.titan-embed-image-v1", 
	accept="application/json", 
	contentType="application/json"
)

Let’s see the response.

response_body = json.loads(response.get("body").read())
print(response_body.get("embedding"))

[0.005087942, -0.004392853, -0.04764151, -0.024312444, 0.049922388, 0.0132532045, 0.014374298, 0.005523709, -0.015199458, 0.02182385, ...]

I redacted the output for brevity. The distance between multimodal embedding vectors, measured with metrics like cosine similarity or euclidean distance, shows how similar or different the represented information is across modalities. Smaller distances mean more similarity, while larger distances mean more dissimilarity.

As a next step, you could build an image database by storing and indexing the multimodal embeddings in a vector store or vector database. To implement text-to-image search, query the database with inputText. For image-to-image search, query the database with inputImage. For image+text-to-image search, query the database with both inputImage and inputText.

Amazon Titan Text
Amazon Titan Text Lite and Amazon Titan Text Express are large language models (LLMs) that support a wide range of text-related tasks, including summarization, translation, and conversational chatbot systems. They can also generate code and are optimized to support popular programming languages and text formats like JSON and CSV.

Titan Text Express – Titan Text Express has a maximum context length of 8,192 tokens and is ideal for a wide range of tasks, such as open-ended text generation and conversational chat, and support within Retrieval Augmented Generation (RAG) workflows.

Titan Text Lite – Titan Text Lite has a maximum context length of 4,096 tokens and is a price-performant version that is ideal for English-language tasks. The model is highly customizable and can be fine-tuned for tasks such as article summarization and copywriting.

Amazon Titan Text in action
For this demo, I ask Titan Text to write an email to my team members suggesting they organize a live stream: “Compose a short email from Antje, Principal Developer Advocate, encouraging colleagues in the developer relations team to organize a live stream to demo our new Amazon Titan V1 models.”

body = json.dumps({
    "inputText": prompt, 
    "textGenerationConfig":{  
        "maxTokenCount":512,
        "stopSequences":[],
        "temperature":0,
        "topP":0.9
    }
})

Titan Text FMs support temperature and topP inference parameters to control the randomness and diversity of the response, as well as maxTokenCount and stopSequences to control the length of the response.

Next, choose the model ID for one of the Titan Text models and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(
    body=body,
	# Choose modelID
	# Titan Text Express: "amazon.titan-text-express-v1"
	# Titan Text Lite: "amazon.titan-text-lite-v1"
	modelID="amazon.titan-text-express-v1"
    accept="application/json", 
    contentType="application/json"
)

Let’s have a look at the response.

response_body = json.loads(response.get('body').read())
outputText = response_body.get('results')[0].get('outputText')

text = outputText[outputText.index('\n')+1:]
email = text.strip()
print(email)

Subject: Demo our new Amazon Titan V1 models live!

Dear colleagues,

I hope this email finds you well. I am excited to announce that we have recently launched our new Amazon Titan V1 models, and I believe it would be a great opportunity for us to showcase their capabilities to the wider developer community.

I suggest that we organize a live stream to demo these models and discuss their features, benefits, and how they can help developers build innovative applications. This live stream could be hosted on our YouTube channel, Twitch, or any other platform that is suitable for our audience.

I believe that showcasing our new models will not only increase our visibility but also help us build stronger relationships with developers. It will also provide an opportunity for us to receive feedback and improve our products based on the developer’s needs.

If you are interested in organizing this live stream, please let me know. I am happy to provide any support or guidance you may need. Together, let’s make this live stream a success and showcase the power of Amazon Titan V1 models to the world!

Best regards,
Antje
Principal Developer Advocate

Nice. I could send this email right away!

Availability and pricing
Amazon Titan Text FMs are available today in AWS Regions US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore, Tokyo), and Europe (Frankfurt). Amazon Titan Multimodal Embeddings is available today in the AWS Regions US East (N. Virginia) and US West (Oregon). Amazon Titan Image Generator is available in public preview in the AWS Regions US East (N. Virginia) and US West (Oregon). For pricing details, see the Amazon Bedrock Pricing page.

Learn more

Go to the AWS Management Console to start building generative AI applications with Amazon Titan FMs on Amazon Bedrock today!

— Antje

Amazon Bedrock now provides access to Anthropic’s latest model, Claude 2.1

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-bedrock-now-provides-access-to-anthropics-latest-model-claude-2-1/

Today, we’re announcing the availability of Anthropic’s Claude 2.1 foundation model (FM) in Amazon Bedrock. Last week, Anthropic introduced its latest model, Claude 2.1, delivering key capabilities for enterprises such as an industry-leading 200,000 token context window (2x the context of Claude 2.0), reduced rates of hallucination, improved accuracy over long documents, system prompts, and a beta tool use feature for function calling and workflow orchestration.

With Claude 2.1’s availability in Amazon Bedrock, you can build enterprise-ready generative artificial intelligence (AI) applications using more honest and reliable AI systems from Anthropic. You can now use the Claude 2.1 model provided by Anthropic in the Amazon Bedrock console.

Here are some key highlights about the new Claude 2.1 model in Amazon Bedrock:

200,000 token context window – Enterprise applications demand larger context windows and more accurate outputs when working with long documents such as product guides, technical documentation, or financial or legal statements. Claude 2.1 supports 200,000 tokens, the equivalent of roughly 150,000 words or over 500 pages of documents. When uploading extensive information to Claude, you can summarize, perform Q&A, forecast trends, and compare and contrast multiple documents for drafting business plans and analyzing complex contracts.

Strong accuracy upgrades – Claude 2.1 has also made significant gains in honesty, with a 2x decrease in hallucination rates, 50 percent fewer hallucinations in open-ended conversation and document Q&A, a 30 percent reduction in incorrect answers, and a 3–4 times lower rate of mistakenly concluding that a document supports a particular claim compared to Claude 2.0. Claude increasingly knows what it doesn’t know and will more likely demur rather than hallucinate. With this improved accuracy, you can build more reliable, mission-critical applications for your customers and employees.

System prompts – Claude 2.1 now supports system prompts, a new feature that can improve Claude’s performance in a variety of ways, including greater character depth and role adherence in role-playing scenarios, particularly over longer conversations, as well as stricter adherence to guidelines, rules, and instructions. This represents a structural change, but not a content change from former ways of prompting Claude.

Tool use for function calling and workflow orchestration – Available as a beta feature, Claude 2.1 can now integrate with your existing internal processes, products, and APIs to build generative AI applications. Claude 2.1 accurately retrieves and processes data from additional knowledge sources as well as invokes functions for a given task.  Claude 2.1 can answer questions by searching databases using private APIs and a web search API, translate natural language requests into structured API calls, or connect to product datasets to make recommendations and help customers complete purchases. Access to this feature is currently limited to select early access partners, with plans for open access in the near future. If you are interested in gaining early access, please contact your AWS account team.

To learn more about Claude 2.1’s features and capabilities, visit Anthropic Claude on Amazon Bedrock and the Amazon Bedrock documentation.

Claude 2.1 in action
To get started with Claude 2.1 in Amazon Bedrock, go to the Amazon Bedrock console. Choose Model access on the bottom left pane, then choose Manage model access on the top right side, submit your use case, and request model access to the Anthropic Claude model. It may take several minutes to get access to models. If you already have access to the Claude model, you don’t need to request access separately for Claude 2.1.

To test Claude 2.1 in chat mode, choose Text or Chat under Playgrounds in the left menu pane. Then select Anthropic and then Claude v2.1.

By choosing View API request, you can also access the model via code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

$ aws bedrock-runtime invoke-model \
      --model-id anthropic.claude-v2:1 \
      --body "{\"prompt\":\"Human: \\n\\nHuman: Tell me funny joke about outer space!\n\nAssistant:", "max_tokens_to_sample": 50}' \
      --cli-binary-format raw-in-base64-out \
      invoke-model-output.txt

You can use system prompt engineering techniques provided by the Claude 2.1 model, where you place your inputs and documents before any questions that reference or utilize that content. Inputs can be natural language text, structured documents, or code snippets using <document>, <papers>, <books>, or <code> tags, and so on. You can also use conversational text, such as chat history, and Retrieval Augmented Generation (RAG) results, such as chunked documents.

Here is a system prompt example for support agents to respond to customer questions based on corporate documents.

Here are some documents for you to reference for your task:
<documents>
 <document index="1">
  <document_content>
  (the text content of the document - could be a passage, web page, article, etc)
   </document_content>
<document index="2">
  <source>https://mycompany.repository/userguide/what-is-it.html</source>
</document>
<document index="3">
  <source>https://mycompany.repository/docs/techspec.pdf</source>
 </document>
...
</documents>

You are Larry, and you are a customer advisor with deep knowledge of your company's products. Larry has a great deal of patience with his customers, even when they say nonsense or are sarcastic. Larry's answers are polite but sometimes funny. However, he only answers questions about the company's products and doesn't know much about other questions. Use the provided documentation to answer user questions.

Human: Your product is making a weird stuttering sound when I operate. What might be the problem?

To learn more about prompt engineering on Amazon Bedrock, see the Prompt engineering guidelines included in the Amazon Bedrock documentation. You can learn general prompt techniques, templates, and examples for Amazon Bedrock text models, including Claude.

Now available
Claude 2.1 is available today in the US East (N. Virginia) and US West (Oregon) Regions.

You only pay for what you use, with no time-based term commitments for on-demand mode. For text generation models, you are charged for every input token processed and every output token generated. Or you can choose the provisioned throughput mode to meet your application’s performance requirements in exchange for a time-based term commitment. To learn more, see Amazon Bedrock Pricing.

Give Anthropic Claude 2.1 a try in Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

New generative AI capabilities for Amazon DataZone to further simplify data cataloging and discovery (preview)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-generative-ai-capabilities-for-amazon-datazone-to-further-simplify-data-cataloging-and-discovery-preview/

Today, we are announcing a preview of an automation feature backed by generative artificial intelligence (AI) for Amazon DataZone that will dramatically decrease the amount of time needed to provide context for organizational data. The new feature can automate the traditionally labor-intensive process of data cataloging. Powered by the large language models (LLMs) of Amazon Bedrock, it generates detailed descriptions of data assets and their schemas, and suggests analytical use cases. You can generate a comprehensive business context with a single click.

We heard from customers that data consumers such as data analysts, scientists, and engineers in organizations struggle to understand the data’s relevance with little metadata. As a result, they either spend more time interpreting the data, or they return to data producers with continued questions. So, data producers such as data owners, engineers, and analysts who own the data and make it available for consumers need to manually enter detailed context for higher-priority data to make data shareable and discoverable. This is time-consuming and the number one problem customers have when trying to collate their data in a system for self-service by consumers.

When we launched the general availability of Amazon DataZone in October 2023, we introduced the first feature that brings generative AI capabilities to automate the generation of the table name and column names of a business catalog asset. In the data portal of Amazon DataZone, the green brain icon indicates automatically generated metadata suggestions. You could accept, edit, or reject each suggestion recommended by Amazon DataZone.

What’s new with today’s preview announcement?
Now, in addition to column and table names, you can automatically generate more detailed descriptions of the table and schema, as well as suggested uses.

In the Business Metadata tab in the data portal, when you choose Generate summary, new content will be generated to explain the table and its metadata.

You can also accept, edit, and reject this recommendation.

When you choose the Schema tab, you can also see new Description recommendations as well as the Name. You can review generated metadata and choose to accept, edit, or reject the recommendation.

This new feature will enhance data discoverability and reduce on back-and-forth communications between data consumers and producers. You will have a richer search experience based on extensive data insights in the future.

Join the preview
The new metadata generation ability is now previewed in the AWS US East (N. Virginia) and US West (Oregon) Regions. With this new generative AI capability, you can reduce time-to-insight by accelerating data cataloging and boosting data discovery. To learn more, visit the Amazon DataZone: Automate Data Discovery.

Give it a try and send feedback to AWS re:Post for Amazon DataZone or through your usual AWS Support contacts.

Channy

Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service is now available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-dynamodb-zero-etl-integration-with-amazon-opensearch-service-is-now-generally-available/

Today, we are announcing the general availability of Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service, which lets you perform a search on your DynamoDB data by automatically replicating and transforming it without custom code or infrastructure. This zero-ETL integration reduces the operational burden and cost involved in writing code for a data pipeline architecture, keeping the data in sync, and updating code with frequent application changes, enabling you to focus on your application.

With this zero-ETL integration, Amazon DynamoDB customers can now use the powerful search features of Amazon OpenSearch Service, such as full-text search, fuzzy search, auto-complete, and vector search for machine learning (ML) capabilities to offer new experiences that boost user engagement and improve satisfaction with their applications.

This zero-ETL integration uses Amazon OpenSearch Ingestion to synchronize the data between Amazon DynamoDB and Amazon OpenSearch Service. You choose the DynamoDB table whose data needs to be synchronized and Amazon OpenSearch Ingestion synchronizes the data to an Amazon OpenSearch managed cluster or serverless collection within seconds of it being available.

You can also specify index mapping templates to ensure that your Amazon DynamoDB fields are mapped to the correct fields in your Amazon OpenSearch Service indexes. Also, you can synchronize data from multiple DynamoDB tables into one Amazon OpenSearch Service managed cluster or serverless collection to offer holistic insights across several applications.

Getting started with this zero-ETL integration
With a few clicks, you can synchronize data from DynamoDB to OpenSearch Service. To create an integration between DynamoDB and OpenSearch Service, choose the Integrations menu in the left pane of the DynamoDB console and the DynamoDB table whose data you want to synchronize.

You must turn on point-in-time recovery (PITR) and the DynamoDB Streams feature. This feature allows you to capture item-level changes in your table and push the changes to a stream. Choose Turn on for PITR and enable DynamoDB Streams in the Exports and streams tab.

After turning on PITR and DynamoDB Stream, choose Create to set up an OpenSearch Ingestion pipeline in your account that replicates the data to an OpenSearch Service managed domain.

In the first step, enter a unique pipeline name and set up pipeline capacity and compute resources to automatically scale your pipeline based on the current ingestion workload.

Now you can configure the pre-defined pipeline configuration in YAML file format. You can browse resources to look up and paste information to build the pipeline configuration. This pipeline is a combination of a source part from DyanmoDB settings and a sink part for OpenSearch Service.

You must set multiple IAM roles (sts_role_arn) with the necessary permissions to read data from the DynamoDB table and write to an OpenSearch domain. This role is then assumed by OpenSearch Ingestion pipelines to ensure that the right security posture is always maintained when moving the data from source to destination. To learn more, see Setting up roles and users in Amazon OpenSearch Ingestion in the AWS documentation.

After entering all required values, you can validate the pipeline configuration to ensure that your configuration is valid. To learn more, see Creating Amazon OpenSearch Ingestion pipelines in the AWS documentation.

Take a few minutes to set up the OpenSearch Ingestion pipeline, and you can see your integration is completed in the DynamoDB table.

Now you can search synchronized items in the OpenSearch Dashboards.

Things to know
Here are a couple of things that you should know about this feature:

  • Custom schema – You can specify your custom data schema along with the index mappings used by OpenSearch Ingestion when writing data from Amazon DynamoDB to OpenSearch Service. This experience is added to the console within Amazon DynamoDB so that you have full control over the format of indices that are created on OpenSearch Service.
  • Pricing – There will be no additional cost to use this feature apart from the cost of the existing underlying components. Note that Amazon OpenSearch Ingestion charges OpenSearch Compute Units (OCUs) which will be used to replicate data between Amazon DynamoDB and Amazon OpenSearch Service. Furthermore, this feature uses Amazon DynamoDB streams for the change data capture (CDC) and you will incur the standard costs for Amazon DynamoDB Streams.
  • Monitoring – You can monitor the state of the pipelines by checking the status of the integration on the DynamoDB console or using the OpenSearch Ingestion dashboard. Additionally, you can use Amazon CloudWatch to provide real-time metrics and logs, which lets you to set up alerts in case of a breach of user-defined thresholds.

Now available
Amazon DynamoDB zero-ETL integration with Amazon OpenSearch Service is now generally available in all AWS Regions where OpenSearch Ingestion is available today.

Channy

New Amazon Q in QuickSight uses generative AI assistance for quicker, easier data insights (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/new-amazon-q-in-quicksight-uses-generative-ai-assistance-for-quicker-easier-data-insights-preview/

Today, I’m happy to share that Amazon Q in QuickSight is available for preview. Now you can experience the Generative BI capabilities in Amazon QuickSight announced on July 26, as well as two additional capabilities for business users.

Turning insights into impact faster with Amazon Q in QuickSight
With this announcement, business users can now generate compelling sharable stories examining their data, see executive summaries of dashboards surfacing key insights from data in seconds, and confidently answer questions of data not answered by dashboards and reports with a reimagined Q&A experience.

Before we go deeper into each capability, here’s a quick summary:

  • Stories — This is a new and visually compelling way to present and share insights. Stories can automatically generated in minutes using natural language prompts, customized using point-and-click options, and shared securely with others.
  • Executive summaries — With this new capability, Amazon Q helps you to understand key highlights in your dashboard.
  • Data Q&A — This capability provides a new and easy-to-use natural-language Q&A experience to help you get answers for questions beyond what is available in existing dashboards and reports.​​

To get started, you need to enable Preview Q Generative Capabilities in Preview manager.

Once enabled, you’re ready to experience what Amazon Q in QuickSight brings for business users and business analysts building dashboards.

Stories automatically builds formatted narratives
Business users often need to share their findings of data with others to inform team decisions; this has historically involved taking data out of the business intelligence (BI) system. Stories are a new feature enabling business users to create beautifully formatted narratives that describe data, and include visuals, images, and text in document or slide format directly that can easily be shared with others within QuickSight.

Now, business users can use natural language to ask Amazon Q to build a story about their data by starting from the Amazon Q Build menu on an Amazon QuickSight dashboard. Amazon Q extracts data insights and statistics from selected visuals, then uses large language models (LLMs) to build a story in multiple parts, examining what the data may mean to the business and suggesting ideas to achieve specific goals.

For example, a sales manager can ask, “Build me a story about overall sales performance trends. Break down data by product and region. Suggest some strategies for improving sales.” Or, “Write a marketing strategy that uses regional sales trends to uncover opportunities that increase revenue.” Amazon Q will build a story exploring specific data insights, including strategies to grow sales.

Once built, business users get point-and-click tools augmented with artificial intelligence- (AI) driven rewriting capabilities to customize stories using a rich text editor to refine the message, add ideas, and highlight important details.

Stories can also be easily and securely shared with other QuickSight users by email.

Executive summaries deliver a quick snapshot of important information
Executive summaries are now available with a single click using the Amazon Q Build menu in Amazon QuickSight. Amazon QuickSight automatically determines interesting facts and statistics, then use LLMs to write about interesting trends.

This new capability saves time in examining detailed dashboards by providing an at-a-glance view of key insights described using natural language.

The executive summaries feature provides two advantages. First, it helps business users generate all the key insights without the need to browse through tens of visuals on the dashboard and understand changes from each. Secondly, it enables readers to find key insights based on information in the context of dashboards and reports with minimum effort.

New data Q&A experience
Once an interesting insight is discovered, business users frequently need to dig in to understand data more deeply than they can from existing dashboards and reports. Natural language query (NLQ) solutions designed to solve this problem frequently expect that users already know what fields may exist or how they should be combined to answer business questions. However, business users aren’t always experts in underlying data schemas, and their questions frequently come in more general terms, like “How were sales last week in NY?” Or, “What’s our top campaign?”

The new Q&A experience accessed within the dashboards and reports helps business users confidently answer questions about data. It includes AI-suggested questions and a profile of what data can be asked about and automatically generated multi-visual answers with narrative summaries explaining data context.

Furthermore, Amazon Q brings the ability to answer vague questions and offer alternatives for specific data. For example, customers can ask a vague question, such as “Top products,” and Amazon Q will provide an answer that breaks down products by sales and offers alternatives for products by customer count and products by profit. Amazon Q explains answer context in a narrative summarizing total sales, number of products, and picking out the sales for the top product.

Customers can search for specific data values and even a single word such as, for example, the product name “contactmatcher.” Amazon Q returns a complete set of data related to that product and provides a natural language breakdown explaining important insights like total units sold. Specific visuals from the answers can also be added to a pinboard for easy future access.

Watch the demo
To see these new capabilities in action, have a look at the demo.

Things to Know
Here are a few additional things that you need to know:

Join the preview
Amazon Q in QuickSight product page

Happy building!
— Donnie

Introducing Amazon Q, a new generative AI-powered assistant (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/introducing-amazon-q-a-new-generative-ai-powered-assistant-preview/

Today, we are announcing Amazon Q, a new generative artificial intelligence- (AI)-powered assistant designed for work that can be tailored to your business. You can use Amazon Q to have conversations, solve problems, generate content, gain insights, and take action by connecting to your company’s information repositories, code, data, and enterprise systems. Amazon Q provides immediate, relevant information and advice to employees to streamline tasks, accelerate decision-making and problem-solving, and help spark creativity and innovation at work.

Amazon Q offers user-based plans, so you get features, pricing, and options tailored to how you use the product. Amazon Q can adapt its interactions to each individual user based on the existing identities, roles, and permissions of your business. AWS never uses customers’ content from Amazon Q to train the underlying models. In other words, your company information remains secure and private.

In this post, I’ll give you a quick tour of how you can use Amazon Q for general business use. 

Amazon Q is your business expert
Let’s look at a few examples of how Amazon Q can help business users complete tasks using simple natural language prompts. As a marketing manager, you could ask Amazon Q to transform a press release into a blog post, create a summary of the press release, or create an email draft based on the provided release. Amazon Q searches through your company content, which can include internal style guides, for example, to provide a response appropriate to your company’s brand standards. Then, you could ask Amazon Q to generate tailored social media prompts to promote your story through each of your social media channels. Later, you can ask Amazon Q to analyze the results of your campaign and summarize them for leadership reviews.

Amazon Q

In the following example, I deployed Amazon Q with access to my AWS News Blog posts from 2023 and called the assistant “AWS Blog Expert.”

Amazon Q

Coming back to my previous example, let’s assume I’m a marketing manager and want Amazon Q to help me create social media posts for recent company blog posts.

I enter the following prompt: “Summarize the key insights from Antje’s recent AWS Weekly Roundup posts and craft a compelling social media post that not only highlights the most important points but also encourages engagement. Consider our target audience and aim for a tone that aligns with our brand identity. The social media post should be concise, informative, and enticing to encourage readers to click through and read the full articles. Please ensure the content is shareable and includes relevant hashtags for maximum visibility.”

Amazon Q

Behind the scenes, Amazon Q searches the documents in connected data sources and creates a relevant and detailed suggestion for a social media post based on my blog posts. Amazon Q also tells me which document was used to generate the answer. In this case, it is PDF file of the blog posts in question.

As an administrator, you can define the context for responses, restrict irrelevant topics, and configure whether to respond only using trusted company information or complement responses with knowledge from the underlying model. Restricting responses to trusted company information helps mitigate hallucinations, a common phenomenon where the underlying model generates responses that sound plausible but are based on misinterpreted or nonexistent data.

Amazon Q provides fine-grained access controls that restrict responses to only using data or acting based on the employee’s level of access and provides citations and references to the original sources for fact-checking and traceability. You can choose among 40+ built-in connectors for popular data sources and enterprise systems, including Amazon S3, Google Drive, Microsoft SharePoint, Salesforce, ServiceNow, and Slack.

How to tailor Amazon Q to your business
To tailor Amazon Q to your business, navigate to Amazon Q in the console, select Applications in the left menu, and choose Create application.

Amazon Q

This starts the following workflow.

Step 1. Create application. Provide an application name and create a new or select an existing AWS Identity and Access Management (IAM) service role that Amazon Q is allowed to assume. I call my application AWS-Blog-Expert. Then, choose Create.

Amazon Q

Step 2. Select retriever. A retriever pulls data from the index in real time during a conversation. You can choose between two options: use the Amazon Q native retriever or use an existing Amazon Kendra retriever. The native retriever can connect to the Amazon Q supported data sources. If you already use Amazon Kendra, you can select the existing Amazon Kendra retriever to connect the associated data sources to your Amazon Q application. I select the native retriever option. Then, choose Next.

Amazon Q

Step 3. Connect data sources. Amazon Q comes with built-in connectors for popular data sources and enterprise systems. For this demo, I choose Amazon S3 and configure the data source by pointing to my S3 bucket with the PDFs of my blog posts.

Amazon Q
Once the data source sync is successfully complete and the retriever shows the accurate document count, you can preview the web experience and start a conversation. Note that the data source sync can take from a few minutes to a few hours, depending on the amount and size of data to index.

You can also connect plugins that manage access to enterprise systems, including ServiceNow, Jira, Salesforce, and Zendesk. Plugins enable Amazon Q to perform user-requested tasks, such as creating support tickets or analyzing sales forecasts.

Amazon Q

Preview and deploy web experience
In the application overview, choose Preview web experience. This opens the web experience with the conversational interface to chat with the tailored Amazon Q AWS Blog Expert. In the final step, you deploy the Amazon Q web experience. You can integrate your SAML 2.0–compliant external identity provider (IdP) using IAM. Amazon Q can work with any IdP that’s compliant with SAML 2.0. Amazon Q uses service-initiated single sign-on (SSO) to authenticate users.

Join the preview
Amazon Q is available today in preview in AWS Regions US East (N. Virginia) and US West (Oregon). Visit the product page to learn how Amazon Q can become your expert in your business.

Also, check out the Amazon Q Slack Gateway GitHub repository that shows how to make Amazon Q available to users as a Slack Bot application.Amazon Q Slack Bot

Learn more

— Antje

Upgrade your Java applications with Amazon Q Code Transformation (preview)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/upgrade-your-java-applications-with-amazon-q-code-transformation-preview/

As our applications age, it takes more and more effort just to keep them secure and running smoothly. Developers managing the upgrades must spend time relearning the intricacies and nuances of breaking changes and performance optimizations others have already discovered in past upgrades. As a result, it’s difficult to balance the focus between new features and essential maintenance work.

Today, we are introducing in preview Amazon Q Code Transformation. This new capability simplifies upgrading and modernizing existing application code using Amazon Q, a new type of assistant powered by generative artificial intelligence (AI). Amazon Q is specifically designed for work and can be tailored to your business.

Amazon Q Code Transformation can perform Java application upgrades now, from version 8 and 11 to version 17, a Java Long-Term Support (LTS) release, and it will soon be able to transform Windows-based .NET Framework applications to cross-platform .NET.

Previously, developers could spend two to three days upgrading each application. Our internal testing shows that the transformation capability can upgrade an application in minutes compared to the days or weeks typically required for manual upgrades, freeing up time to focus on new business requirements. For example, an internal Amazon team of five people successfully upgraded one thousand production applications from Java 8 to 17 in 2 days. It took, on average, 10 minutes to upgrade applications, and the longest one took less than an hour.

Amazon Q Code Transformation automatically analyzes the existing code, generates a transformation plan, and completes the transformation tasks suggested by the plan. While doing so, it identifies and updates package dependencies and refactors deprecated and inefficient code components, switching to new language frameworks and incorporating security best practices. Once complete, you can review the transformed code, complete with build and test results, before accepting the changes.

In this way, you can keep applications updated and supported in just a few steps, gain performance benefits, and remove vulnerabilities from using unsupported versions, freeing up time to focus on new business requirements. Let’s see how this works in practice.

Upgrading a Java application from version 8 to 17
I am using IntelliJ IDEA in this walkthrough (the same is available for Visual Studio Code). To have Amazon Q Code Transformation in my IDE, I install the latest version of the AWS Toolkit for IntelliJ IDEA and sign in using the AWS IAM Identity Center credentials provided by my organization. Note that to access Amazon Q Code Transformation, the CodeWhisperer administrator needs to explicitly give access to Amazon Q features in the profile used by the organization.

I open an old project that I never had the time to update to a more recent version of Java. The project is using Apache Maven to manage the build. The project object model (POM) file (pom.xml), an XML representation of the project, is in the root directory.

First, in the project settings, I check that the project is configured to use the correct SDK version (1.8 in this case). I choose AWS Toolkit on the left pane and then the Amazon Q + CodeWhisperer tab. In the Amazon Q (Preview) section, I choose Transform.

IDE screenshot.

This opens a dialog where I check that the correct Maven module is selected for the upgrade before proceeding with the transformation.

IDE screenshot.

I follow the progress in the Transformation Hub window. The upgrade completes in a few minutes for my small application, while larger ones might take more than an hour to complete.

The end-to-end application upgrade consists of three steps:

  1. Identifying and analyzing the application – The code is copied to a managed environment in the cloud where the build process is set up based on the instructions in the repository. At this stage, the components to be upgraded are identified.
  2. Creating a transformation plan – The code is analyzed to create a transformation plan that lists the steps that Amazon Q Code Transformation will take to upgrade the code, including updating dependencies, building the upgraded code, and then iteratively fixing any build errors encountered during the upgrade.
  3. Code generation, build testing, and finalization – The transformation plan is followed iteratively to update existing code and configuration files, generate new files where needed, perform build validation using the tests provided with the code, and fix issues identified in failed builds.

IDE screenshot.

After a few minutes, the transformation terminates successfully. From here, I can open the plan and a summary of the transformation. I choose View diff to see the proposed changes. In the Apply Patch dialog, I see a recap of the files that have been added, modified, or deleted.

IDE screenshot.

First, I select the pom.xml file and then choose Show Difference (the icon with the left/right arrows) to have a side-by-side view of the current code in the project and the proposed changes. For example, I see that the version of one of the dependencies (Project Lombok) has been increased for compatibility with the target Java version.

IDE screenshot.

In the Java file, the annotations used by the upgraded dependency have been updated. With the new version, @With has been promoted, and @Wither (which was experimental) deprecated. These changes are reflected in the import statements.

IDE screenshot.

There is also a summary file that I keep in the code repo to quickly look up the changes made to complete the upgrade.

I spend some time reviewing the files. Then, I choose OK to accept all changes.

Now the patch has been successfully applied, and the proposed changes merged with the code. I commit changes to my repo and move on to focus on business-critical changes that have been waiting for the migration to be completed.

Things to know
The preview of Amazon Q Code Transformation is available today for customers on the Amazon CodeWhisperer Professional Tier in the AWS Toolkit for IntelliJ IDEA and the AWS Toolkit for Visual Studio Code. To use Amazon Q Code Transformation, the CodeWhisperer administrator needs to give access to the profile used by the organization.

There is no additional cost for using Amazon Q Code Transformation during the preview. You can upgrade Java 8 and 11 applications that are built using Apache Maven to Java version 17. The project must have the POM file (pom.xml) in the root directory. We’ll soon add the option to transform Windows-based .NET Framework applications to cross-platform .NET and help accelerate migrations to Linux.

Once a transformation job is complete, you can use a diff view to verify and accept the proposed changes. The final transformation summary provides details of the dependencies updated and code files changed by Amazon Q Code Transformation. It also provides details of any build failures encountered in the final build of the upgraded code that you can use to fix the issues and complete the upgrade.

Combining Amazon’s long-term investments in automated reasoning and static code analysis with the power of generative AI, Amazon Q Code Transformation incorporates foundation models that we found to be essential for context-specific code transformations that often require updating a long tail of Java libraries with backward-incompatible changes.

In addition to generative AI-powered code transformations built by AWS, Amazon Q Code Transformation uses parts of OpenRewrite to further accelerate Java upgrades for customers. At AWS, many of our services are built with open source components and promoting the long-term sustainability of these communities is critical to us and our customers. That is why it’s important for us to contribute back to communities like OpenRewrite, helping ensure the whole industry can continue to benefit from their innovations. AWS plans to contribute to OpenRewrite recipes and improvements developed as part of Amazon Q Code Transformation to open source.

“The ability for software to adapt at a much faster pace is one of the most fundamental advantages any business can have. That’s why we’re excited to see AWS using OpenRewrite, the open source automated code refactoring technology, as a component of their service,” said Jonathan Schneider, CEO and Co-founder of Moderne (the sponsor of OpenRewrite). “We’re happy to have AWS join the OpenRewrite community and look forward to their contributions to make it even easier to migrate frameworks, patch vulnerabilities, and update APIs.”

Upgrade your Java applications now
Amazon Q Code Transformation product page

Danilo

Improve developer productivity with generative-AI powered Amazon Q in Amazon CodeCatalyst (preview)

Post Syndicated from Irshad Buchh original https://aws.amazon.com/blogs/aws/improve-developer-productivity-with-generative-ai-powered-amazon-q-in-amazon-codecatalyst-preview/

Today, I’m excited to introduce the preview of new generative artificial intelligence (AI) capabilities within Amazon CodeCatalyst that accelerate software delivery using Amazon Q.

Accelerate feature development – The feature development capability in Amazon Q can help you accelerate the implementation of software development tasks such as adding comments and READMEs, refining issue descriptions, generating small classes and unit tests, and updating CodeCatalyst workflows — tedious and undifferentiated tasks that take up developers’ time. Developers can go from an idea in an issue to fully tested, merge-ready, running code with only natural language inputs, in just a few clicks. AI does the heavy lifting of converting the human prompt to an actionable plan, summarizing source code repositories, generating code, unit tests, and workflows, and summarizing any changes in a pull request which is assigned back to the developer. You can also provide feedback to Amazon Q directly on the published pull request and ask it to generate a new revision. If the code change falls short of expectations, you can create a development environment directly from the pull request, make any necessary adjustments manually, publish a new revision, and proceed with the merge upon approval.

Example: make an API change in an existing application
In the navigation pane, I choose Issues and then I choose Create issue. I give the issue the title, Change the get_all_mysfits() API to return mysfits sorted by the Age attribute. I then assign this issue to Amazon Q and choose Create issue.

Create-issue

Amazon Q will automatically move the issue into the In progress state while it analyzes the issue title and description to formulate a potential solution approach. If there is already some discussion on the issue, it should be summarized in the description to help Q understand what needs to be done. As it works, Amazon Q will report on its progress by leaving comments on the issue at every stage. It will attempt to create a solution based on its understanding of the code already present in the repository and the approach it formulated. If Amazon Q is able to successfully generate a potential solution, it will create a branch and commit code to that branch. It will then create a pull request that will merge the changes into the default branch once approved. Once the pull request is published, Amazon Q will change the issue status to In Review so that you and your team know that the code is now ready for you to review.

pull-request

Summarize a change – Pull request authors can save time by asking Amazon Q to summarize the change they are publishing for review. Today pull request authors have to write the description manually or they may choose not to write it at all. If the author does not provide a description, it makes it harder for reviewers to understand what changes are being made and why, delaying the review process and slowing down software delivery.

Pull request authors and reviewers can also save time by asking Amazon Q to summarize the comments left on the pull request. The summary is useful for the author because they can easily see common feedback themes. For the reviewers it is useful because they can quickly catch up on the conversation and feedback from themselves and other team members. The overall benefits are streamlined collaboration, accelerated review process, and faster software delivery.

Join the preview
Amazon Q is available in Amazon CodeCatalyst today for spaces in AWS Region US West (Oregon).

Learn more

Irshad