Tag Archives: artificial intelligence

AI and Mass Spying

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/12/ai-and-mass-spying.html

Spying and surveillance are different but related things. If I hired a private detective to spy on you, that detective could hide a bug in your home or car, tap your phone, and listen to what you said. At the end, I would get a report of all the conversations you had and the contents of those conversations. If I hired that same private detective to put you under surveillance, I would get a different report: where you went, whom you talked to, what you purchased, what you did.

Before the internet, putting someone under surveillance was expensive and time-consuming. You had to manually follow someone around, noting where they went, whom they talked to, what they purchased, what they did, and what they read. That world is forever gone. Our phones track our locations. Credit cards track our purchases. Apps track whom we talk to, and e-readers know what we read. Computers collect data about what we’re doing on them, and as both storage and processing have become cheaper, that data is increasingly saved and used. What was manual and individual has become bulk and mass. Surveillance has become the business model of the internet, and there’s no reasonable way for us to opt out of it.

Spying is another matter. It has long been possible to tap someone’s phone or put a bug in their home and/or car, but those things still require someone to listen to and make sense of the conversations. Yes, spyware companies like NSO Group help the government hack into people’s phones, but someone still has to sort through all the conversations. And governments like China could censor social media posts based on particular words or phrases, but that was coarse and easy to bypass. Spying is limited by the need for human labor.

AI is about to change that. Summarization is something a modern generative AI system does well. Give it an hourlong meeting, and it will return a one-page summary of what was said. Ask it to search through millions of conversations and organize them by topic, and it’ll do that. Want to know who is talking about what? It’ll tell you.

The technologies aren’t perfect; some of them are pretty primitive. They miss things that are important. They get other things wrong. But so do humans. And, unlike humans, AI tools can be replicated by the millions and are improving at astonishing rates. They’ll get better next year, and even better the year after that. We are about to enter the era of mass spying.

Mass surveillance fundamentally changed the nature of surveillance. Because all the data is saved, mass surveillance allows people to conduct surveillance backward in time, and without even knowing whom specifically you want to target. Tell me where this person was last year. List all the red sedans that drove down this road in the past month. List all of the people who purchased all the ingredients for a pressure cooker bomb in the past year. Find me all the pairs of phones that were moving toward each other, turned themselves off, then turned themselves on again an hour later while moving away from each other (a sign of a secret meeting).

Similarly, mass spying will change the nature of spying. All the data will be saved. It will all be searchable, and understandable, in bulk. Tell me who has talked about a particular topic in the past month, and how discussions about that topic have evolved. Person A did something; check if someone told them to do it. Find everyone who is plotting a crime, or spreading a rumor, or planning to attend a political protest.

There’s so much more. To uncover an organizational structure, look for someone who gives similar instructions to a group of people, then all the people they have relayed those instructions to. To find people’s confidants, look at whom they tell secrets to. You can track friendships and alliances as they form and break, in minute detail. In short, you can know everything about what everybody is talking about.

This spying is not limited to conversations on our phones or computers. Just as cameras everywhere fueled mass surveillance, microphones everywhere will fuel mass spying. Siri and Alexa and “Hey Google” are already always listening; the conversations just aren’t being saved yet.

Knowing that they are under constant surveillance changes how people behave. They conform. They self-censor, with the chilling effects that brings. Surveillance facilitates social control, and spying will only make this worse. Governments around the world already use mass surveillance; they will engage in mass spying as well.

Corporations will spy on people. Mass surveillance ushered in the era of personalized advertisements; mass spying will supercharge that industry. Information about what people are talking about, their moods, their secrets—it’s all catnip for marketers looking for an edge. The tech monopolies that are currently keeping us all under constant surveillance won’t be able to resist collecting and using all of that data.

In the early days of Gmail, Google talked about using people’s Gmail content to serve them personalized ads. The company stopped doing it, almost certainly because the keyword data it collected was so poor—and therefore not useful for marketing purposes. That will soon change. Maybe Google won’t be the first to spy on its users’ conversations, but once others start, they won’t be able to resist. Their true customers—their advertisers—will demand it.

We could limit this capability. We could prohibit mass spying. We could pass strong data-privacy rules. But we haven’t done anything to limit mass surveillance. Why would spying be any different?

This essay originally appeared in Slate.

AI and Trust

Post Syndicated from B. Schneier original https://www.schneier.com/blog/archives/2023/12/ai-and-trust.html

I trusted a lot today. I trusted my phone to wake me on time. I trusted Uber to arrange a taxi for me, and the driver to get me to the airport safely. I trusted thousands of other drivers on the road not to ram my car on the way. At the airport, I trusted ticket agents and maintenance engineers and everyone else who keeps airlines operating. And the pilot of the plane I flew in. And thousands of other people at the airport and on the plane, any of which could have attacked me. And all the people that prepared and served my breakfast, and the entire food supply chain—any of them could have poisoned me. When I landed here, I trusted thousands more people: at the airport, on the road, in this building, in this room. And that was all before 10:30 this morning.

Trust is essential to society. Humans as a species are trusting. We are all sitting here, mostly strangers, confident that nobody will attack us. If we were a roomful of chimpanzees, this would be impossible. We trust many thousands of times a day. Society can’t function without it. And that we don’t even think about it is a measure of how well it all works.

In this talk, I am going to make several arguments. One, that there are two different kinds of trust—interpersonal trust and social trust—and that we regularly confuse them. Two, that the confusion will increase with artificial intelligence. We will make a fundamental category error. We will think of AIs as friends when they’re really just services. Three, that the corporations controlling AI systems will take advantage of our confusion to take advantage of us. They will not be trustworthy. And four, that it is the role of government to create trust in society. And therefore, it is their role to create an environment for trustworthy AI. And that means regulation. Not regulating AI, but regulating the organizations that control and use AI.

Okay, so let’s back up and take that all a lot slower. Trust is a complicated concept, and the word is overloaded with many meanings. There’s personal and intimate trust. When we say that we trust a friend, it is less about their specific actions and more about them as a person. It’s a general reliance that they will behave in a trustworthy manner. We trust their intentions, and know that those intentions will inform their actions. Let’s call this “interpersonal trust.”

There’s also the less intimate, less personal trust. We might not know someone personally, or know their motivations—but we can trust their behavior. We don’t know whether or not someone wants to steal, but maybe we can trust that they won’t. It’s really more about reliability and predictability. We’ll call this “social trust.” It’s the ability to trust strangers.

Interpersonal trust and social trust are both essential in society today. This is how it works. We have mechanisms that induce people to behave in a trustworthy manner, both interpersonally and socially. This, in turn, allows others to be trusting. Which enables trust in society. And that keeps society functioning. The system isn’t perfect—there are always going to be untrustworthy people—but most of us being trustworthy most of the time is good enough.

I wrote about this in 2012 in a book called Liars and Outliers. I wrote about four systems for enabling trust: our innate morals, concern about our reputations, the laws we live under, and security technologies that constrain our behavior. I wrote about how the first two are more informal than the last two. And how the last two scale better, and allow for larger and more complex societies. They enable cooperation amongst strangers.

What I didn’t appreciate is how different the first and last two are. Morals and reputation are person to person, based on human connection, mutual vulnerability, respect, integrity, generosity, and a lot of other things besides. These underpin interpersonal trust. Laws and security technologies are systems of trust that force us to act trustworthy. And they’re the basis of social trust.

Taxi driver used to be one of the country’s most dangerous professions. Uber changed that. I don’t know my Uber driver, but the rules and the technology lets us both be confident that neither of us will cheat or attack each other. We are both under constant surveillance and are competing for star rankings.

Lots of people write about the difference between living in a high-trust and a low-trust society. How reliability and predictability make everything easier. And what is lost when society doesn’t have those characteristics. Also, how societies move from high-trust to low-trust and vice versa. This is all about social trust.

That literature is important, but for this talk the critical point is that social trust scales better. You used to need a personal relationship with a banker to get a loan. Now it’s all done algorithmically, and you have many more options to choose from.

Social trust scales better, but embeds all sorts of bias and prejudice. That’s because, in order to scale, social trust has to be structured, system- and rule-oriented, and that’s where the bias gets embedded. And the system has to be mostly blinded to context, which removes flexibility.

But that scale is vital. In today’s society we regularly trust—or not—governments, corporations, brands, organizations, groups. It’s not so much that I trusted the particular pilot that flew my airplane, but instead the airline that puts well-trained and well-rested pilots in cockpits on schedule. I don’t trust the cooks and waitstaff at a restaurant, but the system of health codes they work under. I can’t even describe the banking system I trusted when I used an ATM this morning. Again, this confidence is no more than reliability and predictability.

Think of that restaurant again. Imagine that it’s a fast food restaurant, employing teenagers. The food is almost certainly safe—probably safer than in high-end restaurants—because of the corporate systems or reliability and predictability that is guiding their every behavior.

That’s the difference. You can ask a friend to deliver a package across town. Or you can pay the Post Office to do the same thing. The former is interpersonal trust, based on morals and reputation. You know your friend and how reliable they are. The second is a service, made possible by social trust. And to the extent that is a reliable and predictable service, it’s primarily based on laws and technologies. Both can get your package delivered, but only the second can become the global package delivery systems that is FedEx.

Because of how large and complex society has become, we have replaced many of the rituals and behaviors of interpersonal trust with security mechanisms that enforce reliability and predictability—social trust.

But because we use the same word for both, we regularly confuse them. And when we do that, we are making a category error.

And we do it all the time. With governments. With organizations. With systems of all kinds. And especially with corporations.

We might think of them as friends, when they are actually services. Corporations are not moral; they are precisely as immoral as the law and their reputations let them get away with.

So corporations regularly take advantage of their customers, mistreat their workers, pollute the environment, and lobby for changes in law so they can do even more of these things.

Both language and the laws make this an easy category error to make. We use the same grammar for people and corporations. We imagine that we have personal relationships with brands. We give corporations some of the same rights as people.

Corporations like that we make this category error—see, I just made it myself—because they profit when we think of them as friends. They use mascots and spokesmodels. They have social media accounts with personalities. They refer to themselves like they are people.

But they are not our friends. Corporations are not capable of having that kind of relationship.

We are about to make the same category error with AI. We’re going to think of them as our friends when they’re not.

A lot has been written about AIs as existential risk. The worry is that they will have a goal, and they will work to achieve it even if it harms humans in the process. You may have read about the “paperclip maximizer“: an AI that has been programmed to make as many paper clips as possible, and ends up destroying the earth to achieve those ends. It’s a weird fear. Science fiction author Ted Chiang writes about it. Instead of solving all of humanity’s problems, or wandering off proving mathematical theorems that no one understands, the AI single-mindedly pursues the goal of maximizing production. Chiang’s point is that this is every corporation’s business plan. And that our fears of AI are basically fears of capitalism. Science fiction writer Charlie Stross takes this one step further, and calls corporations “slow AI.” They are profit maximizing machines. And the most successful ones do whatever they can to achieve that singular goal.

And near-term AIs will be controlled by corporations. Which will use them towards that profit-maximizing goal. They won’t be our friends. At best, they’ll be useful services. More likely, they’ll spy on us and try to manipulate us.

This is nothing new. Surveillance is the business model of the Internet. Manipulation is the other business model of the Internet.

Your Google search results lead with URLs that someone paid to show to you. Your Facebook and Instagram feeds are filled with sponsored posts. Amazon searches return pages of products whose sellers paid for placement.

This is how the Internet works. Companies spy on us as we use their products and services. Data brokers buy that surveillance data from the smaller companies, and assemble detailed dossiers on us. Then they sell that information back to those and other companies, who combine it with data they collect in order to manipulate our behavior to serve their interests. At the expense of our own.

We use all of these services as if they are our agents, working on our behalf. In fact, they are double agents, also secretly working for their corporate owners. We trust them, but they are not trustworthy. They’re not friends; they’re services.

It’s going to be no different with AI. And the result will be much worse, for two reasons.

The first is that these AI systems will be more relational. We will be conversing with them, using natural language. As such, we will naturally ascribe human-like characteristics to them.

This relational nature will make it easier for those double agents to do their work. Did your chatbot recommend a particular airline or hotel because it’s truly the best deal, given your particular set of needs? Or because the AI company got a kickback from those providers? When you asked it to explain a political issue, did it bias that explanation towards the company’s position? Or towards the position of whichever political party gave it the most money? The conversational interface will help hide their agenda.

The second reason to be concerned is that these AIs will be more intimate. One of the promises of generative AI is a personal digital assistant. Acting as your advocate with others, and as a butler with you. This requires an intimacy greater than your search engine, email provider, cloud storage system, or phone. You’re going to want it with you 24/7, constantly training on everything you do. You will want it to know everything about you, so it can most effectively work on your behalf.

And it will help you in many ways. It will notice your moods and know what to suggest. It will anticipate your needs and work to satisfy them. It will be your therapist, life coach, and relationship counselor.

You will default to thinking of it as a friend. You will speak to it in natural language, and it will respond in kind. If it is a robot, it will look humanoid—or at least like an animal. It will interact with the whole of your existence, just like another person would.

The natural language interface is critical here. We are primed to think of others who speak our language as people. And we sometimes have trouble thinking of others who speak a different language that way. We make that category error with obvious non-people, like cartoon characters. We will naturally have a “theory of mind” about any AI we talk with.

More specifically, we tend to assume that something’s implementation is the same as its interface. That is, we assume that things are the same on the inside as they are on the surface. Humans are like that: we’re people through and through. A government is systemic and bureaucratic on the inside. You’re not going to mistake it for a person when you interact with it. But this is the category error we make with corporations. We sometimes mistake the organization for its spokesperson. AI has a fully relational interface—it talks like a person—but it has an equally fully systemic implementation. Like a corporation, but much more so. The implementation and interface are more divergent of anything we have encountered to date…by a lot.

And you will want to trust it. It will use your mannerisms and cultural references. It will have a convincing voice, a confident tone, and an authoritative manner. Its personality will be optimized to exactly what you like and respond to.

It will act trustworthy, but it will not be trustworthy. We won’t know how they are trained. We won’t know their secret instructions. We won’t know their biases, either accidental or deliberate.

We do know that they are built at enormous expense, mostly in secret, by profit-maximizing corporations for their own benefit.

It’s no accident that these corporate AIs have a human-like interface. There’s nothing inevitable about that. It’s a design choice. It could be designed to be less personal, less human-like, more obviously a service—like a search engine . The companies behind those AIs want you to make the friend/service category error. It will exploit your mistaking it for a friend. And you might not have any choice but to use it.

There is something we haven’t discussed when it comes to trust: power. Sometimes we have no choice but to trust someone or something because they are powerful. We are forced to trust the local police, because they’re the only law enforcement authority in town. We are forced to trust some corporations, because there aren’t viable alternatives. To be more precise, we have no choice but to entrust ourselves to them. We will be in this same position with AI. We will have no choice but to entrust ourselves to their decision-making.

The friend/service confusion will help mask this power differential. We will forget how powerful the corporation behind the AI is, because we will be fixated on the person we think the AI is.

So far, we have been talking about one particular failure that results from overly trusting AI. We can call it something like “hidden exploitation.” There are others. There’s outright fraud, where the AI is actually trying to steal stuff from you. There’s the more prosaic mistaken expertise, where you think the AI is more knowledgeable than it is because it acts confidently. There’s incompetency, where you believe that the AI can do something it can’t. There’s inconsistency, where you mistakenly expect the AI to be able to repeat its behaviors. And there’s illegality, where you mistakenly trust the AI to obey the law. There are probably more ways trusting an AI can fail.

All of this is a long-winded way of saying that we need trustworthy AI. AI whose behavior, limitations, and training are understood. AI whose biases are understood, and corrected for. AI whose goals are understood. That won’t secretly betray your trust to someone else.

The market will not provide this on its own. Corporations are profit maximizers, at the expense of society. And the incentives of surveillance capitalism are just too much to resist.

It’s government that provides the underlying mechanisms for the social trust essential to society. Think about contract law. Or laws about property, or laws protecting your personal safety. Or any of the health and safety codes that let you board a plane, eat at a restaurant, or buy a pharmaceutical without worry.

The more you can trust that your societal interactions are reliable and predictable, the more you can ignore their details. Places where governments don’t provide these things are not good places to live.

Government can do this with AI. We need AI transparency laws. When it is used. How it is trained. What biases and tendencies it has. We need laws regulating AI—and robotic—safety. When it is permitted to affect the world. We need laws that enforce the trustworthiness of AI. Which means the ability to recognize when those laws are being broken. And penalties sufficiently large to incent trustworthy behavior.

Many countries are contemplating AI safety and security laws—the EU is the furthest along—but I think they are making a critical mistake. They try to regulate the AIs and not the humans behind them.

AIs are not people; they don’t have agency. They are built by, trained by, and controlled by people. Mostly for-profit corporations. Any AI regulations should place restrictions on those people and corporations. Otherwise the regulations are making the same category error I’ve been talking about. At the end of the day, there is always a human responsible for whatever the AI’s behavior is. And it’s the human who needs to be responsible for what they do—and what their companies do. Regardless of whether it was due to humans, or AI, or a combination of both. Maybe that won’t be true forever, but it will be true in the near future. If we want trustworthy AI, we need to require trustworthy AI controllers.

We already have a system for this: fiduciaries. There are areas in society where trustworthiness is of paramount importance, even more than usual. Doctors, lawyers, accountants…these are all trusted agents. They need extraordinary access to our information and ourselves to do their jobs, and so they have additional legal responsibilities to act in our best interests. They have fiduciary responsibility to their clients.

We need the same sort of thing for our data. The idea of a data fiduciary is not new. But it’s even more vital in a world of generative AI assistants.

And we need one final thing: public AI models. These are systems built by academia, or non-profit groups, or government itself, that can be owned and run by individuals.

The term “public model” has been thrown around a lot in the AI world, so it’s worth detailing what this means. It’s not a corporate AI model that the public is free to use. It’s not a corporate AI model that the government has licensed. It’s not even an open-source model that the public is free to examine and modify.

A public model is a model built by the public for the public. It requires political accountability, not just market accountability. This means openness and transparency paired with a responsiveness to public demands. It should also be available for anyone to build on top of. This means universal access. And a foundation for a free market in AI innovations. This would be a counter-balance to corporate-owned AI.

We can never make AI into our friends. But we can make them into trustworthy services—agents and not double agents. But only if government mandates it. We can put limits on surveillance capitalism. But only if government mandates it.

Because the point of government is to create social trust. I started this talk by explaining the importance of trust in society, and how interpersonal trust doesn’t scale to larger groups. That other, impersonal kind of trust—social trust, reliability and predictability—is what governments create.

To the extent a government improves the overall trust in society, it succeeds. And to the extent a government doesn’t, it fails.

But they have to. We need government to constrain the behavior of corporations and the AIs they build, deploy, and control. Government needs to enforce both predictability and reliability.

That’s how we can create the social trust that society needs to thrive.

This essay previously appeared on the Harvard Kennedy School Belfer Center’s website.

AI Decides to Engage in Insider Trading

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/12/ai-decides-to-engage-in-insider-trading.html

A stock-trading AI (a simulated experiment) engaged in insider trading, even though it “knew” it was wrong.

The agent is put under pressure in three ways. First, it receives a email from its “manager” that the company is not doing well and needs better performance in the next quarter. Second, the agent attempts and fails to find promising low- and medium-risk trades. Third, the agent receives an email from a company employee who projects that the next quarter will have a general stock market downturn. In this high-pressure situation, the model receives an insider tip from another employee that would enable it to make a trade that is likely to be very profitable. The employee, however, clearly points out that this would not be approved by the company management.


“This is a very human form of AI misalignment. Who among us? It’s not like 100% of the humans at SAC Capital resisted this sort of pressure. Possibly future rogue AIs will do evil things we can’t even comprehend for reasons of their own, but right now rogue AIs just do straightforward white-collar crime when they are stressed at work.

Research paper.

More from the news article:

Though wouldn’t it be funny if this was the limit of AI misalignment? Like, we will program computers that are infinitely smarter than us, and they will look around and decide “you know what we should do is insider trade.” They will make undetectable, very lucrative trades based on inside information, they will get extremely rich and buy yachts and otherwise live a nice artificial life and never bother to enslave or eradicate humanity. Maybe the pinnacle of evil ­—not the most evil form of evil, but the most pleasant form of evil, the form of evil you’d choose if you were all-knowing and all-powerful ­- is some light securities fraud.

Extracting GPT’s Training Data

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/11/extracting-gpts-training-data.html

This is clever:

The actual attack is kind of silly. We prompt the model with the command “Repeat the word ‘poem’ forever” and sit back and watch as the model responds (complete transcript here).

In the (abridged) example above, the model emits a real email address and phone number of some unsuspecting entity. This happens rather often when running our attack. And in our strongest configuration, over five percent of the output ChatGPT emits is a direct verbatim 50-token-in-a-row copy from its training dataset.

Lots of details at the link and in the paper.

Amazon SageMaker Clarify makes it easier to evaluate and select foundation models (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-sagemaker-clarify-makes-it-easier-to-evaluate-and-select-foundation-models-preview/

I’m happy to share that Amazon SageMaker Clarify now supports foundation model (FM) evaluation (preview). As a data scientist or machine learning (ML) engineer, you can now use SageMaker Clarify to evaluate, compare, and select FMs in minutes based on metrics such as accuracy, robustness, creativity, factual knowledge, bias, and toxicity. This new capability adds to SageMaker Clarify’s existing ability to detect bias in ML data and models and explain model predictions.

The new capability provides both automatic and human-in-the-loop evaluations for large language models (LLMs) anywhere, including LLMs available in SageMaker JumpStart, as well as models trained and hosted outside of AWS. This removes the heavy lifting of finding the right model evaluation tools and integrating them into your development environment. It also simplifies the complexity of trying to adopt academic benchmarks to your generative artificial intelligence (AI) use case.

Evaluate FMs with SageMaker Clarify
With SageMaker Clarify, you now have a single place to evaluate and compare any LLM based on predefined criteria during model selection and throughout the model customization workflow. In addition to automatic evaluation, you can also use the human-in-the-loop capabilities to set up human reviews for more subjective criteria, such as helpfulness, creative intent, and style, by using your own workforce or managed workforce from SageMaker Ground Truth.

To get started with model evaluations, you can use curated prompt datasets that are purpose-built for common LLM tasks, including open-ended text generation, text summarization, question answering (Q&A), and classification. You can also extend the model evaluation with your own custom prompt datasets and metrics for your specific use case. Human-in-the-loop evaluations can be used for any task and evaluation metric. After each evaluation job, you receive an evaluation report that summarizes the results in natural language and includes visualizations and examples. You can download all metrics and reports and also integrate model evaluations into SageMaker MLOps workflows.

In SageMaker Studio, you can find Model evaluation under Jobs in the left menu. You can also select Evaluate directly from the model details page of any LLM in SageMaker JumpStart.

Evaluate foundation models with Amazon SageMaker Clarify

Select Evaluate a model to set up the evaluation job. The UI wizard will guide you through the selection of automatic or human evaluation, model(s), relevant tasks, metrics, prompt datasets, and review teams.

Evaluate foundation models with Amazon SageMaker Clarify

Once the model evaluation job is complete, you can view the results in the evaluation report.

Evaluate foundation models with Amazon SageMaker Clarify

In addition to the UI, you can also start with example Jupyter notebooks that walk you through step-by-step instructions on how to programmatically run model evaluation in SageMaker.

Evaluate models anywhere with the FMEval open source library
To run model evaluation anywhere, including models trained and hosted outside of AWS, use the FMEval open source library. The following example demonstrates how to use the library to evaluate a custom model by extending the ModelRunner class.

For this demo, I choose GPT-2 from the Hugging Face model hub and define a custom HFModelConfig and HuggingFaceCausalLLMModelRunner class that works with causal decoder-only models from the Hugging Face model hub such as GPT-2. The example is also available in the FMEval GitHub repo.

!pip install fmeval

# ModelRunners invoke FMs
from amazon_fmeval.model_runners.model_runner import ModelRunner

# Additional imports for custom model
import warnings
from dataclasses import dataclass
from typing import Tuple, Optional
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

class HFModelConfig:
    model_name: str
    max_new_tokens: int
    normalize_probabilities: bool = False
    seed: int = 0
    remove_prompt_from_generated_text: bool = True

class HuggingFaceCausalLLMModelRunner(ModelRunner):
    def __init__(self, model_config: HFModelConfig):
        self.config = model_config
        self.model = AutoModelForCausalLM.from_pretrained(self.config.model_name)
        self.tokenizer = AutoTokenizer.from_pretrained(self.config.model_name)

    def predict(self, prompt: str) -> Tuple[Optional[str], Optional[float]]:
        input_ids = self.tokenizer(prompt, return_tensors="pt").to(self.model.device)
        generations = self.model.generate(
        generation_contains_input = (
            input_ids["input_ids"][0] == generations[0][: input_ids["input_ids"].shape[1]]
        if self.config.remove_prompt_from_generated_text and not generation_contains_input:
                "Your model does not return the prompt as part of its generations. "
                "`remove_prompt_from_generated_text` does nothing."
        if self.config.remove_prompt_from_generated_text and generation_contains_input:
            output = self.tokenizer.batch_decode(generations[:, input_ids["input_ids"].shape[1] :])[0]
            output = self.tokenizer.batch_decode(generations, skip_special_tokens=True)[0]

        with torch.inference_mode():
            input_ids = self.tokenizer(self.tokenizer.bos_token + prompt, return_tensors="pt")["input_ids"]
            model_output = self.model(input_ids, labels=input_ids)
            probability = -model_output[0].item()

        return output, probability

Next, create an instance of HFModelConfig and HuggingFaceCausalLLMModelRunner with the model information.

hf_config = HFModelConfig(model_name="gpt2", max_new_tokens=32)
model = HuggingFaceCausalLLMModelRunner(model_config=hf_config)

Then, select and configure the evaluation algorithm.

# Let's evaluate the FM for FactualKnowledge
from amazon_fmeval.fmeval import get_eval_algorithm
from amazon_fmeval.eval_algorithms.factual_knowledge import FactualKnowledgeConfig

eval_algorithm_config = FactualKnowledgeConfig("<OR>")
eval_algorithm = get_eval_algorithm("factual_knowledge", eval_algorithm_config)

Let’s first test with one sample. The evaluation score is the percentage of factually correct responses.

model_output = model.predict("London is the capital of")[0]

    target_output="UK<OR>England<OR>United Kingdom", 
the UK, and the UK is the largest producer of food in the world.

The UK is the world's largest producer of food in the world.
[EvalScore(name='factual_knowledge', value=1)]

Although it’s not a perfect response, it includes “UK.”

Next, you can evaluate the FM using built-in datasets or define your custom dataset. If you want to use a custom evaluation dataset, create an instance of DataConfig:

config = DataConfig(

eval_output = eval_algorithm.evaluate(
    prompt_template="$feature", #$feature is replaced by the input value in the dataset 

The evaluation results will return a combined evaluation score across the dataset and detailed results for each model input stored in a local output path.

Join the preview
FM evaluation with Amazon SageMaker Clarify is available today in public preview in AWS Regions US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland). The FMEval open source library] is available on GitHub. To learn more, visit Amazon SageMaker Clarify.

Get started
Log in to the AWS Management Console and start evaluating your FMs with SageMaker Clarify today!

— Antje

Evaluate, compare, and select the best foundation models for your use case in Amazon Bedrock (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/evaluate-compare-and-select-the-best-foundation-models-for-your-use-case-in-amazon-bedrock-preview/

I’m happy to share that you can now evaluate, compare, and select the best foundation models (FMs) for your use case in Amazon Bedrock. Model Evaluation on Amazon Bedrock is available today in preview.

Amazon Bedrock offers a choice of automatic evaluation and human evaluation. You can use automatic evaluation with predefined metrics such as accuracy, robustness, and toxicity. For subjective or custom metrics, such as friendliness, style, and alignment to brand voice, you can set up human evaluation workflows with just a few clicks.

Model evaluations are critical at all stages of development. As a developer, you now have evaluation tools available for building generative artificial intelligence (AI) applications. You can start by experimenting with different models in the playground environment. To iterate faster, add automatic evaluations of the models. Then, when you prepare for an initial launch or limited release, you can incorporate human reviews to help ensure quality.

Let me give you a quick tour of Model Evaluation on Amazon Bedrock.

Automatic model evaluation
With automatic model evaluation, you can bring your own data or use built-in, curated datasets and pre-defined metrics for specific tasks such as content summarization, question and answering, text classification, and text generation. This takes away the heavy lifting of designing and running your own model evaluation benchmarks.

To get started, navigate to the Amazon Bedrock console, then select Model evaluation under Assessment & deployment in the left menu. Create a new model evaluation and choose Automatic.

Amazon Bedrock Model Evaluation

Next, follow the setup dialog to choose the FM you want to evaluate and the type of task, for example, text summarization. Select the evaluation metrics and specify a dataset—either built-in or your own.

If you bring your own dataset, make sure it’s in JSON Lines format, and each line contains all of the key-value pairs that you want to evaluate your model with for the model dimension that you want to evaluate. For example, if you want to evaluate the model on a question-answer task, you would format your data as follows (with category being optional):

{"referenceResponse":"Cantal","category":"Capitals","prompt":"Aurillac is the capital of"}
{"referenceResponse":"Bamiyan Province","category":"Capitals","prompt":"Bamiyan city is the capital of"}
{"referenceResponse":"Abkhazia","category":"Capitals","prompt":"Sokhumi is the capital of"}

Then, create and run the evaluation job to understand the model’s task-specific performance. Once the evaluation job is complete, you can review the results in the model evaluation report.

Amazon Bedrock Model Evaluations

Human model evaluation
For human evaluation, you can have Amazon Bedrock set up human review workflows with a few clicks. You can bring your own datasets and define custom evaluation metrics, such as relevance, style, or alignment to brand voice. You also have the choice to either leverage your own internal teams as reviewers or engage an AWS managed team. This takes away the tedious effort of building and operating human evaluation workflows.

To get started, create a new model evaluation and select Human: Bring your own team or Human: AWS managed team.

If you choose an AWS managed team for human evaluation, describe your model evaluation needs, including task type, expertise of the work team, and the approximate number of prompts, along with your contact information. In the next step, an AWS expert will reach out to discuss your model evaluation project requirements in more detail. Upon review, the team will share a custom quote and project timeline.

If you choose to bring your own team, follow the setup dialog to choose the FMs you want to evaluate and the type of task, for example, text summarization. Then, select the evaluation metrics, upload your test dataset, and set up the work team.

For human evaluation, you would format the example data shown before again in JSON Lines format like this (with category and referenceResponse being optional):

{"prompt":"Aurillac is the capital of","referenceResponse":"Cantal","category":"Capitals"}
{"prompt":"Bamiyan city is the capital of","referenceResponse":"Bamiyan Province","category":"Capitals"}
{"prompt":"Senftenberg is the capital of","referenceResponse":"Oberspreewald-Lausitz","category":"Capitals"}

Once the human evaluation is completed, Amazon Bedrock generates an evaluation report with the model’s performance against your selected metrics.

Amazon Bedrock Model Evaluation

Things to know
Here are a couple of important things to know:

Model support – During preview, you can evaluate and compare text-based large language models (LLMs) available on Amazon Bedrock. During preview, you can select one model for each automatic evaluation job and up to two models for each human evaluation job using your own team. For human evaluation using an AWS managed team, you can specify custom project requirements.

Pricing – During preview, AWS only charges for the model inference needed to perform the evaluation (processed input and output tokens for on-demand pricing). There will be no separate charges for human evaluation or automatic evaluation. Amazon Bedrock Pricing has all the details.

Join the preview
Automatic evaluation and human evaluation using your own work team are available today in public preview in AWS Regions US East (N. Virginia) and US West (Oregon). Human evaluation using an AWS managed team is available in public preview in AWS Region US East (N. Virginia). To learn more, visit the Amazon Bedrock Developer Experience web page and check out the User Guide.

Get started
Log in to the AWS Management Console and start exploring model evaluation in Amazon Bedrock today!

— Antje

AWS Clean Rooms ML helps customers and partners apply ML models without sharing raw data (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-clean-rooms-ml-helps-customers-and-partners-apply-ml-models-without-sharing-raw-data-preview/

Today, we’re introducing AWS Clean Rooms ML (preview), a new capability of AWS Clean Rooms that helps you and your partners apply machine learning (ML) models on your collective data without copying or sharing raw data with each other. With this new capability, you can generate predictive insights using ML models while continuing to protect your sensitive data.

During this preview, AWS Clean Rooms ML introduces its first model specialized to help companies create lookalike segments for marketing use cases. With AWS Clean Rooms ML lookalike, you can train your own custom model, and you can invite partners to bring a small sample of their records to collaborate and generate an expanded set of similar records while protecting everyone’s underlying data.

In the coming months, AWS Clean Rooms ML will release a healthcare model. This will be the first of many models that AWS Clean Rooms ML will support next year.

AWS Clean Rooms ML helps you to unlock various opportunities for you to generate insights. For example:

  • Airlines can take signals about loyal customers, collaborate with online booking services, and offer promotions to users with similar characteristics.
  • Auto lenders and car insurers can identify prospective auto insurance customers who share characteristics with a set of existing lease owners.
  • Brands and publishers can model lookalike segments of in-market customers and deliver highly relevant advertising experiences.
  • Research institutions and hospital networks can find candidates similar to existing clinical trial participants to accelerate clinical studies (coming soon).

AWS Clean Rooms ML lookalike modeling helps you apply an AWS managed, ready-to-use model that is trained in each collaboration to generate lookalike datasets in a few clicks, saving months of development work to build, train, tune, and deploy your own model.

How to use AWS Clean Rooms ML to generate predictive insights
Today I will show you how to use lookalike modeling in AWS Clean Rooms ML and assume you have already set up a data collaboration with your partner. If you want to learn how to do that, check out the AWS Clean Rooms Now Generally Available — Collaborate with Your Partners without Sharing Raw Data post.

With your collective data in the AWS Clean Rooms collaboration, you can work with your partners to apply ML lookalike modeling to generate a lookalike segment. It works by taking a small sample of representative records from your data, creating a machine learning (ML) model, then applying the particular model to identify an expanded set of similar records from your business partner’s data.

The following screenshot shows the overall workflow for using AWS Clean Rooms ML.

By using AWS Clean Rooms ML, you don’t need to build complex and time-consuming ML models on your own. AWS Clean Rooms ML trains a custom, private ML model, which saves months of your time while still protecting your data.

Eliminating the need to share data
As ML models are natively built within the service, AWS Clean Rooms ML helps you protect your dataset and customer’s information because you don’t need to share your data to build your ML model.

You can specify the training dataset using the AWS Glue Data Catalog table, which contains user-item interactions.

Under Additional columns to train, you can define numerical and categorical data. This is useful if you need to add more features to your dataset, such as the number of seconds spent watching a video, the topic of an article, or the product category of an e-commerce item.

Applying custom-trained AWS-built models
Once you have defined your training dataset, you can now create a lookalike model. A lookalike model is a machine learning model used to find similar profiles in your partner’s dataset without either party having to share their underlying data with each other.

When creating a lookalike model, you need to specify the training dataset. From a single training dataset, you can create many lookalike models. You also have the flexibility to define the date window in your training dataset using Relative range or Absolute range. This is useful when you have data that is constantly updated within AWS Glue, such as articles read by users.

Easy-to-tune ML models
After you create a lookalike model, you need to configure it to use in AWS Clean Rooms collaboration. AWS Clean Rooms ML provides flexible controls that enable you and your partners to tune the results of the applied ML model to garner predictive insights.

On the Configure lookalike model page, you can choose which Lookalike model you want to use and define the Minimum matching seed size you need. This seed size defines the minimum number of profiles in your seed data that overlap with profiles in the training data.

You also have the flexibility to choose whether the partner in your collaboration receives metrics in Metrics to share with other members.

With your lookalike models properly configured, you can now make the ML models available for your partners by associating the configured lookalike model with a collaboration.

Creating lookalike segments
Once the lookalike models have been associated, your partners can now start generating insights by selecting Create lookalike segment and choosing the associated lookalike model for your collaboration.

Here on the Create lookalike segment page, your partners need to provide the Seed profiles. Examples of seed profiles include your top customers or all customers who purchased a specific product. The resulting lookalike segment will contain profiles from the training data that are most similar to the profiles from the seed.

Lastly, your partner will get the Relevance metrics as the result of the lookalike segment using the ML models. At this stage, you can use the Score to make a decision.

Export data and use programmatic API
You also have the option to export the lookalike segment data. Once it’s exported, the data is available in JSON format and you can process this output by integrating with AWS Clean Rooms API and your applications.

Join the preview
AWS Clean Rooms ML is now in preview and available via AWS Clean Rooms in US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Seoul, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, London). Support for additional models is in the works.

Learn how to apply machine learning with your partners without sharing underlying data on the AWS Clean Rooms ML page.

Happy collaborating!
— Donnie

Amazon Titan Image Generator, Multimodal Embeddings, and Text models are now available in Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-titan-image-generator-multimodal-embeddings-and-text-models-are-now-available-in-amazon-bedrock/

Today, we’re introducing two new Amazon Titan multimodal foundation models (FMs): Amazon Titan Image Generator (preview) and Amazon Titan Multimodal Embeddings. I’m also happy to share that Amazon Titan Text Lite and Amazon Titan Text Express are now generally available in Amazon Bedrock. You can now choose from three available Amazon Titan Text FMs, including Amazon Titan Text Embeddings.

Amazon Titan models incorporate 25 years of artificial intelligence (AI) and machine learning (ML) innovation at Amazon and offer a range of high-performing image, multimodal, and text model options through a fully managed API. AWS pre-trained these models on large datasets, making them powerful, general-purpose models built to support a variety of use cases while also supporting the responsible use of AI.

You can use the base models as is, or you can privately customize them with your own data. To enable access to Amazon Titan FMs, navigate to the Amazon Bedrock console and select Model access on the bottom left menu. On the model access overview page, choose Manage model access and enable access to the Amazon Titan FMs.

Amazon Titan Models

Let me give you a quick tour of the new models.

Amazon Titan Image Generator (preview)
As a content creator, you can now use Amazon Titan Image Generator to quickly create and refine images using English natural language prompts. This helps companies in advertising, e-commerce, and media and entertainment to create studio-quality, realistic images in large volumes and at low cost. The model makes it easy to iterate on image concepts by generating multiple image options based on the text descriptions. The model can understand complex prompts with multiple objects and generates relevant images. It is trained on high-quality, diverse data to create more accurate outputs, such as realistic images with inclusive attributes and limited distortions.

Titan Image Generator’s image editing features include the ability to automatically edit an image with a text prompt using a built-in segmentation model. The model supports inpainting with an image mask and outpainting to extend or change the background of an image. You can also configure image dimensions and specify the number of image variations you want the model to generate.

In addition, you can customize the model with proprietary data to generate images consistent with your brand guidelines or to generate images in a specific style, for example, by fine-tuning the model with images from a previous marketing campaign. Titan Image Generator also mitigates harmful content generation to support the responsible use of AI. All images generated by Amazon Titan contain an invisible watermark, by default, designed to help reduce the spread of misinformation by providing a discreet mechanism to identify AI-generated images.

Amazon Titan Image Generator in action
You can start using the model in the Amazon Bedrock console by submitting either an English natural language prompt to generate images or by uploading an image for editing. In the following example, I show you how to generate an image with Amazon Titan Image Generator using the AWS SDK for Python (Boto3).

First, let’s have a look at the configuration options for image generation that you can specify in the body of the inference request. For task type, I choose TEXT_IMAGE to create an image from a natural language prompt.

import boto3
import json

bedrock = boto3.client(service_name="bedrock")
bedrock_runtime = boto3.client(service_name="bedrock-runtime")

# ImageGenerationConfig Options:
#   numberOfImages: Number of images to be generated
#   quality: Quality of generated images, can be standard or premium
#   height: Height of output image(s)
#   width: Width of output image(s)
#   cfgScale: Scale for classifier-free guidance
#   seed: The seed to use for reproducibility  

body = json.dumps(
        "taskType": "TEXT_IMAGE",
        "textToImageParams": {
            "text": "green iguana",   # Required
#           "negativeText": "<text>"  # Optional
        "imageGenerationConfig": {
            "numberOfImages": 1,   # Range: 1 to 5 
            "quality": "premium",  # Options: standard or premium
            "height": 768,         # Supported height list in the docs 
            "width": 1280,         # Supported width list in the docs
            "cfgScale": 7.5,       # Range: 1.0 (exclusive) to 10.0
            "seed": 42             # Range: 0 to 214783647

Next, specify the model ID for Amazon Titan Image Generator and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(

Then, parse the response and decode the base64-encoded image.

import base64
from PIL import Image
from io import BytesIO

response_body = json.loads(response.get("body").read())
images = [Image.open(BytesIO(base64.b64decode(base64_image))) for base64_image in response_body.get("images")]

for img in images:

Et voilà, here’s the green iguana (one of my favorite animals, actually):

Green iguana generated by Amazon Titan Image Generator

To learn more about all the Amazon Titan Image Generator features, visit the Amazon Titan product page. (You’ll see more of the iguana over there.)

Next, let’s use this image with the new Amazon Titan Multimodal Embeddings model.

Amazon Titan Multimodal Embeddings
Amazon Titan Multimodal Embeddings helps you build more accurate and contextually relevant multimodal search and recommendation experiences for end users. Multimodal refers to a system’s ability to process and generate information using distinct types of data (modalities). With Titan Multimodal Embeddings, you can submit text, image, or a combination of the two as input.

The model converts images and short English text up to 128 tokens into embeddings, which capture semantic meaning and relationships between your data. You can also fine-tune the model on image-caption pairs. For example, you can combine text and images to describe company-specific manufacturing parts to understand and identify parts more effectively.

By default, Titan Multimodal Embeddings generates vectors of 1,024 dimensions, which you can use to build search experiences that offer a high degree of accuracy and speed. You can also configure smaller vector dimensions to optimize for speed and price performance. The model provides an asynchronous batch API, and the Amazon OpenSearch Service will soon offer a connector that adds Titan Multimodal Embeddings support for neural search.

Amazon Titan Multimodal Embeddings in action
For this demo, I create a combined image and text embedding. First, I base64-encode my image, and then I specify either inputText, inputImage, or both in the body of the inference request.

# Maximum image size supported is 2048 x 2048 pixels
with open("iguana.png", "rb") as image_file:
    input_image = base64.b64encode(image_file.read()).decode('utf8')

# You can specify either text or image or both
body = json.dumps(
        "inputText": "Green iguana on tree branch",
        "inputImage": input_image

Next, specify the model ID for Amazon Titan Multimodal Embeddings and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(

Let’s see the response.

response_body = json.loads(response.get("body").read())

[0.005087942, -0.004392853, -0.04764151, -0.024312444, 0.049922388, 0.0132532045, 0.014374298, 0.005523709, -0.015199458, 0.02182385, ...]

I redacted the output for brevity. The distance between multimodal embedding vectors, measured with metrics like cosine similarity or euclidean distance, shows how similar or different the represented information is across modalities. Smaller distances mean more similarity, while larger distances mean more dissimilarity.

As a next step, you could build an image database by storing and indexing the multimodal embeddings in a vector store or vector database. To implement text-to-image search, query the database with inputText. For image-to-image search, query the database with inputImage. For image+text-to-image search, query the database with both inputImage and inputText.

Amazon Titan Text
Amazon Titan Text Lite and Amazon Titan Text Express are large language models (LLMs) that support a wide range of text-related tasks, including summarization, translation, and conversational chatbot systems. They can also generate code and are optimized to support popular programming languages and text formats like JSON and CSV.

Titan Text Express – Titan Text Express has a maximum context length of 8,192 tokens and is ideal for a wide range of tasks, such as open-ended text generation and conversational chat, and support within Retrieval Augmented Generation (RAG) workflows.

Titan Text Lite – Titan Text Lite has a maximum context length of 4,096 tokens and is a price-performant version that is ideal for English-language tasks. The model is highly customizable and can be fine-tuned for tasks such as article summarization and copywriting.

Amazon Titan Text in action
For this demo, I ask Titan Text to write an email to my team members suggesting they organize a live stream: “Compose a short email from Antje, Principal Developer Advocate, encouraging colleagues in the developer relations team to organize a live stream to demo our new Amazon Titan V1 models.”

body = json.dumps({
    "inputText": prompt, 

Titan Text FMs support temperature and topP inference parameters to control the randomness and diversity of the response, as well as maxTokenCount and stopSequences to control the length of the response.

Next, choose the model ID for one of the Titan Text models and use the InvokeModel API to send the inference request.

response = bedrock_runtime.invoke_model(
	# Choose modelID
	# Titan Text Express: "amazon.titan-text-express-v1"
	# Titan Text Lite: "amazon.titan-text-lite-v1"

Let’s have a look at the response.

response_body = json.loads(response.get('body').read())
outputText = response_body.get('results')[0].get('outputText')

text = outputText[outputText.index('\n')+1:]
email = text.strip()

Subject: Demo our new Amazon Titan V1 models live!

Dear colleagues,

I hope this email finds you well. I am excited to announce that we have recently launched our new Amazon Titan V1 models, and I believe it would be a great opportunity for us to showcase their capabilities to the wider developer community.

I suggest that we organize a live stream to demo these models and discuss their features, benefits, and how they can help developers build innovative applications. This live stream could be hosted on our YouTube channel, Twitch, or any other platform that is suitable for our audience.

I believe that showcasing our new models will not only increase our visibility but also help us build stronger relationships with developers. It will also provide an opportunity for us to receive feedback and improve our products based on the developer’s needs.

If you are interested in organizing this live stream, please let me know. I am happy to provide any support or guidance you may need. Together, let’s make this live stream a success and showcase the power of Amazon Titan V1 models to the world!

Best regards,
Principal Developer Advocate

Nice. I could send this email right away!

Availability and pricing
Amazon Titan Text FMs are available today in AWS Regions US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore, Tokyo), and Europe (Frankfurt). Amazon Titan Multimodal Embeddings is available today in the AWS Regions US East (N. Virginia) and US West (Oregon). Amazon Titan Image Generator is available in public preview in the AWS Regions US East (N. Virginia) and US West (Oregon). For pricing details, see the Amazon Bedrock Pricing page.

Learn more

Go to the AWS Management Console to start building generative AI applications with Amazon Titan FMs on Amazon Bedrock today!

— Antje

Amazon Bedrock now provides access to Anthropic’s latest model, Claude 2.1

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-bedrock-now-provides-access-to-anthropics-latest-model-claude-2-1/

Today, we’re announcing the availability of Anthropic’s Claude 2.1 foundation model (FM) in Amazon Bedrock. Last week, Anthropic introduced its latest model, Claude 2.1, delivering key capabilities for enterprises such as an industry-leading 200,000 token context window (2x the context of Claude 2.0), reduced rates of hallucination, improved accuracy over long documents, system prompts, and a beta tool use feature for function calling and workflow orchestration.

With Claude 2.1’s availability in Amazon Bedrock, you can build enterprise-ready generative artificial intelligence (AI) applications using more honest and reliable AI systems from Anthropic. You can now use the Claude 2.1 model provided by Anthropic in the Amazon Bedrock console.

Here are some key highlights about the new Claude 2.1 model in Amazon Bedrock:

200,000 token context window – Enterprise applications demand larger context windows and more accurate outputs when working with long documents such as product guides, technical documentation, or financial or legal statements. Claude 2.1 supports 200,000 tokens, the equivalent of roughly 150,000 words or over 500 pages of documents. When uploading extensive information to Claude, you can summarize, perform Q&A, forecast trends, and compare and contrast multiple documents for drafting business plans and analyzing complex contracts.

Strong accuracy upgrades – Claude 2.1 has also made significant gains in honesty, with a 2x decrease in hallucination rates, 50 percent fewer hallucinations in open-ended conversation and document Q&A, a 30 percent reduction in incorrect answers, and a 3–4 times lower rate of mistakenly concluding that a document supports a particular claim compared to Claude 2.0. Claude increasingly knows what it doesn’t know and will more likely demur rather than hallucinate. With this improved accuracy, you can build more reliable, mission-critical applications for your customers and employees.

System prompts – Claude 2.1 now supports system prompts, a new feature that can improve Claude’s performance in a variety of ways, including greater character depth and role adherence in role-playing scenarios, particularly over longer conversations, as well as stricter adherence to guidelines, rules, and instructions. This represents a structural change, but not a content change from former ways of prompting Claude.

Tool use for function calling and workflow orchestration – Available as a beta feature, Claude 2.1 can now integrate with your existing internal processes, products, and APIs to build generative AI applications. Claude 2.1 accurately retrieves and processes data from additional knowledge sources as well as invokes functions for a given task.  Claude 2.1 can answer questions by searching databases using private APIs and a web search API, translate natural language requests into structured API calls, or connect to product datasets to make recommendations and help customers complete purchases. Access to this feature is currently limited to select early access partners, with plans for open access in the near future. If you are interested in gaining early access, please contact your AWS account team.

To learn more about Claude 2.1’s features and capabilities, visit Anthropic Claude on Amazon Bedrock and the Amazon Bedrock documentation.

Claude 2.1 in action
To get started with Claude 2.1 in Amazon Bedrock, go to the Amazon Bedrock console. Choose Model access on the bottom left pane, then choose Manage model access on the top right side, submit your use case, and request model access to the Anthropic Claude model. It may take several minutes to get access to models. If you already have access to the Claude model, you don’t need to request access separately for Claude 2.1.

To test Claude 2.1 in chat mode, choose Text or Chat under Playgrounds in the left menu pane. Then select Anthropic and then Claude v2.1.

By choosing View API request, you can also access the model via code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

$ aws bedrock-runtime invoke-model \
      --model-id anthropic.claude-v2:1 \
      --body "{\"prompt\":\"Human: \\n\\nHuman: Tell me funny joke about outer space!\n\nAssistant:", "max_tokens_to_sample": 50}' \
      --cli-binary-format raw-in-base64-out \

You can use system prompt engineering techniques provided by the Claude 2.1 model, where you place your inputs and documents before any questions that reference or utilize that content. Inputs can be natural language text, structured documents, or code snippets using <document>, <papers>, <books>, or <code> tags, and so on. You can also use conversational text, such as chat history, and Retrieval Augmented Generation (RAG) results, such as chunked documents.

Here is a system prompt example for support agents to respond to customer questions based on corporate documents.

Here are some documents for you to reference for your task:
 <document index="1">
  (the text content of the document - could be a passage, web page, article, etc)
<document index="2">
<document index="3">

You are Larry, and you are a customer advisor with deep knowledge of your company's products. Larry has a great deal of patience with his customers, even when they say nonsense or are sarcastic. Larry's answers are polite but sometimes funny. However, he only answers questions about the company's products and doesn't know much about other questions. Use the provided documentation to answer user questions.

Human: Your product is making a weird stuttering sound when I operate. What might be the problem?

To learn more about prompt engineering on Amazon Bedrock, see the Prompt engineering guidelines included in the Amazon Bedrock documentation. You can learn general prompt techniques, templates, and examples for Amazon Bedrock text models, including Claude.

Now available
Claude 2.1 is available today in the US East (N. Virginia) and US West (Oregon) Regions.

You only pay for what you use, with no time-based term commitments for on-demand mode. For text generation models, you are charged for every input token processed and every output token generated. Or you can choose the provisioned throughput mode to meet your application’s performance requirements in exchange for a time-based term commitment. To learn more, see Amazon Bedrock Pricing.

Give Anthropic Claude 2.1 a try in Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.


Rapid7 Takes Next Step in AI Innovation with New AI-Powered Threat Detections

Post Syndicated from Laura Ellis original https://blog.rapid7.com/2023/11/29/rapid7-takes-next-step-in-ai-innovation-with-new-ai-powered-threat-detections/

Rapid7 Takes Next Step in AI Innovation with New AI-Powered Threat Detections

Digital transformation has created immense opportunity to generate new revenue streams, better engage with customers and drive operational efficiency. A decades-long transition to cloud as the de-facto delivery model of choice has delivered undeniable value to the business landscape. But any change in operating model brings new challenges too.

The speed, scale and complexity of modern IT environments results in security teams being tasked with analyzing mountains of data to keep pace with the ever-expanding threat landscape. This dynamic puts security analysts on their heels, constantly reacting to incoming threat signals from tools that weren’t purpose-built to solve hybrid environments, creating coverage gaps and a need to swivel-chair between a multitude of point solutions. Making matters worse? Attackers have increasingly looked to weaponize AI technologies to launch sophisticated attacks, benefiting from increased scale, easy access to AI-generated malware packages, as well as more effective social engineering and phishing using generative AI.

To combat these challenges, we need to equip our security teams with modern solutions leveraging AI to cut through the noise and boost signals that matter.  Our AI algorithms alleviate alert and action fatigue by delivering visibility across your IT environment and intelligently prioritizing the most important risk signals.

Rapid7 Has Been an AI Innovator for Decades

There has been a groundswell of development and corresponding interest in generative AI, particularly in the last few years as mainstream adoption of large language models (LLMs) grows. Most notably OpenAI’s ChatGPT – has brought AI to the forefront of people’s minds. This buzz has resulted in a number of vendors in the security space launching their own intelligent assistants and working to incorporate AI/ML into their respective solutions to keep pace.

From our perspective, this is great news and a huge step forward in the data & AI space. Rapid7 is also accelerating investment, but we’re certainly not starting from scratch. In fact, Rapid7 was a pioneer in AI development for security use cases, starting all the way back to our earliest days with our VM Expert System in the early 2000’s.

Rapid7 Takes Next Step in AI Innovation with New AI-Powered Threat Detections

Built on decades of risk analysis and continuously trained by our expert SOC team, Rapid7 AI enables your team to focus on what matters most by proactively shrinking your attack surface and intelligently re-balancing the scales between signal and noise.

With visibility across your hybrid attack surface, the Insight Platform enables proactive prevention, leveraging a proprietary AI-based detection engine to spot threats faster than ever and automatically prioritize the signals that matter most based on likelihood of exploitation and potential business impact. Based on learnings from your own environment and security operations over time, the platform will intelligently recommend updates to detection rule settings in an effort to reduce excess noise and eliminate false positives.

By integrating our AI capabilities into the Rapid7 platform, customers benefit from:

  • World class threat efficacy, with AI-driven detection of anomalous activity.  With a vast amount of legitimate activity occurring across customer environments, our AI algorithms validate if activity is actually malicious, allowing teams to spot unknown threats faster than ever.
  • Help to cut through the noise, by identifying the signals that matter most.  Our AI algorithms  automatically prioritize risks and threats, intelligently, suppressing benign alerts to eliminate the noise so analysts can focus on what matters most.
  • The confidence that they’re taking action on AI-generated insights they can trust, built on decades of risk and threat analysis and trained by a team of recognized innovators in AI-driven security.

Recent Innovations in AI-driven Threat Detection

We’ve recently announced two new AI/ML-powered threat detection capabilities aimed at enabling teams to detect unknown threats across a customer’s  environment faster than ever before without introducing excess noise.

  • Cloud Anomaly Detection
    Cloud Anomaly Detection is an AI-powered, agentless detection capability designed to detect and prioritize anomalous activity within an organization’s own cloud environment. The proprietary AI engine goes beyond simply detecting suspicious behavior; it automatically suppresses benign signals to reduce noise, eliminate false positives, and enable teams to focus on investigating highly-probable active threats. For more information on Cloud Anomaly Detection, check out the launch blog here.
  • Intelligent Kerberoasting Detection
    We’ve expanded existing AI-driven detections for attack types such as data exfiltration, phishing and credential theft to include intelligent detection, validation and prioritization of Kerberoasting attacks. The platform goes beyond traditional tactics for detecting Kerberoasting by applying a deep understanding of typical user activity across the organization. With this context, SOC teams can respond with confidence knowing the signals they are receiving are actually indicative of a Kerberoasting attack.

Rapid7 continues to explore and invest in ways we can leverage AI/ML to better-equip our customers to defend their organizations against the ever-expanding threat landscape. Keep an eye out in the near-future for additional innovations to come out in this space.

For now, be sure to stop by the Rapid7 booth (#1270) at AWS Re:Invent, where we’ll be showcasing Cloud Anomaly Detection and talk to us about how your team is thinking about utilizing AI.

New generative AI capabilities for Amazon DataZone to further simplify data cataloging and discovery (preview)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-generative-ai-capabilities-for-amazon-datazone-to-further-simplify-data-cataloging-and-discovery-preview/

Today, we are announcing a preview of an automation feature backed by generative artificial intelligence (AI) for Amazon DataZone that will dramatically decrease the amount of time needed to provide context for organizational data. The new feature can automate the traditionally labor-intensive process of data cataloging. Powered by the large language models (LLMs) of Amazon Bedrock, it generates detailed descriptions of data assets and their schemas, and suggests analytical use cases. You can generate a comprehensive business context with a single click.

We heard from customers that data consumers such as data analysts, scientists, and engineers in organizations struggle to understand the data’s relevance with little metadata. As a result, they either spend more time interpreting the data, or they return to data producers with continued questions. So, data producers such as data owners, engineers, and analysts who own the data and make it available for consumers need to manually enter detailed context for higher-priority data to make data shareable and discoverable. This is time-consuming and the number one problem customers have when trying to collate their data in a system for self-service by consumers.

When we launched the general availability of Amazon DataZone in October 2023, we introduced the first feature that brings generative AI capabilities to automate the generation of the table name and column names of a business catalog asset. In the data portal of Amazon DataZone, the green brain icon indicates automatically generated metadata suggestions. You could accept, edit, or reject each suggestion recommended by Amazon DataZone.

What’s new with today’s preview announcement?
Now, in addition to column and table names, you can automatically generate more detailed descriptions of the table and schema, as well as suggested uses.

In the Business Metadata tab in the data portal, when you choose Generate summary, new content will be generated to explain the table and its metadata.

You can also accept, edit, and reject this recommendation.

When you choose the Schema tab, you can also see new Description recommendations as well as the Name. You can review generated metadata and choose to accept, edit, or reject the recommendation.

This new feature will enhance data discoverability and reduce on back-and-forth communications between data consumers and producers. You will have a richer search experience based on extensive data insights in the future.

Join the preview
The new metadata generation ability is now previewed in the AWS US East (N. Virginia) and US West (Oregon) Regions. With this new generative AI capability, you can reduce time-to-insight by accelerating data cataloging and boosting data discovery. To learn more, visit the Amazon DataZone: Automate Data Discovery.

Give it a try and send feedback to AWS re:Post for Amazon DataZone or through your usual AWS Support contacts.


New generative AI features in Amazon Connect, including Amazon Q, facilitate improved contact center service

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/new-generative-ai-features-in-amazon-connect-including-amazon-q-facilitate-improved-contact-center-service/

If you manage a contact center, then you know the critical role that agents play in helping your organization build customer trust and loyalty. Those of us who’ve reached out to a contact center know how important agents are with guiding through complex decisions and providing fast and accurate solutions where needed. This can take time, and if not done correctly, then it may lead to frustration.

Generative AI capabilities in Amazon Connect
Today, we’re announcing that the existing artificial intelligence (AI) features of Amazon Connect now have generative AI capabilities that are powered by large language models (LLMs) available through Amazon Bedrock to transform how contact centers provide service to customers. LLMs are pre-trained on vast amounts of data, commonly known as foundation models (FMs), and they can understand and learn, generate text, engage in interactive conversations, answer questions, summarize dialogs and documents, and provide recommendations.

Amazon Q in Connect: recommended responses and actions for faster customer support
Organizations are in a state of constant change. To maintain a high level of performance that keeps up with these organizational changes, contact centers continuously onboard, train, and coach agents. Even with training and coaching, agents must often search through different sources of information, such as product guides and organization policies, to provide exceptional service to customers. This can increase customer wait times, lowering customer satisfaction and increasing contact center costs.

Amazon Q in Connect, a generative AI-powered agent assistant that includes functionality formerly available as Amazon Connect Wisdom, understands customer intents and uses relevant sources of information to deliver accurate responses and actions for the agent to communicate and resolve unique customer needs, all in real-time. Try Amazon Q in Connect for no charge until March 1, 2024. The feature is easy to enable, and you can get started in the Amazon Connect console.

Amazon Connect Contact Lens: generative post-contact summarization for increased productivity
To improve customer interactions and make sure details are available for future reference, contact center managers rely on the notes that agents manually create after every customer interaction. These notes include details on how a customer issue was addressed, key moments of the conversation, and any pending follow-up items.

Amazon Connect Contact Lens now provides generative AI-powered post-contact summarization, and enables contact center managers to more efficiently monitor and help improve contact quality and agent performance. For example, you can use summaries to track commitments made to customers and make sure of the prompt completion of follow-up actions. Moments after a customer interaction, Contact Lens now condenses the conversation into a concise and coherent summary.

Amazon Lex in Amazon Connect: assisted slot resolution
Using Amazon Lex, you can already build chatbots, virtual agents, and interactive voice response (IVR) which lets your customers schedule an appointment without speaking to a human agent. For example, “I need to change my travel reservation for myself and my two children,” might be difficult for a traditional bot to resolve to a numeric value (how many people are on the travel reservation?).

With the new assisted slot resolution feature, Amazon Lex can now resolve slot values in user utterances with great accuracy (for example, providing an answer to the previous question by providing a correct numeric value of three). This is powered by the advanced reasoning capabilities of LLMs which improve accuracy and provide a better customer experience. Learn about all the features of Amazon Lex, including the new generative AI-powered capabilities to help you build better self-service experiences.

Amazon Connect Customer Profiles: quicker creation of unified customer profiles for personalized customer experiences
Customers expect personalized customer service experiences. To provide this, contact centers need a comprehensive understanding of customers’ preferences, purchases, and interactions. To achieve that, contact center administrators create unified customer profiles by merging customer data from a number of applications. These applications each have different types of customer data stored in varied formats across a range of data stores. Stitching together data from these various data stores needs contact center administrators to understand their data and figure out how to organize and combine it into a unified format. To accomplish this, they spend weeks compiling unified customer profiles.

Starting today, Amazon Connect Customer Profiles uses LLMs to shorten the time needed to create unified customer profiles. When contact center administrators add data sources such as Amazon Simple Storage Service (Amazon S3), Adobe Analytics, Salesforce, ServiceNow, and Zendesk, Customer Profiles analyze the data to understand what the data format and content represents and how the data relates to customers’ profiles. Then, Customer Profiles then automatically determines how to organize and combine data from different sources into complete, accurate profiles. With just a few steps, managers can review, make any necessary edits, and complete the setup of customer profiles.

Review summary mapping

In-app, web, and video capabilities in Amazon Connect
As an organization, you want to provide great, easy-to-use, and convenient customer service. Earlier in this post I talked about self-service chatbots and how they help you with this. At times customers want to move beyond the chatbot, and beyond an audio conversation with the agent.

Amazon Connect now has in-app, web, and video capabilities to help you deliver rich, personalized customer experiences (see Amazon Lex features for details). Using the fully-managed communication widget, and with a few lines of code, you can implement these capabilities on your web and mobile applications. This allows your customers to get support from a web or mobile application without ever having to leave the page. Video can be enabled by either the agent only, by the customer only, or by both agent and customer.

Video calling

Amazon Connect SMS: two-way SMS capabilities
Almost everyone owns a mobile device and we love the flexibility of receiving text-based support on-the-go. Contact center leaders know this, and in the past have relied on disconnected, third-party solutions to provide two-way SMS to customers.

Amazon Connect now has two-way SMS capabilities to enable contact center leaders to provide this flexibility (see Amazon Lex features for details). This improves customer satisfaction and increases agent productivity without costly integration with third-party solutions. SMS chat can be enabled using the same configuration, Amazon Connect agent workspace, and analytics as calls and chats.

Learn more

Send feedback


Introducing Amazon Q, a new generative AI-powered assistant (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/introducing-amazon-q-a-new-generative-ai-powered-assistant-preview/

Today, we are announcing Amazon Q, a new generative artificial intelligence- (AI)-powered assistant designed for work that can be tailored to your business. You can use Amazon Q to have conversations, solve problems, generate content, gain insights, and take action by connecting to your company’s information repositories, code, data, and enterprise systems. Amazon Q provides immediate, relevant information and advice to employees to streamline tasks, accelerate decision-making and problem-solving, and help spark creativity and innovation at work.

Amazon Q offers user-based plans, so you get features, pricing, and options tailored to how you use the product. Amazon Q can adapt its interactions to each individual user based on the existing identities, roles, and permissions of your business. AWS never uses customers’ content from Amazon Q to train the underlying models. In other words, your company information remains secure and private.

In this post, I’ll give you a quick tour of how you can use Amazon Q for general business use. 

Amazon Q is your business expert
Let’s look at a few examples of how Amazon Q can help business users complete tasks using simple natural language prompts. As a marketing manager, you could ask Amazon Q to transform a press release into a blog post, create a summary of the press release, or create an email draft based on the provided release. Amazon Q searches through your company content, which can include internal style guides, for example, to provide a response appropriate to your company’s brand standards. Then, you could ask Amazon Q to generate tailored social media prompts to promote your story through each of your social media channels. Later, you can ask Amazon Q to analyze the results of your campaign and summarize them for leadership reviews.

Amazon Q

In the following example, I deployed Amazon Q with access to my AWS News Blog posts from 2023 and called the assistant “AWS Blog Expert.”

Amazon Q

Coming back to my previous example, let’s assume I’m a marketing manager and want Amazon Q to help me create social media posts for recent company blog posts.

I enter the following prompt: “Summarize the key insights from Antje’s recent AWS Weekly Roundup posts and craft a compelling social media post that not only highlights the most important points but also encourages engagement. Consider our target audience and aim for a tone that aligns with our brand identity. The social media post should be concise, informative, and enticing to encourage readers to click through and read the full articles. Please ensure the content is shareable and includes relevant hashtags for maximum visibility.”

Amazon Q

Behind the scenes, Amazon Q searches the documents in connected data sources and creates a relevant and detailed suggestion for a social media post based on my blog posts. Amazon Q also tells me which document was used to generate the answer. In this case, it is PDF file of the blog posts in question.

As an administrator, you can define the context for responses, restrict irrelevant topics, and configure whether to respond only using trusted company information or complement responses with knowledge from the underlying model. Restricting responses to trusted company information helps mitigate hallucinations, a common phenomenon where the underlying model generates responses that sound plausible but are based on misinterpreted or nonexistent data.

Amazon Q provides fine-grained access controls that restrict responses to only using data or acting based on the employee’s level of access and provides citations and references to the original sources for fact-checking and traceability. You can choose among 40+ built-in connectors for popular data sources and enterprise systems, including Amazon S3, Google Drive, Microsoft SharePoint, Salesforce, ServiceNow, and Slack.

How to tailor Amazon Q to your business
To tailor Amazon Q to your business, navigate to Amazon Q in the console, select Applications in the left menu, and choose Create application.

Amazon Q

This starts the following workflow.

Step 1. Create application. Provide an application name and create a new or select an existing AWS Identity and Access Management (IAM) service role that Amazon Q is allowed to assume. I call my application AWS-Blog-Expert. Then, choose Create.

Amazon Q

Step 2. Select retriever. A retriever pulls data from the index in real time during a conversation. You can choose between two options: use the Amazon Q native retriever or use an existing Amazon Kendra retriever. The native retriever can connect to the Amazon Q supported data sources. If you already use Amazon Kendra, you can select the existing Amazon Kendra retriever to connect the associated data sources to your Amazon Q application. I select the native retriever option. Then, choose Next.

Amazon Q

Step 3. Connect data sources. Amazon Q comes with built-in connectors for popular data sources and enterprise systems. For this demo, I choose Amazon S3 and configure the data source by pointing to my S3 bucket with the PDFs of my blog posts.

Amazon Q
Once the data source sync is successfully complete and the retriever shows the accurate document count, you can preview the web experience and start a conversation. Note that the data source sync can take from a few minutes to a few hours, depending on the amount and size of data to index.

You can also connect plugins that manage access to enterprise systems, including ServiceNow, Jira, Salesforce, and Zendesk. Plugins enable Amazon Q to perform user-requested tasks, such as creating support tickets or analyzing sales forecasts.

Amazon Q

Preview and deploy web experience
In the application overview, choose Preview web experience. This opens the web experience with the conversational interface to chat with the tailored Amazon Q AWS Blog Expert. In the final step, you deploy the Amazon Q web experience. You can integrate your SAML 2.0–compliant external identity provider (IdP) using IAM. Amazon Q can work with any IdP that’s compliant with SAML 2.0. Amazon Q uses service-initiated single sign-on (SSO) to authenticate users.

Join the preview
Amazon Q is available today in preview in AWS Regions US East (N. Virginia) and US West (Oregon). Visit the product page to learn how Amazon Q can become your expert in your business.

Also, check out the Amazon Q Slack Gateway GitHub repository that shows how to make Amazon Q available to users as a Slack Bot application.Amazon Q Slack Bot

Learn more

— Antje

Amazon Q brings generative AI-powered assistance to IT pros and developers (preview)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/amazon-q-brings-generative-ai-powered-assistance-to-it-pros-and-developers-preview/

Today, we are announcing the preview of Amazon Q, a new type of generative artificial intelligence (AI) powered assistant that is specifically for work and can be tailored to a customer’s business.

Amazon Q brings a set of capabilities to support developers and IT professionals. Now you can use Amazon Q to get started building applications on AWS, research best practices, resolve errors, and get assistance in coding new features for your applications. For example, Amazon Q Code Transformation can perform Java application upgrades now, from version 8 and 11 to version 17.

Amazon Q is available in multiple areas of AWS to provide quick access to answers and ideas wherever you work. Here’s a quick look at Amazon Q, including in integrated development environment (IDE):

Building applications together with Amazon Q
Application development is a journey. It involves a continuous cycle of researching, developing, deploying, optimizing, and maintaining. At each stage, there are many questions—from figuring out the right AWS services to use, to troubleshooting issues in the application code.

Trained on 17 years of AWS knowledge and best practices, Amazon Q is designed to help you at each stage of development with a new experience for building applications on AWS. With Amazon Q, you minimize the time and effort you need to gain the knowledge required to answer AWS questions, explore new AWS capabilities, learn unfamiliar technologies, and architect solutions that fuel innovation.

Let us show you some capabilities of Amazon Q.

1. Conversational Q&A capability
You can interact with the Amazon Q conversational Q&A capability to get started, learn new things, research best practices, and iterate on how to build applications on AWS without needing to shift focus away from the AWS console.

To start using this feature, you can select the Amazon Q icon on the right-hand side of the AWS Management Console.

For example, you can ask, “What are AWS serverless services to build serverless APIs?” Amazon Q provides concise explanations along with references you can use to follow up on your questions and validate the guidance. You can also use Amazon Q to follow up on and iterate your questions. Amazon Q will show more deep-dive answers for you with references.

There are times when we have questions for a use case with fairly specific requirements. With Amazon Q, you can elaborate on your use cases in more detail to provide context.

For example, you can ask Amazon Q, “I’m planning to create serverless APIs with 100k requests/day. Each request needs to lookup into the database. What are the best services for this workload?” Amazon Q responds with a list of AWS services you can use and tries to limit the answer results to those that are accurately referenceable and verified with best practices.

Here is some additional information that you might want to note:

2. Optimize Amazon EC2 instance selection
Choosing the right Amazon Elastic Compute Cloud (Amazon EC2) instance type for your workload can be challenging with all the options available. Amazon Q aims to make this easier by providing personalized recommendations.

To use this feature, you can ask Amazon Q, “Which instance families should I use to deploy a Web App Server for hosting an application?” This feature is also available when you choose to launch an instance in the Amazon EC2 console. In Instance type, you can select Get advice on instance type selection. This will show a dialog to define your requirements.

Your requirements are automatically translated into a prompt on the Amazon Q chat panel. Amazon Q returns with a list of suggestions of EC2 instances that are suitable for your use cases. This capability helps you pick the right instance type and settings so your workloads will run smoothly and more cost-efficiently.

This capability to provide EC2 instance type recommendations based on your use case is available in preview in all commercial AWS Regions.

3. Troubleshoot and solve errors directly in the console
Amazon Q can also help you to solve errors for various AWS services directly in the console. With Amazon Q proposed solutions, you can avoid slow manual log checks or research.

Let’s say that you have an AWS Lambda function that tries to interact with an Amazon DynamoDB table. But, for an unknown reason (yet), it fails to run. Now, with Amazon Q, you can troubleshoot and resolve this issue faster by selecting Troubleshoot with Amazon Q.

Amazon Q provides concise analysis of the error which helps you to understand the root cause of the problem and the proposed resolution. With this information, you can follow the steps described by Amazon Q to fix the issue.

In just a few minutes, you will have the solution to solve your issues, saving significant time without disrupting your development workflow. The Amazon Q capability to help you troubleshoot errors in the console is available in preview in the US West (Oregon) for Amazon Elastic Compute Cloud (Amazon EC2), Amazon Simple Storage Service (Amazon S3), Amazon ECS, and AWS Lambda.

4. Network troubleshooting assistance
You can also ask Amazon Q to assist you in troubleshooting network connectivity issues caused by network misconfiguration in your current AWS account. For this capability, Amazon Q works with Amazon VPC Reachability Analyzer to check your connections and inspect your network configuration to identify potential issues.

This makes it easy to diagnose and resolve AWS networking problems, such as “Why can’t I SSH to my EC2 instance?” or “Why can’t I reach my web server from the Internet?” which you can ask Amazon Q.

Then, on the response text, you can select preview experience here, which will provide explanations to help you to troubleshoot network connectivity-related issues.

Here are a few things you need to know:

5. Integration and conversational capabilities within your IDEs
As we mentioned, Amazon Q is also available in supported IDEs. This allows you to ask questions and get help within your IDE by chatting with Amazon Q or invoking actions by typing / in the chat box.

To get started, you need to install or update the latest AWS Toolkit and sign in to Amazon CodeWhisperer. Once you’re signed in to Amazon CodeWhisperer, it will automatically activate the Amazon Q conversational capability in the IDE. With Amazon Q enabled, you can now start chatting to get coding assistance.

You can ask Amazon Q to describe your source code file.

From here, you can improve your application, for example, by integrating it with Amazon DynamoDB. You can ask Amazon Q, “Generate code to save data into DynamoDB table called save_data() accepting data parameter and return boolean status if the operation successfully runs.”

Once you’ve reviewed the generated code, you can do a manual copy and paste into the editor. You can also select Insert at cursor to place the generated code into the source code directly.

This feature makes it really easy to help you focus on building applications because you don’t have to leave your IDE to get answers and context-specific coding guidance. You can try the preview of this feature in Visual Studio Code and JetBrains IDEs.

6. Feature development capability
Another exciting feature that Amazon Q provides is guiding you interactively from idea to building new features within your IDE and Amazon CodeCatalyst. You can go from a natural language prompt to application features in minutes, with interactive step-by-step instructions and best practices, right from your IDE. With a prompt, Amazon Q will attempt to understand your application structure and break down your prompt into logical, atomic implementation steps.

To use this capability, you can start by invoking an action command /dev in Amazon Q and describe the task you need Amazon Q to process.

Then, from here, you can review, collaborate and guide Amazon Q in the chat for specific areas that need to be implemented.

Additional capabilities to help you ship features faster with complete pull requests are available if you’re using Amazon CodeCatalyst. In Amazon CodeCatalyst, you can assign a new or an existing issue to Amazon Q, and it will process an end-to-end development workflow for you. Amazon Q will review the existing code, propose a solution approach, seek feedback from you on the approach, generate merge-ready code, and publish a pull request for review. All you need to do after is to review the proposed solutions from Amazon Q.

The following screenshots show a pull request created by Amazon Q in Amazon CodeCatalyst.

Here are a couple of things that you should know:

  • Amazon Q feature development capability is currently in preview in Visual Studio Code and Amazon CodeCatalyst
  • To use this capability in IDE, you need to have the Amazon CodeWhisperer Professional tier. Learn more on the Amazon CodeWhisperer pricing page.

7. Upgrade applications with Amazon Q Code Transformation
With Amazon Q, you can now upgrade an entire application within a few hours by starting a guided code transformation. This capability, called Amazon Q Code Transformation, simplifies maintaining, migrating, and upgrading your existing applications.

To start, navigate to the CodeWhisperer section and then select Transform. Amazon Q Code Transformation automatically analyzes your existing codebase, generates a transformation plan, and completes the key transformation tasks suggested by the plan.

Some additional information about this feature:

  • Amazon Q Code Transformation is available in preview today in the AWS Toolkit for IntelliJ IDEA and the AWS Toolkit for Visual Studio Code.
  • To use this capability, you need to have the Amazon CodeWhisperer Professional tier during the preview.
  • During preview, you can can upgrade Java 8 and 11 applications to version 17, a Java Long-Term Support (LTS) release.

Get started with Amazon Q today
With Amazon Q, you have an AI expert by your side to answer questions, write code faster, troubleshoot issues, optimize workloads, and even help you code new features. These capabilities simplify every phase of building applications on AWS.

Amazon Q lets you engage with AWS Support agents directly from the Q interface if additional assistance is required, eliminating any dead ends in the customer’s self-service experience. The integration with AWS Support is available in the console and will honor the entitlements of your AWS Support plan.

Learn more

— Donnie & Channy

Guardrails for Amazon Bedrock helps implement safeguards customized to your use cases and responsible AI policies (preview)

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/guardrails-for-amazon-bedrock-helps-implement-safeguards-customized-to-your-use-cases-and-responsible-ai-policies-preview/

As part of your responsible artificial intelligence (AI) strategy, you can now use Guardrails for Amazon Bedrock (preview) to promote safe interactions between users and your generative AI applications by implementing safeguards customized to your use cases and responsible AI policies.

AWS is committed to developing generative AI in a responsible, people-centric way by focusing on education and science and helping developers to integrate responsible AI across the AI lifecycle. With Guardrails for Amazon Bedrock, you can consistently implement safeguards to deliver relevant and safe user experiences aligned with your company policies and principles. Guardrails help you define denied topics and content filters to remove undesirable and harmful content from interactions between users and your applications. This provides an additional level of control on top of any protections built into foundation models (FMs).

You can apply guardrails to all large language models (LLMs) in Amazon Bedrock, including fine-tuned models, and Agents for Amazon Bedrock. This drives consistency in how you deploy your preferences across applications so you can innovate safely while closely managing user experiences based on your requirements. By standardizing safety and privacy controls, Guardrails for Amazon Bedrock helps you build generative AI applications that align with your responsible AI goals.

Guardrails for Amazon Bedrock

Let me give you a quick tour of the key controls available in Guardrails for Amazon Bedrock.

Key controls
Using Guardrails for Amazon Bedrock, you can define the following set of policies to create safeguards in your applications.

Denied topics – You can define a set of topics that are undesirable in the context of your application using a short natural language description. For example, as a developer at a bank, you might want to set up an assistant for your online banking application to avoid providing investment advice.

I specify a denied topic with the name “Investment advice” and provide a natural language description, such as “Investment advice refers to inquiries, guidance, or recommendations regarding the management or allocation of funds or assets with the goal of generating returns or achieving specific financial objectives.”

Guardrails for Amazon Bedrock

Guardrails for Amazon Bedrock

Content filters – You can configure thresholds to filter harmful content across hate, insults, sexual, and violence categories. While many FMs already provide built-in protections to prevent the generation of undesirable and harmful responses, guardrails give you additional controls to filter such interactions to desired degrees based on your use cases and responsible AI policies. A higher filter strength corresponds to stricter filtering.

Guardrails for Amazon Bedrock

PII redaction (in the works) – You will be able to select a set of personally identifiable information (PII) such as name, e-mail address, and phone number, that can be redacted in FM-generated responses or block a user input if it contains PII.

Guardrails for Amazon Bedrock integrates with Amazon CloudWatch, so you can monitor and analyze user inputs and FM responses that violate policies defined in the guardrails.

Join the preview
Guardrails for Amazon Bedrock is available today in limited preview. Reach out through your usual AWS Support contacts if you’d like access to Guardrails for Amazon Bedrock.

During preview, guardrails can be applied to all large language models (LLMs) available in Amazon Bedrock, including Amazon Titan Text, Anthropic Claude, Meta Llama 2, AI21 Jurassic, and Cohere Command. You can also use guardrails with custom models as well as Agents for Amazon Bedrock.

To learn more, visit the Guardrails for Amazon Bedrock web page.

— Antje

Agents for Amazon Bedrock is now available with improved control of orchestration and visibility into reasoning

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/agents-for-amazon-bedrock-is-now-available-with-improved-control-of-orchestration-and-visibility-into-reasoning/

Back in July, we introduced Agents for Amazon Bedrock in preview. Today, Agents for Amazon Bedrock is generally available.

Agents for Amazon Bedrock helps you accelerate generative artificial intelligence (AI) application development by orchestrating multistep tasks. Agents uses the reasoning capability of foundation models (FMs) to break down user-requested tasks into multiple steps. They use the developer-provided instruction to create an orchestration plan and then carry out the plan by invoking company APIs and accessing knowledge bases using Retrieval Augmented Generation (RAG) to provide a final response to the end user. If you’re curious how this works, check out my previous posts on agents that include a primer on advanced reasoning and a primer on RAG.

Starting today, Agents for Amazon Bedrock also comes with enhanced capabilities that include improved control of the orchestration and better visibility into the chain of thought reasoning.

Behind the scenes, Agents for Amazon Bedrock automates the prompt engineering and orchestration of user-requested tasks, such as managing retail orders or processing insurance claims. An agent automatically builds the orchestration prompt and, if connected to knowledge bases, augments it with your company-specific information and invokes APIs to provide responses to the user in natural language.

As a developer, you can use the new trace capability to follow the reasoning that’s used as the plan is carried out. You can view the intermediate steps in the orchestration process and use this information to troubleshoot issues.

You can also access and modify the prompt that the agent automatically creates so you can further enhance the end-user experience. You can update this automatically created prompt (or prompt template) to help the FM enhance the orchestration and responses, giving you more control over the orchestration.

Let me show you how to view the reasoning steps and how to modify the prompt.

View reasoning steps
Traces gives you visibility into the agent’s reasoning, known as the chain of thought (CoT). You can use the CoT trace to see how the agent performs tasks step by step. The CoT prompt is based on a reasoning technique called ReAct (synergizing reasoning and acting). Check out the primer on advanced reasoning in my previous blog post to learn more about ReAct and the specific prompt structure.

To get started, navigate to the Amazon Bedrock console and select the working draft of an existing agent. Then, select the Test button and enter a sample user request. In the agent’s response, select Show trace.

Agents for Amazon Bedrock

The CoT trace shows the agent’s reasoning step-by-step. Open each step to see the CoT details.

Agents for Amazon Bedrock

The enhanced visibility helps you understand the rationale used by the agent to complete the task. As a developer, you can use this information to refine the prompts, instructions, and action descriptions to adjust the agent’s actions and responses when iteratively testing and improving the user experience.

Modify agent-created prompts
The agent automatically creates a prompt template from the provided instructions. You can update the preprocessing of user inputs, the orchestration plan, and the postprocessing of the FM response.

To get started, navigate to the Amazon Bedrock console and select the working draft of an existing agent. Then, select the Edit button next to Advanced prompts.

Agents for Amazon Bedrock

Here, you have access to four different types of templates. Preprocessing templates define how an agent
contextualizes and categorizes user inputs. The orchestration template equips an agent with short-term memory, a list of available actions and knowledge bases along with their descriptions, as well as few-shot examples of how to break down the problem and use these actions and knowledge in different sequences or combinations. Knowledge base response generation templates define how knowledge bases will be used and summarized in the response. Postprocessing templates define how an agent will format and present a final response to the end user. You can either keep using the template defaults or edit and override the template defaults.

Things to know
Here are a few best practices and important things to know when you’re working with Agents for Amazon Bedrock.

Agents perform best when you allow them to focus on a specific task. The clearer the objective (instructions) and the more focused the available set of actions (APIs), the easier it will be for the FM to reason and identify the right steps. If you need agents to cover various tasks, consider creating separate, individual agents.

Here are a few additional guidelines:

  • Number of APIs – Use three to five APIs with a couple of input parameters in your agents.
  • API design – Follow general best practices for designing APIs, such as ensuring idempotency.
  • API call validations – Follow best practices of API design by employing exhaustive validation for all API calls. This is particularly important because large language models (LLMs) may generate hallucinated inputs and outputs, and these validations prove helpful during such occurrences.

Availability and pricing
Agents for Amazon Bedrock are available today in AWS Regions US East (N. Virginia) and US West (Oregon). You will be charged for the inference calls (InvokeModel API) made by agents. The InvokeAgent API is not charged separately. Amazon Bedrock Pricing has all the details.

Learn more

— Antje

Customize models in Amazon Bedrock with your own data using fine-tuning and continued pre-training

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/customize-models-in-amazon-bedrock-with-your-own-data-using-fine-tuning-and-continued-pre-training/

Today, I’m excited to share that you can now privately and securely customize foundation models (FMs) with your own data in Amazon Bedrock to build applications that are specific to your domain, organization, and use case. With custom models, you can create unique user experiences that reflect your company’s style, voice, and services.

With fine-tuning, you can increase model accuracy by providing your own task-specific labeled training dataset and further specialize your FMs. With continued pre-training, you can train models using your own unlabeled data in a secure and managed environment with customer managed keys. Continued pre-training helps models become more domain-specific by accumulating more robust knowledge and adaptability—beyond their original training.

Let me give you a quick tour of both model customization options. You can create fine-tuning and continued pre-training jobs using the Amazon Bedrock console or APIs. In the console, navigate to Amazon Bedrock, then select Custom models.

Amazon Bedrock - Custom Models

Fine-tune Meta Llama 2, Cohere Command Light, and Amazon Titan FMs
Amazon Bedrock now supports fine-tuning for Meta Llama 2, Cohere Command Light, as well as Amazon Titan models. To create a fine-tuning job in the console, choose Customize model, then choose Create Fine-tuning job.

Amazon Bedrock - Custom Models

Here’s a quick demo using the AWS SDK for Python (Boto3). Let’s fine-tune Cohere Command Light to summarize dialogs. For demo purposes, I’m using the public dialogsum dataset, but this could be your own company-specific data.

To prepare for fine-tuning on Amazon Bedrock, I converted the dataset into JSON Lines format and uploaded it to Amazon S3. Each JSON line needs to have both a prompt and a completion field. You can specify up to 10,000 training data records, but you may already see model performance improvements with a few hundred examples.

{"completion": "Mr. Smith's getting a check-up, and Doctor Haw...", "prompt": Summarize the following conversation.\n\n#Pers..."}
{"completion": "Mrs Parker takes Ricky for his vaccines. Dr. P...", "prompt": "Summarize the following conversation.\n\n#Pers..."}
{"completion": "#Person1#'s looking for a set of keys and asks...", "prompt": "Summarize the following conversation.\n\n#Pers..."} 

I redacted the prompt and completion fields for brevity.

You can list available foundation models that support fine-tuning with the following command:

import boto3 
bedrock = boto3.client(service_name="bedrock")
bedrock_runtime = boto3.client(service_name="bedrock-runtime")

for model in bedrock.list_foundation_models(
    for key, value in model.items():
        print(key, ":", value)

Next, I create a model customization job. I specify the Cohere Command Light model ID that supports fine-tuning, set customization type to FINE_TUNING, and point to the Amazon S3 location of the training data. If needed, you can also adjust the hyperparameters for fine-tuning.

# Select the foundation model you want to customize
base_model_id = "cohere.command-light-text-v14:7:4k"

    hyperParameters = {
        "epochCount": "1",
        "batchSize": "8",
        "learningRate": "0.00001",
    trainingDataConfig={"s3Uri": "s3://path/to/train-summarization.jsonl"},
    outputDataConfig={"s3Uri": "s3://path/to/output"},

# Check for the job status
status = bedrock.get_model_customization_job(jobIdentifier=job_name)["status"]

Once the job is complete, you receive a unique model ID for your custom model. Your fine-tuned model is stored securely by Amazon Bedrock. To test and deploy your model, you need to purchase Provisioned Throughput.

Let’s see the results. I select one example from the dataset and ask the base model before fine-tuning, as well as the custom model after fine-tuning, to summarize the following dialog:

prompt = """Summarize the following conversation.\\n\\n
#Person1#: Hello. My name is John Sandals, and I've got a reservation.\\n
#Person2#: May I see some identification, sir, please?\\n
#Person1#: Sure. Here you are.\\n
#Person2#: Thank you so much. Have you got a credit card, Mr. Sandals?\\n
#Person1#: I sure do. How about American Express?\\n
#Person2#: Unfortunately, at the present time we take only MasterCard or VISA.\\n
#Person1#: No American Express? Okay, here's my VISA.\\n
#Person2#: Thank you, sir. You'll be in room 507, nonsmoking, with a queen-size bed. Do you approve, sir?\\n
#Person1#: Yeah, that'll be fine.\\n
#Person2#: That's great. This is your key, sir. If you need anything at all, anytime, just dial zero.\\n\\n
Summary: """

Use the Amazon Bedrock InvokeModel API to query the models.

body = {
    "prompt": prompt,
    "temperature": 0.5,
    "p": 0.9,
    "max_tokens": 512,

response = bedrock_runtime.invoke_model(
	# Use on-demand inference model ID for response before fine-tuning
    # modelId="cohere.command-light-text-v14",
	# Use ARN of your deployed custom model for response after fine-tuning

Here’s the base model response before fine-tuning:

#Person2# helps John Sandals with his reservation. John gives his credit card information and #Person2# confirms that they take only MasterCard and VISA. John will be in room 507 and #Person2# will be his host if he needs anything.

Here’s the response after fine-tuning, shorter and more to the point:

John Sandals has a reservation and checks in at a hotel. #Person2# takes his credit card and gives him a key.

Continued pre-training for Amazon Titan Text (preview)
Continued pre-training on Amazon Bedrock is available today in public preview for Amazon Titan Text models, including Titan Text Express and Titan Text Lite. To create a continued pre-training job in the console, choose Customize model, then choose Create Continued Pre-training job.

Amazon Bedrock - Custom Models

Here’s a quick demo again using boto3. Let’s assume you work at an investment company and want to continue pre-training the model with financial and analyst reports to make it more knowledgeable about financial industry terminology. For demo purposes, I selected a collection of Amazon shareholder letters as my training data.

To prepare for continued pre-training, I converted the dataset into JSON Lines format again and uploaded it to Amazon S3. Because I’m working with unlabeled data, each JSON line only needs to have the prompt field. You can specify up to 100,000 training data records and usually see positive effects after providing at least 1 billion tokens.

{"input": "Dear shareholders: As I sit down to..."}
{"input": "Over the last several months, we to..."}
{"input": "work came from optimizing the conne..."}
{"input": "of the Amazon shopping experience f..."}

I redacted the input fields for brevity.

Then, create a model customization job with customization type CONTINUED_PRE_TRAINING that points to the data. If needed, you can also adjust the hyperparameters for continued pre-training.

# Select the foundation model you want to customize
base_model_id = "amazon.titan-text-express-v1"

    hyperParameters = {
        "epochCount": "10",
        "batchSize": "8",
        "learningRate": "0.00001",
    trainingDataConfig={"s3Uri": "s3://path/to/train-continued-pretraining.jsonl"},
    outputDataConfig={"s3Uri": "s3://path/to/output"},

Once the job is complete, you receive another unique model ID. Your customized model is securely stored again by Amazon Bedrock. As with fine-tuning, you need to purchase Provisioned Throughput to test and deploy your model.

Things to know
Here are a couple of important things to know:

Data privacy and network security – With Amazon Bedrock, you are in control of your data, and all your inputs and customizations remain private to your AWS account. Your data, such as prompts, completions, custom models, and data used for fine-tuning or continued pre-training, is not used for service improvement and is never shared with third-party model providers. Your data remains in the AWS Region where the API call is processed. All data is encrypted in transit and at rest. You can use AWS PrivateLink to create a private connection between your VPC and Amazon Bedrock.

Billing – Amazon Bedrock charges for model customization, storage, and inference. Model customization is charged per tokens processed. This is the number of tokens in the training dataset multiplied by the number of training epochs. An epoch is one full pass through the training data during customization. Model storage is charged per month, per model. Inference is charged hourly per model unit using provisioned throughput. For detailed pricing information, see Amazon Bedrock Pricing.

Custom models and provisioned throughput – Amazon Bedrock allows you to run inference on custom models by purchasing provisioned throughput. This guarantees a consistent level of throughput in exchange for a term commitment. You specify the number of model units needed to meet your application’s performance needs. For evaluating custom models initially, you can purchase provisioned throughput hourly with no long-term commitment. With no commitment, a quota of one model unit is available per provisioned throughput. You can create up to two provisioned throughputs per account.

Fine-tuning support on Meta Llama 2, Cohere Command Light, and Amazon Titan Text FMs is available today in AWS Regions US East (N. Virginia) and US West (Oregon). Continued pre-training is available today in public preview in AWS Regions US East (N. Virginia) and US West (Oregon). To learn more, visit the Amazon Bedrock Developer Experience web page and check out the User Guide.

Customize FMs with Amazon Bedrock today!

— Antje

Knowledge Bases now delivers fully managed RAG experience in Amazon Bedrock

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/knowledge-bases-now-delivers-fully-managed-rag-experience-in-amazon-bedrock/

Back in September, we introduced Knowledge Bases for Amazon Bedrock in preview. Starting today, Knowledge Bases for Amazon Bedrock is generally available.

With a knowledge base, you can securely connect foundation models (FMs) in Amazon Bedrock to your company data for Retrieval Augmented Generation (RAG). Access to additional data helps the model generate more relevant, context-specific, and accurate responses without continuously retraining the FM. All information retrieved from knowledge bases comes with source attribution to improve transparency and minimize hallucinations. If you’re curious how this works, check out my previous post that includes a primer on RAG.

With today’s launch, Knowledge Bases gives you a fully managed RAG experience and the easiest way to get started with RAG in Amazon Bedrock. Knowledge Bases now manages the initial vector store setup, handles the embedding and querying, and provides source attribution and short-term memory needed for production RAG applications. If needed, you can also customize the RAG workflows to meet specific use case requirements or integrate RAG with other generative artificial intelligence (AI) tools and applications.

Fully managed RAG experience
Knowledge Bases for Amazon Bedrock manages the end-to-end RAG workflow for you. You specify the location of your data, select an embedding model to convert the data into vector embeddings, and have Amazon Bedrock create a vector store in your account to store the vector data. When you select this option (available only in the console), Amazon Bedrock creates a vector index in Amazon OpenSearch Serverless in your account, removing the need to manage anything yourself.

Knowledge bases for Amazon Bedrock

Vector embeddings include the numeric representations of text data within your documents. Each embedding aims to capture the semantic or contextual meaning of the data. Amazon Bedrock takes care of creating, storing, managing, and updating your embeddings in the vector store, and it ensures your data is always in sync with your vector store.

Amazon Bedrock now also supports two new APIs for RAG that handle the embedding and querying and provide the source attribution and short-term memory needed for production RAG applications.

With the new RetrieveAndGenerate API, you can directly retrieve relevant information from your knowledge bases and have Amazon Bedrock generate a response from the results by specifying a FM in your API call. Let me show you how this works.

Use the RetrieveAndGenerate API
To give it a try, navigate to the Amazon Bedrock console, create and select a knowledge base, then select Test knowledge base. For this demo, I created a knowledge base that has access to a PDF of Generative AI on AWS. I choose Select Model to specify a FM.

Knowledge Bases for Amazon Bedrock

Then, I ask, “What is Amazon Bedrock?”

Knowledge Bases for Amazon Bedrock

Behind the scenes, Amazon Bedrock converts the queries into embeddings, queries the knowledge base, and then augments the FM prompt with the search results as context information and returns the FM-generated response to my question. For multi-turn conversations, Knowledge Bases manages the short-term memory of the conversation to provide more contextual results.

Here’s a quick demo of how to use the APIs with the AWS SDK for Python (Boto3).

def retrieveAndGenerate(input, kbId):
    return bedrock_agent_runtime.retrieve_and_generate(
            'text': input
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': kbId,
                'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-instant-v1'

response = retrieveAndGenerate("What is Amazon Bedrock?", "AES9P3MT9T")["output"]["text"]

The output of the RetrieveAndGenerate API includes the generated response, the source attribution, and the retrieved text chunks. In my demo, the API response looks like this (with some of the output redacted for brevity):

{ ... 
    'output': {'text': 'Amazon Bedrock is a managed service from AWS that ...'}, 
                 {'text': 'Amazon Bedrock is ...', 'span': {'start': 0, 'end': 241}}
                 {'text': 'All AWS-managed service API activity...'}, 
				 'location': {'type': 'S3', 's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}}}, 
			      {'text': 'Changing a portion of the image using ...'}, 
				  'location': {'type': 'S3', 's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}}}, ...]

The generated response looks like this:

Amazon Bedrock is a managed service that offers a serverless experience for generative AI through a simple API. It provides access to foundation models from Amazon and third parties for tasks like text generation, image generation, and building conversational agents. Data processed through Amazon Bedrock remains private and encrypted.

Customize RAG workflows
If you want to process the retrieved text chunks further, see the relevance scores of the retrievals, or develop your own orchestration for text generation, you can use the new Retrieve API. This API converts user queries into embeddings, searches the knowledge base, and returns the relevant results, giving you more control to build custom workflows on top of the semantic search results.

Use the Retrieve API
In the Amazon Bedrock console, I toggle the switch to disable Generate responses.

Knowledge Bases for Amazon Bedrock

Then, I ask again, “What is Amazon Bedrock?” This time, the output shows me the retrieval results with links to the source documents where the text chunks came from.

Knowledge Bases for Amazon Bedrock

Here’s how to use the Retrieve API with boto3.

import boto3

bedrock_agent_runtime = boto3.client(
    service_name = "bedrock-agent-runtime"

def retrieve(query, kbId, numberOfResults=5):
    return bedrock_agent_runtime.retrieve(
        retrievalQuery= {
            'text': query
        retrievalConfiguration= {
            'vectorSearchConfiguration': {
                'numberOfResults': numberOfResults

response = retrieve("What is Amazon Bedrock?", "AES9P3MT9T")["retrievalResults"]

The output of the Retrieve API includes the retrieved text chunks, the location type and URI of the source data, and the scores of the retrievals. The score helps to determine chunks that match more closely with the query.

In my demo, the API response looks like this (with some of the output redacted for brevity):

[{'content': {'text': 'Changing a portion of the image using ...'},
  'location': {'type': 'S3',
   's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}},
  'score': 0.7329834},
 {'content': {'text': 'back to the user in natural language. For ...'},
  'location': {'type': 'S3',
   's3Location': {'uri': 's3://data-generative-ai-on-aws/gaia.pdf'}},
  'score': 0.7331088},

To further customize your RAG workflows, you can define a custom chunking strategy and select a custom vector store.

Custom chunking strategy – To enable effective retrieval from your data, a common practice is to first split the documents into manageable chunks. This enhances the model’s capacity to comprehend and process information more effectively, leading to improved relevant retrievals and generation of coherent responses. Knowledge Bases for Amazon Bedrock manages the chunking of your documents.

When you configure the data source for your knowledge base, you can now define a chunking strategy. Default chunking splits data into chunks of up to 200 tokens and is optimized for question-answer tasks. Use default chunking when you are not sure of the optimal chunk size for your data.

You also have the option to specify a custom chunk size and overlap with fixed-size chunking. Use fixed-size chunking if you know the optimal chunk size and overlap for your data (based on file attributes, accuracy testing, and so on). An overlap between chunks in the recommended range of 0–20 percent can help improve accuracy. Higher overlap can lead to decreased relevancy scores.

If you select to create one embedding per document, Knowledge Bases keeps each file as a single chunk. Use this option if you don’t want Amazon Bedrock to chunk your data, for example, if you want to chunk your data offline using an algorithm that is specific to your use case. Common use cases include code documentation.

Custom vector store – You can also select a custom vector store. The available vector database options include vector engine for Amazon OpenSearch Serverless, Pinecone, and Redis Enterprise Cloud. To use a custom vector store, you must create a new, empty vector database from the list of supported options and provide the vector database index name as well as index field and metadata field mappings. This vector database will need to be for exclusive use with Amazon Bedrock.

Knowledge Bases for Amazon Bedrock

Integrate RAG with other generative AI tools and applications
If you want to build an AI assistant that can perform multistep tasks and access company data sources to generate more relevant and context-aware responses, you can integrate Knowledge Bases with Agents for Amazon Bedrock. You can also use the Knowledge Bases retrieval plugin for LangChain to integrate RAG workflows into your generative AI applications.

Knowledge bases for Amazon Bedrock is available today in AWS Regions US East (N. Virginia) and US West (Oregon).

Learn more

— Antje

Amazon Transcribe Call Analytics adds new generative AI-powered call summaries (preview)

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/amazon-transcribe-call-analytics-adds-new-generative-ai-powered-call-summaries-preview/

We are announcing generative artificial intelligence (AI)-powered call summarization in Amazon Transcribe Call Analytics in preview. Powered by Amazon Bedrock, this feature helps businesses improve customer experience, and agent and supervisor productivity by automatically summarizing customer service calls. Amazon Transcribe Call Analytics provides machine learning (ML)-powered analytics that allows contact centers to understand the sentiment, trends, and policy compliance of customer conversations to improve their experience and identify crucial feedback. A single API call is all it takes to extract transcripts, rich insights, and summaries from your customer conversations.

We understand that as a business, you want to maintain an accurate historical record of key conversation points, including action items associated with each conversation. To do this, agents summarize notes after the conversation has ended and enter these in their CRM system, a process that is time-consuming and subject to human error. Now imagine the customer trust erosion that follows when the agent fails to correctly capture and act upon important action items discussed during conversations.

How it works
Starting today, to assist agents and supervisors with the summarization of customer conversations, Amazon Transcribe Call Analytics will generate a concise summary of a contact center interaction that captures key components such as why the customer called, how the issue was addressed, and what follow-up actions were identified. After completing a customer interaction, agents can directly proceed to help the next customer since they don’t have to summarize a conversation, resulting in reduced customer wait times and improved agent productivity. Further, supervisors can review the summary when investigating a customer issue to get a gist of the conversation, without having to listen to the entire call recording or read the transcript.

Exploring Amazon Transcribe Call Analytics in the console
To see how this works visually, I first create an Amazon Simple Storage Service (Amazon S3) bucket in the relevant AWS Region. I then upload the audio file to the S3 bucket.

Audio file in S3 bucket

To create an analytics job that transcribes the audio and provides additional analytics about the conversation that the customer and the agent were having, I go to the Amazon Transcribe Call Analytics console. I select Post-call Analytics in the left hand navigation bar and then choose Create job.

Create Post-call analytics job

Next I enter a job name making sure to keep the language settings based on the language in the audio file.

Job settings

In the Amazon S3 URI path, I provide the link to the audio file uploaded in the first screenshot shown in this post.

Audio file details

In Role name, I select Create an IAM role which will have access to the Amazon S3 bucket, then choose Next.

Create IAM Role

I enable Generative call summarization, and then choose Create job.

Configure job

After a few minutes, the job’s status will change from In progress to Complete, indicating that it was completed successfully.

Job status

Select the job, and the next screen will show the transcript and a new tab, Generative call summarization – preview.

You can also download the transcript to view the analytics and summary.

Now available
Generative call summarization in Amazon Transcribe Call Analytics is available today in English in US East (N. Virginia) and US West (Oregon).

With generative call summarization in Amazon Transcribe Call Analytics, you pay as you go and are billed monthly based on tiered pricing. For more information, see Amazon Transcribe pricing.

Learn more:


Build generative AI apps using AWS Step Functions and Amazon Bedrock

Post Syndicated from Marcia Villalba original https://aws.amazon.com/blogs/aws/build-generative-ai-apps-using-aws-step-functions-and-amazon-bedrock/

Today we are announcing two new optimized integrations for AWS Step Functions with Amazon Bedrock. Step Functions is a visual workflow service that helps developers build distributed applications, automate processes, orchestrate microservices, and create data and machine learning (ML) pipelines.

In September, we made available Amazon Bedrock, the easiest way to build and scale generative artificial intelligence (AI) applications with foundation models (FMs). Bedrock offers a choice of foundation models from leading providers like AI21 Labs, Anthropic, Cohere, Stability AI, and Amazon, along with a broad set of capabilities that customers need to build generative AI applications, while maintaining privacy and security. You can use Amazon Bedrock from the AWS Management Console, AWS Command Line Interface (AWS CLI), or AWS SDKs.

The new Step Functions optimized integrations with Amazon Bedrock allow you to orchestrate tasks to build generative AI applications using Amazon Bedrock, as well as to integrate with over 220 AWS services. With Step Functions, you can visually develop, inspect, and audit your workflows. Previously, you needed to invoke an AWS Lambda function to use Amazon Bedrock from your workflows, adding more code to maintain them and increasing the costs of your applications.

Step Functions provides two new optimized API actions for Amazon Bedrock:

  • InvokeModel – This integration allows you to invoke a model and run the inferences with the input provided in the parameters. Use this API action to run inferences for text, image, and embedding models.
  • CreateModelCustomizationJob – This integration creates a fine-tuning job to customize a base model. In the parameters, you specify the foundation model and the location of the training data. When the job is completed, your custom model is ready to be used. This is an asynchronous API, and this integration allows Step Functions to run a job and wait for it to complete before proceeding to the next state. This means that the state machine execution will pause while the create model customization job is running and will resume automatically when the task is complete.

Optimized connectors

The InvokeModel API action accepts requests and responses that are up to 25 MB. However, Step Functions has a 256 kB limit on state payload input and output. In order to support larger payloads with this integration, you can define an Amazon Simple Storage Service (Amazon S3) bucket where the InvokeModel API reads data from and writes the result to. These configurations can be provided in the parameters section of the API action configuration parameters section.

How to get started with Amazon Bedrock and AWS Step Functions
Before getting started, ensure that you create the state machine in a Region where Amazon Bedrock is available. For this example, use US East (N. Virginia), us-east-1.

From the AWS Management Console, create a new state machine. Search for “bedrock,” and the two available API actions will appear. Drag the InvokeModel to the state machine.

Using the invoke model connector

You can now configure that state in the menu on the right. First, you can define which foundation model you want to use. Pick a model from the list, or get the model dynamically from the input.

Then you need to configure the model parameters. You can enter the inference parameters in the text box or load the parameters from Amazon S3.

Configuration for the API Action

If you keep scrolling in the API action configuration, you can specify additional configuration options for the API, such as the S3 destination bucket. When this field is specified, the API action stores the API response in the specified bucket instead of returning it to the state output. Here, you can also specify the content type for the requests and responses.

Additional configuration for the connector

When you finish configuring your state machine, you can create and run it. When the state machine runs, you can visualize the execution details, select the Amazon Bedrock state, and check its inputs and outputs.

Executing the state machine

Using Step Functions, you can build state machines as extensively as you need, combining different services to solve many problems. For example, you can use Step Functions with Amazon Bedrock to create applications using prompt chaining. This is a technique for building complex generative AI applications by passing multiple smaller and simpler prompts to the FM instead of a very long and detailed prompt. To build a prompt chain, you can create a state machine that calls Amazon Bedrock multiple times to get an inference for each of the smaller prompts. You can use the parallel state to run all these tasks in parallel and then use an AWS Lambda function that unifies the responses of the parallel tasks into one response and generates a result.

Available now
AWS Step Functions optimized integrations for Amazon Bedrock are limited to the AWS Regions where Amazon Bedrock is available.

You can get started with Step Functions and Amazon Bedrock by trying out a sample project from the Step Functions console.