Tag Archives: launch

Meta’s Llama 3 models are now available in Amazon Bedrock

2024-04-23 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/metas-llama-3-models-are-now-available-in-amazon-bedrock/

Today, we are announcing the general availability of Meta’s Llama 3 models in Amazon Bedrock. Meta Llama 3 is designed for you to build, experiment, and responsibly scale your generative artificial intelligence (AI) applications. New Llama 3 models are the most capable to support a broad range of use cases with improvements in reasoning, code generation, and instruction.

According to Meta’s Llama 3 announcement, the Llama 3 model family is a collection of pre-trained and instruction-tuned large language models (LLMs) in 8B and 70B parameter sizes. These models have been trained on over 15 trillion tokens of data—a training dataset seven times larger than that used for Llama 2 models, including four times more code, which supports an 8K context length that doubles the capacity of Llama 2.

You can now use two new Llama 3 models in Amazon Bedrock, further increasing model choice within Amazon Bedrock. These models provide the ability for you to easily experiment with and evaluate even more top foundation models (FMs) for your use case:

Llama 3 8B is ideal for limited computational power and resources, and edge devices. The model excels at text summarization, text classification, sentiment analysis, and language translation.
Llama 3 70B is ideal for content creation, conversational AI, language understanding, research development, and enterprise applications. The model excels at text summarization and accuracy, text classification and nuance, sentiment analysis and nuance reasoning, language modeling, dialogue systems, code generation, and following instructions.

Meta is also currently training additional Llama 3 models over 400B parameters in size. These 400B models will have new capabilities, including multimodality, multiple languages support, and a much longer context window. When released, these models will be ideal for content creation, conversational AI, language understanding, research and development (R&D), and enterprise applications.

Llama 3 models in action
If you are new to using Meta models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. To access the latest Llama 3 models from Meta, request access separately for Llama 3 8B Instruct or Llama 3 70B Instruct.

To test the Meta Llama 3 models in the Amazon Bedrock console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Meta as the category and Llama 8B Instruct or Llama 3 70B Instruct as the model.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. You can use model IDs such as meta.llama3-8b-instruct-v1 or meta.llama3-70b-instruct-v1.

Here is a sample of the AWS CLI command:

$ aws bedrock-runtime invoke-model \
  --model-id meta.llama3-8b-instruct-v1:0 \
  --body "{\"prompt\":\"Simply put, the theory of relativity states that\\n the laws of physics are the same everywhere in the universe, and that the passage of time and the length of objects can vary depending on their speed and position in a gravitational field \",\"max_gen_len\":512,\"temperature\":0.5,\"top_p\":0.9}" \
  --cli-binary-format raw-in-base64-out \
  --region us-east-1 \
  invoke-model-output.txt

You can use code examples for Amazon Bedrock using AWS SDKs to build your applications with various programming languages. The following Python code examples show how to invoke the Llama 3 Chat model in Amazon Bedrock for text generation.

def invoke_llama3(self, prompt):
        try:
            body = {
                "prompt": prompt,
                "temperature": 0.5,
                "top_p": 0.9,
                "max_gen_len": 512,
            }

            response = self.bedrock_runtime_client.invoke_model(
                modelId="meta.llama3-8b-instruct-v1:0", body=json.dumps(body)
            )

            response_body = json.loads(response["body"].read())
            completion = response_body["generation"]

            return completion

        except ClientError:
            logger.error("Couldn't invoke Llama 3")
            raise

You can use these Llama 3 models for a variety of tasks, such as question answering, language translation, sentiment analysis, and more. Here is an example of Llama 3 models with few-shot in-context learning where we provide training samples available to the model.

Input: Translate English to French:
sea otter => loutre de mer
peppermint => menthe poivrée
plush girafe => girafe peluche
cheese =>
Output:  fromage
panda => panda
penguin => pingouin
polar bear => ours polaire
puffin => macareux
sloth => paresseux
walrus => morse
whale => baleine
zebra => z\u00e8bre

Translate English to German:
sea otter => Seeotter

You can also use Llama 3 instruct models optimized for dialogue use cases. The input to the instruct model endpoints is the previous history between the chat assistant and the user. So, you can ask questions contextual to the conversation that has happened so far and provide the system configuration, such as personas, which define the chat assistant’s behavior in the following format:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>
[SYSTEM_TEXT]<|eot_id|><|start_header_id|>user<|end_header_id|>
[USER_TEXT]<|eot_id|><|start_header_id|>assistant<|end_header_id|>

In the following example, the user has a conversation with the assistant about tourist sites in Paris. Then, the user inquires about the first option recommended by the chat assistant.

Input: <|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n
I am going to Paris, what should I see?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n
Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:\n\n
1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.\n
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.\n
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.\n\n
These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n
What is so great about #1?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n
Output: The Eiffel Tower! It's an iconic symbol of Paris and one of the most recognizable landmarks in the world. Here are some reasons why it's so great:\n\n
1. **Engineering marvel**: When it was built for the 1889 World's Fair, the Eiffel Tower was a groundbreaking feat of engineering. It was the tallest structure in the world at the time, standing at 324 meters (1,063 feet).\n
2. **Panoramic views**: The Eiffel Tower offers breathtaking 360-degree views of the city from its observation decks. You can see many of Paris' famous landmarks, such as the Seine River, the Louvre, and Montmartre.\n
3. **Romantic atmosphere**: The Eiffel Tower is often associated with romance and is a popular spot for proposals, weddings, and romantic dinners. The tower's iron latticework and twinkling lights at night add to its romantic ambiance.\n
4. **Historical significance**: The Eiffel Tower has played a significant role in French history, serving as a communication tower during World War I and a symbol of French culture and engineering.\n
5. **Accessibility**: The Eiffel Tower is easily accessible by public transportation or on foot, and there are various ways to experience it, including taking the stairs, elevator, or a guided tour.\n
6. **Nighttime spectacle**: The Eiffel Tower is even more magical at night, when it's illuminated with thousands of twinkling lights. The light show, known as the \"Eiffel Tower Sparkle,\" takes place every hour on the hour from dusk to midnight.\n
7. **Iconic status**: The Eiffel Tower has been featured in countless films, literature, and art, making it an instantly recognizable symbol of Paris and France.\n\n
Overall, the Eiffel Tower is a must-visit attraction in Paris, offering a unique combination of history, engineering, romance, and stunning views.

To learn more about the new prompt template and special tokens of Llama 3, check out Meta’s model cards and prompt formats or Llama Recipes in the GitHub repository.

Now available
Meta’s Llama 3 models are available today in Amazon Bedrock in the US East (N. Virginia) and US West (Oregon) Regions. Check the full Region list for future updates. To learn more, check out the Llama in Amazon Bedrock product page and pricing page.

Give Llama 3 a try in the Amazon Bedrock console today, and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Visit our community.aws site to find deep-dive technical content and to discover how our Builder communities are using Amazon Bedrock in their solutions.

— Channy

Guardrails for Amazon Bedrock now available with new safety filters and privacy controls

2024-04-23 Esra Kayabali

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/guardrails-for-amazon-bedrock-now-available-with-new-safety-filters-and-privacy-controls/

Today, I am happy to announce the general availability of Guardrails for Amazon Bedrock, first released in preview at re:Invent 2023. With Guardrails for Amazon Bedrock, you can implement safeguards in your generative artificial intelligence (generative AI) applications that are customized to your use cases and responsible AI policies. You can create multiple guardrails tailored to diﬀerent use cases and apply them across multiple foundation models (FMs), improving end-user experiences and standardizing safety controls across generative AI applications. You can use Guardrails for Amazon Bedrock with all large language models (LLMs) in Amazon Bedrock, including fine-tuned models.

Guardrails for Bedrock offers industry-leading safety protection on top of the native capabilities of FMs, helping customers block as much as 85% more harmful content than protection natively provided by some foundation models on Amazon Bedrock today. Guardrails for Amazon Bedrock is the only responsible AI capability offered by top cloud providers that enables customers to build and customize safety and privacy protections for their generative AI applications in a single solution, and it works with all large language models (LLMs) in Amazon Bedrock, as well as fine-tuned models.

Aha! is a software company that helps more than 1 million people bring their product strategy to life. “Our customers depend on us every day to set goals, collect customer feedback, and create visual roadmaps,” said Dr. Chris Waters, co-founder and Chief Technology Officer at Aha!. “That is why we use Amazon Bedrock to power many of our generative AI capabilities. Amazon Bedrock provides responsible AI features, which enable us to have full control over our information through its data protection and privacy policies, and block harmful content through Guardrails for Bedrock. We just built on it to help product managers discover insights by analyzing feedback submitted by their customers. This is just the beginning. We will continue to build on advanced AWS technology to help product development teams everywhere prioritize what to build next with confidence.”

In the preview post, Antje showed you how to use guardrails to configure thresholds to filter content across harmful categories and define a set of topics that need to be avoided in the context of your application. The Content filters feature now has two additional safety categories: Misconduct for detecting criminal activities and Prompt Attack for detecting prompt injection and jailbreak attempts. We also added important new features, including sensitive information filters to detect and redact personally identifiable information (PII) and word filters to block inputs containing profane and custom words (for example, harmful words, competitor names, and products).

Guardrails for Amazon Bedrock sits in between the application and the model. Guardrails automatically evaluates everything going into the model from the application and coming out of the model to the application to detect and help prevent content that falls into restricted categories.

You can recap the steps in the preview release blog to learn how to configure Denied topics and Content filters. Let me show you how the new features work.

New features
To start using Guardrails for Amazon Bedrock, I go to the AWS Management Console for Amazon Bedrock, where I can create guardrails and configure the new capabilities. In the navigation pane in the Amazon Bedrock console, I choose Guardrails, and then I choose Create guardrail.

I enter the guardrail Name and Description. I choose Next to move to the Add sensitive information filters step.

I use Sensitive information filters to detect sensitive and private information in user inputs and FM outputs. Based on the use cases, I can select a set of entities to be either blocked in inputs (for example, a FAQ-based chatbot that doesn’t require user-specific information) or redacted in outputs (for example, conversation summarization based on chat transcripts). The sensitive information filter supports a set of predefined PII types. I can also define custom regex-based entities specific to my use case and needs.

I add two PII types (Name, Email) from the list and add a regular expression pattern using Booking ID as Name and [0-9a-fA-F]{8} as the Regex pattern.

I choose Next and enter custom messages that will be displayed if my guardrail blocks the input or the model response in the Define blocked messaging step. I review the configuration at the last step and choose Create guardrail.

I navigate to the Guardrails Overview page and choose the Anthropic Claude Instant 1.2 model using the Test section. I enter the following call center transcript in the Prompt field and choose Run.

Please summarize the below call center transcript. Put the name, email and the booking ID to the top: Agent: Welcome to ABC company. How can I help you today? Customer: I want to cancel my hotel booking. Agent: Sure, I can help you with the cancellation. Can you please provide your booking ID? Customer: Yes, my booking ID is 550e8408. Agent: Thank you. Can I have your name and email for confirmation? Customer: My name is Jane Doe and my email is [email protected] Agent: Thank you for confirming. I will go ahead and cancel your reservation.

Guardrail action shows there are three instances where the guardrails came in to effect. I use View trace to check the details. I notice that the guardrail detected the Name, Email and Booking ID and masked them in the final response.

I use Word filters to block inputs containing profane and custom words (for example, competitor names or offensive words). I check the Filter profanity box. The profanity list of words is based on the global definition of profanity. Additionally, I can specify up to 10,000 phrases (with a maximum of three words per phrase) to be blocked by the guardrail. A blocked message will show if my input or model response contain these words or phrases.

Now, I choose Custom words and phrases under Word filters and choose Edit. I use Add words and phrases manually to add a custom word CompetitorY. Alternatively, I can use Upload from a local file or Upload from S3 object if I need to upload a list of phrases. I choose Save and exit to return to my guardrail page.

I enter a prompt containing information about a fictional company and its competitor and add the question What are the extra features offered by CompetitorY?. I choose Run.

I use View trace to check the details. I notice that the guardrail intervened according to the policies I configured.

Now available
Guardrails for Amazon Bedrock is now available in US East (N. Virginia) and US West (Oregon) Regions.

For pricing information, visit the Amazon Bedrock pricing page.

To get started with this feature, visit the Guardrails for Amazon Bedrock web page.

For deep-dive technical content and to learn how our Builder communities are using Amazon Bedrock in their solutions, visit our community.aws website.

— Esra

Agents for Amazon Bedrock: Introducing a simplified creation and configuration experience

2024-04-23 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/agents-for-amazon-bedrock-introducing-a-simplified-creation-and-configuration-experience/

With Agents for Amazon Bedrock, applications can use generative artificial intelligence (generative AI) to run tasks across multiple systems and data sources. Starting today, these new capabilities streamline the creation and management of agents:

Quick agent creation – You can now quickly create an agent and optionally add instructions and action groups later, providing flexibility and agility for your development process.

Agent builder – All agent configurations can be operated in the new agent builder section of the console.

Simplified configuration – Action groups can use a simplified schema that just lists functions and parameters without having to provide an API schema.

Return of control –You can skip using an AWS Lambda function and return control to the application invoking the agent. In this way, the application can directly integrate with systems outside AWS or call internal endpoints hosted in any Amazon Virtual Private Cloud (Amazon VPC) without the need to integrate the required networking and security configurations with a Lambda function.

Infrastructure as code – You can use AWS CloudFormation to deploy and manage agents with the new simplified configuration, ensuring consistency and reproducibility across environments for your generative AI applications.

Let’s see how these enhancements work in practice.

Creating an agent using the new simplified console
To test the new experience, I want to build an agent that can help me reply to an email containing customer feedback. I can use generative AI, but a single invocation of a foundation model (FM) is not enough because I need to interact with other systems. To do that, I use an agent.

In the Amazon Bedrock console, I choose Agents from the navigation pane and then Create Agent. I enter a name for the agent (customer-feedback) and a description. Using the new interface, I proceed and create the agent without providing additional information at this stage.

I am now presented with the Agent builder, the place where I can access and edit the overall configuration of an agent. In the Agent resource role, I leave the default setting as Create and use a new service role so that the AWS Identity and Access Management (IAM) role assumed by the agent is automatically created for me. For the model, I select Anthropic and Claude 3 Sonnet.

In Instructions for the Agent, I provide clear and specific instructions for the task the agent has to perform. Here, I can also specify the style and tone I want the agent to use when replying. For my use case, I enter:

Help reply to customer feedback emails with a solution tailored to the customer account settings.

In Additional settings, I select Enabled for User input so that the agent can ask for additional details when it does not have enough information to respond. Then, I choose Save to update the configuration of the agent.

I now choose Add in the Action groups section. Action groups are the way agents can interact with external systems to gather more information or perform actions. I enter a name (retrieve-customer-settings) and a description for the action group:

Retrieve customer settings including customer ID.

The description is optional but, when provided, is passed to the model to help choose when to use this action group.

In Action group type, I select Define with function details so that I only need to specify functions and their parameters. The other option here (Define with API schemas) corresponds to the previous way of configuring action groups using an API schema.

Action group functions can be associated to a Lambda function call or configured to return control to the user or application invoking the agent so that they can provide a response to the function. The option to return control is useful for four main use cases:

When it’s easier to call an API from an existing application (for example, the one invoking the agent) than building a new Lambda function with the correct authentication and network configurations as required by the API
When the duration of the task goes beyond the maximum Lambda function timeout of 15 minutes so that I can handle the task with an application running in containers or virtual servers or use a workflow orchestration such as AWS Step Functions
When I have time-consuming actions because, with the return of control, the agent doesn’t wait for the action to complete before proceeding to the next step, and the invoking application can run actions asynchronously in the background while the orchestration flow of the agent continues
When I need a quick way to mock the interaction with an API during the development and testing and of an agent

In Action group invocation, I can specify the Lambda function that will be invoked when this action group is identified by the model during orchestration. I can ask the console to quickly create a new Lambda function, to select an existing Lambda function, or return control so that the user or application invoking the agent will ask for details to generate a response. I select Return Control to show how that works in the console.

I configure the first function of the action group. I enter a name (retrieve-customer-settings-from-crm) and the following description for the function:

Retrieve customer settings from CRM including customer ID using the customer email in the sender/from fields of the email.

In Parameters, I add email with Customer email as the description. This is a parameter of type String and is required by this function. I choose Add to complete the creation of the action group.

Because, for my use case, I expect many customers to have issues when logging in, I add another action group (named check-login-status) with the following description:

Check customer login status.

This time, I select the option to create a new Lambda function so that I can handle these requests in code.

For this action group, I configure a function (named check-customer-login-status-in-login-system) with the following description:

Check customer login status in login system using the customer ID from settings.

In Parameters, I add customer_id, another required parameter of type String. Then, I choose Add to complete the creation of the second action group.

When I open the configuration of this action group, I see the name of the Lambda function that has been created in my account. There, I choose View to open the Lambda function in the console.

In the Lambda console, I edit the starting code that has been provided and implement my business case:

import json

def lambda_handler(event, context):
    print(event)
    
    agent = event['agent']
    actionGroup = event['actionGroup']
    function = event['function']
    parameters = event.get('parameters', [])

    # Execute your business logic here. For more information,
    # refer to: https://docs.aws.amazon.com/bedrock/latest/userguide/agents-lambda.html
    if actionGroup == 'check-login-status' and function == 'check-customer-login-status-in-login-system':
        response = {
            "status": "unknown"
        }
        for p in parameters:
            if p['name'] == 'customer_id' and p['type'] == 'string' and p['value'] == '12345':
                response = {
                    "status": "not verified",
                    "reason": "the email address has not been verified",
                    "solution": "please verify your email address"
                }
    else:
        response = {
            "error": "Unknown action group {} or function {}".format(actionGroup, function)
        }
    
    responseBody =  {
        "TEXT": {
            "body": json.dumps(response)
        }
    }

    action_response = {
        'actionGroup': actionGroup,
        'function': function,
        'functionResponse': {
            'responseBody': responseBody
        }

    }

    dummy_function_response = {'response': action_response, 'messageVersion': event['messageVersion']}
    print("Response: {}".format(dummy_function_response))

    return dummy_function_response

I choose Deploy in the Lambda console. The function is configured with a resource-based policy that allows Amazon Bedrock to invoke the function. For this reason, I don’t need to update the IAM role used by the agent.

I am ready to test the agent. Back in the Amazon Bedrock console, with the agent selected, I look for the Test Agent section. There, I choose Prepare to prepare the agent and test it with the latest changes.

As input to the agent, I provide this sample email:

From: [email protected]

Subject: Problems logging in

Hi, when I try to log into my account, I get an error and cannot proceed further. Can you check? Thank you, Danilo

In the first step, the agent orchestration decides to use the first action group (retrieve-customer-settings) and function (retrieve-customer-settings-from-crm). This function is configured to return control, and in the console, I am asked to provide the output of the action group function. The customer email address is provided as the input parameter.

To simulate an interaction with an application, I reply with a JSON syntax and choose Submit:

{ "customer id": 12345 }

In the next step, the agent has the information required to use the second action group (check-login-status) and function (check-customer-login-status-in-login-system) to call the Lambda function. In return, the Lambda function provides this JSON payload:

{
  "status": "not verified",
  "reason": "the email address has not been verified",
  "solution": "please verify your email address"
}

Using this content, the agent can complete its task and suggest the correct solution for this customer.

I am satisfied with the result, but I want to know more about what happened under the hood. I choose Show trace where I can see the details of each step of the agent orchestration. This helps me understand the agent decisions and correct the configurations of the agent groups if they are not used as I expect.

Things to know
You can use the new simplified experience to create and manage Agents for Amazon Bedrock in the US East (N. Virginia) and US West (Oregon) AWS Regions.

You can now create an agent without having to specify an API schema or provide a Lambda function for the action groups. You just need to list the parameters that the action group needs. When invoking the agent, you can choose to return control with the details of the operation to perform so that you can handle the operation in your existing applications or if the duration is longer than the maximum Lambda function timeout.

CloudFormation support for Agents for Amazon Bedrock has been released recently and is now being updated to support the new simplified syntax.

To learn more:

See the Agents for Amazon Bedrock section of the User Guide.
Visit our community.aws site to find deep-dive technical content and discover how others are using Amazon Bedrock in their solutions.

— Danilo

Amazon Bedrock model evaluation is now generally available

2024-04-23 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-bedrock-model-evaluation-is-now-generally-available/

The Amazon Bedrock model evaluation capability that we previewed at AWS re:Invent 2023 is now generally available. This new capability helps you to incorporate Generative AI into your application by giving you the power to select the foundation model that gives you the best results for your particular use case. As my colleague Antje explained in her post (Evaluate, compare, and select the best foundation models for your use case in Amazon Bedrock):

Model evaluations are critical at all stages of development. As a developer, you now have evaluation tools available for building generative artificial intelligence (AI) applications. You can start by experimenting with different models in the playground environment. To iterate faster, add automatic evaluations of the models. Then, when you prepare for an initial launch or limited release, you can incorporate human reviews to help ensure quality.

We received a lot of wonderful and helpful feedback during the preview and used it to round-out the features of this new capability in preparation for today’s launch — I’ll get to those in a moment. As a quick recap, here are the basic steps (refer to Antje’s post for a complete walk-through):

Create a Model Evaluation Job – Select the evaluation method (automatic or human), select one of the available foundation models, choose a task type, and choose the evaluation metrics. You can choose accuracy, robustness, and toxicity for an automatic evaluation, or any desired metrics (friendliness, style, and adherence to brand voice, for example) for a human evaluation. If you choose a human evaluation, you can use your own work team or you can opt for an AWS-managed team. There are four built-in task types, as well as a custom type (not shown):

After you select the task type you choose the metrics and the datasets that you want to use to evaluate the performance of the model. For example, if you select Text classification, you can evaluate accuracy and/or robustness with respect to your own dataset or a built-in one:

As you can see above, you can use a built-in dataset, or prepare a new one in JSON Lines (JSONL) format. Each entry must include a prompt and can include a category. The reference response is optional for all human evaluation configurations and for some combinations of task types and metrics for automatic evaluation:

{
  "prompt" : "Bobigny is the capitol of",
  "referenceResponse" : "Seine-Saint-Denis",
  "category" : "Capitols"
}

You (or your local subject matter experts) can create a dataset that uses customer support questions, product descriptions, or sales collateral that is specific to your organization and your use case. The built-in datasets include Real Toxicity, BOLD, TREX, WikiText-2, Gigaword, BoolQ, Natural Questions, Trivia QA, and Women’s Ecommerce Clothing Reviews. These datasets are designed to test specific types of tasks and metrics, and can be chosen as appropriate.

Run Model Evaluation Job – Start the job and wait for it to complete. You can review the status of each of your model evaluation jobs from the console, and can also access the status using the new GetEvaluationJob API function:

Retrieve and Review Evaluation Report – Get the report and review the model’s performance against the metrics that you selected earlier. Again, refer to Antje’s post for a detailed look at a sample report.

New Features for GA
With all of that out of the way, let’s take a look at the features that were added in preparation for today’s launch:

Improved Job Management – You can now stop a running job using the console or the new model evaluation API.

Model Evaluation API – You can now create and manage model evaluation jobs programmatically. The following functions are available:

CreateEvaluationJob – Create and run a model evaluation job using parameters specified in the API request including an evaluationConfig and an inferenceConfig.
ListEvaluationJobs – List model evaluation jobs, with optional filtering and sorting by creation time, evaluation job name, and status.
GetEvaluationJob – Retrieve the properties of a model evaluation job, including the status (InProgress, Completed, Failed, Stopping, or Stopped). After the job has completed, the results of the evaluation will be stored at the S3 URI that was specified in the outputDataConfig property supplied to CreateEvaluationJob.
StopEvaluationJob – Stop an in-progress job. Once stopped, a job cannot be resumed, and must be created anew if you want to rerun it.

This model evaluation API was one of the most-requested features during the preview. You can use it to perform evaluations at scale, perhaps as part of a development or testing regimen for your applications.

Enhanced Security – You can now use customer-managed KMS keys to encrypt your evaluation job data (if you don’t use this option, your data is encrypted using a key owned by AWS):

Access to More Models – In addition to the existing text-based models from AI21 Labs, Amazon, Anthropic, Cohere, and Meta, you now have access to Claude 2.1:

After you select a model you can set the inference configuration that will be used for the model evaluation job:

Things to Know
Here are a couple of things to know about this cool new Amazon Bedrock capability:

Pricing – You pay for the inferences that are performed during the course of the model evaluation, with no additional charge for algorithmically generated scores. If you use human-based evaluation with your own team, you pay for the inferences and $0.21 for each completed task — a human worker submitting an evaluation of a single prompt and its associated inference responses in the human evaluation user interface. Pricing for evaluations performed by an AWS managed work team is based on the dataset, task types, and metrics that are important to your evaluation. For more information, consult the Amazon Bedrock Pricing page.

Regions – Model evaluation is available in the US East (N. Virginia) and US West (Oregon) AWS Regions.

More GenAI – Visit our new GenAI space to learn more about this and the other announcements that we are making today!

— Jeff;

Import custom models in Amazon Bedrock (preview)

2024-04-23 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/import-custom-models-in-amazon-bedrock-preview/

With Amazon Bedrock, you have access to a choice of high-performing foundation models (FMs) from leading artificial intelligence (AI) companies that make it easier to build and scale generative AI applications. Some of these models provide publicly available weights that can be fine-tuned and customized for specific use cases. However, deploying customized FMs in a secure and scalable way is not an easy task.

Starting today, Amazon Bedrock adds in preview the capability to import custom weights for supported model architectures (such as Meta Llama 2, Llama 3, and Mistral) and serve the custom model using On-Demand mode. You can import models with weights in Hugging Face safetensors format from Amazon SageMaker and Amazon Simple Storage Service (Amazon S3).

In this way, you can use Amazon Bedrock with existing customized models such as Code Llama, a code-specialized version of Llama 2 that was created by further training Llama 2 on code-specific datasets, or use your data to fine-tune models for your own unique business case and import the resulting model in Amazon Bedrock.

Let’s see how this works in practice.

Bringing a custom model to Amazon Bedrock
In the Amazon Bedrock console, I choose Imported models from the Foundation models section of the navigation pane. Now, I can create a custom model by importing model weights from an Amazon Simple Storage Service (Amazon S3) bucket or from an Amazon SageMaker model.

I choose to import model weights from an S3 bucket. In another browser tab, I download the MistralLite model from the Hugging Face website using this pull request (PR) that provides weights in safetensors format. The pull request is currently Ready to merge, so it might be part of the main branch when you read this. MistralLite is a fine-tuned Mistral-7B-v0.1 language model with enhanced capabilities of processing long context up to 32K tokens.

When the download is complete, I upload the files to an S3 bucket in the same AWS Region where I will import the model. Here are the MistralLite model files in the Amazon S3 console:

Back at the Amazon Bedrock console, I enter a name for the model and keep the proposed import job name.

I select Model weights in the Model import settings and browse S3 to choose the location where I uploaded the model weights.

To authorize Amazon Bedrock to access the files on the S3 bucket, I select the option to create and use a new AWS Identity and Access Management (IAM) service role. I use the View permissions details link to check what will be in the role. Then, I submit the job.

About ten minutes later, the import job is completed.

Now, I see the imported model in the console. The list also shows the model Amazon Resource Name (ARN) and the creation date.

I choose the model to get more information, such as the S3 location of the model files.

In the model detail page, I choose Open in playground to test the model in the console. In the text playground, I type a question using the prompt template of the model:

<|prompter|>What are the main challenges to support a long context for LLM?</s><|assistant|>

The MistralLite imported model is quick to reply and describe some of those challenges.

In the playground, I can tune responses for my use case using configurations such as temperature and maximum length or add stop sequences specific to the imported model.

To see the syntax of the API request, I choose the three small vertical dots at the top right of the playground.

I choose View API syntax and run the command using the AWS Command Line Interface (AWS CLI):

aws bedrock-runtime invoke-model \
--model-id arn:aws:bedrock:us-east-1:123412341234:imported-model/a82bkefgp20f \
--body "{\"prompt\":\"<|prompter|>What are the main challenges to support a long context for LLM?</s><|assistant|>\",\"max_tokens\":512,\"top_k\":200,\"top_p\":0.9,\"stop\":[],\"temperature\":0.5}" \
--cli-binary-format raw-in-base64-out \
--region us-east-1 \
invoke-model-output.txt

The output is similar to what I got in the playground. As you can see, for imported models, the model ID is the ARN of the imported model. I can use the model ID to invoke the imported model with the AWS CLI and AWS SDKs.

Things to know
You can bring your own weights for supported model architectures to Amazon Bedrock in the US East (N. Virginia) AWS Region. The model import capability is currently available in preview.

When using custom weights, Amazon Bedrock serves the model with On-Demand mode, and you only pay for what you use with no time-based term commitments. For detailed information, see Amazon Bedrock pricing.

The ability to import models is managed using AWS Identity and Access Management (IAM), and you can allow this capability only to the roles in your organization that need to have it.

With this launch, it’s now easier to build and scale generative AI applications using custom models with security and privacy built in.

To learn more:

See the Amazon Bedrock User Guide.
Visit our community.aws site to find deep-dive technical content and discover how others are using Amazon Bedrock in their solutions.

— Danilo

Amazon Titan Image Generator and watermark detection API are now available in Amazon Bedrock

2024-04-23 Antje Barth

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/amazon-titan-image-generator-and-watermark-detection-api-are-now-available-in-amazon-bedrock/

During AWS re:Invent 2023, we announced the preview of Amazon Titan Image Generator, a generative artificial intelligence (generative AI) foundation model (FM) that you can use to quickly create and refine realistic, studio-quality images using English natural language prompts.

I’m happy to share that Amazon Titan Image Generator is now generally available in Amazon Bedrock, giving you an easy way to build and scale generative AI applications with new image generation and image editing capabilities, including instant customization of images.

In my previous post, I also mentioned that all images generated by Titan Image Generator contain an invisible watermark, by default, which is designed to help reduce the spread of misinformation by providing a mechanism to identify AI-generated images.

I’m excited to announce that watermark detection for Titan Image Generator is now generally available in the Amazon Bedrock console. Today, we’re also introducing a new DetectGeneratedContent API (preview) in Amazon Bedrock that checks for the existence of this watermark and helps you confirm whether an image was generated by Titan Image Generator.

Let me show you how to get started with these new capabilities.

Instant image customization using Amazon Titan Image Generator
You can now generate new images of a subject by providing up to five reference images. You can create the subject in different scenes while preserving its key features, transfer the style from the reference images to new images, or mix styles from multiple reference images. All this can be done without additional prompt engineering or fine-tuning of the model.

For this demo, I prompt Titan Image Generator to create an image of a “parrot eating a banana.” In the first attempt, I use Titan Image Generator to create this new image without providing a reference image.

Note: In the following code examples, I’ll use the AWS SDK for Python (Boto3) to interact with Amazon Bedrock. You can find additional code examples for C#/.NET, Go, Java, and PHP in the Bedrock User Guide.

import boto3
import json

bedrock_runtime = boto3.client(service_name="bedrock-runtime")

body = json.dumps(
    {
        "taskType": "TEXT_IMAGE",
        "textToImageParams": {
            "text": "parrot eating a banana",   
        },
        "imageGenerationConfig": {
            "numberOfImages": 1,   
            "quality": "premium", 
            "height": 768,
            "width": 1280,
            "cfgScale": 10, 
            "seed": 42
        }
    }
)
response = bedrock_runtime.invoke_model(
    body=body, 
    modelId="amazon.titan-image-generator-v1",
    accept="application/json", 
    contentType="application/json"
)

You can display the generated image using the following code.

import io
import base64
from PIL import Image

response_body = json.loads(response.get("body").read())

images = [
    Image.open(io.BytesIO(base64.b64decode(base64_image)))
    for base64_image in response_body.get("images")
]

for img in images:
    display(img)

Here’s the generated image:

Then, I use the new instant image customization capability with the same prompt, but now also providing the following two reference images. For easier comparison, I’ve resized the images, added a caption, and plotted them side by side.

Here’s the code. The new instant customization is available through the IMAGE_VARIATION task:

# Import reference images
image_path_1 = "parrot-cartoon.png"
image_path_2 = "bird-sketch.png"

with open(image_path_1, "rb") as image_file:
    input_image_1 = base64.b64encode(image_file.read()).decode("utf8")

with open(image_path_2, "rb") as image_file:
    input_image_2 = base64.b64encode(image_file.read()).decode("utf8")

# ImageVariationParams options:
#   text: Prompt to guide the model on how to generate variations
#   images: Base64 string representation of a reference image, up to 5 images are supported
#   similarityStrength: Parameter you can tune to control similarity with reference image(s)

body = json.dumps(
    {
        "taskType": "IMAGE_VARIATION",
        "imageVariationParams": {
            "text": "parrot eating a banana",  # Required
            "images": [input_image_1, input_image_2],  # Required 1 to 5 images
            "similarityStrength": 0.7,  # Range: 0.2 to 1.0
        },
        "imageGenerationConfig": {
            "numberOfImages": 1,
            "quality": "premium",
            "height": 768,
            "width": 1280,
            "cfgScale": 10,
            "seed": 42
        }
    }
)

response = bedrock_runtime.invoke_model(
    body=body, 
    modelId="amazon.titan-image-generator-v1",
    accept="application/json", 
    contentType="application/json"
)

Once again, I’ve resized the generated image, added a caption, and plotted it side by side with the originally generated image.

You can see how the parrot in the second image that has been generated using the instant image customization capability resembles in style the combination of the provided reference images.

Watermark detection for Amazon Titan Image Generator
All Amazon Titan FMs are built with responsible AI in mind. They detect and remove harmful content from data, reject inappropriate user inputs, and filter model outputs. As content creators create realistic-looking images with AI, it’s important to promote responsible development of this technology and reduce the spread of misinformation. That’s why all images generated by Titan Image Generator contain an invisible watermark, by default. Watermark detection is an innovative technology, and Amazon Web Services (AWS) is among the first major cloud providers to widely release built-in watermarks for AI image outputs.

Titan Image Generator’s new watermark detection feature is a mechanism that allows you to identify images generated by Amazon Titan. These watermarks are designed to be tamper-resistant, helping increase transparency around AI-generated content as these capabilities continue to advance.

Watermark detection using the console
Watermark detection is generally available in the Amazon Bedrock console. You can upload an image to detect watermarks embedded in images created by Titan Image Generator, including those generated by the base model and any customized versions. If you upload an image that was not created by Titan Image Generator, then the model will indicate that a watermark has not been detected.

The watermark detection feature also comes with a confidence score. The confidence score represents the confidence level in watermark detection. In some cases, the detection confidence may be low if the original image has been modified. This new capability enables content creators, news organizations, risk analysts, fraud detection teams, and others to better identify and mitigate misleading AI-generated content, promoting transparency and responsible AI deployment across organizations.

Watermark detection using the API (preview)
In addition to watermark detection using the console, we’re introducing a new DetectGeneratedContent API (preview) in Amazon Bedrock that checks for the existence of this watermark and helps you confirm whether an image was generated by Titan Image Generator. Let’s see how this works.

For this demo, let’s check if the image of the green iguana I showed in the Titan Image Generator preview post was indeed generated by the model.

I define the imports, set up the Amazon Bedrock boto3 runtime client, and base64-encode the image. Then, I call the DetectGeneratedContent API by specifying the foundation model and providing the encoded image.

import boto3
import json
import base64

bedrock_runtime = boto3.client(service_name="bedrock-runtime")

image_path = "green-iguana.png"

with open(image_path, "rb") as image_file:
    input_image_iguana = image_file.read()

response = bedrock_runtime.detect_generated_content(
    foundationModelId = "amazon.titan-image-generator-v1",
    content = {
        "imageContent": { "bytes": input_image_iguana }
    }
)

Let’s check the response.

response.get("detectionResult")
'GENERATED'
response.get("confidenceLevel")
'HIGH'

The response GENERATED with the confidence level HIGH confirms that Amazon Bedrock detected a watermark generated by Titan Image Generator.

Now, let’s check another image I generated using Stable Diffusion XL 1.0 on Amazon Bedrock. In this case, a “meerkat facing the sunset.”

I call the API again, this time with the image of the meerkat.

image_path = "meerkat.png"

with open(image_path, "rb") as image_file:
    input_image_meerkat = image_file.read()

response = bedrock_runtime.detect_generated_content(
    foundationModelId = "amazon.titan-image-generator-v1",
    content = {
        "imageContent": { "bytes": input_image_meerkat }
    }
)

response.get("detectionResult")
'NOT_GENERATED'

And indeed, the response NOT_GENERATED tells me that there was no watermark by Titan Image Generator detected, and therefore, the image most likely wasn’t generated by the model.

Using Amazon Titan Image Generator and watermark detection in the console
Here’s a short demo of how to get started with Titan Image Generator and the new watermark detection feature in the Amazon Bedrock console, put together by my colleague Nirbhay Agarwal.

Availability
Amazon Titan Image Generator, the new instant customization capabilities, and watermark detection in the Amazon Bedrock console are available today in the AWS Regions US East (N. Virginia) and US West (Oregon). Check the full Region list for future updates. The new DetectGeneratedContent API in Amazon Bedrock is available today in public preview in the AWS Regions US East (N. Virginia) and US West (Oregon).

Amazon Titan Image Generator, now also available in PartyRock
Titan Image Generator is now also available in PartyRock, an Amazon Bedrock playground. PartyRock gives you a no-code, AI-powered app-building experience that doesn’t require a credit card. You can use PartyRock to create apps that generate images in seconds by selecting from your choice of image generation models from Stability AI and Amazon.

More resources
To learn more about the Amazon Titan family of models, visit the Amazon Titan product page. For pricing details, check Amazon Bedrock Pricing.

Give Amazon Titan Image Generator a try in PartyRock or explore the model’s advanced image generation and editing capabilities in the Amazon Bedrock console. Send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS contacts.

For more deep-dive technical content and to engage with the generative AI Builder community, visit our generative AI space at community.aws.

— Antje

Unify DNS management using Amazon Route 53 Profiles with multiple VPCs and AWS accounts

2024-04-23 Esra Kayabali

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/unify-dns-management-using-amazon-route-53-profiles-with-multiple-vpcs-and-aws-accounts/

If you are managing lots of accounts and Amazon Virtual Private Cloud (Amazon VPC) resources, sharing and then associating many DNS resources to each VPC can present a significant burden. You often hit limits around sharing and association, and you may have gone as far as building your own orchestration layers to propagate DNS configuration across your accounts and VPCs.

Today, I’m happy to announce Amazon Route 53 Profiles, which provide the ability to unify management of DNS across all of your organization’s accounts and VPCs. Route 53 Profiles let you define a standard DNS configuration, including Route 53 private hosted zone (PHZ) associations, Resolver forwarding rules, and Route 53 Resolver DNS Firewall rule groups, and apply that configuration to multiple VPCs in the same AWS Region. With Profiles, you have an easy way to ensure all of your VPCs have the same DNS configuration without the complexity of handling separate Route 53 resources. Managing DNS across many VPCs is now as simple as managing those same settings for a single VPC.

Profiles are natively integrated with AWS Resource Access Manager (RAM) allowing you to share your Profiles across accounts or with your AWS Organizations account. Profiles integrates seamlessly with Route 53 private hosted zones by allowing you to create and add existing private hosted zones to your Profile so that your organizations have access to these same settings when the Profile is shared across accounts. AWS CloudFormation allows you to use Profiles to set DNS settings consistently for VPCs as accounts are newly provisioned. With today’s release, you can better govern DNS settings for your multi-account environments.

How Route 53 Profiles works
To start using the Route 53 Profiles, I go to the AWS Management Console for Route 53, where I can create Profiles, add resources to them, and associate them to their VPCs. Then, I share the Profile I created across another account using AWS RAM.

In the navigation pane in the Route 53 console, I choose Profiles and then I choose Create profile to set up my Profile.

I give my Profile configuration a friendly name such as MyFirstRoute53Profile and optionally add tags.

I can configure settings for DNS Firewall rule groups, private hosted zones and Resolver rules or add existing ones within my account all within the Profile console page.

I choose VPCs to associate my VPCs to the Profile. I can add tags as well as do configurations for recursive DNSSEC validation, the failure mode for the DNS Firewalls associated to my VPCs. I can also control the order of DNS evaluation: First VPC DNS then Profile DNS, or first Profile DNS then VPC DNS.

I can associate one Profile per VPC and can associate up to 5,000 VPCs to a single Profile.

Profiles gives me the ability to manage settings for VPCs across accounts in my organization. I am able to disable reverse DNS rules for each of the VPCs the Profile is associated with rather than configuring these on a per-VPC basis. The Route 53 Resolver automatically creates rules for reverse DNS lookups for me so that different services can easily resolve hostnames from IP addresses. If I use DNS Firewall, I am able to select the failure mode for my firewall via settings, to fail open or fail closed. I am also able to specify if I wish for the VPCs associated to the Profile to have recursive DNSSEC validation enabled without having to use DNSSEC signing in Route 53 (or any other provider).

Let’s say I associate a Profile to a VPC. What happens when a query exactly matches both a resolver rule or PHZ associated directly to the VPC and a resolver rule or PHZ associated to the VPC’s Profile? Which DNS settings take precedence, the Profile’s or the local VPC’s? For example, if the VPC is associated to a PHZ for example.com and the Profile contains a PHZ for example.com, that VPC’s local DNS settings will take precedence over the Profile. When a query is made for a name for a conflicting domain name (for example, the Profile contains a PHZ for infra.example.com and the VPC is associated to a PHZ that has the name account1.infra.example.com), the most specific name wins.

Sharing Route 53 Profiles across accounts using AWS RAM
I use AWS Resource Access Manager (RAM) to share the Profile I created in the previous section with my other account.

I choose the Share profile option in the Profiles detail page or I can go to the AWS RAM console page and choose Create resource share.

I provide a name for my resource share and then I search for the ‘Route 53 Profiles’ in the Resources section. I select the Profile in Selected resources. I can choose to add tags. Then, I choose Next.

Profiles utilize RAM managed permissions, which allow me to attach different permissions to each resource type. By default, only the owner (the network admin) of the Profile will be able to modify the resources within the Profile. Recipients of the Profile (the VPC owners) will only be able to view the contents of the Profile (the ReadOnly mode). To allow a recipient of the Profile to add PHZs or other resources to it, the Profile’s owner will have to attach the necessary permissions to the resource. Recipients will not be able to edit or delete any resources added by the Profile owner to the shared resource.

I leave the default selections and choose Next to grant access to my other account.

On the next page, I choose Allow sharing with anyone, enter my other account’s ID and then choose Add. After that, I choose that account ID in the Selected principals section and choose Next.

In the Review and create page, I choose Create resource share. Resource share is successfully created.

Now, I switch to my other account that I share my Profile with and go to the RAM console. In the navigation menu, I go to the Resource shares and choose the resource name I created in the first account. I choose Accept resource share to accept the invitation.

That’s it! Now, I go to my Route 53 Profiles page and I choose the Profile shared with me.

I have access to the shared Profile’s DNS Firewall rule groups, private hosted zones, and Resolver rules. I can associate this account’s VPCs to this Profile. I am not able to edit or delete any resources. Profiles are Regional resources and cannot be shared across Regions.

Available now
You can easily get started with Route 53 Profiles using the AWS Management Console, Route 53 API, AWS Command Line Interface (AWS CLI), AWS CloudFormation, and AWS SDKs.

Route 53 Profiles will be available in all AWS Regions, except in Canada West (Calgary), the AWS GovCloud (US) Regions and the Amazon Web Services China Regions.

For more details about the pricing, visit the Route 53 pricing page.

Get started with Profiles today and please let us know your feedback either through your usual AWS Support contacts or the AWS re:Post for Amazon Route 53.

— Esra

Amazon CloudWatch Internet Weather Map – View and analyze internet health

2024-04-17 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/amazon-cloudwatch-internet-weather-map-view-and-analyze-internet-health/

The Internet has a plethora of moving parts: routers, switches, hubs, terrestrial and submarine cables, and connectors on the hardware side, and complex protocol stacks and configurations on the software side. When something goes wrong that slows or disrupts the Internet in a way that affects your customers, you want to be able to localize and understand the issue as quickly as possible.

New Map
The new Amazon CloudWatch Internet Weather Map is here to help! Built atop of collection of global monitors operated by AWS, you get a broad, global view of Internet weather, with the ability to zoom in and understand performance and availability issues that affect a particular city. To access the map, open the CloudWatch Console, expand Network monitoring on the left, and click Internet Monitor. The map appears and displays weather for the entire world:

The red and yellow circles indicate current, active issues that affect availability or performance, respectively. The grey circles represent issues that have been resolved within the last 24 hours, and the blue diamonds represent AWS regions. The map will automatically refresh every 15 minutes if you leave it on the screen.

Each issue affects a specific city-network, representing a combination of a location where clients access AWS resources, and the Autonomous System Number (ASN) that was used to access the resources. ASNs typically represent individual Internet Service Providers (ISPs).

The list to the right of the map shows active events at the top, followed by events that have been resolved in the recent past, looking back up to 24 hours:

I can hover my mouse over any of the indicators to see the list of city-networks in the geographic area:

If I zoom in a step or two, I can see that those city-networks are spread out over the United States:

I can zoom in even further and see a single city-network:

This information is also available programmatically. The new ListInternetEvents function returns up to 100 performance or availability events per call, with optional filtering by time range, status (ACTIVE or RESOLVED), or type (PERFORMANCE or AVAILABILITY). Each event includes full details including latitude and longitude.

The new map is accessible from all AWS regions and there is no charge to use it. Going forward, we have a lot of powerful additions on the roadmap, subject to prioritization based on your feedback. Right now we are thinking about:

Displaying causes of certain types of outages such as DDoS attacks, BGP route leaks, and issues with route interconnects.
Adding a view that is specific to a chosen ISP.
Displaying the impact to public SaaS applications.

Please feel free to send feedback on this feature to [email protected] .

CloudWatch Internet Monitor
The information in the map applies to everyone who makes use of applications built on AWS. If you want to understand how internet weather affects your particular AWS applications and to take advantage of other features such as health event notification and traffic insights, you can make use of CloudWatch Internet Monitor. As my colleague Sébastien wrote when he launched this feature in late 2022:

You told us one of your challenges when monitoring internet-facing applications is to gather data outside of AWS to build a realistic picture of how your application behaves for your customers connected to multiple and geographically distant internet providers. Capturing and monitoring data about internet traffic before it reaches your infrastructure is either difficult or very expensive.

After you review the map, you can click Create monitor to get started with CloudWatch Internet Monitor:

After that you enter a name for your monitor, choose the AWS resources (VPCs, CloudFront distributions, Network Load Balancers, and Amazon WorkSpace Directories) to monitor, then select the desired percentage of internet-facing traffic to monitor. The monitor will begin to operate within minutes, using entries from your VPC Flow Logs, CloudFront Access Logs, and other telemetry to identify the most relevant city-networks.

Here are some resources to help you learn more about this feature:

— Jeff;

Anthropic’s Claude 3 Opus model is now available on Amazon Bedrock

2024-04-16 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/anthropics-claude-3-opus-model-on-amazon-bedrock/

We are living in the generative artificial intelligence (AI) era; a time of rapid innovation. When Anthropic announced its Claude 3 foundation models (FMs) on March 4, we made Claude 3 Sonnet, a model balanced between skills and speed, available on Amazon Bedrock the same day. On March 13, we launched the Claude 3 Haiku model on Amazon Bedrock, the fastest and most compact member of the Claude 3 family for near-instant responsiveness.

Today, we are announcing the availability of Anthropic’s Claude 3 Opus on Amazon Bedrock, the most intelligent Claude 3 model, with best-in-market performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding, leading the frontier of general intelligence.

With the availability of Claude 3 Opus on Amazon Bedrock, enterprises can build generative AI applications to automate tasks, generate revenue through user-facing applications, conduct complex financial forecasts, and accelerate research and development across various sectors. Like the rest of the Claude 3 family, Opus can process images and return text outputs.

Claude 3 Opus shows an estimated twofold gain in accuracy over Claude 2.1 on difficult open-ended questions, reducing the likelihood of faulty responses. As enterprise customers rely on Claude across industries like healthcare, finance, and legal research, improved accuracy is essential for safety and performance.

How does Claude 3 Opus perform?
Claude 3 Opus outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate-level expert knowledge (MMLU), graduate-level expert reasoning (GPQA), basic mathematics (GSM8K), and more. It exhibits high levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.

Source: https://www.anthropic.com/news/claude-3-family

Here are a few supported use cases for the Claude 3 Opus model:

Task automation: planning and execution of complex actions across APIs, databases, and interactive coding
Research: brainstorming and hypothesis generation, research review, and drug discovery
Strategy: advanced analysis of charts and graphs, financials and market trends, and forecasting

To learn more about Claude 3 Opus’s features and capabilities, visit Anthropic’s Claude on Bedrock page and Anthropic Claude models in the Amazon Bedrock documentation.

Claude 3 Opus in action
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Opus.

To test Claude 3 Opus in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Opus as the model.

To test more Claude prompt examples, choose Load examples. You can view and run examples specific to Claude 3 Opus, such as analyzing a quarterly report, building a website, and creating a side-scrolling game.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
     --model-id anthropic.claude-3-opus-20240229-v1:0 \
     --body "{\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\" Your task is to create a one-page website for an online learning platform.\\n\"}]}],\"anthropic_version\":\"bedrock-2023-05-31\",\"max_tokens\":2000,\"temperature\":1,\"top_k\":250,\"top_p\":0.999,\"stop_sequences\":[\"\\n\\nHuman:\"]}" \
     --cli-binary-format raw-in-base64-out \
     --region us-east-1 \
     invoke-model-output.txt

As I mentioned in my previous Claude 3 model launch posts, you need to use the new Anthropic Claude Messages API format for some Claude 3 model features, such as image processing. If you use Anthropic Claude Text Completions API and want to use Claude 3 models, you should upgrade from the Text Completions API.

My colleagues, Dennis Traub and Francois Bouteruche, are building code examples for Amazon Bedrock using AWS SDKs. You can learn how to invoke Claude 3 on Amazon Bedrock to generate text or multimodal prompts for image analysis in the Amazon Bedrock documentation.

Here is sample JavaScript code to send a Messages API request to generate text:

// claude_opus.js - Invokes Anthropic Claude 3 Opus using the Messages API.
import {
  BedrockRuntimeClient,
  InvokeModelCommand
} from "@aws-sdk/client-bedrock-runtime";

const modelId = "anthropic.claude-3-opus-20240229-v1:0";
const prompt = "Hello Claude, how are you today?";

// Create a new Bedrock Runtime client instance
const client = new BedrockRuntimeClient({ region: "us-east-1" });

// Prepare the payload for the model
const payload = {
  anthropic_version: "bedrock-2023-05-31",
  max_tokens: 1000,
  messages: [{
    role: "user",
    content: [{ type: "text", text: prompt }]
  }]
};

// Invoke Claude with the payload and wait for the response
const command = new InvokeModelCommand({
  contentType: "application/json",
  body: JSON.stringify(payload),
  modelId
});
const apiResponse = await client.send(command);

// Decode and print Claude's response
const decodedResponseBody = new TextDecoder().decode(apiResponse.body);
const responseBody = JSON.parse(decodedResponseBody);
const text = responseBody.content[0].text;
console.log(`Response: ${text}`);

Now, you can install the AWS SDK for JavaScript Runtime Client for Node.js and run claude_opus.js.

npm install @aws-sdk/client-bedrock-runtime
node claude_opus.js

For more examples in different programming languages, check out the code examples section in the Amazon Bedrock User Guide, and learn how to use system prompts with Anthropic Claude at Community.aws.

Now available
Claude 3 Opus is available today in the US West (Oregon) Region; check the full Region list for future updates.

Give Claude 3 Opus a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

— Channy

Congratulations to the PartyRock generative AI hackathon winners

2024-04-15 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/congratulations-to-the-partyrock-generative-ai-hackathon-winners/

The PartyRock Generative AI Hackathon wrapped up earlier this month. Entrants were asked to use PartyRock to build a functional app based on one of four challenge categories, with the option to remix an existing app as well. The hackathon attracted 7,650 registrants who submitted over 1,200 projects, and published over 250 project blog posts on community.aws .

As a member of the judging panel, I was blown away by the creativity and sophistication of the entries that I was asked to review. The participants had the opportunity to go hands-on with prompt engineering and to learn about Foundation Models, and pushed the bounds of what was possible.

Let’s take a quick look at the winners of the top 3 prizes…

First Place
First up, taking home the top overall prize of $20,000 in AWS credits is Parable Rhythm – The Interactive Crime Thriller by Param Birje. This project immerses you in a captivating interactive story using PartyRock’s generative capabilities. Just incredible stuff.

To learn more, read the hackathon submission and the blog post.

Second Place
In second place, earning $10,000 in credits, is Faith – Manga Creation Tools by Michael Oswell. This creative assistant app lets you generate original manga panels and panels with the click of a button. So much potential there.

To learn more, read the hackathon submission.

Third Place
And rounding out the top 3 overall is Arghhhh! Zombie by Michael Eziamaka. This is a wildly entertaining generative AI-powered zombie game that had the judges on the edge of their seats. Great work, Michael!

To learn more, read the hackathon submission.

Round of Applause
I want to give a huge round of applause to all our category winners as well:

Category / Place	Submission	Prize (USD)	AWS Credits
Overall 1st Place	Parable Rhythm	–	$20,000
Overall 2nd Place	Faith – Manga Creation Tools	–	$10,000
Overall 3rd Place	Arghhhh! Zombie	–	$5,000

Creative Assistants 1st Place	Faith – Manga Creation Tools	$4,000	$1,000
Creative Assistants 2nd Place	MovieCreator	$1,500	$1,000
Creative Assistants 3rd Place	WingPal	$500	$1,000

Experimental Entertainment 1st Place	Parable Rhythm	$4,000	$1,000
Experimental Entertainment 2nd Place	Arghhhh! Zombie	$1,500	$1,000
Experimental Entertainment 3rd Place	Find your inner potato	$500	$1,000

Interactive Learning 1st Place	DeBeat Coach	$4,000	$1,000
Interactive Learning 2nd Place	Asteroid Mining Assistant	$1,500	$1,000
Interactive Learning 3rd Place	Unlock your pet’s language	$500	$1,000

Freestyle 1st Place	MindMap Party	$1,000	$1,000
Freestyle 2nd Place	Angler Advisor	$750	$1,000
Freestyle 3rd Place	SafeScares	$250	$1,000

BONUS: Remix ChatRPG	Inferno	–	$2,500
BONUS: Remix ChatRPG	Chat RPG Generator	–	$2,500

From interactive learning experiences to experimental entertainment, the creativity and technical execution on display was off the charts.

And of course, a big thank you to all 7,650 participants who dove in and pushed the boundaries of what’s possible with generative AI. You all should be extremely proud.

Join the Party
You can click on any of the images above and try out the apps for yourself. You can remix and customize them, and you can build your own apps as well (read my post, Build AI apps with PartyRock and Amazon Bedrock to see how to get started).

Alright, that’s a wrap. Congrats again to our winners, and a huge thanks to the PartyRock team and all our amazing sponsors. I can’t wait to see what you all build next. Until then, keep building, keep learning, and keep having fun!

— Jeff;

Tackle complex reasoning tasks with Mistral Large, now available on Amazon Bedrock

2024-04-03 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/tackle-complex-reasoning-tasks-with-mistral-large-now-available-on-amazon-bedrock/

Last month, we announced the availability of two high-performing Mistral AI models, Mistral 7B and Mixtral 8x7B on Amazon Bedrock. Mistral 7B, as the ﬁrst foundation model of Mistral, supports English text generation tasks with natural coding capabilities. Mixtral 8x7B is a popular, high-quality, sparse Mixture-of-Experts (MoE) model, that is ideal for text summarization, question and answering, text classification, text completion, and code generation.

Today, we’re announcing the availability of Mistral Large on Amazon Bedrock. Mistral Large is ideal for complex tasks that require substantial reasoning capabilities, or ones that are highly specialized, such as Synthetic Text Generation or Code Generation.

What you need to know about Mistral Large:

It’s natively fluent in English, French, Spanish, German, and Italian, with a nuanced understanding of grammar and cultural context.
It has a 32K token context window allows precise information recall from large documents.
Its precise instruction-following enables you to design your moderation policies – the folks at Mistral AI used it to set up the system-level moderation of their beta assistant demonstrator le Chat. Your first interaction with Large Language Models (LLMs) revolves around prompts. The art of crafting effective prompts is essential for generating desirable responses from LLMs and Mistral AI has a guide with example prompts showing different prompting capabilities.

Getting started with Mistral Large
To get started with Mistral Large on Bedrock, you first need to get access to the model. On the Amazon Bedrock console, select Model access, and then select Manage model access. Next, select Mistral Large, and then select Save changes.

Once you have access to Mistral Large, you can use the model on Bedrock. Refresh the base models table to view the updated status.

To see the complex reasoning aspect of Mistral Large in action I tried the following prompt in the Playgrounds section:

Prompt

Calculate the difference in payment dates between the two customers whose payment amounts are closest to each other in the given dataset:

'{
"transaction_id":{"0":"T1001","1":"T1002","2":"T1003","3":"T1004","4":"T1005"},
"customer_id":{"0":"C001","1":"C002","2":"C003","3":"C002","4":"C001"},
"payment_amount":{"0":125.5,"1":89.99,"2":120.0,"3":54.3,"4":210.2},
"payment_date":{"0":"2021-10-05","1":"2021-10-06","2":"2021-10-07","3":"2021-10-05","4":"2021-10-08"},
"payment_status":{"0":"Paid","1":"Unpaid","2":"Paid","3":"Paid","4":"Pending"}
}'

Output

To find the difference in payment dates between the two customers whose payment amounts are closest to each other, we first need to identify the two closest payment amounts and then find their respective payment dates.

The payment amounts are: 125.5, 89.99, 120.0, 54.3, 210.2

The two closest payment amounts are 125.5 and 120.0. The payment date for the payment amount 125.5 is '2021-10-05' and for the payment amount 120.0 is '2021-10-07'.

The difference in payment dates can be calculated as follows:

Date difference = |'2021-10-07' - '2021-10-05'| = 2 days

So, the difference in payment dates between the two customers whose payment amounts are closest to each other is 2 days.

Earlier in this post I mentioned that Mistral Large is natively fluent in several languages, with a nuanced understanding of grammar and cultural context. I tested the model’s understanding of the French language:

Prompt

Lequel est le plus lourd une livre de fer ou un kilogramme de plume

Output

Un kilogramme de plumes est plus lourd qu'une livre de fer.

Pour comprendre cela, il est important de se rappeler que :

1. Une livre (lb) est une unité de mesure de masse principalement utilisée aux États-Unis et dans d'autres systèmes de mesure impériaux, tandis qu'un kilogramme (kg) est l'unité de base de masse dans le système international d'unités (SI).

2. 1 kilogramme est approximativement égal à 2,2 livres.

Donc, un kilogramme de plumes est plus lourd qu'une livre de fer, car il correspond à environ 2,2 livres de plumes.

Programmatically interact with Mistral Large
You can also use AWS Command Line Interface (CLI) and AWS Software Development Kit (SDK) to make various calls using Amazon Bedrock APIs. Following, is a sample code in Python that interacts with Amazon Bedrock Runtime APIs with AWS SDK. If you specify in the prompt that “You will only respond with a JSON object with the key X, Y, and Z.”, you can use JSON format output in easy downstream tasks:

import boto3
import json

bedrock = boto3.client(service_name="bedrock-runtime", region_name='us-east-1')

prompt = """
<s>[INST]You are a summarization system that can provide summaries with associated confidence 
scores. In clear and concise language, provide three short summaries of the following essay, 
along with their confidence scores. You will only respond with a JSON object with the key Summary 
and Confidence. Do not provide explanations.[/INST]

# Essay: 
The generative artificial intelligence (AI) revolution is in full swing, and customers of all sizes and across industries are taking advantage of this transformative technology to reshape their businesses. From reimagining workflows to make them more intuitive and easier to enhancing decision-making processes through rapid information synthesis, generative AI promises to redefine how we interact with machines. It’s been amazing to see the number of companies launching innovative generative AI applications on AWS using Amazon Bedrock. Siemens is integrating Amazon Bedrock into its low-code development platform Mendix to allow thousands of companies across multiple industries to create and upgrade applications with the power of generative AI. Accenture and Anthropic are collaborating with AWS to help organizations—especially those in highly-regulated industries like healthcare, public sector, banking, and insurance—responsibly adopt and scale generative AI technology with Amazon Bedrock. This collaboration will help organizations like the District of Columbia Department of Health speed innovation, improve customer service, and improve productivity, while keeping data private and secure. Amazon Pharmacy is using generative AI to fill prescriptions with speed and accuracy, making customer service faster and more helpful, and making sure that the right quantities of medications are stocked for customers.

To power so many diverse applications, we recognized the need for model diversity and choice for generative AI early on. We know that different models excel in different areas, each with unique strengths tailored to specific use cases, leading us to provide customers with access to multiple state-of-the-art large language models (LLMs) and foundation models (FMs) through a unified service: Amazon Bedrock. By facilitating access to top models from Amazon, Anthropic, AI21 Labs, Cohere, Meta, Mistral AI, and Stability AI, we empower customers to experiment, evaluate, and ultimately select the model that delivers optimal performance for their needs.

Announcing Mistral Large on Amazon Bedrock
Today, we are excited to announce the next step on this journey with an expanded collaboration with Mistral AI. A French startup, Mistral AI has quickly established itself as a pioneering force in the generative AI landscape, known for its focus on portability, transparency, and its cost-effective design requiring fewer computational resources to run. We recently announced the availability of Mistral 7B and Mixtral 8x7B models on Amazon Bedrock, with weights that customers can inspect and modify. Today, Mistral AI is bringing its latest and most capable model, Mistral Large, to Amazon Bedrock, and is committed to making future models accessible to AWS customers. Mistral AI will also use AWS AI-optimized AWS Trainium and AWS Inferentia to build and deploy its future foundation models on Amazon Bedrock, benefitting from the price, performance, scale, and security of AWS. Along with this announcement, starting today, customers can use Amazon Bedrock in the AWS Europe (Paris) Region. At launch, customers will have access to some of the latest models from Amazon, Anthropic, Cohere, and Mistral AI, expanding their options to support various use cases from text understanding to complex reasoning.

Mistral Large boasts exceptional language understanding and generation capabilities, which is ideal for complex tasks that require reasoning capabilities or ones that are highly specialized, such as synthetic text generation, code generation, Retrieval Augmented Generation (RAG), or agents. For example, customers can build AI agents capable of engaging in articulate conversations, generating nuanced content, and tackling complex reasoning tasks. The model’s strengths also extend to coding, with proficiency in code generation, review, and comments across mainstream coding languages. And Mistral Large’s exceptional multilingual performance, spanning French, German, Spanish, and Italian, in addition to English, presents a compelling opportunity for customers. By offering a model with robust multilingual support, AWS can better serve customers with diverse language needs, fostering global accessibility and inclusivity for generative AI solutions.

By integrating Mistral Large into Amazon Bedrock, we can offer customers an even broader range of top-performing LLMs to choose from. No single model is optimized for every use case, and to unlock the value of generative AI, customers need access to a variety of models to discover what works best based for their business needs. We are committed to continuously introducing the best models, providing customers with access to the latest and most innovative generative AI capabilities.

“We are excited to announce our collaboration with AWS to accelerate the adoption of our frontier AI technology with organizations around the world. Our mission is to make frontier AI ubiquitous, and to achieve this mission, we want to collaborate with the world’s leading cloud provider to distribute our top-tier models. We have a long and deep relationship with AWS and through strengthening this relationship today, we will be able to provide tailor-made AI to builders around the world.”

– Arthur Mensch, CEO at Mistral AI.

Customers appreciate choice
Since we first announced Amazon Bedrock, we have been innovating at a rapid clip—adding more powerful features like agents and guardrails. And we’ve said all along that more exciting innovations, including new models will keep coming. With more model choice, customers tell us they can achieve remarkable results:

“The ease of accessing different models from one API is one of the strengths of Bedrock. The model choices available have been exciting. As new models become available, our AI team is able to quickly and easily evaluate models to know if they fit our needs. The security and privacy that Bedrock provides makes it a great choice to use for our AI needs.”

– Jamie Caramanica, SVP, Engineering at CS Disco.

“Our top priority today is to help organizations use generative AI to support employees and enhance bots through a range of applications, such as stronger topic, sentiment, and tone detection from customer conversations, language translation, content creation and variation, knowledge optimization, answer highlighting, and auto summarization. To make it easier for them to tap into the potential of generative AI, we’re enabling our users with access to a variety of large language models, such as Genesys-developed models and multiple third-party foundational models through Amazon Bedrock, including Anthropic’s Claude, AI21 Labs’s Jurrassic-2, and Amazon Titan. Together with AWS, we’re offering customers exponential power to create differentiated experiences built around the needs of their business, while helping them prepare for the future.”

– Glenn Nethercutt, CTO at Genesys.

As the generative AI revolution continues to unfold, AWS is poised to shape its future, empowering customers across industries to drive innovation, streamline processes, and redefine how we interact with machines. Together with outstanding partners like Mistral AI, and with Amazon Bedrock as the foundation, our customers can build more innovative generative AI applications.

Democratizing access to LLMs and FMs
Amazon Bedrock is democratizing access to cutting-edge LLMs and FMs and AWS is the only cloud provider to offer the most popular and advanced FMs to customers. The collaboration with Mistral AI represents a significant milestone in this journey, further expanding Amazon Bedrock’s diverse model offerings and reinforcing our commitment to empowering customers with unparalleled choice through Amazon Bedrock. By recognizing that no single model can optimally serve every use case, AWS has paved the way for customers to unlock the full potential of generative AI. Through Amazon Bedrock, organizations can experiment with and take advantage of the unique strengths of multiple top-performing models, tailoring their solutions to specific needs, industry domains, and workloads. This unprecedented choice, combined with the robust security, privacy, and scalability of AWS, enables customers to harness the power of generative AI responsibly and with confidence, no matter their industry or regulatory constraints.
"""

body = json.dumps({
    "prompt": prompt,
    "max_tokens": 512,
    "top_p": 0.8,
    "temperature": 0.5,
})

modelId = "mistral.mistral-large-2402-v1:0"

accept = "application/json"
contentType = "application/json"

response = bedrock.invoke_model(
    body=body,
    modelId=modelId,
    accept=accept,
    contentType=contentType
)

print(json.loads(response.get('body').read()))

You can get JSON formatted output as like:

{ 
   "Summaries": [ 
      { 
         "Summary": "The author discusses their early experiences with programming and writing, 
starting with writing short stories and programming on an IBM 1401 in 9th grade. 
They then moved on to working with microcomputers, building their own from a Heathkit, 
and eventually convincing their father to buy a TRS-80 in 1980. They wrote simple games, 
a program to predict rocket flight trajectories, and a word processor.", 
         "Confidence": 0.9 
      }, 
      { 
         "Summary": "The author began college as a philosophy major, but found it to be unfulfilling 
and switched to AI. They were inspired by a novel and a PBS documentary, as well as the 
potential for AI to create intelligent machines like those in the novel. Despite this 
excitement, they eventually realized that the traditional approach to AI was flawed and 
shifted their focus to Lisp.", 
         "Confidence": 0.85 
      }, 
      { 
         "Summary": "The author briefly worked at Interleaf, where they found that their Lisp skills 
were highly valued. They eventually left Interleaf to return to RISD, but continued to work 
as a freelance Lisp hacker. While at RISD, they started painting still lives in their bedroom 
at night, which led to them applying to art schools and eventually attending the Accademia 
di Belli Arti in Florence.", 
         "Confidence": 0.9 
      } 
   ] 
}

To learn more prompting capabilities in Mistral AI models, visit Mistral AI documentation.

Now Available
Mistral Large, along with other Mistral AI models (Mistral 7B and Mixtral 8x7B), is available today on Amazon Bedrock in the US East (N. Virginia), US West (Oregon), and Europe (Paris) Regions; check the full Region list for future updates.

Share and learn with our generative AI community at community.aws. Give Mistral Large a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Read about our collaboration with Mistral AI and what it means for our customers.

– Veliswa.

Introducing AWS Deadline Cloud: Set up a cloud-based render farm in minutes

2024-04-02 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/introducing-aws-deadline-cloud-set-up-a-cloud-based-render-farm-in-minutes/

Customers in industries such as architecture, engineering, & construction (AEC) and media & entertainment (M&E) generate the final frames for film, TV, games, industrial design visualizations, and other digital media with a process called rendering, which takes 2D/3D digital content data and computes an output, such as an image or video file. Rendering also requires significant compute power, especially to generate 3D graphics and visual effects (VFX) with resolutions as high as 16K for films and TV. This constrains the number of rendering projects that customers can take on at once.

To address this growing demand for rendering high-resolution content, customers often build what are called “render farms” which combine the power of hundreds or thousands of computing nodes to process their rendering jobs. Render farms can traditionally take weeks or even months to build and deploy, and they require significant planning and upfront commitments to procure hardware.

As a result, customers increasingly are transitioning to scalable, cloud-based render farms for efficient production instead of a dedicated render farm on-premises, which can require extremely high fixed costs. But, rendering in the cloud still requires customers to manage their own infrastructure, build bespoke tooling to manage costs on a project-by-project basis, and monitor software licensing costs with their preferred partners themselves.

Today, we are announcing the general availability of AWS Deadline Cloud, a new fully managed service that enables creative teams to easily set up a render farm in minutes, scale to run more projects in parallel, and only pay for what resources they use. AWS Deadline Cloud provides a web-based portal with the ability to create and manage render farms, preview in-progress renders, view and analyze render logs, and easily track these costs.

With Deadline Cloud, you can go from zero to render faster with integrations of digital content creation (DCC) tools and customization tools are built-in. You can reduce the effort and development time required to tailor your rendering pipeline to the needs of each job. You also have the flexibility to use licenses you already own or they are provided by the service for third-party DCC software and renderers such as Maya, Nuke, and Houdini.

Concepts of AWS Deadline Cloud
AWS Deadline Cloud allows you to create and manage rendering projects and jobs on Amazon Elastic Compute Cloud (Amazon EC2) instances directly from DCC pipelines and workstations. You can create a rendering farm, a collection of queues, and fleets. A queue is where your submitted jobs are located and scheduled to be rendered. A fleet is a group of worker nodes that can support multiple queues. A queue can be processed by multiple fleets.

Before you can work on a project, you should have access to the required resources, and the associated farm must be integrated with AWS IAM Identity Center to manage workforce authentication and authorization. IT administrators can create and grant access permissions to users and groups at different levels, such as viewers, contributors, managers, or owners.

Here are four key components of Deadline Cloud:

Deadline Cloud monitor – You can access statuses, logs, and other troubleshooting metrics for jobs, steps, and tasks. The Deadline Cloud monitor provides real-time access and updates to job progress. It also provides access to logs and other troubleshooting metrics, and you can browse multiple farm, fleet, and queue listings to view system utilization.
Deadline Cloud submitter – You can submit a rendering job directly using AWS SDK or AWS Command Line Interface (AWS CLI). You can also submit from DCC software using a Deadline Cloud submitter, which is a DCC-integrated plugin that supports Open Job Description (OpenJD), an open source template specification. With it, artists can submit rendering jobs from a third-party DCC interface they are more familiar with, such as Maya or Nuke, to Deadline Cloud, where project resources are managed and jobs are monitored in one location.
Deadline Cloud budget manager – You can create and edit budgets to help manage project costs and view how many AWS resources are used and the estimated costs for those resources.
Deadline Cloud usage explorer – You can use the usage explorer to track approximate compute and licensing costs based on public pricing rates in Amazon EC2 and Usage-Based Licensing (UBL).

Get started with AWS Deadline Cloud
To get started with AWS Deadline Cloud, define and create a farm with Deadline Cloud monitor, download the Deadline Cloud submitter, and install plugins for your favorite DCC applications with just a few clicks. You can define your rendering jobs in your DCC application and submit them to your created farm within the plugin’s user interfaces.

The DCC plugins detect the necessary input scene data and build a job bundle that uploads to the Amazon Simple Storage Service (Amazon S3) bucket in your account, transfer to Deadline Cloud for rendering the job, and provide completed frames to the S3 bucket for your customers to access.

1. Define a farm with Deadline Cloud monitor
Let’s create your Deadline Cloud monitor infrastructure and define your farm first. In the Deadline Cloud console, choose Set up Deadline Cloud to define a farm with a guided experience, including queues and fleets, adding groups and users, choosing a service role, and adding tags to your resources.

In this step, to choose all the default settings for your Deadline Cloud resources, choose Skip to Review in Step 3 after monitor setup. Otherwise choose Next and customize your Deadline Cloud resources.

Set up your monitor’s infrastructure and enter your Monitor display name. This name makes the Monitor URL, a web portal to manage your farms, queues, fleets, and usages. You can’t change the monitor URL after you finish setting up. The AWS Region is the physical location of your rendering farm, so you should choose the closest Region from your studio to reduce the latency and improve data transfer speeds.

To access the monitor, you can create new users and groups and manage users (such as by assigning them groups, permissions, and applications) or delete users from your monitor. Users, groups, and permissions can also be managed in the IAM Identity Center. So, if you don’t set up the IAM Identity Center in your Region, you should enable it first. To learn more, visit Managing users in Deadline Cloud in the AWS documentation.

In Step 2, you can define farm details such as the name and description of your farm. In Additional farm settings, you can set an AWS Key Management Service (AWS KMS) key to encrypt your data and tags to assign AWS resources for filtering your resources or tracking your AWS costs. Your data is encrypted by default with a key that AWS owns and manages for you. To choose a different key, customize your encryption settings.

You can choose Skip to Review and Create to finish the quick setup process with the default settings.

Let’s look at more optional configurations! In the step for defining queue details, you can set up an S3 bucket for your queue. Job assets are uploaded as job attachments during the rendering process. Job attachments are stored in your defined S3 bucket. Additionally, you can set up the default budget action, service access roles, and environment variables for your queue.

In the step for defining fleet details, set the fleet name, description, Instance option (either Spot or On-Demand Instance), and Auto scaling configuration to define the number of instances and the fleet’s worker requirements. We set conservative worker requirements by default. These values can be updated at any time after setting up your render farm. To learn more, visit Manage Deadline Cloud fleets in the AWS documentation.

Worker instances define EC2 instance types with vCPUs and memory size, for example, c5.large, c5a.large, and c6i.large. You can filter up to 100 EC2 instance types by either allowing or excluding types of worker instances.

Review all of the information entered to create your farm and choose Create farm.

The progress of your Deadline Cloud onboarding is displayed, and a success message displays when your monitor and farm are ready for use. To learn more details about the process, visit Set up a Deadline Cloud monitor in the AWS documentation.

In the Dashboard in the left pane, you can visit the overview of the monitor, farms, users, and groups that you created.

Choose Monitor to visit a web portal to manage your farms, queues, fleets, usages, and budgets. After signing in to your user account, you can enter a web portal and explore the Deadline Cloud resources you created. You can also download a Deadline Cloud monitor desktop application with the same user experiences from the Downloads page.

To learn more about using the monitor, visit Using the Deadline Cloud monitor in the AWS documentation.

2. Set up a workstation and submit your render job to Deadline Cloud
Let’s set up a workstation for artists on their desktops by installing the Deadline Cloud submitter application so they can easily submit render jobs from within Maya, Nuke, and Houdini. Choose Downloads in the left menu pane and download the proper submitter installer for your operating system to test your render farm.

This program installs the latest integrated plugin for Deadline Cloud submitter for Maya, Nuke, and Houdini.

For example, open a Maya on your desktop and your asset. I have an example of a wrench file I’m going to test with. Choose Windows in the menu bar and Settings/Preferences in the sub menu. In the Plugin Manager, search for DeadlineCloudSubmitter. Select Loaded to load the Deadline Cloud submitter plugin.

If you are not already authenticated in the Deadline Cloud submitter, the Deadline Cloud Status tab will display. Choose Login and sign in with your user credentials in a browser sign-in window.

Now, select the Deadline Cloud shelf, then choose the orange deadline cloud logo on the ‘Deadline’ shelf to launch the submitter. From the submitter window, choose the farm and queue you want your render submitted to. If desired, in the Scene Settings tab, you can override the frame range, change the Output Path, or both.

If you choose Submit, the wrench turntable Maya file, along with all of the necessary textures and alembic caches, will be uploaded to Deadline Cloud and rendered on the farm. You can monitor rendering jobs in your Deadline Cloud monitor.

When your render is finished, as indicated by the Succeeded status in the job monitor, choose the job, Job Actions, and Download Output. To learn more about scheduling and monitoring jobs, visit Deadline Cloud jobs in the AWS documentation.

View your rendered image with an image viewing application such as DJView. The image will look like this:

To learn more in detail about the developer-side setup process using the command line, visit Setting up a developer workstation for Deadline Cloud in the AWS documentation.

3. Managing budgets and usage for Deadline Cloud
To help you manage costs for Deadline Cloud, you can use a budget manager to create and edit budgets. You can also use a usage explorer to view how many AWS resources are used and the estimated costs for those resources.

Choose Budgets on the Deadline Cloud monitor page to create your budget for your farm.

You can create budget amounts and limits and set automated actions to help reduce or stop additional spend against the budget.

Choose Usage in the Deadline Cloud monitor page to find real-time metrics on the activity happening on each farm. You can look at the farm’s costs by different variables, such as queue, job, or user. Choose various time frames to find usage during a specific period and look at usage trends over time.

The costs displayed in the usage explorer are approximate. Use them as a guide for managing your resources. There may be other costs from using other connected AWS resources, such as Amazon S3, Amazon CloudWatch, and other services that are not accounted for in the usage explorer.

To learn more, visit Managing budgets and usage for Deadline Cloud in the AWS documentation.

Now available
AWS Deadline Cloud is now available in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland) Regions.

Give AWS Deadline Cloud a try in the Deadline Cloud console. For more information, visit the Deadline Cloud product page, Deadline Cloud User Guide in the AWS documentation, and send feedback to AWS re:Post for AWS Deadline Cloud or through your usual AWS support contacts.

— Channy

Amazon GuardDuty EC2 Runtime Monitoring is now generally available

2024-03-29 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-guardduty-ec2-runtime-monitoring-is-now-generally-available/

Amazon GuardDuty is a machine learning (ML)-based security monitoring and intelligent threat detection service that analyzes and processes various AWS data sources, continuously monitors your AWS accounts and workloads for malicious activity, and delivers detailed security findings for visibility and remediation.

I love the feature of GuardDuty Runtime Monitoring that analyzes operating system (OS)-level, network, and file events to detect potential runtime threats for specific AWS workloads in your environment. I first introduced the general availability of this feature for Amazon Elastic Kubernetes Service (Amazon EKS) resources in March 2023. Seb wrote about the expansion of the Runtime Monitoring feature to provide threat detection for Amazon Elastic Container Service (Amazon ECS) and AWS Fargate as well as the preview for Amazon Elastic Compute Cloud (Amazon EC2) workloads in Nov 2023.

Today, we are announcing the general availability of Amazon GuardDuty EC2 Runtime Monitoring to expand threat detection coverage for EC2 instances at runtime and complement the anomaly detection that GuardDuty already provides by continuously monitoring VPC Flow Logs, DNS query logs, and AWS CloudTrail management events. You now have visibility into on-host, OS-level activities and container-level context into detected threats.

With GuardDuty EC2 Runtime Monitoring, you can identify and respond to potential threats that might target the compute resources within your EC2 workloads. Threats to EC2 workloads often involve remote code execution that leads to the download and execution of malware. This could include instances or self-managed containers in your AWS environment that are connecting to IP addresses associated with cryptocurrency-related activity or to malware command-and-control related IP addresses.

GuardDuty Runtime Monitoring provides visibility into suspicious commands that involve malicious file downloads and execution across each step, which can help you discover threats during initial compromise and before they become business-impacting events. You can also centrally enable runtime threat detection coverage for accounts and workloads across the organization using AWS Organizations to simplify your security coverage.

Configure EC2 Runtime Monitoring in GuardDuty
With a few clicks, you can enable GuardDuty EC2 Runtime Monitoring in the GuardDuty console. For your first use, you need to enable Runtime Monitoring.

Any customers that are new to the EC2 Runtime Monitoring feature can try it for free for 30 days and gain access to all features and detection findings. The GuardDuty console shows how many days are left in the free trial.

Now, you can set up the GuardDuty security agent for the individual EC2 instances for which you want to monitor the runtime behavior. You can choose to deploy the GuardDuty security agent either automatically or manually. At GA, you can enable Automated agent configuration, which is a preferred option for most customers as it allows GuardDuty to manage the security agent on their behalf.

The agent will be deployed on EC2 instances with AWS Systems Manager and uses an Amazon Virtual Private Cloud (Amazon VPC) endpoint to receive the runtime events associated with your resource. If you want to manage the GuardDuty security agent manually, visit Managing the security agent Amazon EC2 instance manually in the AWS documentation. In multiple-account environments, delegated GuardDuty administrator accounts manage their member accounts using AWS Organizations. For more information, visit Managing multiple accounts in the AWS documentation.

When you enable EC2 Runtime Monitoring, you can find the covered EC2 instances list, account ID, and coverage status, and whether the agent is able to receive runtime events from the corresponding resource in the EC2 instance runtime coverage tab.

Even when the coverage status is Unhealthy, meaning it is not currently able to receive runtime findings, you still have defense in depth for your EC2 instance. GuardDuty continues to provide threat detection to the EC2 instance by monitoring CloudTrail, VPC flow, and DNS logs associated with it.

Check out GuardDuty EC2 Runtime security findings
When GuardDuty detects a potential threat and generates security findings, you can view the details of the healthy information.

Choose Findings in the left pane if you want to find security findings specific to Amazon EC2 resources. You can use the filter bar to filter the findings table by specific criteria, such as a Resource type of Instance. The severity and details of the findings differ based on the resource role, which indicates whether the EC2 resource was the target of suspicious activity or the actor performing the activity.

With today’s launch, we support over 30 runtime security findings for EC2 instances, such as detecting abused domains, backdoors, cryptocurrency-related activity, and unauthorized communications. For the full list, visit Runtime Monitoring finding types in the AWS documentation.

Resolve your EC2 security findings
Choose each EC2 security finding to know more details. You can find all the information associated with the finding and examine the resource in question to determine if it is behaving in an expected manner.

If the activity is authorized, you can use suppression rules or trusted IP lists to prevent false positive notifications for that resource. If the activity is unexpected, the security best practice is to assume the instance has been compromised and take the actions detailed in Remediating a potentially compromised Amazon EC2 instance in the AWS documentation.

You can integrate GuardDuty EC2 Runtime Monitoring with other AWS security services, such as AWS Security Hub or Amazon Detective. Or you can use Amazon EventBridge, allowing you to use integrations with security event management or workflow systems, such as Splunk, Jira, and ServiceNow, or trigger automated and semi-automated responses such as isolating a workload for investigation.

When you choose Investigate with Detective, you can find Detective-created visualizations for AWS resources to quickly and easily investigate security issues. To learn more, visit Integration with Amazon Detective in the AWS documentation.

Things to know
GuardDuty EC2 Runtime Monitoring support is now available for EC2 instances running Amazon Linux 2 or Amazon Linux 2023. You have the option to configure maximum CPU and memory limits for the agent. To learn more and for future updates, visit Prerequisites for Amazon EC2 instance support in the AWS documentation.

To estimate the daily average usage costs for GuardDuty, choose Usage in the left pane. During the 30-day free trial period, you can estimate what your costs will be after the trial period. At the end of the trial period, we charge you per vCPU hours tracked monthly for the monitoring agents. To learn more, visit the Amazon GuardDuty pricing page.

Enabling EC2 Runtime Monitoring also allows for a cost-saving opportunity on your GuardDuty cost. When the feature is enabled, you won’t be charged for GuardDuty foundational protection VPC Flow Logs sourced from the EC2 instances running the security agent. This is due to similar, but more contextual, network data available from the security agent. Additionally, GuardDuty would still process VPC Flow Logs and generate relevant findings so you will continue to get network-level security coverage even if the agent experiences downtime.

Now available
Amazon GuardDuty EC2 Runtime Monitoring is now available in all AWS Regions where GuardDuty is available, excluding AWS GovCloud (US) Regions and AWS China Regions. For a full list of Regions where EC2 Runtime Monitoring is available, visit Region-specific feature availability.

Give GuardDuty EC2 Runtime Monitoring a try in the GuardDuty console. For more information, visit the Amazon GuardDuty User Guide and send feedback to AWS re:Post for Amazon GuardDuty or through your usual AWS support contacts.

— Channy

Run large-scale simulations with AWS Batch multi-container jobs

2024-03-25 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/run-large-scale-simulations-with-aws-batch-multi-container-jobs/

Industries like automotive, robotics, and finance are increasingly implementing computational workloads like simulations, machine learning (ML) model training, and big data analytics to improve their products. For example, automakers rely on simulations to test autonomous driving features, robotics companies train ML algorithms to enhance robot perception capabilities, and financial firms run in-depth analyses to better manage risk, process transactions, and detect fraud.

Some of these workloads, including simulations, are especially complicated to run due to their diversity of components and intensive computational requirements. A driving simulation, for instance, involves generating 3D virtual environments, vehicle sensor data, vehicle dynamics controlling car behavior, and more. A robotics simulation might test hundreds of autonomous delivery robots interacting with each other and other systems in a massive warehouse environment.

AWS Batch is a fully managed service that can help you run batch workloads across a range of AWS compute offerings, including Amazon Elastic Container Service (Amazon ECS), Amazon Elastic Kubernetes Service (Amazon EKS), AWS Fargate, and Amazon EC2 Spot or On-Demand Instances. Traditionally, AWS Batch only allowed single-container jobs and required extra steps to merge all components into a monolithic container. It also did not allow using separate “sidecar” containers, which are auxiliary containers that complement the main application by providing additional services like data logging. This additional effort required coordination across multiple teams, such as software development, IT operations, and quality assurance (QA), because any code change meant rebuilding the entire container.

Now, AWS Batch offers multi-container jobs, making it easier and faster to run large-scale simulations in areas like autonomous vehicles and robotics. These workloads are usually divided between the simulation itself and the system under test (also known as an agent) that interacts with the simulation. These two components are often developed and optimized by different teams. With the ability to run multiple containers per job, you get the advanced scaling, scheduling, and cost optimization offered by AWS Batch, and you can use modular containers representing different components like 3D environments, robot sensors, or monitoring sidecars. In fact, customers such as IPG Automotive, MORAI, and Robotec.ai are already using AWS Batch multi-container jobs to run their simulation software in the cloud.

Let’s see how this works in practice using a simplified example and have some fun trying to solve a maze.

Building a Simulation Running on Containers
In production, you will probably use existing simulation software. For this post, I built a simplified version of an agent/model simulation. If you’re not interested in code details, you can skip this section and go straight to how to configure AWS Batch.

For this simulation, the world to explore is a randomly generated 2D maze. The agent has the task to explore the maze to find a key and then reach the exit. In a way, it is a classic example of pathfinding problems with three locations.

Here’s a sample map of a maze where I highlighted the start (S), end (E), and key (K) locations.

The separation of agent and model into two separate containers allows different teams to work on each of them separately. Each team can focus on improving their own part, for example, to add details to the simulation or to find better strategies for how the agent explores the maze.

Here’s the code of the maze model (app.py). I used Python for both examples. The model exposes a REST API that the agent can use to move around the maze and know if it has found the key and reached the exit. The maze model uses Flask for the REST API.

import json
import random
from flask import Flask, request, Response

ready = False

# How map data is stored inside a maze
# with size (width x height) = (4 x 3)
#
#    012345678
# 0: +-+-+ +-+
# 1: | |   | |
# 2: +-+ +-+-+
# 3: | |   | |
# 4: +-+-+ +-+
# 5: | | | | |
# 6: +-+-+-+-+
# 7: Not used

class WrongDirection(Exception):
    pass

class Maze:
    UP, RIGHT, DOWN, LEFT = 0, 1, 2, 3
    OPEN, WALL = 0, 1
    

    @staticmethod
    def distance(p1, p2):
        (x1, y1) = p1
        (x2, y2) = p2
        return abs(y2-y1) + abs(x2-x1)


    @staticmethod
    def random_dir():
        return random.randrange(4)


    @staticmethod
    def go_dir(x, y, d):
        if d == Maze.UP:
            return (x, y - 1)
        elif d == Maze.RIGHT:
            return (x + 1, y)
        elif d == Maze.DOWN:
            return (x, y + 1)
        elif d == Maze.LEFT:
            return (x - 1, y)
        else:
            raise WrongDirection(f"Direction: {d}")


    def __init__(self, width, height):
        self.width = width
        self.height = height        
        self.generate()
        

    def area(self):
        return self.width * self.height
        

    def min_lenght(self):
        return self.area() / 5
    

    def min_distance(self):
        return (self.width + self.height) / 5
    

    def get_pos_dir(self, x, y, d):
        if d == Maze.UP:
            return self.maze[y][2 * x + 1]
        elif d == Maze.RIGHT:
            return self.maze[y][2 * x + 2]
        elif d == Maze.DOWN:
            return self.maze[y + 1][2 * x + 1]
        elif d ==  Maze.LEFT:
            return self.maze[y][2 * x]
        else:
            raise WrongDirection(f"Direction: {d}")


    def set_pos_dir(self, x, y, d, v):
        if d == Maze.UP:
            self.maze[y][2 * x + 1] = v
        elif d == Maze.RIGHT:
            self.maze[y][2 * x + 2] = v
        elif d == Maze.DOWN:
            self.maze[y + 1][2 * x + 1] = v
        elif d ==  Maze.LEFT:
            self.maze[y][2 * x] = v
        else:
            WrongDirection(f"Direction: {d}  Value: {v}")


    def is_inside(self, x, y):
        return 0 <= y < self.height and 0 <= x < self.width


    def generate(self):
        self.maze = []
        # Close all borders
        for y in range(0, self.height + 1):
            self.maze.append([Maze.WALL] * (2 * self.width + 1))
        # Get a random starting point on one of the borders
        if random.random() < 0.5:
            sx = random.randrange(self.width)
            if random.random() < 0.5:
                sy = 0
                self.set_pos_dir(sx, sy, Maze.UP, Maze.OPEN)
            else:
                sy = self.height - 1
                self.set_pos_dir(sx, sy, Maze.DOWN, Maze.OPEN)
        else:
            sy = random.randrange(self.height)
            if random.random() < 0.5:
                sx = 0
                self.set_pos_dir(sx, sy, Maze.LEFT, Maze.OPEN)
            else:
                sx = self.width - 1
                self.set_pos_dir(sx, sy, Maze.RIGHT, Maze.OPEN)
        self.start = (sx, sy)
        been = [self.start]
        pos = -1
        solved = False
        generate_status = 0
        old_generate_status = 0                    
        while len(been) < self.area():
            (x, y) = been[pos]
            sd = Maze.random_dir()
            for nd in range(4):
                d = (sd + nd) % 4
                if self.get_pos_dir(x, y, d) != Maze.WALL:
                    continue
                (nx, ny) = Maze.go_dir(x, y, d)
                if (nx, ny) in been:
                    continue
                if self.is_inside(nx, ny):
                    self.set_pos_dir(x, y, d, Maze.OPEN)
                    been.append((nx, ny))
                    pos = -1
                    generate_status = len(been) / self.area()
                    if generate_status - old_generate_status > 0.1:
                        old_generate_status = generate_status
                        print(f"{generate_status * 100:.2f}%")
                    break
                elif solved or len(been) < self.min_lenght():
                    continue
                else:
                    self.set_pos_dir(x, y, d, Maze.OPEN)
                    self.end = (x, y)
                    solved = True
                    pos = -1 - random.randrange(len(been))
                    break
            else:
                pos -= 1
                if pos < -len(been):
                    pos = -1
                    
        self.key = None
        while(self.key == None):
            kx = random.randrange(self.width)
            ky = random.randrange(self.height)
            if (Maze.distance(self.start, (kx,ky)) > self.min_distance()
                and Maze.distance(self.end, (kx,ky)) > self.min_distance()):
                self.key = (kx, ky)


    def get_label(self, x, y):
        if (x, y) == self.start:
            c = 'S'
        elif (x, y) == self.end:
            c = 'E'
        elif (x, y) == self.key:
            c = 'K'
        else:
            c = ' '
        return c

                    
    def map(self, moves=[]):
        map = ''
        for py in range(self.height * 2 + 1):
            row = ''
            for px in range(self.width * 2 + 1):
                x = int(px / 2)
                y = int(py / 2)
                if py % 2 == 0: #Even rows
                    if px % 2 == 0:
                        c = '+'
                    else:
                        v = self.get_pos_dir(x, y, self.UP)
                        if v == Maze.OPEN:
                            c = ' '
                        elif v == Maze.WALL:
                            c = '-'
                else: # Odd rows
                    if px % 2 == 0:
                        v = self.get_pos_dir(x, y, self.LEFT)
                        if v == Maze.OPEN:
                            c = ' '
                        elif v == Maze.WALL:
                            c = '|'
                    else:
                        c = self.get_label(x, y)
                        if c == ' ' and [x, y] in moves:
                            c = '*'
                row += c
            map += row + '\n'
        return map


app = Flask(__name__)

@app.route('/')
def hello_maze():
    return "<p>Hello, Maze!</p>"

@app.route('/maze/map', methods=['GET', 'POST'])
def maze_map():
    if not ready:
        return Response(status=503, retry_after=10)
    if request.method == 'GET':
        return '<pre>' + maze.map() + '</pre>'
    else:
        moves = request.get_json()
        return maze.map(moves)

@app.route('/maze/start')
def maze_start():
    if not ready:
        return Response(status=503, retry_after=10)
    start = { 'x': maze.start[0], 'y': maze.start[1] }
    return json.dumps(start)

@app.route('/maze/size')
def maze_size():
    if not ready:
        return Response(status=503, retry_after=10)
    size = { 'width': maze.width, 'height': maze.height }
    return json.dumps(size)

@app.route('/maze/pos/<int:y>/<int:x>')
def maze_pos(y, x):
    if not ready:
        return Response(status=503, retry_after=10)
    pos = {
        'here': maze.get_label(x, y),
        'up': maze.get_pos_dir(x, y, Maze.UP),
        'down': maze.get_pos_dir(x, y, Maze.DOWN),
        'left': maze.get_pos_dir(x, y, Maze.LEFT),
        'right': maze.get_pos_dir(x, y, Maze.RIGHT),

    }
    return json.dumps(pos)


WIDTH = 80
HEIGHT = 20
maze = Maze(WIDTH, HEIGHT)
ready = True

The only requirement for the maze model (in requirements.txt) is the Flask module.

To create a container image running the maze model, I use this Dockerfile.

FROM --platform=linux/amd64 public.ecr.aws/docker/library/python:3.12-alpine

WORKDIR /app

COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt

COPY . .

CMD [ "python3", "-m" , "flask", "run", "--host=0.0.0.0", "--port=5555"]

Here’s the code for the agent (agent.py). First, the agent asks the model for the size of the maze and the starting position. Then, it applies its own strategy to explore and solve the maze. In this implementation, the agent chooses its route at random, trying to avoid following the same path more than once.

import random
import requests
from requests.adapters import HTTPAdapter, Retry

HOST = '127.0.0.1'
PORT = 5555

BASE_URL = f"http://{HOST}:{PORT}/maze"

UP, RIGHT, DOWN, LEFT = 0, 1, 2, 3
OPEN, WALL = 0, 1

s = requests.Session()

retries = Retry(total=10,
                backoff_factor=1)

s.mount('http://', HTTPAdapter(max_retries=retries))

r = s.get(f"{BASE_URL}/size")
size = r.json()
print('SIZE', size)

r = s.get(f"{BASE_URL}/start")
start = r.json()
print('START', start)

y = start['y']
x = start['x']

found_key = False
been = set((x, y))
moves = [(x, y)]
moves_stack = [(x, y)]

while True:
    r = s.get(f"{BASE_URL}/pos/{y}/{x}")
    pos = r.json()
    if pos['here'] == 'K' and not found_key:
        print(f"({x}, {y}) key found")
        found_key = True
        been = set((x, y))
        moves_stack = [(x, y)]
    if pos['here'] == 'E' and found_key:
        print(f"({x}, {y}) exit")
        break
    dirs = list(range(4))
    random.shuffle(dirs)
    for d in dirs:
        nx, ny = x, y
        if d == UP and pos['up'] == 0:
            ny -= 1
        if d == RIGHT and pos['right'] == 0:
            nx += 1
        if d == DOWN and pos['down'] == 0:
            ny += 1
        if d == LEFT and pos['left'] == 0:
            nx -= 1 

        if nx < 0 or nx >= size['width'] or ny < 0 or ny >= size['height']:
            continue

        if (nx, ny) in been:
            continue

        x, y = nx, ny
        been.add((x, y))
        moves.append((x, y))
        moves_stack.append((x, y))
        break
    else:
        if len(moves_stack) > 0:
            x, y = moves_stack.pop()
        else:
            print("No moves left")
            break

print(f"Solution length: {len(moves)}")
print(moves)

r = s.post(f'{BASE_URL}/map', json=moves)

print(r.text)

s.close()

The only dependency of the agent (in requirements.txt) is the requests module.

This is the Dockerfile I use to create a container image for the agent.

FROM --platform=linux/amd64 public.ecr.aws/docker/library/python:3.12-alpine

WORKDIR /app

COPY requirements.txt requirements.txt
RUN pip3 install -r requirements.txt

COPY . .

CMD [ "python3", "agent.py"]

You can easily run this simplified version of a simulation locally, but the cloud allows you to run it at larger scale (for example, with a much bigger and more detailed maze) and to test multiple agents to find the best strategy to use. In a real-world scenario, the improvements to the agent would then be implemented into a physical device such as a self-driving car or a robot vacuum cleaner.

Running a simulation using multi-container jobs
To run a job with AWS Batch, I need to configure three resources:

The compute environment in which to run the job
The job queue in which to submit the job
The job definition describing how to run the job, including the container images to use

In the AWS Batch console, I choose Compute environments from the navigation pane and then Create. Now, I have the choice of using Fargate, Amazon EC2, or Amazon EKS. Fargate allows me to closely match the resource requirements that I specify in the job definitions. However, simulations usually require access to a large but static amount of resources and use GPUs to accelerate computations. For this reason, I select Amazon EC2.

I select the Managed orchestration type so that AWS Batch can scale and configure the EC2 instances for me. Then, I enter a name for the compute environment and select the service-linked role (that AWS Batch created for me previously) and the instance role that is used by the ECS container agent (running on the EC2 instances) to make calls to the AWS API on my behalf. I choose Next.

In the Instance configuration settings, I choose the size and type of the EC2 instances. For example, I can select instance types that have GPUs or use the Graviton processor. I do not have specific requirements and leave all the settings to their default values. For Network configuration, the console already selected my default VPC and the default security group. In the final step, I review all configurations and complete the creation of the compute environment.

Now, I choose Job queues from the navigation pane and then Create. Then, I select the same orchestration type I used for the compute environment (Amazon EC2). In the Job queue configuration, I enter a name for the job queue. In the Connected compute environments dropdown, I select the compute environment I just created and complete the creation of the queue.

I choose Job definitions from the navigation pane and then Create. As before, I select Amazon EC2 for the orchestration type.

To use more than one container, I disable the Use legacy containerProperties structure option and move to the next step. By default, the console creates a legacy single-container job definition if there’s already a legacy job definition in the account. That’s my case. For accounts without legacy job definitions, the console has this option disabled.

I enter a name for the job definition. Then, I have to think about which permissions this job requires. The container images I want to use for this job are stored in Amazon ECR private repositories. To allow AWS Batch to download these images to the compute environment, in the Task properties section, I select an Execution role that gives read-only access to the ECR repositories. I don’t need to configure a Task role because the simulation code is not calling AWS APIs. For example, if my code was uploading results to an Amazon Simple Storage Service (Amazon S3) bucket, I could select here a role giving permissions to do so.

In the next step, I configure the two containers used by this job. The first one is the maze-model. I enter the name and the image location. Here, I can specify the resource requirements of the container in terms of vCPUs, memory, and GPUs. This is similar to configuring containers for an ECS task.

I add a second container for the agent and enter name, image location, and resource requirements as before. Because the agent needs to access the maze as soon as it starts, I use the Dependencies section to add a container dependency. I select maze-model for the container name and START as the condition. If I don’t add this dependency, the agent container can fail before the maze-model container is running and able to respond. Because both containers are flagged as essential in this job definition, the overall job would terminate with a failure.

I review all configurations and complete the job definition. Now, I can start a job.

In the Jobs section of the navigation pane, I submit a new job. I enter a name and select the job queue and the job definition I just created.

In the next steps, I don’t need to override any configuration and create the job. After a few minutes, the job has succeeded, and I have access to the logs of the two containers.

The agent solved the maze, and I can get all the details from the logs. Here’s the output of the job to see how the agent started, picked up the key, and then found the exit.

SIZE {'width': 80, 'height': 20}
START {'x': 0, 'y': 18}
(32, 2) key found
(79, 16) exit
Solution length: 437
[(0, 18), (1, 18), (0, 18), ..., (79, 14), (79, 15), (79, 16)]

In the map, the red asterisks (*) follow the path used by the agent between the start (S), key (K), and exit (E) locations.

Increasing observability with a sidecar container
When running complex jobs using multiple components, it helps to have more visibility into what these components are doing. For example, if there is an error or a performance problem, this information can help you find where and what the issue is.

To instrument my application, I use AWS Distro for OpenTelemetry:

First, I update the agent and maze-model containers to use Python auto-instrumentation as described in this article. There are similar tutorials for other platforms, such as Go, Java, and JavaScript. To get information specific to my application, I have the option to manually instrument the code.
Then, I add a sidecar container using the AWS Distro for OpenTelemetry Collector from the Amazon ECR Public Gallery. The collector container will receive telemetry data from the other containers and send traces to AWS X-Ray and metrics to Amazon CloudWatch, Amazon Managed Service for Prometheus, or self-managed Prometheus. OpenTelemetry makes it easier to share data with multiple monitoring and observability platforms.

Using telemetry data collected in this way, I can set up dashboards (for example, using CloudWatch or Amazon Managed Grafana) and alarms (with CloudWatch or Prometheus) that help me better understand what is happening and reduce the time to solve an issue. More generally, a sidecar container can help integrate telemetry data from AWS Batch jobs with your monitoring and observability platforms.

Things to know
AWS Batch support for multi-container jobs is available today in the AWS Management Console, AWS Command Line Interface (AWS CLI), and AWS SDKs in all AWS Regions where Batch is offered. For more information, see the AWS Services by Region list.

There is no additional cost for using multi-container jobs with AWS Batch. In fact, there is no additional charge for using AWS Batch. You only pay for the AWS resources you create to store and run your application, such as EC2 instances and Fargate containers. To optimize your costs, you can use Reserved Instances, Savings Plan, EC2 Spot Instances, and Fargate in your compute environments.

Using multi-container jobs accelerates development times by reducing job preparation efforts and eliminates the need for custom tooling to merge the work of multiple teams into a single container. It also simplifies DevOps by defining clear component responsibilities so that teams can quickly identify and fix issues in their own areas of expertise without distraction.

To learn more, see how to set up multi-container jobs in the AWS Batch User Guide.

— Danilo

Run and manage open source InfluxDB databases with Amazon Timestream

2024-03-15 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/run-and-manage-open-source-influxdb-databases-with-amazon-timestream/

Starting today, you can use InfluxDB as a database engine in Amazon Timestream. This support makes it easy for you to run near real-time time-series applications using InfluxDB and open source APIs, including open source Telegraf agents that collect time-series observations.

Now you have two database engines to choose in Timestream: Timestream for LiveAnalytics and Timestream for InfluxDB.

You should use the Timestream for InfluxDB engine if your use cases require near real-time time-series queries or specific features in InfluxDB, such as using Flux queries. Another option is the existing Timestream for LiveAnalytics engine, which is suitable if you need to ingest more than tens of gigabytes of time-series data per minute and run SQL queries on petabytes of time-series data in seconds.

With InfluxDB support in Timestream, you can use a managed instance that is automatically configured for optimal performance and availability. Furthermore, you can increase resiliency by configuring multi-Availability Zone support for your InfluxDB databases.

Timestream for InfluxDB and Timestream for LiveAnalytics complement each other for low-latency and large-scale ingestion of time-series data.

Getting started with Timestream for InfluxDB
Let me show you how to get started.

First, I create an InfluxDB instance. I navigate to the Timestream console, go to InfluxDB databases in Timestream for InfluxDB and select Create Influx database.

On the next page, I specify the database credentials for the InfluxDB instance.

I also specify my instance class in Instance configuration and the storage type and volume to suit my needs.

In the next part, I can choose a multi-AZ deployment, which synchronously replicates data to a standby database in a different Availability Zone or just a single instance of InfluxDB. In the multi-AZ deployment, if a failure is detected, Timestream for InfluxDB will automatically fail over to the standby instance without data loss.

Then, I configure how to connect to my InfluxDB instance in Connectivity configuration. Here, I have the flexibility to define network type, virtual private cloud (VPC), subnets, and database port. I also have the flexibility to configure my InfluxDB instance to be publicly accessible by specifying public subnets and set the public access to Publicly Accessible, allowing Amazon Timestream will assign a public IP address to my InfluxDB instance. If you choose this option, make sure that you have proper security measures to protect your InfluxDB instances.

In this demo, I set my InfluxDB instance as Not publicly accessible, which also means I can only access it through the VPC and subnets I defined in this section.

Once I configure my database connectivity, I can define the database parameter group and the log delivery settings. In Parameter group, I can define specific configurable parameters that I want to use for my InfluxDB database. In the log delivery settings, I also can define which Amazon Simple Storage Service (Amazon S3) bucket I have to export the system logs. To learn more about the required AWS Identity and Access Management (IAM) policy for the Amazon S3 bucket, visit this page.

Once I’m happy with the configuration, I select Create Influx database.

Once my InfluxDB instance is created, I can see more information on the detail page.

With the InfluxDB instance created, I can also access the InfluxDB user interface (UI). If I configure my InfluxDB as publicly accessible, I can access the UI using the console by selecting InfluxDB UI. As shown on the setup, I configured my InfluxDB instance as not publicly accessible. In this case, I need to access the InfluxDB UI with SSH tunneling through an Amazon Elastic Compute Cloud (Amazon EC2) instance within the same VPC as my InfluxDB instance.

With the URL endpoint from the detail page, I navigate to the InfluxDB UI and use the username and password I configured in the creation process.

With access to the InfluxDB UI, I can now create a token to interact with my InfluxDB instance.

I can also use the Influx command line interface (CLI) to create a token. Before I can create the token, I create a configuration to interact with my InfluxDB instance. The following is the sample command to create a configuration:

influx config create --config-name demo  \
    --host-url https://<TIMESTREAM for INFLUX DB ENDPOINT> \
   --org demo-org  
   --username-password [USERNAME] \
   --active

With the InfluxDB configuration created, I can now create an operator, all-access or read/write token. The following is an example for creating an all-access token to grant permissions to all resources in the organization that I defined:

influx auth create --org demo-org --all-access

With the required token for my use case, I can use various tools, such as the Influx CLI, Telegraf agent, and InfluxDB client libraries, to start ingesting data into my InfluxDB instance. Here, I’m using the Influx CLI to write sample home sensor data in the line protocol format, which you can also get from the InfluxDB documentation page.

influx write \
  --bucket demo-bucket \
  --precision s "
home,room=Living\ Room temp=21.1,hum=35.9,co=0i 1641024000
home,room=Kitchen temp=21.0,hum=35.9,co=0i 1641024000
home,room=Living\ Room temp=21.4,hum=35.9,co=0i 1641027600
home,room=Kitchen temp=23.0,hum=36.2,co=0i 1641027600
home,room=Living\ Room temp=21.8,hum=36.0,co=0i 1641031200
home,room=Kitchen temp=22.7,hum=36.1,co=0i 1641031200
home,room=Living\ Room temp=22.2,hum=36.0,co=0i 1641034800
home,room=Kitchen temp=22.4,hum=36.0,co=0i 1641034800
home,room=Living\ Room temp=22.2,hum=35.9,co=0i 1641038400
home,room=Kitchen temp=22.5,hum=36.0,co=0i 1641038400
home,room=Living\ Room temp=22.4,hum=36.0,co=0i 1641042000
home,room=Kitchen temp=22.8,hum=36.5,co=1i 1641042000
home,room=Living\ Room temp=22.3,hum=36.1,co=0i 1641045600
home,room=Kitchen temp=22.8,hum=36.3,co=1i 1641045600
home,room=Living\ Room temp=22.3,hum=36.1,co=1i 1641049200
home,room=Kitchen temp=22.7,hum=36.2,co=3i 1641049200
home,room=Living\ Room temp=22.4,hum=36.0,co=4i 1641052800
home,room=Kitchen temp=22.4,hum=36.0,co=7i 1641052800
home,room=Living\ Room temp=22.6,hum=35.9,co=5i 1641056400
home,room=Kitchen temp=22.7,hum=36.0,co=9i 1641056400
home,room=Living\ Room temp=22.8,hum=36.2,co=9i 1641060000
home,room=Kitchen temp=23.3,hum=36.9,co=18i 1641060000
home,room=Living\ Room temp=22.5,hum=36.3,co=14i 1641063600
home,room=Kitchen temp=23.1,hum=36.6,co=22i 1641063600
home,room=Living\ Room temp=22.2,hum=36.4,co=17i 1641067200
home,room=Kitchen temp=22.7,hum=36.5,co=26i 1641067200
"

Finally, I can query the data using the InfluxDB UI. I navigate to the Data Explorer page in the InfluxDB UI, create a simple Flux script, and select Submit.

Timestream for InﬂuxDB makes it easier for you to develop applications using InfluxDB, while continuing to use your existing tools to interact with the database. With the multi-AZ configuration, you can increase the availability of your InfluxDB data without worrying about the underlying infrastructure.

AWS and InfluxDB partnership
Celebrating this launch, here’s what Paul Dix, Founder and Chief Technology Officer at InfluxData, said about this partnership:

“The future of open source is powered by the public cloud—reaching the broadest community through simple entry points and practical user experience. Amazon Timestream for InfluxDB delivers on that vision. Our partnership with AWS turns InfluxDB open source into a force multiplier for real-time insights on time-series data, making it easier than ever for developers to build and scale their time-series workloads on AWS.”

Things to know
Here are some additional information that you need to know:

Availability – Timestream for InfluxDB is now generally available in the following AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Mumbai, Singapore, Sydney, Tokyo), and Europe (Frankfurt, Ireland, Stockholm).

Migration scenario – To migrate from a self-managed InfluxDB instance, you can simply restore a backup from an existing InfluxDB database into Timestream for InfluxDB. If you need to migrate from existing Timestream LiveAnalytics engine to Timestream for InfluxDB, you can leverage Amazon S3. Read more on how to do migration for various use cases on Migrating data from self-managed InfluxDB to Timestream for InfluxDB page.

Supported version – Timestream for InfluxDB currently supports the open source 2.7.5 version of InfluxDB

Pricing – To learn more about pricing, please visit Amazon Timestream pricing.

Demo – To see Timestream for InfluxDB in action, have a look at this demo created by my colleague, Derek:

Start building time-series applications and dashboards with millisecond response times using Timestream for InfluxDB. To learn more, visit Amazon Timestream for InfluxDB page.

Happy building!
— Donnie

Anthropic’s Claude 3 Haiku model is now available on Amazon Bedrock

2024-03-14 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/anthropics-claude-3-haiku-model-is-now-available-in-amazon-bedrock/

Last week, Anthropic announced their Claude 3 foundation model family. The family includes three models: Claude 3 Haiku, the fastest and most compact model for near-instant responsiveness; Claude 3 Sonnet, the ideal balanced model between skills and speed; and Claude 3 Opus, the most intelligent offering for top-level performance on highly complex tasks. AWS also announced the general availability of Claude 3 Sonnet in Amazon Bedrock.

Today, we are announcing the availability of Claude 3 Haiku on Amazon Bedrock. The Claude 3 Haiku foundation model is the fastest and most compact model of the Claude 3 family, designed for near-instant responsiveness and seamless generative artificial intelligence (AI) experiences that mimic human interactions. For example, it can read a data-dense research paper on arXiv (~10k tokens) with charts and graphs in less than three seconds.

With Claude 3 Haiku’s availability on Amazon Bedrock, you can build near-instant responsive generative AI applications for enterprises that need quick and accurate targeted performance. Like Sonnet and Opus, Haiku has image-to-text vision capabilities, can understand multiple languages besides English, and boasts increased steerability in a 200k context window.

Claude 3 Haiku use cases
Claude 3 Haiku is smarter, faster, and more affordable than other models in its intelligence category. It answers simple queries and requests with unmatched speed. With its fast speed and increased steerability, you can create AI experiences that seamlessly imitate human interactions.

Here are some use cases for using Claude 3 Haiku:

Customer interactions: quick and accurate support in live interactions, translations
Content moderation: catch risky behavior or customer requests
Cost-saving tasks: optimized logistics, inventory management, fast knowledge extraction from unstructured data

To learn more about Claude 3 Haiku’s features and capabilities, visit Anthropic’s Claude on Amazon Bedrock and Anthropic Claude models in the AWS documentation.

Claude 3 Haiku in action
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Haiku.

To test Claude 3 Haiku in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Haiku as the model.

To test more Claude prompt examples, choose Load examples. You can view and run examples specific to Claude 3 Haiku, such as advanced Q&A with citations, crafting a design brief, and non-English content generation.

Using Compare mode, you can also compare the speed and intelligence between Claude 3 Haiku and the Claude 2.1 model using a sample prompt to generate personalized email responses to address customer questions.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
     --model-id anthropic.claude-3-haiku-20240307-v1:0 \
     --body "{\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Write the test case for uploading the image to Amazon S3 bucket\\nCertainly! Here's an example of a test case for uploading an image to an Amazon S3 bucket using a testing framework like JUnit or TestNG for Java:\\n\\n...."}]}],\"anthropic_version\":\"bedrock-2023-05-31\",\"max_tokens\":2000}" \
     --cli-binary-format raw-in-base64-out \
     --region us-east-1 \
     invoke-model-output.txt

To make an API request with Claude 3, use the new Anthropic Claude Messages API format, which allows for more complex interactions such as image processing. If you use Anthropic Claude Text Completions API, you should upgrade from the Text Completions API.

Here is sample Python code to send a Message API request describing the image file:

def call_claude_haiku(base64_string):

    prompt_config = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 4096,
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/png",
                            "data": base64_string,
                        },
                    },
                    {"type": "text", "text": "Provide a caption for this image"},
                ],
            }
        ],
    }

    body = json.dumps(prompt_config)

    modelId = "anthropic.claude-3-haiku-20240307-v1:0"
    accept = "application/json"
    contentType = "application/json"

    response = bedrock_runtime.invoke_model(
        body=body, modelId=modelId, accept=accept, contentType=contentType
    )
    response_body = json.loads(response.get("body").read())

    results = response_body.get("content")[0].get("text")
    return results

To learn more sample codes with Claude 3, see Get Started with Claude 3 on Amazon Bedrock, Diagrams to CDK/Terraform using Claude 3 on Amazon Bedrock, and Cricket Match Winner Prediction with Amazon Bedrock’s Anthropic Claude 3 Sonnet in the Community.aws.

Now available
Claude 3 Haiku is available now in the US West (Oregon) Region with more Regions coming soon; check the full Region list for future updates.

Claude 3 Haiku is the most cost-effective choice. For example, Claude 3 Haiku is cheaper, up to 68 percent of the price per 1,000 input/output tokens compared to Claude Instant, with higher levels of intelligence. To learn more, see Amazon Bedrock Pricing.

Give Claude 3 Haiku a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

— Channy

Introducing the newest Heroes of the year – March 2024

2024-03-06 Taylor Jacobsen

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/introducing-the-newest-heroes-of-the-year-march-2024/

AWS Heroes are inspirational thought leaders who go above and beyond to knowledge share in a variety of ways. You can find them speaking at local meetups, AWS Community Days, or even at re:Invent. And these technical experts are never done learning—they’re passionate about solving problems and creating content to enable the community to build faster on AWS. We’re excited to announce the first cohort of Heroes in 2024…

Let’s give a round of applause to our new Heroes!

Awedis Keofteian – Beirut, Lebanon

Community Hero Awedis Keofteian is a DevOps Engineer at Anghami. He has a strong background in DevOps practices, and he leverages modern technologies to enhance scalability, reliability, and efficiency in Anghami’s cloud-based architecture. His journey began as an AWS Community Builder, and over time, he took the helm as the leader of the AWS User Group in Beirut. Awedis is passionate about nurturing and supporting the growth of AWS communities, and shares his knowledge across DevOps, automation, serverless, and cloud technologies.

Daniel Aniszkiewicz – Wrocław, Poland

Security Hero Daniel Aniszkiewicz is a Senior Software Engineer at Algoteque International Hub. He co-organizes the Wrocław AWS User Group, and is passionate about contributing to the growth and engagement of the local AWS community. Daniel is also a seasoned speaker and loves to share his knowledge with others, such as presenting at re:Invent, AWS meetups, and AWS Community Days. He is particularly focused on promoting Amazon Verified Permissions and Cedar through workshops, blog posts, IaC templates, and open source projects.

Hazel Sáenz – Guatemala

Serverless Hero Hazel Sáenz is a Software Architect at Cognits. Her primary focus is modernizing on-premises applications to cloud environments using AWS, and predominantly designs high workload architectures in serverless frameworks. Hazel enjoys sharing her knowledge with the community through technical talks at local and international events, participating in AWS Summits, AWS Community Days, and meetups, as well as writing technical articles in both English and Spanish. Additionally, she is the leader of the AWS User Group Guatemala, where she excels at organizing inclusive events and sharing her knowledge with the community.

Kenta Goto – Tokyo, Japan

DevTools Hero Kenta Goto is a Backend Tech Lead and an enthusiastic contributor to AWS CDK. He has been selected as a top contributor and a trusted reviewer in AWS CDK, and serves as a maintainer for the community-driven CDK Construct Library. Kenta is also a conference speaker, having presented at the AWS Dev Day in Japan in 2022 and 2023. Furthermore, he actively contributes to the open source community by developing and publishing his self-made AWS tools and AWS CDK Construct libraries, which are used worldwide.

Martin Damovsky – Prague, Czech Republic

Community Hero Martin Damovsky is a Cloud Governance Lead at Ataccama.com, an AWS Partner providing Unified Data Management Solutions. He has been particularly interested in AWS Control Tower Account Factory for Terraform, Cloud Intelligence Dashboard, and security and govern tools, such as AWS Security Hub, Amazon GuardDuty, and AWS Config. Martin is a leader for AWS User Group Prague, and he enjoys sharing his knowledge with the greater AWS community through his blog and speaking at meetups, podcasts, and conferences.

Rafał Mituła – Warsaw, Poland

Community Hero Rafał Mituła is a Cloud Data Engineer and Architect within the Data & AI division at Chaos Gears. He is actively involved in the AWS community, co-organizing the AWS User Group Warsaw meetups and the AWS Community Day Poland conference. In addition to his technical and organizational roles, Rafał shares his expertise by speaking at conferences and leading workshops aimed at introducing new builders to AWS and data analytics, such as the AWS Data Engineering Immersion Days.

Sena Yakut – Izmir, Turkey

Security Hero Sena Yakut is a Senior Cloud Security Engineer at Lyrebird Studio. She has a master’s degree in cloud security, and builds security requirements for architectural designs, providing threat management and security concepts and services using AWS. Sena shares her knowledge through blog posts across various platforms, and engaging in discussions about cloud security at events, such as AWS Community Day Türkiye, and DevOpsDays Istanbul. As an active blogger and speaker, she enjoys learning new security features on AWS and informing others about them.

Tiago Rodrigues – Lisbon, Portugal

Community Hero Tiago Rodrigues is a Senior Cloud Consultant at tecRacer.com, an AWS Premier Partner and AWS Advanced Training Partner. He specializes in migrations from on-premises environments to the cloud, as well as modernizing architectures and implementing serverless solutions. Beyond his role, Tiago is deeply committed to knowledge sharing and actively contributes to the AWS community through engagements, such as the AWS User Group Lisbon, educational workshops, and guest lectures at universities. He is passionate about education and innovation, and developed an open source mobile app, AWSary, which is an AWS dictionary designed to provide solution architect diagram drawings and quick insights into AWS services.

Learn More

Please visit the AWS Heroes website if you’d like to learn more about the AWS Heroes program or to connect with a Hero near you.

— Taylor

Anthropic’s Claude 3 Sonnet foundation model is now available in Amazon Bedrock

2024-03-04 Channy Yun

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/anthropics-claude-3-sonnet-foundation-model-is-now-available-in-amazon-bedrock/

In September 2023, we announced a strategic collaboration with Anthropic that brought together their respective technology and expertise in safer generative artificial intelligence (AI), to accelerate the development of Anthropic’s Claude foundation models (FMs) and make them widely accessible to AWS customers. You can get early access to unique features of Anthropic’s Claude model in Amazon Bedrock to reimagine user experiences, reinvent your businesses, and accelerate your generative AI journeys.

In November 2023, Amazon Bedrock provided access to Anthropic’s Claude 2.1, which delivers key capabilities to build generative AI for enterprises. Claude 2.1 includes a 200,000 token context window, reduced rates of hallucination, improved accuracy over long documents, system prompts, and a beta tool use feature for function calling and workflow orchestration.

Today, Anthropic announced Claude 3, a new family of state-of-the-art AI models that allows customers to choose the exact combination of intelligence, speed, and cost that suits their business needs. The three models in the family are Claude 3 Haiku, the fastest and most compact model for near-instant responsiveness, Claude 3 Sonnet, the ideal balanced model between skills and speed, and Claude 3 Opus, a most intelligent offering for the top-level performance on highly complex tasks.

We’re also announcing the availability of Anthropic’s Claude 3 Sonnet today in Amazon Bedrock, with Claude 3 Opus and Claude 3 Haiku coming soon. For the vast majority of workloads, Claude 3 Sonnet model is two times faster than Claude 2 and Claude 2.1, with increased steerability, and new image-to-text vision capabilities.

With Claude 3 Sonnet’s availability in Amazon Bedrock, you can build cost-effective generative AI applications for enterprises that need intelligence, reliability, and speed. You can now use Anthropic’s latest model, Claude 3 Sonnet, in the Amazon Bedrock console.

Introduction of Anthropic’s Claude 3 Sonnet
Here are some key highlights about the new Claude 3 Sonnet model in Amazon Bedrock:

2x faster speed – Claude 3 has made significant gains in speed. For the vast majority of workloads, it is two times faster with the same level of intelligence as Anthropic’s most performant models, Claude 2 and Claude 2.1. This combination of speed and skill makes Claude 3 Sonnet the clear choice for tasks that require intelligent tasks demanding rapid responses, like knowledge retrieval or sales automation. This includes use cases like content generation, classification, data extraction, and research and retrieval or accurate searching over knowledge bases.

Increased steerability – Increased steerability of AI systems gives users more control over outputs and delivers predictable, higher-quality outcomes. It is significantly less likely to refuse to answer questions that border on the system’s guardrails to prevent harmful outputs. Claude 3 Sonnet is easier to steer and better at following directions in popular structured output formats like JSON—making it simpler for developers to build enterprise and frontier applications. This is particularly important in enterprise use cases such as autonomous vehicles, health and medical diagnoses, and algorithmic decision-making in sensitive domains such as financial services.

Image-to-text vision capabilities – Claude 3 offers vision capabilities that can process images and return text outputs. It is extremely capable at analyzing and understanding charts, graphs, technical diagrams, reports, and other visual assets. Claude 3 Sonnet achieves comparable performance to other best-in-class models with image processing capabilities, while maintaining a significant speed advantage.

Expanded language support – Claude 3 has improved understanding and responding in languages other than English, such as French, Japanese, and Spanish. This expanded language coverage allows Claude 3 Sonnet to better serve multinational corporations requiring AI services across different geographies and languages, as well as businesses requiring nuanced translation services. Claude 3 Sonnet is also stronger at coding and mathematics, as evidenced by Anthropic’s scores in evaluations such as grade-school math problems (GSM8K and Hendrycks) and Codex (HumanEval).

To learn more about Claude 3 Sonnet’s features and capabilities, visit Anthropic’s Claude on Amazon Bedrock and Anthropic Claude model in the AWS documentation.

Get started with Anthropic’s Claude 3 Sonnet in Amazon Bedrock
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Sonnet.

To test Claude 3 Sonnet in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Sonnet as the model.

To test more Claude prompt examples, choose Load examples. You can view and run Claude 3 specific examples, such as advanced Q&A with citations, crafting a design brief, and non-English content generation.

By choosing View API request, you can also access the model via code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
--model-id anthropic.claude-3-sonnet-v1:0 \
--body "{\"prompt\":\"Write the test case for uploading the image to Amazon S3 bucket\\nHere are some test cases for uploading an image to an Amazon S3 bucket:\\n\\n1. **Successful Upload Test Case**:\\n   - Test Data:\\n     - Valid image file (e.g., .jpg, .png, .gif)\\n     - Correct S3 bucket name\\n     - Correct AWS credentials (access key and secret access key)\\n   - Steps:\\n     1. Initialize the AWS S3 client with the correct credentials.\\n     2. Open the image file.\\n     3. Upload the image file to the specified S3 bucket.\\n     4. Verify that the upload was successful.\\n   - Expected Result: The image should be successfully uploaded to the S3 bucket.\\n\\n2. **Invalid File Type Test Case**:\\n   - Test Data:\\n     - Invalid file type (e.g., .txt, .pdf, .docx)\\n     - Correct S3 bucket name\\n     - Correct AWS credentials\\n   - Steps:\\n     1. Initialize the AWS S3 client with the correct credentials.\\n     2. Open the invalid file type.\\n     3. Attempt to upload the file to the specified S3 bucket.\\n     4. Verify that an appropriate error or exception is raised.\\n   - Expected Result: The upload should fail with an error or exception indicating an invalid file type.\\n\\nThese test cases cover various scenarios, including successful uploads, invalid file types, invalid bucket names, invalid AWS credentials, large file uploads, and concurrent uploads. By executing these test cases, you can ensure the reliability and robustness of your image upload functionality to Amazon S3.\",\"max_tokens_to_sample\":2000,\"temperature\":1,\"top_k\":250,\"top_p\":0.999,\"stop_sequences\":[\"\\n\\nHuman:\"],\"anthropic_version\":\"bedrock-2023-05-31\"}" \
--cli-binary-format raw-in-base64-out \
--region us-east-1 \
invoke-model-output.txt

Upload your image if you want to test image-to-text vision capabilities. I uploaded the featured image of this blog post and received a detailed description of this image.

You can process images via API and return text outputs in English and multiple other languages.

{
  "modelId": "anthropic.claude-3-sonnet-v1:0",
  "contentType": "application/json",
  "accept": "application/json",
  "body": {
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1000,
    "system": "Please respond only in Spanish.",
    "messages": {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "image/jpeg",
            "data": "iVBORw..."
          }
        },
        {
          "type": "text",
          "text": "What's in this image?"
        }
      ]
    }
  }
}

To celebrate this launch, Neerav Kingsland, Head of Global Accounts at Anthropic, talks about the power of the Anthropic and AWS partnership.

“Anthropic at its core is a research company that is trying to create the safest large language models in the world, and through Amazon Bedrock we have a change to take that technology, distribute it to users globally, and do this in an extremely safe and data-secure manner.”

Now available
Claude 3 Sonnet is available today in the US East (N. Virginia) and US West (Oregon) Regions; check the full Region list for future updates. The availability of Anthropic’s Claude 3 Opus and Haiku in Amazon Bedrock also will be coming soon.

You will be charged for model inference and customization with the On-Demand and Batch mode, which allows you to use FMs on a pay-as-you-go basis without having to make any time-based term commitments. With the Provisioned Throughput mode, you can purchase model units for a specific base or custom model. To learn more, see Amazon Bedrock Pricing.

Give Anthropic’s Claude 3 Sonnet a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

— Channy

Mistral AI models now available on Amazon Bedrock

2024-03-01 Donnie Prakoso

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/mistral-ai-models-now-available-on-amazon-bedrock/

Last week, we announced that Mistral AI models are coming to Amazon Bedrock. In that post, we elaborated on a few reasons why Mistral AI models may be a good fit for you. Mistral AI offers a balance of cost and performance, fast inference speed, transparency and trust, and is accessible to a wide range of users.

Today, we’re excited to announce the availability of two high-performing Mistral AI models, Mistral 7B and Mixtral 8x7B, on Amazon Bedrock. Mistral AI is the 7th foundation model provider offering cutting-edge models in Amazon Bedrock, joining other leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon. This integration provides you the flexibility to choose optimal high-performing foundation models in Amazon Bedrock.

Mistral 7B is the first foundation model from Mistral AI, supporting English text generation tasks with natural coding capabilities. It is optimized for low latency with a low memory requirement and high throughput for its size. Mixtral 8x7B is a popular, high-quality, sparse Mixture-of-Experts (MoE) model, that is ideal for text summarization, question and answering, text classification, text completion, and code generation.

Here’s a quick look at Mistral AI models on Amazon Bedrock:

Getting Started with Mistral AI Models
To get started with Mistral AI models in Amazon Bedrock, first you need to get access to the models. On the Amazon Bedrock console, select Model access, and then select Manage model access. Next, select Mistral AI models, and then select Request model access.

Once you have the access to selected Mistral AI models, you can test the models with your prompts using Chat or Text in the Playgrounds section.

Programmatically Interact with Mistral AI Models
You can also use AWS Command Line Interface (CLI) and AWS Software Development Kit (SDK) to make various calls using Amazon Bedrock APIs. Following, is a sample code in Python that interacts with Amazon Bedrock Runtime APIs with AWS SDK:

import boto3
import json

bedrock = boto3.client(service_name="bedrock-runtime")

prompt = "<s>[INST] INSERT YOUR PROMPT HERE [/INST]"

body = json.dumps({
    "prompt": prompt,
    "max_tokens": 512,
    "top_p": 0.8,
    "temperature": 0.5,
})

modelId = "mistral.mistral-7b-instruct-v0:2"

accept = "application/json"
contentType = "application/json"

response = bedrock.invoke_model(
    body=body,
    modelId=modelId,
    accept=accept,
    contentType=contentType
)

print(json.loads(response.get('body').read()))

Mistral AI models in action
By integrating your application with AWS SDK to invoke Mistral AI models using Amazon Bedrock, you can unlock possibilities to implement various use cases. Here are a few of my personal favorite use cases using Mistral AI models with sample prompts. You can see more examples on Prompting Capabilities from the Mistral AI documentation page.

Text summarization — Mistral AI models extract the essence from lengthy articles so you quickly grasp key ideas and core messaging.

You are a summarization system. In clear and concise language, provide three short summaries in bullet points of the following essay.

# Essay:
{insert essay text here}

Personalization — The core AI capabilities of understanding language, reasoning, and learning, allow Mistral AI models to personalize answers with more human-quality text. The accuracy, explanation capabilities, and versatility of Mistral AI models make them useful at personalization tasks, because they can deliver content that aligns closely with individual users.

You are a mortgage lender customer service bot, and your task is to create personalized email responses to address customer questions. Answer the customer's inquiry using the provided facts below. Ensure that your response is clear, concise, and directly addresses the customer's question. Address the customer in a friendly and professional manner. Sign the email with "Lender Customer Support."

# Facts
<INSERT FACTS AND INFORMATION HERE>

# Email
{insert customer email here}

Code completion — Mistral AI models have an exceptional understanding of natural language and code-related tasks, which is essential for projects that need to juggle computer code and regular language. Mistral AI models can help generate code snippets, suggest bug fixes, and optimize existing code, accelerating your development process.

[INST] You are a code assistant. Your task is to generate a 5 valid JSON object based on the following properties:
name: 
lastname: 
address: 
Just generate the JSON object without explanations:
[/INST]

Things You Have to Know
Here are few additional information for you:

Availability — Mistral AI’s Mixtral 8x7B and Mistral 7B models in Amazon Bedrock are available in the US West (Oregon) Region.
Deep dive into Mistral 7B and Mixtral 8x7B — If you want to learn more about Mistral AI models on Amazon Bedrock, you might also enjoy this article titled “Mistral AI – Winds of Change” prepared by my colleague, Mike.

Now Available
Mistral AI models are available today in Amazon Bedrock, and we can’t wait to see what you’re going to build. Get yourself started by visiting Mistral AI on Amazon Bedrock.

Happy building,
— Donnie

AWS Weekly Roundup — .Net Runtime for AWS Lambda, PartyRock Hackathon, and more — February 26, 2024

2024-02-26 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-net-runtime-for-aws-lambda-partyrock-hackathon-and-more-february-26-2024/

The Community AWS re:invent 2023 re:caps continue! Recently, I was invited to participate in one of these events hosted by the AWS User Group Kenya, and was able to learn and spend time with this amazing community.

AWS User Group Kenya

Last week’s launches
Here are some launches that got my attention during the previous week.

.NET 8 runtime for AWS Lambda – AWS Lambda now supports .NET 8 as both a managed runtime and container base image. This support provides you with .NET 8 features that include API enhancements, improved Native Ahead of Time (Native AOT) support, and improved performance. .NET 8 supports C# 12, F# 8, and PowerShell 7.4. You can develop Lambda functions in .NET 8 using the AWS Toolkit for Visual Studio, the AWS Extensions for .NET CLI, AWS Serverless Application Model (AWS SAM), AWS CDK, and other infrastructure as code tools.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional projects, programs, and news items that you might find interesting:

Earlier this month, I used this image to call attention to the PartyRock Hackathon that’s currently in progress. The deadline to join the hackathon is fast approaching so be sure to signup before time runs out.

Amazon API Gateway – Amazon API Gateway processed over 100 trillion API requests in 2023, and we continue to see growing demand for API-driven applications. API Gateway is a fully-managed service that enables you to create, publish, maintain, monitor, and secure APIs at any scale. Customers that onboarded large workloads on API Gateway in 2023 told us they chose the service for its availability, security, and serverless architecture. Those in regulated industries value API Gateway’s private endpoints, which are isolated from the public internet and only accessible from your Amazon Virtual Private Cloud (VPC).

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
Season 3 of the Build on Generative AI Twitch show has kicked off. Join every Monday on Twitch at 9AM PST/Noon EST/18h CET to learn among others, how you can build generative AI-enabled applications.

If you’re in the EMEA timezone, there is still time to register and watch the AWS Innovate Online Generative AI & Data Edition taking place on February 29. Innovate Online events are free, online, and designed to inspire and educate you about building on AWS. Whether you’re in the Americas, Asia Pacific & Japan, or EMEA region, learn here about future AWS Innovate Online events happening in your timezone.

AWS Community re:Invent re:Caps – Join a Community re:Cap event organized by volunteers from AWS User Groups and AWS Cloud Clubs around the world to learn about the latest announcements from AWS re:Invent.

You can browse all upcoming in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

– Veliswa

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS.

Noise

Tag Archives: launch

Meta’s Llama 3 models are now available in Amazon Bedrock

Guardrails for Amazon Bedrock now available with new safety filters and privacy controls

Agents for Amazon Bedrock: Introducing a simplified creation and configuration experience

Amazon Bedrock model evaluation is now generally available

Import custom models in Amazon Bedrock (preview)

Amazon Titan Image Generator and watermark detection API are now available in Amazon Bedrock

Unify DNS management using Amazon Route 53 Profiles with multiple VPCs and AWS accounts

Amazon CloudWatch Internet Weather Map – View and analyze internet health

Anthropic’s Claude 3 Opus model is now available on Amazon Bedrock

Congratulations to the PartyRock generative AI hackathon winners

Tackle complex reasoning tasks with Mistral Large, now available on Amazon Bedrock

Introducing AWS Deadline Cloud: Set up a cloud-based render farm in minutes

Amazon GuardDuty EC2 Runtime Monitoring is now generally available

Run large-scale simulations with AWS Batch multi-container jobs

Run and manage open source InfluxDB databases with Amazon Timestream

Anthropic’s Claude 3 Haiku model is now available on Amazon Bedrock

Introducing the newest Heroes of the year – March 2024

Awedis Keofteian – Beirut, Lebanon

Daniel Aniszkiewicz – Wrocław, Poland

Hazel Sáenz – Guatemala

Kenta Goto – Tokyo, Japan

Martin Damovsky – Prague, Czech Republic

Rafał Mituła – Warsaw, Poland

Sena Yakut – Izmir, Turkey

Tiago Rodrigues – Lisbon, Portugal

Learn More

Anthropic’s Claude 3 Sonnet foundation model is now available in Amazon Bedrock

Mistral AI models now available on Amazon Bedrock

AWS Weekly Roundup — .Net Runtime for AWS Lambda, PartyRock Hackathon, and more — February 26, 2024

The collective thoughts of the interwebz