All posts by Channy Yun

Amazon Titan Image Generator v2 is now available in Amazon Bedrock

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-titan-image-generator-v2-is-now-available-in-amazon-bedrock/

Today, we are announcing the general availability of the Amazon Titan Image Generator v2 model with new capabilities in Amazon Bedrock. With Amazon Titan Image Generator v2, you can guide image creation using reference images, edit existing visuals, remove backgrounds, generate image variations, and securely customize the model to maintain brand style and subject consistency. This powerful tool streamlines workflows, boosts productivity, and brings creative visions to life.

Amazon Titan Image Generator v2 brings a number of new features in addition to all features of Amazon Titan Image Generator v1, including:

  • Image conditioning – Provide a reference image along with a text prompt, resulting in outputs that follow the layout and structure of the user-supplied reference.
  • Image guidance with color palette – Control precisely the color palette of generated images by providing a list of hex codes along with the text prompt.
  • Background removal – Automatically remove background from images containing multiple objects.
  • Subject consistency – Fine-tune the model to preserve a specific subject (for example, a particular dog, shoe, or handbag) in the generated images.

New features in Amazon Titan Image Generator v2
Before getting started, if you are new to using Amazon Titan models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. To access the latest Amazon Titan models from Amazon, request access separately for Amazon Titan Image Generator G1 v2.

Here are details of the Amazon Titan Image Generator v2 in Amazon Bedrock:

Image conditioning
You can use the image conditioning feature to shape your creations with precision and intention. By providing a reference image (that is, a conditioning image), you can instruct the model to focus on specific visual characteristics, such as edges, object outlines, and structural elements, or segmentation maps that define distinct regions and objects within the reference image.

We support two types of image conditioning: Canny edge and segmentation.

  • The Canny edge algorithm is used to extract the prominent edges within the reference image, creating a map that the Amazon Titan Image Generator can then use to guide the generation process. You can “draw” the foundations of your desired image, and the model will then fill in the details, textures, and final aesthetic based on your guidance.
  • Segmentation provides an even more granular level of control. By supplying the reference image, you can define specific areas or objects within the image and instruct the Amazon Titan Image Generator to generate content that aligns with those defined regions. You can precisely control the placement and rendering of characters, objects, and other key elements.

Here are generation examples that use image conditioning.

To use the image conditioning feature, you can use Amazon Bedrock API, AWS SDK, or AWS Command Line Interface (AWS CLI) and choose CANNY_EDGE or SEGMENTATION for controlMode of textToImageParams with your reference image.

	"taskType": "TEXT_IMAGE",
	"textToImageParams": {
 		"text": "a cartoon deer in a fairy world.",
        "conditionImage": input_image, # Optional
        "controlMode": "CANNY_EDGE" # Optional: CANNY_EDGE | SEGMENTATION
        "controlStrength": 0.7 # Optional: weight given to the condition image. Default: 0.7
     }

The following a Python code example using AWS SDK for Python (Boto3) shows how to invoke Amazon Titan Image Generator v2 on Amazon Bedrock to use image conditioning.

import base64
import io
import json
import logging
import boto3
from PIL import Image
from botocore.exceptions import ClientError

def main():
    """
    Entrypoint for Amazon Titan Image Generator V2 example.
    """
    try:
        logging.basicConfig(level=logging.INFO,
                            format="%(levelname)s: %(message)s")

        model_id = 'amazon.titan-image-generator-v2:0'

        # Read image from file and encode it as base64 string.
        with open("/path/to/image", "rb") as image_file:
            input_image = base64.b64encode(image_file.read()).decode('utf8')

        body = json.dumps({
            "taskType": "TEXT_IMAGE",
            "textToImageParams": {
                "text": "a cartoon deer in a fairy world",
                "conditionImage": input_image,
                "controlMode": "CANNY_EDGE",
                "controlStrength": 0.7
            },
            "imageGenerationConfig": {
                "numberOfImages": 1,
                "height": 512,
                "width": 512,
                "cfgScale": 8.0
            }
        })

        image_bytes = generate_image(model_id=model_id,
                                     body=body)
        image = Image.open(io.BytesIO(image_bytes))
        image.show()

    except ClientError as err:
        message = err.response["Error"]["Message"]
        logger.error("A client error occurred: %s", message)
        print("A client error occured: " +
              format(message))
    except ImageError as err:
        logger.error(err.message)
        print(err.message)

    else:
        print(
            f"Finished generating image with Amazon Titan Image Generator V2 model {model_id}.")

def generate_image(model_id, body):
    """
    Generate an image using Amazon Titan Image Generator V2 model on demand.
    Args:
        model_id (str): The model ID to use.
        body (str) : The request body to use.
    Returns:
        image_bytes (bytes): The image generated by the model.
    """

    logger.info(
        "Generating image with Amazon Titan Image Generator V2 model %s", model_id)

    bedrock = boto3.client(service_name='bedrock-runtime')

    accept = "application/json"
    content_type = "application/json"

    response = bedrock.invoke_model(
        body=body, modelId=model_id, accept=accept, contentType=content_type
    )
    response_body = json.loads(response.get("body").read())

    base64_image = response_body.get("images")[0]
    base64_bytes = base64_image.encode('ascii')
    image_bytes = base64.b64decode(base64_bytes)

    finish_reason = response_body.get("error")

    if finish_reason is not None:
        raise ImageError(f"Image generation error. Error is {finish_reason}")

    logger.info(
        "Successfully generated image with Amazon Titan Image Generator V2 model %s", model_id)

    return image_bytes
	
class ImageError(Exception):
    "Custom exception for errors returned by Amazon Titan Image Generator V2"

    def __init__(self, message):
        self.message = message

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)

if __name__ == "__main__":
    main()

Color conditioning
Most designers want to generate images adhering to color branding guidelines so they seek control over color palette in the generated images.

With the Amazon Titan Image Generator v2, you can generate color-conditioned images based on a color palette—a list of hex colors provided as part of the inputs adhering to color branding guidelines. You can also provide a reference image as input (optional) to generate an image with provided hex colors while inheriting style from the reference image.

In this example, the prompt describes:
a jar of salad dressing in a rustic kitchen surrounded by fresh vegetables with studio lighting

The generated image reflects both the content of the text prompt and the specified color scheme to align with the brand’s color guidelines.

To use color conditioning feature, you can set taskType to COLOR_GUIDED_GENERATION with your prompt and hex codes.

       "taskType": "COLOR_GUIDED_GENERATION",
       "colorGuidedGenerationParam": {
             "text": "a jar of salad dressing in a rustic kitchen surrounded by fresh vegetables with studio lighting",                         
	         "colors": ['#ff8080', '#ffb280', '#ffe680', '#e5ff80'], # Optional: list of color hex codes 
             "referenceImage": input_image, #Optional
        }

Background removal
Whether you’re looking to composite an image onto a solid color backdrop or layer it over another scene, the ability to cleanly and accurately remove the background is an essential tool in the creative workflow. You can instantly remove the background from your images with a single step. Amazon Titan Image Generator v2 can intelligently detect and segment multiple foreground objects, ensuring that even complex scenes with overlapping elements are cleanly isolated.

The example shows an image of an iguana sitting on a tree in a forest. The model was able to identify the iguana as the main object and remove the forest background, replacing it with a transparent background. This lets the iguana stand out clearly without the distracting forest around it.

To use background removal feature, you can set taskType to BACKGROUND_REMOVAL with your input image.

    "taskType": "BACKGROUND_REMOVAL",
    "backgroundRemovalParams": {
 		"image": input_image,
    }

Subject consistency with fine-tuning
You can now seamlessly incorporate specific subjects into visually captivating scenes. Whether it’s a brand’s product, a company logo, or a beloved family pet, you can fine-tune the Amazon Titan model using reference images to learn the unique characteristics of the chosen subject.

Once the model is fine-tuned, you can simply provide a text prompt, and the Amazon Titan Generator will generate images that maintain a consistent depiction of the subject, placing it naturally within diverse, imaginative contexts. This opens up a world of possibilities for marketing, advertising, and visual storytelling.

For example, you could use an image with the caption Ron the dog during fine-tuning, give the prompt as Ron the dog wearing a superhero cape during inference with the fine-tuned model, and get a unique image in response.

To learn, visit model inference parameters and code examples for Amazon Titan Image Generator in the AWS documentation.

Now available
The Amazon Titan Generator v2 model is available today in Amazon Bedrock in the US East (N. Virginia) and US West (Oregon) Regions. Check the full Region list for future updates. To learn more, check out the Amazon Titan product page and the Amazon Bedrock pricing page.

Give Amazon Titan Image Generator v2 a try in Amazon Bedrock today, and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Visit our community.aws site to find deep-dive technical content and to discover how our Builder communities are using Amazon Bedrock in their solutions.

Channy

Announcing Llama 3.1 405B, 70B, and 8B models from Meta in Amazon Bedrock

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/announcing-llama-3-1-405b-70b-and-8b-models-from-meta-in-amazon-bedrock/

Today, we are announcing the availability of Llama 3.1 models in Amazon Bedrock. The Llama 3.1 models are Meta’s most advanced and capable models to date. The Llama 3.1 models are a collection of 8B, 70B, and 405B parameter size models that demonstrate state-of-the-art performance on a wide range of industry benchmarks and offer new capabilities for your generative artificial intelligence (generative AI) applications.

All Llama 3.1 models support a 128K context length (an increase of 120K tokens from Llama 3) that has 16 times the capacity of Llama 3 models and improved reasoning for multilingual dialogue use cases in eight languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

You can now use three new Llama 3.1 models from Meta in Amazon Bedrock to build, experiment, and responsibly scale your generative AI ideas:

  • Llama 3.1 405B (preview) is the world’s largest publicly available large language model (LLM) according to Meta. The model sets a new standard for AI and is ideal for enterprise-level applications and research and development (R&D). It is ideal for tasks like synthetic data generation where the outputs of the model can be used to improve smaller Llama models and model distillations to transfer knowledge to smaller models from the 405B model. This model excels at general knowledge, long-form text generation, multilingual translation, machine translation, coding, math, tool use, enhanced contextual understanding, and advanced reasoning and decision-making. To learn more, visit the AWS Machine Learning Blog about using Llama 3.1 405B to generate synthetic data for model distillation.
  • Llama 3.1 70B is ideal for content creation, conversational AI, language understanding, R&D, and enterprise applications. The model excels at text summarization and accuracy, text classification, sentiment analysis and nuance reasoning, language modeling, dialogue systems, code generation, and following instructions.
  • Llama 3.1 8B is best suited for limited computational power and resources. The model excels at text summarization, text classification, sentiment analysis, and language translation requiring low-latency inferencing.

Meta measured the performance of Llama 3.1 on over 150 benchmark datasets that span a wide range of languages and extensive human evaluations. As you can see in the following chart, Llama 3.1 outperforms Llama 3 in every major benchmarking category.

To learn more about Llama 3.1 features and capabilities, visit the Llama 3.1 Model Card from Meta and Llama models in the AWS documentation.

You can take advantage of Llama 3.1’s responsible AI capabilities, combined with the data governance and model evaluation features of Amazon Bedrock to build secure and reliable generative AI applications with confidence.

  • Guardrails for Amazon Bedrock – By creating multiple guardrails with different configurations tailored to specific use cases, you can use Guardrails to promote safe interactions between users and your generative AI applications by implementing safeguards customized to your use cases and responsible AI policies. With Guardrails for Amazon Bedrock, you can continually monitor and analyze user inputs and model responses that might violate customer-defined policies, detect hallucination in model responses that are not grounded in enterprise data or are irrelevant to the user’s query, and evaluate across different models including custom and third-party models. To get started, visit Create a guardrail in the AWS documentation.
  • Model evaluation on Amazon Bedrock – You can evaluate, compare, and select the best Llama models for your use case in just a few steps using either automatic evaluation or human evaluation. With model evaluation on Amazon Bedrock, you can choose automatic evaluation with predefined metrics such as accuracy, robustness, and toxicity. Alternatively, you can choose human evaluation workflows for subjective or custom metrics such as relevance, style, and alignment to brand voice. Model evaluation provides built-in curated datasets or you can bring in your own datasets. To get started, visit Get started with model evaluation in the AWS documentation.

To learn more about how to keep your data and applications secure and private in AWS, visit the Amazon Bedrock Security and Privacy page.

Getting started with Llama 3.1 models in Amazon Bedrock
If you are new to using Llama models from Meta, go to the Amazon Bedrock console and choose Model access on the bottom left pane. To access the latest Llama 3.1 models from Meta, request access separately for Llama 3.1 8B Instruct, Llama 3.1 70B Instruct, or Llama 3.1 405B Instruct.

To request to be considered for access to the preview of Llama 3.1 405B in Amazon Bedrock, contact your AWS account team or submit a support ticket via the AWS Management Console. When creating the support ticket, select Amazon Bedrock as the Service and Models as the Category.

To test the Llama 3.1 models in the Amazon Bedrock console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Meta as the category and Llama 3.1 8B Instruct, Llama 3.1 70B Instruct, or Llama 3.1 405B Instruct as the model.

In the following example I selected the Llama 3.1 405B Instruct model.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. You can use model IDs such as meta.llama3-1-8b-instruct-v1, meta.llama3-1-70b-instruct-v1 , or meta.llama3-1-405b-instruct-v1.

Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
  --model-id meta.llama3-1-405b-instruct-v1:0 \
--body "{\"prompt\":\" [INST]You are a very intelligent bot with exceptional critical thinking[/INST] I went to the market and bought 10 apples. I gave 2 apples to your friend and 2 to the helper. I then went and bought 5 more apples and ate 1. How many apples did I remain with? Let's think step by step.\",\"max_gen_len\":512,\"temperature\":0.5,\"top_p\":0.9}" \
  --cli-binary-format raw-in-base64-out \
  --region us-east-1 \
  invoke-model-output.txt

You can use code examples for Llama models in Amazon Bedrock using AWS SDKs to build your applications using various programming languages. The following Python code examples show how to send a text message to Llama using the Amazon Bedrock Converse API for text generation.

import boto3
from botocore.exceptions import ClientError

# Create a Bedrock Runtime client in the AWS Region you want to use.
client = boto3.client("bedrock-runtime", region_name="us-east-1")

# Set the model ID, e.g., Llama 3 8b Instruct.
model_id = "meta.llama3-1-405b-instruct-v1:0"

# Start a conversation with the user message.
user_message = "Describe the purpose of a 'hello world' program in one line."
conversation = [
    {
        "role": "user",
        "content": [{"text": user_message}],
    }
]

try:
    # Send the message to the model, using a basic inference configuration.
    response = client.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": 512, "temperature": 0.5, "topP": 0.9},
    )

    # Extract and print the response text.
    response_text = response["output"]["message"]["content"][0]["text"]
    print(response_text)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)

You can also use all Llama 3.1 models (8B, 70B, and 405B) in Amazon SageMaker JumpStart. You can discover and deploy Llama 3.1 models with a few clicks in Amazon SageMaker Studio or programmatically through the SageMaker Python SDK. You can operate your models with SageMaker features such as SageMaker Pipelines, SageMaker Debugger, or container logs under your virtual private cloud (VPC) controls, which help provide data security.

The fine-tuning for Llama 3.1 models in Amazon Bedrock and Amazon SageMaker JumpStart will be coming soon. When you build fine-tuned models in SageMaker JumpStart, you will also be able to import your custom models into Amazon Bedrock. To learn more, visit Meta Llama 3.1 models are now available in Amazon SageMaker JumpStart on the AWS Machine Learning Blog.

For customers who want to deploy Llama 3.1 models on AWS through self-managed machine learning workflows for greater flexibility and control of underlying resources, AWS Trainium and AWS Inferentia-powered Amazon Elastic Compute Cloud (Amazon EC2) instances enable high performance, cost-effective deployment of Llama 3.1 models on AWS. To learn more, visit AWS AI chips deliver high performance and low cost for Meta Llama 3.1 models on AWS in the AWS Machine Learning Blog.

To celebrate this launch, Parkin Kent, Business Development Manager at Meta, talks about the power of the Meta and Amazon collaboration, highlighting how Meta and Amazon are working together to push the boundaries of what’s possible with generative AI.

Discover how businesses are leveraging Llama models in Amazon Bedrock to harness the power of generative AI. Nomura, a global financial services group spanning 30 countries and regions, is democratizing generative AI across its organization using Llama models in Amazon Bedrock.

Now available
Llama 3.1 8B and 70B models from Meta are generally available and Llama 450B model is preview today in Amazon Bedrock in the US West (Oregon) Region. To request to be considered for access to the preview of Llama 3.1 405B in Amazon Bedrock, contact your AWS account team or submit a support ticket. Check the full Region list for future updates. To learn more, check out the Llama in Amazon Bedrock product page and the Amazon Bedrock pricing page.

Give Llama 3.1 a try in the Amazon Bedrock console today, and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Visit our community.aws site to find deep-dive technical content and to discover how our Builder communities are using Amazon Bedrock in their solutions. Let me know what you build with Llama 3.1 in Amazon Bedrock!

Channy

Vector search for Amazon MemoryDB is now generally available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/vector-search-for-amazon-memorydb-is-now-generally-available/

Today, we are announcing the general availability of vector search for Amazon MemoryDB, a new capability that you can use to store, index, retrieve, and search vectors to develop real-time machine learning (ML) and generative artificial intelligence (generative AI) applications with in-memory performance and multi-AZ durability.

With this launch, Amazon MemoryDB delivers the fastest vector search performance at the highest recall rates among popular vector databases on Amazon Web Services (AWS). You no longer have to make trade-offs around throughput, recall, and latency, which are traditionally in tension with one another.

You can now use one MemoryDB database to store your application data and millions of vectors with single-digit millisecond query and update response times at the highest levels of recall. This simplifies your generative AI application architecture while delivering peak performance and reducing licensing cost, operational burden, and time to deliver insights on your data.

With vector search for Amazon MemoryDB, you can use the existing MemoryDB API to implement generative AI use cases such as Retrieval Augmented Generation (RAG), anomaly (fraud) detection, document retrieval, and real-time recommendation engines. You can also generate vector embeddings using artificial intelligence and machine learning (AI/ML) services like Amazon Bedrock and Amazon SageMaker and store them within MemoryDB.

Which use cases would benefit most from vector search for MemoryDB?
You can use vector search for MemoryDB for the following specific use cases:

1. Real-time semantic search for retrieval-augmented generation (RAG)
You can use vector search to retrieve relevant passages from a large corpus of data to augment a large language model (LLM). This is done by taking your document corpus, chunking them into discrete buckets of texts, and generating vector embeddings for each chunk with embedding models such as the Amazon Titan Multimodal Embeddings G1 model, then loading these vector embeddings into Amazon MemoryDB.

With RAG and MemoryDB, you can build real-time generative AI applications to find similar products or content by representing items as vectors, or you can search documents by representing text documents as dense vectors that capture semantic meaning.

2. Low latency durable semantic caching
Semantic caching is a process to reduce computational costs by storing previous results from the foundation model (FM) in-memory. You can store prior inferenced answers alongside the vector representation of the question in MemoryDB and reuse them instead of inferencing another answer from the LLM.

If a user’s query is semantically similar based on a defined similarity score to a prior question, MemoryDB will return the answer to the prior question. This use case will allow your generative AI application to respond faster with lower costs from making a new request to the FM and provide a faster user experience for your customers.

3. Real-time anomaly (fraud) detection
You can use vector search for anomaly (fraud) detection to supplement your rule-based and batch ML processes by storing transactional data represented by vectors, alongside metadata representing whether those transactions were identified as fraudulent or valid.

The machine learning processes can detect users’ fraudulent transactions when the net new transactions have a high similarity to vectors representing fraudulent transactions. With vector search for MemoryDB, you can detect fraud by modeling fraudulent transactions based on your batch ML models, then loading normal and fraudulent transactions into MemoryDB to generate their vector representations through statistical decomposition techniques such as principal component analysis (PCA).

As inbound transactions flow through your front-end application, you can run a vector search against MemoryDB by generating the transaction’s vector representation through PCA, and if the transaction is highly similar to a past detected fraudulent transaction, you can reject the transaction within single-digit milliseconds to minimize the risk of fraud.

Getting started with vector search for Amazon MemoryDB
Look at how to implement a simple semantic search application using vector search for MemoryDB.

Step 1. Create a cluster to support vector search
You can create a MemoryDB cluster to enable vector search within the MemoryDB console. Choose Enable vector search in the Cluster settings when you create or update a cluster. Vector search is available for MemoryDB version 7.1 and a single shard configuration.

Step 2. Create vector embeddings using the Amazon Titan Embeddings model
You can use Amazon Titan Text Embeddings or other embedding models to create vector embeddings, which is available in Amazon Bedrock. You can load your PDF file, split the text into chunks, and get vector data using a single API with LangChain libraries integrated with AWS services.

import redis
import numpy as np
from langchain.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.embeddings import BedrockEmbeddings

# Load a PDF file and split document
loader = PyPDFLoader(file_path=pdf_path)
        pages = loader.load_and_split()
        text_splitter = RecursiveCharacterTextSplitter(
            separators=["\n\n", "\n", ".", " "],
            chunk_size=1000,
            chunk_overlap=200,
        )
        chunks = loader.load_and_split(text_splitter)

# Create MemoryDB vector store the chunks and embedding details
client = RedisCluster(
        host=' mycluster.memorydb.us-east-1.amazonaws.com',
        port=6379,
        ssl=True,
        ssl_cert_reqs="none",
        decode_responses=True,
    )

embedding =  BedrockEmbeddings (
           region_name="us-east-1",
 endpoint_url=" https://bedrock-runtime.us-east-1.amazonaws.com",
    )

#Save embedding and metadata using hset into your MemoryDB cluster
for id, dd in enumerate(chucks*):
     y = embeddings.embed_documents([dd])
     j = np.array(y, dtype=np.float32).tobytes()
     client.hset(f'oakDoc:{id}', mapping={'embed': j, 'text': chunks[id] } )

Once you generate the vector embeddings using the Amazon Titan Text Embeddings model, you can connect to your MemoryDB cluster and save these embeddings using the MemoryDB HSET command.

Step 3. Create a vector index
To query your vector data, create a vector index using theFT.CREATE command. Vector indexes are also constructed and maintained over a subset of the MemoryDB keyspace. Vectors can be saved in JSON or HASH data types, and any modifications to the vector data are automatically updated in a keyspace of the vector index.

from redis.commands.search.field import TextField, VectorField

index = client.ft(idx:testIndex).create_index([
        VectorField(
            "embed",
            "FLAT",
            {
                "TYPE": "FLOAT32",
                "DIM": 1536,
                "DISTANCE_METRIC": "COSINE",
            }
        ),
        TextField("text")
        ]
    )

In MemoryDB, you can use four types of fields: numbers fields, tag fields, text fields, and vector fields. Vector fields support K-nearest neighbor searching (KNN) of fixed-sized vectors using the flat search (FLAT) and hierarchical navigable small worlds (HNSW) algorithm. The feature supports various distance metrics, such as euclidean, cosine, and inner product. We will use the euclidean distance, a measure of the angle distance between two points in vector space. The smaller the euclidean distance, the closer the vectors are to each other.

Step 4. Search the vector space
You can use FT.SEARCH and FT.AGGREGATE commands to query your vector data. Each operator uses one field in the index to identify a subset of the keys in the index. You can query and find filtered results by the distance between a vector field in MemoryDB and a query vector based on some predefined threshold (RADIUS).

from redis.commands.search.query import Query

# Query vector data
query = (
    Query("@vector:[VECTOR_RANGE $radius $vec]=>{$YIELD_DISTANCE_AS: score}")
     .paging(0, 3)
     .sort_by("vector score")
     .return_fields("id", "score")     
     .dialect(2)
)

# Find all vectors within 0.8 of the query vector
query_params = {
    "radius": 0.8,
    "vec": np.random.rand(VECTOR_DIMENSIONS).astype(np.float32).tobytes()
}

results = client.ft(index).search(query, query_params).docs

For example, when using cosine similarity, the RADIUS value ranges from 0 to 1, where a value closer to 1 means finding vectors more similar to the search center.

Here is an example result to find all vectors within 0.8 of the query vector.

[Document {'id': 'doc:a', 'payload': None, 'score': '0.243115246296'},
 Document {'id': 'doc:c', 'payload': None, 'score': '0.24981123209'},
 Document {'id': 'doc:b', 'payload': None, 'score': '0.251443207264'}]

To learn more, you can look at a sample generative AI application using RAG with MemoryDB as a vector store.

What’s new at GA
At re:Invent 2023, we released vector search for MemoryDB in preview. Based on customers’ feedback, here are the new features and improvements now available:

  • VECTOR_RANGE to allow MemoryDB to operate as a low latency durable semantic cache, enabling cost optimization and performance improvements for your generative AI applications.
  • SCORE to better filter on similarity when conducting vector search.
  • Shared memory to not duplicate vectors in memory. Vectors are stored within the MemoryDB keyspace and pointers to the vectors are stored in the vector index.
  • Performance improvements at high filtering rates to power the most performance-intensive generative AI applications.

Now available
Vector search is available in all Regions that MemoryDB is currently available. Learn more about vector search for Amazon MemoryDB in the AWS documentation.

Give it a try in the MemoryDB console and send feedback to the AWS re:Post for Amazon MemoryDB or through your usual AWS Support contacts.

Channy

In the Works – AWS Region in Taiwan

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/in-the-works-aws-region-in-taiwan/

Today, we’re announcing that a new AWS Region will be coming to Taiwan by early 2025. The new AWS Asia Pacific (Taipei) Region will consist of three Availability Zones at launch, and will give AWS customers in Taiwan the ability to run workloads and store data that must remain in Taiwan.

Each of the Availability Zones will be physically independent of the others in the Region – close enough to support applications that need low latency, yet sufficiently distant to significantly reduce the risk that an event at an Availability Zone level might impact business continuity.

The Availability Zones in this Region will be connected together through high-bandwidth, low-latency network connections over dedicated, fully redundant fiber. This connectivity supports applications that need synchronous replication between Availability Zones for availability or redundancy. You can take a peek at the AWS Global Infrastructure page to learn more about how we design and build Regions and Availability Zones.

We are currently working on Regions in Malaysia, Mexico, New Zealand, the Kingdom of Saudi Arabia, Thailand, and the AWS European Sovereign Cloud. The AWS Cloud operates 105 Availability Zones within 33 AWS Regions around the world, with announced plans for 21 more Availability Zones and seven more Regions, including Taiwan.

AWS in Taiwan
AWS has been investing and supporting customers and partners in Taiwan for more than 10 years. To support our customers in Taiwan, we have business development teams, solutions architects, partner managers, professional services consultants, support staff, and various other roles working in our Taipei office.

Other AWS infrastructure includes two Amazon CloudFront edge locations along with access to the AWS global backbone through multiple redundant submarine cables. You can access any other AWS Region (except Beijing and Ningxia) from AWS Direct Connect locations in Taipei, operated by Chief Telecom and Chunghwa Telecom. With AWS Direct Connect, your data that would have previously been transported over the internet is delivered through a private network connection between your facilities and AWS.

You can also use AWS Outposts in Taiwan, a family of fully managed solutions delivering AWS infrastructure and services to virtually any on-premises or edge location for a truly consistent hybrid experience. With AWS Local Zones in Taipei, you can deliver applications that require single-digit millisecond latency to end users.

AWS continues to invest in upskilling students, local developers and technical professionals, nontechnical professionals, and the next generation of IT leaders in Taiwan through offerings like AWS AcademyAWS Educate, and AWS Skill Builder. Since 2017, AWS has trained more than eight million people across the Asia Pacific-Japan region on cloud skills, including more than 100,000 people in Taiwan.

To learn more, join AWS Summit 2024 Taiwan in July; in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS.

AWS customers in Taiwan
AWS customers in Taiwan have been increasingly moving their applications to AWS and running their technology infrastructure in other AWS Regions around the world. With the addition of this new AWS Region, customers will be able to provide even lower latency to end users and use advanced technologies such as generative artificial intelligence (generative AI), Internet of Things (IoT), mobile services, and more, to drive innovation. This Region will give AWS customers the ability to run their workloads and store their content in Taiwan.

Here are some examples of customers using AWS to drive innovation:

Chunghwa Telecom is the largest integrated telecom provider in Taiwan. To improve AI data security and governance, they use Amazon Bedrock for a variety of generative AI applications, including automatically generating specifications documents for the software development lifecycle and crafting custom marketing campaigns. With Amazon Bedrock, Chunghwa Telecom is saving developer hours and has also developed an immersive, interactive virtual English teacher for the first time.

Gamania Group is a leader in the development and publication of online games in Taiwan. To maximize the value of running on AWS, they worked with AWS Training and Certification to enhance AWS skills across all of its departments, such as AWS Classroom training, AWS Well-Architected Framework, and AWS GameDay events. As a result, they reduced the time needed to make critical operational decisions by 50 percent, lowered its time-to-market by up to 70 percent, and accelerated the launch of new games.

KKCompany Technologies is a multimedia technology group building a music streaming platform, AI-powered streaming solution, and cloud intelligence service in Taiwan. The company specializes in generative AI, multimedia technology, and digital transformation consulting services for enterprises in Taiwan and Japan. You can find BlendVision, a cloud-based streaming solution in AWS Marketplace.

To learn more about Taiwan customer cases, visit AWS Customer Success Stories in English or our Traditional Chinese page.

Stay Tuned
We’ll announce the opening of this and the other Regions in future blog posts, so be sure to stay tuned! To learn more, visit the AWS Region in Taiwan page in Traditional Chinese.

Channy

Introducing Amazon GuardDuty Malware Protection for Amazon S3

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/introducing-amazon-guardduty-malware-protection-for-amazon-s3/

Today we are announcing the general availability of Amazon GuardDuty Malware Protection for Amazon Simple Storage Service (Amazon S3), an expansion of GuardDuty Malware Protection to detect malicious file uploads to selected S3 buckets. Previously, GuardDuty Malware Protection provided agentless scanning capabilities to identify malicious files on Amazon Elastic Block Store (Amazon EBS) volumes attached to Amazon Elastic Compute Cloud (Amazon EC2) and container workloads.

Now, you can continuously evaluate new objects uploaded to S3 buckets for malware and take action to isolate or eliminate any malware found. Amazon GuardDuty Malware Protection uses multiple Amazon Web Services (AWS) developed and industry-leading third-party malware scanning engines to provide malware detection without degrading the scale, latency, and resiliency profile of Amazon S3.

With GuardDuty Malware Protection for Amazon S3, you can use built-in malware and antivirus protection on your designated S3 buckets to help you remove the operational complexity and cost overhead associated with automating malicious file evaluation at scale. Unlike many existing tools used for malware analysis, this managed solution from GuardDuty does not require you to manage your own isolated data pipelines or compute infrastructure in each AWS account and AWS Region where you want to perform malware analysis.

Your development and security teams can work together to configure and oversee malware protection throughout your organization for select buckets where new uploaded data from untrusted entities is required to be scanned for malware. You can configure post-scan action in GuardDuty, such as object tagging, to inform downstream processing, or consume the scan status information provided through Amazon EventBridge to implement isolation of malicious uploaded objects.

Getting started with GuardDuty Malware Protection for your S3 bucket
To get started, in the GuardDuty console, select Malware Protection for S3 and choose Enable.

Enter the S3 bucket name or choose Browse S3 to select an S3 bucket name from a list of buckets that belong to the currently selected Region. You can select All the objects in the S3 bucket when you want GuardDuty to scan all the newly uploaded objects in the selected bucket. Or you can also select Objects beginning with a specific prefix when you want to scan the newly uploaded objects that belong to a specific prefix.

After scanning a newly uploaded S3 object, GuardDuty can add a predefined tag with the key as GuardDutyMalwareScanStatus and the value as the scan status:

  • NO_THREATS_FOUND – No threat found in the scanned object.
  • THREATS_FOUND – Potential threat detected during scan.
  • UNSUPPORTED – GuardDuty cannot scan this object because of size.
  • ACCESS_DENIED – GuardDuty cannot access object. Check permissions.
  • FAILED – GuardDuty could not scan the object.

When you want GuardDuty to add tags to your scanned S3 objects, select Tag objects. If you use tags, you can create policies to prevent objects from being accessed before the malware scan completes and prevent your application from accessing malicious objects.

Now, you must first create and attach an AWS Identity and Access Management (IAM) role that includes the required permissions:

  • EventBridge actions to create and manage the EventBridge managed rule so that Malware Protection for S3 can listen to your S3 Event Notifications.
  • Amazon S3 and EventBridge actions to send S3 Event Notifications to EventBridge for all events in this bucket.
  • Amazon S3 actions to access the uploaded S3 object and add a predefined tag to the scanned S3 object.
  • AWS Key Management Service (AWS KMS) key actions to access the object before scanning and putting a test object on buckets with the supported DSSE-KMS and SSE-KMS

To add these permissions, choose View permissions and copy the policy template and trust relationship template. These templates include placeholder values that you should replace with the appropriate values associated with your bucket and AWS account. You should also replace the placeholder value for the AWS KMS key ID.

Now, choose Attach permissions, which opens the IAM console in a new tab. You can choose to create a new IAM role or update an existing IAM role with the permissions from the copied templates. If you want to create or update your IAM role in advance, visit Prerequisite – Create or update IAM PassRole policy in the AWS documentation.

Finally, go back to the GuardDuty browser tab that has the IAM console open, choose your created or updated IAM role, and choose Enable.

Now, you will see Active in the protection Status column for this protected bucket.

Choose View all S3 malware findings to see the generated GuardDuty findings associated with your scanned S3 bucket. If you see the finding type S3Object:S3/MaliciousFile, GuardDuty has detected the listed S3 object as malicious. Choose the Threats detected section in the Findings details panel and follow the recommended remediation steps. To learn more, visit Remediating a potentially malicious S3 object in the AWS documentation.

Things to know
You can set up GuardDuty Malware Protection for your S3 buckets even without GuardDuty enabled for your AWS account. However, if you enable GuardDuty in your account, you can use the full monitoring of foundational sources, such as AWS CloudTrail management events, Amazon Virtual Private Cloud (Amazon VPC) Flow Logs, and DNS query logs, as well as malware protection features. You can also have security findings sent to AWS Security Hub and Amazon Detective for further investigation.

GuardDuty can scan files belonging to the following synchronous Amazon S3 storage classes: S3 Standard, S3 Intelligent-Tiering, S3 Standard-IA, S3 One Zone-IA, and Amazon S3 Glacier Instant Retrieval. It will scan the file formats known to be used to spread or contain malware. At the launch, the feature supports file sizes up to 5 GB, including archive files with up to five levels and 1,000 files per level after it is decompressed.

As I said, GuardDuty will send scan metrics to your EventBridge for each protected S3 bucket. You can set up alarms and define post-scan actions, such as tagging the object or moving the malicious object to a quarantine bucket. To learn more about other monitoring options, such as Amazon CloudWatch metrics and S3 object tagging, visit Monitoring S3 object scan status in the AWS documentation.

Now available
Amazon GuardDuty Malware Protection for Amazon S3 is generally available today in all AWS Regions where GuardDuty is available, excluding China Regions and GovCloud (US) Regions.

The pricing is based on the GB volume of the objects scanned and number of objects evaluated per month. This feature comes with a limited AWS Free Tier, which includes 1,000 requests and 1 GB each month, pursuant to conditions for the first 12 months of account creation for new AWS accounts, or until June 11, 2025, for existing AWS accounts. To learn more, visit the Amazon GuardDuty pricing page.

Give GuardDuty Malware Protection for Amazon S3 a try in the GuardDuty console. For more information, visit the Amazon GuardDuty User Guide and send feedback to AWS re:Post for Amazon GuardDuty or through your usual AWS support contacts.

Channy

AWS Weekly Roundup: Amazon EC2 U7i Instances, Bedrock Converse API, AWS World IPv6 Day and more (June 3, 2024)

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-u7i-instances-bedrock-converse-api-aws-world-ipv6-day-and-more-june-3-2024/

Life is not always happy, there are difficult times. However, we can share our joys and sufferings with those we work with. The AWS Community is no exception.

Jeff Barr introduced two members of the AWS community who are dealing with health issues. Farouq Mousa is an AWS Community Builder and fighting brain cancer. Allen Helton is an AWS Serverless Hero and his young daughter is fighting leukemia.

Please donate to support Farauq and Olivia, Allen’s daughter to overcome their disease.

Last week’s launches
Here are some launches that got my attention:

Amazon EC2 high memory U7i Instances – These instances with up to 32 TiB of DDR5 memory and 896 vCPUs are powered by custom fourth generation Intel Xeon Scalable Processors (Sapphire Rapids). These high memory instances are designed to support large, in-memory databases including SAP HANA, Oracle, and SQL Server. To learn more, visit Jeff’s blog post.

New Amazon Connect analytics data lake – You can use a single source for contact center data including contact records, agent performance, Contact Lens insights, and more — eliminating the need to build and maintain complex data pipelines. Your organization can create your own custom reports using Amazon Connect data or combine data queried from third-party sources. To learn more, visit Donnie’s blog post.

Amazon Bedrock Converse API – This API provides developers a consistent way to invoke Amazon Bedrock models removing the complexity to adjust for model-specific differences such as inference parameters. With this API, you can write a code once and use it seamlessly with different models in Amazon Bedrock. To learn more, visit Dennis’s blog post to get started.

New Document widget for PartyRock – You can build, use, and share generative AI-powered apps for fun and for boosting personal productivity, using PartyRock. Its widgets display content, accept input, connect with other widgets, and generate outputs like text, images, and chats using foundation models. You can now use new document widget to integrate text content from files and documents directly into a PartyRock app.

30 days of alarm history in Amazon CloudWatch – You can view the history of your alarm state changes for up to 30 days prior. Previously, CloudWatch provided 2 weeks of alarm history. This extended history makes it easier to observe past behavior and review incidents over a longer period of time. To learn more, visit the CloudWatch alarms documentation section.

10x faster startup time in Amazon SageMaker Canvas – You can launch SageMaker Canvas in less than a minute and get started with your visual, no-code interface for machine learning 10x faster than before. Now, all new user profiles created in existing or new SageMaker domains can experience this accelerated startup time.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items and a Twitch show that you might find interesting:

Let us manage your relational database! – Jeff Barr ran a poll to better understand why some AWS customers still choose to host their own databases in the cloud. Working backwards, he highlights four issues that AWS managed database services address. Consider these before hosting your own database.

Amazon Bedrock Serverless Prompt Chaining – This repository provides examples of using AWS Step Functions and Amazon Bedrock to build complex, serverless, and highly scalable generative AI applications with prompt chaining.

AWS Merch Store Spring Sale – Do you want to buy AWS branded t-shirts, hats, bags, and so on? Get 15% off on all items now through June 7th.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS World IPv6 Day — Join us a free in-person celebration event on June 6, for technical presentations from AWS experts plus a workshop and whiteboarding session. You will learn how to get started with IPv6 and hear from customers who have started on the journey of IPv6 adoption. Check out your near city: San Francisco, Seattle, New YorkLondon, Mumbai, Bangkok, Singapore, Kuala Lumpur, Beijing, Manila, and Sydney.

AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Stockholm (June 4), Madrid (June 5), and Washington, DC (June 26–27).

AWS re:Inforce — Join us for AWS re:Inforce (June 10–12) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Midwest | Columbus (June 13), Sri Lanka (June 27), Cameroon (July 13), New Zealand (August 15), Nigeria (August 24), and New York (August 28).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Channy

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

A new generative engine and three voices are now generally available on Amazon Polly

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/a-new-generative-engine-and-three-voices-are-now-generally-available-on-amazon-polly/

Today, we are announcing the general availability of the generative engine of Amazon Polly with three voices: Ruth and Matthew in American English and Amy in British English. The new generative engine was trained with publicly available and proprietary data, a variety of voices, languages, and styles. It performs with the highest precision to render context-dependent prosody, pausing, spelling, dialectal properties, foreign word pronunciation, and more.

Amazon Polly is a machine learning (ML) service that converts text to lifelike speech, called text-to-speech (TTS) technology. Now, Amazon Polly includes high-quality, natural-sounding human-like voices in dozens of languages, so you can select the ideal voice and distribute your speech-enabled applications in many locales or countries.

With Amazon Polly, you can select various voice options, including neural, long-form, and generative voices, which deliver ground-breaking improvements in speech quality and produce human-like, highly expressive, and emotionally adept voices. You can store speech output in standard formats like MP3 or OGG, adjust the speech rate, pitch, or volume with Speech Synthesis Markup Language (SSML) tags, and quickly deliver lifelike voices and conversational user experiences with consistently fast response times.

What’s the new generative engine?
Amazon Polly now supports four voice engines: standard, neural, long-form, and generative voices.

Standard TTS voices, introduced in 2016 use traditional concatenative synthesis. This method strings together the phonemes of recorded speech, producing very natural-sounding synthesized speech. However, the inevitable variations in speech and the techniques used to segment the waveforms limit the quality of speech.

Neural TTS (NTTS) voices, introduced in 2019, use a sequence-to-sequence neural network that converts a sequence of phonemes into spectrograms, and a neural vocoder that converts the spectrograms into a continuous audio signal. The NTTS produces even higher quality human-like voices than its standard voices.

Long-form voices, introduced in 2023, are developed with cutting-edge deep learning TTS technology and designed to captivate listeners’ attention for longer content, such as news articles, training materials, or marketing videos.

In February 2024, Amazon scientists introduced a new research TTS model called Big Adaptive Streamable TTS with Emergent abilities (BASE). With this technology, Polly Generative engine is able to create human-like synthetically generated voices. You can use these voices as a knowledgeable customer assistant, a virtual trainer, or an experienced marketer.

Here are the new generative voices:

Name Locale Gender Language Sample prompt NTTS voices
Generative voices
Ruth en_US Female English (US) Selma was lying on the ground halfway down the steps. 'Selma! Selma!' we shouted in panic.
Matthew en_US Male English (US) The guards were standing outside with some of our neighbours, listening to a transistor radio. 'Any good news?' I asked. 'No, we're listening to the names of people who were killed yesterday,' Bruno replied.
Amy en_GB Female English (British) What are you looking at?' he said as he stood over me. They got off the bus and started searching the baggage compartment. The tension on the bus was like a dark, menacing cloud that hovered above us.

You can choose from these voice options to suit your application and use case. To learn more about the generative engine, visit Generative voices in the AWS documentation.

Get started with using generative voices
You can access the new voices using the AWS Management Console, AWS Command Line Interface (AWS CLI), or the AWS SDKs.

To get started, go to the Amazon Polly console in the US (N. Virginia) Region and choose Text-to-Speech menu in the left pane. If you select the voice of Ruth or Matthew in the language of English, US or Amy in English, UK, you can choose Generative engine. Input your text and listen to or download the generated voice output.

Using the CLI, you can list the voices that use the new generative engine:

$ aws polly describe-voices --output json --region us-east-1 \
| jq -r '.Voices[] | select(.SupportedEngines | index("generative")) | .Name'

Matthew
Amy
Ruth

Now, run the synthesize-speech CLI command to synthesize sample text to an audio file (hello.mp3) with the parameters of generative engine and a supported voice ID.

$ aws polly synthesize-speech --output-format mp3 --region us-east-1 \
  --text "Hello. This is my first generative voices!" \
  --voice-id Matthew --engine generative hello.mp3

To learn more code examples using AWS SDKs, visit Code and Application Examples in the AWS documentation. You can use Java and Python code examples, application examples such as web applications using Java or Python, or iOS and Android applications.

Now available
The new generative voices of Amazon Polly are now available today in the US East (N. Virginia) Region. You only pay for what you use based on the number of characters of text that you convert to speech. To learn more, visit our Amazon Polly Pricing page.

Give new generative voices a try in the Amazon Polly console today and send feedback to AWS re:Post for Amazon Polly or through your usual AWS Support contacts.

Channy

Amazon Q Business, now generally available, helps boost workforce productivity with generative AI

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-q-business-now-generally-available-helps-boost-workforce-productivity-with-generative-ai/

At AWS re:Invent 2023, we previewed Amazon Q Business, a generative artificial intelligence (generative AI)–powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems.

With Amazon Q Business, you can deploy a secure, private, generative AI assistant that empowers your organization’s users to be more creative, data-driven, efficient, prepared, and productive. During the preview, we heard lots of customer feedback and used that feedback to prioritize our enhancements to the service.

Today, we are announcing the general availability of Amazon Q Business with many new features, including custom plugins, and a preview of Amazon Q Apps, generative AI–powered customized and sharable applications using natural language in a single step for your organization.

In this blog post, I will briefly introduce the key features of Amazon Q Business with the new features now available and take a look at the features of Amazon Q Apps. Let’s get started!

Introducing Amazon Q Business
Amazon Q Business connects seamlessly to over 40 popular enterprise data sources and stores document and permission information, including Amazon Simple Storage Service (Amazon S3), Microsoft 365, and Salesforce. It ensures that you access content securely with existing credentials using single sign-on, according to your permissions, and also includes enterprise-level access controls.

Amazon Q Business makes it easy for users to get answers to questions like company policies, products, business results, or code, using its web-based chat assistant. You can point Amazon Q Business at your enterprise data repositories, and it’ll search across all data, summarize logically, analyze trends, and engage in dialog with users.

With Amazon Q Business, you can build secure and private generative AI assistants with enterprise-grade access controls at scale. You can also use administrative guardrails, document enrichment, and relevance tuning to customize and control responses that are consistent with your company’s guidelines.

Here are the key features of Amazon Q Business with new features now available:

End-user web experience
With the built-in web experience, you can ask a question, receive a response, and then ask follow-up questions and add new information with in-text source citations while keeping the context from the previous answer. You can only get a response from data sources that you have access to.

With general availability, we’re introducing a new content creation mode in the web experience. In this mode, Amazon Q Business does not use or access the enterprise content but instead uses generative AI models built into Amazon Q Business for creative use cases such as summarization of responses and crafting personalized emails. To use the content creation mode, you can turn off Respond from approved sources in the conversation settings.

To learn more, visit Using an Amazon Q Business web experience and Customizing an Amazon Q Business web experience in the AWS documentation.

Pre-built data connectors and plugins
You can connect, index, and sync your enterprise data using over 40 pre-built data connectors or an Amazon Kendra retriever, as well as web crawling or uploading your documents directly.

Amazon Q Business ingests content using a built-in semantic document retriever. It also retrieves and respects permission information such as access control lists (ACLs) to allow it to manage access to the data after retrieval. When the data is ingested, your data is secured with the Service-managed key of AWS Key Management Service (AWS KMS).

You can configure plugins to perform actions in enterprise systems, including Jira, Salesforce, ServiceNow, and Zendesk. Users can create a Jira issue or a Salesforce case while chatting in the chat assistant. You can also deploy a Microsoft Teams gateway or a Slack gateway to use an Amazon Q Business assistant in your teams or channels.

With general availability, you can build custom plugins to connect to any third-party application through APIs so that users can use natural language prompts to perform actions such as submitting time-off requests or sending meeting invites directly through Amazon Q Business assistant. Users can also search real-time data, such as time-off balances, scheduled meetings, and more.

When you choose Custom plugin, you can define an OpenAPI schema to connect your third-party application. You can upload the OpenAPI schema to Amazon S3 or copy it to the Amazon Q Business console in-line schema editor compatible with the Swagger OpenAPI specification.

To learn more, visit Data source connectors and Configure plugins in the AWS documentation.

Admin control and guardrails
You can configure global controls to give users the option to either generate large language model (LLM)-only responses or generate responses from connected data sources. You can specify whether all chat responses will be generated using only enterprise data or whether your application can also use its underlying LLM to generate responses when it can’t find answers in your enterprise data. You can also block specific words.

With topic-level controls, you can specify restricted topics and configure behavior rules in response to the topics, such as answering using enterprise data or blocking completely.

To learn more, visit Admin control and guardrails in the AWS documentation.

You can alter document metadata or attributes and content during the document ingestion process by configuring basic logic to specify a metadata field name, select a condition, and enter or select a value and target actions, such as update or delete. You can also use AWS Lambda functions to manipulate document fields and content, such as using optical character recognition (OCR) to extract text from images.

To learn more, visit Document attributes and types in Amazon Q Business and Document enrichment in Amazon Q Business in the AWS documentation.

Enhanced enterprise-grade security and management
Starting April 30, you will need to use AWS IAM Identity Center for user identity management of all new applications rather than using the legacy identity management. You can securely connect your workforce to Amazon Q Business applications either in the web experience or your own interface.

You can also centrally manage workforce access using IAM Identity Center alongside your existing IAM roles and policies. As the number of your accounts scales, IAM Identity Center gives you the option to use it as a single place to manage user access to all your applications. To learn more, visit Setting up Amazon Q Business with IAM Identity Center in the AWS documentation.

At general availability, Amazon Q Business is now integrated with various AWS services to securely connect and store the data and easily deploy and track access logs.

You can use AWS PrivateLink to access Amazon Q Business securely in your Amazon Virtual Private Cloud (Amazon VPC) environment using a VPC endpoint. You can use the Amazon Q Business template for AWS CloudFormation to easily automate the creation and provisioning of infrastructure resources. You can also use AWS CloudTrail to record actions taken by a user, role, or AWS service in Amazon Q Business.

Also, we support Federal Information Processing Standards (FIPS) endpoints, based on the United States and Canadian government standards and security requirements for cryptographic modules that protect sensitive information.

To learn more, visit Security in Amazon Q Business and Monitoring Amazon Q Business in the AWS documentation.

Build and share apps with new Amazon Q Apps (preview)
Today we are announcing the preview of Amazon Q Apps, a new capability within Amazon Q Business for your organization’s users to easily and quickly create generative AI-powered apps based on company data, without requiring any prior coding experience.

With Amazon Q Apps, users simply describe the app they want, in natural language, or they can take an existing conversation where Amazon Q Business helped them solve a problem. With a few clicks, Amazon Q Business will instantly generate an app that accomplishes their desired task that can be easily shared across their organization.

If you are familiar with PartyRock, you can easily use this code-free builder with the added benefit of connecting it to your enterprise data already with Amazon Q Business.

To create a new Amazon Q App, choose Apps in your web experience and enter a simple text expression for a task in the input box. You can try out samples, such as a content creator, interview question generator, meeting note summarizer, and grammar checker.

I will make a document assistant to review and correct a document using the following prompt:

You are a professional editor tasked with reviewing and correcting a document for grammatical errors, spelling mistakes, and inconsistencies in style and tone. Given a file, your goal is to recommend changes to ensure that the document adheres to the highest standards of writing while preserving the author’s original intent and meaning. You should provide a numbered list for all suggested revisions and the supporting reason.

When you choose the Generate button, a document editing assistant app will be automatically generated with two cards—one to upload a document file as an input and another text output card that gives edit suggestions.

When you choose the Add card button, you can add more cards, such as a user input, text output, file upload, or pre-configured plugin by your administrator. If you want to create a Jira ticket to request publishing a post in the corporate blog channel as an author, you can add a Jira Plugin with the result of edited suggestions from the uploaded file.

Once you are ready to share the app, choose the Publish button. You can securely share this app to your organization’s catalog for others to use, enhancing productivity. Your colleagues can choose shared apps, modify them, and publish their own versions to the organizational catalog instead of starting from scratch.

Choose Library to see all of the published Amazon Q Apps. You can search the catalog by labels and open your favorite apps.

Amazon Q Apps inherit robust security and governance controls from Amazon Q Business, including user authentication and access controls, which empower organizations to safely share apps across functions that warrant governed collaboration and innovation.

In the administrator console, you can see your Amazon Q Apps and control or remove them from the library.

To learn more, visit Amazon Q Apps in the AWS documentation.

Now available
Amazon Q Business is generally available today in the US East (N. Virginia) and US West (Oregon) Regions. We are launching two pricing subscription options.

The Amazon Q Business Lite ($3/user/month) subscription provides users access to the basic functionality of Amazon Q Business.

The Amazon Business Pro ($20/user/month) subscription gets users access to all features of Amazon Q Business, as well as Amazon Q Apps (preview) and Amazon Q in QuickSight (Reader Pro), which enhances business analyst and business user productivity using generative business intelligence capabilities.

You can use the free trial (50 users for 60 days) to experiment with Amazon Q Business. For more information about pricing options, visit Amazon Q Business Plan page.

To learn more about Amazon Q Business, you can study Amazon Q Business Getting Started, a free, self-paced digital course on AWS Skill Builder and Amazon Q Developer Center to get more sample codes.

Give it a try in the Amazon Q Business console today! For more information, visit the Amazon Q Business product page and the User Guide in the AWS documentation. Provide feedback to AWS re:Post for Amazon Q or through your usual AWS support contacts.

Channy

Meta’s Llama 3 models are now available in Amazon Bedrock

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/metas-llama-3-models-are-now-available-in-amazon-bedrock/

Today, we are announcing the general availability of Meta’s Llama 3 models in Amazon Bedrock. Meta Llama 3 is designed for you to build, experiment, and responsibly scale your generative artificial intelligence (AI) applications. New Llama 3 models are the most capable to support a broad range of use cases with improvements in reasoning, code generation, and instruction.

According to Meta’s Llama 3 announcement, the Llama 3 model family is a collection of pre-trained and instruction-tuned large language models (LLMs) in 8B and 70B parameter sizes. These models have been trained on over 15 trillion tokens of data—a training dataset seven times larger than that used for Llama 2 models, including four times more code, which supports an 8K context length that doubles the capacity of Llama 2.

You can now use two new Llama 3 models in Amazon Bedrock, further increasing model choice within Amazon Bedrock. These models provide the ability for you to easily experiment with and evaluate even more top foundation models (FMs) for your use case:

  • Llama 3 8B is ideal for limited computational power and resources, and edge devices. The model excels at text summarization, text classification, sentiment analysis, and language translation.
  • Llama 3 70B is ideal for content creation, conversational AI, language understanding, research development, and enterprise applications. The model excels at text summarization and accuracy, text classification and nuance, sentiment analysis and nuance reasoning, language modeling, dialogue systems, code generation, and following instructions.

Meta is also currently training additional Llama 3 models over 400B parameters in size. These 400B models will have new capabilities, including multimodality, multiple languages support, and a much longer context window. When released, these models will be ideal for content creation, conversational AI, language understanding, research and development (R&D), and enterprise applications.

Llama 3 models in action
If you are new to using Meta models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. To access the latest Llama 3 models from Meta, request access separately for Llama 3 8B Instruct or Llama 3 70B Instruct.

To test the Meta Llama 3 models in the Amazon Bedrock console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Meta as the category and Llama 8B Instruct or Llama 3 70B Instruct as the model.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. You can use model IDs such as meta.llama3-8b-instruct-v1 or meta.llama3-70b-instruct-v1.

Here is a sample of the AWS CLI command:

$ aws bedrock-runtime invoke-model \
  --model-id meta.llama3-8b-instruct-v1:0 \
  --body "{\"prompt\":\"Simply put, the theory of relativity states that\\n the laws of physics are the same everywhere in the universe, and that the passage of time and the length of objects can vary depending on their speed and position in a gravitational field \",\"max_gen_len\":512,\"temperature\":0.5,\"top_p\":0.9}" \
  --cli-binary-format raw-in-base64-out \
  --region us-east-1 \
  invoke-model-output.txt

You can use code examples for Amazon Bedrock using AWS SDKs to build your applications with various programming languages. The following Python code examples show how to invoke the Llama 3 Chat model in Amazon Bedrock for text generation.

def invoke_llama3(self, prompt):
        try:
            body = {
                "prompt": prompt,
                "temperature": 0.5,
                "top_p": 0.9,
                "max_gen_len": 512,
            }

            response = self.bedrock_runtime_client.invoke_model(
                modelId="meta.llama3-8b-instruct-v1:0", body=json.dumps(body)
            )

            response_body = json.loads(response["body"].read())
            completion = response_body["generation"]

            return completion

        except ClientError:
            logger.error("Couldn't invoke Llama 3")
            raise

You can use these Llama 3 models for a variety of tasks, such as question answering, language translation, sentiment analysis, and more. Here is an example of Llama 3 models with few-shot in-context learning where we provide training samples available to the model.

Input: Translate English to French:
sea otter => loutre de mer
peppermint => menthe poivrée
plush girafe => girafe peluche
cheese =>
Output:  fromage
panda => panda
penguin => pingouin
polar bear => ours polaire
puffin => macareux
sloth => paresseux
walrus => morse
whale => baleine
zebra => z\u00e8bre

Translate English to German:
sea otter => Seeotter

You can also use Llama 3 instruct models optimized for dialogue use cases. The input to the instruct model endpoints is the previous history between the chat assistant and the user. So, you can ask questions contextual to the conversation that has happened so far and provide the system configuration, such as personas, which define the chat assistant’s behavior in the following format:

<|begin_of_text|><|start_header_id|>system<|end_header_id|>
[SYSTEM_TEXT]<|eot_id|><|start_header_id|>user<|end_header_id|>
[USER_TEXT]<|eot_id|><|start_header_id|>assistant<|end_header_id|>

In the following example, the user has a conversation with the assistant about tourist sites in Paris. Then, the user inquires about the first option recommended by the chat assistant.

Input: <|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\n
I am going to Paris, what should I see?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n
Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:\n\n
1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.\n
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.\n
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.\n\n
These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n
What is so great about #1?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n
Output: The Eiffel Tower! It's an iconic symbol of Paris and one of the most recognizable landmarks in the world. Here are some reasons why it's so great:\n\n
1. **Engineering marvel**: When it was built for the 1889 World's Fair, the Eiffel Tower was a groundbreaking feat of engineering. It was the tallest structure in the world at the time, standing at 324 meters (1,063 feet).\n
2. **Panoramic views**: The Eiffel Tower offers breathtaking 360-degree views of the city from its observation decks. You can see many of Paris' famous landmarks, such as the Seine River, the Louvre, and Montmartre.\n
3. **Romantic atmosphere**: The Eiffel Tower is often associated with romance and is a popular spot for proposals, weddings, and romantic dinners. The tower's iron latticework and twinkling lights at night add to its romantic ambiance.\n
4. **Historical significance**: The Eiffel Tower has played a significant role in French history, serving as a communication tower during World War I and a symbol of French culture and engineering.\n
5. **Accessibility**: The Eiffel Tower is easily accessible by public transportation or on foot, and there are various ways to experience it, including taking the stairs, elevator, or a guided tour.\n
6. **Nighttime spectacle**: The Eiffel Tower is even more magical at night, when it's illuminated with thousands of twinkling lights. The light show, known as the \"Eiffel Tower Sparkle,\" takes place every hour on the hour from dusk to midnight.\n
7. **Iconic status**: The Eiffel Tower has been featured in countless films, literature, and art, making it an instantly recognizable symbol of Paris and France.\n\n
Overall, the Eiffel Tower is a must-visit attraction in Paris, offering a unique combination of history, engineering, romance, and stunning views.

To learn more about the new prompt template and special tokens of Llama 3, check out Meta’s model cards and prompt formats or Llama Recipes in the GitHub repository.

Now available
Meta’s Llama 3 models are available today in Amazon Bedrock in the US East (N. Virginia) and US West (Oregon) Regions. Check the full Region list for future updates. To learn more, check out the Llama in Amazon Bedrock product page and pricing page.

Give Llama 3 a try in the Amazon Bedrock console today, and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Visit our community.aws site to find deep-dive technical content and to discover how our Builder communities are using Amazon Bedrock in their solutions.

Channy

Anthropic’s Claude 3 Opus model is now available on Amazon Bedrock

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/anthropics-claude-3-opus-model-on-amazon-bedrock/

We are living in the generative artificial intelligence (AI) era; a time of rapid innovation. When Anthropic announced its Claude 3 foundation models (FMs) on March 4, we made Claude 3 Sonnet, a model balanced between skills and speed, available on Amazon Bedrock the same day. On March 13, we launched the Claude 3 Haiku model on Amazon Bedrock, the fastest and most compact member of the Claude 3 family for near-instant responsiveness.

Today, we are announcing the availability of Anthropic’s Claude 3 Opus on Amazon Bedrock, the most intelligent Claude 3 model, with best-in-market performance on highly complex tasks. It can navigate open-ended prompts and sight-unseen scenarios with remarkable fluency and human-like understanding, leading the frontier of general intelligence.

With the availability of Claude 3 Opus on Amazon Bedrock, enterprises can build generative AI applications to automate tasks, generate revenue through user-facing applications, conduct complex financial forecasts, and accelerate research and development across various sectors. Like the rest of the Claude 3 family, Opus can process images and return text outputs.

Claude 3 Opus shows an estimated twofold gain in accuracy over Claude 2.1 on difficult open-ended questions, reducing the likelihood of faulty responses. As enterprise customers rely on Claude across industries like healthcare, finance, and legal research, improved accuracy is essential for safety and performance.

How does Claude 3 Opus perform?
Claude 3 Opus outperforms its peers on most of the common evaluation benchmarks for AI systems, including undergraduate-level expert knowledge (MMLU), graduate-level expert reasoning (GPQA), basic mathematics (GSM8K), and more. It exhibits high levels of comprehension and fluency on complex tasks, leading the frontier of general intelligence.


Source: https://www.anthropic.com/news/claude-3-family

Here are a few supported use cases for the Claude 3 Opus model:

  • Task automation: planning and execution of complex actions across APIs, databases, and interactive coding
  • Research: brainstorming and hypothesis generation, research review, and drug discovery
  • Strategy: advanced analysis of charts and graphs, financials and market trends, and forecasting

To learn more about Claude 3 Opus’s features and capabilities, visit Anthropic’s Claude on Bedrock page and Anthropic Claude models in the Amazon Bedrock documentation.

Claude 3 Opus in action
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Opus.

2024-claude3-opus-2-model-access screenshot

To test Claude 3 Opus in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Opus as the model.

To test more Claude prompt examples, choose Load examples. You can view and run examples specific to Claude 3 Opus, such as analyzing a quarterly report, building a website, and creating a side-scrolling game.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
     --model-id anthropic.claude-3-opus-20240229-v1:0 \
     --body "{\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\" Your task is to create a one-page website for an online learning platform.\\n\"}]}],\"anthropic_version\":\"bedrock-2023-05-31\",\"max_tokens\":2000,\"temperature\":1,\"top_k\":250,\"top_p\":0.999,\"stop_sequences\":[\"\\n\\nHuman:\"]}" \
     --cli-binary-format raw-in-base64-out \
     --region us-east-1 \
     invoke-model-output.txt

As I mentioned in my previous Claude 3 model launch posts, you need to use the new Anthropic Claude Messages API format for some Claude 3 model features, such as image processing. If you use Anthropic Claude Text Completions API and want to use Claude 3 models, you should upgrade from the Text Completions API.

My colleagues, Dennis Traub and Francois Bouteruche, are building code examples for Amazon Bedrock using AWS SDKs. You can learn how to invoke Claude 3 on Amazon Bedrock to generate text or multimodal prompts for image analysis in the Amazon Bedrock documentation.

Here is sample JavaScript code to send a Messages API request to generate text:

// claude_opus.js - Invokes Anthropic Claude 3 Opus using the Messages API.
import {
  BedrockRuntimeClient,
  InvokeModelCommand
} from "@aws-sdk/client-bedrock-runtime";

const modelId = "anthropic.claude-3-opus-20240229-v1:0";
const prompt = "Hello Claude, how are you today?";

// Create a new Bedrock Runtime client instance
const client = new BedrockRuntimeClient({ region: "us-east-1" });

// Prepare the payload for the model
const payload = {
  anthropic_version: "bedrock-2023-05-31",
  max_tokens: 1000,
  messages: [{
    role: "user",
    content: [{ type: "text", text: prompt }]
  }]
};

// Invoke Claude with the payload and wait for the response
const command = new InvokeModelCommand({
  contentType: "application/json",
  body: JSON.stringify(payload),
  modelId
});
const apiResponse = await client.send(command);

// Decode and print Claude's response
const decodedResponseBody = new TextDecoder().decode(apiResponse.body);
const responseBody = JSON.parse(decodedResponseBody);
const text = responseBody.content[0].text;
console.log(`Response: ${text}`);

Now, you can install the AWS SDK for JavaScript Runtime Client for Node.js and run claude_opus.js.

npm install @aws-sdk/client-bedrock-runtime
node claude_opus.js

For more examples in different programming languages, check out the code examples section in the Amazon Bedrock User Guide, and learn how to use system prompts with Anthropic Claude at Community.aws.

Now available
Claude 3 Opus is available today in the US West (Oregon) Region; check the full Region list for future updates.

Give Claude 3 Opus a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

Introducing AWS Deadline Cloud: Set up a cloud-based render farm in minutes

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/introducing-aws-deadline-cloud-set-up-a-cloud-based-render-farm-in-minutes/

Customers in industries such as architecture, engineering, & construction (AEC) and media & entertainment (M&E) generate the final frames for film, TV, games, industrial design visualizations, and other digital media with a process called rendering, which takes 2D/3D digital content data and computes an output, such as an image or video file. Rendering also requires significant compute power, especially to generate 3D graphics and visual effects (VFX) with resolutions as high as 16K for films and TV. This constrains the number of rendering projects that customers can take on at once.

To address this growing demand for rendering high-resolution content, customers often build what are called “render farms” which combine the power of hundreds or thousands of computing nodes to process their rendering jobs. Render farms can traditionally take weeks or even months to build and deploy, and they require significant planning and upfront commitments to procure hardware.

As a result, customers increasingly are transitioning to scalable, cloud-based render farms for efficient production instead of a dedicated render farm on-premises, which can require extremely high fixed costs. But, rendering in the cloud still requires customers to manage their own infrastructure, build bespoke tooling to manage costs on a project-by-project basis, and monitor software licensing costs with their preferred partners themselves.

Today, we are announcing the general availability of AWS Deadline Cloud, a new fully managed service that enables creative teams to easily set up a render farm in minutes, scale to run more projects in parallel, and only pay for what resources they use. AWS Deadline Cloud provides a web-based portal with the ability to create and manage render farms, preview in-progress renders, view and analyze render logs, and easily track these costs.

With Deadline Cloud, you can go from zero to render faster with integrations of digital content creation (DCC) tools and customization tools are built-in. You can reduce the effort and development time required to tailor your rendering pipeline to the needs of each job. You also have the flexibility to use licenses you already own or they are provided by the service for third-party DCC software and renderers such as Maya, Nuke, and Houdini.

Concepts of AWS Deadline Cloud
AWS Deadline Cloud allows you to create and manage rendering projects and jobs on Amazon Elastic Compute Cloud (Amazon EC2) instances directly from DCC pipelines and workstations. You can create a rendering farm, a collection of queues, and fleets. A queue is where your submitted jobs are located and scheduled to be rendered. A fleet is a group of worker nodes that can support multiple queues. A queue can be processed by multiple fleets.

Before you can work on a project, you should have access to the required resources, and the associated farm must be integrated with AWS IAM Identity Center to manage workforce authentication and authorization. IT administrators can create and grant access permissions to users and groups at different levels, such as viewers, contributors, managers, or owners.

Here are four key components of Deadline Cloud:

  • Deadline Cloud monitor – You can access statuses, logs, and other troubleshooting metrics for jobs, steps, and tasks. The Deadline Cloud monitor provides real-time access and updates to job progress. It also provides access to logs and other troubleshooting metrics, and you can browse multiple farm, fleet, and queue listings to view system utilization.
  • Deadline Cloud submitter – You can submit a rendering job directly using AWS SDK or AWS Command Line Interface (AWS CLI). You can also submit from DCC software using a Deadline Cloud submitter, which is a DCC-integrated plugin that supports Open Job Description (OpenJD), an open source template specification. With it, artists can submit rendering jobs from a third-party DCC interface they are more familiar with, such as Maya or Nuke, to Deadline Cloud, where project resources are managed and jobs are monitored in one location.
  • Deadline Cloud budget manager – You can create and edit budgets to help manage project costs and view how many AWS resources are used and the estimated costs for those resources.
  • Deadline Cloud usage explorer – You can use the usage explorer to track approximate compute and licensing costs based on public pricing rates in Amazon EC2 and Usage-Based Licensing (UBL).

Get started with AWS Deadline Cloud
To get started with AWS Deadline Cloud, define and create a farm with Deadline Cloud monitor, download the Deadline Cloud submitter, and install plugins for your favorite DCC applications with just a few clicks. You can define your rendering jobs in your DCC application and submit them to your created farm within the plugin’s user interfaces.

The DCC plugins detect the necessary input scene data and build a job bundle that uploads to the Amazon Simple Storage Service (Amazon S3) bucket in your account, transfer to Deadline Cloud for rendering the job, and provide completed frames to the S3 bucket for your customers to access.

1. Define a farm with Deadline Cloud monitor
Let’s create your Deadline Cloud monitor infrastructure and define your farm first. In the Deadline Cloud console, choose Set up Deadline Cloud to define a farm with a guided experience, including queues and fleets, adding groups and users, choosing a service role, and adding tags to your resources.

In this step, to choose all the default settings for your Deadline Cloud resources, choose Skip to Review in Step 3 after monitor setup. Otherwise choose Next and customize your Deadline Cloud resources.

Set up your monitor’s infrastructure and enter your Monitor display name. This name makes the Monitor URL, a web portal to manage your farms, queues, fleets, and usages. You can’t change the monitor URL after you finish setting up. The AWS Region is the physical location of your rendering farm, so you should choose the closest Region from your studio to reduce the latency and improve data transfer speeds.

To access the monitor, you can create new users and groups and manage users (such as by assigning them groups, permissions, and applications) or delete users from your monitor. Users, groups, and permissions can also be managed in the IAM Identity Center. So, if you don’t set up the IAM Identity Center in your Region, you should enable it first. To learn more, visit Managing users in Deadline Cloud in the AWS documentation.

In Step 2, you can define farm details such as the name and description of your farm. In Additional farm settings, you can set an AWS Key Management Service (AWS KMS) key to encrypt your data and tags to assign AWS resources for filtering your resources or tracking your AWS costs. Your data is encrypted by default with a key that AWS owns and manages for you. To choose a different key, customize your encryption settings.

You can choose Skip to Review and Create to finish the quick setup process with the default settings.

Let’s look at more optional configurations! In the step for defining queue details, you can set up an S3 bucket for your queue. Job assets are uploaded as job attachments during the rendering process. Job attachments are stored in your defined S3 bucket. Additionally, you can set up the default budget action, service access roles, and environment variables for your queue.

In the step for defining fleet details, set the fleet name, description, Instance option (either Spot or On-Demand Instance), and Auto scaling configuration to define the number of instances and the fleet’s worker requirements. We set conservative worker requirements by default. These values can be updated at any time after setting up your render farm. To learn more, visit Manage Deadline Cloud fleets in the AWS documentation.

Worker instances define EC2 instance types with vCPUs and memory size, for example, c5.large, c5a.large, and c6i.large. You can filter up to 100 EC2 instance types by either allowing or excluding types of worker instances.

Review all of the information entered to create your farm and choose Create farm.

The progress of your Deadline Cloud onboarding is displayed, and a success message displays when your monitor and farm are ready for use. To learn more details about the process, visit Set up a Deadline Cloud monitor in the AWS documentation.

In the Dashboard in the left pane, you can visit the overview of the monitor, farms, users, and groups that you created.

Choose Monitor to visit a web portal to manage your farms, queues, fleets, usages, and budgets. After signing in to your user account, you can enter a web portal and explore the Deadline Cloud resources you created. You can also download a Deadline Cloud monitor desktop application with the same user experiences from the Downloads page.

To learn more about using the monitor, visit Using the Deadline Cloud monitor in the AWS documentation.

2. Set up a workstation and submit your render job to Deadline Cloud
Let’s set up a workstation for artists on their desktops by installing the Deadline Cloud submitter application so they can easily submit render jobs from within Maya, Nuke, and Houdini. Choose Downloads in the left menu pane and download the proper submitter installer for your operating system to test your render farm.

This program installs the latest integrated plugin for Deadline Cloud submitter for Maya, Nuke, and Houdini.

For example, open a Maya on your desktop and your asset. I have an example of a wrench file I’m going to test with. Choose Windows in the menu bar and Settings/Preferences in the sub menu. In the Plugin Manager, search for DeadlineCloudSubmitter. Select Loaded to load the Deadline Cloud submitter plugin.

If you are not already authenticated in the Deadline Cloud submitter, the Deadline Cloud Status tab will display. Choose Login and sign in with your user credentials in a browser sign-in window.

Now, select the Deadline Cloud shelf, then choose the orange deadline cloud logo on the ‘Deadline’ shelf to launch the submitter. From the submitter window, choose the farm and queue you want your render submitted to. If desired, in the Scene Settings tab, you can override the frame range, change the Output Path, or both.

If you choose Submit, the wrench turntable Maya file, along with all of the necessary textures and alembic caches, will be uploaded to Deadline Cloud and rendered on the farm. You can monitor rendering jobs in your Deadline Cloud monitor.

When your render is finished, as indicated by the Succeeded status in the job monitor, choose the job, Job Actions, and Download Output. To learn more about scheduling and monitoring jobs, visit Deadline Cloud jobs in the AWS documentation.

View your rendered image with an image viewing application such as DJView. The image will look like this:

To learn more in detail about the developer-side setup process using the command line, visit Setting up a developer workstation for Deadline Cloud in the AWS documentation.

3. Managing budgets and usage for Deadline Cloud
To help you manage costs for Deadline Cloud, you can use a budget manager to create and edit budgets. You can also use a usage explorer to view how many AWS resources are used and the estimated costs for those resources.

Choose Budgets on the Deadline Cloud monitor page to create your budget for your farm.

You can create budget amounts and limits and set automated actions to help reduce or stop additional spend against the budget.

Choose Usage in the Deadline Cloud monitor page to find real-time metrics on the activity happening on each farm. You can look at the farm’s costs by different variables, such as queue, job, or user. Choose various time frames to find usage during a specific period and look at usage trends over time.

The costs displayed in the usage explorer are approximate. Use them as a guide for managing your resources. There may be other costs from using other connected AWS resources, such as Amazon S3, Amazon CloudWatch, and other services that are not accounted for in the usage explorer.

To learn more, visit Managing budgets and usage for Deadline Cloud in the AWS documentation.

Now available
AWS Deadline Cloud is now available in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), and Europe (Ireland) Regions.

Give AWS Deadline Cloud a try in the Deadline Cloud console. For more information, visit the Deadline Cloud product page, Deadline Cloud User Guide in the AWS documentation, and send feedback to AWS re:Post for AWS Deadline Cloud or through your usual AWS support contacts.

Channy

AWS Weekly Roundup — AWS Chips Taste Test, generative AI updates, Community Days, and more — April 1, 2024

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-chips-taste-test-generative-ai-updates-community-days-and-more-april-1-2024/

Today is April Fool’s Day. About 10 years ago, some tech companies would joke about an idea that was thought to be fun and unfeasible on April 1st, to the delight of readers. Jeff Barr has also posted seemingly far-fetched ideas on this blog in the past, and some of these have surprisingly come true! Here are examples:

Year Joke Reality
2010 Introducing QC2 – the Quantum Compute Cloud, a production-ready quantum computer to solve certain types of math and logic problems with breathtaking speed. In 2019, we launched Amazon Braket, a fully managed service that allows scientists, researchers, and developers to begin experimenting with computers from multiple quantum hardware providers in a single place.
2011 Announcing AWS $NAME, a scalable event service to find and automatically integrate with your systems on the cloud, on premises, and even your house and room. In 2019, we introduced Amazon EventBridge to make it easy for you to integrate your own AWS applications with third-party applications. If you use AWS IoT Events, you can monitor and respond to events at scale from your IoT devices at home.
2012 New Amazon EC2 Fresh Servers to deliver a fresh (physical) EC2 server in 15 minutes using atmospheric delivery and communucation from a fleet of satellites. In 2021, we launched AWS Outposts Server, 1U/2U physical servers with built-in AWS services. In 2023, Project Kuiper completed successful tests of an optical mesh network in low Earth orbit. Now, we only need to develop satellite warehouse and atmospheric re-entry technology to follow Amazon PrimeAir’s drone delivery.
2013 PC2 – The New Punched Card Cloud, a new mf (mainframe) instance family, Mainframe Machine Images (MMI), tape storage, and punched card interfaces for mainframe computers used from the 1970s to ’80s. In 2022, we launched AWS Mainframe Modernization to help you modernize your mainframe applications and deploy them to AWS fully managed runtime environments.

Jeff returns! This year, we have AWS “Chips” Taste Test for him to indulge in, drawing unique parallels between chip flavors and silicon innovations. He compared the taste of “Golden Nacho Cheese,” “Al Chili Lime,” and “BBQ Training Wheels” with AWS Graviton, AWS Inferentia, and AWS Trainium chips.

What’s your favorite? Watch a fun video in the LinkedIn and X post of AWS social media channels.

Last week’s launches
If we stay curious, keep learning, and insist on high standards, we will continue to see more ideas turn into reality. The same goes for the generative artificial intelligence (generative AI) world. Here are some launches that utilize generative AI technology this week.

Knowledge Bases for Amazon BedrockAnthropic’s Claude 3 Sonnet foundation model (FM) is now generally available on Knowledge Bases for Amazon Bedrock to connect internal data sources for Retrieval Augmented Generation (RAG).

Knowledge Bases for Amazon Bedrock support metadata filtering, which improves retrieval accuracy by ensuring the documents are relevant to the query. You can narrow search results by specifying which documents to include or exclude from a query, resulting in more relevant responses generated by FMs such as Claude 3 Sonnet.

Finally, you can customize prompts and number of retrieval results in Knowledge Bases for Amazon Bedrock. With custom prompts, you can tailor the prompt instructions by adding context, user input, or output indicator(s), for the model to generate responses that more closely match your use case needs. You can now control the amount of information needed to generate a final response by adjusting the number of retrieved passages. To learn more these new features, visit Knowledge bases for Amazon Bedrock in the AWS documentation.

Amazon Connect Contact Lens – At AWS re:Invent 2023, we previewed a generative AI capability to summarize long customer conversations into succinct, coherent, and context-rich contact summaries to help improve contact quality and agent performance. These generative AI–powered post-contact summaries are now available in Amazon Connect Contact Lens.

Amazon DataZone – At AWS re:Invent 2023, we also previewed a generative AI–based capability to generate comprehensive business data descriptions and context and include recommendations on analytical use cases. These generative AI–powered recommendations for descriptions are now available in Amazon DataZone.

There are also other important launches you shouldn’t miss:

A new Local Zone in Miami, Florida – AWS Local Zones are an AWS infrastructure deployment that places compute, storage, database, and other select services closer to large populations, industry, and IT centers where no AWS Region exists. You can now use a new Local Zone in Miami, Florida, to run applications that require single-digit millisecond latency, such as real-time gaming, hybrid migrations, and live video streaming. Enable the new Local Zone in Miami (use1-mia2-az1) from the Zones tab in the Amazon EC2 console settings to get started.

New Amazon EC2 C7gn metal instance – You can use AWS Graviton based new C7gn bare metal instances to run applications that benefit from deep performance analysis tools, specialized workloads that require direct access to bare metal infrastructure, legacy workloads not supported in virtual environments, and licensing-restricted business-critical applications. The EC2 C7gn metal size comes with 64 vCPUs and 128 GiB of memory.

AWS Batch multi-container jobs – You can use multi-container jobs in AWS Batch, making it easier and faster to run large-scale simulations in areas like autonomous vehicles and robotics. With the ability to run multiple containers per job, you get the advanced scaling, scheduling, and cost optimization offered by AWS Batch, and you can use modular containers representing different components like 3D environments, robot sensors, or monitoring sidecars.

Amazon Guardduty EC2 Runtime Monitoring – We are announcing the general availability of Amazon GuardDuty EC2 Runtime Monitoring to expand threat detection coverage for EC2 instances at runtime and complement the anomaly detection that GuardDuty already provides by continuously monitoring VPC Flow Logs, DNS query logs, and AWS CloudTrail management events. You now have visibility into on-host, OS-level activities and container-level context into detected threats.

GitLab support for AWS CodeBuild – You can now use GitLab and GitLab self-managed as the source provider for your CodeBuild projects. You can initiate builds from changes in source code hosted in your GitLab repositories. To get started with CodeBuild’s new source providers, visit the AWS CodeBuild User Guide.

Retroactive support for AWS cost allocation tags – You can enable AWS cost allocation tags retroactively for up to 12 months. Previously, when you activated resource tags for cost allocation purposes, the tags only took effect prospectively. Submit a backfill request, specifying the duration of time you want the cost allocation tags to be backfilled. Once the backfill is complete, the cost and usage data from prior months will be tagged with the current cost allocation tags.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news about generative AI that you might have missed:

Amazon and Anthropic’s AI investiment – Read the latest milestone in our strategic collaboration and investment with Anthropic. Now, Anthropic is using AWS as its primary cloud provider and will use AWS Trainium and Inferentia chips for mission-critical workloads, including safety research and future FM development. Earlier this month, we announced access to Anthropic’s most powerful FM, Claude 3, on Amazon Bedrock. We announced availability of Sonnet on March 4 and Haiku on March 13. To learn more, watch the video introducing Claude on Amazon Bedrock.

Virtual building assistant built on Amazon Bedrock – BrainBox AI announced the launch of ARIA (Artificial Responsive Intelligent Assistant) powered by Amazon Bedrock. ARIA is designed to enhance building efficiency by assimilating seamlessly into the day-to-day processes related to building management. To learn more, read the full customer story and watch the video on how to reduce a building’s CO2 footprint with ARIA.

Solar models available on Amazon SageMaker JumpStart – Upstage Solar is a large language model (LLM) 100 percent pre-trained with Amazon SageMaker that outperforms and uses its compact size and powerful track record to specialize in purpose training, making it versatile across languages, domains, and tasks. Now, Solar Mini is available on Amazon SageMaker JumpStart. To learn more, watch how to deploy Solar models in SageMaker JumpStart.

AWS open source news and updates – My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community. Last week’s highlight was news that Linux Foundation launched Valkey community, an open source alternative to the Redis in-memory, NoSQL data store.

Upcoming AWS Events
Check your calendars and sign up for upcoming AWS events:

AWS SummitAWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Paris (April 3), Amsterdam (April 9), Sydney (April 10–11), London (April 24), Berlin (May 15–16), and Seoul (May 16–17), Hong Kong (May 22), Milan (May 23), Dubai (May 29), Stockholm (June 4), and Madrid (June 5).

AWS re:Inforce – Explore cloud security in the age of generative AI at AWS re:Inforce, June 10–12 in Pennsylvania for two-and-a-half days of immersive cloud security learning designed to help drive your business initiatives. Read the story from AWS Chief Information Security Officer (CISO) Chris Betz about a bit of what you can expect at re:Inforce.

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Mumbai (April 6), Poland (April 11), Bay Area (April 12), Kenya (April 20), and Turkey (May 18).

You can browse all upcoming AWS led in-person and virtual events and developer-focused events such as AWS DevDay.

That’s all for this week. Check back next Monday for another Week in Review!

— Channy

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS.

Amazon GuardDuty EC2 Runtime Monitoring is now generally available

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-guardduty-ec2-runtime-monitoring-is-now-generally-available/

Amazon GuardDuty is a machine learning (ML)-based security monitoring and intelligent threat detection service that analyzes and processes various AWS data sources, continuously monitors your AWS accounts and workloads for malicious activity, and delivers detailed security findings for visibility and remediation.

I love the feature of GuardDuty Runtime Monitoring that analyzes operating system (OS)-level, network, and file events to detect potential runtime threats for specific AWS workloads in your environment. I first introduced the general availability of this feature for Amazon Elastic Kubernetes Service (Amazon EKS) resources in March 2023. Seb wrote about the expansion of the Runtime Monitoring feature to provide threat detection for Amazon Elastic Container Service (Amazon ECS) and AWS Fargate as well as the preview for Amazon Elastic Compute Cloud (Amazon EC2) workloads in Nov 2023.

Today, we are announcing the general availability of Amazon GuardDuty EC2 Runtime Monitoring to expand threat detection coverage for EC2 instances at runtime and complement the anomaly detection that GuardDuty already provides by continuously monitoring VPC Flow Logs, DNS query logs, and AWS CloudTrail management events. You now have visibility into on-host, OS-level activities and container-level context into detected threats.

With GuardDuty EC2 Runtime Monitoring, you can identify and respond to potential threats that might target the compute resources within your EC2 workloads. Threats to EC2 workloads often involve remote code execution that leads to the download and execution of malware. This could include instances or self-managed containers in your AWS environment that are connecting to IP addresses associated with cryptocurrency-related activity or to malware command-and-control related IP addresses.

GuardDuty Runtime Monitoring provides visibility into suspicious commands that involve malicious file downloads and execution across each step, which can help you discover threats during initial compromise and before they become business-impacting events. You can also centrally enable runtime threat detection coverage for accounts and workloads across the organization using AWS Organizations to simplify your security coverage.

Configure EC2 Runtime Monitoring in GuardDuty
With a few clicks, you can enable GuardDuty EC2 Runtime Monitoring in the GuardDuty console. For your first use, you need to enable Runtime Monitoring.

Any customers that are new to the EC2 Runtime Monitoring feature can try it for free for 30 days and gain access to all features and detection findings. The GuardDuty console shows how many days are left in the free trial.

Now, you can set up the GuardDuty security agent for the individual EC2 instances for which you want to monitor the runtime behavior. You can choose to deploy the GuardDuty security agent either automatically or manually. At GA, you can enable Automated agent configuration, which is a preferred option for most customers as it allows GuardDuty to manage the security agent on their behalf.

The agent will be deployed on EC2 instances with AWS Systems Manager and uses an Amazon Virtual Private Cloud (Amazon VPC) endpoint to receive the runtime events associated with your resource. If you want to manage the GuardDuty security agent manually, visit Managing the security agent Amazon EC2 instance manually in the AWS documentation. In multiple-account environments, delegated GuardDuty administrator accounts manage their member accounts using AWS Organizations. For more information, visit Managing multiple accounts in the AWS documentation.

When you enable EC2 Runtime Monitoring, you can find the covered EC2 instances list, account ID, and coverage status, and whether the agent is able to receive runtime events from the corresponding resource in the EC2 instance runtime coverage tab.

Even when the coverage status is Unhealthy, meaning it is not currently able to receive runtime findings, you still have defense in depth for your EC2 instance. GuardDuty continues to provide threat detection to the EC2 instance by monitoring CloudTrail, VPC flow, and DNS logs associated with it.

Check out GuardDuty EC2 Runtime security findings
When GuardDuty detects a potential threat and generates security findings, you can view the details of the healthy information.

Choose Findings in the left pane if you want to find security findings specific to Amazon EC2 resources. You can use the filter bar to filter the findings table by specific criteria, such as a Resource type of Instance. The severity and details of the findings differ based on the resource role, which indicates whether the EC2 resource was the target of suspicious activity or the actor performing the activity.

With today’s launch, we support over 30 runtime security findings for EC2 instances, such as detecting abused domains, backdoors, cryptocurrency-related activity, and unauthorized communications. For the full list, visit Runtime Monitoring finding types in the AWS documentation.

Resolve your EC2 security findings
Choose each EC2 security finding to know more details. You can find all the information associated with the finding and examine the resource in question to determine if it is behaving in an expected manner.

If the activity is authorized, you can use suppression rules or trusted IP lists to prevent false positive notifications for that resource. If the activity is unexpected, the security best practice is to assume the instance has been compromised and take the actions detailed in Remediating a potentially compromised Amazon EC2 instance in the AWS documentation.

You can integrate GuardDuty EC2 Runtime Monitoring with other AWS security services, such as AWS Security Hub or Amazon Detective. Or you can use Amazon EventBridge, allowing you to use integrations with security event management or workflow systems, such as Splunk, Jira, and ServiceNow, or trigger automated and semi-automated responses such as isolating a workload for investigation.

When you choose Investigate with Detective, you can find Detective-created visualizations for AWS resources to quickly and easily investigate security issues. To learn more, visit Integration with Amazon Detective in the AWS documentation.

Things to know
GuardDuty EC2 Runtime Monitoring support is now available for EC2 instances running Amazon Linux 2 or Amazon Linux 2023. You have the option to configure maximum CPU and memory limits for the agent. To learn more and for future updates, visit Prerequisites for Amazon EC2 instance support in the AWS documentation.

To estimate the daily average usage costs for GuardDuty, choose Usage in the left pane. During the 30-day free trial period, you can estimate what your costs will be after the trial period. At the end of the trial period, we charge you per vCPU hours tracked monthly for the monitoring agents. To learn more, visit the Amazon GuardDuty pricing page.

Enabling EC2 Runtime Monitoring also allows for a cost-saving opportunity on your GuardDuty cost. When the feature is enabled, you won’t be charged for GuardDuty foundational protection VPC Flow Logs sourced from the EC2 instances running the security agent. This is due to similar, but more contextual, network data available from the security agent. Additionally, GuardDuty would still process VPC Flow Logs and generate relevant findings so you will continue to get network-level security coverage even if the agent experiences downtime.

Now available
Amazon GuardDuty EC2 Runtime Monitoring is now available in all AWS Regions where GuardDuty is available, excluding AWS GovCloud (US) Regions and AWS China Regions. For a full list of Regions where EC2 Runtime Monitoring is available, visit Region-specific feature availability.

Give GuardDuty EC2 Runtime Monitoring a try in the GuardDuty console. For more information, visit the Amazon GuardDuty User Guide and send feedback to AWS re:Post for Amazon GuardDuty or through your usual AWS support contacts.

Channy

Anthropic’s Claude 3 Haiku model is now available on Amazon Bedrock

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/anthropics-claude-3-haiku-model-is-now-available-in-amazon-bedrock/

Last week, Anthropic announced their Claude 3 foundation model family. The family includes three models: Claude 3 Haiku, the fastest and most compact model for near-instant responsiveness; Claude 3 Sonnet, the ideal balanced model between skills and speed; and Claude 3 Opus, the most intelligent offering for top-level performance on highly complex tasks. AWS also announced the general availability of Claude 3 Sonnet in Amazon Bedrock.

Today, we are announcing the availability of Claude 3 Haiku on Amazon Bedrock. The Claude 3 Haiku foundation model is the fastest and most compact model of the Claude 3 family, designed for near-instant responsiveness and seamless generative artificial intelligence (AI) experiences that mimic human interactions. For example, it can read a data-dense research paper on arXiv (~10k tokens) with charts and graphs in less than three seconds.

With Claude 3 Haiku’s availability on Amazon Bedrock, you can build near-instant responsive generative AI applications for enterprises that need quick and accurate targeted performance. Like Sonnet and Opus, Haiku has image-to-text vision capabilities, can understand multiple languages besides English, and boasts increased steerability in a 200k context window.

Claude 3 Haiku use cases
Claude 3 Haiku is smarter, faster, and more affordable than other models in its intelligence category. It answers simple queries and requests with unmatched speed. With its fast speed and increased steerability, you can create AI experiences that seamlessly imitate human interactions.

Here are some use cases for using Claude 3 Haiku:

  • Customer interactions: quick and accurate support in live interactions, translations
  • Content moderation: catch risky behavior or customer requests
  • Cost-saving tasks: optimized logistics, inventory management, fast knowledge extraction from unstructured data

To learn more about Claude 3 Haiku’s features and capabilities, visit Anthropic’s Claude on Amazon Bedrock and Anthropic Claude models in the AWS documentation.

Claude 3 Haiku in action
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Haiku.

To test Claude 3 Haiku in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Haiku as the model.

To test more Claude prompt examples, choose Load examples. You can view and run examples specific to Claude 3 Haiku, such as advanced Q&A with citations, crafting a design brief, and non-English content generation.

Using Compare mode, you can also compare the speed and intelligence between Claude 3 Haiku and the Claude 2.1 model using a sample prompt to generate personalized email responses to address customer questions.

By choosing View API request, you can also access the model using code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
     --model-id anthropic.claude-3-haiku-20240307-v1:0 \
     --body "{\"messages\":[{\"role\":\"user\",\"content\":[{\"type\":\"text\",\"text\":\"Write the test case for uploading the image to Amazon S3 bucket\\nCertainly! Here's an example of a test case for uploading an image to an Amazon S3 bucket using a testing framework like JUnit or TestNG for Java:\\n\\n...."}]}],\"anthropic_version\":\"bedrock-2023-05-31\",\"max_tokens\":2000}" \
     --cli-binary-format raw-in-base64-out \
     --region us-east-1 \
     invoke-model-output.txt

To make an API request with Claude 3, use the new Anthropic Claude Messages API format, which allows for more complex interactions such as image processing. If you use Anthropic Claude Text Completions API, you should upgrade from the Text Completions API.

Here is sample Python code to send a Message API request describing the image file:

def call_claude_haiku(base64_string):

    prompt_config = {
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 4096,
        "messages": [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "source": {
                            "type": "base64",
                            "media_type": "image/png",
                            "data": base64_string,
                        },
                    },
                    {"type": "text", "text": "Provide a caption for this image"},
                ],
            }
        ],
    }

    body = json.dumps(prompt_config)

    modelId = "anthropic.claude-3-haiku-20240307-v1:0"
    accept = "application/json"
    contentType = "application/json"

    response = bedrock_runtime.invoke_model(
        body=body, modelId=modelId, accept=accept, contentType=contentType
    )
    response_body = json.loads(response.get("body").read())

    results = response_body.get("content")[0].get("text")
    return results

To learn more sample codes with Claude 3, see Get Started with Claude 3 on Amazon Bedrock, Diagrams to CDK/Terraform using Claude 3 on Amazon Bedrock, and Cricket Match Winner Prediction with Amazon Bedrock’s Anthropic Claude 3 Sonnet in the Community.aws.

Now available
Claude 3 Haiku is available now in the US West (Oregon) Region with more Regions coming soon; check the full Region list for future updates.

Claude 3 Haiku is the most cost-effective choice. For example, Claude 3 Haiku is cheaper, up to 68 percent of the price per 1,000 input/output tokens compared to Claude Instant, with higher levels of intelligence. To learn more, see Amazon Bedrock Pricing.

Give Claude 3 Haiku a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

Anthropic’s Claude 3 Sonnet foundation model is now available in Amazon Bedrock

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/anthropics-claude-3-sonnet-foundation-model-is-now-available-in-amazon-bedrock/

In September 2023, we announced a strategic collaboration with Anthropic that brought together their respective technology and expertise in safer generative artificial intelligence (AI), to accelerate the development of Anthropic’s Claude foundation models (FMs) and make them widely accessible to AWS customers. You can get early access to unique features of Anthropic’s Claude model in Amazon Bedrock to reimagine user experiences, reinvent your businesses, and accelerate your generative AI journeys.

In November 2023, Amazon Bedrock provided access to Anthropic’s Claude 2.1, which delivers key capabilities to build generative AI for enterprises. Claude 2.1 includes a 200,000 token context window, reduced rates of hallucination, improved accuracy over long documents, system prompts, and a beta tool use feature for function calling and workflow orchestration.

Today, Anthropic announced Claude 3, a new family of state-of-the-art AI models that allows customers to choose the exact combination of intelligence, speed, and cost that suits their business needs. The three models in the family are Claude 3 Haiku, the fastest and most compact model for near-instant responsiveness, Claude 3 Sonnet, the ideal balanced model between skills and speed, and Claude 3 Opus, a most intelligent offering for the top-level performance on highly complex tasks.

We’re also announcing the availability of Anthropic’s Claude 3 Sonnet today in Amazon Bedrock, with Claude 3 Opus and Claude 3 Haiku coming soon. For the vast majority of workloads, Claude 3 Sonnet model is two times faster than Claude 2 and Claude 2.1, with increased steerability, and new image-to-text vision capabilities.

With Claude 3 Sonnet’s availability in Amazon Bedrock, you can build cost-effective generative AI applications for enterprises that need intelligence, reliability, and speed. You can now use Anthropic’s latest model, Claude 3 Sonnet, in the Amazon Bedrock console.

Introduction of Anthropic’s Claude 3 Sonnet
Here are some key highlights about the new Claude 3 Sonnet model in Amazon Bedrock:

2x faster speed – Claude 3 has made significant gains in speed. For the vast majority of workloads, it is two times faster with the same level of intelligence as Anthropic’s most performant models, Claude 2 and Claude 2.1. This combination of speed and skill makes Claude 3 Sonnet the clear choice for tasks that require intelligent tasks demanding rapid responses, like knowledge retrieval or sales automation. This includes use cases like content generation, classification, data extraction, and research and retrieval or accurate searching over knowledge bases.

Increased steerability – Increased steerability of AI systems gives users more control over outputs and delivers predictable, higher-quality outcomes. It is significantly less likely to refuse to answer questions that border on the system’s guardrails to prevent harmful outputs. Claude 3 Sonnet is easier to steer and better at following directions in popular structured output formats like JSON—making it simpler for developers to build enterprise and frontier applications. This is particularly important in enterprise use cases such as autonomous vehicles, health and medical diagnoses, and algorithmic decision-making in sensitive domains such as financial services.

Image-to-text vision capabilities – Claude 3 offers vision capabilities that can process images and return text outputs. It is extremely capable at analyzing and understanding charts, graphs, technical diagrams, reports, and other visual assets. Claude 3 Sonnet achieves comparable performance to other best-in-class models with image processing capabilities, while maintaining a significant speed advantage.

Expanded language support – Claude 3 has improved understanding and responding in languages other than English, such as French, Japanese, and Spanish. This expanded language coverage allows Claude 3 Sonnet to better serve multinational corporations requiring AI services across different geographies and languages, as well as businesses requiring nuanced translation services. Claude 3 Sonnet is also stronger at coding and mathematics, as evidenced by Anthropic’s scores in evaluations such as grade-school math problems (GSM8K and Hendrycks) and Codex (HumanEval).

To learn more about Claude 3 Sonnet’s features and capabilities, visit Anthropic’s Claude on Amazon Bedrock and Anthropic Claude model in the AWS documentation.

Get started with Anthropic’s Claude 3 Sonnet in Amazon Bedrock
If you are new to using Anthropic models, go to the Amazon Bedrock console and choose Model access on the bottom left pane. Request access separately for Claude 3 Sonnet.

To test Claude 3 Sonnet in the console, choose Text or Chat under Playgrounds in the left menu pane. Then choose Select model and select Anthropic as the category and Claude 3 Sonnet as the model.

To test more Claude prompt examples, choose Load examples. You can view and run Claude 3 specific examples, such as advanced Q&A with citations, crafting a design brief, and non-English content generation.

By choosing View API request, you can also access the model via code examples in the AWS Command Line Interface (AWS CLI) and AWS SDKs. Here is a sample of the AWS CLI command:

aws bedrock-runtime invoke-model \
--model-id anthropic.claude-3-sonnet-v1:0 \
--body "{\"prompt\":\"Write the test case for uploading the image to Amazon S3 bucket\\nHere are some test cases for uploading an image to an Amazon S3 bucket:\\n\\n1. **Successful Upload Test Case**:\\n   - Test Data:\\n     - Valid image file (e.g., .jpg, .png, .gif)\\n     - Correct S3 bucket name\\n     - Correct AWS credentials (access key and secret access key)\\n   - Steps:\\n     1. Initialize the AWS S3 client with the correct credentials.\\n     2. Open the image file.\\n     3. Upload the image file to the specified S3 bucket.\\n     4. Verify that the upload was successful.\\n   - Expected Result: The image should be successfully uploaded to the S3 bucket.\\n\\n2. **Invalid File Type Test Case**:\\n   - Test Data:\\n     - Invalid file type (e.g., .txt, .pdf, .docx)\\n     - Correct S3 bucket name\\n     - Correct AWS credentials\\n   - Steps:\\n     1. Initialize the AWS S3 client with the correct credentials.\\n     2. Open the invalid file type.\\n     3. Attempt to upload the file to the specified S3 bucket.\\n     4. Verify that an appropriate error or exception is raised.\\n   - Expected Result: The upload should fail with an error or exception indicating an invalid file type.\\n\\nThese test cases cover various scenarios, including successful uploads, invalid file types, invalid bucket names, invalid AWS credentials, large file uploads, and concurrent uploads. By executing these test cases, you can ensure the reliability and robustness of your image upload functionality to Amazon S3.\",\"max_tokens_to_sample\":2000,\"temperature\":1,\"top_k\":250,\"top_p\":0.999,\"stop_sequences\":[\"\\n\\nHuman:\"],\"anthropic_version\":\"bedrock-2023-05-31\"}" \
--cli-binary-format raw-in-base64-out \
--region us-east-1 \
invoke-model-output.txt

Upload your image if you want to test image-to-text vision capabilities. I uploaded the featured image of this blog post and received a detailed description of this image.

You can process images via API and return text outputs in English and multiple other languages.

{
  "modelId": "anthropic.claude-3-sonnet-v1:0",
  "contentType": "application/json",
  "accept": "application/json",
  "body": {
    "anthropic_version": "bedrock-2023-05-31",
    "max_tokens": 1000,
    "system": "Please respond only in Spanish.",
    "messages": {
      "role": "user",
      "content": [
        {
          "type": "image",
          "source": {
            "type": "base64",
            "media_type": "image/jpeg",
            "data": "iVBORw..."
          }
        },
        {
          "type": "text",
          "text": "What's in this image?"
        }
      ]
    }
  }
}

To celebrate this launch, Neerav Kingsland, Head of Global Accounts at Anthropic, talks about the power of the Anthropic and AWS partnership.

“Anthropic at its core is a research company that is trying to create the safest large language models in the world, and through Amazon Bedrock we have a change to take that technology, distribute it to users globally, and do this in an extremely safe and data-secure manner.”

Now available
Claude 3 Sonnet is available today in the US East (N. Virginia) and US West (Oregon) Regions; check the full Region list for future updates. The availability of Anthropic’s Claude 3 Opus and Haiku in Amazon Bedrock also will be coming soon.

You will be charged for model inference and customization with the On-Demand and Batch mode, which allows you to use FMs on a pay-as-you-go basis without having to make any time-based term commitments. With the Provisioned Throughput mode, you can purchase model units for a specific base or custom model. To learn more, see Amazon Bedrock Pricing.

Give Anthropic’s Claude 3 Sonnet a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Channy

AWS Weekly Roundup — Happy Lunar New Year, IaC generator, NFL’s digital athlete, AWS Cloud Clubs, and more — February 12, 2024

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-happy-lunar-new-year-iac-generator-nfls-digital-athlete-aws-cloud-clubs-and-more-february-12-2024/

Happy Lunar New Year! Wishing you a year filled with joy, success, and endless opportunities! May the Year of the Dragon bring uninterrupted connections and limitless growth 🐉 ☁

In case you missed it, here’s outstanding news you need to know as you plan your year in early 2024.

AWS was named as a Leader in the 2023 Magic Quadrant for Strategic Cloud Platform Services. AWS is the longest-running Magic Quadrant Leader, with Gartner naming AWS a Leader for the thirteenth consecutive year. See Sebastian’s blog post to learn more. AWS has been named a Leader for the ninth consecutive year in the 2023 Gartner Magic Quadrant for Cloud Database Management Systems, and we have been positioned highest for ability to execute by providing a comprehensive set of services for your data foundation across all workloads, use cases, and data types. See Rahul Pathak’s blog post to learn more.

AWS also has been named a Leader in data clean room technology according to the IDC MarketScape: Worldwide Data Clean Room Technology 2024 Vendor Assessment (January 2024). This report evaluated data clean room technology vendors for use cases across industries. See the AWS for Industries Blog channel post to learn more.

Last Week’s Launches
Here are some launches that got my attention:

A new Local Zone in Houston, Texas – Local Zones are an AWS infrastructure deployment that places compute, storage, database, and other select services closer to large population, industry, and IT centers where no AWS Region exists. AWS Local Zones are available in the US in 15 other metro areas and globally in an additional 17 metros areas, allowing you to deliver low-latency applications to end users worldwide. You can enable the new Local Zone in Houston (us-east-1-iah-2a) from the Zones tab in the Amazon EC2 console settings.

AWS CloudFormation IaC generator – You can generate a template using AWS resources provisioned in your account that are not already managed by CloudFormation. With this launch, you can onboard workloads to Infrastructure as Code (IaC) in minutes, eliminating weeks of manual effort. You can then leverage the IaC benefits of automation, safety, and scalability for the workloads. Use the template to import resources into CloudFormation or replicate resources in a new account or Region. See the user guide and blog post to learn more.

A new look-and-feel of Amazon Bedrock console – Amazon Bedrock now offers an enhanced console experience with updated UI improves usability, responsiveness, and accessibility with more seamless support for dark mode. To get started with the new experience, visit the Amazon Bedrock console.

2024-bedrock-visual-refresh

One-click WAF integration on ALB – Application Load Balancer (ALB) now supports console integration with AWS WAF that allows you to secure your applications behind ALB with a single click. This integration enables AWS WAF protections as a first line of defense against common web threats for your applications that use ALB. You can use this one-click security protection provided by AWS WAF from the integrated services section of the ALB console for both new and existing load balancers.

Up to 49% price reduction for AWS Fargate Windows containers on Amazon ECS – Windows containers running on Fargate are now billed per second for infrastructure and Windows Server licenses that their containerized application requests. Along with the infrastructure pricing for on-demand, we are also reducing the minimum billing duration for Windows containers to 5 minutes (from 15 minutes) for any Fargate Windows tasks starting February 1st, 2024 (12:00am UTC). The infrastructure pricing and minimum billing period changes will automatically reflect in your monthly AWS bill. For more information on the specific price reductions, see our pricing page.

Introducing Amazon Data Firehose – We are renaming Amazon Kinesis Data Firehose to Amazon Data Firehose. Amazon Data Firehose is the easiest way to capture, transform, and deliver data streams into Amazon S3, Amazon Redshift, Amazon OpenSearch Service, Splunk, Snowflake, and other 3rd party analytics services. The name change is effective in the AWS Management Console, documentations, and product pages.

AWS Transfer Family integrations with Amazon EventBridge – AWS Transfer Family now enables conditional workflows by publishing SFTP, FTPS, and FTP file transfer events in near real-time, SFTP connectors file transfer event notifications, and Applicability Statement 2 (AS2) transfer operations to Amazon EventBridge. You can orchestrate your file transfer and file-processing workflows in AWS using Amazon EventBridge, or any workflow orchestration service of your choice that integrates with these events.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other updates and news that you might have missed:

NFL’s digital athlete in the Super Bowl – AWS is working with the National Football League (NFL) to take player health and safety to the next level. Using AI and machine learning, they are creating a precise picture of each player in training, practice, and games. You could see this technology in action, especially with the Super Bowl on the last Sunday!

Amazon’s commiting the responsible AI – On February 7, Amazon joined the U.S. Artificial Intelligence Safety Institute Consortium, established by the National Institute of Standards of Technology (NIST), to further our government and industry collaboration to advance safe and secure artificial intelligence (AI). Amazon will contribute compute credits to help develop tools to evaluate AI safety and help the institute set an interoperable and trusted foundation for responsible AI development and use.

Compliance updates in South Korea – AWS has completed the 2023 South Korea Cloud Service Providers (CSP) Safety Assessment Program, also known as the Regulation on Supervision on Electronic Financial Transactions (RSEFT) Audit Program. AWS is committed to helping our customers adhere to applicable regulations and guidelines, and we help ensure that our financial customers have a hassle-free experience using the cloud. Also, AWS has successfully renewed certification under the Korea Information Security Management System (K-ISMS) standard (effective from December 16, 2023, to December 15, 2026).

Join AWS Cloud Clubs CaptainsAWS Cloud Clubs are student-led user groups for post-secondary level students and independent learners. Interested in founding or co-founding a Cloud Club in your university or region? We are accepting applications from February 5-18, 2024.

Upcoming AWS Events
Check your calendars and sign up for upcoming AWS events:

AWS Innovate AI/ML and Data Edition – Join our free online conference to learn how you and your organization can leverage the latest advances in generative AI. You can register upcoming AWS Innovate Online event that fits your timezone in Asia Pacific & Japan (February 22), EMEA (February 29), and Americas (March 14).

AWS Public Sector events – Join us at the AWS Public Sector Symposium Brussels (March 12) to discover how the AWS Cloud can help you improve resiliency, develop sustainable solutions, and achieve your mission. AWS Public Sector Day London (March 19) gathers professionals from government, healthcare, and education sectors to tackle pressing challenges in United Kingdom public services.

Kicking off AWS Global Summits – AWS Summits are a series of free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Below is a list of available AWS Summit events taking place in April:

You can browse all upcoming AWS-led in-person and virtual events, and developer-focused events such as AWS DevDay.

That’s all for this week. Check back next Monday for another Week in Review!

— Channy

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Amazon ECS supports a native integration with Amazon EBS volumes for data-intensive workloads

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/amazon-ecs-supports-a-native-integration-with-amazon-ebs-volumes-for-data-intensive-workloads/

Today we are announcing that Amazon Elastic Container Service (Amazon ECS) supports an integration with Amazon Elastic Block Store (Amazon EBS), making it easier to run a wider range of data processing workloads. You can provision Amazon EBS storage for your ECS tasks running on AWS Fargate and Amazon Elastic Compute Cloud (Amazon EC2) without needing to manage storage or compute.

Many organizations choose to deploy their applications as containerized packages, and with the introduction of Amazon ECS integration with Amazon EBS, organizations can now run more types of workloads than before.

You can run data workloads requiring storage that supports high transaction volumes and throughput, such as extract, transform, and load (ETL) jobs for big data, which need to fetch existing data, perform processing, and store this processed data for downstream use. Because the storage lifecycle is fully managed by Amazon ECS, you don’t need to build any additional scaffolding to manage infrastructure updates, and as a result, your data processing workloads are now more resilient while simultaneously requiring less effort to manage.

Now you can choose from a variety of storage options for your containerized applications running on Amazon ECS:

  • Your Fargate tasks get 20 GiB of ephemeral storage by default. For applications that need additional storage space to download large container images or for scratch work, you can configure up to 200 GiB of ephemeral storage for your Fargate tasks.
  • For applications that span many tasks that need concurrent access to a shared dataset, you can configure Amazon ECS to mount the Amazon Elastic File System (Amazon EFS) file system to your ECS tasks running on both EC2 and Fargate. Common examples of such workloads include web applications such as content management systems, internal DevOps tools, and machine learning (ML) frameworks. Amazon EFS is designed to be available across a Region and can be simultaneously attached to many tasks.
  • For applications that need high-performance, low-cost storage that does not need to be shared across tasks, you can configure Amazon ECS to provision and attach Amazon EBS storage to your tasks running on both Amazon EC2 and Fargate. Amazon EBS is designed to provide block storage with low latency and high performance within an Availability Zone.

To learn more, see Using data volumes in Amazon ECS tasks and persistent storage best practices in the AWS documentation.

Getting started with EBS volume integration to your ECS tasks
You can configure the volume mount point for your container in the task definition and pass Amazon EBS storage requirements for your Amazon ECS task at runtime. For most use cases, you can get started by simply providing the size of the volume needed for the task. Optionally, you can configure all EBS volume attributes and the file system you want the volume formatted with.

1. Create a task definition
Go to the Amazon ECS console, navigate to Task definitions, and choose Create new task definition.

In the Storage section, choose Configure at deployment to set EBS volume as a new configuration type. You can provision and attach one volume per task for Linux file systems.

When you choose Configure at task definition creation, you can configure existing storage options such as bind mounts, Docker volumes, EFS volumes, Amazon FSx for Windows File Server volumes, or Fargate ephemeral storage.

Now you can select a container in the task definition, the source EBS volume, and provide a mount path where the volume will be mounted in the task.

You can also use $aws ecs register-task-definition --cli-input-json file://example.json command line to register a task definition to add an EBS volume. The following snippet is a sample, and task definitions are saved in JSON format.

{
    "family": "nginx"
    ...
    "containerDefinitions": [
        {
            ...
            "mountPoints": [
                "containerPath": "/foo",
                "sourceVoumne": "new-ebs-volume"
            ],
            "name": "nginx",
            "image": "nginx"
        }
    ],
    "volumes": [
       {
           "name": "/foo",
           "configuredAtRuntime": true
       }
    ]
}

2. Deploy and run your task with EBS volume
Now you can run a task by selecting your task in your ECS cluster. Go to your ECS cluster and choose Run new task. Note that you can select the compute options, the launch type, and your task definition.

Note: While this example goes through deploying a standalone task with an attached EBS volume, you can also configure a new or existing ECS service to use EBS volumes with the desired configuration.

You have a new Volume section where you can configure the additional storage. The volume name, type, and mount points are those that you defined in your task definition. Choose your EBS volume types, sizes (GiB), IOPs, and the desired throughput.

You cannot attach an existing EBS volume to an ECS task. But if you want to create a volume from an existing snapshot, you have the option to choose your snapshot ID. If you want to create a new volume, then you can leave this field empty. You can choose the file system type, either ext3 or ext4 file systems on Linux.

By default, when a task is terminated, Amazon ECS deletes the attached volume. If you need the data in the EBS volume to be retained after the task exits, check Delete on termination. Also, you need to create an AWS Identity and Access Management (IAM) role for volume management that contains the relevant permissions to allow Amazon ECS to make API calls on your behalf. For more information on this policy, see infrastructure role in the AWS documentation.

You can also configure encryption on your EBS volumes using either Amazon managed keys and customer managed keys. To learn more about the options, see our Amazon EBS encryption in the AWS documentation.

After configuring all task settings, choose Create to start your task.

3. Deploy and run your task with EBS volume
Once your task has started, you can see the volume information on the task definition details page. Choose a task and select the Volumes tab to find your created EBS volume details.

Your team can organize the development and operations of EBS volumes more efficiently. For example, application developers can configure the path where your application expects storage to be available in the task definition, and DevOps engineers can configure the actual EBS volume attributes at runtime when the application is deployed.

This allows DevOps engineers to deploy the same task definition to different environments with differing EBS volume configurations, for example, gp3 volumes in the development environments and io2 volumes in production.

Now available
Amazon ECS integration with Amazon EBS is available in nine AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), and Europe (Stockholm). You only pay for what you use, including EBS volumes and snapshots. To learn more, see the Amazon EBS pricing page and Amazon EBS volumes in ECS in the AWS documentation.

Give it a try now and send feedback to our public roadmap, AWS re:Post for Amazon ECS, or through your usual AWS Support contacts.

Channy

P.S. Special thanks to Maish Saidel-Keesing, a senior enterprise developer advocate at AWS for his contribution in writing this blog post.

Happy New Year! AWS Weekly Roundup – January 8, 2024

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/happy-new-year-aws-weekly-roundup-january-8-2024/

Happy New Year! Cloud technologies, machine learning, and generative AI have become more accessible, impacting nearly every aspect of our lives. Amazon CTO Dr. Werner Vogels offers four tech predictions for 2024 and beyond:

  • Generative AI becomes culturally aware
  • FemTech finally takes off
  • AI assistants redefine developer productivity
  • Education evolves to match the speed of technology

Read how these technology trends will converge to help solve some of society’s most difficult problems. Download the Werner Vogels’ Tech Predictions for 2024 and Beyond ebook or read Werner’s All Things Distributed blog.

AWS re:Invent 2023To hear insights from AWS and industry thought leaders, grow your skills, and get inspired, watch AWS re:Invent 2023 videos on demand for keynotes, innovation talks, breakout sessions, and AWS Hero guide playlists.

Launches from the last few weeks
Since our last week in review on December 18, 2023, I’d like to highlight some launches from year end, as well as last week:

New AWS Canada West (Calgary) Region – We are opening a new and second Region and in Canada, AWS Canada West (Calgary). At the end of 2023, AWS had 33 AWS Regions and 105 Availability Zones (AZs) globally. We preannounced 12 additional AZs in four future Regions in Malaysia, New Zealand, Thailand, and the AWS European Sovereign Cloud. We will share more information on these Regions in 2024. Please stay tuned.

DNS over HTTPS in Amazon Route 53 Resolver – You can use the DNS over HTTPS (DoH) protocol for both inbound and outbound Route 53 Resolver endpoints. As the name suggests, DoH supports HTTP or HTTP/2 over TLS to encrypt the data exchanged for Domain Name System (DNS) resolutions.

Automatic enrollment to Amazon RDS Extended Support – Your MySQL 5.7 and PostgreSQL 11 database instances running on Amazon Aurora and Amazon RDS will be automatically enrolled into Amazon RDS Extended Support starting on February 29, 2024. You can have more control over when you want to upgrade the major version of your database after the community end of life (EoL).

New Amazon CloudWatch Network Monitor – This is a new feature of Amazon CloudWatch that helps monitor network availability and performance between AWS and your on-premises environments. Network Monitor needs zero manual instrumentation and gives you access to real-time network visibility to proactively and quickly identify issues within the AWS network and your own hybrid environment. For more information, read Monitor hybrid connectivity with Amazon CloudWatch Network Monitor.

Amazon Aurora PostgreSQL integrations with Amazon Bedrock – You can use two methods to integrate Aurora PostgreSQL databases with Amazon Bedrock to power generative AI applications. You can use the SQL query with Aurora ML integration with Amazon Bedrock and Aurora vector store with Knowledge Bases for Amazon Bedrock for Retrieval Augmented Generation (RAG).

New WordPress setup on Amazon Lightsail – Set up your WordPress website on Amazon Lightsail with the new workflow to eliminate complexity and time spent configuring your website. The workflow allows you to complete all the necessary steps, including setting up a Secure Sockets Layer (SSL) certificate to secure your website with HTTPS.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Here are some other news items that you may find interesting in the new year:

Book recommendations for AWS customer executives – Plan for the new year and catch up on what others are doing and thinking. AWS Enterprise Strategy team recommends what books are most important for our AWS customer executives to read.

Best practices for scaling AWS CDK adoption with Platform Engineering – A recent evolution in DevOps is the introduction of platform engineering teams to build services, toolchains, and documentation to support workload teams. This blog post introduces strategies and best practices for accelerating CDK adoption within your organization. You can learn how to scale the lessons learned from the pilot project across your organization through platform engineering.

High performance running HPC applications on AWS Graviton instances – When running the Parallel Lattice Boltzmann Solver (Palabos) on Amazon EC2 Hpc7g instances to solve computational fluid dynamics (CFD) problems, performance increased by up to 70% and price performance was up to 3x better than on the previous generation of Graviton instances.

The new AWS open source newsletter, #181 – Check up on all the latest open source content, which this week includes AWS Amplify, Amazon Corretto, dbt, Apache Flink, Karpenter, LangChain, Pinecone, and more.

Upcoming AWS Events
Check your calendars and sign up for these AWS events in the new year:

AWS at CES 2024 (January 9-12) – AWS will be representing some of the latest cloud services and solutions that are purpose built for the automotive, mobility, transportation, and manufacturing industries. Join us to learn about the latest cloud capabilities across generative AI, software define vehicles, product engineering, sustainability, new digital customer experiences, connected mobility, autonomous driving, and so much more in Amazon Experience Area.

APJ Builders Online Series (January 18) – This online conference is designed for you to learn core AWS concepts, and step-by-step architectural best practices, including demonstrations to help you get started and accelerate your success on AWS.

You can browse all upcoming AWS-led in-person and virtual events, and developer-focused events such as AWS DevDay.

That’s all for this week. Check back next Monday for another Week in Review!

— Channy

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Your MySQL 5.7 and PostgreSQL 11 databases will be automatically enrolled into Amazon RDS Extended Support

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/your-mysql-5-7-and-postgresql-11-databases-will-be-automatically-enrolled-into-amazon-rds-extended-support/

Today, we are announcing that your MySQL 5.7 and PostgreSQL 11 database instances running on Amazon Aurora and Amazon Relational Database Service (Amazon RDS) will be automatically enrolled into Amazon RDS Extended Support starting on February 29, 2024.

This will help avoid unplanned downtime and compatibility issues that can arise with automatically upgrading to a new major version. This provides you with more control over when you want to upgrade the major version of your database.

This automatic enrollment may mean that you will experience higher charges when RDS Extended Support begins. You can avoid these charges by upgrading your database to a newer DB version before the start of RDS Extended Support.

What is Amazon RDS Extended Support?
In September 2023, we announced Amazon RDS Extended Support, which allows you to continue running your database on a major engine version past its RDS end of standard support date on Amazon Aurora or Amazon RDS at an additional cost.

Until community end of life (EoL), the MySQL and PostgreSQL open source communities manage common vulnerabilities and exposures (CVE) identification, patch generation, and bug fixes for the respective engines. The communities release a new minor version every quarter containing these security patches and bug fixes until the database major version reaches community end of life. After the community end of life date, CVE patches or bug fixes are no longer available and the community considers those engines unsupported. For example, MySQL 5.7 and PostgreSQL 11 are no longer supported by the communities as of October and November 2023 respectively. We are grateful to the communities for their continued support of these major versions and a transparent process and timeline for transitioning to the newest major version.

With RDS Extended Support, Amazon Aurora and RDS takes on engineering the critical CVE patches and bug fixes for up to three years beyond a major version’s community EoL. For those 3 years, Amazon Aurora and RDS will work to identify CVEs and bugs in the engine, generate patches and release them to you as quickly as possible. Under RDS Extended Support, we will continue to offer support, such that the open source community’s end of support for an engine’s major version does not leave your applications exposed to critical security vulnerabilities or unresolved bugs.

You might wonder why we are charging for RDS Extended Support rather than providing it as part of the RDS service. It’s because the engineering work for maintaining security and functionality of community EoL engines requires AWS to invest developer resources for critical CVE patches and bug fixes. This is why RDS Extended Support is only charging customers who need the additional flexibility to stay on a version past community EoL.

RDS Extended Support may be useful to help you meet your business requirements for your applications if you have particular dependencies on a specific MySQL or PostgreSQL major version, such as compatibility with certain plugins or custom features. If you are currently running on-premises database servers or self-managed Amazon Elastic Compute Cloud (Amazon EC2) instances, you can migrate to Amazon Aurora MySQL-Compatible Edition, Amazon Aurora PostgreSQL-Compatible Edition, Amazon RDS for MySQL, Amazon RDS for PostgreSQL beyond the community EoL date, and continue to use these versions these versions with RDS Extended Support while benefiting from a managed service. If you need to migrate many databases, you can also utilize RDS Extended Support to split your migration into phases, ensuring a smooth transition without overwhelming IT resources.

In 2024, RDS Extended Support will be available for RDS for MySQL major versions 5.7 and higher, RDS for PostgreSQL major versions 11 and higher, Aurora MySQL-compatible version 2 and higher, and Aurora PostgreSQL-compatible version 11 and higher. For a list of all future supported versions, see Supported MySQL major versions on Amazon RDS and Amazon Aurora major versions in the AWS documentation.

Community major version RDS/Aurora version Community end of life date End of RDS standard support date Start of RDS Extended Support pricing End of RDS Extended Support
MySQL 5.7 RDS for MySQL 5.7 October 2023 February 29, 2024 March 1, 2024 February 28, 2027
Aurora MySQL 2 October 31, 2024 December 1, 2024
PostgreSQL 11 RDS for PostgreSQL 11 November 2023 March 31, 2024 April 1, 2024 March 31, 2027
Aurora PostgreSQL 11 February 29, 2024

RDS Extended Support is priced per vCPU per hour. Learn more about pricing details and timelines for RDS Extended Support at Amazon Aurora pricing, RDS for MySQL pricing, and RDS for PostgreSQL pricing. For more information, see the blog posts about Amazon RDS Extended Support for MySQL and PostgreSQL databases in the AWS Database Blog.

Why are we automatically enrolling all databases to Amazon RDS Extended Support?
We had originally informed you that RDS Extended Support would provide the opt-in APIs and console features in December 2023. In that announcement, we said that if you decided not to opt your database in to RDS Extended Support, it would automatically upgrade to a newer engine version starting on March 1, 2024. For example, you would be upgraded from Aurora MySQL 2 or RDS for MySQL 5.7 to Aurora MySQL 3 or RDS for MySQL 8.0 and from Aurora PostgreSQL 11 or RDS for PostgreSQL 11 to Aurora PostgreSQL 15 and RDS for PostgreSQL 15, respectively.

However, we heard lots of feedback from customers that these automatic upgrades may cause their applications to experience breaking changes and other unpredictable behavior between major versions of community DB engines. For example, an unplanned major version upgrade could introduce compatibility issues or downtime if applications are not ready for MySQL 8.0 or PostgreSQL 15.

Automatic enrollment in RDS Extended Support gives you additional time and more control to organize, plan, and test your database upgrades on your own timeline, providing you flexibility on when to transition to new major versions while continuing to receive critical security and bug fixes from AWS.

If you’re worried about increased costs due to automatic enrollment in RDS Extended Support, you can avoid RDS Extended Support and associated charges by upgrading before the end of RDS standard support.

How to upgrade your database to avoid RDS Extended Support charges
Although RDS Extended Support helps you schedule your upgrade on your own timeline, sticking with older versions indefinitely means missing out on the best price-performance for your database workload and incurring additional costs from RDS Extended Support.

MySQL 8.0 on Aurora MySQL, also known as Aurora MySQL 3, unlocks support for popular Aurora features, such as Global Database, Amazon RDS Proxy, Performance Insights, Parallel Query, and Serverless v2 deployments. Upgrading to RDS for MySQL 8.0 provides features including up to three times higher performance versus MySQL 5.7, such as Multi-AZ cluster deployments, Optimized Reads, Optimized Writes, and support for AWS Graviton2 and Graviton3-based instances.

PostgreSQL 15 on Aurora PostgreSQL supports the Aurora I/O Optimized configuration, Aurora Serverless v2, Babelfish for Aurora PostgreSQL, pgvector extension, Trusted Language Extensions for PostgreSQL (TLE), and AWS Graviton3-based instances as well as community enhancements. Upgrading to RDS for PostgreSQL 15 provides features such as Multi-AZ DB cluster deployments, RDS Optimized Reads, HypoPG extension, pgvector extension, TLEs for PostgreSQL, and AWS Graviton3-based instances.

Major version upgrades may make database changes that are not backward-compatible with existing applications. You should manually modify your database instance to upgrade to the major version. It is strongly recommended that you thoroughly test any major version upgrade on non-production instances before applying it to production to ensure compatibility with your applications. For more information about an in-place upgrade from MySQL 5.7 to 8.0, see the incompatibilities between the two versions, Aurora MySQL in-place major version upgrade, and RDS for MySQL upgrades in the AWS documentation. For the in-place upgrade from PostgreSQL 11 to 15, you can use the pg_upgrade method.

To minimize downtime during upgrades, we recommend using Fully Managed Blue/Green Deployments in Amazon Aurora and Amazon RDS. With just a few steps, you can use Amazon RDS Blue/Green Deployments to create a separate, synchronized, fully managed staging environment that mirrors the production environment. This involves launching a parallel green environment with upper version replicas of your production databases lower version. After validating the green environment, you can shift traffic over to it. Then, the blue environment can be decommissioned. To learn more, see Blue/Green Deployments for Aurora MySQL and Aurora PostgreSQL or Blue/Green Deployments for RDS for MySQL and RDS for PostgreSQL in the AWS documentation. In most cases, Blue/Green Deployments are the best option to reduce downtime, except for limited cases in Amazon Aurora or Amazon RDS.

For more information on performing a major version upgrade in each DB engine, see the following guides in the AWS documentation.

Now available
Amazon RDS Extended Support is now available for all customers running Amazon Aurora and Amazon RDS instances using MySQL 5.7, PostgreSQL 11, and higher major versions in AWS Regions, including the AWS GovCloud (US) Regions beyond the end of the standard support date in 2024. You don’t need to opt in to RDS Extended Support, and you get the flexibility to upgrade your databases and continued support for up to 3 years.

Learn more about RDS Extended Support in the Amazon Aurora User Guide and the Amazon RDS User Guide. For pricing details and timelines for RDS Extended Support, see Amazon Aurora pricing, RDS for MySQL pricing, and RDS for PostgreSQL pricing.

Please send feedback to AWS re:Post for Amazon RDS and Amazon Aurora or through your usual AWS Support contacts.

Channy