Tag Archives: news

The 4th Gen AMD EPYC LEGO Model You Have Dreamed Of

2024-10-06 Patrick Kennedy

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/the-4th-gen-amd-epyc-lego-model-you-have-dreamed-of/

Ever dream of combining big server processors and LEGO bricks? That dream is here with a 4th Gen AMD EPYC model built out of LEGO bricks

The post The 4th Gen AMD EPYC LEGO Model You Have Dreamed Of appeared first on ServeTheHome.

NICE DCV is now Amazon DCV with 2024.0 release

2024-10-01 Sébastien Stormacq

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/nice-desktop-cloud-visualization-dcv-is-now-amazon-dcv/

Today, NICE DCV has a new name. So long NICE DCV, welcome Amazon DCV. Today, with the 2024.0 release, along with enhancements and bug fixes, NICE DCV is rebranded to Amazon DCV.

The new name is now also used to consistently refer to the DCV protocol powering AWS managed services such as Amazon AppStream 2.0 and Amazon WorkSpaces.

What is Amazon DCV
Amazon DCV is a high-performance remote display protocol. It lets you securely deliver remote desktops and application streaming from any cloud or data center to any device, over varying network conditions. By using Amazon DCV with Amazon Elastic Compute Cloud (Amazon EC2), you can run graphics-intensive applications remotely on EC2 instances. You can then stream the results to more modest client machines, which eliminates the need for expensive dedicated workstations.

Amazon DCV supports both Windows and major flavors of Linux operating systems on the server side, providing you flexibility to fit your organization’s needs. The client-side that receives the desktops and application streamings could be the native DCV client for Windows, Linux, or macOS or web browsers. The DCV remote server and client transfer only encrypted pixels, not data, so no confidential data is downloaded from the DCV server. When you choose to use Amazon DCV on Amazon Web Services (AWS) with EC2 instances, you can take advantage of the AWS 108 Availability Zones across the 33 geographic Regions and 31 local zones, allowing your remote streaming services to scale globally.

Since Amazon acquired NICE 8 years ago, we’ve witnessed a diverse range of customers adopting DCV. From general-purpose users visualizing business applications to industry-specific professionals, DCV has proven to be versatile. For instance, artists have employed DCV to access powerful cloud workstations for their digital content creation and rendering tasks. In the healthcare sector, medical imaging professionals have used DCV for remote visualization and analysis of patient data. Geoscientists have used DCV to analyze reservoir simulation results, while engineers in manufacturing have used it to visualize computational fluid dynamics experiments. The education and IT support industries have benefited from collaborative sessions in DCV, in which multiple users can share a single desktop.

Notable customers include Quantic Dream, an award-winning game development studio that has harnessed DCV to create high-resolution, low-latency streaming services for their artists and developers. Tally Solutions, an enterprise resource planning (ERP) services provider, has employed DCV to securely stream its ERP software to thousands of customers. Volkswagen has used DCV to provide remote access to computer-aided engineering (CAE) applications for over 1,000 automotive engineers. Amazon Kuiper, an initiative to bring broadband connectivity to underserved communities, has used DCV for designing complex chips.

Within AWS, DCV has been adopted by several services to provide managed solutions to customers. For example, AppStream 2.0 uses DCV to offer secure, reliable, and scalable application streaming. Additionally, since 2020, Amazon WorkSpaces Streaming Protocol (WSP), which is built on DCV and optimized for high performance, is available for Amazon WorkSpaces customers. Today, we’re also phasing out the WSP name and replacing it with DCV. Going forward, you will have DCV as a primary protocol choice in Amazon WorkSpaces.

What’s new with version 2024.0
Amazon DCV 2024.0 introduces several fixes and enhancements for improved performance, security, and ease of use. The 2024.0 release now supports the latest Ubuntu 24.04 LTS, bringing the latest security updates and extended long-term support to simplify system maintenance. The DCV client on Ubuntu 24.04 has built in support for Wayland, offering better graphical rendering efficiency and enhanced application isolation. Additionally, DCV 2024.0 now enables the QUIC UDP protocol by default, allowing clients to benefit from an optimized streaming experience. The release also introduces the capability to blank the Linux host screen when a remote user is connected, preventing local access and interaction with the remote session.

How to get started
The easiest way to test DCV is to spin up a WorkSpaces instance from the WorkSpaces console, selecting one of the DCV-powered bundles, or creating an AppStream session. For this demo however, I want to show you how to install DCV server on an EC2 instance.

I installed DCV server on two servers running on Amazon EC2, one running Windows Server 2022 and one running Ubuntu 24.04. I also installed the client on my macOS laptop. The client and server packages are available to download on our website. For both servers, make sure the security group authorizes inbound connection on UDP or TCP port 8443, the default port DCV uses.

The Windows installation is straightforward: start the msi file, select Next at each step and voilà. It was installed in less time than it took me to write this sentence.

The installation on Linux deserves a bit more care. Amazon Machine Images (AMI) for EC2 servers don’t include any desktop or graphical components. As a prerequisite, I had to install the X Window System and a window manager, and configure X to let users connect and start a graphical user interface session on the server. Fortunately, all these steps are well documented. Here is a summary of the commands I used.

# install desktop packages 
$ sudo apt install ubuntu-desktop

# install a desktop manager 
$ sudo apt install gdm3

# reboot
$ sudo reboot

After the reboot, I installed the DCV server package

# Install the server 
$ sudo apt install ./nice-dcv-server_2024.0.17794-1_amd64.ubuntu2404.deb
$ sudo apt install ./nice-xdcv_2024.0.625-1_amd64.ubuntu2404.deb

# (optional) install the DCV web viewer to allow clients to connect from a web browser
$ sudo apt install ./nice-dcv-web-viewer_2024.0.17794-1_amd64.ubuntu2404.deb

Because my server had no GPU, I also followed these steps to install X11 Dummy driver and configure X11 to use it.

Then, I started the service:

$ sudo systemctl enable dcvserver.service 
$ sudo systemctl start dcvserver.service 
$ sudo systemctl status dcvserver.service

I created a user at the operating system level and assigned a password and a home directory. Then, I checked my setup on the server before trying to connect from the server.

$ sudo dcv list-sessions
There are no sessions available.

$ sudo dcv create-session console --type virtual --owner seb

$ sudo dcv list-sessions
Session: 'console' (owner:seb type:virtual)

Once my server configuration was ready, I started the DCV client on my laptop. I only had to enter the IP address of the server and the username and password of the user to initiate a session.

On my laptop, I opened a new DCV client window and connected to the other EC2 server. After a few seconds, I was able to remotely work with the Windows and the Ubuntu machine running in the cloud.

In this example, I focus on installing Amazon DCV on a single EC2 instance. However, when building your own service infrastructure, you may want to explore the other components that are part of the DCV offering: Amazon DCV Session Manager, Amazon DCV Access Console, and Amazon DCV Connection Gateway.

Pricing and availability
Amazon DCV is free of charges when used on AWS. You only pay for the usage of AWS resources or services, such as EC2 instances, Amazon Workspace desktops, or Amazon App Stream 2.0. If you plan to use DCV with on-premises servers, check the list of license resellers on our website.

Now go build your own servers with DCV.

— seb

Farewell to the Tyan Server Brand As it is Giving Way to MiTAC Computing

2024-10-01 Patrick Kennedy

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/farewell-to-the-tyan-server-brand-as-it-is-giving-way-to-mitac-computing/

Farewell to the venerable Tyan server brand as it gives way to the umbrella MiTAC Computing brand for Q4 2024

The post Farewell to the Tyan Server Brand As it is Giving Way to MiTAC Computing appeared first on ServeTheHome.

STH Q3 2024 Letter from the Editor The Coolest Quarter

2024-09-30 Patrick Kennedy

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/sth-q3-2024-letter-from-the-editor-the-coolest-quarter/

In our STH Q3 2024 Letter from the Editor, we go into the state of STH and some of the cool things we have been learning

The post STH Q3 2024 Letter from the Editor The Coolest Quarter appeared first on ServeTheHome.

New 36GB SK hynix HBM3E 12-High in Volume Production

2024-09-28 Cliff Robinson

Post Syndicated from Cliff Robinson original https://www.servethehome.com/new-36gb-sk-hynix-hbm3e-12-high-in-volume-production/

SK hynix joins the 36GB HBM3E 12-high stack club announcing that its HBM modules are now in volume production

The post New 36GB SK hynix HBM3E 12-High in Volume Production appeared first on ServeTheHome.

Run your compute-intensive and general purpose workloads sustainably with the new Amazon EC2 C8g, M8g instances

2024-09-26 Veliswa Boya

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/run-your-compute-intensive-and-general-purpose-workloads-sustainably-with-the-new-amazon-ec2-c8g-m8g-instances/

Today we’re announcing general availability of the Amazon Elastic Compute Cloud (Amazon EC2) C8g and M8g instances.

C8g instances are AWS Graviton4 based and are ideal for compute-intensive workloads such as high performance computing (HPC), batch processing, gaming, video encoding, scientific modeling, distributed analytics, CPU-based machine learning (ML) inference, and ad serving.

Also Graviton4 based, M8g instances provide the best price performance for general purpose workloads. M8g instances are ideal for applications such as application servers, microservices, gaming servers, mid-size data stores, and caching fleets.

Now looking at some of the improvements that we have made available in both these instances. C8g and M8g instances offer larger instance sizes with up to three times more vCPUs (up to 48xl), three times the memory (up to 384GB for C8g and up to 768GB for M8g), 75 percent more memory bandwidth, and two times more L2 cache over equivalent 7g instances. This helps you to process larger amounts of data, scale up your workloads, improve time to results, and lower your total cost of ownership (TCO). These instances also offer up to 50 Gbps network bandwidth and up to 40 Gbps Amazon Elastic Block Storage (Amazon EBS) bandwidth compared to up to 30 Gbps network bandwidth and up to 20 Gbps Amazon EBS bandwidth on Graviton3-based instances. Similar to R8g instances, C8g and M8g instances offer two bare metal sizes (metal-24xl and metal-48xl). You can right size your instances and deploy workloads that benefit from direct access to physical resources.

The specs for the C8g instances are as follows.

Instance size	vCPUs	Memory (GiB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)
c8g.medium	1	2	Up to 12.5	Up to 10
c8g.large	2	4	Up to 12.5	Up to 10
c8g.xlarge	4	8	Up to 12.5	Up to 10
c8g.2xlarge	8	16	Up to 15	Up to 10
c8g.4xlarge	16	32	Up to 15	Up to 10
c8g.8xlarge	32	64	15	10
c8g.12xlarge	48	96	22.5	15
c8g.16xlarge	64	128	30	20
c8g.24xlarge	96	192	40	30
c8g.48xlarge	192	384	50	40
c8g.metal-24xl	96	192	40	30
c8g.metal-48xl	192	384	50	40

The specs for the M8g instances are as follows.

Instance size	vCPUs	Memory (GiB)	Network bandwidth (Gbps)	EBS bandwidth (Gbps)
m8g.medium	1	4	Up to 12.5	Up to 10
m8g.large	2	8	Up to 12.5	Up to 10
m8g.xlarge	4	16	Up to 12.5	Up to 10
m8g.2xlarge	8	32	Up to 15	Up to 10
m8g.4xlarge	16	64	Up to 15	Up to 10
m8g.8xlarge	32	128	15	10
m8g.12xlarge	48	192	22.5	15
m8g.16xlarge	64	256	30	20
m8g.24xlarge	96	384	40	30
m8g.48xlarge	192	768	50	40
m8g.metal-24xl	96	384	40	30
m8g.metal-48xl	192	768	50	40

Good to know

AWS Graviton4 processors offer enhanced security with always-on memory encryption, dedicated caches for every vCPU, and support for pointer authentication.
These instances are built on the AWS Nitro System which is a rich collection of building blocks that offloads many of the traditional virtualization functions to dedicated hardware and software. It delivers high performance, high availability, and high security, thus reducing virtualization overhead.
The C8g and M8g instances are ideal for Linux-based workloads including containerized and microservices-based applications such as those running on Amazon Elastic Kubernetes Service (Amazon EKS) and Amazon Elastic Container Service (Amazon ECS), as well as applications written in popular programming languages such as C/C++, Rust, Go, Java, Python, .NET Core, Node.js, Ruby, and PHP.

Available now
C8g and M8g instances are available today in the US East (N. Virginia), US East (Ohio), US West (Oregon), and Europe (Frankfurt) AWS Regions. As usual with Amazon EC2, you pay only for what you use. For more information, see Amazon EC2 Pricing. Check out the collection of AWS Graviton resources to help you start migrating your applications to Graviton instance types. You can also visit the AWS Graviton Fast Start program to begin your Graviton adoption journey.

To learn more, visit our Amazon EC2 instances page, and please send feedback to AWS re:Post for EC2 or through your usual AWS Support contacts.

– Veliswa

Introducing Llama 3.2 models from Meta in Amazon Bedrock: A new generation of multimodal vision and lightweight models

2024-09-25 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-llama-3-2-models-from-meta-in-amazon-bedrock-a-new-generation-of-multimodal-vision-and-lightweight-models/

In July, we announced the availability of Llama 3.1 models in Amazon Bedrock. Generative AI technology is improving at incredible speed and today, we are excited to introduce the new Llama 3.2 models from Meta in Amazon Bedrock.

Llama 3.2 offers multimodal vision and lightweight models representing Meta’s latest advancement in large language models (LLMs) and providing enhanced capabilities and broader applicability across various use cases. With a focus on responsible innovation and system-level safety, these new models demonstrate state-of-the-art performance on a wide range of industry benchmarks and introduce features that help you build a new generation of AI experiences.

These models are designed to inspire builders with image reasoning and are more accessible for edge applications, unlocking more possibilities with AI.

The Llama 3.2 collection of models are offered in various sizes, from lightweight text-only 1B and 3B parameter models suitable for edge devices to small and medium-sized 11B and 90B parameter models capable of sophisticated reasoning tasks including multimodal support for high resolution images. Llama 3.2 11B and 90B are the first Llama models to support vision tasks, with a new model architecture that integrates image encoder representations into the language model. The new models are designed to be more efficient for AI workloads, with reduced latency and improved performance, making them suitable for a wide range of applications.

All Llama 3.2 models support a 128K context length, maintaining the expanded token capacity introduced in Llama 3.1. Additionally, the models offer improved multilingual support for eight languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

In addition to the existing text capable Llama 3.1 8B, 70B, and 405B models, Llama 3.2 supports multimodal use cases. You can now use four new Llama 3.2 models — 90B, 11B, 3B, and 1B — from Meta in Amazon Bedrock to build, experiment, and scale your creative ideas:

Llama 3.2 90B Vision (text + image input) – Meta’s most advanced model, ideal for enterprise-level applications. This model excels at general knowledge, long-form text generation, multilingual translation, coding, math, and advanced reasoning. It also introduces image reasoning capabilities, allowing for image understanding and visual reasoning tasks. This model is ideal for the following use cases: image captioning, image-text retrieval, visual grounding, visual question answering and visual reasoning, and document visual question answering.

Llama 3.2 11B Vision (text + image input) – Well-suited for content creation, conversational AI, language understanding, and enterprise applications requiring visual reasoning. The model demonstrates strong performance in text summarization, sentiment analysis, code generation, and following instructions, with the added ability to reason about images. This model use cases are similar to the 90B version: image captioning, image-text-retrieval, visual grounding, visual question answering and visual reasoning, and document visual question answering.

Llama 3.2 3B (text input) – Designed for applications requiring low-latency inferencing and limited computational resources. It excels at text summarization, classification, and language translation tasks. This model is ideal for the following use cases: mobile AI-powered writing assistants and customer service applications.

Llama 3.2 1B (text input) – The most lightweight model in the Llama 3.2 collection of models, perfect for retrieval and summarization for edge devices and mobile applications. This model is ideal for the following use cases: personal information management and multilingual knowledge retrieval.

In addition, Llama 3.2 is built on top of the Llama Stack, a standardized interface for building canonical toolchain components and agentic applications, making building and deploying easier than ever. Llama Stack API adapters and distributions are designed to most effectively leverage the Llama model capabilities and it gives customers the ability to benchmark Llama models across different vendors.

Meta has tested Llama 3.2 on over 150 benchmark datasets spanning multiple languages and conducted extensive human evaluations, demonstrating competitive performance with other leading foundation models. Let’s see how these models work in practice.

Using Llama 3.2 models in Amazon Bedrock
To get started with Llama 3.2 models, I navigate to the Amazon Bedrock console and choose Model access on the navigation pane. There, I request access for the new Llama 3.2 models: Llama 3.2 1B, 3B, 11B Vision, and 90B Vision.

To test the new vision capability, I open another browser tab and download from the Our World in Data website the Share of electricity generated by renewables chart in PNG format. The chart is very high resolution and I resize it to be 1024 pixel wide.

Back in the Amazon Bedrock console, I choose Chat under Playgrounds in the navigation pane, select Meta as the category, and choose the Llama 3.2 90B Vision model.

I use Choose files to select the resized chart image and use this prompt:

Based on this chart, which countries in Europe have the highest share?

I choose Run and the model analyzes the image and returns its results:

I can also access the models programmatically using the AWS Command Line Interface (AWS CLI) and AWS SDKs. Compared to using the Llama 3.1 models, I only need to update the model IDs as described in the documentation. I can also use the new cross-region inference endpoint for the US and the EU Regions. These endpoints work for any Region within the US and the EU respectively. For example, the cross-region inference endpoints for the Llama 3.2 90B Vision model are:

us.meta.llama3-2-90b-instruct-v1:0
eu.meta.llama3-2-90b-instruct-v1:0

Here’s a sample AWS CLI command using the Amazon Bedrock Converse API. I use the --query parameter of the CLI to filter the result and only show the text content of the output message:

aws bedrock-runtime converse --messages '[{ "role": "user", "content": [ { "text": "Tell me the three largest cities in Italy." } ] }]' --model-id us.meta.llama3-2-90b-instruct-v1:0 --query 'output.message.content[*].text' --output text

In output, I get the response message from the "assistant".

The three largest cities in Italy are:

1. Rome (Roma) - population: approximately 2.8 million
2. Milan (Milano) - population: approximately 1.4 million
3. Naples (Napoli) - population: approximately 970,000

It’s not much different if you use one of the AWS SDKs. For example, here’s how you can use Python with the AWS SDK for Python (Boto3) to analyze the same image as in the console example:

import boto3

MODEL_ID = "us.meta.llama3-2-90b-instruct-v1:0"
# MODEL_ID = "eu.meta.llama3-2-90b-instruct-v1:0"

IMAGE_NAME = "share-electricity-renewable-small.png"

bedrock_runtime = boto3.client("bedrock-runtime")

with open(IMAGE_NAME, "rb") as f:
    image = f.read()

user_message = "Based on this chart, which countries in Europe have the highest share?"

messages = [
    {
        "role": "user",
        "content": [
            {"image": {"format": "png", "source": {"bytes": image}}},
            {"text": user_message},
        ],
    }
]

response = bedrock_runtime.converse(
    modelId=MODEL_ID,
    messages=messages,
)
response_text = response["output"]["message"]["content"][0]["text"]
print(response_text)

Llama 3.2 models are also available in Amazon SageMaker JumpStart, a machine learning (ML) hub that makes it easy to deploy pre-trained models using the console or programmatically through the SageMaker Python SDK. From SageMaker JumpStart, you can also access and deploy new safeguard models that can help classify the safety level of model inputs (prompts) and outputs (responses), including Llama Guard 3 11B Vision, which are designed to support responsible innovation and system-level safety.

In addition, you can easily fine-tune Llama 3.2 1B and 3B models with SageMaker JumpStart today. Fine-tuned models can then be imported as custom models into Amazon Bedrock. Fine-tuning for the full collection of Llama 3.2 models in Amazon Bedrock and Amazon SageMaker JumpStart is coming soon.

The publicly available weights of Llama 3.2 models make it easier to deliver tailored solutions for custom needs. For example, you can fine-tune a Llama 3.2 model for a specific use case and bring it into Amazon Bedrock as a custom model, potentially outperforming other models in domain-specific tasks. Whether you’re fine-tuning for enhanced performance in areas like content creation, language understanding, or visual reasoning, Llama 3.2’s availability in Amazon Bedrock and SageMaker empowers you to create unique, high-performing AI capabilities that can set your solutions apart.

More on Llama 3.2 model architecture
Llama 3.2 builds upon the success of its predecessors with an advanced architecture designed for optimal performance and versatility:

Auto-regressive language model – At its core, Llama 3.2 uses an optimized transformer architecture, allowing it to generate text by predicting the next token based on the previous context.

Fine-tuning techniques – The instruction-tuned versions of Llama 3.2 employ two key techniques:

Supervised fine-tuning (SFT) – This process adapts the model to follow specific instructions and generate more relevant responses.
Reinforcement learning with human feedback (RLHF) – This advanced technique aligns the model’s outputs with human preferences, enhancing helpfulness and safety.

Multimodal capabilities – For the 11B and 90B Vision models, Llama 3.2 introduces a novel approach to image understanding:

Separately trained image reasoning adaptor weights are integrated with the core LLM weights.
These adaptors are connected to the main model through cross-attention mechanisms. Cross-attention allows one section of the model to focus on relevant parts of another component’s output, enabling information flow between different sections of the model.
When an image is input, the model treats the image reasoning process as a “tool use” operation, allowing for sophisticated visual analysis alongside text processing. In this context, tool use is the generic term used when a model uses external resources or functions to augment its capabilities and complete tasks more effectively.

Optimized inference – All models support grouped-query attention (GQA), which enhances inference speed and efficiency, particularly beneficial for the larger 90B model.

This architecture enables Llama 3.2 to handle a wide range of tasks, from text generation and understanding to complex reasoning and image analysis, all while maintaining high performance and adaptability across different model sizes.

Things to know
Llama 3.2 models from Meta are now generally available in Amazon Bedrock in the following AWS Regions:

Llama 3.2 1B and 3B models are available in the US West (Oregon) and Europe (Frankfurt) Regions, and are available in the US East (Ohio, N. Virginia) and Europe (Ireland, Paris) Regions via cross-region inference.
Llama 3.2 11B Vision and 90B Vision models are available in the US West (Oregon) Region, and are available in the US East (Ohio, N. Virginia) Regions via cross-region inference.

Check the full AWS Region list for future updates. To estimate your costs, visit the Amazon Bedrock pricing page.

To learn more about Llama 3.2 features and capabilities, visit the Llama models section of the Amazon Bedrock documentation. Give Llama 3.2 a try in the Amazon Bedrock console today, and send feedback to AWS re:Post for Amazon Bedrock.

You can find deep-dive technical content and discover how our Builder communities are using Amazon Bedrock at community.aws. Let us know what you build with Llama 3.2 in Amazon Bedrock!

— Danilo

Jamba 1.5 family of models by AI21 Labs is now available in Amazon Bedrock

2024-09-23 Antje Barth

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/jamba-1-5-family-of-models-by-ai21-labs-is-now-available-in-amazon-bedrock/

Today, we are announcing the availability of AI21 Labs’ powerful new Jamba 1.5 family of large language models (LLMs) in Amazon Bedrock. These models represent a significant advancement in long-context language capabilities, delivering speed, efficiency, and performance across a wide range of applications. The Jamba 1.5 family of models includes Jamba 1.5 Mini and Jamba 1.5 Large. Both models support a 256K token context window, structured JSON output, function calling, and are capable of digesting document objects.

AI21 Labs is a leader in building foundation models and artificial intelligence (AI) systems for the enterprise. Together, AI21 Labs and AWS are empowering customers across industries to build, deploy, and scale generative AI applications that solve real-world challenges and spark innovation through a strategic collaboration. With AI21 Labs’ advanced, production-ready models together with Amazon’s dedicated services and powerful infrastructure, customers can leverage LLMs in a secure environment to shape the future of how we process information, communicate, and learn.

What is Jamba 1.5?
Jamba 1.5 models leverage a unique hybrid architecture that combines the transformer model architecture with Structured State Space model (SSM) technology. This innovative approach allows Jamba 1.5 models to handle long context windows up to 256K tokens, while maintaining the high-performance characteristics of traditional transformer models. You can learn more about this hybrid SSM/transformer architecture in the Jamba: A Hybrid Transformer-Mamba Language Model whitepaper.

You can now use two new Jamba 1.5 models from AI21 in Amazon Bedrock:

Jamba 1.5 Large excels at complex reasoning tasks across all prompt lengths, making it ideal for applications that require high quality outputs on both long and short inputs.
Jamba 1.5 Mini is optimized for low-latency processing of long prompts, enabling fast analysis of lengthy documents and data.

Key strengths of the Jamba 1.5 models include:

Long context handling – With 256K token context length, Jamba 1.5 models can improve the quality of enterprise applications, such as lengthy document summarization and analysis, as well as agentic and RAG workflows.
Multilingual – Support for English, Spanish, French, Portuguese, Italian, Dutch, German, Arabic, and Hebrew.
Developer-friendly – Native support for structured JSON output, function calling, and capable of digesting document objects.
Speed and efficiency – AI21 measured the performance of Jamba 1.5 models and shared that the models demonstrate up to 2.5X faster inference on long contexts than other models of comparable sizes. For detailed performance results, visit the Jamba model family announcement on the AI21 website.

Get started with Jamba 1.5 models in Amazon Bedrock
To get started with the new Jamba 1.5 models, go to the Amazon Bedrock console, choose Model access on the bottom left pane, and request access to Jamba 1.5 Mini or Jamba 1.5 Large.

To test the Jamba 1.5 models in the Amazon Bedrock console, choose the Text or Chat playground in the left menu pane. Then, choose Select model and select AI21 as the category and Jamba 1.5 Mini or Jamba 1.5 Large as the model.

By choosing View API request, you can get a code example of how to invoke the model using the AWS Command Line Interface (AWS CLI) with the current example prompt.

You can follow the code examples in the Amazon Bedrock documentation to access available models using AWS SDKs and to build your applications using various programming languages.

The following Python code example shows how to send a text message to Jamba 1.5 models using the Amazon Bedrock Converse API for text generation.

import boto3
from botocore.exceptions import ClientError

# Create a Bedrock Runtime client.
bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-east-1")

# Set the model ID.
# modelId = "ai21.jamba-1-5-mini-v1:0"
model_id = "ai21.jamba-1-5-large-v1:0"

# Start a conversation with the user message.
user_message = "What are 3 fun facts about mambas?"
conversation = [
    {
        "role": "user",
        "content": [{"text": user_message}],
    }
]

try:
    # Send the message to the model, using a basic inference configuration.
    response = bedrock_runtime.converse(
        modelId=model_id,
        messages=conversation,
        inferenceConfig={"maxTokens": 256, "temperature": 0.7, "topP": 0.8},
    )

    # Extract and print the response text.
    response_text = response["output"]["message"]["content"][0]["text"]
    print(response_text)

except (ClientError, Exception) as e:
    print(f"ERROR: Can't invoke '{model_id}'. Reason: {e}")
    exit(1)

The Jamba 1.5 models are perfect for use cases like paired document analysis, compliance analysis, and question answering for long documents. They can easily compare information across multiple sources, check if passages meet specific guidelines, and handle very long or complex documents. You can find example code in the AI21-on-AWS GitHub repo. To learn more about how to prompt Jamba models effectively, check out AI21’s documentation.

Now available
AI21 Labs’ Jamba 1.5 family of models is generally available today in Amazon Bedrock in the US East (N. Virginia) AWS Region. Check the full Region list for future updates. To learn more, check out the AI21 Labs in Amazon Bedrock product page and pricing page.

Give Jamba 1.5 models a try in the Amazon Bedrock console today and send feedback to AWS re:Post for Amazon Bedrock or through your usual AWS Support contacts.

Visit our community.aws site to find deep-dive technical content and to discover how our Builder communities are using Amazon Bedrock in their solutions.

— Antje

AWS Weekly Roundup: Amazon EC2 X8g Instances, Amazon Q generative SQL for Amazon Redshift, AWS SDK for Swift, and more (Sep 23, 2024)

2024-09-23 Abhishek Gupta

Post Syndicated from Abhishek Gupta original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-ec2-x8g-instances-amazon-q-generative-sql-for-amazon-redshift-aws-sdk-for-swift-and-more-sep-23-2024/

AWS Community Days have been in full swing around the world. I am going to put the spotlight on AWS Community Day Argentina where Jeff Barr delivered the keynote, talks and shared his nuggets of wisdom with the community, including a fun story of how he once followed Bill Gates to a McDonald’s!

I encourage you to read about his experience.

Last week’s launches
Here are the launches that got my attention, starting off with the GA releases.

Amazon EC2 X8g Instances are now generally available – X8g instances are powered by AWS Graviton4 processors and deliver up to 60% better performance than AWS Graviton2-based Amazon EC2 X2gd instances. These instances offer larger sizes with up to 3x more vCPU (up to 48xlarge) and memory (up to 3TiB) than Graviton2-based X2gd instances.

Amazon Q generative SQL for Amazon Redshift is now generally available – Amazon Q generative SQL in Amazon Redshift Query Editor is an out-of-the-box web-based SQL editor for Amazon Redshift. It uses generative AI to analyze user intent, query patterns, and schema metadata to identify common SQL query patterns directly within Amazon Redshift, accelerating the query authoring process for users and reducing the time required to derive actionable data insights.

AWS SDK for Swift is now generally available – AWS SDK for Swift provides a modern, user-friendly, and native Swift interface for accessing Amazon Web Services from Apple platforms, AWS Lambda, and Linux-based Swift on Server applications. Now that it’s GA, customers can use AWS SDK for Swift for production workloads. Learn more in the AWS SDK for Swift Developer Guide.

AWS Amplify now supports long-running tasks with asynchronous server-side function calls – Developers can use AWS Amplify to invoke Lambda function asynchronously for operations like generative AI model inferences, batch processing jobs, or message queuing without blocking the GraphQL API response. This improves responsiveness and scalability, especially for scenarios where immediate responses are not required or where long-running tasks need to be offloaded.

Amazon Keyspaces (for Apache Cassandra) now supports add-column for multi-Region tables – With this launch, you can modify the schema of your existing multi-Region tables in Amazon Keyspaces (for Apache Cassandra) to add new columns. You only have to modify the schema in one of its replica Regions and Keyspaces will replicate the new schema to the other Regions where the table exists.

Amazon Corretto 23 is now generally available – Amazon Corretto is a no-cost, multi-platform, production-ready distribution of OpenJDK. Corretto 23 is an OpenJDK 23 Feature Release that includes an updated Vector API, expanded pattern matching and switch expression, and more. It will be supported through April, 2025.

Use OR1 instances for existing Amazon OpenSearch Service domains – With OpenSearch 2.15, you can leverage OR1 instances for your existing Amazon OpenSearch Service domains by simply updating your existing domain configuration, and choosing OR1 instances for data nodes. This will seamlessly move domains running OpenSearch 2.15 to OR1 instances using a blue/green deployment.

Amazon S3 Express One Zone now supports AWS KMS with customer managed keys – By default, S3 Express One Zone encrypts all objects with server-side encryption using S3 managed keys (SSE-S3). With S3 Express One Zone support for customer managed keys, you have more options to encrypt and manage the security of your data. S3 Bucket Keys are always enabled when you use SSE-KMS with S3 Express One Zone, at no additional cost.

Use AWS Chatbot to interact with Amazon Bedrock agents from Microsoft Teams and Slack – Before, customers had to develop custom chat applications in Microsoft Teams or Slack and integrate it with Amazon Bedrock agents. Now they can invoke their Amazon Bedrock agents from chat channels by connecting the agent alias with an AWS Chatbot channel configuration.

AWS CodeBuild support for managed GitLab runners – Customers can configure their AWS CodeBuild projects to receive GitLab CI/CD job events and run them on ephemeral hosts. This feature allows GitLab jobs to integrate natively with AWS, providing security and convenience through features such as IAM, AWS Secrets Manager, AWS CloudTrail, and Amazon VPC.

We launched existing services in additional Regions:

Amazon Aurora PostgreSQL Optimized Reads is now available in the AWS GovCloud (US) Regions.
Amazon DocumentDB is now available in Europe (Spain), and Africa (Cape Town) Regions.
Amazon MSK now extends support for Graviton3 based M7G instances in Europe (London) Region.
Amazon EC2 G6 instances now available in Spain region and, High Memory instances now available in Africa (Cape Town) Region.

Other AWS news
Here are some additional projects, blog posts, and news items that you might find interesting:

Secure Cross-Cluster Communication in EKS – It demonstrates how you can use Amazon VPC Lattice and Pod Identity to secure cross-EKS-cluster application communication, along with an example that you can use as a reference to adapt to your own microservices applications.

Improve RAG performance using Cohere Rerank – This post focuses on improving search efficiency and accuracy in RAG systems using Cohere Rerank.

AWS open source news and updates – My colleague Ricardo Sueiras writes about open source projects, tools, and events from the AWS Community; check out Ricardo’s page for the latest updates.

Upcoming AWS events
Check your calendars and sign up for upcoming AWS events:

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. Upcoming AWS Community Days are in Italy (Sep. 27), Taiwan (Sep. 28), Saudi Arabia (Sep. 28)), Netherlands (Oct. 3), and Romania (Oct. 5).

Browse all upcoming AWS led in-person and virtual events and developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Abhishek

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

AWS named as a Leader in the 2024 Gartner Magic Quadrant for Desktop as a Service (DaaS)

2024-09-19 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-named-as-a-leader-in-the-2024-gartner-magic-quadrant-for-desktop-as-a-service-daas/

The 2024 Gartner Magic Quadrant for DaaS (Desktop as a Service) positions AWS as a Leader for the first time. Last year we were recognized as a Challenger. We believe this is a result of our commitment to meet a wide range of customer needs by delivering a diverse portfolio of virtual desktop services with license portability (including Microsoft 365 Apps for Enterprise), our geographic strategy, and operational capabilities focused on cost optimization and automation. Also, our focus on easy-to-use interfaces for managing each aspect of our virtual desktop services means that our customers rarely need to make use of third-party tools.

You can access the complete 2024 Gartner Magic Quadrant for Desktop as a Service (DaaS) to learn more.

AWS DaaS Offerings
Let’s take a quick look at our lineup of DaaS offerings (part of our End User Computing portfolio):

Amazon WorkSpaces Family – Originally launched in early 2014 and enhanced frequently ever since, Amazon WorkSpaces gives you a desktop computing environment running Microsoft Windows, Ubuntu, Amazon Linux, or Red Hat Enterprise Linux in the cloud. Designed to support remote & hybrid workers, knowledge workers, developer workstations, and learning environments, WorkSpaces is available in sixteen AWS Regions, in your choice of six bundle sizes, including the GPU-equipped Graphics G4dn bundle. WorkSpaces Personal gives each user a persistent desktop — perfect for developers, knowledge workers, and others who need to install apps and save files or data. If your users do not need persistent desktops (often the case for contact centers, training, virtual learning, and back office access) you can use WorkSpaces Pools to simplify management and reduce costs. WorkSpaces Core provides managed virtual desktop infrastructure that is designed to work with third-party VDI solutions such as those from Citrix, Leostream, Omnissa, and Workspot.

Amazon WorkSpaces clients are available for desktops and tablets, with web access (Amazon WorkSpaces Secure Browser) and the Amazon WorkSpaces Thin Client providing even more choices. If you have the appropriate Windows 10 or 11 desktop license from Microsoft, you can bring your own license to the cloud (also known as BYOL), where it will run on hardware that is dedicated to you.

You can read about the Amazon WorkSpaces Family and review the WorkSpaces Features to learn more about what WorkSpaces has to offer.

Amazon AppStream 2.0 – Launched in late 2016, Amazon AppStream gives you instant, streamed access to SaaS applications and desktop applications without writing code or refactoring the application. You can easily scale applications and make them available to users across the globe without the need to manage any infrastructure. A wide range of compute, memory, storage, GPU, and operating system options let you empower remote workers, while also taking advantage of auto-scaling to avoid overprovisioning. Amazon AppStream offers three fleet types: Always on (instant connections), On-Demand (2 minutes to launch), and Elastic (for unpredictable demand). Pricing varies by type, with per second and per hour granularity for Windows and Linux; read Amazon AppStream 2.0 Pricing to learn more.

— Jeff;

Gartner does not endorse any vendor, product or service depicted in its research publications and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

GARTNER is a registered trademark and service mark of Gartner and Magic Quadrant is a registered trademark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and are used herein with permission. All rights reserved.

This graphic was published by Gartner, Inc. as part of a larger research document and should be evaluated in the context of the entire document. The Gartner document is available upon request from AWS.

Now available: Graviton4-powered memory-optimized Amazon EC2 X8g instances

2024-09-18 Jeff Barr

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/now-available-graviton4-powered-memory-optimized-amazon-ec2-x8g-instances/

Graviton-4-powered, memory-optimized X8g instances are now available in ten virtual sizes and two bare metal sizes, with up to 3 TiB of DDR5 memory and up to 192 vCPUs. The X8g instances are our most energy efficient to date, with the best price performance and scale-up capability of any comparable EC2 Graviton instance to date. With a 16 to 1 ratio of memory to vCPU, these instances are designed for Electronic Design Automation, in-memory databases & caches, relational databases, real-time analytics, and memory-constrained microservices. The instances fully encrypt all high-speed physical hardware interfaces and also include additional AWS Nitro System and Graviton4 security features.

Over 50K AWS customers already make use of the existing roster of over 150 Graviton-powered instances. They run a wide variety of applications including Valkey, Redis, Apache Spark, Apache Hadoop, PostgreSQL, MariaDB, MySQL, and SAP HANA Cloud. Because they are available in twelve sizes, the new X8g instances are an even better host for these applications by allowing you to choose between scaling up (using a bigger instance) and scaling out (using more instances), while also providing additional flexibility for existing memory-bound workloads that are currently running on distinct instances.

The Instances
When compared to the previous generation (X2gd) instances, the X8g instances offer 3x more memory, 3x more vCPUs, more than twice as much EBS bandwidth (40 Gbps vs 19 Gbps), and twice as much network bandwidth (50 Gbps vs 25 Gbps).

The Graviton4 processors inside the X8g instances have twice as much L2 cache per core as the Graviton2 processors in the X2gd instances (2 MiB vs 1 MiB) along with 160% higher memory bandwidth, and can deliver up to 60% better compute performance.

The X8g instances are built using the 5th generation of AWS Nitro System and Graviton4 processors, which incorporates additional security features including Branch Target Identification (BTI) which provides protection against low-level attacks that attempt to disrupt control flow at the instruction level. To learn more about this and Graviton4’s other security features, read How Amazon’s New CPU Fights Cybersecurity Threats and watch the re:Invent 2023 AWS Graviton session.

Here are the specs:

Instance Name	vCPUs	Memory (DDR5)	EBS Bandwidth	Network Bandwidth
x8g.medium	1	16 GiB	Up to 10 Gbps	Up to 12.5 Gbps
x8g.large	2	32 GiB	Up to 10 Gbps	Up to 12.5 Gbps
x8g.xlarge	4	64 GiB	Up to 10 Gbps	Up to 12.5 Gbps
x8g.2xlarge	8	128 GiB	Up to 10 Gbps	Up to 15 Gbps
x8g.4xlarge	16	256 GiB	Up to 10 Gbps	Up to 15 Gbps
x8g.8xlarge	32	512 GiB	10 Gbps	15 Gbps
x8g.12xlarge	48	768 GiB	15 Gbps	22.5 Gbps
x8g.16xlarge	64	1,024 GiB	20 Gbps	30 Gbps
x8g.24xlarge	96	1,536 GiB	30 Gbps	40 Gbps
x8g.48xlarge	192	3,072 GiB	40 Gbps	50 Gbps
x8g.metal-24xl	96	1,536 GiB	30 Gbps	40 Gbps
x8g.metal-48xl	192	3,072 GiB	40 Gbps	50 Gbps

The instances support ENA, ENA Express, and EFA Enhanced Networking. As you can see from the table above they provide a generous amount of EBS bandwidth, and support all EBS volume types including io2 Block Express, EBS General Purpose SSD, and EBS Provisioned IOPS SSD.

X8g Instances in Action
Let’s take a look at some applications and use cases that can make use of 16 GiB of memory per vCPU and/or up to 3 TiB per instance:

Databases – X8g instances allow SAP HANA and SAP Data Analytics Cloud to handle larger and more ambitious workloads than before. Running on Graviton4 powered instances, SAP has measured up to 25% better performance for analytical workloads and up to 40% better performance for transactional workloads in comparison to the same workloads running on Graviton3 instances. X8g instances allow SAP to expand their Graviton-based usage to even larger memory bound solutions.

Electronic Design Automation – EDA workloads are central to the process of designing, testing, verifying, and taping out new generations of chips, including Graviton, Trainium, Inferentia, and those that form the building blocks for the Nitro System. AWS and many other chip makers have adopted the AWS Cloud for these workloads, taking advantage of scale and elasticity to supply each phase of the design process with the appropriate amount of compute power. This allows engineers to innovate faster because they are not waiting for results. Here’s a long-term snapshot from one of the clusters that was used to support development of Graviton4 in late 2022 and early 2023. As you can see this cluster runs at massive scale, with peaks as high as 5x normal usage:

You can see bursts of daily and weekly activity, and then a jump in overall usage during the tape-out phase. The instances in the cluster are on the large end of the size spectrum so the peaks represent several hundred thousand cores running concurrently. This ability to spin up compute when we need it and down when we don’t gives us access to unprecedented scale without a dedicated investment in hardware.

The new X8g instances will allow us and our EDA customers to run even more workloads on Graviton processors, reducing costs and decreasing energy consumption, while also helping to get new products to market faster than ever.

Available Now
X8g instances are available today in the US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) AWS Regions in On Demand, Spot, Reserved Instance, Savings Plan, Dedicated Instance, and Dedicated Host form. To learn more, visit the X8g page.

Data engineering professional certificate: New hands-on specialization by DeepLearning.AI and AWS

2024-09-18 Betty Zheng (郑予彬)

Post Syndicated from Betty Zheng (郑予彬) original https://aws.amazon.com/blogs/aws/data-engineering-professional-certificate-new-hands-on-specialization-by-deeplearning-ai-and-aws/

Data engineers play a crucial role in the modern data-driven landscape, managing essential tasks from data ingestion and processing to transformation and serving. Their expertise is particularly valuable in the era of generative AI, where harnessing the value of vast datasets is paramount.

To empower aspiring and experienced data professionals, DeepLearning.AI and Amazon Web Services (AWS) have partnered to launch the Data Engineering Specialization, an advanced professional certificate on Coursera. This comprehensive program covers a wide range of data engineering concepts, tools, and techniques relevant to modern organizations. It’s designed for learners with some experience working with data who are interested in learning the fundamentals of data engineering. The specialization comprises four hands-on courses, each culminating in a Coursera course certificate upon completion.

Specialization overview

This Data Engineering Specialization is a joint initiative by AWS and DeepLearning.AI, a leading provider of world-class AI education founded by renowned machine learning (ML) pioneer Andrew Ng.

Joe Reis, a prominent figure in data engineering and coauthor of the bestselling book Fundamentals of Data Engineering, leads the program as a primary instructor. By providing a foundational framework, the curriculum ensures learners gain a holistic understanding of the data engineering lifecycle, while covering key aspect such as data architecture, orchestration, DataOps, and data management.

Further enhancing the learning experience, the program features hands-on labs and technical assessments hosted on the AWS Cloud. These practical, cloud-based exercises were designed in partnership with AWS technical experts, including Gal Heyne, Navnit Shukla, and Morgan Willis. Learners will apply theoretical concepts using AWS services and tools, such as Amazon Kinesis, AWS Glue, Amazon Simple Storage Service (Amazon S3), and Amazon Redshift, equipping them with hands-on skill and experience.

Specialization highlights

Participants will be introduced to several key learning opportunities.

Acquisition of core skills and strategies

The specialization equips data engineers with the ability to design data engineering solutions for various use cases, select the right technologies for their data architecture, and circumvent potential pitfalls. The skills gained universally apply across various platforms and technologies, offering learners a program that is versatile.

Unparalleled approach to data engineering education

Unlike conventional courses focused on specific technologies, this specialization provides a comprehensive understanding of data engineering fundamentals. It emphasizes the importance of aligning data engineering strategies with broader business goals, fostering a more integrated and effective approach to building and maintaining data solutions.

Holistic understanding of data engineering

By using the insights from the Fundamentals of Data Engineering book, the curriculum offers a well-rounded education that prepares professionals for success in the data-driven focused industries.

Practical skills through AWS cloud labs

The hands-on labs hosted by AWS Partner Vocareum let learners apply the techniques directly in an AWS environment provided with the course. This practical experience is crucial for mastering the intricacies of data engineering and developing the skills needed to excel in the industry.

Why choose this specialization?

Structured learning path–The specification is thoughtfully structured to provide a step-by-step learning journey, from foundational concepts to advanced applications.
Expert insights–Gain insights from the authors of Fundamentals of Data Engineering and other industry experts. Learn how to apply practical knowledge to build modern data architecture on the cloud, using cloud services for data engineering.
Hands-on experience–Engage in hands-on labs in the AWS Cloud, where you not only learn but also apply the knowledge in real-world scenarios.
Comprehensive curriculum–This program encompasses all aspects of the data engineering lifecycle, including data generation in source systems, ingestion, transformation, storage, and serving. It also addresses key undercurrents of data engineering, such as security, data management, and orchestration.

At the end of this specialization, learners will be well-equipped with the necessary skills and expertise to embark on a career in data engineering, an in-demand role at the core of any organization that is looking to use data to create value. Data-centric ML and analytics would not be possible without the foundation of data engineering.

Course modules

The Data Engineering Specialization comprises four courses:

Course 1–Introduction to Data Engineering–This foundational module explores the collaborative nature of data engineering, identifying key stakeholders and understanding their requirements. The course delves into a mental framework for building data engineering solutions, emphasizing holistic ecosystem understanding, critical factors like data quality and scalability, and effective requirements gathering. The course then examines the data engineering lifecycle, illustrating interconnections between stages. By showcasing the AWS data engineering stack, the course teaches how to use the right technologies. By the end of this course, learners will have the skills and mindset to tackle data engineering challenges and make informed decisions.
Course 2–Source Systems, Data Ingestion, and Pipelines–In this course, data engineers dive deep into the practical aspects of working with diverse data sources, ingestion patterns, and pipeline construction. Learners explore the characteristics of different data formats and the appropriate source systems for generating each type of data, equipping them with the knowledge to design effective data pipelines. The course covers the fundamentals of relational and NoSQL databases, including ACID compliance and CRUD operations, so that engineers learn to interact with a wide range of data source systems. The course covers the significance of cloud networking, resolving database connection issues, and using message queues and streaming platforms—crucial skills for creating strong and scalable data architectures. By mastering the concepts in this course, data engineers will be able to automate data ingestion processes, optimize connectivity, and establish the foundation for successful data engineering projects.
Course 3–Data Storage and Queries–This course equips data engineers with principles and best practices for designing robust, efficient data storage and querying solutions. Learners explore the data lake house concept, implementing a medallion-like architecture and using open table formats to build transactional data lakes. The course enhances SQL proficiency by teaching advanced queries, such as aggregations and joins on streaming data, while also exploring data warehouse and data lake capabilities. Learners compare storage performance and discover optimization strategies, like indexing. Data engineers can achieve high performance and scalability in data services by comprehending query execution and processing.
Course 4–Data Modeling, Transformation, and Serving–In this capstone course, data engineers explore advanced data modeling techniques, including data vault and star schemas. Learners differentiate between modeling approaches like Inmon and Kimball, gaining the ability to transform data and structure it for optimal analytical and ML use cases. The course equips data engineers with preprocessing skills for textual, image, and tabular data. Learners understand the distinctions between supervised and unsupervised learning, as well as classification and regression tasks, empowering them to design data solutions supporting a range of predictive applications. By mastering these data modeling, transformation, and serving concepts, data engineers can build robust, scalable, and business-aligned data architectures to deliver maximum value.

Enrollment

Whether you’re new to data engineering or looking to enhance your skills, this specialization provides a balanced mix of theory and hands-on experience through 4 courses, each culminating in a Coursera course certificate.

Embark on your data engineering journey from here:

By enrolling in these courses, you’ll also earn the DeepLearning.AI Data Engineering Professional Certificate upon completing all four courses.

Enroll now and take the first step towards mastering data engineering with this comprehensive and practical program, built on the foundation of Fundamentals of Data Engineering and powered by AWS.

Amazon S3 Express One Zone now supports AWS KMS with customer managed keys

2024-09-18 Elizabeth Fuentes

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/amazon-s3-express-one-zone-now-supports-aws-kms-with-customer-managed-keys/

Amazon S3 Express One Zone, a high-performance, single-Availability Zone (AZ) S3 storage class, now supports server-side encryption with AWS Key Management Service (KMS) keys (SSE-KMS). S3 Express One Zone already encrypts all objects stored in S3 directory buckets with Amazon S3 managed keys (SSE-S3) by default. Starting today, you can use AWS KMS customer managed keys to encrypt data at rest, with no impact on performance. This new encryption capability gives you an additional option to meet compliance and regulatory requirements when using S3 Express One Zone, which is designed to deliver consistent single-digit millisecond data access for your most frequently accessed data and latency-sensitive applications.

S3 directory buckets allow you to specify only one customer managed key per bucket for SSE-KMS encryption. Once the customer managed key is added, you cannot edit it to use a new key. On the other hand, with S3 general purpose buckets, you can use multiple KMS keys either by changing the default encryption configuration of the bucket or during S3 PUT requests. When using SSE-KMS with S3 Express One Zone, S3 Bucket Keys are always enabled. S3 Bucket Keys are free and reduce the number of requests to AWS KMS by up to 99%, optimizing both performance and costs.

Using SSE-KMS with Amazon S3 Express One Zone
To show you this new capability in action, I first create an S3 directory bucket in the Amazon S3 console following the steps to create a S3 directory bucket and use apne1-az4 as the Availability Zone. In Base name, I enter s3express-kms and a suffix that includes the Availability Zone ID wich is automatically added to create the final name. Then, I select the checkbox to acknowledge that Data is stored in a single Availability Zone.

In the Default encryption section, I choose Server-side encryption with AWS Key Management Service keys (SSE-KMS). Under AWS KMS Key I can Choose from your AWS KMS keys, Enter AWS KMS key ARN, or Create a KMS key. For this example, I previously created an AWS KMS key, which I selected from the list, and then choose Create bucket.

Now, any new object I upload to this S3 directory bucket will be automatically encrypted using my AWS KMS key.

SSE-KMS with Amazon S3 Express One Zone in action
To use SSE-KMS with S3 Express One Zone via the AWS Command Line Interface (AWS CLI), you need an AWS Identity and Access Management (IAM) user or role with the following policy . This policy allows the CreateSession API operation, which is necessary to successfully upload and download encrypted files to and from your S3 directory bucket.

{
	"Version": "2012-10-17",
	"Statement": [
		{
			"Effect": "Allow",
			"Action": [
				"s3express:CreateSession"
			],
			"Resource": [
				"arn:aws:s3express:*:<account>:bucket/s3express-kms--apne1-az4--x-s3"
			]
		},
		{
			"Effect": "Allow",
			"Action": [
				"kms:Decrypt",
				"kms:GenerateDataKey"
			],
			"Resource": [
				"arn:aws:kms:*:<account>:key/<keyId>"
			]
		}
	]
}

With the PutObject command, I upload a new file named confidential-doc.txt to my S3 directory bucket.

aws s3api put-object --bucket s3express-kms--apne1-az4--x-s3 \
--key confidential-doc.txt \
--body confidential-doc.txt

As a success of the previous command I receive the following output:

{
    "ETag": "\"664469eeb92c4218bbdcf92ca559d03b\"",
    "ChecksumCRC32": "0duteA==",
    "ServerSideEncryption": "aws:kms",
    "SSEKMSKeyId": "arn:aws:kms:ap-northeast-1:<accountId>:key/<keyId>",
    "BucketKeyEnabled": true
}

Checking the object’s properties with HeadObject command, I see that it’s encrypted using SSE-KMS with the key that I created before:

aws s3api head-object --bucket s3express-kms--apne1-az4--x-s3 \
--key confidential-doc.txt

I get the following output:

 
{
    "AcceptRanges": "bytes",
    "LastModified": "2024-08-21T15:29:22+00:00",
    "ContentLength": 5,
    "ETag": "\"664469eeb92c4218bbdcf92ca559d03b\"",
    "ContentType": "binary/octet-stream",
    "ServerSideEncryption": "aws:kms",
    "Metadata": {},
    "SSEKMSKeyId": "arn:aws:kms:ap-northeast-1:<accountId>:key/<keyId>",
    "BucketKeyEnabled": true,
    "StorageClass": "EXPRESS_ONEZONE"
}

I download the encrypted object with GetObject:

aws s3api get-object --bucket s3express-kms--apne1-az4--x-s3 \
--key confidential-doc.txt output-confidential-doc.txt

As my session has the necessary permissions, the object is downloaded and decrypted automatically.

{
    "AcceptRanges": "bytes",
    "LastModified": "2024-08-21T15:29:22+00:00",
    "ContentLength": 5,
    "ETag": "\"664469eeb92c4218bbdcf92ca559d03b\"",
    "ContentType": "binary/octet-stream",
    "ServerSideEncryption": "aws:kms",
    "Metadata": {},
    "SSEKMSKeyId": "arn:aws:kms:ap-northeast-1:<accountId>:key/<keyId>",
    "BucketKeyEnabled": true,
    "StorageClass": "EXPRESS_ONEZONE"
}

For this second test, I use a different IAM user with a policy that is not granted the necessary KMS key permissions to download the object. This attempt fails with an AccessDenied error, demonstrating that the SSE-KMS encryption is functioning as intended.

An error occurred (AccessDenied) when calling the CreateSession operation: Access Denied

This demonstration shows how SSE-KMS works seamlessly with S3 Express One Zone, providing an additional layer of security while maintaining ease of use for authorized users.

Things to know
Getting started – You can enable SSE-KMS for S3 Express One Zone using the Amazon S3 console, AWS CLI, or AWS SDKs. Set the default encryption configuration of your S3 directory bucket to SSE-KMS and specify your AWS KMS key. Remember, you can only use one customer managed key per S3 directory bucket for its lifetime.

Regions – S3 Express One Zone support for SSE-KMS using customer managed keys is available in all AWS Regions where S3 Express One Zone is currently available.

Performance – Using SSE-KMS with S3 Express One Zone does not impact request latency. You’ll continue to experience the same single-digit millisecond data access.

Pricing – You pay AWS KMS charges to generate and retrieve data keys used for encryption and decryption. Visit the AWS KMS pricing page for more details. In addition, when using SSE-KMS with S3 Express One Zone, S3 Bucket Keys are enabled by default for all data plane operations except for CopyObject and UploadPartCopy, and can’t be disabled. This reduces the number of requests to AWS KMS by up to 99%, optimizing both performance and costs.

AWS CloudTrail integration – You can audit SSE-KMS actions on S3 Express One Zone objects using AWS CloudTrail. Learn more about that in my previous blog post.

– Eli.

Intel Creating Foundry Subsidiary and Announcing a Big AWS Win

2024-09-16 Patrick Kennedy

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/intel-creating-foundry-subsidiary-and-announcing-a-big-aws-win/

In a huge letter to employees, Intel outlined a number of moves including an Intel Foundry subsidiary, a big AWS win, and more

The post Intel Creating Foundry Subsidiary and Announcing a Big AWS Win appeared first on ServeTheHome.

AWS Weekly Roundup: Oracle Database@AWS, Amazon RDS, AWS PrivateLink, Amazon MSK, Amazon EventBridge, Amazon SageMaker and more

2024-09-16 Matheus Guimaraes

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-oracle-databaseaws-amazon-rds-aws-privatelink-amazon-msk-amazon-eventbridge-amazon-sagemaker-and-more/

Hello, everyone!

It’s been an interesting week full of AWS news as usual, but also full of vibrant faces filling up the rooms in a variety of events happening this month.

Let’s start by covering some of the releases that have caught my attention this week.

My Top 3 AWS news of the week

Amazon RDS for MySQL zero-ETL integrations is now generally available and it comes with exciting new features. You are now able to configure zero-ETL integrations in your AWS CloudFormation templates, and you also now have the ability to set up multiple integrations from a source Amazon RDS for MySQL database with up to five Amazon Redshift warehouses. Lastly, you can now also apply data filters which determine which database and tables get automatically replicated. Read this blog post where I review aspects of this release and show you how to get started with data filtering if you want to know more. Incidentally, this release pairs well with another release this week: Amazon Redshift now allows you to alter the sort keys of tables replicated via zero-ETL integrations.

Oracle Database@AWS has been announced as part of a strategic partnership between Amazon Web Services (AWS) and Oracle. This offering allows customers to access Oracle Autonomous Database and Oracle Exadata Database Service directly within AWS simplifying cloud migration for enterprise workloads. Key features include zero-ETL integration between Oracle and AWS services for real-time data analysis, enhanced security, and optimized performance for hybrid cloud environments. This collaboration addresses the growing demand for multi-cloud flexibility and efficiency. It will be available in preview later in the year with broader availability in 2025 as it expands to new Regions.

Amazon OpenSearch Service now supports version 2.15, featuring improvements in search performance, query optimization, and AI-powered application capabilities. Key updates include radial search for vector space queries, optimizations for neural sparse and hybrid search, and the ability to enable vector and hybrid search on existing indexes. Additionally, it also introduces new features like a toxicity detection guardrail and an ML inference processor for enriching ingest pipelines. Read this guide to see how you can upgrade your Amazon OpenSearch Service domain.

So simple yet so good
These releases are simple in nature, but have a big impact.

AWS Resource Access Manager (RAM) now supports AWS PrivateLink – With this release, you can now securely share resources across AWS accounts with private connectivity, without exposing traffic to the public internet. This integration allows for more secure and streamlined access to shared services via VPC endpoints, improving network security and simplifying resource sharing across organizations.

AWS Network Firewall now supports AWS PrivateLink – another security quick-win, you can now securely access and manage Network Firewall resources without exposing traffic to the public internet.

AWS IAM Identity Center now enables users to customize their experience – You can set the language and visual mode preferences, including dark mode for improved readability and reduced eye strain. This update supports 12 different languages and enables users to adjust their settings for a more personalized experience when accessing AWS resources through the portal.

Others
Amazon EventBridge Pipes now supports customer managed KMS keys – Amazon EventBridge Pipes now supports customer-managed keys for server-side encryption. This update allows customers to use their own AWS Key Management Service (KMS) keys to encrypt data when transferring between sources and targets, offering more control and security over sensitive event data. The feature enhances security for point-to-point integrations without the need for custom integration code. See instructions on how to configure this in the updated documentation.

AWS Glue Data Catalog now supports enhanced storage optimization for Apache Iceberg tables – This includes automatic removal of unnecessary data files, orphan file management, and snapshot retention. These optimizations help reduce storage costs and improve query performance by continuously monitoring and compacting tables, making it easier to manage large-scale datasets stored in Amazon S3. See this Big Data blog post for a deep dive into this new feature.

Amazon MSK Replicator now supports the replication of Kafka topics across clusters while preserving identical topic names – This simplifies cross-cluster replication processes allowing users to replicate data across regions without needing to reconfigure client applications. This reduces setup complexity and enhances support for more seamless failovers in multi-cluster streaming architectures. See this Amazon MSK Replicator developer guide to learn more about it.

Amazon SageMaker introduces sticky session routing for inference – This allows requests from the same client to be directed to the same model instance for the duration of a session improving consistency and reducing latency, particularly in real-time inference scenarios like chatbots or recommendation systems, where session-based interactions are crucial. Read about how to configure it in this documentation guide.

Events
The AWS GenAI Lofts continue to pop up around the world! This week, developers in San Francisco had the opportunity to attend two very exciting events at the AWS Gen AI Loft in San Francisco including the “Generative AI on AWS” meetup last Tuesday, featuring discussions about extended reality, future AI tools, and more. Then things got playful on Thursday with the demonstration of an Amazon Bedrock-powered MineCraft bot and AI video game battles! If you’re around San Francisco before October 19th make sure to check out the schedule to see the list of events that you can join.

Make sure to check out the AWS GenAI Loft in Sao Paulo, Brazil, which opened recently, and the AWS GenAI Loft in London, which opens September 30th. You can already start registering for events before they fill up including one called “The future of development” that offers a whole day of targeted learning for developers to help them accelerate their skills.

Our AWS communities have also been very busy throwing incredible events! I was privileged to be a speaker at AWS Community Day Belfast where I got to finally meet all of the organizers of this amazing thriving community in Northern Ireland. If you haven’t been to a community day, I really recommend you check them out! You are sure to leave energized by the dedication and passion from communities leaders like Matt Coulter, Kristi Perreault, Matthew Wilson, Chloe McAteer, and their community members – not to mention the smiles all around. 🙂

Certifications
If you’ve been postponing taking an AWS certification exam, now is the perfect time! Register free for the AWS Certified: Associate Challenge before December 12, 2024 and get a 50% discount voucher to take any of the following exams: AWS Certified Solutions Architect – Associate, AWS Certified Developer – Associate, AWS Certified SysOps Administrator – Associate, or AWS Certified Data Engineer – Associate. My colleague Jenna Seybold has posted a collection of study material for each exam; check it out if you’re interested.

Also, don’t forget that the brand new AWS Certified AI Practitioner exam is now available. It is in beta stage, but you can already take it. If you pass it before February 15, 2025, you get an Early Adopter badge to add to your collection.

Conclusion
I hope you enjoyed the news this week!

Keep learning!

Amazon RDS for MySQL zero-ETL integration with Amazon Redshift, now generally available, enables near real-time analytics

2024-09-13 Matheus Guimaraes

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/amazon-rds-for-mysql-zero-etl-integration-with-amazon-redshift-now-generally-available-enables-near-real-time-analytics/

Zero-ETL integrations help unify your data across applications and data sources for holistic insights and breaking data silos. They provide a fully managed, no-code, near real-time solution for making petabytes of transactional data available in Amazon Redshift within seconds of data being written into Amazon Relational Database Service (Amazon RDS) for MySQL. This eliminates the need to create your own ETL jobs simplifying data ingestion, reducing your operational overhead and potentially lowering your overall data processing costs. Last year, we announced the general availability of zero-ETL integration with Amazon Redshift for Amazon Aurora MySQL-Compatible Edition as well as the availability in preview of Aurora PostgreSQL-Compatible Edition, Amazon DynamoDB, and RDS for MySQL.

I am happy to announce that Amazon RDS for MySQL zero-ETL with Amazon Redshift is now generally available. This release also includes new features such as data filtering, support for multiple integrations, and the ability to configure zero-ETL integrations in your AWS CloudFormation template.

In this post, I’ll show how you can get started with data filtering and consolidating your data across multiple databases and data warehouses. For a step-by-step walkthrough on how to set up zero-ETL integrations, see this blog post for a description of how to set one up for Aurora MySQL-Compatible, which offers a very similar experience.

Data filtering
Most companies, no matter the size, can benefit from adding filtering to their ETL jobs. A typical use case is to reduce data processing and storage costs by selecting only the subset of data needed to replicate from their production databases. Another is to exclude personally identifiable information (PII) from a report’s dataset. For example, a business in healthcare might want to exclude sensitive patient information when replicating data to build aggregate reports analyzing recent patient cases. Similarly, an e-commerce store may want to make customer spending patterns available to their marketing department, but exclude any identifying information. Conversely, there are certain cases when you might not want to use filtering, such as when making data available to fraud detection teams that need all the data in near real time to make inferences. These are just a few examples, so I encourage you to experiment and discover different use cases that might apply to your organization.

There are two ways to enable filtering in your zero-ETL integrations: when you first create the integration or by modifying an existing integration. Either way, you will find this option on the “Source” step of the zero-ETL creation wizard.

You apply filters by entering filter expressions that can be used to either include or exclude databases or tables from the dataset in the format of database*.table*. You can add multiple expressions and they will be evaluated in order from left to right.

If you’re modifying an existing integration, the new filtering rules will apply from that point in time on after you confirm your changes and Amazon Redshift will drop tables that are no longer part of the filter.

If you want to dive deeper, I recommend you read this blog post, which goes in depth into how you can set up data filters for Amazon Aurora zero-ETL integrations since the steps and concepts are very similar.

Create multiple zero-ETL integrations from a single database
You are now also able to configure up integrations from a single RDS for MySQL database to up to 5 Amazon Redshift data warehouses. The only requirement is that you must wait for the first integration to finish setting up successfully before adding others.

This allows you to share transactional data with different teams while providing them ownership over their own data warehouses for their specific use cases. For example, you can also use this in conjunction with data filtering to fan out different sets of data to development, staging, and production Amazon Redshift clusters from the same Amazon RDS production database.

Another interesting scenario where this could be really useful is consolidation of Amazon Redshift clusters by using zero-ETL to replicate to different warehouses. You could also use Amazon Redshift materialized views to explore your data, power your Amazon Quicksight dashboards, share data, train jobs in Amazon SageMaker, and more.

Conclusion
RDS for MySQL zero-ETL integrations with Amazon Redshift allows you to replicate data for near real-time analytics without needing to build and manage complex data pipelines. It is generally available today with the ability to add filter expressions to include or exclude databases and tables from the replicated data sets. You can now also set up multiple integrations from the same source RDS for MySQL database to different Amazon Redshift warehouses or create integrations from different sources to consolidate data into one data warehouse.

This zero-ETL integration is available for RDS for MySQL versions 8.0.32 and later, Amazon Redshift Serverless, and Amazon Redshift RA3 instance types in supported AWS Regions.

In addition to using the AWS Management Console, you can also set up a zero-ETL integration via the AWS Command Line Interface (AWS CLI) and by using an AWS SDK such as boto3, the official AWS SDK for Python.

See the documentation to learn more about working with zero-ETL integrations.

— Matheus Guimaraes

JUPITER Exascale Supercomputer Starts Installation

2024-09-11 Cliff Robinson

Post Syndicated from Cliff Robinson original https://www.servethehome.com/jupiter-exascale-supercomputer-starts-installation-nvidia-arm-eviden/

The JUPITER Exascale Supercomputer is starting its modular data center installation for the 24000 NVIDIA GH200 European supercomputer

The post JUPITER Exascale Supercomputer Starts Installation appeared first on ServeTheHome.

Amazon SageMaker HyperPod introduces Amazon EKS support

2024-09-10 Elizabeth Fuentes

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/amazon-sagemaker-hyperpod-introduces-amazon-eks-support/

Today, we are pleased to announce Amazon Elastic Kubernetes Service (EKS) support in Amazon SageMaker HyperPod — purpose-built infrastructure engineered with resilience at its core for foundation model (FM) development. This new capability enables customers to orchestrate HyperPod clusters using EKS, combining the power of Kubernetes with Amazon SageMaker HyperPod‘s resilient environment designed for training large models. Amazon SageMaker HyperPod helps efficiently scale across more than a thousand artificial intelligence (AI) accelerators, reducing training time by up to 40%.

Amazon SageMaker HyperPod now enables customers to manage their clusters using a Kubernetes-based interface. This integration allows seamless switching between Slurm and Amazon EKS for optimizing various workloads, including training, fine-tuning, experimentation, and inference. The CloudWatch Observability EKS add-on provides comprehensive monitoring capabilities, offering insights into CPU, network, disk, and other low-level node metrics on a unified dashboard. This enhanced observability extends to resource utilization across the entire cluster, node-level metrics, pod-level performance, and container-specific utilization data, facilitating efficient troubleshooting and optimization.

Launched at re:Invent 2023, Amazon SageMaker HyperPod has become a go-to solution for AI startups and enterprises looking to efficiently train and deploy large scale models. It is compatible with SageMaker’s distributed training libraries, which offer Model Parallel and Data Parallel software optimizations that help reduce training time by up to 20%. SageMaker HyperPod automatically detects and repairs or replaces faulty instances, enabling data scientists to train models uninterrupted for weeks or months. This allows data scientists to focus on model development, rather than managing infrastructure.

The integration of Amazon EKS with Amazon SageMaker HyperPod uses the advantages of Kubernetes, which has become popular for machine learning (ML) workloads due to its scalability and rich open-source tooling. Organizations often standardize on Kubernetes for building applications, including those required for generative AI use cases, as it allows reuse of capabilities across environments while meeting compliance and governance standards. Today’s announcement enables customers to scale and optimize resource utilization across more than a thousand AI accelerators. This flexibility enhances the developer experience, containerized app management, and dynamic scaling for FM training and inference workloads.

Amazon EKS support in Amazon SageMaker HyperPod strengthens resilience through deep health checks, automated node recovery, and job auto-resume capabilities, ensuring uninterrupted training for large scale and/or long-running jobs. Job management can be streamlined with the optional HyperPod CLI, designed for Kubernetes environments, though customers can also use their own CLI tools. Integration with Amazon CloudWatch Container Insights provides advanced observability, offering deeper insights into cluster performance, health, and utilization. Additionally, data scientists can use tools like Kubeflow for automated ML workflows. The integration also includes Amazon SageMaker managed MLflow, providing a robust solution for experiment tracking and model management.

At a high level, Amazon SageMaker HyperPod cluster is created by the cloud admin using the HyperPod cluster API and is fully managed by the HyperPod service, removing the undifferentiated heavy lifting involved in building and optimizing ML infrastructure. Amazon EKS is used to orchestrate these HyperPod nodes, similar to how Slurm orchestrates HyperPod nodes, providing customers with a familiar Kubernetes-based administrator experience.

Let’s explore how to get started with Amazon EKS support in Amazon SageMaker HyperPod
I start by preparing the scenario, checking the prerequisites, and creating an Amazon EKS cluster with a single AWS CloudFormation stack following the Amazon SageMaker HyperPod EKS workshop, configured with VPC and storage resources.

To create and manage Amazon SageMaker HyperPod clusters, I can use either the AWS Management Console or AWS Command Line Interface (AWS CLI). Using the AWS CLI, I specify my cluster configuration in a JSON file. I choose the Amazon EKS cluster created previously as the orchestrator of the SageMaker HyperPod Cluster. Then, I create the cluster worker nodes that I call “worker-group-1”, with a private Subnet, NodeRecovery set to Automatic to enable automatic node recovery and for OnStartDeepHealthChecks I add InstanceStress and InstanceConnectivity to enable deep health checks.

cat > eli-cluster-config.json << EOL
{
    "ClusterName": "example-hp-cluster",
    "Orchestrator": {
        "Eks": {
            "ClusterArn": "${EKS_CLUSTER_ARN}"
        }
    },
    "InstanceGroups": [
        {
            "InstanceGroupName": "worker-group-1",
            "InstanceType": "ml.p5.48xlarge",
            "InstanceCount": 32,
            "LifeCycleConfig": {
                "SourceS3Uri": "s3://${BUCKET_NAME}",
                "OnCreate": "on_create.sh"
            },
            "ExecutionRole": "${EXECUTION_ROLE}",
            "ThreadsPerCore": 1,
            "OnStartDeepHealthChecks": [
                "InstanceStress",
                "InstanceConnectivity"
            ],
        },
  ....
    ],
    "VpcConfig": {
        "SecurityGroupIds": [
            "$SECURITY_GROUP"
        ],
        "Subnets": [
            "$SUBNET_ID"
        ]
    },
    "ResilienceConfig": {
        "NodeRecovery": "Automatic"
    }
}
EOL

You can add InstanceStorageConfigs to provision and mount an additional Amazon EBS volumes on HyperPod nodes.

To create the cluster using the SageMaker HyperPod APIs, I run the following AWS CLI command:

aws sagemaker create-cluster \ 
--cli-input-json file://eli-cluster-config.json

The AWS command returns the ARN of the new HyperPod cluster.

{
"ClusterArn": "arn:aws:sagemaker:us-east-2:ACCOUNT-ID:cluster/wccy5z4n4m49"
}

I then verify the HyperPod cluster status in the SageMaker Console, awaiting until the status changes to InService.

Alternatively, you can check the cluster status using the AWS CLI running the describe-cluster command:

aws sagemaker describe-cluster --cluster-name my-hyperpod-cluster

Once the cluster is ready, I can access the SageMaker HyperPod cluster nodes. For most operations, I can use kubectl commands to manage resources and jobs from my development environment, using the full power of Kubernetes orchestration while benefiting from SageMaker HyperPod’s managed infrastructure. On this occasion, for advanced troubleshooting or direct node access, I use AWS Systems Manager (SSM) to log into individual nodes, following the instructions in the Access your SageMaker HyperPod cluster nodes page.

To run jobs on the SageMaker HyperPod cluster orchestrated by EKS, I follow the steps outlined in the Run jobs on SageMaker HyperPod cluster through Amazon EKS page. You can use the HyperPod CLI and the native kubectl command to find avaible HyperPod clusters and submit training jobs (Pods). For managing ML experiments and training runs, you can use Kubeflow Training Operator, Kueue and Amazon SageMaker-managed MLflow.

Finally, in the SageMaker Console, I can view the Status and Kubernetes version of recently added EKS clusters, providing a comprehensive overview of my SageMaker HyperPod environment.

And I can monitor cluster performance and health insights using Amazon CloudWatch Container.

Things to know
Here are some key things you should know about Amazon EKS support in Amazon SageMaker HyperPod:

Resilient Environment – This integration provides a more resilient training environment with deep health checks, automated node recovery, and job auto-resume. SageMaker HyperPod automatically detects, diagnoses, and recovers from faults, allowing you to continually train foundation models for weeks or months without disruption. This can reduce training time by up to 40%.

Enhanced GPU Observability – Amazon CloudWatch Container Insights provides detailed metrics and logs for your containerized applications and microservices. This enables comprehensive monitoring of cluster performance and health.

Scientist-Friendly Tool – This launch includes a custom HyperPod CLI for job management, Kubeflow Training Operators for distributed training, Kueue for scheduling, and integration with SageMaker Managed MLflow for experiment tracking. It also works with SageMaker’s distributed training libraries, which provide Model Parallel and Data Parallel optimizations to significantly reduce training time. These libraries, combined with auto-resumption of jobs, enable efficient and uninterrupted training of large models.

Flexible Resource Utilization – This integration enhances developer experience and scalability for FM workloads. Data scientists can efficiently share compute capacity across training and inference tasks. You can use your existing Amazon EKS clusters or create and attach new ones to HyperPod compute, bring your own tools for job submission, queuing and monitoring.

To get started with Amazon SageMaker HyperPod on Amazon EKS, you can explore resources such as the SageMaker HyperPod EKS Workshop, the aws-do-hyperpod project, and the awsome-distributed-training project. This release is generally available in the AWS Regions where Amazon SageMaker HyperPod is available except Europe(London). For pricing information, visit the Amazon SageMaker Pricing page.

This blog post was a collaborative effort. I would like to thank Manoj Ravi, Adhesh Garg, Tomonori Shimomura, Alex Iankoulski, Anoop Saha, and the entire team for their significant contributions in compiling and refining the information presented here. Their collective expertise was crucial in creating this comprehensive article.

– Eli.

Stability AI’s best image generating models now in Amazon Bedrock

2024-09-04 Danilo Poccia

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/stability-ais-best-image-generating-models-now-in-amazon-bedrock/

Starting today, you can use three new text-to-image models from Stability AI in Amazon Bedrock: Stable Image Ultra, Stable Diffusion 3 Large, and Stable Image Core. These models greatly improve performance in multi-subject prompts, image quality, and typography and can be used to rapidly generate high-quality visuals for a wide range of use cases across marketing, advertising, media, entertainment, retail, and more.

These models excel in producing images with stunning photorealism, boasting exceptional detail, color, and lighting, addressing common challenges like rendering realistic hands and faces. The models’ advanced prompt understanding allows it to interpret complex instructions involving spatial reasoning, composition, and style.

The three new Stability AI models available in Amazon Bedrock cover different use cases:

Stable Image Ultra – Produces the highest quality, photorealistic outputs perfect for professional print media and large format applications. Stable Image Ultra excels at rendering exceptional detail and realism.

Stable Diffusion 3 Large – Strikes a balance between generation speed and output quality. Ideal for creating high-volume, high-quality digital assets like websites, newsletters, and marketing materials.

Stable Image Core – Optimized for fast and affordable image generation, great for rapidly iterating on concepts during ideation.

This table summarizes the model’s key features:

Features	Stable Image Ultra	Stable Diffusion 3 Large	Stable Image Core
Parameters	16 billion	8 billion	2.6 billion
Input	Text	Text or image	Text
Typography	Tailored for large-scale display	Tailored for large-scale display	Versatility and readability across different sizes and applications
Visual aesthetics	Photorealistic image output	Highly realistic with finer attention to detail	Good rendering; not as detail-oriented

One of the key improvements of Stable Image Ultra and Stable Diffusion 3 Large compared to Stable Diffusion XL (SDXL) is text quality in generated images, with fewer errors in spelling and typography thanks to its innovative Diffusion Transformer architecture, which implements two separate sets of weights for image and text but enables information flow between the two modalities.

Here are a few images created with these models.

Stable Image Ultra – Prompt: photo, realistic, a woman sitting in a field watching a kite fly in the sky, stormy sky, highly detailed, concept art, intricate, professional composition.

Stable Diffusion 3 Large – Prompt: comic-style illustration, male detective standing under a streetlamp, noir city, wearing a trench coat, fedora, dark and rainy, neon signs, reflections on wet pavement, detailed, moody lighting.

Stable Image Core – Prompt: professional 3d render of a white and orange sneaker, floating in center, hovering, floating, high quality, photorealistic.

Use cases for the new Stability AI models in Amazon Bedrock
Text-to-image models offer transformative potential for businesses across various industries and can significantly streamline creative workflows in marketing and advertising departments, enabling rapid generation of high-quality visuals for campaigns, social media content, and product mockups. By expediting the creative process, companies can respond more quickly to market trends and reduce time-to-market for new initiatives. Additionally, these models can enhance brainstorming sessions, providing instant visual representations of concepts that can spark further innovation.

For e-commerce businesses, AI-generated images can help create diverse product showcases and personalized marketing materials at scale. In the realm of user experience and interface design, these tools can quickly produce wireframes and prototypes, accelerating the design iteration process. The adoption of text-to-image models can lead to significant cost savings, increased productivity, and a competitive edge in visual communication across various business functions.

Here are some example use cases across different industries:

Advertising and Marketing

Stable Image Ultra for luxury brand advertising and photorealistic product showcases
Stable Diffusion 3 Large for high-quality product marketing images and print campaigns
Use Stable Image Core for rapid A/B testing of visual concepts for social media ads

E-commerce

Stable Image Ultra for high-end product customization and made-to-order items
Stable Diffusion 3 Large for most product visuals across an e-commerce site
Stable Image Core to quickly generate product images and keep listings up-to-date

Media and Entertainment

Stable Image Ultra for ultra-realistic key art, marketing materials, and game visuals
Stable Diffusion 3 Large for environment textures, character art, and in-game assets
Stable Image Core for rapid prototyping and concept art exploration

Now, let’s see these new models in action, first using the AWS Management Console, then with the AWS Command Line Interface (AWS CLI) and AWS SDKs.

Using the new Stability AI models in the Amazon Bedrock console
In the Amazon Bedrock console, I choose Model access from the navigation pane to enable access the three new models in the Stability AI section.

Now that I have access, I choose Image in the Playgrounds section of the navigation pane. For the model, I choose Stability AI and Stable Image Ultra.

As prompt, I type:

A stylized picture of a cute old steampunk robot with in its hands a sign written in chalk that says "Stable Image Ultra in Amazon Bedrock".

I leave all other options to their default values and choose Run. After a few seconds, I get what I asked. Here’s the image:

Using Stable Image Ultra with the AWS CLI
While I am still in the console Image playground, I choose the three small dots in the corner of the playground window and then View API request. In this way, I can see the AWS Command Line Interface (AWS CLI) command equivalent to what I just did in the console:

aws bedrock-runtime invoke-model \
--model-id stability.stable-image-ultra-v1:0 \
--body "{\"prompt\":\"A stylized picture of a cute old steampunk robot with in its hands a sign written in chalk that says \\\"Stable Image Ultra in Amazon Bedrock\\\".\",\"mode\":\"text-to-image\",\"aspect_ratio\":\"1:1\",\"output_format\":\"jpeg\"}" \
--cli-binary-format raw-in-base64-out \
--region us-west-2 \
invoke-model-output.txt

To use Stable Image Core or Stable Diffusion 3 Large, I can replace the model ID.

The previous command outputs the image in Base64 format inside a JSON object in a text file.

To get the image with a single command, I write the output JSON file to standard output and use the jq tool to extract the encoded image so that it can be decoded on the fly. The output is written in the img.png file. Here’s the full command:

aws bedrock-runtime invoke-model \
--model-id stability.stable-image-ultra-v1:0 \
--body "{\"prompt\":\"A stylized picture of a cute old steampunk robot with in its hands a sign written in chalk that says \\\"Stable Image Ultra in Amazon Bedrock\\\".\",\"mode\":\"text-to-image\",\"aspect_ratio\":\"1:1\",\"output_format\":\"jpeg\"}" \
--cli-binary-format raw-in-base64-out \
--region us-west-2 \
/dev/stdout | jq -r '.images[0]' | base64 --decode > img.png

Using Stable Image Ultra with AWS SDKs
Here’s how you can use Stable Image Ultra with the AWS SDK for Python (Boto3). This simple application interactively asks for a text-to-image prompt and then calls Amazon Bedrock to generate the image.

import base64
import boto3
import json
import os

MODEL_ID = "stability.stable-image-ultra-v1:0"

bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-west-2")

print("Enter a prompt for the text-to-image model:")
prompt = input()

body = {
    "prompt": prompt,
    "mode": "text-to-image"
}
response = bedrock_runtime.invoke_model(modelId=MODEL_ID, body=json.dumps(body))

model_response = json.loads(response["body"].read())

base64_image_data = model_response["images"][0]

i, output_dir = 1, "output"
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
while os.path.exists(os.path.join(output_dir, f"img_{i}.png")):
    i += 1

image_data = base64.b64decode(base64_image_data)

image_path = os.path.join(output_dir, f"img_{i}.png")
with open(image_path, "wb") as file:
    file.write(image_data)

print(f"The generated image has been saved to {image_path}")

The application writes the resulting image in an output directory that is created if not present. To not overwrite existing files, the code checks for existing files to find the first file name available with the img_<number>.png format.

More examples of how to use Stable Diffusion models are available in the Code Library of the AWS Documentation.

Customer voices
Learn from Ken Hoge, Global Alliance Director, Stability AI, how Stable Diffusion models are reshaping the industry from text-to-image to video, audio, and 3D, and how Amazon Bedrock empowers customers with an all-in-one, secure, and scalable solution.

Step into a world where reading comes alive with Nicolette Han, Product Owner, Stride Learning. With support from Amazon Bedrock and AWS, Stride Learning’s Legend Library is transforming how young minds engage with and comprehend literature using AI to create stunning, safe illustrations for children stories.

Things to know
The new Stability AI models – Stable Image Ultra, Stable Diffusion 3 Large, and Stable Image Core – are available today in Amazon Bedrock in the US West (Oregon) AWS Region. With this launch, Amazon Bedrock offers a broader set of solutions to boost your creativity and accelerate content generation workflows. See the Amazon Bedrock pricing page to understand costs for your use case.

You can find more information on Stable Diffusion 3 in the research paper that describes in detail the underlying technology.

To start, see the Stability AI’s models section of the Amazon Bedrock User Guide. To discover how others are using generative AI in their solutions and learn with deep-dive technical content, visit community.aws.

— Danilo

The latest AWS Heroes have arrived – September 2024

2024-09-04 Taylor Jacobsen

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/the-latest-aws-heroes-have-arrived-september-2024/

The AWS Heroes program recognizes outstanding individuals who are making meaningful contributions within the AWS community. These technical experts generously share their insights, best practices, and innovative solutions to help others create efficiencies and build faster on AWS. Heroes are thought leaders who have demonstrated a commitment to empowering the broader AWS community through their significant contributions and leadership.

Meet our newest cohort of AWS Heroes!

Faye Ellis – London, United Kingdom

Community Hero Faye Ellis is a Principal Training Architect at Pluralsight, where she specializes in helping organizations and individuals to develop their AWS skills, and has taught AWS to millions of people worldwide. She is also committed to make a rewarding cloud career achievable for people all around the world. With over a decade of experience in the IT industry, she uses her expertise in designing and supporting mission critical systems to help explain cloud technology in a way that is accessible and easy to understand.

Ilanchezhian Ganesamurthy – Chennai, India

Community Hero Ilanchezhian Ganesamurthy is currently the Director – Generative AI (GenAI) and Conversational AI (CAI) at Tietoevry, a leading Nordic IT services company. Since 2015, he has been actively involved with the AWS User Group India, and in 2018, he took on the role of co-organizer for the AWS User Group Chennai, which has grown to 4,800 members. Ilan champions diversity and inclusion, recognizing the importance of fostering the next generation of cloud experts. He is a strong supporter of AWS Cloud Clubs, leveraging his industry connections to help the Chennai chapter organize events and networking to empower aspiring cloud professionals.

Jaehyun Shin – Seoul, Korea

Community Hero Jaehyun Shin is a Site Reliability Engineer at MUSINSA, a Korean online fashion retailer. In 2017, he joined the AWS Korea User Group (AWSKRUG), where he has since served as a co-owner of the AWSKRUG Serverless and Seongsu Groups. During this time, Jaehyun was an AWS Community Builder, leveraging his expertise to mentor and nurture new Korea User Group leaders. He has also been an active event organizer for AWS Community Days and hands-on labs, further strengthening the AWS community in Korea.

Jimmy Dahlqvist – Malmö, Sweden

Serverless Hero Jimmy Dahlqvist is a Lead Cloud Architect and Advisor at Sigma Technology Cloud, an AWS Advanced Tier Services Partner and one of Sweden’s major consulting companies. In 2024, he started Serverless-Handbook as the home base of all his serverless adventures. Jimmy is also an AWS Certification Subject Matter Expert, and regularly participates in workshops ensuring AWS Certifications are fair for everyone.

Lee Gilmore – Newcastle, United Kingdom

Serverless Hero Lee Gilmore is a Principal Solutions Architect at Leighton, an AWS Consulting Partner based in Newcastle, North East England. With over two decades of experience in the tech industry, he has spent the past ten years specializing in serverless and cloud-native technologies. Lee is passionate about domain-driven design and event-driven architectures, which are central to his work. Additionally, he regularly authors in-depth articles on serverlessadvocate.com, shares open-source full solutions on GitHub, and frequently speaks at both local and international events.

Maciej Walkowiak – Berlin, Germany

DevTools Hero Maciej Walkowiak is an independent Java consultant based in Berlin, Germany. For nearly two decades, he has been helping companies ranging from startups to enterprises in architecting and developing fast, scalable, and easy-to-maintain Java applications. The great majority of these applications are based on the Spring Framework and Spring Boot, which are his favorite tools for building software. Since 2015, he has been deeply involved in the Spring ecosystem, and leads the Spring Cloud AWS project on GitHub—the bridge between AWS APIs and the Spring programming model.

Minoru Onda – Tokyo, Japan

Community Hero Minoru Onda is a Technology Evangelist at KDDI Agile Development Center Corporation (KAG). He joined the Japan AWS User Group (JAWS-UG) in 2021 and now leads the operations of three communities: the Tokyo chapter, the SRE chapter, and NW-JAWS. In recent years, he has been focusing on utilizing Generative AI on AWS, and co-authored an introductory technical book on Amazon Bedrock with community members, which was published in Japan.

Learn More

Visit the AWS Heroes website if you’d like to learn more about the AWS Heroes program or to connect with a Hero near you.

— Taylor

Noise