Tag Archives: announcements

Stability AI’s best image generating models now in Amazon Bedrock

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/stability-ais-best-image-generating-models-now-in-amazon-bedrock/

Starting today, you can use three new text-to-image models from Stability AI in Amazon Bedrock: Stable Image UltraStable Diffusion 3 Large, and Stable Image Core. These models greatly improve performance in multi-subject prompts, image quality, and typography and can be used to rapidly generate high-quality visuals for a wide range of use cases across marketing, advertising, media, entertainment, retail, and more.

These models excel in producing images with stunning photorealism, boasting exceptional detail, color, and lighting, addressing common challenges like rendering realistic hands and faces. The models’ advanced prompt understanding allows it to interpret complex instructions involving spatial reasoning, composition, and style.

The three new Stability AI models available in Amazon Bedrock cover different use cases:

Stable Image Ultra – Produces the highest quality, photorealistic outputs perfect for professional print media and large format applications. Stable Image Ultra excels at rendering exceptional detail and realism.

Stable Diffusion 3 Large – Strikes a balance between generation speed and output quality. Ideal for creating high-volume, high-quality digital assets like websites, newsletters, and marketing materials.

Stable Image Core – Optimized for fast and affordable image generation, great for rapidly iterating on concepts during ideation.

This table summarizes the model’s key features:

Features Stable Image Ultra Stable Diffusion 3 Large Stable Image Core
Parameters 16 billion 8 billion 2.6 billion
Input Text Text or image Text
Typography Tailored for
large-scale display
Tailored for
large-scale display
Versatility and readability across
different sizes and applications
Visual
aesthetics
Photorealistic
image output
Highly realistic with
finer attention to detail
Good rendering;
not as detail-oriented

One of the key improvements of Stable Image Ultra and Stable Diffusion 3 Large compared to Stable Diffusion XL (SDXL) is text quality in generated images, with fewer errors in spelling and typography thanks to its innovative Diffusion Transformer architecture, which implements two separate sets of weights for image and text but enables information flow between the two modalities.

Here are a few images created with these models.

Stable Image Ultra – Prompt: photo, realistic, a woman sitting in a field watching a kite fly in the sky, stormy sky, highly detailed, concept art, intricate, professional composition.

Stable Diffusion 3 Ultra – Prompt: photo, realistic, a woman sitting in a field watching a kite fly in the sky, stormy sky, highly detailed, concept art, intricate, professional composition.

Stable Diffusion 3 Large – Prompt: comic-style illustration, male detective standing under a streetlamp, noir city, wearing a trench coat, fedora, dark and rainy, neon signs, reflections on wet pavement, detailed, moody lighting.

Stable Diffusion 3 Large – Prompt: comic-style illustration, male detective standing under a streetlamp, noir city, wearing a trench coat, fedora, dark and rainy, neon signs, reflections on wet pavement, detailed, moody lighting.

Stable Image Core – Prompt: professional 3d render of a white and orange sneaker, floating in center, hovering, floating, high quality, photorealistic.

Stable Image Core – Prompt: Professional 3d render of a white and orange sneaker, floating in center, hovering, floating, high quality, photorealistic

Use cases for the new Stability AI models in Amazon Bedrock
Text-to-image models offer transformative potential for businesses across various industries and can significantly streamline creative workflows in marketing and advertising departments, enabling rapid generation of high-quality visuals for campaigns, social media content, and product mockups. By expediting the creative process, companies can respond more quickly to market trends and reduce time-to-market for new initiatives. Additionally, these models can enhance brainstorming sessions, providing instant visual representations of concepts that can spark further innovation.

For e-commerce businesses, AI-generated images can help create diverse product showcases and personalized marketing materials at scale. In the realm of user experience and interface design, these tools can quickly produce wireframes and prototypes, accelerating the design iteration process. The adoption of text-to-image models can lead to significant cost savings, increased productivity, and a competitive edge in visual communication across various business functions.

Here are some example use cases across different industries:

Advertising and Marketing

  • Stable Image Ultra for luxury brand advertising and photorealistic product showcases
  • Stable Diffusion 3 Large for high-quality product marketing images and print campaigns
  • Use Stable Image Core for rapid A/B testing of visual concepts for social media ads

E-commerce

  • Stable Image Ultra for high-end product customization and made-to-order items
  • Stable Diffusion 3 Large for most product visuals across an e-commerce site
  • Stable Image Core to quickly generate product images and keep listings up-to-date

Media and Entertainment

  • Stable Image Ultra for ultra-realistic key art, marketing materials, and game visuals
  • Stable Diffusion 3 Large for environment textures, character art, and in-game assets
  • Stable Image Core for rapid prototyping and concept art exploration

Now, let’s see these new models in action, first using the AWS Management Console, then with the AWS Command Line Interface (AWS CLI) and AWS SDKs.

Using the new Stability AI models in the Amazon Bedrock console
In the Amazon Bedrock console, I choose Model access from the navigation pane to enable access the three new models in the Stability AI section.

Now that I have access, I choose Image in the Playgrounds section of the navigation pane. For the model, I choose Stability AI and Stable Image Ultra.

As prompt, I type:

A stylized picture of a cute old steampunk robot with in its hands a sign written in chalk that says "Stable Image Ultra in Amazon Bedrock".

I leave all other options to their default values and choose Run. After a few seconds, I get what I asked. Here’s the image:

A stylized picture of a cute old steampunk robot with in its hands a sign written in chalk that says "Stable Image Ultra in Amazon Bedrock".

Using Stable Image Ultra with the AWS CLI
While I am still in the console Image playground, I choose the three small dots in the corner of the playground window and then View API request. In this way, I can see the AWS Command Line Interface (AWS CLI) command equivalent to what I just did in the console:

aws bedrock-runtime invoke-model \
--model-id stability.stable-image-ultra-v1:0 \
--body "{\"prompt\":\"A stylized picture of a cute old steampunk robot with in its hands a sign written in chalk that says \\\"Stable Image Ultra in Amazon Bedrock\\\".\",\"mode\":\"text-to-image\",\"aspect_ratio\":\"1:1\",\"output_format\":\"jpeg\"}" \
--cli-binary-format raw-in-base64-out \
--region us-west-2 \
invoke-model-output.txt

To use Stable Image Core or Stable Diffusion 3 Large, I can replace the model ID.

The previous command outputs the image in Base64 format inside a JSON object in a text file.

To get the image with a single command, I write the output JSON file to standard output and use the jq tool to extract the encoded image so that it can be decoded on the fly. The output is written in the img.png file. Here’s the full command:

aws bedrock-runtime invoke-model \
--model-id stability.stable-image-ultra-v1:0 \
--body "{\"prompt\":\"A stylized picture of a cute old steampunk robot with in its hands a sign written in chalk that says \\\"Stable Image Ultra in Amazon Bedrock\\\".\",\"mode\":\"text-to-image\",\"aspect_ratio\":\"1:1\",\"output_format\":\"jpeg\"}" \
--cli-binary-format raw-in-base64-out \
--region us-west-2 \
/dev/stdout | jq -r '.images[0]' | base64 --decode > img.png

Using Stable Image Ultra with AWS SDKs
Here’s how you can use Stable Image Ultra with the AWS SDK for Python (Boto3). This simple application interactively asks for a text-to-image prompt and then calls Amazon Bedrock to generate the image.

import base64
import boto3
import json
import os

MODEL_ID = "stability.stable-image-ultra-v1:0"

bedrock_runtime = boto3.client("bedrock-runtime", region_name="us-west-2")

print("Enter a prompt for the text-to-image model:")
prompt = input()

body = {
    "prompt": prompt,
    "mode": "text-to-image"
}
response = bedrock_runtime.invoke_model(modelId=MODEL_ID, body=json.dumps(body))

model_response = json.loads(response["body"].read())

base64_image_data = model_response["images"][0]

i, output_dir = 1, "output"
if not os.path.exists(output_dir):
    os.makedirs(output_dir)
while os.path.exists(os.path.join(output_dir, f"img_{i}.png")):
    i += 1

image_data = base64.b64decode(base64_image_data)

image_path = os.path.join(output_dir, f"img_{i}.png")
with open(image_path, "wb") as file:
    file.write(image_data)

print(f"The generated image has been saved to {image_path}")

The application writes the resulting image in an output directory that is created if not present. To not overwrite existing files, the code checks for existing files to find the first file name available with the img_<number>.png format.

More examples of how to use Stable Diffusion models are available in the Code Library of the AWS Documentation.

Customer voices
Learn from Ken Hoge, Global Alliance Director, Stability AI, how Stable Diffusion models are reshaping the industry from text-to-image to video, audio, and 3D, and how Amazon Bedrock empowers customers with an all-in-one, secure, and scalable solution.

Step into a world where reading comes alive with Nicolette Han, Product Owner, Stride Learning. With support from Amazon Bedrock and AWS, Stride Learning’s Legend Library is transforming how young minds engage with and comprehend literature using AI to create stunning, safe illustrations for children stories.

Things to know
The new Stability AI models – Stable Image Ultra,  Stable Diffusion 3 Large, and Stable Image Core – are available today in Amazon Bedrock in the US West (Oregon) AWS Region. With this launch, Amazon Bedrock offers a broader set of solutions to boost your creativity and accelerate content generation workflows. See the Amazon Bedrock pricing page to understand costs for your use case.

You can find more information on Stable Diffusion 3 in the research paper that describes in detail the underlying technology.

To start, see the Stability AI’s models section of the Amazon Bedrock User Guide. To discover how others are using generative AI in their solutions and learn with deep-dive technical content, visit community.aws.

Danilo

The latest AWS Heroes have arrived – September 2024

Post Syndicated from Taylor Jacobsen original https://aws.amazon.com/blogs/aws/the-latest-aws-heroes-have-arrived-september-2024/

The AWS Heroes program recognizes outstanding individuals who are making meaningful contributions within the AWS community. These technical experts generously share their insights, best practices, and innovative solutions to help others create efficiencies and build faster on AWS. Heroes are thought leaders who have demonstrated a commitment to empowering the broader AWS community through their significant contributions and leadership.

Meet our newest cohort of AWS Heroes!

Faye Ellis – London, United Kingdom

Community Hero Faye Ellis is a Principal Training Architect at Pluralsight, where she specializes in helping organizations and individuals to develop their AWS skills, and has taught AWS to millions of people worldwide. She is also committed to make a rewarding cloud career achievable for people all around the world. With over a decade of experience in the IT industry, she uses her expertise in designing and supporting mission critical systems to help explain cloud technology in a way that is accessible and easy to understand.

Ilanchezhian Ganesamurthy – Chennai, India

Community Hero Ilanchezhian Ganesamurthy is currently the Director – Generative AI (GenAI) and Conversational AI (CAI) at Tietoevry, a leading Nordic IT services company. Since 2015, he has been actively involved with the AWS User Group India, and in 2018, he took on the role of co-organizer for the AWS User Group Chennai, which has grown to 4,800 members. Ilan champions diversity and inclusion, recognizing the importance of fostering the next generation of cloud experts. He is a strong supporter of AWS Cloud Clubs, leveraging his industry connections to help the Chennai chapter organize events and networking to empower aspiring cloud professionals.

Jaehyun Shin – Seoul, Korea

Community Hero Jaehyun Shin is a Site Reliability Engineer at MUSINSA, a Korean online fashion retailer. In 2017, he joined the AWS Korea User Group (AWSKRUG), where he has since served as a co-owner of the AWSKRUG Serverless and Seongsu Groups. During this time, Jaehyun was an AWS Community Builder, leveraging his expertise to mentor and nurture new Korea User Group leaders. He has also been an active event organizer for AWS Community Days and hands-on labs, further strengthening the AWS community in Korea.

Jimmy Dahlqvist – Malmö, Sweden

Serverless Hero Jimmy Dahlqvist is a Lead Cloud Architect and Advisor at Sigma Technology Cloud, an AWS Advanced Tier Services Partner and one of Sweden’s major consulting companies. In 2024, he started Serverless-Handbook as the home base of all his serverless adventures. Jimmy is also an AWS Certification Subject Matter Expert, and regularly participates in workshops ensuring AWS Certifications are fair for everyone.

Lee Gilmore – Newcastle, United Kingdom

Serverless Hero Lee Gilmore is a Principal Solutions Architect at Leighton, an AWS Consulting Partner based in Newcastle, North East England. With over two decades of experience in the tech industry, he has spent the past ten years specializing in serverless and cloud-native technologies. Lee is passionate about domain-driven design and event-driven architectures, which are central to his work. Additionally, he regularly authors in-depth articles on serverlessadvocate.com, shares open-source full solutions on GitHub, and frequently speaks at both local and international events.

Maciej Walkowiak – Berlin, Germany

DevTools Hero Maciej Walkowiak is an independent Java consultant based in Berlin, Germany. For nearly two decades, he has been helping companies ranging from startups to enterprises in architecting and developing fast, scalable, and easy-to-maintain Java applications. The great majority of these applications are based on the Spring Framework and Spring Boot, which are his favorite tools for building software. Since 2015, he has been deeply involved in the Spring ecosystem, and leads the Spring Cloud AWS project on GitHub—the bridge between AWS APIs and the Spring programming model.

Minoru Onda – Tokyo, Japan

Community Hero Minoru Onda is a Technology Evangelist at KDDI Agile Development Center Corporation (KAG). He joined the Japan AWS User Group (JAWS-UG) in 2021 and now leads the operations of three communities: the Tokyo chapter, the SRE chapter, and NW-JAWS. In recent years, he has been focusing on utilizing Generative AI on AWS, and co-authored an introductory technical book on Amazon Bedrock with community members, which was published in Japan.

Learn More

Visit the AWS Heroes website if you’d like to learn more about the AWS Heroes program or to connect with a Hero near you.

Taylor

Introducing job queuing to scale your AWS Glue workloads

Post Syndicated from Noritaka Sekiyama original https://aws.amazon.com/blogs/big-data/introducing-job-queuing-to-scale-your-aws-glue-workloads/

Data is a key driver for your business. Data volume can increase significantly over time, and it often requires concurrent consumption of large compute resources. Data integration workloads can become increasingly concurrent as more and more applications demand access to data at the same time. In AWS, hundreds of thousands of customers use AWS Glue, a serverless data integration service, for integrating data across multiple data sources at scale. AWS Glue jobs can be triggered asynchronously via a schedule or event, or started synchronously, on-demand.

Your AWS account has quotas, also referred to as limits, which are the maximum number of service resources for your AWS account. AWS Glue quotas helps guarantee the availability of AWS Glue resources and prevents accidental over provisioning of resources. However, with large or spiky workloads, it can be challenging to manage job run concurrency or Data Processing Units (DPU) to stay under the service quotas.
Traditionally, when you hit the quota of concurrent Glue job runs, your jobs fail immediately.

Today, we are pleased to announce the general availability of AWS Glue job queuing. Job queuing increases scalability and improves the customer experience of managing AWS Glue jobs. With this new capability, you no longer need to manage concurrency of your AWS Glue job runs and attempt retries just to avoid job failures due to high concurrency. You can simply start your jobs, and when the job runs are in Waiting state, the AWS Glue job queuing feature staggers jobs automatically whenever possible. This increases your job success rates and the experience for large concurrency workloads.

This post demonstrates how job queuing helps you scale your Glue workloads and how job queuing works.

Use cases and benefits for job queuing

The following are common data integration use cases where many concurrent job runs are needed:

  • Many different data sources need to be read in parallel
  • Multiple large datasets need to be processed concurrently
  • Data is processed in an event-driven fashion, and many events occur at the same time

AWS Glue has the following service quotas per Region and account related to concurrent usage:

  • Max concurrent job runs per account
  • Max concurrent job runs per job
  • Max task DPUs per account

You can also configure maximum concurrency for individual jobs.

In the aforementioned typical use cases, when you run a job through the StartJobRun API or AWS Glue console, you may hit the upper limit defined at any of the discussed places. If this happens, your job fails immediately due to errors like ConcurrentRunsExceededException returned by the AWS Glue API endpoint.

Job queuing helps those typical use cases without forcing you to manage concurrency between all your job runs. You no longer need to make manual retries when you get ConcurrentRunsExceededException. Job queuing enqueues job runs when you hit the limit and automatically reattempts job runs when resources free up. It simplifies your daily operation and reduces latency for the retries. It also allows you to scale more with AWS Glue jobs.

In the next section, we describe how job queuing is configured.

Configure job queuing for Glue jobs

To enable job queuing on the AWS Glue Studio console, complete the following steps:

  1. Open AWS Glue console.
  2. Choose Jobs.
  3. Choose your job.
  4. Choose the Job details tab.
  5. For Job Run Queuing, select Enable job runs to be queued to run later when they cannot run immediately due to service quotas
  6. Choose Save.

In the next section, we describe how job queuing works.

How AWS Glue jobs work with job queuing

In the current job run lifecycle, the job-level and account-level limits are checked when a job starts, and the job moves to a Failed state when these limits are reached. With job queuing, your job run state goes into a Waiting state to be reattempted instead of Failed. The Waiting state means that job run is queued for retry after the limits have been exceeded or resources were not unavailable. Job queueing is another retry mechanism in addition to the customer-specified max retry.

AWS Glue job queuing will improve the success rates of job runs and reduce failures due to limits, but it doesn’t guarantee job run success. Limits and resources could still be unavailable by the time the reattempt run starts.

The following screenshot shows that two job runs are in the Waiting state:

The following limits are covered by job queuing:

  • Max concurrent job runs per account exceeded
  • Max concurrent job runs per job exceeded (which includes the account-level service quota as well as the configured parameter on the job)
  • Max concurrent DPUs exceeded
  • Resource unavailable due to IP address exhaustion in VPCs

The retry mechanism is configured to retry for a maximum of 15 minutes or 10 attempts, whichever comes first.

Here’s the state transition diagram for job runs when job queuing is enabled.

Considerations

Keep in mind the following considerations:

  • AWS Glue Flex jobs are not supported
  • With job queuing enabled, the parameter MaxRetries is not configurable for the same job

Conclusion

In this post, we described how the new job queuing capability helps you scale your AWS Glue job workload. You can start leveraging job queuing for your new jobs or existing jobs today. We are looking forward to hearing your feedback.


About the authors

Noritaka Sekiyama is a Principal Big Data Architect on the AWS Glue team. He works based in Tokyo, Japan. He is responsible for building software artifacts to help customers. In his spare time, he enjoys cycling with his road bike.

 Gyan Radhakrishnan is a Software Development Engineer on the AWS Glue team. He is working on designing and building end-to-end solutions for data intensive applications.

Simon Kern is a Software Development Engineer on the AWS Glue team. He is enthusiastic about serverless technologies, data engineering and building great services.

Dana Adylova is a Software Development Engineer on the AWS Glue team. She is working on building software for supporting data intensive applications. In her spare time, she enjoys knitting and reading sci-fi.

Matt Su is a Senior Product Manager on the AWS Glue team. He enjoys helping customers uncover insights and make better decisions using their data with AWS Analytic services. In his spare time, he enjoys skiing and gardening.

AWS Weekly Roundup: AWS Parallel Computing Service, Amazon EC2 status checks, and more (September 2, 2024)

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-aws-parallel-computing-service-amazon-ec2-status-checks-and-more-september-2-2024/

With the arrival of September, AWS re:Invent 2024 is now 3 months away and I am very excited for the new upcoming services and announcements at the conference. I remember attending re:Invent 2019, just before the COVID-19 pandemic. It was the biggest in-person re:Invent with 60,000+ attendees and it was my second one. It was amazing to be in that atmosphere! Registration is now open for AWS re:Invent 2024. Come join us in Las Vegas for five exciting days of keynotes, breakout sessions, chalk talks, interactive learning opportunities, and career-changing connections!

Now let’s look at the last week’s new announcements.

Last week’s launches
Here are the launches that got my attention.

Announcing AWS Parallel Computing Service – AWS Parallel Computing Service (AWS PCS) is a new managed service that lets you run and scale high performance computing (HPC) workloads on AWS. You can build scientific and engineering models and run simulations using a fully managed Slurm scheduler with built-in technical support and a rich set of customization options. Tailor your HPC environment to your specific needs and integrate it with your preferred software stack. Build complete HPC clusters that integrates compute, storage, networking, and visualization resources, and seamlessly scale from zero to thousands of instances. To learn more, visit AWS Parallel Computing Service and read Channy’s blog post.

Amazon EC2 status checks now support reachability health of attached EBS volumes – You can now use Amazon EC2 status checks to directly monitor if the Amazon EBS volumes attached to your instances are reachable and able to complete I/O operations. With this new status check, you can quickly detect attachment issues or volume impairments that may impact the performance of your applications running on Amazon EC2 instances. You can further integrate these status checks within Auto Scaling groups to monitor the health of EC2 instances and replace impacted instances to ensure high availability and reliability of your applications. Attached EBS status checks can be used along with the instance status and system status checks to monitor the health of your instances. To learn more, refer to the Status checks for Amazon EC2 instances documentation.

Amazon QuickSight now supports sharing views of embedded dashboards – You can now share views of embedded dashboards in Amazon QuickSight. This feature allows you to enable more collaborative capabilities in your application with embedded QuickSight dashboards. Additionally, you can enable personalization capabilities such as bookmarks for anonymous users. You can share a unique link that displays only your changes while staying within the application, and use dashboard or console embedding to generate a shareable link to your application page with QuickSight’s reference encapsulated using the QuickSight Embedding SDK. QuickSight Readers can then send this shareable link to their peers. When their peer accesses the shared link, they are taken to the page on the application that contains the embedded QuickSight dashboard. For more information, refer to Embedded view documentation.

Amazon Q Business launches IAM federation for user identity authenticationAmazon Q Business is a fully managed service that deploys a generative AI business expert for your enterprise data. You can use the Amazon Q Business IAM federation feature to connect your applications directly to your identity provider to source user identity and user attributes for these applications. Previously, you had to sync your user identity information from your identity provider into AWS IAM Identity Center, and then connect your Amazon Q Business applications to IAM Identity Center for user authentication. At launch, Amazon Q Business IAM federation will support the OpenID Connect (OIDC) and SAML2.0 protocols for identity provider connectivity. To learn more, visit Amazon Q Business documentation.

Amazon Bedrock now supports cross-Region inferenceAmazon Bedrock announces support for cross-Region inference, an optional feature that enables you to seamlessly manage traffic bursts by utilizing compute across different AWS Regions. If you are using on-demand mode, you’ll be able to get higher throughput limits (up to 2x your allocated in-Region quotas) and enhanced resilience during periods of peak demand by using cross-Region inference. By opting in, you no longer have to spend time and effort predicting demand fluctuations. Instead, cross-Region inference dynamically routes traffic across multiple Regions, ensuring optimal availability for each request and smoother performance during high-usage periods. You can control where your inference data flows by selecting from a pre-defined set of Regions, helping you comply with applicable data residency requirements and sovereignty laws. Find the list at Supported Regions and models for cross-Region inference. To get started, refer to the Amazon Bedrock documentation or this Machine Learning blog.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

We launched existing services and instance types in additional Regions:

Other AWS events
AWS GenAI Lofts are collaborative spaces and immersive experiences that showcase AWS’s cloud and AI expertise, while providing startups and developers with hands-on access to AI products and services, exclusive sessions with industry leaders, and valuable networking opportunities with investors and peers. Find a GenAI Loft location near you and don’t forget to register.

Gen AI loft workshop

credit: Antje Barth

Upcoming AWS events
Check your calendar and sign up for upcoming AWS events:

AWS Summits are free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. AWS Summits for this year are coming to an end. There are 3 more left that you can still register: Jakarta (September 5), Toronto (September 11), and Ottawa (October 9).

AWS Community Days feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world. While AWS Summits 2024 are almost over, AWS Community Days are in full swing. Upcoming AWS Community Days are in Belfast (September 6), SF Bay Area (September 13), where our own Antje Barth is a keynote speaker, Argentina (September 14), and Armenia (September 14).

Browse all upcoming AWS led in-person and virtual events here.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Esra

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Announcing AWS Parallel Computing Service to run HPC workloads at virtually any scale

Post Syndicated from Channy Yun (윤석찬) original https://aws.amazon.com/blogs/aws/announcing-aws-parallel-computing-service-to-run-hpc-workloads-at-virtually-any-scale/

Today we are announcing AWS Parallel Computing Service (AWS PCS), a new managed service that helps customers set up and manage high performance computing (HPC) clusters so they seamlessly run their simulations at virtually any scale on AWS. Using the Slurm scheduler, they can work in a familiar HPC environment, accelerating their time to results instead of worrying about infrastructure.

In November 2018, we introduced AWS ParallelCluster, an AWS supported open-source cluster management tool that helps you to deploy and manage HPC clusters in the AWS Cloud. With AWS ParallelCluster, customers can also quickly build and deploy proof of concept and production HPC compute environments. They can use AWS ParallelCluster Command-Line interface, API, Python library, and the user interface installed from open source packages. They are responsible for updates, which can include tearing down and redeploying clusters. Many customers, though, have asked us for a fully managed AWS service to eliminate operational jobs in building and operating HPC environments.

AWS PCS simplifies HPC environments managed by AWS and is accessible through the AWS Management Console, AWS SDK, and AWS Command-Line Interface (AWS CLI). Your system administrators can create managed Slurm clusters that use their compute and storage configurations, identity, and job allocation preferences. AWS PCS uses Slurm, a highly scalable, fault-tolerant job scheduler used across a wide range of HPC customers, for scheduling and orchestrating simulations. End users such as scientists, researchers, and engineers can log in to AWS PCS clusters to run and manage HPC jobs, use interactive software on virtual desktops, and access data. You can bring their workloads to AWS PCS quickly, without significant effort to port code.

You can use fully managed NICE DCV remote desktops for remote visualization, and access job telemetry or application logs to enable specialists to manage your HPC workflows in one place.

AWS PCS is designed for a wide range of traditional and emerging, compute or data-intensive, engineering and scientific workloads across areas such as computational fluid dynamics, weather modeling, finite element analysis, electronic design automation, and reservoir simulations using familiar ways of preparing, executing, and analyzing simulations and computations.

Getting started with AWS Parallel Computing Service
To try out AWS PCS, you can use our tutorial for creating a simple cluster in the AWS documentation. First, you create a virtual private cloud (VPC) with an AWS CloudFormation template and shared storage in Amazon Elastic File System (Amazon EFS) within your account for the AWS Region where you will try AWS PCS. To learn more, visit Create a VPC and Create shared storage in the AWS documentation.

1. Create a cluster
In the AWS PCS console, choose Create cluster, a persistent resource for managing resources and running workloads.

Next, enter your cluster name and choose the controller size of your Slurm scheduler. You can choose Small (up to 32 nodes and 256 jobs), Medium (up to 512 nodes and 8,192 jobs), or Large (up to 2,048 nodes and 16,384 jobs) for the limits of cluster workloads. In the Networking section, choose your created VPC, subnet to launch the cluster, and security group applied to your cluster.

Optionally, you can set the Slurm configuration such as an idle time before compute nodes will scale down, a Prolog and Epilog scripts directory on launched compute nodes, and a resource selection algorithm parameter used by Slurm.

Choose Create cluster. It takes some time for the cluster to be provisioned.

2. Create compute node groups
After creating your cluster, you can create compute node groups, a virtual collection of Amazon Elastic Compute Cloud (Amazon EC2) instances that AWS PCS uses to provide interactive access to a cluster or run jobs in a cluster. When you define a compute node group, you specify common traits such as EC2 instance types, minimum and maximum instance count, target VPC subnets, Amazon Machine Image (AMI), purchase option, and custom launch configuration. Compute node groups require an instance profile to pass an AWS Identity and Access Management (IAM) role to an EC2 instance and an EC2 launch template that AWS PCS uses to configure EC2 instances it launches. To learn more, visit Create a launch template And Create an instance profile in the AWS documentation.

To create a compute node group in the console, go to your cluster and choose the Compute node groups tab and the Create compute node group button.

You can create two compute node groups: a login node group to be accessed by end users and a job node group to run HPC jobs.

To create a compute node group running HPC jobs, enter a compute node name and select a previously-created EC2 launch template, IAM instance profile, and subnets to launch compute nodes in your cluster VPC.

Next, choose your preferred EC2 instance types to use when launching compute nodes and the minimum and maximum instance count for scaling. I chose the hpc6a.48xlarge instance type and scale limit up to eight instances. For a login node, you can choose a smaller instance, such as one c6i.xlarge instance. You can also choose either the On-demand or Spot EC2 purchase option if the instance type supports. Optionally, you can choose a specific AMI.

Choose Create. It takes some time for the compute node group to be provisioned. To learn more, visit Create a compute node group to run jobs and Create a compute node group for login nodes in the AWS documentation.

3. Create and run your HPC jobs
After creating your compute node groups, you submit a job to a queue to run it. The job remains in the queue until AWS PCS schedules it to run on a compute node group, based on available provisioned capacity. Each queue is associated with one or more compute node groups, which provide the necessary EC2 instances to do the processing.

To create a queue in the console, go to your cluster and choose the Queues tab and the Create queue button.

Enter your queue name and choose your compute node groups assigned to your queue.

Choose Create and wait while the queue is being created.

When the login compute node group is active, you can use AWS Systems Manager to connect to the EC2 instance it created. Go to the Amazon EC2 console and choose your EC2 instance of the login compute node group. To learn more, visit Create a queue to submit and manage jobs and Connect to your cluster in the AWS documentation.

To run a job using Slurm, you prepare a submission script that specifies the job requirements and submit it to a queue with the sbatch command. Typically, this is done from a shared directory so the login and compute nodes have a common space for accessing files.

You can also run a message passing interface (MPI) job in AWS PCS using Slurm. To learn more, visit Run a single node job with Slurm or Run a multi-node MPI job with Slurm in the AWS documentation.

You can connect a fully-managed NICE DCV remote desktop for visualization. To get started, use the CloudFormation template from HPC Recipes for AWS GitHub repository.

In this example, I used the OpenFOAM motorBike simulation to calculate the steady flow around a motorcycle and rider. This simulation was run with 288 cores of three hpc6a instances. The output can be visualized in the ParaView session after logging in to the web interface of DCV instance.

Finally, after you are done HPC jobs with the cluster and node groups that you created, you should delete the resources that you created to avoid unnecessary charges. To learn more, visit Delete your AWS resources in the AWS documentation.

Things to know
Here are a couple of things that you should know about this feature:

  • Slurm versions – AWS PCS initially supports Slurm 23.11 and offers mechanisms designed to enable customers to upgrade their Slurm major versions once new versions are added. Additionally, AWS PCS is designed to automatically update the Slurm controller with patch versions. To learn more, visit Slurm versions in the AWS documentation.
  • Capacity Reservations – You can reserve EC2 capacity in a specific Availability Zone and for a specific duration using On-Demand Capacity Reservations to make sure that you have the necessary compute capacity available when you need it. To learn more, visit Capacity Reservations in the AWS documentation.
  • Network file systems – You can attach network storage volumes where data and files can be written and accessed, including Amazon FSx for NetApp ONTAP, Amazon FSx for OpenZFS, and Amazon File Cache as well as Amazon EFS and Amazon FSx for Lustre. You can also use self-managed volumes, such as NFS servers. To learn more, visit Network file systems in the AWS documentation.

Now available
AWS Parallel Computing Service is now available in the US East (N. Virginia), AWS US East (Ohio), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe (Frankfurt), Europe (Ireland), Europe (Stockholm) Regions.

AWS PCS launches all resources in your AWS account. You will be billed appropriately for those resources. For more information, see the AWS PCS Pricing page.

Give it a try and send feedback to AWS re:Post or through your usual AWS Support contacts.

Channy

P.S. Special thanks to Matthew Vaughn, a principal developer advocate at AWS for his contribution in creating a HPC testing environment.

2024 ISO and CSA STAR certificates now available with three additional services

Post Syndicated from Atulsing Patil original https://aws.amazon.com/blogs/security/2024-iso-and-csa-star-certificates-now-available-with-three-additional-services/

Amazon Web Services (AWS) successfully completed an onboarding audit with no findings for ISO 9001:2015, 27001:2022, 27017:2015, 27018:2019, 27701:2019, 20000-1:2018, and 22301:2019, and Cloud Security Alliance (CSA) STAR Cloud Controls Matrix (CCM) v4.0. Ernst and Young CertifyPoint auditors conducted the audit and reissued the certificates on July 22, 2024. The objective of the audit was to assess the level of compliance with the requirements of the applicable international standards.

During the audit, we added the following three AWS services to the scope of the certification:

For a full list of AWS services that are certified under ISO and CSA Star, see the AWS ISO and CSA STAR Certified page. Customers can also access the certifications in the AWS Management Console through AWS Artifact.

If you have feedback about this post, submit comments in the Comments section below.

Atul Patil

Atulsing Patil
Atulsing is a Compliance Program Manager at AWS. He has 27 years of consulting experience in information technology and information security management. Atulsing holds a master of science in electronics degree and professional certifications such as CCSP, CISSP, CISM, CDPSE, ISO 27001 Lead Auditor, HITRUST CSF, Archer Certified Consultant, and AWS CCP.

Nimesh Ravas

Nimesh Ravasa
Nimesh is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Nimesh has 15 years of experience in information security and holds CISSP, CDPSE, CISA, PMP, CSX, AWS Solutions Architect – Associate, and AWS Security Specialty certifications.

Chinmaee Parulekar

Chinmaee Parulekar
Chinmaee is a Compliance Program Manager at AWS. She has 5 years of experience in information security. Chinmaee holds a master of science degree in management information systems and professional certifications such as CISA.

Summer 2024 SOC report now available with 177 services in scope

Post Syndicated from Brownell Combs original https://aws.amazon.com/blogs/security/summer-2024-soc-report-now-available-with-177-services-in-scope/

We continue to expand the scope of our assurance programs at Amazon Web Services (AWS) and are pleased to announce that the Summer 2024 System and Organization Controls (SOC) 1 report is now available. The report covers 177 services over the 12-month period of July 1, 2023–June 30, 2024, so that customers have a full year of assurance with the report. This report demonstrates our continuous commitment to adhere to the heightened expectations for cloud service providers.

Going forward, we will issue SOC reports covering a 12-month period each quarter as follows:

Report Period covered
Spring SOC 1, 2, and 3 April 1–March 31
Summer SOC 1 July 1–June 30
Fall SOC 1, 2, and 3 October 1–September 30
Winter SOC 1 January 1–December 31

Customers can download the Summer 2024 SOC report through AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about SOC compliance, reach out to your AWS account team.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below.

Brownell Combs
Brownell Combs

Brownell is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Brownell holds a master of science degree in computer science from University of Virginia and a bachelor of science degree in computer science from Centre College. He has over 20 years of experience in IT risk management and CISSP, CISA, CRISC, and GIAC GCLD certifications.
Paul Hong
Paul Hong

Paul is a Compliance Program Manager at AWS. He leads multiple security, compliance, and training initiatives within AWS, and has 10 years of experience in security assurance. Paul holds CISSP, CEH, and CPA certifications. He has a master’s degree in accounting information systems and a bachelor’s degree in business administration from James Madison University, Virginia.
Tushar Jain
Tushar Jain

Tushar is a Compliance Program Manager at AWS. He leads multiple security, compliance, and training initiatives within AWS. Tushar holds a master of business administration from Indian Institute of Management Shillong, and a bachelor of technology in electronics and telecommunication engineering from Marathwada University. He has over 12 years of experience in information security and holds CCSK and CSXF certifications.
Michael Murphy
Michael Murphy

Michael is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Michael has 12 years of experience in information security. He holds a master’s degree and a bachelor’s degree in computer engineering from Stevens Institute of Technology. He also holds CISSP, CRISC, CISA, and CISM certifications.
Nathan Samuel
Nathan Samuel

Nathan is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Nathan has a bachelor of commerce degree from the University of the Witwatersrand, South Africa, and has over 21 years of experience in security assurance. He holds the CISA, CRISC, CGEIT, CISM, CDPSE, and Certified Internal Auditor certifications.
ryan wilks
Ryan Wilks

Ryan is a Compliance Program Manager at AWS. He leads multiple security and privacy initiatives within AWS. Ryan has 13 years of experience in information security. He has a bachelor of arts degree from Rutgers University and holds ITIL, CISM, and CISA certifications.

Now open — AWS Asia Pacific (Malaysia) Region

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/now-open-aws-asia-pacific-malaysia-region/

In March of last year, Jeff Barr announced the plan for an AWS Region in Malaysia. Today, I’m pleased to share the general availability of the AWS Asia Pacific (Malaysia) Region with three Availability Zones and API name ap-southeast-5.

The AWS Asia Pacific (Malaysia) Region is the first infrastructure Region in Malaysia and the thirteenth Region in Asia Pacific, joining the existing Asia Pacific Regions in Hong Kong, Hyderabad, Jakarta, Melbourne, Mumbai, Osaka, Seoul, Singapore, Sydney, and Tokyo and the Mainland China Beijing and Ningxia Regions.

The Petronas Twin Towers in the heart of Kuala Lumpur’s central business district.

The new AWS Region in Malaysia will play a pivotal role in supporting the Malaysian government’s strategic Madani Economy Framework. This initiative aims to improve the living standards of all Malaysians by 2030 while supporting innovation in Malaysia and across ASEAN. The construction and operation of the new AWS Region is estimated to add approximately $12.1 billion (MYR 57.3 billion) to Malaysia’s gross domestic product (GDP) and will support an average of more than 3,500 full-time equivalent jobs at external businesses annually through 2038.

The AWS Region in Malaysia will help to meet the high demand for cloud services while supporting innovation in Malaysia and across Southeast Asia.

AWS in Malaysia
In 2016, Amazon Web Services (AWS) established a presence with its first AWS office in Malaysia. Since then, AWS has provided continuous investments in infrastructure and technology to help drive digital transformations in Malaysia in support of hundreds of thousands of active customers each month.

Amazon CloudFront – In 2017, AWS announced the launch of the first edge location in Malaysia, which helps improve performance and availability for end users. Today, there are four Amazon CloudFront locations in Malaysia.

AWS Direct Connect – To continue helping our customers in Malaysia improve application performance, secure data, and reduce networking costs, in 2017, AWS announced the opening of additional Direct Connect locations in Malaysia. Today, there are two AWS Direct Connect locations in Malaysia.

AWS Outposts – As a fully managed service that extends AWS infrastructure and AWS services, AWS Outposts is ideal for applications that need to run on-premises to meet low latency requirements. Since 2020, customers in Malaysia have been able to order AWS Outposts to be installed at their datacenters and on-premises locations.

AWS customers in Malaysia
Cloud adoption in Malaysia has been steadily gaining momentum in recent years. Here are some examples of AWS customers in Malaysia and how they are using AWS for various workloads:

PayNet – PayNet is Malaysia’s national payments network and shared central infrastructure for the financial market in Malaysia. PayNet uses AWS to run critical national payment workloads, including the MyDebit online cashless payments system and e-payment reporting.

Pos Malaysia Berhad (Pos Malaysia) – Pos Malaysia is the national post and parcel service provider, holding the sole mandate to deliver services under the universal postal service obligation for Malaysia. They migrated critical applications to AWS, which increased their business agility and ability to deliver enhanced customer experiences. Also, they scaled their compute capacity to handle deliveries to more than 11 million addresses and a network of more than 3,500 retail touchpoints using Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic Block Store (Amazon EBS), ensuring disruption-free services.

DerivDeriv, one of the world’s largest online brokers, is using Amazon Q Business to increase productivity, efficiency, and innovation in its operations across customer support, marketing, and recruiting departments. With Amazon Q Business, Deriv has been able to boost productivity and reduce onboarding time by 45 percent.

Asia Pacific University – As one of the leading tech universities in Malaysia, Asia Pacific University (APU) uses AWS serverless technology such as Lambda to reduce operational costs. The automated scalability of AWS services has led to high availability and faster deployment that ensure APU’s applications and services are accessible to the students and staff at all times, enhancing the overall user experience. 

Aerodyne – Aerodyne Group is a DT3 (Drone Tech, Data Tech, and Digital Transformation) solutions provider of drone-based enterprise solutions. They’re running their DRONOS software as a service (SaaS) platform on AWS to help drone operators worldwide grow their businesses.

Building cloud skills together
AWS and various organizations in Malaysia have been working closely to build necessary cloud skills for builders in Malaysia. Here are some of the initiatives:

Program AKAR powered by AWS re/Start – Program AKAR is the first financial services-aligned cloud skills program initiated by AWS and PayNet. This new program aims to bridge the growing skills gap in Malaysia’s digital economy by equipping university students with transferrable skills for careers in the sector. As part of this initial collaboration, PayNet, AWS re/Start, and WEPS have committed to starting the program with 100 students in 2024, with the first 50 from Asia Pacific University serving as a pilot. 

AWS Academy — AWS Academy aims to bridge the gap between industry and academia by preparing students for industry-recognized certifications and careers in the cloud with a free and ready-to-teach cloud computing curriculum. AWS Academy currently runs courses in 48 Malaysian universities, covering various domains. Since 2018, 23,000 students have been trained through this program.

AWS Skills Guild at PETRONAS – PETRONAS, a global energy and solutions provider with a presence in over 50 countries, has been an AWS customer since 2014. AWS is also collaborating with PETRONAS to train their employees using the AWS Skills Guild program.

AWS’s contribution to sustainability in Malaysia
With The Climate Pledge, Amazon is committed to reaching net-zero carbon across its business by 2040 and is on a path to powering its operations with 100 percent renewable energy by 2025.

In September 2023, AWS announced its collaboration with Petronas and Gentari, a global clean energy company, to accelerate sustainability and decarbonization efforts in the global energy transition. Shortly after, in December 2023, AWS customer PKT Logistics Group became the first Malaysian company to join over 300 global companies in The Climate Pledge to accelerate the world’s path to net-zero carbon.

In July 2024, AWS and Zero Waste Management collaborated on the first-ever AWS InCommunities Malaysia initiative, Green Wira Programme, to train educators to build sustainability initiatives in schools to advance Malaysia’s sustainable future.

Amazon is committed to investing and innovating across its businesses to help create a more sustainable future.

Things to know
AWS Community in Malaysia – Malaysia is also home to one AWS Hero, nine AWS Community Builders and about 9,000 community members of three AWS User Groups in various cities in Malaysia. If you’re interested in joining AWS User Groups Malaysia, visit their Meetup and Facebook pages.

AWS Global footprint – With this launch, AWS now spans 108 Availability Zones within 34 geographic Regions around the world. We have also announced plans for 18 more Availability Zones and six more AWS Regions in Mexico, New Zealand, the Kingdom of Saudi Arabia, Taiwan, Thailand, and the AWS European Sovereign Cloud.

Available now – The new Asia Pacific (Malaysia) Region is ready to support your business, and you can find a detailed list of the services available in this Region on the AWS Services by Region page.

To learn more, please visit the AWS Global Infrastructure page, and start building on ap-southeast-5!

Happy building!
— Donnie

AWS Lambda introduces recursive loop detection APIs

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/aws-lambda-introduces-recursive-loop-detection-apis/

This post is written by James Ngai, Senior Product Manager, AWS Lambda, and Aneel Murari, Senior Specialist SA, Serverless.

Today, AWS Lambda is announcing new recursive loop detection APIs that allow you to set recursive loop detection configuration on individual Lambda functions. This allows you to turn off recursive loop detection on functions that intentionally use recursive patterns, avoiding disruption of these workloads. You can use these APIs to avoid disruption to any intentionally recursive workflows as Lambda expands support of recursive loop detection to other AWS services.

Overview

AWS Lambda functions are triggered in response to events generated by various AWS services. These Lambda functions may interact with other AWS services by invoking the corresponding service APIs. Typically, the service and resource that generates the triggering event is distinct from the service and resource that the Lambda function calls. However, due to coding errors or configuration issues, there may be situations where these two resources are the same, leading to an infinite or recursive loop. Such misconfigurations can result in runaway workloads, which can incur unplanned usage and charges to your AWS account. For example, a Lambda function processes messages from an Amazon Simple Notification Service (SNS) topic but then puts the resulting notification back to the same SNS topic. This causes an infinite loop.

Lambda provides a built-in preventative guardrail that detects and stops functions running in a recursive or infinite loop between Lambda, Amazon Simple Queue Service (SQS), and SNS. This feature, known as recursive loop detection, is enabled by default for all Lambda functions. This serves as a protective mechanism against unintended usage and unexpected billing from runaway workloads.

Lambda uses an AWS X-Ray trace header primitive called “lineage” to track the number of times a function has been invoked with an event. When your function code sends an event using a supported AWS SDK version, Lambda increments the counter in the lineage header. If your function is then invoked with the same triggering event more than 16 times, Lambda stops the next invocation for that event and emits an Amazon CloudWatch RecursiveInvocationsDropped metric. If the function is invoked synchronously, Lambda returns a RecursiveInvocationException to the caller. For asynchronous invocations, Lambda sends the event to a dead-letter queue or on-failure destination if one is configured.

You do not need to configure active X-Ray tracing for this feature to work. For more information on this feature and an example scenario, please refer to Detecting and stopping recursive loops in AWS Lambda functions.

Although AWS generally discourages this practice due to the possibility of runaway workloads, some customers intentionally employ recursive patterns in their workflows. Previously, customers that run workloads that intentionally use recursive patterns could only opt-out of recursive loop detection on a per-account basis by contacting AWS Support. With these new APIs, customers can selectively opt-out of recursive loop detection on individual functions while maintaining this preventative guardrail for the remaining functions in their account that do not use recursive code.

Today we are introducing two new API actions for recursive loop detection:

  • GetFunctionRecursiveConfig returns details about a function’s recursive loop detection configuration.
  • PutFunctionRecursiveConfig sets the recursive loop detection configuration for a function. By default, recursive loop detection is turned ON for all functions.

How to use the new recursive loop detection APIs

You can configure recursive loop detection for Lambda functions through the Lambda Console, the AWS CLI, or Infrastructure as Code tools like AWS CloudFormation, AWS Serverless Application Model (AWS SAM), or AWS Cloud Development Kit (CDK). This new configuration option is supported in AWS SAM CLI version 1.123.0 and CDK v2.153.0.

If you turn recursive loop detection off for a function, the metric for RecursiveInvocationsDropped is no longer emitted for that function.

Turning off recursive loop detection on your function means that Lambda no longer prevents recursive invocations caused by misconfiguration. This may lead to unexpected usage and billing to your AWS account. You should explore alternate ways of architecting your workload that do not use recursive patterns. AWS recommends you exercise caution when turning off this guardrail feature.

Setting recursive loop detection configuration on a function using the Lambda Console

You can get recursive loop detection configuration in the AWS Lambda console:

  1. In the AWS Lambda Console, navigate to the Functions page. Select the function that uses intentionally recursive patterns.
  2. Select Configuration. You can find recursive loop detection controls under the Concurrency and recursion detection section.
  3. Recursive loop detection controls in the Lambda console

    Recursive loop detection controls in the Lambda console

  4. Recursive loop detection is turned on by default for all functions. You can change the recursive loop detection configuration of a function by choosing Edit.
  5. To turn off recursive loop detection for a function, select Allow recursive loops and select Save.
Setting to allow recursive loops

Setting to allow recursive loops

Setting recursive loop detection configuration using the AWS CLI

You can get the current recursion loop detection configuration of a Lambda function by using the following CLI command:

aws lambda get-function-recursion-config \
--region $AWS_REGION \
--function-name $FUNCTION_NAME

You can update the recursion loop detection configuration for a Lambda function by using the following CLI command:

aws lambda put-function-recursion-config \
--region $AWS_REGION \
--function-name $FUNCTION_NAME \
--recursive-loop Allow|Terminate

Make sure to set appropriate values for AWS_REGION and FUNCTION_NAME in the previous commands. Setting the put-function-recursion-config parameter to Allow turns off the default behavior of detecting recursive loops. Set this value to Terminate to switch back to default behavior.

Setting recursive loop detection configuration using AWS CloudFormation

You can control the recursive loop detection configuration for a Lambda function by setting the RecursiveLoop resource property in CloudFormation. Setting the value of this property to Allow turns off the default behavior of automatically detecting recursive loops. Set this property to Terminate if you want to switch it back to the default behavior. The following CloudFormation snippet shows RecursiveLoop set to Allow.

LambdaFunction:
    Type: AWS::Lambda::Function                                                                                                                                                                                    
    Properties:                                                                                                                                                                                       
      Code:                                                                                                                                                                                          
        S3Bucket:S3_BUCKET                                                                                                                                                                            
        S3Key: S3_KEY      
      Handler: com.example.App::handleRequest                                                                                                                                                        
      MemorySize: 1024
      Role:                                                                                                                                                                                          
        Fn::GetAtt:                                                                                                                                                                                  
        - LambdaFunctionRole                                                                                                                                                                         
        - Arn                                                                                                                                                                               
      Runtime: java17
      RecursiveLoop : Allow                                                                                                                                                                                                                                                                           
      Timeout: 20                                                                                                                                                                        
      TracingConfig:                                                                                                                                                                               
        Mode: Active                                                                                                                                                                                        

Extending recursive loop detection to additional AWS services

Today, recursive loop detection detects and stops loops between Lambda, SQS, and SNS after approximately 16 invocations. Lambda plans to extend support for recursive loop detection to additional AWS services. Using the APIs, you can turn off recursive loop detection for specific functions that use recursive patterns so that they are not impacted when Lambda expands recursive loop detection to additional AWS services in the future.

One way you can identify functions that use recursive patterns is by using the CloudWatch metric RecursiveInvocationsDropped.

  1. Set a CloudWatch alarm on all Lambda functions for the CloudWatch metric RecursiveInvocationsDropped. Configure the alarm to trigger when the metric is greater than a threshold of zero. Refer to CloudWatch documentation to set alarms. You can use the following CLI command to set this alarm:
  2. aws cloudwatch put-metric-alarm --alarm-name lambda-recursive-alarm --metric-name RecursiveInvocationsDropped --namespace AWS/Lambda --statistic Sum --period 60 --threshold 0 --comparison-operator GreaterThanOrEqualToThreshold --evaluation-periods 1 --alarm-actions $arn-of-sns-notification-topic
  3. When Lambda detects recursive invocations, it will emit the RecursiveInvocationsDropped metric, which will trigger the alarm. Note that Lambda will only detect and stop recursive invocations if all the services within the loop support recursive loop detection.
  4. Navigate to the CloudWatch Console and determine which function has emitted the RecursiveInvocationsDropped metric. On the Browse tab, under Metrics, choose to view metrics By Function Name and search for RecursiveInvocationsDropped. This will list all functions that have emitted that metric.
  5. RecursiveInvocationsDropped metric

    RecursiveInvocationsDropped metric

  6. Determine if recursion is the intended pattern for that function. If so, use the recursive loop detection API to turn off recursive loop detection for this function.

Conclusion

Lambda recursive loop detection automatically detects and stops recursive invocations between Lambda and supported services, preventing runaway workloads. In most cases, you should architect your workloads to avoid any recursive loops. In rare and special circumstances, you may want to turn off the default behavior on a case-by-case basis. The recursive loop detection APIs allow you to set recursive loop detection configuration on individual functions.

This feature is available in all AWS Regions where Lambda supports recursive loop detection.

To learn more about these APIs, refer to the AWS Lambda API Reference.

For more serverless learning resources, visit Serverless Land

Add macOS to your continuous integration pipelines with AWS CodeBuild

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/add-macos-to-your-continuous-integration-pipelines-with-aws-codebuild/

Starting today, you can build applications on macOS with AWS CodeBuild. You can now build artifacts on managed Apple M2 machines that run on macOS 14 Sonoma. AWS CodeBuild is a fully managed continuous integration service that compiles source code, runs tests, and produces ready-to-deploy software packages.

Building, testing, signing, and distributing applications for Apple systems (iOS, iPadOS, watchOS, tvOS, and macOS) requires the use of Xcode, which runs exclusively on macOS. When you build for Apple systems in the AWS Cloud, it is very likely you configured your continuous integration and continuous deployment (CI/CD) pipeline to run on Amazon Elastic Cloud Compute (Amazon EC2) Mac instances.

Since we launched Amazon EC2 Mac in 2020, I have spent a significant amount of time with our customers in various industries and geographies, helping them configure and optimize their pipelines on macOS. In the simplest form, a customer’s pipeline might look like the following diagram.

iOS build pipeline on EC2 Mac

The pipeline starts when there is a new commit or pull request on the source code repository. The repository agent installed on the machine triggers various scripts to configure the environment, build and test the application, and eventually deploy it to App Store Connect.

Amazon EC2 Mac drastically simplifies the management and automation of macOS machines. As I like to describe it, an EC2 Mac instance has all the things I love from Amazon EC2 (Amazon Elastic Block Store (Amazon EBS) volumes, snapshots, virtual private clouds (VPCs), security groups, and more) applied to Mac minis running macOS in the cloud.

However, customers are left with two challenges. The first is to prepare the Amazon Machine Image (AMI) with all the required tools for the build. A minimum build environment requires Xcode, but it is very common to install Fastlane (and Ruby), as well as other build or development tools and libraries. Most organizations require multiple build environments for multiple combinations of macOS and Xcode versions.

The second challenge is to scale your build fleet according to the number and duration of builds. Large organizations typically have hundreds or thousands of builds per day, requiring dozens of build machines. Scaling in and out of that fleet helps to save on costs. EC2 Mac instances are reserved for your dedicated use. One instance is allocated to one dedicated host. Scaling a fleet of dedicated hosts requires a specific configuration.

To address these challenges and simplify the configuration and management of your macOS build machines, today we introduce CodeBuild for macOS.

CodeBuild for macOS is based on the recently introduced reserved capacity fleet, which contains instances powered by Amazon EC2 that are maintained by CodeBuild. With reserved capacity fleets, you configure a set of dedicated instances for your build environment. These machines remain idle, ready to process builds or tests immediately, which reduces build durations. With reserved capacity fleets, your machines are always running and will continue to incur costs as long as they’re provisioned.

CodeBuild provides a standard disk image (AMI) to run your builds. It contains preinstalled versions of Xcode, Fastlane, Ruby, Python, Node.js, and other popular tools for a development and build environment. The full list of tools installed is available in the documentation. Over time, we will provide additional disk images with updated versions of these tools. You can also bring your own custom disk image if you desire.

In addition, CodeBuild makes it easy to configure auto scaling. You tell us how much capacity you want, and we manage everything from there.

Let’s see CodeBuild for macOS in action
To show you how it works, I create a CI/CD pipeline for my pet project: getting started with AWS Amplify on iOS. This tutorial and its accompanying source code explain how to create a simple iOS app with a cloud-based backend. The app uses a GraphQL API (AWS AppSync), a NoSQL database (Amazon DynamoDB), a file-based storage (Amazon Simple Storage Service (Amazon S3)), and user authentication (Amazon Cognito). AWS Amplify for Swift is the piece that glues all these services together.

The tutorial and the source code of the app are available in a Git repository. It includes scripts to automate the build, test, and deployment of the app.

Configuring a new CI/CD pipeline with CodeBuild for macOS involves the following high-level steps:

  1. Create the build project.
  2. Create the dedicated fleet of machines.
  3. Configure one or more build triggers.
  4. Add a pipeline definition file (buildspec.yaml) to the project.

To get started, I open the AWS Management Console, select CodeBuild, and select Create project.

codebuild mac - 1

I enter a Project name and configure the connection to the Source code repository. I use GitHub in this example. CodeBuild also supports GitLab and BitBucket. The documentation has an up-to-date list of supported source code repositories.

codebuild mac - 2

For the Provisioning model, I select Reserved capacity. This is the only model where Amazon EC2 Mac instances are available. I don’t have a fleet defined yet, so I decide to create one on the flight while creating the build project. I select Create fleet.

codebuild mac - 3

On the Compute fleet configuration page, I enter a Compute fleet name and select macOS as Operating system. Under Compute, I select the amount of memory and the quantity of vCPUs needed for my build project, and the number of instances I want under Capacity.

For this example, I am happy to use the Managed image. It includes Xcode 15.4 and the simulator runtime for iOS 17.5, among other packages. You can read the list of packages preinstalled on this image in the documentation.

When finished, I select Create fleet to return to the CodeBuild project creation page.

CodeBuild - create fleet

As a next step, I tell CodeBuild to create a new service role to define the permissions I want for my build environment. In the context of this project, I must include permissions to pull an Amplify configuration and access AWS Secrets Manager. I’m not sharing step-by-step instructions to do so, but the sample project code contains the list of the permissions I added.

codebuild mac - 4

I can choose between providing my set of build commands in the project definition or in a buildspec.yaml file included in my project. I select the latter.

codebuild mac - 5

This is optional, but I want to upload the build artifact to an S3 bucket where I can archive each build. In the Artifact 1 – Primary section, I therefore select Amazon S3 as Type, and I enter a Bucket name and artifact Name. The file name to upload is specified in the buildspec.yaml file.

codebuild mac - 6

Down on the page, I configure the project trigger to add a GitHub WebHook. This will configure CodeBuild to start the build every time a commit or pull request is sent to my project on GitHub.

codebuild - webhook

Finally, I select the orange Create project button at the bottom of the page to create this project.

Testing my builds
My project already includes build scripts to prepare the build, build the project, run the tests, and deploy it to Apple’s TestFlight.

codebuild - project scripts

I add a buildspec.yaml file at the root of my project to orchestrate these existing scripts.

version: 0.2

phases:

  install:
    commands:
      - code/ci_actions/00_install_rosetta.sh
  pre_build:
    commands:
      - code/ci_actions/01_keychain.sh
      - code/ci_actions/02_amplify.sh
  build:
    commands:
      - code/ci_actions/03_build.sh
      - code/ci_actions/04_local_tests.sh
  post_build:
    commands:
      - code/ci_actions/06_deploy_testflight.sh
      - code/ci_actions/07_cleanup.sh
artifacts:
   name: $(date +%Y-%m-%d)-getting-started.ipa
   files:
    - 'getting started.ipa'
  base-directory: 'code/build-release'

I add this file to my Git repository and push it to GitHub with the following command: git commit -am "add buildpsec" buildpec.yaml

On the console, I can observe that the build has started.

codebuild - build history

When I select the build, I can see the log files or select Phase details to receive a high-level status of each phase of the build.

codebuild - phase details

When the build is successful, I can see the iOS application IPA file uploaded to my S3 bucket.

aws s3 ls

The last build script that CodeBuild executes uploads the binary to App Store Connect. I can observe new builds in the TestFlight section of the App Store Connect.

App Store Connect

Things to know
It takes 8-10 minutes to prepare an Amazon EC2 Mac instance and to accept the very first build. This is not specific to CodeBuild. The builds you submit during the machine preparation time are queued and will be run in order as soon as the machine is available.

CodeBuild for macOS works with reserved fleets. Contrary to on-demand fleets, where you pay per minute of build, reserved fleets are charged for the time the build machines are reserved for your exclusive usage, even when no builds are running. The capacity reservation follows the Amazon EC2 Mac 24-hour minimum allocation period, as required by the Software License Agreement for macOS (article 3.A.ii).

A fleet of machines can be shared across CodeBuild projects on your AWS account. The machines in the fleet are reserved for your exclusive use. Only CodeBuild can access the machines.

CodeBuild cleans the working directory between builds, but the machines are reused for other builds. It allows you to use the CodeBuild local cache mechanism to quickly restore selected files after a build. If you build different projects on the same fleet, be sure to reset any global state, such as the macOS keychain, and build artifacts, such as the SwiftPM and Xcode package caches, before starting a new build.

When you work with custom build images, be sure they are built for a 64-bit Mac-Arm architecture. You also must install and start the AWS Systems Manager Agent (SSM Agent). CodeBuild uses the SSM Agent to install its own agent and to manage the machine. Finally, make sure the AMI is available to the CodeBuild organization ARN.

CodeBuild for macOS is available in the following AWS Regions: US East (Ohio, N. Virginia), US West (Oregon), Asia Pacific (Sydney), and Europe (Frankfurt). These are the same Regions that offer Amazon EC2 Mac M2 instances.

Get started today and create your first CodeBuild project on macOS.

— seb

AWS Weekly Roundup: G6e instances, Karpenter, Amazon Prime Day metrics, AWS Certifications update and more (August 19, 2024)

Post Syndicated from Prasad Rao original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-g6e-instances-karpenter-amazon-prime-day-metrics-aws-certifications-update-and-more-august-19-2024/

You know what I find more exciting than the Amazon Prime Day sale? Finding out how Amazon Web Services (AWS) makes it all happen. Every year, I wait eagerly for Jeff Barr’s annual post to read the chart-topping metrics. The scale never ceases to amaze me.

This year, Channy Yun and Jeff Barr bring us behind the scenes of how AWS powered Prime Day 2024 for record-breaking sales. I will let you read the post for full details, but one metric that blows my mind every year is that of Amazon Aurora. On Prime Day, 6,311 Amazon Aurora database instances processed more than 376 billion transactions, stored 2,978 terabytes of data, and transferred 913 terabytes of data.

Amazon Box with checkbox showing a record breaking prime day event powered by AWS

Other news I’m excited to share is that registration is open for two new AWS Certification exams. You can now register for the beta version of the AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer – Associate. These certifications are for everyone—from line-of-business professionals to experienced machine learning (ML) engineers—and will help individuals prepare for in-demand artificial intelligence and machine learning (AI/ML) careers. You can prepare for your exam by following a four-step exam prep plan for AWS Certified AI Practitioner and AWS Certified Machine Learning Engineer – Associate.

Last week’s launches
Here are some launches that got my attention:

General availability of Amazon Elastic Compute Cloud (Amazon EC2) EC2 G6e instances – Powered by NVIDIA L40S Tensor Core GPUs, G6e instances can be used for a wide range of ML and spatial computing use cases. You can use G6e instances to deploy large language models (LLMs) with up to 13B parameters and diffusion models for generating images, video, and audio.

Release of Karpenter 1.0 – Karpenter is a flexible, efficient, and high-performance Kubernetes compute management solution. You can use Karpenter with Amazon Elastic Kubernetes Service (Amazon EKS) or any conformant Kubernetes cluster. To learn more, visit the Karpenter 1.0 launch post.

Drag-and-drop UI for Amazon SageMaker Pipelines – With this launch, you can now quickly create, execute, and monitor an end-to-end AI/ML workflow to train, fine-tune, evaluate, and deploy models without writing code. You can drag and drop various steps of the workflow and connect them together in the UI to compose an AI/ML workflow.

Split, move and modify Amazon EC2 On-Demand Capacity Reservations – With the new capabilities for managing Amazon EC2 On-Demand Capacity Reservations, you can split your Capacity Reservations, move capacity between Capacity Reservations, and modify your Capacity Reservation’s instance eligibility attribute. To learn more about these features, refer to Split off available capacity from an existing Capacity Reservation.

Document-level sync reports in Amazon Q Business – This new feature of Amazon Q Business provides you with a comprehensive document-level report including granular indexing status, metadata, and access control list (ACL) details for every document processed during a data source sync job. You have the visibility of the status of the documents Amazon Q Business attempted to crawl and index as well as the ability to troubleshoot why certain documents were not returned with the expected answers.

Landing zone version selection in AWS Control Tower – Starting with landing zone version 3.1 and above, you can update or reset in-place your landing zone on the current version, or upgrade to a version of your choice. To learn more, visit Select a landing zone version in the AWS Control Tower user guide.

Launch of AWS Support Official channel on AWS re:Post – You now have access to curated content for operating at scale on AWS, authored by AWS Support and AWS Managed Services (AMS) experts. In this new channel, you can find technical solutions for complex problems, operational best practices, and insights into AWS Support and AMS offerings. To learn more, visit the AWS Support Official channel on re:Post.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Regional expansion of AWS Services
Here are some of the expansions of AWS services into new AWS Regions that happened this week:

Amazon VPC Lattice is now available in 7 additional RegionsAmazon VPC Lattice is now available in US West (N. California), Africa (Cape Town), Europe (Milan), Europe (Paris), Asia Pacific (Mumbai), Asia Pacific (Seoul), and South America (São Paulo). With this launch, Amazon VPC Lattice is now generally available in 18 AWS Regions.

Amazon Q in QuickSight is now available in 5 additional Regions Amazon Q in QuickSight is now generally available in Asia Pacific (Mumbai), Canada (Central), Europe (Ireland), Europe (London), and South America (São Paulo), in addition to the existing US East (N. Virginia), US West (Oregon), and Europe (Frankfurt) Regions.

AWS Wickr is now available in the Europe (Zurich) RegionAWS Wickr adds Europe (Zurich) to the US East (N. Virginia), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (London), Europe (Frankfurt), and Europe (Stockholm) Regions that it’s available in.

You can browse the full list of AWS Services available by Region.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS re:Invent 2024 – Dive into the first-round session catalog. Explore all the different learning opportunities at AWS re:Invent this year and start building your agenda today. You’ll find sessions for all interests and learning styles.

AWS Summits – The 2024 AWS Summit season is starting to wrap up! Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Jakarta (September 5), and Toronto (September 11).

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Colombia (August 24), New York (August 28), Belfast (September 6), and Bay Area (September 13).

AWS GenAI Lofts – Meet AWS AI experts and attend talks, workshops, fireside chats, and Q&As with industry leaders. All lofts are free and are carefully curated to offer something for everyone to help you accelerate your journey with AI. There are lofts scheduled in San Francisco (August 14–September 27), São Paulo (September 2–November 20), London (September 30–October 25), Paris (October 8–November 25), and Seoul (November).

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Prasad

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Announcing AWS KMS Elliptic Curve Diffie-Hellman (ECDH) support

Post Syndicated from Patrick Palmer original https://aws.amazon.com/blogs/security/announcing-aws-kms-elliptic-curve-diffie-hellman-ecdh-support/

When using cryptography to protect data, protocol designers often prefer symmetric keys and algorithms for their speed and efficiency. However, when data is exchanged across an untrusted network such as the internet, it becomes difficult to ensure that only the exchanging parties can know the same key. Asymmetric key pairs and algorithms help to solve this problem by allowing a public key to be shared over an untrusted network. And by using a key agreement scheme, two parties can use each other’s public key in combination with their own private key to each derive the same shared secret.

We’re excited to announce that AWS Key Management Service (AWS KMS) now supports Elliptic Curve Diffie-Hellman (ECDH) key agreement on elliptic curve (ECC) KMS keys. You can use the new DeriveSharedSecret API action to enable two parties to establish a secure communication channel by using a derived shared secret.

In this blog post we provide an overview of the new API action and explain how it can help you establish secure communications by exchanging only public keys to obtain a derived shared secret. We then show example commands to demonstrate how AWS KMS and OpenSSL can be used by two parties to derive a shared secret.

With this new DeriveSharedSecret API action, customers can take an external party’s public key and, in combination with a private key that resides within AWS KMS, derive a shared secret which can be used to derive a symmetric encryption key with a key derivation function (KDF). Customers can then use this symmetric encryption key to encrypt data locally within their application.

The same external party can combine their own related private key with the customer’s corresponding public key from AWS KMS to derive the same shared secret.

Now that both parties have the same shared secret, they can generate a symmetric encryption key that can be used to encrypt and decrypt the data they exchange.

DeriveSharedSecret offers a simple and secure way for customers to use their private key from within their application, enabling new asymmetric cryptography use cases for keys protected by AWS KMS, such as elliptic curve integrated encryption scheme (ECIES) or end-to-end encryption (E2EE) schemes.

AWS KMS DeriveSharedSecret overview

The AWS KMS API Reference documentation covers the DeriveSharedSecret API action in more detail than we include in this post. We broadly describe how to interact with the API action, using the following steps:

  1. Create an elliptic curve (ECC) KMS key, selecting that the key be used for KEY_AGREEMENT and choosing one of the supported key specs. You will not be able to modify existing ECC keys to be used for key agreement.
  2. Have another party create an elliptic curve key that matches the key spec you defined for your KMS key.
  3. Retrieve the public key associated with your KMS key by using the existing GetPublicKey API action.
  4. Exchange public keys through a trusted means of exchange with the other party. Note that DeriveSharedSecret expects a base64-encoded DER-formatted public key.
  5. Use the other party’s public key as an input, along with your specified KEY_AGREEMENT key. The only key agreement algorithm supported by AWS KMS at launch is ECDH.
  6. The other party should use the public key retrieved from AWS KMS and the private key associated with their generated ECC key pair to derive a shared secret.

The result of the preceding steps is that both parties have the same output without exchanging secret information. Only public keys were exchanged between the two parties. The output of DeriveSharedSecret is the raw shared secret. This shared secret is the multiplication of points on the elliptic curves and can result in many more bytes than are needed for an encryption key. We recommend that customers use a KDF, following the National Institute of Standards and Technology (NIST) SP800-56A Rev. 3 section 5.8 guidance, to derive encryption keys from this shared secret.

For the purposes of this post, we will demonstrate the steps by using the AWS CLI and OpenSSL command line. AWS has incorporated best practices for customers within the AWS Encryption SDK. You can find more details at AWS KMS ECDH keyrings.

Example use case

An example use case where you might wish to use ECDH key agreement is for end-to-end encryption. Although protocols exist that provide a secure framework for secure communications (for example, within AWS Wickr), we will highlight the simplified high-level steps behind some of these protocols. In our example use case, Alice and Bob are both part of a messaging network. This network is managed by a centralized service, and this service must not be able to access Alice or Bob’s unencrypted messages.

Figure 1: High-level architecture for the service described in the example use case

Figure 1: High-level architecture for the service described in the example use case

As shown in Figure 1, Alice and Bob each have an ECC key pair and participate in the secret derivation by using ECDH, through the following steps:

  1. Alice registers her public key in the centralized key storage service. A detailed discussion of the key storage service is beyond the scope of this post.
  2. Bob, an AWS KMS user, calls the AWS KMS GetPublicKey action to obtain the public key for the ECC KMS key pair.
  3. Bob registers his public key in the same centralized key storage service.
  4. Alice, who wants to exchange encrypted messages with Bob, retrieves Bob’s public key from the centralized key storage service.
  5. Bob gets a notification that Alice wants to communicate with him, and he retrieves Alice’s public key from the centralized key storage service.
  6. Using Bob’s public key and her private key, Alice derives a shared secret by using her cryptography provider.
  7. Using Alice’s public key and his private key, Bob derives a shared secret by using DeriveSharedSecret.
  8. Alice and Bob now have an identical shared secret. From this shared secret, she can create a symmetric encryption key by using a suitable KDF. The symmetric encryption key can be used to create ciphertext that can be sent to Bob.

Example use case walkthrough

You can use the following steps to create a KMS key for ECDH use and derive a shared secret by using AWS KMS. For our demonstration purposes, the user Alice (from our example use case) is using OpenSSL as the cryptography tool. We will show how the AWS KMS user Bob and OpenSSL user Alice can derive a shared secret by using each other’s public key.

General prerequisites

You must have the following prerequisites in place in order to implement the solution:

  • AWS CLI — The latest version is recommended. The example here uses aws-cli/2.15.40 and aws-cli/1.32.110.
  • OpenSSL — The example here uses OpenSSL 3.3.0.
  • Both parties (Alice and Bob, from our example use case) have an ECC key on the same curve. The steps in the next section, Key creation prerequisite, explain how these keys can be created.

Key creation prerequisite

Alice and Bob must use the same ECC curve during key creation. The DeriveSharedSecret API action supports curves ECC_NIST_P256, ECC_NIST_P384, and ECC_NIST_P521, which map to P-256, P-384, and P-521 respectively in OpenSSL. The curves that AWS KMS supports are the curves approved by the U.S. National Institute of Standards and Technology (NIST). Additionally, AWS KMS supports the SM2 key spec only in Amazon Web Services China Regions.

Bob creates an asymmetric KMS key for key agreement purposes

Bob creates a key pair in AWS KMS by using the CreateKey API action. In the following example, Bob creates an ECC key pair with ECC_NIST_P256 for the KeySpec parameter and KEY_AGREEMENT for the KeyUsage parameter.

aws kms create-key \
--key-spec ECC_NIST_P256 \
--key-usage KEY_AGREEMENT \
--description "Example ECDH key pair"

The response looks something like this:

{
    "KeyMetadata": {
        "AWSAccountId": "111122223333",
        "KeyId": "a1b2c3d4-5678-90ab-cdef-EXAMPLE11111",
        "Arn": "arn:aws:kms:us-east-1:111122223333:key/a1b2c3d4-5678-90ab-cdef-EXAMPLE11111",
        "CreationDate": "2024-06-25T13:06:24.888000-07:00",
        "Enabled": true,
        "Description": "Example ECDH key pair",
        "KeyUsage": "KEY_AGREEMENT",
        "KeyState": "Enabled",
        "Origin": "AWS_KMS",
        "KeyManager": "CUSTOMER",
        "CustomerMasterKeySpec": "ECC_NIST_P256",
        "KeySpec": "ECC_NIST_P256",
        "KeyAgreementAlgorithms": [
            "ECDH"
        ],
        "MultiRegion": false
    }
}

You can follow the Creating asymmetric KMS keys documentation to see how to use the AWS Management Console to create a KMS key pair with the same properties as shown here. This example creates a KMS key with a default KMS key policy. You should review and configure your key policy according to the principle of least privilege, as appropriate for your environment.

Note: When a KMS key is created, it will be logged by AWS CloudTrail, a service that monitors and records activity within your account. API calls to the AWS KMS service are logged in CloudTrail, which you can use to audit access to KMS keys.

To allow your KMS key to be identified by a human-readable string rather than by the KeyId value, you can create an alias for the KMS key (replace the target-key-id value of a1b2c3d4-5678-90ab-cdef-EXAMPLE11111 with your KeyId value). This makes it easier to use and manage your KMS keys.

Bob creates an alias for his KMS key by using the CLI with the following command:

aws kms create-alias \
    --alias-name alias/example-ecdh-key \
    --target-key-id a1b2c3d4-5678-90ab-cdef-EXAMPLE11111 

Alice creates an ECC key for key agreement purposes by using OpenSSL

Using the ecparam and genkey option of OpenSSL, Alice creates a P-256 ECC key. The P-256 curve is represented by AWS KMS as ECC_NIST_P256.

Note: For ECDH to work, the curve of the OpenSSL ECC key must be same as the ECC KMS key created by the other party (Bob, in our example use case).

openssl ecparam -name P-256 \
        -genkey -out openssl_ecc_private_key.pem

Key exchange and secret derivation process

The following sections outline the steps that Alice and Bob will follow to share their public keys, retrieve one another’s public key, and then derive the same shared secret using AWS KMS and OpenSSL. The shared secrets derived by Alice and Bob respectively are then compared to show that they both derived the same shared secret.

Step 1: Alice generates and registers her OpenSSL public key with a central service

AWS KMS expects the public key in DER format. Therefore, in this example Alice creates a DER-format public key by using her ECC private key. Alice runs the following command to produce a DER-format file that contains her public key:

openssl ec -in openssl_ecc_private_key.pem \
        -pubout -outform DER \
        > openssl_ecc_public_key.bin.der

The file openssl_ecc_public_key.bin.der will have the public key in DER format, which Alice can store in the centralized key storage service (or send to anyone she would like to communicate with). Details about the centralized key storage service are beyond the scope of this post.

Step 2: Bob obtains the public key for his ECC KMS Key

To retrieve a copy of the public key for his ECC KMS key, Bob uses the GetPublicKey API action. Bob calls this API by using the AWS CLI command get-public-key, as follows:

aws kms get-public-key \
    --key-id alias/example-ecdh-key \
    --output text \
    --query PublicKey | base64 --decode > kms_ecdh_public_key.der

The returned PublicKey value is a DER-encoded X.509 public key. Because the AWS CLI is being used, the public key output is base64-encoded for readability purposes. This base64-encoded value is decoded by using the base64 command, and the decoded value is stored in the output file. The file kms_ecdh_public_key.der contains the DER-encoded public key.

Note: If you call this API by using one of the AWS SDKs, such as Boto3, then the returned PublicKey value is not base64-encoded.

In our example use case, Alice is using OpenSSL, which expects the public key in PEM format. Bob converts his DER-format public key into PEM format by using the following command:

openssl ec -pubin -inform DER -outform PEM \
        -in kms_ecdh_public_key.der \
        -out kms_ecdh_public_key.pem

The file kms_ecdh_public_key.pem contains the public key in PEM format.

Step 3: Bob registers his public key with the centralized key storage service

Bob saves his public key in PEM format, obtained in Step 2, in the centralized key storage service.

Step 4: Alice retrieves Bob’s public key to derive a shared secret

To perform ECDH key agreement, the two parties involved (Alice and Bob, in our example use case) need to exchange their public key with each other. Alice, who wants to send encrypted messages to Bob, retrieves Bob’s public key from the centralized key storage service.

Bob’s public key, kms_ecdh_public_key.pem, is already in PEM format as expected by OpenSSL.

Step 5: Bob retrieves Alice’s public key to derive a shared secret

To perform ECDH key agreement, the two parties involved, Alice and Bob, need to exchange their public key with each other. Bob gets a notification that Alice wants to communicate with him, and he retrieves Alice’s public key from the centralized key storage service.

Alice’s public key, openssl_ecc_public_key.bin.der, is already in DER format as expected by AWS KMS.

Step 6: Alice uses OpenSSL to derive the shared secret

Alice, using her private key and Bob’s public key, can derive the shared secret by using OpenSSL. Alice derives the shared secret by using the OpenSSL pkeyutl command with the derive option, as follows:

openssl pkeyutl -derive \
-inkey openssl_ecc_private_key.pem \
-peerkey kms_ecdh_public_key.pem > openssl.ss

The file openssl.ss will have the shared secret in binary format.

Step 7: Bob uses AWS KMS to derive the shared secret

Bob, using his private key (which remains securely within AWS KMS) and Alice’s public key, can derive the shared secret by using AWS KMS. The following example shows how Bob uses the DeriveSharedSecret API action with the AWS CLI command derive-shared-secret. At launch, the only supported key agreement algorithm is ECDH. Bob passes Alice’s public key for the PublicKey parameter.

aws kms derive-shared-secret \
--key-id alias/example-ecdh-key \
--public-key fileb://path/to/openssl_ecc_public_key.bin.der \
--key-agreement-algorithm ECDH \
--output text --query SharedSecret |base64 --decode > kms.ss

Because the AWS CLI is being used, the returned SharedSecret value is base64-encoded for readability purposes. Using the base64 --decode command, the decoded binary format is stored to the file.

Note: If you call this API by using one of the AWS SDKs, such as Boto3, then the returned SharedSecret value is not base64-encoded.

The file kms.ss will have the shared secret in binary format.

Step 8: Using the shared secret and a suitable KDF, Alice derives an encryption key to encrypt her communication to Bob

You can use the following command to compare the two files containing the derived shared secrets that were obtained in Steps 6 and 7 and verify that they are identical:

diff -qs openssl.ss kms.ss

Because these files are identical, we can see that the same secret was derived using both AWS KMS and OpenSSL.

Using the shared secret, Alice should then derive a symmetric encryption key by using a suitable KDF. She can use this symmetric encryption key to encrypt data and send the ciphertext to Bob.

This blog post does not cover the steps to derive that symmetric encryption key, because that can be a complex topic depending on your use case. However, we note that you should not use the raw shared secret as an encryption key because it is not uniform. In other words, the shared secret has a lot of entropy, but the byte string itself is not random.

NIST recommends that you use a KDF function over the raw shared secret (value Z as described in section 5.8 of NIST SP800-56A Rev. 3). The KDFs that are recommended are described in more detail in NIST SP800-56C Rev. 2. One such example is OpenSSL Single Step KDF (SSKDF) EVP_KDF-SS, but using this KDF involves choosing the other values, such as FixedInfo, carefully.

To help customers make the right choice for the resulting KDF to use on the shared secret, the AWS Encryption SDK now includes AWS KMS ECDH keyrings. The keyring is a construct within the AWS Encryption SDK that you implement within your code. The keyring handles the management of encryption keys while applying best practices to protect your data. You can use the keyring to reference your KMS keys for key agreement, and then call a function to encrypt data. Data will be encrypted by using a derived shared wrapping key following NIST recommendations, and the Encryption SDK applies key commitment to the ciphertext.

Summary

In this blog post, we highlighted how you can use the recently launched DeriveSharedSecret API action to securely derive a shared secret. You’ve seen how ECDH can be used between two parties without having to share secret information across untrusted networks. We explained how you can audit your AWS KMS key usage through AWS CloudTrail logs. We highlighted that you would need to use a KDF to generate a symmetric encryption key from the shared secret. We strongly recommend that you use the AWS Encryption SDK to encrypt your data, which helps make sure that the recommended NIST key derivation functions are used for generating symmetric encryption keys.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Patrick Palmer

Patrick Palmer
Patrick is a Principal Security Specialist Solutions Architect at AWS. He helps customers around the world use AWS services in a secure manner and specializes in cryptography. When not working, he enjoys spending time with his growing family and playing video games.

Raj Puttaiah

Raj Puttaiah
Raj is a Software Development Manager for AWS KMS. Raj leads the development of AWS KMS features, focusing on operational excellence. When not working, Raj spends time with his family hiking the beautiful Washington outdoors, and accompanying his two sons to their activities.

Michael Miller

Michael Miller
Michael is a Senior Solutions Architect at AWS, based in Ireland. He helps public sector customers across the UK and Ireland accelerate their cloud adoption journey and specializes in security and networking. In prior roles, Michael has been responsible for designing architectures and supporting implementations across various sectors including service providers, consultancies, and financial services organizations.

Organize content across business units with enterprise-wide data governance using Amazon DataZone domain units and authorization policies

Post Syndicated from David Victoria original https://aws.amazon.com/blogs/big-data/organize-content-across-business-units-with-enterprise-wide-data-governance-using-amazon-datazone-domain-units-and-authorization-policies/

Amazon DataZone has announced a set of new data governance capabilities—domain units and authorization policies—that enable you to create business unit-level or team-level organization and manage policies according to your business needs. With the addition of domain units, users can organize, create, search, and find data assets and projects associated with business units or teams. With authorization policies, those domain unit users can set access policies for creating projects and glossaries, and using compute resources within Amazon DataZone.

As an Amazon DataZone administrator, you can now create domain units (such as Sales or Marketing) under the top-level domain and assign domain unit owners to further manage the data team’s structure. Amazon DataZone users can log in to the portal to browse and search the catalog by domain units, and subscribe to data produced by specific business units. Additionally, authorization policies can be configured for a domain unit permitting actions such as who can create projects, metadata forms, and glossaries within their domain units. Authorized portal users can then log in to the Amazon DataZone portal and create entities such as projects and create metadata forms using the authorized projects.

Amazon DataZone enables you to discover, access, share, and govern data at scale across organizational boundaries, reducing the undifferentiated heavy lifting of making data and analytics tools accessible to everyone in the organization. With Amazon DataZone, data users like data engineers, data scientists, and data analysts can share and access data across AWS accounts using a unified data portal, allowing them to discover, use, and collaborate on this data across their teams and organizations. Additionally, data owners and data stewards can make data discovery simpler by adding business context to data while balancing access governance to the data in the UI.

In this post, we discuss common approaches to structuring domain units, use cases that customers in the healthcare and life sciences (HCLS) industry encounter, and how to get started with the new domain units and authorization policies features from Amazon DataZone.

Approaches to structuring domain units

Domains are top-level entities that encompass multiple domain units as sub-entities, each with specific policies. Organizations can adopt different approaches when defining and structuring domains and domain units. Some strategies align these units with data domains, whereas others follow organizational structures or lines of business. In this section, we explore a few examples of domains, domain units, and how to organize data assets and products within these constructs.

Domains aligned with the organization

Domain units can be built using the organizational structure, lines of businesses, or use cases. For example, HCLS organizations typically have a range of domains that encompass various aspects of their operations and services. Customers are using domains and domain units to improve searchability and findability of data assets within an organized tree-like structure, and enable individual organizational units to control their own authorization policies.

One of the core benefits of organizing entities as domain units is to enable search and self-service access across various domain units. The following are some common domain units within the HCLS sector:

  • Commercials – Commercial aspects of products or services related to the life sciences and activities such as market analysis, product positioning, pricing, distribution, and customer engagement. There could be several child domain units, such as contract research organization.
  • Research and development – Pharmaceutical and medical device development. Some examples of child domain units include drug discovery and clinical trials management.
  • Clinical services – Hospital and clinic management. Examples of child domain units include physician and nursing services.
  • Revenue cycle management – Patient billing and claims processing. Examples of child domain units include insurance and payer relations.

The following are common domains and domain units that apply across industries:

  • Supply chain and logistics – Procurement and inventory management.
  • Regulatory compliance and quality assurance – Compliance with industry specific regulations, quality management systems, and accreditation.
  • Marketing – Strategies, techniques, and practices aimed at promoting products, services, or ideas to potential customers. Some examples of child domain units are campaigns and events.
  • Sales – Sales process, key performance indicators (KPIs), and metrics.

For example, one of our customers, AWS Data Platform, uses Amazon DataZone to provide secure, trusted, convenient, and fast access to AWS business data.

“At AWS, our vision is to provide customers with reliable, secure, and self-service access to exabyte-scale data while ensuring data governance and compliance. With Amazon DataZone domain units, we are able to organize a vast and growing number of datasets to align with the organizational structure of the customers my teams serve internally. This simplifies data discovery and helps us organize business units’ data in a hierarchical manner for data-driven decision-making at AWS. Amazon DataZone authorization policies coupled with domain units enable a powerful yet flexible way of decentralizing data governance and helps tailor access policies to individual business units. With these features, we are able to reduce the undifferentiated heavy lift while building and managing data products.”

– Arnaud Mauvais, Director of Software Development at AWS.

Domains aligned with data ownership

The term data domain is crucial within the realm of data governance. It signifies a distinct field or classification of data that an organization oversees and regulates. Data domains form a foundational pillar in data governance frameworks. The concept of data domains plays a pivotal role in data governance, empowering organizations to systematically structure, administer, and harness their data assets. This strategic approach aligns data resources with business goals, fostering informed decision-making processes.

You can either define each data domain as a top-level domain or define a top-level data domain (for example, Organization) with several child domain units, such as:

  • Customer data – This domain unit includes all data related to customers, such as customer profiles. Several other child domain units with policies can be built within customer domain units, such as customer interactions and profiles.
  • Financial data – This domain unit encompasses data related to financial information.
  • Human resources data – This domain unit includes employee-related data.
  • Product data – This domain unit covers data related to products or services offered by the organization.

Authorization policies for domains and domain units

Amazon DataZone domain units provide you with a robust and flexible data governance solution tailored to your organizational structure. These domain units empower individual business lines or teams to establish their own authorization policies, enabling self-service governance over critical actions such as publishing data assets and utilizing compute resources within Amazon DataZone. The authorization policies enabled by domain units allow you to grant granular access rights to users and groups, empowering them to manage domain units, project memberships, and creation of content such as projects, metadata forms, glossaries and custom asset types.

Domain governance authorization policies help organizations maintain data privacy, confidentiality, and integrity by controlling and limiting access to sensitive or critical data. They also support data-driven decision-making by making sure authorized users have appropriate access to the information they need to perform their duties. Similarly, authorization policies can help organizations govern the management of organizational domains, collaboration, and metadata. These policies can help define roles like data governance owner, data product owners, and data stewards.

Additionally, these policies facilitate metadata management, glossary administration, and domain ownership, so data governance practices are aligned with the specific needs and requirements of each business line or team. By using domain units and their associated authorization policies, organizations can decentralize data governance responsibilities while maintaining a consistent and controlled approach to data asset and metadata management. This distributed governance model promotes ownership and accountability within individual business lines, fostering a culture of data stewardship and enabling more agile and responsive data management practices.

Use cases for domain units

Amazon DataZone domain units help customers in various industries securely and efficiently govern their data, collaborate on important data management initiatives, and help in complying with relevant regulations. These capabilities are particularly valuable for customers in industries with strict data privacy and security requirements, such as HCLS, financial services, and the public sector. Amazon DataZone domain units enable you to maintain control over your data while facilitating seamless collaboration and helping you adhere to regulations like Health Insurance Portability and Accountability Act (HIPAA), General Data Protection Regulation (GDPR), and others specific to your industry.

The following are key benefits of Amazon DataZone domain units for HCLS customers:

  • Secure and compliant data sharing – Amazon DataZone domain units help provide a secure mechanism for you to share sensitive data, such as protected health information (PHI) and personally identifiable information (PII). This helps organizations with regulatory requirements maintain the privacy and security of their data.
  • Scalable and flexible data management – Amazon DataZone domain units offer a scalable and flexible data management solution that enables you to manage and curate your data, while also enabling efficient data discovery and access.
  • Streamlined collaboration and governance – The platform provides a centralized and controlled environment for teams to collaborate on data-driven projects. It enables effective data governance, allowing you to define and enforce policies, provide clarity on who has access to data, and maintain control over sensitive information.
  • Granular authorization policies – Amazon DataZone domain units allow you to define and enforce fine-grained authorization policies, maintain tight control over your data, and streamline data-driven collaboration and governance across your teams.

Solution overview

On the AWS Management Console, the administrator (AWS account user) creates the Amazon DataZone domain. As the creator of the domain, they can choose to add other single sign-on (SSO) and AWS Identity and Access Management (IAM) users as owners to manage the domain. Under the domain, domain units (such as Sales, Marketing, and Finance) can be created to reflect a hierarchy that aligns with the organization’s data ecosystem. Ownership of these domain units can be assigned to business leaders, who may expand a hierarchy representing their data teams and later set policies that enable users and projects to perform specific actions. With the domain structure in place, you can organize your assets under appropriate domain units. The organization of assets to domain units starts with projects being assigned to a domain unit at time of creation and assets then being cataloged within the project. Catalog consumers then browse the domain hierarchy to find assets related to specific business functions. They can also search for assets using a domain unit as a search facet.

Domain units set the foundation for how authorization policies permit users to perform actions in Amazon DataZone, such as who can create and join projects. Amazon DataZone creates a set of managed authorization policies for every domain unit, and domain unit owners create grants within a policy to users and projects.

There are two Amazon DataZone entities that have policies created on them. The first is a domain unit where the owners can decide who may perform actions such as creating domains, projects, joining projects, creating metadata forms, and so on. The policies have an option to cascade the grant down through child domain units. These policies are managed through the Amazon DataZone portal, and their grants can be applied to two principal types:

  • User-based policies – These policies grant users (IAM, SSO, and SSO groups) permission to perform an action (such as create domain units and projects, join projects, and take ownership of domain units and projects)
  • Project-based policies – These policies grant a project permission to perform an action (such as create metadata forms, glossaries, or custom asset types)

The second Amazon DataZone entity is a blueprint (defines the tools and services for Amazon DataZone environments), where a data platform user (AWS account user) who owns the Amazon DataZone blueprint can decide which projects use their resources through environment profile creation on the Amazon DataZone portal. There are two approaches to specify which projects can use the blueprint to create an environment profile:

  • Account users can use domain units as a delegation mechanism to pass the trust of using the blueprint to a business leader (domain unit owner) on the Amazon DataZone portal
  • Account users can directly grant a specific project permission to use the blueprint

These policies can be managed through the console and Amazon DataZone portal.

The following figure is an example domain structure for the ABC Corp domain. Domain units are created under the ABC Corp domain with domain unit owners assigned. Authorization policies are applied for each domain unit and dictate the actions users and projects can perform.

For more information about Amazon DataZone components, refer to Amazon DataZone terminology and concepts.

In the following sections, we walk through the steps to get started with the data management governance capabilities in Amazon DataZone.

Create an Amazon DataZone domain

With Amazon DataZone, administrators log in to the console and create an Amazon DataZone domain. Additional domain unit owners can be added to help manage the domain. For more information, refer to Managing Amazon DataZone domains and user access.

Create domain units to represent your business units

To create a domain unit, complete the following steps:

  1. Log in to the DataZone data portal and choose Domain in toolbar to view your domain units.
  2. As the domain unit owner, choose Create Domain Unit.
  3. Provide your domain unit details (representing different lines of business).
  4. You can create additional domain units in a nested fashion.
  5. For each domain unit, assign owners to manage the domain unit and its authorization policies.

Apply authorization policies so domain units can self-govern

Amazon DataZone managed authorization policies are available for every domain unit, and domain unit owners can grant access through that policy to users and projects. Policies are either user-based (granted to users) or project-based (granted to projects).

  1. On the Authorization Policies tab of a domain unit, grant authorization policies to users or projects permitting them to perform certain actions. For this example, we choose Project creation policy for the Sales domain.
  2. Choose Add Policy Grant to add either select users and groups, all users, or all groups.

With this, a Sales team member can log in to the data portal and create projects under the Sales domain.

Conclusion

In this post, we discussed common approaches to structuring domain units, use cases that customers in the HCLS industry encounter, and how to get started with the new domain units and authorization policies features from Amazon DataZone.

Domain units provide clean separation between data areas, making the discoverability of data efficient for users. Authorization policies, in combination with domain units, provide the governance layer controlling access to the data and provide control over how the data is cataloged. Together, Amazon DataZone domain units and authorization policies make organization and governance possible across your data.

Amazon DataZone domain units and authorization policies are available in all AWS Regions where Amazon DataZone is available. To learn more, refer to Working with domain units.


About the Authors

David Victoria is a Senior Technical Product Manager with Amazon DataZone at AWS. He focuses on improving administration and governance capabilities needed for customers to support their analytics systems. He is passionate about helping customers realize the most value from their data in a secure, governed manner. Outside of work, he enjoys hiking, traveling, and making his newborn baby laugh.

Nora O Sullivan is a Senior Solutions Architect at AWS. She focuses on helping HCLS customers choose the right AWS services for their data and analytics needs so they can derive value from their data. Outside of work, she enjoys golfing and discovering new wines and authors.

Navneet Srivastava, a Principal Specialist and Analytics Strategy Leader, develops strategic plans for building an end-to-end analytical strategy for large biopharma, healthcare, and life sciences organizations. Navneet is responsible for helping life sciences organizations and healthcare companies deploy data governance and analytical applications, electronic medical records, devices, and AI/ML-based applications while educating customers about how to build secure, scalable, and cost-effective AWS solutions. His expertise spans across data analytics, data governance, AI, ML, big data, and healthcare-related technologies.

Introducing AWS Glue Data Quality anomaly detection

Post Syndicated from Noah Soprala original https://aws.amazon.com/blogs/big-data/introducing-aws-glue-data-quality-anomaly-detection/

Thousands of organizations build data integration pipelines to extract and transform data. They establish data quality rules to ensure the extracted data is of high quality for accurate business decisions. These rules commonly assess the data based on fixed criteria reflecting the current business state. However, when the business environment changes, data properties shift, rendering these fixed criteria outdated and causing poor data quality.

For example, a data engineer at a retail company established a rule that validates daily sales must exceed a 1-million-dollar threshold. After a few months, daily sales surpassed 2 million dollars, rendering the threshold obsolete. The data engineer couldn’t update the rules to reflect the latest thresholds due to lack of notification and the effort required to manually analyze and update the rule. Later in the month, business users noticed a 25% drop in their sales. After hours of investigation, the data engineers discovered that an extract, transform, and load (ETL) pipeline responsible for extracting data from some stores had failed without generating errors. The rule with outdated thresholds continued to operate successfully without detecting this anomaly.

Also, breaks or gaps that significantly deviate from the seasonal pattern can sometimes point to data quality issues. For instance, retail sales may be highest on weekends and holiday seasons while relatively low on weekdays. Divergence from this pattern may indicate data quality issues such as missing data from a store or shifts in business circumstances. Data quality rules with fixed criteria can’t detect seasonal patterns because this requires advanced algorithms that can learn from past patterns and capture seasonality to detect deviations. You need the ability spot anomalies with ease, enabling you to proactively detect data quality issues and make confident business decisions.

To address these challenges, we are excited to announce the general availability of anomaly detection capabilities in AWS Glue Data Quality. In this post, we demonstrate how this feature works with an example. We provide an AWS Cloud Formation template to deploy this setup and experiment with this feature.

For completeness and ease of navigation, you can explore all the following AWS Glue Data Quality blog posts. This will help you understand all the other capabilities of AWS Glue Data Quality, in addition to anomaly detection.

Solution overview

For our use case, a data engineer wants to measure and monitor data quality of the New York taxi ride dataset. The data engineer knows about a few rules, but wants to monitor critical columns and be notified about any anomalies in these columns. These columns include fare amount, and the data engineer wants to be notified about any major deviations. Another attribute is the number of rides, which varies during peak hours, mid-day hours, and night hours. Also, as the city grows, there will be gradual increase in the number of rides overall. We use anomaly detection to help set up and maintain rules for this seasonality and growing trend.

We demonstrate this feature with the following steps:

  1. Deploy a CloudFormation template that will generate 7 days of NYC taxi data.
  2. Create an AWS Glue ETL job and configure the anomaly detection capability.
  3. Run the job for 6 days and explore how AWS Glue Data Quality learns from data statistics and detects anomalies.

Set up resources with AWS CloudFormation

This post includes a CloudFormation template for a quick setup. You can review and customize it to suit your needs. The template generates the following resources:

  • An Amazon Simple Storage Service (Amazon S3) bucket (anomaly-detection-blog-<account-id>-<region>)
  • An AWS Identity and Access Management (IAM) policy to associate with the S3 bucket (anomaly-detection-blog-<account-id>-<region>)
  • An IAM role with AWS Glue run permission as well as read and write permission on the S3 bucket (anomaly_detection_blog_GlueServiceRole)
  • An AWS Glue database to catalog the data (anomaly_detection_blog_db)
  • An AWS Glue visual ETL job to generate sample data (anomaly_detection_blog_data_generator_job)

To create your resources, complete the following steps:

  1. Launch your CloudFormation stack in us-east-1.
  2. Keep all settings as default.
  3. Select I acknowledge that AWS CloudFormation might create IAM resources and choose Create stack.
  4. When the stack is complete, copy the AWS Glue script to the S3 bucket anomaly-detection-blog-<account-id>-<region>.
  5. Open AWS CloudShell.
  6. Run the following command; replace account-id and region as appropriate:
    aws s3 cp s3://aws-blogs-artifacts-public/BDB-4485/scripts/anomaly_detection_blog_data_generator_job.py s3://anomaly-detection-blog-<account-id>-<region>/scripts/anomaly_detection_blog_data_generator_job.py

Run the data generator job

As part of the CloudFormation template, a data generator AWS Glue job is provisioned in your AWS account. Complete the following steps to run the job:

  1. On the AWS Glue console, choose ETL jobs in the navigation pane.
  2. Choose the job
  3. Review the script on the Script
  4. On the Job details tab, verify the job run parameters in the Advanced section:
    1. bucket_name – The S3 bucket name where you want the data to be generated.
    2. bucket_prefix – The prefix in the S3 bucket.
    3. gluecatalog_database_name – The database name in the AWS Glue Data Catalog that was created by the CloudFormation template.
    4. gluecatalog_table_name – The table name to be created in the Data Catalog in the database.
  5. Choose Run to run this job.
  6. On the Runs tab, monitor the job until the Run status column shows as Succeeded.

When the job is complete, it will have generated the NYC taxi dataset for the date range of May 1, 2024, to May 7, 2024, in the specified S3 bucket and cataloged the table and partitions in the Data Catalog for year, month, day, and hour. This dataset contains 7 day of hourly rides that fluctuates between high and low on alternate days. For instance, on Monday, there are approximately 1,400 rides, on Tuesday around 700 rides, and this pattern continues. Of the 7 days, the first 5 days of data is non-anomalous. However, on the sixth day, an anomaly occurs where the number of rows jumps to around 2,200 and the fare_amount is set to an unusually high value of 95 for mid-day traffic.

Create an AWS Glue visual ETL job

Complete the following steps:

  1. On the AWS Glue console, create a new AWS Glue visual job named anomaly-detection-blog-visual.
  2. On the Job details tab, provide the IAM role created by the CloudFormation stack.
  3. On the Visual tab, add an S3 node for the data source.
  4. Provide the following parameters:
    1. For Database, choose anomaly_detection_blog_db.
    2. For Table, choose nyctaxi_raw.
    3. For Partition predicate, enter year==2024 AND month==5 AND day==1.
  1. Add the Evaluate Data Quality transform and add use the following rule for fare_amount:
    Rules = [
        ColumnValues "fare_amount" between 1 and 100
    ]

Because we’re still trying to understand the statistics on this metric, we start with a wide range rule, and after a few runs, we will analyze the results and fine-tune as needed.

Next, we add two analyzers: one for RowCount and another for distinct values of pulocationid.

  1. On the Anomaly detection tab, choose Add analyzer.
  2. For Statistics, enter RowCount.
  3. Add a second analyzer.
  4. For Statistics, enter DistinctValuesCount and for Columns, enter pulocationid.

Your final ruleset should look like the following code:

Rules = [
    ColumnValues "fare_amount" between 1 and 100
]
Analyzers = [
DistinctValuesCount "pulocationid",
RowCount
]
  1. Save the job.

We have now generated a synthetic NYC taxi dataset and authored an AWS Glue visual ETL job to read from this dataset and perform analysis with one rule and two analyzers.

Run and evaluate the visual ETL job

Before we run the job, let’s look at how anomaly detection works. In this example, we have configured one rule and two analyzers. Rules have thresholds to compare what good looks like. Sometimes, you might know the critical columns, but not know specific thresholds. Rules and analyzers gather data statistics or data profiles. In this example, AWS Glue Data Quality will gather four statistics (a ColumnValue rule will gather two statistics, namely minimum and maximum fare amount, and two analyzers will gather two statistics). After gathering three data points from three runs, AWS Glue Data Quality will predict the fourth run along with upper and lower bounds. It will then compare the predicted value with the actual value. When the actual value breaches the predicted upper or lower bounds, it will create an anomaly.

Let’s see this in action.

Run the job for 5 days and analyze results

Because the first 5 days of data is non-anomalous, it will set a baseline with seasonality for training the model. Complete the following steps to run the job five times, once for each day’s partition:

  1. Choose the S3 node on the Visual tab and go to its properties.
  2. Set the day field in the partition predicate to 1.
  3. Choose Run to run this job.
  4. Monitor the job on the Runs tab for Succeeded
  5. Repeat these steps four more times, each time incrementing the day field in the partition predicate. Run the jobs at more or less regular intervals to get a clean graph that simulates the automated scheduled pipeline.
  6. After five successful runs, go to the Data quality tab, where you should see the statistic gathered for fare_amount and RowCount.

The anomaly detection algorithm takes a minimum of three data points to learn and start predicting. After three runs, you may see multiple anomalies detected in your dataset. This is expected because every new trend is seen as an anomaly at first. As the algorithm processes more and more records, it learns from it and sets the upper and lower bounds on your data accurately. The upper and lower bound predictions are dependent on the interval between the job runs.

Also, we can observe that the data quality score is always 100% based on the generic fare_amount rule we set up. You can explore the statistics by choosing the View trends links for each of the metrics to deep dive into the values. For example, the following screenshot shows the values for minimum fare_amount over a set of runs.

The model has predicted the upper bound to be around 1.4 and the lower bound to be around 1.2 for the minimum statistic of the fare_amount metric. When these bounds are breached, it would be considered an anomaly.

Run the job for the sixth (anomalous) day and analyze results

For the sixth day, we process a file that has two known anomalies. With this run, you should see anomalies detected on the graph. Complete the following steps:

  1. Choose the S3 node on the Visual tab and go to its properties.
  2. Set the day field in the partition predicate to 6.
  3. Choose Run to run this job.
  4. Monitor the job on the Runs tab for Succeeded

You should see a screenshot as follows where two anomalies are detected as expected: one for fare_amount with a high value of 95 and one for RowCount with a value of 2776.

Notice that even though the fare_amount score was anomalous and high, the data quality score is still 100%. We will fix this later.

Let’s investigate the RowCount anomaly further. As shown in the following screenshot, if you expand the anomaly record, you can see how the prediction upper bound was breached to cause this anomaly.

Up until this point, we saw how a baseline was set for the model training and statistics collected. We also saw how an anomalous value in our dataset was flagged as an anomaly by the model.

Update data quality rules based on findings

Now that we understand the statistics, lets adjust our ruleset such that when the rules fail, the data quality score is impacted. We take rule recommendations from the anomaly detection feature and add them to the ruleset.

As shown earlier, when the anomaly is detected, it gives you rule recommendations to the right of the graph. For this case, the rule recommendation states the RowCount metric should be between 275.0–1966.0. Let’s update our visual job.

  1. Copy the rule under Rule Recommendations for RowCount.
  2. On the Visual tab, choose the Evaluate Data Quality node, go to its properties, and enter the rule in the rules editor.
  3. Repeat these steps for fare_amount.
  4. You can adjust your final ruleset to look as follows:
    Rules = [
        ColumnValues "fare_amount" <= 52, 
        RowCount between 100 and 1800
    ]
    Analyzers = [
    DistinctValuesCount "pulocationid",
    RowCount
    ]

  5. Save the job, but don’t run it yet.

So far, we have learned how to use statistics collected to adjust the rules and make sure our data quality score is accurate. But there is a problem—the anomalous values influence the model training, forcing the upper and lower bounds to adjust to the anomaly. We need to exclude those data points.

Exclude the RowCount anomaly

When an anomaly is detected in your dataset, the upper and lower bound prediction will adjust to it because it will assume it’s a seasonality by default. After investigation, if you believe that it is indeed an anomaly and not a seasonality, you should exclude the anomaly so it doesn’t impact future predictions.

Because our sixth run is an anomaly, you can complete the following steps to exclude it:

  1. On the Anomalies tab, select the anomaly row you want to exclude.
  2. On the Edit training inputs menu, choose Exclude anomaly.
  3. Choose Save and retrain.
  4. Choose the refresh icon.

If you need to view previous anomalous runs, navigate to the Data quality trend graph, hover over the anomaly data point, and choose View selected run results. This will take you to the job run on a new tab where you can follow the preceding steps to exclude the anomaly.

Alternatively, if you ran the job over a period of time and need to exclude multiple data points, you can do so from the Statistics tab:

  1. On the Data quality tab, go to the Statistics tab and choose View trends for RowCount.
  2. Select the value you want to exclude.
  3. On the Edit training inputs menu, choose Exclude anomaly.
  4. Choose Save and retrain.
  5. Choose the refresh icon.

It may take a few seconds to reflect the change.

The following figure shows how the model adjusted to the anomalies before exclusion.

The following figure shows how the model retrained itself after the anomalies were excluded.

Now that the predictions are adjusted, all future out-of-range values will be detected as anomalies again.

Now you can run the job for day 7, which has non-anomalous data, and explore the trends.

Add an anomaly detection rule

It can be challenging to modify the rule values with the growing business trends. For example, at some point in future, the NYC taxi rows will exceed the now anomalous RowCount value of 2200. As you run the job over a longer period of time, the model matures and fine-tunes itself to the incoming data. At that point, you can make anomaly detection a rule by itself so you don’t have to update the values and can stop the jobs or decrease the data quality score. When there is an anomaly in the dataset, it means that the quality of the data is not good and the data quality score should reflect that. Let’s add a DetectAnomalies rule for the RowCount metric.

  1. On the Visual tab, choose the Evaluate Data Quality node.
  2. For Rule types, search for and choose DetectAnomalies, then add the rule.

Your final ruleset should look like the following screenshot. Notice that you don’t have any values for RowCount.

This is the real power of anomaly detection in your ETL pipeline.

Seasonality use case

The following screenshot shows an example of a trend with a more in-depth seasonality. The NYC taxi dataset has a varying number of rides throughout the day depending on peak hours, mid-day hours, and night hours. The following anomaly detection job ran on the current timestamp every hour to capture the seasonality of the day, and the upper and lower bounds have adjusted to this seasonality. When the number of rides drops unexpectedly within that seasonality trend, it is detected as an anomaly.

We saw how a data engineer can build anomaly detection into their pipeline for the incoming flow of data being processed at regular interval. We also learned how you can make anomaly detection a rule after the model is mature and fail the job, if an anomaly is detected, to avoid redundant downstream processing.

Clean up

To clean up your resources, complete the following steps:

  1. On the Amazon S3 console, empty the S3 bucket created by the CloudFormation stack.
  2. On the AWS Glue console, delete the anomaly-detection-blog-visual AWS Glue job you created.
  3. If you deployed the CloudFormation stack, delete the stack on the AWS CloudFormation console.

Conclusion

This post demonstrated the new anomaly detection feature in AWS Glue Data Quality. Although data quality static and dynamic rules are very useful, they can’t capture data seasonality and how data changes as your business evolves. A machine learning model supporting anomaly detection can understand these complex changes and inform you of anomalies in the dataset. Also, the recommendations provided can help you author accurate data quality rules. You can also enable anomaly detection as a rule after the model has been trained over a longer period of time on a sufficient amount of data.

To learn more about AWS Glue Data Quality, check out AWS Glue Data Quality. If you have any comments or feedback, leave them in the comments section.


About the authors

Noah Soprala is a Solutions Architect based out of Dallas. He is a trusted advisor to his customers in the ISV industry and helps them build innovative solutions using AWS technologies. Noah has over 20+ years of experience in consulting, development and solution delivery.

Shovan Kanjilal is a Senior Analytics and Machine Learning Architect with Amazon Web Services. He is passionate about helping customers build scalable, secure and high-performance data solutions in the cloud.

Shiv Narayanan is a Technical Product Manager for AWS Glue’s data management capabilities like data quality, sensitive data detection and streaming capabilities. Shiv has over 20 years of data management experience in consulting, business development and product management.

Jesus Max Hernandez is a Software Development Engineer at AWS Glue. He joined the team after graduating from The University of Texas at El Paso, and the majority of his work has been in frontend development. Outside of work, you can find him practicing guitar or playing flag football.

Tyler McDaniel is a software development engineer on the AWS Glue team with diverse technical interests, including high-performance computing and optimization, distributed systems, and machine learning operations. He has eight years of experience in software and research roles.

Andrius Juodelis is a Software Development Engineer at AWS Glue with a keen interest in AI, designing machine learning systems, and data engineering.

Spring 2024 SOC 2 report now available in Japanese, Korean, and Spanish

Post Syndicated from Brownell Combs original https://aws.amazon.com/blogs/security/spring-2024-soc-2-report-now-available-in-japanese-korean-and-spanish/

Japanese | Korean | Spanish

At Amazon Web Services (AWS), we continue to listen to our customers, regulators, and stakeholders to understand their needs regarding audit, assurance, certification, and attestation programs. We are pleased to announce that the AWS System and Organization Controls (SOC) 2 report is now available in Japanese, Korean, and Spanish. This translated report will help drive greater engagement and alignment with customer and regulatory requirements across Japan, Korea, Latin America, and Spain.

The Japanese, Korean, and Spanish language versions of the report do not contain the independent opinion issued by the auditors, but you can find this information in the English language version. Stakeholders should use the English version as a complement to the Japanese, Korean, or Spanish versions.

Going forward, the following reports in each quarter will be translated. Spring and Fall SOC 1 controls are included in the Spring and Fall SOC 2 reports, so this translation schedule will provide year-round coverage of the English versions.

  • Spring SOC 2 (April 1 – March 31)
  • Summer SOC 1 (July 1 – June 30)
  • Fall SOC 2 (October 1 – September 30)
  • Winter SOC 1 (January 1 – December 31)

Customers can download the translated Spring 2024 SOC 2 reports in Japanese, Korean, and Spanish through AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

The Spring 2024 SOC 2 report includes a total of 177 services in scope. For up-to-date information, including when additional services are added, visit the AWS Services in Scope by Compliance Program webpage and choose SOC.

AWS strives to continuously bring services into scope of its compliance programs to help you meet your architectural and regulatory needs. Please reach out to your AWS account team if you have questions or feedback about SOC compliance.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
 


Japanese version

第1四半期 2024 SOC 2 レポートの日本語、韓国語、スペイン語版の提供を開始

当社はお客様、規制当局、利害関係者の声に継続的に耳を傾け、Amazon Web Services (AWS) における監査、保証、認定、認証プログラムに関するそれぞれのニーズを理解するよう努めています。この度、AWS System and Organization Controls (SOC) 2 レポートが、日本語、韓国語、スペイン語で利用可能になりました。この翻訳版のレポートは、日本、韓国、ラテンアメリカ、スペインのお客様および規制要件との連携と協力体制を強化するためのものです。

本レポートの日本語、韓国語、スペイン語版には監査人による独立した第三者の意見は含まれていませんが、英語版には含まれています。利害関係者は、日本語、韓国語、スペイン語版の補足として英語版を参照する必要があります。

今後、四半期ごとの以下のレポートで翻訳版が提供されます。SOC 1 統制は、第1四半期 および 第3四半期 SOC 2 レポートに含まれるため、英語版と合わせ、1 年間のレポートの翻訳版すべてがこのスケジュールで網羅されることになります。

  • 第1四半期 SOC 2 (4 月 1 日〜3 月 31 日)
  • 第2四半期 SOC 1 (7 月 1 日〜6 月 30 日)
  • 第3四半期 SOC 2 (10 月 1 日〜9 月 30 日)
  • 第4四半期 SOC 1 (1 月 1 日〜12 月 31 日)

第1四半期 2024 SOC 2 レポートの日本語、韓国語、スペイン語版は AWS Artifact (AWS のコンプライアンスレポートをオンデマンドで入手するためのセルフサービスポータル) を使用してダウンロードできます。AWS マネジメントコンソール内の AWS Artifact にサインインするか、AWS Artifact の開始方法ページで詳細をご覧ください。

第1四半期 2024 SOC 2 レポートの対象範囲には合計 177 のサービスが含まれます。その他のサービスが追加される時期など、最新の情報については、コンプライアンスプログラムによる対象範囲内の AWS のサービスで [SOC] を選択してご覧いただけます。

AWS では、アーキテクチャおよび規制に関するお客様のニーズを支援するため、コンプライアンスプログラムの対象範囲に継続的にサービスを追加するよう努めています。SOC コンプライアンスに関するご質問やご意見については、担当の AWS アカウントチームまでお問い合わせください。

コンプライアンスおよびセキュリティプログラムに関する詳細については、AWS コンプライアンスプログラムをご覧ください。当社ではお客様のご意見・ご質問を重視しています。お問い合わせページより AWS コンプライアンスチームにお問い合わせください。
 


Korean version

2024년 춘계 SOC 2 보고서의 한국어, 일본어, 스페인어 번역본 제공

Amazon은 고객, 규제 기관 및 이해 관계자의 의견을 지속적으로 경청하여 Amazon Web Services (AWS)의 감사, 보증, 인증 및 증명 프로그램과 관련된 요구 사항을 파악하고 있습니다. AWS System and Organization Controls(SOC) 2 보고서가 이제 한국어, 일본어, 스페인어로 제공됨을 알려 드립니다. 이 번역된 보고서는 일본, 한국, 중남미, 스페인의 고객 및 규제 요건을 준수하고 참여도를 높이는 데 도움이 될 것입니다.

보고서의 일본어, 한국어, 스페인어 버전에는 감사인의 독립적인 의견이 포함되어 있지 않지만, 영어 버전에서는 해당 정보를 확인할 수 있습니다. 이해관계자는 일본어, 한국어 또는 스페인어 버전을 보완하기 위해 영어 버전을 사용해야 합니다.

앞으로 분기마다 다음 보고서가 번역본으로 제공됩니다. SOC 1 통제 조치는 춘계 및 추계 SOC 2 보고서에 포함되어 있으므로, 이 일정은 영어 버전과 함께 모든 번역된 언어로 연중 내내 제공됩니다.

  • 춘계 SOC 2(4/1~3/31)
  • 하계 SOC 1(7/1~6/30)
  • 추계 SOC 2(10/1~9/30)
  • 동계 SOC 1(1/1~12/31)

고객은 AWS 규정 준수 보고서를 필요할 때 이용할 수 있는 셀프 서비스 포털인 AWS Artifact를 통해 한국어, 일본어, 스페인어로 번역된 2024년 춘계 SOC 2 보고서를 다운로드할 수 있습니다. AWS Management Console의 AWS Artifact에 로그인하거나 Getting Started with AWS Artifact(AWS Artifact 시작하기)에서 자세한 내용을 알아보세요.

2024년 춘계 SOC 2 보고서에는 총 177개의 서비스가 포함됩니다. 추가 서비스가 추가되는 시기 등의 최신 정보는 AWS Services in Scope by Compliance Program(규정 준수 프로그램별 범위 내 AWS 서비스)에서 SOC를 선택하세요.

AWS는 고객이 아키텍처 및 규제 요구 사항을 충족할 수 있도록 지속적으로 서비스를 규정 준수 프로그램의 범위에 포함시키기 위해 노력하고 있습니다. SOC 규정 준수에 대한 질문이나 피드백이 있는 경우 AWS 계정 팀에 문의하시기 바랍니다.

규정 준수 및 보안 프로그램에 대한 자세한 내용은 AWS 규정 준수 프로그램을 참조하세요. 언제나 그렇듯이 AWS는 여러분의 피드백과 질문을 소중히 여깁니다. 문의하기 페이지를 통해 AWS 규정 준수 팀에 문의하시기 바랍니다.
 


Spanish version

El informe de SOC 2 primavera 2024 se encuentra disponible actualmente en japonés, coreano y español

Seguimos escuchando a nuestros clientes, reguladores y partes interesadas para comprender sus necesidades en relación con los programas de auditoría, garantía, certificación y acreditación en Amazon Web Services (AWS). Nos enorgullece anunciar que el informe de controles de sistema y organización (SOC) 2 de AWS se encuentra disponible en japonés, coreano y español. Estos informes traducidos ayudarán a impulsar un mayor compromiso y alineación con los requisitos normativos y de los clientes en Japón, Corea, Latinoamérica y España.

Estas versiones del informe en japonés, coreano y español no contienen la opinión independiente emitida por los auditores, pero se puede acceder a esta información en la versión en inglés del documento. Las partes interesadas deben usar la versión en inglés como complemento de las versiones en japonés, coreano y español.

De aquí en adelante, los siguientes informes trimestrales estarán traducidos. Dado que los controles SOC 1 se incluyen en los informes de primavera y otoño de SOC 2, esta programación brinda una cobertura anual para todos los idiomas traducidos cuando se la combina con las versiones en inglés.

  • SOC 2 primavera (del 1/4 al 31/3)
  • SOC 1 verano (del 1/7 al 30/6)
  • SOC 2 otoño (del 1/10 al 30/9)
  • SOC 1 invierno (del 1/1 al 31/12)

Los clientes pueden descargar los informes de SOC 2 primavera 2024 traducidos al japonés, coreano y español a través de AWS Artifact, un portal de autoservicio para el acceso bajo demanda a los informes de cumplimiento de AWS. Inicie sesión en AWS Artifact mediante la Consola de administración de AWS u obtenga más información en Introducción a AWS Artifact.

El informe de SOC 2 primavera 2024 incluye un total de 177 servicios que se encuentran dentro del alcance. Para acceder a información actualizada, que incluye novedades sobre cuándo se agregan nuevos servicios, consulte los Servicios de AWS en el ámbito del programa de conformidad y seleccione SOC.

AWS se esfuerza de manera continua por añadir servicios dentro del alcance de sus programas de conformidad para ayudarlo a cumplir con sus necesidades de arquitectura y regulación. Si tiene alguna pregunta o sugerencia sobre la conformidad de los SOC, no dude en comunicarse con su equipo de cuenta de AWS.

Para obtener más información sobre los programas de conformidad y seguridad, consulte los Programas de conformidad de AWS. Como siempre, valoramos sus comentarios y preguntas; de modo que no dude en comunicarse con el equipo de conformidad de AWS a través de la página Contacte con nosotros.

Brownell Combs

Brownell Combs
Brownell is a Compliance Program Manager at AWS, where he leads multiple security and privacy initiatives. Brownell holds a Master’s Degree in Computer Science from the University of Virginia and a Bachelor’s Degree in Computer Science from Centre College. He has over 20 years of experience in information technology risk management and CISSP, CISA, CRISC, and GIAC GCLD certifications.

Rodrigo Fiuza

Rodrigo Fiuza
Rodrigo is a Security Audit Manager at AWS, based in São Paulo. He leads audits, attestations, certifications, and assessments across Latin America, the Caribbean, and Europe. Rodrigo has worked in risk management, security assurance, and technology audits for the past 12 years.

Paul Hong

Paul Hong
Paul is a Compliance Program Manager at AWS. He leads multiple security, compliance, and training initiatives within AWS and has over 10 years of experience in security assurance. Paul is a CISSP, CEH, and CPA, and holds a Masters of Accounting Information Systems and a Bachelors of Business Administration from James Madison University, Virginia.

Hwee Hwang

Hwee Hwang
Hwee is an Audit Specialist at AWS based in Seoul, South Korea. Hwee is responsible for third-party and customer audits, certifications, and assessments in Korea. Hwee previously worked in security governance, risk, and compliance and is laser focused on building customers’ trust and providing them assurance in the cloud.

Tushar Jain

Tushar Jain
Tushar is a Compliance Program Manager at AWS, where he leads multiple security, compliance, and training initiatives. He holds a Master of Business Administration from Indian Institute of Management, Shillong, India and a Bachelor of Technology in Electronics and Telecommunication Engineering from Marathwada University, India. He has over 12 years of experience in information security and holds CCSK and CSXF certifications.

Eun Jin Kim

Eun Jin Kim
Eun Jin is a security assurance professional working as the Audit Program Manager at AWS. She mainly leads compliance programs in South Korea for the financial sector. She has more than 25 years of experience and holds a Master’s Degree in Management Information Systems from Carnegie Mellon University in Pittsburgh, Pennsylvania and a Master’s Degree in Law from George Mason University in Arlington, Virginia.

Michael Murphy

Michael Murphy
Michael is a Compliance Program Manager at AWS, where he leads multiple security and privacy initiatives. Michael has 12 years of experience in information security. He holds a Master’s Degree in Computer Engineering and a Bachelor’s Degree in Computer Engineering from Stevens Institute of Technology. He also holds CISSP, CRISC, CISA, and CISM certifications.

Nathan Samuel

Nathan Samuel
Nathan is a Compliance Program Manager at AWS, where he leads multiple security and privacy initiatives. Nathan has a Bachelor’s of Commerce degree from the University of the Witwatersrand, South Africa. He has 21 years’ experience in security assurance and holds the CISA, CRISC, CGEIT, CISM, CDPSE, and Certified Internal Auditor certifications.

Seul Un Sung

Seul Un Sung
Seul Un is a Security Assurance Audit Program Manager at AWS, where she has been leading South Korea audit programs, including K-ISMS and RSEFT, for the past four years. She has a Bachelor’s of Information Communication and Electronic Engineering degree from Ewha Womans University and has 14 years of experience in IT risk, compliance, governance, and audit, and holds the CISA certification.

Hidetoshi Takeuchi

Hidetoshi Takeuchi
Hidetoshi is a Senior Audit Program Manager at AWS, based in Japan, leading Japan and India security certification and authorization programs. Hidetoshi has led information technology, cyber security, risk management, compliance, security assurance, and technology audits for the past 28 years and holds the CISSP certifications.

ryan wilks

Ryan Wilks
Ryan is a Compliance Program Manager at AWS, where he leads multiple security and privacy initiatives. Ryan has 13 years of experience in information security. He has a bachelor of arts degree from Rutgers University and holds ITIL, CISM, and CISA certifications.

Introducing data products in Amazon DataZone: Simplify discovery and subscription with business use case based grouping

Post Syndicated from Jason Hines original https://aws.amazon.com/blogs/big-data/introducing-data-products-in-amazon-datazone-simplify-discovery-and-subscription-with-business-use-case-based-grouping/

We are excited to announce a new feature in Amazon DataZone that allows data producers to group data assets into well-defined, self-contained packages (data products) tailored for specific business use cases. For example, a marketing analysis data product can bundle various data assets such as marketing campaign data, pipeline data, and customer data. This simplifies the process for data consumers to find datasets, understand their context through shared metadata, and access comprehensive datasets for specific use cases through a single workflow. With the grouping capabilities of data products, data producers can manage and control access to the underlying data assets with just a few steps.

Customers often face challenges in locating and accessing the fragmented data they need, expending time and resources in the process. With Amazon DataZone, they can use data products to enhance data cataloging and subscription processes, aligning these more closely with business objectives while eliminating redundancy in handling individual assets.

In this post, we highlight the key benefits of data products, outline their essential features and workflows, and demonstrate how customers can use these features for easier publishing, discovery, and subscription.

Key benefits of data products

Customers use Amazon DataZone to create data meshes and adopt a culture that emphasizes data as a product. Amazon DataZone facilitates the publication of data assets from diverse sources that are enriched with their business context. It is crucial to organize assets into cohesive units with relational context to maximize the potential of data as a product and drive business use cases.

Amazon DataZone now offers the capability to group data assets with shared metadata into cohesive, business use case based data products, enhancing both the publishing and subscription processes. Data products provide three core benefits that help customers address their business challenges:

  • Simplified discovery – Data consumers can quickly identify interconnected data assets by searching for and finding them as a single unit. This reduces the time and effort required to find all relevant information and lowers the risk of missing important data.
  • Unified access model – Data products simplify access to data with a single request by implementing a unified access model. This eliminates the need for multiple permissions, speeding up the initiation of data analysis.
  • Reduced administrative overhead – By cataloging assets as data product units, data producers reduce administrative overhead by enabling metadata and access control management at the product level rather than individually. This makes access governance and data utilization more efficient, ensuring alignment with business goals and easy accessibility for its intended use. Data governance teams can monitor consumption rates for these data products, providing valuable insights into data literacy maturity.

For example, one of our customers, Natera, uses Amazon DataZone to create tailored datasets for their specific needs. Mirko Buholzer, VP of software engineering at Natera, says

“At Natera, our mission to revolutionize precision medicine depends on managing and leveraging our vast clinical and genomic data. With the Amazon DataZone data products feature, we can create tailored datasets for specific uses like reproductive health, oncology, or organ transplantation. This streamlines data discovery and access for our researchers and data scientists, enabling quick analysis of relevant data. Additionally, it will help physicians and patients gain deeper insights in combination with our clinical tests, ultimately improving patient outcomes.”

With data products, Amazon DataZone now supports business use case based grouping, enhancing data publishing, discovery, and subscription. This feature enables the following capabilities, as shown in the following image:

  • Data product creation and publishing – Producers can create data products by selecting assets from their project’s inventory, setting up shared metadata, and publishing these products to make them discoverable to consumers.
  • Data discovery and subscription – Consumers can search for and subscribe to data product units. Subscription requests are sent within a single workflow to producers for approval. Subscription approval processes, such as approve, reject, and revoke, ensure that access is managed securely. Once approved, access grants for the individual assets within the data product are automatically managed by the system.
  • Data product lifecycle management – Producers have control over the lifecycle of data products, including the ability to edit them and remove them from the catalog. When a producer edits product metadata or adds or removes assets from a data product, they republish it as a new version, and subscriptions are updated without any reapproval.

Solution overview

To demonstrate these capabilities and workflows, consider a use case where a product marketing team wants to drive a campaign on product adoption. To be successful, they need access to sales data, customer data, and review data of similar products. The sales data engineer, acting as the data producer, owns this data and understands the common requests from customers to access these different data assets for sales-related analysis. The data producer’s objective is to group these assets so consumers, such as the product marketing team, can find them together and seamlessly subscribe to perform analysis.

The following high-level implementation steps show how to achieve this use case with data products in Amazon DataZone and are detailed in the following sections.

  1. Data publisher creates and publishes data product
    1. Create data product – The data publisher (the project contributor for the producing project) provides a name and description and adds assets to the data product.
    2. Curate data product – The data publisher adds a readme, glossaries, and metadata forms to the data product.
    3. Publish data product – The data publisher publishes the data product to make it discoverable to consumers.
  2. Data consumer discovers and subscribes to data product
    1. Search data product – The data consumer (the project member of the consuming project) looks for the desired data product in the catalog.
    2. Request subscription – The data consumer submits a request to access the data product.
    3. Data owner approves subscription request – The data owner reviews and approves the subscription request.
    4. Review access approval and grant – The system manages access grants for the underlying assets.
    5. Query subscribed data – The data consumer receives approval and can now access and query the data assets within the subscribed data product.
  3. Data owner maintains lifecycle of data product
    1. Revise data product – The data owner (the project owner for the producing project) updates the data product as needed.
    2. Unpublish data product – The data owner removes the data product from the catalog if necessary.
    3. Delete data product – The data owner permanently deletes the data product if it is no longer needed.
    4. Revoke subscription – The data owner manages subscriptions and revokes access if required.

Prerequisites

To follow along with this post, ensure the publisher of the product sales data asset has ingested individual data assets into Amazon DataZone. In our use case, a data engineer in sales owns the following AWS Glue tables: customers, order_items, orders, products, reviews, and shipments. The data engineer has added a data source to bring these six data assets into the sales producer project inventory, ingesting the metadata in Amazon DataZone. For instructions on ingesting metadata for AWS Glue tables, refer to Create and run an Amazon DataZone data source for the AWS Glue Data Catalog. For Amazon Redshift, see Create and run an Amazon DataZone data source for Amazon Redshift.

On the producer side, a sales product project has been created with a data lake environment. A data source was created to ingest the technical metadata from the AWS Glue salesdb database, which contains the six AWS Glue tables mentioned previously. On the consumer side, a marketing consumer project with a data lake environment has been established.

Data publisher creates and publishes data product

Sign in to Amazon DataZone data portal as a data publisher in the sales producer project. You can now create a data product to group inventory assets relevant to the sales analysis use case. Use the following steps to create and publish a data product, as shown in the following screenshot.

  1. Select DATA in the top ribbon of the Sales Product Project
  2. Select Inventory data in the navigation pane
  3. Choose DATA PRODUCTS to create a data product

Create data product

Follow these steps to create a data product:

  1. Choose Create new data product. Under Details, in the name field, enter “Sales Data Product.” In the description, enter “A data product containing the following 6 assets: Product, Shipments, Order Items, Orders, Customers, and Reviews,” as shown in the following screenshot.
  2. Select Choose assets to add the data assets. Select CHOOSE on the right side next to each of the six data products. Be sure to go to the second page to select the sixth asset. After all are selected, choose the blue CHOOSE button at the bottom of the page, as shown in the following screenshot. Then choose Create to create the data product.

Curate data product

You can curate the sales data product by adding a readme, glossary term, and metadata forms to provide business context to the data product, as shown in the following screenshot.

  1. Choose Add terms under GLOSSARY TERMS. Select a glossary term that you have added to your glossary, for example, Sales. Refer to Create, edit, or delete a business glossary for how to create a business glossary.
  2. Choose Add metadata form to add a form such as a business owner. Refer to Create, edit, or delete metadata forms for how to create a metadata form. In this example, we added Ownership as a metadata form.

Publish data product

Follow these steps to publish a data product.

  1. Once all the necessary business metadata has been added, choose Publish to publish the data product to the business catalog, as shown in the following screenshot.
  2. In the pop-up, choose Publish data product.

The six data assets in the data product will also be published but will only be discoverable through the data product unless published individually. Consumers cannot subscribe to the individual data assets unless they are published and made discoverable in the catalog separately.

Data consumer discovers and subscribes to data product

Now, as the marketing user, inside of the marketing project, you can find and subscribe to the sales data product.

Search data product

Sign in to the Amazon DataZone data portal as a marketing user in the marketing consumer project. In the search bar, enter “sales” or any other metadata that you added to the sales data product.

Once you find the appropriate data product, select it. You can view the metadata added and see which data assets are included in the data product by selecting the DATA ASSETS tab, as shown in the following screenshot.

Request subscription

Choose Subscribe to bring up the Subscribe to Sales Data Product modal. Make sure the project is your consumer project, for example, Marketing Consumer Project. In Reason for request, enter “Running a marketing campaign for the latest sales play.” Choose SUBSCRIBE.

The request will be routed to the sales producer project for approval.

Data owner approves subscription request

Sign in to Amazon DataZone as the project owner for the sales producer project to approve the request. You will see an alert in the task notification bar. Choose the notification icon on the top right to see the notifications, then choose Subscription Request Created, as shown in the following screenshot.

You can also view incoming subscription requests by choosing DATA in the blue ribbon at the top. Then choose Incoming requests in the navigation pane, REQUESTED under Incoming requests, and then View request, as shown in the following screenshot.

On the Subscription request pop-up, you will see who requested access to the Sales Data Product, from which project, the requested date and time, and their reason for requesting it. You can enter a Decision comment and then choose APPROVE.

Review access approval and grant

The marketing consumer is now approved to access the six assets included in the sales data product. Sign in to Amazon DataZone as a marketing user in the marketing consumer project. A new event will appear, showing that the SUBSCRIPTION REQUEST APPROVED has been completed.

You can view this in two different ways. Choose the notification icon on the top right and then EVENTS under Notifications, as shown in the first following screenshot. Alternatively, select DATA in the blue ribbon bar, then Subscribed data, and then Data products, as shown in the second following screenshot.

Choose the Sales Data Product and then Data assets. Amazon DataZone will automatically add the six data assets to the AWS Glue tables that the marketing consumer can use. Wait until you see that all six assets have been added to one environment, as shown in the following screenshot, before proceeding.

Query subscribed data

Once you complete the previous step, return to the main page of the marketing consumer project by choosing Marketing Consumer Project in the top left pull-down project selector, then choose OVERVIEW. The data can now be consumed through the Amazon Athena deep link on the right side. Choose Query data to open Athena, as shown in the following screenshot. In the Open Amazon Athena window, choose Open Amazon Athena.

A new window will open where the marketing consumer has been federated into the role that Amazon DataZone uses for granting permissions to the marketing consumer project data lake environment. The workgroup defaults to the appropriate workgroup that Amazon DataZone manages. Make sure that the Database under Data is the sub_db for the marketing consumer data lake environment. There will be six tables listed that correspond to the original six data assets added to the sales data product. Run your query. In this case, we used a query that looked for the top five best-selling products, as shown in the following code snippet and screenshot.

SELECT p.product_name, SUM(oi.quantity) AS total_quantity FROM order_items oi JOIN products p ON oi.product_id = p.product_idGROUP BY p.product_nameORDER BY total_quantity DESC 
LIMIT 5;

Data owner maintains lifecycle of data product

Follow these steps to maintain the lifecycle of the data product.

Revise data product

The data owner updates the data product, which includes editing metadata and adding or removing assets as needed. For detailed instructions, refer to Republish data products.

The sales data engineer has been tasked with removing one of the assets, the reviews table, from the sales data product.

  1. Open the SALES PRODUCER PROJECT by selecting it from the top project selector.
  2. Select DATA in the top ribbon.
  3. Select Published data in the navigation pane.
  4. Choose DATA PRODUCTS on the right side.
  5. Choose Sales Data Product.

The following screenshot shows these steps.

Once in the data product, the data engineer can add and remove metadata or assets. In To change any of the assets in the data product, follow these steps, as shown in the following screenshot.

  1. Select ASSETS in Sales Data Product.
  2. Select any of the assets. For this example, we remove the Reviews
  3. Select the three dots on the right side.
  4. Select Remove asset.
  5. A pop-up will appear confirming that you want to remove the asset. Choose Remove. The Reviews asset will now have a status of Removing asset: This asset is still available to subscribers.
  6. Republish the data product to remove access to this asset from all subscribers. Choose REPUBLISH and REPUBLISH DATA PRODUCT in the pop-up.
  7. To confirm the asset has been removed, sign in to the marketing project as the consumer. Open the Amazon Athena deep link on the OVERVIEW After selecting the sub_db associated with the marketing consumer data lake environment, only five tables are visible because the Reviews table was removed from the data product, as shown in the following screenshot.

The consumer doesn’t have to take any action after a data product has been republished. If the data engineer had changed any of the business metadata, such as by adding a metadata form, updating the readme, or adding glossary terms and republishing, the consumer would see those changes reflected when viewing the data product under the subscribed data.

Unpublish data product

The data owner removes the data product from the catalog, making it no longer discoverable to the organization. You can choose to retain existing subscription access for the underlying assets. For detailed instructions, refer to refer to Unpublish data product.

Delete data product

The data owner permanently deletes the data product if it is no longer needed. Before deletion, you need to revoke all subscriptions. This action will not delete the underlying data assets. For detailed instructions, refer to Delete Data Product.

Revoke subscription

The data owner manages subscriptions and may revoke a subscription after it has been approved. For detailed instructions, refer to Revoke subscription.

Cleanup

To ensure no additional charges are incurred after testing, be sure to delete the Amazon DataZone domain. Refer to Delete domains for the process.

Conclusion

Data products are crucial for improving decision-making accuracy and speed in modern businesses. Beyond making raw data available, they offer strategic packaging, curation, and discoverability. Data products help customers address the difficulty of locating and accessing fragmented data, which reduces the time and resources needed to perform this important task.

Amazon DataZone already facilitates data cataloging from various sources. Building on this capability, this new feature streamlines data utilization by bundling data into purpose-built data products aligned with business goals. As a result, customers can unlock the full potential of their data.

The feature is supported in all the AWS commercial Regions where Amazon DataZone is currently available. To get started, check out the Working with data products.


About the authors

Jason Hines is a Senior Solutions Architect, at AWS, specializing in serving global customers in the Healthcare and Life Sciences industries. With over 25 years of experience, he has worked with numerous Fortune 100 companies across multiple verticals, bringing a wealth of knowledge and expertise to his role. Outside of work, Jason has a passion for an active lifestyle. He enjoys various outdoor activities such as hiking, scuba diving, and exploring nature. Maintaining a healthy work-life balance is essential to him.

Ramesh H Singh is a Senior Product Manager Technical (External Services) at AWS in Seattle, Washington, currently with the Amazon DataZone team. He is passionate about building high-performance ML/AI and analytics products that enable enterprise customers to achieve their critical goals using cutting-edge technology. Connect with him on LinkedIn.

Leonardo Gomez is a Principal Analytics Specialist Solutions Architect at AWS. He has over a decade of experience in data management, helping customers around the globe address their business and technical needs. Connect with him on LinkedIn.

AWS Weekly Roundup: Amazon Q Business, AWS CloudFormation, Amazon WorkSpaces update, and more (Aug 5, 2024)

Post Syndicated from Matheus Guimaraes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-amazon-q-business-aws-cloudformation-amazon-workspaces-update-and-more-aug-5-2024/

Summer is reaching its peak for some of us around the globe, and many are heading out to their favorite holiday destinations to enjoy some time off. I just came back from holidays myself and I couldn’t help thinking about the key role that artificial intelligence (AI) plays in our modern world to help us scale the operation of simple things like traveling. Passport and identity verifications were quick, and thanks to the new airport security system rolling out across the world, so were my bag checks. I watched my backpack with a smile as it rolled along the security check belt with my computer, tablet, and portable game consoles all nicely tucked inside without any fuss.

If it wasn’t for AI, we wouldn’t be able to scale operations to keep up with population growth or the enormous volumes of data we generate on a daily basis. The advent of generative AI took this even further by unlocking the ability to put all this data to use in all kinds of creative ways, driving a new wave of exciting innovations that continues to elevate modern products and services.

This new landscape can be challenging for companies that are learning how generative AI can help them grow or succeed, such as startups. This is why I’m so excited about the AWS GenAI Lofts taking place in the next months around the world.

The AWS GenAI Lofts are collaborative spaces available in different cities around the world for a number of weeks. Startups, developers, investors, and industry experts can meet here while having access to AWS AI experts, and attend talks, workshops, fireside chats, and Q&As with industry leaders. All lofts are free and are carefully curated to offer something for everyone to help you accelerate your journey with AI. There are lofts scheduled in Bengaluru (July 29-Aug 9), San Francisco (Aug 14-Sept 27), Sao Paulo (Sept 2-Nov 20), London (Sept 30-Oct 25), Paris (Oct 8-Nov 25), and Seoul (Nov, pending exact dates). I highly encourage you to have a look at the agendas of a loft near you and drop in to learn more about GenAI and connect with others.

Last week’s launches
Here are some launches that got my attention last week.

Amazon Q Business cross-Region IdC — Amazon Q Business is a generative AI-powered assistant that deeply understands your business by providing connectors that you can easily set up to unify data from various sources such as Amazon S3, Microsoft 365, and more. You can then generate content, answer questions, and even automate tasks that are relevant and specific to your business. Q Business integrates with AWS IAM Identity Center to ensure that data can only be accessed by those who are authorized to do so. Previously, the IAM Identity Center instance had to be located in the same Region as the Q Business application. Now, you can connect to one in a different Region.

Git sync status changes publish to Amazon EventBridgeAWS CloudFormation Git sync is a very handy feature that can help streamline your DevOps operations by allowing you to automatically update your AWS CloudFormation stacks whenever you commit changes to the template or deployment file in source control. As of last week, any sync status change is published in near real-time as an event to EventBridge. This enables you to take your GitOps workflow further and stay on top of your Git repositories or resource sync status changes.

Some AWS Pinpoint’s capabilities are now under AWS End User Messaging — AWS Pinpoint’s SMS, MMS, push, and text to voice capabilities have been shuffled and now are offered through their own service called AWS End User Messaging. There is no impact to existing applications and no changes to APIs, the AWS Command Line Interface (AWS CLI), or IAM policies, however, the new name is now reflected on the AWS Management Console, AWS Billing console dashboard, documentation, and other places.

Amazon WorkSpaces updates — Microsoft Visual Studio Professional and Microsoft Visual Studio Enterprise 2022 are now added to the list of available license included applications on Workspaces Personal. Additionally, Amazon WorkSpaces Thin Client has received Carbon Trust verification. As verified by the Carbon Trust, the total lifecycle carbon emission is 77kg CO2e and 50% of the product is made from recycled materials.

GenAI for the Public Sector — There has been two significant launches that may interest those in the public sector looking into getting started with generative AI. Amazon Bedrock is now a FedRAMP High authorized service in the AWS GovCloud (US-West) Region. Additionally, both Llama 3 8B and Lllama 3 70B are now also available in that Region making this a perfect opportunity to start experimenting with Bedrock and Llama 3 if you have workloads in the AWS GovCloud (US-West) Region.

Customers in Germany can now sign up for AWS using their bank account — That means no debit or credit card is needed to create AWS accounts if you have a billing address in Germany. This can help simplify payment of AWS invoices for some businesses, as well as make it easier for others to get started on AWS.

Learning Materials

These are my recommended learning materials for this week.

AWS Skill Builder — This is more of a broad recommendation, but I’m still surprised that so many people never heard of AWS Skill Builder or have not tried it yet. There is so much learning you can do for free including a lot of hands-on courses. In July alone, AWS Skill Builder has launched 25 new digital training products including AWS SimulLearn and AWS Cloud Quest: Generative AI which are game-based learning experiences. Speaking of that, did you know that if you need to renew your Cloud Practitioner certification you can do it simply by playing the AWS Cloud Quest: Recertify Cloud Practioner game?

Get started with agentic code interpreter — Earlier last month we released a new capability on Agents for Amazon Bedrock which allows agents to dynamically generate and execute code within a secure sandboxed environment. As usual, my colleague Mike Chambers has created a great video and blog post on community.aws showing how you can start using it today.

That’s it for this week. Check back next Monday for another Weekly Roundup!

Amazon OpenSearch Serverless cost-effective search capabilities, at any scale

Post Syndicated from Satish Nandi original https://aws.amazon.com/blogs/big-data/amazon-opensearch-serverless-cost-effective-search-capabilities-at-any-scale/

We’re excited to announce the new lower entry cost for Amazon OpenSearch Serverless. With support for half (0.5) OpenSearch Compute Units (OCUs) for indexing and search workloads, the entry cost is cut in half. Amazon OpenSearch Serverless is a serverless deployment option for Amazon OpenSearch Service that you can use to run search and analytics workloads without the complexities of infrastructure management, shard tuning or data lifecycle management. OpenSearch Serverless automatically provisions and scales resources to provide consistently fast data ingestion rates and millisecond query response times during changing usage patterns and application demand. 

OpenSearch Serverless offers three types of collections to help meet your needs: Time-series, search, and vector. The new lower cost of entry benefits all collection types. Vector collections have come to the fore as a predominant workload when using OpenSearch Serverless as an Amazon Bedrock knowledge base. With the introduction of half OCUs, the cost for small vector workloads is halved. Time-series and search collections also benefit, especially for small workloads like proof-of-concept deployments and development and test environments.

A full OCU includes one vCPU, 6GB of RAM and 120GB of storage. A half OCU offers half a vCPU, 3 GB of RAM, and 60 GB of storage. OpenSearch Serverless scales up a half OCU first to one full OCU and then in one-OCU increments. Each OCU also uses Amazon Simple Storage Service (Amazon S3) as a backing store; you pay for data stored in Amazon S3 regardless of the OCU size. The number of OCUs needed for the deployment depends on the collection type, along with ingestion and search patterns. We will go over the details later in the post and contrast how the new half OCU base brings benefits. 

OpenSearch Serverless separates indexing and search computes, deploying sets of OCUs for each compute need. You can deploy OpenSearch Serverless in two forms: 1) Deployment with redundancy for production, and 2) Deployment without redundancy for development or testing.

Note: OpenSearch Serverless deploys two times the compute for both indexing and searching in redundant deployments.

OpenSearch Serverless Deployment Type

The following figure shows the architecture for OpenSearch Serverless in redundancy mode.

In redundancy mode, OpenSearch Serverless deploys two base OCUs for each compute set (indexing and search) across two Availability Zones. For small workloads under 60GB, OpenSearch Serverless uses half OCUs as the base size. The minimum deployment is four base units, two each for indexing and search. The minimum cost is approximately $350 per month (four half OCUs). All prices are quoted based on the US-East region and 30 days a month. During normal operation, all OCUs are in operation to serve traffic. OpenSearch Serverless scales up from this baseline as needed.

For non-redundant deployments, OpenSearch Serverless deploys one base OCU for each compute set, costing $174 per month (two half OCUs).

Redundant configurations are recommended for production deployments to maintain availability; if one Availability Zone goes down, the other can continue serving traffic. Non-redundant deployments are suitable for development and testing to reduce costs. In both configurations, you can set a maximum OCU limit to manage costs. The system will scale up to this limit during peak loads if necessary, but will not exceed it.

OpenSearch Serverless collections and resource allocations

OpenSearch Serverless uses compute units differently depending on the type of collection and keeps your data in Amazon S3. When you ingest data, OpenSearch Serverless writes it to the OCU disk and Amazon S3 before acknowledging the request, making sure of the data’s durability and the system’s performance. Depending on collection type, it additionally keeps data in the local storage of the OCUs, scaling to accommodate the storage and computer needs.

The time-series collection type is designed to be cost-efficient by limiting the amount of data kept in local storage, and keeping the remainder in Amazon S3. The number of OCUs needed depends on amount of data and the collection’s retention period. The number of OCUs OpenSearch Serverless uses for your workload is the larger of the default minimum OCUs, or the minimum number of OCUs needed to hold the most recent portion of your data, as defined by your OpenSearch Serverless data lifecycle policy. For example, if you ingest 1 TiB per day and have 30 day retention period, the size of the most recent data will be 1 TiB. You will need 20 OCUs [10 OCUs x 2] for indexing and another 20 OCUS [10 OCUs x 2] for search (based on the 120 GiB of storage per OCU). Access to older data in Amazon S3 raises the latency of the query responses. This tradeoff in query latency for older data is done to save on the OCUs cost.

The vector collection type uses RAM to store vector graphs, as well as disk to store indices. Vector collections keep index data in OCU local storage. When sizing for vector workloads both needs into account. OCU RAM limits are reached faster than OCU disk limits, causing vector collections to be bound by RAM space. 

OpenSearch Serverless allocates OCU resources for vector collections as follows. Considering full OCUs, it uses 2 GB for the operating system, 2 GB for the Java heap, and the remaining 2 GB for vector graphs. It uses 120 GB of local storage for OpenSearch indices. The RAM required for a vector graph depends on the vector dimensions, number of vectors stored, and the algorithm chosen. See Choose the k-NN algorithm for your billion-scale use case with OpenSearch for a review and formulas to help you pre-calculate vector RAM needs for your OpenSearch Serverless deployment.

Note: Many of the behaviors of the system are explained as of June 2024. Check back in coming months as new innovations continue to drive down cost.

Supported AWS Regions

The support for the new OCU minimums for OpenSearch Serverless is now available in all regions that support OpenSearch Serverless. See AWS Regional Services List for more information about OpenSearch Service availability. See the documentation to learn more about OpenSearch Serverless.

Conclusion

The introduction of half OCUs gives you a significant reduction in the base costs of Amazon OpenSearch Serverless. If you have a smaller data set, and limited usage, you can now take advantage of this lower cost. The cost-effective nature of this solution and simplified management of search and analytics workloads ensures seamless operation even as traffic demands vary.


About the authors 

Satish Nandi is a Senior Product Manager with Amazon OpenSearch Service. He is focused on OpenSearch Serverless and Geospatial and has years of experience in networking, security and ML and AI. He holds a BEng in Computer Science and an MBA in Entrepreneurship. In his free time, he likes to fly airplanes, hang glide, and ride his motorcycle.

Jon Handler is a Senior Principal Solutions Architect at Amazon Web Services based in Palo Alto, CA. Jon works closely with OpenSearch and Amazon OpenSearch Service, providing help and guidance to a broad range of customers who have search and log analytics workloads that they want to move to the AWS Cloud. Prior to joining AWS, Jon’s career as a software developer included four years of coding a large-scale, eCommerce search engine. Jon holds a Bachelor of the Arts from the University of Pennsylvania, and a Master of Science and a Ph. D. in Computer Science and Artificial Intelligence from Northwestern University.

OSPAR 2024 report now available with 163 services in scope

Post Syndicated from Joseph Goh original https://aws.amazon.com/blogs/security/ospar-2024-report-available-with-163-services-in-scope/

Amazon Web Services (AWS) is pleased to announce the completion of our annual Outsourced Service Provider’s Audit Report (OSPAR) audit cycle on July 1, 2024. The 2024 OSPAR certification cycle includes the addition of 10 new services in scope, bringing the total number of services in scope to 163 in the AWS Asia Pacific (Singapore) Region.

Newly added services in scope include the following:

The Association of Banks in Singapore (ABS) has established the Guidelines on Control Objectives and Procedures for Outsourced Service Providers to provide baseline controls criteria that Outsourced Service Providers (“OSPs”) operating in Singapore should have in place. Successfully completing the OSPAR assessment demonstrates that AWS has implemented a robust system of controls that adhere to these guidelines. This underscores our commitment to fulfill the security expectations for cloud service providers set by the financial services industry in Singapore.

Customers can use OSPAR to streamline their due diligence processes, thereby reducing the effort and costs associated with compliance. OSPAR remains a core assurance program for our financial services customers, as it is closely aligned with local regulatory requirements from the Monetary Authority of Singapore (MAS).

You can download the latest OSPAR report from AWS Artifact, a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact. The list of services in scope for OSPAR is available in the report, and is also available on the AWS Services in Scope by Compliance Program webpage.

As always, we’re committed to bringing new services into the scope of our OSPAR program based on your architectural and regulatory needs. If you have questions about the OSPAR report, contact your AWS account team.

If you have feedback about this post, submit comments in the Comments section below.

Joseph Goh

Joseph Goh
Joseph is the APJ ASEAN Lead at AWS, based in Singapore. He leads security audits, certifications, and compliance programs across the Asia Pacific region. Joseph is passionate about delivering programs that build trust with customers and providing them assurance on cloud security.

Introducing the Māori Data Lens for the Well-Architected Framework

Post Syndicated from Craig Hind original https://aws.amazon.com/blogs/architecture/introducing-the-maori-data-lens-for-the-well-architected-framework/

In Aotearoa New Zealand, we have been listening and learning to better understand Māori aspirations when using cloud technology. We have been learning from Māori customers, partners, and advisors who have helped us on this journey. A common theme was how to safeguard Māori data in a digital world. Together with a group of Māori advisers, we are excited to introduce the first iteration of a Māori Data Lens for the AWS Well-Architected Framework. This lens is the first of its kind for AWS globally that focuses on indigenous data, specifically Māori data considerations.

An AWS Well-Architected Framework lens is designed to provide a technology, industry, or domain specific perspective aligned with the AWS Well-Architected Framework. The Māori Data Lens allows customers to apply important Māori data considerations when designing and operating reliable, secure, efficient, and cost-effective systems in the cloud. This lens is designed to be a living resource that can grow and adapt alongside the evolving questions and considerations Māori have about how to secure and protect their data as a taonga (treasure). We hope this lens will be valuable in empowering individuals and organisations to design, build, and operate applications and workloads in the AWS Cloud in ways that can align with Māori values and expectations across the six pillars of the Well-Architected Framework: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability. These are not a rigid set of rules, but instead a set of guiding principles.

This lens is designed to complement invaluable te ao Māori knowledge and expertise. It’s important to consult, build trust, and reflect Māori voices in digital and technology choices. AWS customers can consult and partner with their Māori customers to build systems in a way that responsibly interact with their Māori data. This lens is a framework of practical questions and considerations. When combined with Māori knowledge and expertise, AWS customers can begin to use cloud technology in a way that empowers adherence to important cultural and ethical dimensions for safeguarding Māori data. As a starting point, we have sought to align the insights shared with us to our AWS best practices for architecting secure, reliable, and cost-effective applications in the AWS Cloud. Together, these guidelines support the durability and protection of Māori data.

At AWS, we have always believed it is essential that our customers have control over their data. We strive to give customers choice in how to secure and manage their data in the cloud in accordance with their needs. This has been true from the very beginning, when we were the only major cloud provider to allow our customers to control the geographic location of their data, never moving customer data without explicit instruction from the customer.

The launch of an AWS Local Zone in Auckland in 2023 and the coming AWS Region in Aotearoa gives all New Zealanders the choice to store their data onshore in Aotearoa New Zealand in the AWS Cloud without compromising on performance, innovation, scale, or security. We also know that some customers have needs that go beyond where their data is stored. We’re committed to expanding our understanding and our capability to help all customers meet their particular needs and best serve their own customers, to protect their data, and to meet legal and regulatory requirements.

Advice and feedback to date has been instrumental in helping to shape this resource, and we are deeply grateful for the ongoing partnership and insights. We recognise there are different perspectives, and that tikanga (protocol/practice) and experience among Māori on this topic continues to evolve. We welcome feedback on enhancing this resource to better serve the needs of our Māori customers and partners. To provide feedback, reach out to us using the feedback feature on the lens document or your local AWS account team.

We’d like to thank AWS partner HTK Group, as well as Māori technology and data experts who advised us on this work including Renata Hakiwai, Lee Timutimu, Nikora Ngaropo, Ngapera Riley, Wade Reweti, Atawhai Tibble and Eli Pohio. In their words:

“In this rapidly evolving digital landscape, the importance of understanding, organising, and harnessing data cannot be overstated. For our Māori communities, this holds even greater significance, as data can be a taonga or a treasure that represents the collective wisdom and knowledge passed down through generations.

Nā tō rourou, nā taku rourou, ka ora ai te iwi. With your food basket and my food basket, the people will thrive.

Mauri ora!”

Read the Māori Data Lens, or contact your AWS account team for more information.