Support Canada’s CCCS PBHVA overlay compliance with the Landing Zone Accelerator on AWS

Post Syndicated from Naranjan Goklani original https://aws.amazon.com/blogs/security/support-canadas-cccs-pbhva-overlay-compliance-with-the-landing-zone-accelerator-on-aws/

Organizations seeking to adhere to the Canadian Centre for Cyber Security (CCCS) Protected B High Value Assets (PBHVA) overlay requirements can use the Landing Zone Accelerator (LZA) on AWS solution with the CCCS Medium configuration to accelerate their compliance journey. To further support customers, AWS recently collaborated with Coalfire to assess and verify the LZA solution’s ability to support CCCS PBHVA overlay controls.

By implementing the PBHVA control overlay over a CCCS Medium baseline, you can better protect your organization’s most critical assets from potential threats and vulnerabilities, providing continuity of essential government operations and safeguarding sensitive information.

Understanding CCCS PBHVA overlay requirements

The CCCS PBHVA overlay consists of 137 controls designed to protect high-value assets, including 69 new controls and 68 controls from CCCS Medium. These controls provide enhanced data protection, particularly for integrity and availability, and are based on NIST SP 800-53 Revision 5.

Key findings from the Coalfire assessment

Coalfire’s assessment found that the LZA on AWS solution significantly supports CCCS PBHVA overlay compliance requirements:

  • 71 percent of in-scope controls (97 of 137) are supported by the AWS contribution to compliance in the shared responsibility model
  • The solution uses over 35 AWS services to provide comprehensive security capabilities
  • Strong network segmentation is achieved through network account and network-boundary VPC design
  • Infrastructure-as-code (IaC) enables reliable build and deployment results

The 29 percent of controls not addressed by the LZA are on the customer side of the shared responsibility model. They are addressed in the customer’s application stack or as non-technical controls such as policies and procedures.

Key security capabilities

The LZA solution implements several critical security features:

Implementation considerations

While the LZA solution provides significant compliance support, organizations should note:

  • The solution alone does not guarantee compliance
  • Organizations must implement their own policies, standards, and procedures
  • A thorough understanding of the shared responsibility model is essential

The AWS Landing Zone Accelerator Verified Reference Architecture documentation is available for customer download in AWS Artifact. This resource can help organizations reduce the time and effort required to deploy an environment that aligns with CCCS PBHVA overlay requirements.

Conclusion

The Coalfire assessment confirms that the LZA on AWS solution provides effective support for CCCS PBHVA overlay compliance objectives. However, organizations should remember that compliance is an ongoing process that requires active management and cannot be achieved through technology alone.

For more information about implementing the Landing Zone Accelerator for CCCS PBHVA overlay requirements, contact your AWS account team or the AWS Public Sector team directly.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
 

Naranjan Goklani
Naranjan Goklani

Naranjan is an Audit Lead for Canada based in Toronto. He has experience leading audits, attestations, certifications, and assessments across North America and Europe. Naranjan has more than 15 years of experience in risk management, security assurance, and performing technology audits. Naranjan previously worked in one of the Big 4 accounting firms and supported clients from the financial services, technology, retail, e-commerce, and utilities industries as part of the first and third line of defense.
Michael Davie
Michael Davie

Michael is the Canada lead for Amazon Web Services (AWS) Compliance and Security Assurance. He works with customers, regulators, and AWS teams to help raise the bar on secure cloud adoption and usage. Michael has more than 20 years of experience working in the defence, intelligence, and technology sectors in Canada, and is a licensed professional engineer.
James Kierstead
James Kierstead

James is a senior solutions architect at Amazon Web Services (AWS) based in Ottawa, Canada. He is passionate about helping Canada’s federal government use AWS to deliver services to Canadians.

Anthropic’s Claude 3.7 Sonnet hybrid reasoning model is now available in Amazon Bedrock

Post Syndicated from Esra Kayabali original https://aws.amazon.com/blogs/aws/anthropics-claude-3-7-sonnet-the-first-hybrid-reasoning-model-is-now-available-in-amazon-bedrock/

Amazon Bedrock is expanding its foundation model (FM) offerings as the generative AI field evolves. Today, we’re excited to announce the availability of Anthropic’s Claude 3.7 Sonnet foundation model in Amazon Bedrock. As Anthropic’s most intelligent model to date, Claude 3.7 Sonnet stands out as their first hybrid reasoning model capable of producing quick responses or extended thinking, meaning it can work through difficult problems using careful, step-by-step reasoning. Additionally, today we are adding Claude 3.7 Sonnet to the list of models used by Amazon Q Developer. Amazon Q is built on Bedrock, and with Amazon Q you can use the most appropriate model for a specific task such as Claude 3.7 Sonnet, for more advanced coding workflows that enable developers to accelerate building across the entire software development lifecycle.

Key highlights of Claude 3.7 Sonnet
Here are several notable features and capabilities of Claude 3.7 Sonnet in Amazon Bedrock.

The first Claude model with hybrid reasoning – Claude 3.7 Sonnet takes a different approach to how models think. Instead of using separate models—one for quick answers and another for solving complex problems—Claude 3.7 Sonnet integrates reasoning as a core capability within a single model. This combination is more similar to how the human brains works. After all, we use the same brain whether we’re answering a simple question or solving a difficult puzzle.

The model has two modes—standard and extended thinking mode—which can be toggled in Amazon Bedrock. In standard mode, Claude 3.7 Sonnet is an improved version of Claude 3.5 Sonnet. In extended thinking mode, Claude 3.7 Sonnet takes additional time to analyze problems in detail, plan solutions, and consider multiple perspectives before providing a response, allowing it to make further gains in performance. You can control speed and cost by choosing when to use reasoning capabilities. Extended thinking tokens count towards the context window and are billed as output tokens.

Anthropic’s most powerful model for coding – Claude 3.7 Sonnet is state-of-the art for coding, excelling in understanding context and creative problem solving, and according to Anthropic, achieves an industry-leading 70.3% for standard mode on SWE-bench Verified. Claude 3.7 Sonnet also performs better than Claude 3.5 Sonnet across the majority of benchmarks. These enhanced capabilities make Claude 3.7 Sonnet ideal for powering AI agents and complex workflows.

Claude 3.7 Sonnet benchmarks

Source: https://www.anthropic.com/news/claude-3-7-sonnet

Over 15x longer output capacity than its predecessor – Compared to Claude 3.5 Sonnet, this model offers significantly expanded output length. This enhanced capacity is particularly useful when you explicitly request more detail, ask for multiple examples, or request additional context or background information. To achieve long outputs, try asking for a detailed outline (for writing use cases, you can specify outline detail down to the paragraph level and include word count targets). Then, ask for the response to index its paragraphs to the outline and reiterate the word counts. Claude 3.7 Sonnet supports outputs up to 128K tokens long (up to 64K as generally available and up to 128K as a beta).

Adjustable reasoning budget – You can control the budget for thinking when you use Claude 3.7 Sonnet in Amazon Bedrock. This flexibility helps you weigh the trade-offs between speed, cost, and performance. By allocating more tokens to reasoning for complex problems or limiting tokens for faster responses, you can optimize performance for your specific use case.

Claude 3.7 Sonnet in action
As for any new model, I have to request access in the Amazon Bedrock console. In the navigation pane, I choose Model access under Bedrock configurations. Then, I choose Modify model access to request access for Claude 3.7 Sonnet.

Model access in Amazon Bedrock

To try Claude 3.7 Sonnet, I choose Chat / Text under Playgrounds in the navigation pane. Then I choose Select model and choose Anthropic under the Categories and Claude 3.7 Sonnet under the Models. To enable the extended thinking mode, I toggle Model reasoning under Configurations. I type the following prompt, and choose Run:

You're the manager of a small restaurant facing these challenges:

Three staff members called in sick for tonight's dinner service
You're expecting a full house (80 seats)
There's a large party of 20 coming at 7 PM
Your main chef is available but two kitchen helpers are among those who called in sick
You have 2 regular servers and 1 trainee available
How would you:

Reorganize the available staff to handle the situation
Prioritize tasks and service
Determine if you need to make any adjustments to reservations
Handle the large party while maintaining service quality
Minimize negative impact on customer experience
Explain your reasoning for each decision and discuss potential trade-offs


Chat / Text playground

Here’s the result with an animated image showing the reasoning process of the model.

Testing Claude 3.7 Sonnet reasoning

To test image-to-text vision capabilities, I upload an image of a detailed architectural site plan created using Amazon Bedrock. I receive a detailed analysis and reasoned insights of this site plan.

Claude 3.7 Sonnet can also be accessed through AWS SDK by using Amazon Bedrock API. To learn more about Claude 3.7 Sonnet’s features and capabilities, visit the Anthropic’s Claude in Amazon Bedrock product detail page.

Get started with Claude 3.7 Sonnet today
Claude 3.7 Sonnet’s enhanced capabilities can benefit multiple industry use cases. Businesses can create advanced AI assistants and agents that interact directly with customers. In fields such as healthcare, it can assist in medical imaging analysis and research summarization, and financial services can benefit from its abilities to solve complex financial modeling problems. For developers, it serves as a coding companion that can review code, explain technical concepts, and suggest improvements across different languages.

Anthropic’s Claude 3.7 Sonnet is available today in the US East (N. Virginia), US East (Ohio), and US West (Oregon) Regions. Check the full Region list for future updates.

Claude 3.7 Sonnet is priced competitively and matches the price of Claude 3.5 Sonnet. For pricing details, refer to the Amazon Bedrock pricing page.

To get started with Claude 3.7 Sonnet in Amazon Bedrock, visit the Amazon Bedrock console and Amazon Bedrock documentation.

— Esra

Four ways to grant cross-account access in AWS

Post Syndicated from Anshu Bathla original https://aws.amazon.com/blogs/security/four-ways-to-grant-cross-account-access-in-aws/

As your Amazon Web Services (AWS) environment grows, you might develop a need to grant cross-account access to resources. This could be for various reasons, such as enabling centralized operations across multiple AWS accounts, sharing resources across teams or projects within your organization, or integrating with third-party services. However, granting cross-account access requires careful consideration of your security, availability, and manageability requirements.

In this blog post, we explore four different ways to grant cross-account access using resource-based policies. Each method has its own unique tradeoffs, and the best choice depends on your specific requirements and use case.

Evaluating different techniques for granting cross-account access

Cross-account access is granted by identity-based policies and resource-based policies in AWS Identity and Access Management (IAM). Identity-based policies attach to an IAM role, while resource-based polices attach to resources like Amazon Simple Storage Service (Amazon S3) buckets and AWS Key Management Service (AWS KMS) keys. Resource-based policies require you to specify one or more principals (IAM users or roles) that are allowed to access the resource.

Your choice of how to specify the principal in a resource-based policy impacts some aspects of both the confidentiality and the availability of your solution. Understanding this impact and making the right tradeoffs for your use case is the focus of this post.

An example scenario

Imagine that you have an S3 bucket in your AWS account (Account A) that needs to be accessed by different principals in another AWS account (Account B). For this scenario, we assume that the principals in Account B have the necessary access to S3 in their identity-based policies, and we will focus on authoring the resource-based policies in Account A. While the methods explained here use Amazon S3, the concepts discussed apply to all AWS services that support resource-based policies. In the following sections, we walk through four different ways to grant cross-account access in this scenario and discuss the tradeoffs of each.

Method 1: Grant access to a specific IAM role using the Principal element of the resource-based policy

In this example, you use an S3 bucket policy to grant access to a specific IAM role (RoleFromAccountB) in Account B by specifying the IAM role’s Amazon Resource Name (ARN) in the Principal element of the policy in Account A.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowRoleInThePrincipalElement",
      "Principal": {
        "AWS": "arn:aws:iam::111122223333:role/RoleFromAccountB"
      },
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket-account-a/*"
    }
  ]
}

Using this bucket policy, if someone in Account B deletes or recreates the role (RoleFromAccountB), then that role can no longer access the amzn-s3-demo-bucket-account-a bucket, even if that role is recreated with the same name. The reason is that when you save this policy, the role ARN is mapped to the unique ID of the role, which looks something like this: AROADBQP57FF2AEXAMPLE. You will see a role identifier in the Principal element of your resource-based policies if you view them after you delete the role that they referenced.

This behavior is intentional. The resource-based policy only allows the specific instance of the role that you set as principal at the time of policy creation. This helps prevent unintended access to your resources if you delete a role, but forget to update your resource-based policy to remove that role. This behavior can also cause an availability risk because the role (RoleFromAccountB) will have a new unique ID when it is recreated and will no longer have access to the bucket. Roles can be recreated for a number of reasons, including accidentally when you use tools such as infrastructure as code.

You might consider choosing this method if:

  • You own the roles in both Account A and Account B and can control the creation and deletion of these roles.
  • You want your resource-based policy in Account A to stop granting access when the specified role (RoleFromAccountB) is deleted.
  • You prioritize granular access control over potential availability concerns if the role (RoleFromAccountB) is deleted.

Method 2: Grant access to an account using the Principal element of the resource-based policy

In this example, you grant access to a specific account in the Principal element of the resource-based policy. This resource-based policy of Account A allows any user or role from Account B that also has an identity-based policy that grants them access to read the objects.

Note: You can use either "Principal": {"AWS": "111122223333"} or "Principal": {"AWS": "arn:aws:iam::111122223333:root"} in the Principal element. They are equivalent, and the long-form ARN does not represent the root user.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAccountInThePrincipalElement",
      "Principal": {
        "AWS": "111122223333"
      },
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket-account-a/*"
    }
  ]
}

This resource-based policy helps avoid the potential availability issue discussed for Method 1. If a role in Account B that needs to have access to the bucket is recreated, it will still have access after the recreation of that role. This is because you don’t specify a role in the Principal element—instead, you specify an account. If you use Method 2, you must be comfortable delegating access control decisions to the owner of that account.

This approach explicitly delegates access control decisions to IAM in the other account (Account B). Principals in Account B have access to this bucket if allowed by their identity-based policies.

You might consider choosing this method if:

  • You need to grant access to many principals in Account B.
  • You want to delegate the access decision in the account where the principal exists (Account B).
  • You prioritize ease of management and availability over granular access control.

Method 3: Grant access to a specific IAM role using the aws:PrincipalArn condition

This method expands on Method 2 and adds a condition that grants access only to a specific IAM role. Similar to Method 2, you use the account number as the value of the Principal element, but also use the aws:PrincipalArn condition key to limit access to a specific principal in Account B.

The aws:PrincipalArn condition key is a global condition key that compares the ARN of the principal that made the request with the ARN that you specify in the policy. For IAM roles, the request context returns the ARN of the role, not the ARN of the user that assumed the role.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAccountInPrincipalAndRoleInPrincipalArn",
      "Principal": {
        "AWS": "111122223333"
      },
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket-account-a/*",
      "Condition": {
        "ArnEquals": {
          "aws:PrincipalArn": "arn:aws:iam::111122223333:role/RoleFromAccountB"
        }
      }
    }
  ]
}

This policy comes with the same availability benefits as the policy in Method 2: access to this resource will survive role recreation. This is because the role is translated to its unique identifier only when it is used in the Principal element. It is not translated to a unique identifier when it is used in a condition. If the role (RoleFromAccountB) in Account B is recreated, accidentally or intentionally, the policy will continue to grant access because the role matches the role ARN specified in the condition key of the resource-based policy in Account A. As a result, Method 3 provides a balanced approach to availability and security.

You might consider choosing this method if:

  • You are comfortable that this policy will continue to grant access to the role specified in the aws:PrincipalArn condition key if that role (RoleFromAccountB) is recreated.
  • You don’t own the Account B you are granting access to and don’t control when that role may be recreated.
  • You want a balance of availability and confidentiality.

Method 4: Grant access to an entire AWS Organizations organization

This method is focused on a different use case and is not an alternative to the methods listed earlier. Use this method if you have a resource (an S3 bucket, in this example) that you want to share with your entire organization, but not share with anyone outside of it.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowAccessToAnEntireOrganization",
      "Principal": {
        "AWS": "*"
      },
      "Effect": "Allow",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::amzn-s3-demo-bucket-account-a/*",
      "Condition": {
        "StringEquals": {
          "aws:PrincipalOrgId": "o-12345"
        },
        "StringNotEquals": {
          "aws:PrincipalAccount": "${aws:ResourceAccount}"
        }
      }
    }
  ]
}

There is no way to specify an organization by using the Principal element of a resource-based policy, so you must use the aws:PrincipalOrgId condition key to restrict access to a specific organization. In this policy, you specify a wildcard in the Principal element, which says that anyone can access the bucket. Then the condition reduces “anyone” to just those AWS account principals that belong to the specified organization and have an identity-based policy that allows them access.

You then add an additional conditional block that compares the aws:PrincipalAccount condition key to the aws:ResourceAccount condition key by using a policy variable. This extra conditional block is optional and excludes the account that owns the bucket (Account A) from the allow statement. The reason for using this extra conditional block is so that principals in Account A still require an allow statement in their identity-based policy to access this bucket. If you choose to exclude this aws:PrincipalAccount comparison, principals in Account A are granted access to the bucket without an explicit allow statement in their identity-based policy. Policy evaluation logic only requires either the identity-based policy or the resource-based policy (but not both) to allow a request when the principal and resource are in the same account.

You might consider choosing this method if:

  • You have a shared resource that should be accessible to your entire organization.

Conclusion

Choosing a method to grant cross-account access requires careful consideration of your requirements and use case. Each of the four methods discussed in this blog post has its own advantages and tradeoffs. By understanding these methods and their implications, you can decide on the most appropriate approach to grant cross-account access to your AWS resources. Remember to regularly review and audit your resource-based policies to verify that they align with your security and access requirements.

To learn how resource-based policies work with Amazon S3, see the blog post IAM Policies and Bucket Policies and ACLs! Oh My! Controlling Access to S3 Resources.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
 

Anshu Bathla
Anshu Bathla

Anshu is a Lead Consultant – SRC at AWS, based in Gurugram, India. He works with customers across diverse verticals to help strengthen their security infrastructure and achieve their security goals. Outside of work, Anshu enjoys reading books and gardening at his home garden.
Jay Goradia
Jay Goradia

Jay is a Technical Account Manager (TAM) at AWS who works closely with enterprise customers to accelerate their cloud journey through strategic guidance and technical expertise. Using his security background, he helps organizations understand security best practices in AWS.

AWS Weekly Roundup: Cloud Club Captain Applications, Formula 1®, Amazon Nova Prompt Engineering, and more (Feb 24, 2025)

Post Syndicated from Elizabeth Fuentes original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-cloud-club-captain-applications-formula-1-amazon-nova-prompt-engineering-and-more-feb-24-2025/

AWS Developer Day 2025, held on February 20th, showcased how to integrate responsible generative AI into development workflows. The event featured keynotes from AWS leaders including Srini Iragavarapu, Director Generative AI Applications and Developer Experiences, Jeff Barr, Vice President of AWS Evangelism, David Nalley, Director Open Source Marketing of AWS, along with AWS Heroes and technical community members. Watch the full event recording on Developer Day 2025.

Cloud Club

Applications are now open through March 6th for the 2025 AWS Cloud Clubs Captains program. AWS Cloud Clubs are student-led groups for post-secondary and independent students, 18 years old and over. Find a club near you on our Meetup page.

Last week’s launches
Here are some launches that got my attention:

Amplify Hosting announces support for IAM roles for server-side rendered (SSR) applications  AWS Amplify Hosting now supports AWS Identity and Access Management (IAM) roles for SSR applications, enabling secure access to AWS services without managing credentials manually. Learn more in the IAM Compute Roles for Server-Side Rendering with AWS Amplify Hosting blog.

AWS WAF enhances Data Protection and logging experience  AWS WAF expands its Data Protection capabilities allowing sensitive data in logs to be replaced with cryptographic hashes (e.g. ‘ade099751d2ea9f3393f0f’) or a predefined static string (‘REDACTED’) before logs are sent to WAF Sample Logs, Amazon Security Lake, Amazon CloudWatch, or other logging destinations.

Announcing AWS DMS Serverless comprehensive premigration assessments AWS Database Migration Service Serverless (AWS DMS Serverless) now supports premigration assessments for replications to identify potential issues before database migrations begin. The tool analyzes source and target databases, providing recommendations for optimal DMS settings and best practices.

Amazon ECS increases the CPU limit for ECS tasks to 192 vCPUs – Amazon Elastic Container Service (Amazon ECS) now supports CPU limits of up to 192 vCPU for ECS tasks deployed on Amazon Elastic Compute Cloud (Amazon EC2) instances, an increase from the previous 10 vCPU limit. This enhancement allows customers to more effectively manage resource allocation on larger Amazon EC2 instances.

AWS Network Firewall introduces automated domain lists and insightsAWS Network Firewall now provides automated domain lists and insights by analyzing 30 days of HTTP/S traffic. This helps create and maintain allow-list policies more efficiently, at no extra cost.

AWS announces Backup Payment Methods for invoices AWS now enables you to set up backup payment methods that automatically activate if primary payment fails. This helps prevent service interruptions and reduces manual intervention for invoice payments.

Get updated with all the announcements of AWS announcements on the What’s New with AWS? page.

Other AWS news
Here are additional noteworthy items:

AWS Partner Network: Essential training resources for ISV partners To help scale solutions effectively, AWS provides essential training resources for Software Vendors (ISVs) partners in four key areas: AWS Marketplace fundamentals, Foundational Technical Review (FTR), APN Customer Engagement (ACE) program and co-selling, and Partner funding opportunities.

How Formula 1® uses generative AI to accelerate race-day issue resolution Formula 1® (F1) uses Amazon Bedrock to speed up race-day issue resolution, reducing troubleshooting time from weeks to minutes through a chatbot that analyzes root causes and suggests fixes.

How Formula 1® uses generative AI to accelerate race-day issue resolution

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases This blog introduces a solution using Amazon Bedrock Knowledge Bases and Amazon Bedrock Agents to reduce Large language models (LLMs) hallucinations by implementing a verified semantic cache that checks queries against curated answers before generating new responses, improving accuracy and response times.

Reducing hallucinations in LLM agents with a verified semantic cache using Amazon Bedrock Knowledge Bases

Orchestrate an intelligent document processing workflow using tools in Amazon Bedrock This blog demonstrates an intelligent document processing workflow using Amazon Bedrock tools that combines Anthropic’s Claude 3 Haiku for orchestration and Anthropic’s Claude 3.5 Sonnet (v2) for analysis to handle structured, semi-structured, and unstructured healthcare documents efficiently.

From community.aws
Here are my personal favorites posts from community.aws:

Tracing Amazon Bedrock Agents Learn how to track and analyze Amazon Bedrock Agents workflows using AWS X-Ray for better observability, by Randy D.

Testing Amazon ECS Network Resilience with AWS FISThis article demonstrates how to test network resilience in Amazon ECS using AWS FIS with guidance from Amazon Q Developer, by Sunil Govindankutty

Stop Using Default Arguments in AWS Lambda Functions Discover why your AWS Lambda costs might be spiralling out of control due to a common Python programming practice, by Stuart Clark.

Amazon Nova Prompt Engineering on AWS: A Field Guide by Brooke A field guide for using Amazon Nova models, covering prompt engineering patterns and best practices on AWS, by Brooke Jamieson.

Amazon Nova Prompt Engineering on AWS: A Field Guide by Brooke

Creating Deployment Configurations for EKS with Amazon Q Amazon Q Developer helps create EKS deployments by providing templates and best practices for Kubernetes configs, by Ricardo Tasso.

Processing WhatsApp Multimedia with Amazon Bedrock Agents: Images, Video, and DocumentsI invite you to read my latest blog, which explains how to create a WhatsApp AI assistant using Amazon Bedrock and Amazon Nova models to process multimedia content such as images, videos, documents, and audio.

Processing WhatsApp Multimedia with Amazon Bedrock Agents: Images, Video, and Documents

Upcoming AWS events
Check your calendars and sign up for these upcoming AWS events:

AWS GenAI Lofts – GenAI Lofts offer collaborative spaces and immersive experiences for startups and developers. You can join in-person GenAI Loft San Francisco events such as Hands-on with Agentic Graph RAG Workshop (February 25), Unstructured Data Meetup SF (February 26 – 27) and AI Tinkerers – San Francisco – February 2025 Demos + Science Fair (February 27 – 28). GenAI Loft Berlin has events and workshops on February 24 to March 7 that you can’t miss!

AWS Community Days – Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Milan, Italy (April 2), Bay Area – Security Edition (April 4), Timișoara, Romania (April 10), and Prague, Czeh Republic (April 29).

AWS Innovate: Generative AI + Data – Join a free online conference focusing on generative AI and data innovations. Available in multiple geographic regions: APJC and EMEA (March 6), North America (March 13), Greater China Region (March 14), and Latin America (April 8).

AWS Summits – Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Paris (April 9), Amsterdam (April 16), London (April 30), and Poland (May 5).

AWS re:Inforce – AWS re:Inforce (June 16–18) in Philadelphia, PA our annual learning event devoted to all things AWS cloud security. Registration opens in March, and be ready to join more than 5,000 security builders and leaders.

Create your AWS Builder ID and reserve your alias. Builder ID is a universal login credential that gives you access–beyond the AWS Management Console–to AWS tools and resources, including over 600 free training courses, community features, and developer tools such as Amazon Q Developer.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Stay tuned for next week’s Weekly Roundup!

Eli

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Under The Hoodie: The Pen Test Diaries

Post Syndicated from Emma Burdett original https://blog.rapid7.com/2025/02/24/under-the-hoodie-the-pen-test-diaries/

Breaking In So You Don’t Have To

Under The Hoodie: The Pen Test Diaries

Each year, Rapid7 penetration testers conduct over 1,000 security assessments, pushing boundaries to expose vulnerabilities before the bad guys do. The mission? Get in, escalate privileges, and own the environment—physically, digitally, or sometimes just by sweet-talking an unsuspecting employee.

Names? Redacted. Companies? Anonymized. But the hacks? Real.

Welcome to Under the Hoodie, where we share stories straight from the frontlines of ethical hacking. Below are real accounts from our testers, revealing just how easy it can be to break into supposedly secure environments. Click through to hear each story unfold.

1. The Law Firm’s “Secure” File Share – Not So Secure

A law firm’s file storage system was sitting on the internet, just begging for a break-in. Using a mix of open-source intelligence (OSINT) and Burp Suite, our pen tester enumerated users, guessed a couple of predictable passwords (think “Winter2024!”), and walked right into confidential legal documents. Verdict? Guilty of weak security.

Hear how it happened.

2. Taking Over a College (And Its Campus Police)

Ever wondered how much damage someone could do by simply plugging into an open network jack on a college campus? Turns out, a lot. Our tester started with network poisoning attacks, cracked some hashes, and before long, had access to criminal records, police databases, PhD research, and even student grade records. Could’ve handed out straight A’s if they wanted.

Check out the full infiltration.

3. Hacking SQL to Crack a Corporate Network

A misconfigured Microsoft SQL server turned out to be the golden ticket for total network compromise. After gaining basic user access via weak credentials, our tester found a juicy SQL cluster, enabled some stored procedures, and pulled off process injection to gain domain admin privileges. Translation? They owned the company’s entire network from the inside out.

Listen to how it was done.

4. Breaking In With Donuts (Social Engineering for the Win)

Sometimes, hacking isn’t about code—it’s about confidence. Armed with a fake badge and a box of popular local donuts, our tester waltzed into a corporate office by leveraging good ol’ human kindness. A security guard even held the door open. The lesson? Free food lowers defenses faster than any zero-day exploit.

Hear about the sugar-powered social engineering.

5. Phishing Calls: One Password Reset Away from Total Control

A single phone call is sometimes all it takes. Our tester posed as an employee needing a password reset. After some casual chit-chat, an IT admin happily provided a fresh login. No brute force, no malware—just old-school social engineering at its finest.

Find out just how easy it was.

6. How We Almost Stole a Police Car

High-security target? Challenge accepted. Our testers, posing as IT consultants, walked right into a police department, escorted through all secure areas, and even got their hands on a set of keys to a patrol car. No alarms. No suspicion. Just a dangerously believable pretext.

Check out how close they got.

7. The Phish That Netted an Entire Finance Firm’s Data

A fake email, a cloned login page, and a hundred unsuspecting employees. Eight of them entered their credentials, and just like that, our tester had access to financial data, payroll systems, and even proxy rights to other accounts. MFA saved the day—barely.

Find out just how this phishing attack unfolded.

8. Owning a Medical Database Before the Cocoa Cooled

A health transcription company left its web app vulnerable to SQL injection. The result? Full access to sensitive medical records within minutes. The tester reported it immediately, and the company had to shut down its entire system for emergency remediation. All before their hot cocoa had a chance to cool down.

Find out how it happened.

9. No Password? No Problem. Taking Over a Network with NTLM Hashes

No cracked passwords? No worries. Our tester leveraged network sniffing, NTLM relay attacks, and Active Directory Certificate Services to escalate privileges. By the time it was over, they had full control over the company’s systems—without ever knowing a single password.

Check out the full attack.

Security Isn’t a One-Time Fix—It’s a Constant Battle

Every system has weak points—some technical, some human. The goal of penetration testing isn’t just to break in; it’s to make sure real attackers can’t.

Hear more stories from the trenches.

[$] Slabs, sheaves, and barns

Post Syndicated from corbet original https://lwn.net/Articles/1010667/

The kernel’s slab allocator is responsible for the allocation of small
(usually sub-page) chunks of memory. For many workloads, the speed of
object allocation and freeing is one of the key factors in overall
performance, so it is not surprising that a lot of effort has gone into
optimizing the slab allocator over time. Now that the kernel is down to a single slab allocator, the
memory-management developers have free rein to add complexity to it; the
latest move in that direction is the per-CPU
sheaves patch set
from slab maintainer Vlastimil Babka.

Connect your on-premises Kubernetes cluster to AWS APIs using IAM Roles Anywhere

Post Syndicated from Varun Sharma original https://aws.amazon.com/blogs/security/connect-your-on-premises-kubernetes-cluster-to-aws-apis-using-iam-roles-anywhere/

Many customers want to seamlessly integrate their on-premises Kubernetes workloads with AWS services, implement hybrid workloads, or migrate to AWS. Previously, a common approach involved creating long-term access keys, which posed security risks and is no longer recommended. While solutions such as Kubernetes secrets vault and third-party options exist, they fail to address the underlying issue effectively.

One option to connect your on-premises Kubernetes workloads to AWS APIs is to use the service account issuer discovery feature. This allows the Kubernetes API server to act as an OpenID Connect (OIDC) identity provider and be federated with AWS Identity and Access Management (IAM). However, this approach requires public internet access to the Kubernetes API server, which might not be desirable for some customers.

To help eliminate the need for long-term access keys or exposing the Kubernetes API server to the public internet, AWS has introduced AWS IAM Roles Anywhere. This feature enables secure, seamless integration of on-premises Kubernetes workloads with AWS services, promoting robust security practices and minimizing potential risks associated with long-term credentials or public exposure.

IAM Roles Anywhere enables workloads outside of AWS to access AWS resources by exchanging X.509 bound identities for temporary AWS credentials. With IAM Roles Anywhere, you can use the same IAM roles and policies as your AWS workloads to access AWS resources, promoting consistency.

IAM Roles Anywhere can be combined with a standard public key infrastructure solution. In this blog post, we use AWS Private Certificate Authority, which has several advantages over using a self-signed certificate authority (CA). First, it reduces operational and management overhead, because AWS manages the CA for you. Second, the cryptographic key material can be stored in hardware security modules or at least vaulted, which helps you protect your private CA against key compromises. Additionally, certificates can be short-lived, which aligns with dynamic Kubernetes environments where pod lifetimes are typically shorter than traditional servers.

We also demonstrate how to integrate IAM Roles Anywhere without modifying your existing workload Docker files, and how to automate the X.509 certificate lifecycle with cert-manager and an AWS Private CA backend in short-lived certificate mode. By using these capabilities, you can seamlessly integrate your on-premises Kubernetes workloads with AWS services, promoting robust security practices, minimizing risks associated with long-term credentials, and helping to ensure a streamlined, consistent access management experience.

This post is for customers who run their own Kubernetes cluster outside of AWS without using Amazon EKS Anywhere. If you’re using Amazon Elastic Kubernetes Service (Amazon EKS), use IAM roles for service accounts or Amazon EKS Pod Identity instead.

Background

“Why should I prefer X.509 certificates over IAM access keys?” Access keys are long-term credentials that must be rotated regularly to minimize the risk of unauthorized access. They need to be securely deployed onto servers hosting applications that use them, requiring procedures for secure transfer and deletion of transient copies. As the number of applications and access keys grows, tracking and managing them becomes operationally challenging.

In contrast, X.509 certificates use public key infrastructure (PKI). The private key is generated directly on the application server and doesn’t leave it. Only a certificate signing request, which doesn’t contain secrets, is sent to the CA for signing and returning the certificate. This alleviates the need for securely transmitting secret keys.

However, you can argue that X.509 certificates are also long-lived credentials. This concern is valid, but not necessarily true. As demonstrated by projects such as Let’s Encrypt, it’s possible to reduce certificate lifetimes from years to months by implementing automation for certificate renewal. After such a mechanism is in place, certificate lifetimes can be further limited to days or even hours.

In this post, we introduce mutually authenticated Transport Layer Security (mTLS), which uses certificates for high-assurance bidirectional authentication. Certificates are used to establish trust between the client and server, making sure that both parties are authenticated and authorized to communicate securely. By implementing mTLS, you can achieve a higher level of security and trust in your communication channels, mitigating potential risks associated with unauthorized access or man-in-the-middle attacks. Here, we implement ephemeral certificates that are tied to the lifecycle of pods. When a pod is started, a certificate is automatically created, and it expires after a short period of time unless it’s actively in use by the pod, in which case it’s automatically renewed by the cert-manager. This approach verifies that certificates are only valid for the duration of the pod’s lifetime, minimizing the potential risk associated with long-lived credentials. Additionally, IAM Roles Anywhere supports certificate revocation list (CRL) checks, allowing you to perform explicit revocation of certificates if required. This feature provides an additional layer of security, enabling you to revoke access promptly in case of compromised credentials or other security concerns.

Throughout this post, we assume that you have a basic understanding of IAM Roles Anywhere. For more information you can see this blog post. Furthermore, we assume that you are familiar with Kubernetes, kubectl, Helm, and cert-manager.

Solution overview

This solution assumes that you have an existing Kubernetes cluster running outside of AWS.

Figure 1 shows the high-level architecture of our solution. An on-premises Kubernetes cluster accessing AWS APIs using IAM Roles Anywhere with X.509 certificates issued by AWS Private CA in short-lived-certificate mode.

Figure 1: High level architecture of on-premises Kubernetes accessing AWS APIs

Figure 1: High level architecture of on-premises Kubernetes accessing AWS APIs

Here’s how the solution works, as shown in Figure 1:

  1. An AWS Private CA in short-lived certificate mode issues X.509 certificates for your pods.
  2. When you set up your AWS Private CA as a trusted source and establish a specific profile, IAM Roles Anywhere will validate and accept authentication requests that use certificates issued by your AWS Private CA.
  3. cert-manager, deployed into your Kubernetes cluster, orchestrates the issuance of AWS Private CA certificates to authorized pods.
  4. Each pod uses IAM Roles Anywhere to create an AWS session using its private key and X.509 certificate obtained from cert-manager.

Let’s explore the different parts of the architecture in more detail.

AWS Private CA short lived credentials

AWS Private CA offers a short-lived certificate, where the validity period is limited to 7 days or fewer. You can see this AWS Blog to learn how to use AWS Private CA short-lived certificates. This new mode can be used to issue certificates for your Kubernetes pods and benefit from lower costs of operations. By synchronizing the certificate lifecycle with the lifecycle of the pod, you can minimize the operational overhead for this solution. To help meet requirements for auditability and transparency, you can use the audit report feature to list the issued certificates in a machine readable format.

IAM Roles Anywhere

Figure 2 shows a detailed overview of the components involved in authentication with IAM Roles Anywhere.

Figure 2: Components of IAM Roles Anywhere

Figure 2: Components of IAM Roles Anywhere

IAM Roles Anywhere allows you to obtain temporary security credentials for workloads that run outside of AWS. Your workloads must use a certificate issued by a trusted PKI CA to authenticate with IAM Roles Anywhere. You establish trust between IAM Roles Anywhere and your CA by creating a trust anchor that points to the root of the CA.

cert-manager

Figure 3 shows a detailed overview of the cert-manager setup used in this post, including the aws-privateca-issuer add-on for the integration of AWS Private CA.

Figure 3: Detailed overview of cert-manager setup

Figure 3: Detailed overview of cert-manager setup

cert-manager is a tool for managing X.509 certificates in Kubernetes. As shown in Figure 3, cert-manager will make sure that certificates are valid and up-to-date and attempt to renew them before they expire. By using add-ons, you can configure different backends for issuing X.509 certificates. In this post, we explore how to integrate cert-manager with AWS Private CA using the aws-privateca-issuer add-on. The aws-privateca-issuer add-on defines two custom resources, AWSPCAIssuer and AWSPCAClusterIssuer, which are used to configure the link to AWS Private CA. They are similar to the Issuer and ClusterIssuer resources that come with cert-manager, but specific to aws-privateca-issuer.

After the AWSPCAIssuer or AWSPCAClusterIssuer is available, aws-privateca-issuer authenticates towards AWS APIs using temporary security credentials obtained from IAM Roles Anywhere. cert-manager watches for the certificate resource, which references to an AWSPCAIssuer, which in turn references to AWS Private CA. aws-privatca-issuer requests a certificate from AWS Private CA. The auto-generated private key and the signed certificate are stored in Kubernetes secrets.

Using certificates and secrets

cert-manager supports multiple ways of integrating into your Kubernetes workloads. You can use certificate resources, which represent a human-readable definition of a certificate signing request (CSR) and contain information on certificate lifespan and renewal time. When using a certificate, the auto-generated private key and the signed certificate are stored in Kubernetes secrets.

With this option, an X.509 certificate is issued manually and saved as a secret. After a PKI is configured as an issuer, a certificate resource is created to automate the renewal of the certificate. With the certificate resource, the lifecycle of certificates is decoupled from the lifecycle of the pods that use them. This allows you to bootstrap the X.509 certificate even before the trusted PKI is deployed.

Using the CSI driver

Another way of integrating cert-manager is by using a CSI driver. In this case, the certificate lifecycle is bound to the lifecycle of the pod. An X.509 certificate and private key are mounted into a predefined folder where your workloads can read them. On pod creation, cert-manager automatically creates a private key and requests a certificate for the configured trusted PKI. When the pod is deleted, the private key and certificate are also deleted and become invalid because they aren’t renewed by cert-manager.

In this post, we use the CSI driver approach for workloads to create ephemeral certificates for IAM Roles Anywhere.

Workload configuration

Figure 4 shows a detailed view of how pods can be configured to use IAM Roles Anywhere without needing to change the underlying Docker images by using a sidecar that provides an IMDSv2 endpoint that mimics the behavior in the Amazon Elastic Compute Cloud (Amazon EC2) instance metadata endpoint.

Figure 4: Pod configuration using a sidecar

Figure 4: Pod configuration using a sidecar

As shown in Figure 4, when using a certificate resource, the auto-generated private key and the signed certificate are stored in Kubernetes secrets and mounted into the pod. When using the CSI driver, a private key is generated locally (for the pod), a certificate is requested from cert-manager based on the given attributes and is issued by AWSPCAIssuer, and the certificates are mounted directly into the pod with no intermediate secret being created.

IAM Roles Anywhere uses the CreateSession API to authenticate requests with a SigV4a signature using the private key and its associated X.509 certificate. This exchange provides a IAM role session credential, as if you had assumed the IAM role. The aws_signing_helper binary is provided to call the CreateSession API from the command line. In this post, a sidecar container that provides an IMDSv2 endpoint to the workload container is used. This container uses the aws_signing_helper binary and uses its serve command.

This way, applications using AWS SDKs can use the AWS_EC2_METADATA_SERVICE_ENDPOINT environment variable to set the instance metadata endpoint to the correct port on the localhost interface. The X.509 certificate and private key are provided as files to the sidecar container.

Solution deployment

In this section, we show the steps needed to deploy the solution in your AWS account.

Prerequisites

To deploy the solution in this post, make sure that you have the following in place:

  • AWS Command Line Interface (AWS CLI) v2
  • An AWS account and IAM permissions for IAM, IAM Roles Anywhere, and AWS Private CA
  • Latest stable Kubernetes
  • kubectl (matching your Kubernetes version)
  • Helm 3
  • jq

Note: As an alternative to using the AWS CLI, you can use the AWS Controllers for Kubernetes (ACK) service controller for AWS Private CA for creating and managing CertificateAuthority, Certificate, and CertificateAuthorityActivation resources directly within your Kubernetes cluster. After establishing your CA hierarchy using the ACK controller, you can proceed with the subsequent steps involving IAM Roles Anywhere integration, aws-privateca-issuer, and cert-manager as described in this post.

Step 1 – AWS Private CA

  1. Set up a root CA in AWS Private CA, which will issue short lived certificates for your pods. In this example you use only one CA; for production environments, you should check the considerations for designing CA hierarchies. Start by using the AWS CLI to create a configuration.
    cat <<EOF > ca-config.json
    {
       "KeyAlgorithm":"RSA_2048",
       "SigningAlgorithm":"SHA256WITHRSA",
       "Subject":{
          "Country":"DE",
          "Organization":"Example Corp",
          "OrganizationalUnit":"SREs",
          "State":"HE",
          "Locality":"FRANKFURT",
          "CommonName":"Blogpost CA"
       }
    }
    EOF

  2. Create the CA in AWS Private CA with short-lived certificates mode.
    aws acm-pca create-certificate-authority \
      --certificate-authority-configuration file://ca-config.json \
      --certificate-authority-type "ROOT" \
      --usage-mode SHORT_LIVED_CERTIFICATE

  3. The command will return a CertificateAuthorityArn, which you will need for further commands, so export it for later use. Replace <region> with your AWS Region.
    export PCA_ARN=arn:aws:acm-pca:<region>:012345678912:certificate-authority/8213159d-cad0-481c-bf14-a0ced4d6d479

  4. After creating the root CA, the CA is in a pending state. You need to create a CSR.
    aws acm-pca get-certificate-authority-csr \
         --certificate-authority-arn ${PCA_ARN} \
         --output text > ca.csr

  5. Now, the CSR needs to be signed by the root CA.
    aws acm-pca issue-certificate \
         --certificate-authority-arn ${PCA_ARN} \
         --csr fileb://ca.csr \
         --signing-algorithm SHA256WITHRSA \
         --template-arn arn:aws:acm-pca:::template/RootCACertificate/V1 \
         --validity Value=365,Type=DAYS

  6. This command returns a CertificateArn which you will need later. Export it.
    export ROOT_CA_CERTIFICATE_ARN=arn:aws:acm-pca:<region>:012345678912:certificate-authority/8213159d-cad0-481c-bf14-a0ced4d6d479/certificate/5830e475088eee553bd409b7f4964613

  7. Download the root CA certificate and upload it to your AWS Private CA.
    aws acm-pca get-certificate \
        --certificate-authority-arn ${PCA_ARN} \
        --certificate-arn ${ROOT_CA_CERTIFICATE_ARN} \
        --output text > cert.pem
    
    aws acm-pca import-certificate-authority-certificate \
         --certificate-authority-arn ${PCA_ARN} \
         --certificate fileb://cert.pem

  8. Verify the status of the PCA, it should be ACTIVE.
    aws acm-pca describe-certificate-authority \
        --certificate-authority-arn ${PCA_ARN} \
        --output json

Step 2 – IAM Roles Anywhere

At this point your root CA is set up and ready to use. The next step is to configure IAM Roles Anywhere.

  1. Start by defining a trust anchor that will refer to your newly created AWS Private CA and export the trustAnchorArn. Replace <value-of-trustAnchorArn> with the Amazon Resource Name (ARN) value of your IAM Roles Anywhere trust anchor.
    aws rolesanywhere create-trust-anchor \
    --name onprem-k8s-issuer \
    --enabled \
    --source sourceType=AWS_ACM_PCA,sourceData={acmPcaArn=${PCA_ARN}}
    
    export TRUST_ANCHOR_ARN=<value-of-trustAnchorArn>

  2. Create an IAM role to be used by the aws-privateca-issuer cert-manager plugin. This role needs to include the actions sts:AssumeRole, sts:SetSourceIdentity and sts:TagSession, which are required by IAMRA. Replace <TA_ID> with your trust anchor.

    Note: You should specify a PrincipalTag with the CN. Furthermore, it should be scoped to the IAMRA service principal. This further restricts authorization based on attributes that are extracted from the X.509 certificate and provides an additional layer of security by helping to ensure that even if an unauthorized party gains access to a valid certificate, they cannot assume the role unless the certificate’s CN matches the specified value.

    cat <<EOF > trust-policy.json
    {
        "Version": "2012-10-17",
        "Statement": [{
            "Sid": "Statement1",
            "Effect": "Allow",
            "Principal": {
                "Service": "rolesanywhere.amazonaws.com"
            },
            "Action": [
                "sts:AssumeRole",
                "sts:SetSourceIdentity",
                "sts:TagSession"
            ],
            "Condition": {
                "StringEquals": {
                    "aws:PrincipalTag/x509Subject/CN": "iamra-issuer"
                },
                "ArnEquals": {
                    "aws:SourceArn": [
                        "arn:aws:rolesanywhere:<region>:012345678912:trust-anchor/<TA_ID>"
                    ]
                }
    
            }
        }]
    }
    EOF

    • Use the following to create the iamra-issuer role:
      aws iam create-role --role-name iamra-issuer \
        --assume-role-policy-document file://trust-policy.json

  3. The command will return a JSON document containing information about the newly created role. Export the ARN for later use.
    export IAMRA_ISSUER_ROLE=arn:aws:iam::012345678912:role/iamra-issuer

  4. Attach an inline policy that allows the role request certificates from your PCA and retrieve these. Note that there is a condition limiting the AWS Private CA templates to only allow EndEntityCertificate.
    cat <<EOF > inline-policy.json
    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "awspcaissuerread",
          "Action": [
            "acm-pca:DescribeCertificateAuthority",
            "acm-pca:GetCertificate"
          ],
          "Effect": "Allow",
          "Resource": "$PCA_ARN"
        },
        {
          "Sid": "awspcaissuerwrite",
          "Action": [
            "acm-pca:IssueCertificate"
          ],
          "Effect": "Allow",
          "Resource": "$PCA_ARN",
          "Condition":{
            "StringEquals":{
              "acm-pca:TemplateArn":"arn:aws:acm-pca:::template/EndEntityCertificate/V1"
            }
          }
        }
      ]
    }
    EOF

    • Use the following to associate the inline policy (created in the preceding step) with the iamra-issuer role.
      aws iam put-role-policy --role-name iamra-issuer \
        --policy-name iamra-issuer \
        --policy-document file://inline-policy.json

  5. To finish, create a profile that defines which IAM roles can be assumed and then export the returned ARN.
    aws rolesanywhere create-profile --name iamra-issuer \
      --role-arns ${IAMRA_ISSUER_ROLE} \
      --enabled

    • Export the returned ARN:
      export IAMRA_PROFILE_ARN=arn:aws:rolesanywhere:<region>:012345678912:profile/<Profile_ID>

The created role iamra-issuer will only be used by the aws-privateca-issuer to integrate with AWS Private CA. You should repeat the process of creating IAM roles and IAMRA profiles for your workloads. it’s recommended to create a separate IAM role for each workload and limit its use with condition statements in the trust policy, checking for the workload identity and trust anchor (for example, matching the common name). Furthermore, it’s important that you add IAMRA to the trust policy and allow the aforementioned actions. Best practice with IAM roles is to apply least-privilege permissions.

Step 3 – Create the init container

To integrate IAM Roles Anywhere within your Kubernetes environment, you need to provide an IMDSv2 endpoint to your application containers by running the aws_signing_helper binary as a sidecar. You also need to configure your applications using an environment variable to use the new instance metadata endpoint. To do so, build a Docker image that works as a sidecar.

In this step, create a basic image that fulfills the preceding requirements. In your environment, you might want to adapt this example to use your own base image and implement your image hardening processes.

Copy the following script and save it as init.sh.

#!/bin/sh

if [[ -z "$TRUST_ANCHOR_ARN" ]]; then
  echo "Must provide TRUST_ANCHOR_ARN environment variable." 1>&2
  exit 1
fi

if [[ -z "$PROFILE_ARN" ]]; then
  echo "Must provide PROFILE_ARN environment variable." 1>&2
  exit 1
fi

if [[ -z "$ROLE_ARN" ]]; then
  echo "Must provide ROLE_ARN environment variable." 1>&2
  exit 1
fi

echo "starting IMDSv2 endpoint with aws_signing_helper ..."
/aws_signing_helper serve \
  --certificate /iamra/tls.crt         \
  --private-key /iamra/tls.key         \
  --trust-anchor-arn $TRUST_ANCHOR_ARN \
  --profile-arn $PROFILE_ARN           \
  --role-arn $ROLE_ARN

This script is the entry point of the sidecar container. It expects the environment variables TRUST_ANCHOR_ARN, PROFILE_ARN, and ROLE_ARN, which are required by aws_signing_helper. It also expects an X.509 certificate and its private key in the folder /iamra, which will be mounted in a later stage during pod initialization. Finally, it invokes the aws_signing_helper with the serve directive which creates an IMDSv2 endpoint listening on 9911 by default. This can be customized using the --port parameter.

Now let’s inspect the Docker file.

Note: At the time of writing, we used the alpine3.17.0 image. Use a hardened base image that’s designed to be secure and aligns with the requirements of your environment.

FROM alpine:3.17.0

COPY init.sh .
RUN apk add --no-cache libc6-compat libgcc wget
RUN wget https://rolesanywhere.amazonaws.com/releases/1.3.0/X86_64/Linux/aws_signing_helper
RUN chmod +x /aws_signing_helper /init.sh 
RUN ln -s /lib/libc.musl-x86_64.so.1 /lib/libresolv.so.2
ENTRYPOINT ["/bin/sh", "-c", "/init.sh"]

This Docker file copies the init.sh and downloads the aws_signing_helper binary. The init.sh script is defined as an entry point to the container. Dynamic libraries required by aws_signing_helper are installed using Alpine Linux package manager (Apk).

Now build the docker image, sign in to it, and push it for later use. For the following commands replace <my-docker-registry> with the hostname of your local registry or use an ECR Repository.

docker build . -t <my-docker-registry>/iamra-sidecar
docker login <my-docker-registry>
docker push <my-docker-registry>/iamra-sidecar

Step 4 – Install cert-manager

In this step, install cert-manager into your cluster and configure aws-privateca-issuer using a manually bootstrapped certificate. cert-manager-approver-policy is used to control which certificates can be requested by the workloads. Then, set up the cert-manager CSI driver to automatically provision X.509 certificates for your workload pods.

Start with the cert-manager setup:

  1. Add the cert-manager repository to Helm and install the chart.

    Note: At the time of writing, we used cert-manager version 1.16.2. Check for the latest stable version.

    helm repo add jetstack https://charts.jetstack.io
    helm repo update
    helm install \
      cert-manager jetstack/cert-manager \
      --namespace cert-manager \
      --create-namespace \
      --version v1.16.2 \
      --set installCRDs=true \
      --set extraArgs={--controllers='*\,-certificaterequests-approver'}
      
    helm install \
      cert-manager-approver-policy jetstack/cert-manager-approver-policy \
      --namespace cert-manager \
      --wait \
        --set app.approveSignerNames="{\
    issuers.cert-manager.io/*,clusterissuers.cert-manager.io/*,\
    awspcaclusterissuers.awspca.cert-manager.io/*,awspcaissuers.awspca.cert-manager.io/*\
    }"
    
    
    #make modifications in cert-manager-approver-policy and add below permissions
    
    kubectl edit  Clusterrole cert-manager-approver-policy -n cert-manager -o yaml
    
    - apiGroups:
      - awspca.cert-manager.io
      resources:
      - awspcaissuers
      - awspcaclusterissuers
      verbs:
      - get
      - list
      - watch
    - apiGroups:
      - cert-manager.io
      - awspca.cert-manager.io
      resources:
      - signers
      verbs:
      - approve

    Now, install the cert-manager aws-privateca-issuer plugin. This integration connects cert-manager with AWS Private CA and lets you issue short-lived certificates automatically. Currently, aws-privateca-issuer Helm chart doesn’t support IAMRA natively. So, you’re going to use the same init-container to set up IAMRA as for the workload pods.

    You need to issue the first X.509 certificate for aws-privateca-issuer IAMRA manually. Later, cert-manager will renew it automatically.

  2. Create the bootstrap certificate. When asked for a common name, enter iamra-issuer.
    openssl req -out iamra.csr -new -newkey rsa:2048 \
    -nodes -keyout iamra.key
    

    The previous command will create an RSA private key named iamra.key and a certificate signing request name iamra.csr. Now you need to call AWS Private CA to issue the bootstrap certificate.

  3. Set the validity period of the certificate to 1 day so that cert-manager will replace it after it’s set up. The IAM role that’s performing this action must have permissions to AWS Certificate Manager (ACM), IAM, and IAM Roles Anywhere to complete the setup.
    aws acm-pca issue-certificate \
          --certificate-authority-arn ${PCA_ARN} \
          --csr fileb://iamra.csr \
          --signing-algorithm "SHA256WITHRSA" \
          --validity Value=1,Type="DAYS"

  4. The command will return a CertificateArn for your iamra-issuer certificate. Export it and save the certificate to a file.
    export IAMRA_ISSUER_CERT_ARN=arn:aws:acm-pca:<region>:012345678912:certificate-authority/8213159d-cad0-481c-bf14-a0ced4d6d479/certificate/afc47911ed2ded9c2664fa597a33b9fb
    aws acm-pca get-certificate \
          --certificate-authority-arn ${PCA_ARN} \
          --certificate-arn ${IAMRA_ISSUER_CERT_ARN} | \
          jq -r .'Certificate' > iamra-cert.pem

  5. Create a Kubernetes secret that contains the certificate and private key.
    kubectl create secret tls -n cert-manager iamra-issuer \
      --cert=iamra-cert.pem \
      --key=iamra.key

    You’re ready to install the aws-privateca-issuer. You need to modify the Helm chart because it doesn’t currently support IAMRA. You will render the Helm chart into YAML manifests, which are then adapted for IAMRA.

  6. Install the Helm repository and render the charts into a file.
    helm repo add awspca https://cert-manager.github.io/aws-privateca-issuer
     helm template --release-name iamra --include-crds awspca/aws-privateca-issuer \
       -n cert-manager > privateca-issuer.yaml

  7. Add your previously built image as a sidecar and replace the environment variables with your exported values. Search for the deployment definition and add the following section:
    # Source: aws-privateca-issuer/templates/deployment.yaml
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: iamra-aws-privateca-issuer
      namespace: cert-manager
      labels:
        helm.sh/chart: aws-privateca-issuer-v1.4.0
        app.kubernetes.io/name: aws-privateca-issuer
        app.kubernetes.io/instance: iamra
        app.kubernetes.io/version: "v1.4.0"
        app.kubernetes.io/managed-by: Helm
    spec:
      replicas: 1
      revisionHistoryLimit: 10
      selector:
        matchLabels:
          app.kubernetes.io/name: aws-privateca-issuer
          app.kubernetes.io/instance: iamra
      template:
        metadata:
          labels:
            app.kubernetes.io/name: aws-privateca-issuer
            app.kubernetes.io/instance: iamra
        spec:
          serviceAccountName: iamra-aws-privateca-issuer
          securityContext:
            runAsUser: 65532
          volumes:
            - name: "iamra-secret"
              projected:
                sources:
                  - secret:
                      name: iamra-issuer
          containers:
            - name: iamra-sidecar
              image: 012345678912.dkr.ecr.us-east-2.amazonaws.com/<replace-with-iamra-sidecar-repository>
              imagePullPolicy: Always
              env:
                - name: "TRUST_ANCHOR_ARN"
                  value: "arn:aws:rolesanywhere:us-east-2:012345678912:trust-anchor/05d183f8-a34e-4f0c-ad2a-de6f803"
                - name: "PROFILE_ARN"
                  value: "arn:aws:rolesanywhere:us-east-2:012345678912:profile/7b45f9a9-73fa-47f8-a20f-88aacbf57"
                - name: "ROLE_ARN"
                  value: "arn:aws:iam::012345678912:role/iamra-issuer"
              volumeMounts:
                - name: iamra-secret
                  mountPath: "/iamra"
                  readOnly: true
            - name: aws-privateca-issuer
              securityContext:
                allowPrivilegeEscalation: false
              image: "public.ecr.aws/k1n1h4h4/cert-manager-aws-privateca-issuer:latest"
              env:
               - name: "AWS_EC2_METADATA_SERVICE_ENDPOINT"
                 value: "http://localhost:9911/"
              imagePullPolicy: IfNotPresent
              command:
                - /manager
              args:
                - --leader-elect
              ports:
                - containerPort: 8080
                  name: http
              livenessProbe:
                httpGet:
                  path: /healthz
                  port: 8081
                initialDelaySeconds: 15
                periodSeconds: 20
              readinessProbe:
                httpGet:
                  path: /healthz
                  port: 8081
                initialDelaySeconds: 5
                periodSeconds: 10
          terminationGracePeriodSeconds: 10

  8. Apply your modified manifest to install aws-privateca-issuer and verify the deployment you have modified. It should show that one pod is ready and available.
    kubectl apply -f privateca-issuer.yaml
    
    kubectl get deployment -n cert-manager -l app.kubernetes.io/name=aws-privateca-issuer
    
    NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
    iamra-aws-privateca-issuer   1/1     1            1           4d10h

  9. Define an AWSPCAIssuer, which will be used for renewal of the manually bootstrapped certificate for the aws-privateca-issuer add-on.

    Note: At the time of writing, we used awspca cert-manager API version v1beta1. Check for the latest stable version.

    export AWS_REGION=<region>
    cat <<EOF | kubectl apply -f -
    apiVersion: awspca.cert-manager.io/v1beta1
    kind: AWSPCAIssuer
    metadata:
      name: iamra-cm-issuer
      namespace: cert-manager
    spec:
      arn: ${PCA_ARN}
      region: ${AWS_REGION}
    EOF

  10. After at least one AWSPCAIssuer or AWSPCAClusterIssuer is available, aws-privateca-issuer is going to authenticate towards AWS APIs by calling sts.get-caller-identity and verify the authentication method. You can verify this using its log files. It should print the assumed role.
    kubectl logs -n cert-manager -l app.kubernetes.io/name=aws-privateca-issuer -c aws-privateca-issuer | grep -i getcalleridentity
    
    Defaulted container "aws-privateca-issuer" out of: aws-privateca-issuer, iamra-init (init)
    {"level":"info","ts":1669240040.2704494,"logger":"controllers.GenericIssuer","msg":"sts.GetCallerIdentity","genericissuer":"cert-manager/iamra-cm-issuer","arn":"arn:aws:sts::012345678912:assumed-role/iamra-issuer/5bafffcfb691969f0616a9b1a68032ec","account":"012345678912","user_id":"AROA2EIPPI5BVJ6SKBYOY:5bafffcfb691969f0616a9b1a68032ec"}

    Now, you can create a cert-manager Certificate resource that represents a desired certificate that should be issued by the referenced cert-manager Issuer. It combines information of a CSR with details on the validity period and renewal.

  11. Create the certificate object:
    cat <<EOF | kubectl apply -f - 
      apiVersion: cert-manager.io/v1
      kind: Certificate
      metadata:
        name: iamra-privateca-issuer-cert
        namespace: cert-manager
      spec:
        secretName: iamra-issuer
        duration: 168h # 7d
        renewBefore: 24h # 15d
        subject:
          organizations:
            - "Example Corp."
          organizationalUnits:
            - "Admin"
        commonName: "iamra-issuer"
        isCA: false
        usages:
          - "client auth"
          - "server auth"
        issuerRef:
          group: awspca.cert-manager.io
          kind: AWSPCAIssuer
          name: iamra-cm-issuer
      EOF
      helm upgrade -i -n cert-manager cert-manager-csi-driver jetstack/cert-manager-csi-driver --wait
      -- > install policies:
      policy + role + role binding to allow service account to accept certs.
      cat <<EOF | kubectl apply -f - 
      apiVersion: policy.cert-manager.io/v1alpha1
      kind: CertificateRequestPolicy
      metadata:
        name: iamra-issuer-policy
      spec:
        allowed:
          commonName:
            required: true
            value: "iamra-issuer"
          subject:
            organizations:
              values: ["Example Corp."]
              required: true
            organizationalUnits:
              values: ["Admin"]
              required: true
          usages:
          - "server auth"
          - "client auth"
        selector:
          issuerRef:
            group: awspca.cert-manager.io
            kind: AWSPCAIssuer
            name: iamra-cm-issuer
      ---
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRole
      metadata:
        name: cert-manager-policy:iamra-issuer-policy
      rules:
        - apiGroups: ["policy.cert-manager.io"]
          resources: ["certificaterequestpolicies"]
          verbs: ["use"]
          resourceNames: ["iamra-issuer-policy"]
      ---
      apiVersion: rbac.authorization.k8s.io/v1
      kind: ClusterRoleBinding
      metadata:
        name: cert-manager-policy:iamra-issuer-policy
      roleRef:
        apiGroup: rbac.authorization.k8s.io
        kind: ClusterRole
        name: cert-manager-policy:iamra-issuer-policy
      subjects:
      - kind: ServiceAccount
        name: cert-manager
        namespace: cert-manager
      EOF

Step 5 – Deploy your workload

In Step 4, sub-step 9, you created an AWSPCAIssuer named iamra-cm-issuer. You then used this AWSPCAIssuer to renew the manually bootstrapped certificate for the aws-privateca-issuer.

In Step 4, sub-step 11, you created the certificate iamra-privateca-issuer-cert, which is used by the aws-privateca-issuer.

In this step, you will deploy the sample workload. When deploying the sample workload, make sure to repeat the process of creating IAM roles and IAMRA profiles (from Step 2), the AWSPCAIssuer (Step 4, sub-step 9), and the CertificateRequestPolicy (Step 4, sub-step 11) for the certificate request.

For more information on certificate request policies, see the cert-manager documentation on approval policies.

Use the following code to deploy the workload.

cat <<EOF | kubectl apply -f -
  
apiVersion: v1
kind: Pod
metadata:
   creationTimestamp: null
   labels:
     run: acmpca-csi-test
   name: acmpca-csi-test
spec:
  containers:
      - name: iamra-sidecar
        image: 056930860237.dkr.ecr.us-east-2.amazonaws.com/aws_sighning:latest
        imagePullPolicy: Always
        env:
          - name: "TRUST_ANCHOR_ARN"
            value: "arn:aws:rolesanywhere:us-east-2:012345678912:trust-anchor/05d183f8-a34e-4f0c-ad2a-de6f803ac172"
          - name: "PROFILE_ARN"
            value: "arn:aws:rolesanywhere:us-east-2:012345678912:profile/7b45f9a9-73fa-47f8-a20f-88aacbf579d2"
          - name: "ROLE_ARN"
            value: "arn:aws:iam::012345678912:role/iam-roles-anywhere-s3-full-access"
        volumeMounts:
          - name: "iamra-csi"
            mountPath: "/iamra"
            readOnly: true
      - name: aws-cli
        image: amazon/aws-cli:latest
        env:
        - name: "AWS_EC2_METADATA_SERVICE_ENDPOINT"
          value: "http://127.0.0.1:9911/"
        command:
          - sleep
          - "3600"
  dnsPolicy: ClusterFirst
  restartPolicy: Never
  volumes:
    - name: "iamra-csi"
      csi:
        readOnly: true
        driver: csi.cert-manager.io
        volumeAttributes:
            csi.cert-manager.io/issuer-name: my-pca
            csi.cert-manager.io/issuer-group: awspca.cert-manager.io
            csi.cert-manager.io/issuer-kind: AWSPCAIssuer
            csi.cert-manager.io/common-name: "${SERVICE_ACCOUNT_NAME}.${POD_NAMESPACE}"
            csi.cert-manager.io/duration: 168h
            csi.cert-manager.io/renew-before: 24h
            csi.cert-manager.io/is-ca: "false"
            csi.cert-manager.io/key-usages: "client auth, server auth"
  EOF

Step 6 – Test your deployment

To test the deployment, you can use kubectl exec to access the iamra-sidecar container. Navigate to the iamra directory and check if the certificate and key are mounted.

Command:
kubectl exec -it acmpca-csi-test  – sh
ls | grep iamra

Output: iamra

Command:
cd iamra
/iamra# ls

Output: ca.crt   tls.crt  tls.key

You can also exec into the aws-cli container and verify the caller identity and make API calls to Amazon Simple Storage Service (Amazon S3):

Command:
kubectl exec -it acmpca-csi-test -c aws-cli  – sh
$aws sts get-caller-identity

Output: You should see iam-roles-anywhere-s3-full-access in caller-identity.

Command:
$aws s3 ls

Output: You should be able to list the S3 bucket based on the permissions associated with the assumed role.

Summary

In this post, you learned about a solution for securely connecting on-premises Kubernetes workloads to AWS services using IAM Roles Anywhere. The approach alleviates the need for long-term access keys or public internet exposure of the Kubernetes API server. By using this solution for containerized and full stack applications, you can benefit from:

  • Enhanced security: Use short-lived X.509 certificates instead of long-term credentials.
  • Simplified management: Automate the certificate lifecycle with cert-manager and AWS Private CA.
  • Seamless integration: No modifications are required to existing workload Docker files.
  • Consistent policies: Use the same IAM roles and policies across AWS and on premises.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
 

Varun Sharma
Varun Sharma

Varun is a Senior AWS Cloud Security Engineer who wears his security cape proudly. Varun is a go-to subject matter expert for Amazon Cognito and IAM. When he’s not busy securing the cloud, you’ll find him in the world of security penetration testing. Outside of work, Varun switches gears to capture the beauty of nature through the lens of his camera.
Nishant Mainro
Nishant Mainro

Nishant is a Senior Security Consultant in the AWS Professional Services team and is based in Atlanta, Georgia. He is a technical and passionate Amazonian with over 16 years of professional experience with a specialization in security, risk, and compliance. His specializes developing and enabling security controls at scale to empower customers to achieve the required security goals for their workloads.
Roshini Jagarapu
Roshini Jagarapu

Roshini is an Amazon EKS subject matter expert and an AWS Cloud Support Engineer based in India. She works with services such as Amazon EKS and Amazon ECS, helping customers run at scale. Her day-to-day work involves troubleshooting issues related to container technologies. Roshini conducts learning sessions to educate customers and is passionate about cloud-native solutions.

WellRight modernizes to an event-driven architecture to manage bursty and unpredictable traffic

Post Syndicated from John Lee original https://aws.amazon.com/blogs/architecture/wellright-modernizes-to-an-event-driven-architecture-to-manage-bursty-and-unpredictable-traffic/

WellRight is a leading comprehensive corporate wellness platform provider that helps organizations and employees drive meaningful outcomes through personalized wellness programs. The platform increases engagement and benefit utilization by delivering engaging challenges across multiple dimensions of wellness, from physical activities like step tracking to mental health initiatives and team-building exercises.

In this post, we share how WellRight optimized the cost and performance of their application through a ground-up modernization to an event-driven architecture.

The challenge

WellRight’s infrastructure often experiences bursty and unpredictable traffic patterns. For instance, clients can upload bulk user data at any time, which can impact tens of thousands of users, which then cascade into millions of changes. WellRight’s legacy monolithic infrastructure had several challenges when faced with such traffic:

  • Multiple processes such as registration, progress calculation, and reward distribution relied on a single server, leading to a noisy neighbor problem.
  • Certain core services were isolated to avoid the noisy neighbor problem, but with high burst workloads, auto scaling didn’t react fast enough to meet the demand. This led to queues backing up with millions of requests. In addition, the database also had to be overprovisioned to avoid throttling, adding to the overall cost.
  • Parts of the application were not designed with auto scaling in mind, leading to overprovisioning of resources.

The following figure shows the Number of Messages Received metric from a sample Amazon Simple Queue Service (Amazon SQS) queue. WellRight would often receive burst of events at an unpredictable time.

A line graph showing the number of messages received in an SQS queue, with a sharp spike amid otherwise zero activity.

Solution overview

To address the challenges, WellRight made the strategic decision to transition to an event-driven architecture using fully managed AWS services. WellRight’s platform is driven by asynchronous state changes that propagate through multiple wellness programs, which is well suited for an event-driven architecture and can be broken down into microservices. Managed services such as AWS Lambda, Amazon SQS, and Amazon DynamoDB were appealing because they would eliminate the need to manage servers and allow WellRight to focus on core business logic and reduce the operational burden to their engineering team. It also has the added benefit of avoiding overprovisioning of infrastructure or continuously right-sizing resources. Each microservice would scale automatically as needed with no manual efforts, minimizing costs. The loosely coupled architecture would allow the WellRight team to be flexible, being able to add or make modifications to existing programs without affecting existing workflows.

Design

WellRight’s initial event-driven architecture was centered around using serverless and fully managed services. DynamoDB was used as a primary data store for user information. For instance, when a user makes progress on their step challenge, the update in the DynamoDB table would propagate through DynamoDB Streams to Amazon EventBridge. Then, the event would be routed to the appropriate SQS queue, which functions as a buffer and provides fault tolerance to the events. A Lambda function would then process individual user metrics and update the Programs table. The Programs table uses DynamoDB Streams to send out updates using Amazon Simple Notification Service (Amazon SNS), keeping users informed about their progress and rankings.

The following diagram illustrates the flow of an event after a user update.

The first iteration of the event-driven architecture fared better than the monolithic legacy application, but the bursty nature of the traffic was still an issue. Lambda functions triggered by SQS queues scaled rapidly, handling requests in under 15 minutes that previously required 30 servers and took hours to process. Lambda provided WellRight the scalability that they needed, but the rapid scaling introduced a new challenge. This resulted in the throttling of DynamoDB and reaching Lambda concurrency limits during times of extremely high load, which led to many unprocessed messages in the dead-letter queue (DLQ).

Maximum concurrency solution

In January 2023, AWS introduced the maximum concurrency feature for Lambda functions using Amazon SQS as an event source. This new feature allowed WellRight to control the concurrency of their Lambda functions for each SQS queue. Prior to this launch, Lambda functions would continue to scale as long as there were messages in the SQS queue. At times, Lambda functions would scale to its concurrency limits, resulting in it throttling itself. However, with this feature in place, the scaling Lambda functions would not exceed the set maximum concurrency value. This provided WellRight fine-grained control over the overall throughput of the system. WellRight would adjust the maximum concurrency value as needed to protect downstream processes from being overwhelmed, while responding to customer requests in a timely manner.

The following screenshot of the Lambda console shows the maximum concurrency for the function is set to 100 for an SQS trigger.

An AWS Lambda configuration screen showing a trigger from an SQS progress-calculation-queue with maximum concurrency set to 100, alongside a diagram illustrating the SQS to Lambda connection.

WellRight converted all Amazon SQS to Lambda integrations to use this feature. This provided WellRight with full control over the throughput of customer requests while preventing overloading the system. With the maximum concurrency feature, WellRight reduced failed processed messages by 99%, and eliminated DynamoDB throttling events. The feature was enabled for all Amazon SQS and Lambda integrations, including those without scaling issues, as a safeguard for potential future scaling demands.

Performance and cost savings

WellRight’s event-driven architecture significantly improved their ability to handle bursty and unpredictable traffic patterns. The managed serverless services can scale instantaneously to handle these traffic spikes, providing a seamless experience for their clients. With their previous legacy architecture, clients experienced lags in challenge progress, leaderboards, and reward processing.

Now, clients continue to upload updates with over 1 million entries at any time, and WellRight can maintain up-to-the-minute leaderboards and reward processing. The transition to the new architecture has also yielded significant cost savings for WellRight. Prior to the serverless architecture, their baseline architecture required several large Amazon Elastic Compute Cloud (Amazon EC2) instances to handle the initial burst of traffic. After implementing the event-driven architecture, WellRight reduced their costs by 70% on the progress calculation service.

Future plans

WellRight is currently in the process of rolling out the new event-driven architecture to the remaining clients. By the end of 2024, WellRight plans to retire the majority of their remaining servers, further reducing their infrastructure costs.

Conclusion

WellRight’s transition to an event-driven architecture on AWS has been a successful endeavor. By using fully managed services such as Lambda, Amazon SQS, and DynamoDB, they have been able to handle bursty and unpredictable traffic patterns efficiently, while providing a seamless experience for their clients. The introduction of maximum concurrency for Lambda functions has been a game changer, allowing WellRight to control the throughput of their Lambda functions and avoid overwhelming downstream resources.

Overall, the event-driven architecture has enabled WellRight to scale efficiently, improve performance, and reduce costs of their progress calculation service by over 70%. As they continue to optimize their serverless architecture and migrate remaining clients, WellRight is well-positioned to further enhance their platform and provide an exceptional experience to their customers.

To learn more about building event-driven architectures, including key concepts, best practices, AWS services, and getting started resources, visit Serverless Land.


About the authors

Supercharge your RAG applications with Amazon OpenSearch Service and Aryn DocParse

Post Syndicated from Jon Handler original https://aws.amazon.com/blogs/big-data/supercharge-your-rag-applications-with-amazon-opensearch-service-and-aryn-docparse/

The old adage “garbage in, garbage out” applies to all search systems. Whether you are building for ecommerce, document retrieval, or Retrieval Augmented Generation (RAG), the quality of your search results depends on the quality of your search documents. Downstream, RAG systems improve the quality of generated answers by adding relevant data from other systems to the generative prompt. Most RAG solutions use a search engine to search for this relevant data. To get great responses, you need great search results, and to get great search results, you need great data. If you don’t properly partition, extract, enrich, and clean your data before loading it, your search results will reflect the poor quality of your search documents.

Aryn DocParse segments and labels PDF documents, runs OCR, extracts tables and images, and more. It turns your messy documents into beautiful, structured JSON, which is the first step of document extract, transform, and load (ETL). DocParse runs the open source Aryn Partitioner and its state-of-the-art, open source deep learning DETR AI model trained on over 80,000 enterprise documents. This leads to up to 6 times more accurate data chunking and 2 times improved recall on vector search or RAG when compared to off-the-shelf systems. The following screenshot is an example of how DocParse would segment a page in an ETL pipeline. You can visualize labeled bounding boxes for each document segment using the Aryn Playground.

In this post, we demonstrate how to use Amazon OpenSearch Service with purpose-built document ETL tools, Aryn DocParse and Sycamore, to quickly build a RAG application that relies on complex documents. We use over 75 PDF reports from the National Transportation Safety Board (NTSB) about aircraft incidents. You can refer to the following example document from the collection. As you can see, these documents are complex, containing tables, images, section headings, and complicated layouts.

Let’s get started!

Prerequisites

Complete the following prerequisite steps:

  1. Create an OpenSearch Service domain. For more details, see Creating and managing Amazon OpenSearch Service domains. You can create a domain using the AWS Management Console, AWS Command Line Interface (AWS CLI), or SDK. Be sure to choose public access for your domain, and set up a user name and password for your domain’s primary user so that you can run the notebook from your laptop, Amazon SageMaker Studio, or an Amazon Elastic Compute Cloud (EC2) instance. To keep costs low, you can create an OpenSearch Service domain with a single t3.small search node in a dev/test configuration for this example. Take note of the domain’s endpoint to use in later steps.
  2. Get an Aryn API key.
  3. You will be using Anthropic’s Claude large language model (LLM) on Amazon Bedrock in the ETL pipeline, so make sure your notebook has access to AWS credentials with the required permissions.
  4. Have access to a Jupyter environment to open and run the notebook.

Use DocParse and Sycamore to chunk data and load OpenSearch Service

Although you can generate an ETL pipeline to load your OpenSearch Service domain using the Aryn DocPrep UI, we will instead focus on the underlying Sycamore document ETL library and write a pipeline from scratch.

Sycamore was designed to make it straightforward for developers and data engineers to define complex data transformations over large collections of documents. Borrowing some ideas from popular dataflow frameworks like Apache Spark, Sycamore has a core abstraction called the DocSet. Each DocSet represents a collection of unstructured documents, and is scalable from a single document to many thousands. Each document in a DocSet has an arbitrary set of key-value properties as metadata, as well as an ordered list of elements. An Element corresponds to a chunk of the document that can be processed and embedded separately, such as a table, headline, text passage, or image. Like documents, Elements can also contain arbitrary key-value properties to encode domain- or application-specific metadata.

Notebook walkthrough

We’ve created a Jupyter notebook that uses Sycamore to orchestrate data preparation and loading. This notebook uses Sycamore to create a data processing pipeline that sends documents to DocParse for initial document segmentation and data extraction, then runs entity extraction and data transforms, and finally loads data into OpenSearch Service using a connector.

Copy the notebook into your Amazon SageMaker JupyterLab space, launch it using a Python kernel, then walk through the cells along with the following procedures.

To install Sycamore with the OpenSearch Service connector and local inference features necessary to create vector embeddings, run the first cell of the notebook:

!pip install 'sycamore-ai[opensearch,local-inference]'

In the second cell of the notebook, fill in your ARYN_API_KEY. You should be able to complete the example in the notebook for less than $1.

Cell 3 does the initial work of reading the source data and preparing a DocSet for that data. After initializing the Sycamore context and setting paths, this code calls out to DocParse to create a partitioned_docset:

partitioned_docset = (
  docset.partition(
    partitioner=ArynPartitioner(
      extract_table_structure=True,
      extract_images=True
    )
  ).materialize(
      path="./opensearch-tutorial/partitioned-docset",
      source_mode=sycamore.MATERIALIZE_USE_STORED
    )
)
partitioned_docset.execute()

The previous code uses materialize to create and save a checkpoint. In future runs, the code will use the materialized view to save a few minutes of time. partitioned_docset.execute() forces the pipeline to execute. Sycamore uses lazy execution to create efficient query plans, and would otherwise execute the pipeline at a much later step.

After this step, each document in the DocSet now includes the partitioned output from DocParse, including bounding boxes, text content, and images from that document, stored as elements.

Entity extraction

Part of the key to building good retrieval for RAG is adding structured information that enables accurate filtering for the search query. Sycamore provides LLM-powered transforms that can extract this information and store it as structured properties, enriching the document. Sycamore can do unsupervised or supervised schema extraction, where it pulls out fields based on a JSON schema you provide. When executing these types of transforms, Sycamore will take a specified number of elements from each document, use an LLM to extract the specified fields, and include them as properties in the document.

Cell 4 uses supervised schema extraction, setting the schema as the fields you want to extract. You can add additional information that is passed to the LLM performing the entity extraction. The location property is an example of this:

schema = {
            'type': 'object',
            'properties': {'accidentNumber': {'type': 'string'},
                           'dateAndTime': {'type': 'date'},
                           'location': {
                             'type': 'string', 
                             'description': 'US State where the incident occured'
                           },
                           'aircraft': {'type': 'string'},
                           'aircraftDamage': {'type': 'string'},
                           'injuries': {'type': 'string'},
                           'definingEvent': {'type': 'string'}},
            'required': ['accidentNumber',
                         'dateAndTime',
                         'location',
                         'aircraft']
    }

schema_name = 'FlightAccidentReport'
property_extractor=LLMPropertyExtractor(llm=llm, num_of_elements=20, schema_name=schema_name, schema=schema)

The LLMPropertyExtractor uses the schema you provided to add additional properties to the document. Next, summarize the images to add additional information to improve retrieval.

Image summarization

There’s more information in your documents than just text—as the saying goes, a picture is worth 1,000 words! When your documents contain images, you can capture the information in those images using Sycamore’s SummarizeImages transform. SummarizeImages uses an LLM to compute a text summary for the image, then adds the summary to that element. Sycamore will also send related information about the image, like a caption, to the LLM to aid with summarization. The following code (in cell 4) takes advantage of DocParse type labeling to automatically apply SummarizeImages to image elements:

enriched_docset = enriched_docset.transform(SummarizeImages, summarizer=LLMImageSummarizer(llm=llm))

This cell can take up to 20 minutes to complete.

Now that your image elements contain additional retrieval information, it’s time to clean and normalize the text in the elements and extracted entities.

Data cleaning and formatting

Unless you are in direct control of the creation of the documents you are processing, you will likely need to normalize that data and make it ready for search. Sycamore makes it straightforward for you to clean messy data and bring it to a regular form, fixing data quality issues.

For example, in the NTSB data, dates in the incident report are not all formatted the same way, and some US state names are shown as abbreviations. Sycamore makes it straightforward to write custom transformations in Python, and also provides several useful cleaning and formatting transforms. Cell 4 uses two functions in Sycamore to format the state names and dates:

formatted_docset = (
  enriched_docset
  
  # Converts state abbreviations to their full names.
  .map(lambda doc: USStateStandardizer.standardize(
    doc, key_path = ["properties","entity","location"])
  )

  # Converts datetime into a common format
  .map(lambda doc: DateTimeStandardizer.standardize(
    doc, key_path = ["properties","entity","dateTime"])
  )
)

The elements are now in normal form, with extracted entities and image descriptions. The next step is to merge together semantically related elements to create chunks.

Create final chunks and vector embeddings

When you prepare for RAG, you create chunks—parts of the full document that are related information. You design your chunks so that as a search result they can be added to the prompt to provide a unit of meaning and information. There are many ways to approach chunking. If you have small documents, sometimes the whole document is a chunk. If you have larger documents, sentences, paragraphs, or even sections can be a chunk. As you iterate on your end application, it’s common to adjust the chunking strategy to fine-tune the accuracy of retrieval. Sycamore automates the process of building chunks by merging together the elements of the DocSet.

At this stage of the processing in cell 4, each document in our DocSet has a set of elements. The following code merges elements together using a chunking strategy to create larger elements that will improve query results. For instance, the DocSet might have an element that is a table and an element that is a caption for that table. Merging those elements together creates a chunk that’s a better search result.

We will use Sycamore’s Merge transform with the GreedySectionMerger merging strategy to add elements in the same document section together into larger chunks:

merger = GreedySectionMerger(
  tokenizer=HuggingFaceTokenizer(
    "sentence-transformers/all-MiniLM-L6-v2"),
  max_tokens=512
)
chunked_docset = formatted_docset.merge(merger=merger)

With chunks created, it’s time to add vector embeddings for the chunks.

Create vector embeddings

Use vector embeddings to enable semantic search in OpenSearch Service. With semantic search, retrieve documents that are close to a query in a multidimensional space, rather than by matching words exactly. In RAG systems, it’s common to use semantic search along with lexical search for a hybrid search. Using hybrid search, you get best-of-all-worlds retrieval.

The code in cell 4 creates vector embeddings for each chunk. You can use a variety of different AI models with Sycamore’s embed transform to create vector embeddings. You can run these locally or use a service like Amazon Bedrock or OpenAI. The embedding model you choose has a huge impact on your search quality, and it’s common to experiment with this variable as well. In this example, you create embeddings locally using a model called GTE:

model_name = "thenlper/gte-small"
embedded_docset = chunked_docset.spread_properties(["entity", "path"]).explode().embed(
      embedder=SentenceTransformerEmbedder(batch_size=10_000, model_name=model_name)
)
embedded_docset = embedded_docset.materialize(
  path="./opensearch-tutorial/embedded-docset",
  source_mode=sycamore.MATERIALIZE_USE_STORED
)
embedded_docset.execute()

You use materialize again here, so you can checkpoint the processed DocSet before loading. If there is an error when loading the indexes, you can retry without running the last few steps of the pipeline again.

Load OpenSearch Service

The final ETL step is loading the prepared data into OpenSearch Service vector and keyword indexes to power hybrid search for the RAG application. Sycamore makes loading indexes straightforward with its set of connectors. Cell 5 adds configuration, specifying the OpenSearch Service domain endpoint and what indexes to create. If you’re following along, be sure to replace YOUR-DOMAIN-ENDPOINT, YOUR-OPENSEARCH-USERNAME, and YOUR-OPENSEARCH-PASSWORD in cell 5 with the actual values.

If you copied your domain endpoint from the console, it will start with the https:// URL scheme. When you replace YOUR-DOMAIN-ENDPOINT, be sure to remove https://.

In cell 6, Sycamore’s OpenSearch connector loads the data into an OpenSearch index:

embedded_docset.write.opensearch(
    os_client_args=openSearch_client_args,
    index_name="aryn-rag-demo",
    index_settings=index_settings,
)

Congratulations! You’ve completed some of the core processing steps to take raw PDFs and prepare them as a source for retrieval in a RAG application. In the next cells, you will run a couple of RAG queries.

Run a RAG query on OpenSearch using Sycamore

In cell 7, Sycamore’s query and summarize functions create a RAG pipeline on the data. The query step uses OpenSearch’s vector search to retrieve the relevant passages for RAG. Then, cell 8 runs a second RAG query that filters on metadata that Sycamore extracted in the ETL pipeline, yielding even better results. You could also use an OpenSearch hybrid search pipeline to perform hybrid vector and lexical retrieval.

Cell 7 asks “What was common with incidents in Texas, and how does that differ from incidents in California?” Sycamore’s summarize_data transform runs the RAG query, and uses the LLM specified for generation (in this case, it’s Anthropic’s Claude):

Based on the provided data, it appears that the common factor among the incidents 
in Texas was that many of them involved substantial aircraft damage, with some resulting 
in injuries or fatalities. The incidents covered a range of aircraft types, including small
planes like Cessnas and Pipers, as well as a helicopter. The defining events varied, 
including loss of control on the ground, engine failures, fuel issues, and collisions 
with terrain or objects.

In contrast, the incidents in California seemed to primarily involve substantial aircraft
damage as well, but with fewer injuries reported. The defining events included loss of 
control on the ground, collisions during takeoff or landing, and a miscellaneous/other event.
One key difference is that the Texas incidents included a fatal accident (CEN23FA084) 
involving a Piper PA46 that resulted in 4 fatalities and 1 serious injury after impacting 
terrain. The California incidents did not appear to have any fatal accidents based on the 
provided data.

Additionally, while both states had incidents involving loss of control on the ground, the 
Texas incidents seemed to have a higher proportion of engine failures, fuel issues, and 
collisions with terrain or objects as defining events compared to California.

Overall, while both states experienced aviation incidents resulting in substantial aircraft
damage, the Texas incidents tended to be more severe in terms of injuries and fatalities, 
with a higher prevalence of engine failures, fuel issues, and terrain/object collisions as 
contributing factors.

Using metadata filters in a RAG query

Cell 8 makes a small adjustment to the code to add a filter to the vector search, filtering for documents from incidents with the location of California. Filters increase the accuracy of chatbot responses by removing irrelevant data from the result the RAG pipeline passes to the LLM in the prompt.

To add a filter, cell 8 adds a filter clause to the k-nearest neighbors (k-NN) query:

os_query["query"]["knn"]["embedding"]["filter"] = {"match_phrase": {"properties.entity.location": "California"}}

The output from the RAG query is as follows:

Based on the database entries provided, several incidents occurred in California during January 2023:

1. On January 12th, a Cessna 180K aircraft sustained substantial damage in a collision during takeoff 
or landing at Agua Caliente Springs, California. There was 1 person on board with no injuries reported.

2. On January 20th, a Cessna 195A aircraft sustained substantial damage due to a los of control on the 
ground at Calexico, California. There were 3 people on board with no injuries.  

3. On January 15th, a Piper PA-28-180 aircraft sustained substantial damage in a miscellaneous incident 
at San Diego, California during an instructional flight. There were 4 people on board with no injuries.

4. On January 1st, a Cessna 172 aircraft sustained substantial damage in a collision during takeoff or 
landing at Watsonville, California during an instructional flight. There was 1 serious injury reported.

5. On January 27th, a Cessna T210N aircraft sustained substantial damage when it descended into a ravine 
and impacted the ground about 2,000 feet short of the runway threshold at Murrieta, California. There were
1 serious injury and 1 minor injury reported. The engine did not respond during the landing approach.

The details provided in the database entries, such as aircraft type, location, date/time, damage level, 
injuries, and a brief description of the defining event, serve as evidence for these incidents occurring 
in California during the specified time period.

Clean up

Be sure to clean up the resources you deployed for this walkthrough:

  1. Delete your OpenSearch Service domain.
  2. Remove any Jupyter environments you created.

Conclusion

In this post, you used Aryn DocParse and Sycamore to parse, extract, enrich, clean, embed, and load data into vector and keyword indexes in OpenSearch Service. You then used Sycamore to run RAG queries on this data. Your second RAG query used an OpenSearch filter on metadata to get a more accurate result.

The way in which your documents are parsed, enriched, and processed has a significant impact on the quality of your RAG queries. You can use the examples in this post to build your own RAG systems with Aryn and OpenSearch Service, and iterate on the processing and retrieval strategies as you build your generative AI application.


About the Authors

Jon Handler is Director of Solutions Architecture for Search Services at Amazon Web Services, based in Palo Alto, CA. Jon works closely with OpenSearch and Amazon OpenSearch Service, providing help and guidance to a broad range of customers who have search and log analytics workloads for OpenSearch. Prior to joining AWS, Jon’s career as a software developer included four years of coding a large-scale ecommerce search engine. Jon holds a Bachelor of the Arts from the University of Pennsylvania, and a Master’s of Science and a PhD in Computer Science and Artificial Intelligence from Northwestern University.

Jon is the founding Chief Product Officer at Aryn. Prior to that, he was the SVP of Product Management at Dremio, a data lake company. Earlier, Jon was a Director at AWS, and led product management for in-memory database services (Amazon ElastiCache and Amazon MemoryDB for Redis), Amazon EMR (Apache Spark and Hadoop), and founded and was GM of the blockchain division. Jon has an MBA from Stanford Graduate School of Business and a BA in Chemistry from Washington University in St. Louis.

Intel Xeon 6700P and 6500P Granite Rapids-SP for the Masses Initial Benchmarks and First Look

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/intel-xeon-6700p-and-6500p-granite-rapids-sp-for-the-masses-initial-benchmarks-and-first-look/

We take a look at the new Intel Xeon 6700P and 6500P processors codenamed “Granite Rapids-SP” to see what they offer

The post Intel Xeon 6700P and 6500P Granite Rapids-SP for the Masses Initial Benchmarks and First Look appeared first on ServeTheHome.

Intel Xeon 6300 Launched for Entry Servers with 2019 Core Counts

Post Syndicated from Patrick Kennedy original https://www.servethehome.com/intel-xeon-6300-launched-for-entry-servers-with-2019-core-counts/

The Intel Xeon 6300 series is the new Xeon E. With a maximum of 8 cores it only manages to keep pace with 2019 Xeon E-2200 core counts

The post Intel Xeon 6300 Launched for Entry Servers with 2019 Core Counts appeared first on ServeTheHome.

[$] AlmaLinux considers EPEL 10 rebuild for older hardware

Post Syndicated from jzb original https://lwn.net/Articles/1010868/

The AlmaLinux project has published
a request for comments (RFC) on rebuilding Fedora’s Extra Packages for
Enterprise Linux
(EPEL), which provides additional software for
Red Hat Enterprise Linux (RHEL) and its derivatives, to support older
x86_64 hardware that is not supported by EPEL 10. While this may
sound simple on the surface, the proposed rebuild carries a few
potential risks that the AlmaLinux and EPEL contributors would like to
avoid. The AlmaLinux
Engineering Steering Committee
(ALESCo) is currently considering
feedback and will vote on the RFC in March.

Security updates for Monday

Post Syndicated from jake original https://lwn.net/Articles/1011610/

Security updates have been issued by AlmaLinux (bind, bind9.18, libpq, mysql, postgresql, postgresql:15, and postgresql:16), Debian (fort-validator, gnutls28, krb5, libxml2, and python-werkzeug), Fedora (chromium, openssh, proftpd, python3.8, vaultwarden, and vim), Oracle (bind, bind9.16, bind9.18, libpq, libsoup, mysql, mysql:8.0, nodejs:18, nodejs:22, postgresql, postgresql:13, postgresql:15, and postgresql:16), Red Hat (mysql, mysql:8.0, and python3), SUSE (chromedriver, dcmtk, grub2, java-1_8_0-ibm, java-23-openjdk, luanti, openssh, postgresql14, postgresql15, postgresql16, postgresql17, proftpd, radare2, and webkit2gtk3), and Ubuntu (intel-microcode, netty, and nginx).

More Research Showing AI Breaking the Rules

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2025/02/more-research-showing-ai-breaking-the-rules.html

These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating.

Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any human, or any of the AI models in the study. Researchers also gave the models what they call a “scratchpad:” a text box the AI could use to “think” before making its next move, providing researchers with a window into their reasoning.

In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’—not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

Between Jan. 10 and Feb. 13, the researchers ran hundreds of such trials with each model. OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the time­making them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

Here’s the paper.

The collective thoughts of the interwebz