Tag Archives: Developer Tools

Controlling AWS API Calls from Amazon Q Developer: Enterprise Governance with Built-in User Agent Markers

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/controlling-aws-api-calls-from-amazon-q-developer-enterprise-governance-with-built-in-user-agent-markers/

As organizations increasingly adopt AI-powered development tools, a critical challenge emerges: how do you maintain security governance when AI assistants execute AWS operations on behalf of users? Organizations want to leverage AI assistance for development and read operations while maintaining strict controls over write operations that impact production systems and auditing calls made via AI assistants. Consider this scenario: A developer asks Amazon Q Developer “List my S3 buckets”, Q Developer suggests aws s3 ls, the developer approves, and Q Developer executes the command via AWS CLI. From an AWS perspective, this looks identical to the developer manually running the aws s3 ls command on the terminal outside of Amazon Q Developer. But what if your organization needs to distinguish between AI-assisted operations and manual commands for governance or compliance?

Amazon Q Developer, the most capable generative AI–powered assistant for software development, generates AWS CLI commands in response to user requests and executes them using its use_aws and execute_bash built-in tools. The challenge of distinguishing AI-assisted operations from manual commands is a key consideration for Amazon Q Developer adoption in enterprise environments. To address this governance challenge, Amazon Q Developer includes a built-in solution: user-agent markers that automatically identify AWS CLI calls made through Q Developer in CloudTrail logs, enabling precise IAM policy controls.

This blog post explores how Amazon Q Developer’s built-in user agent markers set for AWS CLI calls enable precise IAM policy controls, allowing organizations to distinguish and govern AI-assisted AWS operations while maintaining the productivity benefits of AI-powered development. The following sections demonstrate how these user agent markers work, how to implement IAM policies that leverage them, and how to monitor their effectiveness in your environment.

Understanding Amazon Q Developer User Agent Markers

Prerequisites

This section builds on your knowledge of these concepts and assumes you have the necessary setup in place. These foundational elements are essential for understanding how user agent markers work and for implementing the governance controls discussed later in this post. If you need guidance on any of these topics, please refer to the linked documentation:

Amazon Q Developer automatically includes identifiable markers in the user agent string of all AWS API calls it makes via AWS CLI. These markers appear in two primary contexts: CLI tool operations and IDE integration operations.

Q Developer CLI Tool

When using Amazon Q Developer CLI (both use_aws and execute_bash tools), all AWS CLI calls include:

exec-env/AmazonQ-For-CLI-Version-<QCLI-VersionNo>

How It Works: Amazon Q Developer CLI automatically sets:

AWS_EXECUTION_ENV=AmazonQ-For-CLI-Version-<QCLI-VersionNo>

This means all AWS CLI commands executed through Q Developer CLI – whether via the use_aws tool or execute_bash commands – automatically include this marker.

Q Developer IDE Integration

When using Amazon Q Developer from IDE integrations, AWS CLI calls include:

exec-env/AmazonQ-For-IDE-Version-<QIDE-Plugin-VersionNo>

How It Works: Amazon Q Developer IDE plugin automatically sets:

AWS_EXECUTION_ENV=AmazonQ-For-IDE-Version-<QIDE-Plugin-VersionNo>

This applies when Q Developer makes AWS API calls through IDE integrations, such as when analyzing your codebase or suggesting AWS resource configurations. The IDE marker enables you to distinguish between CLI-based and IDE-based Q Developer operations.

Complete User Agent Example

Here’s how a complete user agent string appears in CloudTrail:

From Q Developer CLI:

"userAgent": "aws-cli/2.27.17 md/awscrt#0.26.1 ua/2.1 os/macos#24.6.0 md/arch#x86_64 lang/python#3.13.3 md/pyimpl#CPython exec-env/AmazonQ-For-CLI-Version-1.15.0 
cfg/retry-mode#standard md/installer#exe md/prompt#off md/command#sts.get-caller-identity"

From Q Developer IDE Integration:

"user-agent": "aws-cli/2.27.17 md/awscrt#0.26.1 ua/2.1 os/macos#24.6.0 md/arch#x86_64 lang/python#3.13.3 md/pyimpl#CPython exec-env/AmazonQ-For-IDE-Version-1.93.0 
cfgretry-mode#standard md/installer#exe md/prompt#off md/command#sts.get-caller-identity"

The key identifiers are exec-env/AmazonQ-For-CLI-Version-* and exec-env/AmazonQ-For-IDE-Version-*, which clearly distinguish Amazon Q Developer operations from regular AWS CLI/SDK usage executed outside of Q Developer.

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                           Amazon Q Developer Flow                           │
└─────────────────────────────────────────────────────────────────────────────┘

┌──────────────────┐    ┌──────────────────┐    ┌──────────────────┐
│   Developer      │    │   Amazon Q       │    │   AWS APIs       │
│                  │    │   Developer      │    │                  │
│ ┌──────────────┐ │    │                  │    │                  │
│ │ Q CLI        │ │    │ ┌──────────────┐ │    │ ┌──────────────┐ │
│ │ use_aws tool │ │────┼─│ Adds marker: │ │────┼─│ CloudTrail   │ │
│ └──────────────┘ │    │ │ exec-env/    │ │    │ │ Event with   │ │
│                  │    │ │ AmazonQ-For- │ │    │ │ User Agent   │ │
│ ┌──────────────┐ │    │ │ CLI-Version  │ │    │ │ Marker       │ │
│ │ IDE          │ │    │ └──────────────┘ │    │ └──────────────┘ │
│ │ Integration  │ │────┼─│ Adds marker: │ │    │                  │
│ └──────────────┘ │    │ │ exec-env/    │ │    │                  │
│                  │    │ │ AmazonQ-For- │ │    │                  │
│ ┌──────────────┐ │    │ │ IDE-Version  │ │    │                  │
│ │ execute_bash │ │────┼─└──────────────┘ │    │                  │
│ │ commands     │ │    │                  │    │                  │
│ └──────────────┘ │    │                  │    │                  │
└──────────────────┘    └──────────────────┘    └──────────────────┘
         │                        │                        │
         │                        │                        │
         ▼                        ▼                        ▼
┌──────────────────────────────────────────────────────────────────────────────┐
│                              IAM Policy Engine                               │
│                                                                              │
│  ┌─────────────────────────────────────────────────────────────────────────┐ │
│  │ Condition: StringLike                                                   │ │
│  │ "aws:userAgent": "*exec-env/AmazonQ-For-*"                              │ │
│  │                                                                         │ │
│  │ ┌─────────────────┐              ┌─────────────────┐                    │ │
│  │ │ Q Developer     │              │ Regular AWS     │                    │ │
│  │ │ Operations      │              │ CLI Operations  │                    │ │
│  │ │                 │              │                 │                    │ │
│  │ │ • Block writes  │              │ • Allow writes  │                    │ │
│  │ │ • Allow reads   │              │ • Allow reads   │                    │ │
│  │ └─────────────────┘              └─────────────────┘                    │ │
│  └─────────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────────┘

IAM Policy Implementation

Use the aws:userAgent condition in IAM policies to control Amazon Q Developer operations through two approaches:

IAM Policies: Deploy in each AWS account where developers have access for deploying workloads or performing AWS operations. Q Developer operates using the developer’s existing AWS credentials and permissions – it doesn’t have additional access beyond what the user already possesses. Attach these policies to the same IAM users, groups, or roles that developers use for their regular AWS work.

Service Control Policies (SCPs): Deploy once at the AWS Organizations level for organization-wide governance. SCPs apply to all member accounts automatically and cannot be overridden by account-level policies.

The following policy allows read operations from Q Developer, blocks write operations from Q Developer, and allows write operations from regular AWS CLI executed outside Q Developer:

Note: This IAM policy example is for illustration purposes only. Follow least privilege principles in production environments. For more details refer prepare for least previlege permissions.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "AllowReadOperationsFromQDeveloper",
      "Effect": "Allow",
      "Action": [
        "s3:GetObject*",
        "s3:ListBucket*",
        "ec2:Describe*"
      ],
      "Resource": "*",
      "Condition": {
        "StringLike": {
          "aws:userAgent": "*exec-env/AmazonQ-For-*"
        }
      }
    },
    {
      "Sid": "BlockWriteOperationsFromQDeveloper",
      "Effect": "Deny",
      "Action": [
        "s3:DeleteObject*",
        "ec2:TerminateInstances",
        "iam:DeleteUser"
      ],
      "Resource": "*",
      "Condition": {
        "StringLike": {
          "aws:userAgent": "*exec-env/AmazonQ-For-*"
        }
      }
    },
    {
      "Sid": "AllowWriteOperationsFromRegularCLI",
      "Effect": "Allow",
      "Action": [
        "s3:DeleteObject*",
        "ec2:TerminateInstances",
        "iam:DeleteUser"
      ],
      "Resource": "*",
      "Condition": {
        "StringNotLike": {
          "aws:userAgent": "*exec-env/AmazonQ-For-*"
        }
      }
    }
  ]
}

Note on User Agent Reliability: While AWS warns that user agents can be “spoofed,” this concern is reduced for Q Developer governance use cases. The user agent is automatically set by Q Developer’s tools, not manually controlled by users. Any spoofing would require deliberate effort and would be detectable through usage pattern analysis. This approach is designed for operational governance and policy differentiation, not as a sole security control.

Additional Control Layer: Custom Agent Configuration

For an additional layer of control, you can create a custom agent configuration that restricts which AWS services Amazon Q Developer can access using allowedServices and deniedServices parameters for the use_aws tool:

{
  "toolsSettings": {
    "use_aws": {
      "allowedServices": ["s3", "lambda", "ec2"],
      "deniedServices": ["eks", "rds"]
    }
  }
}

This custom agent configuration works in conjunction with IAM policies to provide defense-in-depth governance of AI-assisted AWS operations. For more details, refer to the agent configuration documentation.

Verification and Monitoring

CloudTrail Event Analysis

To verify that your policies are working correctly, examine CloudTrail events. Here’s what to look for:

Amazon Q Developer Event

{
  "eventTime": "2025-01-15T10:30:00Z",
  "eventName": "GetCallerIdentity",
  "userAgent": "aws-cli/2.27.17 md/awscrt#0.26.1 ua/2.1 os/macos#24.6.0 md/arch#x86_64 lang/python#3.13.3 md/pyimpl#CPython exec-env/AmazonQ-For-CLI-Version-1.15.0 cfg/retry-mode#standard md/installer#exe md/prompt#off md/command#sts.get-caller-identity",
  "sourceIPAddress": "203.0.113.12",
  "userIdentity": {
    "type": "IAMUser",
    "principalId": "AIDACKCEVSQ6C2EXAMPLE",
    "arn": "arn:aws:iam::123456789012:user/developer"
  }
}

Regular AWS CLI Event

{
  "eventTime": "2025-01-15T10:35:00Z",
  "eventName": "GetCallerIdentity", 
  "userAgent": "aws-cli/2.27.17 md/awscrt#0.26.1 ua/2.1 os/macos#24.6.0 md/arch#x86_64 lang/python#3.13.3 md/pyimpl#CPython cfg/retry-mode#standard md/installer#exe md/prompt#off md/command#sts.get-caller-identity",
  "sourceIPAddress": "203.0.113.12",
  "userIdentity": {
    "type": "IAMUser",
    "principalId": "AIDACKCEVSQ6C2EXAMPLE", 
    "arn": "arn:aws:iam::123456789012:user/developer"
  }
}

Monitoring Script Example

Create a simple monitoring script to track Amazon Q Developer usage:

#!/bin/bash
# Monitor Amazon Q Developer AWS API usage
# Get events from last 24 hours and filter for Q Developer user agents
aws cloudtrail lookup-events \
  --start-time $(date -u -v-24H '+%Y-%m-%dT%H:%M:%SZ') \
  --lookup-attributes AttributeKey=EventName,AttributeValue=GetCallerIdentity \
  --query 'Events[?contains(CloudTrailEvent, `AmazonQ-For-CLI`)].[EventTime,EventName,UserIdentity.userName]' \
  --output table

Conclusion

Amazon Q Developer’s built-in user agent markers provide a powerful foundation for implementing enterprise-grade security controls around AI-assisted AWS operations. By leveraging these markers in IAM policies, organizations can:

  • Distinguish between AI-assisted and manual AWS operations
  • Implement differentiated security policies based on operation source
  • Maintain detailed audit trails for compliance requirements
  • Enable secure Amazon Q Developer adoption in enterprise environments while maintaining strict controls over write operations that could impact production systems

For organizations currently evaluating Amazon Q Developer adoption, implementing user agent marker-based controls is a key component of your deployment strategy. This approach enables you to realize the productivity benefits of AI-assisted development while maintaining the governance and security controls your organization requires.

Experience the power of Amazon Q Developer as your AI-powered coding assistant, and implement the governance controls outlined in this post to ensure secure adoption in your enterprise environment. These built-in user agent markers enable you to maintain enterprise-grade security while unlocking the productivity benefits of AI-assisted development.

To learn more about Amazon Q Developer’s features and capabilities, visit the Amazon Q Developer product page.

About the Author

kirankumar.jpeg

Kirankumar Chandrashekar is a Generative AI Specialist Solutions Architect at AWS, focusing on Amazon Q Developer/Kiro and developer productivity. Bringing deep expertise in AWS cloud services, DevOps, modernization, and infrastructure as code, he helps customers accelerate their development cycles and elevate developer productivity through innovative AI-powered solutions. By leveraging Amazon Q Developer and Kiro, he enables teams to build applications faster, automate routine tasks, and streamline development workflows. Kirankumar is dedicated to enhancing developer efficiency while solving complex customer challenges, and enjoys music, cooking, and traveling.

Multi Agent Collaboration with Strands

Post Syndicated from Aaron Sempf original https://aws.amazon.com/blogs/devops/multi-agent-collaboration-with-strands/

In the evolving landscape of autonomous systems, multi-agent collaboration is becoming not only feasible but necessary. As agents gain more capabilities, like advanced reasoning, adaptation, and tool use, the challenge shifts from individual performance to effective coordination. The question is no longer “can an agent solve a task?” but “how do we organize execution across many intelligent agents?”

A foundational step toward answering this came with the Supervisor pattern, introduced in our article on creating asynchronous AI agents with Amazon Bedrock. The Supervisor addresses the first generation of coordination challenges by acting as a centralized orchestrator, monitoring and delegating tasks across agents in a structured, serverless workflow. It provides asynchronous orchestration, fallback handling, and state tracking across loosely coupled agents, giving organizations a reliable way to move from single-agent prototypes to multi-agent systems.

Yet as agentic systems scale and become more dynamic, the limitations of static supervision become clear. The Supervisor model assumes a relatively stable set of agents and predictable workflows; but modern systems face constantly shifting tasks, emergent capabilities, and the need for adaptive coordination. This is where the Arbiter pattern emerges as the natural evolution: a next-generation supervisory model that extends the Supervisor with dynamic agent generation, semantic task routing, and blackboard-model-based coordination. By addressing the unpredictability and fluidity of large, evolving agent ecosystems, the Arbiter pattern enables systems not only to manage complexity but to thrive in it.

The Arbiter pattern builds directly on this by adding three key capabilities:

  1. Semantic Capability Matching: Instead of only assigning known tasks to known agents, the Arbiter reasons about what kind of agent should exist for a task—even if that agent doesn’t exist yet.
  2. Delegated Agent Creation: If no suitable agent is found, the Arbiter escalates the request to a Fabricator agent that dynamically generates a task-specific agent on demand. This moves beyond delegation to true adaptive generation.
  3. Task Planning + Contextual Memory: Building on the Supervisors task coordination capability, Arbiter decomposes complex inputs into structured task plans, and uses contextual memory to track execution, retry logic, and agent performance.

In short, the Arbiter transforms static orchestration into adaptive coordination.

The Blackboard Model Revisited

To enable loose, extensible collaboration across agents, the Arbiter Pattern incorporates principles from the blackboard model – a classic architecture from distributed AI. In this model, agents contribute opportunistically to a shared data space (the “blackboard”), reacting to changes and collectively solving problems.

Reference: See “The Blackboard Model of Control” (Hayes-Roth et al.), and early applications like Hearsay-II for foundational research.

In our extended Arbiter Pattern, the blackboard becomes a semantic event substrate. Agents, including the Arbiter, publish and consume task-relevant state, enabling loosely coupled, event-driven collaboration.

How It Works

When an event enters the system, the Arbiter takes on the supervisory role but extends it with greater dynamism and adaptability. Like the Supervisor pattern, it begins by interpreting the event and identifying the required objectives and sub-tasks. It then performs a capability assessment, using a local index or peer-published manifests, much like the Supervisor querying an Agents config table.

  1. Interpretation: The Arbiter uses LLM-based reasoning to extract task objectives and sub-tasks.
  2. Capability Assessment: It evaluates which agents can handle each sub-task using a local index or peer capability manifests.
  3. Delegation or Generation:
    • If a suitable agent exists, the task is routed accordingly.
    • If not, the Arbiter sends a generation request to the fabricator agent.
  4. Blackboard Coordination: All agents involved read/write to a shared semantic blackboard, contributing as needed based on observed task state.
  5. Reflection and Adaptation: Performance data is logged and used to inform future agent creation, adaptation, or deprecation.

Arbiter Pattern Architecture

Unlike the Supervisor, which maintains orchestration through a static config list, the Arbiter introduces a shared semantic blackboard that allows all participating agents to read, write, and coordinate based on evolving task state. This blackboard serves as a dynamic collaboration space, enabling mid-task adaptation and richer multi-agent coordination.

The following Diagram 1: Agentic AI Arbiter pattern implemented as a code example can be downloaded here

Architecture diagram of the Arbiter Pattern for Agentic AI. The diagram illustrates the components and flow of the pattern, showing how multiple AI agents interact with an arbiter to coordinate tasks and decision-making in a structured system

Diagram 1: Agentic AI Arbiter pattern

The following sequence describes the Arbiter pattern, according to the numbered steps in the diagram 1: Agentic AI Arbiter pattern

  1. Events entering the system trigger the Supervisor function
  2. Supervisor queries Agents Config table for agent capabilities
  3. Supervisor uses Agents config list as context to plan orchestration of tasks

Option: New Agent:

If no capable agent is found, the Arbiter goes further than the basic supervisor pattern: it issues a generation request to a fabricator agent, which synthesizes new worker code, stores it for runtime access, and updates the capability registry so the agentic system can immediately benefit from the new skill.

  1. Task cannot be completed, request create new capability
  2. Request to fabricate triggers Fabrication agent instance
  3. Fabrication agent queries resources register for available tools (capabilities)
  4. Fabricator generates worker agent code
  5. Worker agent code stored in bucket for runtime access
  6. New worker added to Agents config list with agent capabilities description
  7. Result of fabrication posted to message bus

Repeat steps 1, 2 & 3

Option: Orchestrate workflow:

If a suitable agent exists, the Arbiter orchestrates the workflow by invoking the appropriate worker agents, tracking progress and state as in the Supervisor model.

  1. Orchestration of tasks is stored for tracking end-to-end process
  2. Request to invoke worker agent, by name/id. Add workflow state for agent invocation.
  3. Request to invoke worker agent triggers worker agent wrapper instance
  4. Worker agent wrapper loads agent code
  5. Worker agent reasons and takes action
  6. Worker agent sends response to message bus
  7. Supervisor agent updates workflow state and tracks against orchestration

The Arbiter incorporates a reflection and adaptation loop: performance data from task execution is logged, analyzed, and fed back into the fabricator and coordination logic. This ensures that not only are tasks completed in the moment, but the system continuously adapts, retires underperforming agents, and evolves toward greater efficiency.

The Arbiter Agent: Event Orchestration Engine

The Supervisor Agent (Arbiter Agent) serves as the central coordinator component, managing complex event-driven workflows through intelligent task delegation.

Event Processing Workflow:

The Arbiter pattern follows a structured approach to handle incoming events

  1. Configuration Loading: Loads available agent configurations from Amazon DynamoDB via load_config_from_dynamodb()
  2. LLM Invocation: Invokes Amazon Bedrock LLM with event context and available tool specifications
  3. Decision Analysis: LLM analyzes the event and returns tool invocation decisions with parameters
  4. Task Dispatch: For each specified tool call:
    • Extracts tool name, input parameters, and tool use ID
    • Dispatches message to corresponding Amazon Simple Queue Service (SQS) queue via process_tool_call()
    • Maintains tool invocation list for workflow tracking

Workflow State Management:

The system maintains comprehensive state tracking throughout execution

  • Creates workflow tracking record in DynamoDB with create_workflow_tracking_record()
  • Initializes all invoked agents as incomplete
  • Associates unique request ID with orchestration instance
  • Persists orchestration state including conversation history and request mapping

Completion Coordination:

The Arbiter coordinates task completion through a systematic process

  1. Event Reception: Receives agent completion events via Amazon EventBridge
  2. Status Updates: Updates workflow tracking with update_workflow_tracking()
  3. Completion Check: Performs completion check across all tracked agents
  4. Result Aggregation: When all agents complete:
    • Aggregates results from DynamoDB data field
    • Appends tool results to conversation as user messages
    • Re-invokes orchestration with updated context
  5. Continuation: Continues until LLM provides final response without tool calls

The Fabricator Agent: Dynamic Capability Generation

The Fabricator Agent implements just-in-time agent development using the Strands agents framework, creating new capabilities when required functionality doesn’t exist in the system.

Agent Development Architecture:

The Fabricator operates as a specialized Strands Agent with specific characteristics

  • Implemented as a Strands Agent with specialized system prompt for code generation
  • Triggered by “New worker agent” events from the Arbiter
  • Receives capability requirements through prompt augmentation with agent directive
  • System prompt includes:
    • Strands Agent implementation examples
    • Complete catalog of available Strands Tools
    • Code generation patterns and conventions
    • Standardized handler() function requirements

Code Generation Process:

The agent follows a structured development workflow

  1. Requirement Analysis: LLM analyzes capability requirements and generates Python implementation
  2. Tool Selection: Prioritizes use of existing Strands Tools over custom @tool implementations
  3. Code Structure: Creates agents following standardized patterns:
    • Bedrock model initialization with models.BedrockModel()
    • Agent instantiation with appropriate tool selection
    • Standardized handler() function interface
    • Event-driven completion signaling
  4. File Creation: Writes generated code to /tmp/ directory for immediate availability

Capability Registration Pipeline:

New capabilities are registered through a multi-step process

  1. File Storage: File upload to Amazon Simple Storage Service (S3) via upload_file_to_s3() tool
  2. Metadata Registration: Registration in DynamoDB via store_agent_config_dynamo():
    • toolId: Unique capability identifier
    • filename: S3 object reference
    • schema: OpenAPI specification for LLM tool calling
    • description: Human-readable capability documentation
    • action: SQS queue routing configuration for Generic Wrapper
  3. Completion Notification: Completion event publication to Arbiter via complete_task() tool

Testing Considerations:

The original implementation revealed important insights about testing approaches

  • Previous Approach: Agent testing within the Fabricator resulted in:
    • Unstructured testing leading to false negatives
    • Overzealous optimization of generated agents
  • Recommendation: Separate testing agent with standardized harness for validation feedback

The Generic Wrapper: Dynamic Execution Runtime

The Generic Wrapper implements a hot-loading pattern that enables unlimited agent creation without infrastructure scaling, providing a universal execution environment for Fabricator-generated agents.

This hot-loading approach is critical because it decouples capability growth from infrastructure scaling. Instead of provisioning and maintaining new infrastructure components for every new agent, which could be dozens or even hundreds of agents, the system reuses a single execution wrapper that can dynamically load and execute arbitrary agent code.

This not only makes agent creation effectively limitless but also ensures infrastructure efficiency, cost optimization, and simplified operations, allowing the Arbiter and Fabricator to evolve system capabilities without operational bottlenecks.

In the AWS Samples code, found here, the Hot-loading handler is implemented as am AWS Lambda function, represented in the following code snippet:

def process_event(event, context):
    orchestration_id = event["orchestration_id"]
    tool_use_id = event["tool_use_id"]
    request = event["tool_input"]
    tool_name = event['node']

    # Based on the tool from the event, load the details from DDB
    tool = load_config_from_dynamodb(tool_name)
    config = tool['config']

    if isinstance(config, str):
        config = json.loads(config)

    file_name = config['filename']

    load_file_from_s3_into_tmp(os.environ["AGENT_BUCKET_NAME"], file_name)

    # Hot load the module from the tmp directory
    spec = importlib.util.spec_from_file_location("module.name", "/tmp/loaded_module.py")
    loaded_module = importlib.util.module_from_spec(spec)
    sys.modules["module.name"] = loaded_module
    spec.loader.exec_module(loaded_module)

    # Invoke the generic handler with whatever args were passed in by the Arbiter
    try:
        print("attempting to use module")
        response = loaded_module.handler(**request)
        print(f"response: {response}")
    except Exception as e:
        print(f"error running module: {e}")
        response = "The task could not be completed, this agent has issues, please ignore for now."

    # Finally. report back to the Arbiter. Handled by the wrapper. To avoid the Frabricator from attempting to code this part itself
    post_task_complete(response, tool_use_id, tool_name, orchestration_id)

Although this example is demonstrated through a lambda function, the Hot-Loading code can be executed in Amazon Bedrock AgentCore Runtime, or AWS native container services, such as Amazon Elastic Container Service (ECS) or Amazon Elastic Kubernetes Service (EKS)

Hot-Loading Architecture:

The wrapper implements several key architectural principles

  • Single infrastructure component handles execution of all dynamically created agents
  • Eliminates need for separate infrastructure provisioning per agent
  • Implements runtime code loading from S3 storage
  • Accepts latency trade-off for infrastructure efficiency in non-ultra-low-latency environment

Dynamic Loading Process:

The system follows a precise loading sequence

  1. Message Processing: Extracts agent identifier from incoming SQS message
  2. Configuration Retrieval: Queries DynamoDB for agent configuration via load_config_from_dynamodb()
  3. Code Download: Downloads agent implementation from S3 to /tmp/ directory
  4. Runtime Loading: Module loading using importlib.util:
    • spec_from_file_location() creates module specification
    • module_from_spec() instantiates module object
    • exec_module() performs actual code loading and execution

Execution Management:

The wrapper provides comprehensive execution oversight

  • Invokes standardized handler() function with provided parameters
  • Captures execution results and handles error conditions gracefully
  • Maintains execution isolation between different agent invocations
  • Implements resource cleanup after agent execution completion

Standardized Communication Protocol:

Communication follows strict standardization to ensure system reliability, which is critical in multi-agent environments where dozens or even hundreds of dynamically generated agents may interact. Without consistent message formats, routing rules, and completion signals, orchestration would become brittle, errors would propagate unpredictably, and debugging would be nearly impossible. Standardization guarantees that every agent, no matter when it was created, can interoperate seamlessly, enabling the Arbiter to maintain end-to-end visibility, traceability, and fault-tolerance across the entire system.

Event Handling Principles:

  • Event posting handled exclusively by Generic Wrapper, not individual agents
  • Ensures consistent event-driven communication patterns across all agents

Completion Event Structure:

  • orchestration_id: Workflow context linkage
  • tool_use_id: LLM tool invocation mapping
  • node: Agent identifier for tracking
  • data: Execution results or error information

Reliability Measures:

  • Publishes completion events to EventBridge for Arbiter processing
  • Guarantees workflow tracking receives completion signals regardless of execution outcome

Scalability Characteristics:

The hot-loading approach provides significant scalability benefits

  • Enables agent scaling creation without minimal infrastructure impact
  • S3 download latency acceptable within overall system performance profile
  • Single wrapper instance can execute multiple agent types
  • Memory and resource management handled at container level

Conclusion

The Arbiter Pattern represents a significant evolution beyond the Supervisor architecture, delivering the flexibility required for truly autonomous agentic systems. By introducing semantically rich, context-aware orchestration, it enables dynamic scalability, where agent capabilities grow in step with task demands. The architecture is resilient, redistributing or regenerating tasks when agents fail, and it achieves loose coupling by having agents interact through semantically meaningful events rather than rigid APIs. Most importantly, it embeds continuous adaptation through Arbiter-guided feedback loops, allowing systems to learn and evolve over time. This marks a shift from pre-programmed logic to generative, blackboard-model-based coordination, paving the way for decentralized, intelligent systems that can learn, adapt, and collaborate effectively at scale.

The system delivers several critical capabilities

  • Asynchronous Processing: SQS-based message passing for scalable execution
  • Persistent State Management (Short-term memory): DynamoDB-based workflow tracking
  • Scalability: Hot-loading architecture for unlimited agent creation
  • Intelligent Orchestration: LLM-driven task decomposition and sequencing
  • Self-Expanding Capabilities: Strands-based agent creation on demand
  • Standardized Communication: Reliable event-driven protocols

This architecture enables processing of arbitrary event types by dynamically creating necessary processing capabilities and coordinating their execution through LLM-driven workflow orchestration, while maintaining infrastructure efficiency through hot-loading patterns.


About the Authors

aaron sempfAaron Sempf is Next Gen Tech Lead for the AWS Partner Organization in Asia-Pacific and Japan. With over 20 years in distributed system engineering design and development, he focuses on solving for large scale complex integration and event driven systems. In his spare time, he can be found coding prototypes for autonomous robots, IoT devices, distributed solutions, and designing agentic architecture patterns for generative AI assisted business automation.

josh tothJoshua Toth is a Senior Prototyping Engineer with over a decade of experience in software engineering and distributed systems. He specializes in solving complex business challenges through technical prototypes, demonstrating the art of the possible. With deep expertise in proof of concept development, he focuses on bridging the gap between emerging technologies and practical business applications. In his spare time, he can be found developing next-generation interactive demonstrations and exploring cutting-edge technological innovations.

Accelerate AI agent development with the Nova Act IDE extension

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/accelerate-ai-agent-development-with-the-nova-act-ide-extension/

Today, I’m excited to announce the Nova Act extension — a tool that streamlines the path to build browser automation agents without leaving your IDE. The Nova Act extension integrates directly into IDEs like Visual Studio Code (VS Code), Kiro, and Cursor, helping you to create web-based automation agents using natural language with the Nova Act model.

Here’s a quick look at the Nova Act extension in Visual Studio Code:

The Nova Act extension is built on top of the Amazon Nova Act SDK (preview), our browser automation agents SDK (Software Development Kit). The Nova Act extension transforms traditional workflow development by eliminating context switching between coding and testing environments. You can now build, customize, and test production-grade agent scripts—all within your IDE—using features like natural language based generation, atomic cell-style editing, and integrated browser testing. This unified experience accelerates development velocity for tasks like form filling, QA automation, search, and complex multi-step workflows.

You can start with the Nova Act extension by describing your workflow in natural language to quickly generate an initial agent script. Customize it using the notebook-style builder mode to integrate APIs, data sources, and authentication, then validate it with local testing tools that simulate real-world conditions, including live step-by-step debugging of lengthy multi-step workflows.

Getting started with the Nova Act extension
First, I need to install the Nova Act extension from the extension manager in my IDE. 

I’m using Visual Studio Code, and after choosing Extensions, I enter Nova Act. Then, I select the extension and choose Install

To get started, I need to obtain an API key. To do this, I navigate to the Nova Act page and follow the instructions to get the API key. I select Set API Key by opening the Command Palette with Cmd+Shift+P / Ctrl+Shift+P.

After I’ve entered my API key, I can try Builder Mode. This is a notebook-style builder mode that breaks complex automation scripts into modular cells, allowing me to test and debug each step individually before moving to the next.

Here, I can use the Nova Act SDK to build my agent. On the right side, I have a Live view panel to preview my agent’s actions in the browser and an Output panel to monitor execution logs, including the model’s thinking and actions.

To test the Nova Act extension, I choose Run all cells. This will start a new browser instance and act based on the given prompt.

I choose Fullscreen to see how browser automation works.

Another useful feature in Builder Mode is that I can navigate to the Output panel and select the cell to see its logs. This helps me debug or review logs specific to the cell I’m working on.

I can also select a template to get started.

Besides using Builder Mode, I can also chat with Nova Act to create a script for me. To do that, I select the extension and choose Generate Nova Act Script. The Nova Act extension opens a chat dialog in the right panel and automatically creates a script for me.

After I finish creating the script, I can choose Start Builder Mode, and the Nova Act extension will help me create a Python file in Builder Mode. This creates a seamless integration because I can switch between chat capability and Builder Mode.

In the chat interface, I see three workflow modes available:

  • Ask: Describe tasks in natural language to generate automation scripts
  • Edit: Refine or customize generated scripts before execution
  • Agent: Run, monitor, and interact with the AI agent performing the workflow

I can also add Context to provide relevant information about my active documents, instructions, problems, or additional Model Context Protocol (MCP) resources the agent can use, plus a screenshot of the current window. Providing this information helps the agent understand any specific requirements for the automation task.

The Nova Act extension also provides a set of predefined templates that I can access by entering / in the chat. These templates are predefined automation scenarios designed to help quickly generate scripts for common web tasks.

I can use these templates (for example, @novaAct /shopping [my requirements]) to get tailored Python scripts for my workflow. At launch, Nova Act extension provides the following templates:

  • /shopping: Automates online shopping tasks (searching, comparing, purchasing)
  • /extract: Handles data extraction
  • /search: Performs search and information gathering
  • /qa: Automates quality assurance and testing workflows
  • /formfilling: Completes forms and data entry tasks

This extension transforms my agent development workflow by positioning Nova Act extension as a full-stack agent builder tool—a complete agent IDE for the entire development lifecycle. I can prototype with natural language, customize with modular scripting, and validate with local testing—all without leaving my IDE—ensuring production-grade scripts.

Things to know
Here are key points to note:

  • Supported IDEs: At launch, the Nova Act extension is available for Visual Studio Code, Cursor, and Kiro, with additional IDE support planned
  • Open source: The Nova Act extension is available under the Apache 2.0 license, allowing for community contributions and customization
  • Pricing: The Nova Act extension is available at no charge.

Get started with Nova Act extension by installing it from your IDE’s extension marketplace or visiting the GitHub repository for documentation and examples.

Happy automating!
Donnie

Streamline Spark application development on Amazon EMR with the Data Solutions Framework on AWS

Post Syndicated from Vincent Gromakowski original https://aws.amazon.com/blogs/big-data/streamline-spark-application-development-on-amazon-emr-with-the-data-solutions-framework-on-aws/

Today, organizations are heavily using Apache Spark for their big data processing needs. However, managing the entire development lifecycle of Spark applications—from local development to production deployment—can be complex and time-consuming. Managing the entire code base—including application code, infrastructure provisioning, and continuous integration and delivery (CI/CD) pipelines—is sometimes not fully automated and a shared responsibility across multiple teams, which slows down release cycles. This undifferentiated heavy lifting diverts valuable resources away from core business objectives: deriving value from data.

In this post, we explore how to use Amazon EMR, the AWS Cloud Development Kit (AWS CDK), and the Data Solutions Framework (DSF) on AWS to streamline the development process, from setting up a local development environment to deploying serverless Spark infrastructure, and implementing a CI/CD pipeline for automated testing and deployment.

By adopting this approach, developers gain full control over their code and the infrastructure responsible for running it, alleviating the need for cross-team dependency. Developers can customize the infrastructure to meet specific business needs and optimize performance. Additionally, they can customize CI/CD stages to facilitate comprehensive testing, using the self-mutation capability of AWS CDK Pipelines to automatically update and refine the deployment process. This level of control not only accelerates development cycles but also enhances the reliability and efficiency of the entire application lifecycle, so developers can focus more on innovation and less on manual infrastructure management.

Solution overview

The solution consists of the following key components:

  • The local development environment to develop and test your Spark code locally
  • The infrastructure as code (IaC) that will run your Spark application in AWS environments
  • The CI/CD pipeline running end-to-end tests and deploying into the different AWS environments

In the following sections, we discuss how to set up these components.

Prerequisites

To set up this solution, you must have an AWS account with appropriate permissions, Docker and the AWS CDK CLI.

Set up the local development environment

Developing Spark applications locally can be a challenging task due to the need for a consistent and efficient environment that mirrors your production setup. With Amazon EMR, Docker, and the Amazon EMR toolkit extension for Visual Studio Code, you can quickly set up a local development environment for Spark applications, developing and testing Spark code locally, and seamlessly port it to the cloud.

The Amazon EMR toolkit for VS Code includes an “EMR: Create Local Spark Environment” command that generates a development container. This container is based on an Amazon EMR on Amazon EKS image corresponding to the Amazon EMR version you select. You can develop Spark and PySpark code locally, with full compatibility with your remote Amazon EMR environment. Additionally, the toolkit provides helpers to make it straightforward to connect to the AWS Cloud, including an Amazon EMR explorer, an AWS Glue Data Catalog explorer, and commands to run Amazon EMR Serverless jobs from VS Code.

To set up your local environment, complete the following steps:

  1. Install VS Code and the Amazon EMR Toolkit for VS Code.
  2. Install and launch Docker.
  3. Create a local Amazon EMR environment in your working directory using the command EMR: Create Local Spark Environment.

Amazon EMR Toolkit bootstrap

  1. Choose PySpark, Amazon EMR 7.5, and the AWS Region you want to use, and choose an authentication mechanism.

Amazon EMR toolkit local environment

  1. Log in to Amazon ECR with your AWS credentials using the following command so you can download the Amazon EMR image:
aws ecr get-login-password --region us-east-1 \
    | docker login \
    --username AWS \
    --password-stdin \
    12345678910.dkr.ecr.us-east-1.amazonaws.com
  1. Now you can launch your dev container using the VS Code command Dev Containers: Rebuild and Reopen in container.

The container will install the latest operating system packages and run a local Spark history server on port 18080.

local Spark history server

The container provides spark-shell, spark-sql, and pyspark from the terminal and a Jupyter Python kernel for connecting a Jupyter notebook to execute interactive Spark code.

local Jupyter notebooks

Using the Amazon EMR Toolkit, you can develop your Spark application and test it locally using Pytest—for example, to validate the business logic. You can also connect to other AWS accounts where you have your development environment.

Build the AWS CDK application with DSF on AWS

After you validate the business logic into your local Spark application, you can implement the infrastructure responsible for running your application. DSF provides AWS CDK L3 Constructs that simplify the creation of Spark-based data pipelines on EMR Serverless or Amazon EMR on EKS.

DSF provides the capability to package your local PySpark application, including the Python dependencies, into artifacts that can consumed by EMR Serverless jobs. The PySparkApplicationPackage is a construct that uses a Dockerfile to perform the packaging of dependencies into a Python virtual environment archive and then upload the archive and the PySpark entrypoint file into a secured Amazon Simple Storage Service (Amazon S3) bucket. The following diagram illustrates this architecture.

PySparkApplicationPackage L3 construct

See the following example code:

spark_app = dsf.processing.PySparkApplicationPackage(
    self,
    "SparkApp",
    entrypoint_path="./../spark/src/agg_trip_distance.py",
    application_name="TaxiAggregation",
    # Path of the Dockerfile used to package the dependencies as a Python venv
    dependencies_folder='./../spark',
    # Path of the venv archive in the docker image
    venv_archive_path="/venv-package/pyspark-env.tar.gz",
    removal_policy=RemovalPolicy.DESTROY)

You just need to provide the paths for the following:

  • The PySpark entrypoint. This is the main Python script of your Spark application.
  • The Dockerfile containing the logic for packaging a virtual environment into an archive.
  • The path of the resulting archive in the container file system.

DSF provides helpers to connect the application package to the EMR Serverless job. The PySparkApplicationPackage construct exposes properties that can directly be used into the SparkEmrServerlessJob construct parameters. This construct simplifies the configuration of a batch job using an AWS Step Functions state machine. The following diagram illustrates this architecture.

EmrServerlessJob L3 construct

The following code is an example of an EMR Serverless job:

spark_job = dsf.processing.SparkEmrServerlessJob(
    self,
    "SparkProcessingJob",
    dsf.processing.SparkEmrServerlessJobProps(
        name=f"taxi-agg-job-{Names.unique_resource_name(self)}",
        # ID of the previously created EMR Serverless runtime
        application_id=spark_runtime.application.attr_application_id,
        # The IAM role used by the EMR Job with permissions required by the application
        execution_role=processing_exec_role,
        spark_submit_entry_point=spark_app.entrypoint_uri,
        # Add the Spark parameters from the PySpark package to configure the dependencies (using venv)
        spark_submit_parameters=spark_app.spark_venv_conf + spark_params,
        removal_policy=RemovalPolicy.DESTROY,
        schedule=schedule))

Note the two parameters of SparkEmrServerlessJob that are provided by PySparkApplicationPackage:

  • entrypoint_uri, which is the S3 URI of the entrypoint file
  • spark_venv_conf, which contains the Spark submit parameters for using the Python virtual environment

DSF also provides a SparkEmrServerlessRuntime to simplify the creation of the EMR Serverless application responsible for running the job.

Deploy the Spark application using CI/CD

The final step is to implement a CI/CD pipeline that can test your Spark code and promote from dev/test/stage and then to production. DSF provides a L3 Construct that simplifies the creation of the CI/CD pipeline for your Spark applications. DSF’s implementation of the Spark CI/CD pipeline construct uses the AWS CDK built-in pipeline functionality. One of the key capabilities when using an AWS CDK pipeline is its self-mutating capability. It can update itself whenever you change its definition, avoiding the traditional chicken-and-egg problem of pipeline updates and helping developers fully control their CI/CD pipeline.

When the pipeline runs, it follows a carefully orchestrated sequence. First, it retrieves your code from your repository and synthesizes it into AWS CloudFormation templates. Before doing anything else, it examines these templates to see if you’ve made any changes to the pipeline’s own structure. If the pipeline detects that its definition has changed, it will pause its normal operation and update itself first. After the pipeline has updated itself, it will continue with its regular stages, such as deploying your application.

DSF provides an opinionated implementation of CDK Pipelines for Spark applications, where the PySpark code is automatically unit tested using Pytest and where the configuration is simplified. You only need to configure four components:

  • The CI/CD stages (testing, staging, production, and so on). This includes the AWS account ID and Region where these environments reside in.
  • The AWS CDK stack that is deployed in each environment.
  • (Optional) The integration test script that you want to run against the deployed stack.
  • The SparkEmrCICDPipeline AWS CDK construct.

The following diagram illustrates how everything works together.

SparkCICDPipeline L3 construct

Let’s dive into each of these components.

Define cross-account deployment and CI/CD stages

With the SparkEmrCICDPipeline construct, you can deploy your Spark application stack across different AWS accounts. For example, you can have a separate account for your CI/CD processes and different accounts for your staging and production environments.To set this up, first bootstrap the various AWS accounts (staging, production, and so on):

cdk bootstrap --profile <ENVIRONMENT_ACCOUNT_PROFILE> \ 
    aws://<ENVIRONMENT_ACCOUNT_ID&gt;/&lt;REGION> \ 
    --trust <CICD_ACCOUNT_ID> \ 
    --cloudformation-execution-policies "POLICY_ARN"

This step sets up the necessary resources in the environment accounts and creates a trust relationship between those accounts and the CI/CD account where the pipeline will run.Next, choose between two options to define the environments (both options require the relevant configuration in the cdk.context.json file.The first option is to use pre-defined environments, which is defined as follows:

{ 
    "staging": { 
        "account": "<STAGING_ACCOUNT_ID>", 
        "region": "<REGION>" 
    }, 
    "prod": { 
        "account": "<PROD_ACCOUNT_ID>", 
        "region": "<REGION>" 
    } 
}

Alternatively, you can use user-defined environments, which is defined as follows:

{
   "environments":[
      {
         "stageName":"<STAGE_NAME_1>",
         "account":"<STAGE_ACCOUNT_ID>",
         "region":"<REGION>",
         "triggerIntegTest":"<OPTIONAL_BOOLEAN_CAN_BE_OMMITTED>"
      },
      {
         "stageName":"<STAGE_NAME_2>",
         "account":"<STAGE_ACCOUNT_ID>",
         "region":"<REGION>",
         "triggerIntegTest":"<OPTIONAL_BOOLEAN_CAN_BE_OMMITTED>"
      },
      {
         "stageName":"<STAGE_NAME_3>",
         "account":"<STAGE_ACCOUNT_ID>",
         "region":"<REGION>",
         "triggerIntegTest":"<OPTIONAL_BOOLEAN_CAN_BE_OMMITTED>"
      }
   ]
}

Customize the stack to be deployed

Now that the environments have been bootstrapped and configured, let’s look at the actual stack that contains the resources that will be deployed in the various environments. Two classes must be implemented:

  • A class that extends the stack – This is where the resources that are going to be deployed in each of the environments are defined. This can be a normal AWS CDK stack, but it can be deployed in another AWS account depending on the environment configuration defined in the previous section.
  • A class that extends ApplicationStackFactory – This is DSF specific, and makes it possible to configure and then return the stack that is created.

The following code shows a full example:

class MyApplicationStack(cdk.Stack): 
    def __init__(self, scope, *, stage): 
        super().__init__(scope, "MyApplicationStack") 
        bucket = Bucket(self, "TestBucket",
                        auto_delete_objects=True, 
                        removal_policy=cdk.RemovalPolicy.DESTROY) 
        cdk.CfnOutput(self, "BucketName", value=bucket.bucket_name) 
        
class MyStackFactory(dsf.utils.ApplicationStackFactory): 
    def create_stack(self, scope, stage): 
        return MyApplicationStack(scope, stage=stage)

ApplicationStackFactory supports customization of the stack before returning the initialized object to be deployed by the CI/CD pipeline. You can customize your stack behavior by passing the current stage to your stack. For example, you can skip scheduling the Spark application in the integration tests stage because the integration tests trigger it manually as part of the CI/CD pipeline. For the production stage, the scheduling facilitates automatic execution of the Spark application.

Write the integration test script

The integration test script is a bash script that is triggered after the main application stack has been deployed. Inputs to the bash script can come from the AWS CloudFormation outputs of the main application stack. These outputs are mapped into environment variables that the bash script can access directly.

In the Spark CI/CD example, the application stack uses the SparkEMRServerlessJob CDK construct. This construct uses a Step Functions state machine to manage the execution and monitoring of the Spark job. The following is an example integration test bash script that we use to test that the deployed stack can run the associated Spark job successfully:

#!/bin/bash 
EXECUTION_ARN=$(aws stepfunctions start-execution --state-machine-arn $STEP_FUNCTION_ARN | jq -r '.executionArn')

while true 
do 
    STATUS=$(aws stepfunctions describe-execution --execution-arn $EXECUTION_ARN | jq -r '.status') 
    if [ $STATUS = "SUCCEEDED" ]; then 
        exit 0 
    elif [ $STATUS = "FAILED" ] || [ $STATUS = "TIMED_OUT" ] || [ $STATUS = "ABORTED" ]; then 
        exit 1 
    else 
        sleep 10
        continue 
    fi
done

The integration test scripts are executed within an AWS CodeBuild project. As part of the IntegrationTestStack, we’ve included a custom resource that periodically checks the status of the integration test script as it runs. Failure of the CodeBuild execution causes the parent pipeline (residing in the pipeline account) to fail. This helps teams only promote changes that pass all the required testing.

Bring all the components together

When you have your components ready, you can use the SparkEmrCICDPipeline to bring them together. See the following example code:

dsf.processing.SparkEmrCICDPipeline(
    self,
    "SparkCICDPipeline",
    spark_application_name="SparkTest",
    # The Spark image to use in the CICD unit tests
    spark_image=dsf.processing.SparkImage.EMR_7_5,
    # The factory class to dynamically pass the Application Stack
    application_stack_factory=SparkApplicationStackFactory(),
    # Path of the CDK python application to be used by the CICD build and deploy phases
    cdk_application_path="infra",
    # Path of the Spark application to be built and unit tested in the CICD
    spark_application_path="spark",
    # Path of the bash script responsible to run integration tests 
    integ_test_script='./infra/resources/integ-test.sh',
    # Environment variables used by the integration test script, value is the CFN output name
    integ_test_env={
        "STEP_FUNCTION_ARN": "ProcessingStateMachineArn"
    },
    # Additional permissions to give to the CICD to run the integration tests
    integ_test_permissions=[
        PolicyStatement(
            actions=["states:StartExecution", "states:DescribeExecution"
            ],
            resources=["*"]
        )
    ],
    source= CodePipelineSource.connection("your/repo", "branch",
        connection_arn="arn:aws:codeconnections:us-east-1:222222222222:connection/7d2469ff-514a-4e4f-9003-5ca4a43cdc41"
    ),
    removal_policy=RemovalPolicy.DESTROY,
)

The following elements of the code are worth highlighting:

  • With the integ_test_env parameter, you can define the environment variable mapping with the output of your application stack that’s defined in the application_stack_factory parameter
  • The integ_test_permissions parameter specifies the AWS Identity and Access Management (IAM) permissions that are attached to the CodeBuild project where the integration test script runs in
  • CDK Pipelines needs an AWS code connection Amazon Resource Name (ARN) to connect to your Git repository when you host your code

Now you can deploy the stack containing the CI/CD pipeline. This is a one-time operation because the CI/CD pipeline will dynamically be updated based on code changes that impact the CI/CD pipeline itself:

cd infra 
cdk deploy CICDPipeline

Then you can commit and push the code into the source code repository defined in the source parameter. This step triggers the pipeline and deploys the application in the configured environments. You can check the pipeline definition and status on the AWS CodePipeline console.

AWS CodePipeline

You can find the full example on the Data Solutions Framework GitHub repository.

Clean up

Follow the readme guide to delete the resources created by the solution.

Conclusion

By using Amazon EMR, the AWS CDK, DSF on AWS, and the Amazon EMR toolkit, developers can now streamline their Spark application development process. The solution described in this post helps developers gain full control over their code and infrastructure, making it possible to set up local development environments, implement automated CI/CD pipelines, and deploy serverless Spark infrastructure across multiple environments.

DSF supports other patterns, such as streaming governance and data sharing and Amazon Redshift data warehousing. The DSF roadmap is publicly available, and we look forward to your feature requests, contributions, and feedback. You can get started using DSF by following our Quick start guide.

 


About the authors

Jan Michael Go Tan

Jan Michael Go Tan

Jan is a Principal Solutions Architect for Amazon Web Services. He helps customers design scalable and innovative solutions with the AWS Cloud.

Vincent Gromakowski

Vincent Gromakowski

Vincent is an Analytics Specialist Solutions Architect at AWS where he enjoys solving customers’ analytics, NoSQL, and streaming challenges. He has a strong expertise on distributed data processing engines and resource orchestration platform.

Lotfi Mouhib

Lotfi Mouhib

Lotfi is a Principal Solutions Architect working for the Public Sector team with Amazon Web Services. He helps public sector customers across EMEA realize their ideas, build new services, and innovate for citizens. In his spare time, Lotfi enjoys cycling and running.

Accelerate serverless testing with LocalStack integration in VS Code IDE

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/accelerate-serverless-testing-with-localstack-integration-in-vs-code-ide/

Today, we’re announcing LocalStack integration in the AWS Toolkit for Visual Studio Code that makes it easier than ever for developers to test and debug serverless applications locally. This enhancement builds upon our recent improvements to the AWS Lambda development experience, including the console to IDE integration and remote debugging capabilities we launched in July 2025, continuing our commitment to simplify serverless development on Amazon Web Services (AWS).

When building serverless applications, developers typically focus on three key areas to streamline their testing experience: unit testing, integration testing, and debugging resources running in the cloud. Although AWS Serverless Application Model Command Line Interface (AWS SAM CLI) provides excellent local unit testing capabilities for individual Lambda functions, developers working with event-driven architectures that involve multiple AWS services, such as Amazon Simple Queue Service (Amazon SQS), Amazon EventBridge, and Amazon DynamoDB, need a comprehensive solution for local integration testing. Although LocalStack provided local emulation of AWS services, developers had to previously manage it as a standalone tool, requiring complex configuration and frequent context switching between multiple interfaces, which slowed down the development cycle.

LocalStack integration in AWS Toolkit for VS Code
To address these challenges, we’re introducing LocalStack integration so developers can connect AWS Toolkit for VS Code directly to LocalStack endpoints. With this integration, developers can test and debug serverless applications without switching between tools or managing complex LocalStack setups. Developers can now emulate end-to-end event-driven workflows involving services such as Lambda, Amazon SQS, and EventBridge locally, without needing to manage multiple tools, perform complex endpoint configurations, or deal with service boundary issues that previously required connecting to cloud resources.

The key benefit of this integration is that AWS Toolkit for VS Code can now connect to custom endpoints such as LocalStack, something that wasn’t possible before. Previously, to point AWS Toolkit for VS Code to their LocalStack environment, developers had to perform manual configuration and context switching between tools.

Getting started with LocalStack in VS Code is straightforward. Developers can begin with the LocalStack Free version, which provides local emulation for core AWS services ideal for early-stage development and testing. Using the guided application walkthrough in VS Code, developers can install LocalStack directly from the toolkit interface, which automatically installs the LocalStack extension and guides them through the setup process. When it’s configured, developers can deploy serverless applications directly to the emulated environment and test their functions locally, all without leaving their IDE.

Let’s try it out
First, I’ll update my copy of the AWS Toolkit for VS Code to the latest version. Once, I’ve done this, I can see a new option when I go to Application Builder and click on Walkthrough of Application Builder. This allows me to install LocalStack with a single click.

Once I’ve completed the setup for LocalStack, I can start it up from the status bar and then I’ll be able to select LocalStack from the list of my configured AWS profiles. In this illustration, I am using Application Composer to build a simple serverless architecture using Amazon API Gateway, Lambda, and DynamoDB. Normally, I’d deploy this to AWS using AWS SAM. In this case, I’m going to use the same AWS SAM command to deploy my stack locally.

I just do `sam deploy –guided –profile localstack` from the command line and follow the usual prompts. Deploying to LocalStack using AWS SAM CLI provides the exact same experience I’m used to when deploying to AWS. In the screenshot below, I can see the standard output from AWS SAM, as well as my new LocalStack resources listed in the AWS Toolkit Explorer.

I can even go in to a Lambda function and edit the function code I’ve deployed locally!

Over on the LocalStack website, I can login and take a look at all the resources I have running locally. In the screenshot below, you can see the local DynamoDB table I just deployed.

Enhanced development workflow
These new capabilities complement our recently launched console-to-IDE integration and remote debugging features, creating a comprehensive development experience that addresses different testing needs throughout the development lifecycle. AWS SAM CLI provides excellent local testing for individual Lambda functions, handling unit testing scenarios effectively. For integration testing, the LocalStack integration enables testing of multiservice workflows locally without the complexity of AWS Identity and Access Management (IAM) permissions, Amazon Virtual Private Cloud (Amazon VPC) configurations, or service boundary issues that can slow down development velocity.

When developers need to test using AWS services in development environments, they can use our remote debugging capabilities, which provide full access to Amazon VPC resources and IAM roles. This tiered approach frees up developers to focus on business logic during early development phases using LocalStack, then seamlessly transition to cloud-based testing when they need to validate against AWS service behaviors and configurations. The integration eliminates the need to switch between multiple tools and environments, so developers can identify and fix issues faster while maintaining the flexibility to choose the right testing approach for their specific needs.

Now available
You can start using these new features through the AWS Toolkit for VS Code by updating to v3.74.0. The LocalStack integration is available in all commercial AWS Regions except AWS GovCloud (US) Regions. To learn more, visit the AWS Toolkit for VS Code and Lambda documentation.

For developers who need broader service coverage or advanced capabilities, LocalStack offers additional tiers with expanded features. There are no additional costs from AWS for using this integration.

These enhancements represent another significant step forward in our ongoing commitment to simplifying the serverless development experience. Over the past year, we’ve focused on making VS Code the tool of choice for serverless developers, and this LocalStack integration continues that journey by providing tools for developers to build and test serverless applications more efficiently than ever before.

AWS Cloud Development Kit (CDK) Launches Refactor

Post Syndicated from Natalie White original https://aws.amazon.com/blogs/devops/aws-cloud-development-kit-cdk-launches-refactor/

We are excited to announce a new AWS Cloud Development Kit (CDK) feature that makes it easier and safer to refactor your infrastructure as code. CDK Refactor aims to preserve your AWS resources as you rename constructs, move resources between stacks, and reorganize your CDK applications – operations that previously risked resource replacement.

When writing infrastructure as code with the CDK, developers occasionally need to rename Constructs or move them between Stacks or directories. Whether they need to better organize their code, adhere to coding best practices, or take advantage of object-oriented programming patterns like class inheritance, these changes can be risky in environments with deployed resources, because they change the CDK-generated logical ID of those resources. During a CDK deploy, AWS CloudFormation interprets these changes as new resources, which often requires deletion of the existing resource and creation of a new resource with the new logical ID. For stateful resources, this could cause potential downtime and even data loss. To mitigate this effect of ID changes, developers had to stage their changes to create new resources, create a data or network migration plan, and then delete the old resources to prevent these refactoring effects. Sometimes, developers decide the risk of these changes outweigh the benefit of the refactor and choose not perform the refactor at all.

Today, developers can use the new CDK refactor command to detect, review, confirm, and safely apply refactored changes to their resources without resource replacement. This feature leverages the recently-launched AWS CloudFormation refactor feature, but the CDK automatically computes the mappings that CloudFormation needs to redefine the refactored resources, providing a layer of abstraction that allows developers to focus on code rather than resource configuration. Let’s walk through an example to demonstrate the benefits of this refactor capability.

Prerequisites

Along with the usual CDK prerequisites, if you bootstrapped your CDK project before this launch, you need to re-bootstrap your environment to obtain the new permissions associated with the CDK refactor capabilities before attempting your refactor.

Monolith to micro-service example

For this example, let’s say that we have a legacy CDK App that deploys a monolithic Stack with Amazon DynamoDB tables for users, products, and orders, and an AWS Lambda function that implements CRUD operations on all entities.

Architecture diagram of a monolithic application with an API Gateway, single Lambda function, and three DynamoDB tables for users, products, and orders. The application is contained in a single CDK Stack within the CDK App.

Monolithic application

function monolithApp() {
  const monolith = new CdkAppStack(app, monolithStackName, {env});
  const usersTable = makeTable(monolith, 'users');
  const productsTable = makeTable(monolith, 'products');
  const ordersTable = makeTable(monolith, 'orders');

  // We have a single Lambda function in our application
  const func = new Function(monolith, `MonolithFunction`, {
    code: Code.fromInline(`Some code that accesses all three tables`),
    runtime: Runtime.NODEJS_22_X,
    handler: 'index.handler',
  });

  usersTable.grantReadWriteData(func);
  productsTable.grantReadWriteData(func);
  ordersTable.grantReadWriteData(func);

  // This function creates a REST API, resources, methods, and links
  // everything together to the functions. Right now, we are passing
  // the same function in three places.
  makeApi(monolith, {
    usersFunction: func,
    productsFunction: func,
    ordersFunction: func,
  });
}

monolithApp();

We’ve been asked to adhere to Well Architected Framework best practices and break up the monolith into separate Lambda functions so they can scale independently. Because they’re so similar, we’re also going to create an inheritable Lambda class that we can reuse to improve readability and maintainability of the code, and avoid having to re-define Lambda configuration settings that are consistent across all of the functions.

Finally, the monolith uses only L1 CDK Constructs. To further abstract our code and take advantage of helper functions, we’re going to start using L2 CDK Constructs for DynamoDB, Lambda, and API Gateway. This change will allow the IAM Roles and permissions to be defined automatically, further simplifying our code.

Architecture diagram of the same application refactored into four child stacks: one for the APIGateway and three for the user, product, and order domains. Each child Stack has its own respective Lambda function and DynamoDB table.

Proposed refactored application into separate stacks for each domain.

Without the refactor feature, CloudFormation would delete and re-create the Lambda and DynamoDB resources, which would cause all of the data in the latter to be lost. Alternatively, you could create net-new Lambdas and Amazon DynamoDB tables in one deployment, execute an out-of-band, point-in time and streaming data migration from the old tables to the new ones, update the API Gateway configuration to target the new Lambdas in a second deploy, and turn off the streaming migration process.

With the refactor feature, we can move the resource definitions to new files, update them to L2 Constructs, and leave the stateful resources in place!

Replace stateless resources
First, let’s refactor our CDK code to break the monolithic Lambda into 3 domain-specific Lambdas. CloudFormation’s refactor capability doesn’t support creating new resources or updating configuration of existing resources, so we will deploy these changes as usual, without using the new refactor feature. All resources will stay inside the monolithic stack for now.

Architecture diagram of the same application with the single Lambda function broken into separate functions corresponding to the three DynamoDB tables for users, products, and orders. All resources are still contained in the single API Stack and CDK App.

Refactor stateless single lambda function into 3 functions as a prerequisite to the refactor of stateful DynamoDB tables.

function singleStackMicroservicesApp() {
  // We still have a single stack
  const monolith = new CdkAppStack(app, monolithStackName, {env});

  // makeFunctionAndTable creates a different Lambda function and a DynamoDB table
  // for each domain that is passed as a parameter.
  // In a real CDK application, you would probably define each of them independently.
  makeApi(monolith, {
    usersFunction: makeFunctionAndTable(monolith, 'users'),
    productsFunction: makeFunctionAndTable(monolith, 'products'),
    ordersFunction: makeFunctionAndTable(monolith, 'orders'),
  });
}

singleStackMicroservicesApp();

Refactor stateful resources
Now we can refactor the stateful DynamoDB tables and their respective Lambdas to their own stacks, using cdk refactor to map their new IDs without replacing the resources.

Before refactoring, though, we need to create the new stacks that will receive the functions and tables:

  singleStackMicroservicesApp();

  const usersStack = new Stack(app, 'Users', {env});
  const productsStack = new Stack(app, 'Products', {env});
  const ordersStack = new Stack(app, 'Orders', {env});
Architecture diagram of the final refactored application with four child stacks: one for the APIGateway and three for the user, product, and order domains. Each child Stack has its own respective Lambda function and DynamoDB table.

Refactored Lambda functions and DynamoDB tables into their own separate stacks.

function fullMicroservicesApp() {
    const monolith = new Stack(app, monolithStackName, {env});

    const usersStack = new Stack(app, 'Users', {env});
    const productsStack = new Stack(app, 'Products', {env});
    const ordersStack = new Stack(app, 'Orders', {env});

    makeApi(monolith, {
        // Now each pair function + table is in its own stack
        usersFunction: makeFunctionAndTable(usersStack, 'users'),
        productsFunction: makeFunctionAndTable(productsStack, 'products'),
        ordersFunction: makeFunctionAndTable(ordersStack, 'orders'),
    });
}

fullMicroservicesApp();

Running cdk refactor –unstable=refactor starts the process. (The unstable flag is required as this feature is still subject to breaking changes.) The CDK will compare the current state of your application (the deployed monolithic app) with the new state (the output of your refactored CDK application).

Screenshot of the CDK CLI confirmation dialog while executing the refactor command. The output has three columns: Resource Type, Old Construct Path, and New Construct Path, with rows of resources that will be refactored from one ID to another. The confirmation prompt asks "Do you want to refactor these resources? (yes/no)"

CDK refactor confirmation dialog

As expected, it shows a table of resources that were moved from the Monolith stack to their respective refactored stacks. By default, the CLI asks for confirmation before proceeding. Bypass the confirmation by passing the –force flag, or confirm the changes and execute the refactor:
All resources, including the stateful tables, were safely moved to other stacks, and we now have our well-architected application.

Screenshot of the CDK CLI results after completing the refactor command. The output is similar to the confirmation with three columns: Resource Type, Old Construct Path, and New Construct Path, with rows of resources that were refactored from one ID to another.

CDK refactor results

Conclusion
With the CDK refactor feature, developers can take full advantage of the object-oriented definition of AWS resources, including the ability to change the structure and layers of abstraction without orchestrating complex migration mechanisms or scheduled downtime. Since the CDK is open source, you can learn more about how the CDK automatically determines what resources need to be refactored via the README. Understanding when resources need to be replaced and refactored will help you plan your infrastructure as code roadmap and when you should use this new refactoring capability.

If you’ve got a refactor that you’ve been waiting to execute, read more about the feature set in the CDK refactor documentation and start refactoring your own CDK App today!

Authors

Natalie White

Natalie White is a Principal Solutions Architect at Amazon Web Services. She helps Healthcare and Life Sciences customers deploy solutions to AWS, and uses her software development background to accelerate infrastructure as code automation using the AWS Cloud Development Kit (CDK). She also consults engineering leaders on DevOps cultural transformations.

Otavio Macedo

Otavio Macedo is a Software Development Engineer at Amazon Web Services. He has been with the AWS Cloud Development Kit (CDK) team since 2021, helping deliver the unique experience that CDK provides for customers.

Accelerating local serverless development with console to IDE and remote debugging for AWS Lambda

Post Syndicated from Brian Krygsman original https://aws.amazon.com/blogs/compute/accelerating-local-serverless-development-with-console-to-ide-and-remote-debugging-for-aws-lambda/

Delightful developer experience is an important part of building serverless applications efficiently, whether you’re creating an automation script or developing a complex enterprise application. While AWS Lambda has transformed modern application development in the cloud with its serverless computing model, developers spend significant time working in their local environments. They rely on familiar IDEs, debugging tools, testing frameworks, and build within established organizational workflows to deliver production-ready applications.

This post covers some recent enhancements to local developer experience. Two new Lambda features, namely console to IDE and remote debugging, further bridge the gap between cloud and local development, enabling you to leverage the full power of your local tools while working with Lambda functions in the cloud.

Overview

Serverless development with Lambda spans both cloud and local environments, each with its unique strengths. While the Lambda console offers rapid deployment and prototyping, local development provides the depth and flexibility needed for a complex application development workflow that includes integration testing, deployment to shared environments, continuous integration/continuous deployment (CI/CD) pipelines, and collaboration with other team members. The local developer experience encompasses the tools, workflows, and practices that developers use on their local devices to build and maintain their applications. An intuitive local development experience helps application development teams achieve high productivity, ensure code quality, and confidently ship changes to production.

Recent local serverless development experience enhancements

Local development workflows can be seen as two distinct but interconnected loops: the inner loop of writing, testing, and debugging code locally, and the outer loop that extends to cloud deployment, integration testing, release pipeline, and monitoring, as shown in the following figure. For serverless applications, developers want immediate feedback within the inner loop, as they iterate on function code and test integrations with AWS services. AWS has been steadily enhancing the local development experience for developers building on Lambda, with a focus on accelerating the inner loop, where developers spend most of their time.

DevOps workflow diagram showing interconnected local development and cloud deployment cycles with feedback loops

Figure 1: Inner and outer loop

Visual Studio Code (VS Code) is the most popular IDE among developers according to the 2024 Stack Overflow Developer Survey. Enhanced local IDE experience enables developers to code, test, debug, and deploy Lambda-based serverless applications more efficiently in their local IDE when using VS Code. It introduced the Application Builder interface, which streamlines the entire development workflow from setup to deployment with features such as guided walkthrough for environment setup, pre-configured sample applications, build setting management, and improved local debugging capabilities. This eliminates the need to switch between multiple interfaces. This experience also integrates with AWS Infrastructure Composer, which enables visual application building directly from VS Code, and provides quick-action buttons for common tasks like building, deploying, and invoking functions both locally and in the cloud.

AWS development environment setup wizard showing required tools installation process and local development options

Figure 2 Guided walkthrough in VS Code IDE

With Serverless Land’s extensive ready-to-use pattern library available directly in VS Code, you can now browse, search, and implement a collection of curated, pre-built serverless patterns without leaving the IDE. This integration makes it easier to use proven architectures and AWS best practices while building serverless applications. Amazon CloudWatch Logs Live Tail support for Lambda functions in VS Code brings real-time log streaming and analytics capabilities directly into the IDE, enabling you to monitor and troubleshoot your Lambda functions without context switching. Whether testing a new feature or debugging an issue, you can now see the immediate impact of your code changes without leaving the IDE.

Console to IDE

Over the past decade, the Lambda console has enabled developers to quickly get started with writing Lambda functions, allowing them to rapidly iterate through changing code, testing, and deploying their functions. The console IDE experience saw a major usability refresh in 2024, including the introduction of Amazon Q Developer in the Lambda console.

As applications grow in complexity, developers often need to refactor code, add complex logic, include utility libraries as dependencies, or handle edge cases in their Lambda functions. Examples include using external libraries for complex time calculations or adding modules that perform caller-specific validations. This can make functions too bulky to manage in the console.

Developers may also want to move their functions into a software development lifecycle (SDLC) process that includes test frameworks, security scanning tools, infrastructure as code (IaC) templates, or CI/CD pipelines. This may necessitate that they use version control to collaborate across the team or develop with an AI agent steered by custom rules.

Previously, setting this up required manually configuring a local development environment, including IDE, language runtime, and build/package toolchains. Then, you had to download your function code, configuration, and integration settings and copy them into the IDE. You also had to create the required IaC template with AWS Serverless Application Model (AWS SAM). Only then could you deploy to the cloud to validate the accuracy of your code and configuration and continue with your development workflow.

The new Lambda console to IDE feature enables seamless transition from a cloud-hosted code/test cycle to a local environment, allowing you to download your function code and configuration to local VS Code IDE with just one click. From there, you can easily add dependencies and commit code into source control. Furthermore, you can sync back to the cloud for deployment or export a full AWS SAM template with the “Convert to SAM” capability and continue managing your function as if you had started locally. Console to IDE guides you through setting up the IDE on your local device, if you don’t already have one, along with any necessary configuration. The following figures show a function open in the Lambda console and thereafter in the local VS Code IDE.

AWS Lambda console showing Python function code, IoT integration, test events, and configuration settings for temperature monitoring

Figure 3: A Lambda function as seen in the console IDE

AWS Lambda local development interface showing Python IoT temperature monitoring code, terminal, and getting started guide

Figure 4: The same Lambda function as seen in a local IDE after Console to IDE export

By making it easy to transition inner loop development between cloud and local development environments, the console to IDE feature makes it easy to quickly scale an idea from proof-of-concept to a full-fledge serverless application. Visit the Lambda documentation to learn more.

Remote debugging

Developers building serverless applications with Lambda often need to test and debug cross-service integrations. While local debugging tools offer valuable capabilities, they do not fully replicate the Lambda runtime environment and its interactions with other AWS services, especially when dealing with Amazon Virtual Private Cloud (VPC) resources and AWS Identity and Access Management (IAM) permissions. Therefore, developers had to rely on print statements and verbose logging, and for complex scenarios they had to deploy their functions multiple times to diagnose and resolve issues. This process extended development cycles, particularly when troubleshooting issues specific to the production environment. Developers wished they could use advanced local development tools like debuggers to investigate issues with code running in Lambda functions deployed in the cloud.

Lambda’s new remote debugging feature now enables you to debug your functions running in the cloud directly from your local VS Code IDE using the AWS Toolkit extension. You can now debug the execution environment of the function running in the cloud in its IAM execution role’s security context with access to configured VPC resources, and trace execution through entire service flows in the cloud.

To start debugging, enable Remote debugging when invoking your function through the AWS Toolkit. Configure your local code path and payload and choose Remote Invoke. AWS Toolkit automatically adds an AWS-managed debugging Lambda layer to your function, extends the timeout, publishes a temporary version, and reverts the config change. AWS Toolkit then invokes the published debug version. You can then start debugging. This feature establishes a secure connection between your local debugger and the function running in the cloud using AWS IoT Secure Tunneling. When your debug session is finished, Lambda automatically removes the temporary function version. You can end your debug session explicitly. Otherwise, it will end automatically after 60 seconds of inactivity or when the Lambda function timeout is reached.

The following figure shows how setting a breakpoint in VS Code IDE during a remote debugging session pauses execution so that you can inspect the data with which the function running in the cloud is called, along with your function’s variables. You can continue to step forward from this point line-by-line to follow the function’s execution.

AWS Lambda debug environment displaying IoT temperature monitoring code, variable inspection, and execution logs with breakpoint paused state

Figure 5: VS Code IDE debugger attached to execution environment of a Lambda function running in the cloud

All of this means that you don’t have to set up local emulators to approximate cloud behavior, manage complex test frameworks, or continuously capture expensive logs with TRACE-level detail to understand how your code executes. Your debugger can show you exactly what invocation parameters look like, such as event and context, when they reach your function handler. You can step through how your function behaves for different inputs and inspect variable values along the way. Since your code is running in the cloud, you can even see how your function’s IAM execution role affects its behavior. As you step through, you can immediately see when an AWS SDK service call fails due to lack of permissions.

Moreover, you can combine this with the console to IDE feature described previously in this post. When you’ve downloaded your function and scaffolded your local environment with console to IDE, you can debug the function as it runs in the cloud with remote debugging. This gives you much more visibility into the Lambda developer experience, which helps you find issues more easily, fix bugs quickly, and deliver new features rapidly. Follow the steps in the documentation to get started.

Best practices

Although the improved developer experience enables you to move faster when building serverless applications using Lambda, you should incorporate AWS-recommended best practices into your application development workflow.

For large or complex functions, refactor the code following the programming language norms so that developers and AI agents can better understand it. For example, move complex business logic, such as inventory calculations, out of the function handler into a separate module. Console to IDE allows you to use your local refactoring tools to refactor function code.

For isolated cost allocation and security boundaries between development and production, use separate AWS environments for different stages of your development process. You can use console to IDE to generate an AWS SAM template for your application with properties of your function and related AWS resources, which streamlines consistent cross-environment deployments. Then, you can then automate deployments of your template and function code with a CI/CD pipeline.

During development, you should test your functions in the cloud when you can. Remote debugging makes it easier to test functions running in the cloud from your local environment, allowing you to step through your code to validate logic and least-privilege function execution permissions. To optimize cost, focus on logging just enough to recreate problem scenarios, including necessary context about function execution, rather than logging everything you need to diagnose behavior. This also means that you have smaller log volumes to sift through.

You should recreate problem scenarios in an environment where you control the flow of input and can use remote debugging. When possible, you should use a development environment where there are no other sources of invokes. There’s a small window while remote debugging applies the temporary config change where other traffic to $LATEST might cause unexpected results, such as a slower cold start. By default, the debugger does not initialize when running on $LATEST. You should also use Aliases and Versions to explicitly pin environments to the appropriate version of a function, which avoids this problem and gives you more deterministic behavior along with the ability to do canary deployments.

Conclusion

The local development experience enhancements, including debugging workflows and IDE integrations, minimize the configuration and setup needed for developers to locally build serverless applications using Lambda. This enables developers to focus on building business logic. These enhancements also provide the rapid feedback loop developers need while making sure that their local environment accurately reflects cloud behavior.

AWS continues to streamline the local developer experience for serverless applications in areas such as local testing of service integrations, IaC workflows, troubleshooting capabilities, and using AI assistance more deeply in local development workflows. All of this helps developers build more efficient and secure serverless applications.

To get started with these new capabilities, visit the Lambda developer guide for detailed walkthroughs and best practices. Share your experiences and suggestions through the Lambda GitHub issues page to help shape the future of serverless developer experience.

For more serverless learning resources, visit Serverless Land. Likewise, check out this video from an AWS Community Builder showcasing the latest capabilities.

Introducing an Interactive Code Review Experience with Amazon Q Developer in GitHub

Post Syndicated from Sundaresh Iyer original https://aws.amazon.com/blogs/devops/introducing-an-interactive-code-review-experience-with-amazon-q-developer-in-github/

Code reviews are one of the most valuable rituals in software development. They help ensure quality, maintain consistency, and foster growth as engineers. But they’re also one of the most time consuming steps in the software development lifecycle. A common pattern I’ve seen is a developer opening a pull request (PR), receiving automated or peer comments, and then needing to search through documentation, Slack threads, or past code just to understand why a change was suggested. That search for missing context creates a friction that slows teams down, adds back-and-forth cycles, and often distracts from the bigger picture of building great products.

In the initial preview experience, teams used Amazon Q Developer in GitHub across issues and PRs for feature work, automated code reviews, and common modernization tasks. This kept work inside GitHub and reduced handoffs. Automatic reviews on new or reopened PRs surfaced findings early, but teams still wanted more context and a tighter loop inside the PR.

Today we’re introducing an interactive code review experience for PRs You can ask Amazon Q Developer questions about any finding using /q, see a concise summary with threaded findings, and apply suggested changes without leaving GitHub. Code reviews by Amazon Q Developer now complete quicker than before, which reduces wait time and shortens the review cycle so teams can merge sooner and spend more time building.

What’s new and why it matters

  • Interactive Conversations in the pull request: Comment with /q to get inline answers, or ask Q Developer to propose a code change you can apply in the PR. For example:/q explain this finding or /q propose a change that replaces class toggles with a data attribute for state.
  • Code review summaries with threaded findings: Each code review begins with a concise summary and findings are threaded underneath. This makes updates easier to follow and reduces noise.
  • Faster execution with clearer notifications: Amazon Q Developer completes its analysis quicker and notifications are organized and easier to scan. This reduces wait time and shortens the review cycle.
    When you create or open a new PR, Amazon Q Developer automatically starts a code review if the code review feature is enabled for your GitHub installation in the Amazon Q Developer console. Subsequent commits do not trigger another automatic review. To run a fresh analysis, post /q review as a new comment on the PR.

Getting Started with Amazon Q Developer in GitHub

To get started, install the Amazon Q Developer GitHub App in your GitHub organization or repository. The app is available through the GitHub Marketplace and can be used without an AWS account during the preview. During installation, you choose whether to provide access to all repositories or only selected repositories in your GitHub organization. You can increase free usage by registering the app installation in the Amazon Q Developer console.
For more details on installation, permissions, and configuration options, see the Amazon Q Developer for GitHub documentation. Once the app is installed, you can begin using Q Developer to review PRs automatically.

Using Amazon Q Developer in Pull Requests

To dive deeper, here’s an end-to-end walk-through of the new interactive code review experience using a simple card game I built with Amazon Q Developer.

  1. Create a new pull request : In this example, I started by creating a feature branch and named it demo, added atailwind.css file to the JavaScript and HTML card game app, pushed the branch, and opened a PR for review.
    • GitHub interface showing the creation of a new pull request for a demo branch, with changes to a tailwind.css file in a card game application
  2. Amazon Q Developer automatically starts a code review, analyzing code quality, potential issues, and adherence to best practices. A concise summary appeared at the top, with individual findings threaded underneath. This gave me the big picture and the specifics in one place.
    •  GitHub pull request interface showing a notification that Amazon Q Developer has automatically initiated a code review and is analyzing the changes, with a progress indicator
    • GitHub interface displaying Amazon Q Developer's completed review with a concise summary at the top and detailed findings organized in threaded comments below, highlighting code quality issues and suggested improvements
  3. Code review the summary and findings: I reviewed the summary and threaded findings to decide which change to take on first. Seeing both the rationale and the exact lines called out meant I knew where to begin, without hunting through files.
    •         GitHub interface showing Amazon Q Developer's threaded findings, where each finding is organized as a separate comment thread with detailed explanations of identified issues in the code
    •        
  4. Ask for Clarification with /q : One of the findings suggested using state property to track the card status in my card game application. so I asked Q Developer for clarification. It responded quickly with concrete context and pointers, which reduced back and forth and improved the quality of the review.
    •         Screenshot showing a conversation thread where a developer uses the /q command to ask Amazon Q Developer for clarification about state property implementation in the card game
    •         GitHub interface displaying Amazon Q Developer's detailed response to the clarification request, providing context and specific explanations about the state property implementation recommendation
  5. Continue the conversation (if needed) : I reviewed Q Developer’s suggestion and responded back stating that I preferred an alternate approach and Q Developer quickly returned a complete implementation I could apply in the PR
    •      GitHub interface showing a follow-up exchange between the developer and Amazon Q Developer, where the developer proposes an alternate approach and receives implementation suggestions
  6. Apply Fixes : After reviewing the implementation suggestion, I clicked on Commit suggestion to create a new commit on the PR branch with my username as the author.
  7. Re-run the review : I didn’t need this for my example, but if you push additional changes, you can run a fresh analysis by posting /q review as a new top-level comment. Q Developer will run the review and post updated findings.

With the code review complete and checks passing, I merged. The new interactive code review experience reduced wait time and review cycles and made the “why” behind each finding and suggested change clear.

Conclusion

Amazon Q Developer for GitHub is available today in preview. Whether you are an individual developer or part of a large engineering team, this update helps you ship cleaner code with fewer cycles and makes code reviews something to look forward to rather than avoid.
Try it out on your next PR. Type /q, ask a question, and see how smarter conversational reviews transform your workflow.

Measuring Developer Productivity with Amazon Q Developer and Jellyfish

Post Syndicated from Madhu Balaji original https://aws.amazon.com/blogs/devops/measuring-developer-productivity-with-amazon-q-developer-and-jellyfish/

Modern software development teams face increasing pressure to deliver high-quality code faster, while managing growing system complexity. Developers often spend significant time on necessary, but undifferentiated work, or “toil”. Toil is often manual, repetitive, and of limited enduring value, making it a strong candidate for automation or delegation to generative AI tools. The re:Invent 2024 session Unleashing generative AI: Amazon’s journey with Amazon Q Developer (DOP214) discussed how toil and productivity have an inverse relationship. Amazon Q Developer can help decrease toil and free up your developers to work on more productive tasks. Until now, that impact has been hard to show.

This post shows you how to integrate Amazon Q Developer with Jellyfish to measure AI’s impact on developer productivity. You’ll learn how to set up the integration, understand key metrics, and make data-driven decisions about your AI investments.

The Evolution of Developer Productivity Measurement

The initial Amazon Q Developer Dashboard, released in October 2023, provided basic visibility into subscription usage, code generation statistics, and security scans. While these metrics gave customers visibility into basic usage patterns, they wanted deeper insights into how the metrics connected to developer productivity and business outcomes. Since then, updates to the Amazon Q Developer Dashboard provided additional user-level insights, with the most recent changes discussed in the May 2025 blog post: Unlocking the power of Amazon Q: Metrics-driven strategies for better AI coding.

Amazon Q Developer dashboard in AWS Console showing subscription metrics, usage statistics with a donut chart of code suggestions by category, and an active users trend line graph

Amazon Q Dashboard in AWS Console

Many organizations face challenges when measuring generative AI impact due to complex organizational structures, fragmented tool chains, and rapidly evolving AI capabilities.

Leaders can make more informed decisions about metrics by working backwards from their desired business outcomes. When customers begin using generative AI tools, they focus on basic usage metrics such as subscription counts and active users. As generative AI adoption grows within an organization, teams want to understand AI impact on productivity and business value. By collecting the right data, leaders can measure how generative AI affects development workflows and business outcomes in their organizations.

Why Integrated Metrics Matter

The April 2025 blog “How generative AI is transforming developer workflows at Amazon” shared that developer productivity metrics are more complex than what any single tool measures. This aligns with established frameworks like DORA and SPACE. Understanding AI’s impact requires visibility across the entire development lifecycle. Organizations are looking for ways to combine data from multiple data sources to get a complete view. Some have created home-grown tools and dashboards while others like Genesys, a global cloud leader in AI-Powered Experience Orchestration, have taken advantage of partners like Jellyfish.

“At Jellyfish, our customers have been asking us for an Amazon Q Developer integration so they can understand the complete picture of how generative AI has transformed, improved and accelerated their software development workflows” – Billy Robbins, Jellyfish Head of Partnerships

The Jellyfish Solution

Jellyfish is an engineering management solution that combines metrics from various development tools. When integrated with Amazon Q Developer, Jellyfish helps you understand how Amazon Q Developer affects your development productivity by analyzing AI usage data alongside engineering metrics. Jellyfish understands the taxonomy of customer organizations allowing you to gain insights at the organizational levels that matter to you. This integration helps engineering leaders measure AI impact on development velocity, track adoption and usage patterns, and calculate the return on investment from AI spend.

“At Genesys, we’ve long been committed to data-driven engineering and deep telemetry across our software development lifecycle. However, quantifying AI’s impact on our development teams was challenging, as the insights from isolated tools were too fragmented to give us a clear overall picture. By partnering with AWS and Jellyfish, we’ve integrated the AI developer tooling our engineers trust with the platforms our leadership team relies on for visibility and alignment. This unified view empowers us to go beyond measuring AI adoption and on to operational metrics like productivity improvements and return on investment, enabling more informed decision-making at every level.” – Craig Dahlinger, Genesys Senior Director, Platform Engineering

Solution Overview

The Amazon Q Developer and Jellyfish integration connects your AI-assisted development metrics with broader engineering analytics. Through secure, automated data flow, the solution provides insights into how AI is transforming your development processes.

Architecture diagram showing Amazon Q Developer Metrics for a single account, illustrating data flow between various AWS services including Lambda, CloudTrail, IAM Identity Center, EventBridge, S3, and integration with a third-party analytics partner

Amazon Q Developer log data ingestion setup

How It Works

Amazon Q Developer automatically captures detailed usage data and prompt logs in your AWS environment. This data flows to a designated Amazon Simple Storage Service (Amazon S3) bucket, which Jellyfish securely accesses through pre-defined IAM roles. Jellyfish processes the information alongside data from your other development tools, providing comprehensive insights through their analytics system.

Key Metrics & Insights

Jellyfish’s AI Impact Dashboard surfaces several important metrics across your development lifecycle:

Engineering Adoption

Visualize how many engineers have adopted Amazon Q Developer across your organization. Users are categorized by cohort: Power, Casual, Idle, and New, giving you a clear picture of adoption. The following screenshot shows a breakdown by user cohorts: out of 77 total engineers, you see 22 Power Users, 20 Casual Users, 6 Idle Users, and 12 New Users. This view helps you understand where you’re succeeding in driving adoption and where there might be room for improvement.

Jellyfish dashboard's Manage Adoption view showing user adoption metrics through a donut chart, usage trends over time, team-based adoption data, and programming language statistics for Amazon Q

JellyFish Dashboard Manage Adoption

Usage Patterns and Trends
Through intuitive graphs, you can see daily active usage data, adoption trends, and usage patterns over time. This temporal view is crucial to understand how usage evolves and helps you identify successful adoption strategies and potential barriers to consistent use.

You can also see which programming languages benefit most from AI assistance. For example, the Manage Adoption dashboard screenshot above shows higher acceptance rates for AI suggestions in React compared to SQL (2,415 vs. 54), guiding your efforts to expand AI usage across different development areas.

Impact Measurement

Perhaps most crucially, this integration provides concrete impact metrics. You can now measure the reduction in time from first commit to pull request open. For example, the following screenshot shows a 24% reduction, with work time decreasing from 2 days and 23 hours to 2 days and 6 hours. You can also track changes in review time, which might show slight increases as AI-assisted code often requires more thorough review. Throughput improvements are also measurable, with some teams seeing a 142% increase in average monthly pull requests per user, jumping from 2.6 to 6.3 PRs per month.

Jellyfish dashboard showing development metrics and AI assistance trends comparing performance with and without Amazon Q integration

JellyFish Dashboard Maximum Impact

You can use the dashboard to view the percentage of pull requests assisted by Amazon Q Developer over time and track AI adoption. You can also understand the ratio of AI-written to human-written code, providing insight into the level of AI integration in your development process.

Investment Analysis

To round out the picture, you can visualize the impact of tool utilization on investment across different areas such as Growth, KTLO (Keep The Lights On), and Support. This helps you understand how your AI investment is affecting various aspects of your development lifecycle.

Implementation Guide

Prerequisites

Before implementing this integration, make sure you have:

    • Configure S3 buckets
    • Manage IAM roles
    • Set up CloudTrail logs (optional)

Setup Process

The implementation involves three steps:

Step 1: Enable Amazon Q Developer data collection: Follow the setup process found in this repository, containing automation scripts and detailed instructions. In this step, you configure the necessary AWS resources to collect Amazon Q Developer metrics.

This repository includes:

  • Python scripts for local execution
  • AWS Lambda functions for serverless deployment
  • Comprehensive documentation and testing procedures

Step 2: S3 Access: To grant the JellyFish account/role access to the S3 bucket for logs, update the bucket policy

Sample: Provide Jellyfish the name of your amazon-q-log-bucket

S3 Bucket ARN: <your-amazon-q-log-bucket-arn>

Update S3 Bucket Policy

  1. Go to AWS S3 Console → Select your Amazon Q log bucketPermissions tab.
  2. Click Edit Bucket Policy and add:
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::0XXXXXXXX5:role/<AccessRoleName>"
            },
            "Action": ["s3:GetObject", "s3:ListBucket"],
            "Resource": [
                "arn:aws:s3:::<your-amazon-q-log-bucket>",
                "arn:aws:s3:::<your-amazon-q-log-bucket>/*"
            ]
        }
    ]
}

Step 3: Verify Setup: Confirm that the data is appearing in your S3 bucket and check with Jellyfish team to validate they have access to the S3 bucket and are receiving the logs.

Clean up

Follow automated or the manual cleanup steps provided in the README.md

Conclusion

The Amazon Q Developer integration with Jellyfish represents a significant step forward in your ability to measure and optimize the impact of AI in software development. By providing engineering leaders with powerful, actionable insights into AI adoption and impact, organizations are enabled to make informed decisions about their AI investments, optimize developer workflows, and drive greater efficiency across their engineering teams.

To learn more about this integration, visit the Amazon Q Developer documentation, contact your Jellyfish representative, or visit the Jellyfish website if you’re new to their resources.

Madhu Balaji

Madhu is a Senior Specialist Solutions Architect at AWS who helps customers design and implement innovative cloud solutions. With 20+ years of experience in development and application architecture, he focuses on enabling customers to accelerate their time-to-market and solve complex business challenges using AWS services.

Austin Butler

Austin is a Senior Go-To-Market Specialist at Amazon Web Services (AWS) focusing on generative AI across the software development lifecycle. He works with strategic customers and partners to understand their software development practices and how AWS services like Amazon Q Developer can deliver value in their SDLC. Prior to joining AWS, Austin spent 10 years working in Finance & Accounting.

Mastering Amazon Q Developer with Rules

Post Syndicated from Aurelien Plancque original https://aws.amazon.com/blogs/devops/mastering-amazon-q-developer-with-rules/

When I first started working with Amazon Q Developer, I was impressed by its capabilities, but I quickly found myself in a familiar pattern. Development teams using AI assistants face a common challenge: repeatedly explaining coding standards, workflow preferences, and established patterns in every conversation. This repetitive setup reduces productivity and creates inconsistent AI guidance across team members. Sound familiar?

That’s when I discovered the power of custom rules – and it completely transformed how I work with AI assistance.

What Are Amazon Q Developer Rules?

Rules in Amazon Q Developer are a way to build a library of coding standards and best practices that are automatically used as context when interacting with the assistant.

These rules are defined in Markdown files stored in your project’s .amazonq/rules folder. Once created, they automatically become part of the context for developers interacting with Amazon Q Developer within your project, maintaining consistency across team members regardless of their experience level. Currently, rules are supported in the Amazon Q Developer IDE extensions and in the Amazon Q Developer CLI.

The Power of Rules-Based AI Assistance

What I find most compelling about rules-based AI assistance is how it minimizes the repetitive setup that usually comes with AI interactions. Instead of repeatedly instructing your AI assistant on your preferences and standards for each request, you can define these once as rules. This creates a consistent, predictable AI experience that automatically respects your team’s conventions and best practices.

The real game-changer for me has been the consistency. Whether I’m working on a new feature, debugging an issue, or reviewing code, Amazon Q Developer now understands my context from the start. This means I can focus on the actual problem-solving instead of repeatedly explaining how I like things done.

Understanding the Rule Lifecycle

One thing that consistently surprises developers is how seamlessly rules integrate into Amazon Q Developer workflow. Understanding when and how rules are injected into the context helps you make the most of this capability. Here’s how the process works:

A diagram depicting the rule lifecycle for Amazon Q Developer

The rule lifecycle in Amazon Q Developer

Rules are injected at several key moments:

1. Initial Context Loading: When you first interact with Amazon Q Developer in a project, it scans the `.amazonq/rules` directory and loads the applicable rules into its context.

2. Request Processing: Before generating a response, Amazon Q Developer evaluates your request against the loaded rules to determine which ones apply.

3. Response Generation: While crafting its response, Amazon Q Developer follows the instructions from applicable rules, prioritizing them based on their specified priority levels.

4. Dynamic Updates: If you modify existing rules or add new ones during a session, Amazon Q Developer detects these changes and updates its behavior accordingly.

This continuous integration makes sure that Amazon Q Developer’s responses consistently align with your standards without requiring you to repeat instructions in every conversation.

Why Rules-Based AI Makes a Difference ?

What I’ve found most valuable about this approach is how it transforms the AI from a generic assistant into something that feels like it truly understands your team’s way of working. Here are the key benefits I’ve experienced:

Consistency: Every team member gets the same AI-guided experience, making sure code and documentation remain consistent regardless of who wrote them.

Knowledge Preservation: Rules capture your team’s accumulated wisdom and best practices, making them accessible to everyone.

Reduced Cognitive Load: You can focus on solving problems rather than remembering and enforcing standards.

Faster Onboarding: New team members automatically receive guidance aligned with your team’s practices.

Adaptability: Rules can evolve alongside your project, making sure AI assistance remains relevant as your needs change.

The difference between generic AI assistance and rules-guided assistance becomes clear quickly. Generic AI might suggest any valid solution, while rules-guided AI suggests solutions that fit your specific context and standards.

Building Effective Rules: A Practical Approach

Now that you’ve seen the power of rules, let me walk you through how to create them. While there’s no “official” format for Amazon Q Developer rules (the beauty is in the flexibility!), the approach I’m about to share has consistently delivered excellent results for me and my team.

Rule File Format and Location

Here’s what I’ve learned about organizing rules for Amazon Q Developer:

  • Rules must be written in Markdown format (.md files)
  • They should be placed in the .amazonq/rules directory of your project
  • You can use a filename of your choice, though descriptive names help with organization (examples: monitoring-rule.md, frontend-react.rule.md)
  • Rules can be organized in sub-directories for better structure (for example, .amazonq/rules/frontend/react.rule.md)
  • The filename itself is arbitrary – Amazon Q Developer will read the .md files in the directory. However, using meaningful names makes your rule system easier to maintain as it grows.

Essential Rule Structure

What I find works best is a well-crafted rule file that contains these key sections:

# Rule Name
## Purpose
A clear, concise statement explaining why this rule exists.
## Instructions
- Specific directives for Amazon Q Developer to follow
- Additional instructions with their own identifiers
- Conditions under which instructions apply
## Priority
[Critical/High/Medium/Low]
## Error Handling
- How Amazon Q Developer should behave when exceptions occur
- Fallback strategies when primary instructions can't be followed

Let me show you this structure in action with a complete example of a monitoring rule that has been particularly effective for my team:

# Monitoring
## Purpose
This rule ensures that monitoring coverage is maintained when major features are added to the project.
## Instructions
- When implementing a major feature (new service, API endpoint, or core functionality), ALWAYS check if MONITORING_PLAN.md needs updates.
- Major features include: new microservices, AI integrations, WebSocket endpoints, database operations, external API integrations, or user-facing functionality.
- ALWAYS update MONITORING_PLAN.md to include relevant metrics, dashboards, and alerts for the new feature.
- When updating monitoring plan, include: custom metrics, CloudWatch dashboards, alarms, and logging requirements specific to the new feature.
- After updating MONITORING_PLAN.md, ALWAYS output "📊 Updated monitoring plan for: [feature description]".
## Priority
High
## Error Handling
- If MONITORING_PLAN.md doesn't exist, create it with basic monitoring structure and note the creation
- If monitoring plan is unreadable, create a backup and start fresh with current feature requirements
- If unsure whether a feature qualifies as "major", err on the side of caution and update monitoring plan

I saved this as monitoring.rule.md in my project’s .amazonq/rules directory.

Rule Components That Work

Now let me break down each component and show you why this structure works so well.

Rule Name

Think of this as the “class name” for your rule. It should be descriptive and domain-specific, like “Frontend – React” or “Monitoring.” This helps organize your rules into logical categories and makes them easier to maintain as your ruleset grows.

Purpose

This section is crucial, it’s where you explain the “why” behind your rule. What I’ve learned is that a clear purpose helps Amazon Q Developer understand the intent behind your instructions, allowing it to make better decisions when faced with edge cases. For example:

## Purpose
Ensures consistent monitoring coverage is maintained when adding new features to the project.

This simple statement guides Amazon Q Developer to prioritize monitoring considerations, even when they’re not explicitly mentioned in your request.

Instructions

This is where the magic happens. Instructions are specific directives that shape Amazon Q Developer’s behavior. What I’ve found works best is when each instruction:

  1. Is clear and actionable
  2. Focuses on a single aspect of behavior
  3. Uses consistent formatting for easy scanning

For example:

## Instructions
- When implementing a major feature, ALWAYS check if MONITORING_PLAN.md needs updates.
- Major features include: new microservices, AI integrations, WebSocket endpoints.
- After updating MONITORING_PLAN.md, output "📊 Updated monitoring plan for: [feature]".

These clear, focused instructions give Amazon Q Developer specific guidance on how to behave in different situations, maintaining consistent responses across your team.

Priority

Not all rules are created equal. What I’ve discovered is that the priority level helps Amazon Q Developer resolve conflicts when multiple rules could apply to a situation. I typically use four levels:

  • Critical: Must be followed without exception
  • High: Should be followed unless conflicting with a critical rule
  • Medium: Important guidelines that shape behavior
  • Low: Preferences that can be overridden when necessary
Error Handling

This often-overlooked section is what makes rules robust in real-world scenarios. Good error handling instructions tell Amazon Q Developer what to do when things don’t go as planned:

## Error Handling
- If MONITORING_PLAN.md doesn't exist, create it with basic monitoring structure
- If unsure whether a feature qualifies as "major," err on the side of caution

These fallback strategies make sure Amazon Q Developer remains helpful even when facing unexpected situations.

Seeing Rules in Action

To show you how effective this structure can be, let me give you a simple example. Without rules, asking Amazon Q Developer to “add a new React component for user profiles” might result in a component that doesn’t match your project’s patterns.

But with a well-crafted frontend rule, Amazon Q Developer would automatically:

  1. Check existing component structures
  2. Follow your naming conventions
  3. Create appropriate prop interfaces
  4. Add the right level of documentation
  5. Place the file in your preferred directory structure

All without you having to specify these details every time!

Making Rules Transparent: A Game-Changing Technique

One particularly powerful technique I’ve discovered is teaching Amazon Q Developer to explicitly acknowledge which rules it’s following. This isn’t a default behavior of Amazon Q Developer, but rather a custom enhancement you can implement through a specific conversation rule.

Adding Unique Identifiers for Traceability

The key to this system is adding unique identifiers (IDs) to each instruction in your rules. For example:

## Instructions
- When implementing a major feature, ALWAYS check if MONITORING_PLAN.md needs updates. (ID: CHECK_MONITORING_PLAN)
- Major features include: new microservices, AI integrations, WebSocket endpoints. (ID: MAJOR_FEATURE_CRITERIA)
- After updating MONITORING_PLAN.md, output "📊 Updated monitoring plan for: [feature]". (ID: ANNOUNCE_MONITORING_UPDATE)

These IDs serve as “traceable markers” that Amazon Q Developer can reference when following a rule.

Creating the Acknowledgment Behavior

Next, you can create a conversation rule that instructs Amazon Q Developer to acknowledge which rules it’s following. Here’s a complete example of such a rule:

# Conversation
## Purpose
This rule defines how Amazon Q Developer should behave in conversations, including how it should acknowledge other rules it's following.
## Instructions
- ALWAYS consider your rules before using a tool or responding. (ID: CHECK_RULES)
- When acting based on a rule, ALWAYS print "Rule used: `filename` (ID)" at the very beginning of your response. (ID: PRINT_RULES)
- If multiple rules are matched, list all: "Rule used: `file1.rule.md` (ID1), `file2.rule.md` (ID2)". (ID: PRINT_MULTIPLE)
- DO NOT start responses with general mentions about using rules or context, but DO print specific rule usage as specified above. (ID: NO_GENERIC_MENTIONS)
## Priority
Critical
## Error Handling
- If rule files are unreadable, continue but note the issue
- If multiple conflicting rules apply, follow the highest priority rule and note the conflict

Save this as conversation.rule.md in your .amazonq/rules directory.

When Amazon Q Developer follows a rule, it will now explicitly state which rule and identifier guided its actions:

An image showing Q Developer CLI printing the rules it uses in its answers

You can get Amazon Q Developer to state which instruction it is following.

What I find most valuable about this simple addition is the remarkable benefits it creates:

  • Transparency: Team members can immediately see which guidelines influenced Amazon Q Developer’s response
  • Debugging: When Amazon Q Developer behaves unexpectedly, you can identify which rule caused the behavior
  • Learning: New team members discover relevant rules by seeing which ones are being applied
  • Validation: You can confirm that your rules are working as intended
  • Continuous Improvement: Identify which rules are most frequently used and which might need refinement

By making rules visible, you turn Amazon Q Developer into a collaborative partner that not only follows your guidelines and helps team members discover and engage with your established practices. The IDs aren’t just organizational tools—they’re the foundation of a self-documenting AI assistance workflow that grows more valuable as your rule system expands.

Getting Started with Your Own Rules

What I love about this approach is that you can start small. Begin with one or two rules addressing your most common pain points, then expand as you see the benefits. Some good starting points include:

  • Code style and organization rules
  • Documentation standards
  • Testing requirements
  • Git commit message formats

Remember, the goal isn’t to create an exhaustive rulebook—it’s to capture the aspects of your development process that matter most to your team’s productivity and code quality.

Practical Examples: Rules in Action

To show you the real-world impact of rules, let me walk you through some concrete scenarios that demonstrate how rules transform the AI assistance experience. These examples show the difference between generic AI help and rules-guided assistance.

Scenario 1: Time-Based Data Analysis

This scenario demonstrates how rules help Amazon Q Developer understand your environment’s context for time-related operations and analysis. Here are examples of this rule in action in VS Code.

Here is a rule I use to inform Amazon Q Developer how to behave when it needs to understand the current time:

# Time
## Purpose
This rule defines how Amazon Q Developer (the agent) handles time-related operations and queries
## Instructions
- When determining the current time, ALWAYS use bash commands with AEST timezone: `date` (ID: GET_AEST_TIME)
- When timestamps are needed for logging or documentation, use ISO format with AEST timezone (ID: ISO_TIMESTAMP)
- When comparing times or calculating durations, ensure all times are in AEST for consistency (ID: CONSISTENT_TIMEZONE)
- For time-sensitive operations, always verify the current AEST time before proceeding (ID: VERIFY_TIME)
## Priority
Medium
## Error Handling
- If date command fails, note the system time issue and continue with available information
- If timezone conversion is needed, use appropriate date formatting commands
Without Rules:

When Amazon Q Developer doesn’t have the time rule, it lacks the context needed for time-based queries:

An image showing Amazon Q Developer in the IDE trying to understand what is the date

An Amazon Q Developer interaction without a time rule

As you can see, without the rule, Amazon Q Developer needs clarification about timezone context and doesn’t know how to determine the current time in your environment.

With the Time Rule:

Here’s a similar query with the time rule in place:

An image showing Amazon Q Developer in the IDE using the time rule to get the time and date

Amazon Q Developer follows the instructions of the rule

Notice how Amazon Q Developer immediately uses the `date` command to get the current AEST time, exactly as specified in the rule, without needing clarification.

The Impact:
  • Automatic Context: Amazon Q Developer immediately knows to use the date command to get AEST time
  • No Clarification Needed: It understands “yesterday” relative to the current AEST time without asking
  • Consistent Behavior: The same approach works for other time-based queries across team members
  • Environmental Awareness: It knows exactly how to determine time in your specific system environment
  • Transparent Process: You can see it’s following the rule by using the bash date command as specified

This example shows how the time rule transforms a potentially confusing interaction into a smooth, context-aware analysis that works consistently every time.

Scenario 2: Frontend Component Development

This scenario demonstrates how rules can help prevent technical debt accumulation and maintain consistent component architecture across your team.

Without Rules:

When Amazon Q Developer doesn’t have frontend-specific guidance, different developers can get inconsistent suggestions for component creation. Some might create reusable components immediately, others get copy-paste solutions, and component organization varies based on individual preferences.

With Frontend Rules:

Here’s a real React rule from my development workflow that addresses these consistency issues:

# Frontend - React
## Purpose
Defines how to act when writing React
## Instructions
- ALWAYS evaluate reusability potential for new visual elements using these criteria: used in 2+ places, has configurable props, or represents a common UI pattern. (ID: EVALUATE_REUSABILITY)
- If reusability potential is high (meets 2+ criteria above), create a dedicated component in appropriate folder (components/, shared/, or ui/) with clear prop interfaces and JSDoc comments. (ID: CREATE_REUSABLE_COMPONENT)
- When creating reusable components, include explicit comments explaining: purpose, key props, usage examples, and any important behavior. (ID: DOCUMENT_COMPONENTS)
- Follow existing component structure and naming conventions found in the project's components folder. (ID: FOLLOW_CONVENTIONS)
- Prefer composition over inheritance - create small, focused components that can be combined. (ID: PREFER_COMPOSITION)
## Priority
Medium
## Error Handling
- If component folder structure is unclear, place new components in src/components/ and ask user for preferred organization
- If existing conventions are inconsistent, follow the most recent or most common pattern and note the inconsistency
The Impact:
  • Consistent Architecture: Team members get the same component creation guidance regardless of experience level
  • Reduced Technical Debt: Automatic evaluation of reusability helps prevent duplicated UI elements
  • Better Documentation: Components automatically include proper JSDoc comments and usage examples
  • Maintainable Structure: Consistent naming conventions and folder organization across the project
  • Scalable Patterns: Composition-focused approach creates more flexible, reusable components

This make sure that whether a junior developer or senior architect is working with Amazon Q Developer, the resulting components follow the same quality standards and architectural patterns.

Scenario 3: Version Control Workflow

This scenario demonstrates how rules can act as safety mechanisms and workflow controllers, facilitating consistent practices around version control operations.

Without Rules:

Amazon Q Developer might suggest git operations without considering your team’s workflow preferences or safety requirements. It could recommend immediate pushes, generic commit messages, or skip confirmation steps that help prevent mistakes.

With Git Rules:

Here’s a real git rule that transforms how Amazon Q Developer handles version control operations:

# Git
## Purpose
This rule dictates Amazon Q Developer (the agent) behavior related to git.
## Instructions
- ALWAYS ask confirmation from the user before pushing to git (ID: GIT_PUSH)
- ALWAYS ensure commit messages are meaningful and detailed, including what was changed and why (ID: GIT_COMMIT)
- Commit message should be accurate but playful and not too formal, with comprehensive details about the changes (ID: GIT_COMMIT_STYLE)
- Include specific files/components modified and the impact of changes in commit messages (ID: GIT_COMMIT_DETAILS)
## Priority
Medium
## Error Handling
N/A

Here’s this rule in action when I asked Amazon Q Developer to commit our changes:

> Let's commit our latest changes to git.
> Rule used: git.rule.md (GIT_COMMIT), git.rule.md (GIT_COMMIT_STYLE), 
git.rule.md (GIT_COMMIT_DETAILS)
I'll commit all the changes we've made to remove messaging functionality and 
focus the platform on location-based discovery. Let me create a comprehensive 
commit with all the modifications.
🛠 Using tool: execute_bash (trusted)
 ● I will run the following shell command: 
cd /path/to/your/project/rules-blog && git status
 ↳ Purpose: Check git status to see all changed files
On branch main
Changes not staged for commit:
[... output truncated for brevity ...]
Untracked files:
[... output truncated for brevity ...]
● Completed in 0.54s
🛠 Using tool: execute_bash (trusted)
 ● I will run the following shell command: 
cd /path/to/your/project/rules-blog && git add.
 ↳ Purpose: Add all changes to git staging
 ● Completed in 0.101s
🛠 Using tool: execute_bash (trusted)
 ● I will run the following shell command: 
cd /path/to/your/project/rules-blog && git commit -m "🎯 Refocus platform on location-based discovery: Remove messaging, complete React app
Major platform refocus to pure location-based discovery:
MESSAGING FUNCTIONALITY REMOVED:
[... output truncated for brevity ...]
DOCUMENTATION UPDATES:
[... output truncated for brevity ...]
INFRASTRUCTURE IMPROVEMENTS:
[... output truncated for brevity ...]
PLATFORM FOCUS NOW:
✅ Location-based discovery
✅ Browsing and following
✅ Information and contact details
✅ Notifications by location
❌ No messaging/booking - pure discovery platform
Files modified: 13 core files, 1 new frontend package
Infrastructure: 7 Lambda functions, tagged resources, clean API
Tests: 12/12 passing, production build successful"
[... output truncated for brevity ...]

Note: Notice how Amazon Q Developer automatically shows Rule used: git.rule.md (GIT_COMMIT), git.rule.md (GIT_COMMIT_STYLE), git.rule.md (GIT_COMMIT_DETAILS) at the beginning of its response – this is the rule transparency system we discussed earlier in action, showing exactly which rule instructions guided the commit message creation.

The Impact:
  • Helps Prevent Accidents: The confirmation requirement stops accidental pushes that could disrupt team workflows
  • Consistent Commit Quality: Commit messages follow the same detailed, informative style regardless of who’s working
  • Team Personality: The “playful but not too formal” style maintains team culture while preserving professionalism
  • Better Git History: Detailed commit messages make code archaeology much more straightforward for future debugging
  • Workflow Safety: Acts as a guardrail that respects human oversight in critical operations

This rule shows how Amazon Q Developer can be more than just a code assistant – it becomes a workflow partner that understands and enforces your team’s operational preferences and safety requirements.

What’s Next for Rules-Based Development

Through this exploration of Amazon Q Developer rules, we’ve discovered how a simple concept – defining your preferences once instead of repeating them constantly – can transform your development workflow. The key learnings are clear: rules help you avoid repetitive setup, help with team consistency, preserve institutional knowledge, and create transparent AI interactions that grow more valuable over time.

Reduced cognitive load, faster onboarding, consistent code quality, and AI assistance that truly understands your team’s context – what started as a solution to repetitive explanations has evolved into a comprehensive system for scaling development practices across my team.

My Amazon Q Developer rules system continues to evolve, and I’m excited about the possibilities ahead. As more teams adopt this approach, I expect we’ll see community-shared rule libraries and even more sophisticated customization options.

What I find most promising is how rules create a foundation for more advanced AI assistance. When your AI assistant understands your context deeply, it can provide more sophisticated suggestions and catch potential issues before they become problems.

I encourage you to start experimenting with rules – pick one area where you find yourself repeating instructions to AI assistants and create your first rule. You’ll be surprised how quickly this approach transforms your development workflow.

The key is to remember that rules aren’t about constraining creativity – they’re about freeing you to focus on the interesting problems by automating the routine decisions. When Amazon Q Developer knows how you like things done, you can spend more time on what you’re building and less time on how to build it.

Ready to get started with Amazon Q Developer rules? Check out the Amazon Q Developer documentation for setup instructions and more examples.

Announcing the end-of-support for the AWS SDK for .NET v3

Post Syndicated from Ed McLaughlin original https://aws.amazon.com/blogs/devops/announcing-the-end-of-support-for-the-aws-sdk-for-net-v3/

We are announcing the end-of-support for the AWS SDK for .NET v3.x starting on March 1, 2026, in accordance with the SDK and Tools maintenance policy.  On April 28, 2025, the next major version of the AWS SDK for .NET, version 4.x, became generally available (blog post). Version 4.x of the SDK includes bug fixes, performance enhancements, and modernization for .NET. We strongly encourage you to upgrade to take advantage of these enhancements.

Existing applications that use the AWS SDK for .NET v3.x will continue to function as intended unless there is a fundamental change to how an AWS service works. This is uncommon, and we will announce it broadly if it happens. Beginning March 1, 2026, the AWS SDK for .NET v3.x will only receive critical bug fixes and security updates, and we will not update it to support new AWS services, new service features, or changes to existing services. After June 1, 2026, when the AWS SDK for .NET v3.x reaches end-of-support, it will no longer receive updates or releases.

End of Support Timeline for Version 3

The timeline for end-of-support is as follows, as defined by the SDK major version lifecycle:

Phase Start Date End Date Support Level
General Availability July 28, 2015 June 1, 2026 During this phase, the SDK is fully supported. AWS will provide regular SDK releases that include support for new services, API updates for existing services, as well as bug and security fixes.
Maintenance Mode March 1, 2026 June 1, 2026 During the maintenance mode, AWS limits SDK releases to address critical bug fixes and security issues only. An SDK will not receive API updates for new or existing services, or be updated to support new regions
End-of-Support June 1, 2026 N/A When an SDK reaches end-of support, it will no longer receive updates or releases. Previously published releases will continue to be available via public package managers and the code will remain on GitHub. The GitHub repository may be archived.

References for AWS SDK for .NET Version 4

Use the following references to learn more about the AWS SDK for .NET v4.x along with migration support.

Conclusion

We recommend you upgrade to the latest major version of the AWS SDK for .NET v4.x by using the migration guide. This major version includes, but is not limited to, performance enhancements, bug fixes, modern .NET libraries and frameworks, and the latest AWS service updates. To learn more, please refer to the AWS SDK for .NET GA blog post.

If you need assistance or have feedback, reach out to your usual AWS support contacts. You can also open a discussion or issue on GitHub. Thank you for using the AWS SDK for .NET.

Author

Ed McLaughlin

Ed McLaughlin is the software development manager for the AWS SDK for .NET, .NET Developer Experience products, and AWS Tools for PowerShell. He enjoys working on projects, tools, and features that enhance the experience of the customer. You can find him on LinkedIn @ejm3.

Announcing the end-of-support for AWS Tools for PowerShell v4

Post Syndicated from Ed McLaughlin original https://aws.amazon.com/blogs/devops/announcing-the-end-of-support-for-aws-tools-for-powershell-v4/

We are announcing the end-of-support for the AWS Tools for PowerShell v4.x starting on March 1, 2026, in accordance with the SDK and Tools maintenance policy. On June 23, 2025, the next major version of the AWS Tools for PowerShell, version 5.x, became generally available (blog post). Version 5.x of the AWS Tools for PowerShell includes bug fixes, new features, performance enhancements, and leverages the latest major version of the AWS SDK for .NET v4.x. We strongly encourage you to upgrade to the latest version of AWS Tools for PowerShell v5.x to take advantage of these enhancements.

Existing applications that use the AWS Tools for PowerShell v4.x will continue to function as intended unless there is a fundamental change to how an AWS service works. This is uncommon, and we will announce it broadly if it happens. Beginning March 1, 2026, the AWS Tools for PowerShell v4.x will only receive critical bug fixes and security updates, and we will not update it to support new AWS services, new service features, or changes to existing services. After June 1, 2026, when the AWS Tools for PowerShell v4.x reaches end-of-support, it will no longer receive updates or releases.

End of Support Timeline for Version 4

The timeline for end-of-support is as follows, as defined by the SDK major version lifecycle:

Phase Start Date End Date Support Level
General Availability July 28, 2015 June 1, 2026 During this phase, the SDK is fully supported. AWS will provide regular SDK releases that include support for new services, API updates for existing services, as well as bug and security fixes.
Maintenance Mode March 1, 2026 June 1, 2026 During the maintenance mode, AWS limits SDK releases to address critical bug fixes and security issues only. An SDK will not receive API updates for new or existing services, or be updated to support new regions
End-of-Support June 1, 2026 N/A When an SDK reaches end-of support, it will no longer receive updates or releases. Previously published releases will continue to be available via public package managers and the code will remain on GitHub. The GitHub repository may be archived.

References for AWS Tools for PowerShell Version 5

Use the following references to learn more about the AWS Tools for PowerShell v5.x along with migration support.

Conclusion

We recommend you upgrade to the latest major version of the AWS Tools for PowerShell v5.x by using the migration guide. This major version includes, but is not limited to, performance enhancements, bug fixes, modern .NET libraries and frameworks, and the latest AWS service updates. To learn more, please refer to the AWS Tools for PowerShell v5.x GA blog post.

If you need assistance or have feedback, reach out to your usual AWS support contacts. You can also open a discussion or issue on GitHub. Thank you for using the AWS Tools for PowerShell.

Author

Ed McLaughlin

Ed McLaughlin is the software development manager for the AWS SDK for .NET, .NET Developer Experience products, and AWS Tools for PowerShell. He enjoys working on projects, tools, and features that enhance the experience of the customer. You can find him on LinkedIn @ejm3.

Implementing Defense-in-Depth Security for AWS CodeBuild Pipelines

Post Syndicated from Daniel Begimher original https://aws.amazon.com/blogs/security/implementing-defense-security-for-aws-codebuild-pipelines/

Recent security research has highlighted the importance of CI/CD pipeline configurations, as documented in AWS Security Bulletin AWS-2025-016. This post pulls together existing guidance and recommendations into one guide.

Continuous integration and continuous deployment (CI/CD) practices help development teams deliver software efficiently and reliably. AWS CodeBuild provides managed build services that integrate with source code repositories like GitHub, GitLab, and other Source Control Management (SCM) systems. While this guide uses GitHub examples, the security principles and webhook configuration approaches apply to other supported source control systems.

However, certain configurations require careful attention. We strongly recommend that you do not use automatic pull request builds from untrusted repository contributors without proper security controls and a clear understanding of your threat model. This configuration allows untrusted code to execute in your build environment with access to repository credentials and environment variables. Webhook configurations determine which repository events trigger builds and what code gets executed during the build process. Understanding these configurations is essential for maintaining appropriate security boundaries while preserving the automation benefits that make CI/CD valuable.

Security teams and DevOps engineers can use these practical approaches to configure AWS CodeBuild to meet their security goals while maintaining development velocity. We’ll explore webhook configurations, trust boundaries, and implementation strategies that emphasize threat model assessment, least-privilege access, and proactive monitoring of your pipeline configurations.

Security of the pipeline implications

Under the shared responsibility model, while AWS manages the security of the underlying AWS CodeBuild infrastructure, customers are responsible for securing their pipeline configurations, access controls, and the code that runs within their build environments. This shared responsibility is critical when considering the security of the pipeline itself.

When AWS CodeBuild processes pull requests automatically, it builds the code in an environment with access to repository credentials, environment variables, and potentially sensitive information. This creates specific security of the pipeline considerations:

  • Repository access: AWS CodeBuild projects require repository credentials to read source code and create webhooks. These credentials provide specific permissions that vary based on your configuration.
  • Build execution: The build process runs the retrieved source code, which may include build scripts, dependency definitions, or test files from pull requests.
  • Build environment: AWS CodeBuild environments may have access to environment variables, AWS credentials, or other configuration data needed for the build process.

Establishing trust boundaries

Effective security of the pipeline starts with clearly defining trust boundaries for different types of code contributions:

  • Internal contributors: Team members with repository write access who have been verified through your organization’s access management processes.
  • External contributors: Contributors from outside your organization who submit pull requests from forked repositories.
  • Automated processing: Code that runs without manual review as part of the build process.

These trust boundaries form the foundation for threat modeling your specific environment. Internal and trusted environments can often rely more heavily on automation with contributor filtering and least-privilege controls. Public and open source projects require more stringent controls due to the inherent risks of processing untrusted contributions – these environments benefit from stricter webhook filtering, comprehensive approval gates, or the self-hosted GitHub Actions runner approach discussed later.

The key principle is finding the appropriate balance between security controls and development velocity based on your specific risk profile and contributor trust levels. With these considerations in mind, let’s examine how to assess and configure your current AWS CodeBuild webhook settings.

Configuring secure webhooks

Webhooks represent the preferred mechanism by which external events trigger AWS CodeBuild processes. When properly configured, webhooks provide a powerful and efficient way to automate your build processes in response to repository changes. However, improper webhook configuration can create security vulnerabilities by allowing untrusted code to execute in privileged environments.The security of your webhook configuration depends on understanding exactly which events trigger builds, what level of access those builds have, and what code gets executed during the build process. This section provides a comprehensive approach to authoring, assessing, configuring, and maintaining secure webhook configurations.

Assessing current webhook configurations

Begin by reviewing your existing AWS CodeBuild projects to understand their current webhook configurations. The following AWS CLI commands provide a systematic approach to gathering this information:

# List all CodeBuild projects in your region
aws codebuild list-projects --region us-west-2

# Retrieve detailed configuration for analysis
aws codebuild batch-get-projects --region us-west-2 \
  --names $(aws codebuild list-projects --region us-west-2 \
  --query 'projects[*]' --output text | tr '\n' ' ')

When you run these commands, pay particular attention to the webhook section in the output. This section contains the filterGroups configuration, which determines exactly which repository events trigger builds.

Now that you understand how to review your current setup, let’s examine common configuration patterns and their security implications.

Webhook configuration patterns

Understanding common webhook configuration patterns helps you quickly identify potential security concerns and implement appropriate improvements. The following patterns represent different approaches to webhook configuration, each with specific security implications.

Note: These patterns are not recommended for use and are shown here to help you identify configurations that may need attention.

Configuration requiring review – Automatic pull request processing


{
  "webhook": {
    "payloadUrl": "https://codebuild.us-west-2.amazonaws.com/webhooks",
    "filterGroups": [
      [
        {
          "type": "EVENT",
          "pattern": "PULL_REQUEST_CREATED,PULL_REQUEST_UPDATED,PULL_REQUEST_REOPENED",
          "excludeMatchedPattern": false
        }
      ]
    ]
  }
}

This configuration allows contributors who can create a pull request to trigger code execution in your build environment. We strongly recommend that you do not use automatic pull request builds from untrusted repository contributors.

Configuration requiring immediate review – No event filtering


{
  "webhook": {
    "payloadUrl": "https://codebuild.us-west-2.amazonaws.com/webhooks",
    "filterGroups": []
  }
}

Without filtering, this configuration can trigger builds for a wide variety of repository events.

Recommended secure webhook configurations

The following configurations represent security best practices that balance automation benefits with appropriate security controls. These patterns help to reduce security risks while maintaining the development velocity that makes CI/CD valuable.

Push-based builds (Recommended for most use cases)

Push-based builds make sure that only users with repository write access can trigger builds, which means contributors have already been vetted through your repository’s access control mechanisms.


{
  "webhook": {
    "payloadUrl": "https://codebuild.us-west-2.amazonaws.com/webhooks",
    "filterGroups": [
      [
        {
          "type": "EVENT",
          "pattern": "PUSH",
          "excludeMatchedPattern": false
        }
      ]
    ]
  }
}

Organizations that rely heavily on external open-source contributions may find this approach too restrictive. For example, a popular open-source project that receives dozens of pull requests daily from external contributors would need to manually merge each contribution before builds can run, significantly slowing down the contribution review process. In such cases, contributor-filtered builds or the self-hosted GitHub Actions runner approach may be more appropriate.

Contributor-filtered builds (Recommended for trusted contributors only)


{
  "webhook": {
    "payloadUrl": "https://codebuild.us-west-2.amazonaws.com/webhooks",
    "filterGroups": [
      [
        {
          "type": "EVENT",
          "pattern": "PULL_REQUEST_CREATED,PULL_REQUEST_UPDATED",
          "excludeMatchedPattern": false
        },
        {
          "type": "GITHUB_ACTOR_ACCOUNT_ID",
          "pattern": "^(12345678|87654321|11223344)$",
          "excludeMatchedPattern": false
        }
      ]
    ]
  }
}

This configuration allows pull request builds from specific, trusted contributors.

Important: Filtering applies to the GitHub account ID, not repository ownership. Contributors working from forked repositories can still introduce untrusted code that executes in your build environment.

Before implementing these configurations in your environment, consider these key factors that will help facilitate a smooth transition.

Webhook configuration implementation steps

While implementing the webhook security measures below, consider these broader practices:

  • Threat modeling: Assess your specific risk profile before selecting approaches.
  • Infrastructure as code: Use Infrastructure as Code (IaC) tools for production implementations.
  • Gradual implementation: Implement changes incrementally with observation periods.
  • Testing and rollback: Validate changes in non-production environments first.

The following implementation approach moves from most restrictive to more automated configurations. Choose the approach that best fits your organization’s risk tolerance and operational requirements.
This three-step process moves from the most restrictive approach to more automated configurations while maintaining security controls. Each step builds upon the previous one, creating layers of security that work together to protect your pipeline.

Note: The following examples use the AWS CLI for demonstration purposes. Similar configuration steps can be performed using the AWS Management Console through the AWS CodeBuild project settings.

Step 1: Configure push-only builds

Push-based builds help make sure that only verified contributors can trigger builds. This approach is more secure, because contributors must already be vetted through your repository’s access control mechanisms before they can push code.
Configure your webhook to trigger only on push events:

aws codebuild update-webhook \
  --project-name your-project-name \
  --filter-groups '[
    [
      {
        "type": "EVENT",
        "pattern": "PUSH",
        "excludeMatchedPattern": false
      }
    ]
  ]'

Step 2: Implement branch-based filtering

Branch-based filtering adds an additional layer of security by making sure that builds are triggered only for changes to specific branches. This approach recognizes that not all branches in a repository have the same security requirements or risk profiles.

For example, changes to main or production branches typically require more stringent security controls than changes to feature or development branches. By implementing branch-based filtering, you can apply appropriate security measures based on the criticality and exposure of different branches.

Configure filtering for specific branches:

aws codebuild update-webhook \
  --project-name your-project-name \
  --filter-groups '[
    [
      {
        "type": "EVENT",
        "pattern": "PUSH"
      },
      {
        "type": "HEAD_REF",
        "pattern": "^refs/heads/(main|develop|release/.*)$"
      }
    ]
  ]'

Step 3: Configure contributor filtering

Contributor filtering can be used to manage pull request builds by allowing automation for trusted contributors while requiring manual review for others. This approach recognizes that different contributors represent different risk profiles and should be treated accordingly.

The first step in implementing contributor filtering is identifying the GitHub user IDs of your trusted contributors.

Retrieve GitHub user IDs for trusted contributors:

curl -H "Authorization: token YOUR_GITHUB_TOKEN" \
https://api.github.com/users/trusted-username

Once you have the user IDs of your trusted contributors, you can configure webhook filtering to allow automated builds only for these contributors:


aws codebuild update-webhook \
  --project-name your-project-name \
  --filter-groups '[
    [
      {
        "type": "EVENT",
        "pattern": "PULL_REQUEST_CREATED,PULL_REQUEST_UPDATED"
      },
      {
        "type": "GITHUB_ACTOR_ACCOUNT_ID",
        "pattern": "^(1234567|2345678|3456789)$"
      }
    ]
  ]'

Important: Contributor allowlists require ongoing maintenance as team membership changes. Consider using Infrastructure as Code templates like the Cloudformation examples to manage webhook configurations and contributor lists in version control.

Webhook filtering provides the first layer of security by controlling which events trigger builds. However, comprehensive pipeline security requires additional controls around the permissions and credentials available to those builds once they execute. The following section covers how to implement defense-in-depth security through proper access controls and credential management.

Access control and credential management

This section covers specific approaches to limit the permissions available to build processes, scope repository access tokens appropriately, and create isolated environments that help contain potential security issues. These practices work together to implement defense-in-depth security while maintaining the operational benefits of automated CI/CD workflows.

Implementing least-privilege access

AWS CodeBuild projects require IAM service roles to access AWS resources during the build process. The principle of least privilege dictates that each role should have only the minimum permissions necessary to perform its intended function. By creating separate, purpose-built IAM roles for different types of builds, you can help reduce the potential impact of unauthorized access to build environments.

The following examples demonstrate how to structure minimal IAM roles for different build scenarios. These examples serve as starting points that you should customize based on your specific requirements, adding only the permissions your builds actually need.

Service role configuration

Create minimal IAM roles that provide only the permissions required for specific build types:

Test/validation build role
{
	"Version": "2012-10-17",
	"Statement": [
	{
		"Effect": "Allow",
		"Action": [
			"logs:CreateLogGroup",
			"logs:CreateLogStream",
			"logs:PutLogEvents"
		],
		"Resource": "arn:aws:logs:*:*:log-group:/aws/codebuild/test-*"
	},
	{
	"Effect": "Allow",
	"Action": [
		"s3:GetObject"
	],
	"Resource": "arn:aws:s3:::your-test-artifacts-bucket/*"
  }
 ]
}
Release build role (Separate from test)
{
	"Version": "2012-10-17",
	"Statement": [
	  {
		"Effect": "Allow",
		"Action": [
			"s3:PutObject",
			"s3:GetObject"
		],
		"Resource": "arn:aws:s3:::your-production-artifacts-bucket/*"
	  },
	  {
		"Effect": "Allow",
		"Action": [
			"ecr:BatchCheckLayerAvailability",
			"ecr:GetDownloadUrlForLayer",
			"ecr:BatchGetImage",
			"ecr:PutImage"
		],
		"Resource": "arn:aws:ecr:*:*:repository/your-production-repo"
	  }
	]
}

Leveraging IAM Access Analyzer for CodeBuild security

AWS IAM Access Analyzer can generate least-privilege policies for your AWS CodeBuild service roles based on actual CloudTrail activity from your build executions. This eliminates guesswork by analyzing the specific AWS API calls your builds make, rather than requiring you to predict what permissions might be needed.

After running your CodeBuild projects for a representative period, use Access Analyzer’s policy generation feature to create refined policies. This approach proves particularly valuable for complex build processes where the required permissions might not be immediately obvious.

For detailed implementation steps, refer to the IAM Access Analyzer documentation.

Credential scoping and source authentication

When processing external contributions, the principle of least privilege becomes important for repository access tokens. If an unauthorized user gains access to a token through an untrusted build, properly scoped tokens limit the potential impact to only the permissions necessary for the build process.

Configure fine-grained GitHub Personal Access Tokens with minimal permissions to help reduce this risk. Even if accessed inappropriately, a properly scoped token can only read source code (already accessible through the PR) and write status messages – it cannot push code, modify repository settings, or access other repositories.

The following permissions represent the minimum required access for processing external pull requests, demonstrating how to limit token scope to only essential operations:

  • contents:read – Read-only access to repository source code (already accessible through the PR)
  • statuses:write – Write commit status messages only (cannot modify code or settings)
  • metadata:read – Access basic repository information (name, description, public status)

Important: Use fine-grained personal access tokens restricted to the target repository only. Otherwise, this could allow access to other repositories beyond what is necessary for the build process.

This scoped approach ensures that even if a token is accessed inappropriately, the potential impact is limited to reading already-accessible information and writing status messages. The token cannot push code, modify repository settings, create webhooks, or access other repositories.

Credential storage and rotation

The following examples demonstrate how to securely store and reference these tokens using AWS Secrets Manager. AWS Secrets Manager provides automatic rotation capabilities, encryption at rest and in transit, and fine-grained access controls that help prevent tokens from being exposed in build logs or configuration files. This approach also enables centralized token management across multiple CodeBuild projects while maintaining audit trails of token access.

# Store the fine-grained token in AWS Secrets Manager
aws secretsmanager create-secret \
--name "codebuild/github-pat-limited" \
--description "Limited GitHub PAT for external PR processing" \
--secret-string '{"token":"ghp_your_limited_token_here"}'

# Create CodeBuild project with scoped credentials
aws codebuild create-project \
--name external-pr-processor \
--source '{
"type": "GITHUB",
"location": "https://github.com/your-org/your-repo.git",
"sourceCredentialsOverride": {
"serverType": "GITHUB",
"authType": "PERSONAL_ACCESS_TOKEN",
"token": "{{resolve:secretsmanager:codebuild/github-pat-limited:SecretString:token}}"
},
"reportBuildStatus": false
}' \
--service-role arn:aws:iam::account:role/minimal-test-build-role

The centralized storage enables credential rotation capabilities, helping to minimize the window of exposure compared to hardcoded tokens that would require infrastructure updates to rotate.

Build environment isolation

Establishing proper build environment security controls helps maintain pipeline integrity. The foundation of this approach involves implementing separation between test and release builds, which helps prevent credential escalation and limits the scope of potential unauthorized access.

Network isolation represents another layer of protection. Configure VPC settings specifically for builds that process external code by creating dedicated security groups with carefully restricted outbound access. These security groups should permit only necessary connections, such as HTTPS traffic for downloading legitimate dependencies, while blocking unnecessary network access that could be exploited by untrusted code.

Update your AWS CodeBuild projects to leverage this network isolation through proper VPC configuration, including specified subnets and the restricted security groups you’ve established.

Multi-stage pipeline security with human review gates

Implementing security controls across multiple pipeline stages helps provide proper validation and approval processes, especially when processing external contributions. This approach combines automated scanning with human oversight to identify issues before they reach production.

Code inspection integration

Configure your build specification to automatically run security tools like Automated Security Helper during the build process. These tools scan for code security issues and dependency problems, generating detailed reports for review.

Structure the build to continue execution even when issues are found, allowing all scans to complete while automatically failing builds that contain security problems requiring attention. Store all scan artifacts to provide security teams with detailed information for approval decisions.

Manual approval gates

After code passes automated security scans, configure manual approval gates to involve human reviewers for final validation. This helps provide appropriate human review before proceeding to sensitive environments.

The access control and credential management practices outlined in this section provide specific, actionable approaches to implementing defense-in-depth security for AWS CodeBuild pipelines. These controls work together to create multiple layers of protection while maintaining the operational benefits that make CI/CD automation valuable.

Alternative approach – Self-hosted GitHub Actions runners

AWS CodeBuild’s self-hosted GitHub Actions runner capability addresses the configuration issues described in this guide by isolating repository credentials from the build environment and using GitHub Actions’ execution framework instead of AWS CodeBuild webhook processing.

For organizations that need to process external contributions automatically, configure runners with proper access controls, use ephemeral runners to minimize persistent access, and apply standard security practices for runner management.

Configuration details are available in the AWS CodeBuild documentation. For additional implementation guidance, see AWS CodeBuild Managed Self-Hosted GitHub Action Runners blog post.

Monitoring and compliance

The security controls outlined in previous sections provide protection at build time, but comprehensive defense-in-depth security requires ongoing visibility into your pipeline activities and configuration changes. Monitoring and compliance tracking serve as the final layer of your security framework, helping you detect configuration drift, audit access patterns, and maintain security posture over time.

AWS CloudTrail provides detailed logging of API calls made to AWS services, including AWS CodeBuild. Enable CloudTrail logging to create a comprehensive audit trail of all build-related activities in your environment.

AWS Config tracks AWS CodeBuild project configurations over time, providing an inventory of projects and a complete history of configuration changes. This includes webhook modifications, resource relationships, and compliance tracking across your environment. Configure AWS Config to monitor AWS CodeBuild projects and receive notifications when security-critical configurations like webhook filters are modified. For more information, see the AWS Config sample with CodeBuild documentation.

Conclusion

Implementing defense-in-depth security for AWS CodeBuild pipelines requires layered controls that address different security considerations. The most effective approach combines webhook filtering, access controls, credential management, and monitoring to provide comprehensive protection. By implementing these layered practices outlined in this guide, you can maintain development velocity while establishing robust pipeline security.
Key principles to remember:

  • Assess your threat model first – different projects require different security approaches
  • Establish clear trust boundaries between different types of contributors
  • Use webhook filtering to control when builds are triggered
  • Implement least-privilege access for build environments
  • Monitor and audit configurations regularly using AWS Config and CloudTrail
  • Store secrets in AWS Secrets Manager or SSM Parameter Store and enable rotation

AWS CodeBuild provides the flexibility to implement these security measures while maintaining the operational benefits that make pipelines valuable. Apply the configurations and mitigations in this guide based on your specific risk profile and operational requirements. Regular review and updates of your configurations will help your pipelines remain secure as your organization’s needs evolve.


Stay tuned for additional practical guides for implementing CI/CD security best practices. If you have questions or feedback about this post, including suggestions for topics that would help you most, start a new thread on re:Post : Begimher or contact AWS Support.

Daniel Begimher

Daniel Begimher

Daniel is a Senior Security Engineer in the Global Services Security organization, specializing in cloud security, application security, and incident response solutions. He co-leads the Application Security focus area within the AWS Security and Compliance Technical Field Community, holds all AWS certifications, and authored Automated Security Helper (ASH), an open source code scanning tool. In his free time, Daniel enjoys gadgets, video games, and traveling.

Overcome development disarray with Amazon Q Developer CLI custom agents

Post Syndicated from Brian Beach original https://aws.amazon.com/blogs/devops/overcome-development-disarray-with-amazon-q-developer-cli-custom-agents/

As a developer who has embraced the power of the Model Context Protocol (MCP)to enhance my workflows, I’m thrilled to see the addition of custom agents in the Amazon Q Developer CLI. This new feature takes the capabilities I’ve come to rely on to a whole new level, allowing me to seamlessly manage different development contexts and easily switch between them.

In my previous post, I discussed how MCP servers have revolutionized the way I interact with AWS services, databases, and other essential tools. MCP integration in Amazon Q Developer allows me to query my database schemas, automate infrastructure deployments, and so much more. However, as I started juggling multiple projects, each with their own unique tech stacks and requirements, I found myself needing a more structured approach to managing these diverse development environments.

Enter custom agents. With this new feature, I can now create and use a custom agent by bringing together specific tools, prompt, context and tool permissions for tasks appropriate for the stage of development. In this post I will explain how to configure a cusom agent for front-end and back-end development. Allowing me to easily optimize Amazon Q Developer for each task.

Background

Imagine that I am working on a multi-tier web application. The application has a React front-end written in Typescript and a FastAPI back-end written in Python. In addition to me, the team includes a designer that uses Figma, and the database administrator that manages a PostgreSQL database. There are subtle differences in how I communicate with the designer and the database administrator. For example, when I discuss a “table” with the designer, I’m likely referring to an HTML table and how the page is structured. However, when I discuss a table with the database administrator, I’m likely talking about a SQL table and how data is stored.

In the past, I had both the Figma Dev Mode MCP server and Amazon Aurora PostgreSQL MCP server configured in my environment. While this allowed me to easily work on either the front-end or back-end code, it introduced some challenges. If I asked Amazon Q Developer “how many tables do I have?” Amazon Q Developer would have to guess if I was talking about HTML tables or SQL tables. If the question is about HTML, it should use the Figma server. If the question is about SQL, it should use the Aurora server. This is not a technical limitation, it’s a language limitation. Just as I have to adjust my assumptions to talk with the designer and database administrator, Amazon Q Developer has to make the same adjustments.

Enter Amazon Q Developer CLI custom agents. Custom agents allow me to optimize Q Developer’s configuration for each scenario. Let’s walk through my front-end and back-end configuration to understand the impact.

Front-end agent

My front-end custom agent is optimized for front-end web development using React and Figma. The following code example is the configuration for my front-end agent stored in ~/.aws/amazonq/agents/front-end.json. Let’s discuss the major sections of the configuration.

  • mcpServers – Here I have configured the Figma Dev Mode MCP Server. This simply communicates with the Figma Web Design App installed locally. Note that this replaces the MCP configuration that was stored in ~/.aws/amazonq/mcp.json
  • tools and allowedTools – These two sections are related, so I will discuss them together. tools defines the tools are available to Amazon Q Developer while allowedTools defines which tools are trusted. In other words, Q Developer is able to use all configured tools, and it does not have to ask my permission to use fs_read, fs_write, and @Figma. @Figma allows Amazon Q Developer to use all Figma tools without asking for permission. More on this in the next section.
  • resources – Here I have configured the files that should be added to the context. I have included the README.md (stored in the project folder) and my own preferences for React (stored in my profile). You can read more in the context management section of the user guide.
  • hooks – In addition to the resources, I have also included a hook. This hook will run a command and inject it into the context at runtime. In the example, I am adding the current git status. You can read more in the context hooks section of the user guide.
{
  "description": "Optimized for front-end web development using React and Figma",
  "mcpServers": {
    "Figma": {
      "command": "npx",
      "args": [
        "mcp-remote",
        "http://127.0.0.1:3845/sse"
      ]
    }
  },
  "tools": ["*"],
  "allowedTools": [
    "fs_read",
    "fs_write",
    "report_issues",
    "@Figma"
  ],
  "resources": [
    "file://README.md",
    "file://~/.aws/amazonq/react-preferences.md"
  ],
  "hooks": {
    "agentSpawn": [
      {
        "command": "git status"
      }
    ]
  }
}

Back-end agent

My back-end custom agent is optimized for back-end development with Python and PostgreSQL. The following code example is the configuration for my back-end agent stored in ~/.aws/amazonq/agents/back-end.json. Rather than describing the sections, as I did earlier, I will focus on the differences between the front-end and back-end.

  • mcpServers – Here I have configured the Amazon Aurora PostgreSQL MCP Server. This allows Amazon Q Developer to query my dev database to learn about the schema. Notice that I have configured a read-only connection to ensure that I don’t accidentally update the database.
  • tools and allowedTools – Once again, I have enabled Amazon Q Developer to use all tools. However, notice that I am more restrictive about what tools are trusted. Amazon Q Developer will need to ask permission to use fs_write or @PostgreSQL/run_query. Notice that I can allow the entire MCP server as I did with Figma or specific tools as I did here.
  • resources – Again, I have included the README.md (stored in the project folder) and my own preferences for Python and SQL (both stored in my profile). Note that I can also use glob patterns here. For example, file://.amazonq/rules/**/*.md would include the rules created by the Amazon Q Developer IDE plugins.
  • hooks – Finally, I have also included the hook for the front-end and back-end. However, I could have included project specific options such as npm run for the front-end and pip freeze for the back-end.
{
  "description": "Optimized for back-end development with Python and PostgreSQL",
  "mcpServers": {
    "PostgreSQL": {
      "command": "uvx",
      "args": [
        "awslabs.postgres-mcp-server@latest",
        "--resource_arn", "arn:aws:rds:us-east-1:xxxxxxxxxxxx:cluster:xxxxxx",
        "--secret_arn", "arn:aws:secretsmanager:us-east-1:xxxxxxxxxxxx:secret:rds!cluster-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxx-xxxxxx",
        "--database", "dev",
        "--region", "us-east-1",
        "--readonly", "True"
      ]
    }
  },
  "tools": ["*"],
  "allowedTools": [
    "fs_read",
    "report_issues",
    "@PostgreSQL/get_table_schema"
  ],
  "resources": [
    "file://README.md",
    "file://~/.aws/amazonq/python-preferences.md",
    "file://~/.aws/amazonq/sql-preferences.md"
  ],
  "hooks": {
    "agentSpawn": [
      {
        "command": "git status"
      }
    ]
  }
}

Using custom agents

The real power of agents becomes evident when I need to switch between these different development contexts. I can now simply run q chat --agent front-end when I am working on React and Figma or q chat --agent back-end when I am working with Python and SQL. Amazon Q Developer will configure the correct agent with all my preferences.

In the following image, you can see the configuration in the Amazon Q Developer CLI. Notice that the front-end agent has an additional tool called Figma while the back-end agent has an additional tool called PostgreSQL. In addition, the front-end agent trusts fs_write and all of the Figma tools while the back-end agent will ask permission to use fs_write and only trusts one of the two PostgreSQL tools.

A split terminal view showing tool permissions for front-end and back-end environments. Both displays list built-in commands like execute_bash, fs_read, fs_write, report_issue, and use_aws, along with their permission status (trusted, not trusted, or trust read-only commands). The front-end environment also shows Figma (MCP) related permissions, while the back-end shows PostgreSQL (MCP) permissions. At the bottom of each view is a note that trusted tools will run without confirmation and instructions to use "/tools help" to edit permissions.

Similarly, let’s look at the context configuration in both the front-end and back-end agents. In the following image, I have included my React preferences for front-end development, and both Python and SQL preferences for back-end development.

A split terminal view showing the output of "/context show" command for both front-end and back-end environments. The front-end agent shows matches for "~/.aws/amazonq/react-preferences.md" and "README.md", while the back-end agent shows matches for "~/.aws/amazonq/python-preferences.md", "~/.aws/amazonq/sql-preferences.md", and "README.md". Each file is marked with "(1 match)" in green text.

As you can see, custom agents allow me to optimize the Amazon Q Developer CLI for each task. Of course, front-end and back-end agents are just an example. You might have a developer and testing agents, data science and analytics agents, etc. Custom agents allow you to tailor the configuration to most any task.

Conclusion

Amazon Q Developer CLI custom agents represent a significant improvement in managing complex development environments. By allowing developers to seamlessly switch between different contexts, they eliminate the cognitive overhead of manually reconfiguring tools and permissions for different tasks. Ready to streamline your development workflow? Get started with Amazon Q Developer today.

AWS Weekly Roundup: Kiro, AWS Lambda remote debugging, Amazon ECS blue/green deployments, Amazon Bedrock AgentCore, and more (July 21, 2025)

Post Syndicated from Donnie Prakoso original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-kiro-aws-lambda-remote-debugging-amazon-ecs-blue-green-deployments-amazon-bedrock-agentcore-and-more-july-21-2025/

I’m writing this as I depart from Ho Chi Minh City back to Singapore. Just realized what a week it’s been, so let me rewind a bit. This week, I tried my first Corne keyboard, wrapped up rehearsals for AWS Summit Jakarta with speakers who are absolutely raising the bar, and visited Vietnam to participate as a technical keynote speaker in AWS Community Day Vietnam, an energetic gathering of hundreds of cloud practitioners and AWS enthusiasts who shared knowledge through multiple technical tracks and networking sessions.

What I presented was a keynote titled “Reinvent perspective as modern developers”, featuring serverless, containers, and how we can cut the learning curves and be more productive with Amazon Q Developer and Kiro. I got a chance to discuss with a couple of AWS Community Builders and community developers, who shared how Amazon Q Developer actually addressed their challenges on building applications, with several highlighting significant productivity improvements and smoother learning curves in their cloud development journeys.

As I head back to Singapore, I’m carrying with me not just memories of delicious cà phê sữa đá (iced milk coffee), but also fresh perspectives and inspirations from this vibrant community of cloud innovators.

Introducing Kiro
One of the highlights from last week was definitely Kiro, an AI IDE that helps you deliver from concept to production through a simplified developer experience for working with AI agents. Kiro goes beyond “vibe coding” with features like specs and hooks that help get prototypes into production systems with proper planning and clarity.

Join the waitlist to get notified when it becomes available.

Last week’s AWS Launches
In other news, last week we had AWS Summit in New York, where we released several services. Here are some launches that caught my attention:

Console to IDE Integration

ECS Blue-Green Deployments

AWS Free Tier Enhanced Benefits

  • Monitor and debug event-driven applications with new Amazon EventBridge logging — Amazon EventBridge now provides enhanced logging capabilities that offer comprehensive event lifecycle tracking with detailed information about successes, failures, and status codes. This new observability feature addresses microservices and event-driven architecture monitoring challenges by providing visibility into the complete event journey.

EventBridge Enhanced Logging

S3 Vectors Overview

  • Amazon EKS enables ultra-scale AI/ML workloads with support for 100k nodes per cluster — Amazon EKS now supports up to 100,000 worker nodes in a single cluster, enabling customers to scale up to 1.6 million AWS Trainium accelerators or 800K NVIDIA GPUs. This industry-leading scale empowers customers to train trillion-parameter models and advance AGI development while maintaining Kubernetes conformance and familiar developer experience.

EKS Ultra-Scale Performance Improvements

From AWS Builder Center
In case you missed it, we just launched AWS Builder Center and integrated community.aws. Here are my top picks from the posts:

Upcoming AWS events
Check your calendars and sign up for upcoming AWS and AWS Community events:

  • AWS re:Invent – Register now to get a head start on choosing your best learning path, booking travel and accommodations, and bringing your team to learn, connect, and have fun. If you’re an early-career professional, you can apply to the All Builders Welcome Grant program, which is designed to remove financial barriers and create diverse pathways into cloud technology.
  • AWS Builders Online Series – If you’re based in one of the Asia Pacific time zones, join and learn fundamental AWS concepts, architectural best practices, and hands-on demonstrations to help you build, migrate, and deploy your workloads on AWS.
  • AWS Summits — Join free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS. Register in your nearest city: Taipei (July 29), Mexico City (August 6), and Jakarta (June 26–27).
  • AWS Community Days — Join community-led conferences that feature technical discussions, workshops, and hands-on labs led by expert AWS users and industry leaders from around the world: Singapore (August 2), Australia (August 15), Adria (September 5), Baltic (September 10), and Aotearoa (September 18).

You can browse all upcoming AWS led in-person and virtual developer-focused events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

Donnie

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!


Join Builder ID: Get started with your AWS Builder journey at builder.aws.com

Simplify serverless development with console to IDE and remote debugging for AWS Lambda

Post Syndicated from Micah Walter original https://aws.amazon.com/blogs/aws/simplify-serverless-development-with-console-to-ide-and-remote-debugging-for-aws-lambda/

Today, we’re announcing two significant enhancements to AWS Lambda that make it easier than ever for developers to build and debug serverless applications in their local development environments: console to IDE integration and remote debugging. These new capabilities build upon our recent improvements to the Lambda development experience, including the enhanced in-console editing experience and the improved local integrated development environment (IDE) experience launched in late 2024.

When building serverless applications, developers typically focus on two areas to streamline their workflow: local development environment setup and cloud debugging capabilities. While developers can bring functions from the console to their IDE, they’re looking for ways to make this process more efficient. Additionally, as functions interact with various AWS services in the cloud, developers want enhanced debugging capabilities to identify and resolve issues earlier in the development cycle, reducing their reliance on local emulation and helping them optimize their development workflow.

Console to IDE integration

To address the first challenge, we’re introducing console to IDE integration, which streamlines the workflow from the AWS Management Console to Visual Studio Code (VS Code). This new capability adds an Open in Visual Studio Code button to the Lambda console, enabling developers to quickly move from viewing their function in the browser to editing it in their IDE, eliminating the time-consuming setup process for local development environments.

The console to IDE integration automatically handles the setup process, checking for VS Code installation and the AWS Toolkit for VS Code. For developers that have everything already configured, choosing the button immediately opens their function code in VS Code, so they can continue editing and deploy changes back to Lambda in seconds. If VS Code isn’t installed, it directs developers to the download page, and if the AWS Toolkit is missing, it prompts for installation.

To use console to IDE, look for the Open in VS Code button in either the Getting Started popup after creating a new function or the Code tab of existing Lambda functions. After selecting, VS Code opens automatically (installing AWS Toolkit if needed). Unlike the console environment, you now have access to a full development environment with integrated terminal – a significant improvement for developers who need to manage packages (npm install, pip install), run tests, or use development tools like linters and formatters. You can edit code, add new files/folders, and any changes you make will trigger an automatic deploy prompt. When you choose to deploy, the AWS Toolkit automatically deploys your function to your AWS account.

Screenshot showing Console to IDE

Remote debugging

Once developers have their functions in their IDE, they can use remote debugging to debug Lambda functions deployed in their AWS account directly from VS Code. The key benefit of remote debugging is that it allows developers to debug functions running in the cloud while integrated with other AWS services, enabling faster and more reliable development.

With remote debugging, developers can debug their functions with complete access to Amazon Virtual Private Cloud (VPC) resources and AWS Identity and Access Management (AWS IAM) roles, eliminating the gap between local development and cloud execution. For example, when debugging a Lambda function that interacts with an Amazon Relational Database Service (Amazon RDS) database in a VPC, developers can now debug the execution environment of the function running in the cloud within seconds, rather than spending time setting up a local environment that might not match production.

Getting started with remote debugging is straightforward. Developers can select a Lambda function in VS Code and enable debugging in seconds. AWS Toolkit for VS Code automatically downloads the function code, establishes a secure debugging connection, and enables breakpoint setting. When debugging is complete, AWS Toolkit for VS Code automatically cleans up the debugging configuration to prevent any impact on production traffic.

Let’s try it out

To take remote debugging for a spin, I chose to start with a basic “hello world” example function, written in Python. I had previously created the function using the AWS Management Console for AWS Lambda. Using the AWS Toolkit for VS Code, I can navigate to my function in the Explorer pane. Hovering over my function, I can right-click (ctrl-click in Windows) to download the code to my local machine to edit the code in my IDE. Saving the file will ask me to decide if I want to deploy the latest changes to Lambda.

Screenshot view of the Lambda Debugger in VS Code

From here, I can select the play icon to open the Remote invoke configuration page for my function. This dialog will now display a Remote debugging option, which I configure to point at my local copy of my function handler code. Before choosing Remote invoke, I can set breakpoints on the left anywhere I want my code to pause for inspection.

My code will be running in the cloud after it’s invoked, and I can monitor its status in real time in VS Code. In the following screenshot, you can see I’ve set a breakpoint at the print statement. My function will pause execution at this point in my code, and I can inspect things like local variable values before either continuing to the next breakpoint or stepping into the code line by line.

Here, you can see that I’ve chosen to step into the code, and as I go through it line by line, I can see the context and local and global variables displayed on the left side of the IDE. Additionally, I can follow the logs in the Output tab at the bottom of the IDE. As I step through, I’ll see any log messages or output messages from the execution of my function in real time.

Enhanced development workflow

These new capabilities work together to create a more streamlined development experience. Developers can start in the console, quickly transition to VS Code using the console to IDE integration, and then use remote debugging to debug their functions running in the cloud. This workflow eliminates the need to switch between multiple tools and environments, helping developers identify and fix issues faster.

Now available

You can start using these new features through the AWS Management Console and VS Code with the AWS Toolkit for VS Code (v3.69.0 or later) installed. Console to IDE integration is available in all commercial AWS Regions where Lambda is available, except AWS GovCloud (US) Regions. Learn more about it in Lambda and AWS Toolkit for VS Code documentation. To learn more about remote debugging capability, including AWS Regions it is available in, visit the AWS Toolkit for VS Code and Lambda documentation.

Console to IDE and remote debugging are available to you at no additional cost. With remote debugging, you pay only for the standard Lambda execution costs during debugging sessions. Remote debugging will support Python, Node.js, and Java runtimes at launch, with plans to expand support to additional runtimes in the future.

These enhancements represent a significant step forward in simplifying the serverless development experience, which means developers can build and debug Lambda functions more efficiently than ever before.

GitOps continuous delivery with ArgoCD and EKS using natural language

Post Syndicated from Jagdish Komakula original https://aws.amazon.com/blogs/devops/gitops-continuous-delivery-with-argocd-and-eks-using-natural-language/

Introduction

ArgoCD is a leading GitOps tool that empowers teams to manage Kubernetes deployments declaratively, using Git as the single source of truth. Its robust feature set, including automated sync, rollback support, drift detection, advanced deployment strategies, RBAC integration, and multi-cluster support, makes it a go-to solution for Kubernetes application delivery. However, as organizations scale, several pain points and operational challenges become apparent.

Pain Points with Traditional ArgoCD Usage

  • ArgoCD’s UI and CLI are designed for users with extensive technical background. Interacting with YAML manifests, understanding Kubernetes resource relationships, and troubleshooting sync errors require specialized knowledge. This limits access to GitOps workflows for less technical stakeholders and increases reliance on DevOps engineers.
  • Managing ArgoCD across multiple clusters or environments (using hub-spoke, per-cluster, or grouped models) introduces significant operational complexity. Teams must handle multiple ArgoCD instances, maintain consistent configuration, and coordinate deployments, which can become a bottleneck as service footprints grow.
  • ArgoCD excels at syncing and monitoring Kubernetes resources but lacks built-in mechanisms for pre-deployment (e.g., image scanning) or post-deployment (e.g., load testing) tasks. This forces teams to rely on external tools or custom scripts, fragmenting the deployment pipeline and increasing maintenance effort.
  • Promoting applications across environments (Dev → Test → Prod) is not natively streamlined. Teams must manually orchestrate or script these promotions, slowing down urgent fixes and complicating the release process.
  • As organizations adopt multi-cluster strategies, managing ArgoCD’s access, RBAC, and resource visibility across environments becomes cumbersome, often leading to fragmented workflows and potential security gaps.

How ArgoCD MCP Server with Amazon Q CLI addresses these challenges:

  • The integration of the ArgoCD MCP (Model Context Protocol) Server with Amazon Q CLI fundamentally transforms the user experience by introducing natural language interaction for GitOps operations.
  • With MCP, users can manage deployments, monitor application states, and perform sync or rollback operations using plain conversational language rather than technical commands or YAML. For example, a user can simply ask, “What applications are out of sync in production?” or “Sync the api-service application,” and the system executes the appropriate ArgoCD API calls in the background.
  • This democratizes access to GitOps, enabling less technical team members (such as QA, product managers, or support engineers) to safely interact with deployment workflows.
  • Natural language interfaces abstract away the complexity of multi-cluster and multi-environment management. Users can query or act on resources across clusters without memorizing resource names, namespaces, or API endpoints.
  • The MCP server handles authentication, session management, and robust error handling, reducing the need for manual troubleshooting and custom scripting.
  • The integration provides detailed feedback, intelligent endpoint handling, and comprehensive error messages, making it easier to diagnose and resolve issues. Full static type checking and environment-based configuration further enhance reliability and maintainability.
  • By leveraging Amazon Q CLI’s extensibility, users gain access to pre-built integrations and context-aware prompts, accelerating development and deployment workflows.
  • The MCP server enables AI assistants and language models to automate routine tasks, recommend actions, and even debug issues, acting as a virtual DevOps engineer. This can significantly reduce manual effort and speed up incident response.

Traditional ArgoCD vs. ArgoCD MCP Server with Amazon Q CLI

Feature/Challenge Traditional ArgoCD With MCP Server + Amazon Q CLI
User Interface Technical UI/CLI, YAML required Natural language, conversational
Access for Non-Engineers Limited Broad, democratized
Multi-Cluster Management Complex, manual Simplified, abstracted
Pre-Post Deployment Tasks External tools/scripts needed (Still external, but easier to invoke)
Application Promotion Manual or scripted Natural language, easier orchestration
Troubleshooting Technical, error-prone Guided, AI-assisted, detailed feedback
Automation Scripting required AI/agent-driven, proactive

You can perform the following actions using natural language using Amazon Q CLI integration with ArgoCD MCP server.

  • Application Management: List, create, update, and delete ArgoCD applications
  • Sync Operations: Trigger sync operations and monitor their status
  • Resource Tree Visualization: View the hierarchy of resources managed by applications
  • Health Status Monitoring: Check the health of applications and their resources
  • Event Tracking: View events related to applications and resources
  • Log Access: Retrieve logs from application workloads
  • Resource Actions: Execute actions on resources managed by applications

Setting Up Your Environment

Pre-requisites

Following are the pre-requisites for setting up your EKS environment to be managed by ArgoCD using Amazon Q CLI.

  • An AWS account with appropriate permissions
  • AWS CLI v2.13.0 or later
  • Node.js v18.0.0 or later
  • npm v9.0.0 or later
  • Amazon Q CLI v1.0.0 or later (npm install -g @aws/amazon-q-cli)
  • An EKS cluster (v1.27 or later) with ArgoCD v2.8 or later installed

Connecting to your EKS cluster

  1. Use AWS CLI to update your kubeconfig

aws eks update-kubeconfig --name <cluster_name> --region <region> --role-arn <iam_role_arn>

  1. Verify ArgoCD pods are running properly in the argocd namespace

kubectl get pods -n argocd

  1. Access the ArgoCD server UI locally using port forwarding command

kubectl port-forward svc/blueprints-addon-argocd-server -n argocd 8080:443

Create AgroCD API Token

  1. Access the ArgoCD UI at https://localhost:8080
  2. Log in with the admin credentials
  3. Navigate to User Settings > API Tokens
  4. Click “Generate New” to create a token
  5. Create an Amazon Q CLI MCP configuration file at .amazonq/mcp.json and update the ARGOCD_BASE_URL and ARGOCD_API_TOKEN as per your environment setup.

Integrating with Amazon Q CLI

{ 
  "mcpServers": {
    "argocd-mcp-stdio": { 
      "type": "stdio", 
      "command": "npx", 
      "args": [ 
         "argocd-mcp@latest", 
         "stdio" 
      ], 
      "env": { 
        "ARGOCD_BASE_URL": "<ARGOCD_BASE_URL>",
        "ARGOCD_API_TOKEN": "<ARGOCD_API_TOKEN>", 
        "NODE_TLS_REJECT_UNAUTHORIZED": "0" 
      } 
    } 
  }
}

Once configured, you can start using natural language commands with Amazon Q CLI to interact with your ArgoCD applications.

Managing ArgoCD applications using natural language

Listed below are some example prompts to interact with ArgoCD applications in your EKS cluster.

List ArgoCD application

Prompt: “List all ArgoCD applications in my cluster

Amazon Q listing all ArgoCD applications in my clusterAmazon Q will use the ArgoCD MCP server to retrieve and display all applications

Create new ArgoCD application

Prompt: Create new argocd application using App name: game-2048   Repo: https://github.com/aws-ia/terraform-aws-eks-blueprints  Path: patterns/gitops/getting-started-argocd/k8s. Branch: main  Namespace: argocd

Amazon Q creating new argocd application using MCP ServerAmazon Q will create a new application from GitRepo information provided

Viewing deployment status

Prompt: “Show me the resource tree for team-carmen app

Amazon Q showing Resource tree of argocd application
Amazon Q will display the hierarchy of Kubernetes resources managed by the application

Synchronizing applications

Prompt: “Show me the applications that’s out of sync

Amazon Q showing argocd out of sync applicationsAmazon Q will display the out of sync applications

Prompt: “Sync the application

Amazon Q syncing argocd applicationsAmazon Q syncing application

Amazon Q will:

  • Initiate a sync operation for the specified application
  • Monitor the sync progress
  • Report the final status of the sync operation

Healthchecks and monitoring

Prompt:”Check the health of all resources in the team-geordie application

Amazon Q showing health status of all the resources in an applicationAmazon Q showing health status of all the resources in an application

Amazon Q will:

  • Retrieve the health status of all resources
  • Identify any unhealthy components
  • Provide recommendations for addressing issues

Prompt: “Show me the logs for the failing pod in the team-platform application

Amazon Q showing logs for the failing podAmazon Q showing logs of problematic pod

Amazon Q will:

  • Identify problematic pods
  • Retrieve and display relevant logs
  • Highlight potential error messages

Conclusion

The integration of Amazon Q CLI with ArgoCD through the MCP server marks a transformative advancement in Kubernetes management, combining ArgoCD’s GitOps capabilities with Amazon Q’s natural language processing. By transforming complex Kubernetes operations into simple conversational interactions, this solution allows teams to focus on what truly matters – creating value for their business. Rather than spending time memorizing commands or navigating technical complexities, teams can now manage their cloud infrastructure through natural dialogue, making the cloud-native journey more accessible and efficient for everyone.Ready to transform your EKS and ArgoCD experience? It’s highly recommended to try out Amazon Q CLI integration with ArgoCD MCP and discover why DevOps teams are making it an essential part of their toolkit.


About the authors

Jagdish Komakula Picture Jagdish Komakula is a passionate Sr. Delivery Consultant working with AWS Professional Services. With over two decades of experience in Information Technology, he helped numerous enterprise clients successfully navigate their digital transformation journeys and cloud adoption initiatives.
Aditya Ambati Picture Aditya Ambati, Is an experienced DevOps Engineer with 12 plus years of experience in IT. Excellent reputation for resolving problems, improving customer satisfaction, and driving overall operational improvements.
Anand Krishna Varanasi Picture Anand Krishna Varanasi, is a seasoned AWS builder and architect who began his career over 16 years ago. He guides customers with cutting-edge cloud technology migration strategies (the 7 Rs) and modernization. He is very passionate about the role that technology plays in bridging the present with all the possibilities for our future.

 

Managing Amazon Q Developer Profiles and Customizations in Large Organizations

Post Syndicated from Marco Frattallone original https://aws.amazon.com/blogs/devops/managing-amazon-q-developer-profiles-and-customizations-in-large-organizations/

As organizations scale their development efforts, AI coding assistants that understand organization-specific patterns and standards lead to more efficient development processes and higher quality software delivery. Amazon Q Developer Pro helps address this challenge by allowing organizations to customize the AI assistant with their proprietary code and development practices. Through Amazon Q Developer profiles, teams can efficiently manage access to Amazon Q customizations across different regions and AWS Identity Centers.

In this post, we will explore different approaches for implementing and managing Amazon Q Developer profiles and Amazon Q customizations across large organizations. Using an example with multiple business units, we will explore methods for managing access controls and customization governance while addressing security and compliance requirements.

Amazon Q customization is now available in both the US East (N. Virginia) and EU Central (Frankfurt) regions, giving teams more flexibility to create and deploy customizations closer to their operational hubs while meeting regional data residency requirements.

This blog is not intended to provide recommendations on how to structure your AWS accounts or divide Q Developer subscriptions. Rather, our aim is to explore the full capabilities of Q Developer Customizations in a comprehensive scenario that shows the current art of the possible.

A distributed Amazon Q Developer Pro subscriptions scenario

The following diagram illustrates a sample AWS Organizations structure with a Management Account and four Organizational Units (OUs). This is a common enterprise scenario with three business units, each business unit requiring their own Amazon Q Developer Pro subscription and customizations.

Diagram showing AWS Organizations structure with a Management Account at the top, containing AWS Organizations, IAM Identity Center, Amazon Q, Management Customizations, and AWS Cost & Usage Report. Below are four Organizational Units (OUs): Infrastructure, Alpha, Bravo, and Charlie. The structure illustrates the hierarchical relationship and resource allocation across different OUs and regions within an AWS organization.

Figure 1: AWS Organizations Structure and Resource Hierarchy

The Infrastructure OU has a Delegated Admin Account with delegated access to the AWS IAM Identity Center. There are three additional OUs: Alpha, Bravo, and Charlie, each with at least one Amazon Q Developer Pro subscription. Alpha account has Amazon Q Developer subscriptions both in US East (N. Virginia) and EU Central (Frankfurt) region.

Think of each business unit as its own ecosystem within your organization. When you provide dedicated Q Developer Pro subscriptions to different OUs, you’re essentially giving each unit its own personalized AI assistant. This separation is valuable because it allows each team to work independently while maintaining their specific requirements and workflows.

The Charlie OU maintains its own account instance of IAM Identity Center for Amazon Q Developer Pro. In most cases, we recommend using an organization instance of IAM Identity Center with Amazon Q Developer Pro, there are a few situations where member account instances might make sense, for example: when you do not have a single identity provider, or when you haven’t yet decided to deploy it to the whole organization and want to use Amazon Q just for the AWS account you control.

Note: When a developer has a user within an Amazon Q profile tied to two different IAM Identity Center instances (Bravo and Charlie), they will have two user subscriptions and be billed twice. However, if they belong to two different Amazon Q profiles in two different accounts (Alpha and Bravo) but under the same IAM Identity Center, they will only be billed once.

In our example, the Charlie OU requires additional operational overhead in managing separate credentials and authentication flows. Additionally, the dashboard and administrative settings will only be associated with users and groups within this account.
From an administrative perspective, instead of trying to manage one centralized configuration that attempts to serve everyone’s needs, you can distribute administration to each business unit and delegate responsibility to individual teams.

It’s like having different specialized departments in a hospital – while they’re all part of the same organization and can work together when needed, each department has its own specialized tools and protocols that help them perform their specific functions more effectively.

A strategic approach to Customizations through Q Developer profiles

A diagram illustrating the structure of an AWS IAM Identity Center organization with multiple Amazon Q Developer Pro subscriptions and customizations. Each Q Developer Pro Subscription has its own set of users representing developers. Team orange developers have access to Alpha Q Subscription and customizations, Team blue developers have access to Alpha, Bravo and Charlie Q Subscription and customizations, Team Grey developers have only access to Bravo Subscription and customizations. The organization has also an AWS IAM Identity Center instance, with separate Amazon Q Developer Pro subscription and customizations. Team bravo developers are duplicated between the two IAM Identity Centers.

Figure 2 Developers association to Amazon Q Developer Pro Subscriptions, Customizations and IAM Identity Centers

Amazon Q Developer profiles are the way developers connect to different Amazon Q Developer subscriptions through their IDE. Each profile represents a unique combination of an Amazon Q Developer subscription and its associated customizations. After authentication, developers can simply select or switch between profiles in their IDE to access different customizations.

Let’s walk through some scenarios in this architecture.

Scenario 1 – Users accessing two different customizations tied to a single IAM Identity Instance in the management account

Developers from the Orange team with access to Alpha account customizations can configure two different Amazon Q Developer profiles in their IDE:

  • A “US Profile” connected to the US East subscription in the Alpha account
  • An “EU Profile” connected to the EU Central subscription in the Alpha account

Switching between different sets of customizations involves selecting the relevant profile within their IDE.

Screenshot of IDE interface showing the Amazon Q Developer customizations panel. Developer switch between US and EU Profiles and their customizations

Figure 3 IDE showing customizations available for Team Orange developers switching between US and EU Profile and their customizations

Note: While developers can access multiple customizations through different Amazon Q Developer profiles, they only incur a single user subscription cost since they are using the organization instance of IAM Identity Center. This is because the subscription is tied to their user identity in the IAM Identity Center organization instance, not to the number of profiles or customizations they access.

Scenario 2 – Users accessing two different customizations tied to a single IAM Identity Instance in the management account
Similarly, developers from the Blue team can also configure multiple profiles:

  • One profile for accessing Alpha and Bravo customizations through the management account AWS IAM Identity Center instance
  • A separate profile for accessing Charlie customizations through the AWS IAM Identity Center member account Instance

When developers have access to multiple customizations within the same IAM Identity Center configuration and region, they can switch between profiles in their IDE without requiring reauthentication.

Screenshot of IDE interface showing the Amazon Q Developer customizations panel. When authenticated through the AWS IAM Identity Center Organization, Blue developers can see both Alpha and Bravo customizations.

Figure 4 IDE showing customizations available for Team Blue developers when authenticated to AWS IAM Identity center Organization

However, as demonstrated in the blue developers’ case, switching between profiles that use different IAM Identity Center configurations (Organization vs Account Instance) still requires reauthentication.

Note: In this scenario, developers will incur two separate user subscription charges since they are accessing customizations through two different IAM Identity Center configurations (organization and account instance). As mentioned above, this scenario is not recommended except for situations it might make sense and is shown here purely to illustrate how the authentication and profile switching mechanisms work across different IAM Identity Center configurations.

Screenshot of IDE interface showing the Amazon Q Developer customizations panel. When authenticated through the AWS IAM Identity Center Instance, Blue developers can see only Charlie customizations.

Figure 5 IDE showing customizations available for Team Blue developers when authenticated to AWS IAM Identity center Account Instance

One scenario for creating code customizations specific to each profile is that the developers on the Alpha team might need Q to understand specific libraries and internal coding conventions for Java, while Bravo team developers might need Q to be well-versed in your proprietary technologies and development standards with Python. With separate profiles and customizations, each team gets their own “flavored” version of Q that understands their context.

For Blue developers who have access to Alpha, Bravo and Charlie customizations, they need to set up separate profiles since these customizations belong to different IAM Identity Center configurations and AWS Regions. Switching between these profiles requires reauthentication due to the different IAM Identity Center configurations involved.

Developer Team AWS IAM Identity Center Customizations
Orange Organization instance Alpha customizations in US East (N. Virginia)
Alpha customizations in EU Central (Frankfurt)
Blue Organization instance Alpha customizations in US East (N. Virginia)
Bravo customizations
Account instance Charlie customizations
Grey Organization instance Bravo customizations

You can manage access to specific Amazon Q Developer Pro customizations by adding selected users and groups who already have access to Amazon Q Developer Pro subscriptions within the same Identity Center. This granular access control allows you to create targeted customizations that are only accessible to specific team members or groups within your organization.

Conclusion

In this post, we explored comprehensive strategies for implementing Amazon Q Developer customizations across large organizations. We demonstrated how Amazon Q Developer profiles provide a flexible way to manage access to different customizations across AWS regions and IAM Identity Center configurations. By integrating proprietary code repositories, establishing customization governance, and implementing continuous feedback loops, enterprises can maximize the value of their AI-powered development assistant while maintaining code quality and development standards.

The path forward depends on where you are in your Amazon Q Developer customization journey. If you’re just starting, begin with a clear assessment of your codebase and map out your customization approach before implementation. For existing users, review your current customizations and profile configurations to identify optimization opportunities.

In both cases, implement the customization governance we discussed, tailoring them to your specific development patterns and team structures. Remember that customization evolves with your codebase – regular refinements help ensure your AI assistant remains effective as your applications grow and development practices mature. Whether you’re new to Amazon Q Developer customizations or optimizing existing implementations, these practices can help develop an AI assistant that truly understands and aligns with your organization’s unique development environment.

Ready to get started? Visit the Amazon Q Developer guide to learn more about setting up profiles and customizations for your organization. If you need help planning your customization strategy, contact your AWS account team or find an AWS Partner in the AWS Partner Network.

About the authors:

Marco Frattallone

Marco Frattallone is a Senior Technical Account Manager at AWS focused on supporting Partners. He works closely with Partners to help them build, deploy, and optimize their solutions on AWS, providing guidance and leveraging best practices. Marco is passionate about technology and enables Partners stay at the forefront of innovation. Outside work, he enjoys outdoor cycling, sailing, and exploring new cultures.

Francesco Martini

Francesco Martini is a Senior Technical Account Manager at AWS. He helps AWS customers build reliable and cost-effective systems and achieve operational excellence while running workloads on AWS. He is a builder and a technology enthusiast with a background as a full-stack developer. He is passionate about sports in general, especially soccer and tennis.

Streamline Operational Troubleshooting with Amazon Q Developer CLI

Post Syndicated from Kirankumar Chandrashekar original https://aws.amazon.com/blogs/devops/streamline-operational-troubleshooting-with-amazon-q-developer-cli/

Amazon Q Developer is the most capable generative AI–powered assistant for software development, helping developers perform complex workflows. Amazon Q Developer command-line interface (CLI) combines conversational AI with direct access to AWS services, helping you understand, build, and operate applications more effectively. The Amazon Q Developer CLI executes commands, analyzes outputs, and provides contextual recommendations based on best practices for troubleshooting tools and platforms available on your local machine.

In today’s cloud-native environments, troubleshooting production issues often involves juggling multiple terminal windows, parsing through extensive log files, and navigating numerous AWS console pages. This constant context-switching delays problem resolution and adds cognitive burden to teams managing cloud infrastructure.

In this blog post, you will explore how Amazon Q Developer CLI transforms the troubleshooting experience by streamlining challenging scenarios through conversational interactions.

The Traditional Troubleshooting Experience

When issues arise, engineers typically spend hours manually examining infrastructure configurations, reviewing logs across services, and analyzing error patterns. The process requires switching between multiple interfaces, correlating information from various sources, and deep AWS knowledge. This complex workflow often extends problem resolution from hours into days and increase the burden on the infrastructure teams.

Solution: Amazon Q Developer CLI

Amazon Q Developer CLI streamlines the entire troubleshooting process, from initial investigation to problem resolution, making complex AWS troubleshooting accessible and efficient through simple conversations.

How Amazon Q Developer CLI works:

  • Natural Language Interface: Execute AWS CLI commands and interact with AWS services using conversational prompts
  • Automated Discovery: Map out infrastructure and analyze configurations
  • Intelligent Log Analysis: Parse, correlate, and analyze logs across services
  • Root Cause Identification: Pinpoint issues through AI-powered reasoning
  • Guided Remediation: Implement fixes with minimal human intervention
  • Validation: Test solutions and explain complex issues simply

One of the built-in tools within the Amazon Q Developer CLI, use_aws, enables natural language interaction with AWS services, as shown in Figure 1. This tool leverages the AWS CLI permissions configured on your local machine, allowing secure and authorized access to your AWS resources.

A command line interface showing a list of tools and their permissions. The display is titled "/tools" and shows several built-in tools including execute_bash, fs_read, fs_write, report_issue, and use_aws. Each tool has an associated permission level indicated by asterisks. The use_aws tool is highlighted with "trust read-only commands" permission. At the bottom, there's a note stating "Trusted tools will run without confirmation" and a tip to "Use /tools help to edit permissions".

Figure 1: Tools selection in Amazon Q Developer CLI

Real-World Troubleshooting Scenario

Demonstration Environment Setup

This demonstration was performed with the following environment configuration:

The environment includes a local development machine with necessary tools, appropriate AWS account permissions, and terminal access. By starting Amazon Q Developer CLI in the project directory, it has immediate access to relevant code and configuration files.

Scenario: Troubleshooting NGINX 5XX Errors

The scenario demonstrates troubleshooting a multi-tier application architecture as shown in figure 2 deployed on Amazon ECS Fargate with:

  • Application Load Balancer (ALB) distributing traffic across availability zones
  • NGINX reverse proxy service handling incoming requests
  • Node.js backend service processing business logic
  • Service discovery enabling internal communication
  • CloudWatch Logs providing centralized logging

An AWS cloud architecture diagram showing the flow of traffic from an Internet user through multiple components. The diagram includes: At the top: An Internet user connecting to an Internet Gateway Within a VPC (Virtual Private Cloud): Two public subnets containing a NAT Gateway and Application Load Balancer Two private subnets within an ECS Cluster containing: An NGINX service (Fargate) A Backend service (Fargate) A 10-second timeout between them A Cloud Map Service Discovery component at the bottom CloudWatch Logs integration on the right side The diagram includes a note about gateway timeouts: "504 Gateway Timeout - Backend takes 15s to respond, NGINX timeout is 10s" All components are connected with arrows showing the flow of traffic and data through the system. The infrastructure follows AWS best practices with public and private subnet separation for security.

Figure 2: AWS Architecture diagram for the app used in this blog post

Traditional Troubleshooting Steps

For the architecture in figure 2, when 502 Gateway Timeout errors occur, traditional troubleshooting requires:

  1. Checking ALB target group health
  2. Examining ECS service status across multiple consoles
  3. Analyzing CloudWatch logs from different log groups
  4. Correlating error patterns between services
  5. Reviewing infrastructure code for configuration issues
  6. Implementing and deploying fixes

Amazon Q Developer CLI Approach

Instead, let’s see how Amazon Q Developer CLI handles this systematically, step by step:

Step1: Initial Problem Report

Amazon Q Developer CLI is provided with the initial prompt as a problem statement within the application project directory as shown in the following screenshot in figure 3. Amazon Q Developer responds back and says it is going investigate the 502 Gateway Timeout errors in the NGINX application.

Prompt:

Our production NGINX application is experiencing 502 Gateway Timeout errors. 
I have checked out the application and infrastructure code locally and the AWS CLI 
profile 'demo-profile' is configured with access to the AWS account where the 
infrastructure and application is deployed to. Can you help investigate and diagnose the issue?

A Visual Studio Code window showing a debugging session for an NGINX application. The interface has three main sections: a file explorer on the left showing project files including 'app.ts' and 'nginx-config-task.json', a terminal tab in the center displaying an "Amazon Q" ASCII art logo, and a conversation where a user is reporting 502 Gateway Timeout errors. The terminal shows AWS CLI command execution using a tool called "use_aws" with parameters including the service name "ecs" and region "us-west-2". The interface has red annotations highlighting key areas like "project files", "User provided initial prompt", and "Q CLI executing AWS CLI calls.

Figure 3: Amazon Q Developer CLI with initial prompt and problem statement

Step2: Systematic Infrastructure Discovery

Amazon Q Developer CLI start to systematically discovering the infrastructure as shown in the following screenshot in figure 4. If you see the initial prompt did not include that the app is hosted on ECS, but Amazon Q Developer CLI understood the context and executes the AWS CLI calls to describe the Cluster and the services within it. It made sure that the ECS tasks are running for both the services within the Cluster. It is a key discovery that both services show healthy status (1/1 desired count), indicating the issue isn’t service availability.

A terminal window showing three sequential AWS CLI commands being executed through a "use_aws" tool: First command: "list-clusters" operation for ECS service in us-west-2 region using demo-profile, completing in 1.244 seconds Second command: "list-services" operation targeting the NginxSimulationCluster, completing in 0.877 seconds with confirmation of finding both nginx-service and backend-service Third command: "describe-services" operation examining both services in detail, completing in 0.968 seconds with confirmation that both services are running as expected (1/1 desired count) Each command includes execution details, parameters, and completion status, with the system preparing to check CloudWatch logs next.

Figure 4: AWS Infrastructure discovery by Amazon Q Developer CLI

Step 3: Intelligent Log Analysis

Amazon Q Developer CLI retrieves and analyzes recent CloudWatch logs from the NGINX container, immediately identifying the critical error pattern as shown in the following screenshot in figure 5, where Amazon Q Developer responds: “Perfect! I found the issue. The NGINX logs show clear 504 gateway timeout with upstream timeout messages.”

A terminal window showing two AWS CloudWatch Logs commands being executed: First command: "describe-log-streams" operation for the "/ecs/nginx-service" log group, limiting to 5 most recent entries, ordered by LastEventTime in descending order Second command: "get-log-events" operation retrieving 50 log entries from a specific NGINX container log stream The output reveals a critical error message highlighted at the bottom showing an upstream timeout (error 110) occurring while reading response headers. The error details include client IP 10.0.0.247, upstream server at http://10.0.3.18:3000/, and host 52.35.62.210.

Figure 5: CloudWatch Log analysis by Amazon Q Developer CLI

Step 4: Amazon Q Developer CLI Analysis and Root Cause Identification

Amazon Q Developer examines backend service logs and discovers a mismatch between the backend service response time and NGINX timeout settings, as seen in the following screenshot in figure 6.

A terminal window showing AWS CloudWatch Logs commands and their output. The first command describes log streams for a backend container, and the second retrieves log events. The output reveals a debugging analysis showing that while health checks work fine, regular requests are being delayed by about 15 seconds, causing NGINX timeout issues. The log group is "/ecs/backend-service" in the us-west-2 region using a demo-profile.

Figure 6: Root cause identification by Amazon Q Developer CLI

Step 5: Amazon Q Developer CLI Root Cause Analysis

Amazon Q Developer CLI examines the ECS task definitions to identify the exact configuration mismatch, as shown in the following screenshot in figure 7. Amazon Q Developer finds that:

  • Backend service is configured with response_delay=15000 (15 secs)
  • NGINX proxy is configured with proxy_read_timeout 10s

This mismatch causes 504 gateway timeout errors when the backend response exceeds NGINX’s timeout threshold.

A terminal window showing two AWS CLI commands to describe ECS task definitions in the us-west-2 region. Below the commands is a highlighted "Root Cause Analysis" section that explains a timeout mismatch: the backend service is configured with a 15-second response delay while NGINX has a 10-second proxy timeout, resulting in 502 Gateway Timeout errors. Both commands use a demo-profile and are labeled as checking timeout and response delay configurations.

Figure 7: Root cause analysis and issue detection by Amazon Q Developer CLI

Step 6: Automated Code Fix

Here’s where Amazon Q Developer CLI truly excels—it doesn’t just diagnose; it implements the fix. Since Amazon Q Developer CLI is started within the project where the CDK code for ECS task definition is defined, it identified the code configuration and also modified it, as shown in the following screenshot in figure 8.

A terminal window showing file operations using fs_read and fs_write tools. The code changes show an NGINX configuration update in ecs-nginx-cdk.ts, where the proxy_read_timeout is being modified from '10s' to '20s'. The file also shows additional timeout configurations being added, including proxy_connect_timeout and proxy_send_timeout. The update is confirmed with a user prompt and completed in 0.2 seconds.

Figure 8: CDK code fix by Amazon Q Developer CLI

Step 7: Deployment

Amazon Q Developer CLI builds and deploys the fix by executing cdk synth and cdk deploy using the ‘demo-profile‘ AWS CLI profile that was initially provided in the prompt, as shown in the following screenshot in figure 9.

A terminal window showing two execute_bash commands running in sequence. The first command builds a CDK project using 'npm run build' in the nginx-app directory, completing in 4.102s. The second command deploys the updated CDK stack using 'cdk deploy' with the demo-profile, showing deployment progress including some warnings about minHealthyPercent configurations and CloudFormation stack updates in us-west-2 region.

Figure 9: CDK code build and deployment by Amazon Q Developer CLI

Step 8: Validation

Amazon Q Developer CLI validates the solution by sending a curl request to the ALB endpoint after the successful deployment, as shown in the following screenshot in figure 10.

A terminal window showing the execution of a curl command to test an NGINX application on AWS. The command targets an Elastic Load Balancer in the us-west-2 region. The response shows a successful HTTP 200 OK status after 14 seconds, with a JSON response containing the message "Hello from backend". The test completes in 15.100 seconds, indicating the fix for previous 502 errors was successful.

Figure 10: Fix validation by Amazon Q Developer CLI

In addition to that, Amazon Q Developer also sends a request to the health check endpoint and validates everything is working after the fix was deployed, as shown in the following screenshot in figure 11.

A terminal screenshot showing the results of a health check on an Nginx server using curl. The command executed shows a successful response with "healthy" status, completing in 0.65 seconds. The output displays various metrics including download speed (386 B/s), 100% completion rate, and timing statistics for real, user, and system processes.

Figure 11: Health endpoint validation by Amazon Q Developer CLI

What Amazon Q Developer CLI Accomplished

Using just conversational commands, Amazon Q Developer CLI performed a complete troubleshooting cycle:

  • Infrastructure Discovery: Automatically mapped ECS clusters, services, and dependencies
  • Log Correlation: Analyzed thousands of log entries across multiple services
  • Root Cause Analysis: Identified exact configuration mismatch between NGINX’s timeout (10s) and the backend’s response delay (15s)
  • Code-Level Diagnosis: Located problematic timeout setting in CDK infrastructure code
  • Automated Implementation: Modified infrastructure code to increase the NGINX timeout
  • End-to-End Deployment: Built, deployed, and validated the complete solution
  • Comprehensive Testing: Verified both fix effectiveness and overall system health

Amazon Q Developer CLI handles troubleshooting tasks through a single, conversational interface, eliminating the need for multiple tools or AWS CLI commands.

Conclusion

Amazon Q Developer CLI represents a significant evolution in how we troubleshoot cloud infrastructure issues. By combining natural language understanding with powerful command execution capabilities, it transforms complex troubleshooting workflows into efficient, action-oriented dialogues. Whether you’re dealing with NGINX 5XX errors or similar issues across other AWS services, Amazon Q Developer CLI can help you diagnose issues, implement fixes, and validate solutions—all through a conversational interface that feels natural and intuitive.

Give Amazon Q Developer CLI a try the next time you encounter a troubleshooting challenge, and experience the difference it can make in your operational workflow.

To learn more about Amazon Q Developer’s features and pricing details, visit the Amazon Q Developer product page.

About the Author

kirankumar.jpeg

Kirankumar Chandrashekar is a Generative AI Specialist Solutions Architect at AWS, focusing on Amazon Q Developer. Bringing deep expertise in AWS cloud services, DevOps, modernization, and infrastructure as code, he helps customers accelerate their development cycles and elevate developer productivity through innovative AI-powered solutions. By leveraging Amazon Q Developer, he enables teams to build applications faster, automate routine tasks, and streamline development workflows. Kirankumar is dedicated to enhancing developer efficiency while solving complex customer challenges, and enjoys music, cooking, and traveling.

Announcing the new AWS CDK EKS v2 L2 Constructs

Post Syndicated from Matteo Luigi Restelli original https://aws.amazon.com/blogs/devops/announcing-the-new-aws-cdk-eks-v2-l2-constructs/

Introduction

Today, we’re announcing the release of aws-eks-v2 construct, a new alpha version of AWS Cloud Development Kit (CDK) L2 construct for Amazon Elastic Kubernetes Service (EKS). This construct represents a significant change in how developers can define and manage their EKS environments using infrastructure as code. While maintaining the powerful capabilities of its predecessor library for creating and managing EKS clusters, this alpha release introduces key architectural improvements that enhance both flexibility and maintainability.

The AWS Cloud Development Kit (AWS CDK) is an open-source software development framework that enables you to define your cloud infrastructure using familiar programming languages and deploy it through AWS CloudFormation.
The CDK uses constructs – a layered abstraction concept where Layer 1 (L1) constructs map directly to CloudFormation resources, while Layer 2 (L2) constructs provide intuitive APIs, helper functions, best-practice defaults, and generate a lot of the boilerplate code and glue logic for you. This layered approach means you can seamlessly move between high-level abstractions for common use cases and low-level resource definitions when you need fine-grained control. The result is an Infrastructure as Code (IaC) experience that helps you maintain productivity while ensuring you have access to the full power of AWS services when you need it.
You can read more about constructs and their benefits in the CDK user guide.

In this post we’ll explore:

  • The reasoning behind the creation of a new L2 construct for EKS and the improvements introduced by this new library
  • How to use the new EKS v2 construct

Background

Amazon EKS is a managed Kubernetes service that makes it easy to run Kubernetes on AWS without needing to manage the control plane or nodes. EKS automatically handles critical tasks like patching, node provisioning, and upgrades. You can run EKS using EC2 instances for worker nodes, AWS Fargate for serverless containers, or a combination of both, providing the flexibility to choose the right compute option for your workloads.

While the existing EKS L2 construct has served customers well, we identified opportunities to further enhance the developer experience and operational efficiency based on their feedback. The new aws-eks-v2 construct delivers significant improvements through native AWS CloudFormation resources, modern Access Entry-based authentication, and enhanced architectural flexibility. Key benefits include reduced deployment overhead, simplified cluster access management, support for multiple EKS clusters within a single stack, and granular control over resource creation with features like the optional kubectl Lambda handler.
These improvements help customers build and manage their EKS infrastructure more efficiently while maintaining the robust functionality they expect from AWS CDK constructs.

Using the L2

Given that this construct is in the alpha stage, you’ll need to install and import the construct using the experimental construct libraries process. During the alpha stage, the CDK team is actively gathering customer feedback and iterating on the implementation. Once the construct meets our bar for general availability, we’ll integrate it directly into the AWS CDK core library, making it as easily accessible as our other L1 and L2 constructs. This approach allows us to rapidly deliver new capabilities while ensuring they meet the high standards our customers expect.

Deploying EKS Cluster with Default Configuration

Let’s explore how to create an Amazon EKS cluster using AWS CDK aws-eks-v2 construct with minimal configuration requirements. The following example demonstrates the most straightforward way to define an EKS cluster, leveraging the power of CDK’s opinionated defaults.
Creating a new cluster is done using the Cluster construct. The only required property is the Kubernetes version.

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';

// Creating an EKS Cluster with default properties
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
    version: eksv2.KubernetesVersion.V1_32
});

This translates in the following Architecture as shown in figure 1:

L2 CDK Construct v2 for EKS - Default Architecture

Figure 1 – L2 CDK construct v2 for EKS, Default Architecture

  • Amazon Virtual Private Cloud (VPC) – A logically isolated section of the AWS Cloud that spans across two Availability Zones, equipped with an Internet Gateway to enable secure communication with the internet. This multi-AZ design helps ensure your applications remain available even if an Availability Zone experiences issues.
  • Amazon EKS Control Plane – A fully managed Kubernetes control plane deployed in an AWS-managed VPC , providing high availability and automatic version management for the Kubernetes control plane components.
  • Public Subnet Infrastructure – Two public subnets, each with its own NAT Gateway Instance, enabling your cluster components to securely access the internet for essential operations like pulling container images and downloading updates. These NAT Gateways provide a secure outbound path while protecting your workloads from direct internet exposure.
  • Private Subnet Configuration – Two private subnets optimized for running your EKS worker nodes, offering enhanced security by isolating your workloads from direct internet access while maintaining the ability to communicate with AWS services and the internet through the NAT Gateways.
  • IAM Security Foundation – A comprehensive set of IAM roles and policies that implement the principle of least privilege:
    • Control plane service role that enables EKS to manage AWS resources on your behalf
    • Node IAM role that allows worker nodes to interact with other AWS services and join the EKS cluster

You can also use FargateCluster to provision a cluster that uses only Fargate workers.

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';

// Creating an EKS Cluster with default properties and Fargate workers
const eksFargateCluster = new eksv2.FargateCluster(this, 'EksFargateCluster', {
   version: eksv2.KubernetesVersion.V1_32,
});

To help our customers maintain better control over their cluster access patterns, the Kubectl Handler is not automatically deployed with the default configuration. You can easily enable this functionality by configuring the kubectlProviderOptions property when you need kubectl access management as shown below.

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';
import { KubectlV32Layer } from '@aws-cdk/lambda-layer-kubectl-v32'

// Creating an EKS Cluster with default properties and kubectl handler
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
   version: eksv2.KubernetesVersion.V1_32,
   kubectlProviderOptions: {
      kubectlLayer: new KubectlV32Layer(this, 'KubectlLayer')
   },
});

Deploying EKS Cluster with AutoMode

EKS Auto Mode represents a significant advancement in how Amazon EKS manages compute capacity for Kubernetes clusters. This intelligent capacity management system automatically provisions and scales node groups based on workload demands, removing the need for manual capacity planning.

When you create a new cluster with the aws-eks-v2 construct, EKS Automode is activated by default, by means that DefaultCapacityType.AUTOMODE is automatically set as the default capacity type for the EKS Cluster. If you prefer, you can specify the defaultCapacityType to AutoMode:

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';

// Creating an EKS Cluster with AutoMode
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
   version: eksv2.KubernetesVersion.V1_32,
   defaultCapacityType: eksv2.DefaultCapacityType.AUTOMODE, // default value
});

After deploying the Stack containing the construct instance, in the EKS Console you’ll be able to see that an EKS Cluster has been created with AutoMode enabled:

EKS Cluster Deployed with Automode

Figure 2 – EKS Cluster Deployed with Automode

Auto Mode enhances your Amazon EKS experience by automatically configuring two strategically designed node pools out of the box:

  • A system node pool optimized for running critical cluster system components and add-ons, ensuring reliable cluster operations.
  • A general node pool specifically tuned for your application workloads, providing the flexibility needed for diverse containerized applications.

You can configure which node pools to enable through the compute property:

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';

// Creating an EKS Cluster with Automode and selecting nodePools
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
   version: eksv2.KubernetesVersion.V1_32,
   defaultCapacityType: eksv2.DefaultCapacityType.AUTOMODE,
   compute: {
      nodePools: ['system', 'general-purpose'],
   },
});

Deploying EKS Cluster with Managed Node Groups

Amazon EKS Managed Node Groups deliver a seamless compute management experience for your Kubernetes clusters. This powerful capability eliminates operational complexity by automating the end-to-end lifecycle of Amazon EC2 instances that power your containerized applications.
Behind the scenes, Amazon EKS managed node groups intelligently orchestrate these changes, ensuring zero-disruption to your applications through graceful node draining. The service automatically leverages the latest Amazon EKS-optimized AMIs, providing a secure and optimized foundation for your workloads.

By setting defaultCapacityType to NODEGROUP, customers can leverage the traditional managed node group management approach:

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';

// Creating an EKS Cluster with Managed Node Groups and default instance types
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
   version: eksv2.KubernetesVersion.V1_32,
   defaultCapacityType: eksv2.DefaultCapacityType.NODEGROUP,
});

By default, when using DefaultCapacityType.NODEGROUP, this library will allocate a managed node group with two m5.large instances.
After deploying the above code, you can check the EKS Console to see that an EKS Cluster has been deployed as shown in figure 3:

EKS Cluster Deployed with Managed Node Groups

Figure 3 – EKS Cluster Deployed with Managed Node Groups

You can also check the Compute tab and see the Managed Node Group Configuration as shown in figure 4:

EKS Cluster Managed Node Group Default Configuration

Figure 4 – EKS Cluster Managed Node Group Default Configuration

If you want to have control over instance types of a Managed Node Group, you can specify the default EC2 type as property of the construct:

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';
import * as ec2 from 'aws-cdk-lib/aws-ec2'

// Creating an EKS Cluster with Managed Node Groups and specific instance types
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
   version: eksv2.KubernetesVersion.V1_32,
   defaultCapacityType: eksv2.DefaultCapacityType.NODEGROUP,
   defaultCapacity: 5,
   defaultCapacityInstance: ec2.InstanceType.of(ec2.InstanceClass.M5, ec2.InstanceSize.SMALL),
});

You can also specify additional customizations after the EKS cluster declaration, via the addNodegroupCapacity method:

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';
import * as ec2 from 'aws-cdk-lib/aws-ec2'

// Creating an EKS Cluster with Managed Node Groups and specific instance types
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
   version: eksv2.KubernetesVersion.V1_32,
   defaultCapacityType: eksv2.DefaultCapacityType.NODEGROUP,
   defaultCapacity: 0,
});

eksCluster.addNodegroupCapacity('custom-node-group', {
  instanceTypes: [new ec2.InstanceType('m5.large')],
  minSize: 4,
  diskSize: 100,
});

Managing Permissions through Access Entries

The new aws-eks-v2 construct transitions away from the previous ConfigMap-based authentication (which is deprecated in EKS) in favor of the Access Entries Authentication mode. This change introduces Access Entry as the standardized method for managing cluster permissions, offering a more streamlined and secure approach to granting cluster access to IAM users and roles.

You can define Access Policies through the AccessPolicy construct and you can adjust the scope of the Access Policy to the entire EKS cluster or to specific EKS Namespaces:

import * as eksv2 from '@aws-cdk/aws-eks-v2-alpha';

// AmazonEKSClusterAdminPolicy with `cluster` scope
eks.AccessPolicy.fromAccessPolicyName('AmazonEKSClusterAdminPolicy', {
   accessScopeType: eks.AccessScopeType.CLUSTER,
});

// AmazonEKSAdminPolicy with `namespace` scope
eks.AccessPolicy.fromAccessPolicyName('AmazonEKSAdminPolicy', {
   accessScopeType: eks.AccessScopeType.NAMESPACE,
   namespaces: ['foo', 'bar'] 
});

You can then grant access to specific IAM Roles using the grantAccess method:

import * as iam from 'aws-cdk-lib/aws-iam'

// Defining a IAM Role
const clusterAdminRole = new iam.Role(this, 'ClusterAdminRole', {
   assumedBy: new iam.ArnPrincipal('arn_for_trusted_principal'),
});

// Creating an EKS Cluster with AutoMode
const eksCluster = new eksv2.Cluster(this, 'EksCluster', {
   version: eksv2.KubernetesVersion.V1_32,
   defaultCapacityType: eksv2.DefaultCapacityType.AUTOMODE,
});

// Cluster Admin role for this cluster
eksCluster.grantAccess('clusterAdminAccess', clusterAdminRole.roleArn, [
	eks.AccessPolicy.fromAccessPolicyName('AmazonEKSClusterAdminPolicy', {
   	    accessScopeType: eks.AccessScopeType.CLUSTER,
    }),
]);

When the Principal assumes the ClusterAdminRole, it receives seamless access to the EKS cluster through a carefully orchestrated permission chain. This access is governed by the AmazonEKSClusterAdminPolicy, which is automatically attached to the Access Policy linked to the IAM Role.

Conclusion

In this post, we introduced the new AWS CDK L2 construct (aws-eks-v2) for Amazon EKS, demonstrating how it simplifies cluster deployment while offering enhanced flexibility and operational efficiency. Through practical examples, we showcased how customers can leverage the construct’s intelligent defaults and customization options to build production-ready Kubernetes environments on AWS

The new L2 construct for Amazon EKS delivers significant improvements that help customers accelerate their container adoption journey:

  • Enhanced Performance: Eliminates dependency on Custom Resources and AWS Lambda functions by utilizing native AWS CloudFormation resources, resulting in faster and more reliable deployments.
  • Modern Authentication: Implements Access Entry-based authentication, replacing the deprecated ConfigMap approach with a more secure and programmable solution.
  • Improved Scalability: Removes the single-cluster-per-stack limitation and eliminates nested stacks, enabling more flexible architectural patterns.
  • Optimized Resource Creation: Makes the kubectl Lambda handler optional, giving customers fine-grained control over their infrastructure components.
  • Streamlined Operations: Provides automated node group management with intelligent defaults while maintaining full customer control when needed.

To get started with the new EKS L2 construct, visit the AWS CDK documentation. If you have specific features you’d like to see added, we encourage you to submit a feature request in the aws-cdk GitHub repository. Your feedback helps us continue innovating on your behalf.

About the author

Matteo Luigi Restelli

Matteo Luigi Restelli is an AWS Sr. Partner Solutions Architect. He mainly focuses on Italian Consulting AWS Partners and is also specialized in Infrastructure as Code, Cloud Native App Development and DevOps. Outside of work, he enjoys swimming, rock & roll music and learning something new everyday especially in Computer Science space.