All posts by Hitesh Dodiya

Query Amazon Redshift using natural language with Kiro

Post Syndicated from Hitesh Dodiya original https://aws.amazon.com/blogs/big-data/query-amazon-redshift-using-natural-language-with-kiro/

It’s Monday morning and your VP pings you: “Revenue dropped 15 percent over the weekend. What happened?” The clock starts. You open the AWS Management Console, find the right Amazon Redshift cluster, open the query editor, and start hunting. Which database has the revenue data, analytics_db or reporting_db? Is the table called orders, transactions, or sales_events? You find it, but now you need the schema. Is the amount column total_amount, revenue, or order_value? 20 minutes in, you haven’t answered the question. You’ve been navigating infrastructure.

This scenario plays out daily across data teams, and it’s why the landscape is shifting. With AI agents entering the analytics workflow, a growing number of business users can now perform complex data analysis. They no longer need to file a ticket with the data engineering team and wait days for a response. The bottleneck is no longer SQL expertise. It’s the friction between having a question and getting an answer.

The Amazon Redshift MCP server paired with Kiro removes that friction. Instead of memorizing cluster endpoints, reverse-engineering schemas, and hand-writing SQL, you describe what you need in plain text and get results. That Monday morning question becomes a single sentence: “Show me daily revenue for the past two weeks, broken down by region.” Kiro finds the cluster, discovers the schema, writes the query, and returns the answer in seconds, not minutes.

In this post, you learn how to:

  1. Install and configure Kiro with the Amazon Redshift MCP server.
  2. Discover clusters, databases, and schemas using natural language.
  3. Run analytical queries and cross-cluster comparisons conversationally.
  4. Implement security best practices for production Amazon Redshift environments.

You can use Kiro in two forms: Kiro integrated development environment (IDE), a full desktop development environment, and Kiro command line interface (CLI), which brings the same AI capabilities directly to your terminal. The Redshift MCP server works with both. The CLI experience is particularly well suited for the conversational analytics workflow this post describes, because you can start querying your data warehouse from a terminal session without opening an IDE.

Important: Before using this integration with production Amazon Redshift environments, read the Security tips section. This section covers critical considerations around AWS Identity and Access Management (IAM) permissions and Kiro autonomy modes.

What is the Amazon Redshift MCP server?

The Model Context Protocol (MCP) is an open standard that provides AI agents with secure connections to external data sources and tools. The Amazon Redshift MCP server is an open source implementation that bridges the Kiro AI agent with your Amazon Redshift infrastructure.

With the Redshift MCP server, you can:

  • Automatically find both provisioned clusters and serverless workgroups with cluster discovery.
  • Browse databases, schemas, tables, and columns with metadata exploration.
  • Run SQL in READ ONLY mode with built-in safety protections with safe query execution.
  • Work with multiple clusters and workgroups simultaneously with multi-cluster support.

The server translates your natural language requests into the appropriate Amazon Redshift Data API calls and SQL queries. No manual endpoint configuration or SQL writing is required.

How the Redshift MCP server relates to the AWS MCP server

You might have noticed that AWS also offers the AWS MCP server (part of the Agent Toolkit for AWS), which provides broad access to AWS services, including the Redshift Data API. A common question is: if the AWS MCP server can already reach Redshift, why add a dedicated Redshift MCP server?

The two are complementary, not competing. The AWS MCP server gives Kiro general AWS capabilities (service decision guides, SDK usage guidance, troubleshooting skills, and access to AWS APIs). The Redshift MCP server adds a purpose-built analytics layer on top. It provides single-call query execution (compared to a minimum of three API calls for submit, poll, and fetch), read-only safety by default, transparent provisioned and serverless cluster handling, and dedicated metadata navigation tools. Upcoming features like query plan explanation, native identity propagation, cluster analysis, and UDF discovery will further extend this specialized layer.

You can use both together, or use the Amazon Redshift MCP server on its own. There’s no either-or requirement.

Setting it up

The following sections walk you through the installation and configuration process.

Prerequisites

Before you begin, make sure that you have:

On your machine:

  • Kiro IDE or Kiro CLI installed.
  • Python 3.10 or newer.
  • The uv package manager from Astral.

On AWS:

  • AWS credentials configured through the AWS Command Line Interface (AWS CLI), environment variables, or IAM roles.
  • At least one Amazon Redshift provisioned cluster or serverless workgroup.
  • IAM permissions for Amazon Redshift access (see the following section).

Step 1: Install the uv package manager

If you don’t have uv installed, run one of the following commands.

For macOS or Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

For Windows PowerShell:

powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

Then install Python 3.10 or newer if needed:

uv python install 3.10

Step 2: Configure IAM permissions

Your AWS identity needs the following permissions. Attach this policy to your IAM user or role:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "redshift:DescribeClusters",
        "redshift:GetClusterCredentialsWithIAM",
        "redshift:GetClusterCredentials",
        "redshift-serverless:ListWorkgroups",
        "redshift-serverless:GetWorkgroup",
        "redshift-serverless:GetCredentials",
        "redshift-data:ExecuteStatement",
        "redshift-data:DescribeStatement",
        "redshift-data:GetStatementResult"
      ],
      "Resource": "*"
    }
  ]
}

You also need database-level permissions: SELECT on tables you want to query, USAGE on schemas you want to explore, and connection access to the target databases.

Step 3: Configure the MCP server in Kiro

Open (or create) your Kiro MCP configuration file and add the Amazon Redshift server.

For Kiro IDE:

  1. User-level configuration (applies globally): ~/.kiro/settings/mcp.json.
  2. Workspace-level configuration (applies to a specific project): .kiro/settings/mcp.json.

For Kiro CLI:

  1. User-level configuration: ~/.kiro/settings/mcp.json.
  2. Workspace-level configuration: .kiro/settings/mcp.json in your project directory.

The configuration format is the same for both. Add the following:

{
  "mcpServers": {
    "awslabs.redshift-mcp-server": {
      "command": "uvx",
      "args": ["awslabs.redshift-mcp-server@latest"],
      "env": {
        "AWS_PROFILE": "default",
        "AWS_DEFAULT_REGION": "us-east-1",
        "FASTMCP_LOG_LEVEL": "ERROR"
      },
      "disabled": false,
      "autoApprove": []
    }
  }
}

Tip: Replace AWS_PROFILE and AWS_DEFAULT_REGION with values matching your environment. If you use AWS_REGION instead of AWS_DEFAULT_REGION, AWS_REGION takes higher precedence.

Note: Windows users: use uv instead of uvx with additional args. See the Amazon Redshift MCP server documentation for the Windows-specific configuration.

Step 4: Verify the connection

In Kiro IDE, restart Kiro (or reconnect MCP servers from the MCP Server view in the Kiro feature panel). Then open the Kiro chat and enter the following prompt:

List my available Redshift clusters

If everything is configured correctly, Kiro discovers your clusters and lists them with details like cluster identifiers, status, node types, and endpoints.

Practical examples

The following scenarios show how Kiro and the Amazon Redshift MCP server work together in real-world workflows.

Tip: If you want Kiro to show the SQL that it executes alongside every result, tell it at the beginning of your session:

For every query you run against Amazon Redshift, show me the exact SQL you executed along with the results.

Kiro follows this instruction for the rest of the conversation. For a permanent setup, add this rule to a steering file (see the Tips section that follows) so you don’t need to repeat it. The following examples reflect this behavior.

Example 1: Discover and explore your infrastructure

Without Kiro: Log in to the AWS Management Console, navigate to Amazon Redshift, choose clusters, open the query editor, sign in to each database, and manually browse schemas.

With Kiro:

What Amazon Redshift clusters and serverless workgroups do I have?

Kiro responds with a structured overview:

Found two provisioned clusters and one serverless workgroup:

  • analytics-cluster (provisioned, 2x ra3.xlplus nodes, available)
  • reporting-cluster (provisioned, 4x ra3.4xlarge nodes, available)
  • etl-workgroup (serverless, available)

Then drill deeper:

What databases and schemas exist in analytics-cluster?

Kiro uses the MCP metadata tools to explore the cluster. No SQL is needed for discovery:

Database: analytics_db (local)

  • information_schema — system schema
  • pg_catalog — system schema
  • public — default schema
  • sales — user-created schema

Database: dev (local)

  • information_schema — system schema
  • pg_catalog — system schema
  • public — default schema

The only user-created schema is sales in the analytics_db database.

Show me the tables in the sales schema of the analytics database

There are five base tables in analytics_db.sales:

  1. customers
  2. order_items
  3. orders
  4. products
  5. returns
Describe the columns in the orders table
Column Type Nullable
order_id INTEGER NO
customer_id INTEGER NO
order_date TIMESTAMP NO
total_amount numeric(12,2) NO
status VARCHAR(20) YES

Each request builds on the previous context. Kiro remembers which cluster you’re exploring and navigates the hierarchy for you.

Example 2: On-demand data analysis in plain English

Without Kiro: Write SQL from scratch, guess at column names, run the query, fix syntax errors, iterate.

With Kiro:

Show me the top 10 customers by total purchase amount from the analytics-cluster,
including how many orders each customer placed

Kiro does the following:

  1. Identifies the relevant cluster and database.
  2. Explores the schema to find customer and order tables.
  3. Understands the table relationships.
  4. Generates and runs the appropriate SQL, showing you the query first.
  5. Presents the results in a readable format.

The response includes the query and results:

Query executed:

SELECT
    c.customer_id,
    c.customer_name,
    COUNT(o.order_id) AS total_orders,
    SUM(o.total_amount) AS total_spent
FROM sales.customers c
JOIN sales.orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_id, c.customer_name
ORDER BY total_spent DESC
LIMIT 10;
customer_id customer_name total_orders total_spent
4821 Acme Corp 347 $1,284,500.00
1190 Globex Inc 289 $1,102,340.00
7734 Initech LLC 256 $987,210.00

You didn’t need to remember table names, column names, or join conditions. Kiro figured it out by exploring the metadata first. You can see exactly what it ran, copy the SQL, and reuse it.

Example 3: Cross-cluster data comparison

Without Kiro: Open two SQL clients, sign in to different clusters, run similar queries, manually compare results in a spreadsheet.

With Kiro:

Compare the total sales revenue and transaction counts between
analytics-cluster and etl-workgroup for the last quarter.
Are there any discrepancies?

Kiro queries both environments, aligns the results, and provides a comparison, showing every query it runs along the way.

Query executed on analytics-cluster:

SELECT
    COUNT(*) AS total_records,
    SUM(total_amount) AS total_revenue,
    COUNT(CASE WHEN order_date >= DATE_TRUNC('quarter', CURRENT_DATE - INTERVAL '3 months')
        THEN 1 END) AS q4_transactions
FROM sales.orders;

Query executed on etl-workgroup:

SELECT
    COUNT(*) AS total_records,
    SUM(total_amount) AS total_revenue,
    COUNT(CASE WHEN order_date >= DATE_TRUNC('quarter', CURRENT_DATE - INTERVAL '3 months')
        THEN 1 END) AS q4_transactions
FROM sales.orders;

Sales data comparison: Analytics vs ETL.

Metric Analytics Cluster ETL Workgroup Status
Total Records 172,456 172,456 Match
Total Revenue $110,765,431 $110,765,431 Match
Q4 Transactions 38,061 38,061 Match

Because both queries are visible, you can verify that Kiro compared the same metrics on both sides, with no hidden assumptions.

Example 4: Schema documentation on demand

Without Kiro: Manually document table structures in a wiki or spreadsheet. Documentation frequently falls out of date.

With Kiro:

Generate documentation for the tables in the sales schema of analytics-cluster.
Include column names, data types, and notable relationships.

Kiro explores every table using the discovery tools and produces structured documentation.

Table: sales.customers

Column Type Nullable Description
customer_id INTEGER NO Primary key
customer_name VARCHAR(255) NO
email VARCHAR(255) YES
created_at TIMESTAMP NO

Table: sales.orders

Column Type Nullable Description
order_id INTEGER NO Primary key
customer_id INTEGER NO FK to customers
order_date TIMESTAMP NO
total_amount numeric(12,2) NO
status VARCHAR(20) YES

(continues for all tables in the schema…)

You can paste this directly into your team wiki or README, and anyone can rerun the same discovery flow to refresh the docs later.

Example 5: Troubleshooting data quality

Check the orders table in analytics-cluster for any NULL values
in the customer_id or total_amount columns.
How many records are affected?

Query executed:

SELECT
    COUNT(*) FILTER (WHERE customer_id IS NULL) AS null_customer_ids,
    COUNT(*) FILTER (WHERE total_amount IS NULL) AS null_total_amounts,
    COUNT(*) AS total_rows
FROM sales.orders;

Out of 172,456 rows: 0 NULL customer_ids, 3 NULL total_amounts.

Find duplicate order_ids in the sales.orders table and show me
a few examples with their timestamps

Query executed:

SELECT order_id, COUNT(*) AS occurrences, MIN(order_date) AS first_seen, MAX(order_date) AS last_seen
FROM sales.orders
GROUP BY order_id
HAVING COUNT(*) > 1
ORDER BY occurrences DESC
LIMIT 5;

Found two duplicate order_ids: order 88412 (two occurrences), order 91003 (two occurrences).

What's the date range of data in the events table?
Are there any gaps in the daily record counts?

Query executed:

WITH daily_counts AS (
    SELECT DATE(event_date) AS day, COUNT(*) AS record_count
    FROM sales.events
    GROUP BY DATE(event_date)
),
date_range AS (
    SELECT MIN(day) AS start_date, MAX(day) AS end_date FROM daily_counts
),
all_dates AS (
    SELECT (start_date + (n || ' days')::INTERVAL)::DATE AS day
    FROM date_range, generate_series(0, (end_date - start_date)) AS n
)
SELECT a.day AS missing_date
FROM all_dates a
LEFT JOIN daily_counts d ON a.day = d.day
WHERE d.day IS NULL
ORDER BY a.day;

Date range: 2024-01-01 to 2025-04-20. Found three missing dates: 2024-03-15, 2024-07-04, 2024-12-25 (likely holidays).

Every query is right there in the response. You can copy them into your own SQL client, modify them, or save them as reusable scripts.

Tips for getting the most out of Kiro and Redshift

  1. Start with discovery. Begin each session by asking Kiro to list your clusters and explore the database structure. This gives the agent context for subsequent queries.
  2. Be specific about which cluster. If you have multiple clusters, mention the cluster name in your request to avoid ambiguity.
  3. Iterate gradually. Start with simple questions and build complexity. Ask for a count before asking for a full breakdown.
  4. Use steering files for team conventions. Create a .kiro/steering/redshift.md file in your project with details about your cluster naming conventions, important schemas, and common query patterns. This gives Kiro persistent context about your environment.

Example steering file:

---
inclusion: auto
---

# Redshift Environment Context

## Clusters

- **analytics-cluster**: Primary analytics warehouse. Use database `analytics_db`.
- **etl-workgroup**: Serverless workgroup for ETL pipelines. Use database `staging_db`.

## Key Schemas

- `sales`: Customer transactions, orders, and revenue data
- `marketing`: Campaign performance and attribution data

## Conventions

- Always filter by `is_deleted = false` on soft-delete tables
- Date columns use `TIMESTAMP WITHOUT TIME ZONE` in UTC

## Query Transparency

- Always show the exact SQL query text being executed before or alongside the results.
  Users should be able to see, verify, and reuse every query that Kiro runs against Redshift.

That last Query Transparency rule is a small addition with a big impact. By default, Kiro might summarize results without showing the underlying SQL. Adding this steering instruction makes every query visible, which helps maintain consistent behavior across your data team and supports auditing, learning, and trust.

  1. Use hooks for automation. Set up agent hooks to run common validation queries automatically. For example, trigger a data quality check whenever you edit a specific SQL file.
  2. Verify important results. Always cross-check critical business findings with stakeholders before acting on them.

Security tips

When configured with least privilege IAM policies and Supervised mode, the Redshift MCP server provides multiple layers of protection. Under the AWS shared responsibility model, you are responsible for configuring access controls appropriately.

Safety layers at a glance

  • Data stays in your account. The MCP server runs locally, queries run inside your Amazon Redshift cluster, and no data is sent to third-party services.
  • Content not used for training. Enterprise users are excluded from service improvement usage. Free or individual tier users can opt out in Kiro settings.
  • Read-only by default. Every query is wrapped in a read-only transaction, preventing accidental writes.
  • Standard IAM controls. Kiro only gets permissions you’ve explicitly granted.
  • Supervised mode. Review each agent action before it’s applied in production environments.

Important caveat: The read-only enforcement only applies to queries routed through the MCP server’s execute_query tool. Kiro also has shell access. If your IAM credentials have write permissions, direct CLI calls (aws redshift-data execute-statement) bypass this guardrail entirely.

Takeaway: Use least privilege IAM policies (scoped to read and describe operations only) as your primary defense. Avoid broad policies like AdministratorAccess or AmazonRedshiftFullAccess. When IAM is properly scoped, even a direct CLI call cannot perform writes.

Verify Kiro’s output

Kiro is a powerful assistant, but it’s not infallible. Like any AI tool, it can misinterpret your intent, generate incorrect SQL, or present results that look plausible but are wrong. Examples include a misplaced join, a wrong filter, or an aggregation that silently excludes rows. This is especially important when working with production data where decisions have real business impact.

Treat Kiro’s output as a strong starting point, not a final answer. Review the SQL it generates before acting on the results. Use the Query Transparency steering rule (described in the Tips section) so you can see the exact query behind every result. When findings inform business decisions, validate them independently by running the query yourself, cross-checking with a colleague, or comparing against a known baseline.

Summary

Layer What it protects against
MCP server read-only mode Accidental writes through the MCP execute_query tool
Least privilege IAM policy Write operations via any path, including direct CLI calls
Kiro Supervised mode Unreviewed autonomous actions by the agent

Defense in depth: use these layers together for production environments.

What you can achieve with Kiro and Amazon Redshift

Before Now you can
Switch between the console, SQL clients, and documentation Use one interface for discovery, querying, and analysis
Memorize cluster endpoints, database names, and schemas Ask in plain text and let Kiro discover the structure
Write SQL from scratch for every query Describe what you want and get results
Manually compare data across clusters Run single-sentence cross-cluster analysis
Schema documentation is frequently stale Generate fresh docs on demand
Onboarding new analysts takes days New team members can explore immediately

Every minute you spend hunting for a table name or debugging a SQL syntax error is a minute that you’re not spending on actual analysis. You can reduce that overhead by letting Kiro handle the mechanical parts (discovery, navigation, and query construction) so you can focus on the questions that matter to your business.

Cleaning up

If you created Amazon Redshift resources specifically for this walkthrough, or if you no longer need the MCP server integration, follow these steps. They help you avoid ongoing charges and remove the configuration.

  • Remove the MCP server configuration.
  • Detach the IAM policy.
  • Delete test Amazon Redshift resources (if applicable).
  • Uninstall uv (optional).

Conclusion

In this post, you learned how to set up Kiro with the Amazon Redshift MCP server to query your data warehouse using natural language. You explored cluster discovery, schema browsing, analytical queries, cross-cluster comparisons, and data quality checks, all without writing SQL from scratch or switching between tools.

To go further:

  1. New to Amazon Redshift? Get started with Amazon Redshift to create your first cluster or serverless workgroup.
  2. Read the MCP protocol specification to understand how AI agents work with external tools.
  3. Visit kiro.dev for Kiro’s full capabilities, including specs, hooks, and steering files.

As you get comfortable with the basics, try combining steering files with agent hooks to automate recurring workflows like daily data quality checks or weekly schema documentation refreshes.


About the author

Hitesh Dodiya

Hitesh Dodiya

Hitesh is a Database Engineer at Amazon Redshift with over a decade of experience in databases, data warehousing, and analytics. He is passionate about leveraging Gen AI to deliver innovative solutions that solve real customer problems.