All posts by Ricardo Serafim

Amazon Redshift Serverless at 4 RPUs: High-value analytics at low cost

Post Syndicated from Ricardo Serafim original https://aws.amazon.com/blogs/big-data/amazon-redshift-serverless-at-4-rpus-high-value-analytics-at-low-cost/

Organizations across industries struggle with the economics of data analytics. High entry costs, complex capacity planning, and unpredictable workload demands create barriers that prevent teams from accessing the insights they need. Small businesses abandon analytics initiatives due to prohibitive minimums, and enterprises overprovision resources for development environments, leading to inefficient spending.

Amazon Redshift Serverless now addresses these challenges with 4 RPU configurations, helping you get started with a lower base capacity that runs scalable analytics workloads beginning at $1.50 per hour. This new option transforms the economics of data analytics with the flexibility to scale up automatically based on workload demands. You only pay for the compute capacity you consume, calculated on a per-second basis.

With 64 GB of memory and support for up to 32 TB of managed storage, this lower entry point offering addresses several common customer needs, including development and test environments that maintain separate workloads at lower cost and production workloads with variable demand that need cost-effective scaling. The configuration is particularly useful for test and development environments, departmental data warehouses, periodic reporting workloads, gaming analytics, and data mesh architectures with unpredictable usage patterns. Organizations just starting with cloud analytics can use this low-cost option while getting access to enterprise features like automatic scaling, built-in security, and seamless data lake integration.In this post, we examine how this new sizing option makes Redshift Serverless accessible to smaller organizations while providing enterprises with cost-effective environments for development, testing, and variable workloads.

New 4 RPU minimum base capacity in Redshift Serverless

Redshift Serverless measures compute capacity using Redshift Processing Units (RPUs), where each RPU provides 16 GB of memory. With this new minimum base capacity, the 4 RPU configuration delivers a total of 64 GB of memory. It supports up to 32 TB of managed storage, with a maximum of 100 columns per table. The 4 RPU configuration is cost-efficient, and it’s designed for lighter workloads. When your workload requires additional resources, Redshift Serverless automatically scales up the compute capacity. After you have scaled beyond 4 RPUs, your data warehouse will continue using the higher RPU level to maintain consistent performance. This behavior provides workload stability while preserving the benefits of automatic scaling.

For workloads requiring more resources, such as tables with a large number of columns or higher concurrency requirements, you can choose higher base capacities ranging from 8 RPUs up to 1024 RPUs. This flexibility helps you start small and adjust your resources as your analytics requirements evolve.

Benefits of Redshift Serverless with 4 RPUs

This new feature offers the following benefits:

  • Cost-effective entry point – The new 4 RPU configuration is a low-cost option for cloud data warehousing, making enterprise-grade analytics accessible to organizations of various sizes, such as startups exploring their first data warehouse or established enterprises optimizing their analytics spending. For example, in the US East (N. Virginia) Region, the compute cost is $0.375 per RPU-hour. For a 4 RPU base capacity, this translates to $1.50 per hour of active workload time. Because you’re only charged when workloads are running, small-scale users can keep costs predictable and low. This configuration helps teams begin their analytics journey with minimal upfront commitment. Development teams can maintain dedicated environments for testing and experimentation without significant cost overhead.
  • Support for smaller datasets – With support for up to 32 TB of Redshift Managed Storage, the 4 RPU configuration is well-suited for smaller data warehouses. It can handle datasets ranging from a few gigabytes to tens of terabytes, making it ideal for startups, small businesses, or departments with limited data volumes.
  • Seamless integration with the AWS ecosystem – The 4 RPU configuration integrates seamlessly with other AWS services, such as Amazon Simple Storage Service (Amazon S3) for data lakes, AWS Glue for ETL (extract, transform, and load), and Amazon QuickSight for visualization. This makes it straightforward to build end-to-end analytics pipelines, even for smaller-scale projects. Additionally, Redshift data lake queries on external Amazon S3 data are included in the RPU billing, simplifying cost management.
  • Use case flexibility – The 4 RPU configuration proves valuable across numerous analytics scenarios. Development and testing environments benefit from cost-effective isolation, and departmental data warehouses can start small and scale as needed. Organizations running periodic reporting workloads or proof-of-concept projects can optimize costs by paying only for actual usage. Even small to medium-sized production workloads can use this configuration effectively.

Regardless of the use case, you can benefit from the full feature set of Redshift Serverless, including built-in security, data lake integration, and automated maintenance.

Use cases for Redshift Serverless with 4 RPU workgroups

The 4 RPU configuration is tailored for scenarios where lightweight compute resources suffice. The following are some practical use cases:

  • Small business analytics – Small businesses with limited data (less than 32 GB) can analyze sales, customer behavior, or operational metrics with cost-effective data warehouses. Running 10–20 daily ETL queries and occasional one-time queries remains cost-effective at this capacity.
  • Development and testing environments – The configuration is well-suited for development and test environments where full production resources aren’t needed. Data engineers can experiment with Redshift Serverless, prototype queries, or build proof-of-concept solutions without committing to higher RPU capacities. The 4 RPU configuration lowers the cost of continuous integration and delivery (CI/CD) testing of data pipelines. Teams can run automated integration tests and schema validations in isolated environments that mirror production systems while optimizing costs through per-second billing.
  • Analytics for startups – Startups can build robust product analytics capabilities without significant upfront investment. Teams can track customer behavior, feature adoption, and KPIs using familiar SQL queries, then connect business intelligence (BI) tools like Quicksight or Tableau for lightweight dashboarding.
  • Training and experimentation – Organizations can create dedicated sandbox environments for data analysts’ onboarding and experimentation with minimal budget impact. These environments are perfect for exploring analytics powered by large language models (LLMs), semantic layer development, or generative AI applications.
  • Data quality workflows – The feature efficiently supports scheduled jobs for data quality validation, checking data freshness, integrity, and conformance without dedicating high-capacity environments to routine QA tasks.
  • Enterprise team enablement – Large organizations can implement decentralized data warehousing strategies. Each department can operate its data warehouse aligned with specific needs and budgets, enabling department-level chargeback models.
  • Environment isolation – Organizations can create dedicated workgroups per environment (development, test, QA, UAT), providing complete isolation without sharing compute resources or risking cross-environment interference.
  • Data mesh architecture – Domain teams can operate independently while maintaining cost-efficiency. Each domain runs its workgroup for lightweight transformations, domain-specific marts, and KPI calculations. It offers a flexible sizing option in a data mesh architecture.
  • Event-driven analytics – Well-suited for short-lived or event-triggered analytics tasks. Organizations can programmatically create workgroups through APIs for A/B test analysis, campaign performance summaries, or machine learning (ML) pipeline validation.
  • Low-volume one-time reporting – Organizations with infrequent or lightweight reporting needs, such as monthly financial summaries or dashboard refreshes, can use 4 RPUs to minimize costs while maintaining performance.

Cost considerations and best practices

Although the 4 RPU configuration is cost-effective, there are a few considerations to keep in mind to optimize expenses:

  • Billing – Redshift Serverless bills on a per-second basis with a 60-second minimum per query. For very short queries (such as subsecond), this can inflate costs. To mitigate this, batch queries where possible to maximize resource utilization within the 60-second window. For more information, see Amazon Redshift pricing.
  • Set usage limits – Use the Redshift Serverless console to set maximum RPU-hour limits (daily, weekly, or monthly) to prevent unexpected costs. You can configure alerts or automatically turn off queries when limits are reached. To learn more, see Setting usage limits, including setting RPU limits.
  • Monitor with system views – Query the SYS_SERVERLESS_USAGE system table to track RPU consumption and estimate query costs. For example, you can calculate daily costs by aggregating charged seconds and multiplying by the RPU rate.
  • Close transactions – Make sure transactions are explicitly closed (using COMMIT or ROLLBACK) to avoid idle sessions consuming RPUs, which can lead to unnecessary charges.

The following is a practical example for a 4 RPU workgroup in US East (N. Virginia) at $0.375/RPU-hour for a scenario of a 10-minute query running daily: This is compute costs only. Primary storage capacity is billed as Redshift Managed Storage (RMS).

  • Workload duration: 10 minutes (600 seconds)
  • Cost: (600 seconds / 3600 seconds) × 4 RPUs × $0.375 = $0.25
  • Monthly cost (30 days): $0.25 × 30 = $7.50

Performance considerations

Although the 4 RPU configuration is cost-efficient, it’s designed for lighter workloads. For complex queries or datasets exceeding 32 TB, you must set up 8 RPUs to 24 RPUs to support up to 128 TB of storage. For more than 128 TB, you need 32 RPUs or more. If query performance is a priority, consider increasing the base capacity or enabling AI-driven scaling and optimization to optimize resources dynamically. Benchmark tests suggest that higher RPUs (such as 32 RPUs) significantly improve performance for complex queries. However, for simpler tasks, 4 RPUs deliver adequate throughput.

To monitor performance, use the Redshift Serverless console or CloudWatch metrics like ComputeCapacity and ComputeSeconds. The SYS_QUERY_HISTORY table can also help analyze query runtimes and identify bottlenecks.

Conclusion

Redshift Serverless with 4 RPU represents a significant step forward in making enterprise-grade analytics cheaper and accessible to organizations of different sizes, such as a startup building its first analytics system, a development team looking to optimize testing environments, or an enterprise implementing a data mesh architecture. This new configuration combines the power and flexibility of Redshift Serverless with a cost-effective entry point, so teams can start small and scale seamlessly as their needs grow. The ability to begin with minimal commitment while maintaining access to advanced features like automatic scaling, built-in security, and seamless data lake integration makes this a compelling option for modern data analytics workloads. Combined with pay-per-second billing and intelligent resource management, Redshift Serverless with 4 RPU delivers the ideal balance of cost-efficiency and performance.

To get started with cost-effective analytics, visit the AWS Management Console to create your Redshift Serverless workgroup with 4 RPUs. For more information, refer to the Amazon Redshift Serverless Management Guide or Amazon Redshift best practices. Plan your analytics budget effectively using the AWS Pricing Calculator to estimate costs based on your specific workload patterns, or contact your AWS account team to discuss your particular use case.


About the authors

Ricardo Serafim

Ricardo Serafim

Ricardo is a Senior Analytics Specialist Solutions Architect at AWS. He has been helping companies with Data Warehouse solutions since 2007.

Ashish Agrawal

Ashish Agrawal

Ashish is a Principal Product Manager with Amazon Redshift, building cloud-based data warehouses and analytics cloud services. Ashish has over 25 years of experience in IT. Ashish has expertise in data warehouses, data lakes, and platform as a service. Ashish has been a speaker at worldwide technical conferences.

Andre Hass

Andre Hass

Andre is a Senior Technical Account Manager at AWS, specialized in AWS Data Analytics workloads. With more than 20 years of experience in databases and data analytics, he helps customers optimize their data solutions and navigate complex technical challenges. When not immersed in the world of data, Andre can be found pursuing his passion for outdoor adventures. He enjoys camping, hiking, and exploring new destinations with his family on weekends or whenever an opportunity arises.

Unlock the power of optimization in Amazon Redshift Serverless

Post Syndicated from Ricardo Serafim original https://aws.amazon.com/blogs/big-data/unlock-the-power-of-optimization-in-amazon-redshift-serverless/

Amazon Redshift Serverless automatically scales compute capacity to match workload demands, measuring this capacity in Redshift Processing Units (RPUs). Although traditional scaling primarily responds to query queue times, the new AI-driven scaling and optimization feature offers a more sophisticated approach by considering multiple factors including query complexity and data volume. Intelligent scaling addresses key data warehouse challenges by preventing both over-provisioning of resources for performance and under-provisioning to save costs, particularly for workloads that fluctuate based on daily patterns or monthly cycles.

Amazon Redshift serverless now offers enhanced flexibility in configuring workgroups through two primary methods. Users can either set a base capacity, specifying the baseline RPUs for query execution, with options ranging from 8 to 1024 RPUs and each RPU providing 16 GB of memory, or they can opt for the price-performance target. Amazon Redshift Serverless AI-driven scaling and optimization can adapt more precisely to diverse workload requirements and employs intelligent resource management, automatically adjusting resources during query execution for optimal performance. Consider using AI-driven scaling and optimization if your current workload requires 32 to 512 base RPUs. We don’t recommend using this feature for less than 32 base RPU or more than 512 base RPU workloads.

In this post, we demonstrate how Amazon Redshift Serverless AI-driven scaling and optimization impacts performance and cost across different optimization profiles.

Options in AI-driven scaling and optimization

Amazon Redshift Serverless AI-driven scaling and optimization offers an intuitive slider interface, letting you balance price and performance goals. You can select from five optimization profiles, ranging from Optimized for Cost to Optimized for Performance, as shown in the following diagram. Your slider position determines how Amazon Redshift allocates resources and implements AI-driven scaling and optimizations, to achieve your desired price-performance target.

Sliding bar

The slider offers the following options:

  1. Optimized for Cost (1)
    • Prioritizes cost savings over performance
    • Allocates minimum resources in favor of saving on costs
    • Best for workloads where performance isn’t time-critical
  2. Cost-Balanced (25)
    • Balances towards cost savings while maintaining reasonable performance
    • Allocates moderate resources
    • Suitable for mixed workloads with some flexibility in query time
  3. Balanced (50)
    • Provides equal emphasis on cost efficiency and performance
    • Allocates optimal resources for most use cases
    • Ideal for general-purpose workloads
  4. Performance-Balanced (75)
    • Favors performance while maintaining some cost control
    • Allocates additional resources when needed
    • Suitable for workloads requiring consistently fast query elapsed time
  5. Optimized for Performance (100)
    • Maximizes performance regardless of cost
    • Provides maximum available resources
    • Best for time-critical workloads requiring fastest possible query delivery

Which workloads to consider for AI-driven scaling and optimizations

The Amazon Redshift Serverless AI-driven scaling and optimization capabilities can be applied to almost every analytical workload. Amazon Redshift will assess and apply optimizations according to your price-performance target—cost, balance, or performance.

Most analytical workloads operate on millions or even billions of rows and generate aggregations and complex calculations. These workloads have high variability for query patterns and number of queries. The Amazon Redshift Serverless AI-driven scaling and optimization will improve the price, performance, or both because it learns the patterns (the repeatability of your workload) and will allocate more resources towards performance improvements if you’re performance-focused or fewer resources if you’re cost-focused.

Cost-effectiveness of AI-driven scaling and optimization

To effectively determine the effectiveness of Amazon Redshift Serverless AI-driven scaling and optimization we need to be able to measure your current state of price-performance. We encourage you to measure your current price-performance by using sys_query_history to calculate the total elapsed time of your workload and note the start time and end time. Then use sys_serverless_usage to calculate the cost. You can use the query from the Amazon Redshift documentation and add the same start and end times. This will establish your current price performance, and now you have a baseline to compare against.

If such measurement isn’t practical because your workloads are continuously running and it’s impractical for you to determine a fixed start and end time, then another way is to compare holistically, check your month over month cost, check your user sentiment towards performance, towards system stability, improvements in data delivery, or reduction in overall monthly processing times.

Benchmark conducted and results

We evaluated the optimization options using the TPCDS 3TB dataset from the AWS Labs GitHub repository (amazon-redshift-utils). We deployed this dataset across three Amazon Redshift Serverless workgroups configured as Optimized for Cost, Balanced, and Optimized for Performance. To create a realistic reporting environment, we configured three Amazon Elastic Compute Cloud (Amazon EC2) instances with JMeter (one per endpoint) and ran 15 selected TPCDS queries concurrently for approximately 1 hour, as shown in the following screenshot.

We disabled the result cache to make sure Amazon Redshift Serverless ran all queries directly, providing accurate measurements. This setup helped us capture authentic performance characteristics across each optimization profile. Also, we designed our test environment without setting the Amazon Redshift Serverless workgroup max capacity parameter—a key configuration that controls the maximum RPUs available to your data warehouse. By removing this limit, we could clearly showcase how different configurations affect scaling behavior in our test endpoints.

Jmeter

Our comprehensive test plan included running each of the 15 queries 355 times, generating 5,325 queries per test cycle. The AI-driven scaling and optimization needs multiple iterations to identify patterns and optimize RPUs, so we ran this workload 10 times. Through these repetitions, the AI learned and adapted its behavior, processing a total of 53,250 queries throughout our testing period.

The testing revealed how the AI-driven scaling and optimization system adapts and optimizes performance across three distinct configuration profiles: Optimized for Cost, Balanced, and Optimized for Performance.

Queries and elapsed time

Although we ran the same core workload repeatedly, we used variable parameters in JMeter to generate different values for the WHERE clause conditions. This approach created similar but not identical workloads, introducing natural variations that showed how the system handles real-world scenarios with varying query patterns.

Our elapsed time analysis demonstrates how each configuration achieved its performance objectives, as shown by the average consumption metrics for each endpoint, as shown in the following screenshot.

Average Elapsed Time per Endpoint

The results matched our expectations: the Optimized for Performance configuration delivered significant speed improvements, running queries approximately two times as the Balanced configuration and four times as the Optimized for Cost setup.

The following screenshots show the elapsed time breakdown for each test.

Optimized for Cost - Elapsed Time Balanced - Elapsed Time Optimized for Performance - Elapsed Time

The following screenshot shows tenth and final test iteration demonstrates distinct performance differences across configurations.

Per Configuration - Elapsed Time

To clarify more, we categorized our query elapsed times into three groups:

  • Short queries – Less than 10 seconds
  • Medium queries – From 10 seconds to 10 minutes
  • Long queries: More than 10 minutes

Considering our last test, the analysis shows:

Duration per configuration Optimized for Cost Balanced Optimized for Performance
Short queries (<10 sec) 1488 1743 3290
Medium queries (10 sec – 10 min) 3633 3579 2035
Long queries (>10 min) 204 3 0
TOTAL 5325 5325 5325

The configuration’s capacity directly impacts query elapsed time. The Optimized for Cost configuration limits resources to save money, resulting in longer query times, making it best suited for workloads that aren’t time critical, where cost savings are prioritized. The Balanced configuration provides moderate resource allocation, striking a middle ground by effectively handling medium-duration queries and maintaining reasonable performance for short queries while nearly eliminating long-running queries. In contrast, the Optimized for Performance configuration allocates more resources, which increases costs but delivers faster query results, making it best for latency-sensitive workloads where query speed is critical.

Capacity used during the tests

Our comparison of the three configurations reveals how Amazon Redshift Serverless AI-driven scaling and optimization technology adapts resource allocation to meet user expectations. The monitoring showed both Base RPU variations and distinct scaling patterns across configurations—scaling up aggressively for faster performance or maintaining lower RPUs to optimize costs.

The Optimized for Cost configuration starts at 128 RPUs and increases to 256 RPUs after three tests. To maintain cost-efficiency, this setup limits the maximum RPU allocation during scaling, even when facing query queuing.

In the following table, we can observe the costs for this Optimized for Cost configuration.

Test# Starting RPUs Scaled up to Cost incurred
1 128 1408  $254.17
2 128 1408  $258.39
3 128 1408  $261.92
4 256 1408  $245.57
5 256 1408  $247.11
6 256 1408  $257.25
7 256 1408  $254.27
8 256 1408  $254.27
9 256 1408  $254.11
10 256 1408  $256.15

The strategic RPU allocation by Amazon Redshift Serverless helps optimize costs, as demonstrated in tests 3 and 4, where we observed significant cost savings. This is shown in the following graph.

Optimized for Cost - Cost Average

Although the optimization for cost changed the base RPU, the balanced configuration didn’t change the base RPUs but scaled up to 2176, further than the 1408 RPUs that were the maximum used by the cost optimization setup. The following table shows the figures for the Balanced configuration.

Test# Starting RPUs Scaled up to Cost incurred
1 192 2176  $261.48
2 192 2112  $270.90
3 192 2112  $265.26
4 192 2112  $260.20
5 192 2112  $262.12
6 192 2112  $253.18
7 192 2112  $272.80
8 192 2112  $272.80
9 192 2112  $263.72
10 192 2112  $243.28

The Balanced configuration, averaging $262.57 per test, delivered significantly better performance while costing only 3% more than the Optimized for Cost configuration, which averaged $254.32 per test. As demonstrated in the previous section, this performance advantage is evident in the elapsed time comparisons. The following graph shows the costs for the Balanced configuration.

Balanced - Cost Average

As expected from the Optimized for Performance configuration, the usage of resources was higher to attend the high performance. In this configuration, we can also observe that after two tests, the engine adapted itself to start with a higher number of RPUs to attend the queries faster.

Test# Starting RPUs Scaled Up to Cost incurred
1 512 2753  $295.07
2 512 2327  $280.29
3 768 2560  $333.52
4 768 2991  $295.36
5 768 2479  $308.72
6 768 2816  $324.08
7 768 2413  $300.45
8 768 2413  $300.45
9 768 2107  $321.07
10 768 2304  $284.93

Despite a 19% cost increase in the third test, most subsequent tests remained below the $304.39 average cost.

Optimized for Performance - Cost Average

The Optimized for Performance configuration maximizes resource usage to achieve faster query times, prioritizing speed over cost efficiency.

The final cost-performance analysis reveals compelling results:

  • The Balanced configuration delivered twofold better performance while costing only 3.25% more than the Optimized for Cost setup
  • The Optimized for Performance configuration achieved fourfold faster elapsed time with a 19.39% cost increase compared to the Optimized for Cost option.

The following chart illustrates our cost-performance findings:

Average Billing and Elapsed Time per Endpoint

It’s important to note that these results reflect our specific test scenario. Each workload has unique characteristics, and the performance and cost differences between configurations might vary significantly in other use cases. Our findings serve as a reference point rather than a universal benchmark. Additionally, we didn’t test two intermediate configurations available in Amazon Redshift Serverless: one between Optimized for Cost and Balanced, and another between Balanced and Optimized for Performance.

Conclusion

The test results demonstrate the effectiveness of Amazon Redshift Serverless AI-driven scaling and optimization across different workload requirements. These findings highlight how Amazon Redshift Serverless AI-driven scaling and optimization can help organizations find their ideal balance between cost and performance. Although our test results serve as a reference point, each organization should evaluate their specific workload requirements and price-performance targets. The flexibility of five different optimization profiles, combined with intelligent resource allocation, enables teams to fine-tune their data warehouse operations for optimal efficiency.

To get started with Amazon Redshift Serverless AI-driven scaling and optimization, we recommend:

  1. Establishing your current price-performance baseline
  2. Identifying your workload patterns and requirements
  3. Testing different optimization profiles with your specific workloads
  4. Monitoring and adjusting based on your results

By using these capabilities, organizations can achieve better resource utilization while meeting their specific performance and cost objectives.

Ready to optimize your Amazon Redshift Serverless workloads? Visit the AWS Management Console today to create your own Amazon Redshift Serverless AI-driven scaling and optimization to start exploring the different optimization profiles. For more information, check out our documentation on Amazon Redshift Serverless AI-driven scaling and optimization, or contact your AWS account team to discuss your specific use case.


About the Authors

Ricardo Serafim Ricardo Serafim is a Senior Analytics Specialist Solutions Architect at AWS. He has been helping companies with Data Warehouse solutions since 2007.

Milind Oke Milind Oke is a Data Warehouse Specialist Solutions Architect based out of New York. He has been building data warehouse solutions for over 15 years and specializes in Amazon Redshift.

Andre HassAndre Hass is a Senior Technical Account Manager at AWS, specialized in AWS Data Analytics workloads. With more than 20 years of experience in databases and data analytics, he helps customers optimize their data solutions and navigate complex technical challenges. When not immersed in the world of data, Andre can be found pursuing his passion for outdoor adventures. He enjoys camping, hiking, and exploring new destinations with his family on weekends or whenever an opportunity arises.

Amazon Redshift Serverless adds higher base capacity of up to 1024 RPUs

Post Syndicated from Ricardo Serafim original https://aws.amazon.com/blogs/big-data/amazon-redshift-serverless-adds-higher-base-capacity-of-up-to-1024-rpus/

In the rapidly evolving world of data and analytics, organizations are constantly seeking new ways to optimize their data infrastructure and unlock valuable insights. Amazon Redshift is changing the game for thousands of businesses every day by making analytics straightforward and more impactful. Fully managed, AI powered, and using parallel processing, Amazon Redshift helps companies uncover insights faster than ever. Whether you’re a small startup or a big player, Amazon Redshift helps you make smart decisions quickly and with the best price-performance at scale. Amazon Redshift Serverless is a pay-per-use serverless data warehousing service that eliminates the need for manual cluster provisioning and management. This approach is a game changer for organizations of all sizes with predictable or unpredictable workloads.

The key innovation of Redshift Serverless is its ability to automatically scale compute up or down based on your workload demands, maintaining optimal performance and cost-efficiency without manual intervention. Redshift Serverless allows you to specify the base data warehouse capacity the service uses to handle your queries for a steady level of performance on a well-known workload or use a price-performance target (AI-driven scaling and optimization), better suited in scenarios with fluctuating demands, optimizing costs while maintaining performance. The base capacity is measured in Redshift Processing Units (RPUs), where one RPU provides 16 GB of memory. Redshift Serverless defaults to a robust 128 RPUs, capable of analyzing petabytes of data, allowing you to scale up for more power or down for cost optimization, making sure that your data warehouse is optimally sized for your unique needs. By setting a higher base capacity, you can improve the overall performance of your queries, especially for data processing jobs that tend to consume a lot of compute resources. The more RPUs you allocate as the base capacity, the more memory and processing power Redshift Serverless will have available to tackle your most demanding workloads. This setting gives you the flexibility to optimize Redshift Serverless for your specific needs. If you have a lot of complex, resource-intensive queries, increasing the base capacity can help make sure those queries are executed efficiently, with little to no bottlenecks or delays.

In this post, we explore the new higher base capacity of 1024 RPUs in Redshift Serverless, which doubles the previous maximum of 512 RPUs. This enhancement empowers you to get high performance for your workload containing highly complex queries and write-intensive workloads, with concurrent data ingestion and transformation tasks that require high throughput and low latency with Redshift Serverless. Redshift Serverless also offers scale up to 10 times the base capacity. The focus is on helping you find the right balance between performance and cost to meet your organization’s unique data warehousing needs. By adjusting the base capacity, you can fine-tune Redshift Serverless to deliver the perfect combination of speed and efficiency for your workloads.

The need for 1024 RPUs

Data warehousing workloads are increasingly demanding high-performance computing resources to meet the challenges of modern data processing requirements. The need for 1024 RPUs is driven by several key factors. First, many data warehousing use cases involve processing petabyte-sized historical datasets, whether for initial data loading or periodic reprocessing and querying. This is particularly prevalent in industries like healthcare, financial services, manufacturing, retail, and engineering, where third-party data sources can deliver petabytes of information that must be ingested in a timely manner. Additionally, the seasonal nature of many business processes, such as month-end or quarter-end reporting, creates periodic spikes in computational needs that require substantial scalable resources.

The complexity of the queries and analytics run against data warehouses has also grown exponentially, with many workloads now scanning and processing multi-petabyte datasets. This level of complex data processing requires substantial memory and parallel processing capabilities that can be effectively provided by a 1024 RPU configuration. Furthermore, the increasing integration of data warehouses with data lakes and other distributed data sources adds to the overall computational burden, necessitating high-performing, scalable solutions.

Also, many data warehousing environments are characterized by heavy write-intensive workloads, with concurrent data ingestion and transformation tasks that require a high-throughput, low-latency processing architecture. For workloads requiring access to extremely large volumes of data with complex joins, aggregations, and numerous columns that necessitate substantial memory usage, the 1024 RPU configuration can deliver the necessary performance to help meet demanding service level agreements (SLAs) and provide timely data availability for downstream business intelligence and decision-making processes. And for the control of costs, we can set the maximum capacity (on the Limits tab at the workgroup configuration) to cap the usage of resources to a maximum. The following screenshot shows an example.

MaxCapacity

During the tests discussed later in this post, we compare using maximum capacity of 1024 RPUs vs. 512 RPUs.

When to consider using 1024 RPUs

Consider using 1024 RPUs in the following scenarios:

  • Complex and long-running queries – Large warehouses provide the compute power needed to process complex queries that involve multiple joins, aggregations, and calculations. For workloads analyzing terabytes or petabytes of data, the 1024 RPU capacity can significantly improve query completion times.
  • Data lake queries scanning large datasets – Queries that scan extensive data in external data lakes benefit from the additional compute resources. This provides faster processing and reduced latency, even for large-scale analytics.
  • High-memory queries – Queries requiring substantial memory—such as those with many columns, large intermediate results, or temporary tables—perform better with the increased capacity of a larger warehouse.
  • Accelerated data loading – Large capacity warehouses improve the performance of data ingestion tasks, such as loading massive datasets into the data warehouse. This is particularly beneficial for workloads involving frequent or high-volume data loads.
  • Performance-critical use cases – For applications or systems that demand low latency and high responsiveness, a 1024 RPU warehouse provides smooth operation by allocating sufficient compute resources to handle peak loads efficiently.

Balancing performance and cost

Choosing the right warehouse size requires evaluating your workload’s complexity and performance requirements. A larger warehouse size, such as 1024 RPUs, excels at handling computationally intensive tasks but should be balanced against cost-effectiveness. Consider testing your workload on different base capacities or using the Redshift Serverless price-performance slider to find the optimal setting.

When to avoid larger base capacity

Although larger warehouses offer powerful performance benefits, they might not always be the most cost-effective solution. Consider the following scenarios where a smaller base capacity might be more suitable:

  • Basic or small queries – Simple queries that process small datasets or involve minimal computation don’t require the high capacity of a 1024 RPU warehouse. In such cases, smaller warehouses can handle the workload effectively, avoiding unnecessary costs.
  • Cost-sensitive workloads – For workloads with predictable and moderate complexity, a smaller warehouse can deliver sufficient performance while keeping costs under control. Selecting a larger capacity might lead to overspending without proportional performance gains.

Comparison and cost-effectiveness

The previous maximum of 512 RPUs should suffice for most use cases, but there can be situations that need more. At 512 RPUs, you get 8 TB of memory on your workgroup; with 1024 RPU, it’s doubled to 16 TB. Consider a scenario where you are ingesting large volumes of data with the COPY command and there are healthcare datasets that go into the 30 TB (or more) range.

To illustrate, we ingested the TPC-H 30TB datasets available at AWS Labs Github repository amazon-redshift-utils on the 512 RPU workgroup and the 1024 RPU workgroup.

The following graph provides detailed runtimes. We see an overall 44% performance improvement on 1024 RPUs vs. 512 RPUs. You will notice that the larger ingestion workloads show a greater performance improvement.

Ingestion

The cost for running 6,809 seconds at 512 RPUs in the US East (Ohio) AWS Region at $0.36 per RPU-hour is calculated as 6809 * 512 * 0.36 / 60 / 60 = $348.62.

The cost for running 3,811 seconds at 1024 RPUs in the US East (Ohio) Region at $0.36 per RPU-hour is calculated as 3811 * 1024 * 0.36 / 60 / 60 = $390.25.

1024 RPUs is able to ingest the 30 TB of data 44% faster at a 12% higher cost compared to 512 RPUs.

Next, we ran the 22 TPC-H queries available at AWS Samples Github repository redshift-benchmarks on the same two workgroups to compare query performance.

The following graph provides detailed runtimes for each of the 22 TPC-H queries. We see an overall 17% performance improvement on 1024 RPUs vs. 512 RPUs for a single session sequential query execution, even though performance improved for some and deteriorated for others.

Queries

When running 20 sessions concurrently, we see 62% performance improvement, from 6,903 seconds on 512 RPUs down to 2,592 seconds on 1024 RPUs, with each concurrent session running the 22 TPC-H queries in a different order.

Notice the stark difference in performance improvement seen for concurrent execution (62%) vs. serial execution (17%). The concurrent executions represent a typical production system where multiple concurrent sessions are running queries against the database. It’s important to base your proof of concept decisions on production-like scenarios with concurrent executions, and not only on sequential executions, which typically come from a single user running the proof of concept. The following table compares both tests.

512 RPU 1024 RPU
Sequential (seconds) 1276 1065
Concurrent executions (seconds) 6903 2592
Total (seconds) 8179 3657
Total ($) $418.76 $374.48

The total ($) is calculated by seconds * RPUs * 0.36 / 60 / 60.

1024 RPUs are able to run the TPC-H queries against 30 TB benchmark data 55% faster, and at 11% lower cost compared to 512 RPUs.

Amazon Redshift offers system metadata views and system views, which are useful for tracking resource utilization. We analyzed additional metrics from the sys_query_history and sys_query_detail tables to identify which specific parts of query execution experienced performance improvements or declines. Notice that 1024 RPUs with 16 TB of memory is able to hold a larger number of data blocks in-memory, thereby needing to fetch 35% fewer SSD blocks compared to 512 RPUs with 8 TB of memory. It is able to run the larger workloads better by needing to fetch remote Amazon S3 blocks 71% less compared to 512 RPUs. Finally, local disk spill to SSD (when a query can’t be allocated more memory) was reduced by 63% and remote disk spill to S3 (when the SSD cache is fully occupied) was completely eliminated on 1024 RPUs compared to 512 RPUs.

Metric Improvement (percentage)
Elapsed time 60%
Queue time 23%
Runtime 59%
Compile time -8%
Planning time 64%
Lockwait time -31%
Local SSD blocks read 35%
Remote S3 blocks read 71%
Local disk spill to SSD 63%
Remote disk spill to S3 100%

The following are some run characteristic graphs captured from the Amazon Redshift console. To find these, choose Query and database monitoring and Resource monitoring under Monitoring in the navigation pane.

Thanks to the performance enhancement, queries completed sooner with 1024 RPUs than with 512 RPUs, resulting on connections ending faster.

The following graph illustrates the database connection with 512 RPUs.

Database Connections - 512 RPUs

The following graph illustrates the database connection with 1024 RPUs.

Database Connections - 1024 RPUs

Regarding query classification, there are three categories: short queries (less than 10 seconds), medium queries (10 seconds to 10 minutes), and long queries (more than 10 minutes). We observed that due to performance improvements, the 1024 RPU configuration resulted in fewer long queries compared to the 512 RPU configuration.

The following graph illustrates the queries duration with 512 RPUs.Duration of Queries (512 RPUs)

The following graph illustrates the queries duration with 1024 RPUs.

Duration of Queries (1024 RPUs)

Due to the better performance, we noticed that the number of queries handled per second is higher on 1024 RPUs.

The following graph illustrates the queries completed per second with 512 RPUs.

Queries Per Second (512 RPUs)

The following graph illustrates the queries completed per second with 1024 RPUs.

Queries Per Second (1024 RPUs)

In the following graphs, we see that although the number of queries running looks similar, the 1024 RPU endpoint ends the queries faster, which means a smaller window to run the same number of queries.

The following graph illustrates the queries running with 512 RPUs.

Queries running (512 RPUs)

The following graph illustrates the queries running with 1024 RPUs.

Queries running (1024 RPUs)

There was no queuing when we compared both tests.

The following graph illustrates the queries queued with 512 RPUs.

Queries queued (512 RPUs)

The following graph illustrates the queries queued with 1024 RPUs.

Queries queued (1024 RPUs)

The following graph illustrates the query runtime breakdown with 512 RPUs.

Query Breakdown (512 RPUs)

The following graph illustrates the query runtime breakdown with 1024 RPUs.

Query Breakdown (1024 RPUs)

Queuing was largely avoided due to the automatic scaling feature offered by Redshift Serverless. By dynamically adding more resources, we can keep queries running and match the expected performance levels, even during usage peaks. You are able to set a maximum capacity to help prevent automatic scaling from exceeding your desired resource limits.

The following graph illustrates workgroup scaling with 512 RPUs. Redshift Serverless automatically scaled to 2x/1024 RPUs and peaked at 2.5x/1280 RPUs.

Workgroup Scaling With 512 RPUs

The following graph illustrates workgroup scaling with 1024 RPUs. Redshift Serverless automatically scaled to 2x/2048 RPUs and peaked at 3x/3072 RPUs.

Workgroup Scaling With 1024 RPUs

The following graph illustrates compute consumed with 512 RPUs.

Compute Consumed - 512 RPUs

The following graph illustrates compute consumed with 1024 RPUs.

Compute Consumed - 1024 RPUs

Conclusion

The introduction of the 1024 RPUs capacity for Redshift Serverless marks a significant advancement in data warehousing capabilities, offering substantial benefits for organizations handling large-scale, complex data processing tasks. Redshift Serverless ingestion scan scales up the ingestion performance with higher capacity. As evidenced by the benchmark tests in this post using the TPC-H dataset, this higher base capacity not only accelerates processing times, but can also prove more cost-effective for workloads as described in this post, demonstrating improvements such as 44% faster data ingestion, 62% better performance in concurrent query execution, and overall cost savings of 11% for combined workloads.

Given these impressive results, it’s crucial for organizations to evaluate their current data warehousing needs and consider running a proof of concept with the 1024 RPU configuration. Analyze your workload patterns using the Amazon Redshift monitoring tools, optimize your configurations accordingly, and don’t hesitate to engage with AWS experts for personalized advice. If your company is covered by an account team, ask them for a meeting. If not, post your analysis and question to the Re:Post forum.

By taking these steps and staying informed about future developments, you can make sure that your organization fully takes advantage of Redshift Serverless, potentially unlocking new levels of performance and cost-efficiency in your data warehousing operations.


About the authors

Ricardo Serafim is a Senior Analytics Specialist Solutions Architect at AWS.

Harshida Patel is a Analytics Specialist Principal Solutions Architect, with AWS.

Milind Oke is a Data Warehouse Specialist Solutions Architect based out of New York. He has been building data warehouse solutions for over 15 years and specializes in Amazon Redshift.

Achieve peak performance and boost scalability using multiple Amazon Redshift serverless workgroups and Network Load Balancer

Post Syndicated from Ricardo Serafim original https://aws.amazon.com/blogs/big-data/achieve-peak-performance-and-boost-scalability-using-multiple-amazon-redshift-serverless-workgroups-and-network-load-balancer/

As data analytics use cases grow, factors of scalability and concurrency become crucial for businesses. Your analytic solution architecture should be able to handle large data volumes at high concurrency and without compromising speed, thereby delivering a scalable high-performance analytics environment.

Amazon Redshift Serverless provides a fully managed, petabyte-scale, auto scaling cloud data warehouse to support high-concurrency analytics. It offers data analysts, developers, and scientists a fast, flexible analytic environment to gain insights from their data with optimal price-performance. Redshift Serverless auto scales during usage spikes, enabling enterprises to cost-effectively help meet changing business demands. You can benefit from this simplicity without changing your existing analytics and business intelligence (BI) applications.

To help meet demanding performance needs like high concurrency, usage spikes, and fast query response times while optimizing costs, this post proposes using Redshift Serverless. The proposed solution aims to address three key performance requirements:

  • Support thousands of concurrent connections with high availability by using multiple Redshift Serverless endpoints behind a Network Load Balancer
  • Accommodate hundreds of concurrent queries with low-latency service level agreements through scalable and distributed workgroups
  • Enable subsecond response times for short queries against large datasets using the fast query processing of Amazon Redshift

The suggested architecture uses multiple Redshift Serverless endpoints accessed through a single Network Load Balancer client endpoint. The Network Load Balancer evenly distributes incoming requests across workgroups. This improves performance and reduces latency by scaling out resources to meet high throughput and low latency demands.

Solution overview

The following diagram outlines a Redshift Serverless architecture with multiple Amazon Redshift managed VPC endpoints behind a Network Load Balancer.

The following are the main components of this architecture:

  • Amazon Redshift data sharing – This allows you to securely share live data across Redshift clusters, workgroups, AWS accounts, and AWS Regions without manually moving or copying the data. Users can see up-to-date and consistent information in Amazon Redshift as soon as it’s updated. With Amazon Redshift data sharing, the ingestion can be done at the producer or consumer endpoint, allowing the other consumer endpoints to read and write the same data and thereby enabling horizontal scaling.
  • Network Load Balancer – This serves as the single point of contact for clients. The load balancer distributes incoming traffic across multiple targets, such as Redshift Serverless managed VPC endpoints. This increases the availability, scalability, and performance of your application. You can add one or more listeners to your load balancer. A listener checks for connection requests from clients, using the protocol and port that you configure, and forwards requests to a target group. A target group routes requests to one or more registered targets, such as Redshift Serverless managed VPC endpoints, using the protocol and the port number that you specify.
  • VPC – Redshift Serverless is provisioned in a VPC. By creating a Redshift managed VPC endpoint, you enable private access to Redshift Serverless from applications in another VPC. This design allows you to scale by having multiple VPCs as needed. The VPC endpoint provides a dedicate private IP for each Redshift Serverless workgroup to be used as the target groups on the Network Load Balancer.

Create an Amazon Redshift managed VPC endpoint

Complete the following steps to create the Amazon Redshift managed VPC endpoint:

  1. On the Redshift Serverless console, choose Workgroup configuration in the navigation pane.
  2. Choose a workgroup from the list.
  3. On the Data access tab, in the Redshift managed VPC endpoints section, choose Create endpoint.
  4. Enter the endpoint name. Create a name that is meaningful for your organization.
  5. The AWS account ID will be populated. This is your 12-digit account ID.
  6. Choose a VPC where the endpoint will be created.
  7. Choose a subnet ID. In the most common use case, this is a subnet where you have a client that you want to connect to your Redshift Serverless instance.
  8. Choose which VPC security groups to add. Each security group acts as a virtual firewall to control inbound and outbound traffic to resources protected by the security group, such as specific virtual desktop instances.

The following screenshot shows an example of this workgroup. Note down the IP address to use during the creation of the target group.

Repeat these steps to create all your Redshift Serverless workgroups.

Add VPC endpoints for the target group for the Network Load Balancer

To add these VPC endpoints to the target group for the Network Load Balancer using Amazon Elastic Compute Cloud (Amazon EC2), complete the following steps:

  1. On the Amazon EC2 console, choose Target groups under Load Balancing in the navigation pane.
  2. Choose Create target group.
  3. For Choose a target type, select Instances to register targets by instance ID, or select IP addresses to register targets by IP address.
  4. For Target group name, enter a name for the target group.
  5. For Protocol, choose TCP or TCP_UDP.
  6. For Port, use 5439 (Amazon Redshift port).
  7. For IP address type, choose IPv4 or IPv6. This option is available only if the target type is Instances or IP addresses and the protocol is TCP or TLS.
  8. You must associate an IPv6 target group with a dual-stack load balancer. All targets in the target group must have the same IP address type. You can’t change the IP address type of a target group after you create it.
  9. For VPC, choose the VPC with the targets to register.
  10. Leave the default selections for the Health checks section, Attributes section, and Tags section.

Create a load balancer

After you create the target group, you can create your load balancer. We recommend using port 5439 (Amazon Redshift default port) for it.

The Network Load Balancer serves as a single-access endpoint and will be used on connections to reach Amazon Redshift. This allows you to add more Redshift Serverless workgroups and increase the concurrency transparently.

Testing the solution

We tested this architecture to run three BI reports with the TPC-DS dataset (cloud benchmark dataset) as our data. Amazon Redshift includes this dataset for free when you choose to load sample data (sample_data_dev database). The installation also provides the queries to test the setup.

Among all the queries from TPC-DS benchmark, we chose the following three to use as our report queries. We changed the first two report queries to use a CREATE TABLE AS SELECT (CTAS) query on temporary tables instead of the WITH clause to emulate options you can see on a typical BI tool. For our testing, we also disabled the result cache to make sure that Amazon Redshift would run the queries every time.

The set of queries contains the creation of temporary tables, a join between those tables, and the cleanup. The cleanup step drops tables. This isn’t needed because they’re deleted at the end of the session, but this aims to simulate all that the BI tool does.

We used Apache JMETER to simulate clients invoking the requests. To learn more about how to use and configure Apache JMETER with Amazon Redshift, refer to Building high-quality benchmark tests for Amazon Redshift using Apache JMeter.

For the tests, we used the following configurations:

  • Test 1 – A single 96 RPU Redshift Serverless vs. three workgroups at 32 RPU each
  • Test 2 – A single 48 RPU Redshift Serverless vs. three workgroups at 16 RPU each

We tested three reports by spawning 100 sessions per report (300 total). There were 14 statements across the three reports (4,200 total). All sessions were triggered simultaneously.

The following table summarizes the tables used in the test.

Table Name Row Count
Catalog_page 93,744
Catalog_sales 23,064,768
Customer_address 50,000
Customer 100,000
Date_dim 73,049
Item 144,000
Promotion 2,400
Store_returns 4,600,224
Store_sales 46,086,464
Store 96
Web_returns 1,148,208
Web_sales 11,510,144
Web_site 240

Some tables were modified by ingesting more data than what the TPC-DS schema offers on Amazon Redshift. Data was reinserted on the table to increase the size.

Test results

The following table summarizes our test results.

TEST 1 . Time Consumed Number of Queries Cost Max Scaled RPU Performance
Single: 96 RPUs 0:02:06 2,100 $6 279 Base
Parallel: 3x 32 RPUs 0:01:06 2,100 $1.20 96 48.03%
Parallel 1 (32 RPU) 0:01:03 688 $0.40 32 50.10%
Parallel 2 (32 RPU) 0:01:03 703 $0.40 32 50.13%
Parallel 3 (32 RPU) 0:01:06 709 $0.40 32 48.03%
TEST 2 . Time Consumed Number of Queries Cost Max Scaled RPU Performance
Single: 48 RPUs 0:01:55 2,100 $3.30 168 Base
Parallel: 3x 16 RPUs 0:01:47 2,100 $1.90 96 6.77%
Parallel 1 (16 RPU) 0:01:47 712 $0.70 36 6.77%
Parallel 2 (16 RPU) 0:01:44 696 $0.50 25 9.13%
Parallel 3 (16 RPU) 0:01:46 692 $0.70 35 7.79%

The preceding table shows that the parallel setup was faster than the single at a lower cost. Also, in our tests, even though Test 1 had double the capacity of Test 2 for the parallel setup, the cost was still 36% lower and the speed was 39% faster. Based on these results, we can conclude that for workloads that have high throughput (I/O), low latency, and high concurrency requirements, this architecture is cost-efficient and performant. Refer to the AWS Pricing Cost Calculator for Network Load Balancer and VPC endpoints pricing.

Redshift Serverless automatically scales the capacity to deliver optimal performance during periods of peak workloads including spikes in concurrency of the workload. This is evident from the maximum scaled RPU results in the preceding table.

Recently released features of Redshift Serverless such as MaxRPU and AI-driven scaling were not used for this test. These new features can increase the price-performance of the workload even further.

We recommend enabling cross-zone load balancing on the Network Load Balancer because it distributes requests from clients to registered targets. Enabling cross-zone load balancing will help balance the requests among the Redshift Serverless managed VPC endpoints irrespective of the Availability Zone they are configured in. Also, if the Network Load Balancer receives traffic from only one server (same IP), you should always use an odd number of Redshift Serverless managed VPC endpoints behind the Network Load Balancer.

Conclusion

In this post, we discussed a scalable architecture that increases the throughput of Redshift Serverless in low latency, high concurrency scenarios. Having multiple Redshift Serverless workgroups behind a Network Load Balancer can deliver a horizontally scalable solution at the best price-performance.

Additionally, Redshift Serverless uses AI techniques (currently in preview) to scale automatically with workload changes across all key dimensions—such as data volume changes, concurrent users, and query complexity—to meet and maintain your price-performance targets.

We hope this post provides you with valuable guidance. We welcome any thoughts or questions in the comments section.


About the Authors

Ricardo Serafim is a Senior Analytics Specialist Solutions Architect at AWS.

Harshida Patel is a Analytics Specialist Principal Solutions Architect, with AWS.

Urvish Shah is a Senior Database Engineer at Amazon Redshift. He has more than a decade of experience working on databases, data warehousing and in analytics space. Outside of work, he enjoys cooking, travelling and spending time with his daughter.

Amol Gaikaiwari is a Sr. Redshift Specialist focused on helping customers realize their business outcomes with optimal Redshift price-performance. He loves to simplify data pipelines and enhance capabilities through adoption of latest Redshift features.