Lower your Amazon OpenSearch Service storage cost with gp3 Amazon EBS volumes

Post Syndicated from Siddhant Gupta original https://aws.amazon.com/blogs/big-data/lower-your-amazon-opensearch-service-storage-cost-with-gp3-amazon-ebs-volumes/

Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. OpenSearch is an open-source, distributed search and analytics suite comprising OpenSearch, a distributed search and analytics engine, and OpenSearch Dashboards, a UI and visualization tool. When you use Amazon OpenSearch Service, you configure a set of data nodes to store indexes and serve queries. The service supports instance types for data nodes with different storage options. Some supported Amazon Elastic Compute Cloud (Amazon EC2) instance types, like the R6GD or I3, have local NVMe disks. Others use Amazon Elastic Block Store (Amazon EBS) storage.

On July 2022, OpenSearch Service launched support for the next generation, general purpose SSD (gp3) EBS volumes. OpenSearch Service data nodes require low latency and high throughput storage to provide fast indexing and query. With gp3 EBS volumes, you get higher baseline performance (IOPS and throughput) at a 9.6% lower cost than with the previously offered gp2 EBS volume type. You can provision additional IOPS and throughput independent of volume size using gp3. gp3 volumes are also more stable because they don’t use burst credits. OpenSearch support for gp3 volumes includes doubling the limit on per-data node volume sizes. With these larger volumes, you can reduce the cost of passive data, increasing the amount of storage per node.

We recommend that you consider gp3 as the best Amazon EBS option for price/performance and flexibility. In this post, I discuss the basics of gp3 and various cost-saving use cases. Migrating from previous generation storage (gp2, PIOPS, and magnetic) volumes to the latest generation gp3 volumes allows you to reduce monthly storage costs and optimize instance utilization.

Comparing gp2 and gp3

gp3 is the successor to the general purpose SSD gp2 volume. The key benefits of gp3 include higher baseline performance, 9.6% lower cost, and the ability to provision higher performance regardless of volume. The following table summarizes the key differences between gp2 and gp3.

Volume type gp3 gp2
Volume size Depends on instance type. Max OpenSearch Service supports 24 TiB for R6g.12Xlarge. For the latest instance limits, see Amazon OpenSearch Service quotas. Depends on instance type. Max OpenSearch Service supports 12 TiB for R6g.12Xlarge.
Baseline IOPS 3,000 IOPS for volume size up to 1,024 GiB. For volumes above 1,024 GiB, you get 3 IOPS/GiB, without burst credit complexity. 3 IOPS/GiB (minimum 100 IOPS) to a maximum of 16,000 IOPS. Volumes smaller than 1 TiB can also burst up to 3,000 IOPS.
Max IOPS/volume 16,000 16,000
Baseline throughput 125 MiB/s free for volume size up to 170 GiB, or 250 MiB/s free for volume above 170 GiB. Between 125 MiB/s and 250 MiB/s, depending on the volume size.
Max throughput/volume 1,000 MiB/s 250 MiB/s
Price for us-east-1 Region
  • Storage – $0.122/GB-month.
  • IOPS – 3,000 IOPS free for volumes up to 1,024 GiB, or 3 IOPS/GiB free for volumes above 1,024 GiB. $0.008/provisioned IOPS-month over free limits.
  • Throughput – 125 MiB/s free for volumes up to 170 GiB, or +250 MiB/s free for every 3 TiB for volumes above 170 GiB. $0.064/provisioned MiB/s-month over free limits.
  • Storage – $0.135/GB-month.
  • IOPS and throughput provisioning not allowed.
Instance supported T3, C5, M5, R5, C6g, M6g, and R6g T2, C4, M4, R4, T3, C5, M5, R5, C6g, M6g, and R6g

Lower your monthly bills with gp3

The ability to provision IOPS and throughput independent of volume size and support for denser (twice as large) volume sizes are two significant advantages of gp3 adoption. Together, these benefits enable multiple use cases to lower your monthly bills. In this section, we present a few examples of pricing comparisons for OpenSearch domains.

gp2 vs. gp3

This is the most common scenario, in which existing gp2 customers switch to gp3 and immediately begin saving 9.6% due to the lower monthly price per GB for gp3 storage. You can also benefit from the fact that gp3 supports volume sizes two times larger for the R5, R6g, M5, and M6g instance families. This means that you don’t need to spin up new instances for denser storage requirements and can achieve higher storage on the same instance. OpenSearch Service currently supports a maximum of 24 TiB of gp3 storage on R6g.12Xlarge instances.

PIOPS (io1) vs. gp3

OpenSearch Service supports the PIOPS SSD (io1) EBS volume type. You can switch to gp3 and provision additional IOPS and throughput to meet your specific performance requirements. The following table compares the monthly cost of PIOPS (io1) and gp3 storage with R5.large.search instances for storage requirements of 6 TiB and 16000 IOPS. In this example, you would save 65% with gp3 adoption.

. PIOPS (io1) gp3
Instance cost

6 instances * $0.186/hr = $830/month

(r5.large.search can support up to 1 TiB storage for io1; to support 6 TiB we require six instances.)

3 instances * $0.167Hr = $372/month

(r6g.large.search can support up to 2 TiB storage for gp3; to support 6 TiB we require three instances.)

Storage cost (6 TiB)

6,597 GB * $0.169/GB-month = $1115/month

Notes:
(a) Price for PIOPS(io1) is $0.169 per GB/month.
(b) 6TiB = 6597 GB

6,597 GB * $0.122/GB-month = $805/month

Notes:
(a) Price for gp3 storage is $0.122 per GB/month.
(b) 6TiB = 6597 GB

PIOPS cost (16000 PIOPS)

16000 IOPS * $0.088/IOPS-month = $1408/month

Note: io1 PIOPS rate is $0.088 per IOPS-month.

18,000 IOPS is included in the price for 6 TiB volume of gp3; you don’t need to pay.

Note: 3 IOPS/ GiB Storage IOPS inlcued in price.

Total monthly bills $3,353/month $1,177/month

I3 vs. gp3

I3 instances include Non-Volatile Memory Express (NVMe) SSD-based instance storage optimized for low latency, very high random I/O performance, and high sequential read throughput, and delivers high IOPS. However, I3 uses older third-generation CPUs, and the largest storage supported size is 15 TiB with i3.16xlarge.search instance. You should consider using the largest generation instances such as R6g with gp3 storage to get lower cost and better performance over I3 instances.

To comprehend the cost advantage, let’s compare I3 and gp3 for 12 TiB of data storage needs. By switching to gp3 along with the current generation of instances, you can reduce your monthly bills by 56%, according to the calculations in the following table.

. I3.4xlarge gp3 with R6g.xlarge
On-demand instance cost for us-east-1 Region

4 instances * $1.99/hr = $5,922/month

Note: I3.4xlarge.search supports up to 3.8 TiB, so we require four instances to manage 12 TiB storage. Instance cost is $1.99/hr.

4 instances * $0.335/hr = $996/month

Note: R6g.xlarge.search supports up to 3 TiB with gp3, so we require four instances to manage 12 TiB. Instance cost is $0.335/hr.

Storage cost (12 TiB) N/A (included in instance price)

13,194 GB * $0.122/GB-month = $1,610/month

Notes:
(a) 12 TiB = 13,194 GB
(b) Storage cost is $0.122 per GB / month

Total monthly bills $5,922/month $2,606/month

UltraWarm vs. gp3

UltraWarm is designed to provide inexpensive access to infrequently accessed data, such as logs older than 30 days. Warm storage is useful for indexes that aren’t actively being written to, are queried less frequently, and don’t require high performance. If you have large and query-intensive workloads and are attempting to use UltraWarm to optimize costs but encountering higher query volumes than it can handle, you should consider moving some of the data volume to hot nodes with gp3 storage. UltraWarm will remain the least expensive option for your warm data (less-frequently accessed) type use cases, but you shouldn’t use it for hot data use cases. A combination of low-cost gp3 storage and denser instances can help you achieve cost-optimized higher performance for hot data.

The following table shows the monthly costs associated with running a 30 TiB UltraWarm workload, along with a comparison to the potential monthly costs of gp2 and gp3. With gp3, you can save up to 36% compared to gp2. Please note that UltraWarm setup does require hot data nodes; however, we excluded them in the UltraWarm column to focus on UltraWarm replacement costs with hot data nodes using gp2 and gp3.

. UltraWarm All Hot (gp2 with R6g.8xlarge) All Hot (gp3 with R6g.8xlarge)
Instance cost (On-demand)

2 UW large instances * $2.68/hr = $3,987/month

Note: ultrawarm1.large.search supports max 20 TiB, so we need two instances.

4 instances * $2.677/hr = $7,966/month

Note: r6g.8xlarge.search supports max 8 TiB with gp2, so we require four instances.

2 Instances * $2.677/hr= $3,984/month

Note: r6g.8xlarge.search supports max 16 TiB with gp3, so we only require two instances.

Storage cost (30 TiB)

32,985 GB * $0.024/GB-month = $792/month

Notes:
(1) Storage price is $0.024/per GB/month).
(2) 30 TiB = 32985 GB

32,985 GB * $0.135/GB-month = $4,453/month

Notes:
(1) Storage price is $0.135 per GB/month.
(2) 30 TiB = 32985 GB

32,985 GB * $0.122/GB-month = $4,024/month

Notes:
(1) Storage price is $0.122 per GB/month.
(2) 30 TiB = 32985 GB

Total Monthly Bills $4,779/month $12,419/month $8,008/month

All the preceding use cases are from a cost perspective. Before making any changes to the production environment, we recommend validating performance in a test environment for your unique workload and ensuring that configuration changes don’t result in performance degradation.

Optimize instance cost with gp3’s denser storage

OpenSearch Service increased the maximum volume size supported per instance for gp3 by 100% when compared to gp2 for the R5, R6g, M5, and M6g instance families due to gp3’s improved baseline performance. You can optimize your instance needs by taking advantage of the increased storage per instance volume. For example, R6g.large supports up to 2 TiB with gp3, but only 1 TiB with gp2. If you require support for 12 TiB of data storage, you can reconfigure your domains from six data nodes to three R6g.large in order to reduce your instance costs. For OpenSearch EBS instance-specific volume limits, refer to EBS volume size quotas.

Upgrade from gp2 to gp3

To use the EBS gp3 volume type, you must first upgrade your domain’s instances to supported instance types if they don’t already support gp3. For a list of OpenSearch Service supported instances, see EBS volume size quotas. The transition from gp2 to gp3 is seamless. You can upgrade domain configurations from existing EBS volume types such as gp2, Magnetic, and PIOS (io1) to gp3 through OpenSearch Service console or the UpdatedomainConfig API. The configuration change will initiate blue/green deployment, which runs in the background without impacting your online traffic and, depending on the data size, is complete in a few hours. Blue/green deployments run in the background, ensuring that your online traffic is uninterrupted and preventing data loss.

gp3 baseline performance, and additional provisioning limits

One of the gp3’s key features is the ability to scale IOPS and throughput independent of volume. When your application requires more performance, you can scale up to 16,000 IOPS and 1,000 MiB/s throughput for an additional fee. OpenSearch Service EBS gp3 delivers a baseline performance of 3,000 IOPS and 125 MiB/s throughput at any volume size. In addition, OpenSearch Service provisions additional IOPS and throughput for larger volumes to ensure optimal performance. For volumes above 1,024 GiB, you receive 3 IOPS/GiB, and for volumes above 170 GiB, you get an incremental 250 MiB/s for every 3 TiB of storage.

The following table outlines OpenSearch Service baseline IOPS and throughput, as well as the maximum amount you can provision. Note that your instance type may have additional limitations regarding how much and for how long it can support these performance baselines in a 24-hour period. For more information about instances and their limits, refer to Amazon EBS-optimized instances.

Additional performance customers can provisions

.. Baseline (included in storage price) Additional performance customers can provision
Volume Storage (in GiB) IOPS throughput (MiB/s) IOPS throughput (MiB/s)
170 3,000 125 13,000 875
172 3,000 250 13,000 750
1,024 3,000 250 13,000 750
1,025 3,075 250 12,925 750
3,000 9,000 250 7,000 750
3,001 9,003 500 6,997 500
6,000 18,000 500 NA 500
6,001 18,003 750 NA 250
9,001 27,003 1,000 NA NA
24,000 72,000 2,000 NA NA

Do you need additional performance?

In the majority of use cases, you don’t need to provision additional IOPS and throughput, and gp3 baseline performance should suffice. You can use Amazon CloudWatch metrics to find the usage patterns, and if you observe current limits of IOPS and throughput bottlenecking your index and query performance, you should provision additional performance. For more information, refer to EBS volume metrics.

Conclusion

This post explains how OpenSearch Service general purpose SSD gp3 volumes can significantly reduce monthly storage and instance costs, making them more cost-effective than gp2 volumes. Migration to gp3 volumes with the same size and performance configurations as gp2 is the quickest and simplest way to reduce costs. Additionally, you should also consider reducing instance costs by taking advantage of gp3’s support for denser storage per data node.

For more details, check out Amazon OpenSearch Service pricing and Configuration API reference for Amazon OpenSearch Service.


About the author

Siddhant Gupta is a Sr. Technical Product Manager at Amazon Web Services based in Hyderabad, India. Siddhant has been with Amazon for over five years and is currently working with the OpenSearch Service team, helping with new region launches, pricing strategy, and bringing EC2 and EBS innovations to OpenSearch Service customers . He is passionate about analytics and machine learning. In his free time, he loves traveling, fitness activities, spending time with his family and reading non-fiction books.