All posts by Siddhant Gupta

Introducing Cluster insights: Unified monitoring dashboard for Amazon OpenSearch Service clusters

2025-11-21 Siddhant Gupta

Post Syndicated from Siddhant Gupta original https://aws.amazon.com/blogs/big-data/introducing-cluster-insights-unified-monitoring-dashboard-for-amazon-opensearch-service-clusters/

Amazon OpenSearch Service clusters offer a wealth of operational metrics accessible through CloudWatch and the Amazon OpenSearch Service console to support effective performance monitoring and alert creation. Yet, pinpointing resiliency and performance challenges within your cluster can prove daunting. The process of identifying resource-intensive queries or understanding performance degradation trends can be time-consuming.

To address these challenges, we launched Cluster insights, which presents a unified dashboard delivering curated insights along with actionable mitigation steps. The dashboard displays detailed metrics at the node, index, and shard levels, coupled with a concise summary of security and resiliency best practices to uphold peak resiliency and availability.

This blog will guide you through setting up and using Cluster Insights, including key features and metrics. By the conclusion, you’ll understand how to use Cluster insights to recognize and address performance and resiliency issues within your OpenSearch Service clusters.

Getting Started with Cluster insights

Cluster insights is available at no additional cost to OpenSearch Service users running OpenSearch version 2.17 or later. Accessing Cluster insights requires admin-level permissions for your OpenSearch domain. Cluster insights is available only through the OpenSearch UI. OpenSearch UI offers support to multiple data sources, zero downtime upgrades for your dashboard experience, and curated workspaces for effective team collaborations. You first need to associate a data source (your clusters) with an OpenSearch UI application. Detailed steps are described in the user guide. Your OpenSearch UI console experience will look like following screenshots.

To access Cluster insights using the OpenSearch UI application:

In the Amazon OpenSearch Service console, navigate to OpenSearch UI (Dashboards) and choose the Application URL to access your OpenSearch UI application.
OpenSearch UI application, choose the settings icon at the left-bottom corner, then choose Data administration.
On the Data administration overview page, or under Manage data in the left navigation, select Cluster insights.

Cluster insights overview

The Cluster insights – Overview acts as a landing page to show health and insights for all connected OpenSearch domains. It is organized into five sections:

Current cluster status – Displays cluster health status (Green, Yellow, and Red) in a donut chart.
Insights trend – Tracks issue patterns over the past 30 days, helping you identify emerging problems and track resolution progress. This trend analysis becomes particularly valuable when monitoring the impact of operational changes or troubleshooting recurring issues.
Current open insights – Shows the count and severity breakdown of currently active insights across your clusters.
OpenSearch service clusters – Lists all domains with their vital statistics such as health status, insights count, nodes, shards, and active queries.
Top insights by severity – Prioritizes issues that need immediate attention. Each insight comes with a clear description and specific recommendations, transforming complex monitoring data into actionable tasks. This prioritized view helps teams can focus on critical issues first, whether they’re addressing shard size problems, disk space issues, or performance bottlenecks.

Together, these sections provide a comprehensive view of your OpenSearch Service infrastructure so you can assess cluster health, identify trends, and take action on critical issues from a single dashboard.

Cluster health

When you choose a specific cluster from the OpenSearch domains on the Cluster insights – Overview page, you will see cluster-specific details including health status, active insights, and performance metrics. The overview section displays cluster health along with essential metrics including count of shards, nodes, indices, and a total document size. You can also review the configuration best practices followed by domain across resiliency and security areas.

The lower section contains a table of actionable insights that presents a detailed view of current issues. This table mirrors the insights from the landing page but focuses specifically on issues affecting the selected cluster. You can observe high-severity issues such as low disk space and shard count problems, as well as medium-severity concerns that may impact cluster performance.

Each insight entry serves as an interactive element – selecting any issue reveals an in-depth analysis complete with root cause identification and specific remediation steps. The table includes important metadata such as generation timestamps, severity levels, recommendation counts, and current status, so users can prioritize and address issues effectively.

Insight details

Every insight offers detailed analysis and actionable recommendations. Take the Shard Count insight as an example: selecting it reveals a comprehensive breakdown of the issue. You’ll see that your OpenSearch cluster has breached the number of shards allowed on the nodes based on its JVM heap size, along with a detailed list of affected resources.

The detailed view includes a resource map that precisely identifies each impacted node and index, displaying critical information such as node IDs, shard counts, and the indices contributing to the issue.

The recommendations are organized into two levels: cluster-level recommendations address overall architecture improvements, such as scaling your cluster or adjusting global shard allocation settings. Index-level recommendations provide specific actions for individual indices—for example, you might see suggestions to move idle shards to UltraWarm storage. These are shards without any search or indexing operations for the last 10 days and are at least 5 days old, making them ideal candidates for warm storage to reduce the active shard count. All of this guidance is available directly within the Cluster insights interface, eliminating the need to switch between different tools or consoles.

Node, Index, Shard, and Query view

Next to cluster health, you can review Node, Index, Shard, and Query details for a specific cluster. These views present critical metrics such as resource (CPU, memory, disk) utilization, search and index latency.

Node view

The Node view tab provides a comprehensive view of individual node performance across your cluster. This table displays critical metrics for each node including heat score indicating overall node health, resource utilization (CPU, memory, disk), search and indexing latency and rates, along with quick links to view top N shards and queries running on each node.

This view helps you identify nodes experiencing high resource utilization or performance degradation. You can drill deeper into each node by clicking on the node ID to view detailed time-based metrics showing resource usage trends over time. Additionally, you can click the top N shards link to navigate directly to the Shard View, automatically filtered to show only the shards running on the selected node, allowing you to pinpoint which specific shards are contributing to performance issues.

Index view

The Index view tab shows performance metrics aggregated at the index level. For each index, you can monitor document count and storage size, search latency and rate, indexing latency and rate, and access top N queries affecting the index. This perspective is valuable for understanding which indices are driving cluster load and identifying optimization opportunities at the index configuration level.

Shard view

The Shard view tab offers the most granular view of cluster performance by displaying metrics for individual shards. Each row shows shard ID and its assigned node, index association and resource pressure metrics (CPU, memory), along with search and indexing latency per shard. This detailed view enables you to pinpoint specific shards causing performance issues, identify shard placement imbalances, and take targeted remediation actions.

Query view

The Query view on the Cluster insights page solves presents live dashboards that break down execution stats, CPU and memory usage, and completion progress for every query. This helps monitor which queries are driving the biggest resource consumption (the Top-N queries). With intuitive donut charts and scoreboards showing distribution by node, index, and user, this interface helps operators to quickly pinpoint performance bottlenecks and heavy workloads, supporting targeted optimization and confident scaling decisions.

Query insights

In addition to Cluster insights, you can also get Query insights to view the exact queries running and latencies across Expand, Query, and Fetch phases that provides valuable insights for search developers to further fine-tune their queries.

Conclusion

Cluster insights transforms OpenSearch Service cluster management from reactive troubleshooting to proactive optimization. By providing unified dashboards with heat score, and best practices across stability, resiliency, and security pillars, it offers visibility into your search infrastructure at the account level.

The actionable recommendations and step-by-step remediation guidance help users of all experience levels effectively resolve complex issues like shard imbalances and resource bottlenecks.

The integration with Query insights delivers real-time visibility into resource consumption patterns so that teams can identify and optimize performance-critical queries through detailed profiling and latency analysis.

For more information, see the AWS OpenSearch Service User Guide for additional details.

About the authors

Track Amazon OpenSearch Service configuration changes more easily with new visibility improvements

2024-02-06 Siddhant Gupta

Post Syndicated from Siddhant Gupta original https://aws.amazon.com/blogs/big-data/track-amazon-opensearch-service-configuration-with-improved-visibility/

Amazon OpenSearch Service offers multiple domain configuration settings to meet your workload-specific requirements. As part of standard service operations, you may be required to update these configuration settings on a regular basis. Recently, Amazon OpenSearch Service launched visibility improvements that allow you to track configuration changes more effectively. We’ve introduced granular and more descriptive configuration statuses that enable you to set up alarms and use them in automation to minimize manual monitoring.

We recommend that you take advantage of these visibility improvements in your applications. These changes are backward compatible, and if your automations rely on the legacy processing parameter to determine configuration change status, then they should still continue to work without any disruption. To simplify tracking of multiple in-flight configuration change requests, Amazon OpenSearch Service allows configuration request only when Domain Processing Status is Active. Additional details are in section ‘Single configuration change at a time’.

Solution overview

Earlier, configuration change status visibility was available through processing parameters in the OpenSearch Service APIs (Application Programming Interface), and as a Domain Status field in the OpenSearch Service console. We have now introduced the following changes to improve the configuration update experience:

Introduced two new parameters, DomainProcessingStatus and ConfigChangeStatus, in the API responses. Similarly, added Domain Processing Status and Configuration Change Status fields in the console. These changes provide better visibility through multiple, intuitive statuses. Earlier statuses were limited to only two values: Active and Processing.
Ability to easily compare active and in-flight configurations for clarity. Earlier, it required multiple steps.
Amazon OpenSearch Service has now adopted the approach of allowing a single configuration change request at a time. There is no limit on the number of domain configuration changes you can bundle in a single request. However, you can submit the next configuration request when the previous request is complete and the domain processing status becomes Active. This improvement streamlines configuration updates and addresses previous challenges of tracking multiple, in-flight configuration change requests.
Ability to cancel a change request in case of a validation failure. Previously, when instances were unavailable, domains remained in processing state. Now, upon encountering any validation failure, you can cancel the change request and retry after some time.
Domain processing status turns to Active only after all the background activities, including shard movement is complete. This means that you can confidently use newly introduced statuses in your automation scripts without needing to infer if all the internal processes, such as data movement, are complete.

How do you get granular details to track the configuration update status?

As part of recent improvements, Amazon OpenSearch Service introduced DomainProcessingStatus and ConfigChangeStatus parameters in the APIs along with the respective Domain Processing Status and Configuration Change Status fields in the console. You can rely on these statuses to get accurate and consistent information during different configuration change scenarios, like when configuration changes involve blue/green operations or without blue/green operations, and when configuration changes are triggered by the operator or by the OpenSearch Service. Let us explore these enhanced visibility experiences.

Domain processing status visibility: You can track the staus of domain-level configuration changes through the Domain Processing Status field in the console. Similarly, API responses include the DomainProcessingStatus parameter. The values and a brief description are provided in the following details:
1. Active: No configuration change is in progress. You can submit a new configuration change request.
2. Creating: New domain creation is in progress.
3. Modifying: This status indicates that one or more configuration changes, such as new data node addition, Amazon Elastic Block Store (Amazon EBS) GP3 storage provisioning, or setting up KMS keys, are in progress. In other words, changes made through the UpdateDomainConfig API, set the status to modifying. The ‘Modifying’ status also covers situations where domains require shard movement to complete configuration changes. Note: For backward compatibility, we have kept the behavior of the processing parameter unchanged in the API responses, and it is set to false as soon as the core configuration changes are complete, without waiting for shard movement completion.
4. Upgrading Engine Version: Engine version upgrades are in progress, such as from Elasticsearch version 7.9 to OpenSearch version 1.0.
5. Updating Service Software: This status refers to configuration changes related to service software updates.
6. Deleting: Domain deletion is progressing.
7. Isolated: This represents domains that are suspended due to a variety of reasons, such as account-related billing issues or domains that are not compliant with critical security patch updates.
Configuration change status visibility: Configuration changes can be initiated by the user (e.g., new data node addition, instance type change) or by the Service (e.g., AutoTune and mandatory service software updates). You can find the latest status details through Configuration Change Status field in the console, and through the ConfigChangeStatus parameter in API responses. Below are the values and a brief description:
1. Pending: Indicates that a configuration change request has been submitted.
2. Initializing: Service is initializing a configuration change request.
3. Validating: Service is validating the requested changes and resources required.
4. Validation Failed: Requested changes failed validation. At this point, no configuration changes are applied. Some possible validation failures could be the presence of red indices in the domain, unavailability of a chosen instance type, and low disk space. Here is a list of potential validation failures. During a validation failure event, you can cancel, retry, or edit configuration changes.
5. Awaiting user inputs: Scenarios where user may be able to fix validation errors such as invalid KMS key. At this status, user can edit the configuration changes.
6. Applying changes: Service is applying requested configuration changes.
7. Cancelled: During validation failed status, you can either click on the Cancel button in the console or call the CancelDomainConfigChange API. All the applied changes that were part of the change request will be rolled back.
8. Completed: Requested configuration changes have been successfully completed.

Console enhancements

The Amazon OpenSearch Service console offers enhanced visibility to track configuration change progress. Below are a few screenshots to give you an idea of these improvements.

Amazon OpenSearch Service console provides Domain Processing Status, Configuration Change Status, and Change ID fields. Note: To know the change details associated with the Change ID, you can use the DescribeDomainChangeProgress API.

Configuration change summary. To see a side by side comparison of your active configurations and requested changes, on the domain detail page, navigate to the cluster configuration tab, scroll down to the configuration change summary section. Pending Changes field shows the status of the pending properties at that time and does not include changes that have been applied. You can also get similar details from the DescribeDomain and DescribeDomainConfig APIs through theModifyingProperties parameter.

Cancelling during validation failure. In the below screenshots, you can see a new option to cancel a change request when a configuration change request fails validations. For example, when you encounter SubnetNotFound error, you can use the Cancel request button to roll back to the previous active configuration, fix the issue and then retry the configuration update.

Single configuration change at a time

Previously, it was not straightforward to track the success and failure of individual change requests, when several requests were made. To provide a simplified experience, OpenSearch Service now limits you to only a single change request at a time. In a single configuration change request, you can bundle multiple changes at once. Once a configuration change request is submitted, it must be completed before you can request the next configuration change through the console, or through the UpdateDomainConfig API. This simplified experience makes it easier to keep track of changes that have been requested and their most recent status. If your automation is written to call configuration change update APIs multiple times, then it should be updated to group multiple configuration changes in a single update call, or wait for individual updates to complete before you submit the next configuration change. You can update domain configuration when domain processing status becomes active. For a list of changes that might need a blue/green deployment, please see here.

The below screenshot shows an example alert on the ‘Edit domain’ page informing the user that another change or update is in progress. OpenSearch Service no longer allows you to submit new configuration update requests, and the ‘Apply change’ button is disabled until the change in progress is completed.

API changes

You can use the DescribeDomain, DescribeDomainChangeProgress, and DescribeDomainConfig APIs to get detailed configuration update statuses. In addition, you can use CancelDomainConfigChange to cancel the change request in the event of a validation failures. You can refer Amazon OpenSearch Service API documentation here.

Conclusion

In this post, we showed you how to get granular information about a configuration update request. These newly introduced changes will help you gain better visibility into the progress of configuration change requests, and easily distinguish between applied changes and pending ones. You need to ensure that the DomainProcessingStatus processing status value is Active before submitting configuration change requests. The ability to cancel changes in the event of validation failures gives you better control in getting your domain out of processing state in a self-service manner. Visit product documentation to learn more.

About the Authors

Siddhant Gupta is a Sr. Technical Product Manager at Amazon Web Services based in Hyderabad, India. Siddhant has been with Amazon for over six years and is currently working with the OpenSearch Service team, helping with new region launches, pricing strategy, and bringing EC2 and EBS innovations to OpenSearch Service customers. He is passionate about analytics and machine learning. In his free time, he loves traveling, fitness activities, spending time with his family and reading non-fiction books.

Deniz Ercelebi is a Sr. UX Designer at Amazon OpenSearch Service. In her role she contributes to the creation, implementation, and successful delivery of design solutions for complex problems. Her personal drive is fueled by a passion for user experience, a dedication to customer-centric solutions, and a firm belief in collaborative innovation.

Shashank Gupta is a Sr. Software Developer at Amazon OpenSearch Service, specializing in the enhancement of the Managed service aspect of the platform. His primary focus is on optimizing the managed experience, spanning from the console to APIs and resource provisioning in an efficient manner. With a dedicated commitment to innovation, Shashank aims to elevate the overall customer experience by introducing inventive solutions within the service.

Lower your Amazon OpenSearch Service storage cost with gp3 Amazon EBS volumes

2022-11-24 Siddhant Gupta

Post Syndicated from Siddhant Gupta original https://aws.amazon.com/blogs/big-data/lower-your-amazon-opensearch-service-storage-cost-with-gp3-amazon-ebs-volumes/

Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. OpenSearch is an open-source, distributed search and analytics suite comprising OpenSearch, a distributed search and analytics engine, and OpenSearch Dashboards, a UI and visualization tool. When you use Amazon OpenSearch Service, you configure a set of data nodes to store indexes and serve queries. The service supports instance types for data nodes with different storage options. Some supported Amazon Elastic Compute Cloud (Amazon EC2) instance types, like the R6GD or I3, have local NVMe disks. Others use Amazon Elastic Block Store (Amazon EBS) storage.

On July 2022, OpenSearch Service launched support for the next generation, general purpose SSD (gp3) EBS volumes. OpenSearch Service data nodes require low latency and high throughput storage to provide fast indexing and query. With gp3 EBS volumes, you get higher baseline performance (IOPS and throughput) at a 9.6% lower cost than with the previously offered gp2 EBS volume type. You can provision additional IOPS and throughput independent of volume size using gp3. gp3 volumes are also more stable because they don’t use burst credits. OpenSearch support for gp3 volumes includes doubling the limit on per-data node volume sizes. With these larger volumes, you can reduce the cost of passive data, increasing the amount of storage per node.

We recommend that you consider gp3 as the best Amazon EBS option for price/performance and flexibility. In this post, I discuss the basics of gp3 and various cost-saving use cases. Migrating from previous generation storage (gp2, PIOPS, and magnetic) volumes to the latest generation gp3 volumes allows you to reduce monthly storage costs and optimize instance utilization.

Comparing gp2 and gp3

gp3 is the successor to the general purpose SSD gp2 volume. The key benefits of gp3 include higher baseline performance, 9.6% lower cost, and the ability to provision higher performance regardless of volume. The following table summarizes the key differences between gp2 and gp3.

Volume type	gp3	gp2
Volume size	Depends on instance type. Max OpenSearch Service supports 24 TiB for R6g.12Xlarge. For the latest instance limits, see Amazon OpenSearch Service quotas.	Depends on instance type. Max OpenSearch Service supports 12 TiB for R6g.12Xlarge.
Baseline IOPS	3,000 IOPS for volume size up to 1,024 GiB. For volumes above 1,024 GiB, you get 3 IOPS/GiB, without burst credit complexity.	3 IOPS/GiB (minimum 100 IOPS) to a maximum of 16,000 IOPS. Volumes smaller than 1 TiB can also burst up to 3,000 IOPS.
Max IOPS/volume	16,000	16,000
Baseline throughput	125 MiB/s free for volume size up to 170 GiB, or 250 MiB/s free for volume above 170 GiB.	Between 125 MiB/s and 250 MiB/s, depending on the volume size.
Max throughput/volume	1,000 MiB/s	250 MiB/s
Price for us-east-1 Region	Storage – $0.122/GB-month. IOPS – 3,000 IOPS free for volumes up to 1,024 GiB, or 3 IOPS/GiB free for volumes above 1,024 GiB. $0.008/provisioned IOPS-month over free limits. Throughput – 125 MiB/s free for volumes up to 170 GiB, or +250 MiB/s free for every 3 TiB for volumes above 170 GiB. $0.064/provisioned MiB/s-month over free limits.	Storage – $0.135/GB-month. IOPS and throughput provisioning not allowed.
Instance supported	T3, C5, M5, R5, C6g, M6g, and R6g	T2, C4, M4, R4, T3, C5, M5, R5, C6g, M6g, and R6g

Lower your monthly bills with gp3

The ability to provision IOPS and throughput independent of volume size and support for denser (twice as large) volume sizes are two significant advantages of gp3 adoption. Together, these benefits enable multiple use cases to lower your monthly bills. In this section, we present a few examples of pricing comparisons for OpenSearch domains.

gp2 vs. gp3

This is the most common scenario, in which existing gp2 customers switch to gp3 and immediately begin saving 9.6% due to the lower monthly price per GB for gp3 storage. You can also benefit from the fact that gp3 supports volume sizes two times larger for the R5, R6g, M5, and M6g instance families. This means that you don’t need to spin up new instances for denser storage requirements and can achieve higher storage on the same instance. OpenSearch Service currently supports a maximum of 24 TiB of gp3 storage on R6g.12Xlarge instances.

PIOPS (io1) vs. gp3

OpenSearch Service supports the PIOPS SSD (io1) EBS volume type. You can switch to gp3 and provision additional IOPS and throughput to meet your specific performance requirements. The following table compares the monthly cost of PIOPS (io1) and gp3 storage with R5.large.search instances for storage requirements of 6 TiB and 16000 IOPS. In this example, you would save 65% with gp3 adoption.

.	PIOPS (io1)	gp3
Instance cost	6 instances * $0.186/hr = $830/month (r5.large.search can support up to 1 TiB storage for io1; to support 6 TiB we require six instances.)	3 instances * $0.167Hr = $372/month (r6g.large.search can support up to 2 TiB storage for gp3; to support 6 TiB we require three instances.)
Storage cost (6 TiB)	6,597 GB * $0.169/GB-month = $1115/month Notes: (a) Price for PIOPS(io1) is $0.169 per GB/month. (b) 6TiB = 6597 GB	6,597 GB * $0.122/GB-month = $805/month Notes: (a) Price for gp3 storage is $0.122 per GB/month. (b) 6TiB = 6597 GB
PIOPS cost (16000 PIOPS)	16000 IOPS * $0.088/IOPS-month = $1408/month Note: io1 PIOPS rate is $0.088 per IOPS-month.	18,000 IOPS is included in the price for 6 TiB volume of gp3; you don’t need to pay. Note: 3 IOPS/ GiB Storage IOPS inlcued in price.
Total monthly bills	$3,353/month	$1,177/month

I3 vs. gp3

I3 instances include Non-Volatile Memory Express (NVMe) SSD-based instance storage optimized for low latency, very high random I/O performance, and high sequential read throughput, and delivers high IOPS. However, I3 uses older third-generation CPUs, and the largest storage supported size is 15 TiB with i3.16xlarge.search instance. You should consider using the largest generation instances such as R6g with gp3 storage to get lower cost and better performance over I3 instances.

To comprehend the cost advantage, let’s compare I3 and gp3 for 12 TiB of data storage needs. By switching to gp3 along with the current generation of instances, you can reduce your monthly bills by 56%, according to the calculations in the following table.

.	I3.4xlarge	gp3 with R6g.xlarge
On-demand instance cost for us-east-1 Region	4 instances * $1.99/hr = $5,922/month Note: I3.4xlarge.search supports up to 3.8 TiB, so we require four instances to manage 12 TiB storage. Instance cost is $1.99/hr.	4 instances * $0.335/hr = $996/month Note: R6g.xlarge.search supports up to 3 TiB with gp3, so we require four instances to manage 12 TiB. Instance cost is $0.335/hr.
Storage cost (12 TiB)	N/A (included in instance price)	13,194 GB * $0.122/GB-month = $1,610/month Notes: (a) 12 TiB = 13,194 GB (b) Storage cost is $0.122 per GB / month
Total monthly bills	$5,922/month	$2,606/month

UltraWarm vs. gp3

UltraWarm is designed to provide inexpensive access to infrequently accessed data, such as logs older than 30 days. Warm storage is useful for indexes that aren’t actively being written to, are queried less frequently, and don’t require high performance. If you have large and query-intensive workloads and are attempting to use UltraWarm to optimize costs but encountering higher query volumes than it can handle, you should consider moving some of the data volume to hot nodes with gp3 storage. UltraWarm will remain the least expensive option for your warm data (less-frequently accessed) type use cases, but you shouldn’t use it for hot data use cases. A combination of low-cost gp3 storage and denser instances can help you achieve cost-optimized higher performance for hot data.

The following table shows the monthly costs associated with running a 30 TiB UltraWarm workload, along with a comparison to the potential monthly costs of gp2 and gp3. With gp3, you can save up to 36% compared to gp2. Please note that UltraWarm setup does require hot data nodes; however, we excluded them in the UltraWarm column to focus on UltraWarm replacement costs with hot data nodes using gp2 and gp3.

.	UltraWarm	All Hot (gp2 with R6g.8xlarge)	All Hot (gp3 with R6g.8xlarge)
Instance cost (On-demand)	2 UW large instances * $2.68/hr = $3,987/month Note: ultrawarm1.large.search supports max 20 TiB, so we need two instances.	4 instances * $2.677/hr = $7,966/month Note: r6g.8xlarge.search supports max 8 TiB with gp2, so we require four instances.	2 Instances * $2.677/hr= $3,984/month Note: r6g.8xlarge.search supports max 16 TiB with gp3, so we only require two instances.
Storage cost (30 TiB)	32,985 GB * $0.024/GB-month = $792/month Notes: (1) Storage price is $0.024/per GB/month). (2) 30 TiB = 32985 GB	32,985 GB * $0.135/GB-month = $4,453/month Notes: (1) Storage price is $0.135 per GB/month. (2) 30 TiB = 32985 GB	32,985 GB * $0.122/GB-month = $4,024/month Notes: (1) Storage price is $0.122 per GB/month. (2) 30 TiB = 32985 GB
Total Monthly Bills	$4,779/month	$12,419/month	$8,008/month

All the preceding use cases are from a cost perspective. Before making any changes to the production environment, we recommend validating performance in a test environment for your unique workload and ensuring that configuration changes don’t result in performance degradation.

Optimize instance cost with gp3’s denser storage

OpenSearch Service increased the maximum volume size supported per instance for gp3 by 100% when compared to gp2 for the R5, R6g, M5, and M6g instance families due to gp3’s improved baseline performance. You can optimize your instance needs by taking advantage of the increased storage per instance volume. For example, R6g.large supports up to 2 TiB with gp3, but only 1 TiB with gp2. If you require support for 12 TiB of data storage, you can reconfigure your domains from six data nodes to three R6g.large in order to reduce your instance costs. For OpenSearch EBS instance-specific volume limits, refer to EBS volume size quotas.

Upgrade from gp2 to gp3

To use the EBS gp3 volume type, you must first upgrade your domain’s instances to supported instance types if they don’t already support gp3. For a list of OpenSearch Service supported instances, see EBS volume size quotas. The transition from gp2 to gp3 is seamless. You can upgrade domain configurations from existing EBS volume types such as gp2, Magnetic, and PIOS (io1) to gp3 through OpenSearch Service console or the UpdatedomainConfig API. The configuration change will initiate blue/green deployment, which runs in the background without impacting your online traffic and, depending on the data size, is complete in a few hours. Blue/green deployments run in the background, ensuring that your online traffic is uninterrupted and preventing data loss.

gp3 baseline performance, and additional provisioning limits

One of the gp3’s key features is the ability to scale IOPS and throughput independent of volume. When your application requires more performance, you can scale up to 16,000 IOPS and 1,000 MiB/s throughput for an additional fee. OpenSearch Service EBS gp3 delivers a baseline performance of 3,000 IOPS and 125 MiB/s throughput at any volume size. In addition, OpenSearch Service provisions additional IOPS and throughput for larger volumes to ensure optimal performance. For volumes above 1,024 GiB, you receive 3 IOPS/GiB, and for volumes above 170 GiB, you get an incremental 250 MiB/s for every 3 TiB of storage.

The following table outlines OpenSearch Service baseline IOPS and throughput, as well as the maximum amount you can provision. Note that your instance type may have additional limitations regarding how much and for how long it can support these performance baselines in a 24-hour period. For more information about instances and their limits, refer to Amazon EBS-optimized instances.

Additional performance customers can provisions

..	Baseline (included in storage price)		Additional performance customers can provision
Volume Storage (in GiB)	IOPS	throughput (MiB/s)	IOPS	throughput (MiB/s)
170	3,000	125	13,000	875
172	3,000	250	13,000	750
1,024	3,000	250	13,000	750
1,025	3,075	250	12,925	750
3,000	9,000	250	7,000	750
3,001	9,003	500	6,997	500
6,000	18,000	500	NA	500
6,001	18,003	750	NA	250
9,001	27,003	1,000	NA	NA
24,000	72,000	2,000	NA	NA

Do you need additional performance?

In the majority of use cases, you don’t need to provision additional IOPS and throughput, and gp3 baseline performance should suffice. You can use Amazon CloudWatch metrics to find the usage patterns, and if you observe current limits of IOPS and throughput bottlenecking your index and query performance, you should provision additional performance. For more information, refer to EBS volume metrics.

Conclusion

This post explains how OpenSearch Service general purpose SSD gp3 volumes can significantly reduce monthly storage and instance costs, making them more cost-effective than gp2 volumes. Migration to gp3 volumes with the same size and performance configurations as gp2 is the quickest and simplest way to reduce costs. Additionally, you should also consider reducing instance costs by taking advantage of gp3’s support for denser storage per data node.

For more details, check out Amazon OpenSearch Service pricing and Configuration API reference for Amazon OpenSearch Service.

About the author

Siddhant Gupta is a Sr. Technical Product Manager at Amazon Web Services based in Hyderabad, India. Siddhant has been with Amazon for over five years and is currently working with the OpenSearch Service team, helping with new region launches, pricing strategy, and bringing EC2 and EBS innovations to OpenSearch Service customers . He is passionate about analytics and machine learning. In his free time, he loves traveling, fitness activities, spending time with his family and reading non-fiction books.

Noise

All posts by Siddhant Gupta

Introducing Cluster insights: Unified monitoring dashboard for Amazon OpenSearch Service clusters

Getting Started with Cluster insights

Cluster insights overview

Cluster health

Insight details

Node, Index, Shard, and Query view

Node view

Index view

Shard view

Query view

Query insights

Conclusion

About the authors

Track Amazon OpenSearch Service configuration changes more easily with new visibility improvements

Solution overview

How do you get granular details to track the configuration update status?

Console enhancements

Single configuration change at a time

API changes

Conclusion

About the Authors

Lower your Amazon OpenSearch Service storage cost with gp3 Amazon EBS volumes

Comparing gp2 and gp3

Lower your monthly bills with gp3

gp2 vs. gp3

PIOPS (io1) vs. gp3

I3 vs. gp3

UltraWarm vs. gp3

Optimize instance cost with gp3’s denser storage

Upgrade from gp2 to gp3

gp3 baseline performance, and additional provisioning limits

Do you need additional performance?

Conclusion

About the author

The collective thoughts of the interwebz