Automating Index State Management for Amazon ES

Post Syndicated from Satya Vajrapu original https://aws.amazon.com/blogs/big-data/automating-index-state-management-for-amazon-es/

When it comes to time-series data, it’s more common to access new data over existing data, such as the last 4 hours or 1 day. Often, application teams are tasked with maintaining multiple indexes for diverse data workloads, which brings new requirements to set up a custom solution to manage the indexes’ lifecycle. This becomes tedious as the indexes grow and result in housekeeping overheads.

Amazon Elasticsearch Service (Amazon ES) now enables you to automate recurring index management activities. This avoids using any additional tools to manage the index lifecycle inside Elasticsearch. With Index State Management (ISM), you can create a policy that automates these operations based on index age, size, and other conditions, all from within your Amazon ES domain.

In this post, I discuss how you can implement a sample policy to automate routine index management tasks and apply them to indexes and index patterns.

Prerequisites

Before you get started, make sure you complete the following prerequisites:

  1. Have Elasticsearch 6.8 or later (required to use ISM and Ultrawarm).
  2. Set up a new Amazon ES domain with UltraWarm enabled.
  3. Make sure your user role has sufficient permissions to access the Kibana console of the Amazon ES domain. If required, validate and configure the access to your domains.

Use case

Ultrawarm for Amazon ES is a new low-cost storage tier that provides fast, interactive analytics on up to three petabytes of log data at one-tenth of the cost of the current Amazon ES storage tier. Although hot storage is used for indexing and providing fastest access, Ultrawarm complements the hot storage tier by providing less expensive storage for older and less-frequently accessed data, all while maintaining the same interactive analytics experience. Rather than attached storage, UltraWarm nodes use Amazon Simple Storage Service (Amazon S3) and a sophisticated caching solution to improve performance.

To demonstrate the functionality, I present a sample use case of handling time-series data. In this use case, we migrate a set of existing indexes that are initially in hot state and migrate them to Ultrawarm storage after a day. Upon migration, the data is stored in a service-managed S3 bucket as read only. We then delete the index after 2 days, assuming that index is no longer needed.

After we create the Amazon ES domain, complete the following steps:

  1. Log in using the Kibana UI endpoint.
  2. Wait for the domain status to turn active and choose the Kibana endpoint.
  3. On Kibana’s splash page, add all the sample data listed by choosing Try our sample data and choosing Add data.
  4. After adding the data, choose Index Management (the IM icon on the left navigation pane), which lands into the Index Policies page.
  5. Choose Create policy.
  6. For Name policy, enter ism-policy-sample.
  7. Replace the default policy with the following code:
    {
        "policy": {
            "description": "Lifecycle Management Policy",
            "default_state": "hot",
            "states": [
                {
                    "name": "hot",
                    "actions": [],
                    "transitions": [
                        {
                            "state_name": "warm",
                            "conditions": {
                                "min_index_age": "1d"
                            }
                        }
                    ]
                },
                {
                    "name": "warm",
                    "actions": [
                        {
                            "retry": {
                                "count": 5,
                                "backoff": "exponential",
                                "delay": "1h"
                            },
                            "warm_migration": {}
                        }
                    ],
                    "transitions": [
                        {
                            "state_name": "delete",
                            "conditions": {
                                "min_index_age": "2d"
                            }
                        }
                    ]
                },
                {
                    "name": "delete",
                    "actions": [
                        {
                            "notification": {
                                "destination": {
                                    "chime": {
                                        "url": "<CHIME_WEBHOOK_URL>"
                                    }
                                },
                                "message_template": {
                                    "source": "The index {{ctx.index}} is being deleted because of actioned policy {{ctx.policy_id}}",
                                    "lang": "mustache"
                                }
                            }
                        },
                        {
                            "delete": {}
                        }
                    ],
                    "transitions": []
                }
            ]
        }
    }
    

You can also use the ISM operations to programmatically work with policies and managed indexes. For example, to attach an ISM policy to an index at the time of creation, you invoke an API action. See the following code:

PUT index_1
{
  "settings": {
    "opendistro.index_state_management.policy_id": "ingest_policy",
    "opendistro.index_state_management.rollover_alias": "some_alias"
  },
  "aliases": {
    "some_alias": {
      "is_write_index": true
    }
  }
}

In this case, the ingest_policy is applied to index_1 with the rollover action defined in some_alias. For the list of complete ISM programmatic operations to work with policies and managed policies, see ISM API.

  1. Choose Create. You can now see your index policy on the Index Policies page.
  2. On the Indices page, search for kibana_sample, which should list all the sample data indexes you added earlier.
  3. Select all the indexes and choose Apply policy.
  4. From the Policy ID drop-down menu, choose the policy created in the previous step.
  5. Choose Apply.

The policy is now assigned and starts managing the indexes. On the Managed Indices page, you can observe the status as Initializing.

When initialization is complete, the status changes to Running.

You can also set a refresh frequency to refresh the managed indexes’ status information.

Demystifying the policy

In this section, I explain about the index policy tenets and how they’re structured.

Policies are JSON documents that define the following:

  • The states an index can be in
  • Any actions you want the plugin to take when an index enters the state
  • Conditions that must be met for an index to move or transition into a new state

The policy document begins with basic metadata like description, the default_state the index should enter, and finally a series of state definitions.

A state is the status that the managed index is currently in. A managed index can only be in one state at a time. Each state has associated actions that are run sequentially upon entering a state and transitions that are checked after all the actions are complete.

The first state is hot. In this use case, no actions are defined in this hot state; the managed indexes land in this state initially and then transition to warm. Transitions define the conditions that need to be met for a state to change (in this case, change to warm after the index crosses 24 hours). See the following code:

            {
                "name": "hot",
                "actions": [],
                "transitions": [
                    {
                        "state_name": "warm",
                        "conditions": {
                            "min_index_age": "1d"
                        }
                    }
                ]
            },

We can quickly verify the states on the console. The current state is hot and attempts the transition to warm after 1 day. The transition typically completes within an hour and is reflected under the Status column.

The warm state has actions defined to move the index to Ultrawarm storage. When the actions run successfully, the state has another transition to delete after the index ages 2 days. See the following code:

            {
                "name": "warm",
                "actions": [
                    {
                        "retry": {
                            "count": 5,
                            "backoff": "exponential",
                            "delay": "1h"
                        },
                        "warm_migration": {}
                    }
                ],
                "transitions": [
                    {
                        "state_name": "delete",
                        "conditions": {
                            "min_index_age": "2d"
                        }
                    }
                ]
            },

You can again verify the status of the managed indexes.

You can also verify this on the Amazon ES console under the Ultrawarm Storage usage column.

The third state of the policy document marks the indexes to delete based on the actions. This policy state assumes your index is non-critical and no longer receiving write requests; having zero replicas carries some risk of data loss.

The final delete state has two actions defined. The first action is self-explanatory; it sends a notification as defined in the message_template to the destination. See the following code:

            {
                "name": "delete",
                "actions": [
                    {
                        "notification": {
                            "destination": {
                                "chime": {
                                    "url": "<CHIME_WEBHOOK_URL>"
                                }
                            },
                            "message_template": {
                                "source": " The index {{ctx.index}} is being deleted because of actioned policy {{ctx.policy_id}}",
                                "lang": "mustache"
                            }
                        }
                    },
                    {
                        "delete": {}
                    }
                ],
                "transitions": []
            }

I have configured the notification endpoint to be on Amazon Chime <CHIME_WEBHOOK_URL>. For more information about using webhooks, see Webhooks for Amazon Chime.

You can also configure the notification to send to destinations like Slack or a webhook URL.

At this state, I have received the notification on the Chime webhook (see the following screenshot).

The following screenshot shows the index status on the console.

After the notification is successfully sent, the policy runs the next action in the state that is deleting the indexes. After this final state, the indexes no longer appear on the Managed Indices page.

Additional information on ISM policies

If you have an existing Amazon ES cluster with no Ultrawarm support (because of any missing prerequisite), you can use policy operations read_only and reduces_replicas to replace the warm state. The following code is the policy template for these two states:

            {
                "name": "reduce_replicas",
                "actions": [{
                  "replica_count": {
                    "number_of_replicas": 0
                  }
                }],
                "transitions": [{
                  "state_name": "read_only",
                  "conditions": {
                    "min_index_age": "2d"
                  }
                }]
            },
            {
                "name": "read_only",
                "actions": [
                    {
                        "read_only": {}
                      }
                ],
                "transitions": [
                    {
                        "state_name": "delete",
                        "conditions": {
                            "min_index_age": "3d"
                        }
                    }
                ]
            },

Summary

In this post, you learned how to use the index state management feature on Ultrawarm for Amazon ES. The walkthrough illustrated how to manage indexes using this plugin with a sample lifecycle policy.

For more information about the ISM plugin, see Index State Management. If you need enhancements or have other feature requests, please file an issue. To get involved with the project, see Contributing Guidelines.

A big takeaway for me as I evaluated the ISM plugin in Amazon ES was that the ISM plugin is fully compatible and works on Open Distro for Elasticsearch. For more information, see Index State Management in Open Distro for Elasticsearch. It can be useful for using Open Distro for Elasticsearch as an on-premises or internal solution while using a managed service for your production workloads.


About the Author

Satya Vajrapu is a DevOps Consultant with Amazon Web Services. He works with AWS customers to help design and develop various practices and tools in the DevOps toolchain.