All posts by Aish Gunasekar

Build multi-layer maps in Amazon OpenSearch Service

Post Syndicated from Aish Gunasekar original https://aws.amazon.com/blogs/big-data/build-multi-layer-maps-in-amazon-opensearch-service/

With the release of Amazon OpenSearch Service 2.5, you can create maps with multiple layers to visualize your geographical data. You can build each layer from a different index pattern to separate data sources. Organizing the map in layers makes it more straightforward to visualize, view, and analyze geographical data. The layering also helps fetch data from various sources, view different data at different zoom levels, and continuously refresh data for a real-time dataset. Additionally, in OpenSearch 2.6, you can add multi-layer maps to dashboard panes within OpenSearch Dashboards, which makes it more straightforward to analyze your geospatial data in the context of other visualizations. OpenSearch Service provisions all the resources for your OpenSearch cluster so you can focus on building the business value through maps and other features of OpenSearch rather than spending time managing your deployment.

In this post, we show how to use multi-layer maps in OpenSearch Service.

Solution overview

The multi-layer map helps users visualize data and gain insights using specific layers and zoom levels to help emphasize key messages. For our use case, we use multi-layer maps to support an example real-estate application. Your index will have fields such as location, address, availability, number of bedrooms, price, and more. Initially, we develop a map with state boundaries with aggregated data for an overview. As the user zooms in, they’ll see city boundaries and postal codes. As the user continues to zoom, they’ll see each house with information like the price, number of bedrooms, and so on. You will build various data layers from different data sources to accomplish this task. You can also add filter queries; for example, to only show properties that are available.

Prerequisites

Complete the following prerequisite steps to configure the sample data:

  1. Create an OpenSearch Service domain (version 2.7 or higher).
  2. Download the file bulk_request_realestate.txt.
  3. Copy and paste the entire contents of the downloaded file into the OpenSearch Dashboards console.
  4. Run the commands.

These commands create the realestate index and upload records into the catalog.

Now let’s visualize this data in a multi-layer map.

Add layers

To create your map and add layers, complete the following steps:

  1. On OpenSearch Dashboards, in the navigation pane, under OpenSearch plugins, choose Maps.
  2. Choose Create map.

You will see a default map (or basemap) loaded on the page with a Layers pane on the left. This serves as a canvas for the data. The OpenSearch basemap utilizes vector tiles, resulting in quicker loading speeds and seamless zooming compared to raster tile maps. It effectively accommodates zoom levels ranging from 0–22. 0 is the most zoomed out with the global view, and zoom level 22 is roughly a half-inch per pixel resolution. Any additional layers you add will appear in this pane.

  1. Choose Add layer.
  2. In the prompt, choose the option to add documents for the data layer.
  3. Under Documents, choose the index you created (realestate) as the data source.
  4. For the geospatial field, choose the field containing geopoints (such as location in this example).
  5. Keep the remaining settings at their default values, then choose Update.

In the Layers pane, a newly generated layer named New Layer 2 will be visible. Additionally, you will observe the presence of all geopoints on the map (green dots in the following screenshot).

Update the layer name and enable tooltips

For an overview of the various configuration choices accessible for the layers, choose New Layer 2. This action opens another pane containing three tabs: Data, Style, and Settings. Let’s modify the layer’s name to something more relevant and enable tooltips. Complete the following steps:

  1. On the Settings tab, replace New Layer 2 with realestate-data in the Name field.
  2. On the Data tab, scroll down to Tool tips, and select Tool tips.
  3. Enter region as the tooltip.
  4. Choose Update.

The altered name should now be visible in the left pane. The geopoints themselves don’t convey any information. However, with tooltips enabled, you can access comprehensive information depending on the fields selected. As you hover over the geopoints, you will observe the chosen tooltip information—in this instance, region CA.

Adjust zoom levels

The essence of this functionality lies in the ability to observe your data through distinct layers at varying zoom levels. To achieve this, generate a layer using the same process as before. The following example shows a new layer (locality) featuring tooltips displaying locality and postal code. You can also choose the color of your geographical points on the Style tab. On the Settings tab, you’ll encounter options for zoom levels, allowing you to input minimum and maximum values—like 4 and 6, for instance. Consequently, this indicates that the layer will be visible exclusively within this range of zoom levels.

In the Layers pane, you can observe three layers alongside the locality layer created in the previous step. A notification will indicate “Layer is hidden outside of zoom range 1-2.” This layer becomes visible as you zoom in.

The realestate-data layer is set with a default zoom range of 0–22, ensuring visibility across all levels unless manually hidden. The locality layer is configured to be visible exclusively within the zoom range of 1–2.

As shown in the following screenshot, the tooltip for the realestate-data layer remains visible even after the fourth zoom level. To access the tooltip information for the locality layer, choose the eye icon next to realestate-data to manually conceal this layer. Once completed, hovering over the geopoints will reveal the tooltip details for locality (postal code and locality).

The following are some key points to consider:

  • Each layer can be established with distinct colors for its geographical points. For instance, the realestate-data layer is depicted in green, while the locality layer uses orange.
  • It’s possible to observe geopoints in a color that wasn’t directly chosen. In the following screenshot, brown is visible due to the overlapping of two layers at the same zoom level.
  • You can observe the color shift to the layer’s designated color—orange—after realestate-data is manually hidden, because there’s no longer an overlap between the layers.

You can generate an additional layer designed to display tooltip data such as the count of beds, baths, price, and square footage. This layer will be active within the zoom range of 3–4.

To save your project, choose Save. Enter a title, such as realestate-multilayer-map, then choose Save again. Your multilayer map is now successfully saved!

Exploring the multi-level map

After you have established all the layers, take note of how the layers become visible or invisible at each zoom level. Observe the dynamic adjustments in tooltip information that correspond to these changes as you zoom.

Add a filter

After you have generated multiple layers and successfully visualized your geopoints, you might find that you are interested in specific properties, such as within a particular price range.

To add a filter at layer level, complete the following steps:

  1. In the right pane, on the Data tab, choose Filters.
  2. Input price as the filter criteria.
  3. Select is between as the operator.
  4. Enter 800000 for the start of the range and 1400000 for the end of the range.
  5. Choose Save to update the layer.

You’ll immediately observe the filter taking effect, resulting in the display of only the relevant data matching the filter.

An alternative method to establish a filter involves drawing shapes on the map, such as rectangles or polygons. In this instance, you’ll be utilizing the polygon option. (For API-based filtering, refer to APIs).

  1. Choose the polygon icon on the right side of the map.
  2. For Filter label, enter a name for the filter.
  3. Draw the shape over the map area that you want to select.
  4. For a polygon, select any starting point on the map (this point becomes a polygon vertex) and hover (do not drag) to each subsequent vertex and select that point.
  5. Make sure to select the starting point again to close the polygon, as shown in the following screenshot.

Add a map to a dashboard

You can add this map to an existing or a new dashboard. Complete the following steps:

  1. On OpenSearch Dashboards, choose Create and choose Dashboard.

  1. Select Add an existing dashboard.
  2. Choose realestate-multilayer from the list.

You can see the new visualization on your dashboard.

  1. Choose Save and enter a title for the dashboard.

Conclusion

In this post, you effectively established multi-layer maps for data visualization, analyzed geographic data, observed various data at varying zoom levels, added tooltips for supplementary data visualization, and added multi-layer maps to dashboard panes within OpenSearch Dashboards to easily analyze your geospatial data. Refer to Using maps for an alternative use case and detailed information about map features.


About the authors

Aish Gunasekar is a Specialist Solutions Architect with a focus on Amazon OpenSearch Service. Her passion at AWS is to help customers design highly scalable architectures and help them in their cloud adoption journey. Outside of work, she enjoys hiking and baking.

Satish Nandi is a Senior Technical Product Manager for Amazon OpenSearch Service.

Jon Handler is a Senior Principal Solutions Architect at Amazon Web Services based in Palo Alto, CA. Jon works closely with OpenSearch and Amazon OpenSearch Service, providing help and guidance to a broad range of customers who have search and log analytics workloads that they want to move to the AWS Cloud. Prior to joining AWS, Jon’s career as a software developer included 4 years of coding a large-scale, ecommerce search engine. Jon holds a Bachelor of the Arts from the University of Pennsylvania, and a Master of Science and a PhD in Computer Science and Artificial Intelligence from Northwestern University.

Configure SAML federation for Amazon OpenSearch Serverless with Okta

Post Syndicated from Aish Gunasekar original https://aws.amazon.com/blogs/big-data/configure-saml-federation-for-amazon-opensearch-serverless-with-okta/

Modern applications apply security controls across many systems and their subsystems. Keeping all of these systems in sync would be a major undertaking if you tried to implement it separately. Centralized identity management is the way to maintain a single identity provider (IdP) that can authenticate actors and manage and distribute their rights.

OpenSearch is an open-source search and analytics suite that enables you to ingest, store, analyze, and visualize full text and log data. Amazon OpenSearch Serverless makes it simple to deploy, scale, and operate OpenSearch in the AWS Cloud, freeing you from the undifferentiated heavy lifting of sizing, scaling, and operating an OpenSearch cluster. When you use OpenSearch Serverless, you can integrate with your existing Security Assertion Markup Language 2.0 (SAML)-compliant IdP to provide granular access control for your OpenSearch Serverless collections. Our customers use a variety of IdPs, including AWS IAM Identity Center (successor to AWS SSO), Okta, Keycloak, Active Directory Federation Services (AD FS), and Auth0.

In this post, you will learn how to use Okta as your IdP and integrate it with OpenSearch Serverless to securely manage your users and groups for secure access to your data.

Solution overview

The flow of access requests is depicted in the following figure.

When you navigate to OpenSearch Dashboards, the workflow steps are as follows:

  1. OpenSearch Serverless generates a SAML authentication request.
  2. OpenSearch Serverless redirects your request back to the browser.
  3. The browser redirects to the Okta URL via the Okta application setup.
  4. Okta parses the SAML request, authenticates the user, and generates a SAML response.
  5. Okta returns the encoded SAML response to the browser.
  6. The browser sends the SAML response back to the OpenSearch Serverless Assertion Consumer Services (ACS) URL.
  7. ACS verifies the SAML response and logs in the user with the permissions defined in the data access policy.

Prerequisites

Complete the following prerequisite steps:

  1. Create an OpenSearch Serverless collection. For instructions, refer to Preview: Amazon OpenSearch Serverless – Run Search and Analytics Workloads without Managing Clusters.
  2. Make a note of your AWS account ID to use while configuring your application in Okta.
  3. Create an Okta account, which you will use as an IdP.
  4. Create users and a group in Okta:
    1. Log in to your Okta account, and in the navigation pane, choose Directory, then choose Groups.
    2. Choose Add Group and name itopensearch-serverless, then choose Save.
    3. Choose Assign People to add users.
    4. You can add users to theopensearch-serverlessgroup by choosing the plus sign next to the user name, or you can choose Add All.
    5. Add your users, then choose Save.
    6. To create new users, choose People in the navigation pane under Directory, then choose Add Person.
    7. Provide your first name, last name, user name (email ID), and primary email address.
    8. For Password, choose Set by admin and First-time password.
    9. To create your user, choose Save.
    10. In the navigation pane, choose Groups, then choose theopensearch-serverless group you created earlier.

The following graphic gives a quick demonstration of setting up a user and group.

Configure an application in Okta

To configure an application in Okta, complete the following steps:

  1. Navigate to the Applications page on the Okta console.
  2. Choose App Integration, select SAML 2.0 web application, then choose Next.
  3. For Name, enter a name for the app (for example, myweblogs), then choose Next.
  4. Under Application ACS URL, enter the URL using the format https://collection.<REGION>.aoss.amazonaws.com/_saml/acs (replace <REGION> with the corresponding Region) to generate the IdP metadata.
  5. Select Use this for Recipient URL and Destination URL to use the same ACS URL as the recipient and destination.
  6. Specify aws:opensearch:<AWS-Account-ID> under Audience URI (SP Entity ID). This specifies who the assertion is intended for within the SAML assertion.
  7. Under Group Attribute Statements, enter a name that is relevant to your application, such as mygroup, and select unspecified as the name format. (Don’t forget this name, you’ll need it later.)
  8. Select equals as the filter and enter opensearch-serverless.
  9. Select I’m a software vendor. I’d like to integrate my app with Okta and choose Finish.
  10. After an app is created, choose the sign-on tab, scroll down to the metadata details, and copy the value for Metadata URL.

The following graphic gives a quick demonstration of setting up an application in Okta via the preceding steps.

Next, you associate the users and groups to the application that you created in the previous step.

  1. On the Applications page, choose the app you created earlier.
  2. On the Assignments tab, choose Assign.
  3. Select Assign To Groups and choose the group you wish to assign to (opensearch-serverlessin this case).
  4. Choose Done.

The following graphic gives a quick demonstration of assigning groups to the application via the preceding steps.

Set up SAML on OpenSearch Serverless

In this section, you create a SAML provider that you’ll use for your OpenSearch Serverless collection. Complete the following steps:

  1. Open the OpenSearch Serverless console on a new tab.
  2. In the navigation pane, under Serverless, choose SAML authentication.
  3. Select Add SAML provider.
  4. Provide a recognizable name (for example, okta) and a description.
  5. Open a new tab and enter the copied metadata URL into your browser.

You should see the metadata for the Okta application.

  1. Take note of this metadata and copy it to your clipboard.
  2. On the OpenSearch Service console tab, enter this metadata in the Provide metadata from your IdP section.
  3. Under Additional settings, enter mygroup or the group attribute provided in the Okta configuration.
  4. Choose Create a SAML provider.

The SAML provider has now been created.

The following graphic gives a quick demonstration of setting up the SAML provider in OpenSearch Serverless via the preceding steps.

Update the data access policy

You need to configure the right permissions in the data access policies associated with your OpenSearch collection so your Okta group members can access the OpenSearch Dashboards endpoint.

  1. On the OpenSearch Serverless console, open your collection.
  2. Choose the data access policy associated with the collection in the Data Access section.
  3. Choose Edit.
  4. Choose Principals and Add a SAML principal.
  5. Select the SAML provider you created earlier and enter group/opensearch-serverless next to it.
  6. The OpenSearch Dashboards endpoint can be accessed by all group members. You can grant access to collections, indexes, or both.
  7. Choose Save.

Log in to OpenSearch Dashboards

Now that you have set permissions to access the dashboards, choose the Dashboards URL under the general information for the OpenSearch Serverless collection. This should take you to the website
https://collection-endpoint/_dashboards/

You will see a list with all the access options. Choose the SAML provider that you created (okta in this case) and log in using your Okta credentials. You will now be logged into OpenSearch Dashboards with the permissions that are part of the data access policy. You can perform searches or create visualizations from the dashboard.

Clean up

To avoid unwanted charges, delete the OpenSearch Serverless collection, data access policy, and SAML provider created as part of this demonstration.

Summary

In this post, you learned how to set up Okta as an IdP to access OpenSearch Dashboards using SAML. You also learned how to set up users and groups within Okta and configure their access to OpenSearch Dashboards. For more details, refer to SAML authentication for Amazon OpenSearch Serverless.

You can also refer to the Getting started with Amazon OpenSearch Serverless workshop to know more about OpenSearch Serverless.

If you have feedback about this post, submit it in the comments section. If you have questions about this post, start a new thread on the OpenSearch Service forum or contact AWS Support.


About the Authors

Aish Gunasekar is a Specialist Solutions architect with a focus on Amazon OpenSearch Service. Her passion at AWS is to help customers design highly scalable architectures and help them in their cloud adoption journey. Outside of work, she enjoys hiking and baking.

Prashant Agrawal is a Sr. Search Specialist Solutions Architect with Amazon OpenSearch Service. He works closely with customers to help them migrate their workloads to the cloud and helps existing customers fine-tune their clusters to achieve better performance and save on cost. Before joining AWS, he helped various customers use OpenSearch and Elasticsearch for their search and log analytics use cases. When not working, you can find him traveling and exploring new places. In short, he likes doing Eat → Travel → Repeat.

Build a search application with Amazon OpenSearch Serverless

Post Syndicated from Aish Gunasekar original https://aws.amazon.com/blogs/big-data/build-a-search-application-with-amazon-opensearch-serverless/

In this post, we demonstrate how to build a simple web-based search application using the recently announced Amazon OpenSearch Serverless, a serverless option for Amazon OpenSearch Service that makes it easy to run petabyte-scale search and analytics workloads without having to think about clusters. The benefit of using OpenSearch Serverless as a backend for your search application is that it automatically provisions and scales the underlying resources based on the search traffic demands, so you don’t have to worry about infrastructure management. You can simply focus on building your search application and analyzing the results. OpenSearch Serverless is powered by the open-source OpenSearch project, which consists of a search engine, and OpenSearch Dashboards, a visualization tool to analyze your search results.

Solution overview

There are many ways to build a search application. In our example, we create a simple Java script front end and call Amazon API Gateway, which triggers an AWS Lambda function upon receiving user queries. As shown in the following diagram, API Gateway acts as a broker between the front end and the OpenSearch Serverless collection. When the user queries the front-end webpage, API Gateway passes requests to the Python Lambda function, which runs the queries on the OpenSearch Serverless collection and returns the search results.

To get started with the search application, you must first upload the relevant dataset, a movie catalog in this case, to the OpenSearch collection and index them to make them searchable.

Create a collection in OpenSearch Serverless

A collection in OpenSearch Serverless is a logical grouping of one or more indexes that represent a workload. You can create a collection using the AWS Management Console or AWS Software Development Kit (AWS SDK). Follow the steps in Preview: Amazon OpenSearch Serverless – Run Search and Analytics Workloads without Managing Clusters to create and configure a collection in OpenSearch Serverless.

Create an index and ingest data

After your collection is created and active, you can upload the movie data to an index in this collection. Indexes hold documents, and each document in this example represents a movie record. Documents are comparable to rows in the database table. Each document (the movie record) consists of 10 fields that are typically searched for in a movie catalog, like the director, actor, release date, genre, title, or plot of the movie. The following is a sample movie JSON document:

{
"directors": ["David Yates"],
"release_date": "2011-07-07T00:00:00Z",
"rating": 8.1,
"genres": ["Adventure", "Family", "Fantasy", "Mystery"],
"plot": "Harry, Ron and Hermione search for Voldemort's remaining Horcruxes in their effort to destroy the Dark Lord.",
"title": "Harry Potter and the Deathly Hallows: Part 2",
"rank": 131,
"running_time_secs": 7800,
"actors": ["Daniel Radcliffe", "Emma Watson", "Rupert Grint"],
"year": 2011
}

For the search catalog, you can upload the sample-movies.bulk dataset sourced from the Internet Movies Database (IMDb). OpenSearch Serverless offers the same ingestion pipeline and clients to ingest the data as OpenSearch Service, such as Fluentd, Logstash, and Postman. Alternatively, you can use the OpenSearch Dashboards Dev Tools to ingest and search the data without configuring any additional pipelines. To do so, log in to OpenSearch Dashboards using your SAML credentials and choose Dev tools.

To create a new index, use the PUT command followed by the index name:

PUT movies-index

A confirmation message is displayed upon successful creation of your index.

After the index is created, you can ingest documents into the index. OpenSearch provides the option to ingest multiple documents in one request using the _bulk request. Enter POST /_bulk in the left pane as shown in the following screenshot, then copy and paste the contents of the sample-movies.bulk file you downloaded earlier.

You have successfully created the movies index and uploaded 1,500 records into the catalog! Now let’s integrate the movie catalog with your search application.

Integrate the Lambda function with an OpenSearch Serverless endpoint

In this step, you create a Lambda function that queries the movie catalog in OpenSearch Serverless and returns the result. For more information, see our tutorial on creating a Lambda function for connecting to and querying an OpenSearch Service domain. You can reuse the same code by replacing the parameters to align to OpenSearch Serverless’s requirements. Replace <my-region> with your corresponding region (for example, us-west-2), use aoss instead of es for service, replace <hostname> with the OpenSearch collection endpoint, and <index-name> with your index (in this case, movies-index).

The following is a snippet of the Lambda code. You can find the complete code in the tutorial.

import boto3
import json
import requests
from requests_aws4auth import AWS4Auth

region = '<my-region>'
service = 'aoss'
credentials = boto3.Session().get_credentials()
awsauth = AWS4Auth(credentials.access_key, credentials.secret_key, region, service, session_token=credentials.token)

host = '<hostname>' 
# The OpenSearch collection endpoint 
index = '<index-name>'
url = host + '/' + index + '/_search'

# Lambda execution starts here
def Lambda_handler(event, context):

This Lambda function returns a list of movies based on a search string (such as movie title, director, or actor) provided by the user.

Next, you need to configure the permissions in OpenSearch Serverless’s data access policy to let the Lambda function access the collection.

  1. On the Lambda console, navigate to your function.
  2. On the Configuration tab, in the Permissions section, under Execution role, copy the value for Role name.
  3. Add this role name as one of the principals of your movie-search collection’s data access policy.

Principals can be AWS Identity and Access Management (IAM) users, role ARNs, or SAML identities. These principals must be within the current AWS account.

After you add the role name as a principal, you can see the role ARN updated in your rule, as show in the following screenshot.

Now you can grant collection and index permissions to this principal.

For more details about data access policies, refer to Data access control for Amazon OpenSearch Serverless. Skipping this step or not running it correctly will result in permission errors, and your Lambda code won’t be able to query the movie catalog.

Configure API Gateway

API Gateway acts as a front door for applications to access the code running on Lambda. To create, configure, and deploy the API for the GET method, refer to the steps in the tutorial. For API Gateway to pass the requests to the Lambda function, configure it as a trigger to invoke the Lambda function.

The next step is to integrate it with the front end.

Test the web application

To build the front-end UI, you can download the following sample JavaScript web service. Open the scripts/search.js file and update the apigatewayendpoint variable to point to your API Gateway endpoint:

var apigatewayendpoint = 'https://kxxxxxxzzz.execute-api.us-west-2.amazonaws.com/opensearch-api-test/';
// Update this variable to point to your API Gateway endpoint.

You can access the front-end application by opening index.html in your browser. When the user runs a query on the front-end application, it calls API Gateway and Lambda to serve up the content hosted in the OpenSearch Serverless collection.

When you search the movie catalog, the Lambda function runs the following query:

    # Put the user query into the query DSL for more accurate search results.
    # Note that certain fields are boosted (^).
    query = {
        "size": 25,
        "query": {
            "multi_match": {
                "query": event['queryStringParameters']['q'],
                "fields": ["title", "plot", "actors"]
            }
        }
    }

The query returns documents based on a provided query string. Let’s look at the parameters used in the query:

  • size – The size parameter is the maximum number of documents to return. In this case, a maximum of 25 results is returned.
  • multi_match – You use a match query when matching larger pieces of text, especially when you’re using OpenSearch’s relevance to sort your results. With a multi_match query, you can query across multiple fields specified in the query.
  • fields – The list of fields you are querying.

In a search for “Harry Potter,” the document with the matching term both in the title and plot fields appears higher than other documents with the matching term only in the title field.

Congratulations! You have configured and deployed a search application fronted by API Gateway, running Lambda functions for the queries served by OpenSearch Serverless.

Clean up

To avoid unwanted charges, delete the OpenSearch Service collection, Lambda function, and API Gateway that you created.

Conclusion

In this post, you learned how to build a simple search application using OpenSearch Serverless. With OpenSearch Serverless, you don’t have to worry about managing the underlying infrastructure. OpenSearch Serverless supports the same ingestion and query APIs as the OpenSearch Project. You can quickly get started by ingesting the data into your OpenSearch Service collection, and then perform searches on the data using your web interface.

In subsequent posts, we dive deeper into many other search queries and features that you can use to make your search application even more effective.

We would love to hear how you are building your search applications today. If you’re just getting started with OpenSearch Serverless, we recommend getting hands-on with the Getting started with Amazon OpenSearch Serverless workshop.


About the authors

Aish Gunasekar is a Specialist Solutions architect with a focus on Amazon OpenSearch Service. Her passion at AWS is to help customers design highly scalable architectures and help them in their cloud adoption journey. Outside of work, she enjoys hiking and baking.

Pavani Baddepudi is a senior product manager working in search services at AWS. Her interests include distributed systems, networking, and security.

Simplify private network access for solutions using Amazon OpenSearch Service managed VPC endpoints

Post Syndicated from Aish Gunasekar original https://aws.amazon.com/blogs/big-data/simplify-private-network-access-for-solutions-using-amazon-opensearch-service-managed-vpc-endpoints/

Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. Amazon OpenSearch is an open source, distributed search and analytics suite. Amazon OpenSearch Service offers the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), as well as visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions). Amazon OpenSearch Service currently has tens of thousands of active customers with hundreds of thousands of clusters under management processing trillions of requests per month.

To meet the needs of customers who want simplicity in their network setup with the Amazon OpenSearch Service, you can now use Amazon OpenSearch Service-managed virtual private cloud (VPC) endpoints (powered by AWS PrivateLink) to connect to your applications using Amazon OpenSearch Service domains launched in Amazon Virtual Private Cloud (VPC). With Amazon OpenSearch Service-managed VPC endpoints, you can privately access your Amazon OpenSearch Service domain from multiple VPCs in your account or other AWS accounts based on your application needs without configuring other services features such as VPC peering, AWS Transit Gateway (TGW), or other more complex network routing strategies that place operational burden on your support and engineering teams.

The feature is built using AWS PrivateLink. AWS PrivateLink provides private connectivity between VPCs, supported AWS services, and your on-premises networks without exposing your traffic to the public internet. It provides you with the means to connect multiple application deployments effortlessly to your Amazon OpenSearch Service domains.

This post introduces Amazon OpenSearch Service-managed VPC endpoints that build on top of AWS PrivateLink and shows how you can access a private Amazon OpenSearch Service from one or more VPCs hosted in the same account, or even VPCs hosted in other AWS accounts using AWS PrivateLink managed by Amazon OpenSearch Service.

­­­­Amazon OpenSearch Service managed VPC endpoints

Before the launch of Amazon OpenSearch Service managed VPC endpoints, if you needed to gain access to your domain outside of your VPC, you had three options:

  • Use VPC peering to connect your VPC with other VPCs
  • Use AWS Transit Gateway to connect your VPC with other VPCs
  • Create your own implementation of an AWS PrivateLink setup

The first two options require you to setup your VPCs so that the Classless Inter-Domain Routing (CIDR) block ranges don’t overlap. If they did, then your options are more complicated. The third option, create your own implementation of AWS PrivateLink, involve configuring a network load balancer (NLB) and associating a target group with the NLB as one of the steps in the setup. The architecture discussed in this post, demonstrates these additional layers of complexity.

With Amazon OpenSearch Service managed VPC endpoints (i.e., powered by AWS PrivateLink), these complex setups and processes are no longer needed!

You can access your Amazon OpenSearch Service private domain as if it were deployed in all the VPCs that you want to connect to your domain. If you need private connectivity from your on-premises hybrid deployments, then AWS PrivateLink helps you bring access from your Amazon OpenSearch Service domain to your data centers with minimal effort.

By using AWS PrivateLink with Amazon OpenSearch Service, you can realize the following benefits:

  • You simplify your network architecture between hybrid, multi-VPC, and multi account solutions
  • You address a multitude of compliance concerns by better controlling the traffic that moves between your solutions and Amazon OpenSearch Service domains

Shared search cluster for multiple development teams

Imagine that your company hosts a service as a software (SaaS) application that provides a search application programming interface (API) for the healthcare industry. Each team works on a different function of the API. The development teams API team 1 and API team 2 are in two different AWS accounts and each has their own VPCs within these accounts. Another team (data refinement team) works on the ingestion and data refinement to populate the Amazon OpenSearch Service domain hosted in the same account as API team 2 but in different VPC. Each team shares the domain during the development cycles to save costs and foster collaboration on the data modeling.

Solution overview

Self-managed AWS PrivateLink architecture to connect different VPCs

In this scenario prior to Amazon OpenSearch Service manage VPC endpoints (i.e., powered by AWS PrivateLink), you would have to create the following items:

  1. Deploy an NLB in your VPC
  2. Create a target group that points to the IP addresses of the Elastic Network Interfaces (ENIs), which the Amazon OpenSearch Service creates in your VPC and is used to launch the Amazon OpenSearch Service
  3. Create an AWS PrivateLink deployment and reference your newly created NLB

When you implement the NLB, a target group can only reference IP addresses, an Amazon EC2 instance, or an Application Load Balancer (ALB). If you referenced the IP addresses as targets, then you had to build a process that detected the changes in the IP address if the domain changed due to service initiated or self-initiated blue/green deployments. You must maintain yet another complex process to ensure that you always have active ENIs with which to point your target groups or you lose connectivity.

Typically, customers use an AWS Lambda with scheduled events in Amazon CloudWatch. This means that you use the AWS Lambda to detect the current state where the ENIs that provided the IP addresses were marked as active for the description that matched the ENIs your domain creates. You schedule AWS Lambda to wake up within the time to live (TTL) of the Domain Name Service (DNS) settings (typically 60 seconds) and compare the existing IP addresses in the target group with any new ones found when you query all ENIs with a description referencing your domain in the VPC. You then build a new target group with the deltas and you swap the target groups and drop the old one. It’s tricky, it’s complex, and you have to maintain the solution!

With the new simplified networking architecture, your teams go through the following steps.

OpenSearch Service managed VPC endpoints architecture (powered by AWS PrivateLink)

Since the Amazon OpenSearch Service takes care of the infrastructure described previously — but not necessarily on the same implementation — all you really need to concern yourself with is creating the connections using the instructions in our service documentation.

Once you complete the steps in the instructions and remove your own implementation, your architecture is then simplified as seen in the following diagram.

Once you complete the steps in the instructions and remove your own implementation, your architecture is then simplified.

At this point, the development teams (API team 1 and API team 2) can access the Amazon OpenSearch cluster via Amazon OpenSearch Service Managed VPC Endpoint. This option is highly scalable with a simplified network architecture in which you don’t have to worry about managing a NLB, or setting up target groups and the additional resources. If the number of development teams and VPCs grow in the future, you associate the domain with the associated interface VPC endpoint. You can access services in VPCs in same or different accounts, even if there are overlapping CIDR Block IP ranges.

Conclusion

In this post, we walked through the architectural design of accessing Amazon OpenSearch cluster from different VPCs across different accounts using OpenSearch Service-managed VPC endpoint (AWS PrivateLink). Using Transit Gateway, self-managed AWS PrivateLink or VPC peering required complex networking strategies that increased operation burden. With the introduction of VPC endpoints for Amazon OpenSearch Service, the complexity of your solutions is greatly simplified and what’s even better, it’s managed for you!


About the authors

Aish Gunasekar is a Specialist Solutions architect with a focus on Amazon OpenSearch Service. Her passion at AWS is to help customers design highly scalable architectures and help them in their cloud adoption journey. Outside of work, she enjoys hiking and baking.

Kevin Fallis (@AWSCodeWarrior) is an AWS specialist search solutions architect.  His passion at AWS is to help customers leverage the correct mix of AWS services to achieve success for their business goals. His after-work activities include family, DIY projects, carpentry, playing drums, and all things music.