All posts by Xiaoxue Xu

Implement secure hybrid and multicloud log ingestion with Amazon OpenSearch Ingestion

Post Syndicated from Xiaoxue Xu original https://aws.amazon.com/blogs/big-data/implement-secure-hybrid-and-multicloud-log-ingestion-with-amazon-opensearch-ingestion/

Running applications across hybrid or multicloud environments creates a common challenge: fragmented logs scattered across different platforms. This fragmentation complicates monitoring, slows troubleshooting, and reduces operational visibility. To address this, many organizations seek to implement secure log ingestion from all environments into a centralized platform.

Amazon OpenSearch Service provides a unified solution for real-time search, analytics, and log management across your entire infrastructure. Amazon OpenSearch Ingestion, a fully managed data collector, simplifies data processing with built-in capabilities to filter, transform, and enrich your logs before analysis.

However, securely sending logs from non-AWS environments presents a challenge. Every request to OpenSearch Ingestion requires AWS Signature Version 4 (AWS SigV4) authentication, traditionally requiring long-term credentials that introduce security risks. AWS Identity and Access Management Roles Anywhere solves this problem by providing temporary credentials for workloads running outside AWS.

In this post, we demonstrate how to configure Fluent Bit, a fast and flexible log processor and router supported by various operating systems, to securely send logs from any environment to OpenSearch Ingestion using IAM Roles Anywhere. This approach alleviates the need for long-term credentials while providing a comprehensive view of your application logs across all environments—improving security, simplifying operations, and enhancing your ability to quickly resolve issues.

Solutions overview

The solution in this post uses Fluent Bit to collect logs, retrieve temporary credentials from IAM Roles Anywhere, and sign HTTP log ingestion requests with AWS SigV4 before sending them to the OpenSearch Ingestion pipeline. The following diagram shows the architecture.

Architecture for log ingestion with AWS IAM Roles Anywhere

This solution provisions the following key components:

  • Certificate authority – For this post, we use AWS Private Certificate Authority (AWS Private CA) as the certificate authority (CA) source. Alternatively, you can integrate with an external CA; for more details, see IAM Roles Anywhere with an external certificate authority. Certificates issued from public CAs can’t be used as trust anchors for IAM Roles Anywhere.
  • X.509 Certificate – We use a sample private certificate stored in AWS Certificate Manager (ACM) and issued by AWS Private CA.
  • IAM Roles Anywhere configuration – This includes the following:
    • Trust anchor – Establishes trust between IAM Roles Anywhere and the specified CA.
    • IAM role – Grants permissions for log ingestion and trusts the IAM Roles Anywhere service principal. At minimum, this role must be granted permission for the osis:Ingest action.
    • Profile – Defines which roles IAM Roles Anywhere can assume and the maximum permissions granted with the temporary credentials.
  • OpenSearch Service domain – For this post, we use an OpenSearch Service domain, which is an AWS provisioned equivalent of an open source OpenSearch cluster. We create the domain within a virtual private cloud (VPC); see VPC versus public domains for more information. Alternatively, you can use an Amazon OpenSearch Serverless collection, which is an OpenSearch cluster that scales compute capacity based on your application’s needs.
  • OpenSearch Ingestion – This is configured to receive logs over HTTP as the pipeline source and forward them to the OpenSearch Service domain as the pipeline sink.

Connectivity between AWS and your hybrid or multicloud environments

You can access your OpenSearch Ingestion pipelines using an interface VPC endpoint with push-based HTTP source, which provides private IP address connectivity. For production environments, we recommend using these private connections through interface endpoints for enhanced security.

Setting up this connectivity requires additional configuration, such as creating an AWS Site-to-Site VPN connection with your hybrid and multicloud network. Although this post focuses on the log ingestion solution, you can find detailed guidance on network connectivity in the following resources:

How Fluent Bit retrieves temporary credentials using IAM Roles Anywhere

Using the HTTP output plugin, Fluent Bit can send logs to the OpenSearch Ingestion pipeline. The following diagram is a simplified view of how Fluent Bit retrieves AWS credentials.

How Fluent Bit retrieve AWS credentials

On Linux systems, Fluent Bit can use an AWS Command Line Interface (AWS CLI) profile that uses the credential_process parameter to trigger an external process. This external process is invoked to generate or retrieve credentials not directly supported by the AWS CLI.

The following are two common mechanisms for the external process:

Although both options are viable, this post focuses on IAM Roles Anywhere. In this setup, the AWS IAM Roles Anywhere Credential Helper is executed to handle the signing process for the CreateSession API. This returns credentials in a JSON format that Fluent Bit can consume.

As of this writing, the Fluent Bit aws_profile configuration is supported only on Linux. It is untested on other Unix-based systems (such as macOS) and is not implemented for Windows.

Prerequisites

Before you begin this walkthrough, make sure you have the following:

Deploy AWS resources with AWS CloudFormation

Follow these steps to deploy AWS resources required for this solution:

  1. Choose Launch Stack:


  2. Enter a unique name for Stack name. The default value is osis-with-iamra.
  3. Configure the stack parameters. Default values are provided in the following table.
Parameter Default value Description
CACommonName example.com Common Name for the CA
CACountry US Organization for the CA
CAOrganization Example Org Country for the CA
CAValidityInDays 1826 Validity period in days for the CA certificate
VPCCIDR 10.0.0.0/16 IPv4 CIDR range for the VPC used for OpenSearch Service domain
PublicSubnetCIDR 10.0.0.0/24 IPv4 CIDR range for public subnet
PrivateSubnet1CIDR 10.0.1.0/24 IPv4 CIDR range for private subnet
PrivateSubnet2CIDR 10.0.2.0/24 IPv4 CIDR range for private subnet
DomainName test-domain Name of the OpenSearch Service domain
PipelineName test-pipeline Name of the OpenSearch Ingestion pipeline
PipelineIngestionPath /test-ingestion-path Ingestion path for the OpenSearch Ingestion pipeline
  1. Select the acknowledgement check box and choose Create Stack.
    Stack deployment takes about 30 minutes to complete.
  2. When stack creation is complete, navigate to the Outputs tab on the AWS CloudFormation console and note down the values for the resources created.
    The following table summarizes the output values.
Output Description Example value
ACMCertificateArn Amazon Resource Name (ARN) of the ACM certificate. You will use this for exporting certificate and private key files using the AWS CLI in a later step. arn:aws:acm:aa-example-1:111122223333:certificate/a1b2c3d4-5678-90ab-cdef-EXAMPLE11111
CertificateAuthorityArn ARN of the Private CA. arn:aws:acm-pca:aa-example-1:111122223333:certificate-authority/a1b2c3d4-5678-90ab-cdef-EXAMPLE22222
TrustAnchorArn ARN of the IAM Roles Anywhere profile. You will use this value for configuring credential_process for IAM Roles Anywhere in a later step. arn:aws:rolesanywhere:aa-example-1:111122223333:trust-anchor/a1b2c3d4-5678-90ab-cdef-EXAMPLE33333
IngestionRoleArn ARN of the OpenSearch Ingestion role. You will use this value for configuring credential_process for IAM Roles Anywhere in a later step. arn:aws:iam::111122223333:role/role-name-with-path
ProfileArn ARN of the IAM Roles Anywhere profile. You will use this value for configuring credential_process for IAM Roles Anywhere in a later step. arn:aws:rolesanywhere:aa-example-1:111122223333:profile/a1b2c3d4-5678-90ab-cdef-EXAMPLE44444
OpenSearchDomainEndpoint Endpoint of the VPC OpenSearch domain. You will use this public endpoint for querying your index after ingestion. vpc-my-domain-123456789012.aa-example-1.es.amazonaws.com
PipelineEndpoint Endpoint of the OpenSearch Ingestion pipeline. You will use this public endpoint in the Fluent Bit configuration. my-pipeline-123456789012.aa-example-1.osis.amazonaws.com
PipelineIngestionPath Ingestion path for the OpenSearch Ingestion pipeline. /test-ingestion-path

Export a sample private certificate using CloudShell

Follow these steps to export the sample private certificate created by the CloudFormation stack:

  1. Open CloudShell. For more details, see Navigating the AWS CloudShell interface.
  2. Export the certificate ARN from the CloudFormation outputs. If you changed the stack name in the previous step, use that value for <stack-name>, otherwise use the default value osis-with-iamra.
export CERT_ARN=$(aws cloudformation describe-stacks \
    --stack-name <stack-name> \
    --query 'Stacks[0].Outputs[?OutputKey==`ACMCertificateArn`].OutputValue' \
    --output text)
  1. Extract the certificate and private key files:
# Generate and save the passphrase
export PASSPHRASE=$(openssl rand -base64 32)

# Export certificate using environment variables
aws acm export-certificate \
    --certificate-arn $CERT_ARN \
    --passphrase $(echo -n "$PASSPHRASE" | base64) \
    > cert_export.json

# Extract components to separate files
jq -r '.Certificate' cert_export.json > certificate.pem
jq -r '.PrivateKey' cert_export.json > encrypted_private_key.pem

# Decrypt the private key
openssl rsa -in encrypted_private_key.pem -out private_key.pem -passin pass:"$PASSPHRASE"

# Clear environment variables
unset PASSPHRASE CERT_ARN
  1. Download the extracted certificate and private key files from CloudShell:
    1. /home/cloudshell-user/certificate.pem
    2. /home/cloudshell-user/private_key.pem

Configure an AWS CLI profile

Follow these steps to configure an AWS CLI profile for your log ingestion environment:

  1. Store the downloaded certificate and private key to your environment. For an automated approach to generate and rotate certificates, see Set up AWS Private Certificate Authority to issue certificates for use with IAM Roles Anywhere.
  2. Create a new profile named osis-pipeline-credentials that invokes the credential process. Replace the placeholders with your specific values. Find the values for trusted-anchor-arn, profile-arn, and ingestion-role-arn in your CloudFormation stack outputs.
aws configure set profile.osis-pipeline-credentials.credential_process "</path/to/aws_signing_helper> credential-process \
      --certificate </path/to/certificate.pem> \
      --private-key </path/to/private_key.pem> \
      --trust-anchor-arn <trusted-anchor-arn> \
      --profile-arn <profile-arn> \
      --role-arn <ingestion-role-arn>"
  1. Verify your configuration. Open the ~/.aws/config file and confirm it contains a profile named osis-pipeline-credentials similar to the following:
[profile osis-pipeline-credentials]
credential_process = </path/to/aws_signing_helper> credential-process       --certificate </path/to/certificate.pem>       --private-key </path/to/private_key.pem>       --trust-anchor-arn <trusted-anchor-arn>       --profile-arn <profile-arn>       --role-arn <ingestion-role-arn>

Configure Fluent Bit

Run the following command to create a Fluent Bit configuration. Replace the placeholders with your specific values. Find the osis-pipeline-endpoint and pipeline-ingestion-path values in your CloudFormation stack outputs.

cat << 'EOF' > ~/fluent-bit.conf
[INPUT]
  name                  tail
  path                  /var/log/syslog
  read_from_head        true
  refresh_interval      5
[OUTPUT]
  name 			http
  match	   		*
  aws_service 		osis
  host 			<osis-pipeline-endpoint>
  port 			443
  uri 			<pipeline-ingestion-path> 
  format 		json
  aws_auth	 	true
  aws_region 		<aa-example-1>   
  aws_profile 		osis-pipeline-credentials
  tls 			On
EOF

This example configuration includes the following:

  • Uses the tail input plugin to monitor the /var/log/syslog file
  • Uses the http output plugin to flush log records to the OpenSearch Ingestion pipeline endpoint
  • Uses the osis-pipeline-credentials profile to obtain temporary AWS credentials for SigV4 authentication (aws_auth set to true)

Test the solution

Follow these steps to test the setup:

  1. Start the Fluent Bit client with the configuration file fluent-bit.conf that you created in the previous step. Replace the placeholder with the value applicable to your environment. For Ubuntu 24.04, the default path of the Fluent Bit client is /opt/fluent-bit/bin/fluent-bit. Adjust the path if using other distributions.

sudo AWS_CONFIG_FILE=~/.aws/config <path-to-fluent-bit> -c ~/fluent-bit.conf

  1. Because the solution in this post launched the OpenSearch Service domain within a VPC, you will need an environment that has connectivity to the VPC. For this post, we create a CloudShell VPC environment to run the commands in the next step. Find the VPC, subnet, and security group to use from your CloudFormation stack outputs.
  2. The solution that you deployed through AWS CloudFormation dynamically creates indexes based on ingestion timestamps, format logs-%{yyyy.MM.dd}. You can specify your preferred naming using OpenSearch Ingestion index management. You can query your OpenSearch index using your preferred tool to see the ingested logs from Fluent Bit. We use awscurl in a CloudShell environment as shown in the following example. Replace the placeholders with your specific values. Find the opensearch-domain-endpoint value in your CloudFormation stack outputs.
pip install awscurl

export OPENSEARCH_DOMAIN_ENDPOINT=https://<opensearch-domain-endpoint>

# List indices matching logs-%{yyyy.MM.dd} format and get most recent one to query
export INDEX=$(awscurl --service es "$OPENSEARCH_DOMAIN_ENDPOINT/_cat/indices?v" | grep -E "logs-[0-9]{4}\.[0-9]{2}\.[0-9]{2}" | sort -r | head -1 | awk '{print $3}')

awscurl --service es $OPENSEARCH_DOMAIN_ENDPOINT/$INDEX/_search \
        -X GET -H "Content-Type: application/json" \
        -d '{
            "size": 10,
            "sort": [
              {"@timestamp": {"order": "desc"}}
            ],
            "query": { "match_all": {} }
          }' | jq '.hits.hits[]._source'

The following is an example of the expected output:

{
  "date": 1732039662.399506,
  "log": "2024-11-19T18:07:42.399375+00:00 test-server fluent-bit[9986]: 200 OK",
  "@timestamp": "2024-11-19T18:07:42.812Z"
}
{
  "date": 1732039662.399501,
  "log": "2024-11-19T18:07:42.399224+00:00 test-server fluent-bit[9986]: [2024/11/19 18:07:42] [ info] [output:http:http.0] test-pipeline-123456789012.us-east-2.osis.amazonaws.com:443, HTTP status=200",
  "@timestamp": "2024-11-19T18:07:42.812Z"
}
...

Clean up

To avoid future charges, remove the deployed resources:

  1. Delete the CloudFormation stack.
  2. Remove generated files from CloudShell:

rm cert_export.json encrypted_private_key.pem certificate.pem private_key.pem

Conclusion

In this post, we demonstrated how to obtain temporary credentials from IAM Roles Anywhere and securely ingest logs from hybrid or multicloud environments into OpenSearch Service using OpenSearch Ingestion. This approach minimizes the risk of credential exposure while enabling centralized log collection from distributed workloads. This solution is particularly valuable for organizations managing complex infrastructures across multiple environments and looking to consolidate observability data in OpenSearch Service. For additional details, refer to the following resources:

If you have questions or feedback about this post, please leave them in the comments section.


About the Authors

Xiaoxue Xu is a Solutions Architect for AWS based in Toronto. She primarily works with financial services customers to help secure their workload and design scalable solutions on the AWS Cloud.

Simran Singh is a Senior Solutions Architect at AWS. In this role, he assists our large enterprise customers in meeting their key business objectives using AWS. His areas of expertise include artificial intelligence and machine learning, security, and improving the experience of developers building on AWS.

Managing identity source transition for AWS IAM Identity Center

Post Syndicated from Xiaoxue Xu original https://aws.amazon.com/blogs/security/managing-identity-source-transition-for-aws-iam-identity-center/

AWS IAM Identity Center manages user access to Amazon Web Services (AWS) resources, including both AWS accounts and applications. You can use IAM Identity Center to create and manage user identities within the Identity Center identity store or to connect seamlessly to other identity sources.

Organizations might change the configuration of their identity source in IAM Identity Center for various reasons. These include switching identity providers (IdPs), expanding their identity footprint, adopting new features, and so on. These transitions can disrupt user access and require planning to minimize downtime.

In this blog post, we walk you through the process of switching from one identity source to another and provide sample code that you can use to assist with the transition.

Background

The identity source configured in IAM Identity Center determines where users and groups are created and managed. Each organization can connect to only one identity source at a time. Identity Center supports three main identity source options:

  1. Identity Center directory: This is the default identity store for IAM Identity Center. You can use it to directly create and administer your users and groups within Identity Center without relying on an external provider.
  2. Active Directory: You can configure integration with an on-premises Active Directory or with AWS Managed Microsoft AD using AWS Directory Service. This integration enables you to use your existing Active Directory identities and group memberships.
  3. External IdP: You can continue using your current third-party IdPs that support SAML 2.0, such as Okta Universal Directory or Microsoft Entra ID (formerly Azure AD).

Understanding these identity source options can help you choose the source that best fits your user management needs based on your existing infrastructure and authentication requirements.

Figure 1 explains the flow of users and groups accessing AWS resources. This access is granted through the following:

Figure 1: Granting access to AWS resources for users and groups managed by an identity source in IAM Identity Center

Figure 1: Granting access to AWS resources for users and groups managed by an identity source in IAM Identity Center

When you change the identity source, the work required varies depending on the original and new sources. AWS documentation details these considerations. In minimal impact scenarios, assignments remain intact, although you need to force password resets or verify correct assertions from the new source. More disruptive scenarios delete users, groups, and their assignments. In those scenarios, you need to restore the deleted entries after changing the identity source.

Sample deployment

This deployment covers permission sets and application assignments’ backup and restore. These scripts associate assignments with unique user attributes such as UserName, and group attributes such as DisplayName. Attributes that might change during the user and group restoration process, such as UserId and GroupId, aren’t used.

What isn’t covered includes users, groups, permission sets, and applications backup and restore.

  • Users and groups backup and restore aren’t covered because they depend heavily on the format of the source and target IdPs.
  • Because we’re working with identity source switching, the permission sets will remain unchanged and applications will not be deleted.
  • If you’re changing IdPs as part of an AWS Region , the IAM Identity Center instance will be deleted. The applications and permission sets will be deleted in addition to assignments. In this case, you must redeploy the applications. See How to automate the review and validation of permissions for users and groups in AWS IAM Identity Center for information about backing up permission sets.

The sample scripts and detailed steps are available on GitHub.

Note: This solution is available in the GitHub aws-samples repository. You can report bugs or make feature requests through GitHub Issues. The builders of this solution can help with GitHub issues. Enterprise Support customers can reach out to their Technical Account Manager (TAM) for further questions or feature requests.

Walkthrough

In this section, we walk you through the process of transitioning to a new identity source in IAM Identity Center.

Step 1: Backup users, groups, and assignments from the current identity source

This step is critical to preserve users’ information and their associated access scope.

How to backup users and groups:

Note: For some external IdPs, there are native integrations with Active Directory, such as Okta AD integration and Ping One AD Connect. You can set up a native integration and sync users and groups data without needing to backup and restore that data.

Assignments can be backed up by running the backup.py file from GitHub. Replace <IDC_STORE_ID> with your Identity Store ID (it looks like d-1234567890), and replace <IDC_ARN> with the Amazon Resource Name (ARN) for your IAM Identity Center instance.

python3 backup.py --idc-id <IDC_STORE_ID> --idc-arn <IDC_ARN>

This script uses both IAM Identity Center APIs and Identity Store APIs as shown in Figure 2 that generates permission set assignments backup files (UserAssignments.json and GroupAssignments.json) and an application assignments backup file (AppAssignments.json).

Figure 2: APIs used for backing up assignments

Figure 2: APIs used for backing up assignments

Step 2: Restore and validate the backed-up users and groups in the target identity source

The target will become the new authoritative identity source. When done, verify that the group memberships and attributes have been correctly transferred.

  • If the target is an IAM Identity Center directory, use APIs such as CreateUser, CreateGroup, and CreateGroupMembership to restore from the previous backup file.
  • If the target is Active Directory or an external IdP, use the corresponding native import features or integration tools to restore.

Step 3: Configure IAM Identity Center to connect to the new identity source and synchronize users and groups

Update your IAM Identity Center configuration to point to the new source. If applicable, use tools such as configurable AD sync or automatic provisioning with SCIM to synchronize your restored identities.

WARNING: While the directory is being rebuilt, your users will not have access to AWS accounts or applications through IAM Identity Center until all assignments are restored in Step 4.

Step 4: Restore assignments to users and groups in the new identity source

The APIs used to restore assignments are as shown in Figure 3.

Figure 3: APIs used for restoring assignments

Figure 3: APIs used for restoring assignments

Assignments can be restored by running the restore.py file from GitHub. Replace <NEW_IDC_STORE_ID> with your newly configured Identity Store ID (it looks like d-1234567890) and replace <IDC_ARN> with the ARN for your IAM Identity Center instance.

python3 restore.py --idc-id <NEW_IDC_STORE_ID> --idc-arn <IDC_ARN>

This script uses the APIs illustrated in Figure 2 and picks up backup files (UserAssignments.json, GroupAssignments.json, and AppAssignments.json) from Step 1 by default. Account permission set assignment results are automatically retrieved five times using exponential backoff. If the result is other than SUCCEEDED after five retries, the principal ID will be marked as failed and exported in error logs.

Note: For AWS managed applications that maintain a separate identity source, using the CreateApplicationAssignments API to restore application assignments will not preserve user access. These applications typically have dependencies on the original identity source ID, or dependencies on UserId and GroupId from the original identity source. This dependency is represented by importing users or groups from IAM Identity Center during the AWS managed application creation process. Example AWS managed applications include Amazon SageMaker Studio and Amazon Q Developer. These applications must be restored on a case-by-case basis and can require redeployment of the application.

Step 5: Validate user access using the new identity source

Make sure that users can still access the expected accounts and applications.

Conclusion

Transitioning your identity source in IAM Identity Center requires careful planning and implementation. This post outlined the steps to manage this transition. By following these steps, you can streamline the transition process, providing a smooth and efficient transfer of user access with minimal downtime. To get started, see the GitHub repository. For related posts, visit the AWS Security Blog channel and search for IAM Identity Center.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.
 

Xiaoxue Xu
Xiaoxue Xu

Xiaoxue Xu is a Solutions Architect for AWS based in Toronto. She primarily works with Financial Services customers to help secure their workload and design scalable solutions on the AWS Cloud.
Kanwar Bajwa
Kanwar Bajwa

Kanwar Bajwa is a Principal Technical Account Manager at AWS who works with customers to optimize their use of AWS services and achieve their business objectives.
Yee Fei Ooi
Yee Fei Ooi

Yee Fei is a Solutions Architect supporting independent software vendor (ISV) customers in Singapore and is part of the Containers TFC. She enjoys helping her customers to grow their businesses and build solutions that automate, innovate, and improve efficiency.