All posts by Andries Engelbrecht

Access Snowflake Horizon Catalog data using catalog federation in the AWS Glue Data Catalog

2026-01-14 Andries Engelbrecht

Post Syndicated from Andries Engelbrecht original https://aws.amazon.com/blogs/big-data/access-snowflake-horizon-catalog-data-using-catalog-federation-in-the-aws-glue-data-catalog/

This is a guest post by Andries Engelbrecht, Principal Partner Solutions Engineer at Snowflake, in partnership with AWS.

AWS announced a new catalog federation feature that allows you to directly access data from Snowflake Horizon Catalog through the AWS Glue Data Catalog. This integration enables you to discover and query Horizon Catalog data in Iceberg format through REST endpoints while applying fine-grained access controls using AWS Lake Formation. The new catalog federation combined with Snowflake’s catalog-linked database feature means users can access data stored across AWS and Snowflake from a single point of entry, reducing data movement and associated costs by eliminating the need to duplicate data across platforms.

In this post, we show you how to connect the AWS Glue Data Catalog to Snowflake Horizon Catalog and query the data using AWS analytics services. We cover how to set up catalogs in Horizon Catalog and configure required permissions, create and configure the federation connection in AWS Glue, implement fine-grained access controls using AWS Lake Formation, and finally, query federated tables using Amazon Athena. This step-by-step approach guides you through the complete process of establishing a integration between your Snowflake and AWS data environments.

Business examples and key benefits

Catalog federation enables several critical business scenarios while delivering key operational and strategic benefits.

Common examples

This federation capability addresses several key business scenarios:

Governed, cross-platform analytics: Query data across AWS and Snowflake environments to improve data-driven decision making without data movement or duplication
Data mesh implementation: Enable secure and federated data discovery while maintaining domain-oriented ownership
Compliance management: Implement consistent access controls and auditing across platforms

Key benefits

Operational efficiency: Eliminate data duplication and reduce Extract Transform Load (ETL) workloads
Enhanced security: Centralize access control through AWS Lake Formation with fine-grained permissions
Cost optimization: Minimize data transfer and storage costs across platforms
Improved agility: Enable faster time to insights with direct query access
Simplified governance: Maintain unified compliance and audit framework

Solution overview

The solution uses catalog federation in the AWS Glue Data Catalog to integrate with Snowflake Horizon Catalog. This integration supports both Snowflake Horizon, where the catalog is internal to Snowflake, and external catalogs such as Apache Polaris, Snowflake Open Catalog (a managed service that hosts Apache Polaris), and others.

The following diagram illustrates how AWS Glue Data Catalog federates with Snowflake Horizon Catalog, enabling customers to directly access Iceberg-format data managed by Snowflake Horizon Catalog through the Glue Data Catalog.

Architecture diagram showing integration between AWS services and Snowflake using federated catalog connections through Apache Iceberg REST API.

The integration works through three main components:

Authentication: Uses OAuth2 credentials of Snowflake principal
Access Control: AWS Lake Formation manages fine-grained permissions
Query Access: AWS Analytics services like Amazon Athena can directly query the federated tables

Now, we walk through the step-by-step process of setting up this integration.

Prerequisites

Before you begin, confirm you have the following:

A Snowflake account.
(Optional) A Snowflake Open Catalog account.
An AWS Identity and Access Management (IAM) role that is a Lake Formation data lake administrator in your AWS account. A data lake administrator is an IAM principal that can register Amazon S3 locations, access the Data Catalog, grant Lake Formation permissions to other users, and view AWS CloudTrail. See Create a data lake administrator for more information. This IAM role needs access to:
Install or update the latest AWS Command Line Interface (CLI) version to run the AWS CLI commands. For instructions, refer to Installing or updating the latest version of the AWS CLI.

Configure Snowflake Horizon Catalog for Iceberg external access

Snowflake Horizon Catalog already supports managing Iceberg tables. For this walkthrough, you need to create Snowflake-managed Iceberg tables with data stored in Amazon S3.

Follow these steps in order:

Create an external volume for S3: First, create an external volume that points to your S3 bucket where Iceberg table data is stored. Follow the instructions in Create External Volume(s) for the Iceberg Tables on S3.
Create a database: Create a database to organize your tables. Refer to the Snowflake database creation documentation.
Create a schema: Create a schema within your database following the Snowflake schema creation guide.
Create an Iceberg table: Create your Iceberg table using the external volume. Follow the instructions to Create Iceberg Table.

After completing these steps, your Snowflake-managed Iceberg tables are ready to federate with AWS Glue Data Catalog.

Configure access control and authentication

To enable AWS Glue to access your Snowflake-managed Iceberg tables, you need to configure access control and obtain authentication credentials.

Step 1: Configure access control

Create a dedicated Snowflake role for external engine access to establish clear governance boundaries. Follow the instructions in Configure Access Control for external engines and set up the appropriate permissions for your Iceberg tables.

Step 2: Obtain an access token

Generate an access token for authenticating AWS Glue to Snowflake Horizon Catalog. Snowflake supports three authentication mechanisms:

External OAuth
Key-pair authentication
Programmatic Access Token (PAT)

Choose the authentication method that best fits your security requirements and follow the corresponding Snowflake documentation to generate your credentials.

Catalog Federation supports OAuth or custom authentication. For details on using OAuth refer to Federate to Snowflake Iceberg Catalog.

For this post, we use custom authentication and generate access token using PAT. Replace role_name with the principal role and token_value with the principal’s Programmatic Access Token.

curl --location 'https://<accountidentifier>.snowflakecomputing.com/polaris/api/catalog/v1/oauth/tokens' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'grant_type=client_credentials' \
--data-urlencode 'scope=session:role:<role_name>' \
--data-urlencode 'client_secret=<token_value>'

Note down the access token that is generated.

Step 3: Enable catalog federation

With access control configured and authentication credentials in hand, AWS Glue Catalog Federation can now connect to and access Snowflake’s Horizon Catalog.

Optional: Snowflake Open Catalog configuration

If you prefer to use Snowflake Open Catalog for Iceberg external access instead, refer to Sync a Snowflake-managed table with Snowflake Open Catalog for alternative setup instructions.

Setup Glue Catalog federation with Snowflake Horizon Catalog

Create a secret on AWS Secrets Manager

Choose Store a new secret and select Other type of secret for the secret type.
Set the key-value pair:
- Key: BEARER_TOKEN
- Value: The access token noted earlier
Choose Next and provide the secret name as horizon-secret.
Complete the setup by choosing Store.

Alternatively, you can use the CLI to create the secret by running the following command.

Replace your-access-token and your-region with your actual values:

aws secretsmanager create-secret \
    --name horizon-secret \
    --description "Snowflake Horizon access token" \
    --secret-string '{
        "BEARER_TOKEN": "your-access-token"
    }' \
    --region your-region

Create IAM role for catalog federation

As the catalog owner of a federated catalog in AWS Glue Data Catalog, you can use Lake Formation to implement comprehensive access controls for your data teams:

Access control options

You can implement access controls at different granularity levels depending on your governance needs:

Coarse-grained: Table-level permissions
Fine-grained: Column-level, row-level, and cell-level filtering
Tag-based: Dynamic access based on data classification tags

Lake Formation requires an IAM role with permissions to access the underlying S3 locations of your external catalog.

Create an IAM role that enables the Glue Connection to access AWS Secrets Manager, VPC configurations (optional) and Lake formation to manage credential vending for S3 bucket/prefix.

Required permissions

Secrets Manager access: The Glue connection requires permissions to retrieve secret values from Secrets Manager for OAuth tokens stored for your Snowflake service connection.
Amazon Virtual Private Cloud (VPC) Access (optional): When using VPC endpoints to restrict connectivity to your Snowflake Open Catalog account, the Glue connection needs permissions to describe and use VPC network interfaces. This configuration ensures secure, controlled access to both your stored credentials and network resources while maintaining proper isolation through VPC endpoints.
S3 bucket and AWS Key Management Service (KMS) key permission: The Glue connection requires S3 permissions to read certificates if used in the connection setup. Additionally, Lake Formation requires read permissions on the bucket/prefix where the remote catalog table data resides. If the data is encrypted using a KMS key, additional KMS permissions are required.

Setup steps:

Run the following command using AWS CLI by replacing the placeholder with your setup information:

Create a JSON file (e.g., trust-policy.json) with the following structure:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": ["glue.amazonaws.com","lakeformation.amazonaws.com"]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Use the aws iam create-role command, referencing the trust policy file:

aws iam create-role \
    --role-name LFDataAccessRole \
    --assume-role-policy-document file://<path_file_downloaded>/trust-policy.json

First, create a JSON file (such as, permissions-policy.json) for the permissions:


{
"Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "secretsmanager:GetSecretValue",
                "secretsmanager:DescribeSecret"
            ],
            "Resource": [
                "<secrets manager ARN>"
            ]
        },
        {
            "Effect": "Allow",
            "Action": [
                "ec2:CreateNetworkInterface",
                "ec2:DeleteNetworkInterface",
                "ec2:DescribeNetworkInterfaces"
            ],
            "Resource": "*",
            "Condition": {
                "ArnEquals": {
                    "ec2:Vpc": "arn:aws:ec2:region:account-id:vpc/<vpc-id>", 
                    "ec2:Subnet": [ 
                        "arn:aws:ec2:region:account-id:subnet/<subnet-id>"
                    ]
                }
            }
        },
        {
           # Required when using custom cert to sign requests.
            "Effect": "Allow",
            "Action": [
                "s3:GetObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucketname>/<certpath>"
            ]
        },
        { # Required when using customer managed encryption key for s3 
            "Effect": "Allow",
            "Action": [
                "kms:decrypt",
                "kms:encrypt"
            ],
            "Resource": [
                "<kmsKey>"
            ]
        }
    ]
}

Then, attach it to the role:

aws iam put-role-policy \
--role-name LFDataAccessRole \
--policy-name myaccesspolicies \
--policy-document file://<path_file_downloaded>/permissions- policy.json

Create federated catalog in Glue Data Catalog

AWS Glue supports the SNOWFLAKEICEBERGRESTCATALOG connection type for connecting Glue Data Catalog with Snowflake Horizon Catalog and Snowflake Open Catalog. This Glue connector supports OAuth2 authentication and includes additional configuration parameters like CASING_TYPE to customize how AWS Glue Data Catalog discovers metadata in the Snowflake Horizon Catalog accounts.

Choose Catalog in the left navigation pane and select Create catalog.
Choose the data source as Snowflake Horizon Catalog.
Provide the following information:
- Name: Name of the federated catalog in Glue Catalog. For this post, we use federated_lakehousedb
- Catalog name in Snowflake: Catalog name existing in Snowflake Horizon Catalog, this should match exact name in Horizon catalog. For this post, we use LAKEHOUSEDB
- For Connection details, choose New connection configurations:
  - Connection name: Name for the glue connection. For this post, we use federatedconnection1.
  - Workspace URL: Horizon IRC url (format: https://<account_identifier>.snowflakecomputing.com)
  - Casing type: choose Uppercase only
  - Authentication:
    - Authentication type: choose Custom. Alternatively, you can select OAuth2 authentication. For Custom authentication, an access token is created, refreshed, and managed by the customer’s application or system and stored using AWS Secrets Manager.
    - OAuth Secret: Provide the secret manager ARN that was created in the previous step.

If you have AWS PrivateLink setup and/or a proxy setup, you can provide network details under Settings for network configurations (optional).
For Register Glue connection with Lake Formation:
- Choose the IAM role created earlier(LFDataAccessRole) to manage data access using Lake Formation.

To test the connection, choose Run test. After the connection information is validated, it shows as successful.

You can now create the catalog by selecting Create catalog.

Alternatively, you can use AWS CLI to create connection and catalog using example commands:

aws glue create-connection \
--connection-input '{
"Name": "federatedconnection1",
"ConnectionType": "SNOWFLAKEICEBERGRESTCATALOG",
"ConnectionProperties": {
    "INSTANCE_URL": "<your-snowflake-account-URL>",
    "ROLE_ARN": "< ARN_of_LFDataAccessRole>",
    "CATALOG_CASING_FILTER": "UPPERCASE_ONLY"
},
"AuthenticationConfiguration": {
    "AuthenticationType": "CUSTOM",
    "SecretArn": "arn:aws:secretsmanager:<your-aws-region>:<your-aws-account-id>:secret:horizon-secret"
}
}' \
--region <your-aws-region>
aws lakeformation register-resource \
    --resource-arn <ARN_of_federatedconnection1_connection> \
    --role-arn <ARN_of_LFDataAccessRole> \
    --with-federation \
    --with-privileged-access \
    --region <your-aws-region>
aws glue create-catalog \
    --name federated_lakehousedb \
    --catalog-input '{
    "FederatedCatalog": {
        "Identifier": "LAKEHOUSEDB",
        "ConnectionName": “federatedconnection1 "
    },
    "CreateTableDefaultPermissions": [],
    "CreateDatabaseDefaultPermissions": []
}'

After the catalog is created, the Horizon databases and tables are listed under the federated catalog.

You can implement fine grained access control on the tables by applying row/column filter using Lake Formation.

Query the data using Athena query editor:

Open the Amazon Athena console and run the following query to access the federated Horizon table:

SELECT * FROM "public"."customer" limit 10;

Clean up

To clean up your resources, complete the following steps:

Drop the Snowflake Database with Cascade.
Drop External Volume created for Iceberg Tables on S3.
Drop the resources in Glue Data Catalog and Lake Formation created for this post.
Delete the IAM roles and S3 buckets used for this post.
Delete any VPC, KMS keys if used for this post setup.

Conclusion

In this post, we demonstrated how to establish a secure connection between AWS Analytics services and Snowflake Horizon Catalog, enabling you to access your data from a single connected and governed view. You learned how to:

Configure catalog federation between AWS Glue Data Catalog and Snowflake Horizon Catalog
Set up OAuth2 authentication for secure access
Grant access to Iceberg table in Snowflake Horizon Catalog using AWS Lake Formation
Query federated tables using Amazon Athena

You can follow the same steps to establish a secure connection with open-source catalog options such as Snowflake Open Catalog, a managed service for Apache Iceberg. Remember to clean up any resources you created while following this tutorial to avoid ongoing charges.

To further explore this solution in your environment, consider the following resources:

These resources can help you to implement and optimize this integration pattern for your specific use case. As you begin this journey, remember to start small, validate your architecture with test data, and gradually scale your implementation based on your organization’s needs. Stay tuned for future workshops and resources.

About the authors

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

2024-04-03 Andries Engelbrecht

Post Syndicated from Andries Engelbrecht original https://aws.amazon.com/blogs/big-data/use-apache-iceberg-in-your-data-lake-with-amazon-s3-aws-glue-and-snowflake/

This is post is co-written with Andries Engelbrecht and Scott Teal from Snowflake.

Businesses are constantly evolving, and data leaders are challenged every day to meet new requirements. For many enterprises and large organizations, it is not feasible to have one processing engine or tool to deal with the various business requirements. They understand that a one-size-fits-all approach no longer works, and recognize the value in adopting scalable, flexible tools and open data formats to support interoperability in a modern data architecture to accelerate the delivery of new solutions.

Customers are using AWS and Snowflake to develop purpose-built data architectures that provide the performance required for modern analytics and artificial intelligence (AI) use cases. Implementing these solutions requires data sharing between purpose-built data stores. This is why Snowflake and AWS are delivering enhanced support for Apache Iceberg to enable and facilitate data interoperability between data services.

Apache Iceberg is an open-source table format that provides reliability, simplicity, and high performance for large datasets with transactional integrity between various processing engines. In this post, we discuss the following:

Advantages of Iceberg tables for data lakes
Two architectural patterns for sharing Iceberg tables between AWS and Snowflake:
- Manage your Iceberg tables with AWS Glue Data Catalog
- Manage your Iceberg tables with Snowflake
The process of converting existing data lakes tables to Iceberg tables without copying the data

Now that you have a high-level understanding of the topics, let’s dive into each of them in detail.

Advantages of Apache Iceberg

Apache Iceberg is a distributed, community-driven, Apache 2.0-licensed, 100% open-source data table format that helps simplify data processing on large datasets stored in data lakes. Data engineers use Apache Iceberg because it’s fast, efficient, and reliable at any scale and keeps records of how datasets change over time. Apache Iceberg offers integrations with popular data processing frameworks such as Apache Spark, Apache Flink, Apache Hive, Presto, and more.

Iceberg tables maintain metadata to abstract large collections of files, providing data management features including time travel, rollback, data compaction, and full schema evolution, reducing management overhead. Originally developed at Netflix before being open sourced to the Apache Software Foundation, Apache Iceberg was a blank-slate design to solve common data lake challenges like user experience, reliability, and performance, and is now supported by a robust community of developers focused on continually improving and adding new features to the project, serving real user needs and providing them with optionality.

Transactional data lakes built on AWS and Snowflake

Snowflake provides various integrations for Iceberg tables with multiple storage options, including Amazon S3, and multiple catalog options, including AWS Glue Data Catalog and Snowflake. AWS provides integrations for various AWS services with Iceberg tables as well, including AWS Glue Data Catalog for tracking table metadata. Combining Snowflake and AWS gives you multiple options to build out a transactional data lake for analytical and other use cases such as data sharing and collaboration. By adding a metadata layer to data lakes, you get a better user experience, simplified management, and improved performance and reliability on very large datasets.

Manage your Iceberg table with AWS Glue

You can use AWS Glue to ingest, catalog, transform, and manage the data on Amazon Simple Storage Service (Amazon S3). AWS Glue is a serverless data integration service that allows you to visually create, run, and monitor extract, transform, and load (ETL) pipelines to load data into your data lakes in Iceberg format. With AWS Glue, you can discover and connect to more than 70 diverse data sources and manage your data in a centralized data catalog. Snowflake integrates with AWS Glue Data Catalog to access the Iceberg table catalog and the files on Amazon S3 for analytical queries. This greatly improves performance and compute cost in comparison to external tables on Snowflake, because the additional metadata improves pruning in query plans.

You can use this same integration to take advantage of the data sharing and collaboration capabilities in Snowflake. This can be very powerful if you have data in Amazon S3 and need to enable Snowflake data sharing with other business units, partners, suppliers, or customers.

The following architecture diagram provides a high-level overview of this pattern.

The workflow includes the following steps:

AWS Glue extracts data from applications, databases, and streaming sources. AWS Glue then transforms it and loads it into the data lake in Amazon S3 in Iceberg table format, while inserting and updating the metadata about the Iceberg table in AWS Glue Data Catalog.
The AWS Glue crawler generates and updates Iceberg table metadata and stores it in AWS Glue Data Catalog for existing Iceberg tables on an S3 data lake.
Snowflake integrates with AWS Glue Data Catalog to retrieve the snapshot location.
In the event of a query, Snowflake uses the snapshot location from AWS Glue Data Catalog to read Iceberg table data in Amazon S3.
Snowflake can query across Iceberg and Snowflake table formats. You can share data for collaboration with one or more accounts in the same Snowflake region. You can also use data in Snowflake for visualization using Amazon QuickSight, or use it for machine learning (ML) and artificial intelligence (AI) purposes with Amazon SageMaker.

Manage your Iceberg table with Snowflake

A second pattern also provides interoperability across AWS and Snowflake, but implements data engineering pipelines for ingestion and transformation to Snowflake. In this pattern, data is loaded to Iceberg tables by Snowflake through integrations with AWS services like AWS Glue or through other sources like Snowpipe. Snowflake then writes data directly to Amazon S3 in Iceberg format for downstream access by Snowflake and various AWS services, and Snowflake manages the Iceberg catalog that tracks snapshot locations across tables for AWS services to access.

Like the previous pattern, you can use Snowflake-managed Iceberg tables with Snowflake data sharing, but you can also use S3 to share datasets in cases where one party does not have access to Snowflake.

The following architecture diagram provides an overview of this pattern with Snowflake-managed Iceberg tables.

This workflow consists of the following steps:

In addition to loading data via the COPY command, Snowpipe, and the native Snowflake connector for AWS Glue, you can integrate data via the Snowflake Data Sharing.
Snowflake writes Iceberg tables to Amazon S3 and updates metadata automatically with every transaction.
Iceberg tables in Amazon S3 are queried by Snowflake for analytical and ML workloads using services like QuickSight and SageMaker.
Apache Spark services on AWS can access snapshot locations from Snowflake via a Snowflake Iceberg Catalog SDK and directly scan the Iceberg table files in Amazon S3.

Comparing solutions

These two patterns highlight options available to data personas today to maximize their data interoperability between Snowflake and AWS using Apache Iceberg. But which pattern is ideal for your use case? If you’re already using AWS Glue Data Catalog and only require Snowflake for read queries, then the first pattern can integrate Snowflake with AWS Glue and Amazon S3 to query Iceberg tables. If you’re not already using AWS Glue Data Catalog and require Snowflake to perform reads and writes, then the second pattern is likely a good solution that allows for storing and accessing data from AWS.

Considering that reads and writes will probably operate on a per-table basis rather than the entire data architecture, it is advisable to use a combination of both patterns.

Migrate existing data lakes to a transactional data lake using Apache Iceberg

You can convert existing Parquet, ORC, and Avro-based data lake tables on Amazon S3 to Iceberg format to reap the benefits of transactional integrity while improving performance and user experience. There are several Iceberg table migration options (SNAPSHOT, MIGRATE, and ADD_FILES) for migrating existing data lake tables in-place to Iceberg format, which is preferable to rewriting all of the underlying data files—a costly and time-consuming effort with large datasets. In this section, we focus on ADD_FILES, because it’s useful for custom migrations.

For ADD_FILES options, you can use AWS Glue to generate Iceberg metadata and statistics for an existing data lake table and create new Iceberg tables in AWS Glue Data Catalog for future use without needing to rewrite the underlying data. For instructions on generating Iceberg metadata and statistics using AWS Glue, refer to Migrate an existing data lake to a transactional data lake using Apache Iceberg or Convert existing Amazon S3 data lake tables to Snowflake Unmanaged Iceberg tables using AWS Glue.

This option requires that you pause data pipelines while converting the files to Iceberg tables, which is a straightforward process in AWS Glue because the destination just needs to be changed to an Iceberg table.

Conclusion

In this post, you saw the two architecture patterns for implementing Apache Iceberg in a data lake for better interoperability across AWS and Snowflake. We also provided guidance on migrating existing data lake tables to Iceberg format.

Sign up for AWS Dev Day on April 10 to get hands-on not only with Apache Iceberg, but also with streaming data pipelines with Amazon Data Firehose and Snowpipe Streaming, and generative AI applications with Streamlit in Snowflake and Amazon Bedrock.

About the Authors

Andries Engelbrecht is a Principal Partner Solutions Architect at Snowflake and works with strategic partners. He is actively engaged with strategic partners like AWS supporting product and service integrations as well as the development of joint solutions with partners. Andries has over 20 years of experience in the field of data and analytics.

Deenbandhu Prasad is a Senior Analytics Specialist at AWS, specializing in big data services. He is passionate about helping customers build modern data architectures on the AWS Cloud. He has helped customers of all sizes implement data management, data warehouse, and data lake solutions.

Brian Dolan joined Amazon as a Military Relations Manager in 2012 after his first career as a Naval Aviator. In 2014, Brian joined Amazon Web Services, where he helped Canadian customers from startups to enterprises explore the AWS Cloud. Most recently, Brian was a member of the Non-Relational Business Development team as a Go-To-Market Specialist for Amazon DynamoDB and Amazon Keyspaces before joining the Analytics Worldwide Specialist Organization in 2022 as a Go-To-Market Specialist for AWS Glue.

Nidhi Gupta is a Sr. Partner Solution Architect at AWS. She spends her days working with customers and partners, solving architectural challenges. She is passionate about data integration and orchestration, serverless and big data processing, and machine learning. Nidhi has extensive experience leading the architecture design and production release and deployments for data workloads.

Scott Teal is a Product Marketing Lead at Snowflake and focuses on data lakes, storage, and governance.

Noise

All posts by Andries Engelbrecht

Access Snowflake Horizon Catalog data using catalog federation in the AWS Glue Data Catalog

Business examples and key benefits

Common examples

Key benefits

Solution overview

Prerequisites

Configure Snowflake Horizon Catalog for Iceberg external access

Configure access control and authentication

Step 1: Configure access control

Step 2: Obtain an access token

Step 3: Enable catalog federation

Optional: Snowflake Open Catalog configuration

Setup Glue Catalog federation with Snowflake Horizon Catalog

Create a secret on AWS Secrets Manager

Create IAM role for catalog federation

Access control options

Create federated catalog in Glue Data Catalog

Query the data using Athena query editor:

Clean up

Conclusion

About the authors

Use Apache Iceberg in your data lake with Amazon S3, AWS Glue, and Snowflake

Advantages of Apache Iceberg

Transactional data lakes built on AWS and Snowflake

Manage your Iceberg table with AWS Glue

Manage your Iceberg table with Snowflake

Comparing solutions

Migrate existing data lakes to a transactional data lake using Apache Iceberg

Conclusion

About the Authors

The collective thoughts of the interwebz