All posts by Xiaozang Li

Connect Amazon S3 File Gateway using AWS PrivateLink for Amazon S3

Post Syndicated from Xiaozang Li original https://aws.amazon.com/blogs/architecture/connect-amazon-s3-file-gateway-using-aws-privatelink-for-amazon-s3/

AWS Storage Gateway is a set of services that provides on-premises access to virtually unlimited cloud storage. You can extend your on-premises storage capacity, and move on-premises backups and archives to the cloud. It provides low-latency access to cloud storage by caching frequently accessed data on premises, while storing data securely and durably in the cloud. This simplifies storage management and reduces costs for hybrid cloud storage use.

You may have privacy and security concerns with sending and receiving data across the public internet. In this case, you can use AWS PrivateLink, which provides private connectivity between Amazon Virtual Private Cloud (VPC) and other AWS services.

In this blog post, we will demonstrate how to take advantage of Amazon S3 interface endpoints to connect your on-premises Amazon S3 File Gateway directly to AWS over a private connection. We will also review the steps for implementation using the AWS Management Console.

AWS Storage Gateway on-premises

Storage Gateway offers four different types of gateways to connect on-premises applications with cloud storage.

  • Amazon S3 File Gateway Provides a file interface for applications to seamlessly store files as objects in Amazon S3. These files can be accessed using open standard file protocols.
  • Amazon FSx File Gateway Optimizes on-premises access to Windows file shares on Amazon FSx.
  • Tape Gateway Replaces on-premises physical tapes with virtual tapes in AWS without changing existing backup workflows.
  • Volume Gateway –  Presents cloud-backed iSCSI block storage volumes to your on-premises applications.

We will illustrate the use of Amazon S3 File Gateway in this blog.

VPC endpoints for Amazon S3

AWS PrivateLink provides two types of VPC endpoints that you can use to connect to Amazon S3; Interface endpoints and Gateway endpoints. An interface endpoint is an elastic network interface with a private IP address. It serves as an entry point for traffic destined to a supported AWS service or a VPC endpoint service. A gateway VPC endpoint uses prefix lists as the IP route target in a VPC route table and supports routing traffic privately to Amazon S3 or Amazon DynamoDB. Both these endpoints securely connect to Amazon S3 over the Amazon network, and your network traffic does not traverse the internet.

Solution architecture for PrivateLink connectivity between AWS Storage Gateway and Amazon S3

Previously, AWS Storage Gateway did not support PrivateLink for Amazon S3 and Amazon S3 Access Points. Customers had to build and manage an HTTP proxy infrastructure within their VPC to connect their on-premises applications privately to S3 (see Figure 1). This infrastructure acted as a proxy for all the traffic originating from on-premises gateways to Amazon S3 through Amazon S3 Gateway endpoints. This setup would result in additional configuration and operational overhead. The HTTP proxy could also become a network performance bottleneck.

Figure 1. Connect to Amazon S3 Gateway endpoint using an HTTP proxy

Figure 1. Connect to Amazon S3 Gateway endpoint using an HTTP proxy

AWS Storage Gateway recently added support for AWS PrivateLink for Amazon S3 and Amazon S3 Access Points. Customers can now connect their on-premises Amazon S3 File Gateway directly to Amazon S3 through a private connection. This uses an Amazon S3 interface endpoint and doesn’t require an HTTP proxy. Additionally, customers can use Amazon S3 Access Points instead of bucket names to map file shares. This enables more granular access controls for applications connecting to AWS Storage Gateway file shares (see Figure 2).

Figure 2. AWS Storage Gateway now supports AWS PrivateLink for Amazon S3 endpoints and Amazon S3 Access Points

Figure 2. AWS Storage Gateway now supports AWS PrivateLink for Amazon S3 endpoints and Amazon S3 Access Points

Implement AWS PrivateLink between AWS Storage Gateway and an Amazon S3 endpoint

Let’s look at how to create an Amazon S3 File Gateway file share, which is associated with a Storage Gateway. This file share stores data in an Amazon S3 bucket. It uses AWS PrivateLink for Amazon S3 to transfer data to the S3 endpoint.

  1. Create an Amazon S3 bucket in your preferred Region.
  2. Create and configure an Amazon S3 File Gateway.
  3. Create an Interface endpoint for Amazon S3. Ensure that the S3 interface endpoint is created in the same Region as the S3 bucket.
  4. Customize the File share settings (see Figure 3).
Figure 3. Create file share using VPC endpoints for Amazon S3

Figure 3. Create file share using VPC endpoints for Amazon S3

Best practices:

  • Select the AWS Region where the Amazon S3 bucket is located. This ensures that the VPC endpoint and the storage bucket are in the same Region.
  • When creating the file share with PrivateLink for S3 enabled, you can either select the S3 VPC endpoint ID from the dropdown menu, or manually input the S3 VPC endpoint DNS name.
  • Note that the dropdown list of VPC endpoint IDs only contains the VPCs created by the current AWS account administrator. If you are using a shared VPC in an AWS Organization, you can manually enter the DNS name of the VPC endpoint created in the management account.

Be aware of PrivateLink pricing when using an S3 interface endpoint. The cost for each interface endpoint is based on usage per hour, the number of Availability Zones used, and the volume of data transferred over the endpoint. Additionally, each Amazon S3 VPC interface endpoint can be shared among multiple S3 File Gateways. Each file share associated with the Storage Gateway can be configured with or without PrivateLink. For workloads that do not need the private network connectivity, you can save on interface endpoints costs by creating a file share without PrivateLink.

Verify PrivateLink communication

Once you have set up an S3 File Gateway file share using PrivateLink for S3, you can verify that traffic is flowing over your private connectivity as follows:

1. Enable VPC Flow Log for the VPC hosting the S3 Interface endpoint. This also hosts the Virtual Private Gateway (VGW), which connects to the on-premises environment.

2. From your workstation, connect to your on-premises File Gateway over SMB or NFS protocol and upload a new file (see Figure 4).

Figure 4. Upload a sample file to on-premises Storage Gateway

Figure 4. Upload a sample file to on-premises Storage Gateway

3. Navigate to the S3 bucket associated with the file share.  After a few seconds, you should see that the new file has been successfully uploaded and appears in the S3 bucket (see Figure 5).

Figure 5. Verify that the sample file is uploaded to storage bucket

Figure 5. Verify that the sample file is uploaded to storage bucket

4. On the VPC flow log, look for the generated log events. You’ll see your S3 interface endpoint elastic network interface, your file gateway IP, Amazon S3 private IP, and port number, as shown in Figure 6. This verifies that the file was transferred over the private connection. If you do not see an entry, verify if the VPC Flow Logs have been enabled on the correct VPC and elastic network interface.

Figure 6. VPC Flow Log entry to verify connectivity using Private IPs

Figure 6. VPC Flow Log entry to verify connectivity using Private IPs

Summary

In this blog post, we have demonstrated how to use Amazon S3 File Gateway to transfer files to Amazon S3 buckets over AWS PrivateLink. Use this solution to securely copy your application data and files to cloud storage. This will also provide low latency access to that data from your on-premises applications.

Thanks for reading this blog post. If you have any feedback or questions, please add them in the comments section.

Further Reading: