Optimizing your AWS Infrastructure for Sustainability, Part III: Networking

Post Syndicated from Katja Philipp original https://aws.amazon.com/blogs/architecture/optimizing-your-aws-infrastructure-for-sustainability-part-iii-networking/

In Part I: Compute and Part II: Storage of this series, we introduced strategies to optimize the compute and storage layer of your AWS architecture for sustainability.

This blog post focuses on the network layer of your AWS infrastructure and proposes concepts to optimize your network utilization.

Optimizing the networking layer of your AWS infrastructure

When you make your applications available to more customers, the packets that travel across the network will increase. Similarly, the larger the size of data, as well as the more distance a packet has to travel, the more resources are required to transmit it. With growing number of application users, optimizing network traffic can ensure that network resource consumption is not growing linearly.

The recommendations in the following sections will help you use your resources more efficiently for the network layer of your workload.

Reducing the network traveled per request

Reducing the data sent over the network and optimizing the path a packet takes will result in a more efficient data transfer. The following table provides metrics related to some AWS services that can help you find potential network optimization opportunities.

Service	Metric/Check	Source
Amazon CloudFront	Cache hit rate	Viewing CloudFront and Lambda@Edge metrics
Amazon CloudFront	Cache hit rate	AWS Trusted Advisor check reference
Amazon Simple Storage Service (Amazon S3)	Data transferred in/out of a bucket	Metrics and dimensions
Amazon Simple Storage Service (Amazon S3)	Data transferred in/out of a bucket	AWS Trusted Advisor check reference
Amazon Elastic Compute Cloud (Amazon EC2)	NetworkPacketsIn/NetworkPacketsOut	List the available CloudWatch metrics for your instances
AWS Trusted Advisor	CloudFront Content Delivery Optimization	AWS Trusted Advisor check reference

We recommend the following concepts to optimize your network utilization.

Read local, write global

The following strategies allow users to read the data from the source closest to them; thus, fewer requests travel longer distances.

If you are operating within a single AWS Region, you should choose a Region that is near the majority of your users. The further your users are away from the Region, the further data needs to travel through the global network.
If your users are spread over multiple Regions, set up multiple copies of the data to reside in each Region. Amazon Relational Database Service (Amazon RDS) and Amazon Aurora let you set up cross-Region read replicas. Amazon DynamoDB global tables allow for fast performance and alleviate network load.

Use a content delivery network

Content delivery networks (CDNs) bring your data closer to the end user. When requested, they cache static content from the original server and deliver it to the user. This shortens the distance each packet has to travel.

CloudFront optimizes network utilization and delivers traffic over CloudFront’s globally distributed edge network. Figure 1 shows a global user base that accesses an S3 bucket directly versus serving cached data from edge locations.
Trusted Advisor includes a check that recommends whether you should use a CDN for your S3 buckets. It analyzes the data transferred out of your S3 bucket and flags the buckets that could benefit from a CloudFront distribution.

Figure 1. Comparison of accessing an S3 bucket directly versus via a CloudFront distribution/edge locations

Optimize CloudFront cache hit ratio

CloudFront caches different versions of an object depending upon the request headers (for example, language, date, or user-agent). You can further optimize your CDN distribution’s cache hit ratio (the number of times an object is served from the CDN versus from the origin) with a Trusted Advisor check. It automatically checks for headers that do not affect the object and then recommends a configuration to ignore those headers and not forward the request to the origin.

Use edge-oriented services

Edge computing brings data storage and computation closer to users. By implementing this approach, you can perform data preprocessing or run machine learning algorithms on the edge.

Edge-oriented services applied on gateways or directly onto user devices reduce network traffic because data does not need to be sent back to the cloud server.
One-time, low-latency tasks are a good fit for edge use cases, like when an autonomous vehicle needs to detect objects nearby. You should generally archive data that needs to be accessed by multiple parties in the cloud, but consider factors such as device hardware and privacy regulations first.
CloudFront Functions can run compute on edge locations and Lambda@Edge can generate Regional edge caches. AWS IoT Greengrass provides edge computing for Internet of Things (IoT) devices.

Reducing the size of data transmitted

Serve compressed files

In addition to caching static assets, you can further optimize network utilization by serving compressed files to your users. You can configure CloudFront to automatically compress objects, which results in faster downloads, leading to faster rendering of webpages.

Enhance Amazon EC2 network performance

Network packets consist of data that you are sending (frame) and the processing overhead information. If you use larger packets, you can pass more data in a single packet and decrease processing overhead.

Jumbo frames use the largest permissible packet that can be passed over the connection. Keep in mind that outside a single virtual private cloud (VPC), over virtual private network (VPN) or internet gateway, traffic is limited to a lower frame regardless of using jumbo frames.

Optimize APIs

If your payloads are large, consider reducing their size to reduce network traffic by compressing your messages for your REST API payloads. Use the right endpoint for your use case. Edge-optimized API endpoints are best suited for geographically distributed clients. Regional API endpoints are best suited for when you have a few clients with higher demands, because they can help reduce connection overhead. Caching your API responses will reduce network traffic and enhance responsiveness.

Conclusion

As your organization’s cloud adoption grows, knowing how efficient your resources are is crucial when optimizing your AWS infrastructure for environmental sustainability. Using the fewest number of resources possible and using them to their fullest will have the lowest impact on the environment.

Throughout this three-part blog post series, we introduced you to the following architectural concepts and metrics for the compute, storage, and network layers of your AWS infrastructure.

Reducing idle resources and maximizing utilization
Shaping demand to existing supply
Managing your data’s lifecycle
Using different storage tiers
Optimizing the path data travels through a network
Reducing the size of data transmitted

This is not an exhaustive list. We hope it is a starting point for you to consider the environmental impact of your resources and how you can build your AWS infrastructure to be more efficient and sustainable. Figure 2 shows an overview of how you can monitor related metrics with CloudWatch and Trusted Advisor.

Figure 2. Overview of services that integrate with CloudWatch and Trusted Advisor for monitoring metrics

Ready to get started? Check out the AWS Sustainability page to find out more about our commitment to sustainability. It provides information about renewable energy usage, case studies on sustainability through the cloud, and more.

Related information

AWS Architecture Monthly magazine, Sustainability issue, August 2021

Noise