Amazon Web Services (AWS) customers who establish shared infrastructure services in a multi-account environment through AWS Organizations and AWS Resource Access Manager (RAM) may find that the default permissions assigned to the management account are too broad. This may allow organizational accounts to share virtual private clouds (VPCs) with other accounts that shouldn’t have access. Many AWS customers, such as those in regulated industries or who handle sensitive data, may need tighter control of which AWS accounts can share VPCs and which accounts can access shared VPCs.
This blog post describes a mechanism for using service control policies (SCPs) to provide that granular control when you create VPC subnet resource shares within AWS Organizations. These organization policies create a preventative guardrail for controlling which accounts in AWS Organizations can share VPC resources, and with whom. The approach outlined here helps to ensure that your AWS accounts comply with your organization’s access control guidelines for VPC sharing.
A VPC sharing scenario in a multi-account environment
When you set up a multi-account environment in AWS, you can create a good foundation to support your cloud projects by incorporating AWS best practices. AWS Control Tower can automate the implementation of best practices and help your organization achieve its centralized governance and business agility goals. One AWS best practice is to create a shared service network account to consolidate networking components, such as subnets, that can be used by the rest of the organization without duplication of resources and costs. AWS RAM provides the ability to share VPC subnets across accounts. This helps to leverage the implicit routing within a VPC for applications that require a high degree of interconnectivity. VPC subnet sharing across accounts allows individual teams to co-locate their microservice application stacks in a single VPC, and provides several advantages:
Easier administration and management of VPCs in a central network account.
Separation of duties—in other words, network admins retain control over management of VPCs, network access control lists (network ACLs), and security groups, while application teams have permissions to deploy resources and workloads in those VPCs.
High density of Classless Inter-Domain Routing (CIDR) block usage for VPC subnets and avoidance of the problem of CIDR overlap that is encountered with multiple VPCs.
Reuse of network address translation (NAT) gateways, VPC interface endpoints, and avoidance of inter-VPC connectivity costs.
In order to allow for VPC subnets to be shared, you must turn on resource sharing from the management account for your AWS Organizations structure. See Shared VPC prerequisites for more information. This allows sharing of VPC subnets across any accounts within AWS Organizations. However, RAM resource sharing does not provide granular control over VPC shared access.
Let’s consider a customer organization that has set up a multi-account environment with AWS Organizations. The organization consists of segmented accounts and organization units (OUs). The following diagram shows such a multi-OU multi-account environment for a customer who has several teams using the AWS environment for their applications and initiatives.
Figure 1: VPC sharing in a customer’s multi-account AWS environment
The AWS environment is structured as follows:
The Infrastructure OU consists of AWS accounts that contain shared resources for the organization. This OU contains a central network account that contains all the networking resources to be shared with other AWS Organization accounts. Network administrators create a VPC in the networking account with a public and private subnet. This VPC is created for the purpose of sharing the subnets with other teams that need access for their workloads.
The Applications OU consists of AWS accounts that are used by several application teams. These are external and internal application stacks that require a VPC-based infrastructure.
The Data Science OU consists of AWS accounts that are used by teams working on data analytics applications and business intelligence (BI) tools. These applications use serverless data analytics tools for Extract-Transform-Load (ETL) pipelines and big data processing workloads. They have third-party BI tools that need to be hosted in AWS and used by the BI team of the organization for reporting.
Cloud administrators turn on resource sharing in AWS RAM from the management account for their organization. The network administrators operating within the network account create a resource share for the two subnets by using AWS RAM, and share them with the Applications OU so that the application teams can use the shared subnets.
However, this approach opens the door to sharing AWS VPC subnets from any AWS account to any other AWS account as long as the admin users of individual accounts have access to AWS RAM. An example of such unintended or unwanted sharing is when the Application OU account could share a resource with the Data Science OU account, bypassing the centralized network account to satisfy one-off project requests or Proof of Concepts (POCs) that violate the centralized VPC sharing policy.
As a security best practice, cloud administrators should follow the principle of least privilege and use granular controls for VPC sharing. The cloud administrator in this example wants to limit AWS Identity and Access Management (IAM) users with policies that restrict users’ access to AWS RAM to create resource shares. However, this setup can be cumbersome to manage when there are several OUs and numerous AWS accounts. For a more efficient way to have granular control over VPC sharing, you can enable security guardrails by using service control policies (SCPs), as described in the next section. We will walk you through example SCP policies that restrict VPC sharing within AWS Organizations.
Use service control policies to control VPC sharing
In the scenario we described earlier, cloud admins want to allow VPC sharing with the following constraints:
Allow only VPC subnet shares created from a network account
Allow VPC subnet shares to be shared only with certain OUs or AWS accounts
You can achieve these constraints by using the following service control policies.
Both of these service control policies are attached at the root of AWS Organizations, so they are applied to all underlying OUs and accounts. When a network administrator who has logged into the network account tries to create a VPC subnet resource share and associate it with the Application OU, the resource share is successfully created and available to both the Internalapp and Externalapp AWS accounts. However, when the network admin tries to associate the resource share with the DataAnalytics account (which lies outside of the permitted Application OU), the RAMControl2 SCP prevents that action, based on the first condition in the policy statement. The second condition on the RAMControl2 SCP prevents action for specific AWS accounts. In that case, you will see the following error.
Figure 2 Resource share creation is blocked by the SCP
Likewise, when an AWS account administrator of the Externalapp account creates a VPC in that account and tries to share it with the Internalapp account, an error is displayed and the SCP prevents that action. The RAMControl1 SCP prevents the action because it allows only the network account to create and associate resource shares.
When you use service control policies within a multi-account structure, it’s important to keep in mind the following considerations:
Customers apply SCPs for several governance and guardrail requirements. The SCPs mentioned here will be evaluated in conjunction with all other SCPs applied at that level in the hierarchy or inherited from above in the hierarchy. See How to use SCPs in AWS Organizations for more information.
AWS strongly recommends that you don’t attach SCPs to the root of your organization without thoroughly testing to ensure that you don’t inadvertently lock users out of key services thereby impacting AWS production environments.
While the example here specifies applying SCPs at the root level, you can take a similar approach if you want to control VPC sharing within a specific OU.
SCPs can be applied to shared resources other than VPC subnets for similar control. To find the complete list of resources that can be specified by using ram: RequestedResourceType, see How AWS RAM works with IAM.
VPC subnets can be shared with AWS accounts and OUs only within an organization. For more information, see the Limitations section in Working with shared VPCs.
This blog post provides a starting point for learning how to use SCPs to create a granular governance control for VPC sharing. See the IAM conditions keys for AWS RAM and Example SCPs for AWS RAM for more information that can help you implement this preventative guardrail in a way that’s suitable for your AWS environment.
Adding granular governance controls using SCP limits overly permissive sharing and prevents unauthorized resource sharing. Granular control of VPC sharing also helps you follow the AWS security best practice, the principle of least privilege, for access to VPCs. You can take advantage of organization-level SCPs for granular control of resources, which doesn’t require that you turn on resource sharing in AWS Organizations for services such as AWS Transit Gateway or Amazon Route 53 resolver rules.
If you have feedback about this post, submit comments in the Comments section below.
Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.
This guest post presents patterns for accessing an Amazon Managed Streaming for Apache Kafka cluster across your AWS account or Amazon Virtual Private Cloud (Amazon VPC) boundaries using AWS PrivateLink. In addition, the post discusses the pattern that the Transaction Banking team at Goldman Sachs (TxB) chose for their cross-account access, the reasons behind their decision, and how TxB satisfies its security requirements with Amazon MSK. Using Goldman Sachs’s implementation as a use case, this post aims to provide you with general guidance that you can use when implementing an Amazon MSK environment.
Amazon MSK is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. When you create an MSK cluster, the cluster resources are available to participants within the same Amazon VPC. This allows you to launch the cluster within specific subnets of the VPC, associate it with security groups, and attach IP addresses from your VPC’s address space through elastic network interfaces (ENIs). Network traffic between clients and the cluster stays within the AWS network, with internet access to the cluster not possible by default.
You may need to allow clients access to an MSK cluster in a different VPC within the same or a different AWS account. You have options such as VPC peering or a transit gateway that allow for resources in either VPC to communicate with each other as if they’re within the same network. For more information about access options, see Accessing an Amazon MSK Cluster.
Although these options are valid, this post focuses on a different approach, which uses AWS PrivateLink. Therefore, before we dive deep into the actual patterns, let’s briefly discuss when AWS PrivateLink is a more appropriate strategy for cross-account and cross-VPC access.
VPC peering, illustrated below, is a bidirectional networking connection between two VPCs that enables you to route traffic between them using private IPv4 addresses or IPv6 addresses.
VPC peering is more suited for environments that have a high degree of trust between the parties that are peering their VPCs. This is because, after a VPC peering connection is established, the two VPCs can have broad access to each other, with resources in either VPC capable of initiating a connection. You’re responsible for implementing fine-grained network access controls with security groups to make sure that only specific resources intended to be reachable are accessible between the peered VPCs.
You can only establish VPC peering connections across VPCs that have non-overlapping CIDRs. This can pose a challenge when you need to peer VPCs with overlapping CIDRs, such as when peering across accounts from different organizations.
Additionally, if you’re running at scale, you can have hundreds of Amazon VPCs, and VPC peering has a limit of 125 peering connections to a single Amazon VPC. You can use a network hub like transit gateway, which, although highly scalable in enabling you to connect thousands of Amazon VPCs, requires similar bidirectional trust and non-overlapping CIDRs as VPC peering.
In contrast, AWS PrivateLink provides fine-grained network access control to specific resources in a VPC instead of all resources by default, and is therefore more suited for environments that want to follow a lower trust model approach, thus reducing their risk surface. The following diagram shows a service provider VPC that has a service running on Amazon Elastic Compute Cloud (Amazon EC2) instances, fronted by a Network Load Balancer (NLB). The service provider creates a configuration called a VPC endpoint service in the service provider VPC, pointing to the NLB. You can share this endpoint service with another Amazon VPC (service consumer VPC), which can use an interface VPC endpoint powered by AWS PrivateLink to connect to the service. The service consumers use this interface endpoint to reach the end application or service directly.
AWS PrivateLink makes sure that the connections initiated to a specific set of network resources are unidirectional—the connection can only originate from the service consumer VPC and flow into the service provider VPC and not the other way around. Outside of the network resources backed by the interface endpoint, no other resources in the service provider VPC get exposed. AWS PrivateLink allows for VPC CIDR ranges to overlap, and it can relatively scale better because thousands of Amazon VPCs can consume each service.
VPC peering and AWS PrivateLink are therefore two connectivity options suited for different trust models and use cases.
Transaction Banking’s micro-account strategy
An AWS account is a strong isolation boundary that provides both access control and reduced blast radius for issues that may occur due to deployment and configuration errors. This strong isolation is possible because you need to deliberately and proactively configure flows that cross an account boundary. TxB designed a strategy that moves each of their systems into its own AWS account, each of which is called a TxBmicro-account. This strategy allows TxB to minimize the chances of a misconfiguration exposing multiple systems. For more information about TxB micro-accounts, see the video AWS re:Invent 2018: Policy Verification and Enforcement at Scale with AWS on YouTube.
To further complement the strong gains realized due to a TxB micro-account segmentation, TxB chose AWS PrivateLink for cross-account and cross-VPC access of their systems. AWS PrivateLink allows TxB service providers to expose their services as an endpoint service and use whitelisting to explicitly configure which other AWS accounts can create interface endpoints to these services. This also allows for fine-grained control of the access patterns for each service. The endpoint service definition only allows access to resources attached to the NLBs and thereby makes it easy to understand the scope of access overall. The one-way initiation of connection from a service consumer to a service provider makes sure that all connectivity is controlled on a point-to-point basis. Furthermore, AWS PrivateLink allows the CIDR blocks of VPCs to overlap between the TxB micro-accounts. Thus the use of AWS PrivateLink sets TxB up for future growth as a part of their default setup, because thousands of TxB micro-account VPCs can consume each service if needed.
MSK broker access patterns using AWS PrivateLink
As a part of their micro-account strategy, TxB runs an MSK cluster in its own dedicated AWS account, and clients that interact with this cluster are in their respective micro-accounts. Considering this setup and the preference to use AWS PrivateLink for cross-account connectivity, TxB evaluated the following two patterns for broker access across accounts.
Pattern 1: Front each MSK broker with a unique dedicated interface endpoint
In this pattern, each MSK broker is fronted with a unique dedicated NLB in the TxB MSK account hosting the MSK cluster. The TxB MSK account contains an endpoint service for every NLB and is shared with the client account. The client account contains interface endpoints corresponding to the endpoint services. Finally, DNS entries identical to the broker DNS names point to the respective interface endpoint. The following diagram illustrates this pattern in the US East (Ohio) Region.
After setup, clients from their own accounts talk to the brokers using their provisioned default DNS names as follows:
The client resolves the broker DNS name to the interface endpoint IP address inside the client VPC.
The client initiates a TCP connection to the interface endpoint IP over port 9094.
With AWS PrivateLink technology, this TCP connection is routed to the dedicated NLB setup for the respective broker listening on the same port within the TxB MSK account.
The NLB routes the connection to the single broker IP registered behind it on TCP port 9094.
The setup steps in this section are shown for the US East (Ohio) Region, please modify if using another region. In the TxB MSK account, complete the following:
Create a target group with target type as IP, protocol TCP, port 9094, and in the same VPC as the MSK cluster.
Register the MSK broker as a target by its IP address.
Create an NLB with a listener of TCP port 9094 and forwarding to the target group created in the previous step.
Enable the NLB for the same AZ and subnet as the MSK broker it fronts.
Create a Route 53 private hosted zone, with the domain name kafka.us-east-2.amazonaws.com, and associate it with the same VPC as the clients are in.
Create A-Alias records identical to the broker DNS names to avoid any TLS handshake failures and point it to the interface endpoints of the respective brokers.
Pattern 2: Front all MSK brokers with a single shared interface endpoint
In this second pattern, all brokers in the cluster are fronted with a single unique NLB that has cross-zone load balancing enabled. You make this possible by modifying each MSK broker’s advertised.listeners config to advertise a unique port. You create a unique NLB listener-target group pair for each broker and a single shared listener-target group pair for all brokers. You create an endpoint service configuration for this single NLB and share it with the client account. In the client account, you create an interface endpoint corresponding to the endpoint service. Finally, you create DNS entries identical to the broker DNS names that point to the single interface. The following diagram illustrates this pattern in the US East (Ohio) Region.
After setup, clients from their own accounts talk to the brokers using their provisioned default DNS names as follows:
The client resolves the broker DNS name to the interface endpoint IP address inside the client VPC.
The client initiates a TCP connection to the interface endpoint over port 9094.
The NLB listener within the TxB MSK account on port 9094 receives the connection.
The NLB listener’s corresponding target group load balances the request to one of the brokers registered to it (Broker 1). In response, Broker 1 sends back the advertised DNS name and port (9001) to the client.
The client resolves the broker endpoint address again to the interface endpoint IP and initiates a connection to the same interface endpoint over TCP port 9001.
This connection is routed to the NLB listener for TCP port 9001.
This NLB listener’s corresponding target group is configured to receive the traffic on TCP port 9094, and forwards the request on the same port to the only registered target, Broker 1.
The setup steps in this section are shown for the US East (Ohio) Region, please modify if using another region. In the TxB MSK account, complete the following:
Modify the port that the MSK broker is advertising by running the following command against each running broker. The following example command shows changing the advertised port on a specific broker b-1 to 9001. For each broker you run the below command against, you must change the values of bootstrap-server, entity-name, CLIENT_SECURE, REPLICATION and REPLICATION_SECURE. Please note that while modifying the REPLICATION and REPLICATION_SECURE values, -internal has to be appended to the broker name and the ports 9093 and 9095 shown below should not be changed.
Create a Route 53 private hosted zone, with the domain name kafka.us-east-2.amazonaws.com and associate it with the same VPC as the client is in.
Under this hosted zone, create A-Alias records identical to the broker DNS names to avoid any TLS handshake failures and point it to the interface endpoint.
This post shows both of these patterns to be using TLS on TCP port 9094 to talk to the MSK brokers. If your security posture allows the use of plaintext communication between the clients and brokers, these patterns apply in that scenario as well, using TCP port 9092.
With both of these patterns, if Amazon MSK detects a broker failure, it mitigates the failure by replacing the unhealthy broker with a new one. In addition, the new MSK broker retains the same IP address and has the same Kafka properties, such as any modified advertised.listener configuration.
Amazon MSK allows clients to communicate with the service on TCP ports 9092, 9094, and 2181. As a byproduct of modifying the advertised.listener in Pattern 2, clients are automatically asked to speak with the brokers on the advertised port. If there is a need for clients in the same account as Amazon MSK to access the brokers, you should create a new Route53 hosted zone in the Amazon MSK account with identical broker DNS names pointing to the NLB DNS name. The Route53 record sets override the MSK broker DNS and allow for all traffic to the brokers to go via the NLB.
Transaction Banking’s MSK broker access pattern
For broker access across TxB micro-accounts, TxB chose Pattern 1, where one interface endpoint per broker is exposed to the client account. TxB streamlined this overall process by automating the creation of the endpoint service within the TxB MSK account and the interface endpoints within the client accounts without any manual intervention.
At the time of cluster creation, the bootstrap broker configuration is retrieved by calling the Amazon MSK APIs and stored in AWS Systems Manager Parameter Store in the client account so that they can be retrieved on application startup. This enables clients to be agnostic of the Kafka broker’s DNS names being launched in a completely different account.
A key driver for TxB choosing Pattern 1 is that it avoids having to modify a broker property like the advertised port. Pattern 2 creates the need for TxB to track which broker is advertising which port and make sure new brokers aren’t reusing the same port. This adds the overhead of having to modify and track the advertised port of new brokers being launched live and having to create a corresponding listener-target group pair for these brokers. TxB avoided this additional overhead by choosing Pattern 1.
On the other hand, Pattern 1 requires the creation of additional dedicated NLBs and interface endpoint connections when more brokers are added to the cluster. TxB limits this management overhead through automation, which requires additional engineering effort.
Also, using Pattern 1 costs more compared to Pattern 2, because each broker in the cluster has a dedicated NLB and an interface endpoint. For a single broker, it costs $37.80 per month to keep the end-to-end connectivity infrastructure up. The breakdown of the monthly connectivity costs is as follows:
NLB running cost – 1 NLB x $0.0225 x 720 hours/month = $16.20/month
1 VPC endpoint spread across three AZs – 1 VPCE x 3 ENIs x $0.01 x 720 hours/month = $21.60/month
You want to minimize the management overhead associated with modifying broker properties, such as advertised port
You have automation that takes care of adding and removing infrastructure when new brokers are created or destroyed
Simplified and uniform deployments are primary drivers, with cost as a secondary concern
Transaction Banking’s security requirements for Amazon MSK
The TxB micro-account provides a strong application isolation boundary, and accessing MSK brokers using AWS PrivateLink using Pattern 1 allows for tightly controlled connection flows between these TxB micro-accounts. TxB further builds on this foundation through additional infrastructure and data protection controls available in Amazon MSK. For more information, see Security in Amazon Managed Streaming for Apache Kafka.
The following are the core security tenets that TxB’s internal security team require for using Amazon MSK:
Encryption at rest using Customer Master Key (CMK) – TxB uses the Amazon MSK managed offering of encryption at rest. Amazon MSK integrates with AWS Key Management Service (AWS KMS) to offer transparent server-side encryption to always encrypt your data at rest. When you create an MSK cluster, you can specify the AWS KMS CMK that AWS KMS uses to generate data keys that encrypt your data at rest. For more information, see Using CMKs and data keys.
Encryption in transit – Amazon MSK uses TLS 1.2 for encryption in transit. TxB makes client-broker encryption and encryption between the MSK brokers mandatory.
Client authentication with TLS – Amazon MSK uses AWS Certificate Manager Private Certificate Authority (ACM PCA) for client authentication. The ACM PCA can either be a root Certificate Authority (CA) or a subordinate CA. If it’s a root CA, you need to install a self-signed certificate. If it’s a subordinate CA, you can choose its parent to be an ACM PCA root, a subordinate CA, or an external CA. This external CA can be your own CA that issues the certificate and becomes part of the certificate chain when installed as the ACM PCA certificate. TxB takes advantage of this capability and uses certificates signed by ACM PCA that are distributed to the client accounts.
Authorization using Kafka Access Control Lists (ACLs) – Amazon MSK allows you to use the Distinguished Name of a client’s TLS certificates as the principal of the Kafka ACL to authorize client requests. To enable Kafka ACLs, you must first have client authentication using TLS enabled. TxB uses the Kafka Admin API to create Kafka ACLs for each topic using the certificate names of the certificates deployed on the consumer and producer client instances. For more information, see Apache Kafka ACLs.
This post illustrated how the Transaction Banking team at Goldman Sachs approaches an application isolation boundary through the TxB micro-account strategy and how AWS PrivateLink complements this strategy. Additionally, this post discussed how the TxB team builds connectivity to their MSK clusters across TxB micro-accounts and how Amazon MSK takes the undifferentiated heavy lifting away from TxB by allowing them to achieve their core security requirements. You can leverage this post as a reference to build a similar approach when implementing an Amazon MSK environment.
About the Authors
Robert L. Cossin is a Vice President at Goldman Sachs in New York. Rob joined Goldman Sachs in 2004 and has worked on many projects within the firm’s cash and securities flows. Most recently, Rob is a technical architect on the Transaction Banking team, focusing on cloud enablement and security.
Harsha W. Sharma is a Solutions Architect with AWS in New York. Harsha joined AWS in 2016 and works with Global Financial Services customers to design and develop architectures on AWS, and support their journey on the cloud.
IP networking is often seen as a means to an end, an abstract aspect of your business. You don’t say, “I really want a fast network…just to have a fast network.” Quite the contrary. As a business, you set out to accomplish your mission and goals, and then find you need applications to get there. In turn, these applications must speak to each other. Hence, the requirement for network connectivity is born. But networking is often overlooked; it’s often seen as a byproduct of the main objective of the business. But don’t be fooled—the network is a necessary component of any application.
If we did a comparison of an IP network to a house, the network could be analogous to the plumbing of the house, a foundational component. It must be planned somewhere between the architectural blueprints for the structure of the house and the carpenters closing and finishing the walls. It would be a major reconstruction to open the walls to fix the plumbing. This may be an overly simplified analogy, but it helps illustrate the headache involved for application teams if they forget to spend the time properly designing their network architecture.
Amazon Virtual Private Cloud (Amazon VPC) is a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network. You have complete control to customize and design software-defined networks based on your requirements.
Customers can launch VPCs through the AWS Management Console or use APIs to automate provisioning and operations. Within an AWS account, there is a default VPC where you can launch Amazon Elastic Cloud Compute (EC2) instances and other AWS resources, without any network configuration. While the default VPC is a great way to start, most customers need multiple isolated networks, which translates into having multiple VPCs (for example, creating a different VPC for prod and non-prod). As an application scales or must be integrated, different connectivity needs arise.
Customers are adopting a multi-account strategy and using the “AWS account” as a logical boundary for segmentation. The AWS account can segment different environments (dev, test, and prod), different applications or application teams, increase security, and help simplify pricing. This influences the networking architecture because for every AWS account, there is one or more VPCs, and oftentimes, these VPCs must connect to each other.
Network architecture should be carefully planned before provisioning applications to avoid running into common pit falls, including:
#1: Overlapping IPs are never fun
When it comes to designing network architectures, there is one universal truth: don’t use overlapping IP addresses. It doesn’t end well. When you want these applications to talk to each other, it won’t work without a Network Address Translation (NAT). You either have NAT in one direction or you may end up with NAT in two directions. This doesn’t scale; it adds an unnecessary operational burden to network operators that slows them down. In addition, if you find yourself in this situation, it’s a tedious process to get out of. The best option is to design a new network environment and slowly migrate resources to it.
#2: DNS must be planned for hybrid architectures
Customers migrating to the cloud can have Domain Name System (DNS) architectures that require a hybrid approach that considers the on-premises DNS domains and the requirements for cloud resources to share domain names. Often, on-premises resources need to resolve cloud resources and the other way around. AWS has simplified this with the introduction of Amazon Route 53 resolver hybrid endpoints that can forward to, or receive DNS queries from, on-premises servers. This removes the burden of customers having to manage EC2 instances for the purposes of DNS forwarding.
#3: Scalable architectures require dedicated network teams for architecture and operations
As networks grow, they become more important to the business. One of the age-old questions remains true: how much money does your business lose for every minute, every hour, and every day that the applications are offline? This helps you clearly see how dependent you are on the uptime and availability of your network. The optimal network architecture puts the control and operations of network management into the hands of the network team and enables application developers to focus on building great applications.
#4: AWS is constantly evolving and innovating, and so are network architectures
New services can simplify architectures, improve performance, and save costs. A service worth mentioning is AWS Transit Gateway, which lets you scale connectivity across thousands of Amazon VPCs, AWS accounts, and on-premises networks. Transit Gateway introduces a hub-and-spoke architecture with it being the hub and your VPCs being the spokes. It is highly scalable and performant, and it simplifies network operations. In addition, it enables advanced network architectures such as centralized firewalls for outbound (or egress) filtering, centralized NAT, or multicast.
For other designs, teams and applications may require tighter integration. That would be a good use case for VPC sharing. VPC sharing allows multiple AWS accounts to provision resources in the same centrally managed VPC. VPC sharing can allow network administrators to centralize network resources and have complete control. With a multi-account strategy, more accounts can be created and VPCs can be shared from the network team across to the application and development teams. This gives the network operations team complete control over the architecture and operations of the networking, and ensures that application teams don’t need to worry about properly configuring the network.
#5: Every network design is different
The AWS Well-Architected Framework provides a methodology to help guide customers to build resilient and fault-tolerant applications. But often, the pillars of the Well-Architected Framework are at odds with each other. For example, to build in resiliency may compete with cost optimization. You must define your own requirements. If you’re a large customer, you may prefer to build with more complexity because it keeps costs lower at high scale. In contrast, others may prefer to architect for simplicity and performance with a few added costs.
In closing, I recommend you read the whitepaper, Building a Scalable and Secure Multi-VPC AWS Network Infrastructure. We wrote this paper to help you create scalable, performant, highly available, and secure cloud networks that take your business and applications to the next level. We are in the midst of a technology revolution, and we are your partners in the journey.
Amazon Web Services (AWS) serves more than a million private and public sector organizations all over the world from its extensive and expanding global infrastructure.
Like other countries, organizations all around New Zealand are using AWS to change the way they operate. For example, Xero, a Wellington-based online accountancy software vendor, now serves customers in more than 100 countries, while the Department of Conservation provides its end users with virtual desktops running in Amazon Workspaces.
New Zealand doesn’t currently have a dedicated AWS Region. Geographically, the closest is Asia Pacific (Sydney), which is 2,000 kilometers (km) away, across a deep sea. While customers rely on AWS for business-critical workloads, they are well-served by New Zealand’s international connectivity.
To connect to Amazon’s network, our New Zealand customers have a range of options:
Public internet endpoints
Managed or software Virtual Private Networks (VPN)
All rely on the extensive internet infrastructure connecting New Zealand to the world.
The vast majority of internet traffic is carried over physical cables, while the percentage of traffic moving over satellite or wireless links is small by comparison.
Historically, cables were funded and managed by consortia of telecommunication providers. More recently, large infrastructure and service providers like AWS have contributed to or are building their own cable networks.
There are currently about 400 submarine cables in service globally. Modern submarine cables are fiber-optic, run for thousands of kilometers, and are protected by steel strands, plastic sheathing, copper, and a chemical water barrier. Over that distance, the signal can weaken—or attenuate—so signal repeaters are installed approximately every 50km to mitigate attenuation. Repeaters are powered by a charge running over the copper sheathing in the cable.
An example of submarine cable composition.. Source: WikiMedia Commons
For most of their run, these cables are about as thick as a standard garden hose. They are thicker, however, closer to shore and in areas where there’s a greater risk of damage by fishing nets, boat anchors, etc.
Cables can—and do—break, but redundancy is built into the network. According to Telegeography, there are 100 submarine cable faults globally every year. However, most faults don’t impact users meaningfully.
Southern Cross B: Takapuna (Auckland, New Zealand) -> Spencer Beach (Hawaii, USA) – 1.2 Tbps
A map of major submarine cables connecting to New Zealand. Source submarinecablemap.com
The four cables combined currently deliver 66 Tbps of available capacity. The Southern Cross NEXT cable is due to come online in 2020, which will add another 72 Tbps. These are, of course, potential capacities; it’s likely the “lit” capacity—the proportion of the cables’ overall capacity that is actually in use—is much lower.
Connecting to AWS from New Zealand
While understanding the physical infrastructure is important in practice, these details are not shared with customers. Connectivity options are evaluated on the basis of partner and AWS offerings, which include connectivity.
Customers connect to AWS in three main ways: over public endpoints, via site-to-site VPNs, and via Direct Connect (DX), all typically provided by partners.
Public Internet Endpoints
Customers can connect to public endpoints for AWS services over the public internet. Some services, like Amazon CloudFront, Amazon API Gateway, and Amazon WorkSpaces are generally used in this way.
Many organizations use a VPN to connect to AWS. It’s the simplest and lowest cost entry point to expose resources deployed in private ranges in an Amazon VPC. Amazon VPC allows customers to provision a logically isolated network segment, with fine-grained control of IP ranges, filtering rules, and routing.
AWS offers a managed site-to-site VPN service, which creates secure, redundant Internet Protocol Security (IPSec) VPNs, and also handles maintenance and high-availability while integrating with Amazon CloudWatch for robust monitoring.
If using an AWS managed VPN, the AWS endpoints have publicly routable IPs. They can be connected to over the public internet or via a Public Virtual Interface over DX (outlined below).
Customers can also deploy VPN appliances onto Amazon Elastic Compute Cloud (EC2) instances running in their VPC. These may be self-managed or provided by Amazon Marketplace sellers.
AWS also offers AWS Client VPN, for direct user access to AWS resources.
AWS Direct Connect
While connectivity over the internet is secure and flexible, it has one major disadvantage: it’s unpredictable. By design, traffic traversing the internet can take any path to reach its destination. Most of the time it works but occasionally routing conditions may reduce capacity or increase latency.
DX connections are either 1 or 10 Gigabits per second (Gbps). This capacity is dedicated to the customer; it isn’t shared, as other network users are never routed over the connection. This means customers can rely on consistent latency and bandwidth. The DX per-Gigabit transfer cost is lower than other egress mechanisms. For customers transferring large volumes of data, DX may be more cost effective than other means of connectivity.
Customers may publish their own 802.11q Virtual Local Area Network (VLAN) tags across the DX, and advertise routes via Border Gateway Protocol (BGP). A dedicated connection supports up to 50 private or public virtual interfaces. New Zealand does not have a physical point-of-presence for DX—users must procure connectivity to our Sydney Region. Many AWS Partner Network (APN) members in New Zealand offer this connectivity.
For customers who don’t want or need to manage VLANs to AWS—or prefer 1 Gbps or smaller links —APN partners offer hosted connections or hosted virtual interfaces. For more detail, please review our AWS Direct Connect Partners page.
There are physical limits to latency dictated by the speed of light, and the medium through which optical signals travel. Southern Cross publishes latency statistics, and it sees one-way latency of approximately 11 milliseconds (ms) over the 2,276km Alexandria to Whenuapai link. Double that for a round-trip to 22 ms.
In practice, we see customers achieving round-trip times from user workstations to Sydney in approximately 30-50 ms, assuming fair-weather internet conditions or DX links. Latency in Auckland (the largest city) tends to be on the lower end of that spectrum, while the rest of the country tends towards the higher end.
Bandwidth constraints are more often dictated by client hardware, but AWS and our partners offer up to 10 Gbps links, or smaller as required. For customers that require more than 10 Gbps over a single link, AWS supports Link Aggregation Groups (LAG).
As outlined above, there are a range of ways for customers to adopt AWS via secure, reliable, and performant networks. To discuss your use case, please contact an AWS Solutions Architect.
Since its inception, the Amazon Virtual Private Cloud (VPC) has acted as the embodiment of security and privacy for customers who are looking to run their applications in a controlled, private, secure, and isolated environment.
This logically isolated space has evolved, and in its evolution has increased the avenues that customers can take to create and manage multi-tenant environments with multiple integration points for access to resources on-premises.
This blog is a two-part series that begins with a look at the Amazon VPC as a single unit of networking in the AWS Cloud but eventually takes you to a world in which simplified architectures for establishing a global network of VPCs are possible.
From One VPC: Single Unit of Networking
To be successful with the AWS Virtual Private Cloud you first have to define success for today and what success might look like as your organization’s adoption of the AWS cloud increases and matures. In essence, your VPCs should be designed to satisfy the needs of your applications today and must be scalable to accommodate future needs.
Classless Inter-Domain Routing (CIDR) notations are used to denote the size of your VPC. AWS allows you specify a CIDR block between /16 and /28. The largest, /16, provides you with 65,536 IP addresses and the smallest possible allowed CIDR block, /28, provides you with 16 IP addresses. Note, the first four IP addresses and the last IP address in each subnet CIDR block are not available for you to use, and cannot be assigned to an instance.
AWS VPC supports both IPv4 and IPv6. It is required that you specify an IPv4 CIDR range when creating a VPC. Specifying an IPv6 range is optional.
Customers can specify ANY IPv4 address space for their VPC. This includes but is not limited to RFC 1918 addresses.
After creating your VPC, you divide it into subnets. In an AWS VPC, subnets are not isolation boundaries around your application. Rather, they are containers for routing policies.
Isolation is achieved by attaching an AWS Security Group (SG) to the EC2 instances that host your application. SGs are stateful firewalls, meaning that connections are tracked to ensure return traffic is allowed. They control inbound and outbound access to the elastic network interfaces that are attached to an EC2 instance. These should be tightly configured, only allowing access as needed.
It is our best practice that subnets should be created in categories. There two main categories; public subnets and private subnets. At minimum they should be designed as outlined in the below diagrams for IPv4 and IPv6 subnet design.
Recommended IPv4 subnet design pattern
Recommended IPv6 subnet design pattern
Subnet types are denoted by the ability and inability for applications and users on the internet to directly initiate access to infrastructure within a subnet.
Public subnets are attached to a route table that has a default route to the Internet via an Internet gateway.
Resources in a public subnet can have a public IP or Elastic IP (EIP) that has a NAT to the Elastic Network Interface (ENI) of the virtual machines or containers that hosts your application(s). This is a one-to-one NAT that is performed by the Internet gateway.
Illustration of public subnet access path to the Internet through the Internet Gateway (IGW)
A private subnet contains infrastructure that isn’t directly accessible from the Internet. Unlike the public subnet, this infrastructure only has private IPs.
Infrastructure in a private subnet gain access to resources or users on the Internet through a NAT infrastructure of sorts.
AWS natively provides NAT capability through the use of the NAT Gateway service. Customers can also create NAT instances that they manage or leverage third-party NAT appliances from the AWS Marketplace.
In most scenarios, it is recommended to use the AWS NAT Gateway as it is highly available (in a single Availability Zone) and is provided as a managed service by AWS. It supports 5 Gbps of bandwidth per NAT gateway and automatically scales up to 45 Gbps.
An AWS NAT gateway’s high availability is confined to a single Availability Zone. For high availability across AZs, it is recommended to have a minimum of two NAT gateways (in different AZs). This allows you to switch to an available NAT gateway in the event that one should become unavailable.
This approach allows you to zone your Internet traffic, reducing cross Availability Zone connections to the Internet. More details on NAT gateway are available here.
Illustration of an environment with a single NAT Gateway (NAT-GW)
Illustration of high availability with a multiple NAT Gateways (NAT-GW) attached to their own route table
Illustration of the failure of one NAT Gateway and the fail over to an available NAT Gateway by the manual changing of the default route next hop in private subnet A route table
AWS allocated IPv6 addresses are Global Unicast Addresses by default. That said, you can privatize these subnets by using an Egress-Only Internet Gateway (E-IGW), instead of a regular Internet gateway. E-IGWs are purposely built to prevents users and applications on the Internet from initiating access to infrastructure in your IPv6 subnet(s).
Illustration of internet access for hybrid IPv6 subnets through an Egress-Only Internet Gateway (E-IGW)
Applications hosted on instances living within a private subnet can have different access needs. Some require access to the Internet while others require access to databases, applications, and users that are on-premises. For this type of access, AWS provides two avenues: the Virtual Gateway and the Transit Gateway. The Virtual Gateway can only support a single VPC at a time, while the Transit Gateway is built to simplify the interconnectivity of tens to hundreds of VPCs and then aggregating their connectivity to resources on-premises. Given that we are looking at the VPC as a single unit of networking, all diagrams below contain illustrations of the Virtual Gateway which acts a WAN concentrator for your VPC.
Illustration of private subnets connecting to data center via a Virtual Gateway (VGW)
Illustration of private subnets connecting to Data Center via a VGW
Illustration of private subnets connecting to Data Center using AWS Direct Connect as primary and IPsec as backup
The above diagram illustrates a WAN connection between a VGW attached to a VPC and a customer’s data center.
For these scenarios you have the option to leverage an Amazon VPC Endpoint.
There are two types of VPC Endpoints: Gateway Endpoints and Interface Endpoints.
Gateway Endpoints only support Amazon S3 and Amazon DynamoDB. Upon creation, a gateway is added to your specified route table(s) and acts as the destination for all requests to the service it is created for.
Interface Endpoints differ significantly and can only be created for services that are powered by AWS PrivateLink.
Upon creation, AWS creates an interface endpoint consisting of one or more Elastic Network Interfaces (ENIs). Each AZ can support one interface endpoint ENI. This acts as a point of entry for all traffic destined to a specific PrivateLink service.
When an interface endpoint is created, associated DNS entries are created that point to the endpoint and each ENI that the endpoint contains. To access the PrivateLink service you must send your request to one of these hostnames.
As illustrated below, ensure the Private DNS feature is enabled for AWS public and Marketplace services:
Since interface endpoints leverage ENIs, customers can use cloud techniques they are already familiar with. The interface endpoint can be configured with a restrictive security group. These endpoints can also be easily accessed from both inside and outside the VPC. Access from outside a VPC can be accomplished through Direct Connect and VPN.
Illustration of a solution that leverages an interface and gateway endpoint
Customers can also create AWS Endpoint services for their applications or services running on-premises. This allows access to these services via an interface endpoint which can be extended to other VPCs (even if the VPCs themselves do not have Direct Connect configured).
At re:Invent 2018, AWS launched the feature VPC sharing, which helps customers control VPC sprawl by decoupling the boundary of an AWS account from the underlying VPC network that supports its infrastructure.
VPC sharing allows customers to centralize the management of network, its IP space and the access paths to resources external to the VPC. This method of centralization and reuse (of VPC components such as NAT Gateway and Direct Connect connections) results in a reduction of cost to manage and maintain this environment.
Great, but there are times when a customer needs to build networks with multiple VPCs in and across AWS regions. How should this be done and what are the best practices?
Learn about AWS Services & Solutions – September AWS Online Tech Talks
Join us this September to learn about AWS services and solutions. The AWS Online Tech Talks are live, online presentations that cover a broad range of topics at varying technical levels. These tech talks, led by AWS solutions architects and engineers, feature technical deep dives, live demonstrations, customer examples, and Q&A with AWS experts. Register Now!
Note – All sessions are free and in Pacific Time.
Tech talks this month:
September 23, 2019 | 11:00 AM – 12:00 PM PT – Build Your Hybrid Cloud Architecture with AWS – Learn about the extensive range of services AWS offers to help you build a hybrid cloud architecture best suited for your use case.
October 1, 2019 | 11:00 AM – 12:00 PM PT – Robots and STEM: AWS RoboMaker and AWS Educate Unite! – Come join members of the AWS RoboMaker and AWS Educate teams as we provide an overview of our education initiatives and walk you through the newly launched RoboMaker Badge.
October 2, 2019 | 9:00 AM – 10:00 AM PT – Deep Dive on Amazon EventBridge – Learn how to optimize event-driven applications, and use rules and policies to route, transform, and control access to these events that react to data from SaaS apps.
This post is contributed by Tony Pujals | Senior Developer Advocate, AWS
AWS recently increased the number of elastic network interfaces available when you run tasks on Amazon ECS. Use the account setting called awsvpcTrunking. If you use the Amazon EC2 launch type and task networking (awsvpc network mode), you can now run more tasks on an instance—5 to 17 times as many—as you did before.
As more of you embrace microservices architectures, you deploy increasing numbers of smaller tasks. AWS now offers you the option of more efficient packing per instance, potentially resulting in smaller clusters and associated savings.
To manage your own cluster of EC2 instances, use the EC2 launch type. Use task networking to run ECS tasks using the same networking properties as if tasks were distinct EC2 instances.
Task networking offers several benefits. Every task launched with awsvpc network mode has its own attached network interface, a primary private IP address, and an internal DNS hostname. This simplifies container networking and gives you more control over how tasks communicate, both with each other and with other services within their virtual private clouds (VPCs).
Task networking also lets you take advantage of other EC2 networking features like VPC Flow Logs. This feature lets you monitor traffic to and from tasks. It also provides greater security control for containers, allowing you to use security groups and network monitoring tools at a more granular level within tasks. For more information, see Introducing Cloud Native Networking for Amazon ECS Containers.
However, if you run container tasks on EC2 instances with task networking, you can face a networking limit. This might surprise you, particularly when an instance has plenty of free CPU and memory. The limit reflects the number of network interfaces available to support awsvpc network mode per container instance.
Raise network interface density limits with trunking
The good news is that AWS raised network interface density limits by implementing a networking feature on ECS called “trunking.” This is a technique for multiplexing data over a shared communication link.
If you’re migrating to microservices using AWS App Mesh, you should optimize network interface density. App Mesh requires awsvpc networking to provide routing control and visibility over an ever-expanding array of running tasks. In this context, increased network interface density might save money.
By opting for network interface trunking, you should see a significant increase in capacity—from 5 to 17 times more than the previous limit. For more information on the new task limits per container instance, see Supported Amazon EC2 Instance Types.
Applications with tasks not hitting CPU or memory limits also benefit from this feature through the more cost-effective “bin packing” of container instances.
Trunking is an opt-in feature
AWS chose to make the trunking feature opt-in due to the following factors:
Instance registration: While normal instance registration is straightforward with trunking, this feature increases the number of asynchronous instance registration steps that can potentially fail. Any such failures might add extra seconds to launch time.
Available IP addresses: The “trunk” belongs to the same subnet in which the instance’s primary network interface originates. This effectively reduces the available IP addresses and potentially the ability to scale out on other EC2 instances sharing the same subnet. The trunk consumes an IP address. With a trunk attached, there are two assigned IP addresses per instance, one for the primary interface and one for the trunk.
Differing customer preferences and infrastructure: If you have high CPU or memory workloads, you might not benefit from trunking. Or, you may not want awsvpc networking.
Consequently, AWS leaves it to you to decide if you want to use this feature. AWS might revisit this decision in the future, based on customer feedback. For now, your account roles or users must opt in to the awsvpcTrunking account setting to gain the benefits of increased task density per container instance.
Enable the ECS elastic network interface trunking feature to increase the number of network interfaces that can be attached to supported EC2 container instance types. You must meet the following prerequisites before you can launch a container instance with the increased network interface limits:
Your account must have the AWSServiceRoleForECS service-linked role for ECS.
You must opt into the awsvpcTrunking account setting.
Make sure that a service-linked role exists for ECS
A service-linked role is a unique type of IAM role linked to an AWS service (such as ECS). This role lets you delegate the permissions necessary to call other AWS services on your behalf. Because ECS is a service that manages resources on your behalf, you need this role to proceed.
In most cases, you won’t have to create a service-linked role. If you created or updated an ECS cluster, ECS likely created the service-linked role for you.
You can confirm that your service-linked role exists using the AWS CLI, as shown in the following code example:
Your account, IAM user, or role must opt in to the awsvpcTrunking account setting. Select this setting using the AWS CLI or the ECS console. You can opt in for an account by making awsvpcTrunking its default setting. Or, you can enable this setting for the role associated with the instance profile with which the instance launches. For instructions, see Account Settings.
After completing the prerequisites described in the preceding sections, launch a new container instance with increased network interface limits using one of the supported EC2 instance types.
Keep the following in mind:
It’s available with the latest variant of the ECS-optimized AMI.
It only affects creation of new container instances after opting into awsvpcTrunking.
It only affects tasks created with awsvpc network mode and EC2 launch type. Tasks created with the AWS Fargate launch type always have a dedicated network interface, no matter how many you launch.
If you seek to optimize the usage of your EC2 container instances for clusters that you manage, enable the increased network interface density feature with awsvpcTrunking. By following the steps outlined in this post, you can launch tasks using significantly fewer EC2 instances. This is especially useful if you embrace a microservices architecture, with its increasing numbers of lighter tasks.
Hopefully, you found this post informative and the proposed solution intriguing. As always, AWS welcomes all feedback or comment.
This post is contributed by Lulu Zhao | Software Development Engineer II, AWS
AWS X-Ray helps developers and DevOps engineers quickly understand how an application and its underlying services are performing. When it’s integrated with AWS App Mesh, the combination makes for a powerful analytical tool.
X-Ray helps to identify and troubleshoot the root causes of errors and performance issues. It’s capable of analyzing and debugging distributed applications, including those based on a microservices architecture. It offers insights into the impact and reach of errors and performance problems.
In this post, I demonstrate how to integrate it with App Mesh.
App Mesh is a service mesh based on the Envoy proxy that makes it easy to monitor and control microservices. App Mesh standardizes how your microservices communicate, giving you end-to-end visibility and helping to ensure high application availability.
With App Mesh, it’s easy to maintain consistent visibility and network traffic control for services built across multiple types of compute infrastructure. App Mesh configures each service to export monitoring data and implements consistent communications control logic across your application.
A service mesh is like a communication layer for microservices. All communication between services happens through the mesh. Customers use App Mesh to configure a service mesh that contains virtual services, virtual nodes, virtual routes, and corresponding routes.
However, it’s challenging to visualize the way that request traffic flows through the service mesh while attempting to identify latency and other types of performance issues. This is particularly true as the number of microservices increases.
It’s in exactly this area where X-Ray excels. To show a detailed workflow inside a service mesh, I implemented a tracing extension called X-Ray tracer inside Envoy. With it, I ensure that I’m tracing all inbound and outbound calls that are routed through Envoy.
Traffic routing with color app
The following example shows how X-Ray works with App Mesh. I used the Color App, a simple demo application, to showcase traffic routing.
This app has two Go applications that are included in the AWS X-Ray Go SDK: color-gateway and color-teller. The color-gateway application is exposed to external clients and responds to http://service-name:port/color, which retrieves color from color-teller. I deployed color-app using Amazon ECS. This image illustrates how color-gateway routes traffic into a virtual router and then into separate nodes using color-teller.
The following image shows client interactions with App Mesh in an X-Ray service map after requests have been made to the color-gateway and to color-teller.
There are two types of service nodes:
AWS::AppMesh::Proxy is generated by the X-Ray tracing extension inside Envoy.
After the Color app successfully launched, I made a request to color-gateway to fetch a color.
First, the Envoy proxy appmesh/colorgateway-vn in front of default-gateway received the request and routed it to the server default-gateway.
Then, default-gateway made a request to server default-colorteller-white to retrieve the color.
Instead of directly calling the color-teller server, the request went to the default-gateway Envoy proxy and the proxy routed the call to color-teller.
That’s the advantage of using the Envoy proxy. Envoy is a self-contained process that is designed to run in parallel with all application servers. All of the Envoy proxies form a transparent communication mesh through which each application sends and receives messages to and from localhost while remaining unaware of the broader network topology.
For App Mesh integration, the X-Ray tracer records the mesh name and virtual node name values and injects them into the segment JSON document. Here is an example:
To enable X-Ray tracing through App Mesh inside Envoy, you must set two environment variable configurations:
The first one enables X-Ray tracing using 127.0.0.1:2000 as the default daemon endpoint to which generated segments are sent. If the daemon you installed listens on a different port, you can specify a port value to override the default X-Ray daemon port by using the second configuration.
Currently, AWS X-Ray supports SDKs written in multiple languages (including Java, Python, Go, .NET, and .NET Core, Node.js, and Ruby) to help you implement your services. For more information, see Getting Started with AWS X-Ray.
Today, App Mesh is available as a public preview. In the coming months, we plan to add new functionality and integrations.
Why App Mesh?
Many of our customers are building applications with microservices architectures, breaking applications into many separate, smaller pieces of software that are independently deployed and operated. Microservices help to increase the availability and scalability of an application by allowing each component to scale independently based on demand. Each microservice interacts with the other microservices through an API.
When you start building more than a few microservices within an application, it becomes difficult to identify and isolate issues. These can include high latencies, error rates, or error codes across the application. There is no dynamic way to route network traffic when there are failures or when new containers need to be deployed.
You can address these problems by adding custom code and libraries into each microservice and using open source tools that manage communications for each microservice. However, these solutions can be hard to install, difficult to update across teams, and complex to manage for availability and resiliency.
AWS App Mesh implements a new architectural pattern that helps solve many of these challenges and provides a consistent, dynamic way to manage the communications between microservices. With App Mesh, the logic for monitoring and controlling communications between microservices is implemented as a proxy that runs alongside each microservice, instead of being built into the microservice code. The proxy handles all of the network traffic into and out of the microservice and provides consistency for visibility, traffic control, and security capabilities to all of your microservices.
Use App Mesh to model how all of your microservices connect. App Mesh automatically computes and sends the appropriate configuration information to each microservice proxy. This gives you standardized, easy-to-use visibility and traffic controls across your entire application. App Mesh uses Envoy, an open source proxy. That makes it compatible with a wide range of AWS partner and open source tools for monitoring microservices.
Using App Mesh, you can export observability data to multiple AWS and third-party tools, including Amazon CloudWatch, AWS X-Ray, or any third-party monitoring and tracing tool that integrates with Envoy. You can configure new traffic routing controls to enable dynamic blue/green canary deployments for your services.
Here’s a sample application with two services, where service A receives traffic from the internet and uses service B for some backend processing. You want to route traffic dynamically between services B and B’, a new version of B deployed to act as the canary.
First, create a mesh, a namespace that groups related microservices that must interact.
Next, create virtual nodes to represent services in the mesh. A virtual node can represent a microservice or a specific microservice version. In this example, service A and B participate in the mesh and you manage the traffic to service B using App Mesh.
Now, deploy your services with the required Envoy proxy and with a mapping to the node in the mesh.
After you have defined your virtual nodes, you can define how the traffic flows between your microservices. To do this, define a virtual router and routes for communications between microservices.
A virtual router handles traffic for your microservices. After you create a virtual router, you create routes to direct traffic appropriately. These routes include the connection requests that the route should accept, where they should go, and the weighted amount of traffic to send. All of these changes to adjust traffic between services is computed and sent dynamically to the appropriate proxies by App Mesh to execute your deployment.
You now have a virtual router set up that accepts all traffic from virtual node A sending to the existing version of service B, as well some traffic to the new version, B’.
Exporting metrics, logs, and traces
One of benefits about placing a proxy in front of every microservice is that you can automatically capture metrics, logs, and traces about the communication between your services. App Mesh enables you to easily collect and export this data to the tools of your choice. Envoy is already integrated with several tools like Prometheus and Datadog.
During the preview, we are adding support for AWS services such as Amazon CloudWatch and AWS X-Ray. We have a lot more integrations planned as well.
AWS App Mesh is available as a public preview and you can start using it today in the North Virginia, Ohio, Oregon, and Ireland AWS Regions. During the preview, we plan to add new features and want to hear your feedback. You can check out our GitHub repository for examples and our roadmap.
Performing network security assessments allows you to understand your cloud infrastructure and identify risks, but this process traditionally takes a lot of time and effort. You might need to run network port-scanning tools to test routing and firewall configurations, then validate what processes are listening on your instance network ports, before finally mapping the IPs identified in the port scan back to the host’s owner. To make this process simpler for our customers, AWS recently released the Network Reachability rules package in Amazon Inspector, our automated security assessment service that enables you to understand and improve the security and compliance of applications deployed on AWS. The existing Amazon Inspector host assessment rules packages check the software and configurations on your Amazon Elastic Compute Cloud (Amazon EC2) instances for vulnerabilities and deviations from best practices.
The new Network Reachability rules package analyzes your Amazon Virtual Private Cloud (Amazon VPC) network configuration to determine whether your EC2 instances can be reached from external networks such as the Internet, a virtual private gateway, AWS Direct Connect, or from a peered VPC. In other words, it informs you of potential external access to your hosts. It does this by analyzing all of your network configurations—like security groups, network access control lists (ACLs), route tables, and internet gateways (IGWs)—together to infer reachability. No packets need to be sent across the VPC network, nor must attempts be made to connect to EC2 instance network ports—it’s like packet-less network mapping and reconnaissance!
This new rules package is the first Amazon Inspector rules package that doesn’t require an Amazon Inspector agent to be installed on your Amazon EC2 instances. If you do optionally install the Amazon Inspector agent on your EC2 instances, the network reachability assessment will also report on the processes listening on those ports. In addition, installing the agent allows you to use Amazon Inspector host rules packages to check for vulnerabilities and security exposures in your EC2 instances.
To determine what is reachable, the Network Reachability rules package uses the latest technology from the AWS Provable Security initiative, referring to a suite of AWS technology powered by automated reasoning. It analyzes your AWS network configurations such as Amazon Virtual Private Clouds (VPCs), security groups, network access control lists (ACLs), and route tables to prove reachability of ports. What is automated reasoning, you ask? It’s fancy math that proves things are working as expected. In more technical terms, it’s a method of formal verification that automatically generates and checks mathematical proofs, which help to prove systems are functioning correctly. Note that Network Reachability only analyzes network configurations, so any other network controls, like on-instance IP filters or external firewalls, are not accounted for in the assessment. See documentation for more details.
Tim Kropp, Technology & Security Lead at Bridgewater Associates talked about how Bridgewater benefitted from Network Reachability Rules. “AWS provides tools for organizations to know if all their compliance, security, and availability requirements are being met. Technology created by the AWS Automated Reasoning Group, such as the Network Reachability Rules, allow us to continuously evaluate our live networks against these requirements. This grants us peace of mind that our most sensitive workloads exist on a network that we deeply understand.”
Network reachability assessments are priced per instance per assessment (instance-assessment). The free trial offers the first 250 instance-assessments for free within your first 90 days of usage. After the free trial, pricing is tiered based on your monthly volume. You can see pricing details here.
Using the Network Reachability rules package
Amazon Inspector makes it easy for you to run agentless network reachability assessments on all of your EC2 instances. You can do this with just a couple of clicks on the Welcome page of the Amazon Inspector console. First, use the check box to Enable network assessments, then select Run Once to run a single assessment or Run Weekly to run a weekly recurring assessment.
Figure 1: Assessment setup
Customizing the Network Reachability rules package
If you want to target a subset of your instances or modify the recurrence of assessments, you can select Advanced setup for guided steps to set up and run a custom assessment. For full customization options including getting notifications for findings, select Cancel and use the following steps.
Navigate to the Assessment targets page of the Amazon Inspector console to create an assessment target. You can select the option to include all instances within your account and AWS region, or you can assess a subset of your instances by adding tags to them in the EC2 console and inputting those tags when you create the assessment target. Give your target a name and select Save.
Figure 2: Assessment target
Optional agent installation: To get information about the processes listening on reachable ports, you’ll need to install the Amazon Inspector agent on your EC2 instances. If your instances allow the Systems Manager Run command, you can select the Install Agents option while creating your assessment target. Otherwise, you can follow the instructions here to install the Amazon Inspector agent on your instances before setting up and running the Amazon Inspector assessments using the steps above. In addition, installing the agent allows you to use Amazon Inspector host rules packages to check for vulnerabilities and security exposures in your EC2 instances.
Go to the Assessment templates page of the Amazon Inspector console. In the Target name field, select the assessment target that you created in step 1. From the Rules packages drop-down, select the Network Reachability-1.1 rules package. You can also set up a recurring schedule and notifications to your Amazon Simple Notification Service topic. (Learn more about Amazon SNS topics here). Now, select Create and Run. That’s it!
Alternately, you can run the assessment by selecting the template you just created from the Assessment templates page and then selecting Run, or by using the Amazon Inspector API.
You can view your findings on the Findings page in the Amazon Inspector console. You can also download a CSV of the findings from Amazon Inspector by using the Download button on the page, or you can use the AWS application programming interface (API) to retrieve findings in another application.
Note: You can create any CloudWatch Events rule and add your Amazon Inspector assessment as the target using the assessment template’s Amazon Resource Name (ARN), which is available in the console. You can use CloudWatch Events rules to automatically trigger assessment runs on a schedule or based on any other event. For example, you can trigger a network reachability assessment whenever there is a change to a security group or another VPC configuration, allowing you to automatically be alerted about insecure network exposure.
Understanding your EC2 instance network exposure
You can use this new rules package to analyze the accessibility of critical ports, as well as all other network ports. For critical ports, Amazon Inspector will show the exposure of each and will offer findings per port. When critical, well-known ports (based on Amazon’s standard guidance) are reachable, findings will be created with higher severities. When the Amazon Inspector agent is installed on the instance, each reachable port with a listener will also be reported. The following examples show network exposure from the Internet. There are analogous findings for network exposure via VPN, Direct Connect, or VPC peering. Read more about the finding types here.
Example finding for a well-known port open to the Internet, without installation of the Amazon Inspector Agent:
Figure 3: Finding for a well-known port open to the Internet
Example finding for a well-known port open to the Internet, with the Amazon Inspector Agent installed and a listening process (SSH):
Figure 4: Finding for a well-known port open to the Internet, with the Amazon Inspector Agent installed and a listening process (SSH)
Note that the findings provide the details on exactly how network access is allowed, including which VPC and subnet the instance is in. This makes tracking down the root cause of the network access straightforward. The recommendation includes information about exactly which Security Group you can edit to remove the access. And like all Amazon Inspector findings, these can be published to an SNS topic for additional processing, whether that’s to a ticketing system or to a custom AWS Lambda function. (See our blog post on automatic remediation of findings for guidance on how to do this.) For example, you could use Lambda to automatically remove ingress rules in the Security Group to address a network reachability finding.
With this new functionality from Amazon Inspector, you now have an easy way of assessing the network exposure of your EC2 instances and identifying and resolving unwanted exposure. We’ll continue to tailor findings to align with customer feedback. We encourage you to try out the Network Reachability Rules Package yourself and post any questions in the Amazon Inspector forum.
Want more AWS Security news? Follow us on Twitter.
Join us this month to learn about AWS services and solutions. New this month, we have a fireside chat with the GM of Amazon WorkSpaces and our 2nd episode of the “How to re:Invent” series. We’ll also cover best practices, deep dives, use cases and more! Join us and register today!
Containers June 25, 2018 | 09:00 AM – 09:45 AM PT – Running Kubernetes on AWS – Learn about the basics of running Kubernetes on AWS including how setup masters, networking, security, and add auto-scaling to your cluster.
June 28, 2018 | 01:00 PM – 01:45 PM PT – Fireside Chat: End User Collaboration on AWS – Learn how End User Compute services can help you deliver access to desktops and applications anywhere, anytime, using any device. IoT
The Mozilla blog has an article describing the addition of DNS over HTTPS (DoH) as an optional feature in the Firefox browser. “DoH support has been added to Firefox 62 to improve the way Firefox interacts with DNS. DoH uses encrypted networking to obtain DNS information from a server that is configured within Firefox. This means that DNS requests sent to the DoH cloud server are encrypted while old style DNS requests are not protected.” The configured server is hosted by Cloudflare, which has posted this privacy agreement about the service.
Businesses and organizations that rely on macOS server for essential office and data services are facing some decisions about the future of their IT services.
Apple recently announced that it is deprecating a significant portion of essential network services in macOS Server, as they described in a support statement posted on April 24, 2018, “Prepare for changes to macOS Server.” Apple’s note includes:
macOS Server is changing to focus more on management of computers, devices, and storage on your network. As a result, some changes are coming in how Server works. A number of services will be deprecated, and will be hidden on new installations of an update to macOS Server coming in spring 2018.
The note lists the services that will be removed in a future release of macOS Server, including calendar and contact support, Dynamic Host Configuration Protocol (DHCP), Domain Name Services (DNS), mail, instant messages, virtual private networking (VPN), NetInstall, Web server, and the Wiki.
Apple assures users who have already configured any of the listed services that they will be able to use them in the spring 2018 macOS Server update, but the statement ends with links to a number of alternative services, including hosted services, that macOS Server users should consider as viable replacements to the features it is removing. These alternative services are all FOSS (Free and Open-Source Software).
As difficult as this could be for organizations that use macOS server, this is not unexpected. Apple left the server hardware space back in 2010, when Steve Jobs announced the company was ending its line of Xserve rackmount servers, which were introduced in May, 2002. Since then, macOS Server has hardly been a prominent part of Apple’s product lineup. It’s not just the product itself that has lost some luster, but the entire category of SMB office and business servers, which has been undergoing a gradual change in recent years.
Some might wonder how important the news about macOS Server is, given that macOS Server represents a pretty small share of the server market. macOS Server has been important to design shops, agencies, education users, and small businesses that likely have been on Macs for ages, but it’s not a significant part of the IT infrastructure of larger organizations and businesses.
What Comes After macOS Server?
Lovers of macOS Server don’t have to fear having their Mac minis pried from their cold, dead hands quite yet. Installed services will continue to be available. In the fall of 2018, new installations and upgrades of macOS Server will require users to migrate most services to other software. Since many of the services of macOS Server were already open-source, this means that a change in software might not be required. It does mean more configuration and management required from those who continue with macOS Server, however.
Users can continue with macOS Server if they wish, but many will see the writing on the wall and look for a suitable substitute.
The Times They Are A-Changin’
For many people working in organizations, what is significant about this announcement is how it reflects the move away from the once ubiquitous server-based IT infrastructure. Services that used to be centrally managed and office-based, such as storage, file sharing, communications, and computing, have moved to the cloud.
In selecting the next office IT platforms, there’s an opportunity to move to solutions that reflect and support how people are working and the applications they are using both in the office and remotely. For many, this means including cloud-based services in office automation, backup, and business continuity/disaster recovery planning. This includes Software as a Service, Platform as a Service, and Infrastructure as a Service (Saas, PaaS, IaaS) options.
IT solutions that integrate well with the cloud are worth strong consideration for what comes after a macOS Server-based environment.
Synology NAS as a macOS Server Alternative
One solution that is becoming popular is to replace macOS Server with a device that has the ability to provide important office services, but also bridges the office and cloud environments. Using Network-Attached Storage (NAS) to take up the server slack makes a lot of sense. Many customers are already using NAS for file sharing, local data backup, automatic cloud backup, and other uses. In the case of Synology, their operating system, Synology DiskStation Manager (DSM), is Linux based, and integrates the basic functions of file sharing, centralized backup, RAID storage, multimedia streaming, virtual storage, and other common functions.
Since DSM is based on Linux, there are numerous server applications available, including many of the same ones that are available for macOS Server, which shares conceptual roots with Linux as it comes from BSD Unix.
Synology DiskStation Manager Package Center
According to Ed Lukacs, COO at 2FIFTEEN Systems Management in Salt Lake City, their customers have found the move from macOS Server to Synology NAS not only painless, but positive. DSM works seamlessly with macOS and has been faster for their customers, as well. Many of their customers are running Adobe Creative Suite and Google G Suite applications, so a workflow that combines local storage, remote access, and the cloud, is already well known to them. Remote users are supported by Synology’s QuickConnect or VPN.
Customers have been able to get up and running quickly, with only initial data transfers requiring some time to complete. After that, management of the NAS can be handled in-house or with the support of a Managed Service Provider (MSP).
Are You Sticking with macOS Server or Moving to Another Platform?
If you’re affected by this change in macOS Server, please let us know in the comments how you’re planning to cope. Are you using Synology NAS for server services? Please tell us how that’s working for you.
Regardless of your career path, there’s no denying that attending industry events can provide helpful career development opportunities — not only for improving and expanding your skill sets, but for networking as well. According to this article from PayScale.com, experts estimate that somewhere between 70-85% of new positions are landed through networking.
Narrowing our focus to networking opportunities with cloud computing professionals who’re working on tackling some of today’s most innovative and exciting big data solutions, attending big data-focused sessions at an AWS Global Summit is a great place to start.
AWS Global Summits are free events that bring the cloud computing community together to connect, collaborate, and learn about AWS. As the name suggests, these summits are held in major cities around the world, and attract technologists from all industries and skill levels who’re interested in hearing from AWS leaders, experts, partners, and customers.
In addition to networking opportunities with top cloud technology providers, consultants and your peers in our Partner and Solutions Expo, you’ll also hone your AWS skills by attending and participating in a multitude of education and training opportunities.
Here’s a brief sampling of some of the upcoming sessions relevant to big data professionals:
Be sure to check out the main page for AWS Global Summits, where you can see which cities have AWS Summits planned for 2018, register to attend an upcoming event, or provide your information to be notified when registration opens for a future event.
I was sure that somewhere there must be physically-lightweight sensors with simple power, simple networking, and a lightweight protocol that allowed them to squirt their data down the network with a minimum of overhead. So my interest was piqued when Jan-Piet Mens spoke at FLOSS UK’s Spring Conference on “Small Things for Monitoring”. Once he started passing working demonstration systems around the room without interrupting the demonstration, it was clear that MQTT was what I’d been looking for.
Join us this month to learn about some of the exciting new services and solution best practices at AWS. We also have our first re:Invent 2018 webinar series, “How to re:Invent”. Sign up now to learn more, we look forward to seeing you.
May 30, 2018 | 01:00 PM – 01:45 PM PT – Accelerating Life Sciences with HPC on AWS – Learn how you can accelerate your Life Sciences research workloads by harnessing the power of high performance computing on AWS.
May 22, 2018 | 11:00 AM – 11:45 AM PT – Hybrid Cloud Customer Use Cases on AWS – Learn how customers are leveraging AWS hybrid cloud capabilities to easily extend their datacenter capacity, deliver new services and applications, and ensure business continuity and disaster recovery.
May 31, 2018 | 11:00 AM – 11:45 AM PT – Using AWS IoT for Industrial Applications – Discover how you can quickly onboard your fleet of connected devices, keep them secure, and build predictive analytics with AWS IoT.
May 24, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS DeepLens – Learn how AWS DeepLens provides a new way for developers to learn machine learning by pairing the physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services.
May 30, 2018 | 11:00 AM – 11:45 AM PT – Accelerate Productivity by Computing at the Edge – Learn how AWS Snowball Edge support for compute instances helps accelerate data transfers, execute custom applications, and reduce overall storage costs.
The 4.17-rc4 kernel prepatch is out. “Two thirds of the 4.17-rc4 patch is drivers, which sounds about right. Media, networking, rdma, input, nvme, usb. A little bit of everything, in other words.” The codename has been changed, for the first time since 4.10, to “Merciless Moray”.
EC2’s H1 instances offer 2 to 16 terabytes of fast, dense storage for big data applications, optimized to deliver high throughput for sequential I/O. Enhanced Networking, 32 to 256 gigabytes of RAM, and Intel Xeon E5-2686 v4 processors running at a base frequency of 2.3 GHz round out the feature set.
I am happy to announce that we are reducing the On-Demand and Reserved Instance prices for H1 instances in the US East (N. Virginia), US East (Ohio), US West (Oregon), and EU (Ireland) Regions by 15%, effective immediately.
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.