Tag Archives: Amazon VPC

Learn From Your VPC Flow Logs With Additional Meta-Data

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/learn-from-your-vpc-flow-logs-with-additional-meta-data/

Flow Logs for Amazon Virtual Private Cloud enables you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow Logs data can be published to Amazon CloudWatch Logs or Amazon Simple Storage Service (S3).

Since we launched VPC Flow Logs in 2015, you have been using it for variety of use-cases like troubleshooting connectivity issues across your VPCs, intrusion detection, anomaly detection, or archival for compliance purposes. Until today, VPC Flow Logs provided information that included source IP, source port, destination IP, destination port, action (accept, reject) and status. Once enabled, a VPC Flow Log entry looks like the one below.

While this information was sufficient to understand most flows, it required additional computation and lookup to match IP addresses to instance IDs or to guess the directionality of the flow to come to meaningful conclusions.

Today we are announcing the availability of additional meta data to include in your Flow Logs records to better understand network flows. The enriched Flow Logs will allow you to simplify your scripts or remove the need for postprocessing altogether, by reducing the number of computations or lookups required to extract meaningful information from the log data.

When you create a new VPC Flow Log, in addition to existing fields, you can now choose to add the following meta-data:

  • vpc-id : the ID of the VPC containing the source Elastic Network Interface (ENI).
  • subnet-id : the ID of the subnet containing the source ENI.
  • instance-id : the Amazon Elastic Compute Cloud (EC2) instance ID of the instance associated with the source interface. When the ENI is placed by AWS services (for example, AWS PrivateLink, NAT Gateway, Network Load Balancer etc) this field will be “-
  • tcp-flags : the bitmask for TCP Flags observed within the aggregation period. For example, FIN is 0x01 (1), SYN is 0x02 (2), ACK is 0x10 (16), SYN + ACK is 0x12 (18), etc. (the bits are specified in “Control Bits” section of RFC793 “Transmission Control Protocol Specification”).
    This allows to understand who initiated or terminated the connection. TCP uses a three way handshake to establish a connection. The connecting machine sends a SYN packet to the destination, the destination replies with a SYN + ACK and, finally, the connecting machine sends an ACK. In the Flow Logs, the handshake is shown as two lines, with tcp-flags values of 2 (SYN), 18 (SYN + ACK).  ACK is reported only when it is accompanied with SYN (otherwise it would be too much noise for you to filter out).
  • type : the type of traffic : IPV4, IPV6 or Elastic Fabric Adapter.
  • pkt-srcaddr : the packet-level IP address of the source. You typically use this field in conjunction with srcaddr to distinguish between the IP address of an intermediate layer through which traffic flows, such as a NAT gateway.
  • pkt-dstaddr : the packet-level destination IP address, similar to the previous one, but for destination IP addresses.

To create a VPC Flow Log, you can use the AWS Management Console, the AWS Command Line Interface (CLI) or the CreateFlowLogs API and select which additional information and the order you want to consume the fields, for example:

Or using the AWS Command Line Interface (CLI) as below:

$ aws ec2 create-flow-logs --resource-type VPC \
                            --region eu-west-1 \
                            --resource-ids vpc-12345678 \
                            --traffic-type ALL  \
                            --log-destination-type s3 \
                            --log-destination arn:aws:s3:::sst-vpc-demo \
                            --log-format '${version} ${vpc-id} ${subnet-id} ${instance-id} ${interface-id} ${account-id} ${type} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${pkt-srcaddr} ${pkt-dstaddr} ${protocol} ${bytes} ${packets} ${start} ${end} ${action} ${tcp-flags} ${log-status}'

# be sure to replace the bucket name and VPC ID !

    "ClientToken": "1A....HoP=",
    "FlowLogIds": [
    "Unsuccessful": [] 

Enriched VPC Flow Logs are delivered to S3. We will automatically add the required S3 Bucket Policy to authorize VPC Flow Logs to write to your S3 bucket. VPC Flow Logs does not capture real-time log streams for your network interface, it might take several minutes to begin collecting and publishing data to the chosen destinations. Your logs will eventually be available on S3 at s3://<bucket name>/AWSLogs/<account id>/vpcflowlogs/<region>/<year>/<month>/<day>/

An SSH connection from my laptop with IP address to an EC2 instance would appear like this :

3 vpc-exxxxxx2 subnet-8xxxxf3 i-0bfxxxxxxaf eni-08xxxxxxa5 48xxxxxx93 IPv4 22 62897 6 5225 24 1566328660 1566328672 ACCEPT 18 OK
3 vpc-exxxxxx2 subnet-8xxxxf3 i-0bfxxxxxxaf eni-08xxxxxxa5 48xxxxxx93 IPv4 62897 22 6 4877 29 1566328660 1566328672 ACCEPT 2 OK is the private IP address of the EC2 instance, the one you see when you type ifconfig on the instance.  All flags are “OR”ed during aggregation period. When connection is short, probably both SYN and FIN (3), as well as SYN+ACK and FIN (19) will be set for the same lines.

Once a Flow Log is created, you can not add additional fields or modify the structure of the log to ensure you will not accidently break scripts consuming this data. Any modification will require you to delete and recreate the VPC Flow Logs. There is no additional cost to capture the extra information in the VPC Flow Logs, normal VPC Flow Log pricing applies, remember that Enriched VPC Flow Log records might consume more storage when selecting all fields.  We do recommend to select only the fields relevant to your use-cases.

Enriched VPC Flow Logs is available in all regions where VPC Flow logs is available, you can start to use it today.

— seb

PS: I heard from the team they are working on adding additional meta-data to the logs, stay tuned for updates.

How to add DNS filtering to your NAT instance with Squid

Post Syndicated from Nicolas Malaval original https://aws.amazon.com/blogs/security/how-to-add-dns-filtering-to-your-nat-instance-with-squid/

Note from September 4, 2019: We’ve updated this blog post, initially published on January 26, 2016. Major changes include: support of Amazon Linux 2, no longer having to compile Squid 3.5, and a high availability version of the solution across two availability zones.

Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources on a virtual private network that you’ve defined. On an Amazon VPC, many people use network address translation (NAT) instances and NAT gateways to enable instances in a private subnet to initiate outbound traffic to the Internet, while preventing the instances from receiving inbound traffic initiated by someone on the Internet.

For security and compliance purposes, you might have to filter the requests initiated by these instances (also known as “egress filtering”). Using iptables rules, you could restrict outbound traffic with your NAT instance based on a predefined destination port or IP address. However, you might need to enforce more complex security policies, such as allowing requests to AWS endpoints only, or blocking fraudulent websites, which you can’t easily achieve by using iptables rules.

In this post, I discuss and give an example of how to use Squid, a leading open-source proxy, to implement a “transparent proxy” that can restrict both HTTP and HTTPS outbound traffic to a given set of Internet domains, while being fully transparent for instances in the private subnet.

The solution architecture

In this section, I present the architecture of the high availability NAT solution and explain how to configure Squid to filter traffic transparently. Later in this post, I’ll provide instructions about how to implement and test the solution.

The following diagram illustrates how the components in this process interact with each other. Squid Instance 1 intercepts HTTP/S requests sent by instances in Private Subnet 1, including the Testing Instance. Squid Instance 1 then initiates a connection with the destination host on behalf of the Testing Instance, which goes through the Internet gateway. This solution spans two Availability Zones, with Squid Instance 2 intercepting requests sent from the other Availability Zone. Note that you may adapt the solution to span additional Availability Zones.

Figure 1: The solution spans two Availability Zones

Figure 1: The solution spans two Availability Zones

Intercepting and filtering traffic

In each availability zone, the route table associated to the private subnet sends the outbound traffic to the Squid instance (see Route Tables for a NAT Device). Squid intercepts the requested domain, then applies the following filtering policy:

  • For HTTP requests, Squid retrieves the host header field included in all HTTP/1.1 request messages. This specifies the Internet host being requested.
  • For HTTPS requests, the HTTP traffic is encapsulated in a TLS connection between the instance in the private subnet and the remote host. Squid cannot retrieve the host header field because the header is encrypted. A feature called SslBump would allow Squid to decrypt the traffic, but this would not be transparent for the client because the certificate would be considered invalid in most cases. The feature I use instead, called SslPeekAndSplice, retrieves the Server Name Indication (SNI) from the TLS initiation. The SNI contains the requested Internet host. As a result, Squid can make filtering decisions without decrypting the HTTPS traffic.

Note 1: Some older client-side software stacks do not support SNI. Here are the minimum versions of some important stacks and programming languages that support SNI: Python 2.7.9 and 3.2, Java 7 JSSE, wget 1.14, OpenSSL 0.9.8j, cURL 7.18.1

Note 2: TLS 1.3 introduced an optional extension that allows the client to encrypt the SNI, which may prevent Squid from intercepting the requested domain.

The SslPeekAndSplice feature was introduced in Squid 3.5 and is implemented in the same Squid module as SslBump. To enable this module, Squid requires that you provide a certificate, though it will not be used to decode HTTPS traffic. The solution creates a certificate using OpenSSL.

mkdir /etc/squid/ssl
cd /etc/squid/ssl
openssl genrsa -out squid.key 4096
openssl req -new -key squid.key -out squid.csr -subj "/C=XX/ST=XX/L=squid/O=squid/CN=squid"
openssl x509 -req -days 3650 -in squid.csr -signkey squid.key -out squid.crt
cat squid.key squid.crt >> squid.pem        

The following code shows the Squid configuration file. For HTTPS traffic, note the ssl_bump directives instructing Squid to “peek” (retrieve the SNI) and then “splice” (become a TCP tunnel without decoding) or “terminate” the connection depending on the requested host.

visible_hostname squid
cache deny all

# Log format and rotation
logformat squid %ts.%03tu %6tr %>a %Ss/%03>Hs %<st %rm %ru %ssl::>sni %Sh/%<a %mt
logfile_rotate 10
debug_options rotate=10

# Handling HTTP requests
http_port 3128
http_port 3129 intercept
acl allowed_http_sites dstdomain "/etc/squid/whitelist.txt"
http_access allow allowed_http_sites

# Handling HTTPS requests
https_port 3130 cert=/etc/squid/ssl/squid.pem ssl-bump intercept
acl SSL_port port 443
http_access allow SSL_port
acl allowed_https_sites ssl::server_name "/etc/squid/whitelist.txt"
acl step1 at_step SslBump1
acl step2 at_step SslBump2
acl step3 at_step SslBump3
ssl_bump peek step1 all
ssl_bump peek step2 allowed_https_sites
ssl_bump splice step3 allowed_https_sites
ssl_bump terminate step2 all
http_access deny all       

The text file located at /etc/squid/whitelist.txt contains the list of whitelisted domains, with one domain per line. In this blog post, I’ll show you how to configure Squid to allow requests to *.amazonaws.com, which corresponds to AWS endpoints. Note that you can restrict access to a specific set of AWS services that you’ve defined (see Regions and Endpoints for a detailed list of endpoints), or you can set your own list of domains.

Note: An alternate approach is to use VPC endpoints to privately connect your VPC to supported AWS services without requiring access over the Internet (see VPC Endpoints). Some supported AWS services allow you to create a policy that controls the use of the endpoint to access AWS resources (see VPC Endpoint Policies, and VPC Endpoints for a list of supported services).

You may have noticed that Squid listens on port 3129 for HTTP traffic and 3130 for HTTPS. Because Squid cannot directly listen to 80 and 443, you have to redirect the incoming requests from instances in the private subnets to the Squid ports using iptables. You do not have to enable IP forwarding or add any FORWARD rule, as you would do with a standard NAT instance.

sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 3129
sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j REDIRECT --to-port 3130       

The solution stores the files squid.conf and whitelist.txt in an Amazon Simple Storage Service (S3) bucket and runs the following script every minute on the Squid instances to download and update the Squid configuration from S3. This makes it easy to maintain the Squid configuration from a central location. Note that it first validates the files with squid -k parse and then reload the configuration with squid -k reconfigure if no error was found.

        cp /etc/squid/* /etc/squid/old/
        aws s3 sync s3://<s3-bucket> /etc/squid
        squid -k parse && squid -k reconfigure || (cp /etc/squid/old/* /etc/squid/; exit 1)     

The solution then uses the CloudWatch Agent on the Squid instances to collect and store Squid logs in Amazon CloudWatch Logs. The log group /filtering-nat-instance/cache.log contains the error and debug messages that Squid generates and /filtering-nat-instance/access.log contains the access logs.

An access log record is a space-delimited string that has the following format:

<time> <response_time> <client_ip> <status_code> <size> <method> <url> <sni> <remote_host> <mime>

The following table describes the fields of an access log record.

timeRequest time in seconds since epoch
response_timeResponse time in milliseconds
client_ipClient source IP address
status_codeSquid request status and HTTP response code sent to the client. For example, a HTTP request to an unallowed domain logs TCP_DENIED/403, and a HTTPS request to a whitelisted domain logs TCP_TUNNEL/200
sizeTotal size of the response sent to client
methodRequest method like GET or POST.
urlRequest URL received from the client. Logged for HTTP requests only
sniDomain name intercepted in the SNI. Logged for HTTPS requests only
remote_hostSquid hierarchy status and remote host IP address
mimeMIME content type. Logged for HTTP requests only

The following are some examples of access log records:

1563718817.184 14 TCP_DENIED/403 3822 GET http://example.com/ - HIER_NONE/- text/html
1563718821.573 7 TAG_NONE/200 0 CONNECT example.com HIER_NONE/- -
1563718872.923 32 TCP_TUNNEL/200 22927 CONNECT calculator.s3.amazonaws.com ORIGINAL_DST/ –   

Designing a high availability solution

The Squid instances introduce a single point of failure for the private subnets. If a Squid instance fails, the instances in its associated private subnet cannot send outbound traffic anymore. The following diagram illustrates the architecture that I propose to address this situation within an Availability Zone.

Figure 2: The architecture to address if a Squid instance fails within an Availability Zone

Figure 2: The architecture to address if a Squid instance fails within an Availability Zone

Each Squid instance is launched in an Amazon EC2 Auto Scaling group that has a minimum size and a maximum size of one instance. A shell script is run at startup to configure the instances. That includes installing and configuring Squid (see Running Commands on Your Linux Instance at Launch).

The solution uses the CloudWatch Agent and its procstat plugin to collect the CPU usage of the Squid process every 10 seconds. For each Squid instance, the solution creates a CloudWatch alarm that watches this custom metric and goes to an ALARM state when a data point is missing. This can happen, for example, when Squid crashes or the Squid instance fails. Note that for my use case, I consider watching the Squid process a sufficient approach to determining the health status of a Squid instance, although it cannot detect eventual cases of the Squid process being alive but unable to forward traffic. As a workaround, you can use an end-to-end monitoring approach, like using witness instances in the private subnets to send test requests at regular intervals and collect the custom metric.

When an alarm goes to ALARM state, CloudWatch sends a notification to an Amazon Simple Notification Service (SNS) topic which then triggers an AWS Lambda function. The Lambda function marks the Squid instance as unhealthy in its Auto Scaling group, retrieves the list of healthy Squid instances based on the state of other CloudWatch alarms, and updates the route tables that currently route traffic to the unhealthy Squid instance to instead route traffic to the first available healthy Squid instance. While the Auto Scaling group automatically replaces the unhealthy Squid instance, private instances can send outbound traffic through the Squid instance in the other Availability Zone.

When the CloudWatch agent starts collecting the custom metric again on the replacement Squid instance, the alarm reverts to OK state. Similarly, CloudWatch sends a notification to the SNS topic, which then triggers the Lambda function. The Lambda function completes the lifecycle action (see Amazon EC2 Auto Scaling Lifecycle Hooks) to indicate that the replacement instance is ready to serve traffic, and updates the route table associated to the private subnet in the same availability zone to route traffic to the replacement instance.

Implementing and testing the solution

Now that you understand the architecture behind this solution, you can follow the instructions in this section to implement and test the solution in your AWS account.

Implementing the solution

First, you’ll use AWS CloudFormation to provision the required resources. Select the Launch Stack button below to open the CloudFormation console and create a stack from the template. Then, follow the on-screen instructions.

Select this image to open a link that starts building the CloudFormation stack

CloudFormation will create the following resources:

  • An Amazon Virtual Private Cloud (Amazon VPC) with an internet gateway attached.
  • Two public subnets and two private subnets on the Amazon VPC.
  • Three route tables. The first route table is associated to the public subnets to make them publicly accessible. The other two route tables are associated to the private subnets.
  • An S3 bucket to store the Squid configuration files, and two Lambda-based custom resources to add the files squid.conf and whitelist.txt to this bucket.
  • An IAM role to grant the Squid instances permissions to read from the S3 bucket and use the CloudWatch agent.
  • A security group to allow HTTP and HTTPS traffic from instances in the private subnets.
  • A launch configuration to specify the template of Squid instances. That includes commands to run at startup for automating the initial configuration.
  • Two Auto Scaling groups that use this launch configuration to launch the Squid instances.
  • A Lambda function to redirect the outbound traffic and recover a Squid instance when it fails.
  • Two CloudWatch alarms to watch the custom metric sent by Squid instances and trigger the Lambda function when the health status of Squid instances changes.
  • An EC2 instance in the first private subnet to test the solution, and an IAM role to grant this instance permissions to use the SSM agent. Session Manager, which I introduce in the next paragraph, uses this SSM agent (see Working with SSM Agent)

Testing the solution

After the stack creation has completed (it can take up to 10 minutes), connect onto the Testing Instance using Session Manager, a capability of AWS Systems Manager that lets you manage instances through an interactive shell without the need to open an SSH port:

  1. Open the AWS Systems Manager console.
  2. In the navigation pane, choose Session Manager.
  3. Choose Start Session.
  4. For Target instances, choose the option button to the left of Testing Instance.
  5. Choose Start Session.

Note: Session Manager makes calls to several AWS endpoints (see Working with SSM Agent). If you prefer to restrict access to a defined set of AWS services, make sure to whitelist the associated domains.

After the connection is made, you can test the solution with the following commands. Only the last three requests should return a valid response, because Squid allows traffic to *.amazonaws.com only.

curl http://www.amazon.com
curl https://www.amazon.com
curl http://calculator.s3.amazonaws.com/index.html
curl https://calculator.s3.amazonaws.com/index.html
aws ec2 describe-regions --region us-east-1         

To find the requests you just made in the access logs, here’s how to browse the Squid logs in Amazon CloudWatch Logs:

  1. Open the Amazon CloudWatch console.
  2. In the navigation pane, choose Logs.
  3. For Log Groups, choose the log group /filtering-nat-instance/access.log.
  4. Choose Search Log Group to view and search log records.

To test how the solution behaves when a Squid instance fails, you can terminate one of the Squid instances manually in the Amazon EC2 console. Then, watch the CloudWatch alarm change its state in the Amazon CloudWatch console, or watch the solution change the default route of the impacted route table in the Amazon VPC console.

You can now delete the CloudFormation stack to clean up the resources that were just created.

Discussion: Transparent or forward proxy?

The solution that I describe in this blog is fully transparent for instances in the private subnets, which means that instances don’t need to be aware of the proxy and can make requests as if they were behind a standard NAT instance. An alternate solution is to deploy a forward proxy in your Amazon VPC and configure instances in private subnets to use it (see the blog post How to set up an outbound VPC proxy with domain whitelisting and content filtering for an example). In this section, I discuss some of the differences between the two solutions.


A major drawback with forward proxies is that the proxy must be explicitly configured on every instance within the private subnets. For example, you can configure the HTTP_PROXY and HTTPS_PROXY environment variables on Linux instances, but some applications or services, like yum, require their own proxy configuration, or don’t support proxy usage. Note also that some AWS services and features, like Amazon EMR or Amazon SageMaker notebook instances, don’t support using a forward proxy at the time of this post. However, with TLS 1.3, a forward proxy is the only option to restrict outbound traffic if the SNI is encrypted.


Deploying a forward proxy on AWS usually consists of a load balancer distributing traffic to a set of proxy instances launched in an Auto Scaling group. Proxy instances can be launched or terminated dynamically depending on the demand (also known as “horizontal scaling”). With forward proxies, each route table can route traffic to a single instance at a time, and changing the type of the instance is the only way to increase or decrease the capacity (also known as “vertical scaling”).

The solution I present in this post does not dynamically adapt the instance type of the Squid instances based on the demand. However, you might consider a mechanism in which the traffic from a private subnet is temporarily redirected through another Availability Zone while the Squid instance is being relaunched by Auto Scaling with a smaller or larger instance type.


Deploying a centralized proxy solution and using it across multiple VPCs is a way of reducing cost and operational complexity.

With a forward proxy, instances in private subnets send IP packets to the proxy load balancer. Therefore, sharing a forward proxy across multiple VPCs only requires connectivity between the “instance VPCs” and a proxy VPC that has VPC Peering or equivalent capabilities.

With a transparent proxy, instances in private subnets sends IP packets to the remote host. VPC Peering does not support transitive routing (see Unsupported VPC Peering Configurations) and cannot be used to share a transparent proxy across multiple VPCs. However, you can now use an AWS Transit Gateway that acts as a network transit hub to share a transparent proxy across multiple VPCs. I give an example in the next section.

Sharing the solution across multiple VPCs using AWS Transit Gateway

In this section, I give an example of how to share a transparent proxy across multiple VPCs using AWS Transit Gateway. The architecture is illustrated in the following diagram. For the sake of simplicity, the diagram does not include Availability Zones.

Figure 3: The architecture for a transparent proxy across multiple VPCs using AWS Transit Gateway

Figure 3: The architecture for a transparent proxy across multiple VPCs using AWS Transit Gateway

Here’s how instances in the private subnet of “VPC App” can make requests via the shared transparent proxy in “VPC Shared:”

  1. When instances in VPC App make HTTP/S requests, the network packets they send have the public IP address of the remote host as the destination address. These packets are forwarded to the transit gateway, based on the route table associated to the private subnet.
  2. The transit gateway receives the packets and forwards them to VPC Shared, based on the default route of the transit gateway route table.
  3. Note that the transit gateway attachment resides in the transit gateway subnet. When the packets arrive in VPC Shared, they are forwarded to the Squid instance because the next destination has been determined based on the route table associated to the transit gateway subnet.
  4. The Squid instance makes requests on behalf of the source instance (“Instances” in the schema). Then, it sends the response to the source instance. The packets that it emits have the IP address of the source instance as the destination address and are forwarded to the transit gateway according to the route table associated to the public subnet.
  5. The transit gateway receives and forwards the response packets to VPC App.
  6. Finally, the response reaches the source instance.

In a high availability deployment, you could have one transit gateway subnet per Availability Zone that sends traffic to the Squid instance that resides in the same Availability Zone, or to the Squid instance in another Availability Zone if the instance in the same Availability Zone fails.

You could also use AWS Transit Gateway to implement a transparent proxy solution that scales horizontally. This allows you to add or remove proxy instances based on the demand, instead of changing the instance type. With this approach, you must deploy a fleet of proxy instances – launched by an Auto Scaling group, for example – and mount a VPN connection between each instance and the transit gateway. The proxy instances need to support ECMP (“Equal Cost Multipath routing”; see Transit Gateways) to equally spread the outbound traffic between instances. I don’t describe this alternative architecture further in this blog post.


In this post, I’ve shown how you can use Squid to implement a high availability solution that filters outgoing traffic to the Internet and helps meet your security and compliance needs, while being fully transparent for the back-end instances in your VPC. I’ve also discussed the key differences between transparent proxies and forward proxies. Finally, I gave an example of how to share a transparent proxy solution across multiple VPCs using AWS Transit Gateway.

If you have any questions or suggestions, please leave a comment below or on the Amazon VPC forum.

If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Nicolas Malaval

Nicolas is a Solution Architect for Amazon Web Services. He lives in Paris and helps our healthcare customers in France adopt cloud technology and innovate with AWS. Before that, he spent three years as a Consultant for AWS Professional Services, working with enterprise customers.

Announcing improved VPC networking for AWS Lambda functions

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/announcing-improved-vpc-networking-for-aws-lambda-functions/

We’re excited to announce a major improvement to how AWS Lambda functions work with your Amazon VPC networks. With today’s launch, you will see dramatic improvements to function startup performance and more efficient usage of elastic network interfaces. These improvements are rolling out to all existing and new VPC functions, at no additional cost. Roll out begins today and continues gradually over the next couple of months across all Regions.

Lambda first supported VPCs in February 2016, allowing you to access resources in your VPCs or on-premises systems using an AWS Direct Connect link. Since then, we’ve seen customers widely use VPC connectivity to access many different services:

  • Relational databases such as Amazon RDS
  • Datastores such as Amazon ElastiCache for Redis or Amazon Elasticsearch Service
  • Other services running in Amazon EC2 or Amazon ECS

Today’s launch brings enhancements to how this connectivity between Lambda and your VPCs works.

Previous Lambda integration model for VPCs

It helps to understand some basics about the way that networking with Lambda works. Today, all of the compute infrastructure for Lambda runs inside of VPCs owned by the service.Lambda runs in a VPC controlled by the service

When you invoke a Lambda function from any invocation source and with any execution model (synchronous, asynchronous, or poll based), it occurs through the Lambda API. There is no direct network access to the execution environment where your functions run:Invoke access only through the Lambda API

By default, when your Lambda function is not configured to connect to your own VPCs, the function can access anything available on the public internet such as other AWS services, HTTPS endpoints for APIs, or services and endpoints outside AWS. The function then has no way to connect to your private resources inside of your VPC.

When you configure your Lambda function to connect to your own VPC, it creates an elastic network interface in your VPC and then does a cross-account attachment. These network interfaces allow network access from your Lambda functions to your private resources. These Lambda functions continue to run inside of the Lambda service’s VPC and can now only access resources over the network through your VPC.

All invocations for functions continue to come from the Lambda service’s API and you still do not have direct network access to the execution environment. The attached network interface exists inside of your account and counts towards limits that exist in your account for that Region.lambda mapped to a customer eni

With this model, there are a number of challenges that you might face. The time taken to create and attach new network interfaces can cause longer cold-starts, increasing the time it takes to spin up a new execution environment before your code can be invoked.

As your function’s execution environment scales to handle an increase in requests, more network interfaces are created and attached to the Lambda infrastructure. The exact number of network interfaces created and attached is a factor of your function configuration and concurrency.many ENIs mapped to Lambda environments

Every network interface created for your function is associated with and consumes an IP address in your VPC subnets. It counts towards your account level maximum limit of network interfaces.

As your function scales, you have to be mindful of several issues:

  • Managing the IP address space in your subnets
  • Reaching the account level network interface limit
  • The potential to hit the API rate limit on creating new network interfaces

What’s changing

Starting today, we’re changing the way that your functions connect to your VPCs. AWS Hyperplane, the Network Function Virtualization platform used for Network Load Balancer and NAT Gateway, has supported inter-VPC connectivity for offerings like AWS PrivateLink, and we are now leveraging Hyperplane to provide NAT capabilities from the Lambda VPC to customer VPCs.

The Hyperplane ENI is a managed network resource that the Lambda service controls, allowing multiple execution environments to securely access resources inside of VPCs in your account. Instead of the previous solution of mapping network interfaces in your VPC directly to Lambda execution environments, network interfaces in your VPC are mapped to the Hyperplane ENI and the functions connect using it.

Hyperplane still uses a cross account attached network interface that exists in your VPC, but in a greatly improved way:

  • The network interface creation happens when your Lambda function is created or its VPC settings are updated. When a function is invoked, the execution environment simply uses the pre-created network interface and quickly establishes a network tunnel to it. This dramatically reduces the latency that was previously associated with creating and attaching a network interface at cold start.
  • Because the network interfaces are shared across execution environments, typically only a handful of network interfaces are required per function. Every unique security group:subnet combination across functions in your account requires a distinct network interface. If a combination is shared across multiple functions in your account, we reuse the same network interface across functions.
  • Your function scaling is no longer directly tied to the number of network interfaces and Hyperplane ENIs can scale to support large numbers of concurrent function executions

Hyperplane ENIs are tied to a security group:subnet combination in your account. Functions in the same account that share the same security group:subnet pairing use the same network interfaces. This way, a single application with multiple functions but the same network and security configuration can benefit from the existing interface configuration.

Several things do not change in this new model:

  • Your Lambda functions still need the IAM permissions required to create and delete network interfaces in your VPC.
  • You still control the subnet and security group configurations of these network interfaces. You can continue to apply normal network security controls and follow best practices on VPC configuration.
  • You still have to use a NAT device(for example VPC NAT Gateway) to give a function internet access or use VPC endpoints to connect to services outside of your VPC.
  • Nothing changes about the types of resources that your functions can access inside of your VPCs.

As mentioned earlier, Hyperplane now creates a shared network interface when your Lambda function is first created or when its VPC settings are updated, improving function setup performance and scalability. This one-time setup can take up to 90 seconds to complete. If your Lambda function is invoked while the shared network interface is still being created, the invocation still succeeds but may see longer cold-start times. When setup is complete, your Lambda function no longer requires network interface creation at invocation time. If Lambda functions in an account go idle for consecutive weeks, the service will reclaim the unused Hyperplane resources and so very infrequently invoked functions may still see longer cold-start times.

As I mentioned earlier, you don’t have to enable this new capability yourself, and there are no new associated costs. Starting today, we are gradually rolling this out across all AWS Regions over the next couple of months. We will update this post on a Region by Region basis after the rollout has completed in a given Region.

If you have existing VPC-configured Lambda workloads, you will soon see your applications scale and respond to new invocations more quickly. You can see this change using tools like AWS X-Ray or those from Lambda’s third party partners.

The following screenshots show a before/after example taken with AWS X-Ray:

The first shows a function that was launched in a VPC, with a full cold start with network interface creation at invocation.

In this second image, you see the same function executed but this time in an account that has the new VPC networking capability. You can see the massive difference that this has made in function execution duration, with it dropping to 933 ms from 14.8 seconds!


On the AWS Lambda team, we consider “Simplicity” a core tenet to how we think about building services for our customers. We’re constantly evaluating ways that we can improve and simplify the experience that you see in building serverless applications with Lambda. These changes in how we connect with your VPCs improve the performance and scale for your Lambda functions. They enable you to harness the full power of serverless architectures.

How to set up an outbound VPC proxy with domain whitelisting and content filtering

Post Syndicated from Vesselin Tzvetkov original https://aws.amazon.com/blogs/security/how-to-set-up-an-outbound-vpc-proxy-with-domain-whitelisting-and-content-filtering/

Controlling outbound communication from your Amazon Virtual Private Cloud (Amazon VPC) to the internet is an important part of your overall preventive security controls. By limiting outbound traffic to certain trusted domains (called “whitelisting”) you help prevent instances from downloading malware, communicating with bot networks, or attacking internet hosts. It’s not practical to prevent all outbound web traffic, though. Often, you want to allow access to certain well-known domains (for example, to communicate with partners, to download software updates, or to communicate with AWS API endpoints). In this post, I’ll show you how to limit outbound web connections from your VPC to the internet, using a web proxy with custom domain whitelists or DNS content filtering services. The solution is scalable, highly available, and deploys in a fully automated way.

Solution benefits and deliverables

This solution is based on the open source HTTP proxy Squid. The proxy can be used for all workloads running in the VPC, like Amazon Elastic Compute Cloud (EC2) and AWS Fargate. The solution provides you with the following benefits:

  • An outbound proxy that permit connections to whitelisted domains that you define, while presenting customizable error messages when connections are attempted to unapproved domains.
  • Optional domain content filtering based on DNS, delivered by external services like OpenDNS, Quad9, CleanBrowsing, Yandex.DNS or others. For this option, you do need to be a customer of these external services.
  • Transparent encryption handling, due to the extraction of the domain information from the Server Name Indication (SNI) extension in TLS. Encryption in transit is preserved and end-to-end encryption is maintained.
  • An auto-scaling group with Elastic Load Balancing (ELB) Network Load Balancers that spread over several of your existing subnets (and Availability Zones) and scale based on CPU load.
  • One Elastic IP address per proxy instance for internet communication. Sometimes the web sites that you’re communicating want to know your IP address so they can accept traffic from you. Giving the proxies’ elastic IP addresses allows you to know what IP addresses your web connections will come from.
  • Proxy access logs delivered to CloudWatch Logs.
  • Proxy metrics, available in CloudWatch Metrics.
  • Automated solution deployment via AWS CloudFormation.

Out of scope

  • This solution does not serve applications that aren’t proxy capable. Deep packet inspection is also out of scope.
  • TLS encryption is kept end-to-end, and only the SNI extension is examined. For unencrypted traffic (HTTP), only the host header is analyzed.
  • DNS content filtering must be delivered by an external provider; this solution only integrates with it.

Services used, cost, and performance

The solution uses the following services:

In total, the solution costs a few dollars per day depending on the region and the bandwidth usage. If you are using a DNS filtering service, you may also be charged by the service provider.

Note: An existing VPC and internet gateway are prerequisites to this solution, and aren’t included in the pricing calculations.

Solution architecture


Figure 1: Solution overview

Figure 1: Solution overview

As shown in Figure 1:

  1. The solution is deployed automatically via an AWS CloudFormation template.
  2. CloudWatch Logs stores the Squid access log so that you can search and analyze it.
  3. The list of allowed (whitelisted) domains is stored in AWS Secrets Manager. The Amazon EC2 instance retrieves the domain list every 5 minutes via cronjob and updates the proxy configuration if the list has changed. The values in Secrets Manager are provisioned by CloudFormation and can be read only by the proxy EC2 instances.
  4. The client running on the EC2 instance must have proxy settings pointing toward the Network Load Balancer. The load balancer will forward the request to the fleet of proxies in the target group.


  1. You need an already deployed VPC, with public and private subnets spreading over several Availability Zones (AZs). You can find a description of how to set up your VPC environment at Default VPC Setup.
  2. You must have an internet gateway, with routing set up so that only traffic from a public subnet can reach the internet.

You don’t need to have a NAT (network translation address) gateway deployed since this function will be provided by the outbound proxy.

Integration with content filtering DNS services

If you require content filtering from an external company, like OpenDNS or Yandex.DNS, you must register and become a customer of that service. Many have free services, in addition to paid plans if you need advanced statistics and custom categories. This is your responsibility as the customer. (Learn more about the shared responsibility between AWS and the customer.)

Your DNS service provider will assign you a list of DNS IP addresses. You’ll need to enter the IP addresses when you provision (see Installation below).

If the DNS provider requires it, you may give them the source IPs of the proxies. There are four reserved IPs that you can find in the stack output (see Output parameters below).

Installation (one-time setup)

    1. Select the Launch Stack button to launch the CloudFormation template:
      The "Launch Stack" button

      Note: You must sign in your AWS Account in order to launch the stack in the required region. The stack content can also be downloaded here.

    2. Provide the following proxy parameters, as shown in Figure 2:
      • Allowed domains: Enter your whitelisted domains. Use a leading dot (“.”) to indicate subdomains.
      • Custom DNS servers (optional): List any DNS servers that will be used by the proxy. Leave the default value to use the default Amazon DNS server.
      • Proxy Port: Enter the listener port of the proxy.
      • Instance Type: Enter the EC2 instance type that you want to use for the proxies. Instance type will affect vertical scaling capabilities and solution cost. For more information, see Amazon EC2 Instance Types.
      • AMI ID to be used: This field is prepopulated with the Amazon Machine Image (AMI) ID found in AWS Systems Manager Parameter Store. By default, it will point toward the latest Amazon Linux 2 image. You do not need to adjust this value.
      • SSH Key name (optional): Enter the name of the SSH key for your proxy EC2 instances. This is relevant only for debugging, or if you need to log in on the proxy servers. Consider using AWS Systems Manager Session Manager instead of SSH.
    3. Next, provide the following network parameters, as shown in Figure 2:
      • VPC ID: The VPC where the solution will be deployed.
      • Public subnets: The subnets where the proxies will be deployed. Select between 2 and 3 subnets.
      • Private subnets: The subnets where the Network Load Balancer will be deployed. Select between 2 and 3 subnets.
      • Allowed client CIDR: The value you enter here will be added to the proxy security group. By default, the private IP range is allowed. The allowed block size is between a /32 netmask and an /8 netmask. This prevents you from using an open IP range like If you were to set an open IP range, your proxies would accept traffic from anywhere on the internet, which is a bad practice.


Figure 2: Launching the CloudFormation template

Figure 2: Launching the CloudFormation template


  • When you’ve entered all your proxy and network parameters, select Next. On the following wizard screens, you can keep the default values and select Next and Create Stack.


Find the output parameters

After the stack status has changed to “deployed,” you’ll need to note down the output parameters to configure your clients. Look for the following parameters in the Outputs tab of the stack:

  • The domain name of the proxy that should be configured on the client
  • The port of the proxy that should be configured on the client
  • 4 Elastic IP addresses for the proxy’s instances. These are used for outbound connections to Internet.
  • The CloudWatch Log Group, for access logs.
  • The Security Group that is attached to the proxies.
  • The Linux command to set the proxy. You can copy and paste this to your shell.
Figure 3: Stack output parameters

Figure 3: Stack output parameters

Use the proxy

Proxy setting parameters are specific to every application. Most Linux application use the environment variables http_proxy and https_proxy.

    1. Log in on the Linux EC2 instance that’s allowed to use the proxy.
    2. To set the shell parameter temporarily (only for the current shell session), execute the following export commands:
          [[email protected] ~]$ export http_proxy=http://<Proxy-DOMAIN>:<Proxy-Port>
          [[email protected] ~]$ export https_proxy=$http_proxy

      1. Replace <Proxy-DOMAIN> with the domain of the load balancer, which you can find in the stack output parameter.
      2. Replace <Proxy-Port> with the port of your proxy, which is also listed in the stack output parameter.


  1. Next, you can use cURL (for example) to test the connection. Replace <URL> with one of your whitelisted URLs:
            [[email protected] ~]$ curl -k <URL> -k                                                                
            <!DOCTYPE html>

  2. You can add the proxy parameter permanently to interactive and non-interactive shells. If you do this, you won’t need to set them again after reloading. Execute the following commands in your application shell:
            [[email protected] ~]$ echo 'export http_proxy=http://<Proxy-DOMAIN>:<Proxy-Port>' >> ~/.bashrc
            [[email protected] ~]$ echo 'export https_proxy=$http_proxy' >> ~/.bashrc
            [[email protected] ~]$ echo 'export http_proxy=http://<Proxy-DOMAIN>:<Proxy-Port>' >> ~/.bash_profile
            [[email protected] ~]$ echo 'export https_proxy=$http_proxy' >> ~/.bash_profile

    1. Replace <Proxy-DOMAIN> with the domain of the load balancer.
    2. Replace <Proxy-Port> with the port of your proxy.

Customize the access denied page

An error page will display when a user’s access is blocked or if there’s an internal error. You can adjust the look and feel of this page (HTML or styles) according to the Squid error directory tag.

Use the proxy access log

The proxy access log is an important tool for troubleshooting. It contains the client IP address, the destination domain, the port, and errors with timestamps. The access logs from Squid are uploaded to CloudWatch. You can find them from the CloudWatch console under Log Groups, with the prefix Proxy, as shown in the figure below.

Figure 4: CloudWatch log with access group

Figure 4: CloudWatch log with access group

You can use CloudWatch Insight to analyze and visualize your queries. See the following figure for an example of denied connections visualized on a timeline:

Figure 5: Access logs analysis with CloudWatch Insight

Figure 5: Access logs analysis with CloudWatch Insight

Monitor your metrics with CloudWatch

The main proxy metrics are upload every five minutes to CloudWatch Metrics in the proxy namespace:

  • client_http.errors /sec – errors in processing client requests per second
  • client_http.hits /sec – cache hits per second
  • client_http.kbytes_in /sec – client uploaded data per second
  • client_http.kbytes_out /sec – client downloaded data per second
  • client_http.requests /sec – number of requests per second
  • server.all.errors /sec – proxy server errors per second
  • server.all.kbytes_in /sec – proxy server uploaded data per second
  • server.all.kbytes_out /sec – proxy downloaded data per second
  • server.all.requests /sec – all requests sent by proxy server per second

In the figure below, you can see an example of metrics. For more information on metric use, see the Squid project information.

Figure 6: Example of CloudWatch metrics

Figure 6: Example of CloudWatch metrics

Manage the proxy configuration

From time to time, you may want to add or remove domains from the whitelist. To change your whitelisted domains, you must update the input values in the CloudFormation stack. This will cause the values stored in Secrets Manager to update as well. Every five minutes, the proxies will pull the list from Secrets Manager and update as needed. This means it can take up to five minutes for your change to propagate. The change will be propagated to all instances without terminating or deploying them.

Note that when the whitelist is updated, the Squid proxy processes are restarted, which will interrupt ALL connections passing through them at that time. This can be disruptive, so be careful about when you choose to adjust the whitelist.

If you want to change other CloudFormation parameters, like DNS or Security Group settings, you can again update the CloudFormation stack with new values. The CloudFormation stack will launch a new instance and terminate legacy instances (a rolling update).

You can change the proxy Squid configuration by editing the CloudFormation template (section AWS::CloudFormation::Init) and updating the stack. However, you should not do this unless you have advanced AWS and Squid experience.

Update the instances

To update your AMI, you can update the stack. If the AMI has been updated with a newer version, then a rolling update will redeploy the EC2 instances and Squid software. This automates the process of patching managed instances with both security-related and other updates. If the AMI has not changed, no update will be performed.

Alternately, you can terminate the instance, and the auto scaling group will launch a new instance with the latest updates for Squid and the OS, starting from scratch. This approach may lead to a short service interruption for the clients served on this instance, during the time in which the load balancer is switching to an active instance.


I’ve summarized a few common problems and solutions below.

I receive timeout at client application.
  • Check that you’ve configured the client application to use the proxy. (See Using a proxy, above.)
  • Check that the Security Group allows access from the client instance.
  • Verify that your NACL and routing table allow communication to and from the Network Load Balancer.
I receive an error page that access was blocked by the administrator.Check the stack input parameter for allowed domains. The domains must be comma separated. Included subdomains must start with dot. For example:

  • To include www.amazon.com, specify www.amazon.com
  • To include all subdomains of amazon.com as part of a list, specify .amazon.com
I received a 500 error page from the proxy.
  • Make sure that the proxy EC2 instance has internet access. The public subnets must have an Internet Gateway connected and set as the default route.
  • Check the DNS input parameter in the CloudFormation stack, if you use an external DNS service. Make sure the DNS provider has the correct proxy IPs (if you were required to provide them
The webpage doesn’t look as expected. There are fragments or styles missing.Many pages download content from multiple domains. You need to whitelist all of these domains. Use the access logs in CloudWatch Log to determine which domains are blocked, then update the stack.
On the proxy error page, I receive “unknown certificate issuer.”During the setup, a self-signed certificate for the squid error page is generated. If you need to add your own certificate, you can adapt the CloudFormation template. This requires moderate knowledge of Unix/Linux and AWS CloudFormation.


In this blog post, I showed you how you can configure an outbound proxy for controlling the internet communication from a VPC. If you need Squid support, you can find various offerings on the Squid Support page. AWS forums provides support for Amazon Elastic Compute Cloud (EC2). When you need AWS experts to help you plan, build, or optimise your infrastructure, consider engaging AWS Professional Services.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Vesselin Tzvetkov

Vesselin is senior security consultant at AWS Professional Services and is passionate about security architecture and engineering innovative solutions. Outside of technology, he likes classical music, philosophy, and sports. He holds a Ph.D. in security from TU-Darmstadt and a M.S. in electrical engineering from Bochum University in Germany.

New – VPC Traffic Mirroring – Capture & Inspect Network Traffic

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-vpc-traffic-mirroring/

Running a complex network is not an easy job. In addition to simply keeping it up and running, you need to keep an ever-watchful eye out for unusual traffic patterns or content that could signify a network intrusion, a compromised instance, or some other anomaly.

VPC Traffic Mirroring
Today we are launching VPC Traffic Mirroring. This is a new feature that you can use with your existing Virtual Private Clouds (VPCs) to capture and inspect network traffic at scale. This will allow you to:

Detect Network & Security Anomalies – You can extract traffic of interest from any workload in a VPC and route it to the detection tools of your choice. You can detect and respond to attacks more quickly than is possible with traditional log-based tools.

Gain Operational Insights – You can use VPC Traffic Mirroring to get the network visibility and control that will let you make security decisions that are better informed.

Implement Compliance & Security Controls – You can meet regulatory & compliance requirements that mandate monitoring, logging, and so forth.

Troubleshoot Issues – You can mirror application traffic internally for testing and troubleshooting. You can analyze traffic patterns and proactively locate choke points that will impair the performance of your applications.

You can think of VPC Traffic Mirroring as a “virtual fiber tap” that gives you direct access to the network packets flowing through your VPC. As you will soon see, you can choose to capture all traffic or you can use filters to capture the packets that are of particular interest to you, with an option to limit the number of bytes captured per packet. You can use VPC Traffic Mirroring in a multi-account AWS environment, capturing traffic from VPCs spread across many AWS accounts and then routing it to a central VPC for inspection.

You can mirror traffic from any EC2 instance that is powered by the AWS Nitro system (A1, C5, C5d, M5, M5a, M5d, R5, R5a, R5d, T3, and z1d as I write this).

Getting Started with VPC Traffic Mirroring
Let’s review the key elements of VPC Traffic Mirroring and then set it up:

Mirror Source – An AWS network resource that exists within a particular VPC, and that can be used as the source of traffic. VPC Traffic Mirroring supports the use of Elastic Network Interfaces (ENIs) as mirror sources.

Mirror Target – An ENI or Network Load Balancer that serves as a destination for the mirrored traffic. The target can be in the same AWS account as the Mirror Source, or in a different account for implementation of the central-VPC model that I mentioned above.

Mirror Filter – A specification of the inbound or outbound (with respect to the source) traffic that is to be captured (accepted) or skipped (rejected). The filter can specify a protocol, ranges for the source and destination ports, and CIDR blocks for the source and destination. Rules are numbered, and processed in order within the scope of a particular Mirror Session.

Traffic Mirror Session – A connection between a mirror source and target that makes use of a filter. Sessions are numbered, evaluated in order, and the first match (accept or reject) is used to determine the fate of the packet. A given packet is sent to at most one target.

You can set this up using the VPC Console, EC2 CLI, or the EC2 API, with CloudFormation support in the works. I’ll use the Console.

I already have ENI that I will use as my mirror source and destination (in a real-world use case I would probably use an NLB destination):

The MirrorTestENI_Source and MirrorTestENI_Destination ENIs are already attached to suitable EC2 instances. I open the VPC Console and scroll down to the Traffic Mirroring items, then click Mirror Targets:

I click Create traffic mirror target:

I enter a name and description, choose the Network Interface target type, and select my ENI from the menu. I add a Blog tag to my target, as is my practice, and click Create:

My target is created and ready to use:

Now I click Mirror Filters and Create traffic mirror filter. I create a simple filter that captures inbound traffic on three ports (22, 80, and 443), and click Create:

Again, it is created and ready to use in seconds:

Next, I click Mirror Sessions and Create traffic mirror session. I create a session that uses MirrorTestENI_Source, MainTarget, and MyFilter, allow AWS to choose the VXLAN network identifier, and indicate that I want the entire packet mirrored:

And I am all set. Traffic from my mirror source that matches my filter is encapsulated as specified in RFC 7348 and delivered to my mirror target. I can then use tools like Suricata to capture, analyze, and visualize it.

Things to Know
Here are a couple of things to keep in mind:

Sessions Per ENI – You can have up to three active sessions on each ENI.

Cross-VPC – The source and target ENIs can be in distinct VPCs as long as they are peered to each other or connected through Transit Gateway.

Scaling & HA – In most cases you should plan to mirror traffic to a Network Load Balancer and then run your capture & analysis tools on an Auto Scaled fleet of EC2 instances behind it.

Bandwidth – The replicated traffic generated by each instance will count against the overall bandwidth available to the instance. If traffic congestion occurs, mirrored traffic will be dropped first.

Now Available
VPC Traffic Mirroring is available now and you can start using it today in all commercial AWS Regions except Asia Pacific (Sydney), China (Beijing), and China (Ningxia). Support for those regions will be added soon. You pay an hourly fee (starting at $0.015 per hour) for each mirror source; see the VPC Pricing page for more info.



New – Use an AWS Transit Gateway to Simplify Your Network Architecture

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/new-use-an-aws-transit-gateway-to-simplify-your-network-architecture/

It is safe to say that Amazon Virtual Private Cloud is one of the most useful and central features of AWS. Our customers configure their VPCs in a wide variety of ways, and take advantage of numerous connectivity options and gateways including AWS Direct Connect (via Direct Connect Gateways), NAT Gateways, Internet Gateways, Egress-Only Internet Gateways, VPC Peering, AWS Managed VPN Connections, and PrivateLink.

Because our customers benefit from the isolation and access control that they can achieve using VPCs, subnets, route tables, security groups, and network ACLs, they create a lot of them. It is not uncommon to find customers with hundreds of VPCs distributed across AWS accounts and regions in order to serve multiple lines of business, teams, projects, and so forth.

Things get a bit more complex when our customers start to set up connectivity between their VPCs. All of the connectivity options that I listed above are strictly point-to-point, so the number of VPC-to-VPC connections grows quickly.

New AWS Transit Gateway
Today we are giving you the ability to use the new AWS Transit Gateway to build a hub-and-spoke network topology. You can connect your existing VPCs, data centers, remote offices, and remote gateways to a managed Transit Gateway, with full control over network routing and security, even if your VPCs, Active Directories, shared services, and other resources span multiple AWS accounts. You can simplify your overall network architecture, reduce operational overhead, and gain the ability to centrally manage crucial aspects of your external connectivity, including security. Last but not least, you can use Transit Gateways to consolidate your existing edge connectivity and route it through a single ingress/egress point.

Transit Gateways are easy to set up and to use, and are designed to be highly scalable and resilient. You can attach up to 5000 VPCs to each gateway and each attachment can handle up to 50 Gbits/second of bursty traffic. You can attach your AWS VPN connections to a Transit Gateway today, with Direct Connect planned for early 2019.

Creating a Transit Gateway
This new feature makes use of the new AWS Resource Manager, a new AWS service that makes it really easy for you to share AWS resources across accounts. An account that owns a resource simply creates a Resource Share and specifies a list of other AWS accounts that can access the resource. Transit Gateways are one of the first resource types that you can share in this fashion, with many others on the roadmap. I’ll have a lot more to say about this in the future; for now think of it as separating the concepts of ownership and access for a given AWS resource.

The first step is to create a Transit Gateway in my AWS account. I open up the VPC Console (CLI, API, and CloudFormation support is also available), select Transit Gateways and click Create Transit Gateway to get started. I enter a name and a description, an ASN for the Amazon side of the gateway, and can choose to automatically accept sharing requests from other accounts:

My gateway is available within minutes:

Now I can attach it to a VPC and select the subnet where I have workloads (at most one subnet per AZ):

After that, I head over to the Resource Access Manager Console and click Create a resource share:

I name my share, find and add my Transit Gateway to it, and add the AWS accounts that I want to share it with. I can also choose to share it with an Organization or an Organizational Unit (OU). I make my choices and click Create resource share to proceed:

My resource share is created and ready to go within a few minutes:

Next, I log in to the accounts that I shared the gateway with, head back to the RAM Console, locate the invitation, and click it:

I confirm that I am accepting the desired invitation, and click Accept resource share:

I confirm my intent, and then I can see the Transit Gateway as a resource that is shared with me:

The next step (not shown) is to attach it to the desired VPCs in this account!

As you can see, you can use Transit Gateways to simplify your networking model. You can easily build applications that span multiple VPCs and you can share network services across them without having to manage a complex network. For example, you can go from this:

To this:

You can also connect a Transit Gateway to a firewall or an IPS (Intrusiong Prevention System) and create a single VPC that handles all ingress and egress traffic for your network.

Things to Know
Here are a couple of other things that you should know about VPC Transit Gateways:

AWS Integration – The Transit Gateways publish metrics to Amazon CloudWatch and also generate VPC Flow Logs records.

VPN ECMP Support -You can enable Equal-Cost Multi-Path (ECMP) support on your VPN connections. If the connections advertise the same CIDR blocks, traffic will be distributed equally across them.

Routing Domains – You can use multiple route tables on the same Transit Gateway and use them to control routing on a per-attachment basis. You can isolate VPC traffic or divert traffic from certain VPCs to a separate inspection domain.

Security – You can use VPC security groups and network ACLs to control the flow of traffic between your on-premises networks.

Pricing – You pay a per-hour fee for each hour that a Transit Gateway is attached, along with a per-GB data processing fee.

Direct Connect – We are working on support for AWS Direct Connect.

Available Now
AWS Transit Gateways are available now and you can start using them today!



Connecting to and running ETL jobs across multiple VPCs using a dedicated AWS Glue VPC

Post Syndicated from Nivas Shankar original https://aws.amazon.com/blogs/big-data/connecting-to-and-running-etl-jobs-across-multiple-vpcs-using-a-dedicated-aws-glue-vpc/

Many organizations use a setup that includes multiple VPCs based on the Amazon VPC service, with databases isolated in separate VPCs for security, auditing, and compliance purposes. This blog post shows how you can use AWS Glue to perform extract, transform, load (ETL) and crawler operations for databases located in multiple VPCs.

The solution presented here uses a dedicated AWS Glue VPC and subnet to perform the following operations on databases located in different VPCs:

  • Scenario 1: Ingest data from an Amazon RDS for MySQL database, transform it in AWS Glue, and output the results to an Amazon Redshift data warehouse.
  • Scenario 2: Ingest data from an Amazon RDS for MySQL database, transform it in AWS Glue, and output the results to an Amazon RDS for PostgreSQL database.

In this blog post, we’ll go through the steps needed to build an ETL pipeline that consumes from one source in one VPC and outputs it to another source in a different VPC. We’ll set up in multiple VPCs to reproduce a situation where your database instances are in multiple VPCs for isolation related to security, audit, or other purposes.

For this solution, we create a VPC dedicated to AWS Glue. Next, we set up VPC peering between the AWS Glue VPC and all of the other database VPCs. Then we configure an Amazon S3 endpoint, route tables, security groups, and IAM so that AWS Glue can function properly. Lastly, we create AWS Glue connections and an AWS Glue job to perform the task at hand.

Step 1: Set up a VPC

To simulate these scenarios, we create four VPCs with their respective IPv4 CIDR ranges. (Note: CIDR ranges can’t overlap when you use VPC peering.)

VPC 1Amazon Redshift172.31.0.0/16
VPC 2Amazon RDS for MySQL172.32.0.0/16
VPC 3Amazon RDS for PostgreSQL172.33.0.0/16
VPC 4AWS Glue172.30.0.0/16

Key configuration notes:

  1. The AWS Glue VPC needs at least one private subnet for AWS Glue to use.
  2. Ensure that DNS hostnames are enabled for all of your VPCs (unless you plan to refer to your databases by IP address later on, which isn’t recommended).

Step 2: Set up a VPC peering connection

Next, we peer our VPCs together to ensure that AWS Glue can communicate with all of the database targets and sources. This approach is necessary because AWS Glue resources are created with private addresses only. Thus, they can’t use an internet gateway to communicate with public addresses, such as public database endpoints. If your database endpoints are public, you can alternatively use a network address translation (NAT) gateway with AWS Glue rather than peer across VPCs.

Create the following peering connections.

Peer 1172.30.0.0/16- VPC 4172.31.0.0/16- VPC 1
Peer 2172.30.0.0/16- VPC 4172.32.0.0/16 -VPC 2
Peer 3172.30.0.0/16- VPC 4172.33.0.0/16- VPC 3

These peering connections can be across separate AWS Regions if needed. The database VPCs are not peered together; they are all peered with the AWS Glue VPC instead. We do this because AWS Glue connects to each database from its own VPC. The databases don’t connect to each other.

Key configuration notes:

  1. Create a VPC peering connection, as described in the Amazon VPC documentation. Select the AWS Glue VPC as the requester and the VPC for your database as the accepter.
  2. Accept the VPC peering request. If you are peering to a different AWS Region, switch to that AWS Region to accept the request.

Important: Enable Domain Name Service (DNS) settings for each of the peering connections. Doing this ensures that AWS Glue can retrieve the private IP address of your database endpoints. Otherwise, AWS Glue resolves your database endpoints to public IP addresses. AWS Glue can’t connect to public IP addresses without a NAT gateway.

Step 3: Create an Amazon S3 endpoint for the AWS Glue subnet

We need to add an Amazon S3 endpoint to the AWS Glue VPC (VPC 4). During setup, associate the endpoint with the route table that your private subnet uses. For more details on creating an S3 endpoint for AWS Glue, see Amazon VPC Endpoints for Amazon S3 in the AWS Glue documentation.

AWS Glue uses S3 to store your scripts and temporary data to load into Amazon Redshift.

Step 4: Create a route table configuration

Add the following routes to the route tables used by the respective services’ subnets. These routes are configured along with existing settings.

VPC 4—AWS GlueDestinationTarget
Route table172.33.0.0/16- VPC 3Peer 3 VPC 1Peer 1 VPC 2Peer 2


VPC 1—Amazon RedshiftDestinationTarget
Route table172.30.0.0/16- VPC 4Peer 1


VPC 2—Amazon RDS MySQLDestinationTarget
Route table172.30.0.0/16- VPC 4Peer 2


VPC 3—Amazon RDS PostgreSQLDestinationTarget
Route table172.30.0.0/16- VPC 4Peer 3

Key configuration notes:

  • The route table for the AWS Glue VPC has peering connections to all VPCs. It has these so that AWS Glue can initiate connections to all of the databases.
  • All of the database VPCs have a peering connection back to the AWS Glue VPC. They have these connections to allow return traffic to reach AWS Glue.
  • Ensure that your S3 endpoint is present in the route table for the AWS Glue VPC.

Step 5: Update the database security groups

Each database’s security group must allow traffic to its listening port (3306, 5432, 5439, and so on) from the AWS Glue VPC for AWS Glue to be able to connect to it. It’s also a good idea to restrict the range of source IP addresses as much as possible.

There are two ways to accomplish this. If your AWS Glue job will be in the same AWS Region as the resource, you can define the source as the security group that you use for AWS Glue. If you are using AWS Glue to connect across AWS Regions, specify the IP range from the private subnet in the AWS Glue VPC instead. The examples following use a security group as our AWS Glue job, and data sources are all in the same AWS Region.

In addition to configuring the database’s security groups, AWS Glue requires a special security group that allows all inbound traffic from itself. Because it isn’t secure to allow traffic from, we create a self-referencing rule that simply allows all traffic originating from the security group. You can create a new security group for this purpose, or you can modify an existing security group. In the example following, we create a new security group to use later when AWS Glue connections are created.

The security group Amazon RDS for MySQL needs to allow traffic from AWS Glue:

Amazon RDS for PostgreSQL allows traffic to its listening port from the same:

Amazon Redshift does it as so:

AWS Glue does it as so:

Step 6: Set up IAM

Make sure that you have an AWS Glue IAM role with access to Amazon S3. You might want to provide your own policy for access to specific Amazon S3 resources. Data sources require s3:ListBucket and s3:GetObject permissions. Data targets require s3:ListBucket, s3:PutObject, and s3:DeleteObject permissions. For more information on creating an Amazon S3 policy for your resources, see Policies and Permissions in the IAM documentation.

The role should look like this:

Or you can create an S3 policy that’s more restricted to suit your use case.

Step 7: Set up an AWS Glue connection

The Amazon RDS for MySQL connection in AWS Glue should look like this:

The Amazon Redshift connection should look like this:

The Amazon RDS for PostgreSQL connection should look like this:

Step 8: Set up an AWS Glue job

Key configuration notes:

  1. Create a crawler to import table metadata from the source database (Amazon RDS for MySQL) into the AWS Glue Data Catalog. The scenario includes a database in the catalog named gluedb, to which the crawler adds the sample tables from the source Amazon RDS for MySQL database.
  2. Use either the source connection or destination connection to create a sample job as shown following. (This step is required for the AWS Glue job to establish a network connection and create the necessary elastic network interfaces with the databases’ VPCs and peered connections.)
  3. This scenario uses pyspark code and performs the load operation from Amazon RDS for MySQL to Amazon Redshift. The ingest from Amazon RDS for MySQL to Amazon RDS for PostgreSQL includes a similar job.
  4. After running the job, verify that the table exists in the target database and that the counts match.

The following screenshots show the steps to create a job in the AWS Glue Management Console.

Following are some of examples of loading data from source tables to target instances. These are simple one-to-one mappings, with no transformations applied. Notice that the data sources and data sink (target) connection configuration access multiple VPCs from a single AWS Glue job.

Sample script 1 (Amazon RDS for MySQL to Amazon Redshift)

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

datasource = glueContext.create_dynamic_frame.from_catalog(database = "gluedb", table_name = "mysqldb_events", transformation_ctx = "datasource")

datasink = glueContext.write_dynamic_frame.from_jdbc_conf(frame = datasource, catalog_connection = "Redshift", connection_options = {"dbtable": "mysqldb_events", "database": "dmartblog"}, redshift_tmp_dir = args["TempDir"], transformation_ctx = "datasink")


Sample script 2:  Amazon RDS for MySQL to Amazon RDS for PostgreSQL (can also change with other RDS endpoint)

import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job

sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args['JOB_NAME'], args)

datasource = glueContext.create_dynamic_frame.from_catalog(database = "gluedb", table_name = "mysqldb_events", transformation_ctx = "datasource")

datasink = glueContext.write_dynamic_frame.from_jdbc_conf(frame = dropnullfields3, catalog_connection = "vpc-pgsql", connection_options = {"dbtable": "mysqldb_events", "database": "mypgsql"}, transformation_ctx = "datasink")



In this blog post, you learn how to configure AWS Glue to run in a separate VPC so that it can execute jobs for databases located in multiple VPCs.

The benefits of doing this include the following:

  • A separate VPC and dedicated pool on the running AWS Glue job, isolated from database and compute nodes.
  • Dedicated ETL developer access to a single VPC for better security control and provisioning.


Additional Reading

If you found this post useful, be sure to check out Restrict access to your AWS Glue Data Catalog with resource-level IAM permissions and resource-based policies, and Using Amazon Redshift Spectrum, Amazon Athena, and AWS Glue with Node.js in Production.


About the Author

Nivas Shankar is a Senior Big Data Consultant at Amazon Web Services. He helps and works closely with enterprise customers building big data applications on the AWS platform. He holds a Masters degree in physics and is highly passionate about theoretical physics concepts. He enjoys spending time with his wife and two adorable kids. In his spare time, he takes his kids to tennis and football practice.



Ian Eberhart is a Cloud Support Engineer on the Big Data team for AWS Premium Support. He works with customers on a daily basis to find solutions for moving and sorting their data on the AWS platform. In his spare time, Ian enjoys seeing independent and weird movies, riding his bike, and hiking in the mountains.


How to connect to AWS Secrets Manager service within a Virtual Private Cloud

Post Syndicated from Divya Sridhar original https://aws.amazon.com/blogs/security/how-to-connect-to-aws-secrets-manager-service-within-a-virtual-private-cloud/

You can now use AWS Secrets Manager with Amazon Virtual Private Cloud (Amazon VPC) endpoints powered by AWS Privatelink and keep traffic between your VPC and Secrets Manager within the AWS network.

AWS Secrets Manager is a secrets management service that helps you protect access to your applications, services, and IT resources. This service enables you to rotate, manage, and retrieve database credentials, API keys, and other secrets throughout their lifecycle. When your application running within an Amazon VPC communicates with Secrets Manager, this communication traverses the public internet. By using Secrets Manager with Amazon VPC endpoints, you can now keep this communication within the AWS network and help meet your compliance and regulatory requirements to limit public internet connectivity. You can start using Secrets Manager with Amazon VPC endpoints by creating an Amazon VPC endpoint for Secrets Manager with a few clicks on the VPC console or via AWS CLI. Once you create the VPC endpoint, you can start using it without making any code or configuration changes in your application.

The diagram demonstrates how Secrets Manager works with Amazon VPC endpoints. It shows how I retrieve a secret stored in Secrets Manager from an Amazon EC2 instance. When the request is sent to Secrets Manager, the entire data flow is contained within the VPC and the AWS network.

Figure 1: How Secrets Manager works with Amazon VPC endpoints

Figure 1: How Secrets Manager works with Amazon VPC endpoints

Solution overview

In this post, I show you how to use Secrets Manager with an Amazon VPC endpoint. In this example, we have an application running on an EC2 instance in VPC named vpc-5ad42b3c. This application requires a database password to an RDS instance running in the same VPC. I have stored the database password in Secrets Manager. I will now show how to:

  1. Create an Amazon VPC endpoint for Secrets Manager using the VPC console.
  2. Use the Amazon VPC endpoint via AWS CLI to retrieve the RDS database secret stored in Secrets Manager from an application running on an EC2 instance.

Step 1: Create an Amazon VPC endpoint for Secrets Manager

  1. Open the Amazon VPC console, select Endpoints, and then select Create Endpoint.
  2. Select AWS Services as the Service category, and then, in the Service Name list, select the Secrets Manager endpoint service named com.amazonaws.us-west-2.secrets-manager.
    Figure 2: Options to select when creating an endpoint

    Figure 2: Options to select when creating an endpoint

  3. Specify the VPC you want to create the endpoint in. For this post, I chose the VPC named vpc-5ad42b3c where my RDS instance and application are running.
  4. To create a VPC endpoint, you need to specify the private IP address range in which the endpoint will be accessible. To do this, select the subnet for each Availability Zone (AZ). This restricts the VPC endpoint to the private IP address range specific to each AZ and also creates an AZ-specific VPC endpoint. Specifying more than one subnet-AZ combination helps improve fault tolerance and make the endpoint accessible from a different AZ in case of an AZ failure. Here, I specify subnet IDs for availability zones us-west-2a, us-west-2b, and us-west-2c:
    Figure 3: Specifying subnet IDs

    Figure 3: Specifying subnet IDs

  5. Select the Enable Private DNS Name checkbox for the VPC endpoint. Private DNS resolves the standard Secrets Manager DNS hostname https://secretsmanager.<region>.amazonaws.com. to the private IP addresses associated with the VPC endpoint specific DNS hostname. As a result, you can access the Secrets Manager VPC Endpoint via the AWS Command Line Interface (AWS CLI) or AWS SDKs without making any code or configuration changes to update the Secrets Manager endpoint URL.
    Figure 4: The "Enable Private DNS Name" checkbox

    Figure 4: The “Enable Private DNS Name” checkbox

  6. Associate a security group with this endpoint. The security group enables you to control the traffic to the endpoint from resources in your VPC. For this post, I chose to associate the security group named sg-07e4197d that I created earlier. This security group has been set up to allow all instances running within VPC vpc-5ad42b3c to access the Secrets Manager VPC endpoint. Select Create endpoint to finish creating the endpoint.
    Figure 5: Associate a security group and create the endpoint

    Figure 5: Associate a security group and create the endpoint

  7. To view the details of the endpoint you created, select the link on the console.
    Figure 6: Viewing the endpoint details

    Figure 6: Viewing the endpoint details

  8. The Details tab shows all the DNS hostnames generated while creating the Amazon VPC endpoint that can be used to connect to Secrets Manager. I can now use the standard endpoint secretsmanager.us-west-2.amazonaws.com or one of the VPC-specific endpoints to connect to Secrets Manager within vpc-5ad42b3c where my RDS instance and application also resides.
    Figure 7: The "Details" tab

    Figure 7: The “Details” tab

Step 2: Access Secrets Manager through the VPC endpoint

Now that I have created the VPC endpoint, all traffic between my application running on an EC2 instance hosted within VPC named vpc-5ad42b3c and Secrets Manager will be within the AWS network. This connection will use the VPC endpoint and I can use it to retrieve my RDS database secret stored in Secrets Manager. I can retrieve the secret via the AWS SDK or CLI. As an example, I can use the CLI command shown below to retrieve the current version of my RDS database secret:

$aws secretsmanager get-secret-value –secret-id MyDatabaseSecret –version-stage AWSCURRENT

Since my AWS CLI is configured for us-west-2 region, it uses the standard Secrets Manager endpoint URL https://secretsmanager.us-west-2.amazonaws.com. This standard endpoint automatically routes to the VPC endpoint since I enabled support for Private DNS hostname while creating the VPC endpoint. The above command will result in the following output:

  "ARN": "arn:aws:secretsmanager:us-west-2:123456789012:secret:MyDatabaseSecret-a1b2c3",
  "Name": "MyDatabaseSecret",
  "VersionId": "EXAMPLE1-90ab-cdef-fedc-ba987EXAMPLE",
  "SecretString": "{\n  \"username\":\"david\",\n  \"password\":\"BnQw&XDWgaEeT9XGTT29\"\n}\n",
  "VersionStages": [
  "CreatedDate": 1523477145.713


I’ve shown you how to create a VPC endpoint for AWS Secrets Manager and retrieve an RDS database secret using the VPC endpoint. Secrets Manager VPC Endpoints help you meet compliance and regulatory requirements about limiting public internet connectivity within your VPC. It enables your applications running within a VPC to use Secrets Manager while keeping traffic between the VPC and Secrets Manager within the AWS network. You can start using Amazon VPC Endpoints for Secrets Manager by creating endpoints in the VPC console or AWS CLI. Once created, your applications that interact with Secrets Manager do not require any code or configuration changes.

To learn more about connecting to Secrets Manager through a VPC endpoint, read the Secrets Manager documentation. For guidance about your overall VPC network structure, see Practical VPC Design.

If you have questions about this feature or anything else related to Secrets Manager, start a new thread in the Secrets Manager forum.

Want more AWS Security news? Follow us on Twitter.

Introducing Amazon API Gateway Private Endpoints

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/introducing-amazon-api-gateway-private-endpoints/

One of the biggest trends in application development today is the use of APIs to power the backend technologies supporting a product. Increasingly, the way mobile, IoT, web applications, or internal services talk to each other and to application frontends is using some API interface.

Alongside this trend of building API-powered applications is the move to a microservices application design pattern. A larger application is represented by many smaller application components, also typically communicating via API. The growth of APIs and microservices being used together is driven across all sorts of companies, from startups up through enterprises. The number of tools required to manage APIs at scale, securely, and with minimal operational overhead is growing as well.

Today, we’re excited to announce the launch of Amazon API Gateway private endpoints. This has been one of the most heavily requested features for this service. We believe this is going to make creating and managing private APIs even easier.

API Gateway overview

When API Gateway first launched, it came with what are now known as edge-optimized endpoints. These publicly facing endpoints came fronted with Amazon CloudFront, a global content delivery network with over 100 points of presence today.

Edge-optimized endpoints helped you reduce latency to clients accessing your API on the internet from anywhere; typically, mobile, IoT, or web-based applications. Behind API Gateway, you could back your API with a number of options for backend technologies: AWS Lambda, Amazon EC2, Elastic Load Balancing products such as Application Load Balancers or Classic Load Balancers, Amazon DynamoDB, Amazon Kinesis, or any publicly available HTTPS-based endpoint.

In February 2016, AWS launched the ability for AWS Lambda functions to access resources inside of an Amazon VPC. With this launch, you could build API-based services that did not require a publicly available endpoint. They could still interact with private services, such as databases, inside your VPC.

In November 2017, API Gateway launched regional API endpoints, which are publicly available endpoints without any preconfigured CDN in front of them. Regional endpoints are great for helping to reduce request latency when API requests originate from the same Region as your REST API. You can also configure your own CDN distribution, which allows you to protect your public APIs with AWS WAF, for example. With regional endpoints, nothing changed about the backend technologies supported.

At re:Invent 2017, we announced endpoint integrations inside a private VPC. With this capability, you can now have your backend running on EC2 be private inside your VPC without the need for a publicly accessible IP address or load balancer. Beyond that, you can also now use API Gateway to front APIs hosted by backends that exist privately in your own data centers, using AWS Direct Connect links to your VPC. Private integrations were made possible via VPC Link and Network Load Balancers, which support backends such as EC2 instances, Auto Scaling groups, and Amazon ECS using the Fargate launch type.

Combined with the other capabilities of API Gateway—such as Lambda authorizers, resource policies, canary deployments, SDK generation, and integration with Amazon Cognito User Pools—you’ve been able to build publicly available APIs, with nearly any backend you could want, securely, at scale, and with minimal operations overhead.

Private endpoints

Today’s launch solves one of the missing pieces of the puzzle, which is the ability to have private API endpoints inside your own VPC. With this new feature, you can still use API Gateway features, while securely exposing REST APIs only to the other services and resources inside your VPC, or those connected via Direct Connect to your own data centers.

Here’s how this works.

API Gateway private endpoints are made possible via AWS PrivateLink interface VPC endpoints. Interface endpoints work by creating elastic network interfaces in subnets that you define inside your VPC. Those network interfaces then provide access to services running in other VPCs, or to AWS services such as API Gateway. When configuring your interface endpoints, you specify which service traffic should go through them. When using private DNS, all traffic to that service is directed to the interface endpoint instead of through a default route, such as through a NAT gateway or public IP address.

API Gateway as a fully managed service runs its infrastructure in its own VPCs. When you interface with API Gateway publicly accessible endpoints, it is done through public networks. When they’re configured as private, the public networks are not made available to route your API. Instead, your API can only be accessed using the interface endpoints that you have configured.

Some things to note:

  • Because you configure the subnets in which your endpoints are made available, you control the availability of the access to your API Gateway hosted APIs. Make sure that you provide multiple interfaces in your VPC. In the above diagram, there is one endpoint in each subnet in each Availability Zone for which the VPC is configured.
  • Each endpoint is an elastic network interface configured in your VPC that has security groups configured. Network ACLs apply to the network interface as well.

For more information about endpoint limits, see Interface VPC Endpoints.

Setting up a private endpoint

Getting up and running with your private API Gateway endpoint requires just a few things:

  • A virtual private cloud (VPC) configured with at least one subnet and DNS resolution enabled.
  • A VPC endpoint with the following configuration:
    • Service name = “com.amazonaws.{region}.execute-api”
    • Enable Private DNS Name = enabled
    • A security group set to allow TCP Port 443 inbound from either an IP range in your VPC or another security group in your VPC
  • An API Gateway managed API with the following configuration:
    • Endpoint Type = “Private”
    • An API Gateway resource policy that allows access to your API from the VPC endpoint

Create the VPC

To create a VPC using AWS CloudFormation, choose Launch stack.

This VPC will have two private and two public subnets, one of each in an AZ, as seen in the CloudFormation Designer.

  1. Name the stack “PrivateAPIDemo”.
  2. Set the Environment to “Demo”. This has no real effect beyond tagging and naming certain resources accordingly.
  3. Choose Next.
  4. On the Options page, leave all of the defaults and choose Next.
  5. On the Review page, choose Create. It takes just a few moments for all of the resources in this template to be created.
  6. After the VPC has a status of “CREATE_COMPLETE”, choose Outputs and make note of the values for VpcId, both public and private subnets 1 and 2, and the endpoint security group.

Create the VPC endpoint for API Gateway

  1. Open the Amazon VPC console.
  2. Make sure that you are in the same Region in which you just created the above stack.
  3. In the left navigation pane, choose Endpoints, Create Endpoint.
  4. For Service category, keep it set to “AWS Services”.
  5. For Service Name, set it to “com.amazonaws.{region}.execute-api”.
  6. For VPC, select the one created earlier.
  7. For Subnets, select the two private labeled subnets from this VPC created earlier, one in each Availability Zone. You can find them labeled as “privateSubnet01” and “privateSubnet02”.
  8. For Enable Private DNS Name, keep it checked as Enabled for this endpoint.
  9. For Security Group, select the group named “EndpointSG”. It allows for HTTPS access to the endpoint for the entire VPC IP address range.
  10. Choose Create Endpoint.

Creating the endpoint takes a few moments to go through all of the interface endpoint lifecycle steps. You need the DNS names later so note them now.

Create the API

Follow the Pet Store example in the API Gateway documentation:

  1. Open the API Gateway console in the same Region as the VPC and private endpoint.
  2. Choose Create API, Example API.
  3. For Endpoint Type, choose Private.
  4. Choose Import.

Before deploying the API, create a resource policy to allow access to the API from inside the VPC.

  1. In the left navigation pane, choose Resource Policy.
  2. Choose Source VPC Whitelist from the three examples possible.
  3. Replace {{vpceID}} with the ID of your VPC endpoint.
  4. Choose Save.
  5. In the left navigation pane, select the new API and choose Actions, Deploy API.
    1. Choose [New Stage].
    2. Name the stage demo.
    3. Choose Deploy.

Your API is now fully deployed and available from inside your VPC. Next, test to confirm that it’s working.

Test the API

To emphasize the “privateness” of this API, test it from a resource that only lives inside your VPC and has no direct network access to it, in the traditional networking sense.

Launch a Lambda function inside the VPC, with no public access. To show its ability to hit the private API endpoint, invoke it using the console. The function is launched inside the private subnets inside the VPC without access to a NAT gateway, which would be required for any internet access. This works because Lambda functions are invoked using the service API, not any direct network access to the function’s underlying resources inside your VPC.

To create a Lambda function using CloudFormation, choose Launch stack.

All the code for this function is located inside of the template and the template creates just three resources, as shown in the diagram from Designer:

  • A Lambda function
  • An IAM role
  • A VPC security group
  1. Name the template LambdaTester, or something easy to remember.
  2. For the first parameter, enter a DNS name from your VPC endpoint. These can be found in the Amazon VPC console under Endpoints. For this example, use the endpoints that start with “vpce”. These are the private DNS names for them.For the API Gateway endpoint DNS, see the dashboard for your API Gateway API and copy the URL from the top of the page. Use just the endpoint DNS, not the “https://” or “/demo/” at the end.
  3. Select the same value for Environment as you did earlier in creating your VPC.
  4. Choose Next.
  5. Leave all options as the default values and choose Next.
  6. Select the check box next to I acknowledge that… and choose Create.
  7. When your stack reaches the “CREATE_COMPLETE” state, choose Resources.
  8. To go to the Lambda console for this function, choose the Physical ID of the AWS::Lambda::Function resource.

Note: If you chose a different environment than “Demo” for this example, modify the line “path: ‘/demo/pets’,” to the appropriate value.

  1. Choose Test in the top right of the Lambda console. You are prompted to create a test event to pass the function. Because you don’t need to take anything here for the function to call the internal API, you can create a blank payload or leave the default as shown. Choose Save.
  2. Choose Test again. This invokes the function and passes in the payload that you just saved. It takes just a few moments for the new function’s environment to spin to life and to call the code configured for it. You should now see the results of the API call to the PetStore API.

The JSON returned is from your API Gateway powered private API endpoint. Visit the API Gateway console to see activity on the dashboard and confirm again that this API was called by the Lambda function, as in the following screenshot:


Cleaning up from this demo requires a few simple steps:

  1. Delete the stack for your Lambda function.
  2. Delete the VPC endpoint.
  3. Delete the API Gateway API.
  4. Delete the VPC stack that you created first.


API Gateway private endpoints enable use cases for building private API–based services inside your own VPCs. You can now keep both the frontend to your API (API Gateway) and the backend service (Lambda, EC2, ECS, etc.) private inside your VPC. Or you can have networks using Direct Connect networks without the need to expose them to the internet in any way. All of this without the need to manage the infrastructure that powers the API gateway itself!

You can continue to use the advanced features of API Gateway such as custom authorizers, Amazon Cognito User Pools integration, usage tiers, throttling, deployment canaries, and API keys.

We believe that this feature greatly simplifies the growth of API-based microservices. We look forward to your feedback here, on social media, or in the AWS forums.

Analyze Apache Parquet optimized data using Amazon Kinesis Data Firehose, Amazon Athena, and Amazon Redshift

Post Syndicated from Roy Hasson original https://aws.amazon.com/blogs/big-data/analyzing-apache-parquet-optimized-data-using-amazon-kinesis-data-firehose-amazon-athena-and-amazon-redshift/

Amazon Kinesis Data Firehose is the easiest way to capture and stream data into a data lake built on Amazon S3. This data can be anything—from AWS service logs like AWS CloudTrail log files, Amazon VPC Flow Logs, Application Load Balancer logs, and others. It can also be IoT events, game events, and much more. To efficiently query this data, a time-consuming ETL (extract, transform, and load) process is required to massage and convert the data to an optimal file format, which increases the time to insight. This situation is less than ideal, especially for real-time data that loses its value over time.

To solve this common challenge, Kinesis Data Firehose can now save data to Amazon S3 in Apache Parquet or Apache ORC format. These are optimized columnar formats that are highly recommended for best performance and cost-savings when querying data in S3. This feature directly benefits you if you use Amazon Athena, Amazon Redshift, AWS Glue, Amazon EMR, or any other big data tools that are available from the AWS Partner Network and through the open-source community.

Amazon Connect is a simple-to-use, cloud-based contact center service that makes it easy for any business to provide a great customer experience at a lower cost than common alternatives. Its open platform design enables easy integration with other systems. One of those systems is Amazon Kinesis—in particular, Kinesis Data Streams and Kinesis Data Firehose.

What’s really exciting is that you can now save events from Amazon Connect to S3 in Apache Parquet format. You can then perform analytics using Amazon Athena and Amazon Redshift Spectrum in real time, taking advantage of this key performance and cost optimization. Of course, Amazon Connect is only one example. This new capability opens the door for a great deal of opportunity, especially as organizations continue to build their data lakes.

Amazon Connect includes an array of analytics views in the Administrator dashboard. But you might want to run other types of analysis. In this post, I describe how to set up a data stream from Amazon Connect through Kinesis Data Streams and Kinesis Data Firehose and out to S3, and then perform analytics using Athena and Amazon Redshift Spectrum. I focus primarily on the Kinesis Data Firehose support for Parquet and its integration with the AWS Glue Data Catalog, Amazon Athena, and Amazon Redshift.

Solution overview

Here is how the solution is laid out:



The following sections walk you through each of these steps to set up the pipeline.

1. Define the schema

When Kinesis Data Firehose processes incoming events and converts the data to Parquet, it needs to know which schema to apply. The reason is that many times, incoming events contain all or some of the expected fields based on which values the producers are advertising. A typical process is to normalize the schema during a batch ETL job so that you end up with a consistent schema that can easily be understood and queried. Doing this introduces latency due to the nature of the batch process. To overcome this issue, Kinesis Data Firehose requires the schema to be defined in advance.

To see the available columns and structures, see Amazon Connect Agent Event Streams. For the purpose of simplicity, I opted to make all the columns of type String rather than create the nested structures. But you can definitely do that if you want.

The simplest way to define the schema is to create a table in the Amazon Athena console. Open the Athena console, and paste the following create table statement, substituting your own S3 bucket and prefix for where your event data will be stored. A Data Catalog database is a logical container that holds the different tables that you can create. The default database name shown here should already exist. If it doesn’t, you can create it or use another database that you’ve already created.

CREATE EXTERNAL TABLE default.kfhconnectblog (
  awsaccountid string,
  agentarn string,
  currentagentsnapshot string,
  eventid string,
  eventtimestamp string,
  eventtype string,
  instancearn string,
  previousagentsnapshot string,
  version string
STORED AS parquet
LOCATION 's3://your_bucket/kfhconnectblog/'
TBLPROPERTIES ("parquet.compression"="SNAPPY")

That’s all you have to do to prepare the schema for Kinesis Data Firehose.

2. Define the data streams

Next, you need to define the Kinesis data streams that will be used to stream the Amazon Connect events.  Open the Kinesis Data Streams console and create two streams.  You can configure them with only one shard each because you don’t have a lot of data right now.

3. Define the Kinesis Data Firehose delivery stream for Parquet

Let’s configure the Data Firehose delivery stream using the data stream as the source and Amazon S3 as the output. Start by opening the Kinesis Data Firehose console and creating a new data delivery stream. Give it a name, and associate it with the Kinesis data stream that you created in Step 2.

As shown in the following screenshot, enable Record format conversion (1) and choose Apache Parquet (2). As you can see, Apache ORC is also supported. Scroll down and provide the AWS Glue Data Catalog database name (3) and table names (4) that you created in Step 1. Choose Next.

To make things easier, the output S3 bucket and prefix fields are automatically populated using the values that you defined in the LOCATION parameter of the create table statement from Step 1. Pretty cool. Additionally, you have the option to save the raw events into another location as defined in the Source record S3 backup section. Don’t forget to add a trailing forward slash “ / “ so that Data Firehose creates the date partitions inside that prefix.

On the next page, in the S3 buffer conditions section, there is a note about configuring a large buffer size. The Parquet file format is highly efficient in how it stores and compresses data. Increasing the buffer size allows you to pack more rows into each output file, which is preferred and gives you the most benefit from Parquet.

Compression using Snappy is automatically enabled for both Parquet and ORC. You can modify the compression algorithm by using the Kinesis Data Firehose API and update the OutputFormatConfiguration.

Be sure to also enable Amazon CloudWatch Logs so that you can debug any issues that you might run into.

Lastly, finalize the creation of the Firehose delivery stream, and continue on to the next section.

4. Set up the Amazon Connect contact center

After setting up the Kinesis pipeline, you now need to set up a simple contact center in Amazon Connect. The Getting Started page provides clear instructions on how to set up your environment, acquire a phone number, and create an agent to accept calls.

After setting up the contact center, in the Amazon Connect console, choose your Instance Alias, and then choose Data Streaming. Under Agent Event, choose the Kinesis data stream that you created in Step 2, and then choose Save.

At this point, your pipeline is complete.  Agent events from Amazon Connect are generated as agents go about their day. Events are sent via Kinesis Data Streams to Kinesis Data Firehose, which converts the event data from JSON to Parquet and stores it in S3. Athena and Amazon Redshift Spectrum can simply query the data without any additional work.

So let’s generate some data. Go back into the Administrator console for your Amazon Connect contact center, and create an agent to handle incoming calls. In this example, I creatively named mine Agent One. After it is created, Agent One can get to work and log into their console and set their availability to Available so that they are ready to receive calls.

To make the data a bit more interesting, I also created a second agent, Agent Two. I then made some incoming and outgoing calls and caused some failures to occur, so I now have enough data available to analyze.

5. Analyze the data with Athena

Let’s open the Athena console and run some queries. One thing you’ll notice is that when we created the schema for the dataset, we defined some of the fields as Strings even though in the documentation they were complex structures.  The reason for doing that was simply to show some of the flexibility of Athena to be able to parse JSON data. However, you can define nested structures in your table schema so that Kinesis Data Firehose applies the appropriate schema to the Parquet file.

Let’s run the first query to see which agents have logged into the system.

The query might look complex, but it’s fairly straightforward:

WITH dataset AS (
    from_iso8601_timestamp(eventtimestamp) AS event_ts,
      '$.agentstatus.name') AS current_status,
        '$.agentstatus.starttimestamp')) AS current_starttimestamp,
      '$.configuration.firstname') AS current_firstname,
      '$.configuration.lastname') AS current_lastname,
      '$.configuration.username') AS current_username,
      '$.configuration.routingprofile.defaultoutboundqueue.name') AS               current_outboundqueue,
      '$.configuration.routingprofile.inboundqueues[0].name') as current_inboundqueue,
      '$.agentstatus.name') as prev_status,
       '$.agentstatus.starttimestamp')) as prev_starttimestamp,
      '$.configuration.firstname') as prev_firstname,
      '$.configuration.lastname') as prev_lastname,
      '$.configuration.username') as prev_username,
      '$.configuration.routingprofile.defaultoutboundqueue.name') as current_outboundqueue,
      '$.configuration.routingprofile.inboundqueues[0].name') as prev_inboundqueue
  from kfhconnectblog
  where eventtype <> 'HEART_BEAT'
  current_status as status,
  current_username as username,
FROM dataset
WHERE eventtype = 'LOGIN' AND current_username <> ''
ORDER BY event_ts DESC

The query output looks something like this:

Here is another query that shows the sessions each of the agents engaged with. It tells us where they were incoming or outgoing, if they were completed, and where there were missed or failed calls.

WITH src AS (
     json_extract_scalar(currentagentsnapshot, '$.configuration.username') as username,
     cast(json_extract(currentagentsnapshot, '$.contacts') AS ARRAY(JSON)) as c,
     cast(json_extract(previousagentsnapshot, '$.contacts') AS ARRAY(JSON)) as p
  from kfhconnectblog
src2 AS (
  FROM src CROSS JOIN UNNEST (c, p) AS contacts(c_item, p_item)
dataset AS (
  json_extract_scalar(c_item, '$.contactid') as c_contactid,
  json_extract_scalar(c_item, '$.channel') as c_channel,
  json_extract_scalar(c_item, '$.initiationmethod') as c_direction,
  json_extract_scalar(c_item, '$.queue.name') as c_queue,
  json_extract_scalar(c_item, '$.state') as c_state,
  from_iso8601_timestamp(json_extract_scalar(c_item, '$.statestarttimestamp')) as c_ts,
  json_extract_scalar(p_item, '$.contactid') as p_contactid,
  json_extract_scalar(p_item, '$.channel') as p_channel,
  json_extract_scalar(p_item, '$.initiationmethod') as p_direction,
  json_extract_scalar(p_item, '$.queue.name') as p_queue,
  json_extract_scalar(p_item, '$.state') as p_state,
  from_iso8601_timestamp(json_extract_scalar(p_item, '$.statestarttimestamp')) as p_ts
FROM src2
  c_channel as channel,
  c_direction as direction,
  p_state as prev_state,
  c_state as current_state,
  c_ts as current_ts,
  c_contactid as id
FROM dataset
WHERE c_contactid = p_contactid
ORDER BY id DESC, current_ts ASC

The query output looks similar to the following:

6. Analyze the data with Amazon Redshift Spectrum

With Amazon Redshift Spectrum, you can query data directly in S3 using your existing Amazon Redshift data warehouse cluster. Because the data is already in Parquet format, Redshift Spectrum gets the same great benefits that Athena does.

Here is a simple query to show querying the same data from Amazon Redshift. Note that to do this, you need to first create an external schema in Amazon Redshift that points to the AWS Glue Data Catalog.

  json_extract_path_text(currentagentsnapshot,'agentstatus','name') AS current_status,
  json_extract_path_text(currentagentsnapshot, 'configuration','firstname') AS current_firstname,
  json_extract_path_text(currentagentsnapshot, 'configuration','lastname') AS current_lastname,
    'configuration','routingprofile','defaultoutboundqueue','name') AS current_outboundqueue,
FROM default_schema.kfhconnectblog

The following shows the query output:


In this post, I showed you how to use Kinesis Data Firehose to ingest and convert data to columnar file format, enabling real-time analysis using Athena and Amazon Redshift. This great feature enables a level of optimization in both cost and performance that you need when storing and analyzing large amounts of data. This feature is equally important if you are investing in building data lakes on AWS.


Additional Reading

If you found this post useful, be sure to check out Analyzing VPC Flow Logs with Amazon Kinesis Firehose, Amazon Athena, and Amazon QuickSight and Work with partitioned data in AWS Glue.

About the Author

Roy Hasson is a Global Business Development Manager for AWS Analytics. He works with customers around the globe to design solutions to meet their data processing, analytics and business intelligence needs. Roy is big Manchester United fan cheering his team on and hanging out with his family.




How to centralize DNS management in a multi-account environment

Post Syndicated from Mahmoud Matouk original https://aws.amazon.com/blogs/security/how-to-centralize-dns-management-in-a-multi-account-environment/

In a multi-account environment where you require connectivity between accounts, and perhaps connectivity between cloud and on-premises workloads, the demand for a robust Domain Name Service (DNS) that’s capable of name resolution across all connected environments will be high.

The most common solution is to implement local DNS in each account and use conditional forwarders for DNS resolutions outside of this account. While this solution might be efficient for a single-account environment, it becomes complex in a multi-account environment.

In this post, I will provide a solution to implement central DNS for multiple accounts. This solution reduces the number of DNS servers and forwarders needed to implement cross-account domain resolution. I will show you how to configure this solution in four steps:

  1. Set up your Central DNS account.
  2. Set up each participating account.
  3. Create Route53 associations.
  4. Configure on-premises DNS (if applicable).

Solution overview

In this solution, you use AWS Directory Service for Microsoft Active Directory (AWS Managed Microsoft AD) as a DNS service in a dedicated account in a Virtual Private Cloud (DNS-VPC).

The DNS service included in AWS Managed Microsoft AD uses conditional forwarders to forward domain resolution to either Amazon Route 53 (for domains in the awscloud.com zone) or to on-premises DNS servers (for domains in the example.com zone). You’ll use AWS Managed Microsoft AD as the primary DNS server for other application accounts in the multi-account environment (participating accounts).

A participating account is any application account that hosts a VPC and uses the centralized AWS Managed Microsoft AD as the primary DNS server for that VPC. Each participating account has a private, hosted zone with a unique zone name to represent this account (for example, business_unit.awscloud.com).

You associate the DNS-VPC with the unique hosted zone in each of the participating accounts, this allows AWS Managed Microsoft AD to use Route 53 to resolve all registered domains in private, hosted zones in participating accounts.

The following diagram shows how the various services work together:

Diagram showing the relationship between all the various services

Figure 1: Diagram showing the relationship between all the various services


In this diagram, all VPCs in participating accounts use Dynamic Host Configuration Protocol (DHCP) option sets. The option sets configure EC2 instances to use the centralized AWS Managed Microsoft AD in DNS-VPC as their default DNS Server. You also configure AWS Managed Microsoft AD to use conditional forwarders to send domain queries to Route53 or on-premises DNS servers based on query zone. For domain resolution across accounts to work, we associate DNS-VPC with each hosted zone in participating accounts.

If, for example, server.pa1.awscloud.com needs to resolve addresses in the pa3.awscloud.com domain, the sequence shown in the following diagram happens:

How domain resolution across accounts works

Figure 2: How domain resolution across accounts works


  • 1.1: server.pa1.awscloud.com sends domain name lookup to default DNS server for the name server.pa3.awscloud.com. The request is forwarded to the DNS server defined in the DHCP option set (AWS Managed Microsoft AD in DNS-VPC).
  • 1.2: AWS Managed Microsoft AD forwards name resolution to Route53 because it’s in the awscloud.com zone.
  • 1.3: Route53 resolves the name to the IP address of server.pa3.awscloud.com because DNS-VPC is associated with the private hosted zone pa3.awscloud.com.

Similarly, if server.example.com needs to resolve server.pa3.awscloud.com, the following happens:

  • 2.1: server.example.com sends domain name lookup to on-premise DNS server for the name server.pa3.awscloud.com.
  • 2.2: on-premise DNS server using conditional forwarder forwards domain lookup to AWS Managed Microsoft AD in DNS-VPC.
  • 1.2: AWS Managed Microsoft AD forwards name resolution to Route53 because it’s in the awscloud.com zone.
  • 1.3: Route53 resolves the name to the IP address of server.pa3.awscloud.com because DNS-VPC is associated with the private hosted zone pa3.awscloud.com.

Step 1: Set up a centralized DNS account

In previous AWS Security Blog posts, Drew Dennis covered a couple of options for establishing DNS resolution between on-premises networks and Amazon VPC. In this post, he showed how you can use AWS Managed Microsoft AD (provisioned with AWS Directory Service) to provide DNS resolution with forwarding capabilities.

To set up a centralized DNS account, you can follow the same steps in Drew’s post to create AWS Managed Microsoft AD and configure the forwarders to send DNS queries for awscloud.com to default, VPC-provided DNS and to forward example.com queries to the on-premise DNS server.

Here are a few considerations while setting up central DNS:

  • The VPC that hosts AWS Managed Microsoft AD (DNS-VPC) will be associated with all private hosted zones in participating accounts.
  • To be able to resolve domain names across AWS and on-premises, connectivity through Direct Connect or VPN must be in place.

Step 2: Set up participating accounts

The steps I suggest in this section should be applied individually in each application account that’s participating in central DNS resolution.

  1. Create the VPC(s) that will host your resources in participating account.
  2. Create VPC Peering between local VPC(s) in each participating account and DNS-VPC.
  3. Create a private hosted zone in Route 53. Hosted zone domain names must be unique across all accounts. In the diagram above, we used pa1.awscloud.com / pa2.awscloud.com / pa3.awscloud.com. You could also use a combination of environment and business unit: for example, you could use pa1.dev.awscloud.com to achieve uniqueness.
  4. Associate VPC(s) in each participating account with the local private hosted zone.

The next step is to change the default DNS servers on each VPC using DHCP option set:

  1. Follow these steps to create a new DHCP option set. Make sure in the DNS Servers to put the private IP addresses of the two AWS Managed Microsoft AD servers that were created in DNS-VPC:
    The "Create DHCP options set" dialog box

    Figure 3: The “Create DHCP options set” dialog box


  2. Follow these steps to assign the DHCP option set to your VPC(s) in participating account.

Step 3: Associate DNS-VPC with private hosted zones in each participating account

The next steps will associate DNS-VPC with the private, hosted zone in each participating account. This allows instances in DNS-VPC to resolve domain records created in these hosted zones. If you need them, here are more details on associating a private, hosted zone with VPC on a different account.

  1. In each participating account, create the authorization using the private hosted zone ID from the previous step, the region, and the VPC ID that you want to associate (DNS-VPC).
    aws route53 create-vpc-association-authorization –hosted-zone-id <hosted-zone-id> –vpc VPCRegion=<region>,VPCId=<vpc-id>
  2. In the centralized DNS account, associate DNS-VPC with the hosted zone in each participating account.
    aws route53 associate-vpc-with-hosted-zone –hosted-zone-id <hosted-zone-id> –vpc VPCRegion=<region>,VPCId=<vpc-id>

After completing these steps, AWS Managed Microsoft AD in the centralized DNS account should be able to resolve domain records in the private, hosted zone in each participating account.

Step 4: Setting up on-premises DNS servers

This step is necessary if you would like to resolve AWS private domains from on-premises servers and this task comes down to configuring forwarders on-premise to forward DNS queries to AWS Managed Microsoft AD in DNS-VPC for all domains in the awscloud.com zone.

The steps to implement conditional forwarders vary by DNS product. Follow your product’s documentation to complete this configuration.


I introduced a simplified solution to implement central DNS resolution in a multi-account environment that could be also extended to support DNS resolution between on-premise resources and AWS. This can help reduce operations effort and the number of resources needed to implement cross-account domain resolution.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Directory Service forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Securing messages published to Amazon SNS with AWS PrivateLink

Post Syndicated from Otavio Ferreira original https://aws.amazon.com/blogs/security/securing-messages-published-to-amazon-sns-with-aws-privatelink/

Amazon Simple Notification Service (SNS) now supports VPC Endpoints (VPCE) via AWS PrivateLink. You can use VPC Endpoints to privately publish messages to SNS topics, from an Amazon Virtual Private Cloud (VPC), without traversing the public internet. When you use AWS PrivateLink, you don’t need to set up an Internet Gateway (IGW), Network Address Translation (NAT) device, or Virtual Private Network (VPN) connection. You don’t need to use public IP addresses, either.

VPC Endpoints doesn’t require code changes and can bring additional security to Pub/Sub Messaging use cases that rely on SNS. VPC Endpoints helps promote data privacy and is aligned with assurance programs, including the Health Insurance Portability and Accountability Act (HIPAA), FedRAMP, and others discussed below.

VPC Endpoints for SNS in action

Here’s how VPC Endpoints for SNS works. The following example is based on a banking system that processes mortgage applications. This banking system, which has been deployed to a VPC, publishes each mortgage application to an SNS topic. The SNS topic then fans out the mortgage application message to two subscribing AWS Lambda functions:

  • Save-Mortgage-Application stores the application in an Amazon DynamoDB table. As the mortgage application contains personally identifiable information (PII), the message must not traverse the public internet.
  • Save-Credit-Report checks the applicant’s credit history against an external Credit Reporting Agency (CRA), then stores the final credit report in an Amazon S3 bucket.

The following diagram depicts the underlying architecture for this banking system:
Diagram depicting the architecture for the example banking system
To protect applicants’ data, the financial institution responsible for developing this banking system needed a mechanism to prevent PII data from traversing the internet when publishing mortgage applications from their VPC to the SNS topic. Therefore, they created a VPC endpoint to enable their publisher Amazon EC2 instance to privately connect to the SNS API. As shown in the diagram, when the VPC endpoint is created, an Elastic Network Interface (ENI) is automatically placed in the same VPC subnet as the publisher EC2 instance. This ENI exposes a private IP address that is used as the entry point for traffic destined to SNS. This ensures that traffic between the VPC and SNS doesn’t leave the Amazon network.

Set up VPC Endpoints for SNS

The process for creating a VPC endpoint to privately connect to SNS doesn’t require code changes: access the VPC Management Console, navigate to the Endpoints section, and create a new Endpoint. Three attributes are required:

  • The SNS service name.
  • The VPC and Availability Zones (AZs) from which you’ll publish your messages.
  • The Security Group (SG) to be associated with the endpoint network interface. The Security Group controls the traffic to the endpoint network interface from resources in your VPC. If you don’t specify a Security Group, the default Security Group for your VPC will be associated.

Help ensure your security and compliance

SNS can support messaging use cases in regulated market segments, such as healthcare provider systems subject to the Health Insurance Portability and Accountability Act (HIPAA) and financial systems subject to the Payment Card Industry Data Security Standard (PCI DSS), and is also in-scope with the following Assurance Programs:

The SNS API is served through HTTP Secure (HTTPS), and encrypts all messages in transit with Transport Layer Security (TLS) certificates issued by Amazon Trust Services (ATS). The certificates verify the identity of the SNS API server when encrypted connections are established. The certificates help establish proof that your SNS API client (SDK, CLI) is communicating securely with the SNS API server. A Certificate Authority (CA) issues the certificate to a specific domain. Hence, when a domain presents a certificate that’s issued by a trusted CA, the SNS API client knows it’s safe to make the connection.


VPC Endpoints can increase the security of your pub/sub messaging use cases by allowing you to publish messages to SNS topics, from instances in your VPC, without traversing the internet. Setting up VPC Endpoints for SNS doesn’t require any code changes because the SNS API address remains the same.

VPC Endpoints for SNS is now available in all AWS Regions where AWS PrivateLink is available. For information on pricing and regional availability, visit the VPC pricing page.
For more information and on-boarding, see Publishing to Amazon SNS Topics from Amazon Virtual Private Cloud in the SNS documentation.

If you have comments about this post, submit them in the Comments section below. If you have questions about anything in this post, start a new thread on the Amazon SNS forum or contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Reactive Microservices Architecture on AWS

Post Syndicated from Sascha Moellering original https://aws.amazon.com/blogs/architecture/reactive-microservices-architecture-on-aws/

Microservice-application requirements have changed dramatically in recent years. These days, applications operate with petabytes of data, need almost 100% uptime, and end users expect sub-second response times. Typical N-tier applications can’t deliver on these requirements.

Reactive Manifesto, published in 2014, describes the essential characteristics of reactive systems including: responsiveness, resiliency, elasticity, and being message driven.

Being message driven is perhaps the most important characteristic of reactive systems. Asynchronous messaging helps in the design of loosely coupled systems, which is a key factor for scalability. In order to build a highly decoupled system, it is important to isolate services from each other. As already described, isolation is an important aspect of the microservices pattern. Indeed, reactive systems and microservices are a natural fit.

Implemented Use Case
This reference architecture illustrates a typical ad-tracking implementation.

Many ad-tracking companies collect massive amounts of data in near-real-time. In many cases, these workloads are very spiky and heavily depend on the success of the ad-tech companies’ customers. Typically, an ad-tracking-data use case can be separated into a real-time part and a non-real-time part. In the real-time part, it is important to collect data as fast as possible and ask several questions including:,  “Is this a valid combination of parameters?,””Does this program exist?,” “Is this program still valid?”

Because response time has a huge impact on conversion rate in advertising, it is important for advertisers to respond as fast as possible. This information should be kept in memory to reduce communication overhead with the caching infrastructure. The tracking application itself should be as lightweight and scalable as possible. For example, the application shouldn’t have any shared mutable state and it should use reactive paradigms. In our implementation, one main application is responsible for this real-time part. It collects and validates data, responds to the client as fast as possible, and asynchronously sends events to backend systems.

The non-real-time part of the application consumes the generated events and persists them in a NoSQL database. In a typical tracking implementation, clicks, cookie information, and transactions are matched asynchronously and persisted in a data store. The matching part is not implemented in this reference architecture. Many ad-tech architectures use frameworks like Hadoop for the matching implementation.

The system can be logically divided into the data collection partand the core data updatepart. The data collection part is responsible for collecting, validating, and persisting the data. In the core data update part, the data that is used for validation gets updated and all subscribers are notified of new data.

Components and Services

Main Application
The main application is implemented using Java 8 and uses Vert.x as the main framework. Vert.x is an event-driven, reactive, non-blocking, polyglot framework to implement microservices. It runs on the Java virtual machine (JVM) by using the low-level IO library Netty. You can write applications in Java, JavaScript, Groovy, Ruby, Kotlin, Scala, and Ceylon. The framework offers a simple and scalable actor-like concurrency model. Vert.x calls handlers by using a thread known as an event loop. To use this model, you have to write code known as “verticles.” Verticles share certain similarities with actors in the actor model. To use them, you have to implement the verticle interface. Verticles communicate with each other by generating messages in  a single event bus. Those messages are sent on the event bus to a specific address, and verticles can register to this address by using handlers.

With only a few exceptions, none of the APIs in Vert.x block the calling thread. Similar to Node.js, Vert.x uses the reactor pattern. However, in contrast to Node.js, Vert.x uses several event loops. Unfortunately, not all APIs in the Java ecosystem are written asynchronously, for example, the JDBC API. Vert.x offers a possibility to run this, blocking APIs without blocking the event loop. These special verticles are called worker verticles. You don’t execute worker verticles by using the standard Vert.x event loops, but by using a dedicated thread from a worker pool. This way, the worker verticles don’t block the event loop.

Our application consists of five different verticles covering different aspects of the business logic. The main entry point for our application is the HttpVerticle, which exposes an HTTP-endpoint to consume HTTP-requests and for proper health checking. Data from HTTP requests such as parameters and user-agent information are collected and transformed into a JSON message. In order to validate the input data (to ensure that the program exists and is still valid), the message is sent to the CacheVerticle.

This verticle implements an LRU-cache with a TTL of 10 minutes and a capacity of 100,000 entries. Instead of adding additional functionality to a standard JDK map implementation, we use Google Guava, which has all the features we need. If the data is not in the L1 cache, the message is sent to the RedisVerticle. This verticle is responsible for data residing in Amazon ElastiCache and uses the Vert.x-redis-client to read data from Redis. In our example, Redis is the central data store. However, in a typical production implementation, Redis would just be the L2 cache with a central data store like Amazon DynamoDB. One of the most important paradigms of a reactive system is to switch from a pull- to a push-based model. To achieve this and reduce network overhead, we’ll use Redis pub/sub to push core data changes to our main application.

Vert.x also supports direct Redis pub/sub-integration, the following code shows our subscriber-implementation:

vertx.eventBus().<JsonObject>consumer(REDIS_PUBSUB_CHANNEL_VERTX, received -> {

JsonObject value = received.body().getJsonObject("value");

String message = value.getString("message");

JsonObject jsonObject = new JsonObject(message);



redis.subscribe(Constants.REDIS_PUBSUB_CHANNEL, res -> {

if (res.succeeded()) {

LOGGER.info("Subscribed to " + Constants.REDIS_PUBSUB_CHANNEL);

} else {




The verticle subscribes to the appropriate Redis pub/sub-channel. If a message is sent over this channel, the payload is extracted and forwarded to the cache-verticle that stores the data in the L1-cache. After storing and enriching data, a response is sent back to the HttpVerticle, which responds to the HTTP request that initially hit this verticle. In addition, the message is converted to ByteBuffer, wrapped in protocol buffers, and send to an Amazon Kinesis Data Stream.

The following example shows a stripped-down version of the KinesisVerticle:

public class KinesisVerticle extends AbstractVerticle {

private static final Logger LOGGER = LoggerFactory.getLogger(KinesisVerticle.class);

private AmazonKinesisAsync kinesisAsyncClient;

private String eventStream = "EventStream";


public void start() throws Exception {

EventBus eb = vertx.eventBus();

kinesisAsyncClient = createClient();

eventStream = System.getenv(STREAM_NAME) == null ? "EventStream" : System.getenv(STREAM_NAME);

eb.consumer(Constants.KINESIS_EVENTBUS_ADDRESS, message -> {

try {

TrackingMessage trackingMessage = Json.decodeValue((String)message.body(), TrackingMessage.class);

String partitionKey = trackingMessage.getMessageId();

byte [] byteMessage = createMessage(trackingMessage);

ByteBuffer buf = ByteBuffer.wrap(byteMessage);

sendMessageToKinesis(buf, partitionKey);



catch (KinesisException exc) {





Kinesis Consumer
This AWS Lambda function consumes data from an Amazon Kinesis Data Stream and persists the data in an Amazon DynamoDB table. In order to improve testability, the invocation code is separated from the business logic. The invocation code is implemented in the class KinesisConsumerHandler and iterates over the Kinesis events pulled from the Kinesis stream by AWS Lambda. Each Kinesis event is unwrapped and transformed from ByteBuffer to protocol buffers and converted into a Java object. Those Java objects are passed to the business logic, which persists the data in a DynamoDB table. In order to improve duration of successive Lambda calls, the DynamoDB-client is instantiated lazily and reused if possible.

Redis Updater
From time to time, it is necessary to update core data in Redis. A very efficient implementation for this requirement is using AWS Lambda and Amazon Kinesis. New core data is sent over the AWS Kinesis stream using JSON as data format and consumed by a Lambda function. This function iterates over the Kinesis events pulled from the Kinesis stream by AWS Lambda. Each Kinesis event is unwrapped and transformed from ByteBuffer to String and converted into a Java object. The Java object is passed to the business logic and stored in Redis. In addition, the new core data is also sent to the main application using Redis pub/sub in order to reduce network overhead and converting from a pull- to a push-based model.

The following example shows the source code to store data in Redis and notify all subscribers:

public void updateRedisData(final TrackingMessage trackingMessage, final Jedis jedis, final LambdaLogger logger) {

try {

ObjectMapper mapper = new ObjectMapper();

String jsonString = mapper.writeValueAsString(trackingMessage);

Map<String, String> map = marshal(jsonString);

String statusCode = jedis.hmset(trackingMessage.getProgramId(), map);


catch (Exception exc) {

if (null == logger)






public void notifySubscribers(final TrackingMessage trackingMessage, final Jedis jedis, final LambdaLogger logger) {

try {

ObjectMapper mapper = new ObjectMapper();

String jsonString = mapper.writeValueAsString(trackingMessage);

jedis.publish(Constants.REDIS_PUBSUB_CHANNEL, jsonString);


catch (final IOException e) {

log(e.getMessage(), logger);



Similarly to our Kinesis Consumer, the Redis-client is instantiated somewhat lazily.

Infrastructure as Code
As already outlined, latency and response time are a very critical part of any ad-tracking solution because response time has a huge impact on conversion rate. In order to reduce latency for customers world-wide, it is common practice to roll out the infrastructure in different AWS Regions in the world to be as close to the end customer as possible. AWS CloudFormation can help you model and set up your AWS resources so that you can spend less time managing those resources and more time focusing on your applications that run in AWS.

You create a template that describes all the AWS resources that you want (for example, Amazon EC2 instances or Amazon RDS DB instances), and AWS CloudFormation takes care of provisioning and configuring those resources for you. Our reference architecture can be rolled out in different Regions using an AWS CloudFormation template, which sets up the complete infrastructure (for example, Amazon Virtual Private Cloud (Amazon VPC), Amazon Elastic Container Service (Amazon ECS) cluster, Lambda functions, DynamoDB table, Amazon ElastiCache cluster, etc.).

In this blog post we described reactive principles and an example architecture with a common use case. We leveraged the capabilities of different frameworks in combination with several AWS services in order to implement reactive principles—not only at the application-level but also at the system-level. I hope I’ve given you ideas for creating your own reactive applications and systems on AWS.

About the Author

Sascha Moellering is a Senior Solution Architect. Sascha is primarily interested in automation, infrastructure as code, distributed computing, containers and JVM. He can be reached at [email protected]



Task Networking in AWS Fargate

Post Syndicated from Nathan Peck original https://aws.amazon.com/blogs/compute/task-networking-in-aws-fargate/

AWS Fargate is a technology that allows you to focus on running your application without needing to provision, monitor, or manage the underlying compute infrastructure. You package your application into a Docker container that you can then launch using your container orchestration tool of choice.

Fargate allows you to use containers without being responsible for Amazon EC2 instances, similar to how EC2 allows you to run VMs without managing physical infrastructure. Currently, Fargate provides support for Amazon Elastic Container Service (Amazon ECS). Support for Amazon Elastic Container Service for Kubernetes (Amazon EKS) will be made available in the near future.

Despite offloading the responsibility for the underlying instances, Fargate still gives you deep control over configuration of network placement and policies. This includes the ability to use many networking fundamentals such as Amazon VPC and security groups.

This post covers how to take advantage of the different ways of networking your containers in Fargate when using ECS as your orchestration platform, with a focus on how to do networking securely.

The first step to running any application in Fargate is defining an ECS task for Fargate to launch. A task is a logical group of one or more Docker containers that are deployed with specified settings. When running a task in Fargate, there are two different forms of networking to consider:

  • Container (local) networking
  • External networking

Container Networking

Container networking is often used for tightly coupled application components. Perhaps your application has a web tier that is responsible for serving static content as well as generating some dynamic HTML pages. To generate these dynamic pages, it has to fetch information from another application component that has an HTTP API.

One potential architecture for such an application is to deploy the web tier and the API tier together as a pair and use local networking so the web tier can fetch information from the API tier.

If you are running these two components as two processes on a single EC2 instance, the web tier application process could communicate with the API process on the same machine by using the local loopback interface. The local loopback interface has a special IP address of and hostname of localhost.

By making a networking request to this local interface, it bypasses the network interface hardware and instead the operating system just routes network calls from one process to the other directly. This gives the web tier a fast and efficient way to fetch information from the API tier with almost no networking latency.

In Fargate, when you launch multiple containers as part of a single task, they can also communicate with each other over the local loopback interface. Fargate uses a special container networking mode called awsvpc, which gives all the containers in a task a shared elastic network interface to use for communication.

If you specify a port mapping for each container in the task, then the containers can communicate with each other on that port. For example the following task definition could be used to deploy the web tier and the API tier:

  "family": "myapp"
  "containerDefinitions": [
      "name": "web",
      "image": "my web image url",
      "portMappings": [
          "containerPort": 80
      "memory": 500,
      "cpu": 10,
      "esssential": true
      "name": "api",
      "image": "my api image url",
      "portMappings": [
          "containerPort": 8080
      "cpu": 10,
      "memory": 500,
      "essential": true

ECS, with Fargate, is able to take this definition and launch two containers, each of which is bound to a specific static port on the elastic network interface for the task.

Because each Fargate task has its own isolated networking stack, there is no need for dynamic ports to avoid port conflicts between different tasks as in other networking modes. The static ports make it easy for containers to communicate with each other. For example, the web container makes a request to the API container using its well-known static port:


This sends a local network request, which goes directly from one container to the other over the local loopback interface without traversing the network. This deployment strategy allows for fast and efficient communication between two tightly coupled containers. But most application architectures require more than just internal local networking.

External Networking

External networking is used for network communications that go outside the task to other servers that are not part of the task, or network communications that originate from other hosts on the internet and are directed to the task.

Configuring external networking for a task is done by modifying the settings of the VPC in which you launch your tasks. A VPC is a fundamental tool in AWS for controlling the networking capabilities of resources that you launch on your account.

When setting up a VPC, you create one or more subnets, which are logical groups that your resources can be placed into. Each subnet has an Availability Zone and its own route table, which defines rules about how network traffic operates for that subnet. There are two main types of subnets: public and private.

Public subnets

A public subnet is a subnet that has an associated internet gateway. Fargate tasks in that subnet are assigned both private and public IP addresses:

A browser or other client on the internet can send network traffic to the task via the internet gateway using its public IP address. The tasks can also send network traffic to other servers on the internet because the route table can route traffic out via the internet gateway.

If tasks want to communicate directly with each other, they can use each other’s private IP address to send traffic directly from one to the other so that it stays inside the subnet without going out to the internet gateway and back in.

Private subnets

A private subnet does not have direct internet access. The Fargate tasks inside the subnet don’t have public IP addresses, only private IP addresses. Instead of an internet gateway, a network address translation (NAT) gateway is attached to the subnet:


There is no way for another server or client on the internet to reach your tasks directly, because they don’t even have an address or a direct route to reach them. This is a great way to add another layer of protection for internal tasks that handle sensitive data. Those tasks are protected and can’t receive any inbound traffic at all.

In this configuration, the tasks can still communicate to other servers on the internet via the NAT gateway. They would appear to have the IP address of the NAT gateway to the recipient of the communication. If you run a Fargate task in a private subnet, you must add this NAT gateway. Otherwise, Fargate can’t make a network request to Amazon ECR to download the container image, or communicate with Amazon CloudWatch to store container metrics.

Load balancers

If you are running a container that is hosting internet content in a private subnet, you need a way for traffic from the public to reach the container. This is generally accomplished by using a load balancer such as an Application Load Balancer or a Network Load Balancer.

ECS integrates tightly with AWS load balancers by automatically configuring a service-linked load balancer to send network traffic to containers that are part of the service. When each task starts, the IP address of its elastic network interface is added to the load balancer’s configuration. When the task is being shut down, network traffic is safely drained from the task before removal from the load balancer.

To get internet traffic to containers using a load balancer, the load balancer is placed into a public subnet. ECS configures the load balancer to forward traffic to the container tasks in the private subnet:

This configuration allows your tasks in Fargate to be safely isolated from the rest of the internet. They can still initiate network communication with external resources via the NAT gateway, and still receive traffic from the public via the Application Load Balancer that is in the public subnet.

Another potential use case for a load balancer is for internal communication from one service to another service within the private subnet. This is typically used for a microservice deployment, in which one service such as an internet user account service needs to communicate with an internal service such as a password service. Obviously, it is undesirable for the password service to be directly accessible on the internet, so using an internet load balancer would be a major security vulnerability. Instead, this can be accomplished by hosting an internal load balancer within the private subnet:

With this approach, one container can distribute requests across an Auto Scaling group of other private containers via the internal load balancer, ensuring that the network traffic stays safely protected within the private subnet.

Best Practices for Fargate Networking

Determine whether you should use local task networking

Local task networking is ideal for communicating between containers that are tightly coupled and require maximum networking performance between them. However, when you deploy one or more containers as part of the same task they are always deployed together so it removes the ability to independently scale different types of workload up and down.

In the example of the application with a web tier and an API tier, it may be the case that powering the application requires only two web tier containers but 10 API tier containers. If local container networking is used between these two container types, then an extra eight unnecessary web tier containers would end up being run instead of allowing the two different services to scale independently.

A better approach would be to deploy the two containers as two different services, each with its own load balancer. This allows clients to communicate with the two web containers via the web service’s load balancer. The web service could distribute requests across the eight backend API containers via the API service’s load balancer.

Run internet tasks that require internet access in a public subnet

If you have tasks that require internet access and a lot of bandwidth for communication with other services, it is best to run them in a public subnet. Give them public IP addresses so that each task can communicate with other services directly.

If you run these tasks in a private subnet, then all their outbound traffic has to go through an NAT gateway. AWS NAT gateways support up to 10 Gbps of burst bandwidth. If your bandwidth requirements go over this, then all task networking starts to get throttled. To avoid this, you could distribute the tasks across multiple private subnets, each with their own NAT gateway. It can be easier to just place the tasks into a public subnet, if possible.

Avoid using a public subnet or public IP addresses for private, internal tasks

If you are running a service that handles private, internal information, you should not put it into a public subnet or use a public IP address. For example, imagine that you have one task, which is an API gateway for authentication and access control. You have another background worker task that handles sensitive information.

The intended access pattern is that requests from the public go to the API gateway, which then proxies request to the background task only if the request is from an authenticated user. If the background task is in a public subnet and has a public IP address, then it could be possible for an attacker to bypass the API gateway entirely. They could communicate directly to the background task using its public IP address, without being authenticated.


Fargate gives you a way to run containerized tasks directly without managing any EC2 instances, but you still have full control over how you want networking to work. You can set up containers to talk to each other over the local network interface for maximum speed and efficiency. For running workloads that require privacy and security, use a private subnet with public internet access locked down. Or, for simplicity with an internet workload, you can just use a public subnet and give your containers a public IP address.

To deploy one of these Fargate task networking approaches, check out some sample CloudFormation templates showing how to configure the VPC, subnets, and load balancers.

If you have questions or suggestions, please comment below.