All posts by Nicolas Malaval

How to add DNS filtering to your NAT instance with Squid

Post Syndicated from Nicolas Malaval original https://aws.amazon.com/blogs/security/how-to-add-dns-filtering-to-your-nat-instance-with-squid/

Note from September 4, 2019: We’ve updated this blog post, initially published on January 26, 2016. Major changes include: support of Amazon Linux 2, no longer having to compile Squid 3.5, and a high availability version of the solution across two availability zones.

Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources on a virtual private network that you’ve defined. On an Amazon VPC, many people use network address translation (NAT) instances and NAT gateways to enable instances in a private subnet to initiate outbound traffic to the Internet, while preventing the instances from receiving inbound traffic initiated by someone on the Internet.

For security and compliance purposes, you might have to filter the requests initiated by these instances (also known as “egress filtering”). Using iptables rules, you could restrict outbound traffic with your NAT instance based on a predefined destination port or IP address. However, you might need to enforce more complex security policies, such as allowing requests to AWS endpoints only, or blocking fraudulent websites, which you can’t easily achieve by using iptables rules.

In this post, I discuss and give an example of how to use Squid, a leading open-source proxy, to implement a “transparent proxy” that can restrict both HTTP and HTTPS outbound traffic to a given set of Internet domains, while being fully transparent for instances in the private subnet.

The solution architecture

In this section, I present the architecture of the high availability NAT solution and explain how to configure Squid to filter traffic transparently. Later in this post, I’ll provide instructions about how to implement and test the solution.

The following diagram illustrates how the components in this process interact with each other. Squid Instance 1 intercepts HTTP/S requests sent by instances in Private Subnet 1, including the Testing Instance. Squid Instance 1 then initiates a connection with the destination host on behalf of the Testing Instance, which goes through the Internet gateway. This solution spans two Availability Zones, with Squid Instance 2 intercepting requests sent from the other Availability Zone. Note that you may adapt the solution to span additional Availability Zones.
 

Figure 1: The solution spans two Availability Zones

Figure 1: The solution spans two Availability Zones

Intercepting and filtering traffic

In each availability zone, the route table associated to the private subnet sends the outbound traffic to the Squid instance (see Route Tables for a NAT Device). Squid intercepts the requested domain, then applies the following filtering policy:

  • For HTTP requests, Squid retrieves the host header field included in all HTTP/1.1 request messages. This specifies the Internet host being requested.
  • For HTTPS requests, the HTTP traffic is encapsulated in a TLS connection between the instance in the private subnet and the remote host. Squid cannot retrieve the host header field because the header is encrypted. A feature called SslBump would allow Squid to decrypt the traffic, but this would not be transparent for the client because the certificate would be considered invalid in most cases. The feature I use instead, called SslPeekAndSplice, retrieves the Server Name Indication (SNI) from the TLS initiation. The SNI contains the requested Internet host. As a result, Squid can make filtering decisions without decrypting the HTTPS traffic.

Note 1: Some older client-side software stacks do not support SNI. Here are the minimum versions of some important stacks and programming languages that support SNI: Python 2.7.9 and 3.2, Java 7 JSSE, wget 1.14, OpenSSL 0.9.8j, cURL 7.18.1

Note 2: TLS 1.3 introduced an optional extension that allows the client to encrypt the SNI, which may prevent Squid from intercepting the requested domain.

The SslPeekAndSplice feature was introduced in Squid 3.5 and is implemented in the same Squid module as SslBump. To enable this module, Squid requires that you provide a certificate, though it will not be used to decode HTTPS traffic. The solution creates a certificate using OpenSSL.


mkdir /etc/squid/ssl
cd /etc/squid/ssl
openssl genrsa -out squid.key 4096
openssl req -new -key squid.key -out squid.csr -subj "/C=XX/ST=XX/L=squid/O=squid/CN=squid"
openssl x509 -req -days 3650 -in squid.csr -signkey squid.key -out squid.crt
cat squid.key squid.crt >> squid.pem        

The following code shows the Squid configuration file. For HTTPS traffic, note the ssl_bump directives instructing Squid to “peek” (retrieve the SNI) and then “splice” (become a TCP tunnel without decoding) or “terminate” the connection depending on the requested host.


visible_hostname squid
cache deny all

# Log format and rotation
logformat squid %ts.%03tu %6tr %>a %Ss/%03>Hs %<st %rm %ru %ssl::>sni %Sh/%<a %mt
logfile_rotate 10
debug_options rotate=10

# Handling HTTP requests
http_port 3128
http_port 3129 intercept
acl allowed_http_sites dstdomain "/etc/squid/whitelist.txt"
http_access allow allowed_http_sites

# Handling HTTPS requests
https_port 3130 cert=/etc/squid/ssl/squid.pem ssl-bump intercept
acl SSL_port port 443
http_access allow SSL_port
acl allowed_https_sites ssl::server_name "/etc/squid/whitelist.txt"
acl step1 at_step SslBump1
acl step2 at_step SslBump2
acl step3 at_step SslBump3
ssl_bump peek step1 all
ssl_bump peek step2 allowed_https_sites
ssl_bump splice step3 allowed_https_sites
ssl_bump terminate step2 all
http_access deny all       

The text file located at /etc/squid/whitelist.txt contains the list of whitelisted domains, with one domain per line. In this blog post, I’ll show you how to configure Squid to allow requests to *.amazonaws.com, which corresponds to AWS endpoints. Note that you can restrict access to a specific set of AWS services that you’ve defined (see Regions and Endpoints for a detailed list of endpoints), or you can set your own list of domains.

Note: An alternate approach is to use VPC endpoints to privately connect your VPC to supported AWS services without requiring access over the Internet (see VPC Endpoints). Some supported AWS services allow you to create a policy that controls the use of the endpoint to access AWS resources (see VPC Endpoint Policies, and VPC Endpoints for a list of supported services).

You may have noticed that Squid listens on port 3129 for HTTP traffic and 3130 for HTTPS. Because Squid cannot directly listen to 80 and 443, you have to redirect the incoming requests from instances in the private subnets to the Squid ports using iptables. You do not have to enable IP forwarding or add any FORWARD rule, as you would do with a standard NAT instance.


sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 3129
sudo iptables -t nat -A PREROUTING -p tcp --dport 443 -j REDIRECT --to-port 3130       

The solution stores the files squid.conf and whitelist.txt in an Amazon Simple Storage Service (S3) bucket and runs the following script every minute on the Squid instances to download and update the Squid configuration from S3. This makes it easy to maintain the Squid configuration from a central location. Note that it first validates the files with squid -k parse and then reload the configuration with squid -k reconfigure if no error was found.


        cp /etc/squid/* /etc/squid/old/
        aws s3 sync s3://<s3-bucket> /etc/squid
        squid -k parse && squid -k reconfigure || (cp /etc/squid/old/* /etc/squid/; exit 1)     

The solution then uses the CloudWatch Agent on the Squid instances to collect and store Squid logs in Amazon CloudWatch Logs. The log group /filtering-nat-instance/cache.log contains the error and debug messages that Squid generates and /filtering-nat-instance/access.log contains the access logs.

An access log record is a space-delimited string that has the following format:

<time> <response_time> <client_ip> <status_code> <size> <method> <url> <sni> <remote_host> <mime>

The following table describes the fields of an access log record.

FieldDescription
timeRequest time in seconds since epoch
response_timeResponse time in milliseconds
client_ipClient source IP address
status_codeSquid request status and HTTP response code sent to the client. For example, a HTTP request to an unallowed domain logs TCP_DENIED/403, and a HTTPS request to a whitelisted domain logs TCP_TUNNEL/200
sizeTotal size of the response sent to client
methodRequest method like GET or POST.
urlRequest URL received from the client. Logged for HTTP requests only
sniDomain name intercepted in the SNI. Logged for HTTPS requests only
remote_hostSquid hierarchy status and remote host IP address
mimeMIME content type. Logged for HTTP requests only

The following are some examples of access log records:


1563718817.184 14 10.0.0.28 TCP_DENIED/403 3822 GET http://example.com/ - HIER_NONE/- text/html
1563718821.573 7 10.0.0.28 TAG_NONE/200 0 CONNECT 172.217.7.227:443 example.com HIER_NONE/- -
1563718872.923 32 10.0.0.28 TCP_TUNNEL/200 22927 CONNECT 52.216.187.19:443 calculator.s3.amazonaws.com ORIGINAL_DST/52.216.187.19 –   

Designing a high availability solution

The Squid instances introduce a single point of failure for the private subnets. If a Squid instance fails, the instances in its associated private subnet cannot send outbound traffic anymore. The following diagram illustrates the architecture that I propose to address this situation within an Availability Zone.
 

Figure 2: The architecture to address if a Squid instance fails within an Availability Zone

Figure 2: The architecture to address if a Squid instance fails within an Availability Zone

Each Squid instance is launched in an Amazon EC2 Auto Scaling group that has a minimum size and a maximum size of one instance. A shell script is run at startup to configure the instances. That includes installing and configuring Squid (see Running Commands on Your Linux Instance at Launch).

The solution uses the CloudWatch Agent and its procstat plugin to collect the CPU usage of the Squid process every 10 seconds. For each Squid instance, the solution creates a CloudWatch alarm that watches this custom metric and goes to an ALARM state when a data point is missing. This can happen, for example, when Squid crashes or the Squid instance fails. Note that for my use case, I consider watching the Squid process a sufficient approach to determining the health status of a Squid instance, although it cannot detect eventual cases of the Squid process being alive but unable to forward traffic. As a workaround, you can use an end-to-end monitoring approach, like using witness instances in the private subnets to send test requests at regular intervals and collect the custom metric.

When an alarm goes to ALARM state, CloudWatch sends a notification to an Amazon Simple Notification Service (SNS) topic which then triggers an AWS Lambda function. The Lambda function marks the Squid instance as unhealthy in its Auto Scaling group, retrieves the list of healthy Squid instances based on the state of other CloudWatch alarms, and updates the route tables that currently route traffic to the unhealthy Squid instance to instead route traffic to the first available healthy Squid instance. While the Auto Scaling group automatically replaces the unhealthy Squid instance, private instances can send outbound traffic through the Squid instance in the other Availability Zone.

When the CloudWatch agent starts collecting the custom metric again on the replacement Squid instance, the alarm reverts to OK state. Similarly, CloudWatch sends a notification to the SNS topic, which then triggers the Lambda function. The Lambda function completes the lifecycle action (see Amazon EC2 Auto Scaling Lifecycle Hooks) to indicate that the replacement instance is ready to serve traffic, and updates the route table associated to the private subnet in the same availability zone to route traffic to the replacement instance.

Implementing and testing the solution

Now that you understand the architecture behind this solution, you can follow the instructions in this section to implement and test the solution in your AWS account.

Implementing the solution

First, you’ll use AWS CloudFormation to provision the required resources. Select the Launch Stack button below to open the CloudFormation console and create a stack from the template. Then, follow the on-screen instructions.

Select this image to open a link that starts building the CloudFormation stack

CloudFormation will create the following resources:

  • An Amazon Virtual Private Cloud (Amazon VPC) with an internet gateway attached.
  • Two public subnets and two private subnets on the Amazon VPC.
  • Three route tables. The first route table is associated to the public subnets to make them publicly accessible. The other two route tables are associated to the private subnets.
  • An S3 bucket to store the Squid configuration files, and two Lambda-based custom resources to add the files squid.conf and whitelist.txt to this bucket.
  • An IAM role to grant the Squid instances permissions to read from the S3 bucket and use the CloudWatch agent.
  • A security group to allow HTTP and HTTPS traffic from instances in the private subnets.
  • A launch configuration to specify the template of Squid instances. That includes commands to run at startup for automating the initial configuration.
  • Two Auto Scaling groups that use this launch configuration to launch the Squid instances.
  • A Lambda function to redirect the outbound traffic and recover a Squid instance when it fails.
  • Two CloudWatch alarms to watch the custom metric sent by Squid instances and trigger the Lambda function when the health status of Squid instances changes.
  • An EC2 instance in the first private subnet to test the solution, and an IAM role to grant this instance permissions to use the SSM agent. Session Manager, which I introduce in the next paragraph, uses this SSM agent (see Working with SSM Agent)

Testing the solution

After the stack creation has completed (it can take up to 10 minutes), connect onto the Testing Instance using Session Manager, a capability of AWS Systems Manager that lets you manage instances through an interactive shell without the need to open an SSH port:

  1. Open the AWS Systems Manager console.
  2. In the navigation pane, choose Session Manager.
  3. Choose Start Session.
  4. For Target instances, choose the option button to the left of Testing Instance.
  5. Choose Start Session.

Note: Session Manager makes calls to several AWS endpoints (see Working with SSM Agent). If you prefer to restrict access to a defined set of AWS services, make sure to whitelist the associated domains.

After the connection is made, you can test the solution with the following commands. Only the last three requests should return a valid response, because Squid allows traffic to *.amazonaws.com only.


curl http://www.amazon.com
curl https://www.amazon.com
curl http://calculator.s3.amazonaws.com/index.html
curl https://calculator.s3.amazonaws.com/index.html
aws ec2 describe-regions --region us-east-1         

To find the requests you just made in the access logs, here’s how to browse the Squid logs in Amazon CloudWatch Logs:

  1. Open the Amazon CloudWatch console.
  2. In the navigation pane, choose Logs.
  3. For Log Groups, choose the log group /filtering-nat-instance/access.log.
  4. Choose Search Log Group to view and search log records.

To test how the solution behaves when a Squid instance fails, you can terminate one of the Squid instances manually in the Amazon EC2 console. Then, watch the CloudWatch alarm change its state in the Amazon CloudWatch console, or watch the solution change the default route of the impacted route table in the Amazon VPC console.

You can now delete the CloudFormation stack to clean up the resources that were just created.

Discussion: Transparent or forward proxy?

The solution that I describe in this blog is fully transparent for instances in the private subnets, which means that instances don’t need to be aware of the proxy and can make requests as if they were behind a standard NAT instance. An alternate solution is to deploy a forward proxy in your Amazon VPC and configure instances in private subnets to use it (see the blog post How to set up an outbound VPC proxy with domain whitelisting and content filtering for an example). In this section, I discuss some of the differences between the two solutions.

Supportability

A major drawback with forward proxies is that the proxy must be explicitly configured on every instance within the private subnets. For example, you can configure the HTTP_PROXY and HTTPS_PROXY environment variables on Linux instances, but some applications or services, like yum, require their own proxy configuration, or don’t support proxy usage. Note also that some AWS services and features, like Amazon EMR or Amazon SageMaker notebook instances, don’t support using a forward proxy at the time of this post. However, with TLS 1.3, a forward proxy is the only option to restrict outbound traffic if the SNI is encrypted.

Scalability

Deploying a forward proxy on AWS usually consists of a load balancer distributing traffic to a set of proxy instances launched in an Auto Scaling group. Proxy instances can be launched or terminated dynamically depending on the demand (also known as “horizontal scaling”). With forward proxies, each route table can route traffic to a single instance at a time, and changing the type of the instance is the only way to increase or decrease the capacity (also known as “vertical scaling”).

The solution I present in this post does not dynamically adapt the instance type of the Squid instances based on the demand. However, you might consider a mechanism in which the traffic from a private subnet is temporarily redirected through another Availability Zone while the Squid instance is being relaunched by Auto Scaling with a smaller or larger instance type.

Mutualization

Deploying a centralized proxy solution and using it across multiple VPCs is a way of reducing cost and operational complexity.

With a forward proxy, instances in private subnets send IP packets to the proxy load balancer. Therefore, sharing a forward proxy across multiple VPCs only requires connectivity between the “instance VPCs” and a proxy VPC that has VPC Peering or equivalent capabilities.

With a transparent proxy, instances in private subnets sends IP packets to the remote host. VPC Peering does not support transitive routing (see Unsupported VPC Peering Configurations) and cannot be used to share a transparent proxy across multiple VPCs. However, you can now use an AWS Transit Gateway that acts as a network transit hub to share a transparent proxy across multiple VPCs. I give an example in the next section.

Sharing the solution across multiple VPCs using AWS Transit Gateway

In this section, I give an example of how to share a transparent proxy across multiple VPCs using AWS Transit Gateway. The architecture is illustrated in the following diagram. For the sake of simplicity, the diagram does not include Availability Zones.
 

Figure 3: The architecture for a transparent proxy across multiple VPCs using AWS Transit Gateway

Figure 3: The architecture for a transparent proxy across multiple VPCs using AWS Transit Gateway

Here’s how instances in the private subnet of “VPC App” can make requests via the shared transparent proxy in “VPC Shared:”

  1. When instances in VPC App make HTTP/S requests, the network packets they send have the public IP address of the remote host as the destination address. These packets are forwarded to the transit gateway, based on the route table associated to the private subnet.
  2. The transit gateway receives the packets and forwards them to VPC Shared, based on the default route of the transit gateway route table.
  3. Note that the transit gateway attachment resides in the transit gateway subnet. When the packets arrive in VPC Shared, they are forwarded to the Squid instance because the next destination has been determined based on the route table associated to the transit gateway subnet.
  4. The Squid instance makes requests on behalf of the source instance (“Instances” in the schema). Then, it sends the response to the source instance. The packets that it emits have the IP address of the source instance as the destination address and are forwarded to the transit gateway according to the route table associated to the public subnet.
  5. The transit gateway receives and forwards the response packets to VPC App.
  6. Finally, the response reaches the source instance.

In a high availability deployment, you could have one transit gateway subnet per Availability Zone that sends traffic to the Squid instance that resides in the same Availability Zone, or to the Squid instance in another Availability Zone if the instance in the same Availability Zone fails.

You could also use AWS Transit Gateway to implement a transparent proxy solution that scales horizontally. This allows you to add or remove proxy instances based on the demand, instead of changing the instance type. With this approach, you must deploy a fleet of proxy instances – launched by an Auto Scaling group, for example – and mount a VPN connection between each instance and the transit gateway. The proxy instances need to support ECMP (“Equal Cost Multipath routing”; see Transit Gateways) to equally spread the outbound traffic between instances. I don’t describe this alternative architecture further in this blog post.

Conclusion

In this post, I’ve shown how you can use Squid to implement a high availability solution that filters outgoing traffic to the Internet and helps meet your security and compliance needs, while being fully transparent for the back-end instances in your VPC. I’ve also discussed the key differences between transparent proxies and forward proxies. Finally, I gave an example of how to share a transparent proxy solution across multiple VPCs using AWS Transit Gateway.

If you have any questions or suggestions, please leave a comment below or on the Amazon VPC forum.

If you have feedback about this blog post, submit comments in the Comments section below.

Want more AWS Security news? Follow us on Twitter.

Nicolas Malaval

Nicolas is a Solution Architect for Amazon Web Services. He lives in Paris and helps our healthcare customers in France adopt cloud technology and innovate with AWS. Before that, he spent three years as a Consultant for AWS Professional Services, working with enterprise customers.

How to Record SSH Sessions Established Through a Bastion Host

Post Syndicated from Nicolas Malaval original https://blogs.aws.amazon.com/security/post/Tx2HSURNJZPUP68/How-to-Record-SSH-Sessions-Established-Through-a-Bastion-Host

A bastion host is a server whose purpose is to provide access to a private network from an external network, such as the Internet. Because of its exposure to potential attack, a bastion host must minimize the chances of penetration. For example, you can use a bastion host to mitigate the risk of allowing SSH connections from an external network to the Linux instances launched in a private subnet of your Amazon Virtual Private Cloud (VPC).

In this blog post, I will show you how to leverage a bastion host to record all SSH sessions established with Linux instances. Recording SSH sessions enables auditing and can help in your efforts to comply with regulatory requirements.

The solution architecture

In this section, I present the architecture of this solution and explain how you can configure the bastion host to record SSH sessions. Later in this post, I provide instructions about how to implement and test the solution.

Amazon VPC enables you to launch AWS resources on a virtual private network that you have defined. The bastion host runs on an Amazon EC2 instance that is typically in a public subnet of your Amazon VPC. Linux instances are in a subnet that is not publicly accessible, and they are set up with a security group that allows SSH access from the security group attached to the underlying EC2 instance running the bastion host. Bastion host users connect to the bastion host to connect to the Linux instances, as illustrated in the following diagram.

You can adapt this architecture to meet your own requirements. For example, you could have the bastion host in a separate Amazon VPC and a VPC peering connection between the two Amazon VPCs. What matters is that the bastion host remains the only source of SSH traffic to your Linux instances.

This blog post’s solution for recording SSH sessions resides on the bastion host only and requires no specific configuration of Linux instances. You configure the solution by running commands at launch as the root user on an Amazon Linux instance.

Note: It is a best practice to harden your bastion host because it is a critical point of network security. Hardening might include disabling unnecessary applications or services, tuning the network stack, and the like. I do not discuss hardening in detail in this blog post.

When a client connects to an Amazon Linux instance, the default behavior of OpenSSH, the SSH server, is to run an interactive shell. Instead, I configure OpenSSH to execute a custom script that wraps an interactive shell into a script command. By doing so, the script command records everything displayed on the terminal, including keyboard input and full-screen applications such as vim. You can replay the session from the resulting log files by using the scriptreplay command. See the step-by-step example in the “Testing the solution” section later in this blog post.

Note that I intentionally block a few SSH features because they would allow users to create a direct connection between their local computer and the Linux instances, thereby bypassing the solution.

# Create a new folder for the log files
mkdir /var/log/bastion

# Allow ec2-user only to access this folder and its content
chown ec2-user:ec2-user /var/log/bastion
chmod -R 770 /var/log/bastion
setfacl -Rdm other:0 /var/log/bastion

# Make OpenSSH execute a custom script on logins
echo -e "\nForceCommand /usr/bin/bastion/shell" >> /etc/ssh/sshd_config

# Block some SSH features that bastion host users could use to circumvent 
# the solution
awk '!/AllowTcpForwarding/' /etc/ssh/sshd_config > temp && mv temp /etc/ssh/sshd_config
awk '!/X11Forwarding/' /etc/ssh/sshd_config > temp && mv temp /etc/ssh/sshd_config
echo "AllowTcpForwarding no" >> /etc/ssh/sshd_config
echo "X11Forwarding no" >> /etc/ssh/sshd_config

mkdir /usr/bin/bastion

cat > /usr/bin/bastion/shell << 'EOF'

# Check that the SSH client did not supply a command
if [[ -z $SSH_ORIGINAL_COMMAND ]]; then

  # The format of log files is /var/log/bastion/YYYY-MM-DD_HH-MM-SS_user
  LOG_FILE="`date --date="today" "+%Y-%m-%d_%H-%M-%S"`_`whoami`"
  LOG_DIR="/var/log/bastion/"

  # Print a welcome message
  echo ""
  echo "NOTE: This SSH session will be recorded"
  echo "AUDIT KEY: $LOG_FILE"
  echo ""

  # I suffix the log file name with a random string. I explain why 
  # later on.
  SUFFIX=`mktemp -u _XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX`

  # Wrap an interactive shell into "script" to record the SSH session
  script -qf --timing=$LOG_DIR$LOG_FILE$SUFFIX.time $LOG_DIR$LOG_FILE$SUFFIX.data --command=/bin/bash

else

  # The "script" program could be circumvented with some commands 
  # (e.g. bash, nc). Therefore, I intentionally prevent users 
  # from supplying commands.

  echo "This bastion supports interactive sessions only. Do not supply a command"
  exit 1

fi

EOF

# Make the custom script executable
chmod a+x /usr/bin/bastion/shell

# Bastion host users could overwrite and tamper with an existing log file 
# using "script" if they knew the exact file name. I take several measures 
# to obfuscate the file name:
# 1. Add a random suffix to the log file name.
# 2. Prevent bastion host users from listing the folder containing log 
# files. 
# This is done by changing the group owner of "script" and setting GID.
chown root:ec2-user /usr/bin/script
chmod g+s /usr/bin/script

# 3. Prevent bastion host users from viewing processes owned by other 
# users, because the log file name is one of the "script" 
# execution parameters.
mount -o remount,rw,hidepid=2 /proc
awk '!/proc/' /etc/fstab > temp && mv temp /etc/fstab
echo "proc /proc proc defaults,hidepid=2 0 0" >> /etc/fstab

# Restart the SSH service to apply /etc/ssh/sshd_config modifications.
service sshd restart

The preceding commands make OpenSSH execute a custom script on login, which records SSH sessions into log files stored in the folder,/var/log/bastion. For durable storage, the log files are copied at a regular interval to an Amazon S3 bucket, using theaws s3 cp command, which follows.

cat > /usr/bin/bastion/sync_s3 << 'EOF'
# Copy log files to S3 with server-side encryption enabled.
# Then, if successful, delete log files that are older than a day.
LOG_DIR="/var/log/bastion/"
aws s3 cp $LOG_DIR s3://bucket-name/logs/ --sse --region region --recursive && find $LOG_DIR* -mtime +1 -exec rm {} \;

EOF

chmod 700 /usr/bin/bastion/sync_s3

At this point, OpenSSH is configured to record SSH sessions and the log files are copied to Amazon S3. In order to determine the origin of any action performed on the Linux instances using SSH, bastion host users are provided with personal user accounts on the bastion host and they use their personal SSH key pair to log in. Each user account receives the minimum required privileges so that bastion host users are unable to disable or tamper with the solution.

To ease the management of user accounts, the SSH public key of each bastion host user is uploaded to an S3 bucket. At a regular interval, the bastion host retrieves the public keys available in this bucket. For each public key, a user account is created if it does not already exist, and the SSH public key is copied to the bastion host to allow the user to log in with this key pair. For example, if the bastion host finds a file, john.pub, in the bucket, which is John’s SSH public key, it creates a user account, john, and the public key is copied to /home/john/.ssh/authorized_keys. If an SSH public key were to be removed from the S3 bucket, the bastion host would delete the related user account. Personal user account creations and deletions are logged in /var/log/bastion/users_changelog.txt.

The following commands create a shell script for managing personal user accounts and scheduling a cron job to run the shell script every 5 minutes.

# Bastion host users should log in to the bastion host with 
# their personal SSH key pair. The public keys are stored on 
# S3 with the following naming convention: "username.pub". This 
# script retrieves the public keys, creates or deletes local user 
# accounts as needed, and copies the public key to 
# /home/username/.ssh/authorized_keys

cat > /usr/bin/bastion/sync_users << 'EOF'

# The file will log user changes
LOG_FILE="/var/log/bastion/users_changelog.txt"

# The function returns the user name from the public key file name.
# Example: public-keys/sshuser.pub => sshuser
get_user_name () {
  echo "$1" | sed -e 's/.*\///g' | sed -e 's/\.pub//g'
}

# For each public key available in the S3 bucket
aws s3api list-objects --bucket bucket-name --prefix public-keys/ --region region --output text --query 'Contents[?Size>`0`].Key' | sed -e 'y/\t/\n/' > ~/keys_retrieved_from_s3
while read line; do
  USER_NAME="`get_user_name "$line"`"

  # Make sure the user name is alphanumeric
  if [[ "$USER_NAME" =~ ^[a-z][-a-z0-9]*$ ]]; then

    # Create a user account if it does not already exist
    cut -d: -f1 /etc/passwd | grep -qx $USER_NAME
    if [ $? -eq 1 ]; then
      /usr/sbin/adduser $USER_NAME && \
      mkdir -m 700 /home/$USER_NAME/.ssh && \
      chown $USER_NAME:$USER_NAME /home/$USER_NAME/.ssh && \
      echo "$line" >> ~/keys_installed && \
      echo "`date --date="today" "+%Y-%m-%d %H-%M-%S"`: Creating user account for $USER_NAME ($line)" >> $LOG_FILE
    fi

    # Copy the public key from S3, if a user account was created 
    # from this key
    if [ -f ~/keys_installed ]; then
      grep -qx "$line" ~/keys_installed
      if [ $? -eq 0 ]; then
        aws s3 cp s3://bucket-name/$line /home/$USER_NAME/.ssh/authorized_keys --region region
        chmod 600 /home/$USER_NAME/.ssh/authorized_keys
        chown $USER_NAME:$USER_NAME /home/$USER_NAME/.ssh/authorized_keys
      fi
    fi

  fi
done < ~/keys_retrieved_from_s3

# Remove user accounts whose public key was deleted from S3
if [ -f ~/keys_installed ]; then
  sort -uo ~/keys_installed ~/keys_installed
  sort -uo ~/keys_retrieved_from_s3 ~/keys_retrieved_from_s3
  comm -13 ~/keys_retrieved_from_s3 ~/keys_installed | sed "s/\t//g" > ~/keys_to_remove
  while read line; do
    USER_NAME="`get_user_name "$line"`"
    echo "`date --date="today" "+%Y-%m-%d %H-%M-%S"`: Removing user account for $USER_NAME ($line)" >> $LOG_FILE
    /usr/sbin/userdel -r -f $USER_NAME
  done < ~/keys_to_remove
  comm -3 ~/keys_installed ~/keys_to_remove | sed "s/\t//g" > ~/tmp && mv ~/tmp ~/keys_installed
fi

EOF

chmod 700 /usr/bin/bastion/sync_users

cat > ~/mycron << EOF
*/5 * * * * /usr/bin/bastion/sync_s3
*/5 * * * * /usr/bin/bastion/sync_users
0 0 * * * yum -y update --security
EOF
crontab ~/mycron
rm ~/mycron

Be very careful when distributing the key pair associated with the instance running the bastion host. With this key pair, someone could log in as the root or ec2-user user and damage or tamper with the solution. You might even consider launching the instance without a key pair if the operations requiring root access, such as patching, can be scripted and automated.

Also, you should restrict the permissions on the S3 bucket by using bucket or IAM policies. For example, you could make the log files readable only by a compliance team and the SSH public keys managed by a DevOps team.

Implementing the solution

Now that you understand the architecture of this solution, you can follow the instructions in this section to implement in your AWS account this blog post’s solution.

First, you will create two new key pairs. The first key pair will be associated with the instance running the bastion host. The second key pair will be associated with an Amazon Linux instance launched in a private subnet and will be used as the SSH key pair for a bastion host user.

To manually create the two key pairs:

  1. Open the Amazon EC2 console and select a region from the navigation bar.
  2. Click Key Pairs in the left pane.
  3. Click Create Key Pair.
  4. In the Key pair name box, type bastion and click Create. Your browser downloads the private key file as bastion.pem.
  5. Repeat Steps 3 and 4 to create another key pair and name it sshuser.

You then will use AWS CloudFormation to provision the required resources. Click Create a Stack to open the CloudFormation console and create a CloudFormation stack from the template. Click Next and enter bastion in BastionKeyPair and sshuser in InstanceKeyPair. Then, follow the on-screen instructions.

CloudFormation creates the following resources:

  • An Amazon VPC with an Internet gateway attached.
  • A public subnet on this Amazon VPC with a new route table to make it publicly accessible.
  • A private subnet on this Amazon VPC that will not have access to the Internet, for sake of simplicity.
  • An S3 bucket. The log files will be stored in a folder called logs and the SSH public keys in a folder called public-keys.
  • Two security groups. The first security group allows SSH traffic from the Internet, and the second security group allows SSH traffic from the first security group.
  • An IAM role to grant an EC2 instance permissions to upload log files to the S3 bucket and to read SSH public keys.
  • An Amazon Linux instance running the bastion host in the public subnet with the IAM role attached and the user data script entered to configure the solution.
  • An Amazon Linux instance in the private subnet.

After the stack creation has completed (it can take up to 10 minutes), click the Outputs tab in the CloudFormation console and note the values that the process returned: the name of the new S3 bucket, the public IP address of the bastion host, and the private IP address of the Linux instance.

Finally, you will upload the SSH public key for the key pair sshuser to the S3 bucket so that the bastion host creates a new user account:

  1. Retrieve the public key and save it locally to a file named sshuser.pub (see Retrieving the Public Key for Your Key Pair on Linux or Retrieving the Public Key for Your Key Pair on Windows).
  2. Open the S3 console and click the name of the bucket in the buckets list.
  3. Create a new folder called public-keys (see Creating a Folder), and upload the SSH public key sshuser.pub to this folder (see Uploading Objects into Amazon S3).
  4. Wait for a few minutes. You should see a new folder called logs in the bucket with a new file inside it that is recording events related to user account creation and deletion.

Testing the solution

You might recall that the key pair sshuser serves to log in to both the bastion host as a bastion host user and the Linux instance in the private subnet as the privileged ec2-user user. Therefore, to test this solution, you will use SSH agent forwarding to connect from the bastion host to the Linux instance without storing the private key on the bastion host. See this blog post for further information about SSH agent forwarding.

First, you will log in to the bastion host as sshuser with the -A argument to forward the SSH private key.

chmod 600 [path to sshuser.pem]
ssh-add [path to sshuser.pem]
ssh -A [email protected][public IP of the bastion host] –i [path to sshuser.pem]

You should see a welcome message saying that the SSH session will be recorded, as shown in the following screenshot.

Write down the value of the audit key. Then, connect to the Linux instance, run some commands of your choice, and then close the SSH session.

ssh [email protected][private IP of the Linux instance]
[commands of your choice that will be recorded]
exit
exit

You will now replay the SSH session that was just recorded. Each session has two log files: one file contains the data displayed on the terminal, and the other contains the timing data that enables replay with realistic typing and output delays. For simplicity, you will connect as ec2-user to the bastion host and replay from the local copy of log files. 

Note: Under normal circumstances, you would not replay a SSH session on the bastion host, for two reasons. First, recall that you should strictly avoid using the privileged user account, ec2-user. Second, the bastion host does not have read permissions on the folder logs in the S3 bucket and the log files that are older than a day are deleted from the bastion host. Instead, you would use another Linux instance with sufficient permissions on the S3 bucket to download and replay the log files.

ssh [email protected][public IP of the bastion host] –i [path to bastion.pem]
export LOGFILE=`ls /var/log/bastion/[audit key]*.data | cut -d. -f1`
scriptreplay --timing=$LOGFILE.time $LOGFILE.data

You can now delete the CloudFormation stack to clean up the resources that were just created. Note that you need to empty the S3 bucket before it can be deleted by CloudFormation (see Empty a Bucket).

Conclusion

A bastion host is a standard element of network security that provides secure access to private networks over SSH. In this blog post, I have shown you a way to leverage this bastion host to record SSH sessions, which you can play back and use for auditing purposes.

If you have comments, submit them in the “Comments” section below. If you have questions, please start a new thread on the Amazon VPC forum.

– Nicolas

How to Add DNS Filtering to Your NAT Instance with Squid

Post Syndicated from Nicolas Malaval original https://blogs.aws.amazon.com/security/post/TxFRX7UFUIT2GD/How-to-Add-DNS-Filtering-to-Your-NAT-Instance-with-Squid

Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources on a virtual private network that you’ve defined. On an Amazon VPC, network address translation (NAT) instances, and more recently NAT gateways, are commonly used to enable instances in a private subnet to initiate outbound traffic to the Internet, but prevent the instances from receiving inbound traffic initiated by someone on the Internet.

For security and compliance purposes, you might have to filter the requests initiated by these instances. Using iptables rules, you could restrict outbound traffic with your NAT instance based on a predefined destination port or IP address. However, you may need to enforce more complex policies, such as allowing requests to AWS endpoints only, which cannot be achieved easily by using iptables rules.

In this post, I discuss and give an example of how Squid, a leading open-source proxy, can restrict both HTTP and HTTPS outbound traffic to a given set of Internet domains, while being fully transparent for instances in the private subnet. First, I explain briefly how to create the infrastructure resources required for this approach. Then, I provide step-by-step instructions to install, configure, and test Squid as a transparent proxy.

Keep in mind, that a possible alternative solution could be to deploy a proxy (also known as a forward proxy) in your Amazon VPC. However, a major drawback is that the proxy must be explicitly configured on every instance in the private subnet. This could cause connectivity issues that would be difficult to troubleshoot, if not impossible to remediate, when the application does not support proxy usage.

Deploying the example

The following steps are for manually creating and configuring the required resources. Alternatively, you could use AWS CloudFormation to automate this procedure. Click Create a Stack to open the CloudFormation console and create an AWS CloudFormation stack from the template I developed. Follow the on-screen instructions and go directly to the “Testing the deployment” section later in this blog post when the stack creation has completed (it can take up to 20 minutes).

Note: If you need to allow requests to S3 buckets in the same region as your VPC, you could use a VPC endpoint for Amazon S3 instead (see VPC Endpoints). It enables you to create a private connection between your VPC and another AWS service without requiring access over the Internet. It also has a policy that controls the use of the endpoint to access Amazon S3 resources.

Set up a VPC and create 2 Amazon EC2 instances

The following steps take you through the manual creation and configuration of a VPC and two EC2 instances: one in a public subnet for deploying Squid, and another in a private subnet for testing the configuration. See the NAT Instances documentation for further details, because the prerequisites are similar.

To create and configure a VPC and 2 EC2 instances:

Create a VPC (see Creating a VPC, if you need help with this step).

Create two subnets (see Creating a Subnet): one called “Public Subnet” and another called “Private Subnet.”

Create and attach an Internet Gateway to the VPC (see Attaching an Internet Gateway).

Add a rule to the default route table that sends traffic destined outside the VPC (0.0.0.0/0) to the Internet Gateway (see Adding and Removing Routes from a Route Table).

Add a rule to the VPC default security group that allows ingress SSH traffic (TCP 22) from your 0.0.0.0/0 (see Adding Rules to a Security Group).

Launch an Amazon Linux t2.micro instance called “Squid Instance” in the Public Subnet (make sure to use the Amazon Linux AMI, not the NAT instance AMI). Enable Auto-assign Public IP, choose the VPC default security group as the security group to attach, and select a valid key pair. Leave all other parameters as default (see Launching an Instance).

After the instance starts running, disable the source/destination check (see Disable Source/Destination Checks).

Create a new route table in the VPC. Add a rule that sends traffic destined outside the VPC (0.0.0.0/0) to the Squid instance and associate this new route table with the Private Subnet.

Create an instance role called “Testing Instance Role” and attach the managed policy AmazonEC2ReadOnlyAccess (see detailed instructions).

Finally, launch another Amazon Linux t2.micro instance called “Testing Instance” in the Private Subnet (make sure to use the Amazon Linux AMI, not the NAT instance AMI). Select Testing Instance Role as the instance role, attach the VPC default security group, and select the same key pair. Leave all other parameters as default.

The following diagram illustrates how the components in this process interact with each other. Squid Instance intercepts HTTP/S requests sent by Testing Instance. Squid Instance then initiates a connection with the destination host on behalf of Testing Instance, which goes through the Internet gateway.

Installing Squid

Squid intercepts the requested domain before applying the filtering policy:

For HTTP, Squid retrieves the Host header field included in all HTTP/1.1 request messages, which specifies the Internet host being requested.

For HTTPS, the HTTP traffic is encapsulated in a TLS connection between the instance in the private subnet and the remote host. Squid cannot retrieve the Host header field because the header is encrypted. A feature called SslBump would allow Squid to decrypt the traffic, but this would not be transparent for the client because the certificate would be considered invalid in most cases. The feature we use instead, called SslPeekAndSplice, retrieves the Server Name Indication (SNI) from the TLS initiation, which contains the requested Internet host. As a result, Squid can make filtering decisions without unencrypting the HTTPS traffic.

Note: Some older client-side software, stacks do not support SNI. These are the minimum versions of some important stacks and programming languages that support SNI: Python 2.7.9 and 3.2, Java 7 JSSE, wget 1.14, OpenSSL 0.9.8j, cURL 7.18.1

The feature SslPeekAndSplice was introduced in Squid 3.5. However, when this post was written, the Amazon Linux repository included Squid 3.1.10 (you can check whether Squid 3.5 is available using the command, yum info squid). For the purpose of this post, I will compile and install Squid 3.5 from the official source code.

Note: This Squid installation includes the minimum features required for this example and is not intended for production purposes. You may prefer to adapt it to your own needs and install Squid from your own Red Hat Package Manager (RPM) package or from unofficial RPM packages for CentOS 6.

Squid installation instructions

To manually compile, install, and configure Squid on the Squid instance:

Connect to your Squid instance using SSH with the user ec2-user.

Install the prerequisite packages.

sudo yum update -y
sudo yum install -y perl gcc autoconf automake make sudo wget gcc-c++ libxml2-devel libcap-devel libtool libtool-ltdl-devel openssl openssl-devel

Go to this Squid page and retrieve the link to the tar.gz source code archive for the latest release (3.5.13 was the last release when this post was written). Use this link to download and extract the archive on the Squid instance.

SQUID_ARCHIVE=http://www.squid-cache.org/Versions/v3/3.5/squid-3.5.13.tar.gz
cd /tmp
wget $SQUID_ARCHIVE
tar xvf squid*.tar.gz
cd $(basename squid*.tar.gz .tar.gz)

Compile and install Squid with the minimum required options. This may take up to 15 minutes.

sudo ./configure –prefix=/usr –exec-prefix=/usr –libexecdir=/usr/lib64/squid –sysconfdir=/etc/squid –sharedstatedir=/var/lib –localstatedir=/var –libdir=/usr/lib64 –datadir=/usr/share/squid –with-logdir=/var/log/squid –with-pidfile=/var/run/squid.pid –with-default-user=squid –disable-dependency-tracking –enable-linux-netfilter –with-openssl –without-nettle

sudo make
sudo make install

Complete the Squid installation.

sudo adduser -M squid
sudo chown -R squid:squid /var/log/squid /var/cache/squid
sudo chmod 750 /var/log/squid /var/cache/squid
sudo touch /etc/squid/squid.conf
sudo chown -R root:squid /etc/squid/squid.conf
sudo chmod 640 /etc/squid/squid.conf
cat | sudo tee /etc/init.d/squid <<‘EOF’
#!/bin/sh
# chkconfig: – 90 25
echo -n ‘Squid service’
case "$1" in
start)
/usr/sbin/squid
;;
stop)
/usr/sbin/squid -k shutdown
;;
reload)
/usr/sbin/squid -k reconfigure
;;
*)
echo "Usage: `basename $0` {start|stop|reload}"
;;
esac
EOF
sudo chmod +x /etc/init.d/squid
sudo chkconfig squid on

Note: If you have installed Squid from a RPM package, you are not required to follow the previous instructions for installing Squid before proceeding to the next steps, because your Squid instance already has the required configuration.

Configuring and starting Squid

The SslPeekAndSplice feature is implemented in the same Squid module as SslBump. To enable this module, Squid requires that we provide a certificate, though it will not be used to decode HTTPS traffic. I create a certificate using OpenSSL.

sudo mkdir /etc/squid/ssl
cd /etc/squid/ssl
sudo openssl genrsa -out squid.key 2048
sudo openssl req -new -key squid.key -out squid.csr -subj "/C=XX/ST=XX/L=squid/O=squid/CN=squid"
sudo openssl x509 -req -days 3650 -in squid.csr -signkey squid.key -out squid.crt
sudo cat squid.key squid.crt | sudo tee squid.pem

Next, configure Squid to allow requests to *.amazonaws.com, which corresponds to AWS endpoints. Note that you can restrict access to a defined set of AWS services only. See AWS Regions and Endpoints for a detailed list of endpoints.

For HTTPS traffic, note the ssl_bump directives instructing Squid to “peek” (retrieve the SNI) and then “splice” (become a TCP tunnel without decoding) or “terminate” the connection depending on the requested host.

cat | sudo tee /etc/squid/squid.conf <<EOF
visible_hostname squid

#Handling HTTP requests
http_port 3129 intercept
acl allowed_http_sites dstdomain .amazonaws.com
#acl allowed_http_sites dstdomain [you can add other domains to permit]
http_access allow allowed_http_sites

#Handling HTTPS requests
https_port 3130 cert=/etc/squid/ssl/squid.pem ssl-bump intercept
acl SSL_port port 443
http_access allow SSL_port
acl allowed_https_sites ssl::server_name .amazonaws.com
#acl allowed_https_sites ssl::server_name [you can add other domains to permit]
acl step1 at_step SslBump1
acl step2 at_step SslBump2
acl step3 at_step SslBump3
ssl_bump peek step1 all
ssl_bump peek step2 allowed_https_sites
ssl_bump splice step3 allowed_https_sites
ssl_bump terminate step2 all

http_access deny all
EOF

You may have noticed that Squid listens on port 3129 for HTTP traffic and 3130 for HTTPS. Because Squid cannot directly listen to 80 and 443, we have to redirect the incoming requests from instances in the private subnet to the Squid ports using iptables. You do not have to enable IP Forwarding or add any FORWARD rule, as you would do with a standard NAT instance.

sudo iptables -t nat -A PREROUTING -p tcp –dport 80 -j REDIRECT –to-port 3129
sudo iptables -t nat -A PREROUTING -p tcp –dport 443 -j REDIRECT –to-port 3130

You can now start Squid.

sudo service squid start

Testing the deployment

The Testing Instance that was launched earlier can be used to test the configuration. Because this instance is not accessible from the Internet, you have to jump onto the Squid instance to log on to Testing Instance.

Because both instances were launched with the same key pair, you can connect to Squid Instance using the -A argument in order to forward the SSH key to Testing Instance.

ssh-add [key]
ssh -A [email protected][public IP of the Squid instance] –i [key]
ssh [email protected][private IP of the client instance] –i [key]

You can test the transparent proxy instance with the following commands. Only the last three requests should return a valid response, because Squid allows traffic to *.amazonaws.com only.

curl http://www.amazon.com
curl https://www.amazon.com
curl http://calculator.s3.amazonaws.com/index.html
curl https://calculator.s3.amazonaws.com/index.html
aws ec2 describe-regions –region us-east-1

You can now clean up the resources you just created.

Summary

In this blog post, I have shown how you can use Squid to filter outgoing traffic to the Internet and help meet your security and compliance needs, while being fully transparent for the back-end instances in your VPC.

I invite you to adapt this example to your own requirements. For example, you may implement a high-availability solution, similar to the solution described in High Availability for Amazon VPC NAT Instances: An Example; centralize Squid metrics and access logs, similar to the solution described in Using Squid Proxy Instances for Web Service Access in Amazon VPC: Another Example with AWS CodeDeploy and Amazon CloudWatch; or leverage other Squid features (logging, caching, etc.) for further visibility and control over outbound traffic.

If you have any questions or suggestions, please leave a comment below or on the IAM forum.

– Nicolas