Tag Archives: Amazon VPC

Reduce Cost and Increase Security with Amazon VPC Endpoints

Post Syndicated from Nigel Harris original https://aws.amazon.com/blogs/architecture/reduce-cost-and-increase-security-with-amazon-vpc-endpoints/

Introduction

This blog explains the benefits of using Amazon VPC endpoints and highlights a self-paced workshop that will help you to learn more about them. Amazon Virtual Private Cloud (Amazon VPC) enables you to launch AWS resources into a virtual network that you’ve defined. This virtual network resembles a traditional network that you’d operate in your own data center, with the benefits of using the scalable infrastructure of AWS.

A VPC endpoint allows you to privately connect your VPC to supported AWS services without requiring an Internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. Endpoints are virtual devices that are horizontally scaled, redundant, and highly available VPC components. They allow communication between instances in your VPC and services without imposing availability risks or bandwidth constraints on your network traffic.

VPC endpoints enable you to reduce data transfer charges resulting from network communication between private VPC resources (such as Amazon Elastic Cloud Compute—or EC2—instances) and AWS Services (such as Amazon Quantum Ledger Database, or QLDB). Without VPC endpoints configured, communications that originate from within a VPC destined for public AWS services must egress AWS to the public Internet in order to access AWS services. This network path incurs outbound data transfer charges. Data transfer charges for traffic egressing from Amazon EC2 to the Internet vary based on volume. However, at the time of writing, after the first 1GB / Month ($0.00 per GB), transfers are charged at a rate of $ 0.09/GB (for AWS US-East 1 Virginia). With VPC endpoints configured, communication between your VPC and the associated AWS service does not leave the Amazon network. If your workload requires you to transfer significant volumes of data between your VPC and AWS, you can reduce costs by leveraging VPC endpoints.

There are two types of VPC endpoints: interface endpoints and gateway endpoints. Amazon Simple Storage Service (S3) and Amazon DynamoDB are accessed using gateway endpoints. You can configure resource policies on both the gateway endpoint and the AWS resource that the endpoint provides access to. A VPC endpoint policy is an AWS Identity and Access Management (AWS IAM) resource policy that you can attach to an endpoint. It is a separate policy for controlling access from the endpoint to the specified service. This enables granular access control and private network connectivity from within a VPC. For example, you could create a policy that restricts access to a specific DynamoDB table through a VPC endpoint.

Figure 1: Accessing S3 via a Gateway VPC Endpoint

Figure 1: Accessing S3 via a Gateway VPC Endpoint

Interface endpoints enable you to connect to services powered by AWS PrivateLink. This includes a large number of AWS services, services hosted by other AWS customers and partners in their own VPCs, and supported AWS Marketplace partner services. Like gateway endpoints, interface endpoints can be secured using resource policies on the endpoint itself and the resource that the endpoint provides access to. Interface endpoints enable the use of security groups to restrict access to the endpoint.

Figure 2: Accessing QLDB via an Interface VPC Endpoint

Figure 2: Accessing QLDB via an Interface VPC Endpoint

In larger multi-account AWS environments, network design can vary considerably. Consider an organization that has built a hub-and-spoke network with AWS Transit Gateway. VPCs have been provisioned into multiple AWS accounts, perhaps to facilitate network isolation or to enable delegated network administration. When deploying distributed architectures such as this, a popular approach is to build a “shared services VPC, which provides access to services required by workloads in each of the VPCs. This might include directory services or VPC endpoints. Sharing resources from a central location instead of building them in each VPC may reduce administrative overhead and cost. This approach was outlined by my colleague Bhavin Desai in his blog post Centralized DNS management of hybrid cloud with Amazon Route 53 and AWS Transit Gateway.

Figure 3: Centralized VPC Endpoints (multiple VPCs)

Figure 3: Centralized VPC Endpoints (multiple VPCs)

Alternatively, an organization may have centralized its network and chosen to leverage VPC sharing to enable multiple AWS accounts to create application resources (such as Amazon EC2 instances, Amazon Relational Database Service (RDS) databases, and AWS Lambda functions) into a shared, centrally managed network. With either pattern, establishing granular set of controls to limit access to resources can be critical to support organizational security and compliance objectives while maintaining operational efficiency.

Figure 4: Centralized VPC Endpoints (shared VPC)

Figure 4: Centralized VPC Endpoints (shared VPC)

Learn how with the VPC Endpoint Workshop

Understanding how to appropriately restrict access to endpoints and the services they provide connectivity to is an often-misunderstood topic. I recently authored a hands-on workshop to help customers learn how to provision appropriate levels of access. Continue to learn about Amazon VPC Endpoints by taking the VPC Endpoint Workshop and then improve the security posture of your cloud workloads by leveraging network controls and VPC endpoint policies to manage access to your AWS resources.

Using AWS Lambda IAM condition keys for VPC settings

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/using-aws-lambda-iam-condition-keys-for-vpc-settings/

You can now control the Amazon Virtual Private Cloud (VPC) settings for your AWS Lambda functions using AWS Identity and Access Management (IAM) condition keys. IAM condition keys enable you to further refine the conditions under which an IAM policy statement applies. You can use the new condition keys in IAM policies when granting permissions to create and update functions.

The three new condition keys for VPC settings are lambda:VpcIds, lambda:SubnetIds, and lambda:SecurityGroupIds. The keys allow you to ensure that users can only deploy functions connected to one or more allowed VPCs, subnets, and security groups. If users try to create or update a function with VPC settings that are not allowed, Lambda rejects the operation.

Understanding Lambda and VPCs

All of the Lambda compute infrastructure runs inside VPCs owned by the Lambda service. Lambda functions can only be invoked by calling the Lambda API. There is no direct network access to the execution environment where your functions run.

Non-VPC connected Lambda functions

When your Lambda function is not configured to connect to your own VPCs, the function can access anything available on the public internet. This includes other AWS services, HTTPS endpoints for APIs, or services and endpoints outside AWS. The function cannot directly connect to your private resources inside of your VPC.

VPC connected Lambda functions

You can configure a Lambda function to connect to private subnets in a VPC in your account. When a Lambda function is configured to use a VPC, the Lambda function still runs inside the AWS Lambda service VPC. The function then sends all network traffic through your VPC and abides by your VPC’s network controls. You can use these controls to define where your functions can connect using security groups and network ACLs. Function egress traffic comes from your own network address space, and you have network visibility using VPC flow logs.

You can restrict access to network locations, including the public internet. A Lambda function connected to a VPC has no internet access by default. To give your function access to the internet, you can route outbound traffic to a network address translation (NAT) gateway in a public subnet.

When you configure your Lambda function to connect to your own VPC, it uses a shared elastic network interface (ENI) managed by AWS Hyperplane. The connection creates a VPC-to-VPC NAT and does a cross-account attachment, which allows network access from your Lambda functions to your private resources.

AWS Lambda service VPC with VPC-to-VPT NAT to customer VPC

AWS Lambda service VPC with VPC-to-VPT NAT to customer VPC

The Hyperplane ENI is a managed network interface resource that the Lambda service controls and sits in your VPC inside of your account. Multiple execution environments share the ENI to securely access resources inside of a VPC in your account. You still do not have direct network access to the execution environment.

When are ENIs created?

The network interface creation happens when your Lambda function is created or its VPC settings are updated. When a function is invoked, the execution environment uses the pre-created network interface and quickly establishes a network tunnel to it. This reduces the latency that was previously associated with creating and attaching a network interface during a cold start.

How many ENIs are required?

Because the network interfaces are shared across execution environments, typically only a handful of network interfaces are required per function. Every unique security group:subnet combination across functions in your account requires a distinct network interface. If multiple functions in the same account use the same security group:subnet pairing, it reuses the same network interface. This way, a single application with multiple functions but the same network and security configuration can benefit from the existing interface configuration.

Your function scaling is no longer directly tied to the number of network interfaces. Hyperplane ENIs can scale to support large numbers of concurrent function executions.

If your functions are not active for a long period of time, Lambda reclaims its network interfaces, and the function becomes idle and inactive. You must invoke an idle function to reactivate it. The first invocation fails and the function enters a pending state again until the network interface is available.

Using the new Lambda condition keys for VPC settings

With the new VPC condition key settings, you can specify one or more required VPC, subnets, and security groups. The lambda:VpcIds value is inferred from the subnet and security groups the CreateFunction API caller provides.

The condition syntax is in the format "Condition":{"{condition-operator}":{"{condition-key}":"{condition-value}"}}. You can use condition operators with multiple keys and values to construct policy documents.

I have a private VPC configured with the following four subnets:

Private VPC subnets

Private VPC subnets

I have a MySQL database instance running in my private VPC. The instance is running in us-east-1b in subnet subnet-046c0d0c487b0515b with a failover in us-east-1c in subnet subnet-091e180fa55fb8e83. I have an associated security group sg-0a56588b3406ee3d3 allowing access to the database. As this is a private subnet, I don’t allow internet access.

I want to ensure that any Lambda functions I create with my account must only connect to my private VPC.

  1. I create the following IAM policy document, which I attach to my account. It uses a Deny condition key with a ForAllValues:StringNotEquals condition operator to specify a required VpcId.
  2. {
        "Version": "2012-10-17",
        "Statement": [
    		{
    			"Sid": "Stmt159186333251",
    			"Action": ["lambda:CreateFunction","lambda:UpdateFunctionConfiguration"],
    			"Effect": "Deny",
    			"Resource": "*",
    			"Condition": {"ForAllValues:StringNotEquals": {"lambda:VpcIds":["vpc-0eebf3d0fe63a2db1"]}}
    		}
        ]
    }
    
  3. I attempt to create a Lambda function that does not connect to my VPC by excluding --vpc-config in the API call.
  4. aws lambda create-function --function-name MyVPCLambda1 \
      --runtime python3.7 --handler helloworld.handler --zip-file fileb://vpccondition.zip \
      --region us-east-1 --role arn:aws:iam::123456789012:role/VPCConditionLambdaRole
    
  5. I receive an AccessDeniedException error with an explicit deny:
  6. Lambda function creation AccessDeniedException

    Lambda function creation AccessDeniedException

  7. I attempt to create the Lambda function again and include any one of the subnets in my VPC, along with the security group. I must include both the SubnetIds and SecurityGroupId values with the --vpc-config.
aws lambda create-function --function-name MyVPCLambda1 \
  --vpc-config "SubnetIds=['subnet-019c87c9b67742a8f'],SecurityGroupIds=['sg-0a56588b3406ee3d3']" \
  --runtime python3.7 --handler helloworld.handler --zip-file fileb://vpccondition.zip \
  --region us-east-1 --role arn:aws:iam::123456789012:role/VPCConditionLambdaRole

The function is created successfully.

Successfully create Lambda function connected to VPC

Successfully create Lambda function connected to VPC

I also want to ensure that any Lambda functions created in my account must have the following in the configuration:

  • My private VPC
  • Both subnets containing my database instances
  • The security group including the MySQL database instance
  1. I amend my account IAM policy document to include restrictions for SubnetIds and SecurityGroupIds. I do not need to specify VpcIds as this is inferred.
  2. {
        "Version": "2012-10-17",
        "Statement": [
    		{
    			"Sid": "Stmt159186333252",
    			"Action": ["lambda:CreateFunction","lambda:UpdateFunctionConfiguration"],
    			"Effect": "Deny",
    			"Resource": "*",
    			"Condition": {"ForAllValues:StringNotEquals": {"lambda:SubnetIds": ["subnet-046c0d0c487b0515b","subnet-091e180fa55fb8e83"]}}
    		},
    		{
    			"Sid": "Stmt159186333253",
    			"Action": ["lambda:CreateFunction","lambda:UpdateFunctionConfiguration"],
    			"Effect": "Deny",
    			"Resource": "*",
    			"Condition": {"ForAllValues:StringNotEquals": {"lambda:SecurityGroupIds": ["sg-0a56588b3406ee3d3"]}}
    		}
        ]
    }
    
  3. I try to create another Lambda function, using --vpc-config values with a subnet in my VPC that’s not in the allowed permission list, along with the security group.
  4. aws lambda create-function --function-name MyVPCLambda2 \
      --vpc-config "SubnetIds=['subnet-019c87c9b67742a8f'],SecurityGroupIds=['sg-0a56588b3406ee3d3']" \
      --runtime python3.7 --handler helloworld.handler --zip-file fileb://vpccondition.zip \
      --region us-east-1 --role arn:aws:iam::123456789012:role/VPCConditionLambdaRole
    

    I receive an AccessDeniedException error.

  5. I retry, specifying both valid and allowed SubnetIds and SecurityGroupIds:
aws lambda create-function --function-name MyVPCLambda2 \
  --vpc-config "SubnetIds=['subnet-046c0d0c487b0515b','subnet-091e180fa55fb8e83'],SecurityGroupIds=['sg-0a56588b3406ee3d3']" \
  --runtime python3.7 --handler helloworld.handler --zip-file fileb://vpccondition.zip \
  --region us-east-1 --role arn:aws:iam::123456789012:role/VPCConditionLambdaRole

The function creation is successful.

Successfully create Lambda function connected to specific subnets and security groups

Successfully create Lambda function connected to specific subnets and security groups

With these settings, I can ensure that I can only create Lambda functions with the allowed VPC network security settings.

Updating Lambda functions

When updating Lambda function configuration, you do not need to specify the VPC settings if they already exist. Lambda checks the existing VPC settings before making the authorization call to IAM.

The following command to add more memory to the Lambda function, without specifying the VPC configuration, is successful as the configuration already exists.

aws lambda update-function-configuration --function-name MyVPCLambda2 --memory-size 512

Lambda layer condition keys

Lambda also has another existing condition key – lambda:Layer.

Lambda layers allow you to share code and content between multiple Lambda functions, or even multiple applications.

The lambda:Layer condition key allows you to enforce that a function must include a particular layer, or allowed group of layers. You can also prevent using layers. You can limit using layers to only those from your accounts, preventing layers published by accounts that are not yours.

Conclusion

You can now control the VPC settings for your Lambda functions using IAM condition keys.

The new VPC setting condition keys are available in all AWS Regions where Lambda is available. To learn more about the new condition keys and view policy examples, see “Using IAM condition keys for VPC settings” and  “Resource and Conditions for Lambda actions” in the Lambda Developer Guide.  To learn more about using IAM condition keys, see “IAM JSON Policy Elements: Condition” in the IAM User Guide.

Building well-architected serverless applications: Controlling serverless API access – part 1

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-well-architected-serverless-applications-controlling-serverless-api-access-part-1/

This series of blog posts uses the AWS Well-Architected Tool with the Serverless Lens to help customers build and operate applications using best practices. In each post, I address the nine serverless-specific questions identified by the Serverless Lens along with the recommended best practices. See the Introduction post for a table of contents and explanation of the example application.

Security question SEC1: How do you control access to your serverless API?

Use authentication and authorization mechanisms to prevent unauthorized access, and enforce quota for public resources. By controlling access to your API, you can help protect against unauthorized access and prevent unnecessary use of resources.

AWS has a number of services to provide API endpoints including Amazon API Gateway and AWS AppSync.

Use Amazon API Gateway for RESTful and WebSocket APIs. Here is an example serverless web application architecture using API Gateway.

Example serverless application architecture using API Gateway

Example serverless application architecture using API Gateway

Use AWS AppSync for managed GraphQL APIs.

AWS AppSync overview diagram

AWS AppSync overview diagram

The serverless airline example in this series uses AWS AppSync to provide the frontend, user-facing public API. The application also uses API Gateway to provide backend, internal, private REST APIs for the loyalty and payment services.

Good practice: Use an authentication and an authorization mechanism

Authentication and authorization are mechanisms for controlling and managing access to a resource. In this well-architected question, that is a serverless API. Authentication is verifying who a client or user is. Authorization is deciding whether they have the permission to access a resource. By enforcing authorization, you can prevent unauthorized access to your workload from non-authenticated users.

Integrate with an identity provider that can validate your API consumer’s identity. An identity provider is a system that provides user authentication as a service. The identity provider may use the XML-based Security Assertion Markup Language (SAML), or JSON Web Tokens (JWT) for authentication. It may also federate with other identity management systems. JWT is an open standard that defines a way for securely transmitting information between parties as a JSON object. JWT uses frameworks such as OAuth 2.0 for authorization and OpenID Connect (OIDC), which builds on OAuth2, and adds authentication.

Only authorize access to consumers that have successfully authenticated. Use an identity provider rather than API keys as a primary authorization method. API keys are more suited to rate limiting and throttling.

Evaluate authorization mechanisms

Use AWS Identity and Access Management (IAM) for authorizing access to internal or private API consumers, or other AWS Managed Services like AWS Lambda.

For public, user facing web applications, API Gateway accepts JWT authorizers for authenticating consumers. You can use either Amazon Cognito or OpenID Connect (OIDC).

App client authenticates and gets tokens

App client authenticates and gets tokens

For custom authorization needs, you can use Lambda authorizers.

A Lambda authorizer (previously called a custom authorizer) is an AWS Lambda function which API Gateway calls for an authorization check when a client makes a request to an API method. This means you do not have to write custom authorization logic in a function behind an API. The Lambda authorizer function can validate a bearer token such as JWT, OAuth, or SAML, or request parameters and grant access. Lambda authorizers can be used when using an identity provider other than Amazon Cognito or AWS IAM, or when you require additional authorization customization.

Lambda authorizers

Lambda authorizers

For more information, see the AWS Hero blog post, “The Complete Guide to Custom Authorizers with AWS Lambda and API Gateway”.

The AWS documentation also has a useful section on “Understanding Lambda Authorizers Auth Workflow with Amazon API Gateway”.

Enforce authorization for non-public resources within your API

Within API Gateway, you can enable native authorization for users authenticated using Amazon Cognito or AWS IAM. For authorizing users authenticated by other identity providers, use Lambda authorizers.

For example, within the serverless airline, the loyalty service uses a Lambda function to fetch loyalty points and next tier progress. AWS AppSync acts as the client using an HTTP resolver, via an API Gateway REST API /loyalty/{customerId}/get resource, to invoke the function.

To ensure only AWS AppSync is authorized to invoke the API, IAM authorization is set within the API Gateway method request.

Viewing API Gateway IAM authorization

Viewing API Gateway IAM authorization

The serverless airline uses the AWS Serverless Application Model (AWS SAM) to deploy the backend infrastructure as code. This makes it easier to know which IAM role has access to the API. One of the benefits of using infrastructure as code is visibility into all deployed application resources, including IAM roles.

The loyalty service AWS SAM template contains the AppsyncLoyaltyRestApiIamRole.

AppsyncLoyaltyRestApiIamRole:
Type: AWS::IAM::Role
Properties:
AssumeRolePolicyDocument:
Version: 2012-10-17
Statement:
- Effect: Allow
  AppsyncLoyaltyRestApiIamRole:
    Type: AWS::IAM::Role
    Properties:
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: appsync.amazonaws.com
            Action: sts:AssumeRole
      Path: /
      Policies:
        - PolicyName: LoyaltyApiInvoke
          PolicyDocument:
            Version: 2012-10-17
            Statement:
              - Effect: Allow
                Action:
                  - execute-api:Invoke
                # arn:aws:execute-api:region:account-id:api-id/stage/METHOD_HTTP_VERB/Resource-path
                Resource: !Sub arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${LoyaltyApi}/*/*/*

The IAM role specifies that appsync.amazonaws.com can perform an execute-api:Invoke on the specific API Gateway resource arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${LoyaltyApi}/*/*/*

Within AWS AppSync, you can enable native authorization for users authenticating using Amazon Cognito or AWS IAM. You can also use any external identity provider compliant with OpenID Connect (OIDC).

Improvement plan summary:

  1. Evaluate authorization mechanisms.
  2. Enforce authorization for non-public resources within your API

Required practice: Use appropriate endpoint type and mechanisms to secure access to your API

APIs may have public or private endpoints. Consider public endpoints to serve consumers where they may not be part of your network perimeter. Consider private endpoints to serve consumers within your network perimeter where you may not want to expose the API publicly. Public and private endpoints may have different levels of security.

Determine your API consumer and choose an API endpoint type

For providing public content, use Amazon API Gateway or AWS AppSync public endpoints.

For providing content with restricted access, use Amazon API Gateway with authorization to specific resources, methods, and actions you want to restrict. For example, the serverless airline application uses AWS IAM to restrict access to the private loyalty API so only AWS AppSync can call it.

With AWS AppSync providing a GraphQL API, restrict access to specific data types, data fields, queries, mutations, or subscriptions.

You can create API Gateway private REST APIs that you can only access from your AWS Virtual Private Cloud(VPC) by using an interface VPC endpoint.

API Gateway private endpoints

API Gateway private endpoints

For more information, see “Choose an endpoint type to set up for an API Gateway API”.

Implement security mechanisms appropriate to your API endpoint

With Amazon API Gateway and AWS AppSync, for both public and private endpoints, there are a number of mechanisms for access control.

For providing content with restricted access, API Gateway REST APIs support native authorization using AWS IAM, Amazon Cognito user pools, and Lambda authorizers. Amazon Cognito user pools is a feature that provides a managed user directory for authentication. For more detailed information, see the AWS Hero blog post, “Picking the correct authorization mechanism in Amazon API Gateway“.

You can also use resource policies to restrict content to a specific VPC, VPC endpoint, a data center, or a specific AWS Account.

API Gateway resource policies are different from IAM identity policies. IAM identity policies are attached to IAM users, groups, or roles. These policies define what that identity can do on which resources. For example, in the serverless airline, the IAM role AppsyncLoyaltyRestApiIamRole specifies that appsync.amazonaws.com can perform an execute-api:Invoke on the specific API Gateway resource arn:aws:execute-api:${AWS::Region}:${AWS::AccountId}:${LoyaltyApi}/*/*/*

Resource policies are attached to resources such as an Amazon S3 bucket, or an API Gateway resource or method. The policies define what identities can access the resource.

IAM access is determined by a combination of identity policies and resource policies.

For more information on the differences, see “Identity-Based Policies and Resource-Based Policies”. To see which services support resource-based policies, see “AWS Services That Work with IAM”.

API Gateway HTTP APIs support JWT authorizers as a part of OpenID Connect (OIDC) and OAuth 2.0 frameworks.

API Gateway WebSocket APIs support AWS IAM and Lambda authorizers.

With AWS AppSync public endpoints, you can enable authorization with the following:

  • AWS IAM
  • Amazon Cognito User pools for email and password functionality
  • Social providers (Facebook, Google+, and Login with Amazon)
  • Enterprise federation with SAML

Within the serverless airline, AWS Amplify Console hosts the public user facing site. Amplify Console provides a git-based workflow for building, deploying, and hosting serverless web applications. Amplify Console manages the hosting of the frontend assets for single page app (SPA) frameworks in addition to static websites, along with an optional serverless backend. Frontend assets are stored in S3 and the Amazon CloudFront global edge network distributes the web app globally.

The AWS Amplify CLI toolchain allows you to add backend resources using AWS CloudFormation.

Using Amplify CLI to add authentication

For the serverless airline, I use the Amplify CLI to add authentication using Amazon Cognito with the following command:

amplify add auth

When prompted, I specify the authentication parameters I require.

Amplify add auth

Amplify add auth

Amplify CLI creates a local CloudFormation template. Use the following command to deploy the updated authentication configuration to the cloud:

amplify push

Once the deployment is complete, I view the deployed authentication nested stack resources from within the CloudFormation Console. I see the Amazon Cognito user pool.

View Amplify authentication CloudFormation nested stack resources

View Amplify authentication CloudFormation nested stack resources

For a more detailed walkthrough using Amplify CLI to add authentication for the serverless airline, see the build video.

For more information on Amplify CLI and authentication, see “Authentication with Amplify”.

Conclusion

To help protect against unauthorized access and prevent unnecessary use of serverless API resources, control access using authentication and authorization mechanisms.

In this post, I cover the different mechanisms for authorization available for API Gateway and AWS AppSync. I explain the different approaches for public or private endpoints and show how to use IAM to control access to internal or private API consumers. I walk through how to use the Amplify CLI to create an Amazon Cognito user pool.

This well-architected question will be continued in a future post where I continue using the Amplify CLI to add a GraphQL API. I will explain how to view JSON Web Tokens (JWT) claims, and how to use Cognito identity pools to grant temporary access to AWS services. I will also show how to use API keys and API Gateway usage plans for rate limiting and throttling requests.

Improve VPN Network Performance of AWS Hybrid Cloud with Global Accelerator

Post Syndicated from Anandprasanna Gaitonde original https://aws.amazon.com/blogs/architecture/improve-vpn-network-performance-of-aws-hybrid-cloud-with-global-accelerator/

Introduction

Connecting on-premises data centers to AWS using AWS Site-to-Site VPN to support distributed applications is a common practice. With business expansion and acquisitions, your company’s on-premises IT footprint may grow into various geographies, with these multiple sites comprising of on-premises data centers and co-location facilities. AWS Site-to-Site VPN supports throughput up to 1.25 Gbps, although the actual throughput can be lower for VPN connections that are in a different geolocations from the AWS region. This is because the internet path between them has to traverse multiple networks. For globally distributed applications that interact with other applications and components located on-premises, these VPN connections can impact performance and user experience.

This blog post provides an architectural approach to improving the performance of such globally distributed applications. We’ll explain an architecture that utilizes AWS Global Accelerator to create highly performant connectivity in terms of latency and bandwidth for VPN connections that originate from distant geographies around the world. Using this architecture, you can optimize your inter-application traffic between remote sites and your AWS environment, which can lead to better application performance and customer experience.

Distributed application architecture in a hybrid cloud using VPN

Distributed application architecture in a hybrid cloud using VPN

The above figure shows a pictorial representation of a customer’s existing IT footprint spread across several locations in the U.S., Europe, and the Asia Pacific (APAC), while the AWS environment is set up in us-east-1 region. In this use case, a business application hosted in AWS has the following dependencies on remote data centers and is also accessed by remote corporate users:

  1. Communication with an application hosted in a data center in EU region
  2. Communication with a data center in the US where corporate users access the AWS application over VPN
  3. Integration with local API based service in the APAC region

Site-to-Site VPN from a remote site to an AWS environment provides secure connectivity for this inter-application traffic, as well as traffic from users to the application. Sites closer to the us-east-1 region may see reasonably good network performance and latency. However, sites that are geographically remote may experience higher latencies and not-so-reliable network performance due to the number of network hops spanning multiple networks and possible congestion. In addition, varying network paths through the Internet backbone can also lead to increased latencies. This impacts the overall application performance, which can lead to an unsatisfactory customer experience.

Optimizing application performance with Accelerated VPN connections

Optimizing application performance with Accelerated VPN connections

The above diagram shows the business application hosted in a multi-VPC architecture on AWS comprising of a production VPC and a sandbox VPC, typical of customer environments. These VPCs are interconnected using AWS Transit Gateway, and the VPN connections from the three remote sites terminate at AWS Transit Gateway as VPN attachments.

To improve the user experience for the application, VPN attachments to AWS Transit gateway are enabled with a feature called Accelerated Site-to-Site VPN. With this feature enabled, AWS Global Accelerator routes traffic from an on-premises network to the AWS Edge location closest to your customer’s gateway. It uses the AWS global network to route traffic through the AWS Global backbone from the closest Edge location, thereby ensuring the traffic remains over the optimum network path. This translates into faster response times, increased throughput, and a better user experience as described in this blog post about better performance for internet traffic with AWS Global Accelerator.

The Accelerated Site-to-Site VPN feature is enabled by creating accelerators that allow you to associate two Anycast static IPs from the Edge network. (Anycast is a network addressing and routing method that attributes a single IP address to multiple endpoints in a network.) These static IP addresses act as a fixed entry point to the VPN tunnel endpoints. This improves the availability and performance of your applications that need to interface with remote sites for their functionality. The above diagram shows three Edge locations, each one corresponding to the accelerators for each of the VPN connections. Since AWS Transit Gateway allows connectivity to multiple VPCs in your AWS environment, the benefit of improved network performance is extended to applications and workloads in VPCs connected to the transit gateway. This architecture scales as business demands and workloads continue to grow on AWS.

Configuring your VPN connections for the Acceleration

To make changes to your existing VPN, consider the following for enabling the acceleration:

  • If your current existing VPN connections are terminating on a VPN Gateway, you will need to create an AWS Transit Gateway and create VPC attachments from the application VPC to the Transit Gateway.
  • Existing VPN connections on Transit Gateway can’t be modified to take advantage of the acceleration, so you will need to tear down existing connections and set up new ones in the AWS console as shown below. Then, configure your customer gateway device to use the new Site-to-Site VPN connection and delete the old Site-to-Site VPN connection.

Create VPN connection

For more information and steps, see Creating a transit gateway VPN attachment.

Accelerated VPN connections use two VPN tunnels per connection like a regular Site-to-Site VPN connection. For accelerated VPN connections, each tunnel uses a separate accelerator and a separate pool of IP addresses for the tunnel endpoint IP addresses. The IP addresses for the two VPN tunnels are selected from two separate network zones. This ensures high availability for your VPN connections and can handle any network disruptions within a particular zone. If an Edge location fails, the customer gateway can reinitiate the VPN tunnel to the same IP address and get connected to the nearest available Edge location, making it resilient. These are the outside IP addresses to which the customer gateway will connect, as shown below:

Outside IP addresses to which customer gateway will connect

Considerations

Accelerated VPN functionality provides benefits to architectures involved in communicating with remote data centers and on-premises locations, but there are some considerations to keep in mind:

  • Additional charges are involved due to the use of Global Accelerator when acceleration is enabled. Performance testing should be done to evaluate the benefit it provides to your application.
  • Don’t enable accelerated VPN when the customer gateway for your VPN connection is also in an AWS environment since that traffic already traverses through the AWS backbone.
  • Applications that require a consistent network performance and a dedicated private connection should consider moving to AWS Direct Connect.

From the AWS Region where your application resides, you can use the Global Accelerator Speed Comparison tool from those remote data centers to see Global Accelerator download speeds compared to direct internet downloads. Note that while the tool uses TCP, the VPN uses UDP protocol, meaning it’s not a performance test of a VPN connection. However, it will give you a reasonable indication of the performance improvement for your VPN.

Summary

As you start adopting the cloud and migrating workloads to the AWS platform, you’ll realize the inherent benefits of scalability, high availability, and security to create fault-tolerant and production-grade applications. During this transition, you will have hybrid cloud environments utilizing VPN connectivity. Accelerated Site-to-Site VPN connections can provide you with performance improvements for your application traffic. This is a good alternative until your traffic demands and architecture considerations mandate the use of a dedicated network path using AWS Direct Connect from your remote locations to AWS.

 

Using Amazon EFS for AWS Lambda in your serverless applications

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/using-amazon-efs-for-aws-lambda-in-your-serverless-applications/

Serverless applications are event-driven, using ephemeral compute functions to integrate services and transform data. While AWS Lambda includes a 512-MB temporary file system for your code, this is an ephemeral scratch resource not intended for durable storage.

Amazon EFS is a fully managed, elastic, shared file system designed to be consumed by other AWS services, such as Lambda. With the release of Amazon EFS for Lambda, you can now easily share data across function invocations. You can also read large reference data files, and write function output to a persistent and shared store. There is no additional charge for using file systems from your Lambda function within the same VPC.

EFS for Lambda makes it simpler to use a serverless architecture to implement many common workloads. It opens new capabilities, such as building and importing large code libraries directly into your Lambda functions. Since the code is loaded dynamically, you can also ensure that the latest version of these libraries is always used by every new execution environment. For appending to existing files, EFS is also a preferred option to using Amazon S3.

This blog post shows how to enable EFS for Lambda in your AWS account, and walks through some common use-cases.

Capabilities and behaviors of Lambda with EFS

EFS is built to scale on demand to petabytes of data, growing and shrinking automatically as files are written and deleted. When used with Lambda, your code has low-latency access to a file system where data is persisted after the function terminates.

EFS is a highly reliable NFS-based regional service, with all data stored durably across multiple Availability Zones. It is cost-optimized, due to no provisioning requirements, and no purchase commitments. It uses built-in lifecycle management to optimize between SSD-performance class and an infrequent access class that offer 92% lower cost.

EFS offers two performance modes – general purpose and MaxIO. General purpose is suitable for most Lambda workloads, providing lower operational latency and higher performance for individual files.

You also choose between two throughput modes – bursting and provisioned. The bursting mode uses a credit system to determine when a file system can burst. With bursting, your throughput is calculated based upon the amount of data you are storing. Provisioned throughput is useful when you need more throughout than provided by the bursting mode. Total throughput available is divided across the number of concurrent Lambda invocations.

The Lambda service mounts EFS file systems when the execution environment is prepared. This adds minimal latency when the function is invoked for the first time, often within hundreds of milliseconds. When the execution environment is already warm from previous invocations, the EFS mount is already available.

EFS can be used with Provisioned Concurrency for Lambda. When the reserved capacity is prepared, the Lambda service also configures and mounts EFS file system. Since Provisioned Concurrency executes any initialization code, any libraries or packages consumed from EFS at this point are downloaded. In this use-case, it’s recommended to use provisioned throughout when configuring EFS.

The EFS file system is shared across Lambda functions as it scales up the number of concurrent executions. As files are written by one instance of a Lambda function, all other instances can access and modify this data, depending upon the access point permissions. The EFS file system scales with your Lambda functions, supporting up to 25,000 concurrent connections.

Creating an EFS file system

Configuring EFS for Lambda is straight-forward. I show how to do this in the AWS Management Console but you can also use the AWS CLI, AWS SDK, AWS Serverless Application Model (AWS SAM), and AWS CloudFormation. EFS file systems are always created within a customer VPC, so Lambda functions using the EFS file system must all reside in the same VPC.

To create an EFS file system:

  1. Navigate to the EFS console.
  2. Choose Create File System.
    EFS: Create File System
  3. On the Configure network access page, select your preferred VPC. Only resources within this VPC can access this EFS file system. Accept the default mount targets, and choose Next Step.
  4. On Configure file system settings, you can choose to enable encryption of data at rest. Review this setting, then accept the other defaults and choose Next Step. This uses bursting mode instead of provisioned throughout.
  5. On the Configure client access page, choose Add access point.
    EFS: Add access point
  6. Enter the following parameters. This configuration creates a file system with open read/write permissions – read more about settings to secure your access points. Choose Next Step.EFS: Access points
  7. On the Review and create page, check your settings and choose Create File System.
  8. In the EFS console, you see the new file system and its configuration. Wait until the Mount target state changes to Available before proceeding to the next steps.

Alternatively, you can use CloudFormation to create the EFS access point. With the AWS::EFS::AccessPoint resource, the preceding configuration is defined as follows:

  AccessPointResource:
    Type: 'AWS::EFS::AccessPoint'
    Properties:
      FileSystemId: !Ref FileSystemResource
      PosixUser:
        Uid: "1000"
        Gid: "1000"
      RootDirectory:
        CreationInfo:
          OwnerGid: "1000"
          OwnerUid: "1000"
          Permissions: "0777"
        Path: "/lambda"

For more information, see the example setup template in the code repository.

Working with AWS Cloud9 and Amazon EC2

You can mount EFS access points on Amazon EC2 instances. This can be useful for browsing file systems contents and downloading files from other locations. The EFS console shows customized mount instructions directly under each created file system:

EFS customized mount instructions

The instance must have access to the same security group and reside in the same VPC as the EFS file system. After connecting via SSH to the EC2 instance, you mount the EFS mount target to a directory. You can also mount EFS in AWS Cloud9 instances using the terminal window.

Any files you write into the EFS file system are available to any Lambda functions using the same EFS file system. Similarly, any files written by Lambda functions are available to the EC2 instance.

Sharing large code packages with Lambda

EFS is useful for sharing software packages or binaries that are otherwise too large for Lambda layers. You can copy these to EFS and have Lambda use these packages as if there are installed in the Lambda deployment package.

For example, on EFS you can install Puppeteer, which runs a headless Chromium browser, using the following script run on an EC2 instance or AWS Cloud9 terminal:

  mkdir node && cd node
  npm init -y
  npm i puppeteer --save

Building packages in EC2 for EFS

You can then use this package from a Lambda function connected to this folder in the EFS file system. You include the Puppeteer package with the mount path in the require declaration:

const puppeteer = require ('/mnt/efs/node/node_modules/puppeteer')

In Node.js, to avoid changing declarations manually, you can add the EFS mount path to the Node.js module search path by using app-module-path. Lambda functions support a range of other runtimes, including Python, Java, and Go. Many other runtimes offer similar ways to add the EFS path to the list of default package locations.

There is an important difference between using packages in EFS compared with Lambda layers. When you use Lambda layers to include packages, these are downloaded to an immutable code package. Any changes to the underlying layer do not affect existing functions published using that layer.

Since EFS is a dynamic binding, any changes or upgrades to packages are available immediately to the Lambda function when the execution environment is prepared. This means you can output a build process to an EFS mount, and immediately consume any new versions of the build from a Lambda function.

Configuring AWS Lambda to use EFS

Lambda functions that access EFS must run from within a VPC. Read this guide to learn more about setting up Lambda functions to access resources from a VPC. There are also sample CloudFormation templates you can use to configure private and public VPC access.

The execution role for Lambda function must provide access to the VPC and EFS. For development and testing purposes, this post uses the AWSLambdaVPCAccessExecutionRole and AmazonElasticFileSystemClientFullAccess managed policies in IAM. For production systems, you should use more restrictive policies to control access to EFS resources.

Once your Lambda function is configured to use a VPC, next configure EFS in Lambda:

  1. Navigate to the Lambda console and select your function from the list.
  2. Scroll down to the File system panel, and choose Add file system.
    EFS: Add file system
  3. In the File system configuration:
  • From the EFS file system dropdown, select the required file system. From the Access point dropdown, choose the required EFS access point.
  • In the Local mount path, enter the path your Lambda function uses to access this resource. Enter an absolute path.
  • Choose Save.
    EFS: Add file system

The File system panel now shows the configuration of the EFS mount, and the function is ready to use EFS. Alternatively, you can use an AWS Serverless Application Model (SAM) template to add the EFS configuration to a function resource:

AWSTemplateFormatVersion: '2010-09-09'
Resources:
  MyLambdaFunction:
    Type: AWS::Serverless::Function
    Properties:
	...
      FileSystemConfigs:
      - Arn: arn:aws:elasticfilesystem:us-east-1:xxxxxx:accesspoint/
fsap-123abcdef12abcdef
        LocalMountPath: /mnt/efs

To learn more, see the SAM documentation on this feature.

Example applications

You can view and download these examples from this GitHub repository. To deploy, follow the instructions in the repo’s README.md file.

1. Processing large video files

The first example uses EFS to process a 60-minute MP4 video and create screenshots for each second of the recording. This uses to the FFmpeg Linux package to process the video. After copying the MP4 to the EFS file location, invoke the Lambda function to create a series of JPG frames. This uses the following code to execute FFmpeg and pass the EFS mount path and input file parameters:

const os = require('os')

const inputFile = process.env.INPUT_FILE
const efsPath = process.env.EFS_PATH

const { exec } = require('child_process')

const execPromise = async (command) => {
	console.log(command)
	return new Promise((resolve, reject) => {
		const ls = exec(command, function (error, stdout, stderr) {
		  if (error) {
		    console.log('Error: ', error)
		    reject(error)
		  }
		  console.log('stdout: ', stdout);
		  console.log('stderr: ' ,stderr);
		})
		
		ls.on('exit', function (code) {
		  console.log('Finished: ', code);
		  resolve()
		})
	})
}

// The Lambda handler
exports.handler = async function (eventObject, context) {
	await execPromise(`/opt/bin/ffmpeg -loglevel error -i ${efsPath}/${inputFile} -s 240x135 -vf fps=1 ${efsPath}/%d.jpg`)
}

In this example, the process writes more than 2000 individual JPG files back to the EFS file system during a single invocation:

Console output from sample application

2. Archiving large numbers of files

Using the output from the first application, the second example creates a single archive file from the JPG files. The code uses the Node.js archiver package for processing:

const outputFile = process.env.OUTPUT_FILE
const efsPath = process.env.EFS_PATH

const fs = require('fs')
const archiver = require('archiver')

// The Lambda handler
exports.handler = function (event) {

  const output = fs.createWriteStream(`${efsPath}/${outputFile}`)
  const archive = archiver('zip', {
    zlib: { level: 9 } // Sets the compression level.
  })
  
  output.on('close', function() {
    console.log(archive.pointer() + ' total bytes')
  })
  
  output.on('end', function() {
    console.log('Data has been drained')
  })
  
  archive.pipe(output)  

  // append files from a glob pattern
  archive.glob(`${efsPath}/*.jpg`)
  archive.finalize()
}

After executing this Lambda function, the resulting ZIP file is written back to the EFS file system:

Console output from second sample application.

3. Unzipping archives with a large number of files

The last example shows how to unzip an archive containing many files. This uses the Node.js unzipper package for processing:

const inputFile = process.env.INPUT_FILE
const efsPath = process.env.EFS_PATH
const destinationDir = process.env.DESTINATION_DIR

const fs = require('fs')
const unzipper = require('unzipper')

// The Lambda handler
exports.handler = function (event) {

  fs.createReadStream(`${efsPath}/${inputFile}`)
    .pipe(unzipper.Extract({ path: `${efsPath}/${destinationDir}` }))

}

Once this Lambda function is executed, the archive is unzipped into a destination direction in the EFS file system. This example shows the screenshots unzipped into the frames subdirectory:

Console output from third sample application.

Conclusion

EFS for Lambda allows you to share data across function invocations, read large reference data files, and write function output to a persistent and shared store. After configuring EFS, you provide the Lambda function with an access point ARN, allowing you to read and write to this file system. Lambda securely connects the function instances to the EFS mount targets in the same Availability Zone and subnet.

EFS opens a range of potential new use-cases for Lambda. In this post, I show how this enables you to access large code packages and binaries, and process large numbers of files. You can interact with the file system via EC2 or AWS Cloud9 and pass information to and from your Lambda functions.

EFS for Lambda is supported at launch in APN Partner solutions, including Epsagon, Lumigo, Datadog, HashiCorp Terraform, and Pulumi. To learn more about how to use EFS for Lambda, see the AWS News Blog post and read the documentation.

How Goldman Sachs builds cross-account connectivity to their Amazon MSK clusters with AWS PrivateLink

Post Syndicated from Robert L. Cossin original https://aws.amazon.com/blogs/big-data/how-goldman-sachs-builds-cross-account-connectivity-to-their-amazon-msk-clusters-with-aws-privatelink/

This guest post presents patterns for accessing an Amazon Managed Streaming for Apache Kafka cluster across your AWS account or Amazon Virtual Private Cloud (Amazon VPC) boundaries using AWS PrivateLink. In addition, the post discusses the pattern that the Transaction Banking team at Goldman Sachs (TxB) chose for their cross-account access, the reasons behind their decision, and how TxB satisfies its security requirements with Amazon MSK. Using Goldman Sachs’s implementation as a use case, this post aims to provide you with general guidance that you can use when implementing an Amazon MSK environment.

Overview

Amazon MSK is a fully managed service that makes it easy for you to build and run applications that use Apache Kafka to process streaming data. When you create an MSK cluster, the cluster resources are available to participants within the same Amazon VPC. This allows you to launch the cluster within specific subnets of the VPC, associate it with security groups, and attach IP addresses from your VPC’s address space through elastic network interfaces (ENIs). Network traffic between clients and the cluster stays within the AWS network, with internet access to the cluster not possible by default.

You may need to allow clients access to an MSK cluster in a different VPC within the same or a different AWS account. You have options such as VPC peering or a transit gateway that allow for resources in either VPC to communicate with each other as if they’re within the same network. For more information about access options, see Accessing an Amazon MSK Cluster.

Although these options are valid, this post focuses on a different approach, which uses AWS PrivateLink. Therefore, before we dive deep into the actual patterns, let’s briefly discuss when AWS PrivateLink is a more appropriate strategy for cross-account and cross-VPC access.

VPC peering, illustrated below, is a bidirectional networking connection between two VPCs that enables you to route traffic between them using private IPv4 addresses or IPv6 addresses.

VPC peering is more suited for environments that have a high degree of trust between the parties that are peering their VPCs. This is because, after a VPC peering connection is established, the two VPCs can have broad access to each other, with resources in either VPC capable of initiating a connection. You’re responsible for implementing fine-grained network access controls with security groups to make sure that only specific resources intended to be reachable are accessible between the peered VPCs.

You can only establish VPC peering connections across VPCs that have non-overlapping CIDRs. This can pose a challenge when you need to peer VPCs with overlapping CIDRs, such as when peering across accounts from different organizations.

Additionally, if you’re running at scale, you can have hundreds of Amazon VPCs, and VPC peering has a limit of 125 peering connections to a single Amazon VPC. You can use a network hub like transit gateway, which, although highly scalable in enabling you to connect thousands of Amazon VPCs, requires similar bidirectional trust and non-overlapping CIDRs as VPC peering.

In contrast, AWS PrivateLink provides fine-grained network access control to specific resources in a VPC instead of all resources by default, and is therefore more suited for environments that want to follow a lower trust model approach, thus reducing their risk surface. The following diagram shows a service provider VPC that has a service running on Amazon Elastic Compute Cloud (Amazon EC2) instances, fronted by a Network Load Balancer (NLB). The service provider creates a configuration called a VPC endpoint service in the service provider VPC, pointing to the NLB. You can share this endpoint service with another Amazon VPC (service consumer VPC), which can use an interface VPC endpoint powered by AWS PrivateLink to connect to the service. The service consumers use this interface endpoint to reach the end application or service directly.

AWS PrivateLink makes sure that the connections initiated to a specific set of network resources are unidirectional—the connection can only originate from the service consumer VPC and flow into the service provider VPC and not the other way around. Outside of the network resources backed by the interface endpoint, no other resources in the service provider VPC get exposed. AWS PrivateLink allows for VPC CIDR ranges to overlap, and it can relatively scale better because thousands of Amazon VPCs can consume each service.

VPC peering and AWS PrivateLink are therefore two connectivity options suited for different trust models and use cases.

Transaction Banking’s micro-account strategy

An AWS account is a strong isolation boundary that provides both access control and reduced blast radius for issues that may occur due to deployment and configuration errors. This strong isolation is possible because you need to deliberately and proactively configure flows that cross an account boundary. TxB designed a strategy that moves each of their systems into its own AWS account, each of which is called a TxB micro-account. This strategy allows TxB to minimize the chances of a misconfiguration exposing multiple systems. For more information about TxB micro-accounts, see the video AWS re:Invent 2018: Policy Verification and Enforcement at Scale with AWS on YouTube.

To further complement the strong gains realized due to a TxB micro-account segmentation, TxB chose AWS PrivateLink for cross-account and cross-VPC access of their systems. AWS PrivateLink allows TxB service providers to expose their services as an endpoint service and use whitelisting to explicitly configure which other AWS accounts can create interface endpoints to these services. This also allows for fine-grained control of the access patterns for each service. The endpoint service definition only allows access to resources attached to the NLBs and thereby makes it easy to understand the scope of access overall. The one-way initiation of connection from a service consumer to a service provider makes sure that all connectivity is controlled on a point-to-point basis.  Furthermore, AWS PrivateLink allows the CIDR blocks of VPCs to overlap between the TxB micro-accounts. Thus the use of AWS PrivateLink sets TxB up for future growth as a part of their default setup, because thousands of TxB micro-account VPCs can consume each service if needed.

MSK broker access patterns using AWS PrivateLink

As a part of their micro-account strategy, TxB runs an MSK cluster in its own dedicated AWS account, and clients that interact with this cluster are in their respective micro-accounts. Considering this setup and the preference to use AWS PrivateLink for cross-account connectivity, TxB evaluated the following two patterns for broker access across accounts.

Pattern 1: Front each MSK broker with a unique dedicated interface endpoint

In this pattern, each MSK broker is fronted with a unique dedicated NLB in the TxB MSK account hosting the MSK cluster. The TxB MSK account contains an endpoint service for every NLB and is shared with the client account. The client account contains interface endpoints corresponding to the endpoint services. Finally, DNS entries identical to the broker DNS names point to the respective interface endpoint. The following diagram illustrates this pattern in the US East (Ohio) Region.

High-level flow

After setup, clients from their own accounts talk to the brokers using their provisioned default DNS names as follows:

  1. The client resolves the broker DNS name to the interface endpoint IP address inside the client VPC.
  2. The client initiates a TCP connection to the interface endpoint IP over port 9094.
  3. With AWS PrivateLink technology, this TCP connection is routed to the dedicated NLB setup for the respective broker listening on the same port within the TxB MSK account.
  4. The NLB routes the connection to the single broker IP registered behind it on TCP port 9094.

High-level setup

The setup steps in this section are shown for the US East (Ohio) Region, please modify if using another region. In the TxB MSK account, complete the following:

  1. Create a target group with target type as IP, protocol TCP, port 9094, and in the same VPC as the MSK cluster.
    • Register the MSK broker as a target by its IP address.
  2. Create an NLB with a listener of TCP port 9094 and forwarding to the target group created in the previous step.
    • Enable the NLB for the same AZ and subnet as the MSK broker it fronts.
  3. Create an endpoint service configuration for each NLB that requires acceptance and grant permissions to the client account so it can create a connection to this endpoint service.

In the client account, complete the following:

  1. Create an interface endpoint in the same VPC the client is in (this connection request needs to be accepted within the TxB MSK account).
  2. Create a Route 53 private hosted zone, with the domain name kafka.us-east-2.amazonaws.com, and associate it with the same VPC as the clients are in.
  3. Create A-Alias records identical to the broker DNS names to avoid any TLS handshake failures and point it to the interface endpoints of the respective brokers.

Pattern 2: Front all MSK brokers with a single shared interface endpoint

In this second pattern, all brokers in the cluster are fronted with a single unique NLB that has cross-zone load balancing enabled. You make this possible by modifying each MSK broker’s advertised.listeners config to advertise a unique port. You create a unique NLB listener-target group pair for each broker and a single shared listener-target group pair for all brokers. You create an endpoint service configuration for this single NLB and share it with the client account. In the client account, you create an interface endpoint corresponding to the endpoint service. Finally, you create DNS entries identical to the broker DNS names that point to the single interface. The following diagram illustrates this pattern in the US East (Ohio) Region.

High-level flow

After setup, clients from their own accounts talk to the brokers using their provisioned default DNS names as follows:

  1. The client resolves the broker DNS name to the interface endpoint IP address inside the client VPC.
  2. The client initiates a TCP connection to the interface endpoint over port 9094.
  3. The NLB listener within the TxB MSK account on port 9094 receives the connection.
  4. The NLB listener’s corresponding target group load balances the request to one of the brokers registered to it (Broker 1). In response, Broker 1 sends back the advertised DNS name and port (9001) to the client.
  5. The client resolves the broker endpoint address again to the interface endpoint IP and initiates a connection to the same interface endpoint over TCP port 9001.
  6. This connection is routed to the NLB listener for TCP port 9001.
  7. This NLB listener’s corresponding target group is configured to receive the traffic on TCP port 9094, and forwards the request on the same port to the only registered target, Broker 1.

High-level setup

The setup steps in this section are shown for the US East (Ohio) Region, please modify if using another region. In the TxB MSK account, complete the following:

  1. Modify the port that the MSK broker is advertising by running the following command against each running broker. The following example command shows changing the advertised port on a specific broker b-1 to 9001. For each broker you run the below command against, you must change the values of bootstrap-server, entity-name, CLIENT_SECURE, REPLICATION and REPLICATION_SECURE. Please note that while modifying the REPLICATION and REPLICATION_SECURE values, -internal has to be appended to the broker name and the ports 9093 and 9095 shown below should not be changed.
    ./kafka-configs.sh \
    --bootstrap-server b-1.exampleClusterName.abcde.c2.kafka.us-east-2.amazonaws.com:9094 \
    --entity-type brokers \
    --entity-name 1 \
    --alter \
    --command-config kafka_2.12-2.2.1/bin/client.properties \
    --add-config advertised.listeners=[\
    CLIENT_SECURE://b-1.exampleClusterName.abcde.c2.kafka.us-east-2.amazonaws.com:9001,\
    REPLICATION://b-1-internal.exampleClusterName.abcde.c2.kafka.us-east-2.amazonaws.com:9093,\
    REPLICATION_SECURE://b-1-internal.exampleClusterName.abcde.c2.kafka.us-east-2.amazonaws.com:9095]

  2. Create a target group with target type as IP, protocol TCP, port 9094, and in the same VPC as the MSK cluster. The preceding diagram represents this as B-ALL.
    • Register all MSK brokers to B-ALL as a target by its IP address.
  3. Create target groups dedicated for each broker (B1, B2) with the same properties as B-ALL.
    • Register the respective MSK broker to each target group by its IP address.
  4. Perform the same steps for additional brokers if needed and create unique listener-target group corresponding to the advertised port for each broker.
  5. Create an NLB that is enabled for the same subnets that the MSK brokers are in and with cross-zone load balancing enabled.
    • Create a TCP listener for every broker’s advertised port (9001, 9002) that forwards to the corresponding target group you created (B1, B2).
    • Create a special TCP listener 9094 that forwards to the B-ALL target group.
  6. Create an endpoint service configuration for the NLB that requires acceptance and grant permissions to the client account to create a connection to this endpoint service.

In the client account, complete the following:

  1. Create an interface endpoint in the same VPC the client is in (this connection request needs to be accepted within the TxB MSK account).
  2. Create a Route 53 private hosted zone, with the domain name kafka.us-east-2.amazonaws.com and associate it with the same VPC as the client is in.
  3. Under this hosted zone, create A-Alias records identical to the broker DNS names to avoid any TLS handshake failures and point it to the interface endpoint.

This post shows both of these patterns to be using TLS on TCP port 9094 to talk to the MSK brokers. If your security posture allows the use of plaintext communication between the clients and brokers, these patterns apply in that scenario as well, using TCP port 9092.

With both of these patterns, if Amazon MSK detects a broker failure, it mitigates the failure by replacing the unhealthy broker with a new one. In addition, the new MSK broker retains the same IP address and has the same Kafka properties, such as any modified advertised.listener configuration.

Amazon MSK allows clients to communicate with the service on TCP ports 9092, 9094, and 2181. As a byproduct of modifying the advertised.listener in Pattern 2, clients are automatically asked to speak with the brokers on the advertised port. If there is a need for clients in the same account as Amazon MSK to access the brokers, you should create a new Route53 hosted zone in the Amazon MSK account with identical broker DNS names pointing to the NLB DNS name. The Route53 record sets override the MSK broker DNS and allow for all traffic to the brokers to go via the NLB.

Transaction Banking’s MSK broker access pattern

For broker access across TxB micro-accounts, TxB chose Pattern 1, where one interface endpoint per broker is exposed to the client account. TxB streamlined this overall process by automating the creation of the endpoint service within the TxB MSK account and the interface endpoints within the client accounts without any manual intervention.

At the time of cluster creation, the bootstrap broker configuration is retrieved by calling the Amazon MSK APIs and stored in AWS Systems Manager Parameter Store in the client account so that they can be retrieved on application startup. This enables clients to be agnostic of the Kafka broker’s DNS names being launched in a completely different account.

A key driver for TxB choosing Pattern 1 is that it avoids having to modify a broker property like the advertised port. Pattern 2 creates the need for TxB to track which broker is advertising which port and make sure new brokers aren’t reusing the same port. This adds the overhead of having to modify and track the advertised port of new brokers being launched live and having to create a corresponding listener-target group pair for these brokers. TxB avoided this additional overhead by choosing Pattern 1.

On the other hand, Pattern 1 requires the creation of additional dedicated NLBs and interface endpoint connections when more brokers are added to the cluster. TxB limits this management overhead through automation, which requires additional engineering effort.

Also, using Pattern 1 costs more compared to Pattern 2, because each broker in the cluster has a dedicated NLB and an interface endpoint. For a single broker, it costs $37.80 per month to keep the end-to-end connectivity infrastructure up. The breakdown of the monthly connectivity costs is as follows:

  • NLB running cost – 1 NLB x $0.0225 x 720 hours/month = $16.20/month
  • 1 VPC endpoint spread across three AZs – 1 VPCE x 3 ENIs x $0.01 x 720 hours/month = $21.60/month

Additional charges for NLB capacity used and AWS PrivateLink data processed apply. For more information about pricing, see Elastic Load Balancing pricing and AWS PrivateLink pricing.

To summarize, Pattern 1 is best applicable when:

  • You want to minimize the management overhead associated with modifying broker properties, such as advertised port
  • You have automation that takes care of adding and removing infrastructure when new brokers are created or destroyed
  • Simplified and uniform deployments are primary drivers, with cost as a secondary concern

Transaction Banking’s security requirements for Amazon MSK

The TxB micro-account provides a strong application isolation boundary, and accessing MSK brokers using AWS PrivateLink using Pattern 1 allows for tightly controlled connection flows between these TxB micro-accounts. TxB further builds on this foundation through additional infrastructure and data protection controls available in Amazon MSK. For more information, see Security in Amazon Managed Streaming for Apache Kafka.

The following are the core security tenets that TxB’s internal security team require for using Amazon MSK:

  • Encryption at rest using Customer Master Key (CMK) – TxB uses the Amazon MSK managed offering of encryption at rest. Amazon MSK integrates with AWS Key Management Service (AWS KMS) to offer transparent server-side encryption to always encrypt your data at rest. When you create an MSK cluster, you can specify the AWS KMS CMK that AWS KMS uses to generate data keys that encrypt your data at rest. For more information, see Using CMKs and data keys.
  • Encryption in transit – Amazon MSK uses TLS 1.2 for encryption in transit. TxB makes client-broker encryption and encryption between the MSK brokers mandatory.
  • Client authentication with TLS – Amazon MSK uses AWS Certificate Manager Private Certificate Authority (ACM PCA) for client authentication. The ACM PCA can either be a root Certificate Authority (CA) or a subordinate CA. If it’s a root CA, you need to install a self-signed certificate. If it’s a subordinate CA, you can choose its parent to be an ACM PCA root, a subordinate CA, or an external CA. This external CA can be your own CA that issues the certificate and becomes part of the certificate chain when installed as the ACM PCA certificate. TxB takes advantage of this capability and uses certificates signed by ACM PCA that are distributed to the client accounts.
  • Authorization using Kafka Access Control Lists (ACLs) – Amazon MSK allows you to use the Distinguished Name of a client’s TLS certificates as the principal of the Kafka ACL to authorize client requests. To enable Kafka ACLs, you must first have client authentication using TLS enabled. TxB uses the Kafka Admin API to create Kafka ACLs for each topic using the certificate names of the certificates deployed on the consumer and producer client instances. For more information, see Apache Kafka ACLs.

Conclusion

This post illustrated how the Transaction Banking team at Goldman Sachs approaches an application isolation boundary through the TxB micro-account strategy and how AWS PrivateLink complements this strategy.  Additionally, this post discussed how the TxB team builds connectivity to their MSK clusters across TxB micro-accounts and how Amazon MSK takes the undifferentiated heavy lifting away from TxB by allowing them to achieve their core security requirements. You can leverage this post as a reference to build a similar approach when implementing an Amazon MSK environment.

 


About the Authors

Robert L. Cossin is a Vice President at Goldman Sachs in New York. Rob joined Goldman Sachs in 2004 and has worked on many projects within the firm’s cash and securities flows. Most recently, Rob is a technical architect on the Transaction Banking team, focusing on cloud enablement and security.

 

 

 

Harsha W. Sharma is a Solutions Architect with AWS in New York. Harsha joined AWS in 2016 and works with Global Financial Services customers to design and develop architectures on AWS, and support their journey on the cloud.

 

 

New – Amazon Simple Email Service (SES) for VPC Endpoints

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-amazon-simple-email-service-ses-for-vpc-endpoints/

Although chat and messaging applications have been popular, the email has retained its place as a ubiquitous channel with the highest Return on Investment (ROI) because of its low barrier to entry, affordability and ability to target specific recipients. To ensure that organization’s marketing and transactional messages are received by the end customer in a timely manner and to drive deeper engagement with them, you need to partner with a mature and trusted email service provider that has built specialized expertise in delivering email at scale.

Amazon Simple Email Services(SES) has been the trustworthy, flexible and affordable email service provider for developers and digital marketers since 2011. Amazon SES is a reliable, cost-effective service for businesses of all sizes that use email to keep in contact with their customers. Many businesses operate in industries that are highly secure and have strict security policies. So we have enhanced security and compliance features in Amazon SES, such as enabling you to configure DKIM using your own RSA key pair, and support HIPAA Eligibility and FIPS 140-2 Compliant Endpoints as well as regional expansions.

Today, I am pleased to announce that customers can now connect directly from Virtual Private Cloud (VPC) to Amazon SES through a VPC Endpoint, powered by AWS PrivateLink, in a secure and scalable manner. You can now access Amazon SES through your VPC without requiring an Internet gateway, NAT device, VPN connection, or AWS Direct Connect connection. When you use an interface VPC Endpoint, communication between your VPC and Amazon SES APIs stays within the Amazon network, adding increased security.

With this launch, the traffic to Amazon SES does not transit over the Internet and never leaves the Amazon network to securely connect their VPC to Amazon SES without imposing availability risks or bandwidth constraints on their network traffic. You can centralize Amazon SES across your multi-account infrastructure and provide it as a service to your accounts without the need to utilizing an Internet gateway.

Amazon SES for VPC Endpoints – Getting Started
If you want to test sending emails from your EC2 instance in default VPC, Create a Security Group with following inbound rules and set the private IP of your instance in the EC2 console.

To create the VPC Endpoint for Amazon SES, use the Creating an Interface Endpoint procedure in the VPC console and select com.amazonaws.region.email-smtp service name, and attach security group that you just create it.

After your endpoint will be available, you can ssh to your EC2 instance and use openssl command to test connection or send email through just created endpoint. You can interact with the same way of SMTP interface from your operating system’s command line.

$ openssl s_client -crlf -quiet -starttls smtp -connect email-smtp.ap-southeast-2.amazonaws.com:465
...
depth=2 C = US, O = Amazon, CN = Amazon Root CA 1
verify return:1
depth=1 C = US, O = Amazon, OU = Server CA 1B, CN = Amazon
verify return:1
depth=0 CN = email-smtp.ap-southeast-2.amazonaws.com
verify return:1
...
220 email-smtp.amazonaws.com ESMTP SimpleEmailService-d-ZIFLXXX 
HELO email-smtp.amazonaws.com
...
250 Ok

Note that VPC Endpoints currently do not support cross-region requests—ensure that you create your endpoint in the same region in which you plan to issue your API calls to Amazon SES.

Now Available!
Amazon SES for VPC Endpoints is generally available and you can use it in all regions where Amazon SES is available. There is no additional charge for using Amazon SES for VPC Endpoints. Take a look at the product page and the documentation to learn more. Please send feedback to AWS forum for Amazon SES or through your usual AWS support contacts.

Channy;

Use AWS Firewall Manager and VPC security groups to protect your applications hosted on EC2 instances

Post Syndicated from Kaustubh Phatak original https://aws.amazon.com/blogs/security/use-aws-firewall-manager-vpc-security-groups-to-protect-applications-hosted-on-ec2-instances/

You can use AWS Firewall Manager to centrally configure and manage Amazon Virtual Private Cloud (Amazon VPC) security groups across all your AWS accounts. This post will take you through the step-by-step instructions to apply common security group rules, audit your security groups, and detect unused and redundant rules in your security groups across your AWS environment.

In this post, I’ll show you how to create and enforce a master set of security group rules by using common security group policy, while still allowing developers to deploy and manage application-specific security group rules. In the example below, the security group rules you’ll create allow SSH access only from the public IP address of the bastion host, and to set a policy that prohibits any security group rules that allow SSH access from everywhere (port 22).

When you use Firewall Manager to centrally apply a common security group, you can do things such as ensure that all Application Load Balancers only talk to Amazon CloudFront, or the Secure Shell (SSH) protocol is only allowed from specific IP ranges, or to give system administrators access to a central database.

In many organizations, developers write their own security group rules for their applications. However, if you’re a security administrator, you want to audit the security group rules so you’ll know when a security group is misconfigured. Using audit security group policy, you can set guardrails on which security group rules can or cannot be created across your organization. For example, you could only allow security group rules on ports 10-1000, or specify that you do not allow security group rules on port 23.

As an administrator, you also want to simplify operations by detecting unused and redundant security groups across their AWS accounts. You can use a managed audit policy to help identify unused and redundant security groups.

If you haven’t used these services before, here’s a quick overview:

  1. AWS Firewall Manager is a security management service that allows you to centrally configure and manage firewall rules across your accounts and applications in AWS Organization by using AWS Config in the background. Using AWS Firewall Manager, you can easily roll out AWS WAF rules, create AWS Shield Advanced protections, and enable security groups for your Amazon Elastic Compute Cloud (Amazon EC2) and elastic network interface resource types in Amazon VPCs.
  2. VPC security groups act as a virtual, stateful firewall for your Amazon Elastic Compute Cloud (Amazon EC2) instance to control inbound and outbound traffic. You can specify separate rules for inbound and outbound traffic, and instances associated with a security group can’t talk to each other unless you add rules allowing it.

After you put the master set of security group rules in place, you’ll get notification of all non-compliant changes made by the developers. You can take remediation action if necessary using an audit security group policy. In this post, you’ll also set up a usage security group policy, so that you can flag unused security groups and merge redundant security groups for simpler administration.

Prerequisites

AWS Firewall Manager has the following prerequisites:

  • AWS Organizations: Your organization must be using AWS Organizations to manage your accounts, and All Features must be enabled. For more information, see Creating an Organization and Enabling All Features in Your Organization.
  • An administrator AWS Account: You must designate one of the AWS accounts in your organization as the administrator for Firewall Manager. This gives the account permission to deploy AWS WAF rules across the organization.
  • AWS Config: You must enable AWS Config for all of the accounts in your organization, so that AWS Firewall Manager can detect newly created resources. To enable AWS Config for all of the accounts in your organization, you can use the Enable AWS Config template on the StackSets Sample Templates page. For more information, see Getting Started with AWS Config.

Note: You’ll be charged $100 per policy per month. In the solution in this post, you’ll create three policies. In addition, AWS Config charges also apply. For more information, see AWS Firewall Manager pricing and AWS Config pricing.

Overview

The diagram below illustrates the following steps:

  1. Complete the prerequisites that were outlined in the prerequisites section above.
  2. Create a primary security group under AWS Firewall Manager. This is a VPC security group that gets replicated as a new security group to every resource within the policy scope.
  3. In AWS Firewall Manager, create policies that can be applied to individual application security groups by mapping them to specific application name/value tags. The policies you create will result in the generation of individual new security groups.
  4. Application developers can build additional app-specific security group rules created in the previous step.

 

Figure 1: Overview of solution

Figure 1: Overview of solution

Create a common security group policy

You’ll begin by creating a common security group policy to push primary security group rules across all accounts.

  1. Sign in to the AWS Management Console using the AWS Firewall Manager administrator account that you set up in the prerequisites, and then open the Firewall Manager console.
  2. In the navigation pane, under AWS Firewall Manager, choose Security policies.
  3. Using the Filter menu, select the AWS Region where your application is hosted and choose Create policy. In my example, I choose US West (Oregon).
  4. For Policy type, choose Security group.
  5. For Security group policy type, choose Common security groups, then choose Next.
  6. Enter a policy name. In my example, I’ve named my policy Test_Common_Policy.
  7. Policy rules allow you to choose how the security groups in this policy are applied and maintained. For this tutorial, choose Apply the primary security groups to every resource within the policy scope and leave the other options unchecked. You can also choose to apply only one of these policies. Note that if you choose both check boxes, a local user won’t be able to modify security group and they won’t be able to add additional security groups.
  8. Choose Add primary security group to see all security groups in your account in your specified AWS Region. Select any one of your existing security groups, or create a new security group.
  9. (Optional) If you choose to create a new security group, you’ll be taken to the VPC dashboard where you can create your primary security group by following the Creating a Security Group documentation. Under audit security group, add the following:
    1. For Ingress Rules, choose Allow access on Port 22 from 203.0.113.1/32.
    2. For Egress Rules, choose Allow all traffic on all ports.
  10. After you select the primary security group, choose Add security group.
  11. For Policy action, for this example, choose Apply policy rules and identify resources that are non-compliant but do not auto remediate. By selecting this option, Firewall Manager will notify you of any non-compliant security groups, but will not auto-remediate. Choose Next.
  12. For Policy scope, select the following:
    1. For AWS accounts included in this policy, choose All accounts under my organization.
    2. For Resource Type to apply this policy, choose EC2 instances.
    3. For Criteria to select the resources to protect, choose Include only resources that have the specified tags.
    4. For Key, enter Env.
    5. For Value, enter Prod.

    Choose Next.

  13. Review the security policy, then choose Create policy.

 

Figure 2: Summary of Common Security Group policy

Figure 2: Summary of Common Security Group policy

The security policy will review all the EC2 instances in your child accounts in your specified AWS Region and add the primary security group to the primary network interface of the Amazon EC2 instances. All primary interfaces of the Amazon EC2 instances created in future will also have this primary security group. If the developers remove the security group rules of the primary security group, you’re notified when Firewall Manager Service marks the resource as non-compliant. You can then take remediation action of changing the security policy action to Apply policy rules and auto remediate any non-compliant resources and the non-compliant security group rules will be removed. Alternatively, you can check the non-compliant resources, then log into the AWS account and take remediation action manually.

Create an audit security group policy

Now, you’ll create an audit security group policy to enforce the guardrails. You’ll create a security group rule that allows port 22 access from an allowed IP subnet of 203.0.113.1/32 according to the security team’s recommendations.

  1. In the AWS Management Console, select AWS WAF and AWS Shield.
  2. In the navigation pane, under AWS Firewall Manager, choose Security policies.
  3. In the Filter, select the AWS Region where your application is hosted and choose Create policy. In my example, I will choose US West (Oregon).
  4. For Policy type, choose Security group. For Security group policy type, choose Auditing and enforcement guidelines for security group rules, then choose Next.
  5. Enter a policy name. In my example, I’ve named my policy Test_Audit_Policy.
  6. For Policy rules, select Allow any rules defined in audit security group.
  7. Choose Add audit security group to see all security groups in your account in your specified AWS Region. You can select a security group, or create a new security group.
  8. (Optional) If you choose to create a new security group, you’ll be taken to VPC dashboard where you can create your primary security group by following the Creating a Security Group documentation. In the audit security group, add the following:
    1. For Ingress Rules, choose Allow access on Port 22 from 203.0.113.1/32.
    2. For Egress Rules, choose Allow all traffic on all ports.
  9. After you select the audit security group, choose Add security group.
  10. For Policy action, you can only select Apply policy rules and identify resources that are non-compliant but do not auto remediate. By selecting this option, Firewall Manager will notify you of any non-compliant security groups, but will not auto-remediate. Choose Next.
  11. For Policy scope, select the following:
    1. For AWS accounts included in this policy, choose All accounts under my organization.
    2. For Resource type to apply this policy, choose Security groups.
    3. For Criteria to select the resources to protect, choose Include only resources that have the specified tags.
    4. For Key, enter Env.
    5. For Value, enter Prod.

    Choose Next.

  12. Review the security policy and choose Create policy.

 

Figure 3: Summary of Audit Security Group policy

Figure 3: Summary of Audit Security Group policy

The security policy will audit all the security groups in your child accounts in your specified AWS Region and will only allow security group ingress rules that allow port 22 access from 203.0.113.1/32. All security groups created in future will also have this restriction. If Firewall Manager detects that a security groups exists that allows port 22 access from everywhere except 203.0.113.1/32, you’re notified when Firewall Manager Service marks the resource as non-compliant. You can then take remediation action of editing the security policy action to Apply policy rules and auto remediate any non-compliant resources and the non-compliant security group rules will be removed. Alternatively, you can check the non-compliant resources, then log into the AWS account and take remediation action manually.

Create a usage security group policy

Lastly, you’ll create a usage security group policy to remove unused security groups, and to merge redundant security groups.

  1. In the AWS Management Console, select AWS WAF and Shield.
  2. In the navigation pane, under AWS Firewall Manager, choose Security policies. In the Filter, select the AWS Region where your application is hosted and choose Create policy. In my example, I am choosing US West (Oregon).
  3. For Policy type, choose Security group. For Security group policy type, choose Auditing and cleanup of unused and redundant security groups. Choose Next.
  4. Enter a policy name. In my example, I’ve named my policy Test_Usage_Policy.
  5. For Policy rules, select both the options: Security groups within this policy scope should be used by at least one resource and Security groups within this policy scope should not have similar content.
  6. For Policy action, select Apply policy rules and identify resources that are non-compliant but do not auto remediate. Choose Next.
  7. For Policy scope, select the following:
    1. For AWS accounts included in this policy, choose All accounts under my organization.
    2. For Resource type to apply this policy, choose Security groups.
    3. For Criteria to select the resources to protect, choose Include only resources that have the specified tags.
    4. For Key, enter Env.
    5. For Value, enter Prod.

    Choose Next.

  8. A pop-up warning message will appear. Select Exclude Firewall Manager admin account from the policy scope, so that security groups in the administrator account are not affected.
  9. Review the security policy and choose Create policy.

 

Figure 4: Summary of Usage Security Group policy

Figure 4: Summary of Usage Security Group policy

The security policy will review all the security groups in your child accounts in your specified AWS Region and check if there are any security groups that are not associated with any resource. The security policy will also review if there are any duplicate security group rules. After these cases are identified, AWS Firewall Manager will automatically merge them into one security group. All security groups created in future will also be checked for this. If Firewall Manager detects that a security groups exists that is not associated to any resource or has overlapping rules, you’ll be notified when Firewall Manager Service marks the resource as non-compliant. You can then take remediation action of editing the security policy action to Apply policy rules and auto remediate any non-compliant resources and the non-compliant security group rules will either be removed (in case of unused) or rules will be merged (in case of redundant security groups). Alternatively, you can check the non-compliant resources, then log into the AWS account and take remediation action manually.

Conclusion

In this post, you learned how you can create AWS Firewall Manager rules using the console. Using both VPC security groups and AWS Firewall Manager, you created a deployment strategy that enables the developers in your organization to maintain a security mindset and begin coding security group rules, while at the same time ensuring that all applications are still protected by a set of security group rules defined by your organization’s security team. In addition, you have reduced the likelihood of misconfigured or overly permissive security groups, as well as the operational burden, by simplifying the security groups created in all your member accounts.

For further reading, see AWS Firewall Manager Update – Support for VPC Security Groups.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Firewall Manager forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Kaustubh Phatak

Kaustubh is a Cloud Support Engineer II at AWS. On a daily basis, he provides solutions for customers’ cloud architecture questions related to Networking, Security, and DevOps domain. Outside of the office, Kaustubh likes to play cricket, ping-pong, and soccer. He is also an avid console gamer.

Using VPC Sharing for a Cost-Effective Multi-Account Microservice Architecture

Post Syndicated from Anandprasanna Gaitonde original https://aws.amazon.com/blogs/architecture/using-vpc-sharing-for-a-cost-effective-multi-account-microservice-architecture/

Introduction

Many cloud-native organizations building modern applications have adopted a microservice architecture because of its flexibility, performance, and scalability. Even customers with legacy and monolithic application stacks are embarking on an application modernization journey and opting for this type of architecture. A microservice architecture allows applications to be composed of several loosely coupled discreet services that are independently deployable, scalable, and maintainable. These applications can comprise a large number of microservices, which often span multiple business units within an organization. These customers typically have a multi-account AWS environment with each AWS account belonging to an individual business unit. Their microservice implementations reside in the Virtual Public Clouds (VPCs) of their respective AWS accounts. You can set up multi-account AWS environment incorporating best practices using AWS Landing Zone or AWS Control Tower.

This type of multi-account, multi-VPC architecture provides a good boundary and isolation for individual microservices and achieves a highly available, scalable, and secure architecture. However, for microservices that require a high degree of interconnectivity and are within the same trust boundaries, you can use other AWS capabilities to optimize cost and network management complexity.

This blog presents a cost-effective approach that requires less VPC management while still using separate accounts for billing and access control. This approach does not sacrifice scalability, high availability, fault tolerance, and security. To achieve a similar microservice architecture, you can share a VPC across AWS accounts using AWS Resource Access Manager (AWS RAM) and Network Load Balancer (NLB) support in a shared Amazon Virtual Private Cloud (VPC). This allows multiple microservices to coexist in the same VPC, even though they are developed by different business units.

Microservices architecture in a multi-VPC approach

In this architecture, microservices deployed across multiple VPCs use privately exposed endpoints for better security posture instead of going over the internet. This requires the customers to enable inter-VPC communication using the various networking capabilities of AWS as shown below:

microservices deployed across multiple VPCs use privately exposed endpoints

In the above reference architecture, we created a VPC in Account A, which is hosting the front end of the application across a fleet of Amazon Elastic Compute Cloud (Amazon EC2) instances using an AWS Auto Scaling group. For simplicity, we’ve illustrated a single public and private subnet for the application front end. In reality, this spans across multiple subnets across multiple Availability Zones (AZ) to support a highly available and fault-tolerant configuration.

To ensure security, the application must communicate privately to microservices mS1 and mS2 deployed in VPC of Account B and Account C respectively. For high availability, these microservices are also implemented using a fleet of Amazon EC2 instances with the Auto Scaling group spanning across multiple subnets/availability zones. For high-performance load balancing, they are fronted by a Network Load Balancer.

While this architecture shows an implementation using Amazon EC2, it can also use containerized services deployed using Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). These microservices may have interdependencies and invoke each other’s’ APIs for servicing the requests of the application layer. This application to mS and mS to mS communication can be achieved using following possible connectivity options:

When only few VPC interconnections are required, Amazon VPC peering and AWS PrivateLink may be a viable option. For higher number of VPC interconnections, we recommend AWS Transit Gateway for better manageability of connections and routing through a centralized resource. However, based on the amount of traffic this can introduce significant costs to your architecture.

Alternative approach to microservice architecture using Network Load Balancers in a shared VPC

The above architecture pattern allows your individual microservice teams to continue to own their AWS resources that host their microservice implementation. But they can deploy them in a shared VPC owned by the central account, eliminating the need for inter-VPC network connections. You can share Amazon VPCs to use the implicit routing within a VPC for applications that require a high degree of interconnectivity and are within the same trust boundaries.

This architecture uses AWS RAM, which allows you to share the VPC Subnets from AWS Account A to participating AWS accounts within your AWS organization. When the subnets are shared, participant AWS accounts (Account B and Account C) can see the shared subnets in their own environment. They can then deploy their Amazon EC2 instances in those subnets. This is depicted in the diagram where the visibility of the shared subnets (SS1 and SS2) is extended to the participating accounts (Account B and Account C).

You can also deploy the NLB in these shared subnets. Then, each participant account owns all the AWS resources for their microservice stack, but it’s deployed in the VPC of Account A.

This allows your individual microservice teams to maintain control over load balancer configurations and Auto Scaling policies based for their specific microservices’ needs. At the same time, using the AWS RAM they are able to effectively use the existing VPC environment of Account A.

This architecture presents several benefits over the multi-VPC architecture discussed earlier:

  • You can deploy the entire application, including the individual microservices, into a single shared VPC. This is while still allowing individual microservice teams control over their AWS resources deployed in that VPC.
  • Since the entire architecture now resides in a single VPC, it doesn’t require other networking connectivity features. It can rely on intra-VPC traffic for communication between the application (API) layer and microservices.
  • This leads to reduction in cost of the architecture. While the AWS RAM functionality is free of charge, this also reduces the data transfer and per-connection costs incurred by other options such as VPC peering, AWS PrivateLink, and AWS Transit Gateway.
  • This maintains the isolation across the individual microservices and the application layer.  Participants can’t view, modify, or delete resources that belong to others or the VPC owner.
  • This also leads to effective utilization of your VPC CIDR block resources.
  • Since multiple subnets belonging to different Availability Zones are shared, the application and individual mS continues to take advantage of scalability, availability, and fault tolerance.

The following illustration shows how you can configure AWS RAM to set up the VPC subnet resource shares between owner Account A and participating Account B. The example below shows the sharing of private subnet SS1 using this method:

(Click for larger image)

Accounts A and B Resource Share

Once this subnet is shared, the participating Account B can launch its Network Load Balancer of its microservice ms1 in the shared VPC subnet as shown below:

Account B can launch its Network Load Balancer of its microservice ms1 in the shared VPC subnet

While this architecture has many advantages, there are important considerations:

  • This style of architecture is suitable when you are certain that the number of microservices is small enough to coexist in a single VPC without depleting the CIDR block of the shared subnets of the VPC.
  • If the traffic between these microservices is in-significant, then the cost benefit of this architecture over other options may not be substantial. This is due to the effect of traffic flow on data transfer cost.

Conclusion

AWS Cloud provides several options to build a microservices architecture. It is important to look at the characteristics of your application to determine which architectural choices top opt for. The AWS RAM and the ability to deploy AWS resources (including Network Load Balancers in shared VPC) helps you eliminate inter-VPC traffic and associated networking costs. And this without sacrificing high availability, scalability, fault tolerance, and security for your application.

Top 10 Architecture Blog Posts of 2019

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/top-10-architecture-blog-posts-of-2019/

As we wind our way toward 2020, I want to take a moment to first thank you, our readers, for spending time on our blog. We grew our audience quite a bit this year and the credit goes to our hard-working Solutions Architects and other blog post writers. Below are the top 10 Architecture blog posts written in 2019.

#10: How to Architect APIs for Scale and Security

by George Mao

George Mao, a Specialist Solutions Architect at AWS, focuses on serverless computing and has FIVE posts in the top ten this year. Way to go, George!

This post was the first in a series that focused on best practices and concepts you should be familiar with when you architect APIs for your applications.

Read George’s post.

#9: From One to Many: Evolving VPC Guidance

by Androski Spicer

Since its inception, the Amazon Virtual Private Cloud (VPC) has acted as the embodiment of security and privacy for customers who are looking to run their applications in a controlled, private, secure, and isolated environment.

This logically isolated space has evolved, and in its evolution has increased the avenues that customers can take to create and manage multi-tenant environments with multiple integration points for access to resources on-premises.

Read Androski’s post.

#8: Things to Consider When You Build REST APIs with Amazon API Gateway

by George Mao

REST API 2

This post dives deeper into the things an API architect or developer should consider when building REST APIs with Amazon API Gateway.

Read George’s post.

#7: How to Design Your Serverless Apps for Massive Scale

by George Mao

Serverless at scale-1

Serverless is one of the hottest design patterns in the cloud today, allowing you to focus on building and innovating, rather than worrying about the heavy lifting of server and OS operations. In this series of posts, we’ll discuss topics that you should consider when designing your serverless architectures. First, we’ll look at architectural patterns designed to achieve massive scale with serverless.

Read George’s post.

#6: Best Practices for Developing on AWS Lambda

by George Mao

RDS instance: When to VPC enable a Lambda function

One of the benefits of using Lambda, is that you don’t have to worry about server and infrastructure management. This means AWS will handle the heavy lifting needed to execute your AWS Lambda functions. Take advantage of this architecture with the tips in this post.

Read George’s post.

#5: Stream Amazon CloudWatch Logs to a Centralized Account for Audit and Analysis

by David Bailey

Figure 1 - Initial Landing Zone logging account resources

A key component of enterprise multi-account environments is logging. Centralized logging provides a single point of access to all salient logs generated across accounts and regions, and is critical for auditing, security and compliance. While some customers use the built-in ability to push Amazon CloudWatch Logs directly into Amazon Elasticsearch Service for analysis, others would prefer to move all logs into a centralized Amazon Simple Storage Service (Amazon S3) bucket location for access by several custom and third-party tools. In this blog post, David Bailey will show you how to forward existing and any new CloudWatch Logs log groups created in the future to a cross-account centralized logging Amazon S3 bucket.

Read David’s post.

#4: Updates to Serverless Architectural Patterns and Best Practices

by Drew Dennis

Drew wrote this post at about the halfway point between re:Invent 2018 and re:Invent 2019, where he revisited some of the recent serverless announcements we’ve made. These are all complimentary to the patterns discussed in the re:Invent architecture track’s Serverless Architectural Patterns and Best Practices session.

Read Drew’s post.

#3: Understanding the Different Ways to Invoke Lambda Functions

by George Mao

Invoking Lambda

In George’s first post of this series (#7 on this list), he talked about general design patterns to enable massive scale with serverless applications. In this post, he’ll review the different ways you can invoke Lambda functions and what you should be aware of with each invocation model.

Read George’s post.

#2: Using API Gateway as a Single Entry Point for Web Applications and API Microservices

by Anandprasanna Gaitonde and Mohit Malik

In this post, Anand and Mohit talk about a reference architecture that allows API Gateway to act as single entry point for external-facing, API-based microservices and web applications across multiple external customers by leveraging a different subdomain for each one.

Read Anand’s and Mohit’s post.

#1: 10 Things Serverless Architects Should Know

by Justin Pirtle

Building on the first three parts of the AWS Lambda scaling and best practices series where you learned how to design serverless apps for massive scale, AWS Lambda’s different invocation models, and best practices for developing with AWS Lambda, Justin invited you to take your serverless knowledge to the next level by reviewing 10 topics to deepen your serverless skills.

Read Justin’s post.

Thank You

Thanks again to all our readers and blog post writers. We look forward to learning and building amazing things together in the coming year.

Best of 2019

Coming soon: Updated Lambda states lifecycle for VPC networking

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/coming-soon-updated-lambda-states-lifecycle-for-vpc-networking/

On November 27, we announced that AWS Lambda now includes additional attributes in the function information returned by several Lambda API actions to better communicate the current “state” of your function, when they are being created or updated. In our post “Tracking the state of AWS Lambda functions”, we covered the various states your Lambda function can be in, the conditions that lead to them, and how the Lambda service transitions the function through those states.

Our first feature using the function states lifecycle is a change to the recently announced improved VPC networking for AWS Lambda functions. As stated in the announcement post, Lambda creates the ENIs required for your function to connect to your VPCs, which can take 60–90 seconds to complete. We are updating this operation to explicitly place the function into a Pending state while pre-creating the required elastic network interface resources, and transitioning to an Active state after that process is completed. By doing this, we can use the lifecycle to complete the creation of these resources, and then reduce inconsistent invokes after the create/update has completed.

Most customers experience no impact from this change except for fewer long cold-starts due to network resource creation. As a reminder, any invocations or other API actions that operate on the function will fail during the time before the function is Active. To better assist you in adopting this behavior, we are rolling out this behavior for VPC configured functions in a phased manner. This post provides further details about timelines and approaches to both test the change before it is 100% live or delay it for your functions using a delay mechanism.

Changes to function create and update

On function create

During creation of new functions configured for VPC, your function remains in the Pending state until all VPC resources are created. You are not able to invoke the function or take any other Lambda API actions against it. After successful completion of the creation of these resources, your function transitions automatically to the Active state and is available for invokes and Lambda API actions. If the network resources fail to create then your function is placed in a Failed state.

On function update

During the update of functions configured for VPC, if there are any modifications to the VPC configuration, the function remains in the Active state, but shows in the InProgress status until all VPC resources are updated. During this time, any invokes go to the previous function code and configuration. After successful completion, the function LastUpdateStatus transitions automatically to Successful and all new invokes use the newly updated code and configuration. If the network resources fail to be created/updated then the LastUpdateStatus shows Failed, but the previous code and configuration remains in the Active state.

It’s important to note that creation or update of VPC resources can take between 60-90 seconds complete.

Change timeframe

As a reminder, all functions today show an Active state only. We are rolling out this change to create resources during the Pending state over a multiple phase period starting with the Begin Testing phase today, December 16, 2019. The phases allow you to update tooling for deploying and managing Lambda functions to account for this change. By the end of the update timeline, all accounts transition to using this new VPC resource create/update Lambda lifecycle.

Update timeline

Update timeline

December 16, 2019 – Begin Testing: You can now begin testing and updating any deployment or management tools you have to account for the upcoming lifecycle change. You can also use this time to update your function configuration to delay the change until the Delayed Update phase.

January 20, 2020 – General Update: All customers without the delayed update configuration begin seeing functions transition as described above under “On function create” and “On function update”.

February 17, 2020 – Delayed Update: The delay mechanism expires and customers now see the new VPC resource lifecycle applied during function create or update.

March 2, 2020 – Update End: All functions now have the new VPC resource lifecycle applied during function create or update.

Opt-in and delayed update configurations

Starting today, we are providing a mechanism for an opt-in, to allow you to update and test your tools and developer workflow processes for this change. We are also providing a mechanism to delay this change until the end of the Delayed Update phase. If you configure your functions for VPC and use the delayed update mechanism after the start of the General Update, your functions continue to experience a delayed first invocation due to VPC resource creation.

This mechanism operates on a function-by-function basis, so you can test and experiment individually without impacting your whole account. Once the General Update phase begins, all functions in an account that do not have the delayed update mechanism in place see the new lifecycle for their functions.

Both mechanisms work by adding a special string in the “Description” parameter of your Lambda functions. This string can be added to the prefix or suffix, or be the entire contents of the field.

To opt in:

aws:states:opt-in

To delay the update:

aws:states:opt-out

NOTE: Delay configuration mechanism has no impact after the Delayed Update phase ends.

Here is how this looks in the console:

  1. I add the opt-in configuration to my function’s Description.

    Opt-in in Description

    Opt-in in Description

  2. When I choose Save at the top, I see the update begin. During this time, I am blocked from executing tests, updating my code, and making some configuration changes against the function.

    Function updating

    Function updating

  3. After the update completes, I can once again run tests and other console commands.

    Function update successful

    Function update successful

Once the opt-in is set for a function, then updates on that function go through the update flow shown above. If I don’t change my function’s VPC configuration, then updates to my function transition almost instantly to the Successful update status.

With this in place, you can now test your development workflow ahead of the General Update phase. Download the latest CLI (version 1.16.291 or greater) or SDKs in order to see function state and related attribute information.

Conclusion

With functions states, you can have better clarity on how the resources required by your Lambda function are being created. This change does not impact the way that functions are invoked or how your code is executed. While this is a minor change to when resources are created for your Lambda function, the result is even better consistency of performance. Combined with the original announcement of improved VPC networking for Lambda, you experience better consistency for invokes, greatly reduced cold-starts, and fewer network resources created for your functions.

re:Invent 2019: Introducing the Amazon Builders’ Library (Part I)

Post Syndicated from Annik Stahl original https://aws.amazon.com/blogs/architecture/reinvent-2019-introducing-the-amazon-builders-library-part-i/

Today, I’m going to tell you about a new site we launched at re:Invent, the Amazon Builders’ Library, a collection of living articles covering topics across architecture, software delivery, and operations. You get to peek under the hood of how Amazon architects, releases, and operates the software underpinning Amazon.com and AWS.

Want to know how Amazon.com does what it does? This is for you. In this two-part series (the next one coming December 23), I’ll highlight some of the best architecture articles written by Amazon’s senior technical leaders and engineers.

Avoiding insurmountable queue backlogs

Avoiding insurmountable queue backlogs

In queueing theory, the behavior of queues when they are short is relatively uninteresting. After all, when a queue is short, everyone is happy. It’s only when the queue is backlogged, when the line to an event goes out the door and around the corner, that people start thinking about throughput and prioritization.

In this article, I discuss strategies we use at Amazon to deal with queue backlog scenarios – design approaches we take to drain queues quickly and to prioritize workloads. Most importantly, I describe how to prevent queue backlogs from building up in the first place. In the first half, I describe scenarios that lead to backlogs, and in the second half, I describe many approaches used at Amazon to avoid backlogs or deal with them gracefully.

Read the full article by David Yanacek – Principal Engineer

Timeouts, retries, and backoff with jitter

Timeouts, retries and backoff with jitter

Whenever one service or system calls another, failures can happen. These failures can come from a variety of factors. They include servers, networks, load balancers, software, operating systems, or even mistakes from system operators. We design our systems to reduce the probability of failure, but impossible to build systems that never fail. So in Amazon, we design our systems to tolerate and reduce the probability of failure, and avoid magnifying a small percentage of failures into a complete outage. To build resilient systems, we employ three essential tools: timeouts, retries, and backoff.

Read the full article by Marc Brooker, Senior Principal Engineer

Challenges with distributed systems

Challenges with distributed systems

The moment we added our second server, distributed systems became the way of life at Amazon. When I started at Amazon in 1999, we had so few servers that we could give some of them recognizable names like “fishy” or “online-01”. However, even in 1999, distributed computing was not easy. Then as now, challenges with distributed systems involved latency, scaling, understanding networking APIs, marshalling and unmarshalling data, and the complexity of algorithms such as Paxos. As the systems quickly grew larger and more distributed, what had been theoretical edge cases turned into regular occurrences.

Developing distributed utility computing services, such as reliable long-distance telephone networks, or Amazon Web Services (AWS) services, is hard. Distributed computing is also weirder and less intuitive than other forms of computing because of two interrelated problems. Independent failures and nondeterminism cause the most impactful issues in distributed systems. In addition to the typical computing failures most engineers are used to, failures in distributed systems can occur in many other ways. What’s worse, it’s impossible always to know whether something failed.

Read the full article by Jacob Gabrielson, Senior Principal Engineer

Static stability using Availability Zones

Static stability using availability zones

At Amazon, the services we build must meet extremely high availability targets. This means that we need to think carefully about the dependencies that our systems take. We design our systems to stay resilient even when those dependencies are impaired. In this article, we’ll define a pattern that we use called static stability to achieve this level of resilience. We’ll show you how we apply this concept to Availability Zones, a key infrastructure building block in AWS and therefore a bedrock dependency on which all of our services are built.

Read the full article by Becky Weiss, Senior Principal Engineer, and Mike Furr, Principal Engineer

Check back in two weeks to read about some other architecture-based expert articles that let you in on how Amazon does what it does.

New for AWS Transit Gateway – Build Global Networks and Centralize Monitoring Using Network Manager

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/new-for-aws-transit-gateway-build-global-networks-and-centralize-monitoring-using-network-manager/

As your company grows and gets the benefits of a cloud-based infrastructure, your on-premises sites like offices and stores increasingly need high performance private connectivity to AWS and to other sites at a reasonable cost. Growing your network is hard, because traditional branch networks based on leased lines are costly, and they suffer from the same lack of elasticity and agility as traditional data centers.

At the same time, it becomes increasingly complex to manage and monitor a global network that is spread across AWS regions and on-premises sites. You need to stitch together data from these diverse locations. This results in an inconsistent operational experience, increased costs and efforts, and missed insights from the lack of visibility across different technologies.

Today, we want to make it easier to build, manage, and monitor global networks with the following new capabilities for AWS Transit Gateway:

  • Transit Gateway Inter-Region Peering
  • Accelerated Site-to-Site VPN
  • AWS Transit Gateway Network Manager

These new networking capabilities enable you to optimize your network using AWS’s global backbone, and to centrally visualize and monitor your global network. More specifically:

  • Inter-Region Peering and Accelerated VPN improve application performance by leveraging the AWS Global Network. In this way, you can reduce the number of leased-lines required to operate your network, optimizing your cost and improving agility. Transit Gateway Inter-Region Peering sends inter region traffic privately over AWS’s global network backbone. Accelerated VPN uses AWS Global Accelerator to route VPN traffic from remote locations through the closest AWS edge location to improve connection performance.
  • Network Manager reduces the operational complexity of managing a global network across AWS and on-premises. With Network Manager, you set up a global view of your private network simply by registering your Transit Gateways and on-premises resources. Your global network can then be visualized and monitored via a centralized operational dashboard.

These features allow you to optimize connectivity from on-premises sites to AWS and also between on-premises sites, by routing traffic through Transit Gateways and the AWS Global Network, and centrally managing through Network Manager.

Visualizing Your Global Network
In the Network Manager console, that you can reach from the Transit Gateways section of the Amazon Virtual Private Cloud console, you have an overview of your global networks. Each global network includes AWS and on-premises resources. Specifically, it provides a central point of management for your AWS Transit Gateways, your physical devices and sites connected to the Transit Gateways via Site-to-Site VPN Connections, and AWS Direct Connect locations attached to the Transit Gateways.

For example, this is the Geographic view of a global network covering North America and Europe with 5 Transit Gateways in 3 AWS Regions, 80 VPCs, 50 VPNs, 1 Direct Connect location, and 16 on-premises sites with 50 devices:

As I zoom in the map, I get a description on what these nodes represent, for example if they are AWS Regions, Direct Connect locations, or branch offices.

I can select any node in the map to get more information. For example, I select the US West (Oregon) AWS Region to see the details of the two Transit Gateways I am using there, including the state of all VPN connections, VPCs, and VPNs handled by the selected Transit Gateway.

Selecting a site, I get a centralized view with the status of the VPN connections, including site metadata such as address, location, and description. For example, here are the details of the Colorado branch offices.

In the Topology panel, I see the logical relationship of all the resources in my network. On the left here there is the entire topology of my global network, on the right the detail of the European part. Connections status is reported as color in the topology view.

Selecting any node in the topology map displays details specific to the resource type (Transit Gateway, VPC, customer gateway, and so on) including links to the corresponding service in the AWS console to get more information and configure the resource.

Monitoring Your Global Network
Network Manager is using Amazon CloudWatch, which collects raw data and processes it into readable, near real-time metrics for data in/out, packets dropped, and VPN connection status.

These statistics are kept for 15 months, so that you can access historical information and gain a better perspective on how your web application or service is performing. You can also set alarms that watch for certain thresholds, and send notifications or take actions when those thresholds are met.

For example, these are the last 12 hours of Monitoring for the Transit Gateway in Europe (Ireland).

In the global network view, you have a single point of view of all events affecting your network, simplifying root cause analysis in case of issues. Clicking on any of the messages in the console will take to a more detailed view in the Events tab.

Your global network events are also delivered by CloudWatch Events. Using simple rules that you can quickly set up, you can match events and route them to one or more target functions or streams. To process the same events, you can also use the additional capabilities offered by Amazon EventBridge.

Network Manager sends the following types of events:

  • Topology changes, for example when a VPN connection is created for a transit gateway.
  • Routing updates, such as when a route is deleted in a transit gateway route table.
  • Status updates, for example in case a VPN tunnel’s BGP session goes down.

Configuring Your Global Network
To get your on-premises resources included in the above visualizations and monitoring, you need to input into Network Manager information about your on-premises devices, sites, and links. You also need to associate devices with the customer gateways they host for VPN connections.

Our software-defined wide area network (SD-WAN) partners, such as Cisco, Aruba, Silver Peak, and Aviatrix, have configured their SD-WAN devices to connect with Transit Gateway Network Manager in only a few clicks. Their SD-WANs also define the on-premises devices, sites, and links automatically in Network Manager. SD-WAN integrations enable to include your on-premises network in the Network Manager global dashboard view without requiring you to input information manually.

Available Now
AWS Transit Gateway Network Manager is a global service available for Transit Gateways in the following regions: US East (N. Virginia), US East (Ohio), US West (N. California), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Europe (London), Europe (Paris), Asia Pacific (Singapore), Asia Pacific (Tokyo), Asia Pacific (Seoul), Asia Pacific (Sydney), Asia Pacific (Mumbai), Canada (Central), South America (São Paulo).

There is no additional cost for using Network Manager. You pay for the network resources you use, like Transit Gateways, VPNs, and so on. Here you can find more information on pricing for VPN and Transit Gateway.

You can learn more in the documentation of the Network ManagerInter-Region Peering, and Accelerated VPN.

With these new features, you can take advantage of the performance of our AWS Global Network, and simplify network management and monitoring across your AWS and on-premises resources.

Danilo

New – VPC Ingress Routing – Simplifying Integration of Third-Party Appliances

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/new-vpc-ingress-routing-simplifying-integration-of-third-party-appliances/

When I was delivering the Architecting on AWS class, customers often asked me how to configure an Amazon Virtual Private Cloud to enforce the same network security policies in the cloud as they have on-premises. For example, to scan all ingress traffic with an Intrusion Detection System (IDS) appliance or to use the same firewall in the cloud as on-premises. Until today, the only answer I could provide was to route all traffic back from their VPC to an on-premises appliance or firewall in order to inspect the traffic with their usual networking gear before routing it back to the cloud. This is obviously not an ideal configuration, it adds latency and complexity.

Today, we announce new VPC networking routing primitives to allow to route all incoming and outgoing traffic to/from an Internet Gateway (IGW) or Virtual Private Gateway (VGW) to a specific Amazon Elastic Compute Cloud (EC2) instance’s Elastic Network Interface. It means you can now configure your Virtual Private Cloud to send all traffic to an EC2 instance before the traffic reaches your business workloads. The instance typically runs network security tools to inspect or to block suspicious network traffic (such as IDS/IPS or Firewall) or to perform any other network traffic inspection before relaying the traffic to other EC2 instances.

How Does it Work?
To learn how it works, I wrote this CDK script to create a VPC with two public subnets: one subnet for the appliance and one subnet for a business application. The script launches two EC2 instances with public IP address, one in each subnet. The script creates the below architecture:

This is a regular VPC, the subnets have routing tables to the Internet Gateway and the traffic flows in and out as expected. The application instance hosts a static web site, it is accessible from any browser. You can retrieve the application public DNS name from the EC2 Console (for your convenience, I also included the CLI version in the comments of the CDK script).

AWS_REGION=us-west-2
APPLICATION_IP=$(aws ec2 describe-instances                           \
                     --region $AWS_REGION                             \
                     --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='application']].NetworkInterfaces[].Association.PublicDnsName"  \
                     --output text)
				   
curl -I $APPLICATION_IP

Configure Routing
To configure routing, you need to know the VPC ID, the ENI ID of the ENI attached to the appliance instance, and the Internet Gateway ID. Assuming you created the infrastructure using the CDK script I provided, here are the commands I use to find these three IDs (be sure to adjust to the AWS Region you use):

AWS_REGION=us-west-2
VPC_ID=$(aws cloudformation describe-stacks                              \
             --region $AWS_REGION                                        \
             --stack-name VpcIngressRoutingStack                         \
             --query "Stacks[].Outputs[?OutputKey=='VPCID'].OutputValue" \
             --output text)

ENI_ID=$(aws ec2 describe-instances                                       \
             --region $AWS_REGION                                         \
             --query "Reservations[].Instances[] | [?Tags[?Key=='Name' &&  Value=='appliance']].NetworkInterfaces[].NetworkInterfaceId" \
             --output text)

IGW_ID=$(aws ec2 describe-internet-gateways                               \
             --region $AWS_REGION                                         \
             --query "InternetGateways[] | [?Attachments[?VpcId=='${VPC_ID}']].InternetGatewayId" \
             --output text)

To route all incoming traffic through my appliance, I create a routing table for the Internet Gateway and I attach a rule to direct all traffic to the EC2 instance Elastic Network Interface (ENI):

# create a new routing table for the Internet Gateway
ROUTE_TABLE_ID=$(aws ec2 create-route-table                      \
                     --region $AWS_REGION                        \
                     --vpc-id $VPC_ID                            \
                     --query "RouteTable.RouteTableId"           \
                     --output text)

# create a route for 10.0.1.0/24 pointing to the appliance ENI
aws ec2 create-route                             \
    --region $AWS_REGION                         \
    --route-table-id $ROUTE_TABLE_ID             \
    --destination-cidr-block 10.0.1.0/24         \
    --network-interface-id $ENI_ID

# associate the routing table to the Internet Gateway
aws ec2 associate-route-table                      \
    --region $AWS_REGION                           \
    --route-table-id $ROUTE_TABLE_ID               \
    --gateway-id $IGW_ID

Alternatively, I can use the VPC Console under the new Edge Associations tab.

To route all application outgoing traffic through the appliance, I replace the default route for the application subnet to point to the appliance’s ENI:

SUBNET_ID=$(aws ec2 describe-instances                                  \
                --region $AWS_REGION                                    \
                --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='application']].NetworkInterfaces[].SubnetId"    \
                --output text)
ROUTING_TABLE=$(aws ec2 describe-route-tables                           \
                    --region $AWS_REGION                                \
                    --query "RouteTables[?VpcId=='${VPC_ID}'] | [?Associations[?SubnetId=='${SUBNET_ID}']].RouteTableId" \
                    --output text)

# delete the existing default route (the one pointing to the internet gateway)
aws ec2 delete-route                       \
    --region $AWS_REGION                   \
    --route-table-id $ROUTING_TABLE        \
    --destination-cidr-block 0.0.0.0/0
	
# create a default route pointing to the appliance's ENI
aws ec2 create-route                          \
    --region $AWS_REGION                      \
    --route-table-id $ROUTING_TABLE           \
    --destination-cidr-block 0.0.0.0/0        \
    --network-interface-id $ENI_ID
	
aws ec2 associate-route-table       \
    --region $AWS_REGION            \
    --route-table-id $ROUTING_TABLE \
    --subnet-id $SUBNET_ID

Alternatively, I can use the VPC Console. Within the correct routing table, I select the Routes tab and click Edit routes to replace the default route (the one pointing to 0.0.0.0/0) to target the appliance’s ENI.

Now I have the routing configuration in place. The new routing looks like:

Configure the Appliance Instance
Finally, I configure the appliance instance to forward all traffic it receives. Your software appliance usually does that for you, no extra step is required when you use AWS Marketplace appliances. When using a plain Linux instance, two extra steps are required:

1. Connect to the EC2 appliance instance and configure IP traffic forwarding in the kernel:

APPLIANCE_ID=$(aws ec2 describe-instances  \
                   --region $AWS_REGION    \
                   --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='appliance']].InstanceId" \
                   --output text)
aws ssm start-session --region $AWS_REGION --target $APPLIANCE_ID	

##
## once connected (you see the 'sh-4.2$' prompt), type:
##

sudo sysctl -w net.ipv4.ip_forward=1
sudo sysctl -w net.ipv6.conf.all.forwarding=1
exit

2. Configure the EC2 instance to accept traffic for different destinations than itself (known as Dest/Source check) :

aws ec2 modify-instance-attribute --region $AWS_REGION \
                         --no-source-dest-check        \
                         --instance-id $APPLIANCE_ID

Now, the appliance is ready to forward traffic to the other EC2 instances. You can test this by pointing your browser (or using `cURL`) to the application instance.

APPLICATION_IP=$(aws ec2 describe-instances --region $AWS_REGION                          \
                     --query "Reservations[].Instances[] | [?Tags[?Key=='Name' && Value=='application']].NetworkInterfaces[].Association.PublicDnsName"  \
                     --output text)
				   
curl -I $APPLICATION_IP

To verify the traffic is really flowing through the appliance, you can enable source/destination check on the instance again (use --source-dest-check parameter with the modify-instance-attributeCLI command above). The traffic is blocked when Source/Destination check is enabled.

Cleanup
Should you use the CDK script I provided for this article, be sure to run cdk destroy when finished. This ensures you are not billed for the two EC2 instances I use for this demo. As I modified routing tables behind the back of AWS CloudFormation, I need to manually delete the routing tables, the subnet and the VPC. The easiest is to navigate to the VPC Console, select the VPC and click Actions => Delete VPC. The console deletes all components in the correct order. You might need to wait 5-10 minutes after the end of cdk destroy before the console is able to delete the VPC.

From our Partners
During the beta test of these new routing capabilities, we granted early access to a collection of AWS partners. They provided us with tons of helpful feedback. Here are some of the blog posts that they wrote in order to share their experiences (I am updating this article with links as they are published):

  • 128 Technology
  • Aviatrix
  • Checkpoint
  • Cisco
  • Citrix
  • FireEye
  • Fortinet
  • HashiCorp
  • IBM Security
  • Lastline
  • Netscout
  • Palo Alto Networks
  • ShieldX Networks
  • Sophos
  • Trend Micro
  • Valtix
  • Vectra AI
  • Versa Networks

Availability
There is no additional costs to use Virtual Private Cloud ingress routing. It is available in all regions (including AWS GovCloud (US-West)) and you can start to use it today.

You can learn more about gateways routing tables in the updated VPC documentation.

What are the appliances you are going to use with this new VPC routing capability?

— seb

AWS Firewall Manager Update – Support for VPC Security Groups

Post Syndicated from Jeff Barr original https://aws.amazon.com/blogs/aws/aws-firewall-manager-update-support-for-vpc-security-groups/

I introduced you to AWS Firewall Manager last year, and showed you how you can use it to centrally configure and manage your AWS WAF rules and AWS Shield advanced protections. AWS Firewall Manager makes use of AWS Organizations, and lets you build policies and apply them across multiple AWS accounts in a consistent manner.

Security Group Support
Today we are making AWS Firewall Manager even more useful, giving you the power to define, manage, and audit organization-wide policies for the use of VPC Security Groups.

You can use the policies to apply security groups to specified accounts and resources, check and manage the rules that are used in security group, and to find and then clean up unused and redundant security groups. You get real-time notification when misconfigured rules are detected, and can take corrective action from within the Firewall Manager Console.

In order to make use of this feature, you need to have an AWS Organization and AWS Config must be enabled for all of the accounts in it. You must also designate an AWS account as the Firewall Administrator. This account has permission to deploy AWS WAF rules, Shield Advanced protections, and security group rules across your organization.

Creating and Using Policies
After logging in to my organization’s root account, I open the Firewall Manager Console, and click Go to AWS Firewall Manager:

Then I click Security Policies in the AWS FMS section to get started. The console displays my existing policies (if any); I click Create policy to move ahead:

I select Security group as the Policy type and Common security groups as the Security group policy type, choose the target region, and click Next to proceed (I will examine the other policy types in a minute):

I give my policy a name (OrgDefault), choose a security group (SSH_Only), and opt to protect the group’s rules from changes, then click Next:

Now I define the scope of the policy. As you can see, I can choose the accounts, resource types, and even specifically tagged resources, before clicking Next:

I can also choose to exclude resources that are tagged in a particular way; this can be used to create an organization-wide policy that provides special privileges for a limited group of resources.

I review my policy, confirm that I have to enable Config and to pay the associated charges, and click Create policy:

The policy takes effect immediately, and begins to evaluate compliance within 3-5 minutes. The Firewall Manager Policies page shows an overview:

I can click the policy to learn more:

Policies also have an auto-remediation option. While this can be enabled when the policy is created, our advice is to wait until after policy has taken effect so that you can see what will happen when you go ahead and enable auto-remediation:

Let’s take a look at the other two security group policy types:

Auditing and enforcement of security group rules – This policy type centers around an audit security group that can be used in one of two ways:

You can use this policy type when you want to establish guardrails that establish limits on the rules that can be created. For example, I could create a policy rule that allows inbound access from a specific set of IP addresses (perhaps a /24 used by my organization), and use it to detect any resource that is more permissive.

Auditing and cleanup of unused and redundant security groups – This policy type looks for security groups that are not being used, or that are redundant:

Available Now
You can start to use this feature today in the US East (N. Virginia), US East (Ohio), US West (Oregon), US West (N. California), Europe (Ireland), Europe (Frankfurt), Europe (London), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Singapore), and Asia Pacific (Seoul) Regions. You will be charged $100 per policy per month.

Jeff;

New Zealand Internet Connectivity to AWS

Post Syndicated from Cameron Tod original https://aws.amazon.com/blogs/architecture/new-zealand-internet-connectivity-to-aws/

Amazon Web Services (AWS) serves more than a million private and public sector organizations all over the world from its extensive and expanding global infrastructure.

Like other countries, organizations all around New Zealand are using AWS to change the way they operate. For example, Xero, a Wellington-based online accountancy software vendor, now serves customers in more than 100 countries, while the Department of Conservation provides its end users with virtual desktops running in Amazon Workspaces.

New Zealand doesn’t currently have a dedicated AWS Region. Geographically, the closest is Asia Pacific (Sydney), which is 2,000 kilometers (km) away, across a deep sea. While customers rely on AWS for business-critical workloads, they are well-served by New Zealand’s international connectivity.

To connect to Amazon’s network, our New Zealand customers have a range of options:

  • Public internet endpoints
  • Managed or software Virtual Private Networks (VPN)
  • AWS Direct Connect (DX).

All rely on the extensive internet infrastructure connecting New Zealand to the world.

International Connectivity

The vast majority of internet traffic is carried over physical cables, while the percentage of traffic moving over satellite or wireless links is small by comparison.

Historically, cables were funded and managed by consortia of telecommunication providers. More recently, large infrastructure and service providers like AWS have contributed to or are building their own cable networks.

There are currently about 400 submarine cables in service globally. Modern submarine cables are fiber-optic, run for thousands of kilometers, and are protected by steel strands, plastic sheathing, copper, and a chemical water barrier. Over that distance, the signal can weaken—or attenuate—so signal repeaters are installed approximately every 50km to mitigate attenuation. Repeaters are powered by a charge running over the copper sheathing in the cable.

An example of submarine cable composition.. S

An example of submarine cable composition.. Source: WikiMedia Commons

For most of their run, these cables are about as thick as a standard garden hose. They are thicker, however, closer to shore and in areas where there’s a greater risk of damage by fishing nets, boat anchors, etc.

Cables can—and do—break, but redundancy is built into the network. According to Telegeography, there are 100 submarine cable faults globally every year. However, most faults don’t impact users meaningfully.

New Zealand is served by four main cables:

  1. Hawaiki : Sydney -> Mangawhai (Northland, NZ) -> Kapolei (Hawaii, USA) -> Hilsboro, Oregon (USA) – 44 Terabits per second (Tbps)
  2. Tasman Global Access: Raglan (Auckland, New Zealand) -> Narabeen (NSW, Australia) – 20 Tbps
  3. Southern Cross A: Whenuapai (Auckland, New Zealand) -> Alexandria (NSW, Australia) – 1.2 Tbps
  4. Southern Cross B: Takapuna (Auckland, New Zealand) -> Spencer Beach (Hawaii, USA) – 1.2 Tbps
A map of major submarine cables connecting to New Zealand.

A map of major submarine cables connecting to New Zealand. Source submarinecablemap.com

The four cables combined currently deliver 66 Tbps of available capacity. The Southern Cross NEXT cable is due to come online in 2020, which will add another 72 Tbps. These are, of course, potential capacities; it’s likely the “lit” capacity—the proportion of the cables’ overall capacity that is actually in use—is much lower.

Connecting to AWS from New Zealand

While understanding the physical infrastructure is important in practice, these details are not shared with customers. Connectivity options are evaluated on the basis of partner and AWS offerings, which include connectivity.

Customers connect to AWS in three main ways: over public endpoints, via site-to-site VPNs, and via Direct Connect (DX), all typically provided by partners.

Public Internet Endpoints

Customers can connect to public endpoints for AWS services over the public internet. Some services, like Amazon CloudFront, Amazon API Gateway, and Amazon WorkSpaces are generally used in this way.

Network-level access can be controlled via various means depending on the service, whether that is Endpoint Policies for API Gateway, Security Groups, and Network Access Control Lists for Amazon Virtual Private Cloud (VPC), or Resource Policies for services such as Amazon S3, Amazon Simple Queue Service (SQS), or Amazon Key Management Service (KMS).

All services offer TLS or IPsec connectivity for secure encryption-in-motion.

Site-to-Site Virtual Private Network

Many organizations use a VPN to connect to AWS. It’s the simplest and lowest cost entry point to expose resources deployed in private ranges in an Amazon VPC. Amazon VPC allows customers to provision a logically isolated network segment, with fine-grained control of IP ranges, filtering rules, and routing.

AWS offers a managed site-to-site VPN service, which creates secure, redundant Internet Protocol Security (IPSec) VPNs, and also handles maintenance and high-availability while integrating with Amazon CloudWatch for robust monitoring.

If using an AWS managed VPN, the AWS endpoints have publicly routable IPs. They can be connected to over the public internet or via a Public Virtual Interface over DX (outlined below).

Customers can also deploy VPN appliances onto Amazon Elastic Compute Cloud (EC2) instances running in their VPC. These may be self-managed or provided by Amazon Marketplace sellers.

AWS also offers AWS Client VPN, for direct user access to AWS resources.

AWS Direct Connect

While connectivity over the internet is secure and flexible, it has one major disadvantage: it’s unpredictable. By design, traffic traversing the internet can take any path to reach its destination. Most of the time it works but occasionally routing conditions may reduce capacity or increase latency.

DX connections are either 1 or 10 Gigabits per second (Gbps). This capacity is dedicated to the customer; it isn’t shared, as other network users are never routed over the connection. This means customers can rely on consistent latency and bandwidth. The DX per-Gigabit transfer cost is lower than other egress mechanisms. For customers transferring large volumes of data, DX may be more cost effective than other means of connectivity.

Customers may publish their own 802.11q Virtual Local Area Network (VLAN) tags across the DX, and advertise routes via Border Gateway Protocol (BGP). A dedicated connection supports up to 50 private or public virtual interfaces. New Zealand does not have a physical point-of-presence for DX—users must procure connectivity to our Sydney Region. Many AWS Partner Network (APN) members in New Zealand offer this connectivity.

For customers who don’t want or need to manage VLANs to AWS—or prefer 1 Gbps or smaller links —APN partners offer hosted connections or hosted virtual interfaces.  For more detail, please review our AWS Direct Connect Partners page.

Performance

There are physical limits to latency dictated by the speed of light, and the medium through which optical signals travel. Southern Cross publishes latency statistics, and it sees one-way latency of approximately 11 milliseconds (ms) over the 2,276km Alexandria to Whenuapai link. Double that for a round-trip to 22 ms.

In practice, we see customers achieving round-trip times from user workstations to Sydney in approximately 30-50 ms, assuming fair-weather internet conditions or DX links. Latency in Auckland (the largest city) tends to be on the lower end of that spectrum, while the rest of the country tends towards the higher end.

Bandwidth constraints are more often dictated by client hardware, but AWS and our partners offer up to 10 Gbps links, or smaller as required. For customers that require more than 10 Gbps over a single link, AWS supports Link Aggregation Groups (LAG).

As outlined above, there are a range of ways for customers to adopt AWS via secure, reliable, and performant networks. To discuss your use case, please contact an AWS Solutions Architect.

 

Update: Issue affecting HashiCorp Terraform resource deletions after the VPC Improvements to AWS Lambda

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/update-issue-affecting-hashicorp-terraform-resource-deletions-after-the-vpc-improvements-to-aws-lambda/

On September 3, 2019, we announced an exciting update that improves the performance, scale, and efficiency of AWS Lambda functions when working with Amazon VPC networks. You can learn more about the improvements in the original blog post. These improvements represent a significant change in how elastic network interfaces (ENIs) are configured to connect to your VPCs. With this new model, we identified an issue where VPC resources, such as subnets, security groups, and VPCs, can fail to be destroyed via HashiCorp Terraform. More information about the issue can be found here. In this post we will help you identify whether this issue affects you and the steps to resolve the issue.

How do I know if I’m affected by this issue?

This issue only affects you if you use HashiCorp Terraform to destroy environments. Versions of Terraform AWS Provider that are v2.30.0 or older are impacted by this issue. With these versions you may encounter errors when destroying environments that contain AWS Lambda functions, VPC subnets, security groups, and Amazon VPCs. Typically, terraform destroy fails with errors similar to the following:

Error deleting subnet: timeout while waiting for state to become 'destroyed' (last state: 'pending', timeout: 20m0s)

Error deleting security group: DependencyViolation: resource sg-<id> has a dependent object
        	status code: 400, request id: <guid>

Depending on which AWS Regions the VPC improvements are rolled out, you may encounter these errors in some Regions and not others.

How do I resolve this issue if I am affected?

You have two options to resolve this issue. The recommended option is to upgrade your Terraform AWS Provider to v2.31.0 or later. To learn more about upgrading the Provider, visit the Terraform AWS Provider Version 2 Upgrade Guide. You can find information and source code for the latest releases of the AWS Provider on this page. The latest version of the Terraform AWS Provider contains a fix for this issue as well as changes that improve the reliability of the environment destruction process. We highly recommend that you upgrade the Provider version as the preferred option to resolve this issue.

If you are unable to upgrade the Provider version, you can mitigate the issue by making changes to your Terraform configuration. You need to make the following sets of changes to your configuration:

  1. Add an explicit dependency, using a depends_on argument, to the aws_security_group and aws_subnet resources that you use with your Lambda functions. The dependency has to be added on the aws_security_group or aws_subnet and target the aws_iam_policy resource associated with IAM role configured on the Lambda function. See the example below for more details.
  2. Override the delete timeout for all aws_security_group and aws_subnet resources. The timeout should be set to 40 minutes.

The following configuration file shows an example where these changes have been made(scroll to see the full code):

provider "aws" {
    region = "eu-central-1"
}
 
resource "aws_iam_role" "lambda_exec_role" {
  name = "lambda_exec_role"
  assume_role_policy = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Principal": {
        "Service": "lambda.amazonaws.com"
      },
      "Effect": "Allow",
      "Sid": ""
    }
  ]
}
EOF
}
 
data "aws_iam_policy" "LambdaVPCAccess" {
  arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaVPCAccessExecutionRole"
}
 
resource "aws_iam_role_policy_attachment" "sto-lambda-vpc-role-policy-attach" {
  role       = "${aws_iam_role.lambda_exec_role.name}"
  policy_arn = "${data.aws_iam_policy.LambdaVPCAccess.arn}"
}
 
resource "aws_security_group" "allow_tls" {
  name        = "allow_tls"
  description = "Allow TLS inbound traffic"
  vpc_id      = "vpc-<id>"
 
  ingress {
    # TLS (change to whatever ports you need)
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    # Please restrict your ingress to only necessary IPs and ports.
    # Opening to 0.0.0.0/0 can lead to security vulnerabilities.
    cidr_blocks = ["0.0.0.0/0"]
  }
 
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "tcp"
    cidr_blocks     = ["0.0.0.0/0"]
  }
  
  timeouts {
    delete = "40m"
  }
  depends_on = ["aws_iam_role_policy_attachment.sto-lambda-vpc-role-policy-attach"]  
}
 
resource "aws_subnet" "main" {
  vpc_id     = "vpc-<id>"
  cidr_block = "172.31.68.0/24"

  timeouts {
    delete = "40m"
  }
  depends_on = ["aws_iam_role_policy_attachment.sto-lambda-vpc-role-policy-attach"]
}
 
resource "aws_lambda_function" "demo_lambda" {
    function_name = "demo_lambda"
    handler = "index.handler"
    runtime = "nodejs10.x"
    filename = "function.zip"
    source_code_hash = "${filebase64sha256("function.zip")}"
    role = "${aws_iam_role.lambda_exec_role.arn}"
    vpc_config {
     subnet_ids         = ["${aws_subnet.main.id}"]
     security_group_ids = ["${aws_security_group.allow_tls.id}"]
  }
}

The key block to note here is the following, which can be seen in both the “allow_tls” security group and “main” subnet resources:

timeouts {
  delete = "40m"
}
depends_on = ["aws_iam_role_policy_attachment.sto-lambda-vpc-role-policy-attach"]

These changes should be made to your Terraform configuration files before destroying your environments for the first time.

Can I delete resources remaining after a failed destroy operation?

Destroying environments without upgrading the provider or making the configuration changes outlined above may result in failures. As a result, you may have ENIs in your account that remain due to a failed destroy operation. These ENIs can be manually deleted a few minutes after the Lambda functions that use them have been deleted (typically within 40 minutes). Once the ENIs have been deleted, you can re-re-run terraform destroy.

One to Many: Evolving VPC Design

Post Syndicated from Androski Spicer original https://aws.amazon.com/blogs/architecture/one-to-many-evolving-vpc-design/

Since its inception, the Amazon Virtual Private Cloud (VPC) has acted as the embodiment of security and privacy for customers who are looking to run their applications in a controlled, private, secure, and isolated environment.

This logically isolated space has evolved, and in its evolution has increased the avenues that customers can take to create and manage multi-tenant environments with multiple integration points for access to resources on-premises.

This blog is a two-part series that begins with a look at the Amazon VPC as a single unit of networking in the AWS Cloud but eventually takes you to a world in which simplified architectures for establishing a global network of VPCs are possible.

From One VPC: Single Unit of Networking

To be successful with the AWS Virtual Private Cloud you first have to define success for today and what success might look like as your organization’s adoption of the AWS cloud increases and matures. In essence, your VPCs should be designed to satisfy the needs of your applications today and must be scalable to accommodate future needs.

Classless Inter-Domain Routing (CIDR) notations are used to denote the size of your VPC. AWS allows you specify a CIDR block between /16 and /28. The largest, /16, provides you with 65,536 IP addresses and the smallest possible allowed CIDR block, /28, provides you with 16 IP addresses. Note, the first four IP addresses and the last IP address in each subnet CIDR block are not available for you to use, and cannot be assigned to an instance.

AWS VPC supports both IPv4 and IPv6. It is required that you specify an IPv4 CIDR range when creating a VPC. Specifying an IPv6 range is optional.

Customers can specify ANY IPv4 address space for their VPC. This includes but is not limited to RFC 1918 addresses.

After creating your VPC, you divide it into subnets. In an AWS VPC, subnets are not isolation boundaries around your application. Rather, they are containers for routing policies.

Isolation is achieved by attaching an AWS Security Group (SG) to the EC2 instances that host your application. SGs are stateful firewalls, meaning that connections are tracked to ensure return traffic is allowed. They control inbound and outbound access to the elastic network interfaces that are attached to an EC2 instance. These should be tightly configured, only allowing access as needed.

It is our best practice that subnets should be created in categories. There two main categories; public subnets and private subnets. At minimum they should be designed as outlined in the below diagrams for IPv4 and IPv6 subnet design.

Recommended IPv4 subnet design pattern

Recommended IPv6 subnet design pattern

Subnet types are denoted by the ability and inability for applications and users on the internet to directly initiate access to infrastructure within a subnet.

Public Subnets

Public subnets are attached to a route table that has a default route to the Internet via an Internet gateway.

Resources in a public subnet can have a public IP or Elastic IP (EIP) that has a NAT to the Elastic Network Interface (ENI) of the virtual machines or containers that hosts your application(s). This is a one-to-one NAT that is performed by the Internet gateway.

Illustration of public subnet access path to the Internet through the Internet Gateway (IGW)

Private Subnets

A private subnet contains infrastructure that isn’t directly accessible from the Internet. Unlike the public subnet, this infrastructure only has private IPs.

Infrastructure in a private subnet gain access to resources or users on the Internet through a NAT infrastructure of sorts.

AWS natively provides NAT capability through the use of the NAT Gateway service. Customers can also create NAT instances that they manage or leverage third-party NAT appliances from the AWS Marketplace.

In most scenarios, it is recommended to use the AWS NAT Gateway as it is highly available (in a single Availability Zone) and is provided as a managed service by AWS. It supports 5 Gbps of bandwidth per NAT gateway and automatically scales up to 45 Gbps.

An AWS NAT gateway’s high availability is confined to a single Availability Zone. For high availability across AZs, it is recommended to have a minimum of two NAT gateways (in different AZs). This allows you to switch to an available NAT gateway in the event that one should become unavailable.

This approach allows you to zone your Internet traffic, reducing cross Availability Zone connections to the Internet. More details on NAT gateway are available here.

Illustration of an environment with a single NAT Gateway (NAT-GW)

Illustration of high availability with a multiple NAT Gateways (NAT-GW) attached to their own route table

Illustration of the failure of one NAT Gateway and the fail over to an available NAT Gateway by the manual changing of the default route next hop in private subnet A route table

AWS allocated IPv6 addresses are Global Unicast Addresses by default. That said, you can privatize these subnets by using an Egress-Only Internet Gateway (E-IGW), instead of a regular Internet gateway. E-IGWs are purposely built to prevents users and applications on the Internet from initiating access to infrastructure in your IPv6 subnet(s).

Illustration of internet access for hybrid IPv6 subnets through an Egress-Only Internet Gateway (E-IGW)

Applications hosted on instances living within a private subnet can have different access needs. Some require access to the Internet while others require access to databases, applications, and users that are on-premises. For this type of access, AWS provides two avenues: the Virtual Gateway and the Transit Gateway. The Virtual Gateway can only support a single VPC at a time, while the Transit Gateway is built to simplify the interconnectivity of tens to hundreds of VPCs and then aggregating their connectivity to resources on-premises. Given that we are looking at the VPC as a single unit of networking, all diagrams below contain illustrations of the Virtual Gateway which acts a WAN concentrator for your VPC.

Illustration of private subnets connecting to data center via a Virtual Gateway (VGW)

 

Illustration of private subnets connecting to Data Center via a VGW

 

Illustration of private subnets connecting to Data Center using AWS Direct Connect as primary and IPsec as backup

The above diagram illustrates a WAN connection between a VGW attached to a VPC and a customer’s data center.

AWS provides two options for establishing a private connectivity between your VPC and on-premises network: AWS Direct Connect and AWS Site-to-Site VPN.

AWS Site-to-Site VPN configuration leverages IPSec with each connection providing two redundant IPSec tunnels. AWS support both static routing and dynamic routing (through the use of BGP).

BGP is recommended, as it allows dynamic route advertisement, high availability through failure detection, and fail over between tunnels in addition to decreased management complexity.

VPC Endpoints: Gateway & Interface Endpoints

Applications running inside your subnet(s) may need to connect to AWS public services (like Amazon S3, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon API Gateway, etc.) or applications in another VPC that lives in another account. For example, you may have a database in another account that you would like to expose applications that lives in a completely different account and subnet.

For these scenarios you have the option to leverage an Amazon VPC Endpoint.

There are two types of VPC Endpoints: Gateway Endpoints and Interface Endpoints.

Gateway Endpoints only support Amazon S3 and Amazon DynamoDB. Upon creation, a gateway is added to your specified route table(s) and acts as the destination for all requests to the service it is created for.

Interface Endpoints differ significantly and can only be created for services that are powered by AWS PrivateLink.

Upon creation, AWS creates an interface endpoint consisting of one or more Elastic Network Interfaces (ENIs). Each AZ can support one interface endpoint ENI. This acts as a point of entry for all traffic destined to a specific PrivateLink service.

When an interface endpoint is created, associated DNS entries are created that point to the endpoint and each ENI that the endpoint contains. To access the PrivateLink service you must send your request to one of these hostnames.

As illustrated below, ensure the Private DNS feature is enabled for AWS public and Marketplace services:

Since interface endpoints leverage ENIs, customers can use cloud techniques they are already familiar with. The interface endpoint can be configured with a restrictive security group. These endpoints can also be easily accessed from both inside and outside the VPC. Access from outside a VPC can be accomplished through Direct Connect and VPN.

Illustration of a solution that leverages an interface and gateway endpoint

Customers can also create AWS Endpoint services for their applications or services running on-premises. This allows access to these services via an interface endpoint which can be extended to other VPCs (even if the VPCs themselves do not have Direct Connect configured).

VPC Sharing

At re:Invent 2018, AWS launched the feature VPC sharing, which helps customers control VPC sprawl by decoupling the boundary of an AWS account from the underlying VPC network that supports its infrastructure.

VPC sharing uses Amazon Resource Access Manager (RAM) to share subnets across accounts within the same AWS organization.

VPC sharing is defined as:

VPC sharing allows customers to centralize the management of network, its IP space and the access paths to resources external to the VPC. This method of centralization and reuse (of VPC components such as NAT Gateway and Direct Connect connections) results in a reduction of cost to manage and maintain this environment.

Great, but there are times when a customer needs to build networks with multiple VPCs in and across AWS regions. How should this be done and what are the best practices?

This will be answered in part two of this blog.

 

 

Learn From Your VPC Flow Logs With Additional Meta-Data

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/learn-from-your-vpc-flow-logs-with-additional-meta-data/

Flow Logs for Amazon Virtual Private Cloud enables you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow Logs data can be published to Amazon CloudWatch Logs or Amazon Simple Storage Service (S3).

Since we launched VPC Flow Logs in 2015, you have been using it for variety of use-cases like troubleshooting connectivity issues across your VPCs, intrusion detection, anomaly detection, or archival for compliance purposes. Until today, VPC Flow Logs provided information that included source IP, source port, destination IP, destination port, action (accept, reject) and status. Once enabled, a VPC Flow Log entry looks like the one below.

While this information was sufficient to understand most flows, it required additional computation and lookup to match IP addresses to instance IDs or to guess the directionality of the flow to come to meaningful conclusions.

Today we are announcing the availability of additional meta data to include in your Flow Logs records to better understand network flows. The enriched Flow Logs will allow you to simplify your scripts or remove the need for postprocessing altogether, by reducing the number of computations or lookups required to extract meaningful information from the log data.

When you create a new VPC Flow Log, in addition to existing fields, you can now choose to add the following meta-data:

  • vpc-id : the ID of the VPC containing the source Elastic Network Interface (ENI).
  • subnet-id : the ID of the subnet containing the source ENI.
  • instance-id : the Amazon Elastic Compute Cloud (EC2) instance ID of the instance associated with the source interface. When the ENI is placed by AWS services (for example, AWS PrivateLink, NAT Gateway, Network Load Balancer etc) this field will be “-
  • tcp-flags : the bitmask for TCP Flags observed within the aggregation period. For example, FIN is 0x01 (1), SYN is 0x02 (2), ACK is 0x10 (16), SYN + ACK is 0x12 (18), etc. (the bits are specified in “Control Bits” section of RFC793 “Transmission Control Protocol Specification”).
    This allows to understand who initiated or terminated the connection. TCP uses a three way handshake to establish a connection. The connecting machine sends a SYN packet to the destination, the destination replies with a SYN + ACK and, finally, the connecting machine sends an ACK. In the Flow Logs, the handshake is shown as two lines, with tcp-flags values of 2 (SYN), 18 (SYN + ACK).  ACK is reported only when it is accompanied with SYN (otherwise it would be too much noise for you to filter out).
  • type : the type of traffic : IPV4, IPV6 or Elastic Fabric Adapter.
  • pkt-srcaddr : the packet-level IP address of the source. You typically use this field in conjunction with srcaddr to distinguish between the IP address of an intermediate layer through which traffic flows, such as a NAT gateway.
  • pkt-dstaddr : the packet-level destination IP address, similar to the previous one, but for destination IP addresses.

To create a VPC Flow Log, you can use the AWS Management Console, the AWS Command Line Interface (CLI) or the CreateFlowLogs API and select which additional information and the order you want to consume the fields, for example:

Or using the AWS Command Line Interface (CLI) as below:

$ aws ec2 create-flow-logs --resource-type VPC \
                            --region eu-west-1 \
                            --resource-ids vpc-12345678 \
                            --traffic-type ALL  \
                            --log-destination-type s3 \
                            --log-destination arn:aws:s3:::sst-vpc-demo \
                            --log-format '${version} ${vpc-id} ${subnet-id} ${instance-id} ${interface-id} ${account-id} ${type} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${pkt-srcaddr} ${pkt-dstaddr} ${protocol} ${bytes} ${packets} ${start} ${end} ${action} ${tcp-flags} ${log-status}'

# be sure to replace the bucket name and VPC ID !

{
    "ClientToken": "1A....HoP=",
    "FlowLogIds": [
        "fl-12345678123456789"
    ],
    "Unsuccessful": [] 
}

Enriched VPC Flow Logs are delivered to S3. We will automatically add the required S3 Bucket Policy to authorize VPC Flow Logs to write to your S3 bucket. VPC Flow Logs does not capture real-time log streams for your network interface, it might take several minutes to begin collecting and publishing data to the chosen destinations. Your logs will eventually be available on S3 at s3://<bucket name>/AWSLogs/<account id>/vpcflowlogs/<region>/<year>/<month>/<day>/

An SSH connection from my laptop with IP address 90.90.0.200 to an EC2 instance would appear like this :

3 vpc-exxxxxx2 subnet-8xxxxf3 i-0bfxxxxxxaf eni-08xxxxxxa5 48xxxxxx93 IPv4 172.31.22.145 90.90.0.200 22 62897 172.31.22.145 90.90.0.200 6 5225 24 1566328660 1566328672 ACCEPT 18 OK
3 vpc-exxxxxx2 subnet-8xxxxf3 i-0bfxxxxxxaf eni-08xxxxxxa5 48xxxxxx93 IPv4 90.90.0.200 172.31.22.145 62897 22 90.90.0.200 172.31.22.145 6 4877 29 1566328660 1566328672 ACCEPT 2 OK

172.31.22.145 is the private IP address of the EC2 instance, the one you see when you type ifconfig on the instance.  All flags are “OR”ed during aggregation period. When connection is short, probably both SYN and FIN (3), as well as SYN+ACK and FIN (19) will be set for the same lines.

Once a Flow Log is created, you can not add additional fields or modify the structure of the log to ensure you will not accidently break scripts consuming this data. Any modification will require you to delete and recreate the VPC Flow Logs. There is no additional cost to capture the extra information in the VPC Flow Logs, normal VPC Flow Log pricing applies, remember that Enriched VPC Flow Log records might consume more storage when selecting all fields.  We do recommend to select only the fields relevant to your use-cases.

Enriched VPC Flow Logs is available in all regions where VPC Flow logs is available, you can start to use it today.

— seb

PS: I heard from the team they are working on adding additional meta-data to the logs, stay tuned for updates.