Tag Archives: Amazon VPC

AWS Week in Review – New Open-Source Updates for Snapchange, Cedar, and Jupyter Community Contributions – May 15, 2023

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/aws-week-in-review-new-open-source-updates-for-snapchange-cedar-and-jupyter-community-contributions-may-15-2023/

A new week has begun. Last week, there was a lot of news related to AWS. I have compiled a few announcements you need to know. Let’s get started right away!

Last Week’s Launches
Let’s take a look at some launches from the last week that I want to remind you of:

New Amazon EC2 I4g Instances – Powered by AWS Graviton2 processors, Amazon Elastic Compute Cloud (Amazon EC2) I4g instances improve real-time storage performance up to 2x compared to prior generation storage-optimized instances. Based on AWS Nitro SSDs that are custom-built by AWS and reduce both latency and latency variability, I4g instances are optimized for workloads that perform a high mix of random read/write and require very low I/O latency, such as transactional databases and real-time analytics. To learn more, see Jeff’s post.

Amazon Aurora I/O-Optimized – You can now choose between two storage configurations for Amazon Aurora DB clusters: Aurora Standard or Aurora I/O-Optimized. For applications with low-to-moderate I/Os, Aurora Standard is a cost-effective option.

For applications with high I/Os, Aurora I/O-Optimized provides improved price performance, predictable pricing, and up to 40 percent costs savings. To learn more, see my full blog post.

AWS Management Console Private Access – This is a new security feature that allows you to limit access to the AWS Management Console from your Virtual Private Cloud (VPC) or connected networks to a set of trusted AWS accounts and organizations. It is built on VPC endpoints, which use AWS PrivateLink to establish a private connection between your VPC and the console.

https://docs.aws.amazon.com/images/awsconsolehelpdocs/latest/gsg/images/console-private-access-verify.png

AWS Management Console Private Access is useful when you want to prevent users from signing in to unexpected AWS accounts from within your network. To learn more, see the AWS Management Console getting started guide.

One-Click Security Protection on the Amazon CloudFront Console – You can now secure your web applications and APIs with AWS WAF with a single click on the Amazon CloudFront console. CloudFront handles creating and configuring AWS WAF for you with out-of-the-box protections recommended by AWS and this simple and convenient way to protect applications at the time you create or edit your distribution.

You may continue to select a preconfigured AWS WAF web access control list (ACL) when you prefer to use an existing web ACL. To learn more, see Using AWS WAF to control access to your content in the AWS documentation.

Tracing AWS Lambda SnapStart Functions with AWS X-Ray – You can use AWS X-Ray traces to gain deeper visibility into your function’s performance and execution lifecycle, helping you identify errors and performance bottlenecks for your latency-sensitive Java applications built using SnapStart-enabled functions.

With X-Ray support for SnapStart-enabled functions, you can now see trace data about the restoration of the execution environment and execution of your function code. You can enable X-Ray for Java-based SnapStart-enabled Lambda functions running on Amazon Corretto 11 or 17. To learn more about X-Ray for SnapStart-enabled functions, visit the Lambda Developer Guide or read Marcia’s blog post.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Open Source Updates
Last week, we introduced new open-source projects and significant roadmap contributions to the Jupyter community.

Snapchange – Snapchange is a new open-source project to make fuzzing of a memory snapshot easier using KVM written by Rust. Snapchange enables a target binary to be fuzzed with minimal modifications, providing useful introspection that aids in fuzzing. Snapchange utilizes the features of the Linux kernel’s built-in virtual machine manager known as kernel virtual machine or KVM. To learn more, see the announcement post and GitHub repository.

Cedar – Cedar is a new open-source language for defining permissions as policies, which describes who should have access to what, and evaluating those policies. You can use Cedar to control access to resources such as photos in a photo-sharing app, compute nodes in a microservices cluster, or components in a workflow automation system. Cedar is also authorization-policy language used by the Amazon Verified Permissions, a scalable, fine-grained permissions management and authorization service for custom applications and AWS Verified Access managed services to validate each application request before granting access. To learn more, see the announcement post , Amazon Science blog post and Cedar playground to test sample policies.

Jupyter Community Contributions – We announced new contributions to Jupyter community to democratize generative artificial intelligence (AI) and scale machine learning (ML) workloads. We contributed two Jupyter extensions – Jupyter AI to bring generative AI to Jupyter notebooks and Amazon CodeWhisperer Jupyter extension to generate code suggestions for Python notebooks in JupyterLab. We also contributed three new capabilities to help you scale ML development faster: notebooks scheduling, SageMaker open-source distribution, and Amazon CodeGuru Jupyter extension. To learn more, see the announcement post and Jupyter on AWS.

To learn about weekly updates for open source at AWS, check out the latest AWS open source newsletter by Ricardo.

Upcoming AWS Events
Check your calendars and sign up for these AWS-led events:

AWS Serverless Innovation Day on May 17 – Join us for a free full-day virtual event to learn about AWS Serverless technologies and event-driven architectures from customers, experts, and leaders. Marcia outlined the agenda and main topics of this event in her post. You can register on the event page.

AWS Data Insights Day on May 24 – Join us for another virtual event to discover ways to innovate faster and more cost-effectively with data. Whether your data is stored in operational data stores, data lakes, streaming engines, or within your data warehouse, Amazon Redshift helps you achieve the best performance with the lowest spend. This event focuses on customer voices, deep-dive sessions, and best practices of Amazon Redshift. You can register on the event page.

AWS Silicon Innovation Day on June 21 – Join AWS leaders and experts showcasing AWS innovations in custom-designed EC2 chips built for high performance and scale in the cloud. AWS has designed and developed purpose-built silicon specifically for the cloud. You can understand AWS Silicons and how they can use AWS’s unique EC2 chip offerings to their benefit. You can register on the event page.

AWS re:Inforce 2023 – You can still register for AWS re:Inforce, in Anaheim, California, June 13–14.

AWS Global Summits – Sign up for the AWS Summit closest to your city: Hong Kong (May 23), India (May 25), Amsterdam (June 1), London (June 7), Washington DC (June 7-8), Toronto (June 14), Madrid (June 15), and Milano (June 22).

AWS Community Day – Join community-led conferences driven by AWS user group leaders closest to your city: Chicago (June 15), and Philippines (June 29–30).

You can browse all upcoming AWS-led in-person and virtual events, and developer-focused events such as AWS DevDay.

That’s all for this week. Check back next Monday for another Week in Review!

Channy

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

A walk through AWS Verified Access policies

Post Syndicated from Riggs Goodman III original https://aws.amazon.com/blogs/security/a-walk-through-aws-verified-access-policies/

AWS Verified Access helps improve your organization’s security posture by using security trust providers to grant access to applications. This service grants access to applications only when the user’s identity and the user’s device meet configured security requirements. In this blog post, we will provide an overview of trust providers and policies, then walk through a Verified Access policy for securing your corporate applications.

Understanding trust data and policies

Verified Access policies enable you to use trust data from trust providers and help protect access to corporate applications that are hosted on Amazon Web Services (AWS). When you create a Verified Access group or a Verified Access endpoint, you create a Verified Access policy, which is applied to the group or both the group and endpoint. Policies are written in Cedar, an AWS policy language. With Verified Access, you can express policies that use the trust data from the trust providers that you configure, such as corporate identity providers and device security state providers.

Verified Access receives trust data or claims from different trust providers. Currently, Verified Access supports two types of trust providers. The first type is an identity trust provider. Identity trust providers manage the identities of digital users, including the user’s email address, groups, and profile information. The second type of trust provider is a device trust provider. Device trust providers manage the device posture for users, including the OS version of the device, risk scores, and other metrics that reflect device posture. When a user makes a request to Verified Access, the request includes claims from the configured trust providers. Verified Access customers permit or forbid access to applications by evaluating the claims in Cedar policies. We will walk through the types of claims that are included from trust providers and the options for custom trust data.

End-to-end Cedar policy use cases

Let’s look at how to use policies with your applications. In general, you use Verified Access to control access to an application for purposes of authentication and initial authorization. This means that you use Verified Access to authenticate the user when they log in and to confirm that the device posture of the end device meets minimum criteria. For authorization logic to control access to actions and resources inside the application, you pass the identity claims to the application. The application uses the information to authorize users within the application after authentication. In other words, not every identity claim needs to be passed or checked in Verified Access to allow traffic to pass to the application. You can and should put additional logic in place to make decisions for users when they gain access to the backend application after initial authentication and authorization by Verified Access. From an identity perspective, this additional criteria might be an email address, a group, and possibly some additional claims. From a device perspective, Verified Access does not at this time pass device trust data to the end application. This means that you should use Verified Access to perform checks involving device posture.

We will explore the evolution of a policy by walking you through four use cases for Cedar policy. You can test the claim data and policies in the Verified Access Cedar Playground. For more information about Verified Access, see Verified Access policies and types of trust providers.

Use case 1: Basic policy

For many applications, you only need a simple policy to provide access to your users. This can include the identity information only. For example, let’s say that you want to write a policy that uses the user’s email address and matches a certain group that the user is part of. Within the Verified Access trust provider configuration, you can include “openid email groups” as the scope, and your OpenID Connect (OIDC) provider will include each claim associated with the scopes that you have configured with the OIDC provider. When the user John in this example uses case logs in to the OIDC provider, he receives the following claims from the OIDC provider. For this provider, the Verified Access Trust Provider is configured for “identity” to be the policy reference name.

{
  "identity": {
    "email": "[email protected]",
    "groups": [
      "finance",
      "employees"
    ]
  }
}

With these claims, you can write a policy that matches the email domain and the group, to allow access to the application, as follows.

permit(principal, action, resource)
when {
    // Returns true if the email ends in "@example.com"
    context.identity.email like "*@example.com" &&
    // Returns true if the user is part of the "finance" group
    context.identity.groups.contains("finance")
};

Use case 2: Custom claims within a policy

Many times, you are also interested in company-specific or custom claims from the identity provider. The claims that exist with the user endpoint are dependent on how you configure the identity provider. For OIDC providers, this is determined by the scopes that you define when you set up the identity provider. Verified Access uses OIDC scopes to authorize access to details of the user. This includes attributes such as the name, email address, email verification, and custom attributes. Each scope that you configure for the identity provider returns a set of user attributes, which we call claims. Depending on which claims you want to match on in your policy, you configure the scopes and claims in the OIDC provider, which the OIDC provider adds to the user endpoint. For a list of standard claims, including profile, email, name, and others, see the Standard Claims OIDC specification.

In this example use case, as your policy evolves from the basic policy, you decide to add additional company-specific claims to Verified Access. This includes both the business unit and the level of each employee. Within the Verified Access trust provider configuration, you can include “openid email groups profile” as the scope, and your OIDC provider will include each claim associated with the scopes that you have configured with the OIDC provider. Now, when the user John logs in to the OIDC provider, he receives the following claims from the OIDC provider, with both the business unit and role as claims from the “profile” scope in OIDC.

{
  "identity": {
    "email": "[email protected]",
    "groups": [
      "finance",
      "employees"
    ],
    "business_unit": "corp",
    "level": 8
  }
}

With these claims, the company can write a policy that matches the claims to allow access to the application, as follows.

permit(principal, action, resource)
when {
    // Returns true if the email ends in "@example.com"
    context.identity.email like "*@example.com" &&
    // Returns true if the user is part of the "finance" group
    context.identity.groups.contains("finance") &&
    // Returns true if the business unit is "corp"
    context.identity.business_unit == "corp" &&
    // Returns true if the level is greater than 6
    context.identity.level >= 6
};

Use case 3: Add a device trust provider to a policy

The other type of trust provider is a device trust provider. Verified Access supports two device trust providers today: CrowdStrike and Jamf. As detailed in the AWS Verified Access Request Verification Flow, for HTTP/HTTPS traffic, the extension in the web browser receives device posture information from the device agent on the user’s device. Each device trust provider determines what risk information and device information to include in the claims and how that information is formatted. Depending on the device trust provider, the claims are static or configurable.

In our example use case, with the evolution of the policy, you now add device trust provider checks to the policy. After you install the Verified Access browser extension on John’s computer, Verified Access receives the following claims from both the identity trust provider and the device trust provider, which uses the policy reference name “crwd”.

{
  "identity": {
    "email": "[email protected]",
    "groups": [
      "finance",
      "employees"
    ],
    "business_unit": "corp",
    "level": 8
  },
  "crwd": {
    "assessment": {
      "overall": 90,
      "os": 100,
      "sensor_config": 80,
      "version": "3.4.0"
    }
  }
}

With these claims, you can write a policy that matches the claims to allow access to the application, as follows.

permit(principal, action, resource)
when {
    // Returns true if the email ends in "@example.com"
    context.identity.email like "*@example.com" &&
    // Returns true if the user is part of the "finance" group
    context.identity.groups.contains("finance") &&
    // Returns true if the business unit is "corp"
    context.identity.business_unit == "corp" &&
    // Returns true if the level is greater than 6
    context.identity.level >= 6 &&
    // If the CrowdStrike agent is present
    ( context has "crwd" &&
      // The overall device score is greater or equal to 80 
      context.crwd.assessment.overall >= 80 )
};

For more information about these scores, see Third-party trust providers.

Use case 4: Multiple device trust providers

The final update to your policy comes in the form of multiple device trust providers. Verified Access provides the ability to match on multiple device trust providers in the same policy. This provides flexibility for your company, which in this example use case has different device trust providers installed on different types of users’ devices. For information about many of the claims that each device trust provider provides to AWS, see Third-party trust providers. However, for this updated policy, John’s claims do not change, but the new policy can match on either CrowdStrike’s or Jamf’s trust data. For Jamf, the policy reference name is “jamf”.

permit(principal, action, resource)
when {
    // Returns true if the email ends in "@example.com"
    context.identity.email like "*@example.com" &&
    // Returns true if the user is part of the "finance" group
    context.identity.groups.contains("finance") &&
    // Returns true if the business unit is "corp"
    context.identity.business_unit == "corp" &&
    // Returns true if the level is greater than 6
    context.identity.level >= 6 &&
    // If the CrowdStrike agent is present
    (( context has "crwd" &&
      // The overall device score is greater or equal to 80 
      context.crwd.assessment.overall >= 80 ) ||
    // If the Jamf agent is present
    ( context has "jamf" &&
      // The risk level is either LOW or SECURE
      ["LOW","SECURE"].contains(context.jamf.risk) ))
};

For more information about using Jamf with Verified Access, see Integrating AWS Verified Access with Jamf Device Identity.

Conclusion

In this blog post, we covered an overview of Cedar policy for AWS Verified Access, discussed the types of trust providers available for Verified Access, and walked through different use cases as you evolve your Cedar policy in Verified Access.

If you want to test your own policies and claims, see the Cedar Playground. If you want more information about Verified Access, see the AWS Verified Access documentation.

Want more AWS Security news? Follow us on Twitter.

Riggs Goodman III

Riggs Goodman III

Riggs Goodman III is the Senior Global Tech Lead for the Networking Partner Segment at Amazon Web Services (AWS). Based in Atlanta, Georgia, Riggs has over 17 years of experience designing and architecting networking solutions for both partners and customers.

Bashuman Deb

Bashuman Deb

Bashuman is a Principal Software Development Engineer with Amazon Web Services. He loves to create delightful experiences for customers when they interact with the AWS Network. He loves dabbling with software-defined-networks and virtualized multi-tenant implementations of network-protocols. He is baffled by the complexities of keeping global routing meshes in sync.

Best Practices for managing data residency in AWS Local Zones using landing zone controls

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/best-practices-for-managing-data-residency-in-aws-local-zones-using-landing-zone-controls/

This blog post is written by Abeer Naffa’, Sr. Solutions Architect, Solutions Builder AWS, David Filiatrault, Principal Security Consultant, and Jared Thompson Hybrid Edge SA Specialist.

In this post, we discuss how you can leverage AWS Control Tower landing zone and AWS Organizations custom policies – guardrails – at the root level, known as Service Control Policies (SCPS) to enable certain data residency requirements on AWS Local Zones. Using the suggested controls lets you limit the ability to store data, process data outside a geographic location, and keep your data within specific Local Zone(s).

Data residency is a critical consideration for organizations that collect and store sensitive information, such as Personal Identifiable Information (PII), financial data, and healthcare data. With the rise of cloud computing and the global nature of the Internet, it can be challenging for organizations to make sure that their data is being stored and processed in compliance with local laws and regulations.

One potential solution for addressing data residency challenges with AWS is utilizing Local Zones, which places AWS infrastructure in large metro areas. This enables organizations to store and process data in specific geographic locations. By using Local Zones, organizations can architect their environment to meet data residency requirements when an AWS Region is unavailable within the same legal jurisdiction. Local Zones can be configured to utilize landing zone to further adhere to data residency requirements.

A landing zone is a well-architected, multi-account AWS environment that is scalable and secure. This is a starting point from which your organization can quickly launch and deploy workloads and applications with confidence in your security and infrastructure environment.

When leveraging a Local Zone to meet data residency requirements, you must have control over the in-scope data movement from the Local Zones to the AWS Region. This can be accomplished by implementing landing zone best practices and the suggested guardrails. The main focus of this post is the custom policies that restrict data snapshots, prohibit data creation within the Region, and limit data transfer to the Region. Furthermore, this post covers which prerequisites organizations should consider before implementing these guardrails.

Prerequisites

Landing zones best practices and custom guardrails can help when data must remain in a specific locality where the Local Zone is also located. This can be completed by defining and enforcing policies for data storage and usage within the landing zone organization that you set up. The following prerequisites should be considered before implementing the suggested guardrails:

  1. AWS Local Zones

Local Zones are enabled through the Amazon Elastic Compute Cloud (Amazon EC2) console or the AWS Command Line Interface (AWS CLI).

You can start using available AWS services on the designated Local Zone directly from the console. Moreover, you can manage the Local Zone using the same tools and interfaces that you use in AWS Region.

2. AWS Control Tower landing zone

AWS Control Tower is a managed service that provides a pre-packaged set of best-practice blueprints for setting up and governing multi-account AWS environments. You must have Control Tower fully implemented in your environment before you can deploy custom guardrails.

Control Tower launches a key resource associated with your account, called a landing zone, which serves as a home for your organizations and their accounts.

Note that Control Tower relies on Organizations to create and manage multi-account structures.

  1. Set up the data residency guardrails

Using Organizations, you must make sure that the Local Zone is enabled within a workload account in the landing zones.

Figure 1 Landing Zones Accelerator Local Zones workload on AWS high level Architecture

Figure 1: Landing Zones Accelerator – Local Zones workload on AWS high level Architecture

Utilizing Local Zones for regulated components

The availability of Local Zones provides an excellent opportunity to meet data residency requirements and comply with local regulations that restrict the use of the Region outside of your required geo-political boundary. By leveraging Local Zones, organizations can maintain compliance while utilizing AWS services to support their business needs. AWS owns and manages the infrastructure, including hardware, software, and networking for Local Zones. However, as part of the shared responsibility model, customers are responsible for the security of their applications and data, including access control, data encryption, etc.

You must also comply with all applicable regulations and security standards, such as HIPAA, PCI DSS, and GDPR. You should conduct regular security assessments and implement appropriate security controls to protect their applications and data.

In this post, we consider a scenario where there is a single Local Zones location in a metro. However, you must analyze the specific requirements of your workloads and the relevant regulations that apply to determine the most appropriate high availability configurations for your case.

Local Zones workload data residency guardrails

Organizations provides central governance and management for multiple accounts. Central security administrators use SCPs with Organizations to establish controls to which all AWS Identity and Access Management (IAM) principals (users and roles) adhere.

Now you can use SCPs to set permission guardrails. The suggested preventative controls that leverage the implementation of SCPs for data residency on Local Zones are shown in the next paragraph. SCPs let you set permission guardrails by defining the maximum available permissions for IAM entities in an account and all accounts within the Organization root or Organizational Unit (OU). If an SCP denies an action for an account, then none of the entities in any member account, including the member account’s root user, can take that action, even if their IAM permissions let them. The guardrails set in SCPs apply to all IAM entities in the account, which include all users, roles, and the account root user.

 Upon finalizing these prerequisites, you can create the guardrails for the chosen OU.

Note that although the following guidelines serve as helpful guardrails – SCPs – for data residency, you should consult internally with legal and security teams for specific organizational requirements.

 To exercise better control over the workload in Local Zones and prevent data transfer from Local Zones or data storage outside of the Local Zones, consider implementing the following guardrails:

  1. When your data residency requirements require restricting data transfer/saving to the Region, consider the following guardrails:

a. Deny copying data from Local Zones to the Region for Amazon EC2), Amazon Relational Database Service (Amazon RDS), Amazon ElastiCache, and data sync “DenyCopyToRegion”.

As the available services in Local Zones can vary based on location, you must review the services available in the chosen Local Zone and Adjust the SCPs accordingly.

b. Deny Amazon Simple Storage Service (Amazon S3) put action to the Region “DenyPutObjectToRegionalBuckets”.

If your data residency requirements mandate restrictions on data storage in the Region, then consider implementing this guardrail to prevent the use of Amazon S3 in the Region.

c. If your data residency requirements mandate restrictions on data storage in the Region, then consider implementing “DenyDirectTransferToRegion”

Out of Scope is metadata such as tags or operational data such as AWS Key Management Service (AWS KMS) keys.

{
  "Version": "2012-10-17",
  "Statement": [
      {
      "Sid": "DenyCopyToRegion",
      "Action": [
        "ec2:ModifyImageAttribute",
        "ec2:CopyImage",  
        "ec2:CreateImage",
        "ec2:CreateInstanceExportTask",
        "ec2:ExportImage",
        "ec2:ImportImage",
        "ec2:ImportInstance",
        "ec2:ImportSnapshot",
        "ec2:ImportVolume",
        "rds:CreateDBSnapshot",
        "rds:CreateDBClusterSnapshot",
        "rds:ModifyDBSnapshotAttribute",
        "elasticache:CreateSnapshot",
        "elasticache:CopySnapshot",
        "datasync:Create*",
        "datasync:Update*"
      ],
      "Resource": "*",
      "Effect": "Deny"
    },
    {
      "Sid": "DenyDirectTransferToRegion",
      "Action": [
        "dynamodb:PutItem",
        "dynamodb:CreateTable",
        "ec2:CreateTrafficMirrorTarget",
        "ec2:CreateTrafficMirrorSession",
        "rds:CreateGlobalCluster",
        "es:Create*",
        "elasticfilesystem:C*",
        "elasticfilesystem:Put*",
        "storagegateway:Create*",
        "neptune-db:connect",
        "glue:CreateDevEndpoint",
        "glue:UpdateDevEndpoint",
        "datapipeline:CreatePipeline",
        "datapipeline:PutPipelineDefinition",
        "sagemaker:CreateAutoMLJob",
        "sagemaker:CreateData*",
        "sagemaker:CreateCode*",
        "sagemaker:CreateEndpoint",
        "sagemaker:CreateDomain",
        "sagemaker:CreateEdgePackagingJob",
        "sagemaker:CreateNotebookInstance",
        "sagemaker:CreateProcessingJob",
        "sagemaker:CreateModel*",
        "sagemaker:CreateTra*",
        "sagemaker:Update*",
        "redshift:CreateCluster*",
        "ses:Send*",
        "ses:Create*",
        "sqs:Create*",
        "sqs:Send*",
        "mq:Create*",
        "cloudfront:Create*",
        "cloudfront:Update*",
        "ecr:Put*",
        "ecr:Create*",
        "ecr:Upload*",
        "ram:AcceptResourceShareInvitation"
      ],
      "Resource": "*",
      "Effect": "Deny"
    },
    {
      "Sid": "DenyPutObjectToRegionalBuckets",
      "Action": [
        "s3:PutObject"
      ],
      "Resource": ["arn:aws:s3:::*"],
      "Effect": "Deny"
    }
  ]
}
  1. If your data residency requirements require limitations on data storage in the Region, then consider implementing this guardrail “DenyAllSnapshots” to restrict the use of snapshots in the Region.

Note that the following guardrail restricts the creation of snapshots on AWS Outposts as well. If you’re using Outposts in the same AWS account, then you may need to customize this guardrail to make sure that it aligns with your organization’s specific needs and requirements. For more information on data residency considerations for Outposts, please refer to Architecting for data residency with AWS Outposts rack and landing zone guardrails.

{
  "Version": "2012-10-17",
  "Statement": [

    {
      "Sid": "DenyAllSnapshots",
      "Effect":"Deny",
      "Action":[
        "ec2:CreateSnapshot",
        "ec2:CreateSnapshots",
        "ec2:CopySnapshot",
        "ec2:ModifySnapshotAttribute"
      ],
      "Resource":"arn:aws:ec2:*::snapshot/*"
    }
  ]
}
  1. This guardrail helps prevent the launch of EC2 instances or the creation of network interfaces by subnet as opposed to Local Zones You should keep data residency workloads within the Local Zones rather than the Region to make sure of better control over regulated workloads. This approach can help your organization achieve better control over data residency workloads and improve governance over your Organization.

 Make sure to update the Local Zones subnets < localzones_subnet_arns>.

{
"Version": "2012-10-17",
  "Statement":[{
    "Sid": "DenyNotLocalZonesSubnet",
    "Effect":"Deny",
    "Action": [
      "ec2:RunInstances",
      "ec2:CreateNetworkInterface"
    ],
    "Resource": [
      "arn:aws:ec2:*:*:network-interface/*"
    ],
    "Condition": {
      "ForAllValues:ArnNotEquals": {
        "ec2:Subnet": ["<localzones_subnet_arns>"]
      }
    }
  }]
}

Additional considerations

When implementing data residency guardrails on Local Zones, consider backup and disaster recovery strategies to make sure that your data is protected in the event of an outage or other unexpected events. This may include creating regular backups of your data, implementing disaster recovery plans and procedures, and using redundancy and failover systems to minimize the impact of any potential disruptions. Additionally, you should make sure that your backup and disaster recovery systems are compliant with any relevant data residency regulations and requirements. You should also test your backup and disaster recovery systems regularly to make sure that they are functioning as intended.

Additionally, the provided SCPs for Local Zones in the above example do not block the “logs:PutLogEvents”. Therefore, even if you implemented data residency guardrails on Local Zones, the application may log data to Amazon CloudWatch Logs in the Region.

Highlights

By default, application-level logs on Local Zones are not automatically sent to CloudWatch Logs in the Region. You can configure CloudWatch Logs agent on Local Zones to collect and send your application-level logs to CloudWatch Logs.

logs:PutLogEvents does transmit data to the Region, but it is not blocked by the provided SCPs, as it’s expected that most use cases still want to be able to use this logging API. However, if blocking is desired, then add the action to the first recommended guardrail. If you want specific roles to be allowed, then combine with the ArnNotLike condition example referenced in the Customization Guide.

Conclusion

The combined use of Local Zones and the suggested guardrails via Organizations policies enables you to exercise better control over the movement of the data. By creating a landing zone for your organization, you can apply SCPs to your Local Zones that will help make sure that your data remains within a specific geographic location, as required by the data residency regulations.

Note that, although custom guardrails can help you manage data residency on Local Zones, it’s critical to thoroughly review your policies, procedures, and configurations. This lets you make sure that they are compliant with all relevant data residency regulations and requirements. Regularly testing and monitoring your systems can help make sure that your data is protected and your organization stays compliant.

References

Simplify Service-to-Service Connectivity, Security, and Monitoring with Amazon VPC Lattice – Now Generally Available

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/simplify-service-to-service-connectivity-security-and-monitoring-with-amazon-vpc-lattice-now-generally-available/

At AWS re:Invent 2022, we introduced in preview Amazon VPC Lattice, a new capability of Amazon Virtual Private Cloud (Amazon VPC) that gives you a consistent way to connect, secure, and monitor communication between your services. With VPC Lattice, you can define policies for network access, traffic management, and monitoring to connect compute services across instances, containers, and serverless applications.

Today, I am happy to share that VPC Lattice is now generally available. Compared to the preview, you have access to new capabilities:

  • Services can use a custom domain name in addition to the domain name automatically generated by VPC Lattice. When using HTTPS, you can configure an SSL/TLS certificate that matches the custom domain name.
  • You can deploy the open-source AWS Gateway API Controller to use VPC Lattice with a Kubernetes-native experience. It uses the Kubernetes Gateway API to let you connect services across multiple Kubernetes clusters and services running on EC2 instances, containers, and serverless functions.
  • You can use an Application Load Balancer (ALB) or a Network Load Balancer (NLB) as a target for a service.
  • The IP address target type now supports IPv6 connectivity.

Let’s see some of these new features in practice.

Using Amazon VPC Lattice for Service-to-Service Connectivity
In my previous post introducing VPC Lattice, I show how to create a service network, associate multiple VPCs and services, and configure target groups for EC2 instances and Lambda functions. There, I also show how to route traffic based on request characteristics and how to use weighted routing. Weighted routing is really handy for blue/green and canary-style deployments or for migrating from one compute platform to another.

Now, let’s see how to use VPC Lattice to allow the services of an e-commerce application to communicate with each other. For simplicity, I only consider four services:

  • The Order service, running as a Lambda function.
  • The Inventory service, deployed as an Amazon Elastic Container Service (Amazon ECS) service in a dual-stack VPC supporting IPv6.
  • The Delivery service, deployed as an ECS service using an ALB to distribute traffic to the service tasks.
  • The Payment service, running on an EC2 instance.

First, I create a service network. The Order service needs to call the Inventory service (to check if an item is available for purchase), the Delivery service (to organize the delivery of the item), and the Payment service (to transfer the funds). The following diagram shows the service-to-service communication from the perspective of the service network.

Diagram describing the service network view of the e-commerce services.

These services run in different AWS accounts and multiple VPCs. VPC Lattice handles the complexity of setting up connectivity across VPC boundaries and permission across accounts so that service-to-service communication is as simple as an HTTP/HTTPS call.

The following diagram shows how the communication flows from an implementation point of view.

Diagram describing the implementation view of the e-commerce services.

The Order service runs in a Lambda function connected to a VPC. Because all the VPCs in the diagram are associated with the service network, the Order service is able to call the other services (Inventory, Delivery, and Payment) even if they are deployed in different AWS accounts and in VPCs with overlapping IP addresses.

Using a Network Load Balancer (NLB) as Target
The Inventory service runs in a dual-stack VPC. It’s deployed as an ECS service with an NLB to distribute traffic to the tasks in the service. To get the IPv6 addresses of the NLB, I look for the network interfaces used by the NLB in the EC2 console.

Console screenshot.

When creating the target group for the Inventory service, under Basic configuration, I choose IP addresses as the target type. Then, I select IPv6 for the IP Address type.

Console screenshot.

In the next step, I enter the IPv6 addresses of the NLB as targets. After the target group is created, the health checks test the targets to see if they are responding as expected.

Console screenshot.

Using an Application Load Balancer (ALB) as Target
Using an ALB as a target is even easier. When creating a target group for the Delivery service, under Basic configuration, I choose the new Application Load Balancer target type.

Console screenshot.

I select the VPC in which to look for the ALB and choose the Protocol version.

Console screenshot.

In the next step, I choose Register now and select the ALB from the dropdown. I use the default port used by the target group. VPC Lattice does not provide additional health checks for ALBs. However, load balancers already have their own health checks configured.

Console screenshot.

Using Custom Domain Names for Services
To call these services, I use custom domain names. For example, when I create the Payment service in the VPC console, I choose to Specify a custom domain configuration, enter a Custom domain name, and select an SSL/TLS certificate for the HTTPS listener. The Custom SSL/TLS certificate dropdown shows available certificates from AWS Certificate Manager (ACM).

Console screenshot.

Securing Service-to-Service Communications
Now that the target groups have been created, let’s see how I can secure the way services communicate with each other. To implement zero-trust authentication and authorization, I use AWS Identity and Access Management (IAM). When creating a service, I select the AWS IAM as Auth type.

I select the Allow only authenticated access policy template so that requests to services need to be signed using Signature Version 4, the same signing protocol used by AWS APIs. In this way, requests between services are authenticated by their IAM credentials, and I don’t have to manage secrets to secure their communications.

Console screenshot.

Optionally, I can be more precise and use an auth policy that only gives access to some services or specific URL paths of a service. For example, I can apply the following auth policy to the Order service to give to the Lambda function these permissions:

  • Read-only access (GET method) to the Inventory service /stock URL path.
  • Full access (any HTTP method) to the Delivery service /delivery URL path.
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<Order Service Lambda Function IAM Role ARN>"
            },
            "Action": "vpc-lattice-svcs:Invoke",
            "Resource": "<Inventory Service ARN>/stock",
            "Condition": {
                "StringEquals": {
                    "vpc-lattice-svcs:RequestMethod": "GET"
                }
            }
        },
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "<Order Service Lambda Function IAM Role ARN>"
            },
            "Action": "vpc-lattice-svcs:Invoke",
            "Resource": "<Delivery Service ARN>/delivery"
        }
    ]
}

Using VPC Lattice, I quickly configured the communication between the services of my e-commerce application, including security and monitoring. Now, I can focus on the business logic instead of managing how services communicate with each other.

Availability and Pricing
Amazon VPC Lattice is available today in the following AWS Regions: US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), and Europe (Ireland).

With VPC Lattice, you pay for the time a service is provisioned, the amount of data transferred through each service, and the number of requests. There is no charge for the first 300,000 requests every hour, and you only pay for requests above this threshold. For more information, see VPC Lattice pricing.

We designed VPC Lattice to allow incremental opt-in over time. Each team in your organization can choose if and when to use VPC Lattice. Other applications can connect to VPC Lattice services using standard protocols such as HTTP and HTTPS. By using VPC Lattice, you can focus on your application logic and improve productivity and deployment flexibility with consistent support for instances, containers, and serverless computing.

Simplify the way you connect, secure, and monitor your services with VPC Lattice.

Danilo

Week in Review – February 13, 2023

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/week-in-review-february-13-2023/

AWS announced 32 capabilities since we published the last Week in Review blog post a week ago. I also read a couple of other news and blog posts.

Here is my summary.

The VPC section of the AWS Management Console now allows you to visualize your VPC resources, such as the relationships between a VPC and its subnets, routing tables, and gateways. This visualization was available at VPC creation time only, and now you can go back to it using the Resource Map tab in the console. You can read the details in Channy’s blog post.

CloudTrail Lake now gives you the ability to ingest activity events from non-AWS sources. This lets you immutably store and then process activity events without regard to their origin–AWS, on-premises servers, and so forth. All of this power is available to you with a single API call: PutAuditEvents. We launched AWS CloudTrail Lake about a year ago. It is a managed organization-scale data lake that aggregates, immutably stores, and allows querying of events recorded by CloudTrail. You can use it for auditing, security investigation, and troubleshooting. Again, my colleague Channy wrote a post with the details.

There are three new Amazon CloudWatch metrics for asynchronous AWS Lambda function invocations: AsyncEventsReceived, AsyncEventAge, and AsyncEventsDropped. These metrics provide visibility for asynchronous Lambda function invocations. They help you to identify the root cause of processing issues such as throttling, concurrency limit, function errors, processing latency because of retries, or missing events. You can learn more and have access to a sample application in this blog post.

Amazon Simple Notification Service (Amazon SNS) now supports AWS X-Ray to visualize, analyze, and debug applications. Developers can now trace messages going through Amazon SNS, making it easier to understand or debug microservices or serverless applications.

Amazon EC2 Mac instances now support replacing root volumes for quick instance restoration. Stopping and starting EC2 Mac instances trigger a scrubbing workflow that can take up to one hour to complete. Now you can swap the root volume of the instance with an EBS snapshot or an AMI. It helps to reset your instance to a previous known state in 10–15 minutes only. This significantly speeds up your CI and CD pipelines.

Amazon Polly launches two new Japanese NTTS voices. Neural Text To Speech (NTTS) produces the most natural and human-like text-to-speech voices possible. You can try these voices in the Polly section of the AWS Management Console. With this addition, according to my count, you can now choose among 52 NTTS voices in 28 languages or language variants (French from France or from Quebec, for example).

The AWS SDK for Java now includes the AWS CRT HTTP Client. The HTTP client is the center-piece powering our SDKs. Every single AWS API call triggers a network call to our API endpoints. It is therefore important to use a low-footprint and low-latency HTTP client library in our SDKs. AWS created a common HTTP client for all SDKs using the C programming language. We also offer 11 wrappers for 11 programming languages, from C++ to Swift. When you develop in Java, you now have the option to use this common HTTP client. It provides up to 76 percent cold start time reduction on AWS Lambda functions and up to 14 percent less memory usage compared to the Netty-based HTTP client provided by default. My colleague Zoe has more details in her blog post.

X in Y Jeff started this section a while ago to list the expansion of new services and capabilities to additional Regions. I noticed 10 Regional expansions this week:

Other AWS News
This week, I also noticed these AWS news items:

My colleague Mai-Lan shared some impressive customer stories and metrics related to the use and scale of Amazon S3 Glacier. Check it out to learn how to put your cold data to work.

Space is the final (edge) frontier. I read this blog post published on avionweek.com. It explains how AWS helps to deploy AIML models on observation satellites to analyze image quality before sending them to earth, saving up to 40 percent satellite bandwidth. Interestingly, the main cause for unusable satellite images is…clouds.

Upcoming AWS Events
Check your calendars and sign up for these AWS events:

AWS re:Invent recaps in your area. During the re:Invent week, we had lots of new announcements, and in the next weeks, you can find in your area a recap of all these launches. All the events are posted on this site, so check it regularly to find an event nearby.

AWS re:Invent keynotes, leadership sessions, and breakout sessions are available on demand. I recommend that you check the playlists and find the talks about your favorite topics in one collection.

AWS Summits season will restart in Q2 2023. The dates and locations will be announced here. Paris and Sidney are kicking off the season on April 4th. You can register today to attend these in-person, free events (Paris, Sidney).

Stay Informed
That was my selection for this week! To better keep up with all of this news, do not forget to check out the following resources:

— seb
This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

How to choose between CoIP and Direct VPC routing modes on AWS Outposts rack

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/how-to-choose-between-coip-and-direct-vpc-routing-modes-on-aws-outposts-rack/

This blog post is written by Sumit Menaria, Senior Hybrid Solutions Architect AWS WWSO Core Services.

AWS Outposts Rack is a fully-managed service that extends AWS infrastructure, services, APIs, and tools to customer premises. By providing local access to AWS managed infrastructure and services, Outposts rack enables customers to build and run applications on premises using the same programming interfaces as in AWS Regions, while using local compute and storage resources for low latency, local data processing, and data residency needs.

There are various data sources on premises that you might want to connect from your Outpost. These sources can include field devices, on-premises databases, mainframes, storage arrays, or end users. Each Outpost supports a single Local Gateway (LGW) construct, which enables connectivity from your Outpost subnets to an on-premises network. Note that this post is specific to Outposts racks and a different method of local communication is used for AWS Outposts servers.

Two different options for facilitating communication between your Outpost based resources and on-premises network: Direct VPC routing and customer-owned IP pool. Both of these are mutually exclusive options, and routing works differently based on your choice of the mode. The two modes are the attributes of the LGW route table that your Outpost subnets’ VPC is associated with, which specifies the communication mode for the Outpost subnets.

Direct VPC routing mode

Direct VPC routing uses the private IP address of the instances in the VPC CIDR block to facilitate communication with your on-premises network. These addresses are advertised to your on-premises network with Border Gateway Protocol (BGP). Advertisement via BGP is only for the private IP addresses that belong to the subnets on your Outpost and have a route pointing to the LGW via the subnet’s route table. This type of routing is the default mode for Outposts Rack. In this mode, the LGW doesn’t perform Network Address Translation (NAT) for instances. Furthermore, you don’t have to assign an Elastic IP address to your Amazon Elastic Compute Cloud (Amazon EC2) instance from a (CoIP) to enable communication with your on-premises resources.

A diagram showing how an EC2 instance on an Outpost communicates with on-premises network using direct VPC routing mode

In this diagram, when the instance Y wants to communicate with an on-premises server, it traverses the LGW and can talk to the on-premises server using its source address (10.0.1.11) in the Subnet CIDR range (10.0.1.0/24) that is advertised over BGP from the LGW to the Customer Network Device. Similarly when the on-premises server wants to initiate communication with the Outpost based EC2 instance, it uses the instance’s private IP address (10.0.1.11) as the destination IP address to set up the connection.

CoIP mode

Utilizing CoIP mode means that you must provide a separate IP address range from your on-premises IP space for AWS to create an address pool, known as a CoIP. With CoIP, when an Outpost based resources, such as EC2 instances, Application Load Balancer (ALB), or Amazon Relational Database Service (Amazon RDS) instances, need to communicate to your on-premises network, the Local Gateway will perform 1:1 NAT from the resource’s private IP address from the Outpost subnet range to an IP address from the CoIP pool. The subnet-to-CoIP address mapping is done by assigning an Elastic IP (EIP) from the CoIP address range allocated for resources such as EC2 instances. To enable the communication with the CoIP pool from the on-premises network, and then the LGW advertises the CoIP pool through BGP over its peering with the Customer Network Device.

A diagram showing how an EC2 instance on an Outpost communicates with on-premises network using CoIP mode

In this diagram, when the instance Y wants to talk to an on-premises server, the traffic traverses the LGW and the source IP address (10.0.1.11) of the instance gets translated to an IP address (192.168.0.11) in the CoIP range that is associated with the instance. Similarly, when the on-premises server initiates the communication, the request will be sent with the CoIP address (192.168.0.11) of the instance as the destination IP address. This will be changed to the instance’s private IP address (10.0.1.11) via NAT at the LGW. The CoIP pool (192.168.0.0/26) is advertised via BGP to the Customer Network Device to provide the route to the on-premises environment for reaching the Outpost based resources.

When to choose CoIP routing mode

CoIP is particularly useful when you want to isolate your Outpost based workloads from the on-premises infrastructure and only need specific resources on the Outpost to be able to communicate to the on-premises infrastructure. This is useful in situations where large enterprise networks have hundreds of IP pools allocated and there is a high chance of overlap between IP addresses allocated to Outpost based VPCs/Subnets and those allocated to on-premises infrastructure. Furthermore, CoIP can act as another layer of security, as you may choose to allocate the CoIP addresses to only the resources which must communicate with the on-premises network. Then you can allocate for the rest of them using the subnet private IP address range for communication within the Outpost or Region based resources.

This means that you don’t need to have the number of IPs in the CoIP pool be equal to the number of resources on your Outpost. For example, you may choose to configure a /26 CoIP range and a /22 pool for subnets to meet your workload requirements.

CoIP mode can also be useful when using an external ALB on Outpost and you want to make it routable through the local internet connectivity. By using a smaller internet routable CoIP address range assignment for your external ALB, you can route traffic to the ALB on the Outpost without needing to traverse through the internet gateway (IGW) in the parent Region.

When to choose Direct VPC routing mode

You can choose Direct VPC routing if you don’t want the operational overhead of managing the additional IP pools for NAT between your Outpost based resources and on-premises network. There are also few applications which may not work well if there is an NAT of IPs between the two endpoints communicating with each other. Some examples you may see are Active Directory communication with on-premises based servers, or iSCSI mount of your instances as an additional storage to on-premises Storage Area Network (SAN). These applications may not work or may need additional tuning if they encounter NATed IP addresses between an Outpost based client on an EC2 instance and on-premises based server for a two-way communication.

When Direct VPC routing mode is used, multiple VPCs can be associated to an Outpost LGW route table, and the Outpost subnets with the LGW as the route target, are automatically advertised to the on-premises network through BGP. Therefore, you must make sure that appropriate IP planning is in place to avoid any overlap of the Outpost VPC/Subnet IP range with the on-premises IP range, as they are directly advertised from LGW toward the Customer Network Device. Having overlapping IP subnets in your network can lead to undesired effects on your application connectivity and you must pay special attention when allocating IP pools for your on-premises and Outpost based VPC address space. You can use Amazon VPC IP Address Manager (IPAM) to plan the IP space of your VPCs and CoIP pools, as well as add on-premises based IP Pools using manual allocation.

Conclusion

You can select either Direct VPC routing or CoIP mode for routing through an Outpost Local Gateway. Since this selection affects the routing for all of the subnets on your Outpost associated with the LGW route table, it should be selected based on your workload requirements and existing IP infrastructure planning. You can also change the LGW route table mode at a later stage. However, that involves network disruption and the creation of a new LGW route table. To learn more about Outposts Racks routing, visit the LGW Route table documentation.

New – Visualize Your VPC Resources from Amazon VPC Creation Experience

Post Syndicated from Channy Yun original https://aws.amazon.com/blogs/aws/new-visualize-your-vpc-resources-from-amazon-vpc-creation-experience/

Today we are announcing Amazon Virtual Private Cloud (Amazon VPC) resource map, a new feature that simplifies the VPC creation experience in the AWS Management Console. This feature displays your existing VPC resources and their routing visually on a single page, allowing you to quickly understand the architectural layout of the VPC.

A year ago, in March 2022, we launched a new VPC creation experience that streamlines the process of creating and connecting VPC resources. With just one click, even across multiple Availability Zones (AZs), you can create and connect VPC resources, eliminating more than 90 percent of the manual steps required in the past. The new creation experience is centered around an interactive diagram that displays a preview of the VPC architecture and updates as options are selected, providing a visual representation of the resources and their relationships within the VPC that you are about to create.

However, after the creation of the VPC, the diagram that was available during the creation experience that many of our customers loved was no longer available. Today we are changing that! With VPC resource map, you can quickly understand the architectural layout of the VPC, including the number of subnets, which subnets are associated with the public route table, and which route tables have routes to the NAT Gateway.

You can also get to the specific resource details by clicking on the resource. This eliminates the need for you to map out resource relationships mentally and hold the information in your head while working with your VPC, making the process much more efficient and less prone to mistakes.

Getting Started with VPC Resource Map
To get started, choose an existing VPC in the VPC console. In the details section, select the Resource map tab. Here, you can see the resources in your VPC and the relationships between those resources.

As you hover over a resource, you can see the related resources and the connected lines highlighted. If you click to select the resource, you can see a few lines of details and a link to see the details of the selected resource.

Getting Started with VPC Creation Experience
I want to explain how to use the VPC creation experience to improve your workflow to create a new VPC to make a high-availability three-tier VPC easily.

Choose Create VPC and select VPC and more in the VPC console. You can preview the VPC resources that you are about to create all on the same page.

In Name tag auto-generation, you can specify a prefix value for Name tags. This value is used to generate Name tags for all VPC resources in the preview. If I change the default value, which is project to channy, the Name tag in the preview changes to channy- something, such as channy-vpc. You can customize a Name tag per resource in the preview by clicking each resource and making changes.

You can easily change the default CIDR value (10.0.0.0/16) when you click the IPv4 CIDR block field to reveal the CIDR joystick. Use the left or right arrow to move to the previous (9.255.0.0/16) or next (10.0.1.0/16) CIDR block within the /16 network mask. You can also change the subnet mask to /17 by using the down arrow, or go back to /16 using the up arrow.

Choose the number of Availability Zones (AZs) up to 3. The number of public and private subnet types changes based on the number of AZs and shows the total number of each subnet type it will create.

I want a high-availability VPC in three AZs and select 6 for the number of private subnets. In the preview panel, you can see that there are 9 subnets. When I hover over channy-rtb-public, I can visually confirm that this route table is connected to three public subnets and also routed to the internet gateway (channy-igw). The dotted lines indicate routes to network node, and the solid lines indicate relationships such as implicit or explicit associations.

Adding NAT gateways and VPC endpoints is easy. You can simply change the number of NAT gateways in or per Availability Zone (AZ). Note that there is a charge for each NAT gateway. We always recommend having one NAT gateway per AZ and route traffic from subnets in an AZ to the NAT gateway in the same AZ for high availability and to avoid inter-AZ data charges.

To route traffic to Amazon Simple Storage Service (Amazon S3) buckets more securely, you can choose the S3 Gateway endpoint by default. The S3 Gateway endpoint is free of charge and does not use NAT gateways when moving data from private subnets.

You can create additional tags and assign them to all resources in the VPC in no time. I select Add new tag and enter environment for the Key and test for the Value. This key-value pair will be added to every resource here.

Choose Create VPC at the bottom of the page and see the resources and the IDs of those resources that are being created. Before creating, please validate resources from the preview.

Once all the resources are created, choose View VPC at the bottom. The button takes you directly to the VPC resource map, where you can see a visual representation of what you created.

Now Available
Amazon VPC resource map is now available in all AWS Regions where Amazon VPC is available, and you can start using it today.

The VPC resource map and creation experience now only displays VPC, subnets, route tables, internet gateway, NAT gateways, and Amazon S3 gateway. The Amazon VPC console teams and user experience teams will continue to improve the console experience using customer feedback.

To learn more, see the Amazon VPC User Guide, and please send feedback to AWS re:Post for Amazon VPC or through your usual AWS support contacts.

Channy

Automating your workload deployments in AWS Local Zones

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/automating-your-workload-deployments-in-aws-local-zones/

This blog post is written by Enrico Liguori, SA – Solutions Builder , WWPS Solution Architecture.

AWS Local Zones are a type of infrastructure deployment that places compute, storage,and other select AWS services close to large population and industry centers.

We now have a total of 32 Local Zones; 15 outside of the US (Bangkok, Buenos Aires, Copenhagen, Delhi, Hamburg, Helsinki, Kolkata, Lagos, Lima, Muscat, Perth, Querétaro, Santiago, Taipei, and Warsaw) and 17 in the US. We will continue to launch Local Zones in 21 metro areas in 18 countries, including Australia, Austria, Belgium, Brazil, Canada, Colombia, Czech Republic, Germany, Greece, India, Kenya, Netherlands, New Zealand, Norway, Philippines, Portugal, South Africa, and Vietnam.

Customers using AWS Local Zones can provision the infrastructure and services needed to host their workloads with the same APIs and tools for automation that they use in the AWS Region, included the AWS Cloud Development Kit (AWS CDK).

The AWS CDK is an open source software development framework to model and provision your cloud application resources using familiar programming languages, including TypeScript, JavaScript, Python, C#, and Java. For the solution in this post, we use Python.

Overview

In this post we demonstrate how to:

  1. Programmatically enable the Local Zone of your interest.
  2. Explore the supported APIs to check the types of Amazon Elastic Compute Cloud (Amazon EC2) instances available in a specific Local Zone and get their associated price per hour;
  3. Deploy a simple WordPress application in the Local Zone through AWS CDK.

Prerequisites

To be able to try the examples provided in this post, you must configure:

  1. AWS Command Line Interface (AWS CLI)
  2. Python version 3.8 or above
  3. AWS CDK

Enabling a Local Zone programmatically

To get started with Local Zones, you must first enable the Local Zone that you plan to use in your AWS account. In this tutorial, you can learn how to select the Local Zone that provides the lowest latency to your site and understand how to opt into the Local Zone from the AWS Management Console.

If you prefer to interact with AWS APIs programmatically, then you can enable the Local Zone of your interest by calling the ModifyAvailabilityZoneGroup API through the AWS CLI or one of the supported AWS SDKs.

The following examples show how to opt into the Atlanta Local Zone through the AWS CLI and through the Python SDK:

AWS CLI:

aws ec2 modify-availability-zone-group \
  --region us-east-1 \
  --group-name us-east-1-atl-1 \
  --opt-in-status opted-in

Python SDK:

ec2 = boto3.client('ec2', config=Config(region_name='us-east-1'))
response = ec2.modify_availability_zone_group(
                  GroupName='us-east-1-atl-1',
                  OptInStatus='opted-in'
           )

The opt in process takes approximately five minutes to complete. After this time, you can confirm the opt in status using the DescribeAvailabilityZones API.

From the AWS CLI, you can check the enabled Local Zones with:

aws ec2 describe-availability-zones --region us-east-1

Or, once again, we can use one of the supported SDKs. Here is an example using Phyton:

ec2 = boto3.client('ec2', config=Config(region_name='us-east-1'))
response = ec2.describe_availability_zones()

In both cases, a JSON object similar to the following, will be returned:

{
"State": "available",
"OptInStatus": "opted-in",
"Messages": [],
"RegionName": "us-east-1",
"ZoneName": "us-east-1-atl-1a",
"ZoneId": "use1-atl1-az1",
"GroupName": "us-east-1-atl-1",
"NetworkBorderGroup": "us-east-1-atl-1",
"ZoneType": "local-zone",
"ParentZoneName": "us-east-1d",
"ParentZoneId": "use1-az4"
}

The OptInStatus confirms that we successful enabled the Atlanta Local Zone and that we can now deploy resources in it.

How to check available EC2 instances in Local Zones

The set of instance types available in a Local Zone might change from one Local Zone to another. This means that before starting deploying resources, it’s a good practice to check which instance types are supported in the Local Zone.

After enabling the Local Zone, we can programmatically check the instance types that are available by using DescribeInstanceTypeOfferings. To use the API with Local Zones, we must pass availability-zone as the value of the LocationType parameter and use a Filter object to select the correct Local Zone that we want to check. The resulting AWS CLI command will look like the following example:

aws ec2 describe-instance-type-offerings --location-type "availability-zone" --filters 
Name=location,Values=us-east-1-atl-1a --region us-east-1

Using Python SDK:

ec2 = boto3.client('ec2', config=Config(region_name='us-east-1'))
response = ec2.describe_instance_type_offerings(
      LocationType='availability-zone',
      Filters=[
            {
            'Name': 'location',
            'Values': ['us-east-1-atl-1a']
            }
            ]
      )

How to check prices of EC2 instances in Local Zones

EC2 instances and other AWS resources in Local Zones will have different prices than in the parent Region. Check the pricing page for the complete list of pricing options and associated price-per-hour.

To access the pricing list programmatically, we can use the GetProducts API. The API returns the list of pricing options available for the AWS service specified in the ServiceCode parameter. We also recommend defining Filters to restrict the number of results returned. For example, to retrieve the On-Demand pricing list of a T3 Medium instance in Atlanta from the AWS CLI, we can use the following:

aws pricing get-products --format-version aws_v1 --service-code AmazonEC2 --region us-east-1 \
--filters 'Type=TERM_MATCH,Field=instanceType,Value=t3.medium' \
--filters 'Type=TERM_MATCH,Field=location,Value=US East (Atlanta)'

Similarly, with Python SDK we can use the following:

pricing = boto3.client('pricing',config=Config(region_name="us-east-1")) response = pricing.get_products(
         ServiceCode='AmazonEC2',
         Filters= [
          {
          "Type": "TERM_MATCH",
          "Field": "instanceType",
          "Value": "t3.medium"
          },
          {
          "Type": "TERM_MATCH",
          "Field": "regionCode",
          "Value": "us-east-1-atl-1"
          }
        ],
         FormatVersion='aws_v1',
)

Note that the Region specified in the CLI command and in Boto3, is the location of the AWS Price List service API endpoint. This API is available only in us-east-1 and ap-south-1 Regions.

Deploying WordPress in Local Zones using AWS CDK

In this section, we see how to use the AWS CDK and Python to deploy a simple non-production WordPress installation in a Local Zone.

Architecture overview

architecture overview

The AWS CDK stack will deploy a new standard Amazon Virtual Private Cloud (Amazon VPC) in the parent Region (us-east-1) that will be extended to the Local Zone. This creates two subnets associated with the Atlanta Local Zone: a public subnet to expose resources on the Internet, and a private subnet to host the application and database layers. Review the AWS public documentation for a definition of public and private subnets in a VPC.

The application architecture is made of the following:

  • A front-end in the private subnet where a WordPress application is installed, through a User Data script, in a type T3 medium EC2 instance.
  • A back-end in the private subnet where MySQL database is installed, through a User Data script, in a type T3 medium EC2 instance.
  • An Application Load Balancer (ALB) in the public subnet that will act as the entry point for the application.
  • A NAT instance to allow resources in the private subnet to initiate traffic to the Internet.

Clone the sample code from the AWS CDK examples repository

We can clone the AWS CDK code hosted on GitHub with:

$ git clone https://github.com/aws-samples/aws-cdk-examples.git

Then navigate to the directory aws-cdk-examples/python/vpc-ec2-local-zones using the following:

$ cd aws-cdk-examples/python/vpc-ec2-local-zones

Before starting the provisioning, let’s look at the code in the following sections.

Networking infrastructure

The networking infrastructure is usually the first building block that we must define. In AWS CDK, this can be done using the VPC construct:

import aws_cdk.aws_ec2 as ec2
vpc = ec2.Vpc(
            self,
            "Vpc",
            cidr=”172.31.100.0/24”,
            subnet_configuration=[
                ec2.SubnetConfiguration(
                    name = 'Public-Subnet',
                    subnet_type = ec2.SubnetType.PUBLIC,
                    cidr_mask = 26,
                ),
                ec2.SubnetConfiguration(
                    name = 'Private-Subnet',
                    subnet_type = ec2.SubnetType.PRIVATE_ISOLATED,
                    cidr_mask = 26,
                ),
            ]      
        )

Together with the VPC CIDR (i.e. 172.31.100.0/24), we define also the subnets configuration through the subnet_configuration parameter.

Note that in the subnet definitions above there is no specification of the Availability Zone or Local Zone that we want to associate them with. We can define this setting at the VPC level, overwriting the availability_zones method as shown here:

@property
def availability_zones(self):
   return [“us-east-1-atl-1a”]

As an alternative, you can use a Local Zone Name as the value of the availability_zones parameter in each Subnet definition. For a complete list of Local Zone Names, check out the Zone Names on the Local Zones Locations page.

Specifying ec2.SubnetType.PUBLIC  in the subnet_type parameter, AWS CDK  automatically creates an Internet Gateway (IGW) associated with our VPC and a default route in its routing table pointing to the IGW. With this setup, the Internet traffic will go directly to the IGW in the Local Zone without going through the parent AWS Region. For other connectivity options, check the AWS Local Zone User Guide.

The last piece of our networking infrastructure is a self-managed NAT instance. This will allow instances in the private subnet to communicate with services outside of the VPC and simultaneously prevent them from receiving unsolicited connection requests.

We can implement the best practices for NAT instances included in the AWS public documentation using a combination of parameters of the Instance construct, as shown here:

nat = ec2.Instance(self, "NATInstanceInLZ",
                 vpc=vpc,
                 security_group=self.create_nat_SG(vpc),
                 instance_type=ec2.InstanceType.of(ec2.InstanceClass.T3, ec2.InstanceSize.MEDIUM),
                 machine_image=ec2.MachineImage.latest_amazon_linux(),
                 user_data=ec2.UserData.custom(user_data),
                 vpc_subnets=ec2.SubnetSelection(availability_zones=[“us-east-1-atl-1a”], subnet_type=ec2.SubnetType.PUBLIC),
                 source_dest_check=False
                )

In the previous code example, we specify the following as parameters:

The final required step is to update the route table of the private subnet with the following:

priv_subnet.add_route("DefRouteToNAT",
            router_id=nat_instance.instance_id,
            router_type=ec2.RouterType.INSTANCE,
            destination_cidr_block="0.0.0.0/0",
            enables_internet_connectivity=True)

The application stack

The other resources, including the front-end instance managed by AutoScaling, the back-end instance, and ALB are deployed using the standard AWS CDK constructs. Note that the ALB service is only available in some Local Zones. If you plan to use a Local Zone where ALB isn’t supported, then you must deploy a load balancer on a self-managed EC2 instance, or use a load balancer available in AWS Marketplace.

Stack deployment

Next, let’s go through the AWS CDK bootstrapping process. This is required only for the first time that we use AWS CDK in a specific AWS environment (an AWS environment is a combination of an AWS account and Region).

$ cdk bootstrap

Now we can deploy the stack with the following:

$ cdk deploy

After the deployment is completed, we can connect to the application with a browser using the URL returned in the output of the cdk deploy command:

terminal screenshot

The WordPress install wizard will be displayed in the browser, thereby confirming that the deployment worked as expected:

The WordPress install wizard

Note that in this post we use the Local Zone in Atlanta. Therefore, we must deploy the stack in its parent Region, US East (N. Virginia). To select the Region used by the stack, configure the AWS CLI default profile.

Cleanup

To terminate the resources that we created in this post, you can simply run the following:

$ cdk destroy

Conclusion

In this post, we demonstrated how to interact programmatically with the different AWS APIs available for Local Zones. Furthermore, we deployed a simple WordPress application in the Atlanta Local Zone after analyzing the AWS CDK code used for the deployment.

We encourage you to try the examples provided in this post and get familiar with the programmatic configuration and deployment of resources in a Local Zone.

Stream VPC flow logs to Amazon OpenSearch Service via Amazon Kinesis Data Firehose

Post Syndicated from Chaitanya Shah original https://aws.amazon.com/blogs/big-data/stream-vpc-flow-logs-to-amazon-opensearch-service-via-amazon-kinesis-data-firehose/

Amazon Virtual Private Cloud (Amazon VPC) flow logs enable you to track the IP traffic going to and from the network interfaces in your VPC for your workloads. Analyzing VPC logs helps you understand how your applications are communicating over your VPC network with log records and acts as a main source of information to the network in your VPC. After collecting the flow logs, the next step is performing log analysis to understand user or application behavior and patterns to make informed decisions. You can analyze logs using log analytics tools such as Amazon OpenSearch Service.

Amazon Kinesis Data Firehose is a fully managed service for delivering near real-time streaming data to various destinations for storage and performing near real-time analytics. With its extensible data transformation capabilities, you can also streamline log processing and log delivery pipelines into a single Firehose delivery stream.

Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. Amazon OpenSearch is an open source, distributed search and analytics suite. Amazon OpenSearch Service offers the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), as well as visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions). Amazon OpenSearch Service currently has tens of thousands of active customers with hundreds of thousands of clusters under management processing trillions of requests per month.

In this post, you will learn how to ingest VPC flow logs with Kinesis Data Firehose and deliver them to an Amazon OpenSearch Service for analysis using OpenSearch Service Dashboards.

Overview of solution

This solution uses native integration of VPC flow logs streaming to Kinesis Data Firehose. We use a Firehose delivery stream to buffer the streamed VPC flow logs, and deliver those to an OpenSearch Service destination endpoint. We use Amazon OpenSearch Service Dashboards to create an index pattern for the VPC flow logs to analyze and visualize the logs in a near-real time. The following diagram illustrates this architecture.

Solution Architecture

We walk you through the following high-level steps:

  1. Create an OpenSearch Service domain for storing and analyzing the VPC flow logs.
  2. Create a Firehose delivery stream to deliver the flow logs to the OpenSearch Service domain.
  3. Create a VPC flow log subscription to the delivery stream.
  4. Explore VPC flow logs in OpenSearch Service Dashboards
    • Create role mapping with an OpenSearch Service user to the Kinesis Data Firehose service role. Because we’re using a public access domain for OpenSearch Service, we have to map the delivery stream AWS Identity and Access Management (IAM) role to the OpenSearch Service primary user to deliver logs in bulk to the OpenSearch Service domain.
    • Create an index pattern in OpenSearch Service Dashboards to enable analysis and visualization of VPC logs.

Prerequisites

As a prerequisite, you need to create an Amazon Simple Storage Service (Amazon S3) bucket to store the Firehose delivery stream backups and failed logs.

Create an Amazon OpenSearch Service domain

For demonstration purposes, and to limit the costs, we create an OpenSearch Service domain with the Development and testing deployment type and public access to the dashboard. For instructions, refer to Create an Amazon OpenSearch Service domain. Note that we select Public access only for demo purposes. For production, we recommend using VPC access for security reasons.

When it’s complete, the OpenSearch Service domain shows as Active.

OpenSearch Domain

Create a Kinesis Data Firehose delivery stream

Now that your Amazon OpenSearch Service domain is active, you can create a Firehose delivery stream where VPC flow logs are streamed.

  1. On the Amazon Kinesis console, choose Kinesis Data Firehose in the navigation pane, then choose Create delivery stream.
  2. Choose Direct PUT as the source and set the destination as Amazon OpenSearch Service.
  3. For Delivery stream name, enter PUT-OPENSEARCH-STREAM-DEMO.Kinesis Delivery Stream
  4. In the Destination settings section, choose Browse and choose the previously created Amazon OpenSearch Service domain.
  5. For Index name, enter vpcflowlogs.
  6. For Index rotation, choose Every day.
  7. For this post, we set Buffer size to 5 and Buffer interval to 900.You can modify these settings to optimize ingestion throughput and near-real-time behavior.
    Kinesis Stream Destination setting
  1. In the Backup settings section, for Source record backup in Amazon S3, select Failed events only so you only save the data that fails to deliver to Amazon OpenSearch Service.
  2. For S3 bucket, choose Browse and choose the S3 bucket you created to store failed logs and backups.
  3. Optionally, you can input a prefix for backup files and error files.
  4. Select GZIP for Compression for data records.
  5. For Encryption for data records, select Disabled.Kinesis Stream - Backup Setting
  6. Expand Advanced settings, and for Amazon CloudWatch error logging, select Enabled.
  7. Choose Create delivery stream.Kinesis Stream - Advance Setting

When the delivery stream is active, proceed to the next step.

Create a VPC flow logs subscription

Now you create a VPC flow logs subscription for the Firehose delivery stream you created in the previous step.

  1. On the Amazon VPC console, choose Your VPCs.
  2. Select the VPC for which to create the flow log.
  3. On the Actions menu, choose Create flow log.VPC Flow Log
  4. Select All to send all flow log records to Amazon OpenSearch Service.

If you want to filter the flow logs, you can select either Accept or Reject.

  1. For Maximum aggregation interval, select 10 minutes or the minimum setting of 1 minute if you need the flow log data to be available for near-real-time analysis in Amazon OpenSearch Service.
  2. For Destination, select Send to Kinesis Firehose in the same account if the delivery stream is set up on the same account where you create the VPC flow logs.
  3. For Log record format, if you leave it at AWS default format, the flow logs are sent as version 2 format.

Alternatively, you can specify which fields you need the flow logs to capture and send to an Amazon OpenSearch Service. For more information on log format and available fields, refer to Flow log records.

  1. Choose Create flow log.Create VPC Flow Logs

Now let’s explore the VPC flow logs in Amazon OpenSearch Service.

Explore VPC flow logs in Amazon OpenSearch Service Dashboards

In the final step, we set up OpenSearch Service Dashboards to explore the VPC flow logs.

  1. On the OpenSearch Service console, choose Domains in the navigation pane.
  2. Choose the domain you created.
  3. Under OpenSearch Dashboards URL, choose the link to open a new tab.OpenSearch Dashboard
  4. Log in with the user you created during OpenSearch Service domain setup.OpenSearch Service Dashboard
  5. Select Private for Select your tenant, then choose Confirm.OpenSearch Service Dashboard Tenant

Because we used a public access domain for OpenSearch Service, you need to map the role created for the Firehose delivery stream to the OpenSearch Service Dashboards user, so that the delivery stream can deliver logs in bulk to the OpenSearch Service domain.

  1. On the menu icon, choose Security.
  2. Choose Roles.
  3. Choose the all_access role.OpenSearch Service All Access Role
  4. On the Mapped users tab, choose Manage mapping.OpenSearch Service Dashboard map role
  5. For Backend roles, enter the IAM role ARN created for the Firehose delivery stream.
  6. Choose Map.OpenSearch Service Dashboard Map role arn
  7. Now that mapping is complete, choose the menu icon, then choose Stack management.OpenSearch Service Dashboard Stack Management
  8. Choose Index Patterns, then choose Create index pattern.
  9. For Index pattern name, enter vpcflowlogs*.
  10. Choose Next step.OpenSearch Service Dashboard Create Index
  11. Navigate to the Discover menu option.You can see the VPC flow logs from your VPC in this dashboard. Now you can search and visualize the flow logs that are being streamed in near-real time to the OpenSearch Service domain.
    OpenSearch Service Dashboard Discover

Clean up

After you test out this solution, remember to delete all the resources you created to avoid incurring future charges:

  1. Delete your Amazon OpenSearch Service domain.
  2. Delete the VPC flow logs subscription.
  3. Delete the Firehose delivery stream.
  4. Delete the S3 bucket for the VPC flow logs backup and failed logs.
  5. If you created a new VPC and new resources in the VPC, delete the resources and VPC.

Conclusion

In this post, we walked through a solution of how integrate VPC flow logs with a Kinesis Data Firehose delivery stream and deliver it to an Amazon OpenSearch Service destination with no code and visualize it in OpenSearch Service Dashboards.

Try this new quick and hassle-free way of sending your VPC flow logs to an Amazon OpenSearch Service using Kinesis Data Firehose.


About the Author

Chaitanya Shah is a Sr. Technical Account Manager with AWS, based out of New York. He has over 22 years of experience working with enterprise customers. He loves to code and actively contributes to the AWS solutions labs to help customers solve complex problems. He provides guidance to AWS customers on best practices for their AWS Cloud migrations. He is also specialized in AWS data transfer and the data and analytics domain.

Building a healthcare data pipeline on AWS with IBM Cloud Pak for Data

Post Syndicated from Eduardo Monich Fronza original https://aws.amazon.com/blogs/architecture/building-a-healthcare-data-pipeline-on-aws-with-ibm-cloud-pak-for-data/

Healthcare data is being generated at an increased rate with the proliferation of connected medical devices and clinical systems. Some examples of these data are time-sensitive patient information, including results of laboratory tests, pathology reports, X-rays, digital imaging, and medical devices to monitor a patient’s vital signs, such as blood pressure, heart rate, and temperature.

These different types of data can be difficult to work with, but when combined they can be used to build data pipelines and machine learning (ML) models to address various challenges in the healthcare industry, like the prediction of patient outcome, readmission rate, or disease progression.

In this post, we demonstrate how to bring data from different sources, like Snowflake and connected health devices, to form a healthcare data lake on Amazon Web Services (AWS). We also explore how to use this data with IBM Watson to build, train, and deploy ML models. You can learn how to integrate model endpoints with clinical health applications to generate predictions for patient health conditions.

Solution overview

The main parts of the architecture we discuss are (Figure 1):

  1. Using patient data to improve health outcomes
  2. Healthcare data lake formation to store patient health information
  3. Analyzing clinical data to improve medical research
  4. Gaining operational insights from healthcare provider data
  5. Providing data governance to maintain the data privacy
  6. Building, training, and deploying an ML model
  7. Integration with the healthcare system
Data pipeline for the healthcare industry using IBM CP4D on AWS

Figure 1. Data pipeline for the healthcare industry using IBM CP4D on AWS

IBM Cloud Pak for Data (CP4D) is deployed on Red Hat OpenShift Service on AWS (ROSA). It provides the components IBM DataStage, IBM Watson Knowledge Catalogue, IBM Watson Studio, IBM Watson Machine Learning, plus a wide variety of connections with data sources available in a public cloud or on-premises.

Connected health devices, on the edge, use sensors and wireless connectivity to gather patient health data, such as biometrics, and send it to the AWS Cloud through Amazon Kinesis Data Firehose. AWS Lambda transforms the data that is persisted to Amazon Simple Storage Service (Amazon S3), making that information available to healthcare providers.

Amazon Simple Notification Service (Amazon SNS) is used to send notifications whenever there is an issue with the real-time data ingestion from the connected health devices. In case of failures, messages are sent via Amazon SNS topics for rectifying and reprocessing of failure messages.

DataStage performs ETL operations and move patient historical information from Snowflake into Amazon S3. This data, combined with the data from the connected health devices, form a healthcare data lake, which is used in IBM CP4D to build and train ML models.

The pipeline described in architecture uses Watson Knowledge Catalogue, which provides data governance framework and artifacts to enrich our data assets. It protects sensitive patient information from unauthorized access, like individually identifiable information, medical history, test results, or insurance information.

Data protection rules define how to control access to data, mask sensitive values, or filter rows from data assets. The rules are automatically evaluated and enforced each time a user attempts to access a data asset in any governed catalog of the platform.

After this, the datasets are published to Watson Studio projects, where they are used to train ML models. You can develop models using Jupyter Notebook, IBM AutoAI (low-code), or IBM SPSS modeler (no-code).

For the purpose of this use case, we used logistic regression algorithm for classifying and predicting the probability of an event, such as disease risk management to assist doctors in making critical medical decisions. You can also build ML models using algorithms like Classification, Random Forest, and K-Nearest Neighbor. These are widely used to predict disease risk.

Once the models are trained, they are exposed as endpoints with Watson Machine Learning and integrated with the healthcare application to generate predictions by analyzing patient symptoms.

The healthcare applications are a type of clinical software that offer crucial physiological insights and predict the effects of illnesses and possible treatments. It provides built-in dashboards that display patient information together with the patient’s overall metrics for outcomes and treatments. This can help healthcare practitioners gain insights into patient conditions. It also can help medical institutions prioritize patients with more risk factors and curate clinical and behavioral health plans.

Finally, we are using IBM Security QRadar XDR SIEM to collect, process, and aggregate Amazon Virtual Private Cloud (Amazon VPC) flow logs, AWS CloudTrail logs, and IBM CP4D logs. QRadar XDR uses this information to manage security by providing real-time monitoring, alerts, and responses to threats.

Healthcare data lake

A healthcare data lake can help health organizations turn data into insights. It is centralized, curated, and securely stores data on Amazon S3. It also enables you to break down data silos and combine different types of analytics to gain insights. We are using the DataStage, Kinesis Data Firehose, and Amazon S3 services to build the healthcare data lake.

Data governance

Watson Knowledge Catalogue provides an ML catalogue for data discovery, cataloging, quality, and governance. We define policies in Watson Knowledge Catalogue to enable data privacy and overall access to and utilization of this data. This includes sensitive data and personal information that needs to be handled through data protection, quality, and automation rules. To learn more about IBM data governance, please refer to Running a data quality analysis (Watson Knowledge Catalogue).

Build, train, and deploy the ML model

Watson Studio empowers data scientists, developers, and analysts to build, run, and manage AI models on IBM CP4D.

In this solution, we are building models using Watson Studio by:

  1. Promoting the governed data from Watson Knowledge Catalogue to Watson Studio for insights
  2. Using ETL features, such as built-in search, automatic metadata propagation, and simultaneous highlighting, to process and transform large amounts of data
  3. Training the model, including model technique selection and application, hyperparameter setting and adjustment, validation, ensemble model development and testing; algorithm selection; and model optimization
  4. Evaluating the model based on metric evaluation, confusion matrix calculations, KPIs, model performance metrics, model quality measurements for accuracy and precision
  5. Deploying the model on Watson Machine Learning using online deployments, which create an endpoint to generate a score or prediction in real time
  6. Integrating the endpoint with applications like health applications, as demonstrated in Figure 1

Conclusion

In this blog, we demonstrated how to use patient data to improve health outcomes by creating a healthcare data lake and analyzing clinical data. This can help patients and healthcare practitioners make better, faster decisions and prioritize cases. We also discussed how to build an ML model using IBM Watson and integrate it with healthcare applications for health analysis.

Additional resources

Simplify private network access for solutions using Amazon OpenSearch Service managed VPC endpoints

Post Syndicated from Aish Gunasekar original https://aws.amazon.com/blogs/big-data/simplify-private-network-access-for-solutions-using-amazon-opensearch-service-managed-vpc-endpoints/

Amazon OpenSearch Service makes it easy for you to perform interactive log analytics, real-time application monitoring, website search, and more. Amazon OpenSearch is an open source, distributed search and analytics suite. Amazon OpenSearch Service offers the latest versions of OpenSearch, support for 19 versions of Elasticsearch (1.5 to 7.10 versions), as well as visualization capabilities powered by OpenSearch Dashboards and Kibana (1.5 to 7.10 versions). Amazon OpenSearch Service currently has tens of thousands of active customers with hundreds of thousands of clusters under management processing trillions of requests per month.

To meet the needs of customers who want simplicity in their network setup with the Amazon OpenSearch Service, you can now use Amazon OpenSearch Service-managed virtual private cloud (VPC) endpoints (powered by AWS PrivateLink) to connect to your applications using Amazon OpenSearch Service domains launched in Amazon Virtual Private Cloud (VPC). With Amazon OpenSearch Service-managed VPC endpoints, you can privately access your Amazon OpenSearch Service domain from multiple VPCs in your account or other AWS accounts based on your application needs without configuring other services features such as VPC peering, AWS Transit Gateway (TGW), or other more complex network routing strategies that place operational burden on your support and engineering teams.

The feature is built using AWS PrivateLink. AWS PrivateLink provides private connectivity between VPCs, supported AWS services, and your on-premises networks without exposing your traffic to the public internet. It provides you with the means to connect multiple application deployments effortlessly to your Amazon OpenSearch Service domains.

This post introduces Amazon OpenSearch Service-managed VPC endpoints that build on top of AWS PrivateLink and shows how you can access a private Amazon OpenSearch Service from one or more VPCs hosted in the same account, or even VPCs hosted in other AWS accounts using AWS PrivateLink managed by Amazon OpenSearch Service.

­­­­Amazon OpenSearch Service managed VPC endpoints

Before the launch of Amazon OpenSearch Service managed VPC endpoints, if you needed to gain access to your domain outside of your VPC, you had three options:

  • Use VPC peering to connect your VPC with other VPCs
  • Use AWS Transit Gateway to connect your VPC with other VPCs
  • Create your own implementation of an AWS PrivateLink setup

The first two options require you to setup your VPCs so that the Classless Inter-Domain Routing (CIDR) block ranges don’t overlap. If they did, then your options are more complicated. The third option, create your own implementation of AWS PrivateLink, involve configuring a network load balancer (NLB) and associating a target group with the NLB as one of the steps in the setup. The architecture discussed in this post, demonstrates these additional layers of complexity.

With Amazon OpenSearch Service managed VPC endpoints (i.e., powered by AWS PrivateLink), these complex setups and processes are no longer needed!

You can access your Amazon OpenSearch Service private domain as if it were deployed in all the VPCs that you want to connect to your domain. If you need private connectivity from your on-premises hybrid deployments, then AWS PrivateLink helps you bring access from your Amazon OpenSearch Service domain to your data centers with minimal effort.

By using AWS PrivateLink with Amazon OpenSearch Service, you can realize the following benefits:

  • You simplify your network architecture between hybrid, multi-VPC, and multi account solutions
  • You address a multitude of compliance concerns by better controlling the traffic that moves between your solutions and Amazon OpenSearch Service domains

Shared search cluster for multiple development teams

Imagine that your company hosts a service as a software (SaaS) application that provides a search application programming interface (API) for the healthcare industry. Each team works on a different function of the API. The development teams API team 1 and API team 2 are in two different AWS accounts and each has their own VPCs within these accounts. Another team (data refinement team) works on the ingestion and data refinement to populate the Amazon OpenSearch Service domain hosted in the same account as API team 2 but in different VPC. Each team shares the domain during the development cycles to save costs and foster collaboration on the data modeling.

Solution overview

Self-managed AWS PrivateLink architecture to connect different VPCs

In this scenario prior to Amazon OpenSearch Service manage VPC endpoints (i.e., powered by AWS PrivateLink), you would have to create the following items:

  1. Deploy an NLB in your VPC
  2. Create a target group that points to the IP addresses of the Elastic Network Interfaces (ENIs), which the Amazon OpenSearch Service creates in your VPC and is used to launch the Amazon OpenSearch Service
  3. Create an AWS PrivateLink deployment and reference your newly created NLB

When you implement the NLB, a target group can only reference IP addresses, an Amazon EC2 instance, or an Application Load Balancer (ALB). If you referenced the IP addresses as targets, then you had to build a process that detected the changes in the IP address if the domain changed due to service initiated or self-initiated blue/green deployments. You must maintain yet another complex process to ensure that you always have active ENIs with which to point your target groups or you lose connectivity.

Typically, customers use an AWS Lambda with scheduled events in Amazon CloudWatch. This means that you use the AWS Lambda to detect the current state where the ENIs that provided the IP addresses were marked as active for the description that matched the ENIs your domain creates. You schedule AWS Lambda to wake up within the time to live (TTL) of the Domain Name Service (DNS) settings (typically 60 seconds) and compare the existing IP addresses in the target group with any new ones found when you query all ENIs with a description referencing your domain in the VPC. You then build a new target group with the deltas and you swap the target groups and drop the old one. It’s tricky, it’s complex, and you have to maintain the solution!

With the new simplified networking architecture, your teams go through the following steps.

OpenSearch Service managed VPC endpoints architecture (powered by AWS PrivateLink)

Since the Amazon OpenSearch Service takes care of the infrastructure described previously — but not necessarily on the same implementation — all you really need to concern yourself with is creating the connections using the instructions in our service documentation.

Once you complete the steps in the instructions and remove your own implementation, your architecture is then simplified as seen in the following diagram.

Once you complete the steps in the instructions and remove your own implementation, your architecture is then simplified.

At this point, the development teams (API team 1 and API team 2) can access the Amazon OpenSearch cluster via Amazon OpenSearch Service Managed VPC Endpoint. This option is highly scalable with a simplified network architecture in which you don’t have to worry about managing a NLB, or setting up target groups and the additional resources. If the number of development teams and VPCs grow in the future, you associate the domain with the associated interface VPC endpoint. You can access services in VPCs in same or different accounts, even if there are overlapping CIDR Block IP ranges.

Conclusion

In this post, we walked through the architectural design of accessing Amazon OpenSearch cluster from different VPCs across different accounts using OpenSearch Service-managed VPC endpoint (AWS PrivateLink). Using Transit Gateway, self-managed AWS PrivateLink or VPC peering required complex networking strategies that increased operation burden. With the introduction of VPC endpoints for Amazon OpenSearch Service, the complexity of your solutions is greatly simplified and what’s even better, it’s managed for you!


About the authors

Aish Gunasekar is a Specialist Solutions architect with a focus on Amazon OpenSearch Service. Her passion at AWS is to help customers design highly scalable architectures and help them in their cloud adoption journey. Outside of work, she enjoys hiking and baking.

Kevin Fallis (@AWSCodeWarrior) is an AWS specialist search solutions architect.  His passion at AWS is to help customers leverage the correct mix of AWS services to achieve success for their business goals. His after-work activities include family, DIY projects, carpentry, playing drums, and all things music.

Introducing VPC Lattice – Simplify Networking for Service-to-Service Communication (Preview)

Post Syndicated from Danilo Poccia original https://aws.amazon.com/blogs/aws/introducing-vpc-lattice-simplify-networking-for-service-to-service-communication-preview/

Modern applications are built using modular and distributed components. Each component is a service that implements its own subset of functionalities. To make these services communicate with each other, you need a way to let them discover where they are, authorize access, and route traffic. When troubleshooting issues, you need to keep communication configurations under control so that you can quickly understand what is happening at the application, service, and network levels. This can take a lot of your time.

Today, we are making available in preview Amazon VPC Lattice, a new capability of Amazon Virtual Private Cloud (Amazon VPC) that gives you a consistent way to connect, secure, and monitor communication between your services. With VPC Lattice, you can define policies for traffic management, network access, and monitoring so you can connect applications in a simple and consistent way across AWS compute services (instances, containers, and serverless functions). VPC Lattice automatically handles network connectivity between VPCs and accounts and network address translation between IPv4, IPv6, and overlapping IP addresses. VPC Lattice integrates with AWS Identity and Access Management (IAM) to give you the same authentication and authorization capabilities you are familiar with when interacting with AWS services today, but for your own service-to-service communication. With VPC Lattice, you have common controls to route traffic based on request characteristics and weighted routing for blue/green and canary-style deployments. For example, VPC Lattice allows you to mix and match compute types for a given service, which helps you modernize a monolith application architecture to microservices.

VPC Lattice is designed to be noninvasive, allowing teams across your organization to incrementally opt in over time. In this way, you are able to deliver applications faster by focusing on your application logic, while VPC Lattice handles service-to-service networking, security, and monitoring requirements.

How Amazon VPC Lattice Works
With VPC Lattice, you create a logical application layer network, called a service network, that connects clients and services across different VPCs and accounts, abstracting network complexity. A service network is a logical boundary that is used to automatically implement service discovery and connectivity as well as apply access and observability policies to a collection of services. It offers inter-application connectivity over HTTP/HTTPS and gRPC protocols within a VPC.

Once a VPC has been enabled for a service network, clients in the VPC will automatically be able to discover the services in the service network through DNS and will direct all inter-application traffic through VPC Lattice. You can use AWS Resource Access Manager (RAM) to control which accounts, VPCs, and applications can establish communication via VPC Lattice.

A service is an independently deployable unit of software that delivers a specific task or function. In VPC Lattice, a service is a logical component that can live in any VPC or account and can run on a mixture of compute types (virtual machines, containers, and serverless functions). A service configuration consists of:

  • One or two listeners that define the port and protocol that the service is expecting traffic on. Supported protocols are HTTP/1.1, HTTP/2, and gRPC, including HTTPS for TLS-enabled services.
  • Listeners have rules that consist of a priority, which specifies the order in which rules should be processed, one or more conditions that define when to apply the rule, and actions that forward traffic to target groups. Each listener has a default rule that takes effect when no additional rules are configured, or no conditions are met.
  • A target group is a collection of targets, or compute resources, that are running a specific workload you are trying to route toward. Targets can be Amazon Elastic Compute Cloud (Amazon EC2) instances, IP addresses, and Lambda functions. For Kubernetes workloads, VPC Lattice can target services and pods via the AWS Gateway Controller for Kubernetes. To have access to the AWS Gateway Controller for Kubernetes, you can join the preview.

VPC Lattice logical architecture.

To configure service access controls, you can use access policies. An access policy is an IAM resource policy that can be associated with a service network and individual services. With access policies, you can use the “PARC” (principal, action, resource, and condition) model to enforce context-specific access controls for services. For example, you can use an access policy to define which services can access a service you own. If you use AWS Organizations, you can limit access to a service network to a specific organization.

VPC Lattice also provides a service directory, a centralized view of the services that you own or have been shared with you via AWS RAM.

Using Amazon VPC Lattice
We expect people with different roles can use VPC Lattice. For example:

  • The service network administrator can:
    • Create and manage a service network.
    • Define access and monitoring for the service network.
    • Associate client and services.
    • Share the service network with other AWS accounts.
  • The service owner can:
    • Create and manage a service, including access and monitoring.
    • Define routing, for example, configuring listeners and rules that point to the target groups where the service is running.
    • Associate a service to service networks.

Let’s see how this works in practice. In this quick walkthrough, I am covering both roles.

Creating Two Backend Services
There is nothing specific to VPC Lattice in this section. I am just creating a couple of services, one running on Amazon EC2 and one on AWS Lambda, that I’ll use later when I configure networking with VPC Lattice.

In an Amazon Linux EC2 instance, I create a web app that replies “Hello from the instance” to HTTP requests. To allow access to the instance from clients coming via VPC Lattice, I add an inbound rule to the security group to allow TCP traffic on port 8080 from the VPC Lattice AWS-managed prefix list.

Here’s the app.py file. I am using Python and Flask for this app, but you don’t need to know them to follow along with the post.

from flask import Flask

app = Flask(__name__)

@app.route('/')
def index():
  return 'Hello from the instance'

@app.route('/<path>')
def somePath(path):
  return 'Hello from the instance at path "{}"'.format(path)

app.run(host='0.0.0.0', port=8080)

Here’s the requirements.txt file with the Python dependencies. There’s only one line because the only module I need is flask:

flask

I install the dependencies:

pip3 install -r requirements.txt

Then, I start the web app using the nohup command to keep it running in case I log out of the instance:

nohup flask run --host=0.0.0.0 --port 8080 &

On the EC2 instance, the web service is now listening to HTTP traffic on port 8080.

In the Lambda console, I create a simple function using the Node.js 18.x runtime that replies “Hello from the function” to all invocations.

exports.handler = async (event) => {
    const response = {
        statusCode: 200,
        body: JSON.stringify('Hello from the function'),
    };
    return response;
};

The two services are now both ready. Let’s use VPC Lattice to configure networking.

Creating VPC Lattice Target Groups
I start by creating two target groups, one for the EC2 instance and one for the Lambda function. In the VPC console, there is a new VPC Lattice section in the navigation pane. There, I choose Target groups and then Create target group.

For the first target group, I choose the Instances target type and enter a name.

Console screenshot.

I choose the protocol (HTTP) and port (8080) used by the web app running on the instance. I select the VPC where the instance is running and the protocol version (HTTP1).

Console screenshot.

Now I can configure the health check that will be used to test the target status. In this case, I use the default values proposed by the console.

Console screenshot.

In the next step, I can register the targets. I select the instance on which the web app is running from the list and choose to include it.

Console screenshot.

I review the selected targets (one instance in this case) and choose Submit.

In a similar way, I create a target group for the Lambda function. This time, I select the function from the list. I can choose which function version or function alias to use. For simplicity, I use the $LATEST version.

Console screenshot.

Creating VPC Lattice Services
Now that the target groups are ready, I choose Services in the navigation pane and then Create service. I enter a name and a description.

Console screenshot.

Now, I can choose the authentication type. If I choose None, the service network does not authenticate or authorize client access, and the auth policy, if present, is not used. I select AWS IAM and then, from the Apply policy template dropdown, the template that allows both authenticated and unauthenticated access.

Console screenshot.

In the Monitoring section, I turn on Access logs. As the destination for the access logs, I use an Amazon CloudWatch Log group that I created before. I also have the option to use an Amazon Simple Storage Service (Amazon S3) bucket or a Amazon Kinesis Data Firehose delivery stream.

Console screenshot.

In the next step, I define routing for the service. I choose Add listener. For the protocol, I configure the service to listen using HTTPS. In the default action, I choose to send two-thirds (Weight 20) of the requests to the instance target group and one-third (Weight 10) to the function target group.

Console screenshot.

Then, I add two additional rules. The first rule (Priority 10) sends all requests where the path is /to-instance to the instance target group.

Console screenshot.

The second rule (Priority 20) sends all traffic where the path is /to-function to the function target group.

Console screenshot.

In the next step, I am asked to associate the service with one or more service networks. I didn’t create a service network yet, so I skip this step for now and choose Next. I review the configuration and create the service.

Creating VPC Lattice Service Networks
Now, I create the service network so that I can associate the service and the VPCs I want to use. I choose Service network from the navigation pane and then Create service network. I enter a name and a description for the service network.

Console screenshot.

In the Associate services, I select the service I just created.

Console screenshot.

In the VPC associations, I select the VPC used by the instance where the web app runs. This can help in the future because it allows the web app to call other services associated with the service network.

Console screenshot.

Then, I select a second VPC where I have another EC2 instance that I want to use to run some tests.

Console screenshot.

For simplicity, in the Access section, I select the None auth type.

Console screenshot.

In the Monitoring section, I choose to send the access logs for the whole service network to an S3 bucket.

Console screenshot.

I review the summary of the configuration and create the service network. After a few seconds all service and VPC associations are active, and I can start using the service.

I write down the domain name of the service from the list of service associations.

Console screenshot.

Testing Access to the Service Using VPC Lattice
I look at the Routing tab of the service to find a nice recap of how the listener is handling routing towards the different target groups.

Console screenshot.

Then, I log into the EC2 instance in my second VPC and use curl to call the service domain name. As expected, I get about two-thirds of the responses from the instance and one-third from the function.

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws
Hello from the instance

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws
Hello from the instance

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws
"Hello from the function"

When I call the /to-instance and /to-function paths, the additional rules forward the requests to the instance and the function, respectively.

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws/to-instance
Hello from the instance "to-instance" path

curl https://my-service-03e92ee54968d87ca.7d67968.vpc-lattice-svcs.us-west-2.on.aws/to-function
"Hello from the function"

I can now review access to my service using the access log subscriptions I configured before.

For the service, I look in the CloudWatch Log group. There, I find a log stream containing detailed access information about the service.

Console screenshot.

The access log for all services associated with the service network is on the S3 bucket. I have only one service for now, but more are coming.

Console screenshot.

Available in Preview
Amazon VPC Lattice is available in preview in the US West (Oregon) Region.

VPC Lattice provides deployment consistency across AWS compute types so that you can connect your services across instances, containers, and serverless functions. You can use VPC Lattice to apply granular and rich traffic controls, such as policy-based routing and weighted targets to support blue/green and canary-style deployments.

VPC Lattice allows monitoring and troubleshooting service-to-service communication with detailed access logs and metrics that capture request type, volume of traffic, error rates, response time, and more. In this blog post, I only scratched the surface of what you can do with VPC Lattice.

Simplify the way you connect, secure, and monitor service-to-service communication with Amazon VPC Lattice.

Enrich VPC Flow Logs with resource tags and deliver data to Amazon S3 using Amazon Kinesis Data Firehose

Post Syndicated from Chaitanya Shah original https://aws.amazon.com/blogs/big-data/enrich-vpc-flow-logs-with-resource-tags-and-deliver-data-to-amazon-s3-using-amazon-kinesis-data-firehose/

VPC Flow Logs is an AWS feature that captures information about the network traffic flows going to and from network interfaces in Amazon Virtual Private Cloud (Amazon VPC). Visibility to the network traffic flows of your application can help you troubleshoot connectivity issues, architect your application and network for improved performance, and improve security of your application.

Each VPC flow log record contains the source and destination IP address fields for the traffic flows. The records also contain the Amazon Elastic Compute Cloud (Amazon EC2) instance ID that generated the traffic flow, which makes it easier to identify the EC2 instance and its associated VPC, subnet, and Availability Zone from where the traffic originated. However, when you have a large number of EC2 instances running in your environment, it may not be obvious where the traffic is coming from or going to simply based on the EC2 instance IDs or IP addresses contained in the VPC flow log records.

By enriching flow log records with additional metadata such as resource tags associated with the source and destination resources, you can more easily understand and analyze traffic patterns in your environment. For example, customers often tag their resources with resource names and project names. By enriching flow log records with resource tags, you can easily query and view flow log records based on an EC2 instance name, or identify all traffic for a certain project.

In addition, you can add resource context and metadata about the destination resource such as the destination EC2 instance ID and its associated VPC, subnet, and Availability Zone based on the destination IP in the flow logs. This way, you can easily query your flow logs to identify traffic crossing Availability Zones or VPCs.

In this post, you will learn how to enrich flow logs with tags associated with resources from VPC flow logs in a completely serverless model using Amazon Kinesis Data Firehose and the recently launched Amazon VPC IP Address Manager (IPAM), and also analyze and visualize the flow logs using Amazon Athena and Amazon QuickSight.

Solution overview

In this solution, you enable VPC flow logs and stream them to Kinesis Data Firehose. This solution enriches log records using an AWS Lambda function on Kinesis Data Firehose in a completely serverless manner. The Lambda function fetches resource tags for the instance ID. It also looks up the destination resource from the destination IP using the Amazon EC2 API and IPAM, and adds the associated VPC network context and metadata for the destination resource. It then stores the enriched log records in an Amazon Simple Storage Service (Amazon S3) bucket. After you have enriched your flow logs, you can query, view, and analyze them in a wide variety of services, such as AWS Glue, Athena, QuickSight, Amazon OpenSearch Service, as well as solutions from the AWS Partner Network such as Splunk and Datadog.

The following diagram illustrates the solution architecture.

Architecture

The workflow contains the following steps:

  1. Amazon VPC sends the VPC flow logs to the Kinesis Data Firehose delivery stream.
  2. The delivery stream uses a Lambda function to fetch resource tags for instance IDs from the flow log record and add it to the record. You can also fetch tags for the source and destination IP address and enrich the flow log record.
  3. When the Lambda function finishes processing all the records from the Kinesis Data Firehose buffer with enriched information like resource tags, Kinesis Data Firehose stores the result file in the destination S3 bucket. Any failed records that Kinesis Data Firehose couldn’t process are stored in the destination S3 bucket under the prefix you specify during delivery stream setup.
  4. All the logs for the delivery stream and Lambda function are stored in Amazon CloudWatch log groups.

Prerequisites

As a prerequisite, you need to create the target S3 bucket before creating the Kinesis Data Firehose delivery stream.

If using a Windows computer, you need PowerShell; if using a Mac, you need Terminal to run AWS Command Line Interface (AWS CLI) commands. To install the latest version of the AWS CLI, refer to Installing or updating the latest version of the AWS CLI.

Create a Lambda function

You can download the Lambda function code from the GitHub repo used in this solution. The example in this post assumes you are enabling all the available fields in the VPC flow logs. You can use it as is or customize per your needs. For example, if you intend to use the default fields when enabling the VPC flow logs, you need to modify the Lambda function with the respective fields. Creating this function creates an AWS Identity and Access Management (IAM) Lambda execution role.

To create your Lambda function, complete the following steps:

  1. On the Lambda console, choose Functions in the navigation pane.
  2. Choose Create function.
  3. Select Author from scratch.
  4. For Function name, enter a name.
  5. For Runtime, choose Python 3.8.
  6. For Architecture, select x86_64.
  7. For Execution role, select Create a new role with basic Lambda permissions.
  8. Choose Create function.

Create Lambda Function

You can then see code source page, as shown in the following screenshot, with the default code in the lambda_function.py file.

  1. Delete the default code and enter the code from the GitHub Lambda function aws-vpc-flowlogs-enricher.py.
  2. Choose Deploy.

VPC Flow Logs Enricher function

To enrich the flow logs with additional tag information, you need to create an additional IAM policy to give Lambda permission to describe tags on resources from the VPC flow logs.

  1. On the IAM console, choose Policies in the navigation pane.
  2. Choose Create policy.
  3. On the JSON tab, enter the JSON code as shown in the following screenshot.

This policy gives the Lambda function permission to retrieve tags for the source and destination IP and retrieve the VPC ID, subnet ID, and other relevant metadata for the destination IP from your VPC flow log record.

  1. Choose Next: Tags.

Tags

  1. Add any tags and choose Next: Review.

  1. For Name, enter vpcfl-describe-tag-policy.
  2. For Description, enter a description.
  3. Choose Create policy.

Create IAM Policy

  1. Navigate to the previously created Lambda function and choose Permissions in the navigation pane.
  2. Choose the role that was created by Lambda function.

A page opens in a new tab.

  1. On the Add permissions menu, choose Attach policies.

Add Permissions

  1. Search for the vpcfl-describe-tag-policy you just created.
  2. Select the vpcfl-describe-tag-policy and choose Attach policies.

Create the Kinesis Data Firehose delivery stream

To create your delivery stream, complete the following steps:

  1. On the Kinesis Data Firehose console, choose Create delivery stream.
  2. For Source, choose Direct PUT.
  3. For Destination, choose Amazon S3.

Kinesis Firehose Stream Source and Destination

After you choose Amazon S3 for Destination, the Transform and convert records section appears.

  1. For Data transformation, select Enable.
  2. Browse and choose the Lambda function you created earlier.
  3. You can customize the buffer size as needed.

This impacts on how many records the delivery stream will buffer before it flushes it to Amazon S3.

  1. You can also customize the buffer interval as needed.

This impacts how long (in seconds) the delivery stream will buffer the incoming records from the VPC.

  1. Optionally, you can enable Record format conversion.

If you want to query from Athena, it’s recommended to convert it to Apache Parquet or ORC and compress the files with available compression algorithms, such as gzip and snappy. For more performance tips, refer to Top 10 Performance Tuning Tips for Amazon Athena. In this post, record format conversion is disabled.

Transform and Conver records

  1. For S3 bucket, choose Browse and choose the S3 bucket you created as a prerequisite to store the flow logs.
  2. Optionally, you can specify the S3 bucket prefix. The following expression creates a Hive-style partition for year, month, and day:

AWSLogs/year=!{timestamp:YYYY}/month=!{timestamp:MM}/day=!{timestamp:dd}/

  1. Optionally, you can enable dynamic partitioning.

Dynamic partitioning enables you to create targeted datasets by partitioning streaming S3 data based on partitioning keys. The right partitioning can help you to save costs related to the amount of data that is scanned by analytics services like Athena. For more information, see Kinesis Data Firehose now supports dynamic partitioning to Amazon S3.

Note that you can enable dynamic partitioning only when you create a new delivery stream. You can’t enable dynamic partitioning for an existing delivery stream.

Destination Settings

  1. Expand Buffer hints, compression and encryption.
  2. Set the buffer size to 128 and buffer interval to 900 for best performance.
  3. For Compression for data records, select GZIP.

S3 Buffer settings

Create a VPC flow log subscription

Now you create a VPC flow log subscription for the Kinesis Data Firehose delivery stream you created.

Navigate to AWS CloudShell or Terminal/PowerShell for a Mac or Windows computer and run the following AWS CLI command to enable the subscription. Provide your VPC ID for the parameter --resource-ids and delivery stream ARN for the parameter --log-destination.

aws ec2 create-flow-logs \ 
--resource-type VPC \ 
--resource-ids vpc-0000012345f123400d \ 
--traffic-type ALL \ 
--log-destination-type kinesis-data-firehose \ 
--log-destination arn:aws:firehose:us-east-1:123456789101:deliverystream/PUT-Kinesis-Demo-Stream \ 
--max-aggregation-interval 60 \ 
--log-format '${account-id} ${action} ${az-id} ${bytes} ${dstaddr} ${dstport} ${end} ${flow-direction} ${instance-id} ${interface-id} ${log-status} ${packets} ${pkt-dst-aws-service} ${pkt-dstaddr} ${pkt-src-aws-service} ${pkt-srcaddr} ${protocol} ${region} ${srcaddr} ${srcport} ${start} ${sublocation-id} ${sublocation-type} ${subnet-id} ${tcp-flags} ${traffic-path} ${type} ${version} ${vpc-id}'

If you’re running CloudShell for the first time, it will take a few seconds to prepare the environment to run.

After you successfully enable the subscription for your VPC flow logs, it takes a few minutes depending on the intervals mentioned in the setup to create the log record files in the destination S3 folder.

To view those files, navigate to the Amazon S3 console and choose the bucket storing the flow logs. You should see the compressed interval logs, as shown in the following screenshot.

S3 destination bucket

You can download any file from the destination S3 bucket on your computer. Then extract the gzip file and view it in your favorite text editor.

The following is a sample enriched flow log record, with the new fields in bold providing added context and metadata of the source and destination IP addresses:

{'account-id': '123456789101',
 'action': 'ACCEPT',
 'az-id': 'use1-az2',
 'bytes': '7251',
 'dstaddr': '10.10.10.10',
 'dstport': '52942',
 'end': '1661285182',
 'flow-direction': 'ingress',
 'instance-id': 'i-123456789',
 'interface-id': 'eni-0123a456b789d',
 'log-status': 'OK',
 'packets': '25',
 'pkt-dst-aws-service': '-',
 'pkt-dstaddr': '10.10.10.11',
 'pkt-src-aws-service': 'AMAZON',
 'pkt-srcaddr': '52.52.52.152',
 'protocol': '6',
 'region': 'us-east-1',
 'srcaddr': '52.52.52.152',
 'srcport': '443',
 'start': '1661285124',
 'sublocation-id': '-',
 'sublocation-type': '-',
 'subnet-id': 'subnet-01eb23eb4fe5c6bd7',
 'tcp-flags': '19',
 'traffic-path': '-',
 'type': 'IPv4',
 'version': '5',
 'vpc-id': 'vpc-0123a456b789d',
 'src-tag-Name': 'test-traffic-ec2-1', 'src-tag-project': ‘Log Analytics’, 'src-tag-team': 'Engineering', 'dst-tag-Name': 'test-traffic-ec2-1', 'dst-tag-project': ‘Log Analytics’, 'dst-tag-team': 'Engineering', 'dst-vpc-id': 'vpc-0bf974690f763100d', 'dst-az-id': 'us-east-1a', 'dst-subnet-id': 'subnet-01eb23eb4fe5c6bd7', 'dst-interface-id': 'eni-01eb23eb4fe5c6bd7', 'dst-instance-id': 'i-06be6f86af0353293'}

Create an Athena database and AWS Glue crawler

Now that you have enriched the VPC flow logs and stored them in Amazon S3, the next step is to create the Athena database and table to query the data. You first create an AWS Glue crawler to infer the schema from the log files in Amazon S3.

  1. On the AWS Glue console, choose Crawlers in the navigation pane.
  2. Choose Create crawler.

Glue Crawler

  1. For Name¸ enter a name for the crawler.
  2. For Description, enter an optional description.
  3. Choose Next.

Glue Crawler properties

  1. Choose Add a data source.
  2. For Data source¸ choose S3.
  3. For S3 path, provide the path of the flow logs bucket.
  4. Select Crawl all sub-folders.
  5. Choose Add an S3 data source.

Add Data source

  1. Choose Next.

Data source classifiers

  1. Choose Create new IAM role.
  2. Enter a role name.
  3. Choose Next.

Configure security settings

  1. Choose Add database.
  2. For Name, enter a database name.
  3. For Description, enter an optional description.
  4. Choose Create database.

Create Database

  1. On the previous tab for the AWS Glue crawler setup, for Target database, choose the newly created database.
  2. Choose Next.

Set output and scheduling

  1. Review the configuration and choose Create crawler.

Create crawler

  1. On the Crawlers page, select the crawler you created and choose Run.

Run crawler

You can rerun this crawler when new tags are added to your AWS resources, so that they’re available for you to query from the Athena database.

Run Athena queries

Now you’re ready to query the enriched VPC flow logs from Athena.

  1. On the Athena console, open the query editor.
  2. For Database, choose the database you created.
  3. Enter the query as shown in the following screenshot and choose Run.

Athena query

The following code shows some of the sample queries you can run:

Select * from awslogs where "dst-az-id"='us-east-1a'
Select * from awslogs where "src-tag-project"='Log Analytics' or "dst-tag-team"='Engineering' 
Select "srcaddr", "srcport", "dstaddr", "dstport", "region", "az-id", "dst-az-id", "flow-direction" from awslogs where "az-id"='use1-az2' and "dst-az-id"='us-east-1a'

The following screenshot shows an example query result of the source Availability Zone to the destination Availability Zone traffic.

Athena query result

You can also visualize various charts for the flow logs stored in the S3 bucket via QuickSight. For more information, refer to Analyzing VPC Flow Logs using Amazon Athena, and Amazon QuickSight.

Pricing

For pricing details, refer to Amazon Kinesis Data Firehose pricing.

Clean up

To clean up your resources, complete the following steps:

  1. Delete the Kinesis Data Firehose delivery stream and associated IAM role and policies.
  2. Delete the target S3 bucket.
  3. Delete the VPC flow log subscription.
  4. Delete the Lambda function and associated IAM role and policy.

Conclusion

This post provided a complete serverless solution architecture for enriching VPC flow log records with additional information like resource tags using a Kinesis Data Firehose delivery stream and Lambda function to process logs to enrich with metadata and store in a target S3 file. This solution can help you query, analyze, and visualize VPC flow logs with relevant application metadata because resource tags have been assigned to resources that are available in the logs. This meaningful information associated with each log record wherever the tags are available makes it easy to associate log information to your application.

We encourage you to follow the steps provided in this post to create a delivery stream, integrate with your VPC flow logs, and create a Lambda function to enrich the flow log records with additional metadata to more easily understand and analyze traffic patterns in your environment.


About the Authors

Chaitanya Shah is a Sr. Technical Account Manager with AWS, based out of New York. He has over 22 years of experience working with enterprise customers. He loves to code and actively contributes to AWS solutions labs to help customers solve complex problems. He provides guidance to AWS customers on best practices for their AWS Cloud migrations. He is also specialized in AWS data transfer and in the data and analytics domain.

Vaibhav Katkade is a Senior Product Manager in the Amazon VPC team. He is interested in areas of network security and cloud networking operations. Outside of work, he enjoys cooking and the outdoors.

Building highly resilient applications with on-premises interdependencies using AWS Local Zones

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/building-highly-resilient-applications-with-on-premises-interdependencies-using-aws-local-zones/

This blog post is written by Rachel Rui Liu, Senior Solutions Architect.

AWS Local Zones are a type of infrastructure deployment that places compute, storage, database, and other select AWS services close to large population and industry centers.

Following the successful launch of the AWS Local Zones in 16 US cities since 2019, in Feb 2022, AWS announced plans to launch new AWS Local Zones in 32 metropolitan areas in 26 countries worldwide.

With Local Zones, we’ve seen use cases in two common categories.

The first category of use cases is for workloads that require extremely low latency between end-user devices and workload servers. For example, let’s consider media content creation and real-time multiplayer gaming. For these use cases, deploying the workload to a Local Zone can help achieve down to single-digit milliseconds latency between end-user devices and the AWS infrastructure, which is ideal for a good end-user experience.

This post will focus on addressing the second category of use cases, which is commonly seen in an enterprise hybrid architecture, where customers must achieve low latency between AWS infrastructure and existing on-premises data centers.  Compared to the first category of use cases, these use cases can tolerate slightly higher latency between the end-user devices and the AWS infrastructure. However, these workloads have dependencies to these on-premises systems, so the lowest possible latency between AWS infrastructure and on-premises data centers is required for better application performance. Here are a few examples of these systems:

  • Financial services sector mainframe workloads hosted on premises serving regional customers.
  • Enterprise Active Directory hosted on premise serving cloud and on-premises workloads.
  • Enterprise applications hosted on premises processing a high volume of locally generated data.

For workloads deployed in AWS, the time taken for each interaction with components still hosted in the on-premises data center is increased by the latency. In turn, this delays responses received by the end-user. The total latency accumulates and results in suboptimal user experiences.

By deploying modernized workloads in Local Zones, you can reduce latency while continuing to access systems hosted in on-premises data centers, thereby reducing the total latency for the end-user. At the same time, you can enjoy the benefits of agility, elasticity, and security offered by AWS, and can apply the same automation, compliance, and security best practices that you’ve been familiar with in the AWS Regions.

Enterprise workload resiliency with Local Zones

While designing hybrid architectures with Local Zones, resiliency is an important consideration. You want to route traffic to the nearest Local Zone for low latency. However, when disasters happen, it’s critical to fail over to the parent Region automatically.

Let’s look at the details of hybrid architecture design based on real world deployments from different angles to understand how the architecture achieves all of the design goals.

Hybrid architecture with resilient network connectivity

The following diagram shows a high-level overview of a resilient enterprise hybrid architecture with Local Zones, where you have redundant connections between the AWS Region, the Local Zone, and the corporate data center.

resillient network connectivity

Here are a few key points with this network connectivity design:

  1. Use AWS Direct Connect or Site-to-Site VPN to connect the corporate data center and AWS Region.
  2. Use Direct Connect or self-hosted VPN to connect the corporate data center and the Local Zone. This connection will provide dedicated low-latency connectivity between the Local Zone and corporate data center.
  3. Transit Gateway is a regional service. When attaching the VPC to AWS Transit Gateway, you can only add subnets provisioned in the Region. Instances on subnets in the Local Zone can still use Transit Gateway to reach resources in the Region.
  4. For subnets provisioned in the Region, the VPC route table should be configured to route the traffic to the corporate data center via Transit Gateway.
  5. For subnets provisioned in Local Zone, the VPC route table should be configured to route the traffic to the corporate data center via the self-hosted VPN instance or Direct Connect.

Hybrid architecture with resilient workload deployment

The next examples show a public and a private facing workload.

To simplify the diagram and focus on application layer architecture, the following diagrams assume that you are using Direct Connect to connect between AWS and the on-premises data center.

Example 1: Resilient public facing workload

With a public facing workload, end-user traffic will be routed to the Local Zone. If the Local Zone is unavailable, then the traffic will be routed to the Region automatically using an Amazon Route 53 failover policy.

public facing workload resilliency
Here are the key design considerations for this architecture:

  1. Deploy the workload in the Local Zone and put the compute layer in an AWS AutoScaling Group, so that the application can scale up and down depending on volume of requests.
  2. Deploy the workload in both the Local Zone and an AWS Region, and put the compute layer into an autoscaling group. The regional deployment will act as pilot light or warm standby with minimal footprint. But it can scale out when the Local Zone is unavailable.
  3. Two Application Load Balancers (ALBs) are required: one in the Region and one in the Local Zone. Each ALB will dispatch the traffic to each workload cluster inside the autoscaling group local to it.
  4. An internet gateway is required for public facing workloads. When using a Local Zone, there’s no extra configuration needed: define a single internet gateway and attach it to the VPC.

If you want to specify an Elastic IP address to be the workload’s public endpoint, the Local Zone will have a different address pool than the Region. Noting that BYOIP is unsupported for Local Zones.

  1. Create a Route 53 DNS record with “Failover” as the routing policy.
  • For the primary record, point it to the alias of the ALB in the Local Zone. This will set Local Zone as the preferred destination for the application traffic which minimizes latency for end-users.
  • For the secondary record, point it to the alias of the ALB in the AWS Region.
  • Enable health check for the primary record. If health check against the primary record fails, which indicates that the workload deployed in the Local Zone has failed to respond, then Route 53 will automatically point to the secondary record, which is the workload deployed in the AWS Region.

Example 2: Resilient private workload

For a private workload that’s only accessible by internal users, a few extra considerations must be made to keep the traffic inside of the trusted private network.

private workload resilliency

The architecture for resilient private facing workload has the same steps as public facing workload, but with some key differences. These include:

  1. Instead of using a public hosted zone, create private hosted zones in Route 53 to respond to DNS queries for the workload.
  2. Create the primary and secondary records in Route 53 just like the public workload but referencing the private ALBs.
  3. To allow end-users onto the corporate network (within offices or connected via VPN) to resolve the workload, use the Route 53 Resolver with an inbound endpoint. This allows end-users located on-premises to resolve the records in the private hosted zone. Route 53 Resolver is designed to be integrated with an on-premises DNS server.
  4. No internet gateway is required for hosting the private workload. You might need an internet gateway in the Local Zone for other purposes: for example, to host a self-managed VPN solution to connect the Local Zone with the corporate data center.

Hosting multiple workloads

Customers who host multiple workloads in a single VPC generally must consider how to segregate those workloads. As with workloads in the AWS Region, segregation can be implemented at a subnet or VPC level.

If you want to segregate workloads at the subnet level, you can extend your existing VPC architecture by provisioning extra sets of subnets to the Local Zone.

segregate workloads at subnet level

Although not shown in the diagram, for those of you using a self-hosted VPN to connect the Local Zone with an on-premises data center, the VPN solution can be deployed in a centralized subnet.

You can continue to use security groups, network access control lists (NACLs) , and VPC route tables – just as you would in the Region to segregate the workloads.

If you want to segregate workloads at the VPC level, like many of our customers do, within the Region, inter-VPC routing is generally handled by Transit Gateway. However, in this case, it may be undesirable to send traffic to the Region to reach a subnet in another VPC that is also extended to the Local Zone.

segregate workloads at VPC level

Key considerations for this design are as follows:

  1. Direct Connect is deployed to connect the Local Zone with the corporate data center. Therefore, each VPC will have a dedicated Virtual Private Gateway provisioned to allow association with the Direct Connect Gateway.
  2. To enable inter-VPC traffic within the Local Zone, peer the two VPCs together.
  3. Create a VPC route table in VPC A. Add a route for Subnet Y where the destination is the peering link. Assign this route table to Subnet X.
  4. Create a VPC route table in VPC B. Add a route for Subnet X where the destination is the peering link. Assign this route table to Subnet Y.
  5. If necessary, add routes for on-premises networks and the transit gateway to both route tables.

This design allows traffic between subnets X and Y to stay within the Local Zone, thereby avoiding any latency from the Local Zone to the AWS Region while still permitting full connectivity to all other networks.

Conclusion

In this post, we summarized the use cases for enterprise hybrid architecture with Local Zones, and showed you:

  • Reference architectures to host workloads in Local Zones with low-latency connectivity to corporate data centers and resiliency to enable fail over to the AWS Region automatically.
  • Different design considerations for public and private facing workloads utilizing this hybrid architecture.
  • Segregation and connectivity considerations when extending this hybrid architecture to host multiple workloads.

Hopefully you will be able to follow along with these reference architectures to build and run highly resilient applications with local system interdependencies using Local Zones.

Deploying Local Gateway Ingress Routing on AWS Outposts

Post Syndicated from Sheila Busser original https://aws.amazon.com/blogs/compute/deploying-local-gateway-ingress-routing-on-aws-outposts/

This post is written by Leonardo Solano, Senior Hybrid Cloud Solution Architect and Chris Lunsford, Senior Specialist Solutions Architect, AWS Outposts.

AWS Outposts lets customers use the same Amazon Virtual Private Cloud (VPC) security mechanisms, such as security groups and network access control lists, to control traffic flows for on-premises applications running on Outposts. Some customers, desiring additional security or consistency with on-premises systems, want the ability to inspect and filter incoming application traffic as it enters the Outpost. Ideally, they would like to deploy virtual appliances in front of the workloads running on Outposts.

Today, we are announcing a new feature called Outposts local lateway (LGW) ingress routing. This lets you create LGW inbound routes to redirect incoming traffic to an Amazon Elastic Compute Cloud (EC2) Elastic Network Interface (ENI) associated with an EC2 instance running on Outposts rack. The traffic is redirected for inspection before it reaches the workloads running on Outposts rack. Moreover, it lets the EC2 virtual appliance inspect, filter, or optimize the traffic in a similar way as VPC ingress routing in the Region.

Use case

A common use case for this feature is deploying a customer-preferred third-party virtual network appliance. The appliance can inspect, modify, or monitor the incoming traffic for policy compliance and forward compliant traffic on to the workloads running on the Outpost. A typical virtual appliance could be a firewall, intrusion detection system (IDS), or intrusion prevention system (IPS). The features provided by the virtual appliances vary, and they may include deep packet inspection, traffic optimization, and flow monitoring. This new Outposts rack feature modifies the default behavior of the local gateway routing table (LGW-RTB), and it lets customers redirect traffic coming into an Outposts deployment to the virtual appliance.

 Local Gateway Ingress Routing on Outposts Architecture

The new behavior?

Now you can create static routes in the LGW-RTB that target a specific ENI on the Outpost as the next hop. These static routes are propagated toward the customer network through the Border Gateway Protocol (BGP) peering sessions with the Customer Networking Devices. The on-premises network will route traffic to the specified Classless Inter-Domain Routing (CIDR) prefixes, as defined in the static routes, toward the Outposts Network Devices.

 Local Gateway Routing Table

In the preceeding diagram, the static route 198.19.33.248/29 has a longer prefix length than 198.19.33.240/28, and both routes will be propagated toward the customer network via BGP. The incoming traffic for the 198.19.33.248/29 prefix will be directed toward the ENI eni-1234example0. The architecture looks like the following diagram, where the security virtual appliance is seated between the LGW and a set of EC2 instances in Outposts.

Local Gateway Advertised routes

As ingress traffic is routed through the virtual appliance for inspection and filtering, the destination addresses of packets arriving at the ENI of the virtual appliance won’t match its ENI’s private IP address (the packets are transiting the instance). By default, the ENI will drop the inbound traffic unless you disable source/destination checking on the virtual appliance instance ENI settings. The following screenshot shows how you can disable the EC2 instance source/destination checking in the AWS console.

(aka, source-destination-check.png) . EC2 source/destination Check

Considerations for LGW ingress routing

Consider the following requirements when preparing to deploy LGW ingress routing:

  • The ENIs used as the next-hop target must be deployed in an Outposts Subnet.
  • The subnets must belong to a VPC associated with the LGW-RTB.
  • Routes with the longest matches are prioritized. If there are two with the same destination CIDR, then static routes are preferred over propagated ones.

Working with Outposts LGW ingress routing

The following output shows what the LGW route table looks like before applying the ingress routing feature:

{
    "Routes": [
        {
            "DestinationCidrBlock": "0.0.0.0/0",
            "LocalGatewayVirtualInterfaceGroupId": "lgw-vif-grp-XXX",
            "Type": "static",
            "State": "active",
            "LocalGatewayRouteTableId": "lgw-rtb-XXX",
            "LocalGatewayRouteTableArn": "arn:aws:ec2:>AWS-REGION>:<account-id>:local-gateway-route-table/lgw-rtb-XXX",
            "OwnerId": "<account-id>"
        },
        {
            "DestinationCidrBlock": "198.19.33.16/28",
            "CoipPoolId": "coip-pool-0000aaaabbbbcccc1111",
            "Type": "propagated",
            "State": "active",
            "LocalGatewayRouteTableId": "lgw-rtb-XXX",
            "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/lgw-XXX",
            "OwnerId": "<account-id>"
        },
        {
            "DestinationCidrBlock": "198.19.33.240/28",
            "CoipPoolId": "coip-pool-0000aaaabbbbcccc2222",
            "Type": "propagated",
            "State": "active",
            "LocalGatewayRouteTableId": "lgw-rtb-XXX",
            "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/lgw-XXX",
            "OwnerId": "<account-id>"
        }
     ]
}

The relevant change under an LGW-RTB before to add a local-gateway-route is the presence of the “propagated routes”. This represents the Outposts Subnets that can’t be deleted or modified with Next-Hop as specific ENIs present in Outposts. In the following section, we will cover how it will look after the creation of a local-gateway-route.

Configuring LGW ingress routing

To configure LGW ingress routing, you must provide the LGW route table ID, the ENI ID that will be utilized as a next-hop, and the destination CIDR block. Once you have identified those three parameters, you can configure LGW ingress routing via the This is shown in the following example, where the prefix 198.19.33.248/29 is routed to an Outpost. If the route points to an ENI attached to an instance, then the route will show as active. If the route points to an ENI that isn’t attached to an EC2 instance, then the route will show a blackhole state.

$ aws ec2 create-local-gateway-route \
  --local-gateway-route-table-id <lgw-rtb-id> \
  --network-interface-id <eni-id> \
  --destination-cidr-block 198.19.33.248/29
  
{
    "Route": {
        "DestinationCidrBlock": "198.19.33.248/29",
        "NetworkInterfaceId": "eni-id",
        "Type": "static",
        "State": "active",
        "LocalGatewayRouteTableId": "lgw-rtb-id",
        "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/<lgw-rtb-id>",
        "OwnerId": "<account-id>"
    }
}

Once LGW ingress routing has been configured, the LGW will route traffic destined to the 198.19.33.248/29 prefix to the target ENI. This must be present as part of the Outposts subnets. Note that the segment 198.19.33.248/29 is part of the Outposts CIDR range of 198.19.33.240/28. This belongs, in this case, to the Outposts customer-owned IP address (CoIP) CIDRs. When traffic follows a static route to an ENI, the packet destination address is preserved and isn’t translated to the private address of the ENI.

In this case, the new LGW-RTB will look like the following:

{
    "Routes": [
        {
            "DestinationCidrBlock": "0.0.0.0/0",
            "LocalGatewayVirtualInterfaceGroupId": "lgw-vif-grp-XXX",
            "Type": "static",
            "State": "active",
            "LocalGatewayRouteTableId": "lgw-rtb-XXX",
            "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/lgw-rtb-XXX",
            "OwnerId": "<account-id>"
        },
        {
            "DestinationCidrBlock": "198.19.33.16/28",
            "CoipPoolId": "coip-pool-0000aaaabbbbcccc1111",
            "Type": "propagated",
            "State": "active",
            "LocalGatewayRouteTableId": "lgw-rtb-XXX",
            "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/lgw-XXX",
            "OwnerId": "<account-id>"
        },
        {
            "DestinationCidrBlock": "198.19.33.240/28",
            "CoipPoolId": "coip-pool-0000aaaabbbbcccc1111",
            "Type": "propagated",
            "State": "active",
            "LocalGatewayRouteTableId": "lgw-rtb-XXX",
            "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/lgw-XXX",
            "OwnerId": "<account-id>"
        },
         {
            "DestinationCidrBlock": "198.19.33.248/29",
            "NetworkInterfaceId": "eni-XXX",
            "Type": "static",
            "State": "active",
            "LocalGatewayRouteTableId": "lgw-rtb-XXX",
            "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/lgw-rtb-XXX",
            "OwnerId": "<account-id>"
        }
     ]
}

In the AWS console, the LGW-RTB will show the new ingress routing route:

 (aka, LWG-RTB) Console Local Gateway Routing Table

Modifying LGW ingress routing

Utilize a similar AWS CLI command to the one that we used previously to create the LGW ingress routing route to modify existing routes. In this case, the command will be aws ec2 modify-local-gateway-route, and the arguments are the same as with the create command. Use this command when you want to shift inbound traffic from one EC2 instance to another – perhaps from an active to a standby network appliance while you perform required maintenance on the primary instance.

$ aws ec2 modify-local-gateway-route \
  --local-gateway-route-table-id <lgw-rtb-id> \
  --network-interface-id <new-eni-id> \
  --destination-cidr-block 198.19.33.248/29
{
    "Route": {
        "DestinationCidrBlock": "198.19.33.248/29",
        "NetworkInterfaceId": "new-eni-id",
        "Type": "static",
        "State": "active",
        "LocalGatewayRouteTableId": "lgw-rtb-id",
        "LocalGatewayRouteTableArn": "arn:aws:ec2:<AWS-REGION>:<account-id>:local-gateway-route-table/<lgw-rtb-id>",
        "OwnerId": "<account-id>"
    }
}

Conclusion

AWS Outposts LGW ingress routing allows AWS customers and partners to deploy virtual appliances on Outposts rack and direct inbound traffic through those appliances. The virtual appliance can inspect, filter, and optimize the ingress traffic before forwarding it on to the workloads running on Outposts rack, creating fine-grained network and security policies for your workloads. To learn more about AWS Outposts rack, visit the product overview page.

How to automate updates for your domain list in Route 53 Resolver DNS Firewall

Post Syndicated from Guillaume Neau original https://aws.amazon.com/blogs/security/how-to-automate-updates-for-your-domain-list-in-route-53-resolver-dns-firewall/

Note: This post includes links to third-party websites. AWS is not responsible for the content on those websites.


Following the release of Amazon Route 53 Resolver DNS Firewall, Amazon Web Services (AWS) published several blog posts to help you protect your Amazon Virtual Private Cloud (Amazon VPC) DNS resolution, including How to Get Started with Amazon Route 53 Resolver DNS Firewall for Amazon VPC and Secure your Amazon VPC DNS resolution with Amazon Route 53 Resolver DNS Firewall. Route 53 Resolver DNS Firewall provides managed domain lists that are fully maintained and kept up-to-date by AWS and that directly benefit from the threat intelligence that we gather, but you might want to create or import your own list to have full control over the DNS filtering.

In this blog post, you will find a solution to automate the management of your domain list by using AWS Lambda, Amazon EventBridge, and Amazon Simple Storage Service (Amazon S3). The solution in this post uses, as an example, the URLhaus open Response Policy Zone (RPZ) list, which generates a new file every five minutes.

Architecture overview

The solution is made of the following four components, as shown in Figure 1.

  1. An EventBridge scheduled rule to invoke the Lambda function on a schedule.
  2. A Lambda function that uses the AWS SDK to perform the automation logic.
  3. An S3 bucket to temporarily store the list of domains retrieved.
  4. Amazon Route 53 Resolver DNS Firewall.
    Figure 1: Architecture overview

    Figure 1: Architecture overview

After the solution is deployed, it works as follows:

  1. The scheduled rule invokes the Lambda function every 5 minutes to fetch the latest domain list available.
  2. The Lambda function fetches the list from URLhaus, parses the data retrieved, formats the data, uploads the list of domains into the S3 bucket, and invokes the Route 53 Resolver DNS Firewall importFirewallDomains API action.
  3. The domain list is then updated.

Implementation steps

As a first step, create your own domain list on the Route 53 Resolver DNS Firewall. Having your own domain list allows you to have full control of the list of domains to which you want to apply actions, as defined within rule groups.

To create your own domain list

  1. In the Route 53 console, in the left menu, choose Domain lists in the DNS firewall section.
  2. Choose the Add domain list button, enter a name for your owned domain list, and then enter a placeholder domain to initialize the domain list.
  3. Choose Add domain list to finalize the creation of the domain list.
    Figure 2: Expected view of the console

    Figure 2: Expected view of the console

The list from URLhaus contains more than a thousand records. You will use the ImportFirewallDomains endpoint to upload this list to DNS Firewall. The use of the ImportFirewallDomains endpoint requires that you first upload the list of domains and make the list available in an S3 bucket that is located in the same AWS Region as the owned domain list that you just created.

To create the S3 bucket

  1. In the S3 console, choose Create bucket.
  2. Under General configuration, configure the AWS Region option to be the same as the Region in which you created your domain list.
  3. Finalize the configuration of your S3 bucket, and then choose Create bucket.

Because a new file is created every five minutes, we recommend setting a lifecycle rule to automatically expire and delete files after 24 hours to optimize for cost and only save the most recent lists.

To create the Lambda function

  1. Follow the steps in the topic Creating an execution role in the IAM console to create an execution role. After step 4, when you configure permissions, choose Create Policy, and then create and add an IAM policy similar to the following example. This policy needs to:
    • Allow the Lambda function to put logs in Amazon CloudWatch.
    • Allow the Lambda function to have read and write access to objects placed in the created S3 bucket.
    • Allow the Lambda function to update the firewall domain list.
    • {
          "Version": "2012-10-17",
          "Statement": [
              {
                  "Action": [
                      "logs:CreateLogGroup",
                      "logs:CreateLogStream",
                      "logs:PutLogEvents"
                  ],
                  "Resource": "arn:aws:logs:<region>:<accountId>:*",
                  "Effect": "Allow"
              },
              {
                  "Action": [
                      "s3:PutObject",
                      "s3:GetObject"
                  ],
                  "Resource": "arn:aws:s3:::<DNSFW-BUCKET-NAME>/*",
                  "Effect": "Allow"
              },
              {
                  "Action": [
                      "route53resolver:ImportFirewallDomains"
                  ],
                  "Resource": "arn:aws:route53resolver:<region>:<accountId>:firewall-domain-list/<domain-list-id>",
                  "Effect": "Allow"
              }
          ]
      }

  2. (Optional) If you decide to use the example provided by AWS:
    • After cloning the repository: Build the layer following the instruction included in the readme.md and the provided script.
    • Zip the lambda.
    • In the left menu, select Layers then Create Layer. Enter a name for the layer, then select Upload a .zip file. Choose to upload the layer (node-axios-layer.zip).
    • As a compatible runtime, select: Node.js 16.x.
    • Select Create
  3. In the Lambda console, in the same Region as your domain list, choose Create function, and then do the following:
    • Choose your desired runtime and architecture.
    • (Optional) To use the code provided by AWS: Select Node.js 16.x as the runtime.
    • Choose Change the default execution role.
    • Choose Use an existing role, and then pick the role that you just created.
  4. After the Lambda function is created, in the left menu of the Lambda console, choose Functions, and then select the function you created.
    • For Code source, you can either enter the code of the Lambda function or choose the Upload from button and then choose the source for the code. AWS provides an example of functioning code on GitHub under a MIT-0 license.

    (optional) To use the code provided by AWS:

    • Choose the Upload from button and upload the zipped code example.
    • After the code is uploaded, edit the default Runtime settings: Choose the Edit button and set the handler to be equal to: LambdaRpz.handler
    • Edit the default Layers configuration, choose the Add a layer button, select Specify an ARN and enter the ARN of the layer created during the optional step 2.
    • Edit the environment variables of the function: Select the Edit button and define the three following variables:
      1. Key : FirewallDomainListId | Value : <domain-list-id>
      2. Key : region | Value : <region>
      3. Key : s3Prefix | Value : <DNSFW-BUCKET-NAME>

The code that you place in the function will be able to fetch the list from URLhaus, upload the list as a file to S3, and start the import of domains.

For the Lambda function to be invoked every 5 minutes, next you will create a scheduled rule with Amazon EventBridge.

To automate the invoking of the Lambda function

  1. In the EventBridge console, in the same AWS Region as your domain list, choose Create rule.
  2. For Rule type, choose Schedule.
  3. For Schedule pattern, select the option A schedule that runs at a regular rate, such as every 10 minutes, and under Rate expression set a rate of 5 minutes.
    Figure 3: Console view when configuring a schedule

    Figure 3: Console view when configuring a schedule

  4. To select the target, choose AWS service, choose Lambda function, and then select the function that you previously created.

After the solution is deployed, your domain list will be updated every 5 minutes and look like the view in Figure 4.

Figure 4: Console view of the created domain list after it has been updated by the Lambda function

Figure 4: Console view of the created domain list after it has been updated by the Lambda function

Code samples

You can use the samples in the amazon-route-53-resolver-firewall-automation-examples-2 GitHub repository to ease the automation of your domain list, and the associated updates. The repository contains script files to help you with the deployment process of the AWS CloudFormation template. Note that you need to have the AWS Command Line Interface (AWS CLI) installed and properly configured in order to use the files.

To deploy the CloudFormation stack

  1. If you haven’t done so already, create an S3 bucket to store the artifacts in the Region where you wish to deploy. This name of this bucket will then be referenced as ParamS3ArtifactBucket with a value of <DOC-EXAMPLE-BUCKET-ARTIFACT>
  2. Clone the repository locally.
    git clone https://github.com/aws-samples/amazon-route-53-resolver-firewall-automation-examples-2
  3. Build the Lambda function layer. From the /layer folder, use the provided script.
    . ./build-layer.sh
  4. Zip and upload the artifact to the bucket created in step 1. From the root folder, use the provided script.
    . ./zipupload.sh <ParamS3ArtifactBucket>
  5. Deploy the AWS CloudFormation stack by using either the AWS CLI or the CloudFormation console.
    • To deploy by using the AWS CLI, from the root folder, type the following command, making sure to replace <region>, <DOC-EXAMPLE-BUCKET-ARTIFACT>, <DNSFW-BUCKET-NAME>, and <DomainListName>with your own values.
      aws --region <region> cloudformation create-stack --stack-name DNSFWStack --capabilities CAPABILITY_NAMED_IAM --template-body file://./DNSFWStack.cfn.yaml --parameters ParameterKey=ParamS3ArtifactBucket,ParameterValue=<DOC-EXAMPLE-BUCKET-ARTIFACT> ParameterKey=ParamS3RpzBucket,ParameterValue=<DNSFW-BUCKET-NAME> ParameterKey=ParamFirewallDomainListName,ParameterValue=<DomainListName>

    • To deploy by using the console, do the following:
      1. In the CloudFormation console, choose Create stack, and then choose With new resources (standard).
      2. On the creation screen, choose Template is ready, and upload the provided DNSFWStack.cfn.yaml file.
      3. Enter a stack name and configure the requested parameters with your desired configuration and outcomes. These parameters include the following:
        • The name of your firewall domain list.
        • The name of the S3 bucket that contains Lambda artifacts.
        • The name of the S3 bucket that will be created to contain the files with the domain information from URLhaus.
      4. Acknowledge that the template requires IAM permission because it will create the role for the Lambda function and manage its IAM policy, and then choose Create stack.

After a few minutes, all the resources should be created and the CloudFormation stack is now deployed. After 5 minutes, your domain list should be updated, as shown in Figure 5.

Figure 5: Console view of CloudFormation after the stack has been deployed

Figure 5: Console view of CloudFormation after the stack has been deployed

Conclusions and cost

In this blog post, you learned about creating and automating the update of a domain list that you fully control. To go further, you can extend and replicate the architecture pattern to fetch domain names from other sources by editing the source code of the Lambda function.

After the solution is in place, in order for the filtering to be effective, you need to create a rule group referencing the domain list and associate the rule group with some of your VPCs.

For cost information, see the AWS Pricing Calculator. This solution will be invoked 60 (minutes) * 24 (hours) * 30 (days) / 5 (minutes) = 8,640 times per month, invoking the Lambda function that will run for an average of 400 minutes, storing an average of 0.5 GB in Amazon S3, and creating a domain list that averages 1,500 domains. According to our public pricing, and without factoring in the AWS Free Tier, this will incur the estimated total cost of $1.43 per month for the filtering of 1 million DNS requests.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

Want more AWS Security news? Follow us on Twitter.

Guillaume Neau

Guillaume Neau

Guillaume is a solutions architect of France with an expertise in information security that focus on building solutions that improve the life of citizens.

AWS and VMware Announce VMware Cloud on AWS integration with Amazon FSx for NetApp ONTAP

Post Syndicated from Veliswa Boya original https://aws.amazon.com/blogs/aws/aws-and-vmware-announce-vmware-cloud-on-aws-integration-with-amazon-fsx-for-netapp-ontap/

Our customers are looking for cost-effective ways to continue to migrate their applications to the cloud. VMware Cloud on AWS is a fully managed, jointly engineered service that brings VMware’s enterprise-class, software-defined data center architecture to the cloud. VMware Cloud on AWS offers our customers the ability to run applications across operationally consistent VMware vSphere-based public, private, and hybrid cloud environments by bringing VMware’s Software-Defined Data Center (SDDC) to AWS.

In 2021, we announced the fully managed shared storage service Amazon FSx for NetApp ONTAP. This service provides our customers with access to the popular features, performance, and APIs of ONTAP file systems with the agility, scalability, security, and resiliency of AWS, making it easier to migrate on-premises applications that rely on network-attached storage (NAS) appliances to AWS.

Today I’m excited to announce the general availability of VMware Cloud on AWS integration with Amazon FSx for NetApp ONTAP. Prior to this announcement, customers could only use VMware VSAN where they could scale datastore capacity with compute. Now, they can scale storage independently and SDDCs can be scaled with the additional storage capacity that is made possible by FSx for NetApp ONTAP.

Customers can already add storage to their SDDCs by purchasing additional hosts or by adding AWS native storage services such as Amazon S3, Amazon EFS, and Amazon FSx for providing storage to virtual machines (VMs) on existing hosts. You may be thinking that nothing about this announcement is new.

Well, with this amazing integration, our customers now have the flexibility to add an external datastore option to support their growing workload needs. If you are running into storage constraints or are continually met with unplanned storage demands, this integration provides a cost-effective way to incrementally add capacity without the need to purchase more hosts. By taking advantage of external datastores through FSx for NetApp ONTAP, you have the flexibility to add more storage capacity when your workloads require it.

An Overview of VMware Cloud on AWS Integration with Amazon FSx for NetApp ONTAP
There are two account connectivity options for enabling storage provisioned by FSx for NetApp ONTAP to be made available for mounting as a datastore to a VMware Cloud on AWS SDDC. Both options use a dedicated Amazon Virtual Private Cloud (Amazon VPC) for the FSx file system to prevent routing conflicts.

The first option is to create a new Amazon VPC under the same connected AWS account and have it connected with the VMware-owned Shadow VPC using VMware Transit Connect. The diagram below shows the architecture of this option:

The first option is to enable storage under the same customer-owned account

The first option is to enable storage under the same AWS connected account

The second option is to create a new AWS account, which by default comes with an Amazon VPC for the Region. Similar to the first option, VMware Transit Connect is used to attach this new VPC with the VMware-owned Shadow VPC. Here is a diagram showing the architecture of this option:

The second option is to enable storage provisioned by FSx for NetApp ONTAP by creating a new AWS account

The second option is to enable storage by creating a new AWS account

Getting Started with VMware Cloud on AWS Integration with Amazon FSx for NetApp ONTAP
The first step is to create an FSx for NetApp ONTAP file system in your AWS account. The steps that you will follow to do this are the same, whether you’re using the first or second path to provision and mount your NFS datastore.

  1. Open the Amazon FSx service page.
  2. On the dashboard, choose Create file system to start the file system creation wizard.
  3. On the Select file system type page, select Amazon FSx for NetApp ONTAP, and then click Next which takes you to the Create ONTAP file system page. Here select the Standard create method.

The following video shows a complete guide on how to create an FSx for NetApp ONTAP:

The same process can be found in this FSx for ONTAP User Guide.

After the file system is created, locate the NFS IP address under the Storage virtual machines tab. The NFS IP address is the floating IP that is used to manage access between file system nodes, and it is required for configuring VMware Transit Connect.

Location of the NFS IP address under the Storage virtual machines tab - AWS console

Location of the NFS IP address under the Storage virtual machines tab – AWS console

Location of the NFS IP address under the Storage virtual machines tab - AWS console

Location of the NFS IP address under the Storage virtual machines tab – AWS console

You are done with creating the FSx for NetApp ONTAP file system, and now you need to create an SDDC group and configure VMware Transit Connect. In order to do this, you need to navigate between the VMware Cloud Console and the AWS console.

Sign in to the VMware Cloud Console, then go to the SDDC page. Here locate the Actions button and select Create SDDC Group. Once you’ve done this, provide the required data for Name (in the following example I used “FSx SDDC Group” for the name) and Description. For Membership, only include the SDDC in question.

After the SDDC Group is created, it shows up in your list of SDDC Groups. Select the SDDC Group, and then go to the External VPC tab.

External VPC tab Add Account - VMC Console

External VPC tab Add Account – VMC Console

Once you are in the External VPC tab, click the ADD ACCOUNT button, then provide the AWS account that was used to provision the FSx file system, and then click Add.

Now it’s time for you to go back to the AWS console and sign in to the same AWS account where you created your Amazon FSx file system. Here navigate to the Resource Access Manager service page and click the Accept resource share button.

Resource Access Manager service page to access the Accept resource share button - AWS console

Resource Access Manager service page to access the Accept resource share button – AWS console

Return to the VMC Console. By now, the External VPC is in an ASSOCIATED state. This can take several minutes to update.

External VPC tab - VMC Console

External VPC tab – VMC Console

Next, you need to attach a Transit Gateway to the VPC. For this, navigate back to the AWS console. A step-by-step guide can be found in the AWS Transit Gateway documentation.

The following is an example that represents a typical architecture of a VPC attached to a Transit Gateway:

A typical architecture of a VPC attached to a Transit Gateway

A typical architecture of a VPC attached to a Transit Gateway

You are almost at the end of the process. You now need to accept the transit gateway attachment and for this you will navigate back to the VMware Cloud Console.

Accept the Transit Gateway attachment as follows:

  1. Navigating back to the SDDC Group, External VPC tab, select the AWS account ID used for creating your FSx NetApp ONTAP, and click Accept. This process may take a few minutes.
  2. Next, you need to add the routes so that the SDDC can see the FSx file system. This is done on the same External VPC tab, where you will find a table with the VPC. In that table, there is a button called Add Routes. In the Add Route section, add two routes:
    1. The CIDR of the VPC where the FSx file system was deployed.
    2. The floating IP address of the file system.
  3. Click Done to complete the route task.

In the AWS console, create the route back to the SDDC by locating VPC on the VPC service page and navigating to the Route Table as seen below.

VPC service page Route Table navigation - AWS console

VPC service page Route Table navigation – AWS console

Ensure that you have the correct inbound rules for the SDDC Group CIDR by locating Security Groups under VPC and finding the Security Group that is being used (it should be the default one) to allow the inbound rules for SDDC Group CIDR.

Security Groups under VPC that is being used to allow the inbound rules for SDDC Group CIDR

Security Groups under VPC that are being used to allow the inbound rules for SDDC Group CIDR

Lastly, mount the NFS Datastore in the VMware Cloud Console as follows:

  1. Locate your SDDC.
  2. After selecting the SDDC, Navigate to the Storage Tab.
  3. Click Attach Datastore to mount the NFS volume(s).
  4. The next step is to select which hosts in the SDDC to mount the datastore to and click Mount to complete the task.
Attach a new datastore

Attach A New Datastore

Available Today
Amazon FSx for NetApp ONTAP is available today for VMware Cloud on AWS customers in US East (Ohio), US East (N. Virginia), US West (Oregon), Asia Pacific (Mumbai), Asia Pacific (Seoul), Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Canada (Central), Europe (Frankfurt), Europe (Ireland), Europe (London), Europe (Milan), Europe (Paris), Europe (Stockholm), South America (São Paulo), AWS GovCloud (US-East), and AWS GovCloud (US-West).

Veliswa x

Building AWS Lambda governance and guardrails

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/building-aws-lambda-governance-and-guardrails/

When building serverless applications using AWS Lambda, there are a number of considerations regarding security, governance, and compliance. This post highlights how Lambda, as a serverless service, simplifies cloud security and compliance so you can concentrate on your business logic. It covers controls that you can implement for your Lambda workloads to ensure that your applications conform to your organizational requirements.

The Shared Responsibility Model

The AWS Shared Responsibility Model distinguishes between what AWS is responsible for and what customers are responsible for with cloud workloads. AWS is responsible for “Security of the Cloud” where AWS protects the infrastructure that runs all the services offered in the AWS Cloud. Customers are responsible for “Security in the Cloud”, managing and securing their workloads. When building traditional applications, you take on responsibility for many infrastructure services, including operating systems and network configuration.

Traditional application shared responsibility

Traditional application shared responsibility

One major benefit when building serverless applications is shifting more responsibility to AWS so you can concentrate on your business applications. AWS handles managing and patching the underlying servers, operating systems, and networking as part of running the services.

Serverless application shared responsibility

Serverless application shared responsibility

For Lambda, AWS manages the application platform where your code runs, which includes patching and updating the managed language runtimes. This reduces the attack surface while making cloud security simpler. You are responsible for the security of your code and AWS Identity and Access Management (IAM) to the Lambda service and within your function.

Lambda is SOCHIPAAPCI, and ISO-compliant. For more information, see Compliance validation for AWS Lambda and the latest Lambda certification and compliance readiness services in scope.

Lambda isolation

Lambda functions run in separate isolated AWS accounts that are dedicated to the Lambda service. Lambda invokes your code in a secure and isolated runtime environment within the Lambda service account. A runtime environment is a collection of resources running in a dedicated hardware-virtualized Micro Virtual Machines (MVM) on a Lambda worker node.

Lambda workers are bare metalEC2 Nitro instances, which are managed and patched by the Lambda service team. They have a maximum lease lifetime of 14 hours to keep the underlying infrastructure secure and fresh. MVMs are created by Firecracker, an open source virtual machine monitor (VMM) that uses Linux’s Kernel-based Virtual Machine (KVM) to create and manage MVMs securely at scale.

MVMs maintain a strong separation between runtime environments at the virtual machine hardware level, which increases security. Runtime environments are never reused across functions, function versions, or AWS accounts.

Isolation model for AWS Lambda workers

Isolation model for AWS Lambda workers

Network security

Lambda functions always run inside secure Amazon Virtual Private Cloud (Amazon VPCs) owned by the Lambda service. This gives the Lambda function access to AWS services and the public internet. There is no direct network inbound access to Lambda workers, runtime environments, or Lambda functions. All inbound access to a Lambda function only comes via the Lambda Invoke API, which sends the event object to the function handler.

You can configure a Lambda function to connect to private subnets in a VPC in your account if necessary, which you can control with IAM condition keys . The Lambda function still runs inside the Lambda service VPC but sends all network traffic through your VPC. Function outbound traffic comes from your own network address space.

AWS Lambda service VPC with VPC-to-VPC NAT to customer VPC

AWS Lambda service VPC with VPC-to-VPC NAT to customer VPC

To give your VPC-connected function access to the internet, route outbound traffic to a NAT gateway in a public subnet. Connecting a function to a public subnet doesn’t give it internet access or a public IP address, as the function is still running in the Lambda service VPC and then routing network traffic into your VPC.

All internal AWS traffic uses the AWS Global Backbone rather than traversing the internet. You do not need to connect your functions to a VPC to avoid connectivity to AWS services over the internet. VPC connected functions allow you to control and audit outbound network access.

You can use security groups to control outbound traffic for VPC-connected functions and network ACLs to block access to CIDR IP ranges or ports. VPC endpoints allow you to enable private communications with supported AWS services without internet access.

You can use VPC Flow Logs to audit traffic going to and from network interfaces in your VPC.

Runtime environment re-use

Each runtime environment processes a single request at a time. After Lambda finishes processing the request, the runtime environment is ready to process an additional request for the same function version. For more information on how Lambda manages runtime environments, see Understanding AWS Lambda scaling and throughput.

Data can persist in the local temporary filesystem path, in globally scoped variables, and in environment variables across subsequent invocations of the same function version. Ensure that you only handle sensitive information within individual invocations of the function by processing it in the function handler, or using local variables. Do not re-use files in the local temporary filesystem to process unencrypted sensitive data. Do not put sensitive or confidential information into Lambda environment variables, tags, or other freeform fields such as Name fields.

For more Lambda security information, see the Lambda security whitepaper.

Multiple accounts

AWS recommends using multiple accounts to isolate your resources because they provide natural boundaries for security, access, and billing. Use AWS Organizations to manage and govern individual member accounts centrally. You can use AWS Control Tower to automate many of the account build steps and apply managed guardrails to govern your environment. These include preventative guardrails to limit actions and detective guardrails to detect and alert on non-compliance resources for remediation.

Lambda access controls

Lambda permissions define what a Lambda function can do, and who or what can invoke the function. Consider the following areas when applying access controls to your Lambda functions to ensure least privilege:

Execution role

Lambda functions have permission to access other AWS resources using execution roles. This is an AWS principal that the Lambda service assumes which grants permissions using identity policy statements assigned to the role. The Lambda service uses this role to fetch and cache temporary security credentials, which are then available as environment variables during a function’s invocation. It may re-use them across different runtime environments that use the same execution role.

Ensure that each function has its own unique role with the minimum set of permissions..

Identity/user policies

IAM identity policies are attached to IAM users, groups, or roles. These policies allow users or callers to perform operations on Lambda functions. You can restrict who can create functions, or control what functions particular users can manage.

Resource policies

Resource policies define what identities have fine-grained inbound access to managed services. For example, you can restrict which Lambda function versions can add events to a specific Amazon EventBridge event bus. You can use resource-based policies on Lambda resources to control what AWS IAM identities and event sources can invoke a specific version or alias of your function. You also use a resource-based policy to allow an AWS service to invoke your function on your behalf. To see which services support resource-based policies, see “AWS services that work with IAM”.

Attribute-based access control (ABAC)

With attribute-based access control (ABAC), you can use tags to control access to your Lambda functions. With ABAC, you can scale an access control strategy by setting granular permissions with tags without requiring permissions updates for every new user or resource as your organization scales. You can also use tag policies with AWS Organizations to standardize tags across resources.

Permissions boundaries

Permissions boundaries are a way to delegate permission management safely. The boundary places a limit on the maximum permissions that a policy can grant. For example, you can use boundary permissions to limit the scope of the execution role to allow only read access to databases. A builder with permission to manage a function or with write access to the applications code repository cannot escalate the permissions beyond the boundary to allow write access.

Service control policies

When using AWS Organizations, you can use Service control policies (SCPs) to manage permissions in your organization. These provide guardrails for what actions IAM users and roles within the organization root or OUs can do. For more information, see the AWS Organizations documentation, which includes example service control policies.

Code signing

As you are responsible for the code that runs in your Lambda functions, you can ensure that only trusted code runs by using code signing with the AWS Signer service. AWS Signer digitally signs your code packages and Lambda validates the code package before accepting the deployment, which can be part of your automated software deployment process.

Auditing Lambda configuration, permissions and access

You should audit access and permissions regularly to ensure that your workloads are secure. Use the IAM console to view when an IAM role was last used.

IAM last used

IAM last used

IAM access advisor

Use IAM access advisor on the Access Advisor tab in the IAM console to review when was the last time an AWS service was used from a specific IAM user or role. You can use this to remove IAM policies and access from your IAM roles.

IAM access advisor

IAM access advisor

AWS CloudTrail

AWS CloudTrail helps you monitor, log, and retain account activity to provide a complete event history of actions across your AWS infrastructure. You can monitor Lambda API actions to ensure that only appropriate actions are made against your Lambda functions. These include CreateFunction, DeleteFunction, CreateEventSourceMapping, AddPermission, UpdateEventSourceMapping,  UpdateFunctionConfiguration, and UpdateFunctionCode.

AWS CloudTrail

AWS CloudTrail

IAM Access Analyzer

You can validate policies using IAM Access Analyzer, which provides over 100 policy checks with security warnings for overly permissive policies. To learn more about policy checks provided by IAM Access Analyzer, see “IAM Access Analyzer policy validation”.

You can also generate IAM policies based on access activity from CloudTrail logs, which contain the permissions that the role used in your specified date range.

IAM Access Analyzer

IAM Access Analyzer

AWS Config

AWS Config provides you with a record of the configuration history of your AWS resources. AWS Config monitors the resource configuration and includes rules to alert when they fall into a non-compliant state.

For Lambda, you can track and alert on changes to your function configuration, along with the IAM execution role. This allows you to gather Lambda function lifecycle data for potential audit and compliance requirements. For more information, see the Lambda Operators Guide.

AWS Config includes Lambda managed config rules such as lambda-concurrency-check, lambda-dlq-check, lambda-function-public-access-prohibited, lambda-function-settings-check, and lambda-inside-vpc. You can also write your own rules.

There are a number of other AWS services to help with security compliance.

  1. AWS Audit Manager: Collect evidence to help you audit your use of cloud services.
  2. Amazon GuardDuty: Detect unexpected and potentially unauthorized activity in your AWS environment.
  3. Amazon Macie: Evaluates your content to identify business-critical or potentially confidential data.
  4. AWS Trusted Advisor: Identify opportunities to improve stability, save money, or help close security gaps.
  5. AWS Security Hub: Provides security checks and recommendations across your organization.

Conclusion

Lambda makes cloud security simpler by taking on more responsibility using the AWS Shared Responsibility Model. Lambda implements strict workload security at scale to isolate your code and prevent network intrusion to your functions. This post provides guidance on assessing and implementing best practices and tools for Lambda to improve your security, governance, and compliance controls. These include permissions, access controls, multiple accounts, and code security. Learn how to audit your function permissions, configuration, and access to ensure that your applications conform to your organizational requirements.

For more serverless learning resources, visit Serverless Land.

Securely retrieving secrets with AWS Lambda

Post Syndicated from Julian Wood original https://aws.amazon.com/blogs/compute/securely-retrieving-secrets-with-aws-lambda/

AWS Lambda functions often need to access secrets, such as certificates, API keys, or database passwords. Storing secrets outside the function code in an external secrets manager helps to avoid exposing secrets in application source code. Using a secrets manager also allows you to audit and control access, and can help with secret rotation. Do not store secrets in Lambda environment variables, as these are visible to anyone who has access to view function configuration.

This post highlights some solutions to store secrets securely and retrieve them from within your Lambda functions.

AWS Partner Network (APN) member Hashicorp provides Vault to secure secrets and application data. Vault allows you to control access to your secrets centrally, across applications, systems, and infrastructure. You can store secrets in Vault and access them from a Lambda function to access a database, for example. The Vault Agent for AWS helps you authenticate with Vault, retrieve the database credentials, and then perform the queries. You can also use the Vault AWS Lambda extension to manage connectivity to Vault.

AWS Systems Manager Parameter Store enables you to store configuration data securely, including secrets, as parameter values. For information on Parameter Store pricing, see the documentation.

AWS Secrets Manager allows you to replace hardcoded credentials in your code with an API call to Secrets Manager to retrieve the secret programmatically. You can generate, protect, rotate, manage, and retrieve secrets throughout their lifecycle. By default, Secrets Manager does not write or cache the secret to persistent storage. Secrets Manager supports cross-account access to secrets. For information on Secrets Manager pricing, see the documentation.

Parameter Store integrates directly with Secrets Manager as a pass-through service for references to Secrets Manager secrets. Use this integration if you prefer using Parameter Store as a consistent solution for calling and referencing secrets across your applications. For more information, see “Referencing AWS Secrets Manager secrets from Parameter Store parameters.”

For an example application to show Secrets Manager functionality, deploy the example detailed in “How to securely provide database credentials to Lambda functions by using AWS Secrets Manager”.

When to retrieve secrets

When Lambda first invokes your function, it creates a runtime environment. It runs the function’s initialization (init) code, which is the code outside the main handler. Lambda then runs the function handler code as the invocation. This receives the event payload and processes your business logic. Subsequent invocations can use the same runtime environment.

You can retrieve secrets during each function invocation from within your handler code. This ensures that the secret value is always up to date but can lead to increased function duration and cost, as the function calls the secret manager during each invocation. There may also be additional retrieval costs from Secret Manager.

Retrieving secret during each invocation

Retrieving secret during each invocation

You can reduce costs and improve performance by retrieving the secret during the function init process. During subsequent invocations using the same runtime environment, your handler code can use the same secret.

Retrieving secret during function initialization.

Retrieving secret during function initialization.

The Serverless Land pattern example shows how to retrieve a secret during the init phase using Node.js and top-level await.

If a secret may change between subsequent invocations, ensure that your handler can check for the secret validity and, if necessary, retrieve the secret again.

Retrieve changed secret during subsequent invocation.

Retrieve changed secret during subsequent invocation.

You can also use Lambda extensions to retrieve secrets from Secrets Manager, cache them, and automatically refresh the cache based on a time value. The extension retrieves the secret from Secrets Manager before the init process and makes it available via a local HTTP endpoint. The function then retrieves the secret from the local HTTP endpoint, rather than directly from Secrets Manager, increasing performance. You can also share the extension with multiple functions, which can reduce function code. The extension handles refreshing the cache based on a configurable timeout value. This ensures that the function has the updated value, without handling the refresh in your function code, which increases reliability.

Using Lambda extensions to cache and refresh secret.

Using Lambda extensions to cache and refresh secret.

You can deploy the solution using the steps in Cache secrets using AWS Lambda extensions.

Lambda Powertools

Lambda Powertools provides a suite of utilities for Lambda functions to simplify the adoption of serverless best practices. AWS Lambda Powertools for Python and AWS Lambda Powertools for Java both provide a parameters utility that integrates with Secrets Manager.

from aws_lambda_powertools.utilities import parameters
def handler(event, context):
    # Retrieve a single secret
    value = parameters.get_secret("my-secret")
import software.amazon.lambda.powertools.parameters.SecretsProvider;
import software.amazon.lambda.powertools.parameters.ParamManager;

public class AppWithSecrets implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {
    // Get an instance of the Secrets Provider
    SecretsProvider secretsProvider = ParamManager.getSecretsProvider();

    // Retrieve a single secret
    String value = secretsProvider.get("/my/secret");

Rotating secrets

You should rotate secrets to prevent the misuse of your secrets. This helps you to replace long-term secrets with short-term ones, which reduces the risk of compromise.

Secrets Manager has built-in functionality to rotate secrets on demand or according to a schedule. Secrets Manager has native integrations with Amazon RDS, Amazon DocumentDB, and Amazon Redshift, using a Lambda function to manage the rotation process for you. It deploys an AWS CloudFormation stack and populates the function with the Amazon Resource Name (ARN) of the secret. You specify the permissions to rotate the credentials, and how often you want to rotate the secret. You can view and edit Secrets Manager rotation settings in the Secrets Manager console.

Secrets Manager rotation settings

Secrets Manager rotation settings

You can also create your own rotation Lambda function for other services.

Auditing secrets access

You should continually review how applications are using your secrets to ensure that the usage is as you expect. You should also log any changes to them so you can investigate any potential issues, and roll back changes if necessary.

When using Hashicorp Vault, use Audit devices to log all requests and responses to Vault. Audit devices can append logs to a file, write to syslog, or write to a socket.

Secrets Manager supports logging API calls using AWS CloudTrail. CloudTrail monitors and records all API calls for Secrets Manager as events. This includes calls from code calling the Secrets Manager APIs and access via the Secrets Manager console. CloudTrail data is considered sensitive, so you should use AWS KMS encryption to protect it.

The CloudTrail event history shows the requests to secretsmanager.amazonaws.com.

Viewing CloudTrail access to Secrets Manager

Viewing CloudTrail access to Secrets Manager

You can use Amazon EventBridge to respond to alerts based on specific operations recorded in CloudTrail. These include secret rotation or deleted secrets. You can also generate an alert if someone tries to use a version of a secret version while it is pending deletion. This may help identify and alert you when an outdated certificate is used.

Securing secrets

You must tightly control access to secrets because of their sensitive nature. Create AWS Identity and Access Management (IAM) policies and resource policies to enable minimal access to secrets. You can use role-based, as well as attribute-based, access control. This can prevent credentials from being accidentally used or compromised. For more information, see “Authentication and access control for AWS Secrets Manager”.

Secrets Manager supports encryption at rest using AWS Key Management Service (AWS KMS) using keys that you manage. Secrets are encrypted in transit using TLS by default, which requires request signing.

You can access secrets from inside an Amazon Virtual Private Cloud (Amazon VPC) without requiring internet access. Use AWS PrivateLink and configure a Secrets Manager specific VPC endpoint.

Do not store plaintext secrets in Lambda environment variables. Ensure that you do not embed secrets directly in function code, commit these secrets to code repositories, or log the secret to CloudWatch.

Conclusion

Using a secrets manager to store secrets such as certificates, API keys or database passwords helps to avoid exposing secrets in application source code. This post highlights some AWS and third-party solutions, such as Hashicorp Vault, to store secrets securely and retrieve them from within your Lambda functions.

Secrets Manager is the preferred AWS solution for storing and managing secrets. I explain when to retrieve secrets, including using Lambda extensions to cache secrets, which can reduce cost and improve performance.

You can use the Lambda Powertools parameters utility, which integrates with Secrets Manager. Rotating secrets reduces the risk of compromise and you can audit secrets using CloudTrail and respond to alerts using EventBridge. I also cover security considerations for controlling access to your secrets.

For more serverless learning resources, visit Serverless Land.

How Munich Re Automation Solutions Ltd built a digital insurance platform on AWS

Post Syndicated from Sid Singh original https://aws.amazon.com/blogs/architecture/how-munich-re-automation-solutions-ltd-built-a-digital-insurance-platform-on-aws/

Underwriting for life insurance can be quite manual and often time-intensive with lots of re-keying by advisers before underwriting decisions can be made and policies finally issued. In the digital age, people purchasing life insurance want self-service interactions with their prospective insurer. People want speed of transaction with time to cover reduced from days to minutes. While this has been achieved in the general insurance space with online car and home insurance journeys, this is not always the case in the life insurance space. This is where Munich Re Automation Solutions Ltd (MRAS) offers its customers, a competitive edge to shrink the quote-to-fulfilment process using their ALLFINANZ solution.

ALLFINANZ is a cloud-based life insurance and analytics solution to underwrite new life insurance business. It is designed to transform the end consumer’s journey, delivering everything they need to become a policyholder. The core digital services offered to all ALLFINANZ customers include Rulebook Hub, Risk Assessment Interview delivery, Decision Engine, deep analytics (including predictive modeling capabilities), and technical integration services—for example, API integration and SSO integration.

Current state architecture

The ALLFINANZ application began as a traditional three-tier architecture deployed within a datacenter. As MRAS migrated their workload to the AWS cloud, they looked at their regulatory requirements and the technology stack, and decided on the silo model of the multi-tenant SaaS system. Each tenant is provided a dedicated Amazon Virtual Private Cloud (VPC) that holds network and application components, fully isolated from other primary insurers.

As an entry point into the ALLFINANZ environment, MRAS uses Amazon Route 53 to route incoming traffic to the appropriate Amazon VPC. The routing relies on a model where subdomains are assigned to each tenant, for example the subdomain allfinanz.tenant1.munichre.cloud is the subdomain for tenant 1. The diagram below shows the ALLFINANZ architecture. Note: not all links between components are shown here for simplicity.

Current high-level solution architecture for the ALLFINANZ solution

Figure 1. Current high-level solution architecture for the ALLFINANZ solution

  1. The solution uses Route 53 as the DNS service, which provides two entry points to the SaaS solution for MRAS customers:
    • The URL allfinanz.<tenant-id>.munichre.cloud allows user access to the ALLFINANZ Interview Screen (AIS). The AIS can exist as a standalone application, or can be integrated with a customer’s wider digital point-of -sale process.
    • The URL api.allfinanz.<tenant-id>.munichre.cloud is used for accessing the application’s Web services and REST APIs.
  2. Traffic from both entry points flows through the load balancers. While HTTP/S traffic from the application user access entry point flows through an Application Load Balancer (ALB), TCP traffic from the REST API clients flows through a Network Load Balancer (NLB). Transport Layer Security (TLS) termination for user traffic happens at the ALB using certificates provided by the AWS Certificate Manager.  Secure communication over the public network is enforced through TLS validation of the server’s identity.
  3. Unlike application user access traffic, REST API clients use mutual TLS authentication to authenticate a customer’s server. Since NLB doesn’t support mutual TLS, MRAS opted for a solution to pass this traffic to a backend NGINX server for the TLS termination. Mutual TLS is enforced by using self-signed client and server certificates issued by a certificate authority that both the client and the server trust.
  4. Authenticated traffic from ALB and NGINX servers is routed to EC2 instances hosting the application logic. These EC2 instances are hosted in an auto-scaling group spanning two Availability Zones (AZs) to provide high availability and elasticity, therefore, allowing the application to scale to meet fluctuating demand.
  5. Application transactions are persisted in the backend Amazon Relational Database Service MySQL instances. This database layer is configured across multi-AZs, providing high availability and automatic failover.
  6. The application requires the capability to integrate evidence from data sources external to the ALLFINANZ service. This message sharing is enabled through the Amazon MQ managed message broker service for Apache Active MQ.
  7. Amazon CloudWatch is used for end-to-end platform monitoring through logs collection and application and infrastructure metrics and alerts to support ongoing visibility of the health of the application.
  8. Software deployment and associated infrastructure provisioning is automated through infrastructure as code using a combination of Git, Amazon CodeCommit, Ansible, and Terraform.
  9. Amazon GuardDuty continuously monitors the application for malicious activity and delivers detailed security findings for visibility and remediation. GuardDuty also allows MRAS to provide evidence of the application’s strong security posture to meet audit and regulatory requirements.

High availability, resiliency, and security

MRAS deploys their solution across multiple AWS AZs to meet high-availability requirements and ensure operational resiliency. If one AZ has an ongoing event, the solution will remain operational, as there are instances receiving production traffic in another AZ. As described above, this is achieved using ALBs and NLBs to distribute requests to the application subnets across AZs.

The ALLFINANZ solution uses private subnets to segregate core application components and the database storage platform. Security groups provide networking security measures at the elastic network interface level. MRAS restrict access from incoming connection requests to ranges of IP addresses by attaching security groups to the ALBs. Amazon Inspector monitors workloads for software vulnerabilities and unintended network exposure. AWS WAF is integrated with the ALB to protect from SQL injection or cross-site scripting attacks on the application.

Optimizing the existing workload

One of the key benefits of this architecture is that now MRAS can standardize the infrastructure configuration and ensure consistent versioning of the workload across tenants. This makes onboarding new tenants as simple as provisioning another VPC with the same infrastructure footprint.

MRAS are continuing to optimize their architecture iteratively, examining components to modernize to cloud-native components and evolving towards the pool model of multi-tenant SaaS architecture wherever possible. For example, MRAS centralized their per-tenant NAT gateway deployment to a centralized outbound Internet routing design using AWS Transit Gateway, saving approximately 30% on their overall NAT gateway spend.

Conclusion

The AWS global infrastructure has allowed MRAS to serve more than 40 customers in five AWS regions around the world. This solution improves customers’ experience and workload maintainability by standardizing and automating the infrastructure and workload configuration within a SaaS model, compared with multiple versions for the on-premise deployments. SaaS customers are also freed up from the undifferentiated heavy lifting of infrastructure operations, allowing them to focus on their business of underwriting for life insurance.

MRAS used the AWS Well-Architected Framework to assess their architecture and list key recommendations. AWS also offers Well-Architected SaaS Lens and AWS SaaS Factory Program, with a collection of resources to empower and enable insurers at any stage of their SaaS on AWS journey.