Tag Archives: Amazon Route 53

Minimizing Dependencies in a Disaster Recovery Plan

Post Syndicated from Randy DeFauw original https://aws.amazon.com/blogs/architecture/minimizing-dependencies-in-a-disaster-recovery-plan/

The Availability and Beyond whitepaper discusses the concept of static stability for improving resilience. What does static stability mean with regard to a multi-Region disaster recovery (DR) plan? What if the very tools that we rely on for failover are themselves impacted by a DR event?

In this post, you’ll learn how to reduce dependencies in your DR plan and manually control failover even if critical AWS services are disrupted. As a bonus, you’ll see how to use service control policies (SCPs) to help simulate a Regional outage, so that you can test failover scenarios more realistically.

Failover plan dependencies and considerations

Let’s dig into the DR scenario in more detail. Using Amazon Route 53 for Regional failover routing is a common pattern for DR events. In the simplest case, we’ve deployed an application in a primary Region and a backup Region. We have a Route 53 DNS record set with records for both Regions, and all traffic goes to the primary Region. In an event that triggers our DR plan, we manually or automatically switch the DNS records to direct all traffic to the backup Region.

Relying on an automated health check to control Regional failover can be tricky. A health check might not be perfectly reliable if a Region is experiencing some type of degradation. Often, we prefer to initiate our DR plan manually, which then initiates with automation.

What are the dependencies that we’ve baked into this failover plan? First, Route 53, our DNS service, has to be available. It must continue to serve DNS queries, and we have to be able to change DNS records manually. Second, if we do not have a full set of resources already deployed in the backup Region, we must be able to deploy resources into it.

Both dependencies might violate static stability, because we are relying on resources in our DR plan that might be affected by the outage we’re seeing. Ideally, we don’t want to depend on other services running so we can failover and continue to serve our own traffic. How do we reduce additional dependencies?

Static stability

Let’s look at our first dependency on Route 53 – control planes and data planes. Briefly, a control plane is used to configure resources, and the data plane delivers services (see Understanding Availability Needs for a more complete definition.)

The Route 53 data plane, which responds to DNS queries, is highly resilient across Regions. We can safely rely on it during the failure of any single Region. But let’s assume that for some reason we are not able to call on the Route 53 control plane.

Amazon Route 53 Application Recovery Controller (Route 53 ARC) was built to handle this scenario. It provisions a Route 53 health check that we can manually control with a Route 53 ARC routing control, and is a data plane operation. The Route 53 ARC data plane is highly resilient, using a cluster of five Regional endpoints. You can revise the health check if three of the five Regions are available.

Figure 1. Simple Regional failover scenario using Route 53 Application Recovery Controller

Figure 1. Simple Regional failover scenario using Route 53 Application Recovery Controller

The second dependency, being able to deploy resources into the second Region, is not a concern if we run a fully scaled-out set of resources. We must make sure that our deployment mechanism doesn’t rely only on the primary Region. Most AWS services have Regional control planes, so this isn’t an issue.

The AWS Identity and Access Management (IAM) data plane is highly available in each Region, so you can authorize the creation of new resources as long as you’ve already defined the roles. Note: If you use federated authentication through an identity provider, you should test that the IdP does not itself have a dependency on another Region.

Testing your disaster recovery plan

Once we’ve identified our dependencies, we need to decide how to simulate a disaster scenario. Two mechanisms you can use for this are network access control lists (NACLs) and SCPs. The first one enables us to restrict network traffic to our service endpoints. However, the second allows defining policies that specify the maximum permissions for the target accounts. It also allows us to simulate a Route 53 or IAM control plane outage by restricting access to the service.

For the end-to-end DR simulation, we’ve published an AWS samples repository on GitHub that you can use to deploy. This evaluates Route 53 ARC capabilities if both Route 53 and IAM control planes aren’t accessible.

By deploying test applications across us-east-1 and us-west-1 AWS Regions, we can simulate a real-world scenario that determines the business continuity impact, failover timing, and procedures required for successful failover with unavailable control planes.

Figure 2. Simulating Regional failover using service control policies

Figure 2. Simulating Regional failover using service control policies

Before you conduct the test outlined in our scenario, we strongly recommend that you create a dedicated AWS testing environment with an AWS Organizations setup. Make sure that you don’t attach SCPs to your organization’s root but instead create a dedicated organization unit (OU). You can use this pattern to test SCPs and ensure that you don’t inadvertently lock out users from key services.

Chaos engineering

Chaos engineering is the discipline of experimenting on a system to build confidence in its capability to withstand turbulent production conditions. Chaos engineering and its principles are important tools when you plan for disaster recovery. Even a simple distributed system may be too complex to operate reliably. It can be hard or impossible to plan for every failure scenario in non-trivial distributed systems, because of the number of failure permutations. Chaos experiments test these unknowns by injecting failures (for example, shutting down EC2 instances) or transient anomalies (for example, unusually high network latency.)

In the context of multi-Region DR, these techniques can help challenge assumptions and expose vulnerabilities. For example, what happens if a health check passes but the system itself is unhealthy, or vice versa? What will you do if your entire monitoring system is offline in your primary Region, or too slow to be useful? Are there control plane operations that you rely on that themselves depend on a single AWS Region’s health, such as Amazon Route 53? How does your workload respond when 25% of network packets are lost? Does your application set reasonable timeouts or does it hang indefinitely when it experiences large network latencies?

Questions like these can feel overwhelming, so start with a few, then test and iterate. You might learn that your system can run acceptably in a degraded mode. Alternatively, you might find out that you need to be able to failover quickly. Regardless of the results, the exercise of performing chaos experiments and challenging assumptions is critical when developing a robust multi-Region DR plan.

Conclusion

In this blog, you learned about reducing dependencies in your DR plan. We showed how you can use Amazon Route 53 Application Recovery Controller to reduce a dependency on the Route 53 control plane, and how to simulate a Regional failure using SCPs. As you evaluate your own DR plan, be sure to take advantage of chaos engineering practices. Formulate questions and test your static stability assumptions. And of course, you can incorporate these questions into a custom lens when you run a Well-Architected review using the AWS Well-Architected Tool.

How Ribbon Communications Built a Scalable, Resilient Robocall Mitigation Platform

Post Syndicated from Siva Rajamani original https://aws.amazon.com/blogs/architecture/how-ribbon-communications-built-a-scalable-resilient-robocall-mitigation-platform/

Ribbon Communications provides communications software, and IP and optical networking end-to-end solutions that deliver innovation, unparalleled scale, performance, and agility to service providers and enterprise.

Ribbon Communications is helping customers modernize their networks. In today’s data-hungry, 24/7 world, this equates to improved competitive positioning and business outcomes. Companies are migrating from on-premises equipment for telephony services and looking for equivalent as a service (aaS) offerings. But these solutions must still meet the stringent resiliency, availability, performance, and regulatory requirements of a telephony service.

The telephony world is inundated with robocalls. In the United States alone, there were an estimated 50.5 billion robocalls in 2021! In this blog post, we describe the Ribbon Identity Hub – a holistic solution for robocall mitigation. The Ribbon Identity Hub enables services that sign and verify caller identity, which is compliant to the ATIS standards under the STIR/SHAKEN framework. It also evaluates and scores calls for the probability of nuisance and fraud.

Ribbon Identity Hub is implemented in Amazon Web Services (AWS). It is a fully managed service for telephony service providers and enterprises. The solution is secure, multi-tenant, automatic scaling, and multi-Region, and enables Ribbon to offer managed services to a wide range of telephony customers. Ribbon ensures resiliency and performance with efficient use of resources in the telephony environment, where load ratios between busy and idle time can exceed 10:1.

Ribbon Identity Hub

The Ribbon Identity Hub services are separated into a data (call-transaction) plane, and a control plane.

Data plane (call-transaction)

The call-transaction processing is typically invoked on a per-call-setup basis where availability, resilience, and performance predictability are paramount. Additionally, due to high variability in load, automatic scaling is a prerequisite.

Figure 1. Data plane architecture

Figure 1. Data plane architecture

Several AWS services come together in a solution that meets all these important objectives:

  1. Amazon Elastic Container Service (ECS): The ECS services are set up for automatic scaling and span two Availability Zones. This provides the horizontal scaling capability, the self-healing capacity, and the resiliency across Availability Zones.
  2. Elastic Load Balancing – Application Load Balancer (ALB): This provides the ability to distribute incoming traffic to ECS services as the target. In addition, it also offers:
    • Seamless integration with the ECS Auto Scaling group. As the group grows, traffic is directed to the new instances only when they are ready. As traffic drops, traffic is drained from the target instances for graceful scale down.
    • Full support for canary and linear upgrades with zero downtime. Maintains full-service availability without any changes or even perception for the client devices.
  3. Amazon Simple Storage Service (S3): Transaction detail records associated with call-related requests must be securely and reliably maintained for over a year due to billing and other contractual obligations. Amazon S3 simplifies this task with high durability, lifecycle rules, and varied controls for retention.
  4. Amazon DynamoDB: Building resilient services is significantly easier when the compute processing can be stateless. Amazon DynamoDB facilitates such stateless architectures without compromise. Coupled with the availability of the Amazon DynamoDB Accelerator (DAX) caching layer, the solution can meet the extreme low latency operation requirements.
  5. AWS Key Management Service (KMS): Certain tenant configuration is highly confidential and requires elevated protection. Furthermore, the data is part of the state that must be recovered across Regions in disaster recovery scenarios. To meet the security requirements, the KMS is used for envelope encryption using per-tenant keys. Multi-Region KMS keys facilitates the secure availability of this state across Regions without the need for application-level intervention when replicating encrypted data.
  6. Amazon Route 53: For telephony services, any non-transient service failure is unacceptable. In addition to providing high degree of resiliency through Multi-AZ architecture, Identity Hub also provides Regional level high availability through its multi-Region active-active architecture. Route 53 with health checks provides for dynamic rerouting of requests within minutes to alternate Regions.

Control plane

The Identity Hub control plane is used for customer configuration, status, and monitoring. The API is REST-based. Since this is not used on a call-by-call basis, the requirements around latency and performance are less stringent, though the requirements around high resiliency and dynamic scaling still apply. In this area, ease of implementation and maintainability are key.

Figure 2. Control plane architecture

Figure 2. Control plane architecture

The following AWS services implement our control plane:

  1. Amazon API Gateway: Coupled with a custom authenticator, the API Gateway handles all the REST API credential verification and routing. Implementation of an API is transformed into implementing handlers for each resource, which is the application core of the API.
  2. AWS Lambda: All the REST API handlers are written as Lambda functions. By using the Lambda’s serverless and concurrency features, the application automatically gains self-healing and auto-scaling capabilities. There is also a significant cost advantage as billing is per millisecond of actual compute time used. This is significant for a control plane where usage is typically sparse and unpredictable.
  3. Amazon DynamoDB: A stateless architecture with Lambda and API Gateway, all persistent state must be stored in an external database. The database must match the resilience and auto-scaling characteristics of the rest of the control plane. DynamoDB easily fits the requirements here.

The customer portal, in addition to providing the user interface for control plane REST APIs, also delivers a rich set of user-customizable dashboards and reporting capability. Here again, the availability of various AWS services simplifies the implementation, and remains non-intrusive to the central call-transaction processing.

Services used here include:

  1. AWS Glue: Enables extraction and transformation of raw transaction data into a format useful for reporting and dashboarding. AWS Glue is particularly useful here as the data available is regularly expanding, and the use cases for the reporting and dashboarding increase.
  2. Amazon QuickSight: Provides all the business intelligence (BI) functionality, including the ability for Ribbon to offer separate author and reader access to their users, and implements tenant-based access separation.

Conclusion

Ribbon has successfully deployed Identity Hub to enable cloud hosted telephony services to mitigate robocalls. Telephony requirements around resiliency, performance, and capacity were not compromised. Identity Hub offers the benefits of a 24/7 fully managed service requiring no additional customer on-premises equipment.

Choosing AWS services for Identity Hub gives Ribbon the ability to scale and meet future growth. The ability to dynamically scale the service in and out also brings significant cost advantages in telephony applications where busy hour traffic is significantly higher than idle time traffic. In addition, the availability of global AWS services facilitates the deployment of services in customer-local geographic locations to meet performance requirements or local regulatory compliance.

Creating a Multi-Region Application with AWS Services – Part 1, Compute and Security

Post Syndicated from Joe Chapman original https://aws.amazon.com/blogs/architecture/creating-a-multi-region-application-with-aws-services-part-1-compute-and-security/

Building a multi-Region application requires lots of preparation and work. Many AWS services have features to help you build and manage a multi-Region architecture, but identifying those capabilities across 200+ services can be overwhelming.

In this 3-part blog series, we’ll explore AWS services with features to assist you in building multi-Region applications. In Part 1, we’ll build a foundation with AWS security, networking, and compute services. In Part 2, we’ll add in data and replication strategies. Finally, in Part 3, we’ll look at the application and management layers.

Considerations before getting started

AWS Regions are built with multiple isolated and physically separate Availability Zones (AZs). This approach allows you to create highly available Well-Architected workloads that span AZs to achieve greater fault tolerance. There are three general reasons that you may need to expand beyond a single Region:

  • Expansion to a global audience as an application grows and its user base becomes more geographically dispersed, there can be a need to reduce latencies for different parts of the world.
  • Reducing Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) as part of disaster recovery (DR) plan.
  • Local laws and regulations may have strict data residency and privacy requirements that must be followed.

Ensuring security, identity, and compliance

Creating a security foundation starts with proper authentication, authorization, and accounting to implement the principle of least privilege. AWS Identity and Access Management (IAM) operates in a global context by default. With IAM, you specify who can access which AWS resources and under what conditions. For workloads that use directory services, the AWS Directory Service for Microsoft Active Directory Enterprise Edition can be set up to automatically replicate directory data across Regions. This allows applications to reduce lookup latencies by using the closest directory and creates durability by spanning multiple Regions.

Applications that need to securely store, rotate, and audit secrets, such as database passwords, should use AWS Secrets Manager. It encrypts secrets with AWS Key Management Service (AWS KMS) keys and can replicate secrets to secondary Regions to ensure applications are able to obtain a secret in the closest Region.

Encrypt everything all the time

AWS KMS can be used to encrypt data at rest, and is used extensively for encryption across AWS services. By default, keys are confined to a single Region. AWS KMS multi-Region keys can be created to replicate keys to a second Region, which eliminates the need to decrypt and re-encrypt data with a different key in each Region.

AWS CloudTrail logs user activity and API usage. Logs are created in each Region, but they can be centralized from multiple Regions and multiple accounts into a single Amazon Simple Storage Service (Amazon S3) bucket. As a best practice, these logs should be aggregated to an account that is only accessible to required security personnel to prevent misuse.

As your application expands to new Regions, AWS Security Hub can aggregate and link findings to a single Region to create a centralized view across accounts and Regions. These findings are continuously synced between Regions to keep you updated on global findings.

We put these features together in Figure 1.

Multi-Region security, identity, and compliance services

Figure 1. Multi-Region security, identity, and compliance services

Building a global network

For resources launched into virtual networks in different Regions, Amazon Virtual Private Cloud (Amazon VPC) allows private routing between Regions and accounts with VPC peering. These resources can communicate using private IP addresses and do not require an internet gateway, VPN, or separate network appliances. This works well for smaller networks that only require a few peering connections. However, as the number of peered connections increases, the mesh of peered connections can become difficult to manage and troubleshoot.

AWS Transit Gateway can help reduce these difficulties by creating a central transitive hub to act as a cloud router. A Transit Gateway’s routing capabilities can expand to additional Regions with Transit Gateway inter-Region peering to create a globally distributed private network.

Building a reliable, cost-effective way to route users to distributed Internet applications requires highly available and scalable Domain Name System (DNS) records. Amazon Route 53 does exactly that.

Route 53 routing policies can route traffic to a record with the lowest latency, or automatically fail over a record. If a larger failure occurs, the Route 53 Application Recovery Controller can simplify the monitoring and failover process for application failures across Regions, AZs, and on-premises.

Amazon CloudFront’s content delivery network is truly global, built across 300+ points of presence (PoP) spread throughout the world. Applications that have multiple possible origins, such as across Regions, can use CloudFront origin failover to automatically fail over the origin. CloudFront’s capabilities expand beyond serving content, with the ability to run compute at the edge. CloudFront functions make it easy to run lightweight JavaScript functions, and AWS [email protected] makes it easy to run Node.js and Python functions across these 300+ PoPs.

AWS Global Accelerator uses the AWS global network infrastructure to provide two static anycast IPs for your application. It automatically routes traffic to the closest Region deployment, and if a failure is detected it will automatically redirect traffic to a healthy endpoint within seconds.

Figure 2 brings these features together to create a global network across two Regions.

AWS VPC connectivity and content delivery

Figure 2. AWS VPC connectivity and content delivery

Building the compute layer

An Amazon Elastic Compute Cloud (Amazon EC2) instance is based on an Amazon Machine Image (AMI). An AMI specifies instance configurations such as the instance’s storage, launch permissions, and device mappings. When a new standard image needs to be created, EC2 Image Builder can be used to streamline copying AMIs to selected Regions.

Although EC2 instances and their associated Amazon Elastic Block Store (Amazon EBS) volumes live in a single AZ, Amazon Data Lifecycle Manager can automate the process of taking and copying EBS snapshots across Regions. This can enhance DR strategies by providing a relatively easy cold backup-and-restore option for EBS volumes.

As an architecture expands into multiple Regions, it can become difficult to track where instances are provisioned. Amazon EC2 Global View helps solve this by providing a centralized dashboard to see Amazon EC2 resources such as instances, VPCs, subnets, security groups, and volumes in all active Regions.

Microservice-based applications that use containers benefit from quicker start-up times. Amazon Elastic Container Registry (Amazon ECR) can help ensure this happens consistently across Regions with private image replication at the registry level. An ECR private registry can be configured for either cross-Region or cross-account replication to ensure your images are ready in secondary Regions when needed.

We bring these compute layer features together in Figure 3.

AMI and EBS snapshot copy across Regions

Figure 3. AMI and EBS snapshot copy across Regions

Summary

It’s important to create a solid foundation when architecting a multi-Region application. These foundations pave the way for you to move fast in a secure, reliable, and elastic way as you build out your application. In this post, we covered options across AWS security, networking, and compute services that have built-in functionality to take away some of the undifferentiated heavy lifting. We’ll cover data, application, and management services in future posts.

Ready to get started? We’ve chosen some AWS Solutions and AWS Blogs to help you!

Looking for more architecture content? AWS Architecture Center provides reference architecture diagrams, vetted architecture solutions, Well-Architected best practices, patterns, icons, and more!

What to Consider when Selecting a Region for your Workloads

Post Syndicated from Saud Albazei original https://aws.amazon.com/blogs/architecture/what-to-consider-when-selecting-a-region-for-your-workloads/

The AWS Cloud is an ever-growing network of Regions and points of presence (PoP), with a global network infrastructure that connects them together. With such a vast selection of Regions, costs, and services available, it can be challenging for startups to select the optimal Region for a workload. This decision must be made carefully, as it has a major impact on compliance, cost, performance, and services available for your workloads.

Evaluating Regions for deployment

There are four main factors that play into evaluating each AWS Region for a workload deployment:

  1. Compliance. If your workload contains data that is bound by local regulations, then selecting the Region that complies with the regulation overrides other evaluation factors. This applies to workloads that are bound by data residency laws where choosing an AWS Region located in that country is mandatory.
  2. Latency. A major factor to consider for user experience is latency. Reduced network latency can make substantial impact on enhancing the user experience. Choosing an AWS Region with close proximity to your user base location can achieve lower network latency. It can also increase communication quality, given that network packets have fewer exchange points to travel through.
  3. Cost. AWS services are priced differently from one Region to another. Some Regions have lower cost than others, which can result in a cost reduction for the same deployment.
  4. Services and features. Newer services and features are deployed to Regions gradually. Although all AWS Regions have the same service level agreement (SLA), some larger Regions are usually first to offer newer services, features, and software releases. Smaller Regions may not get these services or features in time for you to use them to support your workload.

Evaluating all these factors can make coming to a decision complicated. This is where your priorities as a business should influence the decision.

Assess potential Regions for the right option

Evaluate by shortlisting potential Regions.

  • Check if these Regions are compliant and have the services and features you need to run your workload using the AWS Regional Services website.
  • Check feature availability of each service and versions available, if your workload has specific requirements.
  • Calculate the cost of the workload on each Region using the AWS Pricing Calculator.
  • Test the network latency between your user base location and each AWS Region.

At this point, you should have a list of AWS Regions with varying cost and network latency that looks something Table 1:

Region Compliance Latency Cost Services / Features
Region A

15 ms $$
Region B

20 ms

$$$

X

Region C

80 ms $

Table 1. Region evaluation matrix

Many workloads such as high performance computing (HPC), analytics, and machine learning (ML), are not directly linked to a customer-facing application. These would not be sensitive to network latency, so you may want to select the Region with the lowest cost.

Alternatively, you may have a backend service for a game or mobile application in which network latency has a direct impact on user experience. Measure the difference in network latency between each Region, and determine if it is worth the increased cost. You can leverage the Amazon CloudFront edge network, which helps reduce latency and increases communication quality. This is because it uses a fully managed AWS network infrastructure, which connects your application to the edge location nearest to your users.

Multi-Region deployment

You can also split the workload across multiple Regions. The same workload may have some components that are sensitive to network latency and some that are not. You may determine you can benefit from both lower network latency and reduced cost at the same time. Here’s an example:

Figure 1. Multi-Region deployment optimized for feature availability

Figure 1. Multi-Region deployment optimized for feature availability

Figure 1 shows a serverless application deployed at the Bahrain Region (me-south-1) which has a close proximity to the customer base in Riyadh, Saudi Arabia. Application users enjoy a lower latency network connecting to the AWS Cloud. Analytics workloads are deployed in the Ireland Region (eu-west-1), which has a lower cost for Amazon Redshift and other features.

Note that data transfer between Regions is not free and, in this example, costs $0.115 per GB. However, even with this additional cost factored in, running the analytical workload in Ireland (eu-west-1) is still more cost-effective. You can also benefit from additional capabilities and features that may have not yet been released in the Bahrain (me-south-1) Region.

This multi-Region setup could also be beneficial for applications with a global user base. The application can be deployed in multiple secondary AWS Regions closer to the user base locations. It uses a primary AWS Region with a lower cost for consolidated services and latency-insensitive workloads.

Figure 2. Multi-Region deployment optimized for network latency

Figure 2. Multi-Region deployment optimized for network latency

Figure 2 allows for an application to span multiple Regions to serve read requests with the lowest network latency possible. Each client will be routed to the nearest AWS Region. For read requests, an Amazon Route 53 latency routing policy will be used. For write requests, an endpoint routed to the primary Region will be used. This primary endpoint can also have periodic health checks to failover to a secondary Region for disaster recovery (DR).

Other factors may also apply for certain applications such as ones that require Amazon EC2 Spot Instances. Regions differ in size, with some having three, and others up to six Availability Zones (AZ). This results in varying Spot Instance capacity available for Amazon EC2. Choosing larger Regions offers larger Spot capacity. A multi-Region deployment offers the most Spot capacity.

Conclusion

Selecting the optimal AWS Region is an important first step when deploying new workloads. There are many other scenarios in which splitting the workload across multiple AWS Regions can result in a better user experience and cost reduction. The four factors mentioned in this blog post can be evaluated together to find the most appropriate Region to deploy your workloads.

If the workload is bound by any regulations, shortlist the Regions that are compliant. Measure the network latency between each Region and the location of the user base. Estimate the workload cost for each Region. Check that the shortlisted Regions have the services and features your workload requires. And finally, determine if your workload can benefit from running in multiple Regions.

Dive deeper into the AWS Global Infrastructure Website for more information.

Protect your remote workforce by using a managed DNS firewall and network firewall

Post Syndicated from Patrick Duffy original https://aws.amazon.com/blogs/security/protect-your-remote-workforce-by-using-a-managed-dns-firewall-and-network-firewall/

More of our customers are adopting flexible work-from-home and remote work strategies that use virtual desktop solutions, such as Amazon WorkSpaces and Amazon AppStream 2.0, to deliver their user applications. Securing these workloads benefits from a layered approach, and this post focuses on protecting your users at the network level. Customers can now apply these security measures by using Route 53 Resolver DNS Firewall and AWS Network Firewall, two managed services that provide layered protection for the customer’s virtual private cloud (VPC). This blog post provides recommendations for how you can build network protection for your remote workforce by using DNS Firewall and Network Firewall.

Overview

DNS Firewall helps you block DNS queries that are made for known malicious domains, while allowing DNS queries to trusted domains. DNS Firewall has a simple deployment model that makes it straightforward for you to start protecting your VPCs by using managed domain lists, as well as custom domain lists. With DNS Firewall, you can filter and regulate outbound DNS requests. The service inspects DNS requests that are handled by Route 53 Resolver and applies actions that you define to allow or block requests.

DNS Firewall consists of domain lists and rule groups. Domain lists include custom domain lists that you create and AWS managed domain lists. Rule groups are associated with VPCs and control the response for domain lists that you choose. You can configure rule groups at scale by using AWS Firewall Manager. Rule groups process in priority order and stop processing after a rule is matched.

Network Firewall helps customers protect their VPCs by protecting the workload at the network layer. Network Firewall is an automatically scaling, highly available service that simplifies deployment and management for network administrators. With Network Firewall, you can perform inspection for inbound traffic, outbound traffic, traffic between VPCs, and traffic between VPCs and AWS Direct Connect or AWS VPN traffic. You can deploy stateless rules to allow or deny traffic based on the protocol, source and destination ports, and source and destination IP addresses. Additionally, you can deploy stateful rules that allow or block traffic based on domain lists, standard rule groups, or Suricata compatible intrusion prevention system (IPS) rules.

To configure Network Firewall, you need to create Network Firewall rule groups, a Network Firewall policy, and finally, a network firewall. Rule groups consist of stateless and stateful rule groups. For both types of rule groups, you need to estimate the capacity when you create the rule group. See the Network Firewall Developer Guide to learn how to estimate the capacity that is needed for the stateless and stateful rule engines.

This post shows you how to configure DNS Firewall and Network Firewall to protect your workload. You will learn how to create rules that prevent DNS queries to unapproved DNS servers, and that block resources by protocol, domain, and IP address. For the purposes of this post, we’ll show you how to protect a workload consisting of two Microsoft Active Directory domain controllers, an application server running QuickBooks, and Amazon WorkSpaces to deliver the QuickBooks application to end users, as shown in Figure 1.
 

Figure 1: An example architecture that includes domain controllers and QuickBooks hosted on EC2 and Amazon WorkSpaces for user virtual desktops

Figure 1: An example architecture that includes domain controllers and QuickBooks hosted on EC2 and Amazon WorkSpaces for user virtual desktops

Configure DNS Firewall

DNS Firewall domain lists currently include two managed lists to block malware and botnet command-and-control networks, and you can also bring your own list. Your list can include any domain names that you have found to be malicious and any domains that you don’t want your workloads connecting to.

To configure DNS Firewall domain lists (console)

  1. Open the Amazon VPC console.
  2. In the navigation pane, under DNS Firewall, choose Domain lists.
  3. Choose Add domain list to configure a customer-owned domain list.
  4. In the domain list builder dialog box, do the following.
    1. Under Domain list name, enter a name.
    2. In the second dialog box, enter the list of domains you want to allow or block.
    3. Choose Add domain list.

When you create a domain list, you can enter a list of domains you want to block or allow. You also have the option to upload your domains by using a bulk upload. You can use wildcards when you add domains for DNS Firewall. Figure 2 shows an example of a custom domain list that matches the root domain and any subdomain of box.com, dropbox.com, and sharefile.com, to prevent users from using these file sharing platforms.
 

Figure 2: Domains added to a customer-owned domain list

Figure 2: Domains added to a customer-owned domain list

To configure DNS Firewall rule groups (console)

  1. Open the Amazon VPC console.
  2. In the navigation pane, under DNS Firewall, choose Rule group.
  3. Choose Create rule group to apply actions to domain lists.
  4. Enter a rule group name and optional description.
  5. Choose Add rule to add a managed or customer-owned domain list, and do the following.
    1. Enter a rule name and optional description.
    2. Choose Add my own domain list or Add AWS managed domain list.
    3. Select the desired domain list.
    4. Choose an action, and then choose Next.
  6. (Optional) Change the rule priority.
  7. (Optional) Add tags.
  8. Choose Create rule group.

When you create your rule group, you attach rules and set an action and priority for the rule. You can set rule actions to Allow, Block, or Alert. When you set the action to Block, you can return the following responses:

  • NODATA – Returns no response.
  • NXDOMAIN – Returns an unknown domain response.
  • OVERRIDE – Returns a custom CNAME response.

Figure 3 shows rules attached to the DNS firewall.
 

Figure 3: DNS Firewall rules

Figure 3: DNS Firewall rules

To associate your rule group to a VPC (console)

  1. Open the Amazon VPC console.
  2. In the navigation pane, under DNS Firewall, choose Rule group.
  3. Select the desired rule group.
  4. Choose Associated VPCs, and then choose Associate VPC.
  5. Select one or more VPCs, and then choose Associate.

The rule group will filter your DNS requests to Route 53 Resolver. Set your DNS servers forwarders to use your Route 53 Resolver.

To configure logging for your firewall’s activity, navigate to the Route 53 console and select your VPC under the Resolver section. You can configure multiple logging options, if required. You can choose to log to Amazon CloudWatch, Amazon Simple Storage Service (Amazon S3), or Amazon Kinesis Data Firehose. Select the VPC that you want to log queries for and add any tags that you require.

Configure Network Firewall

In this section, you’ll learn how to create Network Firewall rule groups, a firewall policy, and a network firewall.

Configure rule groups

Stateless rule groups are straightforward evaluations of a source and destination IP address, protocol, and port. It’s important to note that stateless rules don’t perform any deep inspection of network traffic.

Stateless rules have three options:

  • Pass – Pass the packet without further inspection.
  • Drop – Drop the packet.
  • Forward – Forward the packet to stateful rule groups.

Stateless rules inspect each packet in isolation in the order of priority and stop processing when a rule has been matched. This example doesn’t use a stateless rule, and simply uses the default firewall action to forward all traffic to stateful rule groups.

Stateful rule groups support deep packet inspection, traffic logging, and more complex rules. Stateful rule groups evaluate traffic based on standard rules, domain rules or Suricata rules. Depending on the type of rule that you use, you can pass, drop, or create alerts on the traffic that is inspected.

To create a rule group (console)

  1. Open the Amazon VPC console.
  2. In the navigation pane, under AWS Network Firewall, choose Network Firewall rule groups.
  3. Choose Create Network Firewall rule group.
  4. Choose Stateful rule group or Stateless rule group.
  5. Enter the desired settings.
  6. Choose Create stateful rule group.

The example in Figure 4 uses standard rules to block outbound and inbound Server Message Block (SMB), Secure Shell (SSH), Network Time Protocol (NTP), DNS, and Kerberos traffic, which are common protocols used in our example workload. Network Firewall doesn’t inspect traffic between subnets within the same VPC or over VPC peering, so these rules won’t block local traffic. You can add rules with the Pass action to allow traffic to and from trusted networks.
 

Figure 4: Standard rules created to block unauthorized SMB, SSH, NTP, DNS, and Kerberos traffic

Figure 4: Standard rules created to block unauthorized SMB, SSH, NTP, DNS, and Kerberos traffic

Blocking outbound DNS requests is a common strategy to verify that DNS traffic resolves only from local resolvers, such as your DNS server or the Route 53 Resolver. You can also use these rules to prevent inbound traffic to your VPC-hosted resources, as an additional layer of security beyond security groups. If a security group erroneously allows SMB access to a file server from external sources, Network Firewall will drop this traffic based on these rules.

Even though the DNS Firewall policy described in this blog post will block DNS queries for unauthorized sharing platforms, some users might attempt to bypass this block by modifying the HOSTS file on their Amazon WorkSpace. To counter this risk, you can add a domain rule to your firewall policy to block the box.com, dropbox.com, and sharefile.com domains, as shown in Figure 5.
 

Figure 5: A domain list rule to block box.com, dropbox.com, and sharefile.com

Figure 5: A domain list rule to block box.com, dropbox.com, and sharefile.com

Configure firewall policy

You can use firewall policies to attach stateless and stateful rule groups to a single policy that is used by one or more network firewalls. Attach your rule groups to this policy and set your preferred default stateless actions. The default stateless actions will apply to any packets that don’t match a stateless rule group within the policy. You can choose separate actions for full packets and fragmented packets, depending on your needs, as shown in Figure 6.
 

Figure 6: Stateful rule groups attached to a firewall policy

Figure 6: Stateful rule groups attached to a firewall policy

You can choose to forward the traffic to be processed by any stateful rule groups that you have attached to your firewall policy. To bypass any stateful rule groups, you can select the Pass option.

To create a firewall policy (console)

  1. Open the Amazon VPC console.
  2. In the navigation pane, under AWS Network Firewall, choose Firewall policies.
  3. Choose Create firewall policy.
  4. Enter a name and description for the policy.
  5. Choose Add rule groups.
    1. Select the stateless default actions you want to use.
    2. For any stateless or stateful rule groups, choose Add rule groups to add any rule groups that you want to use.
  6. (Optional) Add tags.
  7. Choose Create firewall policy.

Configure a network firewall

Configuring the network firewall requires you to attach the firewall to a VPC and select at least one subnet.

To create a network firewall (console)

  1. Open the Amazon VPC console.
  2. In the navigation pane, under AWS Network Firewall, choose Firewalls.
  3. Choose Create firewall.
  4. Under Firewall details, do the following:
    1. Enter a name for the firewall.
    2. Select the VPC.
    3. Select one or more Availability Zones and subnets, as needed.
  5. Under Associated firewall policy, do the following:
    1. Choose Associate an existing firewall policy.
    2. Select the firewall policy.
  6. (Optional) Add tags.
  7. Choose Create firewall.

Two subnets in separate Availability Zones are used for the network firewall example shown in Figure 7, to provide high availability.
 

Figure 7: A network firewall configuration that includes multiple subnets

Figure 7: A network firewall configuration that includes multiple subnets

After the firewall is in the ready state, you’ll be able to see the endpoint IDs of the firewall endpoints, as shown in Figure 8. The endpoint IDs are needed when you update VPC route tables.
 

Figure 8: Firewall endpoint IDs

Figure 8: Firewall endpoint IDs

You can configure alert logs, flow logs, or both to be sent to Amazon S3, CloudWatch log groups, or Kinesis Data Firehose. Administrators configure alert logging to build proactive alerting and flow logging to use in troubleshooting and analysis.

Finalize the setup

After the firewall is created and ready, the last step to complete setup is to update the VPC route tables. Update your routing in the VPC to route traffic through the new network firewall endpoints. Update the public subnets route table to direct traffic to the firewall endpoint in the same Availability Zone. Update the internet gateway route to direct traffic to the firewall endpoints in the matching Availability Zone for public subnets. These routes are shown in Figure 9.
 

Figure 9: Network diagram of the firewall solution

Figure 9: Network diagram of the firewall solution

In this example architecture, Amazon WorkSpaces users are able to connect directly between private subnet 1 and private subnet 2 to access local resources. Security groups and Windows authentication control access from WorkSpaces to EC2-hosted workloads such as Active Directory, file servers, and SQL applications. For example, Microsoft Active Directory domain controllers are added to a security group that allows inbound ports 53, 389, and 445, as shown in Figure 10.
 

Figure 10: Domain controller security group inbound rules

Figure 10: Domain controller security group inbound rules

Traffic from WorkSpaces will first resolve DNS requests by using the Active Directory domain controller. The domain controller uses the local Route 53 Resolver as a DNS forwarder, which DNS Firewall protects. Network traffic then flows from the private subnet to the NAT gateway, through the network firewall to the internet gateway. Response traffic flows back from the internet gateway to the network firewall, then to the NAT gateway, and finally to the user WorkSpace. This workflow is shown in Figure 11.
 

Figure 11: Traffic flow for allowed traffic

Figure 11: Traffic flow for allowed traffic

If a user attempts to connect to blocked internet resources, such as box.com, a botnet, or a malware domain, this will result in a NXDOMAIN response from DNS Firewall, and the connection will not proceed any further. This blocked traffic flow is shown in Figure 12.
  

Figure 12: Traffic flow when blocked by DNS Firewall

Figure 12: Traffic flow when blocked by DNS Firewall

If a user attempts to initiate a DNS request to a public DNS server or attempts to access a public file server, this will result in a dropped connection by Network Firewall. The traffic will flow as expected from the user WorkSpace to the NAT gateway and from the NAT gateway to the network firewall, which inspects the traffic. The network firewall then drops the traffic when it matches a rule with the drop or block action, as shown in Figure 13. This configuration helps to ensure that your private resources only use approved DNS servers and internet resources. Network Firewall will block unapproved domains and restricted protocols that use standard rules.
 

Figure 13: Traffic flow when blocked by Network Firewall

Figure 13: Traffic flow when blocked by Network Firewall

Take extra care to associate a route table with your internet gateway to route private subnet traffic to your firewall endpoints; otherwise, response traffic won’t make it back to your private subnets. Traffic will route from the private subnet up through the NAT gateway in its Availability Zone. The NAT gateway will pass the traffic to the network firewall endpoint in the same Availability Zone, which will process the rules and send allowed traffic to the internet gateway for the VPC. By using this method, you can block outbound network traffic with criteria that are more advanced than what is allowed by network ACLs.

Conclusion

Amazon Route 53 Resolver DNS Firewall and AWS Network Firewall help you protect your VPC workloads by inspecting network traffic and applying deep packet inspection rules to block unwanted traffic. This post focused on implementing Network Firewall in a virtual desktop workload that spans multiple Availability Zones. You’ve seen how to deploy a network firewall and update your VPC route tables. This solution can help increase the security of your workloads in AWS. If you have multiple VPCs to protect, consider enforcing your policies at scale by using AWS Firewall Manager, as outlined in this blog post.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Network Firewall forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Patrick Duffy

Patrick is a Solutions Architect in the Small Medium Business (SMB) segment at AWS. He is passionate about raising awareness and increasing security of AWS workloads. Outside work, he loves to travel and try new cuisines and enjoys a match in Magic Arena or Overwatch.

Using VPC Endpoints in Multi-Region Architectures with Route 53 Resolver

Post Syndicated from Michael Haken original https://aws.amazon.com/blogs/architecture/using-vpc-endpoints-in-multi-region-architectures-with-route-53-resolver/

Many customers are building multi-Region architectures on AWS. They might want to bring their systems closer to their end users, support disaster recovery (DR), or comply with data sovereignty requirements. Often, these architectures use Amazon Virtual Private Cloud (VPC) to host resources like Amazon EC2 instances, Amazon Relational Database Service (RDS) databases, and AWS Lambda functions. Typically, these VPCs are also connected using VPC peering or AWS Transit Gateway.

Within these VPC networks, customers also use AWS PrivateLink to deploy VPC endpoints. These endpoints provide private connectivity between VPCs and AWS services. They also support endpoint policies that allow customers to implement guardrails. As an example, customers frequently use endpoint policies to ensure that only IAM principals in their AWS Organization are accessing resources from their networks.

The challenge some customers have faced is that VPC endpoints can only be used to access resources in the same Region as the endpoint. For example, an Amazon Simple Storage Service (S3) VPC endpoint deployed in us-east-1 can only be used to access S3 buckets also located in us-east-1. To access a bucket in us-east-2, that traffic has to traverse the public internet. Ideally, customers want to keep this traffic within their private network and apply VPC endpoint policies, regardless of the Region where the resource is located.

Amazon Route 53 Resolver to the rescue

One of the ways we can solve this problem is with Amazon Route 53 Resolver. Route 53 Resolver provides inbound and outbound DNS services in a VPC. It allows you to resolve domain names for AWS resources in the Region where the resolver endpoint is deployed. It also allows you to forward DNS requests to other DNS servers based on rules you define. To consistently apply VPC endpoint policies to all traffic, we use Route 53 Resolver to steer traffic to VPC endpoints in each Region.

Figure 1. A multi-Region architecture with Route 53 Resolver and S3 endpoints

Figure 1. A multi-Region architecture with Route 53 Resolver and S3 endpoints

In this example shown in Figure 1, we have a workload that operates in us-east-1. It must access Amazon S3 buckets in us-east-2 and us-west-2. There is a VPC in each Region that is connected via VPC peering to the one in us-east-1. We’ve also deployed an inbound and outbound Route 53 Resolver endpoint in each VPC.

Finally, we also have Amazon S3 interface VPC endpoints in each VPC. These provide their own unique DNS names. They can be resolved to private IP addresses using VPC provided DNS (using the .2 address or 169.254.169.253 address) or the inbound resolver IP addresses.

When the EC2 instance accesses a bucket in us-east-1, the Route 53 Resolver endpoint resolves the DNS name to the private IP address of the VPC endpoint. However, without an outbound rule, a DNS query for a bucket in another Region like us-east-2 would resolve to the public IP address of the S3 service. To solve this, we’re going to add four outbound rules to the resolver in us-east-1.

  • us-west-2.amazonaws.com
  • us-west-2.vpce.amazonaws.com
  • us-east-2.amazonaws.com
  • us-east-2.vpce.amazonaws.com

These rules will forward the DNS request to the appropriate inbound Route 53 Resolver in the peered VPC. When there isn’t a VPC endpoint deployed for a service, the resolver will use its automatically created recursive rule to return the public IP address. Let’s look at how this works in Figure 2.

Figure 2. The workflow of resolving an out-of-Region S3 DNS name

Figure 2. The workflow of resolving an out-of-Region S3 DNS name

  1. The EC2 instance runs a command to list a bucket in us-east-2. The DNS request first goes to the local Route 53 Resolver endpoint in us-east-1.
  2. The Route 53 Resolver in us-east-1 has an outbound rule matching the bucket’s domain name. This forwards all DNS queries for the domain us-east-2.vpce.amazonaws.com to the inbound Route 53 Resolver in us-east-2.
  3. The Route 53 Resolver in us-east-2 responds with the private IP address of the S3 interface VPC endpoint in its VPC. This is then returned to the EC2 instance.
  4. The EC2 instance sends the request to the S3 interface VPC endpoint in us-east-2.

This pattern can be easily extended to support any Region that your organization uses. Add additional VPCs in those Regions to host the Route 53 Resolver endpoints and VPC endpoints. Then, add additional outbound resolver rules for those Regions. You can also support additional AWS services by deploying VPC endpoints for them in each peered VPC that hosts the inbound Route 53 Resolver endpoint.

This architecture can be extended to provide a centralized capability to your entire business instead of supporting a single workload in a VPC. We’ll look at that next.

Scaling cross-Region VPC endpoints with Route 53 Resolver

In Figure 3, each Region has a centralized HTTP proxy fleet. This is located in a dedicated VPC with AWS service VPC endpoints and a Route 53 Resolver endpoint. Each workload VPC in the same Region connects to this VPC over Transit Gateway. All instances send their HTTP traffic to the proxies. The proxies manage resolving domain names and forwarding the traffic to the correct Region. Here, each Route 53 Resolver supports inbound DNS requests from other VPCs. It also has outbound rules to forward requests to the appropriate Region. Let’s walk through how this solution works.

Figure 3. Using Route 53 Resolver endpoints with central HTTP proxies

Figure 3. Using Route 53 Resolver endpoints with central HTTP proxies

  1. The EC2 instance in us-east-1 runs a command to list a bucket in us-east-2. The HTTP request is sent to the proxy fleet in the same Region.
  2. The proxy fleet attempts to resolve the domain name of the bucket in us-east-2. The Route 53 Resolver in us-east-1 has an outbound rule for the domain us-east-2.vpce.amazonaws.com. This rule forwards the DNS query to the inbound Route 53 Resolver in us-east-2. The Route 53 Resolver in us-east-2 responds with the private IP address of the S3 interface endpoint in its VPC.
  3. The proxy server sends the request to the S3 interface endpoint in us-east-2 over the Transit Gateway connection. VPC endpoint policies are consistently applied to the request.

This solution (Figure 3) scales the previous implementation (Figure 2) to support multiple workloads across all of the in-use Regions. And it does this without duplicating VPC endpoints in every VPC.

If your environment doesn’t use HTTP proxies, you could alternatively deploy Route 53 Resolver outbound endpoints in each workload VPC. In this case, you have two options. The outbound rules can forward the DNS requests directly to the cross-Region inbound resolver, like in the Figure 2. Or, there can be a single outbound rule to forward the DNS requests to a central inbound resolver in the same Region (see Figure 3). The first option reduces dependencies on a centralized service. The second option provides reduced management overhead of the creation and updates to outbound rules.

Conclusion

Customers want a straightforward way to use VPC endpoints and endpoint policies for all Regions uniformly and consistently. Route 53 Resolver provides a solution using DNS. This ensures that requests to AWS services that support VPC endpoints stay within the VPC network, regardless of their Region.

Check out the documentation for Route 53 Resolver to learn more about how you can use DNS to simplify using VPC endpoints in multi-Region architectures.

Introducing Amazon Route 53 Application Recovery Controller

Post Syndicated from Sébastien Stormacq original https://aws.amazon.com/blogs/aws/amazon-route-53-application-recovery-controller/

I am pleased to announce the availability today of Amazon Route 53 Application Recovery Controller, a Amazon Route 53 set of capabilities that continuously monitors an application’s ability to recover from failures and controls application recovery across multiple AWS Availability Zones, AWS Regions, and on premises environments to help you to build applications that must deliver very high availability.

At AWS, the security and availability of your data and workloads are our top priorities. From the very beginning, AWS global infrastructure allowed you to build application architectures that are resilient to different type of failures. When your business or application requires high availability, you typically use AWS global infrastructure to deploy redundant application replicas across AWS Availability Zones inside an AWS Region. Then, you use a Network or Application Load Balancer to route traffic to the appropriate replica. This architecture handles the requirements of the vast majority of workloads.

However, some industries and workloads have higher requirements in terms of high availability: availability rate at or above 99.99% with recovery time objectives (RTO) measured in seconds or minutes. Think about how real-time payment processing or trading engines can affect entire economies if disrupted. To address these requirements, you typically deploy multiple replicas across a variety of AWS Availability Zones, AWS Regions, and on premises environments. Then, you use Amazon Route 53 to reliably route end users to the appropriate replica.

Amazon Route 53 Application Recovery Controller helps you to build these applications requiring very high availability and low RTO, typically those using active-active architectures, but other type of redundant architectures might also benefit from Amazon Route 53 Application Recovery Controller. It is made of two parts: readiness check and routing control.

Readiness checks continuously monitor AWS resource configurations, capacity, and network routing policies, and allow you to monitor for any changes that would affect the ability to execute a recovery operation. These checks ensure that the recovery environment is scaled and configured to take over when needed. They check the configuration of Auto Scaling groups, Amazon Elastic Compute Cloud (Amazon EC2) instances, Amazon Elastic Block Store (EBS) volumes, load balancers, Amazon Relational Database Service (RDS) instances, Amazon DynamoDB tables, and several others. For example, readiness check verifies AWS service limits to ensure enough capacity can be deployed in an AWS Region in case of failover. It also verifies capacity and scaling characteristics of application replicas are the same across AWS Region.

Routing controls help to rebalance traffic across application replicas during failures, to ensure that the application stays available. Routing controls work with Amazon Route 53 health checks to redirect traffic to an application replica, using DNS resolution. Routing controls improve traditional automated Amazon Route 53 health check-based failovers in three ways:

  • First, routing controls give you a way to failover the entire application stack based on application metrics or partial failures, such as a 5% increased error rate or a millisecond of increased latency.
  • Second, routing controls give you safe and simple manual overrides. You can use them to shift traffic for maintenance purposes or to recover from failures when your monitors fail to detect an issue.
  • Third, routing controls can use a capability called safety rules to prevent common side effects associated with fully automated health checks, such as preventing fail over to an unprepared replica, or flapping issues.

To help you understand how Route 53 Application Recovery Controller works, I’ll walk you through the process I used to configure my own high availability application.

How It Works
For demo purposes, I built an application made up of a load balancer, an Auto Scaling group with two EC2 instances, and a global DynamoDB table. I wrote a CDK script to deploy the application in two AWS Regions: US East (N. Virginia) and US West (Oregon). The global DynamoDB table ensures data is replicated across the two AWS Regions. This is an active-standby architecture, as I described earlier.

The application is a multi-player TicTacToe game, an application that typically needs 99.99% availability or more :-). One DNS record (tictactoe.seb.go-aws.com) points to the load balancer in the US East (N. Virginia) region. The following diagram shows the architecture for this application:

Example application architecture

Preparing My Application
To configure Route 53 Application Recovery Controller for my application, I first deployed independent replicas of my application stack so that I can fail over traffic across the stacks. These copies are deployed across AWS high-availability boundaries, such as Availability Zones, or AWS Regions. I chose to deploy my application replicas across multiple AWS Regions

Then, I configured data replication across these independent replicas. I’m using DynamoDB global tables to help replicate my data.

Lastly, I configured each independent stack to expose a DNS name. This DNS name is the entry point into my application, such as a regional load balancer DNS name.

Terminology
Before I configure readiness check, let me share some basic terminology.

A cell defines the silo that contains my application’s independent units of failover. It groups all AWS resources that are required for my application to operate independently. For my demo, I have two cells: one per AWS Region where my application is deployed. A cell is typically aligned with AWS high-availability boundaries, such as AWS Regions or Availability Zones, but it can be smaller too. It is possible to have multiple cells in one Availability Zone. This is an effective way to reduce blast radius, especially when you follow one-cell-at-a-time change management practices.

definition of a cell

A recovery group is a collection of cells that represent an application or group of applications that I want to check for failover readiness. A recovery group typically consists of two or more cells that mirror each other in terms of functionality.

definition of a recovery group

A resource set is a set of AWS resources that can span multiple cells. For this demo, I have three resource sets: one for the two load balancers in us-east-1 and us-west-2, one for the two Auto Scaling groups in the two Regions, and one for the global DynamoDB table.

A readiness check validates a set of AWS resources readiness to be failed over to. In this example, I want to audit readiness for my load balancers, Auto Scaling groups, and DynamoDB table. I create a readiness check for the Auto Scaling groups. The service constantly monitors the instance types and counts in the groups to make sure that each group is scaled equally. I repeat the process for the load balancer and the global DynamoDB table.

definition of a resource set

To help determine recovery readiness for my application, Route 53 Application Recovery Controller continuously audits mismatches in capacity, AWS resource limits, and AWS throttle limits across application cells (Availability Zones or Regions). When Route 53 Application Recovery Controller detects a mismatch in limits, it raises an AWS Service Quota request for the resource across the cells. If Route 53 Application Recovery Controller detects a capacity mismatch in resources, I can take actions to align capacity across the cells. For example, I could trigger a scaling increase for my Auto Scaling groups.

Create a Readiness Check
To create a readiness check, I open the AWS Management Console and navigate to the Application Recovery Controller section under Route 53.

Create Recovery Group

To create a recovery group for my application, I navigate to the Getting Started section, then I choose Create recovery group.

Create Recovery Group - enter a name

I enter a name (for example AWSNewsBlogDemo) and then choose Next.

Create Recovery readiness - create cells

In Configure Architecture, I choose Add Cell, then I enter a cell name (AWSNewsBlogDemo-RegionWEST) and then choose Add Cell again to add a second cell. I enter AWSNewsBlogDemo-RegionEAST for the second cell. I choose Next to review my inputs, then I choose Create recovery group.

I now need to associate resources such as my load balancers, Auto Scaling groups, and DynamoDB table with my recovery group.

Create Resource Set

In the left navigation pane, I choose Resource Set and then I choose Create.

Create Resource Set - load balancers

I enter a name for my first resource set (for example, load_balancers). For Resource type, I choose Network Load Balancer or Application Load Balancer and I then choose Add to add the load balancer ARN.

I choose Add again to enter the second load balancer ARN, and then I choose Create resource set.

I repeat the process to create one resource set for the two Auto Scaling groups and a third resource set for the global DynamoDB table (one ARN). I now have three resource sets:

Create Resource Set - 3 Resource Sets

My last step is to create the readiness check. This will associate the resources with cells in the resource groups.

Create Readiness Check

In Readiness check, I choose Create at the top right of the screen, then Readiness check.

Create Readiness Check Step 1

Step 1 (Create readiness check), I enter a name (for example, load_balancers). For Resource Type, I choose Network Load Balancer or Application Load Balancer and then choose Next.

Create Readiness Check Step 2

Step 2 (Add resource set), I keep the default selection Use an existing resource set and for Resource set name, I choose load_balancers and then I choose Next.

Step 3 (Apply readiness rules), I review the rules and then choose Next.

Recovery Group Options

Step 4 (Recovery Group Options), I keep the default selection Associate with an existing recovery group. For Recovery group name, I choose AWSNewsBlog. Then, I associate the two cells (EAST and WEST) with the two load balancers ARN. Be sure to associate the correct load balancer to each cell. The Region name is included in the ARN.

Step 5 (Review and create), I review my choices and then choose Create readiness check.

Three readiness checks

I repeat this process for the Auto Scaling group and the DynamoDB global table.

Recovery Groups in Ready mode

When all readiness checks in the group are green, the group has a status of Ready.

Now, I can configure and test the routing controls.

Terminology
Before I configure routing controls, let me share some basic terminology.

A cluster is a set of five redundant Regional endpoints against which you can execute API calls to update or get the state of routing controls. You can host multiple control panels and routing controls on one cluster.

A routing control is a simple on/off switch, hosted on a cluster, that you use to control routing of client traffic in and out of cells. When you create a routing control, you add a health check in Route 53 so that you can reroute traffic when you update the routing control in Route 53 Application Recovery Controller. The health checks must be associated with DNS failover records that front each application replica if you want to use them to route traffic with routing controls.

A control panel groups together a set of related routing controls.

Configure Routing Controls
I can use the Route 53 console or API actions to create a routing control for each AWS Region for my application. After I create routing controls, I create an Amazon Route 53 Application Recovery Controller health check for each one, and then associate each health check with a DNS failover record for my load balancers in each Region. Then, to fail over traffic between Regions, I change the routing control state for one routing control to off and another routing control state to on.

The first step is to create a cluster. A cluster is charged $2.5 / hour. When you create a cluster to experience Route 53 Application Recovery Controller, be sure to delete the cluster after your experimentation.

Create Cluster

In the left navigation pane, I navigate to the cluster panel and then I choose Create.

Create Cluster - enter name

I enter a name for my cluster and then choose Create cluster.

The cluster is in Pending state for a few minutes. After a while, its status changes to Deployed.

After it’s deployed, I select the cluster name to discover the five redundant API endpoints. You must specify one of those endpoints when you build recovery tools to retrieve or set routing control states. You can use any of the cluster endpoints, but in complex or automated scenarios, we recommend that your systems be prepared to retry with each of the available endpoints, using a different endpoint with each retry request.

Routing Control Cluster Endpoints

Traffic routing is managed through routing controls that are grouped in a control panel. You can create one or use the default one that is created for you.

Default Control Panel

I choose DefaultControlPanel.

Default Control Panel - Add routing control

I choose Add routing control.

Create Routing Control

I enter a name for my routing (FailToWEST) control and then choose Create routing control. I repeat the operation for the second routing control (FailToEAST).

Control Panel - Create Health Check

After the routing control is created, I choose it from the list. On the detail page, I choose Create health check to create a health check in Route 53.

Control Panel - Create Health Check

I enter a name for the health check and then choose Create. I navigate to the Route 53 console to verify the health check was correctly created.

I create one health check for each routing control.

You might have noticed that the Control Panel provides a place where you can add Safety Rules. When you work with several routing controls at the same time, you might want some safeguards in place when you enable and disable them. These help you to avoid initiating a failover when a replica is not ready, or unintended consequences like turning both routing controls off and stopping all traffic flow. To create these safeguards, you create safety rules. For more information about safety rules, including usage examples, see the Route 53 Application Recovery Controller developer guide.

Now the routing controls and the DNS health check are in place, the last step is to route traffic to my application.

Adjust My DNS Settings
To route traffic to my application. I assign a DNS alias to the top-level entry point of the application in the cell. For this example, using the Route 53 console, I create two ALIAS A records of type FAILOVER and associate each health check with each DNS record. The two records have the same record name. One is the primary record and the other is the secondary record. For more information about Amazon Route 53 health checks, see the Amazon Route 53 developer guide.

DNS Alias Record Primary DNS Alias Record Secondary

On the application recovery routing controls page, I enable one of the two routing controls.

Application recovery Control - enable one control state

As soon as I do, all the traffic pointed to tictactoe.seb.go-aws.com goes to the infrastructure deployed on us-east-1.

Testing My Setup
To test my setup, I first use the dig command in a terminal. It shows the DNS CNAME record that points to the load balancer deployed in us-east-1.

testing alias for us-east-1

I also test the application with a web browser. I observe the name tictactoe.seb.go-aws.com goes to us-east-1.

Tic Tac Toe application

Now, using the update-routing-control-state API action, the CLI, or the console, I turn off the routing control to the us-east-1 Region and turn on the one to the us-west-2 Region. When I use the CLI, I use the endpoints provided by my cluster.

aws route53-recovery-cluster update-routing-control-state \
     --routing-control-arn arn:aws:route53-recovery-control::012345678:controlpanel/xxx/routingcontrol/abcd \
     --routing-control-state On \
     --region us-west-2 \
     --endpoint-url https://host-xxx.us-west-2.cluster.routing-control.amazonaws.com/v1

In the console, I navigate to the control panel, I select the routing control I want to change and click Change routing control states.

Changing routing control states

After less than a minute, the DNS address is updated. My application traffic is now routed to the us-west-2 Region.

DNS checked after a routing control state change

Readiness checks and routing controls provide a controlled failover for my application traffic, redirecting traffic from my active replica to my standby one, in another AWS Region. I can change the traffic routing manually, as I showed in the demo, or I can automate it using Amazon CloudWatch alarms based on technical and business metrics for my application.

Pricing
This new capability is charged on demand. There are no upfront costs. You are charged per readiness check and per cluster per hour. Readiness checks are charged $0.045 / hour. Cluster are charged $2.5 / hour. In the demo example used for this blog post, there are three readiness checks and one cluster. The price per hour for this setup, excluding the application itself, is 3 x $0.045 + 1 x $2.5 = $2.635 / hour. For more details about the pricing, including an example, see the Route 53 pricing page.

This new capability is a global service that can be used to monitor and control application recovery for application running in any of the public commercial AWS Regions. Give it a try and let us know what you think. As always, you can send feedback through your usual AWS Support contacts or post it on the AWS forum for Route 53 Application Recovery Controller.

— seb

Implementing Multi-Region Disaster Recovery Using Event-Driven Architecture

Post Syndicated from Vaibhav Shah original https://aws.amazon.com/blogs/architecture/implementing-multi-region-disaster-recovery-using-event-driven-architecture/

In this blog post, we share a reference architecture that uses a multi-Region active/passive strategy to implement a hot standby strategy for disaster recovery (DR).

We highlight the benefits of performing DR failover using event-driven, serverless architecture, which provides high reliability, one of the pillars of AWS Well Architected Framework.

With the multi-Region active/passive strategy, your workloads operate in primary and secondary Regions with full capacity. The main traffic flows through the primary and the secondary Region acts as a recovery Region in case of a disaster event. This makes your infrastructure more resilient and highly available and allows business continuity with minimal impact on production workloads. This blog post aligns with the Disaster Recovery Series that explains various DR strategies that you can implement based on your goals for recovery time objectives (RTO), recovery point objectives (RPO), and cost.

DR Strategies

Figure 1. DR strategies

Keeping RTO and RPO low

DR allows you to recover from various unforeseen failures that may make a Region unusable, including human errors causing misconfiguration, technical failures, natural disasters, etc. DR also mitigates the impact of disaster events and improves resiliency, which keeps Service Level Agreements high with minimum impact on business continuity.

As shown in Figure 2, the multi-Region active-passive strategy switches request traffic from primary to secondary Region via DNS records via Amazon Route 53 routing policies. This keeps RTO and RPO low.

DR implementation architecture on multi-Region active-passive workloads

Figure 2. DR implementation architecture on multi-Region active/passive workloads

Deploying your multi-Region workload with AWS CodePipeline

In the multi-Region active/passive strategy, your workload handles full capacity in primary and secondary AWS Regions using AWS CloudFormation. By using AWS CodePipeline, one deploy stage within the pipeline will deploy the stack to the primary Region (Figure 3). After that, the same stack is copied to the secondary Region.

The workloads in the primary and secondary Regions will be treated as two different environments. However, they will run the same version of your application for consistency and availability in event of a failure.

Deploying new versions to Lambda using CodePipeline and CloudFormation in two Regions

Figure 3. Deploying new versions to Lambda using CodePipeline and CloudFormation in two Regions

Fail over with event-driven serverless architecture

The event-driven serverless architecture performs failover by updating the weights of the Route 53 record. This shifts the traffic flow from the primary to the secondary Region. This operation specifies the source Region from where the failover is happening to the destination Region.

For a given application, there will be two Route 53 records with the same name. The two records will point at two different endpoints for the application deployed in two different Regions.

The record will use a weighted policy with the weight as 100 for the record pointing at the endpoint in the primary Region. This means that all the request traffic will be served by the endpoint in the primary Region. Similarly, the second record will have weight as 0 and will be pointing to the endpoint in the secondary Region. This means none of the request traffic will reach that endpoint. This process can be repeated for multiple API endpoints. The information about the applications, like DNS records, endpoints, hosted zone IDs, Regions, and weights will all be stored in a DynamoDB table.

Then the Amazon API Gateway calls an AWS Lambda function that scans each item in an Amazon DynamoDB table. The API Gateway also updates the weighted policy of the Route 53 record and the DynamoDB table weight attribute.

The API Gateway, Lambda, function and global DynamoDB table will all be deployed in the primary and secondary Regions.

Ensuring workload availability after a disaster

In the event of disaster, the data in the affected Region must be available in the recovery Region. In this section, we talk about fail over of databases like DynamoDB tables and Amazon Relational Database Service (Amazon RDS) databases.

Global DynamoDB tables

If you have two tables coexisting in two different Regions, any changes made to the table in the primary Region will be replicated to the secondary Region and vice versa. Once failover occurs, the request traffic moves through the recovery Region and connects to its databases, meaning your data and workloads are still working and available.

Amazon RDS database

Amazon RDS does not offer the same failover features that DynamoDB tables do. However, Amazon Aurora does offer a replication feature to read replicas in other Regions.

To use this feature, you’ll create an RDS database in the primary Region and enable backup replication in the configuration. In case of a DR event, you can choose to restore the replicated backup on the Amazon RDS instance in the destination Region.

Approach

Initiating DR

The DR process can be initiated manually or automatically based on certain metrics like status checks, error rates, etc. If the established thresholds are reached for these metrics, it signifies the workloads in the primary Region are failing.

You can initiate the DR process automatically by invoking an API call that can initiate backend automation. This allows you to measure how resilient and reliable your workload is and how quickly you can switch traffic to another Region if a real disaster happens.

Failover

The failover process is initiated by the API Gateway invoking a Lambda function, as shown on the left side of Figure 2. The Lambda function then performs failover by updating the weights of the Route 53 and DynamoDB table records. Similar steps can be performed for failing over database endpoints.

Once failover is complete, you’ll want to monitor traffic using Amazon CloudWatch. The List the available CloudWatch metrics for your instances user guide provides common metrics for you to monitor for Amazon Elastic Compute Cloud (Amazon EC2). The Amazon ECS CloudWatch metrics user guide provides common metrics to monitor for Amazon Elastic Container Service (Amazon ECS).

Failback

Once failover is successful and you’ve proven that traffic is being successfully routed to the new Region, you’ll failback to the primary Region. Similar metrics can be monitored in the secondary Region like you did in the Failover section.

Testing and results

Regularly test your DR process and evaluate the results and metrics. Based on the success and misses, you can make nearly continuous improvements in the DR process.

The critical applications within your organization will likely change over time, so it is important to evaluate which applications are mission critical and require an active/passive DR strategy. If an application is no longer mission critical, another DR strategy may be more appropriate.

Conclusion

The multi-Region active/passive strategy is one way to implement DR for applications hosted on AWS. It can fail over several applications in a short period of time by using serverless capabilities.

By using this strategy, your applications will be highly available and resilient to issues impacting Regional failures. It provides high business continuity, limits losses due to revenue reduction, and your customers will see minimum impact on performance efficiency, which is one of the pillars of AWS Well Architected Framework. By using this strategy, you can significantly reduce DR time by trading lower RTO and RPO for higher costs for critical applications.

Related information

Disaster Recovery (DR) Architecture on AWS, Part IV: Multi-site Active/Active

Post Syndicated from Seth Eliot original https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-iv-multi-site-active-active/

In my first blog post of this series, I introduced you to four strategies for disaster recovery (DR). My subsequent posts shared details on the backup and restore, pilot light, and warm standby active/passive strategies.

In this post, you’ll learn how to implement an active/active strategy to run your workload and serve requests in two or more distinct sites. Like other DR strategies, this enables your workload to remain available despite disaster events such as natural disasters, technical failures, or human actions.

DR strategies: Multi-site active/active

As we know from our now familiar DR strategies diagram (Figure 1), the multi-site active/active strategy will give you the lowest RTO (recovery time objective) and RPO (recovery point objective). However, this must be weighed against the potential cost and complexity of operating active stacks in multiple sites.

DR strategies

Figure 1. DR strategies

Implementing multi-site active/active

The architecture in Figure 2 shows you how to use AWS Regions as your active sites, creating a multi-Region active/active architecture. Only two Regions are shown, which is common, but more may be used. Each Region hosts a highly available, multi-Availability Zone (AZ) workload stack. In each Region, data is replicated live between the data stores and also backed up. This protects against disasters that include data deletion or corruption, since the data backup can be restored to the last known good state.

Multi-site active/active DR strategy

Figure 2. Multi-site active/active DR strategy

Traffic routing

Each regional stack serves production traffic. How you implement traffic routing determines which Region will receive a given request. Figure 2 shows Amazon Route 53, a highly available and scalable cloud Domain Name System (DNS), used for routing. Route 53 offers multiple routing policies. For example, the geolocation or latency routing policies are good choices for active/active deployments. For geolocation routing, you configure which Region a request goes to based on the origin location of the request. For latency routing, AWS automatically sends requests to the Region that provides the shortest round-trip time.

Your data governance strategy helps inform which routing policy to use. Geolocation routing lets you distribute requests in a deterministic way. This allows you to keep data for certain users within a specific Region, or you can control where write operations are routed to prevent contention. If optimizing for performance is your top priority, then latency routing is a good choice.

Read/write patterns

Read local/write local pattern

The Region to which a request is routed is called the “local Region” for that request. To maintain low latencies and reduce the potential for network error, serve all read and write requests from the local Region of your multi-Region active/active architecture.

I use Amazon DynamoDB for the example architecture in Figure 2. DynamoDB global tables replicate a table to multiple Regions. Writes to the table in any Region are replicated to other Regions within a second. This makes it a good choice when using the read local/write local pattern. However, there is the possibility of write contention if updates are made to the same item in different Regions at about the same time. To help ensure eventual consistency, DynamoDB global tables use a last writer wins reconciliation between concurrent updates. In this case, the data written by the first writer is lost. If your application cannot handle this and you require strong consistency, use another write pattern to avoid write contention.

Read local/write global pattern

With a write global pattern, you choose a Region to be the global write Region and only accept writes in that Region. DynamoDB global tables are still an excellent choice for replicating data globally; however, you must ensure that locally received write requests are re-directed to the global write Region.

Amazon Aurora is another good choice. When deployed as an Aurora global database, a primary cluster is deployed to your global write Region, and read-only instances (Aurora Replicas) are deployed to other AWS Regions. Data is replicated to these read-only instances with typical latency of under a second. Aurora global database write forwarding (available using Aurora MySQL-Compatible Edition) allows Aurora Replicas in the secondary cluster to forward write operations to the primary cluster in the global write Region. This way, you can treat the read-only replicas in all your Regions as if they were read/write capable. Using write forwarding, the request travels over the AWS network and not the public internet, reducing latency.

Amazon ElastiCache for Redis also can replicate data across Regions. For example, to store session data, you write to your global write Region and use Global Datastore to ensure that this data is available to be read from other Regions.

Read local/write partitioned pattern

For write-heavy workloads with users located around the world, your application may not be suited to incur the round trip to the global write Region with every write. Consider using a write partitioned pattern to mitigate this. With this pattern, each item or record is assigned a home Region. This can be done based on the Region it was first written to. Or it can be based on a partition key in the record (such as user ID) by pre-assigning a home Region for each value of this key. As shown in Figure 3, records for this user are assigned to the left AWS Region as their home Region. The goal is to try to map records to a home Region close to where most write requests will originate.

Read local/write partitioned pattern for multi-site active/active DR strategy

Figure 3. Read local/write partitioned pattern for multi-site active/active DR strategy

When the user in Figure 3 travels away from home, they will read local, but writes will be routed back to their home Region. Usually writes will not incur long round trips as they are expected to typically come from near the home Region. Since writes are accepted in all Regions (for records homed to that respective Region), DynamoDB global tables, which accept writes in all Regions, are a good choice here also.

Failover

With a multi-Region active/active strategy, if your workload cannot operate in a Region, failover will route traffic away from the impacted Region to healthy Region(s). You can accomplish this with Route 53 by updating the DNS records. Make sure you set TTL (time to live) on these records low enough so that DNS resolvers will reflect your changes quickly enough to meet your RTO targets. Alternatively, you can use AWS Global Accelerator for routing and failover. It does not rely on DNS. Global Accelerator gives you two static IP addresses. You then configure which Regions user traffic goes to based on traffic dials and weights you set.

If you’re using a write global pattern and the impacted Region is the global write Region, then a new Region needs to be promoted to be the new global write Region. If you’re using a write partitioned pattern, your workload must repartition so that the records homed in the impacted Region are assigned to one of the remaining Regions. Using write local, all Regions can accept writes. With no changes needed to the data storage layer, this pattern can have the fastest (near zero) RTO.

Conclusion

Consider the multi-site active/active strategy for your workload if you need DR with the quickest recovery time (lowest RTO) and least data loss (lowest RPO). Implementing it across Regions (multi-Region) is a good option if you are looking for the most separation and complete independence of your sites, or if you need to provide low latency access to the workload from users in globally diverse locations.

Also consider the trade-offs. Implementing and operating this strategy, particularly using multi-Region, can be more complicated and more expensive, than other DR strategies. When implementing multi-Region active/active in AWS, you have access to resources to choose the routing policy and the read/write pattern that is right for your workload.

Related information

Complying with DMARC across multiple accounts using Amazon SES

Post Syndicated from Brendan Paul original https://aws.amazon.com/blogs/messaging-and-targeting/complying-with-dmarc-across-multiple-accounts-using-amazon-ses/

Introduction

For enterprises of all sizes, email is a critical piece of infrastructure that supports large volumes of communication from an organization. As such, companies need a robust solution to deal with the complexities this may introduce. In some cases, companies have multiple domains that support several different business units and need a distributed way of managing email sending for those domains. For example, you might want different business units to have the ability to send emails from subdomains, or give a marketing company the ability to send emails on your behalf. Amazon Simple Email Service (Amazon SES) is a cost-effective, flexible, and scalable email service that enables developers to send mail from any application. One of the benefits of Amazon SES is that you can configure Amazon SES to authorize other users to send emails from addresses or domains that you own (your identities) using their own AWS accounts. When allowing other accounts to send emails from your domain, it is important to ensure this is done securely. Amazon SES allows you to send emails to your users using popular authentication methods such as DMARC. In this blog, we walk you through 1/ how to comply with DMARC when using Amazon SES and 2/ how to enable other AWS accounts to send authenticated emails from your domain.

DMARC: what is it, why is it important?

DMARC stands for “Domain-based Message Authentication, Reporting & Conformance”, and it is an email authentication protocol (DMARC.org). DMARC gives domain owners and email senders a way to protect their domain from being used by malicious actors in phishing or spoofing attacks. Email spoofing can be used as a way to compromise users’ financial or personal information by taking advantage of their trust of well-known brands. DMARC makes it easier for senders and recipients to determine whether or not an email was actually sent by the domain that it claims to have been sent by.

Solution Overview

In this solution, you will learn how to set up DKIM signing on Amazon SES, implement a DMARC Policy, and enable other accounts in your organization to send emails from your domain using Sending Authorization. When you set up DKIM signing, Amazon SES will attach a digital signature to all outgoing messages, allowing recipients to verify that the email came from your domain. You will then set your DMARC Policy, which tells an email receiver what to do if an email is not authenticated. Lastly, you will set up Sending Authorization so that other AWS accounts can send authenticated emails from your domain.

Prerequisites

In order to complete the example illustrated in this blog post, you will need to have:

  1. A domain in an Amazon Route53 Hosted Zone or third-party provider. Note: You will need to add/update records for the domain. For this blog we will be using Route53.
  2. An AWS Organization
  3. A second AWS account to send Amazon SES Emails within a different AWS Organizations OU. If you have not worked with AWS Organizations before, review the Organizations Getting Started Guide

How to comply with DMARC (DKIM and SPF) in Amazon SES

In order to comply with DMARC, you must authenticate your messages with either DKIM (DomainKeys Identified Mail), SPF (Sender Policy Framework), or both. DKIM allows you to send email messages with a cryptographic key, which enables email providers to determine whether or not the email is authentic. SPF defines what servers are allowed to send emails for their domain. To use SPF for DMARC compliance you need to set up a custom MAIL FROM domain in Amazon SES. To authenticate your emails with DKIM in Amazon SES, you have the option of:

In this blog, you will be setting up a sending identity.

Setting up DKIM Signing in Amazon SES

  1. Navigate to the Amazon SES Console 
  2. Select Verify a New Domain and type the name of your domain in
  3. Select Generate DKIM Settings
  4. Choose Verify This Domain
    1. This will generate the DNS records needed to complete domain verification, DKIM signing, and routing incoming mail.
    2. Note: When you initiate domain verification using the Amazon SES console or API, Amazon SES gives you the name and value to use for the TXT record. Add a TXT record to your domain’s DNS server using the specified Name and Value. Amazon SES domain verification is complete when Amazon SES detects the existence of the TXT record in your domain’s DNS settings.
  5. If you are using Route 53 as your DNS provider, choose the Use Route 53 button to update the DNS records automatically
    1. If you are not using Route 53, go to your third-party provider and add the TXT record to verify the domain as well as the three CNAME records to enable DKIM signing. You can also add the MX record at the end to route incoming mail to Amazon SES.
    2. A list of common DNS Providers and instructions on how to update the DNS records can be found in the Amazon SES documentation
  6. Choose Create Record Sets if you are using Route53 as shown below or choose Close after you have added the necessary records to your third-party DNS provider.

 

Note: in the case that you previously verified a domain, but did NOT generate the DKIM settings for your domain, follow the steps below. Skip these steps if this is not the case:

  1. Go to the Amazon SES Console, and select your domain
  2. Select the DKIM dropdown
  3. Choose Generate DKIM Settings and copy the three values in the record set shown
    1. You may also download the record set as a CSV file
  4. Navigate to the Route53 console or your third-party DNS provider. Instructions on how to update the DNS records in your third-party can be found in the Amazon SES documentation
  5. Select the domain you are using
  6. Choose Create Record

  1. Enter the values that Amazon SES has generated for you, and add the three CNAME records to your domain
  2. Wait a few minutes, and go back to your domain in the Amazon SES Console
  3. Check that the DKIM status is verified

You also want to set up a custom MAIL FROM domain that you will use later on. To do so, follow the steps in the documentation.

Setting up a DMARC policy on your domain

DMARC policies are TXT records you place in DNS to define what happens to incoming emails that don’t align with the validations provided when setting up DKIM and SPF. With this policy, you can choose to allow the email to pass through, quarantine the email into a folder like junk or spam, or reject the email.

As a best practice, you should start with a DMARC policy that doesn’t reject all email traffic and collect reports on emails that don’t align to determine if they should be allowed. You can also set a percentage on the DMARC policy to perform filtering on a subset of emails to, for example, quarantine only 50% of the emails that don’t align. Once you are in a state where you can begin to reject non-compliant emails, flip the policy to reject failed authentications. When you set the DMARC policy for your domain, any subdomains that are authorized to send on behalf of your domain will inherit this policy and the same rule will apply. For more information on setting up a DMARC policy, see our documentation.

In a scenario where you have multiple subdomains sending emails, you should be setting the DMARC policy for the organizational domain that you own. For example, if you own the domain example.com and also want to use the sub-domain sender.example.com to send emails you can set the organizational DMARC policy (as a DNS TXT record) to:

Name Type Value
1 _dmarc.example.com TXT “v=DMARC1;p=quarantine;pct=50;rua=mailto:[email protected]

This DMARC policy states that 50% of emails coming from example.com that fail authentication should be quarantined and you want to send a report of those failures to [email protected]. For your sender.example.com sub-domain, this policy will be inherited unless you specify another DMARC policy for our sub-domain. In the case where you want to be stricter on the sub-domain you could add another DMARC policy like you see in the following table.

 

Name Type Value
1 _dmarc.sender.example.com TXT “v=DMARC1;p=reject;pct=100;rua=mailto:[email protected];ruf=mailto:[email protected]

This policy would apply to emails coming from sender.example.com and would reject any email that fails authentication. It would also send aggregate feedback to [email protected] and detailed message-specific failure information to [email protected] for further analysis.

Sending Authorization in Amazon SES – Allowing Other Accounts to Send Authenticated Emails

Now that you have configured Amazon SES to comply with DMARC in the account that owns your identity, you may want to allow other accounts in your organization the ability to send emails in the same way. Using Sending Authorization, you can authorize other users or accounts to send emails from identities that you own and manage. An example of where this could be useful is if you are an organization which has different business units in that organization. Using sending authorization, a business unit’s application could send emails to their customers from the top-level domain. This application would be able to leverage the authentication settings of the identity owner without additional configuration. Another advantage is that if the business unit has its own subdomain, the top-level domain’s DKIM settings can apply to this subdomain, so long as you are using Easy DKIM in Amazon SES and have not set up Easy DKIM for the specific subdomains.

Setting up sending authorization across accounts

Before you set up sending authorization, note that working across multiple accounts can impact bounces, complaints, pricing, and quotas in Amazon SES. Amazon SES documentation provides a good understanding of the impacts when using multiple accounts. Specifically, delegated senders are responsible for bounces and complaints and can set up notifications to monitor such activities. These also count against the delegated senders account quotas. To set up Sending Authorization across accounts:

  1. Navigate to the Amazon SES Console from the account that owns the Domain
  2. Select Domains under Identity Management
  3. Select the domain that you want to set up sending authorization with
  4. Select View Details
  5. Expand Identity Policies and Click Create Policy
  6. You can either create a policy using the policy generator or create a custom policy. For the purposes of this blog, you will create a custom policy.
  7. For the custom policy, you will allow a particular Organization Unit (OU) from our AWS Organization access to our domain. You can also limit access to particular accounts or other IAM principals. Use the following policy to allow a particular OU to access the domain:

{
  “Version”: “2012-10-17”,
  “Id”: “AuthPolicy”,
  “Statement”: [
    {
      “Sid”: “AuthorizeOU”,
      “Effect”: “Allow”,
      “Principal”: “*”,
      “Action”: [
        “SES:SendEmail”,
        “SES:SendRawEmail”
      ],
      “Resource”: “<Arn of Verified Domain>”,
      “Condition”: {
        “ForAnyValue:StringLike”: {
          “aws:PrincipalOrgPaths”: “<Organization Id>/<Root OU Id>/<Organizational Unit Id>”
        }
      }
    }
  ]
}

9. Make sure to replace the escaped values with your Verified Domain ARN and the Org path of the OU you want to limit access to.

 

You can find more policy examples in the documentation. Note that you can configure sending authorization such that all accounts under your AWS Organization are authorized to send via a certain subdomain.

Testing

You can now test the ability to send emails from your domain in a different AWS account. You will do this by creating a Lambda function to send a test email. Before you create the Lambda function, you will need to create an IAM role for the Lambda function to use.

Creating the IAM Role:

  1. Log in to your separate AWS account
  2. Navigate to the IAM Management Console
  3. Select Role and choose Create Role
  4. Under Choose a use case select Lambda
  5. choose Next: Permissions
  6. In the search bar, type SES and select the check box next to AmazonSESFullAccess
  7. Choose Next:Tags and Review
  8. Give the role a name of your choosing, and choose Create Role

Navigate to Lambda Console

  1. Select Create Function
  2. Choose the box marked Author from Scratch
  3. Give the function a name of your choosing (Ex: TestSESfunction)
  4. In this demo, you will be using Python 3.8 runtime, but feel free to modify to your language of choice
  5. Select the Change default execution role dropdown, and choose the Use an existing role radio button
  6. Under Existing Role, choose the role that you created in the previous step, and create the function

Edit the function

  1. Navigate to the Function Code portion of the page and open the function python file
  2. Replace the default code with the code shown below, ensuring that you put your own values in based on your resources
  3. Values needed:
    1. Test Email Address: an email address you have access to
      1. NOTE: If you are still operating in the Amazon SES Sandbox, this will need to be a verified email in Amazon SES. To verify an email in Amazon SES, follow the process here. Alternatively, here is how you can move out of the Amazon SES Sandbox
    2. SourceArn: The arn of your domain. This can be found in Amazon SES Console → Domains → <YourDomain> → Identity ARN
    3. ReturnPathArn: The same as your Source ARN
    4. Source: This should be your Mail FROM Domain @ your domain
      1. Your Mail FROM Domain can be found under Domains → <YourDomain> → Mail FROM Domain dropdown
      2. Ex: [email protected]
    5. Use the following function code for this example

import json
import boto3
from botocore.exceptions import ClientError

client = boto3.client('ses')
def lambda_handler(event, context):
    # Try to send the email.
    try:
        #Provide the contents of the email.
        response = client.send_email(
            Destination={
                'ToAddresses': [
                    '<[email protected]>',
                ],
            },
            Message={
                'Body': {
                    'Html': {
                        'Charset': 'UTF-8',
                        'Data': 'This email was sent with Amazon SES.',
                    },
                },
                'Subject': {
                    'Charset': 'UTF-8',
                    'Data': 'Amazon SES Test',
                },
            },
            SourceArn='<your-ses-identity-ARN>',
            ReturnPathArn='<your-ses-identity-ARN>',
            Source='<[email protected]>',
             )
    # Display an error if something goes wrong.
    except ClientError as e:
        print(e.response['Error']['Message'])
    else:
        print("Email sent! Message ID:"),
        print(response['ResponseMetadata']['RequestId'])

  1. Once you have replaced the appropriate values, choose the Deploy button to deploy your changes

Run a Test invocation

  1. After you have deployed your changes, select the “Test” Panel above your function code

  1. You can leave all of these keys and values as default, as the function does not use any event parameters
  2. Choose the Invoke button in the top right corner
  3. You should see this above the test event window:

Verifying that the Email has been signed properly

Depending on your email provider, you may be able to check the DKIM signature directly in the application. As an example, for Outlook, right click on the message, and choose View Source from the menu. You should see line that shows the Authentication Results and whether or not the DKIM/SPF signature passed. For Gmail, go to your Gmail Inbox on the Gmail web app. Choose the message you wish to inspect, and choose the More Icon. Choose View Original from the drop-down menu. You should then see the SPF and DKIM “PASS” Results.

Cleanup

To clean up the resources in your account,

  1. Navigate to the Route53 Console
  2. Select the Hosted Zone you have been working with
  3. Select the CNAME, TXT, and MX records that you created earlier in this blog and delete them
  4. Navigate to the SES Console
  5. Select Domains
  6. Select the Domain that you have been working with
  7. Click the drop down Identity Policies and delete the one that you created in this blog
  8. If you verified a domain for the sake of this blog: navigate to the Domains tab, select the domain and select Remove
  9. Navigate to the Lambda Console
  10. Select Functions
  11. Select the function that you created in this exercise
  12. Select Actions and delete the function

Conclusion

In this blog post, we demonstrated how to delegate sending and management of your sub-domains to other AWS accounts while also complying with DMARC when using Amazon SES. In order to do this, you set up a sending identity so that Amazon SES automatically adds a DKIM signature to your messages. Additionally, you created a custom MAIL FROM domain to comply with SPF. Lastly, you authorized another AWS account to send emails from a sub-domain managed in a different account, and tested this using a Lambda function. Allowing other accounts the ability to manage and send email from your sub-domains provides flexibility and scalability for your organization without compromising on security.

Now that you have set up DMARC authentication for multiple accounts in your enviornment, head to the AWS Messaging & Targeting Blog to see examples of how you can combine Amazon SES with other AWS Services!

If you have more questions about Amazon Simple Email Service, check out our FAQs or our Developer Guide.

If you have feedback about this post, submit comments in the Comments section below.

Using Route 53 Private Hosted Zones for Cross-account Multi-region Architectures

Post Syndicated from Anandprasanna Gaitonde original https://aws.amazon.com/blogs/architecture/using-route-53-private-hosted-zones-for-cross-account-multi-region-architectures/

This post was co-written by Anandprasanna Gaitonde, AWS Solutions Architect and John Bickle, Senior Technical Account Manager, AWS Enterprise Support

Introduction

Many AWS customers have internal business applications spread over multiple AWS accounts and on-premises to support different business units. In such environments, you may find a consistent view of DNS records and domain names between on-premises and different AWS accounts useful. Route 53 Private Hosted Zones (PHZs) and Resolver endpoints on AWS create an architecture best practice for centralized DNS in hybrid cloud environment. Your business units can use flexibility and autonomy to manage the hosted zones for their applications and support multi-region application environments for disaster recovery (DR) purposes.

This blog presents an architecture that provides a unified view of the DNS while allowing different AWS accounts to manage subdomains. It utilizes PHZs with overlapping namespaces and cross-account multi-region VPC association for PHZs to create an efficient, scalable, and highly available architecture for DNS.

Architecture Overview

You can set up a multi-account environment using services such as AWS Control Tower to host applications and workloads from different business units in separate AWS accounts. However, these applications have to conform to a naming scheme based on organization policies and simpler management of DNS hierarchy. As a best practice, the integration with on-premises DNS is done by configuring Amazon Route 53 Resolver endpoints in a shared networking account. Following is an example of this architecture.

Route 53 PHZs and Resolver Endpoints

Figure 1 – Architecture Diagram

The customer in this example has on-premises applications under the customer.local domain. Applications hosted in AWS use subdomain delegation to aws.customer.local. The example here shows three applications that belong to three different teams, and those environments are located in their separate AWS accounts to allow for autonomy and flexibility. This architecture pattern follows the option of the “Multi-Account Decentralized” model as described in the whitepaper Hybrid Cloud DNS options for Amazon VPC.

This architecture involves three key components:

1. PHZ configuration: PHZ for the subdomain aws.customer.local is created in the shared Networking account. This is to support centralized management of PHZ for ancillary applications where teams don’t want individual control (Item 1a in Figure). However, for the key business applications, each of the teams or business units creates its own PHZ. For example, app1.aws.customer.local – Application1 in Account A, app2.aws.customer.local – Application2 in Account B, app3.aws.customer.local – Application3 in Account C (Items 1b in Figure). Application1 is a critical business application and has stringent DR requirements. A DR environment of this application is also created in us-west-2.

For a consistent view of DNS and efficient DNS query routing between the AWS accounts and on-premises, best practice is to associate all the PHZs to the Networking Account. PHZs created in Account A, B and C are associated with VPC in Networking Account by using cross-account association of Private Hosted Zones with VPCs. This creates overlapping domains from multiple PHZs for the VPCs of the networking account. It also overlaps with the parent sub-domain PHZ (aws.customer.local) in the Networking account. In such cases where there is two or more PHZ with overlapping namespaces, Route 53 resolver routes traffic based on most specific match as described in the Developer Guide.

2. Route 53 Resolver endpoints for on-premises integration (Item 2 in Figure): The networking account is used to set up the integration with on-premises DNS using Route 53 Resolver endpoints as shown in Resolving DNS queries between VPC and your network. Inbound and Outbound Route 53 Resolver endpoints are created in the VPC in us-east-1 to serve as the integration between on-premises DNS and AWS. The DNS traffic between on-premises to AWS requires an AWS Site2Site VPN connection or AWS Direct Connect connection to carry DNS and application traffic. For each Resolver endpoint, two or more IP addresses can be specified to map to different Availability Zones (AZs). This helps create a highly available architecture.

3. Route 53 Resolver rules (Item 3 in Figure): Forwarding rules are created only in the networking account to route DNS queries for on-premises domains (customer.local) to the on-premises DNS server. AWS Resource Access Manager (RAM) is used to share the rules to accounts A, B and C as mentioned in the section “Sharing forwarding rules with other AWS accounts and using shared rules” in the documentation. Account owners can now associate these shared rules with their VPCs the same way that they associate rules created in their own AWS accounts. If you share the rule with another AWS account, you also indirectly share the outbound endpoint that you specify in the rule as described in the section “Considerations when creating inbound and outbound endpoints” in the documentation. This implies that you use one outbound endpoint in a region to forward DNS queries to your on-premises network from multiple VPCs, even if the VPCs were created in different AWS accounts. Resolver starts to forward DNS queries for the domain name that’s specified in the rule to the outbound endpoint and forward to the on-premises DNS servers. The rules are created in both regions in this architecture.

This architecture provides the following benefits:

  1. Resilient and scalable
  2. Uses the VPC+2 endpoint, local caching and Availability Zone (AZ) isolation
  3. Minimal forwarding hops
  4. Lower cost: optimal use of Resolver endpoints and forwarding rules

In order to handle the DR, here are some other considerations:

  • For app1.aws.customer.local, the same PHZ is associated with VPC in us-west-2 region. While VPCs are regional, the PHZ is a global construct. The same PHZ is accessible from VPCs in different regions.
  • Failover routing policy is set up in the PHZ and failover records are created. However, Route 53 health checkers (being outside of the VPC) require a public IP for your applications. As these business applications are internal to the organization, a metric-based health check with Amazon CloudWatch can be configured as mentioned in Configuring failover in a private hosted zone.
  • Resolver endpoints are created in VPC in another region (us-west-2) in the networking account. This allows on-premises servers to failover to these secondary Resolver inbound endpoints in case the region goes down.
  • A second set of forwarding rules is created in the networking account, which uses the outbound endpoint in us-west-2. These are shared with Account A and then associated with VPC in us-west-2.
  • In addition, to have DR across multiple on-premises locations, the on-premises servers should have a secondary backup DNS on-premises as well (not shown in the diagram).
    This ensures a simple DNS architecture for the DR setup, and seamless failover for applications in case of a region failure.

Considerations

  • If Application 1 needs to communicate to Application 2, then the PHZ from Account A must be shared with Account B. DNS queries can then be routed efficiently for those VPCs in different accounts.
  • Create additional IP addresses in a single AZ/subnet for the resolver endpoints, to handle large volumes of DNS traffic.
  • Look at Considerations while using Private Hosted Zones before implementing such architectures in your AWS environment.

Summary

Hybrid cloud environments can utilize the features of Route 53 Private Hosted Zones such as overlapping namespaces and the ability to perform cross-account and multi-region VPC association. This creates a unified DNS view for your application environments. The architecture allows for scalability and high availability for business applications.

The Satellite Ear Tag that is Changing Cattle Management

Post Syndicated from Karen Hildebrand original https://aws.amazon.com/blogs/architecture/the-satellite-ear-tag-that-is-changing-cattle-management/

Most cattle are not raised in cities—they live on cattle stations, large open plains, and tracts of land largely unpopulated by humans. It’s hard to keep connected with the herd. Cattle don’t often carry their own mobile phones, and they don’t pay a mobile phone bill. Naturally, the areas in which cattle live, often do not have cellular connectivity or reception. But they now have one way to stay connected: a world-first satellite ear tag.

Ceres Tag co-founders Melita Smith and David Smith recognized the problem given their own farming background. David explained that they needed to know simple things to begin with, such as:

  • Where are they?
  • How many are out there?
  • What are they doing?
  • What condition are they in?
  • Are they OK?

Later, the questions advanced to:

  • Which are the higher performing animals that I want to keep?
  • Where do I start when rounding them up?
  • As assets, can I get better financing and insurance if I can prove their location, existence, and condition?

To answer these questions, Ceres Tag first had to solve the biggest challenge, and it was not to get cattle to carry their mobile phones and pay mobile phone bills to generate the revenue needed to get greater coverage. David and Melita knew they needed help developing a new method of tracking, but in a way that aligned with current livestock practices. Their idea of a satellite connected ear tag came to life through close partnership and collaboration with CSIRO, Australia’s national science agency. They brought expertise to the problem, and rallied together teams of experts across public and private partnerships, never accepting “that’s not been done before” as a reason to curtail their innovation.

 

Figure 1: How Ceres Tag works in practice

Thinking Big: Ceres Tag Protocol

Melita and David constructed their idea and brought the physical hardware to reality. This meant finding strategic partners to build hardware, connectivity partners that provided global coverage at a cost that was tenable to cattle operators, integrations with existing herd management platforms and a global infrastructure backbone that allowed their solution to scale. They showed resilience, tenacity and persistence that are often traits attributed to startup founders and lifelong agricultural advocates. Explaining the purpose of the product often requires some unique approaches to defining the value proposition while fundamentally breaking down existing ways of thinking about things. As David explained, “We have an internal saying, ‘As per Ceres Tag protocol …..’ to help people to see the problem through a new lens.” This persistence led to the creation of an easy to use ear tagging applicator and a two-prong smart ear tag. The ear tag connects via satellite for data transmission, providing connectivity to more than 120 countries in the world and 80% of the earth’s surface.

The Ceres Tag applicator, smart tag, and global satellite connectivity

Figure 2: The Ceres Tag applicator, smart tag, and global satellite connectivity

Unlocking the blocker: data-driven insights

With the hardware and connectivity challenges solved, Ceres Tag turned to how the data driven insights would be delivered. The company needed to select a technology partner that understood their global customer base, and what it means to deliver a low latency solution for web, mobile and API-driven solutions. David, once again knew the power in leveraging the team around him to find the best solution. The evaluation of cloud providers was led by Lewis Frost, COO, and Heidi Perrett, Data Platform Manager. Ceres Tag ultimately chose to partner with AWS and use the AWS Cloud as the backbone for the Ceres Tag Management System.

Ceres Tag conceptual diagram

Figure 3: Ceres Tag conceptual diagram

The Ceres Tag Management System houses the data and metadata about each tag, enabling the traceability of that tag throughout each animal’s life cycle. This includes verification as to whom should have access to their health records and history. Based on the nature of the data being stored and transmitted, security of the application is critical. As a startup, it was important for Ceres Tag to keep costs low, but to also to be able to scale based on growth and usage as it expands globally.

Ceres Tag is able to quickly respond to customers regardless of geography, routing traffic to the appropriate end point. They accomplish this by leveraging Amazon CloudFront as the Content Delivery Network (CDN) for traffic distribution of front-end requests and Amazon Route 53 for DNS routing. A multi-Availability Zone deployment and AWS Application Load Balancer distribute incoming traffic across multiple targets, increasing the availability of your application.

Ceres Tag is using AWS Fargate to provide a serverless compute environment that matches the pay-as-you-go usage-based model. AWS also provides many advanced security features and architecture guidance that has helped to implement and evaluate best practice security posture across all of the environments. Authentication is handled by Amazon Cognito, which allows Ceres Tag to scale easily by supporting millions of users. It leverages easy-to-use features like sign-in with social identity providers, such as Facebook, Google, and Amazon, and enterprise identity providers via SAML 2.0.

The data captured from the ear tag on the cattle is will be ingested via AWS PrivateLink. By providing a private endpoint to access your services, AWS PrivateLink ensures your traffic is not exposed to the public internet. It also makes it easy to connect services across different accounts and VPCs to significantly simplify your network architecture. In leveraging a satellite connectivity provider running on AWS, Ceres Tag will benefit from the AWS Ground Station infrastructure leveraged by the provider in addition to the streaming IoT database.

 

Agile website delivery with Hugo and AWS Amplify

Post Syndicated from Nigel Harris original https://aws.amazon.com/blogs/devops/agile-website-delivery-with-hugo-and-aws-amplify/

In this post, we show how you can rapidly configure and deploy a website using Hugo (an AWS Cloud9 integrated development environment (IDE) for content editing), AWS CodeCommit for source code control, and AWS Amplify to implement a source code-controlled, automated deployment process.

When hosting a website on AWS, you can choose from several options. One popular option is to use Amazon Simple Storage Service (Amazon S3) to host a static website. If you prefer full access to the infrastructure hosting your website, you can use the NGINX Quick Start to quickly deploy web server infrastructure using AWS CloudFormation.

Static website generators such as Hugo and MkDocs accelerate the website content generation process, and can be a valuable tool when trying to rapidly deliver technical documentation or similar content. Typically, the content creation process requires programming in HTML and CSS.

Hugo is written in Go and available under the Apache 2.0 license. It provides several themes (collections of layouts) that accelerate website creation by drastically reducing the need to focus on format. You can author content in Markdown and output in multiple languages and formats (including ebook formats). Excellent examples of public websites built using Hugo include Digital.gov and Kubernetes.io.

 

Solution overview

This solution illustrates how to provision a hosted, source code-controlled Hugo generated website using CodeCommit and Amplify Console. The provisioned website is configured with a custom subdomain and an SSL certificate. We use an AWS Cloud9 IDE to enable content creation in the cloud.

 

Setting up an AWS Cloud9 IDE

Start by provisioning an AWS Cloud9 IDE. AWS Cloud9 environments run using Amazon Elastic Compute Cloud (Amazon EC2). You need to provision your AWS Cloud9 environment into an existing public subnet in an Amazon Virtual Private Cloud (Amazon VPC) within your AWS account. You can complete this in the following steps:

1. Access your AWS account using with an identity with administrative privileges. If you don’t have an AWS account, you can create one.

2. Create a new AWS Cloud9 environment using the wizard on the AWS Cloud9 console.

3. Enter a name for your desktop and an optional description.

4. Choose Next step.

Naming your Cloud 9 environment

5. In the Environment settings section, for Environment type, select Create a new EC2 instance for environment (direct access).

6. For Instance type, select your preferred instance type (the default, t2.micro, works for this use case)

7. Under Network settings, for Network (VPC), choose a VPC that you wish to deploy your AWS Cloud9 instance into. You may wish to use your default VPC, which is suitable for the purpose of this tutorial.

8. Choose a public subnet from this VPC for deployment.

Cloud9 Settings

9. Leave all other settings unchanged and choose Next step.
10. Review your choices and choose Create environment.

Environment creation takes a few minutes to complete. When the environment is ready, you receive access to the AWS Cloud9 IDE in your browser. We return to it shortly to develop content for your Hugo website.

Your Cloud9 Desktop

Configuring a source code repository to track content changes

Static website generators enable rapid changes to website content and layout. Source control management (SCM) systems provide a revision history for your code, and allow you to revert to previous versions of a project when unintended changes are introduced. SCM systems become increasingly important as the velocity of change and the number of team members introducing change increases.

You now create a source code repository to track changes to your content. You use CodeCommit, a fully-managed source control service that hosts secure Git-based repositories.

1. In a new browser, sign in to the CodeCommit Console and create a new repository.

2. For Repository name, enter amplify-website.

3. For Description, enter an appropriate description.

4. Choose Create.

Create repository

Repository creation takes just a few moments.

5. In the Connection steps section, choose the appropriate method to connect to your repository based on how you accessed your AWS account.

For this post, I signed in to my AWS account using federated access, so I choose the HTTPS Git Remote CodeCommit (HTTPS-GRC) tab. This is the recommended connection method for this sign-in type. You can also configure a connection to your repository using SSH or Git credentials over HTTPS. SSH and Git credentials over HTTPS are appropriate methods if you have signed in to your AWS account as an AWS Identity and Access Management (IAM) user. The Amazon CodeCommit console provides additional information regarding each of these connection types, including links to supporting documentation.

Connect to Repo

 

Configuring and deploying an example website

You’re now ready to configure and deploy your website.

1. Return to the browser with your AWS Cloud9 IDE and place your cursor in the lower terminal pane of the IDE.

The terminal pane provides Bash shell access on the EC2 instance running AWS Cloud9.

You now create a Hugo website. The website design is based on Hugo-theme-learn. Themes are collections of Hugo layouts that take all the hassle out of building your website. Learn is a multilingual-ready theme authored by Mathieu Cornic, designed for building technical documentation websites.

Hugo provides a variety of themes on their website. Many of the themes include bundled example website content that you can easily adapt by following the accompanying theme documentation.

2. Enter the following code to download an existing example website stored as a .zip file, extract it, and commit the contents into CodeCommit from your AWS Cloud9 IDE:

cd ~/environment
aws s3 cp s3://ee-assets-prod-us-east-1/modules/3c5ba9cb6ff44465b96993d210f67147/v1/example-website.zip ~/environment/example-website.zip
unzip example-website.zip
rm example-website.zip

The following screenshot shows your output.

example website copy commands

 

Next, we run commands to create a directory to host your website and copy files into place from the example website to get started. We then create a new default branch called main (formerly referred to as the master branch), local to our AWS Cloud9 instance. We then copy files into place from the example website. After adding and committing them locally, we push all our changes to the remote Amazon Codecommit repository.

3. Enter the following code:

mkdir ~/environment/amplify-website/
cd ~/environment/amplify-website/
git init
git remote add origin codecommit::us-east-1://amplify-website
git remote -v
git checkout -b main
cp -rp ~/environment/example-website/* ~/environment/amplify-website/
git add *
git commit -am "first commit"
git push -u origin main

Deployment and hosting is achieved by using Amplify Console, a static web hosting service that accelerates your application release cycle by providing a simple CI/CD workflow for building and deploying static web applications.

4. On the Amplify console, under Deploy, choose Get Started.

Amplify banner

5. On the Get started with the Amplify Console page, select AWS CodeCommit as your source code repository.

6. Choose Continue.

Amplify get started page

7. On the Add repository branch page, for Recently updated repositories, choose your repository.

8. For Branch, choose main.

9. Choose Next.

add branch

On the Configure build settings page, Amplify automatically uses the amplify.yml file for build settings for your deployment. You committed this into your source code repository in the previous step. The amplify.yml file is detected from the root of your website directory structure.

10. Choose Next.

Amplify configure build settings

11. On the review page, choose Save and deploy.

Amplify builds and deploys your Amplify website within minutes, and shows you its progress. When deployment is complete, you can access the website to see the sample content.

amplify website

The following screenshot shows your example website.

sample website

 

Promoting changes to the website

We can now update the line of text in the home page and commit and publish this change.

1. Return to the browser with your AWS Cloud9 IDE and place your cursor in the lower terminal pane of the IDE.

2. On the navigation pane, choose the file ~/environment/amplify-website/workshop/content/_index.en.md.

The contents of the file open under a new tab in the upper pane.

3. Change the string First Line of Text to First Update to Website.

content change

4. From the File menu, choose Save to save the changes you have made to the _index.en.md file.

save content changes

5. Commit the changes and push to CodeCommit by running the following command in the lower terminal pane in AWS Cloud9:

git add *; git commit -am "homepage update"; git push origin main

The output in your AWS Cloud9 terminal should appear similar to the following screenshot.

commit output

6. Return to the Amplify Console and observe how the committed change in CodeCommit is automatically detected. Amplify runs deployment steps to push your changes to the website.

amplify deploy changes

7. Access the URL of your website after this update is complete to verify that the first line of text on your home page has changed.

updated website

You can repeat this process to make source-code controlled, automated changes to your website.

Adding a custom domain

Adding a custom domain to your Amplify configuration makes it easier for clients to access your content. You can register new domains using Amazon Route 53 or, if you have an existing domain registered outside of AWS, you can integrate it with Route 53 and Amplify. For our use case, the domain www.hugoonamplify.com is a registered a domain name using a third-party registrar (NameCheap). You can manage DNS configurations for domains registered outside of AWS using Route 53.

Start by configuring a public hosted zone in Route 53.

1. On the Route 53 console, choose Hosted zones.

2. Choose Create hosted zone.

hosted zones

3. For Domain name, enter hugoonamplify.com.

4. For Description, enter an appropriate description.

5. For Type, select Public hosted zone.

hosted zones configuration

6. Choose Create hosted zone.

7. Save the addresses of the name servers that respond to client DNS lookup requests for the custom domain.

create hosted zone

8. In a separate browser, access the console of your DNS registrar.

9. Configure a custom DNS name servers setting on the console of the third-party domain name registrar.

This configuration specifies the Route 53 assigned name servers as authoritative DNS for our custom domain. For this use case, propagation of this change may take up to 48 hours.

namecheap console

10. Use https://who.is to verify that the AWS name servers are listed correctly for your custom domain to internet clients.

whois lookup

You can now set up your custom domain in Amplify. Amplify helps you configure DNS and set up SSL for your desired custom domain.

domain management

11. On the Amplify Console, under App settings, choose Domain management.

12. Choose Add domain.

13. For Domain, enter your custom domain name (hugoonamplify.com).

14. Choose Configure domain.

15. For Subdomain, I only want to set up www and choose to exclude the root of my custom domain.

16. Choose Save.

Amplify begins the process of creating the SSL certificates. Amplify sends a notification that it’s issuing an SSL certificate to secure traffic to the custom domain.

ssl domain management

After a few moments, it proceeds to SSL configuration and indicates that ownership of domain is in progress.

ssl domain management configuration

Amplify verifies domain ownership by creating a sample CNAME record in your hosted zone file. When ownership is verified, the domain is propagated onto an Amazon CloudFront distribution managed by the Amplify service, and domain activation is complete.

ssl domain management configured

Clients can now access the website using the custom domain name www.hugoonaplify.com.

access website via custom domain

 

Establishing a subdomain for development

You can create a development website in Amplify that is aligned to a development code branch in CodeCommit that enables testing changes prior to production release.

1. Access the AWS Cloud9 IDE and use the terminal to enter the following commands to create a development branch and push changes to CodeCommit using the current content from the main branch with a single content change:

git checkout -b development
git branch
git remote -v
git add *; git commit -am "first development commit";
git push -u origin development

2. Open and edit the file ~/environment/amplify-website/workshop/content/_index.en.md and change the string Update to Website to something else.

Alternatively, run the following Unix sed command from the terminal in AWS Cloud9 to make that content change:

sed -i 's/Update to Website/Update to Development/g' ~/environment/amplify-website/workshop/content/_index.en.md

3. Commit and push your change with the following code:

git add *; git commit -am "second development commit"; git push -u origin development

You now configure a subdomain in Amplify to allow developers to review changes.

4. Return to the amplify-website app.

5. Choose Connect branch.

connect branch

6. For Branch, choose the development branch you created and committed code into.

7. Choose Next.

add development branch

Amplify builds a second website based on the contents of the development branch. You can see the instance of your website matched to the development code branch on Amplify Console.

amplify two branches

8. Access the domain management menu item in your Amplify application to add a friendly subdomain.

9. Edit the domain and add a subdomain item with a name of your choice (for example, dev).

10. Associate it to the development branch containing the committed code and content changes.

11. Choose Add.

add dev domain

You can access the subdomain to verify the changes.

verify domain

Controlling access to development

You may wish to restrict access to new content as it’s deployed into the development website.

1. On Amplify Console, choose your application.

2. Choose Access control.

3. Under Access control settings, choose your preferred settings.

You have the option to restrict access globally or on a branch-by-branch basis. For this use case, we create a simple password protection for a user named developer on the development branch and site.

access control settings

 

Cleaning up

Unless you plan to keep the website you have constructed, you can quickly clean up provisioned assets and avoid any unnecessary costs.

1. On Amplify Console, select the app you created.

2. From the Actions drop-down menu, choose Delete app.

3. In the pop-up window, confirm the deletion.

4. On the CodeCommit dashboard, select the repository you created.

5. Choose Delete.

6. In the pop-up window, confirm the deletion.

7. On the AWS Cloud9 dashboard, select the IDE you created.

8. Choose Delete.

9. In the pop-up window, confirm the deletion.

 

Conclusion

Hugo is a powerful tool that enables accelerated delivery of content in a variety of formats including image portfolios, online resume presentation, blogging, and technical documentation. Amplify Console provides a convenient, easy-to-use, static web hosting service that can greatly accelerate delivery of static content.

When combining Hugo with Amplify Console, you can rapidly deploy websites in minutes with features such as friendly URLS, environments matched to code branches, and encryption (SSL). Visit gohugo.io to find out more about Hugo. For more information about how Amplify Console can help you rapidly deploy Hugo and other modern web applications, see the AWS Amplify Console User Guide.

Nigel Harris

Nigel Harris

Nigel Harris is an Enterprise Solutions Architect at Amazon Web Services. He works with AWS customers to provide guidance and technical assistance on AWS architectures.

How to configure an LDAPS endpoint for Simple AD

Post Syndicated from Marco Sommella original https://aws.amazon.com/blogs/security/how-to-configure-ldaps-endpoint-for-simple-ad/

In this blog post, we show you how to configure an LDAPS (LDAP over SSL or TLS) encrypted endpoint for Simple AD so that you can extend Simple AD over untrusted networks. Our solution uses Network Load Balancer (NLB) as SSL/TLS termination. The data is then decrypted and sent to Simple AD. Network Load Balancer offers integrated certificate management, SSL/TLS termination, and the ability to use a scalable Amazon Elastic Compute Cloud (Amazon EC2) backend to process decrypted traffic. Network Load Balancer also tightly integrates with Amazon Route 53, enabling you to use a custom domain for the LDAPS endpoint. To simplify testing and deployment, we have provided an AWS CloudFormation template to provision the network load balancer (NLB).

Simple AD, which is powered by Samba 4, supports basic Active Directory (AD) authentication features such as users, groups, and the ability to join domains. Simple AD also includes an integrated Lightweight Directory Access Protocol (LDAP) server. LDAP is a standard application protocol for accessing and managing directory information. You can use the BIND operation from Simple AD to authenticate LDAP client sessions. This makes LDAP a common choice for centralized authentication and authorization for services such as Secure Shell (SSH), client-based virtual private networks (VPNs), and many other applications. Authentication, the process of confirming the identity of a principal, typically involves the transmission of highly sensitive information such as user names and passwords. To protect this information in transit over untrusted networks, companies often require encryption as part of their information security strategy.

This post assumes that you understand concepts such as Amazon Virtual Private Cloud (Amazon VPC) and its components, including subnets, routing, internet and network address translation (NAT) gateways, DNS, and security groups. If needed, you should familiarize yourself with these concepts and review the solution overview and prerequisites in the next section before proceeding with the deployment.

Note: This solution is intended for use by clients who require only an LDAPS endpoint. If your requirements extend beyond this, you should consider accessing the Simple AD servers directly or by using AWS Directory Service for Microsoft AD.

Solution overview

The following description explains the Simple AD LDAPS environment. The AWS CloudFormation template creates the network-load-balancer object.

  1. The LDAP client sends an LDAPS request to the NLB on TCP port 636.
  2. The NLB terminates the SSL/TLS session and decrypts the traffic using a certificate. The NLB sends the decrypted LDAP traffic to Simple AD on TCP port 389.
  3. The Simple AD servers send an LDAP response to the NLB. The NLB encrypts the response and sends it to the client.

The following diagram illustrates how the solution works and shows the prerequisites (listed in the following section).

Figure 1: LDAPS with Simple AD Architecture

Figure 1: LDAPS with Simple AD Architecture

Note: Amazon VPC prevents third parties from intercepting traffic within the VPC. Because of this, the VPC protects the decrypted traffic between the NLB and Simple AD. The NLB encryption provides an additional layer of security for client connections and protects traffic coming from hosts outside the VPC.

Prerequisites

  1. Our approach requires an Amazon VPC with one public and two private subnets. If you don’t have an Amazon VPC that meets that requirement, use the following instructions to set up a sample environment:
    1. Identify an AWS Region that supports Simple AD and network load balancing.
    2. Identify two Availability Zones in that Region to use with Simple AD. The Availability Zones are needed as parameters in the AWS CloudFormation template used later in this process.
    3. Create or choose an Amazon VPC in the region you chose.
    4. Enable DNS support within your VPC so you can use Route 53 to resolve the LDAPS endpoint.
    5. Create two private subnets, one per Availability Zone. The Simple AD servers use the subnets that you create.
    6. Create a public subnet in the same VPC.
    7. The LDAP service requires a DNS domain that resolves within your VPC and from your LDAP clients. If you don’t have an existing DNS domain, create a private hosted zone and associate it with your VPC. To avoid encryption protocol errors, you must ensure that the DNS domain name is consistent across your Route 53 zone and in the SSL/TLS certificate.
  2. Make sure you’ve completed the Simple AD prerequisites.
  3. You can use a certificate issued by your preferred certificate authority or a certificate issued by AWS Certificate Manager (ACM). If you don’t have a certificate authority, you can create a self-signed certificate by following the instructions in section 2 (Create a certificate).

Note: To prevent unauthorized direct connections to your Simple AD servers, you can modify the Simple AD security group on port 389 to block traffic from locations outside of the Simple AD VPC. You can find the security group in the Amazon EC2 console by creating a search filter for your Simple AD directory ID. It is also important to allow the Simple AD servers to communicate with each other as shown on Simple AD Prerequisites.

Solution deployment

This solution includes 5 main parts:

  1. Create a Simple AD directory.
  2. (Optional) Create a SSL/TLS certificate, if you don’t have already have one.
  3. Create the NLB by using the supplied AWS CloudFormation template.
  4. Create a Route 53 record.
  5. Test LDAPS access using an Amazon Linux 2 client.

1. Create a Simple AD directory

With the prerequisites completed, your first step is to create a Simple AD directory in your private VPC subnets.

To create a Simple AD directory:

  1. In the Directory Service console navigation pane, choose Directories and then choose Set up directory.
  2. Choose Simple AD.

    Figure 2: Select directory type

    Figure 2: Select directory type

  3. Provide the following information:
    1. Directory Size: The size of the directory. The options are Small or Large. Which you should choose depends on the anticipated size of your directory.
    2. Directory DNS: The fully qualified domain name (FQDN) of the directory, such as corp.example.com.

      Note: You will need the directory FQDN when you test your solution.

    3. NetBIOS name: The short name for the directory, such as corp.
    4. Administrator password: The password for the directory administrator. The directory creation process creates an administrator account with the user name Administrator and this password. Don’t lose this password, because it can’t be recovered. You also need this password for testing LDAPS access in a later step.
    5. Description: An optional description for the directory.
    Figure 3: Directory information

    Figure 3: Directory information

  4. Select the VPC and subnets, and then choose Next:
    • VPC: Use the dropdown list to select the VPC to install the directory in.
    • Subnets: Use the dropdown lists to select two private subnets for the directory servers. The two subnets must be in different Availability Zones. Make a note of the VPC and subnet IDs to use as input parameters for the AWS CloudFormation template. In the following example, the subnets are in the us-east-1a and us-east-1c Availability Zones.
    Figure 4: Choose VPC and subnets

    Figure 4: Choose VPC and subnets

  5. Review the directory information and make any necessary changes. When the information is correct, choose Create directory.

    Figure 5: Review and create the directory

    Figure 5: Review and create the directory

  6. It takes several minutes to create the directory. From the AWS Directory Service console, refresh the screen periodically and wait until the directory Status value changes to Active before continuing.
  7. When the status has changed to Active, choose your Simple AD directory and note the two IP addresses in the DNS address section. You will enter them in a later step when you run the AWS CloudFormation template.

Note: How to administer your Simple AD implementation is out of scope for this post. See the documentation to add users, groups, or instances to your directory. Also see the previous blog post, How to Manage Identities in Simple AD Directories.

2. Add a certificate

Now that you have a Simple AD directory, you need a SSL/TLS certificate. The certificate will be used with the NLB to secure the LDAPS endpoint. You then import the certificate into ACM, which is integrated with the NLB.

As mentioned earlier, you can use a certificate issued by your preferred certificate authority or a certificate issued by AWS Certificate Manager (ACM).

(Optional) Create a self-signed certificate

If you don’t already have a certificate authority, you can use the following instructions to generate a self-signed certificate using OpenSSL.

Note: OpenSSL is a standard, open source library that supports a wide range of cryptographic functions, including the creation and signing of x509 certificates.

Use the command line interface to create a certificate:

  1. You must have a system with OpenSSL installed to complete this step. If you don’t have OpenSSL, you can install it on Amazon Linux by running the command sudo yum install openssl. If you don’t have access to an Amazon Linux instance you can create one with SSH access enabled to proceed with this step. Use the command line to run the command openssl version to see if you already have OpenSSL installed.
    [[email protected] ~]$ openssl version
    OpenSSL 1.0.1k-fips 8 Jan 2015
    

  2. Create a private key using the openssl genrsa command.
    [[email protected] tmp]$ openssl genrsa 2048 > privatekey.pem
    Generating RSA private key, 2048 bit long modulus
    ......................................................................................................................................................................+++
    ..........................+++
    e is 65537 (0x10001)
    

  3. Generate a certificate signing request (CSR) using the openssl req command. Provide the requested information for each field. The Common Name is the FQDN for your LDAPS endpoint (for example, ldap.corp.example.com). The Common Name must use the domain name you will later register in Route 53. You will encounter certificate errors if the names do not match.
    [[email protected] tmp]$ openssl req -new -key privatekey.pem -out server.csr
    You are about to be asked to enter information that will be incorporated into your certificate request.
    

  4. Use the openssl x509 command to sign the certificate. The following example uses the private key from the previous step (privatekey.pem) and the signing request (server.csr) to create a public certificate named server.crt that is valid for 365 days. This certificate must be updated within 365 days to avoid disruption of LDAPS functionality.
    [[email protected] tmp]$ openssl x509 -req -sha256 -days 365 -in server.csr -signkey privatekey.pem -out server.crt
    Signature ok
    subject=/C=XX/L=Default City/O=Default Company Ltd/CN=ldap.corp.example.com
    Getting Private key
    

  5. You should see three files: privatekey.pem, server.crt, and server.csr.
    [[email protected] tmp]$ ls
    privatekey.pem server.crt server.csr
    

  6. Restrict access to the private key.
    [[email protected] tmp]$ chmod 600 privatekey.pem
    

Note: Keep the private key and public certificate to use later. You can discard the signing request, because you are using a self-signed certificate and not using a certificate authority. Always store the private key in a secure location, and avoid adding it to your source code.

Import a certificate

For this step, you can either use a certificate obtained from a certificate authority, or a self-signed certificate that you created using the optional procedure above.

  1. In the ACM console, choose Import a certificate.
  2. Using a Linux text editor, paste the contents of your certificate file (called server.crt if you followed the procedure above) file in the Certificate body box.
  3. Using a Linux text editor, paste the contents of your privatekey.pem file in the Certificate private key box. (For a self-signed certificate, you can leave the Certificate chain box blank.)
  4. Choose Review and import. Confirm the information and choose Import.
  5. Take note of the Amazon Resource Name (ARN) of the imported certificate.

3. Create the NLB by using the supplied AWS CloudFormation template

Now that you have a Simple AD directory and SSL/TLS certificate, you’re ready to use the AWS CloudFormation template to create the NLB.

Create the NLB:

  1. Load the AWS CloudFormation template to deploy an internal NLB. After you load the template, provide the input parameters from the following table:

    Input parameter Input parameter description
    VPCId The target VPC for this solution. Must be the VPC where you deployed Simple AD and available in your Simple AD directory details page.
    SubnetId1 The Simple AD primary subnet. This information is available in your Simple AD directory details page.
    SubnetId2 The Simple AD secondary subnet. This information is available in your Simple AD directory details page.
    SimpleADPriIP The primary Simple AD Server IP. This information is available in your Simple AD directory details page.
    SimpleADSecIP The secondary Simple AD Server IP. This information is available in your Simple AD directory details page.
    LDAPSCertificateARN The Amazon Resource Name (ARN) for the SSL certificate. This information is available in the ACM console.
  2. Enter the input parameters and choose Next.
  3. On the Options page, accept the defaults and choose Next.
  4. On the Review page, confirm the details and choose Create. The stack will be created in approximately 5 minutes.
  5. Wait until the AWS Cloud formation stack status is CREATE_COMPLETE before starting the next procedure, Create a Route 53 record.
  6. Go to Outputs and note the FQDN of your new NLB. The FQDN is in the output variable named LDAPSURL.

    Note: You can find the parameters of your Simple AD on the directory details page by choosing your Simple AD in the Directory Service console.

4. Create a Route 53 record

The next step is to create a Route 53 record in your private hosted zone so that clients can resolve your LDAPS endpoint.

Note: Don’t start this procedure until the AWS CloudFormation stack status is CREATE_COMPLETE.

Create a Route 53 record:

  1. If you don’t have an existing DNS domain for use with LDAP, create a private hosted zone and associate it with your VPC. The hosted zone name should be consistent with your Simple AD (for example, corp.example.com).
  2. When the AWS CloudFormation stack is in CREATE_COMPLETE status, locate the value of the LDAPSURL on the Outputs tab of the stack. Copy this value for use in the next step.
  3. On the Route 53 console, choose Hosted Zones and then choose the zone you used for the Common Name value for your self-signed certificate. Choose Create Record Set and enter the following information:
    1. Name: A short name for the record set (remember that the FQDN has to match the Common Name of your certificate).
    2. Type: Leave as A – IPv4 address.
    3. Alias: Select Yes.
    4. Alias Target: Paste the value of the LDAPSURL from the Outputs tab of the stack.
  4. Leave the defaults for Routing Policy and Evaluate Target Health, and choose Create.
Figure 6: Create a Route 53 record

Figure 6: Create a Route 53 record

5. Test LDAPS access using an Amazon Linux 2 client

At this point, you’re ready to test your LDAPS endpoint from an Amazon Linux client.

Test LDAPS access:

  1. Create an Amazon Linux 2 instance with SSH access enabled to test the solution. Launch the instance on one of the public subnets in your VPC. Make sure the IP assigned to the instance is in the trusted IP range you specified in the security group associated with the Simple AD.
  2. Use SSH to sign in to the instance and complete the following steps to verify access.
    1. Install the openldap-clients package and any required dependencies:
      sudo yum install -y openldap-clients.
      

    2. Add the server.crt file to the /etc/openldap/certs/ directory so that the LDAPS client will trust your SSL/TLS certificate. You can download the file directly from the NLB the certificate and save it in the proper format, or copy the file using Secure Copy or create it using a text editor:
      openssl s_client -connect <LDAPSURL>:636 -showcerts </dev/null 2>/dev/null | openssl x509 -outform PEM > server.crt 
      

      Replace <LDAPSURL> with the FQDN of your NLB, the address can be found in the Outputs section of the stack created in CloudFormation.

    3. Edit the /etc/openldap/ldap.conf file to define the environment variables:
      • BASE: The Simple AD directory name.
      • URI: Your DNS alias.
      • TLS_CACERT: The path to your public certificate.
      • TLSCACertificateFile: The path to your self-signed certificate authority. If you used the instructions in section 2 (Create a certificate) to create a certificate, the path will be /etc/ssl/certs/ca-bundle.crt.

      Here’s an example of the file:

      BASE dc=corp,dc=example,dc=com
      URI ldaps://ldap.corp.example.com
      TLS_CACERT /etc/openldap/certs/server.crt
      TLSCACertificateFile /etc/ssl/certs/ca-bundle.crt
      

  3. To test the solution, query the directory through the LDAPS endpoint, as shown in the following command. Replace corp.example.com with your domain name and use the Administrator password that you configured in step 3 of section 1 (Create a Simple AD directory).
    $ ldapsearch -D "[email protected]" -W sAMAccountName=Administrator
    

  4. The response will include the directory information in LDAP Data Interchange Format (LDIF) for the administrator distinguished name (DN) from your Simple AD LDAP server.
    # extended LDIF
    #
    # LDAPv3
    # base <dc=corp,dc=example,dc=com> (default) with scope subtree
    # filter: sAMAccountName=Administrator
    # requesting: ALL
    #
    
    # Administrator, Users, corp.example.com
    dn: CN=Administrator,CN=Users,DC=corp,DC=example,DC=com
    objectClass: top
    objectClass: person
    objectClass: organizationalPerson
    objectClass: user
    description: Built-in account for administering the computer/domain
    instanceType: 4
    whenCreated: 20170721123204.0Z
    uSNCreated: 3223
    name: Administrator
    objectGUID:: l3h0HIiKO0a/ShL4yVK/vw==
    userAccountControl: 512
    …
    

You can now use the LDAPS endpoint for directory operations and authentication within your environment. Here are a few resources to learn more about how to interact with an LDAPS endpoint:

Troubleshooting

If the ldapsearch command returns something like the following error, there are a few things you can do to help identify issues.

ldap_sasl_bind(SIMPLE): Can't contact LDAP server (-1)
  1. You might be able to obtain additional error details by adding the -d1 debug flag to the ldapsearch command.
    $ ldapsearch -D "[email protected]" -W sAMAccountName=Administrator –d1
    

  2. Verify that the parameters in ldap.conf match your configured LDAPS URI endpoint and that all parameters can be resolved by DNS. You can use the following dig command, substituting your configured endpoint DNS name.
    $ dig ldap.corp.example.com
    

  3. Confirm that the client instance you’re connecting from is in the trusted IP range you specified in the security associated with your Simple AD directory.
  4. Confirm that the path to your public SSL/TLS certificate in ldap.conf as TLS_CAERT is correct. You configured this as part of step 2 in section 5 (Test LDAPS access using an Amazon Linux 2 client). You can check your SSL/TLS connection with the following command, replacing ldap.corp.example.com with the DNS name of your endpoint.
    $ echo -n | openssl s_client -connect ldap.corp.example.com:636
    

  5. Verify that the status of your Simple AD IPs is Healthy in the Amazon EC2 console.
    1. Open the EC2 console and choose Load Balancing and then Target Groups in the navigation pane.
    2. Choose your LDAPS target and then choose Targets.

Conclusion

You can use NLB to provide an LDAPS endpoint for Simple AD and transport sensitive authentication information over untrusted networks. You can explore using LDAPS to authenticate SSH users or integrate with other software solutions that support LDAP authentication. The AWS CloudFormation template for this solution is available on GitHub.

If you have comments about this post, submit them in the Comments section below. If you have questions about or issues implementing this solution, start a new thread on the AWS Directory Service forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Marco Somella

Marco Sommella

Marco is a Cloud Support Engineer II in the Windows Team based in Dublin. He is a Subject Matter Expert on Directory Service and EC2 Windows. Marco has over 10 years experience as a Windows and Linux system administrator and is passionate about automation coding. He is actively involved in AWS Systems Manager public Automations released by AWS Support and AWS EC2.

Cameron Worrell

Cameron Worrell

Cameron is a Solutions Architect with a passion for security and enterprise transformation. He joined AWS in 2015.

Log your VPC DNS queries with Route 53 Resolver Query Logs

Post Syndicated from Martin Beeby original https://aws.amazon.com/blogs/aws/log-your-vpc-dns-queries-with-route-53-resolver-query-logs/

The Amazon Route 53 team has just launched a new feature called Route 53 Resolver Query Logs, which will let you log all DNS queries made by resources within your Amazon Virtual Private Cloud. Whether it’s an Amazon Elastic Compute Cloud (EC2) instance, an AWS Lambda function, or a container, if it lives in your Virtual Private Cloud and makes a DNS query, then this feature will log it; you are then able to explore and better understand how your applications are operating.

Our customers explained to us that DNS query logs were important to them. Some wanted the logs so that they could be compliant with regulations, others wished to monitor DNS querying behavior, so they could spot security threats. Others simply wanted to troubleshoot application issues that were related to DNS. The team listened to our customers and have developed what I have found to be an elegant and easy to use solution.

From knowing very little about the Route 53 Resolver, I was able to configure query logging and have it working with barely a second glance at the documentation; which I assure you is a testament to the intuitiveness of the feature rather than me having any significant experience with Route 53 or DNS query logging.

You can choose to have the DNS query logs sent to one of three AWS services: Amazon CloudWatch Logs, Amazon Simple Storage Service (S3), and Amazon Kinesis Data Firehose. The target service you choose will depend mainly on what you want to do with the data. If you have compliance mandates (For example, Australia’s Information Security Registered Assessors Program), then maybe storing the logs in Amazon Simple Storage Service (S3) is a good option. If you have plans to monitor and analyze DNS queries in real-time or you integrate your logs with a 3rd party data analysis tool like Kibana or a SEIM tool like Splunk, than perhaps Amazon Kinesis Data Firehose is the option for you. For those of you who want an easy way to search, query, monitor metrics, or raise alarms, then Amazon CloudWatch Logs is a great choice, and this is what I will show in the following demo.

Over in the Route 53 Console, near the Resolver menu section, I see a new item called Query logging. Clicking on this takes me to a screen where I can configure the logging.

The dashboard shows the current configurations that are setup. I click Configure query logging to get started.

The console asks me to fill out some necessary information, such as a friendly name; I’ve named mine demoNewsBlog.

I am now prompted to select the destination where I would like my logs to be sent. I choose the CloudWatch Logs log group and select the option to Create log group. I give my new log group the name /aws/route/demothebeebsnet.

Next, I need to select what VPC I would like to log queries for. Any resource that sits inside the VPCs I choose here will have their DNS queries logged. You are also able to add tags to this configuration. I am in the habit of tagging anything that I use as part of a demo with the tag demo. This is so I can easily distinguish between demo resources and live resources in my account.

Finally, I press the Configure query logging button, and the configuration is saved. Within a few moments, the service has successfully enabled the query logging in my VPC.

After a few minutes, I log into the Amazon CloudWatch Logs console and can see that the logs have started to appear.

As you can see below, I was quickly able to start searching my logs and running queries using Amazon CloudWatch Logs Insights.

There is a lot you can do with the Amazon CloudWatch Logs service, for example, I could use CloudWatch Metric Filters to automatically generate metrics or even create dashboards. While putting this demo together, I also discovered a feature inside of Amazon CloudWatch Logs called Contributor Insights that enables you to analyze log data and create time series that display top talkers. Very quickly, I was able to produce this graph, which lists out the most common DNS queries over time.
Route 53 Resolver Query Logs is available in all AWS Commercial Regions that support Route 53 Resolver Endpoints, and you can get started using either the API or the AWS Console. You do not pay for the Route 53 Resolver Query Logs, but you will pay for handling the logs in the destination service that you choose. So, for example, if you decided to use Amazon Kinesis Data Firehose, then you will incur the regular charges for handling logs with the Amazon Kinesis Data Firehose service.

Happy Logging

— Martin

Automated Disaster Recovery using CloudEndure

Post Syndicated from Ryan Jaeger original https://aws.amazon.com/blogs/architecture/automated-disaster-recovery-using-cloudendure/

There are any number of events that cause IT outages and impact business continuity. These could include the unexpected infrastructure or application outages caused by flooding, earthquakes, fires, hardware failures, or even malicious attacks. Cloud computing opens a new door to support disaster recovery strategies, with benefits such as elasticity, agility, speed to innovate, and cost savings—all which aid new disaster recovery solutions.

With AWS, organizations can acquire IT resources on-demand, and pay only for the resources they use. Automating disaster recovery (DR) has always been challenging. This blog post shows how you can use automation to allow the orchestration of recovery to eliminate manual processes. CloudEndure Disaster Recovery, an AWS Company, Amazon Route 53, and AWS Lambda are the building blocks to deliver a cost-effective automated DR solution. The example in this post demonstrates how you can recover a production web application with sub-second Recovery Point Objects (RPOs) and Recovery Time Objectives (RTOs) in minutes.

As part of a DR strategy, knowing RPOs and RTOs will determine what kind of solution architecture you need. The RPO represents the point in time of the last recoverable data point (for example, the “last backup”). Any disaster after that point would result in data loss.

The time from the outage to restoration is the RTO. Minimizing RTO and RPO is a cost tradeoff. Restoring from backups and recreating infrastructure after the event is the lowest cost but highest RTO. Conversely, the highest cost and lowest RTO is a solution running a duplicate auto-failover environment.

Solution Overview

CloudEndure is an automated IT resilience solution that lets you recover your environment from unexpected infrastructure or application outages, data corruption, ransomware, or other malicious attacks. It utilizes block-level Continuous Data Replication (CDP), which ensures that target machines are spun up in their most current state during a disaster or drill, so that you can achieve sub-second RPOs. In the event of a disaster, CloudEndure triggers a highly automated machine conversion process and a scalable orchestration engine that can spin up machines in the target AWS Region within minutes. This process enables you to achieve RTOs in minutes. The CloudEndure solution uses a software agent that installs on physical or virtual servers. It connects to a self-service, web-based use console, which then issues an API call to the selected AWS target Region to create a Staging Area in the customer’s AWS account designated to receive the source machine’s replicated data.

Architecture

In the above example, a webserver and database server have the CloudEndure Agent installed, and the disk volumes on each server replicated to a staging environment in the customer’s AWS account. The CloudEndure Replication Server receives the encrypted data replication traffic and writes to the appropriate corresponding EBS volumes. It’s also possible to configure data replication traffic to use VPN or AWS Direct Connect.

With this current setup, if an infrastructure or application outage occurs, a failover to AWS is executed by manually starting the process from the CloudEndure Console. When this happens, CloudEndure creates EC2 instances from the synchronized target EBS volumes. After the failover completes, additional manual steps are needed to change the website’s DNS entry to point to the IP address of the failed over webserver.

Could the CloudEndure failover and DNS update be automated? Yes.

Amazon Route 53 is a highly available and scalable Domain Name System (DNS) web service with three main functions: domain registration, DNS routing, and health checking. A configured Route 53 health check monitors the endpoint of a webserver. If the health check fails over a specified period, an alarm is raised to execute an AWS Lambda function to start the CloudEndure failover process. In addition to health checks, Route 53 DNS Failover allows the DNS record for the webserver to be automatically update based on a healthy endpoint. Now the previously manual process of updating the DNS record to point to the restored web server is automated. You can also build Route 53 DNS Failover configurations to support decision trees to handle complex configurations.

To illustrate this, the following builds on the example by having a primary, secondary, and tertiary DNS Failover choice for the web application:

How Health Checks Work in Complex Amazon Route 53 Configurations

When the CloudEndure failover action executes, it takes several minutes until the target EC2 is launched and configured by CloudEndure. An S3 static web page can be returned to the end-user to improve communication while the failover is happening.

To support this example, Amazon Route 53 DNS failover decision tree can be configured to have a primary, secondary, and tertiary failover. The decision tree logic to support the scenario is the following:

  1. If the primary health check passes, return the primary webserver.
  2. Else, if the secondary health check passes, return the failover webserver.
  3. Else, return the S3 static site.

When the Route 53 health check fails when monitoring the primary endpoint for the webserver, a CloudWatch alarm is configured to ALARM after a set time. This CloudWatch alarm then executes a Lambda function that calls the CloudEndure API to begin the failover.

In the screenshot below, both health checks are reporting “Unhealthy” while the primary health check is in a state of ALARM. At the point, the DNS failover logic should be returning the path to the static S3 site, and the Lambda function executed to start the CloudEndure failover.

The following architecture illustrates the completed scenario:

Conclusion

Having a disaster recovery strategy is critical for business continuity. The benefits of AWS combined with CloudEndure Disaster Recovery creates a non-disruptive DR solution that provides minimal RTO and RPO while reducing total cost of ownership for customers. Leveraging CloudWatch Alarms combined with AWS Lambda for serverless computing are building blocks for a variety of automation scenarios.

 

 

Architecting multiple microservices behind a single domain with Amazon API Gateway

Post Syndicated from James Beswick original https://aws.amazon.com/blogs/compute/architecting-multiple-microservices-behind-a-single-domain-with-amazon-api-gateway/

This post is courtesy of Roberto Iturralde, Solutions Architect.

Today’s modern architectures are increasingly microservices-based, with separate engineering teams working independently on services with their own feature requirements and deployment pipelines. The benefits of this approach include increased agility and release velocity.

Microservice architectures also come with some challenges, particularly when they make up parts of a public service or API. These include enforcing engineering and security standards and collating application logs and metrics for a cross-service operational view.

It’s also important to have the microservices feel like a cohesive product to external customers, for authentication and metering in particular:

  • The engineering teams want autonomy.
  • The security team wants a cross-service view and to make it easy for the teams to adhere to the organization’s guidelines.
  • Customers want to feel like they’re using a unified product.

The AWS toolbox

AWS offers many services that you can weave together to meet these needs.

Amazon API Gateway is a fully managed service for deploying and managing a unified front door to your applications. It has features for routing your domain’s traffic to different backing microservices, enforcing consistent authentication and authorization with fine-grained permissions across them, and implementing consistent API throttling and usage metering. The microservice that backs a given API can live in another AWS account. You don’t have to expose it to the internet.

Amazon Cognito is a user management service with rich support for authentication and authorization of users. You can manage those users within Amazon Cognito or from other federated IdPs. Amazon Cognito can vend JSON Web Tokens and integrates natively with API Gateway to support OAuth scopes for fine-grained API access.

Amazon CloudWatch is a monitoring and management service that collects and visualizes data across AWS services. CloudWatch dashboards are customizable home pages that can contain graphs showing metrics and alarms. You can customize these to represent a specific microservice, a collection of microservices that comprise a product, or any other meaningful view with fine-grained access control to the dashboard.

AWS X-Ray is an analysis and debugging tool designed for distributed applications. It has tools to help gain insight into the performance of your microservices, and the APIs that front them, to measure and debug any potential customer impact.

AWS Service Catalog allows the central management and self-service creation of AWS resources that meet your organization’s guidelines and best practices. You can require separate permissions for managing catalog entries from deploying catalog entries, allowing a central team to define and publish templates for resources across the company.

Architectural options

There are many options for how you can combine these AWS services to meet your requirements. Your decisions may also depend on your expertise with AWS. The following features are common to all the designs below:

  • Amazon Route 53 has registered custom domains and hosts their DNS. You could also use an external registrar and DNS service.
  • AWS Certificate Manager (ACM) manages Transport Layer Security (TLS) certificates for the custom domains that route traffic to API Gateway APIs in a given account.
  • Amazon Cognito manages the users who access the APIs in API Gateway.
  • Service Catalog holds catalog products for API Gateway APIs that adhere to the organizational guidelines and best practices, such as security configuration and default API throttling. Microservice teams have permission to create an API pointed to their service and configure specific parameters, with approvals required for production environments. For more information, see Standardizing infrastructure delivery in distributed environments using AWS Service Catalog.

The following shows common design patterns and their high-level benefits and challenges.

Single AWS account

Microservices, their fronting API Gateway APIs, and supporting services are in the same AWS account. This account also includes core AWS services such as the following:

  • Route 53 for domain name registration and DNS
  • ACM for managing server certificates for your domain
  • Amazon Cognito for user management
  • Service Catalog for the catalog of best-practice product templates to use across the organization

Single AWS account example

Use this approach if you do not yet have a multi-account strategy or if you use AWS native tools for observability. With a single AWS account, the microservices can share the same networking topology, and so more easily communicate with each other when needed. With all the API Gateway APIs in the same AWS account, you can configure API throttling, metering, authentication, and authorization features for a unified experience for customers. You can also route traffic to a given API using subdomains or base path mapping in API Gateway.

A single AWS account can manage TLS certificates for AWS domains in one place. This feature is available to all API Gateway APIs. Having the microservices and their API Gateway APIs in the same AWS account gives more complete X-Ray service maps, given that X-Ray currently can’t analyze traces across AWS accounts. Similarly, you have a complete view of the metrics all AWS services publish to CloudWatch. This feature allows you to create CloudWatch dashboards that span the API Gateway APIs and their backing microservices.

There is an increased blast radius with this architecture, because the microservices share the same account. The microservices can impact each other through shared AWS service limits or mistakes by team members on other microservice teams. Most AWS services support tagging for cost allocation and granular access control, but there are some features of AWS services that do not. Because of this, it’s more difficult to separate the costs of each microservice completely.

Separate AWS accounts

When using separate AWS accounts, each API Gateway API lives in the same AWS account as its backing microservice. Separate AWS accounts hold the Service Catalog portfolio, domain registration (using Route 53), and aggregated logs from the microservices. The organization account, security account, and other core accounts are discussed further in the AWS Landing Zone Solution.

Separate AWS accounts

Use this architecture if you have a mature multi-account strategy and existing tooling for cross-account observability. In this approach, an AWS account encapsulates a microservice completely, for cost isolation and reduced blast radius. With the API Gateway API in the same account as the backing microservice, you have a complete view of the microservice in CloudWatch and X-Ray.

You can only meter API usage by microservice because API Gateway usage plans can’t track activity across accounts. Implement a process to ensure each customer’s API Gateway API key is the same across accounts for a smooth customer experience.

API Gateway base path mappings are local to an AWS account, so you must use subdomains to separate the microservices that comprise a product under a single domain. However, you can have a complete view of each microservice in the CloudWatch dashboards and X-Ray console for its AWS account. This creates a view across microservices that requires aggregation in a central AWS account or external tool.

Central API account

Using a central API account is similar to the separate account architecture, except the API Gateway APIs are in a central account.

Central API account

This architecture is the best approach for most users. It offers a balance of the benefits of microservice separation with the unification of particular services for a better end-user experience. Each microservice has an AWS account, which isolates it from the other services and reduces the risk of AWS service limit contention or accidents due to sharing the account with other engineering teams.

Because each microservice lives in a separate account, that account’s bill captures all the costs for that microservice. You can track the API costs, which are in the shared API account, using tags on API Gateway resources.

While the microservices are isolated in separate AWS accounts, the API Gateway throttling, metering, authentication, and authorization features are centralized for a consistent experience for customers. You can use subdomains or API Gateway base path mappings to route traffic to different API Gateway APIs. Also, the TLS certificates for your domains are centrally managed and available to all API Gateway APIs.

You can now split CloudWatch metrics, X-Ray traces, and application logs across accounts for a given microservice and its fronting API Gateway API. Unify these in a central AWS account or a third-party tool.

Conclusion

The breadth of the AWS Cloud presents many architectural options to customers. When designing your systems, it’s essential to understand the benefits and challenges of design decisions before implementing a solution.

This post walked you through three common architectural patterns for allowing independent microservice teams to operate behind a unified domain presented to your customers. The best approach for your organization depends on your priorities, experience, and familiarity with AWS.

Creating static custom domain endpoints with Amazon MQ to simplify broker modification and scaling

Post Syndicated from Rachel Richardson original https://aws.amazon.com/blogs/compute/creating-static-custom-domain-endpoints-with-amazon-mq/

This post is courtesy of Wallace Printz, Senior Solutions Architect, AWS, and Christian Mueller, Senior Solutions Architect, AWS.

Many cloud-native application architectures take advantage of the point-to-point and publish-subscribe (“pub-sub”) model of message-based communication between application components. This architecture is generally more resilient to failure because of the loose coupling and because message processing failures can be retried. It’s also more efficient because individual application components can independently scale up or down to maintain message-processing SLAs, compared to monolithic application architectures. Synchronous (REST-based) systems are tightly coupled. A problem in a synchronous downstream dependency has an immediate impact on the upstream callers.

Retries from upstream callers can all too easily fan out and amplify problems. Amazon SQS and Amazon SNS are fully managed message queuing services, but are not necessarily the right tool for the job in some cases. For applications requiring messaging protocols including JMS, NMS, AMQP, STOMP, MQTT, and WebSocket, Amazon provides Amazon MQ. Amazon MQ is a managed message broker service for Apache ActiveMQ that makes it easy to set up and operate message brokers in the cloud.

Amazon MQ provides two managed broker deployment connection options: public brokers and private brokers. Public brokers receive internet-accessible IP addresses, while private brokers receive only private IP addresses from the corresponding CIDR range in their VPC subnet.

In some cases, for security purposes, you may prefer to place brokers in a private subnet. You can also allow access to the brokers through a persistent public endpoint, such as a subdomain of their corporate domain like mq.example.com.

In this post, we explain how to provision private Amazon MQ brokers behind a secure public load balancer endpoint using an example subdomain.

Architecture overview

There are several reasons one might want to deploy this architecture beyond the security aspects.

First, human-readable URLs are easier for people to parse when reviewing operations and troubleshooting, such as deploying updates to mq-dev.example.com before mq-prod.example.com.

Second, maintaining static URLs for your brokers helps reduce the necessity of modifying client code when performing maintenance on the brokers.

Third, this pattern allows you to vertically scale your brokers without changing the client code or even notifying the clients that changes have been made.

Finally, the same architecture described here works for a network of brokers configuration as well, whereby you could horizontally scale your brokers without impacting the client code.

Prerequisites

This blog post assumes some familiarity with AWS networking fundamentals, such as VPCs, subnets, load balancers, and Amazon Route 53.

When you are finished, the architecture should be set up as shown in the following diagram. For ease of visualization, we demonstrate with a pair of brokers using the active-standby option.

Solution Overview

Amazon MQ solution overview

The client to broker traffic flow is as follows.

  • First, the client service tries to connect with a failover URL to the domain endpoint setup in Route 53. If a client loses the connection, using the failover URL allows the client to automatically try to reconnect to the broker.
  • The client looks up the domain name from Route 53, and Route 53 returns the IP address of the Network Load Balancer.
  • The client creates a secure socket layer (SSL) connection to the Network Load Balancer with an SSL certificate provided from AWS Certificate Manager (ACM). The Network Load Balancer selects from the healthy brokers in its target group and creates a separate SSL connection between the Network Load Balancer and the broker. This provides secure, end-to-end SSL encrypted messaging between client and brokers.

In this diagram, the healthy broker connection is shown in the solid line. The standby broker, which does not reply to connection requests and is therefore marked as unhealthy in the target group, is shown in the dashed line.

Solution walkthrough

To build this architecture, build the network segmentation first, then the Amazon MQ brokers, and finally the network routing.

Setup

First, you need the following resources:

  • A VPC
  • One private subnet per Availability Zone
  • One public subnet for your bastion host (if desired)

This demonstration VPC uses the 20.0.0.0/16 CIDR range.

Additionally, you must create a custom security group for your brokers. Set up this security group to allow traffic from your Network Load Balancer and, if using a network of brokers, among the brokers as well.

This VPC is not being used for any other workloads. This demonstration allows all incoming traffic originating within the VPC, including the Network Load Balancer, through to the brokers on the following ports:

  • OpenWire communication port of 61617
  • Apache ActiveMQ console port of 8162

If you are using a different protocol, adjust the port numbers accordingly.

Create an amazon mq security group

Building the Amazon MQ brokers

Now that you have the network segmentation set up, build the Amazon MQ brokers. As mentioned previously, this demonstration uses the active-standby pair of private brokers option.

Configure the broker settings by selecting a broker name, instance type, ActiveMQ console user, and password first.

Configure Amazon MQ broker settings

In the Additional Settings area, place the brokers in your previously selected VPC and the associated private subnets.

Configure Amazon MQ additional settings

Finally, select the existing Security Group previously discussed, and make sure that the Public Accessibility option is set to No.

Set Amazon MQ security group settings

That’s it for the brokers. When it is done provisioning, the Amazon MQ dashboard should look like the one shown in the following screenshot. Note the IP addresses of the brokers and the ActiveMQ web console URLs for later.

Amazon MQ dashboard

Configuring a Load Balancer Target Group

The next step in the build process is to configure the load balancer’s target group. This demonstration uses the private IP addresses of the brokers as targets for the Network Load Balancer.

Create and name a target group, select the IP option under Target type, and make sure to select TLS under Protocol and 61617 under Port, as well as the VPC in which your brokers reside. It is important to configure the health check settings so traffic is only routed to active brokers by selecting the TCP protocol and overriding the health check port to 8162, the Apache ActiveMQ console port.

Do not use the OpenWire port as the target group health check port. Because the Network Load Balancer may not be able to recognize the host as healthy on that port, it is better to use the ActiveMQ web console port.

Next, add the brokers’ IP addresses as targets. You can find the broker IP addresses in the Amazon MQ console page after they complete provisioning. Make sure to add both the active and the standby broker to the target group so that when reboots occur, the Network Load Balancer routes traffic to whichever broker is active.

You may be pursuing a more dynamic environment for scaling brokers up and down to handle the demands of a variable message load. In that case, as you scale to add more brokers, make sure that you also add them to the target group.

AWS Lambda would be a great way to programmatically handle adding or removing the broker’s IP addresses to this target group automatically.

Creating a Network Load Balancer

Next, create a Network Load Balancer. This demo uses an internet-facing load balancer with TLS listeners on port 61617, and routes traffic to brokers’ VPC and private subnets.

Configure a network load balancer

Clients must securely connect to the Network Load Balancer, so this demo uses an ACM certificate for the subdomain registered in Route 53, such as mq.example.com. For simplicity, ACM certificate provisioning is not shown. For more information, see Request a Public Certificate.

Make sure that the ACM certificate is provisioned in the same Region as your Network Load Balancer, or the certificate is not displayed in the selection menu.

Next, select the target group that you just created, and select TLS for the connection between the Network Load Balancer and the brokers. Similarly, select the health checks on TCP port 8162.

If all went well, you see the list of brokers’ IP addresses listed as targets. From here, review your settings and confirm you’d like to deploy the Network Load Balancer.

Configuring Route 53

The last step in this build is to configure Route 53 to serve traffic at the subdomain of your choice to your Network Load Balancer.

Go to your Route 53 Hosted Zone, and create a new subdomain record set, such as mq.example.com, that matches the ACM certificate that you previously created. In the Type field, select A – IPv4 address, then select Yes for Alias. This allows you to select the Network Load Balancer as the alias target. Select the Network Load Balancer that you just created from the Alias Target menu and save the record set.

Testing broker connectivity

And that’s it!

There’s an important advantage to this architecture. When you create Amazon MQ active-standby brokers, the Amazon MQ service provides two endpoints. Only one broker host is active at a time, and when configuration changes or other reboot events occur, the standby broker becomes active and the active broker goes to standby. The typical connection string when there is an option to connect to multiple brokers is something similar to the following string

"failover:(ssl://b-ce452fbe-2581-4003-8ce2-4185b1377b43-1.mq.us-west-2.amazonaws.com:61617,ssl://b-ce452fbe-2581-4003-8ce2-4185b1377b43-2.mq.us-west-2.amazonaws.com:61617)"

In this architecture, you use only a single connection URL, but you still want to use the failover protocol to force re-connection if the connection is dropped for any reason.

For ease of use, this solution relies on the Amazon MQ workshop client application code from re:Invent 2018. To test this solution setting the connection URL to the following:

"failover:(ssl://mq.example.com:61617

Run the producer and consumer clients in separate terminal windows.

The messages are sent and received successfully across the internet, while the brokers are hidden behind the Network Load Balancer.

Logging into the broker’s ActiveMQ console

But what if we want to log in to the broker’s ActiveMQ web console?

There are three options. Due to the security group rules, only traffic originating from inside the VPC is allowed to the brokers.

  • Use a VPN connection from the corporate network to the VPC. Most customers likely use this option, but for rapid testing, there is a simple and cost-effective method.
  • Connect to the brokers’ web console through a Route 53 subdomain, which requires creating a separate port 8162 Listener on the existing Network Load Balancer and creating a separate TLS target group on port 8162 for the brokers.
  • Use a bastion host to proxy traffic to the web console.

To use a bastion host, create a small Linux EC2 instance in your public subnet, and make sure that:

  • The EC2 instance has a public IP address.
  • You have access to the SSH key pair.
  • It is placed in a security group that allows SSH port 22 traffic from your location.

For simplicity, this step is not shown, but this demonstration uses a t3.micro Amazon Linux 2 host with all default options as the bastion.

Creating a forwarding tunnel

Next, create a forwarding tunnel through an SSH connection to the bastion host. Below is an example command in the terminal window. This keeps a persistent SSH connection forwarding port 8162 through the bastion host at the public IP address 54.244.188.53.

For example, the command could be:

ssh -D 8162 -C -q -N -I <my-key-pair-name>.pem [email protected]<ec2-ip-address>

You can also configure a browser to tunnel traffic through your proxy.

We have chosen to demonstrate in Firefox. Configure the network settings to use a manual proxy on localhost on the Apache ActiveMQ console port of 8162.  This can be done by opening the Firefox Connection Settings.  In the Configure Proxy Access to the Internet section, select Manual proxy configuration, then set the SOCKS Host to localhost and Port to 8162, leaving other fields empty.

Finally, use the Apache ActiveMQ console URL provided in the Amazon MQ web console details page to connect to the broker through the proxy.

ActiveMQ screenshot

Conclusion

Congratulations! You’ve successfully built a highly available Amazon MQ broker pair in a private subnet. You’ve layered your security defense by putting the brokers behind a highly scalable Network Load Balancer, and you’ve configured routing from a single custom subdomain URL to multiple brokers with health check built in.

To learn more about Amazon MQ and scalable broker communication patterns, we highly recommend the following resources:

Keep on building!

Simplify DNS management in a multi-account environment with Route 53 Resolver

Post Syndicated from Mahmoud Matouk original https://aws.amazon.com/blogs/security/simplify-dns-management-in-a-multiaccount-environment-with-route-53-resolver/

In a previous post, I showed you a solution to implement central DNS in a multi-account environment that simplified DNS management by reducing the number of servers and forwarders you needed when implementing cross-account and AWS-to-on-premises domain resolution. With the release of the Amazon Route 53 Resolver service, you now have access to a native conditional forwarder that will simplify hybrid DNS resolution even more.

In this post, I’ll show you a modernized solution to centralize DNS management in a multi-account environment by using Route 53 Resolver. This solution allows you to resolve domains across multiple accounts and between workloads running on AWS and on-premises without the need to run a domain controller in AWS.

Solution overview

My solution will show you how to solve three primary use-cases for domain resolution:

  • Resolving on-premises domains from workloads running in your VPCs.
  • Resolving private domains in your AWS environment from workloads running on-premises.
  • Resolving private domains between workloads running in different AWS accounts.

The following diagram explains the high-level full architecture.
 

Figure 1: Solution architecture diagram

Figure 1: Solution architecture diagram

In this architecture:

  1. This is the Amazon-provided default DNS server for the central DNS VPC, which we’ll refer to as the DNS-VPC. This is the second IP address in the VPC CIDR range (as illustrated, this is 172.27.0.2). This default DNS server will be the primary domain resolver for all workloads running in participating AWS accounts.
  2. This shows the Route 53 Resolver endpoints. The inbound endpoint will receive queries forwarded from on-premises DNS servers and from workloads running in participating AWS accounts. The outbound endpoint will be used to forward domain queries from AWS to on-premises DNS.
  3. This shows conditional forwarding rules. For this architecture, we need two rules, one to forward domain queries for onprem.private zone to the on-premises DNS server through the outbound gateway, and a second rule to forward domain queries for awscloud.private to the resolver inbound endpoint in DNS-VPC.
  4. This indicates that these two forwarding rules are shared with all other AWS accounts through AWS Resource Access Manager and are associated with all VPCs in these accounts.
  5. This shows the private hosted zone created in each account with a unique subdomain of awscloud.private.
  6. This shows the on-premises DNS server with conditional forwarders configured to forward queries to the awscloud.private zone to the IP addresses of the Resolver inbound endpoint.

Note: This solution doesn’t require VPC-peering or connectivity between the source/destination VPCs and the DNS-VPC.

How it works

Now, I’m going to show how the domain resolution flow of this architecture works according to the three use-cases I’m focusing on.

First use case

 

 Figure 2:  Use case for resolving on-premises domains from workloads running in AWS

Figure 2: Use case for resolving on-premises domains from workloads running in AWS

First, I’ll look at resolving on-premises domains from workloads running in AWS. If the server with private domain host1.acc1.awscloud.private attempts to resolve the address host1.onprem.private, here’s what happens:

  1. The DNS query will route to the default DNS server of the VPC that hosts host1.acc1.awscloud.private
  2. Because the VPC is associated with the forwarding rules shared from the central DNS account, these rules will be evaluated by the default Amazon-provided DNS in the VPC.
  3. In this example, one of the rules indicates that queries for onprem.private should be forwarded to an on-premises DNS server. Following this rule, the query will be forwarded to an on-premises DNS server.
  4. The forwarding rule is associated with the Resolver outbound endpoint, so the query will be forwarded through this endpoint to an on-premises DNS server.

In this flow, the DNS query that was initiated in one of the participating accounts has been forwarded to the centralized DNS server which, in turn, forwarded this to the on-premises DNS.

Second use case

Next, here’s how on-premises workloads will be able to resolve private domains in your AWS environment:
 

Figure 3: Use case for how on-premises workloads will be able to resolve private domains in your AWS environment

Figure 3: Use case for how on-premises workloads will be able to resolve private domains in your AWS environment

In this case, the query for host1.acc1.awscloud.private is initiated from an on-premises host. Here’s what happens next:

  1. The domain query is forwarded to on-premises DNS server.
  2. The query is then forwarded to the Resolver inbound endpoint via a conditional forwarder rule on the on-premises DNS server.
  3. The query reaches the default DNS server for DNS-VPC.
  4. Because DNS-VPC is associated with the private hosted zone acc1.awscloud.private, the default DNS server will be able to resolve this domain.

In this case, the DNS query has been initiated on-premises and forwarded to centralized DNS on the AWS side through the inbound endpoint.

Third use case

Finally, you might need to resolve domains across multiple AWS accounts. Here’s how you could achieve this:
 

Figure 4: Use case for how to resolve domains across multiple AWS accounts

Figure 4: Use case for how to resolve domains across multiple AWS accounts

Let’s say that host1 in host1.acc1.awscloud.private attempts to resolve the domain host2.acc2.awscloud.private. Here’s what happens:

  1. The domain query is sent to the default DNS server for the VPC hosting source machine (host1).
  2. Because the VPC is associated with the shared forwarding rules, these rules will be evaluated.
  3. A rule indicates that queries for awscloud.private zone should be forwarded to the resolver endpoint in DNS-VPC (for inbound endpoint IP addresses), which will then use the Amazon-provided default DNS to resolve the query.
  4. Because DNS-VPC is associated with the acc2.awscloud.private hosted zone, the default DNS will use auto-defined rules to resolve this domain.

This use case explains the AWS-to-AWS case where the DNS query has been initiated on one participating account and forwarded to central DNS for resolution of domains in another AWS account. Now, I’ll look at what it takes to build this solution in your environment.

How to deploy the solution

I’ll show you how to configure this solution in four steps:

  1. Set up a centralized DNS account.
  2. Set up each participating account.
  3. Create private hosted zones and Route 53 associations.
  4. Configure on-premises DNS forwarders.

Step 1: Set up a centralized DNS account

In this step, you’ll set up resources in the centralized DNS account. Primarily, this includes the DNS-VPC, Resolver endpoints, and forwarding rules.

  1. Create a VPC to act as DNS-VPC according to your business scenario, either using the web console or from an AWS Quick Start. You can review common scenarios in the Amazon VPC user guide; one very common scenarios is a VPC with public and private subnets.
  2. Create resolver endpoints. You need to create an outbound endpoint to forward DNS queries to on-premises DNS and an inbound endpoint to receive DNS queries forwarded from on-premises workloads and other AWS accounts.
  3. Create two forwarding rules. The first rule is to forward DNS queries for zone onprem.private to your on-premises DNS server IP addresses, and the second rule is to forward DNS queries for zone awscloud.private to the IP addresses of the resolver inbound endpoint.
  4. After creating the rules, associate them with DNS-VPC that was created in step #1. This will allow the Route 53 Resolver to start forwarding domain queries accordingly.
  5. Finally, you need to share the two forwarding rules with all participating accounts. To do that, you’ll use AWS Resource Access Manager and you can share the rules with your entire AWS Organization or with specific accounts.

Note: To be able to forward domain queries to your on-premises DNS server, you need connectivity between your data center and DNS-VPC, which could be established either using site-to-site VPN or AWS Direct Connect.

Step 2: Set up participating accounts

For each participating account, you need to configure your VPCs to use the shared forwarding rules, and you need to create a private hosted zone for each account.

  • Accept the shared rules from AWS Resource Access Manager. This step is not required if the rules were shared to your AWS Organization. Then, associate the forwarding rules with the VPCs that host your workloads in each account. Once associated, the resolver will start forwarding DNS queries according to the rules.

At this point, you should be able to resolve on-premises domains from workloads running in any VPC associated with the shared forwarding rules. To create private domains in AWS, you need to create Private Hosted Zones.

Step 3: Create private hosted zones

In this step, you need to create a private hosted zone in each account with a subdomain of awscloud.private. Use unique names for each private hosted zone to avoid domain conflicts in your environment (for example, acc1.awscloud.private or dev.awscloud.private).

  1. Create a private hosted zone in each participating account with a subdomain of awscloud.private and associate it with VPCs running in that account.
  2. Associate the private hosted zone with DNS-VPC. This allows the centralized DNS-VPC to resolve domains in the private hosted zone and act as a DNS resolver between AWS accounts.

Because the private hosted zone and DNS-VPC are in different accounts, you need to associate the private hosted zone with DNS-VPC. To do that, you need to create authorization from the account that owns the private hosted zone and accept this authorization from the account that owns DNS-VPC. You can do that using AWS CLI:

  1. In each participating account, create the authorization using the private hosted zone ID, the region, and the VPC ID that you want to associate (DNS-VPC).
    
        aws route53 create-vpc-association-authorization --hosted-zone-id <hosted-zone-id>  --vpc VPCRegion=<region> ,VPCId=<vpc-id>    
    

  2. In the centralized DNS account, associate the DNS-VPC with the hosted zone in each participating account.
    
        aws route53 associate-vpc-with-hosted-zone --hosted-zone-id <hosted-zone-id> --vpc VPCRegion=<region>,VPCId=<vpc-id>    
    

Step 4: Configure on-premises DNS forwarders

To be able to resolve subdomains within the awscloud.private domain from workloads running on-premises, you need to configure conditional forwarding rules to forward domain queries to the two IP addresses of resolver inbound endpoints that were created in the central DNS account. Note that this requires connectivity between your data center and DNS-VPC, which could be established either using site-to-site VPN or
AWS Direct Connect.

Additional considerations and limitations

Thanks to the flexibility of Route 53 Resolver and conditional forwarding rules, you can control which queries to send to central DNS and which ones to resolve locally in the same account. This is particularly important when you plan to use some AWS services, such as AWS PrivateLink or Amazon Elastic File System (EFS) because domain names associated with these services need to be resolved local to the account that owns them. In this section, I will name two use-cases that require additional considerations.

  1. Interface VPC Endpoints (AWS PrivateLink)

    When you create an AWS PrivateLink interface endpoint, AWS generates endpoint-specific DNS hostnames that you can use to communicate with the service. For AWS services and AWS Marketplace partner services, you can optionally enable private DNS for the endpoint. This option associates a private hosted zone with your VPC. The hosted zone contains a record set for the default DNS name for the service (for example, ec2.us-east-1.amazonaws.com) that resolves to the private IP addresses of the endpoint network interfaces in your VPC. This enables you to make requests to the service using its default DNS hostname instead of the endpoint-specific DNS hostnames.

    If you use private DNS for your endpoint, you have to resolve DNS queries to the endpoint local to the account and use the default DNS provided by AWS. So, in this case, I recommend that you resolve domain queries in amazonaws.com locally and not forward these queries to central DNS.

  2. Mounting EFS with a DNS name

    You can mount an Amazon EFS file system on an Amazon EC2 instance using DNS names. The file system DNS name automatically resolves to the mount target’s IP address in the Availability Zone of the connecting Amazon EC2 instance. To be able to do that, the VPC must use the default DNS provided by Amazon to resolve EFS DNS names.

    If you plan to use EFS in your environment, I recommend that you resolve EFS DNS names locally and avoid sending these queries to central DNS because clients in that case would not receive answers optimized for their availability zone, which might result in higher operation latencies and less durability.

Summary

In this post, I introduced a simplified solution to implement central DNS resolution in a multi-account and hybrid environment. This solution uses AWS Route 53 Resolver, AWS Resource Access Manager, and native Route 53 capabilities and it reduces complexity and operations effort by removing the need for custom DNS servers or forwarders in AWS environment.

If you have feedback about this blog post, submit comments in the Comments section below. If you have questions about this blog post, start a new thread on in the AWS forums.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Mahmoud Matouk

Mahmoud is part of our world-wide public sector Solutions Architects, helping higher education customers build innovative, secured, and highly available solutions using various AWS services.