Tag Archives: DNS

How Rust and Wasm power Cloudflare’s 1.1.1.1

Post Syndicated from Anbang Wen original https://blog.cloudflare.com/big-pineapple-intro/

How Rust and Wasm power Cloudflare's 1.1.1.1

How Rust and Wasm power Cloudflare's 1.1.1.1

On April 1, 2018, Cloudflare announced the 1.1.1.1 public DNS resolver. Over the years, we added the debug page for troubleshooting, global cache purge, 0 TTL for zones on Cloudflare, Upstream TLS, and 1.1.1.1 for families to the platform. In this post, we would like to share some behind the scenes details and changes.

When the project started, Knot Resolver was chosen as the DNS resolver. We started building a whole system on top of it, so that it could fit Cloudflare’s use case. Having a battle tested DNS recursive resolver, as well as a DNSSEC validator, was fantastic because we could spend our energy elsewhere, instead of worrying about the DNS protocol implementation.

Knot Resolver is quite flexible in terms of its Lua-based plugin system. It allowed us to quickly extend the core functionality to support various product features, like DoH/DoT, logging, BPF-based attack mitigation, cache sharing, and iteration logic override. As the traffic grew, we reached certain limitations.

Lessons we learned

Before going any deeper, let’s first have a bird’s-eye view of a simplified Cloudflare data center setup, which could help us understand what we are going to talk about later. At Cloudflare, every server is identical: the software stack running on one server is exactly the same as on another server, only the configuration may be different. This setup greatly reduces the complexity of fleet maintenance.

How Rust and Wasm power Cloudflare's 1.1.1.1
Figure 1 Data center layout

The resolver runs as a daemon process, kresd, and it doesn’t work alone. Requests, specifically DNS requests, are load-balanced to the servers inside a data center by Unimog. DoH requests are terminated at our TLS terminator. Configs and other small pieces of data can be delivered worldwide by Quicksilver in seconds. With all the help, the resolver can concentrate on its own goal – resolving DNS queries, and not worrying about transport protocol details. Now let’s talk about 3 key areas we wanted to improve here – blocking I/O in plugins, a more efficient use of cache space, and plugin isolation.

Callbacks blocking the event loop

Knot Resolver has a very flexible plugin system for extending its core functionality. The plugins are called modules, and they are based on callbacks. At certain points during request processing, these callbacks will be invoked with current query context. This gives a module the ability to inspect, modify, and even produce requests / responses. By design, these callbacks are supposed to be simple, in order to avoid blocking the underlying event loop. This matters because the service is single threaded, and the event loop is in charge of serving many requests at the same time. So even just one request being held up in a callback means that no other concurrent requests can be progressed until the callback finishes.

The setup worked well enough for us until we needed to do blocking operations, for example, to pull data from Quicksilver before responding to the client.

Cache efficiency

As requests for a domain could land on any node inside a data center, it would be wasteful to repetitively resolve a query when another node already has the answer. By intuition, the latency could be improved if the cache could be shared among the servers, and so we created a cache module which multicasted the newly added cache entries. Nodes inside the same data center could then subscribe to the events and update their local cache.

The default cache implementation in Knot Resolver is LMDB. It is fast and reliable for small to medium deployments. But in our case, cache eviction shortly became a problem. The cache itself doesn’t track for any TTL, popularity, etc. When it’s full, it just clears all the entries and starts over. Scenarios like zone enumeration could fill the cache with data that is unlikely to be retrieved later.

Furthermore, our multicast cache module made it worse by amplifying the less useful data to all the nodes, and led them to the cache high watermark at the same time. Then we saw a latency spike because all the nodes dropped the cache and started over around the same time.

Module isolation

With the list of Lua modules increasing, debugging issues became increasingly difficult. This is because a single Lua state is shared among all the modules, so one misbehaving module could affect another. For example, when something went wrong inside the Lua state, like having too many coroutines, or being out of memory, we got lucky if the program just crashed, but the resulting stack traces were hard to read. It is also difficult to forcibly tear down, or upgrade, a running module as it not only has state in the Lua runtime, but also FFI, so memory safety is not guaranteed.

Hello BigPineapple

We didn’t find any existing software that would meet our somewhat niche requirements, so eventually we started building something ourselves. The first attempt was to wrap Knot Resolver’s core with a thin service written in Rust (modified edgedns).

This proved to be difficult due to having to constantly convert between the storage, and C/FFI types, and some other quirks (for example, the ABI for looking up records from cache expects the returned records to be immutable until the next call, or the end of the read transaction). But we learned a lot from trying to implement this sort of split functionality where the host (the service) provides some resources to the guest (resolver core library), and how we would make that interface better.

In the later iterations, we replaced the entire recursive library with a new one based around an async runtime; and a redesigned module system was added to it, sneakily rewriting the service into Rust over time as we swapped out more and more components. That async runtime was tokio, which offered a neat thread pool interface for running both non-blocking and blocking tasks, as well as a good ecosystem for working with other crates (Rust libraries).

After that, as the futures combinators became tedious, we started converting everything to async/await. This was before the async/await feature that landed in Rust 1.39, which led us to use nightly (Rust beta) for a while and had some hiccups. When the async/await stabilized, it enabled us to write our request processing routine ergonomically, similar to Go.

All the tasks can be run concurrently, and certain I/O heavy ones can be broken down into smaller pieces, to benefit from a more granular scheduling. As the runtime executes tasks on a threadpool, instead of a single thread, it also benefits from work stealing. This avoids a problem we previously had, where a single request taking a lot of time to process, that blocks all the other requests on the event loop.

How Rust and Wasm power Cloudflare's 1.1.1.1
Figure 2 Components overview

Finally, we forged a platform that we are happy with, and we call it BigPineapple. The figure above shows an overview of its main components and the data flow between them. Inside BigPineapple, the server module gets inbound requests from the client, validates and transforms them into unified frame streams, which can then be processed by the worker module. The worker module has a set of workers, whose task is to figure out the answer to the question in the request. Each worker interacts with the cache module to check if the answer is there and still valid, otherwise it drives the recursor module to recursively iterate the query. The recursor doesn’t do any I/O, when it needs anything, it delegates the sub-task to the conductor module. The conductor then uses outbound queries to get the information from upstream nameservers. Through the whole process, some modules can interact with the Sandbox module, to extend its functionality by running the plugins inside.

Let’s look at some of them in more detail, and see how they helped us overcome the problems we had before.

Updated I/O architecture

A DNS resolver can be seen as an agent between a client and several authoritative nameservers: it receives requests from the client, recursively fetches data from the upstream nameservers, then composes the responses and sends them back to the client. So it has both inbound and outbound traffic, which are handled by the server and the conductor component respectively.

The server listens on a list of interfaces using different transport protocols. These are later abstracted into streams of “frames”. Each frame is a high level representation of a DNS message, with some extra metadata. Underneath, it can be a UDP packet, a segment of TCP stream, or the payload of a HTTP request, but they are all processed the same way. The frame is then converted into an asynchronous task, which in turn is picked up by a set of workers in charge of resolving these tasks. The finished tasks are converted back into responses, and sent back to the client.

This “frame” abstraction over the protocols and their encodings simplified the logic used to regulate the frame sources, such as enforcing fairness to prevent starving and controlling pacing to protect the server from being overwhelmed. One of the things we’ve learned with the previous implementations is that, for a service open to the public, a peak performance of the I/O matters less than the ability to pace clients fairly. This is mainly because the time and computational cost of each recursive request is vastly different (for example a cache hit from a cache miss), and it’s difficult to guess it beforehand. The cache misses in recursive service not only consume Cloudflare’s resources, but also the resources of the authoritative nameservers being queried, so we need to be mindful of that.

On the other side of the server is the conductor, which manages all the outbound connections. It helps to answer some questions before reaching out to the upstream: Which is the fastest nameserver to connect to in terms of latency? What to do if all the nameservers are not reachable? What protocol to use for the connection, and are there any better options? The conductor is able to make these decisions by tracking the upstream server’s metrics, such as RTT, QoS, etc. With that knowledge, it can also guess for things like upstream capacity, UDP packet loss, and take necessary actions, e.g. retry when it thinks the previous UDP packet didn’t reach the upstream.

How Rust and Wasm power Cloudflare's 1.1.1.1
Figure 3 I/O conductor

Figure 3 shows a simplified data flow about the conductor. It is called by the exchanger mentioned above, with upstream requests as input. The requests will be deduplicated first: meaning in a small window, if a lot of requests come to the conductor and ask for the same question, only one of them will pass, the others are put into a waiting queue. This is common when a cache entry expires, and can reduce unnecessary network traffic. Then based on the request and upstream metrics, the connection instructor either picks an open connection if available, or generates a set of parameters. With these parameters, the I/O executor is able to connect to the upstream directly, or even take a route via another Cloudflare data center using our Argo Smart Routing technology!

The cache

Caching in a recursive service is critical as a server can return a cached response in under one millisecond, while it will be hundreds of milliseconds to respond on a cache miss. As the memory is a finite resource (and also a shared resource in Cloudflare’s architecture), more efficient use of space for cache was one of the key areas we wanted to improve. The new cache is implemented with a cache replacement data structure (ARC), instead of a KV store. This makes good use of the space on a single node, as less popular entries are progressively evicted, and the data structure is resistant to scans.

Moreover, instead of duplicating the cache across the whole data center with multicast, as we did before, BigPineapple is aware of its peer nodes in the same data center, and relays queries from one node to another if it cannot find an entry in its own cache. This is done by consistent hashing the queries onto the healthy nodes in each data center. So, for example, queries for the same registered domain go through the same subset of nodes, which not only increases the cache hit ratio, but also helps the infrastructure cache, which stores information about performance and features of nameservers.

How Rust and Wasm power Cloudflare's 1.1.1.1
Figure 4 Updated data center layout

Async recursive library

The recursive library is the DNS brain of BigPineapple, as it knows how to find the answer to the question in the query. Starting from the root, it breaks down the client query into subqueries, and uses them to collect knowledge recursively from various authoritative nameservers on the internet. The product of this process is the answer. Thanks to the async/await it can be abstracted as a function like such:

async fn resolve(Request, Exchanger) → Result<Response>;

The function contains all the logic necessary to generate a response to a given request, but it doesn’t do any I/O on its own. Instead, we pass an Exchanger trait(Rust interface) that knows how to exchange DNS messages with upstream authoritative nameservers asynchronously. The exchanger is usually called at various await points – for example, when a recursion starts, one of the first things it does is that it looks up the closest cached delegation for the domain. If it doesn’t have the final delegation in cache, it needs to ask what nameservers are responsible for this domain and wait for the response, before it can proceed any further.

Thanks to this design, which decouples the “waiting for some responses” part from the recursive DNS logic, it is much easier to test by providing a mock implementation of the exchanger. In addition, it makes the recursive iteration code (and DNSSEC validation logic in particular) much more readable, as it’s written sequentially instead of being scattered across many callbacks.

Fun fact: writing a DNS recursive resolver from scratch is not fun at all!

Not only because of the complexity of DNSSEC validation, but also because of the necessary “workarounds” needed for various RFC incompatible servers, forwarders, firewalls, etc. So we ported deckard into Rust to help test it. Additionally, when we started migrating over to this new async recursive library, we first ran it in “shadow” mode: processing real world query samples from the production service, and comparing differences. We’ve done this in the past on Cloudflare’s authoritative DNS service as well. It is slightly more difficult for a recursive service due to the fact that a recursive service has to look up all the data over the Internet, and authoritative nameservers often give different answers for the same query due to localization, load balancing and such, leading to many false positives.

In December 2019, we finally enabled the new service on a public test endpoint (see the announcement) to iron out remaining issues before slowly migrating the production endpoints to the new service. Even after all that, we continued to find edge cases with the DNS recursion (and DNSSEC validation in particular), but fixing and reproducing these issues has become much easier due to the new architecture of the library.

Sandboxed plugins

Having the ability to extend the core DNS functionality on the fly is important for us, thus BigPineapple has its redesigned plugin system. Before, the Lua plugins run in the same memory space as the service itself, and are generally free to do what they want. This is convenient, as we can freely pass memory references between the service and modules using C/FFI. For example, to read a response directly from cache without having to copy to a buffer first. But it is also dangerous, as the module can read uninitialized memory, call a host ABI using a wrong function signature, block on a local socket, or do other undesirable things, in addition the service doesn’t have a way to restrict these behaviors.

So we looked at replacing the embedded Lua runtime with JavaScript, or native modules, but around the same time, embedded runtimes for WebAssembly (Wasm for short) started to appear. Two nice properties of WebAssembly programs are that it allows us to write them in the same language as the rest of the service, and that they run in an isolated memory space. So we started modeling the guest/host interface around the limitations of WebAssembly modules, to see how that would work.

BigPineapple’s Wasm runtime is currently powered by Wasmer. We tried several runtimes over time like Wasmtime, WAVM in the beginning, and found Wasmer was simpler to use in our case. The runtime allows each module to run in its own instance, with an isolated memory and a signal trap, which naturally solved the module isolation problem we described before. In addition to this, we can have multiple instances of the same module running at the same time. Being controlled carefully, the apps can be hot swapped from one instance to another without missing a single request! This is great because the apps can be upgraded on the fly without a server restart. Given that the Wasm programs are distributed via Quicksilver, BigPineapple’s functionality can be safely changed worldwide within a few seconds!

To better understand the WebAssembly sandbox, several terms need to be introduced first:

  • Host: the program which runs the Wasm runtime. Similar to a kernel, it has full control through the runtime over the guest applications.
  • Guest application: the Wasm program inside the sandbox. Within a restricted environment, it can only access its own memory space, which is provided by the runtime, and call the imported Host calls. We call it an app for short.
  • Host call: the functions defined in the host that can be imported by the guest. Comparable to syscall, it’s the only way guest apps can access the resources outside the sandbox.
  • Guest runtime: a library for guest applications to easily interact with the host. It implements some common interfaces, so an app can just use async, socket, log and tracing without knowing the underlying details.

Now it’s time to dive into the sandbox, so stay awhile and listen. First let’s start from the guest side, and see what a common app lifespan looks like. With the help of the guest runtime, guest apps can be written similar to regular programs. So like other executables, an app begins with a start function as an entrypoint, which is called by the host upon loading. It is also provided with arguments as from the command line. At this point, the instance normally does some initialization, and more importantly, registers callback functions for different query phases. This is because in a recursive resolver, a query has to go through several phases before it gathers enough information to produce a response, for example a cache lookup, or making subrequests to resolve a delegation chain for the domain, so being able to tie into these phases is necessary for the apps to be useful for different use cases. The start function can also run some background tasks to supplement the phase callbacks, and store global state. For example – report metrics, or pre-fetch shared data from external sources, etc. Again, just like how we write a normal program.

But where do the program arguments come from? How could a guest app send log and metrics? The answer is, external functions.

How Rust and Wasm power Cloudflare's 1.1.1.1
Figure 5 Wasm based Sandbox

In figure 5, we can see a barrier in the middle, which is the sandbox boundary, that separates the guest from the host. The only way one side can reach out to the other, is via a set of functions exported by the peer beforehand. As in the picture, the “hostcalls” are exported by the host, imported and called by the guest; while the “trampoline” are guest functions that the host has knowledge of.

It is called trampoline because it is used to invoke a function or a closure inside a guest instance that’s not exported. The phase callbacks are one example of why we need a trampoline function: each callback returns a closure, and therefore can’t be exported on instantiation. So a guest app wants to register a callback, it calls a host call with the callback address “hostcall_register_callback(pre_cache, #30987)”, when the callback needs to be invoked, the host cannot just call that pointer as it’s pointing to the guest’s memory space. What it can do instead is, to leverage one of the aforementioned trampolines, and give it the address of the callback closure: “trampoline_call(#30987)”.

Isolation overhead
Like a coin that has two sides, the new sandbox does come with some additional overhead. The portability and isolation that WebAssembly offers bring extra cost. Here, we’ll list two examples.

Firstly, guest apps are not allowed to read host memory. The way it works is the guest provides a memory region via a host call, then the host writes the data into the guest memory space. This introduces a memory copy that would not be needed if we were outside the sandbox. The bad news is, in our use case, the guest apps are supposed to do something on the query and/or the response, so they almost always need to read data from the host on every single request. The good news, on the other hand, is that during a request life cycle, the data won’t change. So we pre-allocate a bulk of memory in the guest memory space right after the guest app instantiates. The allocated memory is not going to be used, but instead serves to occupy a hole in the address space. Once the host gets the address details, it maps a shared memory region with the common data needed by the guest into the guest’s space. When the guest code starts to execute, it can just access the data in the shared memory overlay, and no copy is needed.

Another issue we ran into was when we wanted to add support for a modern protocol, oDoH, into BigPineapple. The main job of it is to decrypt the client query, resolve it, then encrypt the answers before sending it back. By design, this doesn’t belong to core DNS, and should instead be extended with a Wasm app. However, the WebAssembly instruction set doesn’t provide some crypto primitives, such as AES and SHA-2, which prevents it from getting the benefit of host hardware. There is ongoing work to bring this functionality to Wasm with WASI-crypto. Until then, our solution for this is to simply delegate the HPKE to the host via host calls, and we already saw 4x performance improvements, compared to doing it inside Wasm.

Async in Wasm
Remember the problem we talked about before that the callbacks could block the event loop? Essentially, the problem is how to run the sandboxed code asynchronously. Because no matter how complex the request processing callback is, if it can yield, we can put an upper bound on how long it is allowed to block. Luckily, Rust’s async framework is both elegant and lightweight. It gives us the opportunity to use a set of guest calls to implement the “Future”s.

In Rust, a Future is a building block for asynchronous computations. From the user’s perspective, in order to make an asynchronous program, one has to take care of two things: implement a pollable function that drives the state transition, and place a waker as a callback to wake itself up, when the pollable function should be called again due to some external event (e.g. time passes, socket becomes readable, and so on). The former is to be able to progress the program gradually, e.g. read buffered data from I/O and return a new state indicating the status of the task: either finished, or yielded. The latter is useful in case of task yielding, as it will trigger the Future to be polled when the conditions that the task was waiting for are fulfilled, instead of busy looping until it’s complete.

Let’s see how this is implemented in our sandbox. For a scenario when the guest needs to do some I/O, it has to do so via the host calls, as it is inside a restricted environment. Assuming the host provides a set of simplified host calls which mirror the basic socket operations: open, read, write, and close, the guest can have its pseudo poller defined as below:

fn poll(&mut self, wake: fn()) -> Poll {
	match hostcall_socket_read(self.sock, self.buffer) {
    	    HostOk  => Poll::Ready,
    	    HostEof => Poll::Pending,
	}
}

Here the host call reads data from a socket into a buffer, depending on its return value, the function can move itself to one of the states we mentioned above: finished(Ready), or yielded(Pending). The magic happens inside the host call. Remember in figure 5, that it is the only way to access resources? The guest app doesn’t own the socket, but it can acquire a “handle” via “hostcall_socket_open”, which will in turn create a socket on the host side, and return a handle. The handle can be anything in theory, but in practice using integer socket handles map well to file descriptors on the host side, or indices in a vector or slab. By referencing the returned handle, the guest app is able to remotely control the real socket. As the host side is fully asynchronous, it can simply relay the socket state to the guest. If you noticed that the waker function isn’t used above, well done! That’s because when the host call is called, it not only starts opening a socket, but also registers the current waker to be called then the socket is opened (or fails to do so). So when the socket becomes ready, the host task will be woken up, it will find the corresponding guest task from its context, and wakes it up using the trampoline function as shown in figure 5. There are other cases where a guest task needs to wait for another guest task, an async mutex for example. The mechanism here is similar: using host calls to register wakers.

All of these complicated things are encapsulated in our guest async runtime, with easy to use API, so the guest apps get access to regular async functions without having to worry about the underlying details.

(Not) The End

Hopefully, this blog post gave you a general idea of the innovative platform that powers 1.1.1.1. It is still evolving. As of today, several of our products, such as 1.1.1.1 for Families, AS112, and Gateway DNS, are supported by guest apps running on BigPineapple. We are looking forward to bringing new technologies into it. If you have any ideas, please let us know in the community or via email.

One of our most requested features is here: DNS record comments and tags

Post Syndicated from Hannes Gerhart original https://blog.cloudflare.com/dns-record-comments/

One of our most requested features is here: DNS record comments and tags

One of our most requested features is here: DNS record comments and tags

Starting today, we’re adding support on all zone plans to add custom comments on your DNS records. Users on the Pro, Business and Enterprise plan will also be able to tag DNS records.

DNS records are important

DNS records play an essential role when it comes to operating a website or a web application. In general, they are used to mapping human-readable hostnames to machine-readable information, most commonly IP addresses. Besides mapping hostnames to IP addresses they also fulfill many other use cases like:

  • Ensuring emails can reach your inbox, by setting up MX records.
  • Avoiding email spoofing and phishing by configuring SPF, DMARC and DKIM policies as TXT records.
  • Validating a TLS certificate by adding a TXT (or CNAME) record.
  • Specifying allowed certificate authorities that can issue certificates on behalf of your domain by creating a CAA record.
  • Validating ownership of your domain for other web services (website hosting, email hosting, web storage, etc.) – usually by creating a TXT record.
  • And many more.

With all these different use cases, it is easy to forget what a particular DNS record is for and it is not always possible to derive the purpose from the name, type and content of a record. Validation TXT records tend to be on seemingly arbitrary names with rather cryptic content. When you then also throw multiple people or teams into the mix who have access to the same domain, all creating and updating DNS records, it can quickly happen that someone modifies or even deletes a record causing the on-call person to get paged in the middle of the night.

Enter: DNS record comments & tags 📝

Starting today, everyone with a zone on Cloudflare can add custom comments on each of their DNS records via the API and through the Cloudflare dashboard.

One of our most requested features is here: DNS record comments and tags

To add a comment, just click on the Edit action of the respective DNS record and fill out the Comment field. Once you hit Save, a small icon will appear next to the record name to remind you that this record has a comment. Hovering over the icon will allow you to take a quick glance at it without having to open the edit panel.

One of our most requested features is here: DNS record comments and tags

What you also can see in the screenshot above is the new Tags field. All users on the Pro, Business, or Enterprise plans now have the option to add custom tags to their records. These tags can be just a key like “important” or a key-value pair like “team:DNS” which is separated by a colon. Neither comments nor tags have any impact on the resolution or propagation of the particular DNS record, and they’re only visible to people with access to the zone.

Now we know that some of our users love automation by using our API. So if you want to create a number of zones and populate all their DNS records by uploading a zone file as part of your script, you can also directly include the DNS record comments and tags in that zone file. And when you export a zone file, either to back up all records of your zone or to easily move your zone to another account on Cloudflare, it will also contain comments and tags. Learn more about importing and exporting comments and tags on our developer documentation.

;; A Records
*.mycoolwebpage.xyz.     1      IN  A    192.0.2.3
mycoolwebpage.xyz.       1      IN  A    203.0.113.1 ; Contact Hannes for details.
sub1.mycoolwebpage.xyz.  1      IN  A    192.0.2.2 ; Test origin server. Can be deleted eventually. cf_tags=testing
sub1.mycoolwebpage.xyz.  1      IN  A    192.0.2.1 ; Production origin server. cf_tags=important,prod,team:DNS

;; MX Records
mycoolwebpage.xyz.       1      IN  MX   1 mailserver1.example.
mycoolwebpage.xyz.       1      IN  MX   2 mailserver2.example.

;; TXT Records
mycoolwebpage.xyz.       86400	IN  TXT  "v=spf1 ip4:192.0.2.0/24 -all" ; cf_tags=important,team:EMAIL
sub1.mycoolwebpage.xyz.  86400  IN  TXT  "hBeFxN3qZT40" ; Verification record for service XYZ. cf_tags=team:API

New filters

It might be that your zone has hundreds or thousands of DNS records, so how on earth would you find all the records that belong to the same team or that are needed for one particular application?

For this we created a new filter option in the dashboard. This allows you to not only filter for comments or tags but also for other record data like name, type, content, or proxy status. The general search bar for a quick and broader search will still be available, but it cannot (yet) be used in conjunction with the new filters.

One of our most requested features is here: DNS record comments and tags

By clicking on the “Add filter” button, you can select individual filters that are connected with a logical AND. So if I wanted to only look at TXT records that are tagged as important, I would add these filters:

One more thing (or two)

Another change we made is to replace the Advanced button with two individual actions: Import and Export, and Dashboard Display Settings.

You can find them in the top right corner under DNS management. When you click on Import and Export you have the option to either export all existing DNS records (including their comments and tags) into a zone file or import new DNS records to your zone by uploading a zone file.

The action Dashboard Display Settings allows you to select which special record types are shown in the UI. And there is an option to toggle showing the record tags inline under the respective DNS record or just showing an icon if there are tags present on the record.

And last but not least, we increased the width of the DNS record table as part of this release. The new table makes better use of the existing horizontal space and allows you to see more details of your DNS records, especially if you have longer subdomain names or content.

Try it now

DNS record comments and tags are available today. Just navigate to the DNS tab of your zone in the Cloudflare dashboard and create your first comment or tag. If you are not yet using Cloudflare DNS, sign up for free in just a few minutes.

Learn more about DNS record comments and tags on our developer documentation.

Cloudflare is joining the AS112 project to help the Internet deal with misdirected DNS queries

Post Syndicated from Hunts Chen original https://blog.cloudflare.com/the-as112-project/

Cloudflare is joining the AS112 project to help the Internet deal with misdirected DNS queries

Cloudflare is joining the AS112 project to help the Internet deal with misdirected DNS queries

Today, we’re excited to announce that Cloudflare is participating in the AS112 project, becoming an operator of this community-operated, loosely-coordinated anycast deployment of DNS servers that primarily answer reverse DNS lookup queries that are misdirected and create significant, unwanted load on the Internet.

With the addition of Cloudflare global network, we can make huge improvements to the stability, reliability and performance of this distributed public service.

What is AS112 project

The AS112 project is a community effort to run an important network service intended to handle reverse DNS lookup queries for private-only use addresses that should never appear in the public DNS system. In the seven days leading up to publication of this blog post, for example, Cloudflare’s 1.1.1.1 resolver received more than 98 billion of these queries — all of which have no useful answer in the Domain Name System.

Some history is useful for context. Internet Protocol (IP) addresses are essential to network communication. Many networks make use of IPv4 addresses that are reserved for private use, and devices in the network are able to connect to the Internet with the use of network address translation (NAT), a process that maps one or more local private addresses to one or more global IP addresses and vice versa before transferring the information.

Your home Internet router most likely does this for you. You will likely find that, when at home, your computer has an IP address like 192.168.1.42. That’s an example of a private use address that is fine to use at home, but can’t be used on the public Internet. Your home router translates it, through NAT, to an address your ISP assigned to your home and that can be used on the Internet.

Here are the reserved “private use” addresses designated in RFC 1918.

Address block Address range Number of addresses
10.0.0.0/8 10.0.0.0 – 10.255.255.255 16,777,216
172.16.0.0/12 172.16.0.0 – 172.31.255.255 1,048,576
192.168.0.0/16 192.168.0.0 – 192.168.255.255 65,536

(Reserved private IPv4 network ranges)

Although the reserved addresses themselves are blocked from ever appearing on the public Internet, devices and programs in private environments may occasionally originate DNS queries corresponding to those addresses. These are called “reverse lookups” because they ask the DNS if there is a name associated with an address.

Reverse DNS lookup

A reverse DNS lookup is an opposite process of the more commonly used DNS lookup (which is used every day to translate a name like www.cloudflare.com to its corresponding IP address). It is a query to look up the domain name associated with a given IP address, in particular those addresses associated with routers and switches. For example, network administrators and researchers use reverse lookups to help understand paths being taken by data packets in the network, and it’s much easier to understand meaningful names than a meaningless number.

A reverse lookup is accomplished by querying DNS servers for a pointer record (PTR). PTR records store IP addresses with their segments reversed, and by appending “.in-addr.arpa” to the end. For example, the IP address 192.0.2.1 will have the PTR record stored as 1.2.0.192.in-addr.arpa. In IPv6, PTR records are stored within the “.ip6.arpa” domain instead of “.in-addr.arpa.”. Below are some query examples using the dig command line tool.

# Lookup the domain name associated with IPv4 address 172.64.35.46
# “+short” option make it output the short form of answers only
$ dig @1.1.1.1 PTR 46.35.64.172.in-addr.arpa +short
hunts.ns.cloudflare.com.

# Or use the shortcut “-x” for reverse lookups
$ dig @1.1.1.1 -x 172.64.35.46 +short
hunts.ns.cloudflare.com.

# Lookup the domain name associated with IPv6 address 2606:4700:58::a29f:2c2e
$ dig @1.1.1.1 PTR e.2.c.2.f.9.2.a.0.0.0.0.0.0.0.0.0.0.0.0.8.5.0.0.0.0.7.4.6.0.6.2.ip6.arpa. +short
hunts.ns.cloudflare.com.

# Or use the shortcut “-x” for reverse lookups
$ dig @1.1.1.1 -x 2606:4700:58::a29f:2c2e +short  
hunts.ns.cloudflare.com.

The problem that private use addresses cause for DNS

The private use addresses concerned have only local significance and cannot be resolved by the public DNS. In other words, there is no way for the public DNS to provide a useful answer to a question that has no global meaning. It is therefore a good practice for network administrators to ensure that queries for private use addresses are answered locally. However, it is not uncommon for such queries to follow the normal delegation path in the public DNS instead of being answered within the network. That creates unnecessary load.

By definition of being private use, they have no ownership in the public sphere, so there are no authoritative DNS servers to answer the queries. At the very beginning, root servers respond to all these types of queries since they serve the IN-ADDR.ARPA zone.

Over time, due to the wide deployment of private use addresses and the continuing growth of the Internet, traffic on the IN-ADDR.ARPA DNS infrastructure grew and the load due to these junk queries started to cause some concern. Therefore, the idea of offloading IN-ADDR.ARPA queries related to private use addresses was proposed. Following that, the use of anycast for distributing authoritative DNS service for that idea was subsequently proposed at a private meeting of root server operators. And eventually the AS112 service was launched to provide an alternative target for the junk.

The AS112 project is born

To deal with this problem, the Internet community set up special DNS servers called “blackhole servers” as the authoritative name servers that respond to the reverse lookup of the private use address blocks 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16 and the link-local address block 169.254.0.0/16 (which also has only local significance). Since the relevant zones are directly delegated to the blackhole servers, this approach has come to be known as Direct Delegation.

The first two blackhole servers set up by the project are: blackhole-1.iana.org and blackhole-2.iana.org.

Any server, including DNS name server, needs an IP address to be reachable. The IP address must also be associated with an Autonomous System Number (ASN) so that networks can recognize other networks and route data packets to the IP address destination. To solve this problem, a new authoritative DNS service would be created but, to make it work, the community would have to designate IP addresses for the servers and, to facilitate their availability, an AS number that network operators could use to reach (or provide) the new service.

The selected AS number (provided by American Registry for Internet Numbers) and namesake of the project, was 112. It was started by a small subset of root server operators, later grown to a group of volunteer name server operators that include many other organizations. They run anycasted instances of the blackhole servers that, together, form a distributed sink for the reverse DNS lookups for private network and link-local addresses sent to the public Internet.

A reverse DNS lookup for a private use address would see responses like in the example below, where the name server blackhole-1.iana.org is authoritative for it and says the name does not exist, represented in DNS responses by NXDOMAIN.

$ dig @blackhole-1.iana.org -x 192.168.1.1 +nord

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 23870
;; flags: qr aa; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;1.1.168.192.in-addr.arpa.	IN	PTR

;; AUTHORITY SECTION:
168.192.in-addr.arpa.	10800	IN	SOA	168.192.in-addr.arpa. nobody.localhost. 42 86400 43200 604800 10800

At the beginning of the project, node operators set up the service in the direct delegation fashion (RFC 7534). However, adding delegations to this service requires all AS112 servers to be updated, which is difficult to ensure in a system that is only loosely-coordinated. An alternative approach using DNAME redirection was subsequently introduced by RFC 7535 to allow new zones to be added to the system without reconfiguring the blackhole servers.

Direct delegation

DNS zones are directly delegated to the blackhole servers in this approach.

RFC 7534 defines the static set of reverse lookup zones for which AS112 name servers should answer authoritatively. They are as follows:

  • 10.in-addr-arpa
  • 16.172.in-addr.arpa
  • 17.172.in-addr.arpa
  • 18.172.in-addr.arpa
  • 19.172.in-addr.arpa
  • 20.172.in-addr.arpa
  • 21.172.in-addr.arpa
  • 22.172.in-addr.arpa
  • 23.172.in-addr.arpa
  • 24.172.in-addr.arpa
  • 25.172.in-addr.arpa
  • 26.172.in-addr.arpa
  • 27.172.in-addr.arpa
  • 28.172.in-addr.arpa
  • 29.172.in-addr.arpa
  • 30.172.in-addr.arpa
  • 31.172.in-addr.arpa
  • 168.192.in-addr.arpa
  • 254.169.in-addr.arpa (corresponding to the IPv4 link-local address block)

Zone files for these zones are quite simple because essentially they are empty apart from the required  SOA and NS records. A template of the zone file is defined as:

  ; db.dd-empty
   ;
   ; Empty zone for direct delegation AS112 service.
   ;
   $TTL    1W
   @  IN  SOA  prisoner.iana.org. hostmaster.root-servers.org. (
                                  1         ; serial number
                                  1W      ; refresh
                                  1M      ; retry
                                  1W      ; expire
                                  1W )    ; negative caching TTL
   ;
          NS     blackhole-1.iana.org.
          NS     blackhole-2.iana.org.

IP addresses of the direct delegation name servers are covered by the single IPv4 prefix 192.175.48.0/24 and the IPv6 prefix 2620:4f:8000::/48.

Name server IPv4 address IPv6 address
blackhole-1.iana.org 192.175.48.6 2620:4f:8000::6
blackhole-2.iana.org 192.175.48.42 2620:4f:8000::42

DNAME redirection

Firstly, what is DNAME? Introduced by RFC 6672, a DNAME record or Delegation Name Record creates an alias for an entire subtree of the domain name tree. In contrast, the CNAME record creates an alias for a single name and not its subdomains. For a received DNS query, the DNAME record instructs the name server to substitute all those appearing in the left hand (owner name) with the right hand (alias name). The substituted query name, like the CNAME, may live within the zone or may live outside the zone.

Like the CNAME record, the DNS lookup will continue by retrying the lookup with the substituted name. For example, if there are two DNS zone as follows:

# zone: example.com
www.example.com.	A		203.0.113.1
foo.example.com.	DNAME	example.net.

# zone: example.net
example.net.		A		203.0.113.2
bar.example.net.	A		203.0.113.3

The query resolution scenarios would look like this:

Query (Type + Name) Substitution Final result
A www.example.com (no DNAME, don’t apply) 203.0.113.1
DNAME foo.example.com (don’t apply to the owner name itself) example.net
A foo.example.com (don’t apply to the owner name itself) <NXDOMAIN>
A bar.foo.example.com bar.example.net 203.0.113.2

RFC 7535 specifies adding another special zone, empty.as112.arpa, to support DNAME redirection for AS112 nodes. When there are new zones to be added, there is no need for AS112 node operators to update their configuration: instead, the zones’ parents will set up DNAME records for the new domains with the target domain empty.as112.arpa. The redirection (which can be cached and reused) causes clients to send future queries to the blackhole server that is authoritative for the target zone.

Note that blackhole servers do not have to support DNAME records themselves, but they do need to configure the new zone to which root servers will redirect queries at. Considering there may be existing node operators that do not update their name server configuration for some reasons and in order to not cause interruption to the service, the zone was delegated to a new blackhole server instead – blackhole.as112.arpa.

This name server uses a new pair of IPv4 and IPv6 addresses, 192.31.196.1 and 2001:4:112::1, so queries involving DNAME redirection will only land on those nodes operated by entities that also set up the new name server. Since it is not necessary for all AS112 participants to reconfigure their servers to serve empty.as112.arpa from this new server for this system to work, it is compatible with the loose coordination of the system as a whole.

The zone file for empty.as112.arpa is defined as:

   ; db.dr-empty
   ;
   ; Empty zone for DNAME redirection AS112 service.
   ;
   $TTL    1W
   @  IN  SOA  blackhole.as112.arpa. noc.dns.icann.org. (
                                  1         ; serial number
                                  1W      ; refresh
                                  1M      ; retry
                                  1W      ; expire
                                  1W )    ; negative caching TTL
   ;
          NS     blackhole.as112.arpa.

The addresses of the new DNAME redirection name server are covered by the single IPv4 prefix 192.31.196.0/24 and the IPv6 prefix 2001:4:112::/48.

Name server IPv4 address IPv6 address
blackhole.as112.arpa 192.31.196.1 2001:4:112::1

Node identification

RFC 7534 recommends every AS112 node also to host the following metadata zones as well: hostname.as112.net and hostname.as112.arpa.

These zones only host TXT records and serve as identifiers for querying metadata information about an AS112 node. At Cloudflare nodes, the zone files look like this:

$ORIGIN hostname.as112.net.
;
$TTL    604800
;
@       IN  SOA     ns3.cloudflare.com. dns.cloudflare.com. (
                       1                ; serial number
                       604800           ; refresh
                       60               ; retry
                       604800           ; expire
                       604800 )         ; negative caching TTL
;
            NS      blackhole-1.iana.org.
            NS      blackhole-2.iana.org.
;
            TXT     "Cloudflare DNS, <DATA_CENTER_AIRPORT_CODE>"
            TXT     "See http://www.as112.net/ for more information."
;

$ORIGIN hostname.as112.arpa.
;
$TTL    604800
;
@       IN  SOA     ns3.cloudflare.com. dns.cloudflare.com. (
                       1                ; serial number
                       604800           ; refresh
                       60               ; retry
                       604800           ; expire
                       604800 )         ; negative caching TTL
;
            NS      blackhole.as112.arpa.
;
            TXT     "Cloudflare DNS, <DATA_CENTER_AIRPORT_CODE>"
            TXT     "See http://www.as112.net/ for more information."
;

Helping AS112 helps the Internet

As the AS112 project helps reduce the load on public DNS infrastructure, it plays a vital role in maintaining the stability and efficiency of the Internet. Being a part of this project aligns with Cloudflare’s mission to help build a better Internet.

Cloudflare is one of the fastest global anycast networks on the planet, and operates one of the largest, highly performant and reliable DNS services. We run authoritative DNS for millions of Internet properties globally. We also operate the privacy- and performance-focused public DNS resolver 1.1.1.1 service. Given our network presence and scale of operations, we believe we can make a meaningful contribution to the AS112 project.

How we built it

We’ve publicly talked about the Cloudflare in-house built authoritative DNS server software, rrDNS, several times in the past, but haven’t talked much about the software we built to power the Cloudflare public resolver – 1.1.1.1. This is an opportunity to shed some light on the technology we used to build 1.1.1.1, because this AS112 service is built on top of the same platform.

A platform for DNS workloads

Cloudflare is joining the AS112 project to help the Internet deal with misdirected DNS queries

We’ve created a platform to run DNS workloads. Today, it powers 1.1.1.1, 1.1.1.1 for Families, Oblivious DNS over HTTPS (ODoH), Cloudflare WARP and Cloudflare Gateway.

The core part of the platform is a non-traditional DNS server, which has a built-in DNS recursive resolver and a forwarder to forward queries to other servers. It consists of four key modules:

  1. A highly efficient listener module that accepts connections for incoming requests.
  2. A query router module that decides how a query should be resolved.
  3. A conductor module that figures out the best way of exchanging DNS messages with upstream servers.
  4. A sandbox environment to host guest applications.

The DNS server itself doesn’t include any business logic, instead the guest applications run in the sandbox environment can implement concrete business logic such as request filtering, query processing, logging, attack mitigation, cache purging, etc.

The server is written in Rust and the sandbox environment is built on top of a WebAssembly runtime. The combination of Rust and WebAssembly allow us to implement high efficient connection handling, request filtering and query dispatching modules, while having the flexibility of implementing custom business logic in a safe and efficient manner.

The host exposes a set of APIs, called hostcalls, for the guest applications to accomplish a variety of tasks. You can think of them like syscalls on Linux. Here are few examples functions provided by the hostcalls:

  • Obtain the current UNIX timestamp
  • Lookup geolocation data of IP addresses
  • Spawn async tasks
  • Create local sockets
  • Forward DNS queries to designated servers
  • Register callback functions of the sandbox hooks
  • Read current request information, and write responses
  • Emit application logs, metric data points and tracing spans/events

The DNS request lifecycle is broken down into phases. A request phase is a point in processing at which sandboxed apps can be called to change the course of request resolution. And each guest application can register callbacks for each phase.

Cloudflare is joining the AS112 project to help the Internet deal with misdirected DNS queries

AS112 guest application

The AS112 service is built as a guest application written in Rust and compiled to WebAssembly. The zones listed in RFC 7534 and RFC 7535 are loaded as static zones in memory and indexed as a tree data structure. Incoming queries are answered locally by looking up entries in the zone tree.

A router setting in the app manifest is added to tell the host what kind of DNS queries should be processed by the guest application, and a fallback_action setting is added to declare the expected fallback behavior.

# Declare what kind of queries the app handles.
router = [
    # The app is responsible for all the AS112 IP prefixes.
    "dst in { 192.31.196.0/24 192.175.48.0/24 2001:4:112::/48 2620:4f:8000::/48 }",
]

# If the app fails to handle the query, servfail should be returned.
fallback_action = "fail"

The guest application, along with its manifest, is then compiled and deployed through a deployment pipeline that leverages Quicksilver to store and replicate the assets worldwide.

The guest application is now up and running, but how does the DNS query traffic destined to the new IP prefixes reach the DNS server? Do we have to restart the DNS server every time we add a new guest application? Of course there is no need. We use software we developed and deployed earlier, called Tubular. It allows us to change the addresses of a service on the fly. With the help of Tubular, incoming packets destined to the AS112 service IP prefixes are dispatched to the right DNS server process without the need to make any change or release of the DNS server itself.

Meanwhile, in order to make the misdirected DNS queries land on the Cloudflare network in the first place, we use BYOIP (Bringing Your Own IPs to Cloudflare), a Cloudflare product that can announce customer’s own IP prefixes in all our locations. The four AS112 IP prefixes are boarded onto the BYOIP system, and will be announced by it globally.

Testing

How can we ensure the service we set up does the right thing before we announce it to the public Internet? 1.1.1.1 processes more than 13 billion of these misdirected queries every day, and it has logic in place to directly return NXDOMAIN for them locally, which is a recommended practice per RFC 7534.

However, we are able to use a dynamic rule to change how the misdirected queries are handled in Cloudflare testing locations. For example, a rule like following:

phase = post-cache and qtype in { PTR } and colo in { test1 test2 } and qname-suffix in { 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa 254.169.in-addr.arpa } forward 192.175.48.6:53

The rule instructs that in data center test1 and test2, when the DNS query type is PTR, and the query name ends with those in the list, forward the query to server 192.175.48.6 (one of the AS112 service IPs) on port 53.

Because we’ve provisioned the AS112 IP prefixes in the same node, the new AS112 service will receive the queries and respond to the resolver.

It’s worth mentioning that the above-mentioned dynamic rule that intercepts a query at the post-cache phase, and changes how the query gets processed, is executed by a guest application too, which is named override. This app loads all dynamic rules, parses the DSL texts and registers callback functions at phases declared by each rule. And when an incoming query matches the expressions, it executes the designated actions.

Public reports

We collect the following metrics to generate the public statistics that an AS112 operator is expected to share to the operator community:

  • Number of queries by query type
  • Number of queries by response code
  • Number of queries by protocol
  • Number of queries by IP versions
  • Number of queries with EDNS support
  • Number of queries with DNSSEC support
  • Number of queries by ASN/Data center

We’ll serve the public statistics page on the Cloudflare Radar website. We are still working on implementing the required backend API and frontend of the page – we’ll share the link to this page once it is available.

What’s next?

We are going to announce the AS112 prefixes starting December 15, 2022.

After the service is launched, you can run a dig command to check if you are hitting an AS112 node operated by Cloudflare, like:

$ dig @blackhole-1.iana.org TXT hostname.as112.arpa +short

"Cloudflare DNS, SFO"
"See http://www.as112.net/ for more information."

How to control non-HTTP and non-HTTPS traffic to a DNS domain with AWS Network Firewall and AWS Lambda

Post Syndicated from Tyler Applebaum original https://aws.amazon.com/blogs/security/how-to-control-non-http-and-non-https-traffic-to-a-dns-domain-with-aws-network-firewall-and-aws-lambda/

Security and network administrators can control outbound access from a virtual private cloud (VPC) to specific destinations by using a service like AWS Network Firewall. You can use stateful rule groups to control outbound access to domains for HTTP and HTTPS by default in Network Firewall. In this post, we’ll walk you through how to accomplish this access control for non-HTTP and non-HTTPS traffic, such as SSH (Secure Shell). This solution is extensible to other protocols with static port assignments.

In the example scenario in this post, the network administrator needs to permit outbound SSH access on port 22/tcp to a third-party domain, example.org, from a group of Amazon Elastic Compute Cloud (Amazon EC2) instances that sits inside of a protected VPC that restricts outbound SSH traffic with Network Firewall. Non-HTTP traffic can’t currently be controlled with a domain rule in Network Firewall.

This solution allows administrators to control outbound access to a given domain in a granular way, by resolving the domain name inside of an AWS Lambda function, and updating a Network Firewall rule variable with the results of the DNS query. This solution further restricts specific non-HTTP and non-HTTPS traffic to those allowed domains to only what is explicitly specified by the administrator.

Solution overview

Figure 1 provides an overview of the solution and the resulting traffic flow.

Figure 1: Overview of the solution and the resulting traffic flow

Figure 1: Overview of the solution and the resulting traffic flow

The solution workflow is as follows:

  1. An Amazon EventBridge rule invokes the Lambda function every 10 minutes. You can modify this frequency to meet your needs. You should consider the time-to-live (TTL) record of the DNS record that you are configuring when choosing this interval.
  2. The Lambda function performs the DNS lookup for the provided domain, and updates a variable in an existing Network Firewall rule group. The rule group changes take a few seconds to fully apply to the nodes in your Network Firewall deployment.
  3. The newly created Network Firewall rule group is associated with the Network Firewall policy to control traffic.
  4. Traffic from the instances in your VPC flows through the Network Firewall endpoint, and if allowed, is routed through an internet gateway to the target server.

Prerequisites

This solution has the following prerequisites:

  1. An AWS account. If you don’t have an AWS account, create and activate one.
  2. An existing VPC with default routing to an internet gateway through a network firewall that has a firewall policy attached to it. The example rule included in the solution’s AWS CloudFormation template expects the firewall policy to use the default action order for stateful rule groups. If you don’t have an existing network firewall associated with your VPC, see the AWS Network Firewall Developer Guide to get started. For a walkthrough of the Network Firewall configuration and rules engine, see the blog post Hands-on walkthrough of the AWS Network Firewall flexible rules engine – Part 1.
  3. A DNS domain that you provide, which allows traffic for the protocol and port (or ports) that you plan to allow traffic to. This DNS domain needs to resolve to an IPv4 address or set of addresses; IPv6 is not supported, at this point.

Deploy the solution

We’ve provided a CloudFormation template to deploy this solution, which is located in the GitHub repository that accompanies this blog post.

To deploy the solution

  1. Download the CloudFormation template from our GitHub repository.
  2. Sign in to your AWS account and select the AWS Region where your Network Firewall is deployed.
  3. Navigate to the CloudFormation service.
  4. Choose Stacks > Create Stack > With new resources (standard).
  5. In the Specify template section, choose Upload a template file.
  6. Choose Choose file, navigate to where you saved the CloudFormation template, and upload it. Then choose Next.
  7. Specify a stack name for your CloudFormation stack.
  8. In the Parameters section, for the Domain parameter, specify the name of the domain to which you will control access. The default value is set to example.org; however, note that the actual example.org doesn’t allow SSH traffic.
  9. The remaining parameters have defaults to allow outbound SSH traffic to the specified domain. Adjust the LambdaJobFrequency variable so that it corresponds with the TTL of the DNS record that it will resolve. This allows the Lambda function to keep the IP address of the DNS record up to date, in the event that it changes. After you’ve configured the parameters, choose Next.
    Figure 2: CloudFormation stack parameters

    Figure 2: CloudFormation stack parameters

  10. On the Configure stack options page, specify any further options needed or keep the default options, and then choose Next.
  11. On the Review page, review the stack and parameters and select the check box to acknowledge that this template will create IAM resources. Choose Create Stack.
  12. Check the stack creation status. Upon successful completion, the status shows CREATE_COMPLETE.
    Figure 3: The successful creation of the CloudFormation stack

    Figure 3: The successful creation of the CloudFormation stack

Test the solution

Before you test the newly created rule, make sure that the Lambda function has been invoked at least once from the EventBridge rule.

To verify the Lambda function results

  1. In the AWS Management Console, navigate to the Lambda function Network-Firewall-Resolver-Function, and on the Monitor tab, choose View logs in CloudWatch.
    Figure 4: Navigating to view logs in CloudWatch

    Figure 4: Navigating to view logs in CloudWatch

  2. Select the most recent log stream.
  3. Verify that that a log line contains the entry StatefulRuleGroup updated successfully.
    Figure 5: Examining the CloudWatch logs to verify that the Lambda function ran successfully

    Figure 5: Examining the CloudWatch logs to verify that the Lambda function ran successfully

  4. Associate the stateful rule group that was created by the stack, Lambda-Managed-Stateful-Rule with the existing Network Firewall policy that is attached to your VPC. To do this:
    1. Navigate to VPC > Network Firewall > Firewall Policies and select your existing firewall policy.
    2. In the Stateful rule groups section, for Actions, choose Add unmanaged stateful rule groups.
  5. Select the check box for Lambda-Managed-Stateful-Rule, and then choose Add stateful rule group.
  6. When the newly provisioned Lambda function runs successfully, it will resolve the IPv4 address for the domain (example.org) and associate the address with the stateful rule variable IP_NET. To validate that this has happened, do the following:
    1. Navigate to VPC > Network Firewall > Network Firewall rule groups.
    2. Choose the Lambda-Managed-Stateful-Rule rule group.
    3. Navigate to the rule variable section, and choose IP_NET. If the Lambda function successfully resolved the provided domain name, the variable will contain the IPv4 addresses for the domain you provided, as shown in Figure 6.
      Figure 6: Validating the rule variable details

      Figure 6: Validating the rule variable details

  7. Test the rule by attempting to connect to the domain that you specified in the CloudFormation template. Use an EC2 instance within the VPC that the network firewall rule is associated with, and attempt to establish an SSH connection to the domain that you specified. As shown by the SSH key negotiation in Figure 7, traffic is allowed through the network firewall, as intended.
    Figure 7: SSH connectivity to the domain was successful

    Figure 7: SSH connectivity to the domain was successful

    You can also configure the rule to drop the SSH connection, rather than permit it. To do this:

    1. Navigate to VPC > Network Firewall > Network Firewall rule groups.
    2. Choose the Lambda-Managed-Stateful-Rule rule group. In the Rules section, choose Edit Rules.
    3. Modify the rule to take the Drop action, and save the rule group.

    As shown by the lack of response from the host in Figure 8, the SSH connection cannot be established anymore.

    Figure 8: An SSH connection cannot be established, due to the connection timing out

    Figure 8: An SSH connection cannot be established, due to the connection timing out

Cleanup

Follow the steps in this section to remove the resources created by this solution.

To remove the resources

  1. Sign in to your AWS account where you deployed the CloudFormation stack and navigate to the Network Firewall console.
  2. In the Stateful rule groups section, select the check box for Lambda-Managed-Stateful-Rule. For Actions, choose Disassociate from policy.
    Figure 9: Disassociating the stateful rule from the existing policy

    Figure 9: Disassociating the stateful rule from the existing policy

  3. Navigate to the CloudFormation console, select the stack that you created, and then choose Delete. Upon successful deletion, the resources created by the stack will be deleted.

Conclusion

In this post, we’ve demonstrated how security and network administrators have the ability to permit or restrict non-HTTP and non-HTTPS traffic to a given domain by using Network Firewall. With this solution, administrators can enforce granular port- and protocol-level control to third-party domains. To learn more about rule group configuration in AWS Network Firewall, see Managing your own rule groups in the Developer Guide.

 
If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support. You can also start a new thread on AWS Network Firewall re:Post to get answers from the community.

Want more AWS Security news? Follow us on Twitter.

Tyler Applebaum

Tyler Applebaum

Tyler is a Sr. Solutions Architect in the Charlotte, NC area helping customers migrate to AWS and modernize their applications. He has previous experience as a network engineer working in healthcare and finance.

Bhavin Lakhani

Bhavin is a Technical Account Manager in the US West, Northern California, helping customers with their AWS Enterprise Support, Operational, and Architectural needs. He has cloud infrastructure and database platform leadership experience in his previous roles. He loves soccer and is an ardent “Arsenal” fan.

Goodbye, Alexa. Hello, Cloudflare Radar Domain Rankings

Post Syndicated from Celso Martinho original https://blog.cloudflare.com/radar-domain-rankings/

Goodbye, Alexa. Hello, Cloudflare Radar Domain Rankings

Goodbye, Alexa. Hello, Cloudflare Radar Domain Rankings

The Internet is a living organism. Technology changes, shifts in human behavior, social events, intentional disruptions, and other occurrences change the Internet in unpredictable ways, even to the trained eye.

Cloudflare Radar has long been the place to visit for accessing data and getting unique insights into how people and organizations are using the Internet across the globe, as well as those unpredictable changes to the Internet.

One of the most popular features on Radar has always been the “Most Popular Domains,” with both global and country-level perspectives. Domain usage signals provide a proxy for user behavior over time and are a good representation of what people are doing on the Internet.

Today, we’re going one step further and launching a new dataset called Radar Domain Rankings (Beta). Domain Rankings is based on aggregated 1.1.1.1 resolver data that is anonymized in accordance with our privacy commitments. The dataset aims to identify the top most popular domains based on how people use the Internet globally, without tracking individuals’ Internet use.

There are a few reasons why we’re doing this now. One is obviously to improve our Radar features with better data and incorporate new learnings. But also, ranking lists are used all over the Internet in all sorts of systems. One of the most used and trusted sources of domain rankings was Alexa, but that service was recently deprecated. We believe we are in a good position to provide a strong alternative.

Let’s see how we built it.

Differences in domain names

Before we dig into the data science behind Domain Rankings, it’s important to understand what a domain and DNS are. Internet domain names are human-readable dot-separated letters, digits and hyphens that correspond to a network resource, like a server or a website. However, your computer and applications don’t know what to do with a domain name; they need IP addresses to send and receive information over the network. DNS is the system that converts, or resolves, a domain name into an IP address. Think of it as an Internet phonebook for domain names.

Note: This is a simplification. A new standard called Internationalized Domain Names, or IDN, allows using Unicode strings in domain names.

Each dot defines a new hierarchy level, reading right to left. Domains can have multiple levels of depth. The highest level corresponds to country code top-level domains (ccTLDs) like .uk, .fr or .pt, or generic top-level domains (gTLDs) like .com, .org, or .net. These are normally assigned to and managed by either country-level entities or administrative organizations operating a registry.

Then there are the second-level domains like cloudflare.com or google.com. These are normally purchased and registered by individuals or organizations, which are then free to create and manage as many hostnames and hierarchy levels as they want.

Unfortunately, however, there are exceptions. For instance, many countries use second-level domain registration. One such example is the United Kingdom, where commercial domains can only be registered under the .co.uk hierarchy. That’s why Google in the UK isn’t google.uk, but rather google.co.uk.

But that’s not all. Some countries use 3rd level domain registrations. One example is Japan, which offers Regional Domain registration under cities like *.aisai.aichi.jp.

Projects like the Public Suffix List are a good starting point for understanding the variations involved, and how they affect validations and assumptions in other systems, such as cookies in web browsers.

Domain Rankings takes some of this nuance into account to inform the definition of our current ruleset:

  • We boil everything down to second-level domains, such as cloudflare.com or google.com.
  • However, if the second level is .edu, .com, .org, .gov, .net, .gov, .net, .co or .mil, then we use third-level domains.
  • We don’t distinguish between what we think is a website or an infrastructure system. A domain represents an Internet-available resource.
  • We will also semi-automate, curate and maintain a list of domains that map to popular platforms and services in the future. Example: fb.audio, fb.com, fb.watch, all map to a “facebook” platform.

Defining popularity

Definitions are important. We established what we consider a domain, but what does domain popularity mean exactly? Our research showed that the volume of traffic generated to a given domain doesn’t really work as a proxy for what we perceive as popular. Instead, Domain Rankings looks at the size of the population of users that look up a domain per unit of time. The more people who are interested in a domain, the more popular it is.

Sounds pretty straightforward, right? Well, it’s not. Our databases don’t have cookies, IPs, or other tracking artifacts, and we strip information that leads to identifying an individual from all of our data, by design.

The good news, however, is that we do a very good job at identifying automated traffic (for instance, you can read about Bot Management and how we use Machine Learning to detect bots in HTTP traffic in our blog) and we found we could develop a reasonable proxy for the unique users metric without sacrificing privacy (using other data points that we store for a limited period of time, like the ASN and high-level geolocation information of the request or the Cloudflare data center that served it).

Domain Rankings’ popularity metric is best described as the estimated relative size of the user population that accesses a domain over some period of time.

Our approach

We announced 1.1.1.1, our privacy-first consumer DNS resolver in 2018, and over the years it’s grown to become one of the top DNS services in the world. 1.1.1.1 is also part of a Research Agreement with APNIC in which we collaborate with them doing public research and DNS data insights.

The data we collect from it honors our privacy commitments, and is aggregated and stripped of any information that could lead to identifying or tracking users. We conducted a privacy examination by a Big Four accounting firm to determine whether the 1.1.1.1 resolver was effectively configured to meet our privacy commitments. You can read more about it in this blog, and the full report is publicly available on our compliance page.

Even without this personally identifying information, the resulting collection is vast and representative of Internet activity.

The 1.1.1.1 service is used in many ways. Regular (human) Internet users use it as their DNS resolver, either because they explicitly configured it in their devices, or their ISP did, or because they use WARP, or their browser uses 1.1.1.1 under the hood. However, servers and cloud infrastructure, IoT devices, home routers, and bots also use 1.1.1.1 extensively, which creates a lot of challenges for us when trying to identify human traffic.

We’ve been using DNS data to calculate the top and trending domains found on both the global and country pages on Cloudflare Radar. It’s been quite a learning experience trying to improve these lists. We have implemented aggregations, counts, filters, handling exceptions, and tried reducing noise, and yet they’re far from perfect. We felt that there had to be a better way.

We’ve spent the last six months building a variety of machine learning models to help us predict the rank of a domain.

Building the model was no easy feat. We experimented with multiple regression types first, to know exactly what the model was doing, and then more complex algorithms to get better performance. We played with different datasets, changed the population groups, variables (features), and combinations of variables, and used synthetic data.

After evaluation, one of our first conclusions was that building a model that could produce good results for the highest ranked domains and the long tail would be difficult.

The paper “A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists” describes this problem well. “The ranking of domains in the long tail should be based on significantly smaller and hence less reliable numbers.” Talking to our Research Team who submitted the collaboration paper “Toppling Top Lists: Evaluating the Accuracy of Popular Website Lists” to IMC 2022, got us to the same conclusion: the most popular domains (like google.com and facebook.com) have feature values disproportionally higher than the lower-ranked domains.

Therefore, we selected the two models that performed best. One model was trained on the population with the highest feature values, uses more features, and is used to generate the ordered top 100 domain list. A second model was trained on a more general group of domains, uses fewer features, and is used to get the top one million most popular domains, which we then divide into ranking buckets.

These buckets are ranked, but each bucket’s contents are intentionally unordered. For example, the second bucket of 10,000 most popular domains includes the set of domains that rank from 10,001 to 20,000, but give no further indication of the individual ranking of domains in that bucket. Given the size of some of these buckets and the window of time we use to populate them, they will inherently be exposed to more instability, too. We feel this is a good compromise between the described natural uncertainties of our long tail model and providing a reasonable idea of how close to the top a domain is.

Results

It’s important to mention there is no global view that can establish the perfect rank, and there’s no easy mechanism to confirm if a ranking is, ultimately, good. Data-driven results are always subject to some bias and skewing, related to the context of the organizations and systems that collect them. Sometimes all that can be done is to be transparent about potential sources of bias. The geographical distribution of customers and users, product characteristics, platform features, and behavioral diversity play an essential role in the final result. We are presenting the Cloudflare view, what we see.

Having said this, Cloudflare sits in a privileged position and handles a significant amount of Internet traffic. We have plenty of signals we can extract from our aggregated data, and believe that makes it possible to generate high quality domain rankings.

Domain Rankings are available today. You can head up to the Domains page and check it out:

  • Ordered list of the top 100 most popular domains globally and per country, based on our first model. Last 24 hours, updated daily.
  • Unordered global most popular domains datasets divided into buckets of the following sizes: 200, 500, 1,000, 2,000, 5,000, 10,000, 20,000, 50,000, 100,000, 200,000, 500,000, 1,000,000. Last 7 days, updated weekly.
Goodbye, Alexa. Hello, Cloudflare Radar Domain Rankings

Next steps

We will keep improving Domain Rankings and monitoring the results. Anyone can access them on Cloudflare Radar, read the results, and download the CSV files.

Feel free to explore our Domain Rankings and share feedback with us.

Dig through SERVFAILs with EDE

Post Syndicated from Stanley Chiang original https://blog.cloudflare.com/dig-through-servfails-with-ede/

Dig through SERVFAILs with EDE

Dig through SERVFAILs with EDE

It can be frustrating to get errors (SERVFAIL response codes) returned from your DNS queries. It can be even more frustrating if you don’t get enough information to understand why the error is occurring or what to do next. That’s why back in 2020, we launched support for Extended DNS Error (EDE) Codes to 1.1.1.1.

As a quick refresher, EDE codes are a proposed IETF standard enabled by the Extension Mechanisms for DNS (EDNS) spec. The codes return extra information about DNS or DNSSEC issues without touching the RCODE so that debugging is easier.

Now we’re happy to announce we will return more error code types and include additional helpful information to further improve your debugging experience. Let’s run through some examples of how these error codes can help you better understand the issues you may face.

To try for yourself, you’ll need to run the dig or kdig command in the terminal. For dig, please ensure you have v9.11.20 or above. If you are on macOS 12.1, by default you only have dig 9.10.6. Install an updated version of BIND to fix that.

Let’s start with the output of an example dig command without EDE support.

% dig @1.1.1.1 dnssec-failed.org +noedns

; <<>> DiG 9.18.0 <<>> @1.1.1.1 dnssec-failed.org +noedns
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 8054
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;dnssec-failed.org.		IN	A

;; Query time: 23 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Thu Mar 17 10:12:57 PDT 2022
;; MSG SIZE  rcvd: 35

In the output above, we tried to do DNSSEC validation on dnssec-failed.org. It returns a SERVFAIL, but we don’t have context as to why.

Now let’s try that again with 1.1.1.1’s EDE support.

% dig @1.1.1.1 dnssec-failed.org +dnssec

; <<>> DiG 9.18.0 <<>> @1.1.1.1 dnssec-failed.org +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 34492
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
; EDE: 9 (DNSKEY Missing): (no SEP matching the DS found for dnssec-failed.org.)
;; QUESTION SECTION:
;dnssec-failed.org.		IN	A

;; Query time: 15 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Fri Mar 04 12:53:45 PST 2022
;; MSG SIZE  rcvd: 103

We can see there is still a SERVFAIL. However, this time there is also an EDE Code 9 which stands for “DNSKey Missing”. Accompanying that, we also have additional information saying “no SEP matching the DS found” for dnssec-failed.org. That’s better!

Another nifty feature is that we will return multiple errors when appropriate, so you can debug each one separately. In the example below, we returned a SERVFAIL with three different error codes: “Unsupported DNSKEY Algorithm”, “No Reachable Authority”, and “Network Error”.

dig @1.1.1.1 [domain] +dnssec

; <<>> DiG 9.18.0 <<>> @1.1.1.1 [domain] +dnssec
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 55957
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags: do; udp: 1232
; EDE: 1 (Unsupported DNSKEY Algorithm): (no supported DNSKEY algorithm for [domain].)
; EDE: 22 (No Reachable Authority): (at delegation [domain].)
; EDE: 23 (Network Error): (135.181.58.79:53 rcode=REFUSED for [domain] A)
;; QUESTION SECTION:
;[domain].		IN	A

;; Query time: 1197 msec
;; SERVER: 1.1.1.1#53(1.1.1.1) (UDP)
;; WHEN: Wed Mar 02 13:41:30 PST 2022
;; MSG SIZE  rcvd: 202

Here’s a list of the additional codes we now support:

Error Code Number Error Code Name
1 Unsupported DNSKEY Algorithm
2 Unsupported DS Digest Type
5 DNSSEC Indeterminate
7 Signature Expired
8 Signature Not Yet Valid
9 DNSKEY Missing
10 RRSIGs Missing
11 No Zone Key Bit Set
12 NSEC Missing

We have documented all the error codes we currently support with additional information you may find helpful. Refer to our dev docs for more information.

How we improved DNS record build speed by more than 4,000x

Post Syndicated from Alex Fattouche original https://blog.cloudflare.com/dns-build-improvement/

How we improved DNS record build speed by more than 4,000x

How we improved DNS record build speed by more than 4,000x

Since my previous blog about Secondary DNS, Cloudflare’s DNS traffic has more than doubled from 15.8 trillion DNS queries per month to 38.7 trillion. Our network now spans over 270 cities in over 100 countries, interconnecting with more than 10,000 networks globally. According to w3 stats, “Cloudflare is used as a DNS server provider by 15.3% of all the websites.” This means we have an enormous responsibility to serve DNS in the fastest and most reliable way possible.

Although the response time we have on DNS queries is the most important performance metric, there is another metric that sometimes goes unnoticed. DNS Record Propagation time is how long it takes changes submitted to our API to be reflected in our DNS query responses. Every millisecond counts here as it allows customers to quickly change configuration, making their systems much more agile. Although our DNS propagation pipeline was already known to be very fast, we had identified several improvements that, if implemented, would massively improve performance. In this blog post I’ll explain how we managed to drastically improve our DNS record propagation speed, and the impact it has on our customers.

How DNS records are propagated

Cloudflare uses a multi-stage pipeline that takes our customers’ DNS record changes and pushes them to our global network, so they are available all over the world.

How we improved DNS record build speed by more than 4,000x

The steps shown in the diagram above are:

  1. Customer makes a change to a record via our DNS Records API (or UI).
  2. The change is persisted to the database.
  3. The database event triggers a Kafka message which is consumed by the Zone Builder.
  4. The Zone Builder takes the message, collects the contents of the zone from the database and pushes it to Quicksilver, our distributed KV store.
  5. Quicksilver then propagates this information to the network.

Of course, this is a simplified version of what is happening. In reality, our API receives thousands of requests per second. All POST/PUT/PATCH/DELETE requests ultimately result in a DNS record change. Each of these changes needs to be actioned so that the information we show through our API and in the Cloudflare dashboard is eventually consistent with the information we use to respond to DNS queries.

Historically, one of the largest bottlenecks in the DNS propagation pipeline was the Zone Builder, shown in step 4 above. Responsible for collecting and organizing records to be written to our global network, our Zone Builder often ate up most of the propagation time, especially for larger zones. As we continue to scale, it is important for us to remove any bottlenecks that may exist in our systems, and this was clearly identified as one such bottleneck.

Growing pains

When the pipeline shown above was first announced, the Zone Builder received somewhere between 5 and 10 DNS record changes per second. Although the Zone Builder at the time was a massive improvement on the previous system, it was not going to last long given the growth that Cloudflare was and still is experiencing. Fast-forward to today, we receive on average 250 DNS record changes per second, a staggering 25x growth from when the Zone Builder was first announced.

How we improved DNS record build speed by more than 4,000x

The way that the Zone Builder was initially designed was quite simple. When a zone changed, the Zone Builder would grab all the records from the database for that zone and compare them with the records stored in Quicksilver. Any differences were fixed to maintain consistency between the database and Quicksilver.

This is known as a full build. Full builds work great because each DNS record change corresponds to one zone change event. This means that multiple events can be batched and subsequently dropped if needed. For example, if a user makes 10 changes to their zone, this will result in 10 events. Since the Zone Builder grabs all the records for the zone anyway, there is no need to build the zone 10 times. We just need to build it once after the final change has been submitted.

What happens if the zone contains one million records or 10 million records? This is a very real problem, because not only is Cloudflare scaling, but our customers are scaling with us. Today our largest zone currently has millions of records. Although our database is optimized for performance, even one full build containing one million records took up to 35 seconds, largely caused by database query latency. In addition, when the Zone Builder compares the zone contents with the records stored in Quicksilver, we need to fetch all the records from Quicksilver for the zone, adding time. However, the impact doesn’t just stop at the single customer. This also eats up more resources from other services reading from the database and slows down the rate at which our Zone Builder can build other zones.

Per-record build: a new build type

Many of you might already have the solution to this problem in your head:

Why doesn’t the Zone Builder just query the database for the record that has changed and propagate just the single record?

Of course this is the correct solution, and the one we eventually ended up at. However, the road to get there was not as simple as it might seem.

Firstly, our database uses a series of functions that, at zone touch time, create a PostgreSQL Queue (PGQ) event that ultimately gets turned into a Kafka event. Initially, we had no distinction for individual record events, which meant our Zone Builder had no idea what had actually changed until it queried the database.

Next, the Zone Builder is still responsible for DNS zone settings in addition to records. Some examples of DNS zone settings include custom nameserver control and DNSSEC control. As a result, our Zone Builder needed to be aware of specific build types to ensure that they don’t step on each other. Furthermore, per-record builds cannot be batched in the same way that zone builds can because each event needs to be actioned separately.

As a result, a brand new scheduling system needed to be written. Lastly, Quicksilver interaction needed to be re-written to account for the different types of schedulers. These issues can be broken down as follows:

  1. Create a new Kafka event pipeline for record changes that contain information about the changed record.
  2. Separate the Zone Builder into a new type of scheduler that implements some defined scheduler interface.
  3. Implement the per-record scheduler to read events one by one in the correct order.
  4. Implement the new Quicksilver interface for the per-record scheduler.

Below is a high level diagram of how the new Zone Builder looks internally with the new scheduler types.

How we improved DNS record build speed by more than 4,000x

It is critically important that we lock between these two schedulers because it would otherwise be possible for the full build scheduler to overwrite the per-record scheduler’s changes with stale data.

It is important to note that none of this per-record architecture would be possible without the use of Cloudflare’s black lie approach to negative answers with DNSSEC. Normally, in order to properly serve negative answers with DNSSEC, all the records within the zone must be canonically sorted. This is needed in order to maintain a list of references from the apex record through all the records in the zone. With this normal approach to negative answers, a single record that has been added to the zone requires collecting all records to determine its insertion point within this sorted list of names.

Bugs

I would love to be able to write a Cloudflare blog where everything went smoothly; however, that is never the case. Bugs happen, but we need to be ready to react to them and set ourselves up so that next time this specific bug cannot happen.

In this case, the major bug we discovered was related to the cleanup of old records in Quicksilver. With the full Zone Builder, we have the luxury of knowing exactly what records exist in both the database and in Quicksilver. This makes writing and cleaning up a fairly simple task.

When the per-record builds were introduced, record events such as creates, updates, and deletes all needed to be treated differently. Creates and deletes are fairly simple because you are either adding or removing a record from Quicksilver. Updates introduced an unforeseen issue due to the way that our PGQ was producing Kafka events. Record updates only contained the new record information, which meant that when the record name was changed, we had no way of knowing what to query for in Quicksilver in order to clean up the old record. This meant that any time a customer changed the name of a record in the DNS Records API, the old record would not be deleted. Ultimately, this was fixed by replacing those specific update events with both a creation and a deletion event so that the Zone Builder had the necessary information to clean up the stale records.

None of this is rocket surgery, but we spend engineering effort to continuously improve our software so that it grows with the scaling of Cloudflare. And it’s challenging to change such a fundamental low-level part of Cloudflare when millions of domains depend on us.

Results

Today, all DNS Records API record changes are treated as per-record builds by the Zone Builder. As I previously mentioned, we have not been able to get rid of full builds entirely; however, they now represent about 13% of total DNS builds. This 13% corresponds to changes made to DNS settings that require knowledge of the entire zone’s contents.

How we improved DNS record build speed by more than 4,000x

When we compare the two build types as shown below we can see that per-record builds are on average 150x faster than full builds. The build time below includes both database query time and Quicksilver write time.

How we improved DNS record build speed by more than 4,000x

From there, our records are propagated to our global network through Quicksilver.

The 150x improvement above is with respect to averages, but what about that 4000x that I mentioned at the start? As you can imagine, as the size of the zone increases, the difference between full build time and per-record build time also increases. I used a test zone of one million records and ran several per-record builds, followed by several full builds. The results are shown in the table below:

Build Type Build Time (ms)
Per Record #1 6
Per Record #2 7
Per Record #3 6
Per Record #4 8
Per Record #5 6
Full #1 34032
Full #2 33953
Full #3 34271
Full #4 34121
Full #5 34093

We can see that, given five per-record builds, the build time was no more than 8ms. When running a full build however, the build time lasted on average 34 seconds. That is a build time reduction of 4250x!

Given the full build times for both average-sized zones and large zones, it is apparent that all Cloudflare customers are benefitting from this improved performance, and the benefits only improve as the size of the zone increases. In addition, our Zone Builder uses less database and Quicksilver resources meaning other Cloudflare systems are able to operate at increased capacity.

Next Steps

The results here have been very impactful, though we think that we can do even better. In the future, we plan to get rid of full builds altogether by replacing them with zone setting builds. Instead of fetching the zone settings in addition to all the records, the zone setting builder would just fetch the settings for the zone and propagate that to our global network via Quicksilver. Similar to the per-record builds, this is a difficult challenge due to the complexity of zone settings and the number of actors that touch it. Ultimately if this can be accomplished, we can officially retire the full builds and leave it as a reminder in our git history of the scale at which we have grown over the years.

In addition, we plan to introduce a batching system that will collect record changes into groups to minimize the number of queries we make to our database and Quicksilver.

Does solving these kinds of technical and operational challenges excite you? Cloudflare is always hiring for talented specialists and generalists within our Engineering and other teams.

Wildcard proxy for everyone

Post Syndicated from Hannes Gerhart original https://blog.cloudflare.com/wildcard-proxy-for-everyone/

Wildcard proxy for everyone

Wildcard proxy for everyone

Today, I have the pleasure to announce that we’re giving everyone the ability to proxy DNS wildcard records. Previously, this feature was only available to our Enterprise customers. After many of our free and pay-as-you-go users reached out, we decided that this feature should be available to everyone.

What is a wildcard DNS record?

A DNS record usually maps a domain name to one or multiple IP addresses or another resource associated with that name, so it’s a one-to-many mapping. Let’s look at an example:

Wildcard proxy for everyone

When I do a DNS lookup for the IP address of subdomain1.mycoolwebpage.xyz, I get two IP addresses back, because I have added two A records on that subdomain:

$ dig subdomain1.mycoolwebpage.xyz -t a +short
192.0.2.1
192.0.2.2

I could specify the target of all subdomains like this, with one or multiple DNS records per subdomain. But what if I have hundreds or even thousands of subdomains that I all want to point to the same resource?

This is where a wildcard DNS record comes in. By using the asterisk symbol "*" in the Name field, I can create one or multiple DNS records that are used as the response for all subdomains that are not specifically covered by another DNS record (more on this later). So the wildcard record you can see in the screenshot above is covering *.mycoolwebpage.xyz, meaning all subdomains of mycoolwebpage.xyz. This can also be done on deeper levels, like on *.www.mycoolwebpage.xyz.

If I perform a lookup for subdomain2.mycoolwebpage.xyz, the target I specified in the wildcard record will be used as the response. Again, this is only happening because there is no DNS record specifically for this subdomain.

$ dig subdomain2.mycoolwebpage.xyz -t a +short
192.0.2.3

And it is often overlooked that a wildcard record does not only cover the level it is set on directly, but deeper levels, as well:

$ dig some.deep.label.subdomain2.mycoolwebpage.xyz -t a +short 
192.0.2.3

Also, a wildcard DNS record does not cover the apex of the zone (in this example the apex is mycoolwebpage.xyz).

A few more things to know about wildcard records

Below you can find additional rules that apply to wildcard DNS records you should be aware of:

Wildcards are only supported on the first label. Meaning something like subdomain.*.mycoolwebpage.xyz is not a wildcard on the level of the asterisk character. If you create a DNS record with that name, the asterisk is interpreted as the literal character and not as the wildcard operator.

You cannot create wildcards on multiple levels. So if you create a DNS record on *.*.mycoolwebpage.xyz, only the first asterisk is interpreted as a wildcard while the second one is interpreted as the literal “*” character.

Wildcards will be applied for multiple levels. But a specific record on any equal or lower level will terminate anything on or below this specific record — independent of the type of that specific record. Here is an example. If you have only these two records on your domain

subdomain1.mycoolwebpage.xyz  TXT  “some text”
*.mycoolwebpage.xyz  	A  192.0.2.3

the wildcard record will be used for queries going to any subdomain of mycoolwebpage.xyz except subdomain1.mycoolwebpage.xyz or anything below that specific label, like deeper.label.subdomain1.mycoolwebpage.xyz — simply because there already exists a record on subdomain1.mycoolwebpage.xyz. However, the wildcard will be used for deeper labels that are not below the specific record on subdomain1 — for example, deeper.label.subdomain2.mycoolwebpage.xyz.

To expand on this rule: if you think of DNS as a tree starting from the root zone (see the diagram below), simply the existence of a branch terminates the wildcard for all records on that branch. In the example above the wildcard was terminated for anything on the label subdomain1 and below, but even if there only exists a record on a deeper level, anything above will also be terminating the wildcard. This example should make it clear. If you only have the following two records on your domain, as shown in the diagram below

some.deep.label.subdomain1.mycoolwebpage.xyz  TXT  “some other text”
*.mycoolwebpage.xyz  	A  192.0.2.3

a query to label.subdomain1.mycoolwebpage.xyz for an A record is not covered by the wildcard because it is a node on the existing branch ending in the TXT record above.

Wildcard proxy for everyone

Wildcard records only cover the record type they are specified for. If you add a wildcard A record for *.mycoolwebpage.xyz it will not cover queries specifying AAAA records (or any other type). But as mentioned in the previous point, a record on a specific label will terminate the wildcard for this label and everything below even if it’s a different record type.

All the above and more can be found in RFC4592. Not the type to read through complex RFCs but still generally interested in how DNS works, go check out Julia Evans’ wizard zines about DNS, she did a great job explaining all the complexities about DNS in an easy to digest way.

What is a proxied wildcard DNS record?

Cloudflare provides a range of features (including Caching, Firewall, or Workers) that require you to proxy the specific hostname you want to use these features on. You can proxy DNS records of the type A, AAAA, and CNAME. These record types are used to specify the origin server of a hostname which expects traffic via HTTP/S.

Proxying a wildcard DNS record works exactly as proxying a specific record. In the Cloudflare dashboard, navigate to the DNS app and either create a new wildcard record or edit an existing record and toggle the proxy status to Proxied. Previously, we only allowed this on wildcard records if the domain was upgraded to the Enterprise plan, but this feature is now available on all plan levels!

Once you have enabled the proxy status of your wildcard DNS record, Cloudflare nameservers will respond with two Cloudflare anycast IPs instead of the origin IP(s) you have specified for that record. These Cloudflare IPs are advertised on our global network from more than 275 locations in more than 100 countries.

$ dig subdomain2.mycoolwebpage.xyz -t a +short
104.18.35.126
172.64.152.130

In the example above, this will ensure that all HTTP/S requests sent to subdomain2.mycoolwebpage.xyz or any other subdomain that is covered by the proxied wildcard DNS record are proxied by Cloudflare’s network, specifically the closest Cloudflare data center. Go see for yourself and pick a random subdomain of mycoolwebpage.xyz. You will see a simple page that is generated using Cloudflare Workers:

Wildcard proxy for everyone

And the cool thing is that you don’t even have to think about creating a TLS certificate. By default, Cloudflare will issue and automatically renew a certificate for your zone apex (mycoolwebpage.xyz) and all subdomains on the next level (*.mycoolwebpage.xyz).

Wildcard proxy for everyone

If you want to proxy a wildcard DNS record on a deeper level like *.www.mycoolwebpage.xyz you can subscribe to Cloudflare Advanced Certificate Manager and get a certificate that is covering that wildcard like this:

Wildcard proxy for everyone

Try it yourself on your domain

If you are not already using Cloudflare DNS for your domain, it is very easy to move from your existing DNS provider and can be done in a few minutes. Head over to our developer documentation for detailed instructions on how to change your authoritative nameservers.

Announcing experimental DDR in 1.1.1.1

Post Syndicated from Christopher Wood original https://blog.cloudflare.com/announcing-ddr-support/

Announcing experimental DDR in 1.1.1.1

Announcing experimental DDR in 1.1.1.1

1.1.1.1 sees approximately 600 billion queries per day. However, proportionally, most queries sent to this resolver are over cleartext: 89% over UDP and TCP combined, and the remaining 11% are encrypted. We care about end-user privacy and would prefer to see all of these queries sent to us over an encrypted transport using DNS-over-TLS or DNS-over-HTTPS. Having a mechanism by which clients could discover support for encrypted protocols such as DoH or DoT will help drive this number up and lead to more name encryption on the Internet. That’s where DDR – or Discovery of Designated Resolvers – comes into play. As of today, 1.1.1.1 supports the latest version of DDR so clients can automatically upgrade non-secure UDP and TCP connections to secure connections. In this post, we’ll describe the motivations for DDR, how the mechanism works, and, importantly, how you can test it out as a client.

DNS transports and public resolvers

We initially launched our public recursive resolver service 1.1.1.1 over three years ago, and have since seen its usage steadily grow. Today, it is one of the fastest public recursive resolvers available to end-users, supporting the latest security and privacy DNS transports such as HTTP/3 for DNS-over-HTTPS (DoH), as well as Oblivious DoH.

As a public resolver, all clients, regardless of type, are typically manually configured based on a user’s desired performance, security, and privacy requirements. This choice reflects answers to two separate but related types of questions:

  1. What recursive resolver should be used to answer my DNS queries? Does the resolver perform well? Does the recursive resolver respect my privacy?
  2. What protocol should be used to speak to this particular recursive resolver? How can I keep my DNS data safe from eavesdroppers that should otherwise not have access to it?

The second question primarily concerns technical matters. In particular, whether or not a recursive resolver supports DoH is simple enough to answer. Either the recursive resolver does or does not support it!

In contrast, the first question is primarily a matter of policy. For example, consider the question of choosing between a local network-provided DNS recursive resolver and a public recursive resolver. How do resolver features (including DoH support, for example) influence this decision? How does the resolver’s privacy policy regarding data use and retention influence this decision? More generally, what information about recursive resolver capabilities is available to clients in making this decision and how is this information delivered to clients?

These policy questions have been the topic of substantial debate in the Internet Engineering Task Force (IETF), the standards body where DoH was standardized, and is the one facet of the Adaptive DNS Discovery (ADD) Working Group, which is chartered to work on the following items (among others):

– Define a mechanism that allows clients to discover DNS resolvers that support encryption and that are available to the client either on the public Internet or on private or local networks.

– Define a mechanism that allows communication of DNS resolver information to clients for use in selection decisions. This could be part of the mechanism used for discovery, above.

In other words, the ADD Working Group aims to specify mechanisms by which clients can obtain the information they need to answer question (1). Critically, one of those pieces of information is what encrypted transport protocols the recursive resolver supports, which would answer question (2).

As the answer to question (2) is purely technical and not a matter of policy, the ADD Working Group was able to specify a workable solution that we’ve implemented and tested with existing clients. Before getting into the details of how it works, let’s dig into the problem statement here and see what’s required to address it.

Threat model and problem statement

The DDR problem is relatively straightforward: given the IP address of a DNS recursive resolver, how can one discover parameters necessary for speaking to the same resolver using an encrypted transport? (As above, discovering parameters for a different resolver is a distinctly different problem that pertains to policy and is therefore out of scope.)

This question is only meaningful insofar as using encryption helps protect against some attacker. Otherwise, if the network was trusted, encryption would add no value! A direct consequence is that this question assumes the network – for some definition of “the network” – is untrusted and encryption helps protect against this network.

But what exactly is the network here? In practice, the topology typically looks like the following:

Announcing experimental DDR in 1.1.1.1
Typical DNS configuration from DHCP

Again, for DNS discovery to have any meaning, we assume that either the ISP or home network – or both – is untrusted and malicious. The setting here depends on the client and the network they are attached to, but it’s generally simplest to assume the ISP network is untrusted.

This question also makes one important assumption: clients know the desired recursive resolver address. Why is this important? Typically, the IP address of a DNS recursive resolver is provided via Dynamic Host Configuration Protocol (DHCP). When a client joins a network, it uses DHCP to learn information about the network, including the default DNS recursive resolver. However, DHCP is a famously unauthenticated protocol, which means that any active attacker on the network can spoof the information, as shown below.

Announcing experimental DDR in 1.1.1.1
Unauthenticated DHCP discovery

One obvious attacker vector would be for the attacker to redirect DNS traffic from the network’s desired recursive resolver to an attacker-controlled recursive resolver. This has important implications on the threat model for discovery.

First, there is currently no known mechanism for encrypted DNS discovery in the presence of an active attacker that can influence the client’s view of the recursive resolver’s address. In other words, to make any meaningful improvement, DNS discovery assumes the client’s view of the DNS recursive resolver address is correct (and obtained through some secure mechanism). A second implication is that the attacker can simply block any attempt of client discovery, preventing upgrade to encrypted transports. This seems true of any interactive discovery mechanism. As a result, DNS discovery must relax this attacker’s capabilities somewhat: rather than add, drop, or modify packets, the attacker can only add or modify packets.

Altogether, this threat model lets us sharpen the DNS discovery problem statement: given the IP address of a DNS recursive resolver, how can one securely discover parameters necessary for speaking to the same resolver using an encrypted transport in the presence of an active attacker that can add or modify packets? It should be infeasible, for example, for the attacker to redirect the client from the resolver that it knows at the outset to one the attacker controls.

So how does this work, exactly?

DDR mechanics

DDR depends on two mechanisms:

  1. Certificate-based authentication of encrypted DNS resolvers.
  2. SVCB records for encoding and communicating DNS parameters.

Certificates allow resolvers to prove authority for IP addresses. For example, if you view the certificate for one.one.one.one, you’ll see several IP addresses listed under the SubjectAlternativeName extension, including 1.1.1.1.

Announcing experimental DDR in 1.1.1.1
SubjectAltName list of the one.one.one.one certificate

SVCB records are extensible key-value stores that can be used for conveying information about services to clients. Example information includes the supported application protocols, including HTTP/3, as well as keying material like that used for TLS Encrypted Client Hello.

How does DDR combine these two to solve the discovery problem above? In three simple steps:

  1. Clients query the expected DNS resolver for its designations and their parameters with a special-purpose SVCB record.
  2. Clients open a secure connection to the designated resolver, for example, one.one.one.one, authenticating the resolver against the one.one.one.one name.
  3. Clients check that the designated resolver is additionally authenticated for the IP address of the origin resolver. That is, the certificate for one.one.one.one, the designated resolver, must include the IP address 1.1.1.1, the original designator resolver.

If this validation completes, clients can then use the secure connection to the designated resolver. In pictures, this is as follows:

Announcing experimental DDR in 1.1.1.1
DDR discovery process

This demonstrates that the encrypted DNS resolver is authoritative for the client’s original DNS resolver. Or, in other words, that the original resolver and the encrypted resolver are effectively  “the same.” An encrypted resolver that does not include the originally requested resolver IP address on its certificate would fail the validation, and clients are not expected to follow the designated upgrade path. This entire process is referred to as “Verified Discovery” in the DDR specification.

Experimental deployment and next steps

To enable more encrypted DNS on the Internet and help the standardization process, 1.1.1.1 now has experimental support for DDR. You can query it directly to find out:

$ dig +short @1.1.1.1 _dns.resolver.arpa type64

QUESTION SECTION
_dns.resolver.arpa.               IN SVCB 

ANSWER SECTION
_dns.resolver.arpa.                           300    IN SVCB  1 one.one.one.one. alpn="h2,h3" port="443" ipv4hint="1.1.1.1,1.0.0.1" ipv6hint="2606:4700:4700::1111,2606:4700:4700::1001" key7="/dns-query{?name}"
_dns.resolver.arpa.                           300    IN SVCB  2 one.one.one.one. alpn="dot" port="853" ipv4hint="1.1.1.1,1.0.0.1" ipv6hint="2606:4700:4700::1111,2606:4700:4700::1001"

ADDITIONAL SECTION
one.one.one.one.                              300    IN AAAA  2606:4700:4700::1111
one.one.one.one.                              300    IN AAAA  2606:4700:4700::1001
one.one.one.one.                              300    IN A     1.1.1.1
one.one.one.one.                              300    IN A     1.0.0.1

This command sends a SVCB query (type64) for the reserved name _dns.resolver.arpa to 1.1.1.1. The output lists the contents of this record, including the DoH and DoT designation parameters. Let’s walk through the contents of this record:

_dns.resolver.arpa.                           300    IN SVCB  1 one.one.one.one. alpn="h2,h3" port="443" ipv4hint="1.1.1.1,1.0.0.1" ipv6hint="2606:4700:4700::1111,2606:4700:4700::1001" key7="/dns-query{?name}"

This says that the DoH target one.one.one.one is accessible over port 443 (port=”443”) using either HTTP/2 or HTTP/3 (alpn=”h2,h3”), and the DoH path (key7) for queries is “/dns-query{?name}”.

Moving forward

DDR is a simple mechanism that lets clients automatically upgrade to encrypted transport protocols for DNS queries without any manual configuration. At the end of the day, users running compatible clients will enjoy a more private Internet experience. Happily, both Microsoft and Apple recently announced experimental support for this emerging standard, and we’re pleased to help them and other clients test support.
Going forward, we hope to help add support for DDR to open source DNS resolver software such as dnscrypt-proxy and Bind. If you’re interested in helping us continue to drive adoption of encrypted DNS and related protocols to help build a better Internet, we’re hiring!

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

Post Syndicated from Hannes Gerhart original https://blog.cloudflare.com/foundation-dns/

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

Today, we’re announcing Foundation DNS, Cloudflare’s new premium DNS offering that provides unparalleled reliability, supreme performance and is able to meet the most complex requirements of infrastructure teams.

Let’s talk money first

When you’re signing an enterprise DNS deal, usually DNS providers request three inputs from you in order to generate a quote:

  • Number of zones
  • Total DNS queries per month
  • Total DNS records across all zones

Some are considerably more complicated and many have pricing calculators or opaque “Contact Us” pricing. Planning a budget around how you may grow brings unnecessary complexity, and we think we can do better. Why not make this even simpler? Here you go: We decided to charge Foundation DNS based on a single input for our enterprise customers: Total DNS queries per month. This way, we expect to save companies money and even more importantly, remove complexity from their DNS bill.

And don’t worry, just like the rest of our products, DDoS mitigation is still unmetered. There won’t be any hidden overage fees in case your nameservers are DDoS’d or the number of DNS queries exceeds your quota for a month or two.

Why is DNS so important?

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

The Domain Name System (DNS) is nearly as old as the Internet itself. It was originally defined in RFC882 and RFC883 in 1983 out of the need to create a mapping between hostnames and IP addresses. Back then, the authors wisely stated: “[The Internet] is a large system and is likely to grow much larger.” [RFC882]. Today there are almost 160 Million domain names just under the .com, one of the largest Top Level Domains (TLD) [source].

By design, DNS is a hierarchical and highly distributed system, but as an end user you usually only communicate with a resolver (1) that is either assigned or operated by your Internet Service Provider (ISP) or directly configured by your employer or yourself. The resolver communicates with one of the root servers (2), the responsible TLD server (3) and the authoritative nameserver (4) of the domain in question. In many cases all of these four parties are operated by a different entity and located in different regions, maybe even continents.

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

As we have seen in the recent past, if your DNS infrastructure goes down you are in serious trouble, and it likely will cost you a lot of money and potentially damage your reputation. So as a domain owner you want that DNS lookups to your domain are answered 100% of the time and ideally as quickly as possible. So what can you do? You cannot influence which resolver your users have configured. You cannot influence the root server. You can choose which TLD server is involved by picking a domain name with the respective TLD. But if you are bound to a certain TLD for other reasons then that is out of your control as well. What you can easily influence is the provider for your authoritative nameservers. So let’s take a closer look at Cloudflare’s authoritative DNS offering.

A look at Cloudflare’s Authoritative DNS

Authoritative DNS is one of our oldest products, and we have spent a lot of time making it great. All DNS queries are answered from our global anycast network with a presence in more than 250 cities. This way we can deliver supreme performance while always guaranteeing global availability. And of course, we leverage our extensive experience in mitigating DDoS attacks to prevent anyone from knocking down our nameservers and with that the domains of our customers.

DNS is critically important to Cloudflare because up until the release of Magic Transit, DNS was how every user on the Internet was directed to Cloudflare to protect and accelerate our customer’s applications. If our DNS answers were slow, Cloudflare was slow. If our DNS answers were unavailable, Cloudflare was unavailable. Speed and reliability of our authoritative DNS is paramount to the speed and reliability of Cloudflare, as it is to our customers. We have also had our customers push our DNS infrastructure as they’ve grown with Cloudflare. Today our largest customer zone has more than 3 million records and the top 5 are reaching almost 10 million records combined. Those customers rely on Cloudflare to push new DNS record updates to our edge in seconds, not minutes. Due to this importance and our customer’s needs, over the years we have grown our dedicated DNS engineering team focused on keeping our DNS stack fast and reliable.

The security of the DNS ecosystem is also important. Cloudflare has always been a proponent of DNSSEC. Signing and validating DNS answers through DNSSEC ensures that an on-path attacker cannot hijack answers and redirect traffic. Cloudflare has always offered DNSSEC for free on all plan levels, and it will continue to be a no charge option for Foundation DNS. For customers who also choose to use Cloudflare as a registrar, simple one-click deployment of DNSSEC is another key feature that ensures our customers’ domains are not hijacked, and their users are protected. We support RFC 8078 for one-click deployment on external registrars as well.

But there are other issues that can bring parts of the Internet to a halt and these are mostly out of our control: route leaks or even worse route hijacking. While DNSSEC can help with mitigating route hijacks, unfortunately not all recursive resolvers will validate DNSSEC. And even if the resolver does validate, a route leak or hijack to your nameservers will still result in downtime. If all your nameserver IPs are affected by such an event, your domain becomes unresolvable.

With many providers each of your nameservers usually resolves to only one IPv4 and one IPv6 address. If that IP address is not reachable — for example because of network congestion or, even worse, a route leak — the entire nameserver becomes unavailable leading to your domain becoming unresolvable. Even worse, some providers even use the same IP subnet for all their nameservers. So if there is an issue with that subnet all nameservers are down.

Let’s take a look at an example:

$ dig aws.com ns +short              
ns-1500.awsdns-59.org.
ns-164.awsdns-20.com.
ns-2028.awsdns-61.co.uk.
ns-917.awsdns-50.net.

$ dig ns-1500.awsdns-59.org. +short
205.251.197.220
$ dig ns-164.awsdns-20.com. +short
205.251.192.164
$ dig ns-2028.awsdns-61.co.uk. +short
205.251.199.236
$ dig ns-917.awsdns-50.net. +short
205.251.195.149

All nameserver IPs are part of 205.251.192.0/21. Thankfully, AWS is now signing their ranges through RPKI and this makes it less likely to leak… provided that the resolver ISP is validating RPKI. But if the resolver ISP does not validate RPKI and should this subnet be leaked or hijacked, resolvers wouldn’t be able to reach any of the nameservers and aws.com would become unresolvable.

It goes without saying that Cloudflare signs all of our routes and are pushing the rest of the Internet to minimize the impact of route leaks, but what else can we do to ensure that our DNS systems remain resilient through route leaks while we wait for RPKI to be ubiquitously deployed?

Today, when you’re using Cloudflare DNS on the Free, Pro, Business or Enterprise plan, your domain gets two nameservers of the structure <name>.ns.cloudflare.com where <name> is a random first name.

$ dig isbgpsafeyet.com ns +short
tom.ns.cloudflare.com.
kami.ns.cloudflare.com.

Now, as we learned before, in order for a domain to be available, its nameservers have to be available. This is why each of these nameservers resolves to 3 anycast IPv4 and 3 anycast IPv6 addresses.

$ dig tom.ns.cloudflare.com a +short
173.245.59.147
108.162.193.147
172.64.33.147

$ dig tom.ns.cloudflare.com aaaa +short
2606:4700:58::adf5:3b93
2803:f800:50::6ca2:c193
2a06:98c1:50::ac40:2193

The essential detail to notice here is that each of the 3 IPv4 and 3 IPv6 addresses is from a different /8 IPv4 (/45 for IPv6) block. So in order for your nameservers to become unavailable via IPv4, the route leak would have to affect exactly the corresponding subnets across all three /8 IPv4 blocks. This type of event, while theoretically is possible, is virtually impossible in practical terms.

How can this be further improved?

Customers using Foundation DNS will be assigned a new set of advanced nameservers hosted on foundationdns.com and foundationdns.net. These nameservers will be even more resilient than the default Cloudflare nameservers. We will be announcing more details about how we’re achieving this early next year, so stay tuned. All external Cloudflare domains (such as cloudflare.com) will transition to these nameservers in the new year.

There is even more

We’re glad to announce that we are launching two highly requested features:

  • Support for outgoing zone transfers for Secondary DNS
  • Logpush for authoritative and secondary DNS queries

Both of them will be available as part of Foundation DNS and to enterprise customers without any additional costs. Let’s take a closer look at each of these and see how they make our DNS offering even better.

Support for outgoing zone transfers for Secondary DNS

What is Secondary DNS, and why is it important? Many large enterprises have requirements to use more than one DNS provider for redundancy in case one provider becomes unavailable. They can achieve this by adding their domain’s DNS records on two independent platforms and manually keeping the zone files in sync — this is referred to as “multi-primary” setup. With Secondary DNS there are two mechanisms how this can be automated using a “primary-secondary” setup:

  • DNS NOTIFY: The primary nameserver notifies the secondary on every change on the zone. Once the secondary receives the NOTIFY, it sends a zone transfer request to the primary to get in sync with it.
  • SOA query: Here, the secondary nameserver regularly queries the SOA record of the zone and checks if the serial number that can be found on the SOA record is the same with the latest serial number the secondary has stored in it’s SOA record of the zone. If there is a new version of the zone available, it sends a zone transfer request to the primary to get those changes.

Alex Fattouche has written a very insightful blog post about how Secondary DNS works behind the scenes if you want to learn more about it. Another flavor of the primary-secondary setup is to hide the primary, thus referred to as “hidden primary”. The difference of this setup is that only the secondary nameservers are authoritative — in other words configured at the domain’s registrar. The diagram below illustrates the different setups.

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

Since 2018, we have been supporting primary-secondary setups where Cloudflare takes the role of the secondary nameserver. This means from our perspective that we are accepting incoming zone transfers from the primary nameservers.

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

Starting today, we are now also supporting outgoing zone transfers, meaning taking the role of the primary nameserver with one or multiple external secondary nameservers receiving zone transfers from Cloudflare. Exactly as for incoming transfers, we are supporting

  • zone transfers via AXFR and IXFR
  • automatic notifications via DNS NOTIFY to trigger zone transfers on every change
  • signed transfers using TSIG to ensure zone files are authenticated during transfer

Logpush for authoritative and secondary DNS

Here at Cloudflare we love logs. In Q3 2021, we processed 28 Million HTTP requests per second and 13.6 Million DNS queries per second on average and blocked 76 Billion threats each day. All these events are stored as logs for a limited time frame in order to provide our users near real-time analytics in the dashboard. For those customers who want to — or have to — permanently store these logs we’ve built Logpush back in 2019. Logpush allows you to stream logs in near real time to one of our analytics partners Microsoft Azure Sentinel, Splunk, Datadog and Sumo Logic or to any cloud storage destination with R2-compatible API.

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

Today, we’re adding one additional data set for Logpush: DNS logs. In order to configure Logpush and stream DNS logs for your domain, just head over to the Cloudflare dashboard, create a new Logpush job, select DNS logs and configure the log fields you’re interested in:

Announcing Foundation DNS — Cloudflare’s new premium DNS offering

Check out our developer documentation for detailed instructions on how to do this through the API and for a thorough description of the new DNS log fields.

One more thing (or two…)

When looking at the entirety of DNS within your infrastructure, it’s important to review how your traffic is flowing through your systems and how that traffic is behaving. At the end of the day, there is only so much processing power, memory, server capacity, and overall compute resources available. One of the best and important tools we have available is Load Balancing and Health Monitoring!

Cloudflare has provided a Load Balancing solution since 2016, supporting customers to leverage their existing resources in a scalable and intelligent manner. But our Load Balancer was limited to A, AAAA, and CNAME records. This covered a lot of major use cases required by customers, but did not cover all of them. Many customers have more needs such as load balancing MX or email server traffic, SRV records to declare which ports and weight that respective traffic should travel across for a specific service, HTTPS records to ensure the respective traffic uses the secure protocol regardless of port and many more. We want to ensure that our customers’ needs are covered and support their ability to align business goals with technical implementation.

We are happy to announce that we have added additional Health Monitoring methods to support Load Balancing MX, SRV, HTTPS and TXT record traffic without any additional configuration necessary. Create your respective DNS records in Cloudflare and set your Load Balancer as the destination…it’s as easy as that! By leveraging ICMP Ping, SMTP, and UDP-ICMP methods, customers will always have a pulse on the health of their servers and be able to apply intelligent steering decisions based on the respective health information.

When thinking about intelligent steering, there is no one size fits all answer. Different businesses have different needs, especially when looking at where your servers are located around the globe, and where your customers are situated. A common rule of thumb to follow is to place servers where your customers are. This ensures they have the most performant and localized experience possible. One common scenario is to steer your traffic based on where your end-user request originates and create a mapping to the server closest to that area. Cloudflare’s geo steering capability allows our customers to do just that — easily create a mapping of regions to pools, ensuring if we see a request originate from Eastern Europe, to send that request to the proper server to suffice that request. But sometimes, regions can be quite large and lend to issues around not being able to tightly couple together that mapping as closely as one might like.

Today, we are very excited to announce country support within our Geo Steering functionality. Now, customers will be able to choose either one of our thirteen regions, or a specific country to map against their pools to give further granularity and control to how customers traffic should behave as it travels through their system. Both country-level steering and our new health monitoring methods to support load balancing more DNS records will be available in January 2022!

Advancing the DNS Ecosystem

Furthermore, we have some other exciting news to share: We’re finishing the work on Multi-Signer DNSSEC (RFC8901) and plan to roll this out in Q1 2022. Why is this important? Two common requirements of large enterprises are:

  • Redundancy: Having multiple DNS providers responding authoritatively for their domains
  • Authenticity: Deploying DNSSEC to ensure DNS responses can be properly authenticated

Both can be achieved by having the primary nameserver sign the domain and transfer its DNS records plus the record signatures to the secondary nameserver which will serve both as is. This setup is supported with Cloudflare Secondary DNS today. What cannot be supported when transferring pre-signed zones are non-standard DNS features like country-level steering. This is where Multi-Signer DNSSEC comes in. Both DNS providers need to know the signing keys of the other provider and perform their own online (or on-the-fly) signing. If you’re curious to learn more about how Multi-Signer DNSSEC works, go check out this excellent blog post published by APNIC.

Last but not least, Cloudflare is joining the DNS Operations, Analysis, and Research Center (DNS-OARC) as a gold member. Together with other researchers and operators of DNS infrastructure we want to tackle the most challenging problems and continuously work on implementing new standards and features.

While we’ve been at DNS since day one of Cloudflare, we’re still just getting started. We know there are more granular and specific features our future customers will ask of us and the launch of Foundation DNS is our stake in the ground that we will continue to invest in all levels of DNS while building the most feature rich enterprise DNS platform on the planet. If you have ideas, let us know what you’ve always dreamed your DNS provider would do. If you want to help build these features, we are hiring.

Cloudflare blocks an almost 2 Tbps multi-vector DDoS attack

Post Syndicated from Omer Yoachimik original https://blog.cloudflare.com/cloudflare-blocks-an-almost-2-tbps-multi-vector-ddos-attack/

Cloudflare blocks an almost 2 Tbps multi-vector DDoS attack

Cloudflare blocks an almost 2 Tbps multi-vector DDoS attack

Earlier this week, Cloudflare automatically detected and mitigated a DDoS attack that peaked just below 2 Tbps — the largest we’ve seen to date. This was a multi-vector attack combining DNS amplification attacks and UDP floods. The entire attack lasted just one minute. The attack was launched from approximately 15,000 bots running a variant of the original Mirai code on IoT devices and unpatched GitLab instances.

Cloudflare blocks an almost 2 Tbps multi-vector DDoS attack
DDoS attack peaking just below 2 Tbps‌‌

Network-layer DDoS attacks increased by 44%

Last quarter, we saw multiple terabit-strong DDoS attacks and this attack continues this trend of increased attack intensity. Another key finding from our Q3 DDoS Trends report was that network-layer DDoS attacks actually increased by 44% quarter-over-quarter. While the fourth quarter is not over yet, we have, again, seen multiple terabit-strong attacks that targeted Cloudflare customers.

Cloudflare blocks an almost 2 Tbps multi-vector DDoS attack
DDoS attacks peaking at 1-1.4 Tbps

How did Cloudflare mitigate this attack?

To begin with, our systems constantly analyze traffic samples “out-of-path” which allows us to asynchronously detect DDoS attacks without causing latency or impacting performance. Once the attack traffic was detected (within sub-seconds), our systems generated a real-time signature that surgically matched against the attack patterns to mitigate the attack without impacting legitimate traffic.

Once generated, the fingerprint is propagated as an ephemeral mitigation rule to the most optimal location in the Cloudflare edge for cost-efficient mitigation. In this specific case, as with most L3/4 DDoS attacks, the rule was pushed in-line into the Linux kernel eXpress Data Path (XDP) to drop the attack packet at wirespeed.

Cloudflare blocks an almost 2 Tbps multi-vector DDoS attack
A conceptual diagram of Cloudflare’s DDoS protection systems

Read more about Cloudflare’s DDoS Protection systems.

Helping build a better Internet

Cloudflare’s mission is to help build a better Internet — one that is secure, faster, and more reliable for everyone. The DDoS team’s vision is derived from this mission: our goal is to make the impact of DDoS attacks a thing of the past. Whether it’s the Meris botnet that launched some of the largest HTTP DDoS attacks on record, the recent attacks on VoIP providers or this Mirai-variant that’s DDoSing Internet properties, Cloudflare’s network automatically detects and mitigates DDoS attacks. Cloudflare provides a secure, reliable, performant, and customizable platform for Internet properties of all types.

For more information about Cloudflare’s DDoS protection, reach out to us or have a go with a hands-on evaluation of Cloudflare’s Free plan here.

Five Great (free!) Ways to Get Started With Cloudflare

Post Syndicated from John Engates original https://blog.cloudflare.com/five-free-ways-to-get-started-with-cloudflare/

Five Great (free!) Ways to Get Started With Cloudflare

Five Great (free!) Ways to Get Started With Cloudflare

I joined Cloudflare a few weeks ago, and as someone new to the company, there’s a ton of information to absorb. I have always learned best by doing, so I decided to use Cloudflare like a brand-new user. Cloudflare customers range from individuals with a simple website to companies in the Fortune 100. I’m currently exploring Cloudflare from the perspective of the individual, so I signed up for a free account and logged into the dashboard. Just like getting into a new car, I want to turn all the dials and push all the buttons. I looked for things that would be fun and easy to do and would deliver some immediate value. Now I want to share the best ones with you.

Here are my five ways to get started with Cloudflare. These should be easy for anyone, and they’re free. You’ll likely even save some money and improve your privacy and security in the process. Let’s go!

1. Transfer or register a domain with Cloudflare Registrar

If you’re like me, you’ve acquired a few (dozen) Internet domains for things like personalizing your email address, a web page for your nature photography hobby, or maybe a side business. You probably registered them at one or more of the popular domain name registrars, and you pay around $15 per year for each domain. I did an audit and found I was spending a shocking amount each year to maintain my domains, and they were spread across three different registrars.

Cloudflare makes it easy to transfer domains from other registrars and doesn’t charge a markup for domain registrar services. Let me say that again; there is zero price markup for domain registration with Cloudflare Registrar. You’ll pay exactly what Cloudflare pays. For example, a .com domain registered with Cloudflare currently costs half of what I was paying at other registrars.

Not only will you save on the domain registration, but Cloudflare doesn’t nickel-and-dime you like registrars who charge extra for WHOIS privacy and transfer lock and then sneakily bundle their website hosting services. It all adds up.

To get started registering or transferring a domain, log into the Cloudflare Dashboard, click “Add a Site,” and bring your domains to Cloudflare.

Five Great (free!) Ways to Get Started With Cloudflare

2. Configure DNS on Cloudflare DNS

DNS servers do the work of translating hostnames into IP addresses. To put a domain name to use on the Internet, you can create DNS records to point to your website and email provider. Every time someone wants to put a website or Internet application online, this process must happen so the rest of us can find it. Cloudflare’s DNS dashboard makes it simple to configure DNS records. For transfers, Cloudflare will even copy records from your existing DNS service to prevent any disruption.

Five Great (free!) Ways to Get Started With Cloudflare

The Cloudflare DNS dashboard will also improve security on your domains with DNSSEC, protect your domains from email spoofing with DMARC, and enforce other DNS best practices.

I’ve now moved all my domains to Cloudflare DNS, which is a big win for me for security and simplicity. I can see them all in one place, and I’m more confident with the increased level of control and protection I have for my domains.

3. Set up a blog with Cloudflare Pages

Once I moved my domains, I was eager to set up a new website. I have been thinking lately it would be fun to have a place to post my photos where they can stand out and won’t get lost in the stream of social media. It’s been a while since I’ve built a website from scratch, but it’s fun getting back to basics. In the old days, to host a website you’d set up a dedicated web server or use a shared web host to serve your site. Today, many web hosts provide ready-to-go templates for websites and make hosting as easy as one click to set up a new site.

I wanted to learn by doing, so I took the do-it-yourself route. What I discovered in the process is an architecture called Jamstack. It’s a bit different from the traditional way of building and hosting websites. With Jamstack, your site doesn’t live at a traditional hosting provider, nor is it dynamically generated from CGI scripts and a database. Your content is now stored on a code repository like GitHub. The site is pre-generated as a static site and then deployed and delivered directly from Cloudflare’s network.

I used a Jamstack static site generator called Hugo to build my photo blog, pushed it to GitHub, and used Cloudflare Pages to generate the content and host my site. Now that it’s configured, there’s zero work necessary to maintain it. Jamstack, combined with Pages, alleviates the regular updates required to keep up with security patches, and there are no web servers or database services to break. Delivered from Cloudflare’s edge network, the site scales effortlessly, and it’s blazingly fast from a user perspective.

By the way, you don’t need to register a domain to deploy to Pages. Cloudflare will generate a pages.dev site that you can use.

For extra credit, have a look at the Cloudflare Workers serverless platform. Workers will allow you to write and deploy even more advanced custom code and run it across Cloudflare’s globally distributed network.

4. Protect your network with Cloudflare for Teams

At first, it wasn’t evident to me how I was going to use Cloudflare for Teams. I initially thought it was only for larger organizations. After all, I’m sitting here in my home office, and I’m just a team of one. Digging into the product more, it became clear that Teams is about privacy and security for groups of any size.

We’ve discussed the impressive Cloudflare DNS infrastructure, and you can take advantage of the Cloudflare DNS resolver for your devices at home by simply configuring them to point to Cloudflare 1.1.1.1 DNS servers. But for more granular control and detailed logging, you should try the DNS infrastructure built into the Cloudflare for Teams Gateway feature.

When you point your home network to Cloudflare for Teams DNS servers, your dashboard will populate with logs of all DNS requests coming from your network. You can set up rules to block DNS requests for various categories, including known malware, phishing, adult sites, and other questionable content. You’ll see the logs instantly and can add or remove categories as needed. If you trigger one of the rules, Cloudflare will display a page that shows you’ve hit one of these blocked sites.

Five Great (free!) Ways to Get Started With Cloudflare

Malware can bypass DNS, so filtering DNS is no silver bullet. Think of DNS filtering as another layer of defense that may help you avoid nefarious sites in the first place. For example, known phishing sites sent as URLs via email won’t resolve and will be blocked before they affect you. Additionally, DNS logs should give you visibility into what’s happening on the network and that may lead you to implement even better security in other areas.

Five Great (free!) Ways to Get Started With Cloudflare

There’s so much more to Cloudflare for Teams than DNS filtering, but I wanted to give you just a little taste of what you can do with it quickly and for free.

5. Secure your traffic with the Cloudflare 1.1.1.1 app and WARP

Finally, let’s discuss the challenge of securing Internet communications on your mobile phones, tablets, and devices at home and while traveling. We know that the SSL/TLS encryption on secure websites provides a degree of protection, but the apps you use and sites you visit are still visible to your ISP and upstream network operators. Some providers sell this data or use it to target you with ads.

If you install the 1.1.1.1 app, Cloudflare will create an always-on, encrypted tunnel from your device to the nearest Cloudflare data center and secure your Internet traffic. We call this Cloudflare WARP. WARP not only encrypts your traffic but can even help accelerate it by routing intelligently across the Cloudflare network.

WARP is a compelling VPN replacement without the risks associated with some shady VPN providers who may also want to sell your data. Remember, Cloudflare will never sell your data!

The Cloudflare WARP client combined with Cloudflare for Teams gives you enhanced visibility into DNS queries and unlocks some advanced traffic management and filtering capabilities. And it’s all free for small teams.

Five Great (free!) Ways to Get Started With Cloudflare

Hopefully, my exploration of the Cloudflare product portfolio gives you some ideas of what you can do to make your life a little easier or your team more secure. I’m just scratching the surface, and I’m excited to keep learning what’s possible with Cloudflare. I’ll continue to share what I learn, and I encourage you to experiment with some of these capabilities yourself and let me know how it goes.

How to use domain with Amazon SES in multiple accounts or regions

Post Syndicated from Leonardo Azize original https://aws.amazon.com/blogs/messaging-and-targeting/how-to-use-domain-with-amazon-ses-in-multiple-accounts-or-regions/

Sometimes customers want to use their email domain with Amazon Simples Email Service (Amazon SES) across multiple accounts, or the same account but across multiple regions.

For example, AnyCompany is an insurance company with marketing and operations business units. The operations department sends transactional emails every time customers perform insurance simulations. The marketing department sends email advertisements to existing and prospective customers. Since they are different organizations inside AnyCompany, they want to have their own Amazon SES billing. At the same time, they still want to use the same AnyCompany domain.

Other use-cases include customers who want to setup multi-region redundancy, need to satisfy data residency requirements, or need to send emails on behalf of several different clients. In all of these cases, customers can use different regions, in the same or across different accounts.

This post shows how to verify and configure your domain on Amazon SES across multiple accounts or multiple regions.

Overview of solution

You can use the same domain with Amazon SES across multiple accounts or regions. Your options are: different accounts but the same region, different accounts and different regions, and the same account but different regions.

In all of these scenarios, you will have two SES instances running, each sending email for example.com domain – let’s call them SES1 and SES2. Every time you configure a domain in Amazon SES it will generate a series of DNS records you will have to add on your domain authoritative DNS server, which is unique for your domain. Those records are different for each SES instance.

You will need to modify your DNS to add one TXT record, with multiple values, for domain verification. If you decide to use DomainKeys Identified Mail (DKIM), you will modify your DNS to add six CNAME records, three records from each SES instance.

When you configure a domain on Amazon SES, you can also configure a MAIL FROM domain. If you decide to do so, you will need to modify your DNS to add one TXT record for Sender Policy Framework (SPF) and one MX record for bounce and complaint notifications that email providers send you.

Furthermore, your domain can be configured to support DMAC for email spoofing detection. It will rely on SPF or DKIM configured above. Below we walk you through these steps.

  • Verify domain
    You will take TXT values from both SES1 and SES2 instances and add them in DNS, so SES can validate you own the domain
  • Complying with DMAC
    You will add a TXT value with DMAC policy that applies to your domain. This is not tied to any specific SES instance
  • Custom MAIL FROM Domain and SPF
    You will take TXT and MX records related from your MAIL FROM domain from both SES1 and SES2 instances and add them in DNS, so SES can comply with DMARC

Here is a sample matrix of the various configurations:

Two accounts, same region Two accounts, different regions One account, two regions
TXT records for domain verification*

1 record with multiple values

_amazonses.example.com = “VALUE FROM SES1”
“VALUE FROM SES2”

CNAMES for DKIM verification

6 records, 3 from each SES instance

record1-SES1._domainkey.example.com = VALUE FROM SES1
record2-SES1._domainkey.example.com = VALUE FROM SES1
record3-SES1._domainkey.example.com = VALUE FROM SES1
record1-SES2._domainkey.example.com = VALUE FROM SES2
record2-SES2._domainkey.example.com = VALUE FROM SES2
record3-SES2._domainkey.example.com = VALUE FROM SES2

TXT record for DMARC

1 record. It is not related to SES instance or region

_dmarc.example.com = DMARC VALUE

MAIL FROM MX record to define message sender for SES

1 record for entire region

mail.example.com = 10 feedback-smtp.us-east-1.amazonses.com

2 records, one for each region

mail1.example.com = 10 feedback-smtp.us-east-1.amazonses.com
mail2.example.com = 10 feedback-smtp.eu-west-1.amazonses.com

MAIL FROM TXT record for SPF

1 record for entire region

mail.example.com = “v=spf1 include:amazonses.com ~all”

2 records, one for each region

mail1.example.com = “v=spf1 include:amazonses.com ~all”
mail2.example.com = “v=spf1 include:amazonses.com ~all”

* Considering your DNS supports multiple values for a TXT record

Setup SES1 and SES2

In this blog, we call SES1 your primary or existing SES instance. We assume that you have already setup SES, but if not, you can still follow the instructions and setup both at the same time. The settings on SES2 will differ slightly, and therefore you will need to add new DNS entries to support the two-instance setup.

In this document we will use configurations from the “Verification,” “DKIM,” and “Mail FROM Domain” sections of the SES Domains screen and configure SES2 and setup DNS correctly for the two-instance configuration.

Verify domain

Amazon SES requires that you verify, in DNS, your domain, to confirm that you own it and to prevent others from using it. When you verify an entire domain, you are verifying all email addresses from that domain, so you don’t need to verify email addresses from that domain individually.

You can instruct multiple SES instances, across multiple accounts or regions to verify your domain.  The process to verify your domain requires you to add some records in your DNS provider. In this post I am assuming Amazon Route 53 is an authoritative DNS server for example.com domain.

Verifying a domain for SES purposes involves initiating the verification in SES console, and adding DNS records and values to confirm you have ownership of the domain. SES will automatically check DNS to complete the verification process. We assume you have done this step for SES1 instance, and have a _amazonses.example.com TXT record with one value already in your DNS. In this section you will add a second value, from SES2, to the TXT record. If you do not have SES1 setup in DNS, complete these steps twice, once for SES1 and again for SES2. This will prove to both SES instances that you own the domain and are entitled to send email from them.

Initiate Verification in SES Console

Just like you have done on SES1, in the second SES instance (SES2) initiate a verification process for the same domain; in our case example.com

  1. Sign in to the AWS Management Console and open the Amazon SES console.
  2. In the navigation pane, under Identity Management, choose Domains.
  3. Choose Verify a New Domain.
  4. In the Verify a New Domain dialog box, enter the domain name (i.e. example.com).
  5. If you want to set up DKIM signing for this domain, choose Generate DKIM Settings.
  6. Click on Verify This Domain.
  7. In the Verify a New Domain dialog box, you will see a Domain Verification Record Set containing a Name, a Type, and a Value. Copy Name and Value and store them for the step below, where you will add this value to DNS.
    (This information is also available by choosing the domain name after you close the dialog box.)

To complete domain verification, add a TXT record with the displayed Name and Value to your domain’s DNS server. For information about Amazon SES TXT records and general guidance about how to add a TXT record to a DNS server, see Amazon SES domain verification TXT records.

Add DNS Values for SES2

To complete domain verification for your second account, edit current _amazonses TXT record and add the Value from the SES2 to it. If you do not have an _amazonses TXT record create it, and add the Domain Verification values from both SES1 and SES2 to it. We are showing how to add record to Route 53 DNS, but the steps should be similar in any DNS management service you use.

  1. Sign in to the AWS Management Console and open the Amazon Route 53 console.
  2. In the navigation pane, choose Hosted zones.
  3. Choose the domain name you are verifying.
  4. Choose the _amazonses TXT record you created when you verified your domain for SES1.
  5. Under Record details, choose Edit record.
  6. In the Value box, go to the end of the existing attribute value, and then press Enter.
  7. Add the attribute value for the additional account or region.
  8. Choose Save.
  9. To validate, run the following command:
    dig TXT _amazonses.example.com +short
  10. You should see the two values returned:
    "4AjLMzUu4nSjrz4QVqDD8rXq8X2AHr+JhGSl4foiMmU="
    "abcde12345Sjrz4QVqDD8rXq8X2AHr+JhGSl4foiMmU="

Please note:

  1. if your DNS provider does not allow underscores in record names, you can omit _amazonses from the Name.
  2. to help you easily identify this record within your domain’s DNS settings, you can optionally prefix the Value with “amazonses:”.
  3. some DNS providers automatically append the domain name to DNS record names. To avoid duplication of the domain name, you can add a period to the end of the domain name in the DNS record. This indicates that the record name is fully qualified and the DNS provider need not append an additional domain name.
  4. if your DNS server does not support two values for a TXT record, you can have one record named _amazonses.example.com and another one called example.com.

Finally, after some time SES will complete its validation of the domain name and you should see the “pending validation” change to “verified”.

Verify DKIM

DomainKeys Identified Mail (DKIM) is a standard that allows senders to sign their email messages with a cryptographic key. Email providers then use these signatures to verify that the messages weren’t modified by a third party while in transit.

An email message that is sent using DKIM includes a DKIM-Signature header field that contains a cryptographically signed representation of the message. A provider that receives the message can use a public key, which is published in the sender’s DNS record, to decode the signature. Email providers then use this information to determine whether messages are authentic.

When you enable DKIM it generates CNAME records you need to add into your DNS. As it generates different values for each SES instance, you can use DKIM with multiple accounts and regions.

To complete the DKIM verification, copy the three (3) DKIM Names and Values from SES1 and three (3) from SES2 and add them to your DNS authoritative server as CNAME records.

You will know you are successful because, after some time SES will complete the DKIM verification and the “pending verification” will change to “verified”.

Configuring for DMARC compliance

Domain-based Message Authentication, Reporting and Conformance (DMARC) is an email authentication protocol that uses Sender Policy Framework (SPF) and/or DomainKeys Identified Mail (DKIM) to detect email spoofing. In order to comply with DMARC, you need to setup a “_dmarc” DNS record and either SPF or DKIM, or both. The DNS record for compliance with DMARC is setup once per domain, but SPF and DKIM require DNS records for each SES instance.

  1. Setup “_dmarc” record in DNS for your domain; one time per domain. See instructions here
  2. To validate it, run the following command:
    dig TXT _dmarc.example.com +short
    "v=DMARC1;p=quarantine;pct=25;rua=mailto:[email protected]"
  3. For DKIM and SPF follow the instructions below

Custom MAIL FROM Domain and SPF

Sender Policy Framework (SPF) is an email validation standard that’s designed to prevent email spoofing. Domain owners use SPF to tell email providers which servers are allowed to send email from their domains. SPF is defined in RFC 7208.

To comply with Sender Policy Framework (SPF) you will need to use a custom MAIL FROM domain. When you enable MAIL FROM domain in SES console, the service generates two records you need to configure in your DNS to document who is authorized to send messages for your domain. One record is MX and another TXT; see screenshot for mail.example.com. Save these records and enter them in your DNS authoritative server for example.com.

Configure MAIL FROM Domain for SES2

  1. Open the Amazon SES console at https://console.aws.amazon.com/ses/.
  2. In the navigation pane, under Identity Management, choose Domains.
  3. In the list of domains, choose the domain and proceed to the next step.
  4. Under MAIL FROM Domain, choose Set MAIL FROM Domain.
  5. On the Set MAIL FROM Domain window, do the following:
    • For MAIL FROM domain, enter the subdomain that you want to use as the MAIL FROM domain. In our case mail.example.com.
    • For Behavior if MX record not found, choose one of the following options:
      • Use amazonses.com as MAIL FROM – If the custom MAIL FROM domain’s MX record is not set up correctly, Amazon SES will use a subdomain of amazonses.com. The subdomain varies based on the AWS Region in which you use Amazon SES.
      • Reject message – If the custom MAIL FROM domain’s MX record is not set up correctly, Amazon SES will return a MailFromDomainNotVerified error. Emails that you attempt to send from this domain will be automatically rejected.
    • Click Set MAIL FROM Domain.

You will need to complete this step on SES1, as well as SES2. The MAIL FROM records are regional and you will need to add them both to your DNS authoritative server.

Set MAIL FROM records in DNS

From both SES1 and SES2, take the MX and TXT records provided by the MAIL FROM configuration and add them to the DNS authoritative server. If SES1 and SES2 are in the same region (us-east-1 in our example) you will publish exactly one MX record (mail.example.com in our example) into DNS, pointing to endpoint for that region. If SES1 and SES2 are in different regions, you will create two different records (mail1.example.com and mail2.example.com) into DNS, each pointing to endpoint for specific region.

Verify MX record

Example of MX record where SES1 and SES2 are in the same region

dig MX mail.example.com +short
10 feedback-smtp.us-east-1.amazonses.com.

Example of MX records where SES1 and SES2 are in different regions

dig MX mail1.example.com +short
10 feedback-smtp.us-east-1.amazonses.com.

dig MX mail2.example.com +short
10 feedback-smtp.eu-west-1.amazonses.com.

Verify if it works

On both SES instances (SES1 and SES2), check that validations are complete. In the SES Console:

  • In Verification section, Status should be “verified” (in green color)
  • In DKIM section, DKIM Verification Status should be “verified” (in green color)
  • In MAIL FROM Domain section, MAIL FROM domain status should be “verified” (in green color)

If you have it all verified on both accounts or regions, it is correctly configured and ready to use.

Conclusion

In this post, we explained how to verify and use the same domain for Amazon SES in multiple account and regions and maintaining the DMARC, DKIM and SPF compliance and security features related to email exchange.

While each customer has different necessities, Amazon SES is flexible to allow customers decide, organize, and be in control about how they want to uses Amazon SES to send email.

Author bio

Leonardo Azize Martins is a Cloud Infrastructure Architect at Professional Services for Public Sector.

His background is on development and infrastructure for web applications, working on large enterprises.

When not working, Leonardo enjoys time with family, read technical content, watch movies and series, and play with his daughter.

Contributor

Daniel Tet is a senior solutions architect at AWS specializing in Low-Code and No-Code solutions. For over twenty years, he has worked on projects for Franklin Templeton, Blackrock, Stanford Children’s Hospital, Napster, and Twitter. He has a Bachelor of Science in Computer Science and an MBA. He is passionate about making technology easy for common people; he enjoys camping and adventures in nature.

 

What happened on the Internet during the Facebook outage

Post Syndicated from Celso Martinho original https://blog.cloudflare.com/during-the-facebook-outage/

What happened on the Internet during the Facebook outage

It’s been a few days now since Facebook, Instagram, and WhatsApp went AWOL and experienced one of the most extended and rough downtime periods in their existence.

When that happened, we reported our bird’s-eye view of the event and posted the blog Understanding How Facebook Disappeared from the Internet where we tried to explain what we saw and how DNS and BGP, two of the technologies at the center of the outage, played a role in the event.

In the meantime, more information has surfaced, and Facebook has published a blog post giving more details of what happened internally.

As we said before, these events are a gentle reminder that the Internet is a vast network of networks, and we, as industry players and end-users, are part of it and should work together.

In the aftermath of an event of this size, we don’t waste much time debating how peers handled the situation. We do, however, ask ourselves the more important questions: “How did this affect us?” and “What if this had happened to us?” Asking and answering these questions whenever something like this happens is a great and healthy exercise that helps us improve our own resilience.

Today, we’re going to show you how the Facebook and affiliate sites downtime affected us, and what we can see in our data.

1.1.1.1

1.1.1.1 is a fast and privacy-centric public DNS resolver operated by Cloudflare, used by millions of users, browsers, and devices worldwide. Let’s look at our telemetry and see what we find.

First, the obvious. If we look at the response rate, there was a massive spike in the number of SERVFAIL codes. SERVFAILs can happen for several reasons; we have an excellent blog called Unwrap the SERVFAIL that you should read if you’re curious.

In this case, we started serving SERVFAIL responses to all facebook.com and whatsapp.com DNS queries because our resolver couldn’t access the upstream Facebook authoritative servers. About 60x times more than the average on a typical day.

What happened on the Internet during the Facebook outage

If we look at all the queries, not specific to Facebook or WhatsApp domains, and we split them by IPv4 and IPv6 clients, we can see that our load increased too.

As explained before, this is due to a snowball effect associated with applications and users retrying after the errors and generating even more traffic. In this case, 1.1.1.1 had to handle more than the expected rate for A and AAAA queries.

What happened on the Internet during the Facebook outage

Here’s another fun one.

DNS vs. DoT and DoH. Typically, DNS queries and responses are sent in plaintext over UDP (or TCP sometimes), and that’s been the case for decades now. Naturally, this poses security and privacy risks to end-users as it allows in-transit attacks or traffic snooping.

With DNS over TLS (DoT) and DNS over HTTPS, clients can talk DNS using well-known, well-supported encryption and authentication protocols.

Our learning center has a good article on “DNS over TLS vs. DNS over HTTPS” that you can read. Browsers like Chrome, Firefox, and Edge have supported DoH for some time now, WAP uses DoH too, and you can even configure your operating system to use the new protocols.

When Facebook went offline, we saw the number of DoT+DoH SERVFAILs responses grow by over x300 vs. the average rate.

What happened on the Internet during the Facebook outage
What happened on the Internet during the Facebook outage
What happened on the Internet during the Facebook outage

So, we got hammered with lots of requests and errors, causing traffic spikes to our 1.1.1.1 resolver and causing an unexpected load in the edge network and systems. How did we perform during this stressful period?

Quite well. 1.1.1.1 kept its cool and continued serving the vast majority of requests around the famous 10ms mark. An insignificant fraction of p95 and p99 percentiles saw increased response times, probably due to timeouts trying to reach Facebook’s nameservers.

What happened on the Internet during the Facebook outage

Another interesting perspective is the distribution of the ratio between SERVFAIL and good DNS answers, by country. In theory, the higher this ratio is, the more the country uses Facebook. Here’s the map with the countries that suffered the most:

What happened on the Internet during the Facebook outage

Here’s the top twelve country list, ordered by those that apparently use Facebook, WhatsApp and Instagram the most:

Country SERVFAIL/Good Answers ratio
Turkey 7.34
Grenada 4.84
Congo 4.44
Lesotho 3.94
Nicaragua 3.57
South Sudan 3.47
Syrian Arab Republic 3.41
Serbia 3.25
Turkmenistan 3.23
United Arab Emirates 3.17
Togo 3.14
French Guiana 3.00

Impact on other sites

When Facebook, Instagram, and WhatsApp aren’t around, the world turns to other places to look for information on what’s going on, other forms of entertainment or other applications to communicate with their friends and family. Our data shows us those shifts. While Facebook was going down, other services and platforms were going up.

To get an idea of the changing traffic patterns we look at DNS queries as an indicator of increased traffic to specific sites or types of site.

Here are a few examples.

Other social media platforms saw a slight increase in use, compared to normal.

What happened on the Internet during the Facebook outage

Traffic to messaging platforms like Telegram, Signal, Discord and Slack got a little push too.

What happened on the Internet during the Facebook outage

Nothing like a little gaming time when Instagram is down, we guess, when looking at traffic to sites like Steam, Xbox, Minecraft and others.

What happened on the Internet during the Facebook outage

And yes, people want to know what’s going on and fall back on news sites like CNN, New York Times, The Guardian, Wall Street Journal, Washington Post, Huffington Post, BBC, and others:

What happened on the Internet during the Facebook outage

Attacks

One could speculate that the Internet was under attack from malicious hackers. Our Firewall doesn’t agree; nothing out of the ordinary stands out.

What happened on the Internet during the Facebook outage

Network Error Logs

Network Error Logging, NEL for short, is an experimental technology supported in Chrome. A website can issue a Report-To header and ask the browser to send reports about network problems, like bad requests or DNS issues, to a specific endpoint.

Cloudflare uses NEL data to quickly help triage end-user connectivity issues when end-users reach our network. You can learn more about this feature in our help center.

If Facebook is down and their DNS isn’t responding, Chrome will start reporting NEL events every time one of the pages in our zones fails to load Facebook comments, posts, ads, or authentication buttons. This chart shows it clearly.​​

What happened on the Internet during the Facebook outage

WARP

Cloudflare announced WARP in 2019, and called it “A VPN for People Who Don’t Know What V.P.N. Stands For” and offered it for free to its customers. Today WARP is used by millions of people worldwide to securely and privately access the Internet on their desktop and mobile devices. Here’s what we saw during the outage by looking at traffic volume between WARP and Facebook’s network:

What happened on the Internet during the Facebook outage

You can see how the steep drop in Facebook ASN traffic coincides with the start of the incident and how it compares to the same period the day before.

Our own traffic

People tend to think of Facebook as a place to visit. We log in, and we access Facebook, we post. It turns out that Facebook likes to visit us too, quite a lot. Like Google and other platforms, Facebook uses an army of crawlers to constantly check websites for data and updates. Those robots gather information about websites content, such as its titles, descriptions, thumbnail images, and metadata. You can learn more about this on the “The Facebook Crawler” page and the Open Graph website.

Here’s what we see when traffic is coming from the Facebook ASN, supposedly from crawlers, to our CDN sites:

What happened on the Internet during the Facebook outage

The robots went silent.

What about the traffic coming to our CDN sites from Facebook User-Agents? The gap is indisputable.

What happened on the Internet during the Facebook outage

We see about 30% of a typical request rate hitting us. But it’s not zero; why is that?

We’ll let you know a little secret. Never trust User-Agent information; it’s broken. User-Agent spoofing is everywhere. Browsers, apps, and other clients deliberately change the User-Agent string when they fetch pages from the Internet to hide, obtain access to certain features, or bypass paywalls (because pay-walled sites want sites like Facebook to index their content, so that then they get more traffic from links).

Fortunately, there are newer, and privacy-centric standards emerging like User-Agent Client Hints.

Core Web Vitals

Core Web Vitals are the subset of Web Vitals, an initiative by Google to provide a unified interface to measure real-world quality signals when a user visits a web page. Such signals include Largest Contentful Paint (LCP), First Input Delay (FID), and Cumulative Layout Shift (CLS).

We use Core Web Vitals with our privacy-centric Web Analytics product and collect anonymized data on how end-users experience the websites that enable this feature.

One of the metrics we can calculate using these signals is the page load time. Our theory is that if a page includes scripts coming from external sites (for example, Facebook “like” buttons, comments, ads), and they are unreachable, its total load time gets affected.

We used a list of about 400 domains that we know embed Facebook scripts in their pages and looked at the data.

What happened on the Internet during the Facebook outage

Now let’s look at the Largest Contentful Paint. LCP marks the point in the page load timeline when the page’s main content has likely loaded. The faster the LCP is, the better the end-user experience.

What happened on the Internet during the Facebook outage

Again, the page load experience got visibly degraded.

The outcome seems clear. The sites that use Facebook scripts in their pages took 1.5x more time to load their pages during the outage, with some of them taking more than 2x the usual time. Facebook’s outage dragged the performance of  some other sites down.

Conclusion

When Facebook, Instagram, and WhatsApp went down, the Web felt it. Some websites got slower or lost traffic, other services and platforms got unexpected load, and people lost the ability to communicate or do business normally.

Hypotheses About What Happened to Facebook

Post Syndicated from Bozho original https://techblog.bozho.net/hypotheses-about-what-happened-to-facebook/

Facebook was down. I’d recommend reading Cloudflare’s summary. Then I recommend reading Facebook’s own account on the incident. But let me expand on that. Facebook published announcements and withdrawals for certain BGP prefixes which lead to removing its DNS servers from “the map of the internet” – they told everyone “the part of our network where our DNS servers are doesn’t exist”. That was the result of a backbone self-inflicted failure due to a bug in the auditing tool that checks whether the commands executed aren’t doing harmful things.

Facebook owns a lot of IPs. According to RIPEstat they are part of 399 prefixes (147 of them are IPv4). The DNS servers are located in two of those 399. Facebook uses a.ns.facebook.com, b.ns.facebook.com, c.ns.facebook.com and d.ns.facebook.com, which get queries whenever someone wants to know the IPs of Facebook-owned domains. These four nameservers are served by the same Autonomous System from just two prefixes – 129.134.30.0/23 and 185.89.218.0/23. Of course “4 nameservers” is a logical construct, there are probably many actual servers behind that (using anycast).

I wrote a simple “script” to fetch all the withdrawals and announcements for all Facebook-owned prefixes (from the great API of RIPEstats). Facebook didn’t remove itself from the map entirely. As CloudFlare points out, it was just some prefixes that are affected. It can be just these two, or a few others as well, but it seems that just a handful were affected. If we sort the resulting CSV from the above script by withdrawals, we’ll notice that 129.134.30.0/23 and 185.89.218.0/23 are the pretty high up (alongside 185.89 and 123.134 with a /24, which are all included in the /23). Now that perfectly matches Facebook’s account that their nameservers automatically withdraw themselves if they fail to connect to other parts of the infrastructure. Everything may have also been down, but the logic for withdrawal is present only in the networks that have nameservers in them.

So first, let me make three general observations that are not as obvious and as universal as they may sound, but they are worth discussing:

  • Use longer DNS TTLs if possible – if Facebook had 6 hour TTL on its domains, we may have not figured out that their name servers are down. This is hard to ask for such a complex service that uses DNS for load-balancing and geographical distribution, but it’s worth considering. Also, if they killed their backbone and their entire infrastructure was down anyway, the DNS TTL would not have solved the issue. But
  • We need improved caching logic for DNS. It can’t be just “present or not”; DNS caches may keep “last known good state” in case of SERVFAIL and fallback to that. All of those DNS resolvers that had to ask the authoritative nameserver “where can I find facebook.com” knew where to find facebook.com just a minute ago. Then they got a failure and suddenly they are wondering “oh, where could Facebook be?”. It’s not that simple, of course, but such cache improvement is worth considering. And again, if their entire infrastructure was down, this would not have helped.
  • Consider having an authoritative nameserver outside your main AS. If something bad happens to your AS routes (regardless of the reason), you may still have DNS working. That may have downsides – generally, it will be hard to manage and sync your DNS infrastructure. But at least having a spare set of nameservers and the option to quickly point glue records there is worth considering. It would not have saved Facebook in this case, as again, they claim the entire infrastructure was inaccessible due to a “broken” backbone.
  • Have a 100% test coverage on critical tools, such as the auditing tool that had a bug. 100% test coverage is rarely achievable in any project, but in such critical tools it’s a must.

The main explanation is the accidental outage. This is what Facebook engineers explain in the blogpost and other accounts, and that’s what seems to have happened. However, there are alternative hypotheses floating around, so let me briefly discuss all of the options.

  • Accidental outage due to misconfiguration – a very likely scenario. These things may happen to everyone and Facebook is known for it “break things” mentality, so it’s not unlikely that they just didn’t have the right safeguards in place and that someone ran a buggy update. The scenarios why and how that may have happened are many, and we can’t know from the outside (even after Facebook’s brief description). This remains the primary explanation, following my favorite Hanlon’s razor. A bug in the audit tool is absolutely realistic (btw, I’d love Facebook to publish their internal tools).
  • Cyber attack – It cannot be known by the data we have, but this would be a sophisticated attack that gained access to their BGP administration interface, which I would assume is properly protected. Not impossible, but a 6-hour outage of a social network is not something a sophisticated actor (e.g. a nation state) would invest resources in. We can’t rule it out, as this might be “just a drill” for something bigger to follow. If I were an attacker that wanted to take Facebook down, I’d try to kill their DNS servers, or indeed, “de-route” them. If we didn’t know that Facebook lets its DNS servers cut themselves from the network in case of failures, the fact that so few prefixes were updated might be in indicator of targeted attack, but this seems less and less likely.
  • Deliberate self-sabotage1.5 billion records are claimed to be leaked yesterday. At the same time, a Facebook whistleblower is testifying in the US congress. Both of these news are potentially damaging to Facebook reputation and shares. If they wanted to drown the news and the respective share price plunge in a technical story that few people understand but everyone is talking about (and then have their share price rebound, because technical issues happen to everyone), then that’s the way to do it – just as a malicious actor would do, but without all the hassle to gain access from outside – de-route the prefixes for the DNS servers and you have a “perfect” outage. These coincidences have lead people to assume such a plot, but from the observed outage and the explanation given by Facebook on why the DNS prefixes have been automatically withdrawn, this sounds unlikely.

Distinguishing between the three options is actually hard. You can mask a deliberate outage as an accident, a malicious actor can make it look like a deliberate self-sabotage. That’s why there are speculations. To me, however, by all of the data we have in RIPEStat and the various accounts by CloudFlare, Facebook and other experts, it seems that a chain of mistakes (operational and possibly design ones) lead to this.

The post Hypotheses About What Happened to Facebook appeared first on Bozho's tech blog.

Facebook Is Down

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2021/10/facebook-is-down.html

Facebook — along with Instagram and WhatsApp — went down globally today. Basically, someone deleted their BGP records, which made their DNS fall apart.

…at approximately 11:39 a.m. ET today (15:39 UTC), someone at Facebook caused an update to be made to the company’s Border Gateway Protocol (BGP) records. BGP is a mechanism by which Internet service providers of the world share information about which providers are responsible for routing Internet traffic to which specific groups of Internet addresses.

In simpler terms, sometime this morning Facebook took away the map telling the world’s computers how to find its various online properties. As a result, when one types Facebook.com into a web browser, the browser has no idea where to find Facebook.com, and so returns an error page.

In addition to stranding billions of users, the Facebook outage also has stranded its employees from communicating with one another using their internal Facebook tools. That’s because Facebook’s email and tools are all managed in house and via the same domains that are now stranded.

What I heard is that none of the employee keycards work, since they have to ping a now-unreachable server. So people can’t get into buildings and offices.

And every third-party site that relies on “log in with Facebook” is stuck as well.

The fix won’t be quick:

As a former network admin who worked on the internet at this level, I anticipate Facebook will be down for hours more. I suspect it will end up being Facebook’s longest and most severe failure to date before it’s fixed.

We all know the security risks of monocultures.

EDITED TO ADD (10/6): Good explanation of what happened. Shorter from Jonathan Zittrain: “Facebook basically locked its keys in the car.”

Understanding How Facebook Disappeared from the Internet

Post Syndicated from Tom Strickx original https://blog.cloudflare.com/october-2021-facebook-outage/

Understanding How Facebook Disappeared from the Internet

Understanding How Facebook Disappeared from the Internet

“Facebook can’t be down, can it?”, we thought, for a second.

Today at 1651 UTC, we opened an internal incident entitled “Facebook DNS lookup returning SERVFAIL” because we were worried that something was wrong with our DNS resolver 1.1.1.1.  But as we were about to post on our public status page we realized something else more serious was going on.

Social media quickly burst into flames, reporting what our engineers rapidly confirmed too. Facebook and its affiliated services WhatsApp and Instagram were, in fact, all down. Their DNS names stopped resolving, and their infrastructure IPs were unreachable. It was as if someone had “pulled the cables” from their data centers all at once and disconnected them from the Internet.

How’s that even possible?

Meet BGP

BGP stands for Border Gateway Protocol. It’s a mechanism to exchange routing information between autonomous systems (AS) on the Internet. The big routers that make the Internet work have huge, constantly updated lists of the possible routes that can be used to deliver every network packet to their final destinations. Without BGP, the Internet routers wouldn’t know what to do, and the Internet wouldn’t work.

The Internet is literally a network of networks, and it’s bound together by BGP. BGP allows one network (say Facebook) to advertise its presence to other networks that form the Internet. As we write Facebook is not advertising its presence, ISPs and other networks can’t find Facebook’s network and so it is unavailable.

The individual networks each have an ASN: an Autonomous System Number. An Autonomous System (AS) is an individual network with a unified internal routing policy. An AS can originate prefixes (say that they control a group of IP addresses), as well as transit prefixes (say they know how to reach specific groups of IP addresses).

Cloudflare’s ASN is AS13335. Every ASN needs to announce its prefix routes to the Internet using BGP; otherwise, no one will know how to connect and where to find us.

Our learning center has a good overview of what BGP and ASNs are and how they work.

In this simplified diagram, you can see six autonomous systems on the Internet and two possible routes that one packet can use to go from Start to End. AS1 → AS2 → AS3 being the fastest, and AS1 → AS6 → AS5 → AS4 → AS3 being the slowest, but that can be used if the first fails.

Understanding How Facebook Disappeared from the Internet

At 1658 UTC we noticed that Facebook had stopped announcing the routes to their DNS prefixes. That meant that, at least, Facebook’s DNS servers were unavailable. Because of this Cloudflare’s 1.1.1.1 DNS resolver could no longer respond to queries asking for the IP address of facebook.com or instagram.com.

route-views>show ip bgp 185.89.218.0/23
% Network not in table
route-views>

route-views>show ip bgp 129.134.30.0/23
% Network not in table
route-views>

Meanwhile, other Facebook IP addresses remained routed but weren’t particularly useful since without DNS Facebook and related services were effectively unavailable:

route-views>show ip bgp 129.134.30.0   
BGP routing table entry for 129.134.0.0/17, version 1025798334
Paths: (24 available, best #14, table default)
  Not advertised to any peer
  Refresh Epoch 2
  3303 6453 32934
    217.192.89.50 from 217.192.89.50 (138.187.128.158)
      Origin IGP, localpref 100, valid, external
      Community: 3303:1004 3303:1006 3303:3075 6453:3000 6453:3400 6453:3402
      path 7FE1408ED9C8 RPKI State not found
      rx pathid: 0, tx pathid: 0
  Refresh Epoch 1
route-views>

We keep track of all the BGP updates and announcements we see in our global network. At our scale, the data we collect gives us a view of how the Internet is connected and where the traffic is meant to flow from and to everywhere on the planet.

A BGP UPDATE message informs a router of any changes you’ve made to a prefix advertisement or entirely withdraws the prefix. We can clearly see this in the number of updates we received from Facebook when checking our time-series BGP database. Normally this chart is fairly quiet: Facebook doesn’t make a lot of changes to its network minute to minute.

But at around 15:40 UTC we saw a peak of routing changes from Facebook. That’s when the trouble began.

Understanding How Facebook Disappeared from the Internet

If we split this view by routes announcements and withdrawals, we get an even better idea of what happened. Routes were withdrawn, Facebook’s DNS servers went offline, and one minute after the problem occurred, Cloudflare engineers were in a room wondering why 1.1.1.1 couldn’t resolve facebook.com and worrying that it was somehow a fault with our systems.

Understanding How Facebook Disappeared from the Internet

With those withdrawals, Facebook and its sites had effectively disconnected themselves from the Internet.

DNS gets affected

As a direct consequence of this, DNS resolvers all over the world stopped resolving their domain names.

➜  ~ dig @1.1.1.1 facebook.com
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 31322
;facebook.com.			IN	A
➜  ~ dig @1.1.1.1 whatsapp.com
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 31322
;whatsapp.com.			IN	A
➜  ~ dig @8.8.8.8 facebook.com
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 31322
;facebook.com.			IN	A
➜  ~ dig @8.8.8.8 whatsapp.com
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 31322
;whatsapp.com.			IN	A

This happens because DNS, like many other systems on the Internet, also has its routing mechanism. When someone types the https://facebook.com URL in the browser, the DNS resolver, responsible for translating domain names into actual IP addresses to connect to, first checks if it has something in its cache and uses it. If not, it tries to grab the answer from the domain nameservers, typically hosted by the entity that owns it.

If the nameservers are unreachable or fail to respond because of some other reason, then a SERVFAIL is returned, and the browser issues an error to the user.

Again, our learning center provides a good explanation on how DNS works.

Understanding How Facebook Disappeared from the Internet

Due to Facebook stopping announcing their DNS prefix routes through BGP, our and everyone else’s DNS resolvers had no way to connect to their nameservers. Consequently, 1.1.1.1, 8.8.8.8, and other major public DNS resolvers started issuing (and caching) SERVFAIL responses.

But that’s not all. Now human behavior and application logic kicks in and causes another exponential effect. A tsunami of additional DNS traffic follows.

This happened in part because apps won’t accept an error for an answer and start retrying, sometimes aggressively, and in part because end-users also won’t take an error for an answer and start reloading the pages, or killing and relaunching their apps, sometimes also aggressively.

This is the traffic increase (in number of requests) that we saw on 1.1.1.1:

Understanding How Facebook Disappeared from the Internet

So now, because Facebook and their sites are so big, we have DNS resolvers worldwide handling 30x more queries than usual and potentially causing latency and timeout issues to other platforms.

Fortunately, 1.1.1.1 was built to be Free, Private, Fast (as the independent DNS monitor DNSPerf can attest), and scalable, and we were able to keep servicing our users with minimal impact.

The vast majority of our DNS requests kept resolving in under 10ms. At the same time, a minimal fraction of p95 and p99 percentiles saw increased response times, probably due to expired TTLs having to resort to the Facebook nameservers and timeout. The 10 seconds DNS timeout limit is well known amongst engineers.

Understanding How Facebook Disappeared from the Internet

Impacting other services

People look for alternatives and want to know more or discuss what’s going on. When Facebook became unreachable, we started seeing increased DNS queries to Twitter, Signal and other messaging and social media platforms.

Understanding How Facebook Disappeared from the Internet

We can also see another side effect of this unreachability in our WARP traffic to and from Facebook’s affected ASN 32934. This chart shows how traffic changed from 15:45 UTC to 16:45 UTC compared with three hours before in each country. All over the world WARP traffic to and from Facebook’s network simply disappeared.

Understanding How Facebook Disappeared from the Internet

The Internet

Today’s events are a gentle reminder that the Internet is a very complex and interdependent system of millions of systems and protocols working together. That trust, standardization, and cooperation between entities are at the center of making it work for almost five billion active users worldwide.

Update

At around 21:00 UTC we saw renewed BGP activity from Facebook’s network which peaked at 21:17 UTC.

Understanding How Facebook Disappeared from the Internet

This chart shows the availability of the DNS name ‘facebook.com’ on Cloudflare’s DNS resolver 1.1.1.1. It stopped being available at around 15:50 UTC and returned at 21:20 UTC.

Understanding How Facebook Disappeared from the Internet

Undoubtedly Facebook, WhatsApp and Instagram services will take further time to come online but as of 22:28 UTC Facebook appears to be reconnected to the global Internet and DNS working again.

Tackling Email Spoofing and Phishing

Post Syndicated from Hannes Gerhart original https://blog.cloudflare.com/tackling-email-spoofing/

Tackling Email Spoofing and Phishing

Tackling Email Spoofing and Phishing

Today we’re rolling out a new tool to tackle email spoofing and phishing and improve email deliverability: The new Email Security DNS Wizard can be used to create DNS records that prevent others from sending malicious emails on behalf of your domain. This new feature also warns users about insecure DNS configurations on their domain and shows recommendations on how to fix them. The feature will first be rolled out to users on the Free plan and over the next weeks be made available for Pro, Business and Enterprise customers, as well.

Tackling Email Spoofing and Phishing

Before we dive into what magic this wizard is capable of, let’s take a step back and take a look at the problem: email spoofing and phishing.

What is email spoofing and phishing?

Spoofing is the process of posing as someone else which can be used in order to gain some kind of illicit advantage. One example is domain spoofing where someone hosts a website like mycoolwebpaqe.xyz  to trick users of mycoolwebpage.xyz to provide sensitive information without knowing they landed on a false website. When looking at the address bar side by side in a browser, it’s very hard to spot the difference.

Tackling Email Spoofing and Phishing

Then, there is email spoofing. In order to understand how this works, let’s take a look at a Cloudflare product update email I received on my personal email address. With most email providers you can look at the full source of an email which contains a number of headers and of course the body of the email.

Date: Thu, 23 Sep 2021 10:30:02 -0500 (CDT)
From: Cloudflare <[email protected]>
Reply-To: produ[email protected]
To: <my_personal_email_address>

Above you can see four headers of the email, when it was received, who it came from, who I should reply to, and my personal email address. The value of the From header is used to display the sender in my email program.

Tackling Email Spoofing and Phishing

When I receive an email as above, I automatically assume this email has been sent from Cloudflare. However, nobody is stopped from sending an email with a modified From header from their mail server. If my email provider is not performing the right checks, which we will cover later in this blog post, I could be tricked into believing that an email was sent from Cloudflare, but it actually was not.

Tackling Email Spoofing and Phishing

This brings us to the second attack type: phishing. Let’s say a malicious actor has successfully used email spoofing to send emails to your company’s customers that seem to originate from one of your corporate service emails. The content of these emails look exactly like a legitimate email from your company using the same styling and format. The email text could be an urgent message to update some account information including a hyperlink to the alleged web portal. If the receiving mail server of a user does not flag the email as spam or insecure origin, the user might click on the link which could execute malicious code or lead them to a spoofed domain asking for sensitive information.

According to the FBI’s 2020 Internet Crime Report, phishing was the most common cyber crime in 2020 with over 240,000 victims leading to a loss of over $50M. And the number of victims has more than doubled since 2019 and is almost ten times higher than in 2018.

Tackling Email Spoofing and Phishing

In order to understand how most phishing attacks are carried out, let’s take a closer look at the findings of the 2020 Verizon Data Breach Investigations Report. It lies out that phishing accounts for more than 80% of all “social actions”, another word for social engineering attacks, making it by far the most common type of such an attack. Furthermore, the report states that 96% of social actions are sent via email and only 3% through a website and 1% via phone or text.

Tackling Email Spoofing and Phishing

This clearly shows that email phishing is a serious problem causing Internet users a big headache. So let’s see what domain owners can do to stop bad actors from misusing their domain for email phishing.

How can DNS help prevent this?

Luckily, there are three anti-spoofing mechanisms already built into the Domain Name System (DNS):

  • Sender Policy Framework (SPF)
  • DomainKeys Identified Mail (DKIM)
  • Domain-based Message Authentication Reporting and Conformance (DMARC)

However, it is not trivial to configure them correctly, especially for someone less experienced. In case your configuration is too strict, legitimate emails will be dropped or marked as spam. And if you keep your configuration too relaxed, your domain might be misused for email spoofing.

Sender Policy Framework (SPF)

SPF is used to specify which IP addresses and domains are permitted to send email on behalf of your domain. An SPF policy is published as a TXT record on your domain, so everyone can access it via DNS. Let’s inspect the TXT record for cloudflare.com:

cloudflare.com 	TXT	"v=spf1 ip4:199.15.212.0/22 ip4:173.245.48.0/20 include:_spf.google.com include:spf1.mcsv.net include:spf.mandrillapp.com include:mail.zendesk.com include:stspg-customer.com include:_spf.salesforce.com -all"

An SPF TXT record always needs to start with v=spf1. It usually contains a list of IP addresses of sending email servers using the ip4 or ip6 mechanism. The include mechanism is used to reference another SPF record on another domain. This is usually done if you are relying on other providers that need to send emails on our behalf. You can see a few examples in the SPF record of cloudflare.com above: we’re using Zendesk as customer support software and Mandrill for marketing and transactional emails.

Finally, there is the catch-all mechanism -all which specifies how all incoming, but unspecified emails should be treated The catch-all mechanism is preceded by a qualifier that can be either + (Allow), ~ (Softfail) or – (Fail). Using the Allow qualifier is not recommended as it basically makes the SPF record useless and allows all IP addresses and domains to send email on behalf of your domain. Softfail is interpreted differently by receiving mail servers, marking an email as Spam or insecure, depending on the server. Fail tells a server not to accept any emails originating from unspecified sources.

Tackling Email Spoofing and Phishing

The diagram above shows the steps taken to ensure a received email is SPF compliant.

  1. The email is sent from the IP address 203.0.113.10 and contains a From header with the value of [email protected].
  2. After receiving the email, the receiver queries the SPF record on mycoolwebpage.xyz to retrieve the SPF policy for this domain.
  3. The receiver checks if the sending IP address 203.0.113.10 is listed in the SPF record. If it is, the email succeeds the SPF check. If it is not, the qualifier of the all mechanism defines the outcome.

For a full list of all mechanisms and more details about SPF, refer to RFC7208.

DomainKeys Identified Mail (DKIM)

Okay, with SPF we’ve ensured that only permitted IP addresses and domains can send emails on behalf of cloudflare.com. But what if one of the IPs changes owner without us noticing and updating the SPF record? Or what if someone else who is also using Google’s email server on the same IP tries to spoof emails coming from cloudflare.com?

This is where DKIM comes in. DKIM provides a mechanism to cryptographically sign parts of an email — usually the body and certain headers — using public key encryption. Before we dive into how this works, let’s take a look at the DKIM record used for cloudflare.com:

google._domainkey.cloudflare.com.   TXT   "v=DKIM1; k=rsa; p=MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQDMxbNxA2V84XMpZgzMgHHey3TQFvHkwlPF2a11Ex6PGD71Sp8elVMMCdZhPYqDlzbehg9aWVwPz0+n3oRD73o+JXoSswgUXPV82O8s8dGny/BAJklo0+y+Bh2Op4YPGhClT6mRO2i5Qiqo4vPCuc6GB34Fyx7yhreDNKY9BNMUtQIDAQAB"

The structure of a DKIM record is <selector>._domainkey.<domain>, where the selector is provided by your email provider. The content of the DKIM TXT record always starts with v=DKIM1 followed by the public key. We can see the public key type, referenced by the k tag, and the public key itself, preceded by the p tag.

Below is a simplified sequence how the signing and DKIM check work:

  1. The sending email server creates a hash from the email content.
  2. The sending email server encrypts this hash with the private DKIM key.
  3. The email is sent, containing the encrypted hash.
  4. The receiving email server retrieves the public key from the DKIM TXT record published on the email domain.
  5. The receiving email server decrypts the hash using the public DKIM key.
  6. The receiving email server generates the hash from the email content.
  7. If the decrypted hash and the generated hash match, email authenticity is proven. Otherwise, the DKIM check fails.

The full DKIM specification can be found in RFC4871 and RFC5672.

Domain-based Message Authentication Reporting and Conformance (DMARC)

Domain-based Message Authentication Reporting and Conformance, that’s definitely a mouthful. Let’s focus on two words: Reporting and Conformance. DMARC provides exactly that. Regular reports let the email sender know how many emails are non-conforming and potentially spoofed. Conformance helps provide a clear signal to the email receiver how to treat non-conforming emails. Email receivers might impose their own policies for emails that fail SPF or DKIM checks even without a DMARC record. However, the policy configured on the DMARC record is an explicit instruction by the email sender, so it increases the confidence for email receivers what to do with non-conforming emails.

Here is the DMARC record for cloudflare.com

_dmarc.cloudflare.com.   TXT   "v=DMARC1; p=reject; pct=100; rua=mailto:[email protected]"

The DMARC TXT record is always set on the _dmarc subdomain of the email domain and — similar to SPF and DKIM — the content needs to start with v=DMARC1. Then we see three additional tags:

The policy tag (p) indicates how email receivers should treat emails that fail SPF or DKIM checks. Possible values are none, reject, and quarantine. The none policy is also called monitoring only and allows emails failing the checks to still be accepted. By specifying quarantine, email receivers will put SPF or DKIM non-conforming emails in the Spam folder. With reject, emails are dropped and not delivered at all if they fail SPF or DKIM.

The percentage tag (pct) can be used to apply the specified policy to a subset of incoming emails. This is helpful if you’re just rolling out DMARC and want to make sure everything is configured correctly by testing on a subset.

The last tag we can find on the record is the reporting URI (rua). This is used to specify an email address that will receive aggregate reports (usually daily) about non-conforming emails.

Tackling Email Spoofing and Phishing

Above, we can see how DMARC works step by step.

  1. An email is sent with a From header containing [email protected].
  2. The receiver queries SPF, DKIM and DMARC records from the domain mycoolwebpage.xyz to retrieve the required policies and the DKIM public key.
  3. The receiver performs SPF and DKIM checks as outlined previously. If both succeed, the email is accepted and delivered to the inbox. If either SPF or DKIM check fails, the DMARC policy will be followed and determines if the email is still accepted, dropped or sent to the spam folder.
  4. Finally, the receiver sends back an aggregate report. Depending on the email specified in the rua tag this report could also be sent to a different email server which is responsible for that email address.

Other optional tags and the complete DMARC specification is described in RFC7489.

A few numbers on the current adoption

Now that we’ve learned what the problem is and how to tackle it using SPF, DKIM, and DMARC, let’s see how widely they’re adopted.

Dmarc.org has published the adoption of DMARC records of domains in a representative dataset. It shows that by the end of 2020, still less than 50% of domains even had a DMARC record. And remember, without a DMARC record there is no clear enforcement of SPF and DKIM checks. The study further shows that, of the domains that have a DMARC record, more than 65% are using the monitoring only policy (p=none), so there is a significant potential to drive this adoption higher. The study does not mention if these domains are sending or receiving emails, but even if they didn’t, a secure configuration should include a DMARC record (more about this later).

Tackling Email Spoofing and Phishing

Another report from August 1, 2021 tells a similar story for domains that belong to entities in the banking sector. Of 2,881 banking entities in the United States, only 44% have published a DMARC record on their domain. Of those that have a DMARC record, roughly 2 out of 5 have set the DMARC policy to None and another 8% are considered “Misconfigured”. Denmark has a very high adoption of DMARC on domains in the banking sector of 94%, in contrast to Japan where only 13% of domains are using DMARC. SPF adoption is on average significantly higher than DMARC, which might have to do with the fact that the SPF standard was first introduced as experimental RFC in 2006 and DMARC only became a standard in 2015.

Country Number of entities SPF present DMARC present
Denmark 53 91% 94%
UK 83 88% 76%
Canada 24 96% 67%
USA 2,881 91% 44%
Germany 39 74% 36%
Japan 90 82% 13%

This shows us there is quite some room for improvement.

Enter: Email Security DNS Wizard

So what are we doing to increase the adoption of SPF, DKIM, and DMARC and tackle email spoofing and phishing? Enter the new Cloudflare Email Security DNS Wizard.

Starting today, when you’re navigating to the DNS tab of the Cloudflare dashboard, you’ll see two new features:

  1. A new section called Email Security
Tackling Email Spoofing and Phishing
  1. New warnings about insecure configurations
Tackling Email Spoofing and Phishing

In order to start using the Email Security DNS Wizard, you can either directly click the link in the warning which brings you to the relevant section of the wizard or click Configure in the new Email Security section. The latter will show you the following available options:

Tackling Email Spoofing and Phishing

There are two scenarios. You’re using your domain to send email, or you don’t. If you do, you can configure SPF, DKIM, and DMARC records by following a few simple steps. Here you can see the steps for SPF:

If your domain is not sending email, there is an easy way to configure all three records with just one click:

Tackling Email Spoofing and Phishing

Once you click Submit, this will create all three records configured in such a way that all emails will fail the checks and be rejected by incoming email servers.

example.com			TXT	"v=spf1 -all"
*._domainkey.example.com.	TXT	"v=DKIM1; p="
_dmarc.example.com.		TXT	"v=DMARC1; p=reject; sp=reject; adkim=s; aspf=s;"

Help tackle email spoofing and phishing

Out you go and make sure your domain is secured against email spoofing and phishing. Just head over to the DNS tab in the Cloudflare dashboard or if you are not yet using Cloudflare DNS, sign up for free in just a few minutes on cloudflare.com.

If you want to read more about SPF, DKIM and DMARC, go check out our new learning pages about email related DNS records.

Oblivious DNS-over-HTTPS

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2020/12/oblivious-dns-over-https.html

This new protocol, called Oblivious DNS-over-HTTPS (ODoH), hides the websites you visit from your ISP.

Here’s how it works: ODoH wraps a layer of encryption around the DNS query and passes it through a proxy server, which acts as a go-between the internet user and the website they want to visit. Because the DNS query is encrypted, the proxy can’t see what’s inside, but acts as a shield to prevent the DNS resolver from seeing who sent the query to begin with.

IETF memo.

The paper:

Abstract: The Domain Name System (DNS) is the foundation of a human-usable Internet, responding to client queries for host-names with corresponding IP addresses and records. Traditional DNS is also unencrypted, and leaks user information to network operators. Recent efforts to secure DNS using DNS over TLS (DoT) and DNS over HTTPS (DoH) havebeen gaining traction, ostensibly protecting traffic and hiding content from on-lookers. However, one of the criticisms ofDoT and DoH is brought to bear by the small number of large-scale deployments (e.g., Comcast, Google, Cloudflare): DNS resolvers can associate query contents with client identities in the form of IP addresses. Oblivious DNS over HTTPS (ODoH) safeguards against this problem. In this paper we ask what it would take to make ODoH practical? We describe ODoH, a practical DNS protocol aimed at resolving this issue by both protecting the client’s content and identity. We implement and deploy the protocol, and perform measurements to show that ODoH has comparable performance to protocols like DoH and DoT which are gaining widespread adoption,while improving client privacy, making ODoH a practical privacy enhancing replacement for the usage of DNS.

Slashdot thread.

How to protect a self-managed DNS service against DDoS attacks using AWS Global Accelerator and AWS Shield Advanced

Post Syndicated from Chido Chemambo original https://aws.amazon.com/blogs/security/how-to-protect-a-self-managed-dns-service-against-ddos-attacks-using-aws-global-accelerator-and-aws-shield-advanced/

In this blog post, I show you how to improve the distributed denial of service (DDoS) resilience of your self-managed Domain Name System (DNS) service by using AWS Global Accelerator and AWS Shield Advanced. You can use those services to incorporate some of the techniques used by Amazon Route 53 to protect against DDoS attacks.

DNS routes users to your application by quickly translating a human-readable domain name to a machine-readable IP address. When protecting the availability of your application against DDoS attacks, it’s important to consider every part of the stack, including domain name resolution. The recommended best practice is to create hosted zones on Route 53, a scalable, highly available DNS service that’s protected against large DDoS attacks and query floods. Route 53 uses anycast routing to serve DNS queries from more than 150 edge locations around the globe. With anycast routing, DNS queries are served from locations that are closer to your users and the globally distributed DDoS mitigation capacity of Amazon Web Services (AWS) reduces the impact of attacks.

Optionally, you can also build your own DNS service on Amazon Elastic Compute Cloud (Amazon EC2). For example, you can run your own proprietary DNS server to take advantage of custom features that you wrote to integrate with an existing DNS service that isn’t running on AWS. When you register a domain name, you’re usually required to provide at least two name servers that can respond to queries from your users. It’s possible to build a DNS service on only two instances, but that provides limited DDoS resilience.

Solution overview

To protect your self-managed DNS service using this solution, you need a strong understanding of DNS and how to operate a distributed, self-managed DNS service on Amazon EC2. This solution improves upon an existing self-managed DNS service by significantly enhancing its ability to withstand DDoS attacks. There are two components that you add to your application:

  • You use Global Accelerator to provide your application with two static IP addresses that act as a fixed entry point to Amazon EC2 instances in multiple AWS Regions. Global Accelerator uses anycast to route your traffic to a point of entry close to the source of the traffic. In addition to providing availability and performance benefits, this gives you access to global DDoS mitigation capacity through AWS.
  • You use Shield Advanced to monitor the availability of your application and automatically engage the AWS Shield Response Team (SRT) if its availability is affected by a DDoS attack. When you associate a Route 53 health check to your protected resources, Shield Advanced uses the health of the application as an input for detection and as a signal to SRT to contact your operations center when needed. You can also engage with SRT to write custom mitigations for your application. For your self-managed DNS service use case, this can include mitigations like DNS packet validation and suspicion scoring that gives a higher priority to queries that are more likely to be legitimate traffic for your application.

As part of this solution, you will build a DNS canary that uses Amazon CloudWatch to update the status of a Route 53 health check if your self-managed DNS service stops responding to queries. An example architecture using Amazon EC2 based DNS behind Global Accelerator and Shield is shown in figure 1.

Figure 1: Amazon EC2 based DNS behind Global Accelerator and Shield

Figure 1: Amazon EC2 based DNS behind Global Accelerator and Shield

Create and configure an accelerator

To begin, create an accelerator and add your existing DNS servers as endpoints. The newly created accelerator will receive queries and forward them to your DNS service.

To create and configure an accelerator

Step 1: Create an accelerator

  1. Navigate to the AWS Global Accelerator dashboard.
  2. Choose Create accelerator.
  3. Enter a name for your accelerator.
  4. Choose Next.

Step 2: Add listeners

Since DNS uses both TCP and UDP protocols, you must create separate listeners to handle requests for each protocol.

At the Add Listeners step, enter the following:

  1. Ports: 53
  2. Protocol: TCP
  3. Client affinity: None

Choose Add listener again to add the UDP listener. Enter the following:

  1. Ports: 53
  2. Protocol: UDP
  3. Client affinity: None
  4. Choose Next

To learn more about the different options available in this step, see To create a listener in Getting started with AWS Global Accelerator.

Step 3: Add endpoint groups

Starting with the TCP listener, enter the following settings:

  1. Region: Choose a Region that your DNS instances are located in, for example, us-east-1.
  2. Traffic dial: 100
  3. If you have additional DNS instances in another AWS Region, choose Add endpoint group and repeat steps a) and b), entering the appropriate Region.
  4. Repeat steps a) through c) to add endpoint groups for the UDP listener, and then choose Next.

To learn more about the different options available in this step, for example, Traffic dial, see the Add endpoint groups in Getting started with AWS Global Accelerator.

Step 4: Add endpoints

Starting with the TCP listener, enter the following in the form boxes for each Region specified in the previous step:

  1. Endpoint type: Select EC2 instance from the drop-down list.
  2. Endpoint: Select a DNS instance from the drop-down list.
  3. Weight: 128

If you have additional DNS instances in the Region, choose Add endpoint and repeat the preceding steps, but select a DNS instance that hasn’t been added as an endpoint.

Repeat all of the preceding steps for the UDP listener, then choose Create accelerator.

To learn more about the different options available in this step, see the Add endpoints in Getting started with AWS Global Accelerator.

Step 5: Verification

When you choose the Create accelerator button, you’re redirected to a Global Accelerator console page that lists all the accelerators in your account. On this page, you can view the global IPs and DNS name allocated to your newly created accelerator, in addition to the current status.

Wait until the status of the accelerators changes to Deployed before proceeding with any tests.

Configure Shield Advanced and Shield Advanced proactive engagement

Protect your accelerator with Shield Advanced, monitor the health of your application, and configure proactive engagement. When you turn on proactive engagement, the SRT will directly contact you if an Amazon Route 53 health check associated with your protected resource becomes unhealthy during an event that’s detected by Shield Advanced.

To configure proactive engagement

Step 1: Create a Route 53 health check

If you already have a Route 53 health check that monitors the health of your DNS service, you can proceed to step 2 of this section. If you don’t yet have a health check, you can use this AWS CloudFormation template to create one. The template will:

  1. Create a Lambda function that queries your DNS server through the accelerator global IPs. This function posts metrics to CloudWatch to indicate whether the query was successful or not.
  2. Create a CloudWatch alarm that will detect when DNS queries fail.
  3. Create a Route 53 health check that tracks the CloudWatch alarm and changes status to unhealthy when the alarm changes to the Alarm state.

Step 2: Subscribe to Shield Advanced

Please note that with AWS Shield Advanced, you pay a monthly fee of $3,000 per month per organization. In addition, you also pay for AWS Shield Advanced Data Transfer usage fees for AWS resources enabled for advanced protection.

  1. Navigate to the AWS Shield console.
  2. In the AWS Shield navigation bar, choose Getting started, and then choose Subscribe to Shield Advanced.
  3. On the Subscribe to Shield Advanced page, read the terms of agreement, and then select all of the check boxes to indicate that you accept the terms.
  4. Choose Subscribe to Shield Advanced.

Step 3: Add resources to protect

  1. Do one of the following, depending on if you were already subscribed to Shield Advanced.
    • If you just subscribed to Shield Advanced by completing Step 2 above, choose Add resources to protect.
    • If you were already subscribed to Shield Advanced, open the Shield console and choose Protected Resources, and then choose Add resources to protect.
  2. In the Choose resources to protect with Shield Advanced page, select the Regions and resource types that you want to protect, then choose Load resources.
  3. Select the resources that you want to protect, and then choose Protect with Shield Advanced.
  4. In the Configure health check based DDoS detection page, under the Protected resources section, select a Route 53 health check to add—either one that you created previously, or a health check created by the AWS CloudFormation template—as the Associated Health Check.
  5. Choose Next until you reach the Review and configure DDoS mitigation and visibility page, and then review the settings and choose Finish configuration.

Step 4: Add contacts

  1. Navigate to the Overview tab of the AWS Shield console.
  2. In the Proactive engagements and contacts section, choose Edit under the Contacts heading.
  3. In the Add contact form, add the contact’s Email, Phone number, and Notes.
  4. Choose Save.

Step 5: Request proactive engagement

  1. Choose Edit proactive engagement feature.
  2. Select Enable.
  3. Choose Save.

Step 6: Configuration review with the SRT

After you enable proactive engagement, the state will be Proactive engagement requested and pending.

SRT will contact you to schedule a configuration review. The review will include a review of your Route 53 health check configuration and a consultation about custom mitigations that can be configured to support your DNS use case. Following this review, SRT will complete your request to enable proactive engagement.

Summary

DNS is a foundational part of the user experience for any application that is accessed via a human readable domain name. Your DNS service should be highly available, DDoS resilient, and accessible to your users with minimal latency. If you run your own DNS service on Amazon EC2, you can improve the DDoS resiliency using Global Accelerator and Shield Advanced. This solution provides your users with a low latency path to your DNS service and provides you with some of the DDoS mitigation that protects Route 53. To learn more about DDoS best practices, see AWS Best Practices for DDoS Resiliency.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the AWS Shield forum or contact AWS Support.

Want more AWS Security how-to content, news, and feature announcements? Follow us on Twitter.

Author

Chido Chemambo

Chido is a Security Engineer on the AWS Shield Team with 12 years of experience in the telecommunications industry. He specializes in network security and enjoys working with colleagues to improve AWS Shield, and with customers to improve their cloud architectures. Outside of work, Chido enjoys jumping rope, improving his development skills, and watching English Premier League soccer and Formula 1.