Tag Archives: firewall

Building even faster interpreters in Rust

Post Syndicated from Zak Cutner original https://blog.cloudflare.com/building-even-faster-interpreters-in-rust/

Building even faster interpreters in Rust

Building even faster interpreters in Rust

At Cloudflare, we’re constantly working on improving the performance of our edge — and that was exactly what my internship this summer entailed. I’m excited to share some improvements we’ve made to our popular Firewall Rules product over the past few months.

Firewall Rules lets customers filter the traffic hitting their site. It’s built using our engine, Wirefilter, which takes powerful boolean expressions written by customers and matches incoming requests against them. Customers can then choose how to respond to traffic which matches these rules. We will discuss some in-depth optimizations we have recently made to Wirefilter, so you may wish to get familiar with how it works if you haven’t already.

Minimizing CPU usage

As a new member of the Firewall team, I quickly learned that performance is important — even in our security products. We look for opportunities to make our customers’ Internet properties faster where it’s safe to do so, maximizing both security and performance.

Our engine is already heavily used, powering all of Firewall Rules. But we have bigger plans. More and more products like our Web Application Firewall (WAF) will be running behind our Wirefilter-based engine, and it will become responsible for eating up a sizable chunk of our total CPU usage before long.

How to measure performance?

Measuring performance is a notoriously tricky task, and as you can probably imagine trying to do this in a highly distributed environment (aka Cloudflare’s edge) does not help. We’ve been surprised in the past by optimizations that look good on paper, but, when tested out in production, just don’t seem to do much.

Our solution? Performance measurement as a service — an isolated and reproducible benchmark for our Firewall engine and a framework for engineers to easily request runs and view results. It’s worth noting that we took a lot of inspiration from the fantastic Rust Compiler benchmarks to build this.

Building even faster interpreters in Rust
Our benchmarking framework, showing how performance during different stages of processing Wirefilter expressions has changed over time [1].

What to measure?

Our next challenge was to find some meaningful performance metrics. Some experimentation quickly uncovered that time was far too volatile a measure for meaningful comparisons, so we turned to hardware counters [2]. It’s not hard to find tools to measure these (perf and VTune are two such examples), although they (mostly) don’t allow control over which parts of the program are recorded. In our case, we wished to individually record measurements for different stages of filter processing — parsing, compilation, analysis, and execution.

Once again we took inspiration from the Rust compiler, and its self-profiling options, using the perf_event_open API to record counters from inside our binary. We then output something like the following, which our framework can easily ingest and store for later visualization.

Building even faster interpreters in Rust
Output of our benchmarks in JSON Lines format, showing a list of recordings for each combination of hardware counter and Wirefilter processing stage. We’ve used 10 repeats here for readability, but we use around 20, in addition to 5 warmup rounds, within our framework.

Whilst we mainly focussed on metrics relating to CPU usage, we also use a combination of getrusage and clear_refs to find the maximum resident set size (RSS). This is useful to understand the memory impact of particular algorithms in addition to CPU.

But the challenge was not over. Cloudflare’s standard CI agents use virtualization and sandboxing for security and convenience, but this makes accessing hardware counters virtually impossible. Running our benchmarks on a dedicated machine gave us access to these counters, and ensured more reproducible results.

Speeding up the speed test

Our benchmarks were designed from the outset to take an important place in our development process. For instance, we now perform a full benchmark run before releasing each new version to detect performance regressions.

But with our benchmarks in place, it quickly became clear that we had a problem. Our benchmarks simply weren’t fast enough — at least if we wanted to complete them in less than a few hours! The problem was we have a very large number  of filters. Since our engine would never usually execute requests against this many filters at once it was proving incredibly costly. We came up with a few tricks to cut this down…

  • Deduplication. It turns out that only around a third of filters are structurally unique (something that is easy to check as Wirefilter can helpfully serialize to JSON). We managed to cut down a great deal of time by ignoring duplicate filters in our benchmarks.
  • Sampling. Still, we had too many filters and random sampling presented an easy solution. A more subtle challenge was to make sure that the random sample was always the same to maintain reproducibility.
  • Partitioning. We worried that deduplication and sampling would cause us to miss important cases that are useful to optimize. By first partitioning filters by Wirefilter language feature, we can ensure we’re getting a good range of filters. It also helpfully gives us more detail about where specifically the impact of a performance change is.

Most of these are trade-offs, but very necessary ones which allow us to run continual benchmarks without development speed grinding to a halt. At the time of writing, we’ve managed to get a benchmark run down to around 20 minutes using these ideas.

Optimizing our engine

With a benchmarking framework in place, we were ready to begin testing optimizations. But how do you optimize an interpreter like Wirefilter? Just-in-time (JIT) compilation, selective inlining and replication were some ideas floating around in the word of interpreters that seemed attractive. After all, we previously wrote about the cost of dynamic dispatch in Wirefilter. All of these techniques aim to reduce that effect.

However, running some real filters through a profiler tells a different story. Most execution time, around 65%, is spent not resolving dynamic dispatch calls but instead performing operations like comparison and searches. Filters currently in production tend to be pretty light on functions, but throw in a few more of these and even less time would be spent on dynamic dispatch. We suspect that even a fair chunk of the remaining 35% is actually spent reading the memory of request fields.

FunctionCPU time
`matches` operator0.6%
`in` operator1.1%
`eq` operator11.8%
`contains` operator51.5%
Everything else35.0%
Breakdown of CPU time while executing a typical production filter.

An adventure in substring searching

By now, you shouldn’t be surprised that the contains operator was one of the first in line for optimization. If you’ve ever written a Firewall Rule, you’re probably already familiar with what it does — it checks whether a substring is present in the field you are matching against. For example, the following expression would match when the host is “example.com” or “www.example.net”, but not when it is “cloudflare.com”. In string searching algorithms, this is commonly referred to as finding a ‘needle’ (“example”) within a ‘haystack’ (“example.com”).

http.host contains “example”

How does this work under the hood? Ordinarily, we may have used Rust’s `String::contains` function but Wirefilter also allows raw byte expressions that don’t necessarily conform to UTF-8.

http.host contains 65:78:61:6d:70:6c:65

We therefore used the memmem crate which performs a two-way substring search algorithm on raw bytes.

Sounds good, right? It was, and it was working pretty well, although we’d noticed that rewriting `contains` filters using regular expressions could bizarrely often make them faster.

http.host matches “example”

Regular expressions are great, but since they’re far more powerful than the `contains` operator, they shouldn’t be faster than a specialized algorithm in simple cases like this one.

Something was definitely up. It turns out that Rust’s regex library comes equipped with a whole host of specialized matchers for what it deems to be simple expressions like this. The obvious question was whether we could therefore simply use the regex library. Interestingly, you may not have realized that the popular ripgrep tool does just that when searching for fixed-string patterns.

However, our use case is a little different. Since we’re building an interpreter (and we’re using dynamic dispatch in any case), we would prefer to dispatch to a specialized case for `contains` expressions, rather than matching on some enum deep within the regex crate when the filter is executed. What’s more, there are some pretty cool things being done to perform substring searching that leverages SIMD instruction sets. So we wired up our engine to some previous work by Wojciech Muła and the results were fantastic.

BenchmarkImprovement
Expressions using `contains` operator72.3%
‘Simple’ expressions0.0%
All expressions31.6%
Improvements in instruction count using Wojciech Muła’s sse4-strstr library over the memmem crate with Wirefilter.

I encourage you to read more on “Algorithm 1”, which we used, but it works something like this (I’ve changed the order a little to help make it clearer). It’s worth reading up on SIMD instructions if you’re unfamiliar with them — they’re the essence behind what makes this algorithm fast.

  1. We fill one SIMD register with the first byte of the needle being searched for, simply repeated over and over.
  2. We load as much of our haystack as we can into another SIMD register and perform a bitwise equality operation with our previous register.
  3. Now, any position in the resultant register that is 0 cannot be the start of the match since it doesn’t start with the same byte of the needle.
  4. We now repeat this process with the last byte of the needle, offsetting the haystack, to rule out any positions that don’t end with the same byte as the needle.
  5. Bitwise ANDing these two results together, we (hopefully) have now drastically reduced our potential matches.
  6. Each of the remaining potential matches can be checked manually using a memcmp operation. If we find a match, then we’re done.
  7. If not, we continue with the next part of our haystack and repeat until we’ve checked the entire thing.

When it goes wrong

You may be wondering what happens if our haystack doesn’t fit neatly into registers. In the original algorithm, nothing. It simply continues reading into the oblivion after the end of the haystack until the last register is full, and uses a bitmask to ignore potential false-positives from this additional region of memory.

As we mentioned, security is our priority when it comes to optimizations, so we could never deploy something with this kind of behaviour. We ended up porting Muła’s library to Rust (we’ve also open-sourced the crate!) and performed an overlapping registers modification found in ARM’s blog.

It’s best illustrated by example — notice the difference between how we would fill registers on an imaginary SIMD instruction-set with 4-byte registers.

Before modification

Building even faster interpreters in Rust
How registers are filled in the original implementation for the haystack “abcdefghij”, red squares indicate out of bounds memory.

After modification

Building even faster interpreters in Rust
How registers are filled with the overlapping modification for the same haystack, notice how ‘g’ and ‘h’ each appear in two registers.

In our case, repeating some bytes within two different registers will never change the final outcome, so this modification is allowed as-is. However, in reality, we found it was better to use a bitmask to exclude repeated parts of the final register and minimize the number of memcmp calls.

What if the haystack is too small to even fill a single register? In this case, we can’t use our overlapping trick since there’s nothing to overlap with. Our solution is straightforward: while we were primarily targeting AVX2, which can store 32-bytes in a lane, we can easily move down to another instruction set with smaller registers that the haystack can fit into. In reality, we don’t currently go any smaller than SSE2. Beyond this, we instead use an implementation of the Rabin-Karp searching algorithm which appears to perform well.

Instruction setRegister size
AVX232 bytes
SSE216 bytes
SWAR (u64)8 bytes
SWAR (u32)4 bytes
Register sizes in different SIMD instruction sets [3]. We did not consider AVX512 since support for this is not widespread enough.

Is it always fast?

Choosing the first and last bytes of the needle to rule out potential matches is a great idea. It means that when it does come to performing a memcmp, we can ignore these, as we know they already match. Unfortunately, as Muła points out, this also makes the algorithm susceptible to a worst-case attack in some instances.

Let’s give an expression that a customer might write to illustrate this.

http.request.uri.path contains “/wp-admin/”

If we try to search for this within a very long sequence of ‘/’s, we will find a potential match in every position and make lots of calls to memcmp — essentially performing a slow bruteforce substring search.

Clearly we need to choose different bytes from the needle. But which ones should we choose? For each choice, an adversary can always find a slightly different, but equally troublesome, worst case. We instead use randomness to throw off our would-be adversary, picking the first byte of the needle as before, but then choosing another random byte to use.

Our new version is unsurprisingly slower than Muła’s, yet it still exhibits a great improvement over both the memmem and regex crates. Performance, but without sacrificing safety.

BenchmarkImprovement
sse4-strstr (original)sliceslice (our version)
Expressions using `contains` operator72.3%49.1%
‘Simple’ expressions0.0%0.1%
All expressions31.6%24.0%
Improvements in instruction count of using sse4-strstr and sliceslice over the memmem crate with Wirefilter.

What’s next?

This is only a small taste of the performance work we’ve been doing, and we have much more yet to come. Nevertheless, none of this would have been possible without the support of my manager Richard and my mentor Elie, who contributed a lot of these ideas. I’ve learned so much over the past few months, but most of all that Cloudflare is an amazing place to be an intern!

[1] Since our benchmarks are not run within a production environment, results in this post do not represent traffic on our edge.

[2] We found instruction counts to be a particularly stable measure, and CPU cycles a particularly unstable one.

[3] Note that SWAR is technically not an instruction set, but instead uses regular registers like vector registers.

Enhancing site security with new Lightsail firewall features

Post Syndicated from Emma White original https://aws.amazon.com/blogs/compute/enhancing-site-security-with-new-lightsail-firewall-features/

This post is contributed by Mike Coleman, AWS Senior Developer Advocate – Lightsail

Amazon Lightsail provides an easy way to get started with AWS for many customers. The service balances ease of use, security, and flexibility. The Lightsail firewall now offers additional features to help customers secure their Lightsail instances. This update offers three new capabilities:

  • The ability to specify source IP addresses for firewall rules
  • Explicitly allowing or disallowing remote access to instances via Lightsail’s web-based console
  • Support for PING

This blog explores each of these new features in detail, starting with source IP addresses.

Before this update, any open ports in the Lightsail firewall were open to the internet. In many cases, this is a reasonable approach. For example, for new WordPress servers, you likely need broad public access.

However, in some cases you want to restrict access to an instance. If you are staging a new website and it’s not ready for publication, you may want to limit access. One way to ensure that only certain people can visit the site is to only allow certain IP addresses to connect.

Another common use case is limiting remote access to an instance. With the new changes to the Lightsail firewall, you would be able to limit SSH or RDP access by source IP address. Additionally, you can now enable or disable remote access via Lightsail’s built-in web client.

Access can be restricted from one or more IP addresses (for example, the IP address for your home computer) or a continuous range of IP addresses (such as the address range for your corporate network).

Next, I review how you configure these options to restrict remote access via SSH to a single source IP address.

Finding your IP address

Most computers do not have an internet routable IP address assigned. Internet routable IP addresses are scarce and usually assigned to your internet gateway device. The devices on the network are assigned private IP addresses. To communicate between the private IP network and the internet, the network router typically uses network address translation (NAT).

This tutorial assumes you are using NAT. This means the IP address used to restrict SSH access is the IP routable address of your network gateway device (usually your wireless router). Consequently, this limits access to all devices on the network behind this IP address.

There are many ways to find your internet routable IP address. You can log into your network gateway device and find it there (consult your device’s user manual for more details). Alternatively, use one of several public services to determine your IP address – search online for “what is my IP” to list several options.

Restricting SSH access to a single IP address

  1. Start by creating a new Lightsail instance – you can select any blueprint.
  2. Once the instance state shows Running, choose the name of the instance to open the Instance details page.
    firewall test instance
  3. Choose Networking from the menu.
    networking tab
  4. Scroll down to find the current firewall settings. Under Allow connections from, it lists Any IP address for all of the applications. To change this, choose the edit icon for the SSH rule.
    IP address
  5. Check the box next to Restrict to IP address and enter your internet routable IP address under Source IP or range.restrict IP address
    Note: The next section shows how to restrict access from Lightsail’s browser-based SSH client. Currently, Allow Lightsail browser SSH box is checked.
  6. Choose Save.

Now, SSH into your Lightsail instance from your local machine. You can learn more about how to connect to your Lightsail instance using SSH from our documentation.

You should be able to connect to your instance successfully. Next, you test the connection from a different IP address. You do this by restricting a different IP address, and attempting to connect again:

  1. Edit the SSH firewall rule again follows the instructions above. This time, under IP or IP Range enter 192.168.2.150.
  2. Choose Save.

Attempt to connect to your instance once more. The connection fails because your IP address does not match an IP address in the range.

Restricting access from the Lightsail browser-based SSH client

The browser-based SSH client makes it easy to access instances without needing to manage SSH keys on locally. However, there may be cases where you must disable browser-based access.

To do this:

  1. Navigate to the firewall rules for the instance you created earlier.
  2. Choose the edit icon for the SSH rule.edit IP address
  3. Uncheck the box next to Allow Lightsail browser SSH. Choose Save.
  4. From the menu, choose Connect, then choose Connect using SSH. The browser window opens, but you are not connected to the instance.

PING Support

There is now support for PING, a command line utility used to check if a computer is reachable over the network. PING sends a packet to a remote computer, which sends a simple response back. Before this release, you could not PING Lightsail instances.

To activate this feature, add a firewall rule:

  1. Navigate to the networking page for your instance.      add rule to firewall<
  2. Under the firewall section, choose +Add rule.
  3. From the application list, choose PING (ICMP). Choose Save.
  4. From a terminal window on your local machine, send a ping command to your Lightsail instance’s IP address. You can find the IP address from the Connect tab of the instance details page or from instance card on the Lightsail home page.
ping -c 5 192.168.2.143

You see a response similar to:

<ping -c 5 192.168.2.143
PING 192.168.2.143 (192.168.2.143): 56 data bytes
64 bytes from 192.168.2.143: icmp_seq=0 ttl=54 time=19.383 ms
64 bytes from 192.168.2.143: icmp_seq=1 ttl=54 time=16.821 ms
64 bytes from 192.168.2.143: icmp_seq=2 ttl=54 time=16.363 ms
64 bytes from 192.168.2.143: icmp_seq=3 ttl=54 time=27.335 ms
64 bytes from 192.168.2.143: icmp_seq=4 ttl=54 time=19.429 ms

--- 192.168.2.143 ping statistics ---
5 packets transmitted, 5 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 16.363/19.866/27.335/3.943 ms

Conclusion

In this blog I covered how you can increase the security of your Lightsail instances by taking advantage of three new features: source IP restrictions, limiting access to the Lightsail browser SSH and RDP clients, and the addition of PING (ICMP) as an application type. These new features provide you an extra level of flexibility and security when deploying applications on Lightsail.

To learn more about the Lightsail firewall, see the documentation. Additionally, there are Getting Started tutorials for Lightsail, including launching a LAMP stack application or .NET application.

Cloudflare Bot Management: machine learning and more

Post Syndicated from Alex Bocharov original https://blog.cloudflare.com/cloudflare-bot-management-machine-learning-and-more/

Cloudflare Bot Management: machine learning and more

Introduction

Cloudflare Bot Management: machine learning and more

Building Cloudflare Bot Management platform is an exhilarating experience. It blends Distributed Systems, Web Development, Machine Learning, Security and Research (and every discipline in between) while fighting ever-adaptive and motivated adversaries at the same time.

This is the ongoing story of Bot Management at Cloudflare and also an introduction to a series of blog posts about the detection mechanisms powering it. I’ll start with several definitions from the Bot Management world, then introduce the product and technical requirements, leading to an overview of the platform we’ve built. Finally, I’ll share details about the detection mechanisms powering our platform.

Let’s start with Bot Management’s nomenclature.

Some Definitions

Bot – an autonomous program on a network that can interact with computer systems or users, imitating or replacing a human user’s behavior, performing repetitive tasks much faster than human users could.

Good bots – bots which are useful to businesses they interact with, e.g. search engine bots like Googlebot, Bingbot or bots that operate on social media platforms like Facebook Bot.

Bad bots – bots which are designed to perform malicious actions, ultimately hurting businesses, e.g. credential stuffing bots, third-party scraping bots, spam bots and sneakerbots.

Cloudflare Bot Management: machine learning and more

Bot Management – blocking undesired or malicious Internet bot traffic while still allowing useful bots to access web properties by detecting bot activity, discerning between desirable and undesirable bot behavior, and identifying the sources of the undesirable activity.

WAF – a security system that monitors and controls network traffic based on a set of security rules.

Gathering requirements

Cloudflare has been stopping malicious bots from accessing websites or misusing APIs from the very beginning, at the same time helping the climate by offsetting the carbon costs from the bots. Over time it became clear that we needed a dedicated platform which would unite different bot fighting techniques and streamline the customer experience. In designing this new platform, we tried to fulfill the following key requirements.

  • Complete, not complex – customers can turn on/off Bot Management with a single click of a button, to protect their websites, mobile applications, or APIs.
  • Trustworthy – customers want to know whether they can trust the website visitor is who they say they are and provide a certainty indicator for that trust level.
  • Flexible – customers should be able to define what subset of the traffic Bot Management mitigations should be applied to, e.g. only login URLs, pricing pages or sitewide.
  • Accurate – Bot Management detections should have a very small error, e.g. none or very few human visitors ever should be mistakenly identified as bots.
  • Recoverable – in case a wrong prediction was made, human visitors still should be able to access websites as well as good bots being let through.

Moreover, the goal for new Bot Management product was to make it work well on the following use cases:

Cloudflare Bot Management: machine learning and more

Technical requirements

Additionally to the product requirements above, we engineers had a list of must-haves for the new Bot Management platform. The most critical were:

  • Scalability – the platform should be able to calculate a score on every request, even at over 10 million requests per second.
  • Low latency – detections must be performed extremely quickly, not slowing down request processing by more than 100 microseconds, and not requiring additional hardware.
  • Configurability – it should be possible to configure what detections are applied on what traffic, including on per domain/data center/server level.
  • Modifiability – the platform should be easily extensible with more detection mechanisms, different mitigation actions, richer analytics and logs.
  • Security – no sensitive information from one customer should be used to build models that protect another customer.
  • Explainability & debuggability – we should be able to explain and tune predictions in an intuitive way.

Equipped with these requirements, back in 2018, our small team of engineers got to work to design and build the next generation of Cloudflare Bot Management.

Meet the Score

“Simplicity is the ultimate sophistication.”
– Leonardo Da Vinci

Cloudflare operates on a vast scale. At the time of this writing, this means covering 26M+ Internet properties, processing on average 11M requests per second (with peaks over 14M), and examining more than 250 request attributes from different protocol levels. The key question is how to harness the power of such “gargantuan” data to protect all of our customers from modern day cyberthreats in a simple, reliable and explainable way?

Bot management is hard. Some bots are much harder to detect and require looking at multiple dimensions of request attributes over a long time, and sometimes a single request attribute could give them away. More signals may help, but are they generalizable?

When we classify traffic, should customers decide what to do with it or are there decisions we can make on behalf of the customer? What concept could possibly address all these uncertainty problems and also help us to deliver on the requirements from above?

As you might’ve guessed from the section title, we came up with the concept of Trusted Score or simply The Scoreone thing to rule them all – indicating the likelihood between 0 and 100 whether a request originated from a human (high score) vs. an automated program (low score).

Cloudflare Bot Management: machine learning and more
“One Ring to rule them all” by idreamlikecrazy, used under CC BY / Desaturated from original

Okay, let’s imagine that we are able to assign such a score on every incoming HTTP/HTTPS request, what are we or the customer supposed to do with it? Maybe it’s enough to provide such a score in the logs. Customers could then analyze them on their end, find the most frequent IPs with the lowest scores, and then use the Cloudflare Firewall to block those IPs. Although useful, such a process would be manual, prone to error and most importantly cannot be done in real time to protect the customer’s Internet property.

Fortunately, around the same time we started worked on this system , our colleagues from the Firewall team had just announced Firewall Rules. This new capability provided customers the ability to control requests in a flexible and intuitive way, inspired by the widely known Wireshark®  language. Firewall rules supported a variety of request fields, and we thought – why not have the score be one of these fields? Customers could then write granular rules to block very specific attack types. That’s how the cf.bot_management.score field was born.

Having a score in the heart of Cloudflare Bot Management addressed multiple product and technical requirements with one strike – it’s simple, flexible, configurable, and it provides customers with telemetry about bots on a per request basis. Customers can adjust the score threshold in firewall rules, depending on their sensitivity to false positives/negatives. Additionally, this intuitive score allows us to extend our detection capabilities under the hood without customers needing to adjust any configuration.

So how can we produce this score and how hard is it? Let’s explore it in the following section.

Architecture overview

What is powering the Bot Management score? The short answer is a set of microservices. Building this platform we tried to re-use as many pipelines, databases and components as we could, however many services had to be built from scratch. Let’s have a look at overall architecture (this overly simplified version contains Bot Management related services):

Cloudflare Bot Management: machine learning and more

Core Bot Management services

In a nutshell our systems process data received from the edge data centers, produce and store data required for bot detection mechanisms using the following technologies:

  • Databases & data storesKafka, ClickHouse, Postgres, Redis, Ceph.
  • Programming languages – Go, Rust, Python, Java, Bash.
  • Configuration & schema management – Salt, Quicksilver, Cap’n Proto.
  • Containerization – Docker, Kubernetes, Helm, Mesos/Marathon.

Each of these services is built with resilience, performance, observability and security in mind.

Edge Bot Management module

All bot detection mechanisms are applied on every request in real-time during the request processing stage in the Bot Management module running on every machine at Cloudflare’s edge locations. When a request comes in we extract and transform the required request attributes and feed them to our detection mechanisms. The Bot Management module produces the following output:

Firewall fieldsBot Management fields
cf.bot_management.score – an integer indicating the likelihood between 0 and 100 whether a request originated from an automated program (low score) to a human (high score).
cf.bot_management.verified_bot – a boolean indicating whether such request comes from a Cloudflare whitelisted bot.
cf.bot_management.static_resource – a boolean indicating whether request matches file extensions for many types of static resources.

Cookies – most notably it produces cf_bm, which helps manage incoming traffic that matches criteria associated with bots.

JS challenges – for some of our detections and customers we inject into invisible JavaScript challenges, providing us with more signals for bot detection.

Detection logs – we log through our data pipelines to ClickHouse details about each applied detection, used features and flags, some of which are used for analytics and customer logs, while others are used to debug and improve our models.

Once the Bot Management module has produced the required fields, the Firewall takes over the actual bot mitigation.

Firewall integration

The Cloudflare Firewall’s intuitive dashboard enables users to build powerful rules through easy clicks and also provides Terraform integration. Every request to the firewall is inspected against the rule engine. Suspicious requests can be blocked, challenged or logged as per the needs of the user while legitimate requests are routed to the destination, based on the score produced by the Bot Management module and the configured threshold.

Cloudflare Bot Management: machine learning and more

Firewall rules provide the following bot mitigation actions:

  • Log – records matching requests in the Cloudflare Logs provided to customers.
  • Bypass – allows customers to dynamically disable Cloudflare security features for a request.
  • Allow – matching requests are exempt from challenge and block actions triggered by other Firewall Rules content.
  • Challenge (Captcha) – useful for ensuring that the visitor accessing the site is human, and not automated.
  • JS Challenge – useful for ensuring that bots and spam cannot access the requested resource; browsers, however, are free to satisfy the challenge automatically.
  • Block – matching requests are denied access to the site.

Our Firewall Analytics tool, powered by ClickHouse and GraphQL API, enables customers to quickly identify and investigate security threats using an intuitive interface. In addition to analytics, we provide detailed logs on all bots-related activity using either the Logpull API and/or LogPush, which provides the easy way to get your logs to your cloud storage.

Cloudflare Workers integration

In case a customer wants more flexibility on what to do with the requests based on the score, e.g. they might want to inject new, or change existing, HTML page content, or serve incorrect data to the bots, or stall certain requests, Cloudflare Workers provide an option to do that. For example, using this small code-snippet, we can pass the score back to the origin server for more advanced real-time analysis or mitigation:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})
 
async function handleRequest(request) {
  request = new Request(request);
 
  request.headers.set("Cf-Bot-Score", request.cf.bot_management.score)
 
  return fetch(request);
}

Now let’s have a look into how a single score is produced using multiple detection mechanisms.

Detection mechanisms

Cloudflare Bot Management: machine learning and more

The Cloudflare Bot Management platform currently uses five complementary detection mechanisms, producing their own scores, which we combine to form the single score going to the Firewall. Most of the detection mechanisms are applied on every request, while some are enabled on a per customer basis to better fit their needs.

Cloudflare Bot Management: machine learning and more
Cloudflare Bot Management: machine learning and more

Having a score on every request for every customer has the following benefits:

  • Ease of onboarding – even before we enable Bot Management in active mode, we’re able to tell how well it’s going to work for the specific customer, including providing historical trends about bot activity.
  • Feedback loop – availability of the score on every request along with all features has tremendous value for continuous improvement of our detection mechanisms.
  • Ensures scaling – if we can compute for score every request and customer, it means that every Internet property behind Cloudflare is a potential Bot Management customer.
  • Global bot insights – Cloudflare is sitting in front of more than 26M+ Internet properties, which allows us to understand and react to the tectonic shifts happening in security and threat intelligence over time.

Overall globally, more than third of the Internet traffic visible to Cloudflare is coming from bad bots, while Bot Management customers have the ratio of bad bots even higher at ~43%!

Cloudflare Bot Management: machine learning and more
Cloudflare Bot Management: machine learning and more

Let’s dive into specific detection mechanisms in chronological order of their integration with Cloudflare Bot Management.

Machine learning

The majority of decisions about the score are made using our machine learning models. These were also the first detection mechanisms to produce a score and to on-board customers back in 2018. The successful application of machine learning requires data high in Quantity, Diversity, and Quality, and thanks to both free and paid customers, Cloudflare has all three, enabling continuous learning and improvement of our models for all of our customers.

At the core of the machine learning detection mechanism is CatBoost  – a high-performance open source library for gradient boosting on decision trees. The choice of CatBoost was driven by the library’s outstanding capabilities:

  • Categorical features support – allowing us to train on even very high cardinality features.
  • Superior accuracy – allowing us to reduce overfitting by using a novel gradient-boosting scheme.
  • Inference speed – in our case it takes less than 50 microseconds to apply any of our models, making sure request processing stays extremely fast.
  • C and Rust API – most of our business logic on the edge is written using Lua, more specifically LuaJIT, so having a compatible FFI interface to be able to apply models is fantastic.

There are multiple CatBoost models run on Cloudflare’s Edge in the shadow mode on every request on every machine. One of the models is run in active mode, which influences the final score going to Firewall. All ML detection results and features are logged and recorded in ClickHouse for further analysis, model improvement, analytics and customer facing logs. We feed both categorical and numerical features into our models, extracted from request attributes and inter-request features built using those attributes, calculated and delivered by the Gagarin inter-requests features platform.

We’re able to deploy new ML models in a matter of seconds using an extremely reliable and performant Quicksilver configuration database. The same mechanism can be used to configure which version of an ML model should be run in active mode for a specific customer.

A deep dive into our machine learning detection mechanism deserves a blog post of its own and it will cover how do we train and validate our models on trillions of requests using GPUs, how model feature delivery and extraction works, and how we explain and debug model predictions both internally and externally.

Heuristics engine

Not all problems in the world are the best solved with machine learning. We can tweak the ML models in various ways, but in certain cases they will likely underperform basic heuristics. Often the problems machine learning is trying to solve are not entirely new. When building the Bot Management solution it became apparent that sometimes a single attribute of the request could give a bot away. This means that we can create a bunch of simple rules capturing bots in a straightforward way, while also ensuring lowest false positives.

The heuristics engine was the second detection mechanism integrated into the Cloudflare Bot Management platform in 2019 and it’s also applied on every request. We have multiple heuristic types and hundreds of specific rules based on certain attributes of the request, some of which are very hard to spoof. When any of the requests matches any of the heuristics – we assign the lowest possible score of 1.

The engine has the following properties:

  • Speed – if ML model inference takes less than 50 microseconds per model, hundreds of heuristics can be applied just under 20 microseconds!
  • Deployability – the heuristics engine allows us to add new heuristic in a matter of seconds using Quicksilver, and it will be applied on every request.
  • Vast coverage – using a set of simple heuristics allows us to classify ~15% of global traffic and ~30% of Bot Management customers’ traffic as bots. Not too bad for a few if conditions, right?
  • Lowest false positives – because we’re very sure and conservative on the heuristics we add, this detection mechanism has the lowest FP rate among all detection mechanisms.
  • Labels for ML – because of the high certainty we use requests classified with heuristics to train our ML models, which then can generalize behavior learnt from from heuristics and improve detections accuracy.

So heuristics gave us a lift when tweaked with machine learning and they contained a lot of the intuition about the bots, which helped to advance the Cloudflare Bot Management platform and allowed us to onboard more customers.

Behavioral analysis

Machine learning and heuristics detections provide tremendous value, but both of them require human input on the labels, or basically a teacher to distinguish between right and wrong. While our supervised ML models can generalize well enough even on novel threats similar to what we taught them on, we decided to go further. What if there was an approach which doesn’t require a teacher, but rather can learn to distinguish bad behavior from the normal behavior?

Enter the behavioral analysis detection mechanism, initially developed in 2018 and integrated with the Bot Management platform in 2019. This is an unsupervised machine learning approach, which has the following properties:

  • Fitting specific customer needs – it’s automatically enabled for all Bot Management customers, calculating and analyzing normal visitor behavior over an extended period of time.
  • Detects bots never seen before – as it doesn’t use known bot labels, it can detect bots and anomalies from the normal behavior on specific customer’s website.
  • Harder to evade – anomalous behavior is often a direct result of the bot’s specific goal.

Please stay tuned for a more detailed blog about behavioral analysis models and the platform powering this incredible detection mechanism, protecting many of our customers from unseen attacks.

Verified bots

So far we’ve discussed how to detect bad bots and humans. What about good bots, some of which are extremely useful for the customer website? Is there a need for a dedicated detection mechanism or is there something we could use from previously described detection mechanisms? While the majority of good bot requests (e.g. Googlebot, Bingbot, LinkedInbot) already have low score produced by other detection mechanisms, we also need a way to avoid accidental blocks of useful bots. That’s how the Firewall field cf.bot_management.verified_bot came into existence in 2019, allowing customers to decide for themselves whether they want to let all of the good bots through or restrict access to certain parts of the website.

The actual platform calculating Verified Bot flag deserves a detailed blog on its own, but in the nutshell it has the following properties:

  • Validator based approach – we support multiple validation mechanisms, each of them allowing us to reliably confirm good bot identity by clustering a set of IPs.
  • Reverse DNS validator – performs a reverse DNS check to determine whether or not a bots IP address matches its alleged hostname.
  • ASN Block validator – similar to rDNS check, but performed on ASN block.
  • Downloader validator – collects good bot IPs from either text files or HTML pages hosted on bot owner sites.
  • Machine learning validator – uses an unsupervised learning algorithm, clustering good bot IPs which are not possible to validate through other means.
  • Bots Directory – a database with UI that stores and manages bots that pass through the Cloudflare network.
Cloudflare Bot Management: machine learning and more
Bots directory UI sample‌‌

Using multiple validation methods listed above, the Verified Bots detection mechanism identifies hundreds of unique good bot identities, belonging to different companies and categories.

JS fingerprinting

When it comes to Bot Management detection quality it’s all about the signal quality and quantity. All previously described detections use request attributes sent over the network and analyzed on the server side using different techniques. Are there more signals available, which can be extracted from the client to improve our detections?

As a matter of fact there are plenty, as every browser has unique implementation quirks. Every web browser graphics output such as canvas depends on multiple layers such as hardware (GPU) and software (drivers, operating system rendering). This highly unique output allows precise differentiation between different browser/device types. Moreover, this is achievable without sacrificing website visitor privacy as it’s not a supercookie, and it cannot be used to track and identify individual users, but only to confirm that request’s user agent matches other telemetry gathered through browser canvas API.

This detection mechanism is implemented as a challenge-response system with challenge injected into the webpage on Cloudflare’s edge. The challenge is then rendered in the background using provided graphic instructions and the result sent back to Cloudflare for validation and further action such as  producing the score. There is a lot going on behind the scenes to make sure we get reliable results without sacrificing users’ privacy while being tamper resistant to replay attacks. The system is currently in private beta and being evaluated for its effectiveness and we already see very promising results. Stay tuned for this new detection mechanism becoming widely available and the blog on how we’ve built it.

This concludes an overview of the five detection mechanisms we’ve built so far. It’s time to sum it all up!

Summary

Cloudflare has the unique ability to collect data from trillions of requests flowing through its network every week. With this data, Cloudflare is able to identify likely bot activity with Machine Learning, Heuristics, Behavioral Analysis, and other detection mechanisms. Cloudflare Bot Management integrates seamlessly with other Cloudflare products, such as WAF  and Workers.

Cloudflare Bot Management: machine learning and more

All this could not be possible without hard work across multiple teams! First of all thanks to everybody on the Bots Team for their tremendous efforts to make this platform come to life. Other Cloudflare teams, most notably: Firewall, Data, Solutions Engineering, Performance, SRE, helped us a lot to design, build and support this incredible platform.

Cloudflare Bot Management: machine learning and more
Bots team during Austin team summit 2019 hunting bots with axes 🙂

Lastly, there are more blogs from the Bots series coming soon, diving into internals of our detection mechanisms, so stay tuned for more exciting stories about Cloudflare Bot Management!

Stream Firewall Events directly to your SIEM

Post Syndicated from Patrick R. Donahue original https://blog.cloudflare.com/stream-firewall-events-directly-to-your-siem/

Stream Firewall Events directly to your SIEM

Stream Firewall Events directly to your SIEM

The highest trafficked sites using Cloudflare receive billions of requests per day. But only about 5% of those requests typically trigger security rules, whether they be “managed” rules such as our WAF and DDoS protections, or custom rules such as those configured by customers using our powerful Firewall Rules and Rate Limiting engines.

When enforcement is taken on a request that interrupts the flow of malicious traffic, a Firewall Event is logged with detail about the request including which rule triggered us to take action and what action we took, e.g., challenged or blocked outright.

Previously, if you wanted to ingest all of these events into your SIEM or logging platform, you had to take the whole firehose of requests—good and bad—and then filter them client side. If you’re paying by the log line or scaling your own storage solution, this cost can add up quickly. And if you have a security team monitoring logs, they’re being sent a lot of extraneous data to sift through before determining what needs their attention most.

As of today, customers using Cloudflare Logs can create Logpush jobs that send only Firewall Events. These events arrive much faster than our existing HTTP requests logs: they are typically delivered to your logging platform within 60 seconds of sending the response to the client.

In this post we’ll show you how to use Terraform and Sumo Logic, an analytics integration partner, to get this logging set up live in just a few minutes.

Process overview

The steps below take you through the process of configuring Cloudflare Logs to push security events directly to your logging platform. For purposes of this tutorial, we’ve chosen Sumo Logic as our log destination, but you’re free to use any of our analytics partners, or any logging platform that can read from cloud storage such as AWS S3, Azure Blob Storage, or Google Cloud Storage.

To configure Sumo Logic and Cloudflare we make use of Terraform, a popular Infrastructure-as-Code tool from HashiCorp. If you’re new to Terraform, see Getting started with Terraform and Cloudflare for a guided walkthrough with best practice recommendations such as how to version and store your configuration in git for easy rollback.

Once the infrastructure is in place, you’ll send a malicious request towards your site to trigger the Cloudflare Web Application Firewall, and watch as the Firewall Events generated by that request shows up in Sumo Logic about a minute later.

Stream Firewall Events directly to your SIEM

Prerequisites

Install Terraform and Go

First you’ll need to install Terraform. See our Developer Docs for instructions.

Next you’ll need to install Go. The easiest way on macOS to do so is with Homebrew:

$ brew install golang
$ export GOPATH=$HOME/go
$ mkdir $GOPATH

Go is required because the Sumo Logic Terraform Provider is a “community” plugin, which means it has to be built and installed manually rather than automatically through the Terraform Registry, as will happen later for the Cloudflare Terraform Provider.

Install the Sumo Logic Terraform Provider Module

The official installation instructions for installing the Sumo Logic provider can be found on their GitHub Project page, but here are my notes:

$ mkdir -p $GOPATH/src/github.com/terraform-providers && cd $_
$ git clone https://github.com/SumoLogic/sumologic-terraform-provider.git
$ cd sumologic-terraform-provider
$ make install

Prepare Sumo Logic to receive Cloudflare Logs

Install Sumo Logic livetail utility

While not strictly necessary, the livetail tool from Sumo Logic makes it easy to grab the Cloudflare Logs challenge token we’ll need in a minute, and also to view the fruits of your labor: seeing a Firewall Event appear in Sumo Logic shortly after the malicious request hit the edge.

On macOS:

$ brew cask install livetail
...
==> Verifying SHA-256 checksum for Cask 'livetail'.
==> Installing Cask livetail
==> Linking Binary 'livetail' to '/usr/local/bin/livetail'.
🍺  livetail was successfully installed!

Generate Sumo Logic Access Key

This step assumes you already have a Sumo Logic account. If not, you can sign up for a free trial here.

  1. Browse to https://service.$ENV.sumologic.com/ui/#/security/access-keys where $ENV should be replaced by the environment you chose on signup.
  2. Click the “+ Add Access Key” button, give it a name, and click “Create Key”
  3. In the next step you’ll save the Access ID and Access Key that are provided as environment variables, so don’t close this modal until you do.

Generate Cloudflare Scoped API Token

  1. Log in to the Cloudflare Dashboard
  2. Click on the profile icon in the top-right corner and then select “My Profile”
  3. Select “API Tokens” from the nav bar and click “Create Token”
  4. Click the “Get started” button next to the “Create Custom Token” label

On the Create Custom Token screen:

  1. Provide a token name, e.g., “Logpush – Firewall Events”
  2. Under Permissions, change Account to Zone, and then select Logs and Edit, respectively, in the two drop-downs to the right
  3. Optionally, change Zone Resources and IP Address Filtering to restrict restrict access for this token to specific zones or from specific IPs

Click “Continue to summary” and then “Create token” on the next screen. Save the token somewhere secure, e.g., your password manager, as it’ll be needed in just a minute.

Set environment variables

Rather than add sensitive credentials to source files (that may get submitted to your source code repository), we’ll set environment variables and have the Terraform modules read from them.

$ export CLOUDFLARE_API_TOKEN="<your scoped cloudflare API token>"
$ export CF_ZONE_ID="<tag of zone you wish to send logs for>"

We’ll also need your Sumo Logic environment, Access ID, and Access Key:

$ export SUMOLOGIC_ENVIRONMENT="eu"
$ export SUMOLOGIC_ACCESSID="<access id from previous step>"
$ export SUMOLOGIC_ACCESSKEY="<access key from previous step>"

Create the Sumo Logic Collector and HTTP Source

We’ll create a directory to store our Terraform project in and build it up as we go:

$ mkdir -p ~/src/fwevents && cd $_

Then we’ll create the Collector and HTTP source that will store and provide Firewall Events logs to Sumo Logic:

$ cat <<'EOF' | tee main.tf
##################
### SUMO LOGIC ###
##################
provider "sumologic" {
    environment = var.sumo_environment
    access_id = var.sumo_access_id
}

resource "sumologic_collector" "collector" {
    name = "CloudflareLogCollector"
    timezone = "Etc/UTC"
}

resource "sumologic_http_source" "http_source" {
    name = "firewall-events-source"
    collector_id = sumologic_collector.collector.id
    timezone = "Etc/UTC"
}
EOF

Then we’ll create a variables file so Terraform has credentials to communicate with Sumo Logic:

$ cat <<EOF | tee variables.tf
##################
### SUMO LOGIC ###
##################
variable "sumo_environment" {
    default = "$SUMOLOGIC_ENVIRONMENT"
}

variable "sumo_access_id" {
    default = "$SUMOLOGIC_ACCESSID"
}
EOF

With our Sumo Logic configuration set, we’ll initialize Terraform with terraform init and then preview what changes Terraform is going to make by running terraform plan:

$ terraform init

Initializing the backend...

Initializing provider plugins...

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.


------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # sumologic_collector.collector will be created
  + resource "sumologic_collector" "collector" {
      + destroy        = true
      + id             = (known after apply)
      + lookup_by_name = false
      + name           = "CloudflareLogCollector"
      + timezone       = "Etc/UTC"
    }

  # sumologic_http_source.http_source will be created
  + resource "sumologic_http_source" "http_source" {
      + automatic_date_parsing       = true
      + collector_id                 = (known after apply)
      + cutoff_timestamp             = 0
      + destroy                      = true
      + force_timezone               = false
      + id                           = (known after apply)
      + lookup_by_name               = false
      + message_per_request          = false
      + multiline_processing_enabled = true
      + name                         = "firewall-events-source"
      + timezone                     = "Etc/UTC"
      + url                          = (known after apply)
      + use_autoline_matching        = true
    }

Plan: 2 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.

Assuming everything looks good, let’s execute the plan:

$ terraform apply -auto-approve
sumologic_collector.collector: Creating...
sumologic_collector.collector: Creation complete after 3s [id=108448215]
sumologic_http_source.http_source: Creating...
sumologic_http_source.http_source: Creation complete after 0s [id=150364538]

Apply complete! Resources: 2 added, 0 changed, 0 destroyed.

Success! At this point you could log into the Sumo Logic web interface and confirm that your Collector and HTTP Source were created successfully.

Create a Cloudflare Logpush Job

Before we’ll start sending logs to your collector, you need to demonstrate the ability to read from it. This validation step prevents accidental (or intentional) misconfigurations from overrunning your logs.

Tail the Sumo Logic Collector and await the challenge token

In a new shell window—you should keep the current one with your environment variables set for use with Terraform—we’ll start tailing Sumo Logic for events sent from the firewall-events-source HTTP source.

The first time that you run livetail you’ll need to specify your Sumo Logic Environment, Access ID and Access Key, but these values will be stored in the working directory for subsequent runs:

$ livetail _source=firewall-events-source
### Welcome to Sumo Logic Live Tail Command Line Interface ###
1 US1
2 US2
3 EU
4 AU
5 DE
6 FED
7 JP
8 CA
Please select Sumo Logic environment: 
See http://help.sumologic.com/Send_Data/Collector_Management_API/Sumo_Logic_Endpoints to choose the correct environment. 3
### Authenticating ###
Please enter your Access ID: <access id>
Please enter your Access Key <access key>
### Starting Live Tail session ###

Request and receive challenge token

Before requesting a challenge token, we need to figure out where Cloudflare should send logs.

We do this by asking Terraform for the receiver URL of the recently created HTTP source. Note that we modify the URL returned slightly as Cloudflare Logs expects sumo:// rather than https://.

$ export SUMO_RECEIVER_URL=$(terraform state show sumologic_http_source.http_source | grep url | awk '{print $3}' | sed -e 's/https:/sumo:/; s/"//g')

$ echo $SUMO_RECEIVER_URL
sumo://endpoint1.collection.eu.sumologic.com/receiver/v1/http/<redacted>

With URL in hand, we can now request the token.

$ curl -sXPOST -H "Content-Type: application/json" -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" -d '{"destination_conf":"'''$SUMO_RECEIVER_URL'''"}' https://api.cloudflare.com/client/v4/zones/$CF_ZONE_ID/logpush/ownership

{"errors":[],"messages":[],"result":{"filename":"ownership-challenge-bb2912e0.txt","message":"","valid":true},"success":true}

Back in the other window where your livetail is running you should see something like this:

{"content":"eyJhbGciOiJkaXIiLCJlbmMiOiJBMTI4R0NNIiwidHlwIjoiSldUIn0..WQhkW_EfxVy8p0BQ.oO6YEvfYFMHCTEd6D8MbmyjJqcrASDLRvHFTbZ5yUTMqBf1oniPNzo9Mn3ZzgTdayKg_jk0Gg-mBpdeqNI8LJFtUzzgTGU-aN1-haQlzmHVksEQdqawX7EZu2yiePT5QVk8RUsMRgloa76WANQbKghx1yivTZ3TGj8WquZELgnsiiQSvHqdFjAsiUJ0g73L962rDMJPG91cHuDqgfXWwSUqPsjVk88pmvGEEH4AMdKIol0EOc-7JIAWFBhcqmnv0uAXVOH5uXHHe_YNZ8PNLfYZXkw1xQlVDwH52wRC93ohIxg.pHAeaOGC8ALwLOXqxpXJgQ","filename":"ownership-challenge-bb2912e0.txt"}

Copy the content value from above into an environment variable, as you’ll need it in a minute to create the job:

$ export LOGPUSH_CHALLENGE_TOKEN="<content value>"

Create the Logpush job using the challenge token

With challenge token in hand, we’ll use Terraform to create the job.

First you’ll want to choose the log fields that should be sent to Sumo Logic. You can enumerate the list by querying the dataset:

$ curl -sXGET -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" https://api.cloudflare.com/client/v4/zones/$CF_ZONE_ID/logpush/datasets/firewall_events/fields | jq .
{
  "errors": [],
  "messages": [],
  "result": {
    "Action": "string; the code of the first-class action the Cloudflare Firewall took on this request",
    "ClientASN": "int; the ASN number of the visitor",
    "ClientASNDescription": "string; the ASN of the visitor as string",
    "ClientCountryName": "string; country from which request originated",
    "ClientIP": "string; the visitor's IP address (IPv4 or IPv6)",
    "ClientIPClass": "string; the classification of the visitor's IP address, possible values are: unknown | clean | badHost | searchEngine | whitelist | greylist | monitoringService | securityScanner | noRecord | scan | backupService | mobilePlatform | tor",
    "ClientRefererHost": "string; the referer host",
    "ClientRefererPath": "string; the referer path requested by visitor",
    "ClientRefererQuery": "string; the referer query-string was requested by the visitor",
    "ClientRefererScheme": "string; the referer url scheme requested by the visitor",
    "ClientRequestHTTPHost": "string; the HTTP hostname requested by the visitor",
    "ClientRequestHTTPMethodName": "string; the HTTP method used by the visitor",
    "ClientRequestHTTPProtocol": "string; the version of HTTP protocol requested by the visitor",
    "ClientRequestPath": "string; the path requested by visitor",
    "ClientRequestQuery": "string; the query-string was requested by the visitor",
    "ClientRequestScheme": "string; the url scheme requested by the visitor",
    "Datetime": "int or string; the date and time the event occurred at the edge",
    "EdgeColoName": "string; the airport code of the Cloudflare datacenter that served this request",
    "EdgeResponseStatus": "int; HTTP response status code returned to browser",
    "Kind": "string; the kind of event, currently only possible values are: firewall",
    "MatchIndex": "int; rules match index in the chain",
    "Metadata": "object; additional product-specific information. Metadata is organized in key:value pairs. Key and Value formats can vary by Cloudflare security product and can change over time",
    "OriginResponseStatus": "int; HTTP origin response status code returned to browser",
    "OriginatorRayName": "string; the RayId of the request that issued the challenge/jschallenge",
    "RayName": "string; the RayId of the request",
    "RuleId": "string; the Cloudflare security product-specific RuleId triggered by this request",
    "Source": "string; the Cloudflare security product triggered by this request",
    "UserAgent": "string; visitor's user-agent string"
  },
  "success": true
}

Then you’ll append your Cloudflare configuration to the main.tf file:

$ cat <<EOF | tee -a main.tf

##################
### CLOUDFLARE ###
##################
provider "cloudflare" {
  version = "~> 2.0"
}

resource "cloudflare_logpush_job" "firewall_events_job" {
  name = "fwevents-logpush-job"
  zone_id = var.cf_zone_id
  enabled = true
  dataset = "firewall_events"
  logpull_options = "fields=RayName,Source,RuleId,Action,EdgeResponseStatusDatetime,EdgeColoName,ClientIP,ClientCountryName,ClientASNDescription,UserAgent,ClientRequestHTTPMethodName,ClientRequestHTTPHost,ClientRequestHTTPPath&timestamps=rfc3339"
  destination_conf = replace(sumologic_http_source.http_source.url,"https:","sumo:")
  ownership_challenge = "$LOGPUSH_CHALLENGE_TOKEN"
}
EOF

And add to the variables.tf file:

$ cat <<EOF | tee -a variables.tf

##################
### CLOUDFLARE ###
##################
variable "cf_zone_id" {
  default = "$CF_ZONE_ID"
}

Next we re-run terraform init to install the latest Cloudflare Terraform Provider Module. You’ll need to make sure you have at least version 2.6.0 as this is the version in which we added Logpush job support:

$ terraform init

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "cloudflare" (terraform-providers/cloudflare) 2.6.0...

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

With the latest Terraform installed, we check out the plan and then apply:

$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.

sumologic_collector.collector: Refreshing state... [id=108448215]
sumologic_http_source.http_source: Refreshing state... [id=150364538]

------------------------------------------------------------------------

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # cloudflare_logpush_job.firewall_events_job will be created
  + resource "cloudflare_logpush_job" "firewall_events_job" {
      + dataset             = "firewall_events"
      + destination_conf    = "sumo://endpoint1.collection.eu.sumologic.com/receiver/v1/http/(redacted)"
      + enabled             = true
      + id                  = (known after apply)
      + logpull_options     = "fields=RayName,Source,RuleId,Action,EdgeResponseStatusDatetime,EdgeColoName,ClientIP,ClientCountryName,ClientASNDescription,UserAgent,ClientRequestHTTPMethodName,ClientRequestHTTPHost,ClientRequestHTTPPath&timestamps=rfc3339"
      + name                = "fwevents-logpush-job"
      + ownership_challenge = "(redacted)"
      + zone_id             = "(redacted)"
    }

Plan: 1 to add, 0 to change, 0 to destroy.

------------------------------------------------------------------------

Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.
$ terraform apply --auto-approve
sumologic_collector.collector: Refreshing state... [id=108448215]
sumologic_http_source.http_source: Refreshing state... [id=150364538]
cloudflare_logpush_job.firewall_events_job: Creating...
cloudflare_logpush_job.firewall_events_job: Creation complete after 3s [id=13746]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.

Success! Last step is to test your setup.

Testing your setup by sending a malicious request

The following step assumes that you have the Cloudflare WAF turned on. Alternatively, you can create a Firewall Rule to match your request and generate a Firewall Event that way.

First make sure that livetail is running as described earlier:

$ livetail "_source=firewall-events-source"
### Authenticating ###
### Starting Live Tail session ###

Then in a browser make the following request https://example.com/<script>alert()</script>. You should see the following returned:

Stream Firewall Events directly to your SIEM

And a few moments later in livetail:

{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"958052","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"958051","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"973300","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"973307","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"973331","Action":"log","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}
{"RayName":"58830d3f9945bc36","Source":"waf","RuleId":"981176","Action":"drop","EdgeColoName":"LHR","ClientIP":"203.0.113.69","ClientCountryName":"gb","ClientASNDescription":"NTL","UserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36","ClientRequestHTTPMethodName":"GET","ClientRequestHTTPHost":"upinatoms.com"}

Note that for this one malicious request Cloudflare Logs actually sent 6 separate Firewall Events to Sumo Logic. The reason for this is that this specific request triggered a variety of different Managed Rules: #958051, 958052, 973300, 973307, 973331, and 981176.

Seeing it all in action

Here’s a demo of  launching livetail, making a malicious request in a browser, and then seeing the result sent from the Cloudflare Logpush job:

Stream Firewall Events directly to your SIEM

Firewall Analytics: Now available to all paid plans

Post Syndicated from Alex Cruz Farmer original https://blog.cloudflare.com/updates-to-firewall-analytics/

Firewall Analytics: Now available to all paid plans

Firewall Analytics: Now available to all paid plans

Our Firewall Analytics tool enables customers to quickly identify and investigate security threats using an intuitive interface. Until now, this tool had only been available to our Enterprise customers, who have been using it to get detailed insights into their traffic and better tailor their security configurations. Today, we are excited to make Firewall Analytics available to all paid plans and share details on several recent improvements we have made.

All paid plans are now able to take advantage of these capabilities, along with several important enhancements we’ve made to improve our customers’ workflow and productivity.

Firewall Analytics: Now available to all paid plans

Increased Data Retention and Adaptive Sampling

Previously, Enterprise customers could view 14 days of Firewall Analytics for their domains. Today we’re increasing that retention to 30 days, and again to 90 days in the coming months. Business and Professional plan zones will get 30 and 3 days of retention, respectively.

In addition to the extended retention, we are introducing adaptive sampling to guarantee that Firewall Analytics results are displayed in the Cloudflare Dashboard quickly and reliably, even when you are under a massive attack or otherwise receiving a large volume of requests.

Adaptive sampling works similar to Netflix: when your internet connection runs low on bandwidth, you receive a slightly downscaled version of the video stream you are watching. When your bandwidth recovers, Netflix then upscales back to the highest quality available.

Firewall Analytics does this sampling on each query, ensuring that customers see the best precision available in the UI given current load on the zone. When results are sampled, the sampling rate will be displayed as shown below:

Firewall Analytics: Now available to all paid plans

Event-Based Logging

As adoption of our expressive Firewall Rules engine has grown, one consistent ask we’ve heard from customers is for a more streamlined way to see all Firewall Events generated by a specific rule. Until today, if a malicious request matched multiple rules, only the last one to execute was shown in the Activity Log, requiring customers to click into the request to see if the rule they’re investigating was listed as an “Additional match”.

To streamline this process, we’ve changed how the Firewall Analytics UI interacts with the Activity Log. Customers can now filter by a specific rule (or any other criteria) and see a row for each event generated by that rule. This change also makes it easier to review all requests that would have been blocked by a rule by creating it in Log mode first before changing it to Block.

Firewall Analytics: Now available to all paid plans

Challenge Solve Rates to help reduce False Positives

When our customers write rules to block undesired, automated traffic they want to make sure they’re not blocking or challenging desired traffic, e.g., humans wanting to make a purchase should be allowed but not bots scraping pricing.

To help customers determine what percent of CAPTCHA challenges returned to users may have been unnecessary, i.e., false positives, we are now showing the Challenge Solve Rate (CSR) for each rule. If you’re seeing rates higher than expected, e.g., for your Bot Management rules, you may want to relax the rule criteria. If the rate you see is 0% indicating that no CAPTCHAs are being solved, you may want to change the rule to Block outright rather than challenge.

Firewall Analytics: Now available to all paid plans

Hovering over the CSR rate will reveal the number of CAPTCHAs issued vs. solved:

Firewall Analytics: Now available to all paid plans

Exporting Firewall Events

Business and Enterprise customers can now export a set of 500 events from the Activity Log. The data exported are those events that remain after any selected filters have been applied.

Firewall Analytics: Now available to all paid plans

Column Customization

Sometimes the columns shown in the Activity Log do not contain the details you want to see to analyze the threat. When this happens, you can now click “Edit Columns” to select the fields you want to see. For example, a customer diagnosing a Bot related issue may want to also view the User-Agent and the source country whereas a customer investigating a DDoS attack may want to see IP addresses, ASNs, Path, and other attributes. You can now customize what you’d like to see as shown below.

Firewall Analytics: Now available to all paid plans

We would love to hear your feedback and suggestions, so feel free to reach out to us via our Community forums or through your Customer Success team.

If you’d like to receive more updates like this one directly to your inbox, please subscribe to our Blog!

Adding a Hardware Backdoor to a Networked Computer

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2019/10/adding_a_hardwa.html

Interesting proof of concept:

At the CS3sthlm security conference later this month, security researcher Monta Elkins will show how he created a proof-of-concept version of that hardware hack in his basement. He intends to demonstrate just how easily spies, criminals, or saboteurs with even minimal skills, working on a shoestring budget, can plant a chip in enterprise IT equipment to offer themselves stealthy backdoor access…. With only a $150 hot-air soldering tool, a $40 microscope, and some $2 chips ordered online, Elkins was able to alter a Cisco firewall in a way that he says most IT admins likely wouldn’t notice, yet would give a remote attacker deep control.

Supercharging Firewall Events for Self-Serve

Post Syndicated from Alex Cruz Farmer original https://blog.cloudflare.com/supercharging-firewall-events-for-self-serve/

Supercharging Firewall Events for Self-Serve

Today, I’m very pleased to announce the release of a completely overhauled version of our Firewall Event log to our Free, Pro and Business customers. This new Firewall Events log is now available in your Dashboard, and you are not required to do anything to receive this new capability.

Supercharging Firewall Events for Self-Serve

No more modals!

We have done away with those pesky modals, providing a much smoother user experience. To review more detailed information about an event, you simply click anywhere on the event list row.

Supercharging Firewall Events for Self-Serve

In the expanded view, you are provided with all the information you may need to identify or diagnose issues with your Firewall or find more details about a potential threat to your application.

Additional matches per event

Cloudflare has several Firewall features to give customers granular control of their security. With this control comes some complexity when debugging why a request was stopped by the Firewall. To help clarify what happened, we have provided an “Additional matches” count at the bottom for events triggered by multiple services or rules for the same request. Clicking the number expands a list showing each rule and service along with the corresponding action.

Supercharging Firewall Events for Self-Serve

Search for any field within a Firewall Event

This is one of my favourite parts of our new Firewall Event Log. Many of our customers have expressed their frustration with the difficulty of pinpointing specific events. This is where our new search capabilities come into their own. Customers can now filter and freeform search for any field that is visible in a Firewall Event!

Let’s say you want to find all the requests originating from a specific ISP or country where your Firewall Rules issued a JavaScript challenge. There are two different ways to do this in the UI.

Firstly, when in the detail view, you can create an include or exclude filter for that field value.

Supercharging Firewall Events for Self-Serve

Secondly, you can create a freeform filter using the “+ Add Filter” button at the top, or edit one of the already filtered fields:

Supercharging Firewall Events for Self-Serve

As illustrated above, with our WAF Managed Rules enabled in log only, we can see all the rules which would have triggered if this was a legitimate attack. This allows you to confirm that your configuration is working as expected.

Scoping your search to a specific date and time

In our old Firewall Event Log, to find an event, users had to traverse through many pages to find Events from a specific date. The last major change we have added is the capability to select a time window to view events between two points in time over the last 2 weeks. In the time selection window, Free and Pro customers can choose a 24 hour time window and our Business customers can view up to 72 hours.

Supercharging Firewall Events for Self-Serve

We want your feedback!

We need your help! Please feel free to leave any feedback on our Community forums, or open a Support ticket with any problems you find. Your feedback is critical to our product improvement process, and we look forward to hearing from you.

Protecting Project Galileo websites from HTTP attacks

Post Syndicated from Maxime Guerreiro original https://blog.cloudflare.com/protecting-galileo-websites/

Protecting Project Galileo websites from HTTP attacks

Yesterday, we celebrated the fifth anniversary of Project Galileo. More than 550 websites are part of this program, and they have something in common: each and every one of them has been subject to attacks in the last month. In this blog post, we will look at the security events we observed between the 23 April 2019 and 23 May 2019.

Project Galileo sites are protected by the Cloudflare Firewall and Advanced DDoS Protection which contain a number of features that can be used to detect and mitigate different types of attack and suspicious traffic. The following table shows how each of these features contributed to the protection of sites on Project Galileo.

Firewall Feature

Requests Mitigated

Distinct originating IPs

Sites Affected (approx.)

Firewall
Rules

78.7M

396.5K

~ 30

Security
Level

41.7M

1.8M

~ 520

Access
Rules

24.0M

386.9K

~ 200

Browser
Integrity Check

9.4M

32.2K

~ 500

WAF

4.5M

163.8K

~ 200

User-Agent
Blocking

2.3M

1.3K

~ 15

Hotlink
Protection

2.0M

686.7K

~ 40

HTTP
DoS

1.6M

360

1

Rate
Limit

623.5K

6.6K

~ 15

Zone
Lockdown

9.7K

2.8K

~ 10

WAF (Web Application Firewall)

Although not the most impressive in terms of blocked requests, the WAF is the most interesting as it identifies and blocks malicious requests, based on heuristics and rules that are the result of seeing attacks across all of our customers and learning from those. The WAF is available to all of our paying customers, protecting them against 0-days, SQL/XSS exploits and more. For the Project Galileo customers the WAF rules blocked more than 4.5 million requests in the month that we looked at, matching over 130 WAF rules and approximately 150k requests per day.

Protecting Project Galileo websites from HTTP attacks
Heat map showing the attacks seen on customer sites (rows) per day (columns)

This heat map may initially appear confusing but reading one is easy once you know what to expect so bear with us! It is a table where each line is a website on Project Galileo and each column is a day. The color represents the number of requests triggering WAF rules – on a scale from 0 (white) to a lot (dark red). The darker the cell, the more requests were blocked on this day.

We observe malicious traffic on a daily basis for most websites we protect. The average Project Galileo site saw malicious traffic for 27 days in the 1 month observed, and for almost 60% of the sites we noticed daily events.

Fortunately, the vast majority of websites only receive a few malicious requests per day, likely from automated scanners. In some cases, we notice a net increase in attacks against some websites – and a few websites are under a constant influx of attacks.

Protecting Project Galileo websites from HTTP attacks
Heat map showing the attacks blocked for each WAF rule (rows) per day (columns)

This heat map shows the WAF rules that blocked requests by day. At first, it seems some rules are useless as they never match malicious requests, but this plot makes it obvious that some attack vectors become active all of a sudden (isolated dark cells). This is especially true for 0-days, malicious traffic starts once an exploit is published and is very active on the first few days. The dark active lines are the most common malicious requests, and these WAF rules protect against things like XSS and SQL injection attacks.

DoS (Denial of Service)

A DoS attack prevents legitimate visitors from accessing a website by flooding it with bad traffic.  Due to the way Cloudflare works, websites protected by Cloudflare are immune to many DoS vectors, out of the box. We block layer 3 and 4 attacks, which includes SYN floods and UDP amplifications. DNS nameservers, often described as the Internet’s phone book, are fully managed by Cloudflare, and protected – visitors know how to reach the websites.

Protecting Project Galileo websites from HTTP attacks
Line plot – requests per second to a website under DoS attack

Can you spot the attack?

As for layer 7 attacks (for instance, HTTP floods), we rely on Gatebot, an automated tool to detect, analyse and block DoS attacks, so you can sleep. The graph shows the requests per second we received on a zone, and whether or not it reached the origin server. As you can see, the bad traffic was identified automatically by Gatebot, and more than 1.6 million requests were blocked as a result.

Firewall Rules

For websites with specific requirements we provide tools to allow customers to block traffic to precisely fit their needs. Customers can easily implement complex logic using Firewall Rules to filter out specific chunks of traffic, block IPs / Networks / Countries using Access Rules and Project Galileo sites have done just that. Let’s see a few examples.

Firewall Rules allows website owners to challenge or block as much or as little traffic as they desire, and this can be done as a surgical tool “block just this request” or as a general tool “challenge every request”.

For instance, a well-known website used Firewall Rules to prevent twenty IPs from fetching specific pages. 3 of these IPs were then used to send a total of 4.5 million requests over a short period of time, and the following chart shows the requests seen for this website. When this happened Cloudflare, mitigated the traffic ensuring that the website remains available.

Protecting Project Galileo websites from HTTP attacks
Cumulative line plot. Requests per second to a website

Another website, built with WordPress, is using Cloudflare to cache their webpages. As POST requests are not cacheable, they always hit the origin machine and increase load on the origin server – that’s why this website is using firewall rules to block POST requests, except on their administration backend. Smart!

Website owners can also deny or challenge requests based on the visitor’s IP address, Autonomous System Number (ASN) or Country. Dubbed Access Rules, it is enforced on all pages of a website – hassle-free.

For example, a news website is using Cloudflare’s Access Rules to challenge visitors from countries outside of their geographic region who are accessing their website. We enforce the rules globally even for cached resources, and take care of GeoIP database updates for them, so they don’t have to.

The Zone Lockdown utility restricts a specific URL to specific IP addresses. This is useful to protect an internal but public path being accessed by external IP addresses. A non-profit based in the United Kingdom is using Zone Lockdown to restrict access to their WordPress’ admin panel and login page, hardening their website without relying on non official plugins. Although it does not prevent very sophisticated attacks, it shields them against automated attacks and phishing attempts – as even if their credentials are stolen, they can’t be used as easily.

Rate Limiting

Cloudflare acts as a CDN, caching resources and happily serving them, reducing bandwidth used by the origin server … and indirectly the costs. Unfortunately, not all requests can be cached and some requests are very expensive to handle. Malicious users may abuse this to increase load on the server, and website owners can rely on our Rate Limit to help them: they define thresholds, expressed in requests over a time span, and we make sure to enforce this threshold. A non-profit fighting against poverty relies on rate limits to protect their donation page, and we are glad to help!

Security Level

Last but not least, one of Cloudflare’s greatest assets is our threat intelligence. With such a wide lens of the threat landscape, Cloudflare uses our Firewall data, combined with machine learning to curate our IP Reputation databases. This data is provided to all Cloudflare customers, and is configured through our Security Level feature. Customers then may define their threshold sensitivity, ranging  from Essentially Off to I’m Under Attack. For every incoming request, we ask visitors to complete a challenge if the score is above a customer defined threshold. This system alone is responsible for 25% of the requests we mitigated: it’s extremely easy to use, and it constantly learns from the other protections.

Conclusion

When taken together, the Cloudflare Firewall features provide our Project Galileo customers comprehensive and effective security that enables them to ensure their important work is available. The majority of security events were handled automatically, and this is our strength – security that is always on, always available, always learning.

[$] Bpfilter (and user-mode blobs) for 4.18

Post Syndicated from corbet original https://lwn.net/Articles/755919/rss

In February, the bpfilter mechanism was
first posted to the mailing lists. Bpfilter is meant to be a replacement
for the current in-kernel firewall/packet-filtering code. It provides
little functionality itself; instead, it creates a set of hooks that can
run BPF programs to make the packet-filtering decisions. A version of that patch set has been merged
into the net-next tree for 4.18. It will not be replacing any existing
packet filters in its current form, but it does feature a significant
change to one of its more controversial features: the new user-mode helper
mechanism.

Protecting your API using Amazon API Gateway and AWS WAF — Part I

Post Syndicated from Chris Munns original https://aws.amazon.com/blogs/compute/protecting-your-api-using-amazon-api-gateway-and-aws-waf-part-i/

This post courtesy of Thiago Morais, AWS Solutions Architect

When you build web applications or expose any data externally, you probably look for a platform where you can build highly scalable, secure, and robust REST APIs. As APIs are publicly exposed, there are a number of best practices for providing a secure mechanism to consumers using your API.

Amazon API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, authorization and access control, monitoring, and API version management.

In this post, I show you how to take advantage of the regional API endpoint feature in API Gateway, so that you can create your own Amazon CloudFront distribution and secure your API using AWS WAF.

AWS WAF is a web application firewall that helps protect your web applications from common web exploits that could affect application availability, compromise security, or consume excessive resources.

As you make your APIs publicly available, you are exposed to attackers trying to exploit your services in several ways. The AWS security team published a whitepaper solution using AWS WAF, How to Mitigate OWASP’s Top 10 Web Application Vulnerabilities.

Regional API endpoints

Edge-optimized APIs are endpoints that are accessed through a CloudFront distribution created and managed by API Gateway. Before the launch of regional API endpoints, this was the default option when creating APIs using API Gateway. It primarily helped to reduce latency for API consumers that were located in different geographical locations than your API.

When API requests predominantly originate from an Amazon EC2 instance or other services within the same AWS Region as the API is deployed, a regional API endpoint typically lowers the latency of connections. It is recommended for such scenarios.

For better control around caching strategies, customers can use their own CloudFront distribution for regional APIs. They also have the ability to use AWS WAF protection, as I describe in this post.

Edge-optimized API endpoint

The following diagram is an illustrated example of the edge-optimized API endpoint where your API clients access your API through a CloudFront distribution created and managed by API Gateway.

Regional API endpoint

For the regional API endpoint, your customers access your API from the same Region in which your REST API is deployed. This helps you to reduce request latency and particularly allows you to add your own content delivery network, as needed.

Walkthrough

In this section, you implement the following steps:

  • Create a regional API using the PetStore sample API.
  • Create a CloudFront distribution for the API.
  • Test the CloudFront distribution.
  • Set up AWS WAF and create a web ACL.
  • Attach the web ACL to the CloudFront distribution.
  • Test AWS WAF protection.

Create the regional API

For this walkthrough, use an existing PetStore API. All new APIs launch by default as the regional endpoint type. To change the endpoint type for your existing API, choose the cog icon on the top right corner:

After you have created the PetStore API on your account, deploy a stage called “prod” for the PetStore API.

On the API Gateway console, select the PetStore API and choose Actions, Deploy API.

For Stage name, type prod and add a stage description.

Choose Deploy and the new API stage is created.

Use the following AWS CLI command to update your API from edge-optimized to regional:

aws apigateway update-rest-api \
--rest-api-id {rest-api-id} \
--patch-operations op=replace,path=/endpointConfiguration/types/EDGE,value=REGIONAL

A successful response looks like the following:

{
    "description": "Your first API with Amazon API Gateway. This is a sample API that integrates via HTTP with your demo Pet Store endpoints", 
    "createdDate": 1511525626, 
    "endpointConfiguration": {
        "types": [
            "REGIONAL"
        ]
    }, 
    "id": "{api-id}", 
    "name": "PetStore"
}

After you change your API endpoint to regional, you can now assign your own CloudFront distribution to this API.

Create a CloudFront distribution

To make things easier, I have provided an AWS CloudFormation template to deploy a CloudFront distribution pointing to the API that you just created. Click the button to deploy the template in the us-east-1 Region.

For Stack name, enter RegionalAPI. For APIGWEndpoint, enter your API FQDN in the following format:

{api-id}.execute-api.us-east-1.amazonaws.com

After you fill out the parameters, choose Next to continue the stack deployment. It takes a couple of minutes to finish the deployment. After it finishes, the Output tab lists the following items:

  • A CloudFront domain URL
  • An S3 bucket for CloudFront access logs
Output from CloudFormation

Output from CloudFormation

Test the CloudFront distribution

To see if the CloudFront distribution was configured correctly, use a web browser and enter the URL from your distribution, with the following parameters:

https://{your-distribution-url}.cloudfront.net/{api-stage}/pets

You should get the following output:

[
  {
    "id": 1,
    "type": "dog",
    "price": 249.99
  },
  {
    "id": 2,
    "type": "cat",
    "price": 124.99
  },
  {
    "id": 3,
    "type": "fish",
    "price": 0.99
  }
]

Set up AWS WAF and create a web ACL

With the new CloudFront distribution in place, you can now start setting up AWS WAF to protect your API.

For this demo, you deploy the AWS WAF Security Automations solution, which provides fine-grained control over the requests attempting to access your API.

For more information about deployment, see Automated Deployment. If you prefer, you can launch the solution directly into your account using the following button.

For CloudFront Access Log Bucket Name, add the name of the bucket created during the deployment of the CloudFormation stack for your CloudFront distribution.

The solution allows you to adjust thresholds and also choose which automations to enable to protect your API. After you finish configuring these settings, choose Next.

To start the deployment process in your account, follow the creation wizard and choose Create. It takes a few minutes do finish the deployment. You can follow the creation process through the CloudFormation console.

After the deployment finishes, you can see the new web ACL deployed on the AWS WAF console, AWSWAFSecurityAutomations.

Attach the AWS WAF web ACL to the CloudFront distribution

With the solution deployed, you can now attach the AWS WAF web ACL to the CloudFront distribution that you created earlier.

To assign the newly created AWS WAF web ACL, go back to your CloudFront distribution. After you open your distribution for editing, choose General, Edit.

Select the new AWS WAF web ACL that you created earlier, AWSWAFSecurityAutomations.

Save the changes to your CloudFront distribution and wait for the deployment to finish.

Test AWS WAF protection

To validate the AWS WAF Web ACL setup, use Artillery to load test your API and see AWS WAF in action.

To install Artillery on your machine, run the following command:

$ npm install -g artillery

After the installation completes, you can check if Artillery installed successfully by running the following command:

$ artillery -V
$ 1.6.0-12

As the time of publication, Artillery is on version 1.6.0-12.

One of the WAF web ACL rules that you have set up is a rate-based rule. By default, it is set up to block any requesters that exceed 2000 requests under 5 minutes. Try this out.

First, use cURL to query your distribution and see the API output:

$ curl -s https://{distribution-name}.cloudfront.net/prod/pets
[
  {
    "id": 1,
    "type": "dog",
    "price": 249.99
  },
  {
    "id": 2,
    "type": "cat",
    "price": 124.99
  },
  {
    "id": 3,
    "type": "fish",
    "price": 0.99
  }
]

Based on the test above, the result looks good. But what if you max out the 2000 requests in under 5 minutes?

Run the following Artillery command:

artillery quick -n 2000 --count 10  https://{distribution-name}.cloudfront.net/prod/pets

What you are doing is firing 2000 requests to your API from 10 concurrent users. For brevity, I am not posting the Artillery output here.

After Artillery finishes its execution, try to run the cURL request again and see what happens:

 

$ curl -s https://{distribution-name}.cloudfront.net/prod/pets

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<HTML><HEAD><META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
<TITLE>ERROR: The request could not be satisfied</TITLE>
</HEAD><BODY>
<H1>ERROR</H1>
<H2>The request could not be satisfied.</H2>
<HR noshade size="1px">
Request blocked.
<BR clear="all">
<HR noshade size="1px">
<PRE>
Generated by cloudfront (CloudFront)
Request ID: [removed]
</PRE>
<ADDRESS>
</ADDRESS>
</BODY></HTML>

As you can see from the output above, the request was blocked by AWS WAF. Your IP address is removed from the blocked list after it falls below the request limit rate.

Conclusion

In this first part, you saw how to use the new API Gateway regional API endpoint together with Amazon CloudFront and AWS WAF to secure your API from a series of attacks.

In the second part, I will demonstrate some other techniques to protect your API using API keys and Amazon CloudFront custom headers.

masscan, macOS, and firewall

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/05/masscan-macos-and-firewall.html

One of the more useful features of masscan is the “–banners” check, which connects to the TCP port, sends some request, and gets a basic response back. However, since masscan has it’s own TCP stack, it’ll interfere with the operating system’s TCP stack if they are sharing the same IPv4 address. The operating system will reply with a RST packet before the TCP connection can be established.

The way to fix this is to use the built-in packet-filtering firewall to block those packets in the operating-system TCP/IP stack. The masscan program still sees everything before the packet-filter, but the operating system can’t see anything after the packet-filter.

Note that we are talking about the “packet-filter” firewall feature here. Remember that macOS, like most operating systems these days, has two separate firewalls: an application firewall and a packet-filter firewall. The application firewall is the one you see in System Settings labeled “Firewall”, and it controls things based upon the application’s identity rather than by which ports it uses. This is normally “on” by default. The packet-filter is normally “off” by default and is of little use to normal users.

Also note that macOS changed packet-filters around version 10.10.5 (“Yosemite”, October 2014). The older one is known as “ipfw“, which was the default firewall for FreeBSD (much of macOS is based on FreeBSD). The replacement is known as PF, which comes from OpenBSD. Whereas you used to use the old “ipfw” command on the command line, you now use the “pfctl” command, as well as the “/etc/pf.conf” configuration file.

What we need to filter is the source port of the packets that masscan will send, so that when replies are received, they won’t reach the operating-system stack, and just go to masscan instead. To do this, we need find a range of ports that won’t conflict with the operating system. Namely, when the operating system creates outgoing connections, it randomly chooses a source port within a certain range. We want to use masscan to use source ports in a different range.

To figure out the range macOS uses, we run the following command:

sysctl net.inet.ip.portrange.first net.inet.ip.portrange.last

On my laptop, which is probably the default for macOS, I get the following range. Sniffing with Wireshark confirms this is the range used for source ports for outgoing connections.

net.inet.ip.portrange.first: 49152
net.inet.ip.portrange.last: 65535

So this means I shouldn’t use source ports anywhere in the range 49152 to 65535. On my laptop, I’ve decided to use for masscan the ports 40000 to 41023. The range masscan uses must be a power of 2, so here I’m using 1024 (two to the tenth power).

To configure masscan, I can either type the parameter “–source-port 40000-41023” every time I run the program, or I can add the following line to /etc/masscan/masscan.conf. Remember that by default, masscan will look in that configuration file for any configuration parameters, so you don’t have to keep retyping them on the command line.

source-port = 40000-41023

Next, I need to add the following firewall rule to the bottom of /etc/pf.conf:

block in proto tcp from any to any port 40000 >< 41024

However, we aren’t done yet. By default, the packet-filter firewall is off on some versions of macOS. Therefore, every time you reboot your computer, you need to enable it. The simple way to do this is on the command line run:

pfctl -e

Or, if that doesn’t work, try:

pfctl -E

If the firewall is already running, then you’ll need to load the file explicitly (or reboot):

pfctl -f /etc/pf.conf

You can check to see if the rule is active:

pfctl -s rules

Supply-Chain Security

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2018/05/supply-chain_se.html

Earlier this month, the Pentagon stopped selling phones made by the Chinese companies ZTE and Huawei on military bases because they might be used to spy on their users.

It’s a legitimate fear, and perhaps a prudent action. But it’s just one instance of the much larger issue of securing our supply chains.

All of our computerized systems are deeply international, and we have no choice but to trust the companies and governments that touch those systems. And while we can ban a few specific products, services or companies, no country can isolate itself from potential foreign interference.

In this specific case, the Pentagon is concerned that the Chinese government demanded that ZTE and Huawei add “backdoors” to their phones that could be surreptitiously turned on by government spies or cause them to fail during some future political conflict. This tampering is possible because the software in these phones is incredibly complex. It’s relatively easy for programmers to hide these capabilities, and correspondingly difficult to detect them.

This isn’t the first time the United States has taken action against foreign software suspected to contain hidden features that can be used against us. Last December, President Trump signed into law a bill banning software from the Russian company Kaspersky from being used within the US government. In 2012, the focus was on Chinese-made Internet routers. Then, the House Intelligence Committee concluded: “Based on available classified and unclassified information, Huawei and ZTE cannot be trusted to be free of foreign state influence and thus pose a security threat to the United States and to our systems.”

Nor is the United States the only country worried about these threats. In 2014, China reportedly banned antivirus products from both Kaspersky and the US company Symantec, based on similar fears. In 2017, the Indian government identified 42 smartphone apps that China subverted. Back in 1997, the Israeli company Check Point was dogged by rumors that its government added backdoors into its products; other of that country’s tech companies have been suspected of the same thing. Even al-Qaeda was concerned; ten years ago, a sympathizer released the encryption software Mujahedeen Secrets, claimed to be free of Western influence and backdoors. If a country doesn’t trust another country, then it can’t trust that country’s computer products.

But this trust isn’t limited to the country where the company is based. We have to trust the country where the software is written — and the countries where all the components are manufactured. In 2016, researchers discovered that many different models of cheap Android phones were sending information back to China. The phones might be American-made, but the software was from China. In 2016, researchers demonstrated an even more devious technique, where a backdoor could be added at the computer chip level in the factory that made the chips ­ without the knowledge of, and undetectable by, the engineers who designed the chips in the first place. Pretty much every US technology company manufactures its hardware in countries such as Malaysia, Indonesia, China and Taiwan.

We also have to trust the programmers. Today’s large software programs are written by teams of hundreds of programmers scattered around the globe. Backdoors, put there by we-have-no-idea-who, have been discovered in Juniper firewalls and D-Link routers, both of which are US companies. In 2003, someone almost slipped a very clever backdoor into Linux. Think of how many countries’ citizens are writing software for Apple or Microsoft or Google.

We can go even farther down the rabbit hole. We have to trust the distribution systems for our hardware and software. Documents disclosed by Edward Snowden showed the National Security Agency installing backdoors into Cisco routers being shipped to the Syrian telephone company. There are fake apps in the Google Play store that eavesdrop on you. Russian hackers subverted the update mechanism of a popular brand of Ukrainian accounting software to spread the NotPetya malware.

In 2017, researchers demonstrated that a smartphone can be subverted by installing a malicious replacement screen.

I could go on. Supply-chain security is an incredibly complex problem. US-only design and manufacturing isn’t an option; the tech world is far too internationally interdependent for that. We can’t trust anyone, yet we have no choice but to trust everyone. Our phones, computers, software and cloud systems are touched by citizens of dozens of different countries, any one of whom could subvert them at the demand of their government. And just as Russia is penetrating the US power grid so they have that capability in the event of hostilities, many countries are almost certainly doing the same thing at the consumer level.

We don’t know whether the risk of Huawei and ZTE equipment is great enough to warrant the ban. We don’t know what classified intelligence the United States has, and what it implies. But we do know that this is just a minor fix for a much larger problem. It’s doubtful that this ban will have any real effect. Members of the military, and everyone else, can still buy the phones. They just can’t buy them on US military bases. And while the US might block the occasional merger or acquisition, or ban the occasional hardware or software product, we’re largely ignoring that larger issue. Solving it borders on somewhere between incredibly expensive and realistically impossible.

Perhaps someday, global norms and international treaties will render this sort of device-level tampering off-limits. But until then, all we can do is hope that this particular arms race doesn’t get too far out of control.

This essay previously appeared in the Washington Post.

AWS Online Tech Talks – May and Early June 2018

Post Syndicated from Devin Watson original https://aws.amazon.com/blogs/aws/aws-online-tech-talks-may-and-early-june-2018/

AWS Online Tech Talks – May and Early June 2018  

Join us this month to learn about some of the exciting new services and solution best practices at AWS. We also have our first re:Invent 2018 webinar series, “How to re:Invent”. Sign up now to learn more, we look forward to seeing you.

Note – All sessions are free and in Pacific Time.

Tech talks featured this month:

Analytics & Big Data

May 21, 2018 | 11:00 AM – 11:45 AM PT Integrating Amazon Elasticsearch with your DevOps Tooling – Learn how you can easily integrate Amazon Elasticsearch Service into your DevOps tooling and gain valuable insight from your log data.

May 23, 2018 | 11:00 AM – 11:45 AM PTData Warehousing and Data Lake Analytics, Together – Learn how to query data across your data warehouse and data lake without moving data.

May 24, 2018 | 11:00 AM – 11:45 AM PTData Transformation Patterns in AWS – Discover how to perform common data transformations on the AWS Data Lake.

Compute

May 29, 2018 | 01:00 PM – 01:45 PM PT – Creating and Managing a WordPress Website with Amazon Lightsail – Learn about Amazon Lightsail and how you can create, run and manage your WordPress websites with Amazon’s simple compute platform.

May 30, 2018 | 01:00 PM – 01:45 PM PTAccelerating Life Sciences with HPC on AWS – Learn how you can accelerate your Life Sciences research workloads by harnessing the power of high performance computing on AWS.

Containers

May 24, 2018 | 01:00 PM – 01:45 PM PT – Building Microservices with the 12 Factor App Pattern on AWS – Learn best practices for building containerized microservices on AWS, and how traditional software design patterns evolve in the context of containers.

Databases

May 21, 2018 | 01:00 PM – 01:45 PM PTHow to Migrate from Cassandra to Amazon DynamoDB – Get the benefits, best practices and guides on how to migrate your Cassandra databases to Amazon DynamoDB.

May 23, 2018 | 01:00 PM – 01:45 PM PT5 Hacks for Optimizing MySQL in the Cloud – Learn how to optimize your MySQL databases for high availability, performance, and disaster resilience using RDS.

DevOps

May 23, 2018 | 09:00 AM – 09:45 AM PT.NET Serverless Development on AWS – Learn how to build a modern serverless application in .NET Core 2.0.

Enterprise & Hybrid

May 22, 2018 | 11:00 AM – 11:45 AM PTHybrid Cloud Customer Use Cases on AWS – Learn how customers are leveraging AWS hybrid cloud capabilities to easily extend their datacenter capacity, deliver new services and applications, and ensure business continuity and disaster recovery.

IoT

May 31, 2018 | 11:00 AM – 11:45 AM PTUsing AWS IoT for Industrial Applications – Discover how you can quickly onboard your fleet of connected devices, keep them secure, and build predictive analytics with AWS IoT.

Machine Learning

May 22, 2018 | 09:00 AM – 09:45 AM PTUsing Apache Spark with Amazon SageMaker – Discover how to use Apache Spark with Amazon SageMaker for training jobs and application integration.

May 24, 2018 | 09:00 AM – 09:45 AM PTIntroducing AWS DeepLens – Learn how AWS DeepLens provides a new way for developers to learn machine learning by pairing the physical device with a broad set of tutorials, examples, source code, and integration with familiar AWS services.

Management Tools

May 21, 2018 | 09:00 AM – 09:45 AM PTGaining Better Observability of Your VMs with Amazon CloudWatch – Learn how CloudWatch Agent makes it easy for customers like Rackspace to monitor their VMs.

Mobile

May 29, 2018 | 11:00 AM – 11:45 AM PT – Deep Dive on Amazon Pinpoint Segmentation and Endpoint Management – See how segmentation and endpoint management with Amazon Pinpoint can help you target the right audience.

Networking

May 31, 2018 | 09:00 AM – 09:45 AM PTMaking Private Connectivity the New Norm via AWS PrivateLink – See how PrivateLink enables service owners to offer private endpoints to customers outside their company.

Security, Identity, & Compliance

May 30, 2018 | 09:00 AM – 09:45 AM PT – Introducing AWS Certificate Manager Private Certificate Authority (CA) – Learn how AWS Certificate Manager (ACM) Private Certificate Authority (CA), a managed private CA service, helps you easily and securely manage the lifecycle of your private certificates.

June 1, 2018 | 09:00 AM – 09:45 AM PTIntroducing AWS Firewall Manager – Centrally configure and manage AWS WAF rules across your accounts and applications.

Serverless

May 22, 2018 | 01:00 PM – 01:45 PM PTBuilding API-Driven Microservices with Amazon API Gateway – Learn how to build a secure, scalable API for your application in our tech talk about API-driven microservices.

Storage

May 30, 2018 | 11:00 AM – 11:45 AM PTAccelerate Productivity by Computing at the Edge – Learn how AWS Snowball Edge support for compute instances helps accelerate data transfers, execute custom applications, and reduce overall storage costs.

June 1, 2018 | 11:00 AM – 11:45 AM PTLearn to Build a Cloud-Scale Website Powered by Amazon EFS – Technical deep dive where you’ll learn tips and tricks for integrating WordPress, Drupal and Magento with Amazon EFS.

 

 

 

 

OMG The Stupid It Burns

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/04/omg-stupid-it-burns.html

This article, pointed out by @TheGrugq, is stupid enough that it’s worth rebutting.

The article starts with the question “Why did the lessons of Stuxnet, Wannacry, Heartbleed and Shamoon go unheeded?“. It then proceeds to ignore the lessons of those things.
Some of the actual lessons should be things like how Stuxnet crossed air gaps, how Wannacry spread through flat Windows networking, how Heartbleed comes from technical debt, and how Shamoon furthers state aims by causing damage.
But this article doesn’t cover the technical lessons. Instead, it thinks the lesson should be the moral lesson, that we should take these things more seriously. But that’s stupid. It’s the sort of lesson people teach you that know nothing about the topic. When you have nothing of value to contribute to a topic you can always take the moral high road and criticize everyone for being morally weak for not taking it more seriously. Obviously, since doctors haven’t cured cancer yet, it’s because they don’t take the problem seriously.
The article continues to ignore the lesson of these cyber attacks and instead regales us with a list of military lessons from WW I and WW II. This makes the same flaw that many in the military make, trying to understand cyber through analogies with the real world. It’s not that such lessons could have no value, it’s that this article contains a poor list of them. It seems to consist of a random list of events that appeal to the author rather than events that have bearing on cybersecurity.
Then, in case we don’t get the point, the article bullies us with hyperbole, cliches, buzzwords, bombastic language, famous quotes, and citations. It’s hard to see how most of them actually apply to the text. Rather, it seems like they are included simply because he really really likes them.
The article invests much effort in discussing the buzzword “OODA loop”. Most attacks in cyberspace don’t have one. Instead, attackers flail around, trying lots of random things, overcoming defense with brute-force rather than an understanding of what’s going on. That’s obviously the case with Wannacry: it was an accident, with the perpetrator experimenting with what would happen if they added the ETERNALBLUE exploit to their existing ransomware code. The consequence was beyond anybody’s ability to predict.
You might claim that this is just the first stage, that they’ll loop around, observe Wannacry’s effects, orient themselves, decide, then act upon what they learned. Nope. Wannacry burned the exploit. It’s essentially removed any vulnerable systems from the public Internet, thereby making it impossible to use what they learned. It’s still active a year later, with infected systems behind firewalls busily scanning the Internet so that if you put a new system online that’s vulnerable, it’ll be taken offline within a few hours, before any other evildoer can take advantage of it.
See what I’m doing here? Learning the actual lessons of things like Wannacry? The thing the above article fails to do??
The article has a humorous paragraph on “defense in depth”, misunderstanding the term. To be fair, it’s the cybersecurity industry’s fault: they adopted then redefined the term. That’s why there’s two separate articles on Wikipedia: one for the old military term (as used in this article) and one for the new cybersecurity term.
As used in the cybersecurity industry, “defense in depth” means having multiple layers of security. Many organizations put all their defensive efforts on the perimeter, and none inside a network. The idea of “defense in depth” is to put more defenses inside the network. For example, instead of just one firewall at the edge of the network, put firewalls inside the network to segment different subnetworks from each other, so that a ransomware infection in the customer support computers doesn’t spread to sales and marketing computers.
The article talks about exploiting WiFi chips to bypass the defense in depth measures like browser sandboxes. This is conflating different types of attacks. A WiFi attack is usually considered a local attack, from somebody next to you in bar, rather than a remote attack from a server in Russia. Moreover, far from disproving “defense in depth” such WiFi attacks highlight the need for it. Namely, phones need to be designed so that successful exploitation of other microprocessors (namely, the WiFi, Bluetooth, and cellular baseband chips) can’t directly compromise the host system. In other words, once exploited with “Broadpwn”, a hacker would need to extend the exploit chain with another vulnerability in the hosts Broadcom WiFi driver rather than immediately exploiting a DMA attack across PCIe. This suggests that if PCIe is used to interface to peripherals in the phone that an IOMMU be used, for “defense in depth”.
Cybersecurity is a young field. There are lots of useful things that outsider non-techies can teach us. Lessons from military history would be well-received.
But that’s not this story. Instead, this story is by an outsider telling us we don’t know what we are doing, that they do, and then proceeds to prove they don’t know what they are doing. Their argument is based on a moral suasion and bullying us with what appears on the surface to be intellectual rigor, but which is in fact devoid of anything smart.
My fear, here, is that I’m going to be in a meeting where somebody has read this pretentious garbage, explaining to me why “defense in depth” is wrong and how we need to OODA faster. I’d rather nip this in the bud, pointing out if you found anything interesting from that article, you are wrong.

Portspoof – Spoof All Ports Open & Emulate Valid Services

Post Syndicated from Darknet original https://www.darknet.org.uk/2018/04/portspoof-spoof-all-ports-open-emulate-valid-services/?utm_source=rss&utm_medium=social&utm_campaign=darknetfeed

Portspoof – Spoof All Ports Open & Emulate Valid Services

The primary goal of the Portspoof program is to enhance your system security through a set of new camouflage techniques which spoof all ports open and also emulate valid services on every port. As a result, any attackers port scan results will become fairly meaningless and will require hours of effort to accurately identify which ports have real services on and which do not.

The tool is meant to be a lightweight, fast, portable and secure addition to any firewall system or security system.

Read the rest of Portspoof – Spoof All Ports Open & Emulate Valid Services now! Only available at Darknet.

WannaCry after one year

Post Syndicated from Robert Graham original https://blog.erratasec.com/2018/03/wannacry-after-one-year.html

In the news, Boeing (an aircraft maker) has been “targeted by a WannaCry virus attack”. Phrased this way, it’s implausible. There are no new attacks targeting people with WannaCry. There is either no WannaCry, or it’s simply a continuation of the attack from a year ago.


It’s possible what happened is that an anti-virus product called a new virus “WannaCry”. Virus families are often related, and sometimes a distant relative gets called the same thing. I know this watching the way various anti-virus products label my own software, which isn’t a virus, but which virus writers often include with their own stuff. The Lazarus group, which is believed to be responsible for WannaCry, have whole virus families like this. Thus, just because an AV product claims you are infected with WannaCry doesn’t mean it’s the same thing that everyone else is calling WannaCry.

Famously, WannaCry was the first virus/ransomware/worm that used the NSA ETERNALBLUE exploit. Other viruses have since added the exploit, and of course, hackers use it when attacking systems. It may be that a network intrusion detection system detected ETERNALBLUE, which people then assumed was due to WannaCry. It may actually have been an nPetya infection instead (nPetya was the second major virus/worm/ransomware to use the exploit).

Or it could be the real WannaCry, but it’s probably not a new “attack” that “targets” Boeing. Instead, it’s likely a continuation from WannaCry’s first appearance. WannaCry is a worm, which means it spreads automatically after it was launched, for years, without anybody in control. Infected machines still exist, unnoticed by their owners, attacking random machines on the Internet. If you plug in an unpatched computer onto the raw Internet, without the benefit of a firewall, it’ll get infected within an hour.

However, the Boeing manufacturing systems that were infected were not on the Internet, so what happened? The narrative from the news stories imply some nefarious hacker activity that “targeted” Boeing, but that’s unlikely.

We have now have over 15 years of experience with network worms getting into strange places disconnected and even “air gapped” from the Internet. The most common reason is laptops. Somebody takes their laptop to some place like an airport WiFi network, and gets infected. They put their laptop to sleep, then wake it again when they reach their destination, and plug it into the manufacturing network. At this point, the virus spreads and infects everything. This is especially the case with maintenance/support engineers, who often have specialized software they use to control manufacturing machines, for which they have a reason to connect to the local network even if it doesn’t have useful access to the Internet. A single engineer may act as a sort of Typhoid Mary, going from customer to customer, infecting each in turn whenever they open their laptop.

Another cause for infection is virtual machines. A common practice is to take “snapshots” of live machines and save them to backups. Should the virtual machine crash, instead of rebooting it, it’s simply restored from the backed up running image. If that backup image is infected, then bringing it out of sleep will allow the worm to start spreading.

Jake Williams claims he’s seen three other manufacturing networks infected with WannaCry. Why does manufacturing seem more susceptible? The reason appears to be the “killswitch” that stops WannaCry from running elsewhere. The killswitch uses a DNS lookup, stopping itself if it can resolve a certain domain. Manufacturing networks are largely disconnected from the Internet enough that such DNS lookups don’t work, so the domain can’t be found, so the killswitch doesn’t work. Thus, manufacturing systems are no more likely to get infected, but the lack of killswitch means the virus will continue to run, attacking more systems instead of immediately killing itself.

One solution to this would be to setup sinkhole DNS servers on the network that resolve all unknown DNS queries to a single server that logs all requests. This is trivially setup with most DNS servers. The logs will quickly identify problems on the network, as well as any hacker or virus activity. The side effect is that it would make this killswitch kill WannaCry. WannaCry isn’t sufficient reason to setup sinkhole servers, of course, but it’s something I’ve found generally useful in the past.

Conclusion

Something obviously happened to the Boeing plant, but the narrative is all wrong. Words like “targeted attack” imply things that likely didn’t happen. Facts are so loose in cybersecurity that it may not have even been WannaCry.

The real story is that the original WannaCry is still out there, still trying to spread. Simply put a computer on the raw Internet (without a firewall) and you’ll get attacked. That, somehow, isn’t news. Instead, what’s news is whenever that continued infection hits somewhere famous, like Boeing, even though (as Boeing claims) it had no important effect.

What’s New in Qubes 4 (Linux Journal)

Post Syndicated from jake original https://lwn.net/Articles/748450/rss

Linux Journal has a look at Qubes 4, which is due to be released in the next month or so. It has undergone a refactoring of sorts. “Another major change in Qubes 4 relates to the GUI VM manager. In past releases, this program provided a graphical way for you to start, stop and pause VMs. It also allowed you to change all your VM settings, firewall rules and even which applications appeared in the VM’s menu. It also provided a GUI way to back up and restore VMs. With Qubes 4, a lot has changed. The ultimate goal with Qubes 4 is to replace the VM manager with standalone tools that replicate most of the original functionality.