New ways to troubleshoot Cloudflare Access ‘blocked’ messages

Post Syndicated from Kenny Johnson original https://blog.cloudflare.com/403-logs-cloudflare-access/

New ways to troubleshoot Cloudflare Access 'blocked' messages

Cloudflare Access is the industry’s easiest Zero Trust access control solution to deploy and maintain. Users can connect via Access to reach the resources and applications that power your team, all while Cloudflare’s network enforces least privilege rules and accelerates their connectivity.

Enforcing least privilege rules can lead to accidental blocks for legitimate users. Over the past year, we have focused on adding tools to make it easier for security administrators to troubleshoot why legitimate users are denied access. These block reasons were initially limited to users denied access due to information about their identity (e.g. wrong identity provider group, email address not in the Access policy, etc.)

Zero Trust access control extends beyond identity and device. Cloudflare Access allows for rules that enforce how a user connects. These rules can include their location, IP address, the presence of our Secure Web Gateway and other controls.

Starting today, you can investigate those allow or block decisions based on how a connection was made with the same level of ease that you can troubleshoot user identity. We’re excited to help more teams make the migration to a Zero Trust model as easy as possible and ensure the ongoing maintenance is a significant reduction compared to their previous private network.

Why was I blocked?

All Zero Trust deployments start and end with identity. In a Zero Trust model, you want your resources (and the network protecting them) to have zero trust by default of any incoming connection or request. Instead, every attempt should have to prove to the network that they should be allowed to connect.

Organizations provide users with a mechanism of proof by integrating their identity provider (IdP) like Azure Active Directory or Okta. With Cloudflare, teams can integrate multiple providers simultaneously to help users connect during activities like mergers or to allow contractors to reach specific resources. Users authenticate with their provider and Cloudflare Access uses that to determine if we should trust a given request or connection.

After integrating identity, most teams start to layer on new controls like device posture. In some cases, or in every case, the resources are so sensitive that you want to ensure only approved users connecting from managed, healthy devices are allowed to connect.

While that model significantly improves security, it can also create strain for IT teams managing remote or hybrid workforces. Troubleshooting “why” a user cannot reach a resource becomes a guessing game over chat. Earlier this year, we launched a new tool to tell you exactly why a user’s identity or device posture did not meet the rules that your administrators created.

What about how they connected?

As organizations advance in their Zero Trust journey, they add rules that go beyond the identity of the user or the posture of the device. For example, some teams might have regulatory restrictions that prevent users from accessing sensitive data from certain countries. Other enterprises need to understand the network context before granting access.

These adaptive controls enforce decisions around how a user connects. The user (and their device) might otherwise be allowed, but their current context like location or network prohibits them from doing so. These checks can extend to automated services, too, like a trusted chatbot that uses a service token to connect to your internal ticketing system.

While user and device posture checks require at least one step of authentication, these contextual rules can consist of policies that make it simple for a bad actor to retry over and over again like an IP address check. While the user will still be denied, that kind of information can overwhelm and flood your logs while you attempt to investigate what should be a valid login attempt.

With today’s release, your team can now have the best of both worlds.

However, other checks are not based on a user’s identity, these include looking at a device’s properties, network context, location, presence of a certificate and more. Requests that fail these “non-identity” checks are immediately blocked. These requests are immediately blocked in order to prevent a malicious user from seeing which identity providers are used by a business.

Additionally, these blocks were not logged in order to avoid overloading the Access request logs of an individual account. A malicious user attempting hundreds of requests or a misconfigured API making thousands of requests should not cloud a security admin’s ability to analyze legitimate user Access requests.

These logs would immediately become overloaded if every blocked request were logged. However, we heard from users that in some situations, especially during initial setup, it is helpful to see individual block requests even for non-identity checks.

We have released a GraphQL API that allows Access administrators to look up a specific blocked request by RayID, User or Application. The API response will return a full output of the properties of the associated request which makes it much easier to diagnose why a specific request was blocked.

In addition to the GraphQL API, we also improved the user facing block page to include additional detail about a user’s session. This will make it faster for end users and administrators to diagnose why a legitimate user was not allowed access.

How does it work?

Collecting blocked request logs for thousands of Access customers presented an interesting scale challenge. A single application in a single customer account could have millions of blocked requests in a day, multiple that out across all protected applications across all Access customers and the number of logs start to get large quickly.

We were able to leverage our existing analytics pipeline that was built to handle the scale of our global network which is far beyond the scale of Access. The analytics pipeline is configured to intelligently begin sampling data if an individual account begins generating too many requests. The majority of customers will have all non-identity block logs captured while accounts generating large traffic volumes will retain a significant portion to diagnose issues.

How can I get started?

We have built an example guide to use the GraphQL API to diagnose Access block reasons. These logs can be manually checked using an GraphQL API client or periodically ingested into a log storage database.

We know that achieving a Zero Trust Architecture is a journey and a significant part of that is troubleshooting and initial configuration. We are committed to making Cloudflare Zero Trust the easiest Zero Trust solution to troubleshoot and configure at scale. Keep an eye out for additional announcements in the coming months that make Cloudflare Zero Trust even easier to troubleshoot.

If you don’t already have Cloudflare Zero Trust set up, getting started is easy – see the platform yourself with 50 free seats by signing up here.

Or if you would like to talk with a Cloudflare representative about your overall Zero Trust strategy, reach out to us here for a consultation.

For those who already know and love Cloudflare Zero Trust, this feature is enabled for all accounts across all pricing tiers.

Noise

New ways to troubleshoot Cloudflare Access ‘blocked’ messages

Why was I blocked?

What about how they connected?

How does it work?

How can I get started?

The collective thoughts of the interwebz