Tag Archives: Guest Post

How Picsart leverages Cloudflare’s Developer Platform to build globally performant services

Post Syndicated from Mark Dembo original https://blog.cloudflare.com/picsart-move-to-workers-huge-performance-gains


Delivering great user experiences with a global user base can be challenging. While serving requests quickly when you start out in a local market is straightforward, doing so for a global audience is much more difficult. Why? Even under optimal conditions, you cannot be faster than the speed of light, which brings single data center solutions to their performance limits.

In this post, we will cover how Picsart improved the performance of one of its most critical services by moving from a centralized architecture to a globally distributed service built on Cloudflare. Our serverless compute platform, Workers, distributed throughout 310+ cities around the world, and our globally distributed Workers KV storage allowed them to improve their performance significantly and drive real business impact.

Success driven by data-driven insights

Picsart is one of the world’s largest digital creation platforms and a long-standing Cloudflare partner. At its core, an advanced tech stack powers its comprehensive features, including AI-driven photo and video editing tools and community-driven content sharing. With its infrastructure spanning across multiple cloud environments and on-prem deployments, Picsart is engineered to handle billions of daily requests from its huge mobile and web user base and API integrations. For over a decade, Cloudflare has been integral to Picsart, providing support for performant content delivery and securing its digital ecosystem.  

Similar to many other tech giants, Picsart approaches product development in a data-driven way. At the core of the innovation is Picsart’s remote configuration and experimentation platform, which enables product managers, UX researchers, and others to segment their user base into different test groups. These test groups might get to see slightly different implementations of features or designs of the Picsart app. Users might also get early access to experimental features or see different in-app promotions. In combination with constant monitoring of relevant KPIs, this allows for informed product decisions based on user preference and business impact.

On each app start, the client device sends a request to the remote configuration service for the latest setup tailored to the user’s session. The assignment of experiments relies on factors like the operating system and previous sessions, making each request unique and uncachable. Picsart’s app showcases extensive remote configuration capabilities, enabling adjustments to nearly every element. This results in a response containing a 1.5 MB configuration file for mobile clients. While the long-term solution is to reduce the file size, which has grown over time as more teams adopted the powerful service, this is not possible in the near or mid-term as it requires a major rewrite of all clients.

This setup request is blocking in the “hot path” during app start, as the results of this request will decide how the app itself looks and behaves. Hence, performance is critical. To ensure users are not waiting for too long, Picsart apps will wait for 1500ms on mobile for the request to complete – if it does not, the user will not be assigned a test group and the app will fallback to default settings.

The clock is ticking

While a 1500ms round trip time seems like a sufficiently large time budget, the data suggested otherwise. Before the improvements were implemented, a staggering 50% of devices could not complete the requests in time. How come? In these 1.5 seconds the following steps need to complete:

  1. The request must travel from the users’ devices to the centralized backend servers
  2. The server processes the request based on dozens of user attributes provided in the request and thousands of defined remote configuration variations, running experiments, and segments metadata. Using all the info, the server selects the right variation of each remote setting entry and builds the response payload.
  3. The response must travel from the centralized backend servers to the user devices.

Looking at the data, it was clear to the Picsart team that their backend service was already well-optimized, taking only 30 milliseconds, a tiny fraction of the available time budget, to process each of the billions of monthly requests. The bulk of the request time came from network latency. Especially with mobile devices, last mile performance can be very volatile, eating away a significant amount of the available time budget. Not only that, but the data was clear: users closer to the origin server had a much higher chance of making the round trip in time versus users out of region. It quickly became obvious that Picsart, fueled by its global success, had outgrown a single-region setup.

To the drawing board

A solution that comes to mind would be to replicate the existing cloud infrastructure in multiple regions and use global load balancing to minimize the distance a request needs to travel. However, this introduces significant overhead and cost. On the infrastructure side, it is not only the additional compute instances and database clusters that incur cost, but also cross-region data transfer to keep data in sync. Moreover, technical teams would need to operate and monitor infrastructure in multiple regions, which can add a lot to the complexity and cognitive load, leading to decreased development velocity and productivity loss.

Picsart instead looked to Cloudflare – we already had a long-lasting relationship for Application Performance and Security, and they aimed to use our Developer Platform to tackle the problem.

Workers and Workers KV seemed like the ideal solution. Both compute and data are globally distributed in 310+ locations around the world, resulting in a shorter distance between end users and the experimentation service. Not only that, but Cloudflare’s global-by-default approach allows for deployment with minimal overhead, and in contrast to other considered solutions, no additional fees to distribute the data around the globe.

No race without a clock

The objective for the refactor of the experimentation service was to increase the share of devices that successfully receive experimentation configuration within the set time budget.

But how to measure success? While synthetic testing can be useful in many situations, Picsart opted to come up with another clever solution:

During development, the Picsart engineers had already added a testing endpoint to the web and mobile versions of their app that sends a duplicate request to the new endpoint, discarding the response and swallowing all potential errors. This allows them to collect timing data based on real-user metrics without impacting the app’s performance and reliability.

A simplified version of this pattern for a web client could look like this:

// API endpoint URLs
const prodUrl = 'https://prod.example.com/';
const devUrl = 'https://new.example.com/';

// Function to collect metrics
const collectMetrics = (duration) => {
    console.log('Request duration:', duration);
    // …
};

// Function to fetch data from an endpoint and call collectMetrics
const fetchData = async (url, options) => {
    const startTime = performance.now();
    
    try {
        const response = await fetch(url, options);
        const endTime = performance.now();
        const duration = endTime - startTime;
        collectMetrics(duration);
        return await response.json();
    } catch (error) {
        console.error('Error fetching data:', error);
    }
};

// Fetching data from both endpoints
async function fetchDataFromBothEndpoints() {
    try {
        const result1 = await fetchData(prodUrl, { method: 'POST', ... });
        console.log('Result from endpoint 1:', result1);

        // Fetching data from the second endpoint without awaiting its completion
        fetchData(devUrl, { method: 'POST', ... });
    } catch (error) {
        console.error('Error fetching data from both endpoints:', error);
    }
}

fetchDataFromBothEndpoints();

Using existing data analytics tools, Picsart was able to analyze the performance of the new services from day one, starting with a dummy endpoint and a ‘hello world’ response. And with that a v0 was created that did not have the correct logic just yet, but simulated reading multiple values from KV and returning a response of a realistic size back to the end user.

The need for a do-over

In the initial phase, outcomes fell short of expectations. Surprisingly, requests were slower despite the service’s proximity to end users. What caused this setback?  Subsequent investigation unveiled multiple culprits and design patterns in need for optimization.

Data segmentation

The previous, stateful solution operated on a single massive “blob” of data exceeding 100MB in value. Loading this into memory incurred a few seconds of initial startup time, but once the VM completed the task, request handling was fast, benefiting from the readily available data in memory.

However, this approach doesn’t seamlessly transition to the serverless realm. Unlike long-running VMs, Worker isolates have short lifespans. Repeatedly parsing large JSON objects led to prolonged compute durations. Simply parsing four KV entries of 25MB each (KV maximum value size is 25MB) on each request was not a feasible option.

The Picsart team went back to solution design and embarked on a journey to optimize their system’s execution time, resulting in a series of impactful improvements.

The fundamental insight that guided the solution was the unnecessary overhead that was involved in loading and parsing data irrelevant to the user’s specific context. The 100MB configuration file contained configurations for all platforms and locations worldwide – a setup that was far from efficient in a globally distributed, serverless compute environment. For instance, when processing requests from users in the United States, there was no need to fetch configurations targeted for users in other countries, or for different platforms.

To address this inefficiency, the Picsart team stored the configuration of each platform and country in separate KV records. This targeted strategy meant that for a request originating from a US user on an Android device, our system would only fetch and parse the KV record specific to Android users in the US, thereby excluding all irrelevant data. This resulted in approximately 600 KV records, each with a maximum size of 10MB. While this leads to data duplication on the KV storage side, it decreases the amount of data that needs to be parsed upon request. As Cloudflare operates in over 120 countries around the world, only a subset of records were needed in each location. Hence, the increase in cardinality had minimal impact on KV cache performance, as demonstrated by more than 99.5% of KV reads being served from local cache.

Key Size
settings_part1.json 25MB
settings_part2.json 25MB
….

Before (simplified)

Key Size
com.picsart.studio_apple_us.json 6.1MB
com.picsart.studio_apple_de.json 6.1MB
com.picsart.studio_android_us.json 5.9MB

After (simplified)

This approach was a significant move for Picsart as they transitioned from a regional cloud setup to Cloudflare’s globally distributed connectivity cloud. By serving data from close proximity to end user locations, they were able to combat the high network latency from their previous setup. This strategy radically transformed the data-handling process. which unlocked two major benefits:

  • Performance Gains: By ensuring that only the relevant subset of data is fetched and parsed based on the user’s platform and geographical location, wall time and compute resources required for these operations could be significantly reduced.
  • Scalability and Flexibility: the granular segmentation of data enables effortless scaling of the service for new features or regional content. Adding support for new applications now only requires inserting new, standalone KV records in contrast to the previous solution where this would require increasing the size of the single record.

Immutable updates

Now that changes to the configuration were segmented by app, country, and platform, this also allowed for individual updates of the configuration in KV. KV storage showcases its best performance when records are updated infrequently but read very often. This pattern leverages KV’s fundamental design to cache values at edge locations upon reads, ensuring that subsequent queries for the same record are swiftly served by local caches rather than requiring a trip back to KV’s centralized data centers. This architecture is fundamental for minimizing latency and maximizing the speed of data retrieval across a globally distributed platform.

A crucial requirement for Picsart’s experimentation system was the ability to propagate updates of remote configuration values immediately. Updating existing records would require very short cache TTLs and even the minimum KV cache TTL of 60 seconds was considered unacceptable for the dynamic nature of the feature flagging. Moreover, setting short TTLs also impacts the cache hit ratio and the overall KV performance, specifically in regions with low traffic.

To reconcile the need for both rapid updates and efficient caching, Picsart adopted an innovative approach: making KV records immutable. Instead of modifying existing records, they opted to create new records with each configuration change. By appending the content hash to the KV key and writing new records after each update, Picsart ensured that each record was unique and immutable. This allowed them to leverage higher cache TTLs, as these records would never be updated.

Key Size
com.picsart.studio_apple_us.json 60s
….

Before (simplified)

Key Size
com.picsart.studio_apple_us_b58b59.json 86400s
com.picsart.studio_apple_us_273678.json 86400s

After (simplified)

There was a catch, though. The service must now keep track of the correct KV keys to use. The Picsart team addressed this challenge by storing references to the latest KV keys in the environment variables of the Worker.

Each configuration change triggers a new KV pair to be written and the services’ environment variables to be updated. As global Workers deployments take mere seconds, changes to the experimentation and configuration data are near-instantaneously globally available.

JSON serialization & alternatives

Following the previous improvements, the Picsart team made another significant discovery: only a small fraction of configuration data is needed to assign the experiments, while the remaining majority of the data comprises JSON values for the remote configuration payloads. While the service must deliver the latter in the response, the data is not required during the initial processing phase.

The initial implementation used KV’s get() method to retrieve the configuration data with the parameter type=json, which converts the KV value to an object. This process is very compute-intensive compared to using the get() method with parameter type= text, which simply returns the value as a string. In the context of Picsart’s data, the bulk of the CPU cycles were wasted on serializing JSON data that is not needed to perform the required business logic.

What if the data structure and code could be changed in such a way that only the data needed to assign experiments was parsed as JSON, while the configuration values were treated as text? Picsart went ahead with a new approach: splitting the KV records into two, creating a small 300KB record for the metadata, which can be quickly parsed to an object, and a 9.7MB record of configuration values. The extracted configuration values are delimited by newline characters. The respective line number is used as reference in the metadata entry, so that the respective configuration value for an experiment can be merged back into the payload response later.


{
  "name": "shape_replace_items",
  "default_value": "<large json object>",
  "segments": [
    {
      "id": "f1244",
      "value": "<Another json object
json object>"
    },
    {
      "id": "a2lfd",
      "value": "<Yet another large json
object>"
    }
  ]
}
Before: Metadata and Values in one JSON object (simplified)

// com.picsart.studio_apple_am_metadata

1 {
2   "name": "shape_replace_items",
3   "default_value": 1,
4   "segments": [
5     {
6       "id": "f1244",
7       "value": 2
8     },
9     {
10       "id": "a2lfd",
11      "value": 3
12     }
13   ]
14 }


// com.picsart.studio_apple_am_values

1 "<large json object>"
2 "<Another json object>"
3 "<Yet another json object>"

After: Metadata and Values are split (simplified)

After calculating the experiments and selecting the correct variants based solely on the small metadata entry, the service constructs a JSON string for the response containing placeholders for the actual values that reference the corresponding line numbers in the separate text file. To finalize the response, the server replaces the placeholders with the corresponding serialized JSON strings from the text file. This approach circumvents the need for parsing and re-serializing large JSON objects and helps to avoid a significant computational overhead.

Note that the process of parsing the metadata JSON and determining the correct experiments as well as the loading of the large file with configuration values are executed in parallel, saving additional milliseconds.

By minimizing the size of the JSON data that needed to be parsed and leveraging a more efficient method for constructing the final response, the Picsart team managed to not only reduce the response times but also optimize the compute resource usage. This approach reflects a broader principle applicable across the tech industry: that efficiency, particularly in serverless architectures, can often be dramatically improved by rethinking data structure and utilization.

Getting a head start

The changes on the server-side, moving from a single region setup to Cloudflare’s global architecture, paid off massively. Median response times globally dropped by more than 1 second, which was already a huge improvement for the team. However, in looking at the new data, two more paths for client-side optimizations were found.

As the web and mobile app would call the service at startup, most of the time no active connections to the servers were alive and establishing that connection at request time costs valuable milliseconds.

For the web version, setting a pre-connect header on initial page load showed a positive impact. For the mobile app version, the Picsart team took it a step further. Investigation showed that before the connection could be established, three modules had to complete initialization: the error tracker, the HTTP client, and the SDK. Reordering of the modules to initialize the HTTP client first allowed for connection establishment in parallel to the initialization of the SDK and error tracker, again saving time. This resulted in another 200ms improvement for end users.

Setting a new personal best

The day had come and it was time for the phased rollout, web first and the mobile apps second. With suspense, the team looked at the dashboards, and were pleasantly surprised. The rollout was successful and billions of requests were handled with ease.

Share of successfully delivered experiments

The result? The Picsart apps are loading faster than ever for millions of users worldwide, while the share of successfully delivered experiments increased from 50% to 85%. Median response time dropped from 1500 ms to 280 ms. The response time dropped to 70 ms on the web since the response size is smaller compared to mobile. This translates to a real business impact for Picsart as they can now deliver more personalized and data-driven experiences to even more of their users.

A bright future ahead

Picsart is already thinking of the next generation of experimentation. To integrate with Cloudflare even further, the plan is to use Durable Objects to store hundreds of millions of user data records in a decentralized fashion, enabling even more powerful experiments without impacting performance. This is possible thanks to Durable Objects’ underlying architecture that stores the user data in-region, close to the respective end user device.

Beyond that, Picsart’s experimentation team is also planning to onboard external B2B customers to their experimentation platform as Cloudflare’s developer platform provides them with the scale and global network to handle more traffic and data with ease.

Get started yourself

If you’re also working on or with an application that would benefit from Cloudflare’s speed and scale, check out our developer documentation and tutorials, and join our developer Discord to get community support and inspiration.

How to Accelerate Your SOAR Program to Full Speed in Less Than a Year

Post Syndicated from Ryan Fried original https://blog.rapid7.com/2022/09/21/how-to-accelerate-your-soar-program-to-full-speed-in-less-than-a-year/

How to Accelerate Your SOAR Program to Full Speed in Less Than a Year

Every new technology comes with a learning curve specific to your organization. First you learn the basics, then you accelerate. Rapid7’s offerings are no different.

As a Senior Information Security Engineer at Brooks, I have firsthand experience with this process. I oversaw the implementation of Rapid7’s security orchestration, automation, and response (SOAR) product, InsightConnect, within my organization. We went from zero to 20+ workflows in just one year. Here are some reflections and advice about setting up a SOAR program, through the lens of my story about that successful and innovative year.

Workflow 1: Let Rapid7 hold your hand

In a previous blog regarding our initial deployment of InsightConnect, I shared key advice about how to set up a SOAR tool and get the program started. Looking back on that successful process, I believe that you should start with a goal that’s manageable – and delivers immediate value to help prove and cement the value of the initiative. For example, a phishing-related workflow is a great place to start. But there are other options as well, depending on your organization’s needs. Consider the following questions:

  • What pain point within your organization presents an immediate need?
  • What processes do you already want or need to try to automate?

Consider your team’s key technologies as well, but as you think through these questions, approach the solution in a technology-agnostic way. Instead, focus on the process, which can usually be applied to multiple technologies, and the corresponding desired outcome.

After that, you’ll want to work with your security analysts (assuming you’re not the security analyst!) to determine their pain points as well. What are the most common alerts they get? Where do they spend the most time? Or my favorite question to ask, “What requires the most browser tabs?” Your immediate focus should be how to make their job easier and more efficient.

From there, lean heavily on Rapid7’s product resources and services, and especially existing workflows that you can find in the Rapid7 Extensions Library – this will cut your work in half.

Workflows 2 to 5: Integrate with Slack and Teams

Once you’ve gone live with your first workflow, continue to look to the Rapid7 Extensions Library for workflows you can download and adapt to your needs. Some of the best examples of that use Slack or Microsoft Teams as the primary interface – you can find them easily by searching for workflows by category. And when you find an appropriate workflow, don’t get caught up on the specific technologies in the workflow. Again, focus on the process that you’re automating – after all, blocking an IP on one firewall is essentially the same as blocking an IP on another firewall, as it’s just a matter of swapping the integration plugin.

A major reason I advise starting with Slack and Teams-related workflows is that they’re the most numerous in the library and are valuable to most organizations. But this is the point where buy-in from key stakeholders across your organization becomes essential. Work with whoever runs your Teams or enterprise Slack account to input the appropriate API keys – they’re an extended part of your security automation team.

From there, look into workflows for incident response and enrichment – again, in the extensions library. Searching Virustotal or forcing a password reset or revoking Office 365 access can be very useful areas of automation, since you likely conduct those processes a lot. They can take a lot of time because they often rely on other teams when integration and automations aren’t already in place. Since time is of the essence in a phishing-related compromise, they’re super impactful.

One reason response and enrichment workflows are so useful for Workflows 2 to 5 is that it helps you understand that SOAR is not just about full automation. In fact, it’s about supporting human decision-making. So many security decisions require human insight and experience to make the right decision. What SOAR can do is to automatically collect the necessary context, tee up the decision to the security analyst, and then broadly automate the execution of those decisions.

Workflows 6 to 10: Hone in on your analysts’ pain points

At this point, it’s going to become easier for you and your team to build and implement your own, more heavily customized workflows. You’ll understand things like decisions trees, loops, and markdown cards – essential tools to take your security automation workflows to the next level. You’ll then be prepared to start customizing more workflows specifically catered to the needs of your organization and your analysts. Start here:

  1. Find out what your analysts’ top 5 alerts are. They’ll likely be something along the lines of DNS, EDR, Firewalls, or email-related alerts.
  2. Return to the Rapid7 workflow library to find existing workflows you can adopt and customize to address those alert categories.

Expect to commit a couple of hours here and there over a couple of weeks to perfect each workflow to fit your organization. This may sound like a lot, but I promise – the lift isn’t too hard. The Rapid7 extensions library and tool does a lot of lifting for you!

Workflows 10 to 20: Take your workflows to the next level

Once you’ve implemented roughly 10 workflows, you’re ready to start honing in on specific pain points that likely require a bit more and customization – for example, ad-hoc actions for investigations like revoking active Office 365 sessions, searching for and deleting specific emails, or automatically blocking likely malicious URLs based on threat intelligence feeds you’re subscribed to.

The more you create, the more comfortable you’ll be  creating workflows from scratch. In my experience, by the time you get to 20 workflows, you should expect that you or a team member could get a typical workflow designed and shipped in 1 to 2 weeks, assuming they spend 4 to 8 hours a week on it. Check out two of my team’s prized custom workflows:

However, that’s not to say that you can’t still make existing workflows your own. It’s to your benefit to keep up with the latest developments in Rapid7’s marketplace. I check the marketplace every few weeks and subscribe to the newsletter for new workflows or plugins. I’ve also learned to use the plugin API to make custom API calls for plugins Rapid7 doesn’t yet have!

In my next blog, I’ll take a deeper dive into the why and how of high-value security automation workflows. I’ll also give you some insights into the benefits we’ve seen at Brooks thanks to our SOAR program.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

Deploying a SOAR Tool Doesn’t Have to Be Hard: I’ve Done It Twice

Post Syndicated from Ryan Fried original https://blog.rapid7.com/2022/07/21/deploying-a-soar-tool-doesnt-have-to-be-hard-ive-done-it-twice/

Deploying a SOAR Tool Doesn’t Have to Be Hard: I’ve Done It Twice

As the senior information security engineer at Brooks, an international running shoe and apparel company, I can appreciate the  challenge of launching a security orchestration, automation, and response (SOAR) tool for the first time. I’ve done it at two different companies, so I’ll share some lessons learned and examples of how we got over some speed bumps and past friction points. I’ll also describe the key steps that helped us create a solid SOAR program.

At Brooks we selected Rapid7’s InsightConnect (ICON) as our security automation tool after a thorough product review. I was familiar with ICON because I had used it at a previous company. There are other SOAR tools out there, but InsightConnect is my preferred option based on my experience, its integrations, support, and Rapid7’s track record of innovation in SOAR. InsightConnect is embedded in everything we do now. We use it to slash analyst time spent on manual, repetitive tasks and to streamline our incident response and vulnerability management processes.

When you’re starting out with SOAR, there are two important things you need to put in place.

  • One is getting buy-in from your active directory (AD) team on the automation process and the role they need to play. At Brooks, we have yearly goals that are broken down into quarters, so getting it on their quarterly goals as part of our overall SOAR goal was really important.  This also applies to other areas of the IT and security organizations
  • The second is getting all the integrations set up within the first 30 to 60 days. It’s critical because your automation tool is only as good as the integrations you have deployed. Maybe 50% to 60% of them fall under IT security, but the other 30% or 40% are still pretty important, given how dependent security teams are on other organizations and their systems. So, getting buy-in from the teams that own those systems and setting up all the integrations are key.

Start with collaboration and build trust

A successful SOAR program requires trust and collaboration with your internal partners – essentially, engineering and operations and the team that sets up your active directory domain – because they help set up the integrations that the security automations depend on. You need to develop that trust because IT teams often hesitate when it comes to automation.

In conversations with these teams, let them know you won’t be completely automating things like blocking websites or deleting users. In addition, stress that almost everything being done will require human interaction and oversight. We’re just enriching and accelerating the many of the processes we already have in place. Therefore, it will free up their time in addition to ours because it’s accomplishing things that they do for us already. And remember we have the ability to see if something happened that may have been caused by the SOAR tool, so it’s automation combined with human decision-making.

For example, say something starts not working. The team asks you: “Hey, what’s changed?” With ICON up and running, you can search within seconds to see, for example, what firewall changes have happened within the last 24 hours. What logins have occurred? Are there any user account lockouts? I can search that in seconds. Before, it used to take me 15 to 30 minutes to get back to them with a response. Not any more. That’s what I call fast troubleshooting.

Meet with your security analysts and explain the workflows

Right from the beginning, it’s important to meet with your security analysts and explain the initial workflows you’ve created. Then, get them thinking about the top five alerts that happen most often and consume a lot of their time, and what information they need from those alerts. For instance, with two-factor authentication logs, the questions might be, “What’s the device name? Who’s the user’s manager? What’s their location?” Then, you can work in the SOAR tool to get that information for them. This will help them see the benefit firsthand.

This approach helps with analyst retention because the automation becomes the platform glue for all of your other tools. It also reduces the time your analysts have to spend on repetitive drudge work. Now, they’re able to give more confident answers if something shows up in the environment, and they can focus on more creative work.

Dedicate a resource to SOAR

I believe it’s important to have one person dedicated to the SOAR project at least half-time for the first six months. This is where teams can come up short. When the staff and time commitment is there, the process quickly expands beyond simple tasks. Then you’re thinking, “What else can I automate? What additional workflows can I pick up from the Rapid7 workflow marketplace and customize for our own use?”

Take advantage of the Rapid7 Extensions Library

The good news is you don’t need to build workflows (playbooks) from scratch. The Rapid7 Extensions Library contains hundreds of workflows which you can use as a core foundation for your needs. Then you can tweak the last 15% to 20% to make the workflow fit even better. These pre-built workflows get you off the ground running. Think of them not as ready-to-go tools, but more as workflow ideas and curated best practices. The first time I used InsightConnect, I used the phishing workflow and started seeing value in less than two weeks.

Implementing a security automation tool within a company’s network environment can be a challenge if you don’t come at it the right way. I know because I’ve been there. But Rapid7’s InsightConnect makes it easier by enabling almost anything you can imagine. With a SOAR solution, your analysts will spend less time on drudge work and more time optimizing your security environment. These are real benefits I’ve seen firsthand at Brooks. You can have them as well by following this simple approach. Best of luck.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

Maximize Your VM Investment: Fix Vulnerabilities Faster With Automox + Rapid7

Post Syndicated from Nicholas Colyer original https://blog.rapid7.com/2022/05/16/maximize-your-vm-investment-fix-vulnerabilities-faster-with-automox-rapid7/

Maximize Your VM Investment: Fix Vulnerabilities Faster With Automox + Rapid7

The Rapid7 InsightConnect Extension library is getting bigger! We’ve teamed up with IT operations platform, Automox, to release a new plugin and technology alliance that closes the aperture of attack for vulnerability findings and automates remediation. Using the Automox Plugin for Rapid7 InsightConnect in conjunction with InsightVM, customers are able to:

  • Automate discovery-to-remediation of vulnerability findings
  • Query Automox device details via Slack or Microsoft Teams

Getting started with Automox within InsightConnect

Automox is an IT Operations platform that fully automates the process of endpoint management across Windows, macOS, Linux, and third-party software — including Adobe, Java, Firefox, Chrome, and Windows.

The Automox InsightConnect Plugin allows mutual customers of Rapid7 and Automox to expand their capabilities between products, ultimately streamlining cyber security outcomes and operational effectiveness. Seamlessly transition CVE-based vulnerability findings through discovery to remediation, and perform device queries without needing to leave Slack or Microsoft Teams!

Example workflows you can start leveraging now with the Automox Plugin

  • Generate Rapid7 InsightVM Report and Upload to Automox Vulnerability Sync: An example workflow that leverages threat context for assets and prioritizes them for remediation. An InsightVM report is automatically generated and uploaded using Automox’s Vulnerability Sync for easy remediation, saving internal teams precious time and effort in managing  critically emerging threats – from start to finish.
  • Automox Device Lookup from Microsoft Teams: An example workflow that lets a user query a device in Automox directly from Microsoft Teams.
  • Automox Device Lookup from Slack: An example workflow that lets a user query a device in Automox directly from Slack.

For more information or to start using this plugin, access and install the Automox Plugin for Rapid7 InsightConnect through the Rapid7 Extension Library.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

4 Fallacies That Keep SMBs Vulnerable to Ransomware, Pt. 2

Post Syndicated from Ryan Weeks original https://blog.rapid7.com/2022/03/31/4-fallacies-that-keep-smbs-vulnerable-to-ransomware-pt-2/

4 Fallacies That Keep SMBs Vulnerable to Ransomware, Pt. 2

This post is co-authored by Chris Henderson, Senior Director of Information Security at Datto, Inc.

Welcome back for the second and final of our blogs on the fallacies and biases that perpetuate ransomware risk for SMBs. In part one, we examined how flawed thinking and a sense of helplessness are obstacles to taking action against ransomware. In this final part, we will examine fallacies number 3 and 4: the ways SMBs often fail to “trust but verify” the security safety of their critical business partners, and how prior investments affect their forward-looking mitigation decisions.

3. Failing to trust but verify

“You seem like someone I can trust to help support and grow my business.”

Stranger danger

When SMBs create business partnerships, we do so with a reasonable expectation that others will do the right things to keep both them and us safe. SMBs are effectively placing trust in strangers. As humans, we (often unconsciously) decide who to trust due to how they make us feel or whether they remind us of a past positive experience. Rarely have SMBs done a deep enough examination to determine if that level of trust is truly warranted, especially when it comes to protecting against ransomware.

We reasonably — but perhaps incorrectly — expect a few key things from these business partners, namely that they will:

  • Be rational actors that can be relied on to make informed decisions that maximize benefits for us
  • Exercise rational choice in our best interests
  • Operate with the same level of due care that a reasonable, prudent person would use under the same or similar circumstances, in decisions that affect our business – akin to a fiduciary

Rational actor model

According to an economic theory, a rational actor maximizes benefits for themselves first and will exercise rational choice that determines whether an option is right for themselves. That begs the question: To what extent do SMBs understand if business objectives are aligned such that what is right for their business partners’ cyber protection is also right for them? In the SMB space, too often the answer is based on trust alone and not on any sort of verification, or what mature security programs call third-party due diligence.

If I harm you, I harm myself

Increasingly, ransomware attacks are relying on our business relationships (a.k.a. supply chains) to facilitate attacks on targets. End targets may be meticulously selected, but they could instead be targets of opportunity, and sometimes they are even impacted as collateral damage. In any case, in this ransomware environment, it is critical for SMBs to reassess the level of trust they place in their business partners, as their cyber posture is now part of yours. You share the risk.

Trust is a critical component of business relationships, but trust in a business partner’s security must be verified upon establishment of the relationship and reaffirmed periodically thereafter. It is a reasonable expectation that, given this ransomware environment, your business partners will be able to prove that they take both their and your protection as being in your mutual best interests. They must be able to speak to and demonstrate how they work toward that objective.

Acknowledge and act

Trust is no longer enough — SMBs have to verify. Unfortunately, there is no one-size-fits-all process for diligence, but a good place to start is with a serious conversation about your business partners’ attitudes, beliefs, current readiness, and their investments in cyber resilience, ransomware prevention, and recovery. During that conversation, ask a few key questions:

  1. Do you have cyber insurance coverage for a ransomware incident that affects both you and your customers? Tip: Ask them to provide you proof of coverage.
  2. What cybersecurity program framework do you follow, and to what extent have you accomplished operating effectiveness against that framework? Tip: Ask to see materials from audits or assessments as evidence.
  3. Has your security posture been validated by an independent third party? Tip: Ask to see materials from audits or assessments as evidence.
  4. When was the last time you, or a customer of yours, suffered a cybersecurity incident, and how did you respond? Tip: Ask for a reference from a customer they’ve helped recover from a ransomware incident.

4. “We can’t turn back now; we’ve come too far”

“We have already spent so much time and made significant investments in IT solutions to achieve our business objectives. It wouldn’t make sense at this point to abandon our solutions, given what we’ve already invested.”

Sunk cost

Ransomware threat actors seek businesses whose IT solutions — when improperly developed, deployed, configured, or maintained — make compromise and infection easy. Such solutions are currently a primary access vector for ransomware, as they can be difficult to retrofit security into. When that happens, we are faced with a decision to migrate platforms, which can be costly and disruptive.

This decision point is one of the most difficult for SMBs, as it’s very easy to fall into a sunk cost fallacy — the tendency to follow through on an endeavor if we’ve already invested time, effort, or money into it, whether or not the current costs outweigh the benefits.

It’s easy to look backward at all the work done to get an IT solution to this point and exceedingly difficult to accept a large part of that investment as a sunk cost. The reality is that it doesn’t matter how much time has been invested in IT solutions. If security is not a core feature of the solution, then the long-term risk to an SMB’s business is greater than any sunk cost.

Acknowledge and act

Sunk costs burn because they feel like a failure — knowing what we know now, we should have made a different decision. New information is always presenting itself, and the security landscape is changing constantly around us. It’s impossible to foresee every shift, so our best defense is to remain agile and pivot when and as necessary. Acknowledge that there will be sunk costs on this journey, and allowing those to stand in the way of reasonable action is the real failure.

Moving forward

“There’s a brighter tomorrow that’s just down the road. Do not look back; you are not going that way” – Mary Engelbreit

Realizing your SMB has real cyber risk exposure to ransomware requires overcoming a series of logical fallacies and cognitive biases. Once you understand and accept that reality, it’s imperative not to buy into learned helplessness, because you need not be a victim. An SMB’s size and agility can be an advantage.

From here, re-evaluate your business partnerships and level of trust when it comes to cybersecurity. Be willing to make decisions that accept prior investments may just be sunk cost, but that the benefits of change to become more cyber resilient outweigh the risks of not changing in the long run.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

4 Fallacies That Keep SMBs Vulnerable to Ransomware, Pt. 1

Post Syndicated from Ryan Weeks original https://blog.rapid7.com/2022/03/24/four-fallacies-that-keep-smbs-vulnerable-to-ransomware-pt-1/

4 Fallacies That Keep SMBs Vulnerable to Ransomware, Pt. 1

Ransomware has focused on big-game hunting of large enterprises in the past years, and those events often make the headlines. The risk can be even more serious for small and medium-sized businesses (SMBs), who struggle to both understand the changing nature of the threats and lack the resources to become cyber resilient. Ransomware poses a greater threat to SMBs’ core ability to continue to operate, as recovery can be impossible or expensive beyond their means.

SMBs commonly seek assistance from managed services providers (MSPs) for their foundational IT needs to run their business — MSPs have been the virtual CIOs for SMBs for years. Increasingly, SMBs are also turning to their MSP partners to help them fight the threat of ransomware, implicitly asking them to also take on the role of a virtual CISO, too. These MSPs have working knowledge of ransomware and are uniquely situated to assist SMBs that are ready to go on a cyber resilience journey.

With this expert assistance available, one would think that we would be making more progress on ransomware. However, MSPs are still meeting resistance when working to implement a cyber resilience plan for many SMBs.

In our experience working with MSPs and hearing the challenges they face with SMBs, we have come to the conclusion that much of this resistance they meet is based on under-awareness, biases, or fallacies.

In this two-part blog series, we will present four common mistakes SMBs make when thinking about ransomware risk, allowing you to examine your own beliefs and draw new conclusions. We contend that until SMBs resistance to resilience improvement do the work to unwind critical flaws in thinking, ransomware will continue to be a growing and existential problem they face.

1. Relying on flawed thinking

I’m concerned about the potential impacts of ransomware, but I do not have anything valuable that an attacker would want, so ransomware is not likely to happen to me.

Formal fallacies

These arguments are the most common form of resistance toward implementing adequate cyber resilience for SMBs, and they create a rationalization for inaction as well as a false sense of safety. However, they are formal fallacies, relying on common beliefs that are partially informed by cognitive biases.

Formal fallacies can best be classified simply as deductively invalid arguments that typically commit an easily recognizable logical error when properly examined. Either the premises are untrue, or the argument is invalid due to a logical flaw.

Looking at this argument, the conclusion “ransomware will not happen to me” is the logical conclusion of the prior statement, “I have nothing of value to an attacker.” The flaw in this argument is that the attacker does not need the data they steal or hold ransom to be intrinsically valuable to them — they only need it to be valuable to the attack target.

Data that is intrinsically valuable is nice to have for an attacker, as they can monetize it outside of the attack by exfiltrating it and selling it (potentially multiple times), but the primary objective is to hold it ransom, because you need it to run your business. Facing this fact, we can see that the conclusion “ransomware will not happen to me” is logically invalid based on the premise “I have nothing of value to an attacker.”

Confirmation bias

The belief “ransomware will not happen to me” can also be a standalone argument. The challenge here is that the premise of the argument is unknown. This means we need data to support probability. With insufficient reporting data to capture accurate rates of ransomware on SMBs, this is problematic and can lead to confirmation bias. If I can’t find data on others like me as an SMB, then I may conclude that this confirms I’m not at risk.

Anchoring bias

I may be able to find data in aggregate that states that my SMB’s industries are not as commonly targeted. This piece of data can lead to an anchoring bias, which is the tendency to rely heavily on the first piece of information we are given. While ransomware might not be as common in your industry, that does not mean it does not exist. We need to research further rather than latching onto this data to anchor our belief.

Acknowledge and act

The best way to combat these formal fallacies and biases is for the SMB and their MSP to acknowledge these beliefs and act to challenge them through proper education. Below are some of the most effective exercises we have seen SMBs and MSPs use to better educate themselves on real versus perceived ransomware risk likelihood:

  1. Threat profiling is an exercise that collects information, from vendor partners and open-source intelligence sources, to inform which threat actors are likely to target the business, using which tactics.
  2. Data flow diagrams can help you to map out your unique operating environment and see how all your systems connect together to better inform how data moves and resides within your IT environment.
  3. A risk assessment uses the threat profile information and overlays on the data flow diagram to determine where the business is most susceptible to attacker tactics.
  4. Corrective action planning is the last exercise, where you prioritize the largest gaps in protection using a threat- and risk-informed approach.

2. Being resigned to victimhood

“Large companies and enterprises get hit with ransomware all the time. As an SMB, I don’t stand a chance. I don’t have the resources they do. This is hopeless; there’s nothing I can do about it.”

Victim mentality

This past year has seen a number of companies that were supposedly “too large and well-funded to be hacked” reporting ransomware breaches. It feels like there is a constant stream of information re-enforcing the mentality that, even with a multi-million dollar security program, an SMB will not be able to effectively defend against the adverse outcomes from ransomware. This barrage of information can make them feel a loss of control and that the world is against them.

Learned helplessness

These frequent negative outcomes for “prepared” organizations are building a sense of learned helplessness, or powerlessness, within the SMB space. If a well-funded and organized company can’t stop ransomware, why should we even try?

This mentality takes a binary view on a ransomware attack, viewing it as an all-or-nothing event. In reality, there are degrees of success of a ransomware attack. The goal of becoming immune to ransomware can spark feelings of learned helplessness, but if you reframe it as minimizing the damage a successful attack will have, this allows you to regain a sense of control in what otherwise may feel like an impossible effort.

Pessimism bias

This echo chamber of successful attacks (and thus presumed unsuccessful mitigations) is driving a pessimism bias. As empathetic beings, we feel the pain of these attacked organizations as though it were our own. We then tie this negative emotion to our expectation of an event (i.e. a ransomware attack), creating the expectation of a negative outcome for our own organization.

Acknowledge and act

Biases and beliefs shape our reality. If an SMB believes they are going to fall victim to ransomware and fails to protect against it, they actually make that exact adverse outcome more likely.

Despite the fear and uncertainty, the most important variable missing from this mental math is environment complexity. The more complex the environment, the more difficult it is to protect. SMBs have an advantage over their large-business counterparts, as the SMB IT environment is usually easier to control with the right in-house tech staff and/or MSP partners. That means SMBs are better situated than large companies to deter and recover from attacks — with the right strategic investments.

Check back with us next week, when we’ll tackle the third and fourth major fallacies that hold SMBs back from securing themselves against ransomware.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

An Inside Look at CISA’s Supply Chain Task Force

Post Syndicated from Chad Kliewer, MS, CISSP, CCSP original https://blog.rapid7.com/2022/03/14/an-inside-look-at-cisas-supply-chain-task-force/

An Inside Look at CISA’s Supply Chain Task Force

When one mentions supply chains these days, we tend to think of microchips from China causing delays in automobile manufacturing or toilet paper disappearing from store shelves. Sure, there are some chips in the communications infrastructure, but the cyber supply chain is mostly about virtual things – the ones you can’t actually touch.  

In 2018, the Cybersecurity and Infrastructure Security Agency (CISA) established the Information and Communications Technology (ICT) Supply Chain Risk Management (SCRM) Task Force as a public-private joint effort to build partnerships and enhance ICT supply chain resilience. To date, the Task Force has worked on 7 Executive Orders from the White House that underscore the importance of supply chain resilience in critical infrastructure.

Background

The ICT-SCRM Task Force is made up of members from the following sectors:

  • Information Technology (IT) – Over 40 IT companies, including service providers, hardware, software, and cloud have provided input.
  • Communications – Nearly 25 communications associations and companies are included, with representation from the wireline, wireless, broadband, and broadcast areas.
  • Government – More than 30 government organizations and agencies are represented on the Task Force.

These three sector groups touch nearly every facet of critical infrastructure that businesses and government require. The Task Force is dedicated to identifying threats and developing solutions to enhance resilience by reducing the attack surface of critical infrastructure. This diverse group is poised perfectly to evaluate existing practices and elevate them to new heights by enhancing existing standards and frameworks with up-to-date practical advice.

Working groups

The core of the task force is the working groups. These groups are created and disbanded as needed to address core areas of the cyber supply chain. Some of the working groups have been concentrating on areas like:

  • The legal risks of information sharing
  • Evaluating supply chain threats
  • Identifying criteria for building Qualified Bidder Lists and Qualified Manufacturer Lists
  • The impacts of the COVID-19 pandemic on supply chains
  • Creating a vendor supply chain risk management template

Ongoing efforts

After two years of producing some great resources and rather large reports, the ICT-SCRM Task Force recognized the need to ensure organizations of all sizes can take advantage of the group’s resources, even if they don’t have a dedicated risk management professional at their disposal. This led to the creation of both a Small and Medium Business (SMB) working group, as well as one dedicated to Product Marketing.

The SMB working group chose to review and adapt the Vendor SCRM template for use by small and medium businesses, which shows the template can be a great resource for companies and organizations of all sizes.  

Out of this template, the group described three cyber supply chain scenarios that an SMB (or any size organization, really) could encounter. From that, the group further simplified the process by creating an Excel spreadsheet that provides a document that is easy for SMBs to share with their prospective vendors and partners as a tool to evaluate their cybersecurity posture. Most importantly, the document does not promote a checkbox approach to cybersecurity — it allows for partial compliance, with room provided for explanations. It also allows many of the questions to be removed if the prospective partner possesses a SOC1/2 certification, thereby eliminating duplication in questions.

What the future holds

At the time of this writing, the Product Marketing and SMB working groups are hard at work making sure everyone, including the smallest businesses, are using the ICT-SCRM Task Force Resources to their fullest potential. Additional workstreams are being developed and will be announced soon, and these will likely include expansion with international partners and additional critical-infrastructure sectors.

For more information, you can visit the CISA ICT-SCRM Task Force website.

Additional reading:

NEVER MISS A BLOG

Get the latest stories, expertise, and news about security today.

Exploring WebAssembly AI Services on Cloudflare Workers

Post Syndicated from Guest Author original https://blog.cloudflare.com/exploring-webassembly-ai-services-on-cloudflare-workers/

Exploring WebAssembly AI Services on Cloudflare Workers

This is a guest post by Videet Parekh, Abelardo Lopez-Lagunas, Sek Chai at Latent AI.

Edge networks present a significant opportunity for Artificial Intelligence (AI) performance and applicability. AI technologies already make it possible to run compelling applications like object and voice recognition, navigation, and recommendations.

AI at the edge presents a host of benefits. One is scalability—it is simply impractical to send all data to a centralized cloud. In fact, one study has predicted a global scope of 90 zettabytes generated by billions of IoT devices by 2025. Another is privacy—many users are reluctant to move their personal data to the cloud, whereas data processed at the edge are more ephemeral.

When AI services are distributed away from centralized data centers and closer to the service edge, it becomes possible to enhance the overall application speed without moving data unnecessarily.  However, there are still challenges to make AI from the deep-cloud run efficiently on edge hardware. Here, we use the term deep-cloud to refer to highly centralized, massively-sized data centers. Deploying edge AI services can be hard because AI is both computational and memory bandwidth intensive. We need to tune the AI models so the computational latency and bandwidth can be radically reduced for the edge.

The Case for Distributed AI Services

Edge network infrastructure for distributed AI is already widely available. Edge networks like Cloudflare serve a significant proportion of today’s Internet traffic, and can serve as the bridge between devices and the centralized cloud. Highly-performant AI services are possible because of the distributed processing that has excellent spatial proximity to the edge data.

We at Latent AI are exploring ways to deploy AI at the edge, with technology that transforms and compresses AI models for the edge. The size of our edge AI model is many orders of magnitudes smaller than the sensor data (e.g., kilobytes or megabytes for the edge AI model, compared to petabytes of edge data). We are exploring using WebAssembly (WASM) within the Cloudflare Workers environment. We want to identify possible operating points for the distributed AI services by exploring achievable performance on the available edge infrastructure.

Architectural Approach for Exploration

WebAssembly (WASM) is a new open-standard format for programs that run on the Web. It is a popular way to enable high-performance web-based applications. WASM is closer to machine code, and thus faster than JavaScript (JS) or JIT. Compiler optimizations, already done ahead of time, reduce the overhead in fetching and parsing application code. Today, WASM offers the flexibility and portability of JS at the near-optimum performance of compiled machine code.

AI models have notoriously large memory usage demands because configuring them requires high parameter counts. Cloudflare already extends support for WASM using their Wrangler CLI, and we chose to use it for our exploration. Wrangler is the open-source CLI tool used to manage Workers, and is designed to enable a smooth developer experience.

How Latent AI Accelerates Distributed AI Services

Latent AI’s mission is to enable ambient computing, regardless of any resource constraints. We develop developer tools that greatly reduce the computing resources needed to process AI on the edge while being completely hardware-agnostic.

Latent AI’s tools significantly compress AI models to reduce their memory size. We have shown up to 10x compression for state-of-the-art models. This capability addresses the load time latencies challenging many edge network deployments. We also offer an optimized runtime that executes a neural network natively. Results are a 2-3x speedup on runtime without any hardware-specific accelerators. This dramatic performance boost offers fast and efficient inferences for the edge.

Our compression uses quantization algorithms to convert parameters for the AI model from 32-bit floating-point toward 16-bit or 8-bit models, with minimal loss of accuracy. The key benefit of moving to lower bit-precision is the higher power efficiency with less storage needed.  Now AI inference can be processed using more efficient parallel processor hardware on the continuum of platforms at the distributed edge.

Exploring WebAssembly AI Services on Cloudflare Workers
Optimized AI services can process data closest to the source and perform inferences at the distributed edge.

Selecting Real-World WASM Neural Network Examples

For our exploration, we use state-of-the-art deep neural networks called MobileNet. MobileNets are designed specifically for embedded platforms such as smartphones, and can achieve high recognition accuracy in visual object detection. We compress MobileNets AI models to be small fast, in order to represent the variety of use cases that can be deployed as distributed AI services. Please see this blog for more details on the AI model architecture.

We use the MobileNetV2 model variant for our exploration. The models are trained with different visual objects that can be detected: (1) a larger sized model with 10 objects derived from ImageNet dataset, and (2) a smaller version with just two classes derived from the COCO dataset. The COCO dataset are public open-source databases of images that are used as benchmarks for AI models. Images are labeled with detected objects such as persons, vehicles, bicycles, traffic lights, etc. Using Latent AI’s compression tool, we were able to compress and compile the MobileNetV2 models into WASM programs. In the WASM form, we can achieve fast and efficient processing of the AI model with a small storage footprint.

We want WASM neural networks to be as fast and efficient as possible. We spun up a Workers app to accept an image from a client, convert and preprocess the image into a cleaned data array, run it through the model and then return a class for that image. For both the large and small MobileNetv2 models, we create three variants with different bit-precision (32-bit floating point, 16-bit integer, and 8-bit integer).  The average memory and inference times for the large AI model are 110ms and 189ms, respectively; And for the smaller AI model, they are 159ms and 15ms, respectively.

Our analysis suggests that overall processing can be improved by reducing the overhead for memory operations. For the large model, lowering bit precision to 8-bits reduces memory operations from 48% to 26%. For the small model, the memory load times dominate over the inference computation with over 90% of the latency in memory operations.

It is important to note that our results are based on our initial exploration, which is focused more on functionality rather than optimization. We make sure the results are consistent by averaging our measurements over 50-100 iterations. We do acknowledge that there are still network and system related latencies that can be further optimized, but we believe that the early results described here show promise with respect to AI model inferences on the distributed edge.  

Exploring WebAssembly AI Services on Cloudflare Workers
Comparison of memory and inference processing times for large and small DNNs.

Learning from Real-World WASM Neural Network Example

What lessons can we draw from our example use case?

First of all, we recommend a minimal compute and memory footprint for AI models deployed to the network edge. A small footprint allows for better line up of data types for WASM AI models to reduce memory load overhead. WASM practitioners know that WASM speed-ups come from the tighter coupling of the API between JavaScript API and native machine code. Because WASM code does not need to speculate on data types, parallelizing compilation for WASM can achieve better optimization.

Furthermore, we encourage the use of running AI models at 8-bit precision to reduce the overall size. These 8-bit AI models are readily compressed and compiled for the target hardware to greatly reduce the overhead in hosting the models for inference. Furthermore, for video imagery, there is no overhead to convert digitized raw data (e.g. image files digitized and stored as integers) to floating-point values for use with floating point AI models.

Finally, we suggest the use of a smart cache for AI models so that Workers can essentially reduce memory load times and focus solely on neural network inferences at runtime. Again, 8-bit models allow more AI models to be hosted and ready for inference. Referring to our exploratory results, hosted small AI models can be served at approximately 15ms inference time, offering a very compelling user experience with low latency and local processing. The WASM API provides a significant performance increase over pure-JS toolchains like Tensorflow.js. For example, for inference time for the large AI model of 189ms on WASM, we have observed a range of 1500ms on Tensorflow.js workflow, which is approximately an 8X difference in compute latency.

Unlocking the Future of the Distributed Edge

With exceedingly optimized WASM neural networks, distributed edge networks can move the inference closer to users, offering new edge AI services closer to the source of the data. With Latent AI technology to compress and compile WASM neural networks, the distributed edge networks can (1) host more models, (2) offer lower latency responses, and (3) potentially lower power utilization with more efficient computing.

Exploring WebAssembly AI Services on Cloudflare Workers
Example person detected using a small AI model, 10x compressed to 150KB.

Imagine for example that the small AI model described earlier can distinguish if a person is in a video feed. Digital systems, e.g. door bell and doorway entry cameras, can hook up to Cloudflare Workers to verify if a person is present in the camera field of view. Similarly, other AI services could conduct sound analyses to check for broken windows  and water leaks. With these distributed AI services, applications can run without access to deep cloud services. Furthermore, the sensor platform can be made with ultra low cost, low power hardware, in very compact form factors.

Application developers can now offer AI services with neural networks trained, compressed, and compiled natively as a WASM neural network. Latent AI developer tools can compress WASM neural networks and provide WASM runtimes offering blazingly fast inferences for the device and infrastructure edge.  With scale and speed baked in, developers can easily create high-performance experiences for their users, wherever they are, at any scale. More importantly, we can scale enterprise applications on the edge, while offering the desired return on investments using edge networks.

About Latent AI

Latent AI is an early-stage venture spinout of SRI International. Our mission is to enable developers and change the way we think about building AI for the edge. We develop software tools designed to help companies add AI to edge devices and to empower users with new smart IoT applications. For more information about the availability of LEIP SDK, please feel free to contact us at [email protected] or check out our website.

Let’s build a Cloudflare Worker with WebAssembly and Haskell

Post Syndicated from Cristhian Motoche original https://blog.cloudflare.com/cloudflare-worker-with-webassembly-and-haskell/

Let's build a Cloudflare Worker with WebAssembly and Haskell

This is a guest post by Cristhian Motoche of Stack Builders.

At Stack Builders, we believe that Haskell’s system of expressive static types offers many benefits to the software industry and the world-wide community that depends on our services. In order to fully realize these benefits, it is necessary to have proper training and access to an ecosystem that allows for reliable deployment of services. In exploring the tools that help us run our systems based on Haskell, our developer Cristhian Motoche has created a tutorial that shows how to compile Haskell to WebAssembly using Asterius for deployment on Cloudflare.

What is a Cloudflare Worker?

Cloudflare Workers is a serverless platform that allows us to run our code on the edge of the Cloudflare infrastructure. It’s built on Google V8, so it’s possible to write functionalities in JavaScript or any other language that targets WebAssembly.

WebAssembly is a portable binary instruction format that can be executed fast in a memory-safe sandboxed environment. For this reason, it’s especially useful for tasks that need to perform resource-demanding and self-contained operations.

Why use Haskell to target WebAssembly?

Haskell is a pure functional languages that can target WebAssembly. As such, It helps developers break down complex tasks into small functions that can later be composed to do complex tasks. Additionally, it’s statically typed and has type inference, so it will complain if there are type errors at compile time. Because of that and much more, Haskell is a good source language for targeting WebAssembly.

From Haskell to WebAssembly

We’ll use Asterius to target WebAssembly from Haskell. It’s a well documented tool that is updated often and supports a lot of Haskell features.

First, as suggested in the documentation, we’ll use podman to pull the Asterius prebuilt container from Docker hub. In this tutorial, we will use Asterius version 200617, which works with GHC 8.8.

podman run -it --rm -v $(pwd):/workspace -w /workspace terrorjack/asterius:200617

Now we’ll create a Haskell module called fact.hs file that will export a pure function:

module Factorial (fact) where

fact :: Int -> Int
fact n = go n 1
  where
    go 0 acc = acc
    go n acc = go (n - 1) (n*acc)

foreign export javascript "fact" fact :: Int -> Int

In this module, we define a pure function called fact, optimized with tail recursion and exported using the Asterius JavaScript FFI, so that it can be called when a WebAssembly module is instantiated in JavaScript.

Next, we’ll create a JavaScript file called fact_node.mjs that contains the following code:

import * as rts from "./rts.mjs";
import module from "./fact.wasm.mjs";
import req from "./fact.req.mjs";

async function handleModule(m) {
  const i = await rts.newAsteriusInstance(Object.assign(req, {module: m}));
  const result = await i.exports.fact(5);
  console.log(result);
}

module.then(handleModule);

This code imports rts.mjs (common runtime), WebAssembly loaders, and the required parameters for the Asterius instance. It creates a new Asterius instance, it calls the exported function fact with the input 5, and prints out the result.

You probably have noted that fact is done asynchronously. This happens with any exported function by Asterius, even if it’s a pure function.

Next, we’ll compile this code using the Asterius command line interface (CLI) ahc-link, and we’ll run the JavaScript code in Node:

ahc-link \
  --input-hs fact.hs \
  --no-main \
  --export-function=fact \
  --run \
  --input-mjs fact_node.mjs \
  --output-dir=node

This command takes fact.hs as a Haskell input file, specifies that no main function is exported, and exports the fact function. Additionally, it takes fact_node.mjs as the entry JavaScript file that replaces the generated file by default, and it places the generated code in a directory called node.

Running the ahc-link command from above will print the following output in the console:

[INFO] Compiling fact.hs to WebAssembly
...
[INFO] Running node/fact.mjs
120

As you can see, the result is executed in node and it prints out the result of fact in the console.

Push your code to Cloudflare Workers

Now we’ll set everything up for deploying our code to Cloudflare Workers.

First, let’s add a metadata.json file with the following content:

{
  "body_part": "script",
  "bindings": [
    {
      "type": "wasm_module",
      "name": "WASM",
      "part": "wasm"
    }
  ]
}

This file is needed to specify the wasm_module binding. The name value corresponds to the global variable to access the WebAssembly module from your Worker code. In our example, it’s going to have the name WASM.

Our next step is to define the main point of the Workers script.

import * as rts from "./rts.mjs";
import fact from "./fact.req.mjs";

async function handleFact(param) {
  const i = await rts.newAsteriusInstance(Object.assign(fact, { module: WASM }));
  return await i.exports.fact(param);
}

async function handleRequest(req) {
  if (req.method == "POST") {
    const data = await req.formData();
    const param = parseInt(data.get("param"));
    if (param) {
      const resp = await handleFact(param);
      return new Response(resp, {status: 200});
    } else {
      return new Response(
        "Expecting 'param' in request to be an integer",
        {status: 400},
      );
    }
  }
  return new Response("Method not allowed", {status: 405});
}

addEventListener("fetch", event => {
  event.respondWith(handleRequest(event.request))
})

There are a few interesting things to point out in this code:

  1. We import rts.mjs and fact.req.mjs to load the exported functions from our WebAssembly module.
  2. handleFact is an asynchronous function that creates an instance of Asterius with the global WASM module, as a Workers global variable, and calls the exported function fact with some input.
  3. handleRequest handles the request of the Worker. It expects a POST request, with a parameter called param in the request body. If param is a number, it calls handleFact to respond with the result of fact.
  4. Using the Service Workers API, we listen to the fetch event that will respond with the result of handleRequest.

We need to build and bundle our code in a single JavaScript file, because Workers only accepts one script per worker. Fortunately, Asterius comes with Parcel.js, which will bundle all the necessary code in a single JavaScript file.

ahc-link \
  --input-hs fact.hs \
  --no-main \
  --export-function=fact \
  --input-mjs fact_cfw.mjs \
  --bundle \
  --browser \
  --output-dir worker

ahc-link will generate some files inside a directory called worker. For our Workers, we’re only interested in the JavaScript file (fact.js) and the WebAssembly module (fact.wasm). Now, it’s time to submit both of them to Workers. We can do this with the provided REST API.

Make sure you have an account id ($CF_ACCOUNT_ID), a name for your script ($SCRIPT_NAME), and an API Token ($CF_API_TOKEN):

cd worker
curl -X PUT "https://api.cloudflare.com/client/v4/accounts/$CF_ACCOUNT_ID/workers/scripts/$SCRIPT_NAME" \
     -H  "Authorization: Bearer $CF_API_TOKEN" \
     -F "[email protected];type=application/json" \
     -F "[email protected];type=application/javascript" \
     -F "[email protected];type=application/wasm"

Now, visit the Workers UI, where you can use the editor to view, edit, and test the script. Also, you can enable it to on a workers.dev subdomain ($CFW_SUBDOMAIN); in that case, you could then simply:

curl -X POST $CFW_SUBDOMAIN \
       -H 'Content-Type: application/x-www-form-urlencoded' \
       --data 'param=5'

Beyond a simple Haskell file

So far, we’ve created a WebAssembly module that exports a pure Haskell function we ran in Workers. However, we can also create and build a Cabal project using Asterius ahc-cabal CLI, and then use ahc-dist to compile it to WebAssembly.

First, let’s create the project:

ahc-cabal init -m -p cabal-cfw-example

Then, let’s add some dependencies to our cabal project. The cabal file will look like this:

cabal-version:       2.4
name:                cabal-cfw-example
version:             0.1.0.0
license:             NONE

executable cabal-cfw-example
  ghc-options: -optl--export-function=handleReq
  main-is:             Main.hs
  build-depends:
    base,
    bytestring,
    aeson >=1.5 && < 1.6,
    text
  default-language:    Haskell2010

It’s a simple cabal file, except for the -optl--export-function=handleReq ghc flag. This is necessary when exporting a function from a cabal project.

In this example, we’ll define a simple User record, and we’ll define its instance automatically using Template Haskell!

{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TemplateHaskell   #-}

module Main where

import           Asterius.Types
import           Control.Monad
import           Data.Aeson                 hiding (Object)
import qualified Data.Aeson                 as A
import           Data.Aeson.TH
import qualified Data.ByteString.Lazy.Char8 as B8
import           Data.Text


main :: IO ()
main = putStrLn "CFW Cabal"

data User =
  User
    { name :: Text
    , age  :: Int
    }

$(deriveJSON defaultOptions 'User)

NOTE: It’s not necessary to create a Cabal project for this example, because the prebuilt container comes with a lot of prebuilt packages (aesona included). Nevertheless, it will help us show the potential of ahc-cabal and ahc-dist.

Next, we’ll define handleReq, which we’ll export using JavaScript FFI just like we did before.

handleReq :: JSString -> JSString -> IO JSObject
handleReq method rawBody =
  case fromJSString method of
    "POST" ->
      let eitherUser :: Either String User
          eitherUser = eitherDecode (B8.pack $ fromJSString rawBody)
       in case eitherUser of
            Right _  -> js_new_response (toJSString "Success!") 200
            Left err -> js_new_response (toJSString err) 400
    _ -> js_new_response (toJSString "Not a valid method") 405

foreign export javascript "handleReq" handleReq :: JSString -> JSString -> IO JSObject

foreign import javascript "new Response($1, {\"status\": $2})"
  js_new_response :: JSString -> Int -> IO JSObject

This time, we define js_new_response, a Haskell function that creates a JavaScript object, to create a Response. handleReq takes two string parameters from JavaScript and it uses them to prepare a response.

Now let’s build and install the binary in the current directory:

ahc-cabal new-install --installdir . --overwrite-policy=always

This will generate a binary for our executable, called cabal-cfw-example. We’re going to use ahc-dist to take that binary and target WebAssembly:

ahc-dist --input-exe cabal-cfw-example --export-function=handleReq --no-main --input-mjs cabal_cfw_example.mjs --bundle --browser

cabal_cfw_example.mjs contains the following code:

import * as rts from "./rts.mjs";
import cabal_cfw_example from "./cabal_cfw_example.req.mjs";

async function handleRequest(req) {
  const i = await rts.newAsteriusInstance(Object.assign(cabal_cfw_example, { module: WASM }));
  const body = await req.text();
  return await i.exports.handleReq(req.method, body);
}

addEventListener("fetch", event => {
  event.respondWith(handleRequest(event.request))
});

Finally, we can deploy our code to Workers by defining a metadata.json file and uploading the script and the WebAssembly module using Workers API as we did before.

Caveats

Workers limits your JavaScript and WebAssembly in file size. Therefore, you need to be careful with any dependencies you add.

Conclusion

Stack Builders builds better software for better living through technologies like expressive static types. We used Asterius to compile Haskell to WebAssembly and deployed it to Cloudflare Workers using the Workers API. Asterius supports a lot of Haskell features (e.g. Template Haskell) and it provides an easy-to-use JavaScript FFI to interact with JavaScript. Additionally, it provides prebuilt containers that contain a lot of Haskell packages, so you can start writing a script right away.

Following this approach, we can write functional type-safe code in Haskell, target it to WebAssembly, and publish it to Workers, which runs on the edge of the Cloudflare infrastructure.

For more content check our blogs and tutorials!

Rendering React on the Edge with Flareact and Cloudflare Workers

Post Syndicated from Guest Author original https://blog.cloudflare.com/rendering-react-on-the-edge-with-flareact-and-cloudflare-workers/

Rendering React on the Edge with Flareact and Cloudflare Workers

The following is a guest post from Josh Larson, Engineer at Vox Media.

Imagine you’re the maintainer of a high-traffic media website, and your DNS is already hosted on Cloudflare.

Page speed is critical. You need to get content to your audience as quickly as possible on every device. You also need to render ads in a speedy way to maintain a good user experience and make money to support your journalism.

One solution would be to render your site statically and cache it at the edge. This would help ensure you have top-notch delivery speed because you don’t need a server to return a response. However, your site has decades worth of content. If you wanted to make even a small change to the site design, you would need to regenerate every single page during your next deploy. This would take ages.

Another issue is that your site would be static — and future updates to content or new articles would not be available until you deploy again.

That’s not going to work.

Another solution would be to render each page dynamically on your server. This ensures you can return a dynamic response for new or updated articles.

However, you’re going to need to pay for some beefy servers to be able to handle spikes in traffic and respond to requests in a timely manner. You’ll also probably need to implement a system of internal caches to optimize the performance of your app, which could lead to a more complicated development experience. That also means you’ll be at risk of a thundering herd problem if, for any reason, your cache becomes invalidated.

Neither of these solutions are great, and you’re forced to make a tradeoff between one of these two approaches.

Thankfully, you’ve recently come across a project like Next.js which offers a hybrid approach: static-site generation along with incremental regeneration. You’re in love with the patterns and developer experience in Next.js, but you’d also love to take advantage of the Cloudflare Workers platform to host your site.

Cloudflare Workers allow you to run your code on the edge quickly, efficiently and at scale. Instead of paying for a server to host your code, you can host it directly inside the datacenter — reducing the number of network trips required to load your application. In a perfect world, we wouldn’t need to find hosting for a Next.js site, because Cloudflare offers the same JavaScript hosting functionality with the Workers platform. With their dynamic runtime and edge caching capabilities, we wouldn’t need to worry about making a tradeoff between static and dynamic for our site.

Unfortunately, frameworks like Next.js and Cloudflare Workers don’t mesh together particularly well due to technical constraints. Until now:

I’m excited to announce Flareact, a new open-source React framework built for Cloudflare Workers.

Rendering React on the Edge with Flareact and Cloudflare Workers

With Flareact, you don’t need to make the tradeoff between a static site and a dynamic application.

Flareact allows you to render your React apps at the edge rather than on the server. It is modeled after Next.js, which means it supports file-based page routing, dynamic page paths and edge-side data fetching APIs.

Not only are Flareact pages rendered at the edge — they’re also cached at the edge using the Cache API. This allows you to provide a dynamic content source for your app without worrying about traffic spikes or response times.

With no servers or origins to deal with, your site is instantly available to your audience. Cloudflare Workers gives you a 0ms cold start and responses from the edge within milliseconds.

You can check out the docs and get started now by clicking the button below:

Rendering React on the Edge with Flareact and Cloudflare Workers

To get started manually, install the latest wrangler, and use the handy wrangler generate command below to create your first project:

npm i @cloudflare/wrangler -g
wrangler generate my-project https://github.com/flareact/flareact-template

What’s the big deal?

Hosting React apps on Cloudflare Workers Sites is not a new concept. In fact, you’ve always been able to deploy a create-react-app project to Workers Sites in addition to static versions of other frameworks like Gatsby and Next.js.

However, Flareact renders your React application at the edge. This allows you to provide an initial server response with HTML markup — which can be helpful for search engine crawlers. You can also cache the response at the edge and optionally invalidate that cache on a timed basis — meaning your static markup will be regenerated if you need it to be fresh.

This isn’t a new pattern: Next.js has done the hard work in defining the shape of this API with SSG support and Incremental Static Regeneration. While there are nuanced differences in the implementation between Flareact and Next.js, they serve a similar purpose: to get your application to your end-user in the quickest and most-scalable way possible.

A focus on developer experience

A magical developer experience is a crucial ingredient to any successful product.

As a longtime fan and user of Next.js, I wanted to experiment with running the framework on Cloudflare Workers. However, Next.js and its APIs are framed around the Node.js HTTP Server API, while Cloudflare Workers use V8 isolates and are modeled after the FetchEvent type.

Since we don’t have typical access to a filesystem inside V8 isolates, it’s tough to mimic the environment required to run a dynamic Next.js server at the edge. Though projects like Fab have come up with workarounds, I decided to approach the project with a clean slate and use existing patterns established in Next.js in a brand-new framework.

As a developer, I absolutely love the simplicity of exporting an asynchronous function from my page to have it supply props to the component. Flareact implements this pattern by allowing you to export a getEdgeProps function. This is similar to getStaticProps in Next.js, and it matches the expected return shape of that function in Next.js — including a revalidate parameter. Learn more about data fetching in Flareact.

I was also inspired by the API Routes feature of Next.js when I implemented the API Routes feature of Flareact — enabling you to write standard Cloudflare Worker scripts directly within your React app.

I hope porting over an existing Next.js project to Flareact is a breeze!

How it works

When a FetchEvent request comes in, Flareact inspects the URL pathname to decide how to handle it:

If the request is for a page or for page props, it checks the cache for that request and returns it if there’s a hit. If there is a cache miss, it generates the page request or props function, stores the result in the cache, and returns the response.

If the request is for an API route, it sends the entire FetchEvent along to the user-defined API function, allowing the user to respond as they see fit.

Rendering React on the Edge with Flareact and Cloudflare Workers

If you want your cached page to be revalidated after a certain amount of time, you can return an additional revalidate property from getEdgeProps(). This instructs Flareact to cache the endpoint for that number of seconds before generating a new response.

Rendering React on the Edge with Flareact and Cloudflare Workers

Finally, if the request is for a static asset, it returns it directly from the Workers KV.

The Worker

The core responsibilities of the Worker — or in a traditional SSR framework, the server are to:

  1. Render the initial React page component into static HTML markup.
  2. Provide the initial page props as a JSON object, embedded into the static markup in a script tag.
  3. Load the client-side JavaScript bundles and stylesheets necessary to render the interactive page.

One challenge with building Flareact is that the Webpack targets the webworker output rather than the node output. This makes it difficult to inform the worker which pages exist in the filesystem, since there is no access to the filesystem.

To get around this, Flareact leverages require.context, a Webpack-specific API, to inspect the project and build a manifest of pages on the client and the worker. I’d love to replace this with a smarter bundling strategy on the client-side eventually.

The Client

In addition to handling incoming Worker requests, Flareact compiles a client bundle containing the code necessary for routing, data fetching and more from the browser.

The core responsibilities of the client are to:

  1. Listen for routing events
  2. Fetch the necessary page component and its props from the worker over AJAX

Building a client router from scratch has been a challenge. It listens for changes to the internal route state, updates the URL pathname with pushState, makes an AJAX request to the worker for the page props, and then updates the current component in the render tree with the requested page.

It was fun building a flareact/link component similar to next/link:

import Link from "flareact/link";

export default function Index() {
  return (
    <div>
      <Link href="/about">
        <a>Go to About</a>
      </Link>
    </div>
  );
}

I also set out to build a custom version of next/head for Flareact. As it turns out, this was non-trivial! With lots of interesting stuff going on behind the scenes to support SSR and client-side routing events, I decided to make flareact/head a simple wrapper around react-helmet instead:

import Head from "flareact/head";

export default function Index() {
  return (
    <div>
      <Head>
        <title>My page title</title>
      </Head>
      <h1>Hello, world.</h1>
    </div>
  );
}

Local Development

The local developer experience of Flareact leverages the new wrangler dev command, sending server requests through a local tunnel to the Cloudflare edge and back to your machine.


This is a huge win for productivity, since you don’t need to manually build and deploy your application to see how it will perform in a production environment.

It’s also a really exciting update to the serverless toolchain. Running a robust development environment in a serverless world has always been a challenge, since your code is executing in a non-traditional context. Tunneling local code to the edge and back is such a great addition to Cloudflare’s developer experience.

Use cases

Flareact is a great candidate for a lot of Jamstack-adjacent applications, like blogs or static marketing sites.

It could also be used for more dynamic applications, with robust API functions and authentication mechanisms — all implemented using Cloudflare Workers.

Imagine building a high-traffic e-commerce site with Flareact, where both site reliability and dynamic rendering for things like price changes and stock availability are crucial.

There are also untold possibilities for integrating the Workers KV into your edge props or API functions as a first-class database solution. No need to reach for an externally-hosted database!

While the project is still in its early days, here are a couple real-world examples:

The road ahead

I have to be honest: creating a server-side rendered React framework with little prior knowledge was very difficult. There’s still a ton to learn, and Flareact has a long way to go to reach parity with Next.js in the areas of optimization and production-readiness.

Here’s what I’m hoping to add to Flareact in the near future:

  • Smarter client bundling and Webpack chunks to reduce individual page weight
  • A more feature-complete client-side router
  • The ability to extend and customize the root document of the app
  • Support for more style frameworks (CSS-in-JS, Sass, CSS modules, etc)
  • A more stable development environment
  • Documentation and support for environment variables, secrets and KV namespaces
  • A guide for deploying from GitHub Actions and other CI tools

If the project sounds interesting to you, be sure to check out the source code on GitHub. Contributors are welcome!